Skip to main content
The Cochrane Database of Systematic Reviews logoLink to The Cochrane Database of Systematic Reviews
. 2018 Dec 4;2018(12):CD013194. doi: 10.1002/14651858.CD013194

Visual inspection for diagnosing cutaneous melanoma in adults

Jacqueline Dinnes 1,2,, Jonathan J Deeks 1,2, Matthew J Grainge 3, Naomi Chuchu 1, Lavinia Ferrante di Ruffano 1, Rubeta N Matin 4, David R Thomson 5, Kai Yuen Wong 6, Roger Benjamin Aldridge 7, Rachel Abbott 8, Monica Fawzy 9, Susan E Bayliss 1, Yemisi Takwoingi 1,2, Clare Davenport 1, Kathie Godfrey 10, Fiona M Walter 11, Hywel C Williams 12; Cochrane Skin Cancer Diagnostic Test Accuracy Group1
Editor: Cochrane Skin Group
PMCID: PMC6492463  PMID: 30521684

Abstract

Background

Melanoma has one of the fastest rising incidence rates of any cancer. It accounts for a small percentage of skin cancer cases but is responsible for the majority of skin cancer deaths. History‐taking and visual inspection of a suspicious lesion by a clinician is usually the first in a series of ‘tests’ to diagnose skin cancer. Establishing the accuracy of visual inspection alone is critical to understating the potential contribution of additional tests to assist in the diagnosis of melanoma.

Objectives

To determine the diagnostic accuracy of visual inspection for the detection of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants in adults with limited prior testing and in those referred for further evaluation of a suspicious lesion. Studies were separated according to whether the diagnosis was recorded face‐to‐face (in‐person) or based on remote (image‐based) assessment.

Search methods

We undertook a comprehensive search of the following databases from inception up to August 2016: CENTRAL; CINAHL; CPCI; Zetoc; Science Citation Index; US National Institutes of Health Ongoing Trials Register; NIHR Clinical Research Network Portfolio Database; and the World Health Organization International Clinical Trials Registry Platform. We studied reference lists and published systematic review articles.

Selection criteria

Test accuracy studies of any design that evaluated visual inspection in adults with lesions suspicious for melanoma, compared with a reference standard of either histological confirmation or clinical follow‐up. We excluded studies reporting data for ‘clinical diagnosis’ where dermoscopy may or may not have been used.

Data collection and analysis

Two review authors independently extracted all data using a standardised data extraction and quality assessment form (based on QUADAS‐2). We contacted authors of included studies where information related to the target condition or diagnostic threshold were missing. We estimated summary sensitivities and specificities per algorithm and threshold using the bivariate hierarchical model. We investigated the impact of: in‐person test interpretation; use of a purposely developed algorithm to assist diagnosis; and observer expertise.

Main results

We included 49 publications reporting on a total of 51 study cohorts with 34,351 lesions (including 2499 cases), providing 134 datasets for visual inspection. Across almost all study quality domains, the majority of study reports provided insufficient information to allow us to judge the risk of bias, while in three of four domains that we assessed we scored concerns regarding applicability of study findings as 'high'. Selective participant recruitment, lack of detail regarding the threshold for deciding on a positive test result, and lack of detail on observer expertise were particularly problematic.

Attempts to analyse studies by degree of prior testing were hampered by a lack of relevant information and by the restricted inclusion of lesions selected for biopsy or excision. Accuracy was generally much higher for in‐person diagnosis compared to image‐based evaluations (relative diagnostic odds ratio of 8.54, 95% CI 2.89 to 25.3, P < 0.001). Meta‐analysis of in‐person evaluations that could be clearly placed on the clinical pathway showed a general trade‐off between sensitivity and specificity, with the highest sensitivity (92.4%, 95% CI 26.2% to 99.8%) and lowest specificity (79.7%, 95% CI 73.7% to 84.7%) observed in participants with limited prior testing (n = 3 datasets). Summary sensitivities were lower for those referred for specialist assessment but with much higher specificities (e.g. sensitivity 76.7%, 95% CI 61.7% to 87.1%) and specificity 95.7%, 95% CI 89.7% to 98.3%) for lesions selected for excision, n = 8 datasets). These differences may be related to differences in the spectrum of included lesions, differences in the definition of a positive test result, or to variations in observer expertise. We did not find clear evidence that accuracy is improved by the use of any algorithm to assist diagnosis in all settings. Attempts to examine the effect of observer expertise in melanoma diagnosis were hindered due to poor reporting.

Authors' conclusions

Visual inspection is a fundamental component of the assessment of a suspicious skin lesion; however, the evidence suggests that melanomas will be missed if visual inspection is used on its own. The evidence to support its accuracy in the range of settings in which it is used is flawed and very poorly reported. Although published algorithms do not appear to improve accuracy, there is insufficient evidence to suggest that the ‘no algorithm’ approach should be preferred in all settings. Despite the volume of research evaluating visual inspection, further prospective evaluation of the potential added value of using established algorithms according to the prior testing or diagnostic difficulty of lesions may be warranted.

Keywords: Adult, Aged, Humans, Middle Aged, Algorithms, Diagnostic Errors, Melanoma, Melanoma/diagnosis, Melanoma/diagnostic imaging, Physical Examination, Physical Examination/methods, Sensitivity and Specificity, Skin Neoplasms, Skin Neoplasms/diagnosis, Skin Neoplasms/diagnostic imaging

Plain language summary

How accurate is visual inspection of skin lesions with the naked eye for diagnosis of melanoma in adults?

What is the aim of the review?

Melanoma is one of the most dangerous forms of skin cancer. The aim of this Cochrane Review was to find out how accurate checking suspicious skin lesions (lumps, bumps, wounds, scratches or grazes) with the naked eye (visual inspection) can be to diagnose melanoma (diagnostic accuracy). The Review also investigated whether diagnostic accuracy was different depending on whether the clinician was face to face with the patient (in‐person visual inspection), or looked at an image of the lesion (image‐based visual inspection). Cochrane researchers included 19 studies to answer this question.

Why is it important to know the diagnostic accuracy of visual examination of skin lesions suspected to be melanomas?

Not recognising a melanoma when it is present (a false‐negative test result) delays surgery to remove it (excision), risking cancer spreading to other organs in the body and possibly death. Diagnosing a skin lesion (a mole or area of skin with an unusual appearance in comparison with the surrounding skin) as a melanoma when it is not (a false‐positive result) may result in unnecessary surgery, further investigations, and patient anxiety. Visual inspection of suspicious skin lesions by a clinician using the naked eye is usually the first of a series of ‘tests’ to diagnose melanoma. Knowing the diagnostic accuracy of visual inspection alone is important to decide whether additional tests, such as a biopsy (removing a part of the lesion for examination under a microscope) are needed to improve accuracy to an acceptable level.

What did the review study?

Researchers wanted to find out the diagnostic accuracy of in‐person compared with image‐based visual inspection of suspicious skin lesions. Researchers also wanted to find out whether diagnostic accuracy was improved if doctors used a 'visual inspection checklist' or depending on how experienced in visual inspection they were (level of clinical expertise). They considered the diagnostic accuracy of the first visual inspection of a lesion, for example, by a general practitioner (GP), and of lesions that had been referred for further evaluation, for example, by a dermatologist (doctor specialising in skin problems).

What are the main results of the review?

Only 19 studies (17 in‐person studies and 2 image‐based studies) were clear whether the test was the first visual inspection of a lesion or was a visual inspection following referral (for example, when patients are referred by a GP to skin specialists for visual inspection).

First in‐person visual inspection (3 studies)

The results of three studies of 1339 suspicious skin lesions suggest that in a group of 1000 lesions, of which 90 (9%) actually are melanoma:

‐ An estimated 268 will have a visual inspection result indicating melanoma is present. Of these, 185 will not be melanoma and will result in an unnecessary biopsy (false‐positive results).

‐ An estimated 732 will have a visual inspection result indicating that melanoma is not present. Of these, seven will actually have melanoma and would not be sent for biopsy (false‐negative results).

Two further studies restricted to 4228 suspicious skin lesions that were all selected to be excised found similar results.

In‐person visual inspection after referral, all lesions selected to be excised (8 studies)

The results of eight studies of 5331 suspicious skin lesions suggest that in a group of 1000 lesions, of which 90 (9%) actually are melanoma:

‐ An estimated 108 will have a visual inspection result indicating melanoma is present, and of these, 39 will not be melanoma and will result in an unnecessary biopsy (false‐positive results).

‐ Of the 892 lesions with a visual inspection result indicating that melanoma is not present, 21 will actually be melanoma and would not be sent for biopsy (false‐negative results).

Overall, the number of false‐positive results (diagnosing a skin lesion as a melanoma when it is not) was observed to be higher and the number of false‐negative results (not recognising a melanoma when it is present) lower for first visual inspections of suspicious skin lesions compared to visual inspection following referral.

Visual inspection of images of suspicious skin lesions (2 studies)

Accuracy was much lower for visual inspection of images of lesions compared to visual inspection in person.

Value of visual inspection checklists

There was no evidence that use of a visual inspection checklist or the level of clinical expertise changed diagnostic accuracy.

How reliable are the results of the studies of this review?

The majority of included studies diagnosed melanoma by lesion biopsy and confirmed that melanoma was not present by biopsy or by follow‐up over time to make sure the skin lesion remained negative for melanoma. In these studies, biopsy, clinical follow‐up, or specialist clinician diagnosis were the reference standards (means of establishing final diagnoses). Biopsy or follow‐up are likely to have been reliable methods for deciding whether patients really had melanoma. In a few studies, experts diagnosed the absence of melanoma (expert diagnosis), which is less likely to have been a reliable method for deciding whether patients really had melanoma. There was lots of variation in the results of the studies in this review and the studies did not always describe fully the methods they used, which made it difficult to assess their reliability.

Who do the results of this review apply to?

Thirteen studies were undertaken in Europe (68%), with the remainder undertaken in Asia (n = 1), Oceania (n = 4), and North America (n = 1). Mean age ranged from 30 to 73.6 years (reported in 10 studies). The percentage of individuals with melanoma ranged between 4% and 20% in first visualised lesions and between 1% and 50% in studies of referred lesions. In the majority of studies, the lesions were unlikely to be representative of the range of those seen in practice, for example, only including skin lesions of a certain size or with a specific appearance. In addition, variation in the expertise of clinicians performing visual inspection and in the definition used to decide whether or not melanoma was present across studies makes it unclear as to how visual inspection should be carried out and by whom in order to achieve the accuracy observed in studies.

What are the implications of this review?

Error rates from visual inspection are too high for it to be relied upon alone. Although not evaluated in this review, other technologies need to be used to ensure accurate diagnosis of skin cancer. There is considerable variation and uncertainty about the diagnostic accuracy of visual inspection alone for the diagnosis of melanoma. There is no evidence to suggest that visual inspection checklists reliably improve the diagnostic accuracy of visual inspection, so recommendations cannot be made about when they should be used. Despite the existence of numerous research studies, further, well‐reported studies assessing the diagnostic accuracy of visual inspection with and without visual inspection checklists and by clinicians with different levels of expertise are needed.

How up‐to‐date is this review?

The review authors searched for and used studies published up to August 2016.

Summary of findings

Summary of findings'. 'What is the diagnostic accuracy of visual inspection for the detection of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants in adults?

Question What is the diagnostic accuracy of visual inspection for the detection of cutaneous invasive melanoma and atypical intraepidermalmelanocytic variants in adults?
Population Adults with lesions suspicious for melanoma, including:
  • those with limited prior testing (presenting in primary, community or private dermatology settings)

  • referred populations (presenting in secondary care or specialist skin cancer clinics)

Index test Visual inspection with or without the use of any established algorithms or checklist to aid diagnosis, including:
  • in‐person evaluations (face‐to‐face diagnosis)

  • image‐based evaluations (diagnosis based on assessment of a clinical image)

Target condition Cutaneous invasive melanoma and atypical intraepidermal melanocytic variants
Reference standard Histology with or without long‐term follow‐up
Action If accurate, positive results ensure melanoma lesions are not missed but are appropriately referred and excised and those with negative results can be safely reassured and discharged.
  Number of studies Total lesions Total cases
Quantity of evidence 49a 34,351 2499
Limitations
Risk of bias Potential risk for participant selection from case‐control design (6), inappropriate exclusion criteria (7) or lack of detail (27/49)
All index test interpretation was blinded to reference standard diagnosis. Index test thresholds not clearly pre‐specified (22/33 in‐person evaluations; 13/16 image‐based)
Low risk for reference standard (42/49); high concern from use of expert diagnosis (6). Blinding of reference standard to visual inspection diagnosis not reported in any study.
High risk for participant flow due to differential verification (11), and exclusions following recruitment (15); 37 studies did not mention timing of tests
Applicability of evidence to question Participant selection restricted to those with melanocytic lesions only (10), or to those with histopathology results (37) and included multiple lesions per participant (14)
No description of index test diagnostic thresholds (24 in‐person; 13 image‐based) or reporting of average or consensus diagnoses (7 in‐person; 13 image‐based).
Clinical images interpreted blinded to clinical information (11/16). Little information given concerning the expertise of the histopathologist (40/49).
Findings
37 studies (providing 39 datasets) reported accuracy data for the primary target condition. We separated them a priori into in‐person (n = 28) and image‐based (n = 11) evaluations. Subsequent analysis confirmed differences in accuracy according to the different approaches to diagnosis (P < 0.001). Attempts to analyse studies by degree of prior testing were hampered by a lack of relevant information provided in the study publications and by the inclusion of lesions selected for biopsy or excision. Of the 28 in‐person evaluations, we could only clearly place 17 on the clinical pathway, and considered 11 to have provided insufficient information to allow us to identify the pathway (coded ‘unclear’ on pathway). The findings presented are based on results for in‐person evaluations that could be clearly placed on the clinical pathway.
Test: In‐person visual inspection using any or no algorithm at any threshold
Data: Number of datasets Total lesions Total melanomas
All in‐person evaluations 28 25,604 1748
Studies clearly placed on the clinical pathway 17 14,700 622
Place on pathway: participants with limited prior testing (all lesions)
Datasets (n) Lesions (n) Melanomas (n) Sensitivity (95% CI) Specificity (95% CI)
3 1339 55 92% (26 to 100) 80% (74 to 85)
Numbers in a cohort of 1000 lesionsb TP FP FN TN PPV NPV
At a prevalence of 4% 37
(10 to 40)
195
(252 to 147)
3
(30 to 0)
765
(708 to 813)
16% (4 to 21) 100%
(96 to 100)
At a prevalence of 9% 83
(24 to 90)
185
(239 to 139)
7
(66 to 0)
725
(671 to 771)
31%
(9 to 39)
99%
(91 to 100)
At a prevalence of 16% 148
(42 to 160)
171
(221 to 129)
12
(118 to 0)
669
(619 to 711)
46%
(16 to 55)
98%
(84 to 100)
Place on pathway: participants with limited prior testing (only lesions selected for excision)
Datasets (n) Lesions (n) Melanomas (n) Sensitivity (95% CI) Specificity (95% CI)
2 4228 160 90% (70 to 97) 81% (67 to 90)
Numbers in a cohort of 1000 lesionsb TP FP FN TN PPV NPV
At a prevalence of 4% 36
(28 to 39)
180
(312 to 96)
4
(12 to 1)
780
(648 to 864)
17%
(8 to 29)
99%
(98 to 100)
At a prevalence of 9% 81
(63 to 88)
170
(296 to 91)
9
(27 to 2)
740
(614 to 819)
32%
(18 to 49)
99%
(96 to 100)
At a prevalence of 16% 144
(112 to 156)
157
(273 to 84)
16
(48 to 4)
683
(567 to 756)
48%
(29 to 65)
98%
(92 to 99)
Place on pathway: referred participants (all lesions)
Datasets (n) Lesions (n) Melanomas (n) Sensitivity (95% CI) Specificity (95% CI)
2 3494 61 75% (49 to 90) 99% (95 to 100)
Numbers in a cohort of 1000 lesionsb TP FP FN TN PPV NPV
At a prevalence of 4% 30
(20 to 36)
13
(51 to 4)
10
(20 to 4)
947
(909 to 956)
69%
(28 to 90)
99%
(98 to 100)
At a prevalence of 9% 67
(44 to 81)
13
(48 to 4)
23
(46 to 9)
897
(862 to 906)
84%
(48 to 96)
98%
(95 to 99)
At a prevalence of 16% 119
(78 to 144)
12
(45 to 3)
41
(82 to 16)
828
(795 to 837)
91%
(64 to 98)
95%
(91 to 98)
Referred participants (only lesions selected for excision)
Datasets (n) Lesions (n) Melanomas (n) Sensitivity (95% CI) Specificity (95% CI)
8 5331 258 77% (62 to 87) 96% (90 to 98)
Numbers in a cohort of 1000 lesionsb TP FP FN TN PPV NPV
At a prevalence of 4% 31
(25 to 35)
41
(99 to 16)
9
(15 to 5)
919
(861 to 944)
43%
(20 to 68)
99%
(98 to 99)
At a prevalence of 9% 69
(56 to 78)
39
(94 to 15)
21
(34 to 12)
871
(816 to 895)
64%
(37 to 84)
98%
(96 to 99)
At a prevalence of 16% 123
(99 to 139)
36
(87 to 14)
37
(61 to 21)
804
(753 to 826)
77%
(53 to 91)
96%
(92 to 98)
Referred participants with equivocal lesions (only lesions selected for excision)
Datasets (n) Lesions (n) Melanomas (n) Sensitivity (95% CI) Specificity (95% CI)
2 930 88 85% (56 to 96) 89% (79 to 95)
Numbers in a cohort of 1000 lesionsb TP FP FN TN PPV NPV
At a prevalence of 4% 34
(22 to 38)
101
(197 to 48)
6
(18 to 2)
859
(763 to 912)
25%
(10 to 44)
99%
(98 to 100)
At a prevalence of 9% 76
(50 to 86)
96
(187 to 46)
14
(40 to 4)
814
(723 to 865)
44%
(21 to 66)
98%
(95 to 100)
At a prevalence of 16% 136
(89 to 154)
88
(172 to 42)
24
(71 to 6)
752
(668 to 798)
61%
(34 to 79)
97%
(90 to 99)
CI: confidence interval; FN: false‐negative; FP: false‐positive; NPV: negative predictive value; PPV: positive predictive value; TN: true negative; TP: true positive

a37 of the 49 included studies (reporting on 39 cohorts of lesions) provide data for the primary target condition (defined as detection of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants) and are the main focus of this 'Summary of findings' table; the summary of methodological quality is based on the full sample of 49 studies.
 bWe estimated number of true positives (TP), false‐positives (FP), false‐negatives (FN) and true negatives (TN) for a hypothetical cohort of 1000 lesions at the median and interquartile ranges of prevalence (25th and 75th percentiles), at average sensitivity and specificity and using the lower and upper limits of the 95% confidence intervals, denoted in brackets (lower limit to upper limit).

Background

This review is one of a series of Cochrane Diagnostic Test Accuracy (DTA) reviews on the diagnosis and staging of melanoma and keratinocyte skin cancers conducted for the National Institute for Health Research (NIHR) Cochrane Systematic Reviews Programme. Appendix 1 shows the content and structure of the programme. Appendix 2 provides a glossary of terms used, and a table of acronyms used is provided in Appendix 3.

Target condition being diagnosed

Melanoma is one of the most aggressive forms of skin cancer, with the potential to metastasise to other parts of the body via the lymphatic system and blood stream. It accounts for a small percentage of skin cancer cases but is responsible for up to 75% of skin cancer deaths (Boring 1994; Cancer Research UK 2017).

Melanoma arises from uncontrolled proliferation of melanocytes, the epidermal cells that produce pigment or melanin. It most commonly arises in the skin but can occur in any organ that contains melanocytes, including mucosal surfaces, the back of the eye, and lining around the spinal cord and brain. Cutaneous melanoma refers to a skin lesion with malignant melanocytes present in the dermis, and includes superficial spreading, nodular, acral lentiginous, and lentigo maligna melanoma variants (see Figure 1). Melanoma in situ refers to malignant melanocytes that are contained within the epidermis and have not yet invaded the dermis, but are at risk of progression to melanoma if left untreated. Lentigo maligna, a subtype of melanoma‐in‐situ in chronically sun‐damaged skin, denotes another form of proliferation of abnormal melanocytes. Lentigo maligna can progress to invasive melanoma if its growth breaches the dermo‐epidermal junction during a vertical growth phase (when it becomes known as 'lentigo maligna melanoma'); however, its rate of malignant transformation is both lower and slower than for melanoma in situ (Kasprzak 2015). Melanoma in situ and lentigo maligna are both atypical intraepidermal melanocytic variants.

1.

1

Sample photographs of superficial spreading melanoma (left) and nodular melanoma (right). Copyright © 2010 Dr Rubeta Matin: reproduced with permission.

The incidence of melanoma rose to over 200,000 newly diagnosed cases worldwide in 2012 (Erdmann 2013; Ferlay 2015), with an estimated 55,000 deaths (Ferlay 2015). The highest incidence is observed in Australia with 13,134 new cases of melanoma of the skin in 2014 (ACIM 2017) and in New Zealand with 2341 registered cases in 2010 (HPA and MelNet NZ 2014). For 2014 in the USA, the predicted incidence was 73,870 per annum and the predicted number of deaths was 9940 (Siegel 2015). The highest rates in Europe are seen in north‐western Europe and the Scandinavian countries, with a highest incidence reported in Switzerland: 25.8 per 100,000 in 2012. Rates in England have tripled from 4.6 and 6.0 per 100,000 in men and women, respectively, in 1990, to 18.6 and 19.6 per 100,000 in 2012 (EUCAN 2012). In the UK, melanoma has one of the fastest rising incidence rates of any cancer and has the biggest projected increase in incidence between 2007 and 2030 (Mistry 2011). In the decade leading up to 2013, age‐standardised incidence increased by 46%, with 14,500 new cases in 2013 and 2459 deaths in 2014 (Cancer Research UK 2017). While overall incidence rates are higher in women than in men, the rate of incidence in men is increasing faster than in women (Arnold 2014).

The rising incidence in melanoma is thought to be primarily related to an increase in recreational sun exposure and use of tanning beds, and an increasingly ageing population with higher lifetime ultraviolet (UV) exposure, in conjunction with possible earlier detection (Belbasis 2016; Linos 2009). Putative risk factors are reviewed in detail elsewhere (Belbasis 2016), but can be broadly divided into host or environmental factors. Host factors include fair skin and light hair or eye colour; older age (Geller 2002); male sex (Geller 2002); previous skin cancer (Tucker 1985); predisposing skin lesions, for example, high melanocytic naevus counts (Gandini 2005), clinically atypical naevi (Gandini 2005), or large congenital naevi (Swerdlow 1995); genetically inherited skin disorders, for example, xeroderma pigmentosum (Lehmann 2011); and a family history of melanoma (Gandini 2005). Environmental factors include recreational, occupational, and work‐related exposure to sunlight (both cumulative and episodic burning) (Armstrong 2017; Gandini 2005); artificial tanning (Boniol 2012); and immunosuppression, for example, in organ transplant recipients or HIV‐positive individuals (DePry 2011). Lower socioeconomic class may be associated with delayed presentation and thus more advanced disease at diagnosis (Reyes‐Ortiz 2006).

A database of over 40,000 US patients from 1998 onwards, which assisted the development of the Eighth Edition American Joint Committee on Cancer (AJCC) Staging System indicated a five‐year survival of 99% for stage IA melanoma (melanoma ≤ 1 mm thick without ulceration, mitosis or involvement of the lymph nodes), dropping to anything between 32% and 93% in stage III disease (melanoma of any thickness with metastasis to the lymph nodes) depending on tumour thickness, the presence of ulceration and number of involved nodes (Gershenwald 2017). Before the advent of targeted and immuno‐therapies, stage IV melanoma (melanoma disseminated to distant sites/visceral organs) was associated with median survival of six to nine months, one‐year survival rate of 25%, and three‐year survival of 15% (Balch 2009; Korn 2008).

Between 1975 and 2010, five‐year relative survival for melanoma (i.e. not including deaths from other causes) in the USA increased from 80% to 94%, with survival for localised, regional, and distant disease estimated at 99%, 70%, and 18%, respectively in 2010 (Cho 2014). Overall, mortality rates however showed little change, at 2.1 per 100,000 deaths in 1975 and 2.7 per 100,000 in 2010 (Cho 2014). Increasing incidence in localised disease over the same period (from 5.7 to 21 per 100,000) suggests that much of the observed improvement in survival may be due to earlier detection and heightened vigilance (Cho 2014). New targeted therapies for stage IV melanoma (e.g. BRAF inhibitors) have improved survival and immunotherapies are evolving such that long‐term survival is being documented (Pasquali 2018). No new data regarding the survival prospects for people with stage IV disease were analysed for the AJCC Eighth Edition Staging Guidelines due to lack of contemporary data (Gershenwald 2017).

Treatment of melanoma

For primary melanoma, the mainstay of definitive treatment is early detection and excision of the lesion, to remove both the tumour and any malignant cells that might have spread into the surrounding skin (Garbe 2016; Marsden 2010; NICE 2015a; SIGN 2017; Sladden 2009). Recommended surgical margins vary according to tumour thickness (Garbe 2016) and stage of disease at presentation (NICE 2015a).

Index test(s)

For the purposes of our series of reviews, each component of the diagnostic process, including visual inspection or clinical examination, is considered a diagnostic or index ‘test', the accuracy of which can be established in comparison with a reference standard of diagnosis, either alone or in combination with other available technologies that may assist the diagnostic process.

Clinical history‐taking to identify risk factors and visual inspection of the lesion, surrounding skin and comparison with other lesions on the rest of the body is fundamental to the diagnosis of skin cancer. The strongest common phenotypic risk factor is the presence of atypical naevi; typically the presence of over a hundred moles or naevi of abnormal appearance that may pose diagnostic challenges (Goodson 2010; Rademaker 2010; Salerni 2012). In the UK, clinical examination is typically done at two decision points – first in the general practice (GP) surgery, where a decision is made to refer or not to refer, and then a second time by a dermatologist or other secondary care clinician, where a decision is made to biopsy or not. Specialist advice can also be sought using teledermatology, where lesion images are forwarded with variable clinical information (such as age, gender, and location of lesion) to specialist clinics or to commercial organisations for interpretation. The accuracy of these diagnostic encounters (defined as the proportion of 'correct' diagnoses, i.e. true positive plus true negative diagnoses out of the total number of diagnoses) is known to vary according to qualifications and experience (Morton 1998; Westerhoff 2000); the accuracy of ‘image‐based’ as opposed to face‐to‐face diagnosis is less clear.

Research into the cognitive processes involved in dermatological diagnoses suggests that two main strategies are employed simultaneously and iteratively (Elstein 2002; Norman 1989; Norman 2009). Non‐analytical pattern recognition formulates an initial hypothesis; identification is made implicitly, without conscious thought or reference to specific rules and hidden from the conscious view of the diagnostician (Norman 2009). Analytical pattern recognition, using more explicit rules based on conscious analytical reasoning, is then employed to test the initial hypothesis. Analytical pattern recognition has been described as the “careful and systematic gathering of data and weighing the elicited information against mental rules” (Norman 2009). The balance between non‐analytical and analytical reasoning varies between clinicians, according to factors such as experience and familiarity with the diagnostic question.

Various attempts have been made to formalise the 'mental rules' involved in analytical pattern recognition for melanoma, ranging from setting out criteria that should be considered (e.g. ‘pattern analysis’; Friedman 1985; Sober 1979) to formal scoring systems with explicit numerical thresholds (MacKie 1985; MacKie 1990). The most commonly used algorithms are described in detail in Appendix 4.

The ABCD (asymmetry, border irregularity, colour variegation, diameter > 6 mm) algorithm of clinical warning signs was developed in 1985 to help distinguish melanoma from a benign naevus (Friedman 1985), and then extended to include an E for 'enlargement' criterion (Thomas 1998). As a result of its simplicity, ABCD(E) is now widely advocated for use by non‐experts or lay persons (American Academy of Dermatology 2015). The approach has been criticised for its inability to capture nodular and amelanotic melanomas, which account for a relatively small proportion (˜15% to 20%) of incident melanomas but a large proportion (˜50%) of melanoma‐related deaths (Moreau 2013; Shaikh 2012). In addition, up to a third of melanomas may be smaller than 6 mm in diameter (Maley 2014), a proportion which is likely to increase due to improved skin surveillance. The validity of ABCD(E) as a useful tool for the lay public has also been called into question (Aldridge 2011a; Girardi 2006; Liu 2005). Subsequent modifications have been suggested, including altering the meaning of the ABCD acronym for use in paediatric populations (Cordoro 2013); changing 'D' to 'dark' (Goldsmith 2014)); or changing the acronym altogether (e.g. CCC for colour, contour, and change (Moynihan 1994); or "Do UC" the melanoma for different, uneven, changing (Yagerman 2014)). To date, the latter three have not been evaluated in populations with lesions suggestive for melanoma.

The seven‐point checklist assessing change in size, shape, colour, inflammation, crusting or bleeding, sensory change, or diameter of 7 mm or more was developed by UK researchers as a guide to help non‐dermatologists detect possible melanoma (MacKie 1985; MacKie 1990). The revised, weighted version (MacKie 1990), is currently recommended for GP use in the evaluation of pigmented lesions (NICE 2015a). A primary care‐based evaluation found moderately good performance for the identification of clinically significant lesions (including malignant and premalignant lesions as disease‐positive) in primary care (sensitivity and specificity for the presence of at least three features were 62.7% and 65.0%, respectively), with higher sensitivity for the detection of melanoma (80.6%) at the expense of low specificity (61.7%) (Walter 2013).

Unlike most formalised rules, the 'ugly duckling' sign is based on differential pattern recognition, where abnormal lesion identification is achieved by noticing the odd one out, that is, a melanoma will be the pigmented lesion that does not match the rest of a person's naevi, for example a very dark or pale/pink lesion that is different in colour compared to the rest of the pigmented naevi (Grob 1998). Although 'ugly duckling' is inherently a form of subjective pattern recognition, sensitivity has been reported to be 100% for pigmented‐lesion experts and 85% for non‐clinicians (Scope 2008). The assumption that an individual has a "normal" naevus phenotype is debatable, however. Many individuals have multiple 'atypical' pigmented lesions which, although very similar morphologically, allow malignancy to easily disguise itself amidst an abnormal complex of pigmented lesions (also referred to as ‘The Little Red Riding Hood’ phenomenon) (Mascaro 1998).

Clinical pathway

The diagnosis of melanoma can take place in primary, secondary, and tertiary care settings by both generalist and specialist healthcare providers. In the UK, people with concerns about a new or changing lesion will usually present first to their GP or, less commonly, directly to a specialist in secondary care, which could include a dermatologist, plastic surgeon, general surgeon or other specialist surgeon (such as an ear, nose, and throat (ENT) specialist or maxillofacial surgeon), or ophthalmologist (Figure 2). Current UK guidelines recommend that all suspicious pigmented lesions presenting in primary care should be assessed by taking a clinical history and visual inspection using the seven‐point checklist (MacKie 1990); lesions suspected to be melanoma should be referred urgently for appropriate specialist assessment within two weeks (Chao 2013; Marsden 2010; NICE 2015b; SIGN 2017).

2.

2

Current clinical pathway for people with skin lesions.

Teledermatology consultations can aid more appropriate triage of lesions into urgent referral; non‐urgent secondary care referral (e.g. for suspected basal cell carcinoma (BCC)); or where available, referral to an intermediate care setting, for example, clinics run by GPs with a special interest in dermatology. The distinction between setting and examiner qualifications and experience is important as specialist clinicians might work in primary care settings (for example, in the UK, GPs with a special interest in dermatology and skin surgery who have undergone appropriate training), and generalists might practice in secondary care settings (for example, plastic surgeons who do not specialise in skin cancer). The level of skill and experience in skin cancer diagnosis will vary for both generalist and specialist care providers and will also impact on test accuracy.

The specialist clinician will also use history‐taking and visual inspection of the lesion (in comparison with other lesions on the skin), usually in conjunction with dermoscopic examination, to inform a clinical decision. If melanoma is suspected, then urgent excision biopsy is recommended; for suspected cutaneous squamous cell carcinoma (cSCC) urgent excision with predetermined surgical margins. Other lesions such as BCC or pre‐malignant lesions such as lentigo maligna may also be referred for a diagnostic biopsy, followed by appropriate treatment or further surveillance or reassurance and discharge.

Prior test(s)

Although smartphone applications and community‐based teledermatology services can increasingly be directly accessed by people who have concerns about a skin lesion (Chuchu 2018), visual inspection of a suspicious lesion by a clinician is usually the first in a series of tests to diagnose skin cancer. In the UK first visual inspection of a suspicious lesion usually takes place in primary care; however, in some countries, people with suspicious lesions can present directly to a secondary care setting. Considering the degree of prior testing that study participants have undergone is key to interpretation of resulting test accuracy indices, which are known to vary according to the spectrum or case‐mix of included participants (Lachs 1992; Leeflang 2013; Moons 1997; Usher‐Smith 2016). Studies of people with suspicious lesions at the initial clinical presentation stage ('test‐naïve'), are likely to have a wider range of differential diagnoses and include a higher proportion of people with benign diagnoses compared with studies of participants who have been referred for a specialist opinion on the basis of visual inspection (with or without dermoscopy) by a generalist practitioner. Furthermore, studies in more specialist settings may focus on equivocal or difficult‐to‐diagnose lesions, rather than lesions with a more general level of clinical suspicion. A simple categorisation of studies according to primary, secondary, or specialist setting may not always adequately reflect differences in spectrum.

Role of index test(s)

Visual inspection and history‐taking are key to diagnosing skin cancer and are always undertaken as part of a clinical examination regardless of examiner experience and whatever additional technologies are available. For the generalist practitioner, the key is to minimise the proportion of people who are referred unnecessarily and identify those lesions that require urgent referral. For the specialist, the aim is not only to identify those in need of urgent excision due to invasive cancer, but also to identify high‐risk lesions, with considerable potential to progress to invasive disease, such as those with severe dysplasia or in situ disease, for example, lentigo maligna. Given differences in setting, prior testing, observer qualifications, experience and training, the anticipated performance in terms of accuracy is likely to vary.

When diagnosing potentially life‐threatening conditions such as melanoma, the consequences of falsely reassuring a person that they do not have skin cancer can be serious and potentially fatal, as the resulting delay to diagnosis means that the window for successful early treatment may be missed. To minimise these false‐negative diagnoses, a good diagnostic test will demonstrate high sensitivity and a high negative predictive value (NPV), where very few of those with a negative test result will actually have a melanoma. Giving falsely positive test results (meaning the test has poor specificity and a high false‐positive rate) resulting in the removal of lesions that turn out to be benign is arguably less of an error than missing a potentially fatal melanoma, but is not cost free. False‐positive diagnoses not only cause unnecessary scarring from the biopsy or excision procedure, but also increase patient anxiety whilst they await the definite histology results and increase healthcare costs as the number needed to remove to yield one melanoma diagnosis increases.

Alternative test(s)

We have reviewed a number of other tests as part of our series of Cochrane diagnostic test accuracy (DTA) reviews on the diagnosis of melanoma. In particular, dermoscopy has become an essential tool for the specialist clinician and is increasingly being taken up in primary care settings. Dermoscopy (also referred to as dermatoscopy or epiluminescence microscopy or ELM) uses a hand‐held microscope and incident light (with or without oil immersion) to reveal subsurface images of the skin at increased magnification of x 10 to x 100 (Kittler 2011). Used alongside clinical examination, dermoscopy has been shown in some studies to increase the sensitivity of clinical diagnosis of melanoma from around 60% to as much as 90% (Bono 2006; Carli 2002a; Kittler 1999; Stanganelli 2000) with much smaller effects in others (Benelli 1999; Bono 2002a). The accuracy of dermoscopy depends on the experience of the examiner (Kittler 2011), with accuracy when used by untrained or less experienced examiners potentially no better than clinical inspection alone (Binder 1997; Kittler 2002).

Pattern analysis (Pehamberger 1993; Steiner 1987) is thought to be the most specific and reliable technique to aid dermoscopy interpretation when used by specialists (Maley 2014); however, dermoscopic histological correlations have been established and diagnostic algorithms developed based on colour, aspect, pigmentation pattern, and skin vessels (e.g. the ABCD rule for dermoscopy (Nachbar 1994; Stolz 1994), the Menzies (Menzies 1996) and the seven‐point dermoscopy checklist (Annessi 2007; Argenziano 1998; Argenziano 2001; Gereli 2010; amongst others). Dermoscopy used in addition to visual inspection (in‐person evaluations) or used alone (dermoscopic image interpretation remotely from the patient concerned) are the subject of a separate systematic review (Dinnes 2018).

Other relevant tests that we have looked at as part of this series of reviews include teledermatology, mobile phone applications, reflectance confocal microscopy, optical coherence tomography, computer‐assisted diagnosis or artificial intelligence‐based techniques, and high‐frequency ultrasound (Dinnes 2015a). Evidence permitting, we will compare the accuracy of available tests in an overview review, exploiting within‐study comparisons of tests and allowing the analysis and comparison of commonly used diagnostic strategies where tests may be used singly or in combination.

We also considered and excluded a number of tests from review, including tests used in the context of monitoring people, such as total body photography of those with large numbers of typical or atypical naevi, and finally, histopathological confirmation following lesion excision. The latter is the established reference standard for melanoma diagnosis and will be one of the standards against which we evaluate the index tests in these reviews.

Rationale

Our series of reviews of diagnostic tests used to assist clinical diagnosis in either clinical practice or in a research setting aims to identify the most accurate approaches to diagnosis and to provide clinical and policy decision‐makers with the highest possible standard of evidence on which to base diagnostic and treatment decisions. With increasing rates of melanoma and a trend to adopt the use of dermoscopy and other high‐resolution image analysis in primary care, the anxiety around missing early cases needs to be balanced against the risk of over‐referrals, to avoid sending too many people with benign lesions for a specialist opinion. It is questionable whether all skin cancers picked up by sophisticated techniques contribute to morbidity and mortality or whether newer technologies run the risk of increasing false‐positive diagnoses. It is also possible that use of some technologies, for example, widespread use of dermoscopy in primary care with no training, could actually result in harm by missing melanomas if they are used as replacement technologies for traditional history‐taking and clinical examination of the entire skin. Many branches of medicine have noted the danger of such "gizmo idolatry" amongst doctors (Leff 2008). The trend toward remote interpretation of dermatology images (whether clinical or dermoscopic images) and the use of remote technologies that do not involve clinicians without substantive evidence could further disrupt clinical pathways and healthcare payments as they may attract custom from the worried well, leaving an ever decreasing pool of qualified doctors to pick up any resulting problems.

There are few available systematic reviews in the field. The literature searches for the most comprehensive systematic reviews of visual inspection were carried out up to 2007 (Vestergaard 2008) or are focused on specific clinical questions, for example, specific healthcare professionals (Corbo 2012 including only direct comparisons of the accuracy of primary care physicians versus dermatologists, and Loescher 2011 reviewing the skin cancer detection skills of advanced practice nurses) or settings (Herschorn 2012 including direct comparisons of visual inspection versus dermoscopy in primary care). More recently, Harrington and colleagues (Harrington 2017) published a systematic review of clinical prediction rules (or published algorithms) used to assist the diagnosis of melanoma; however, the requirement for a clinical prediction rule does not allow comparison of accuracy with and without the use of an algorithm.

The critical question about the accuracy of visual inspection alone and the impact of examiner, prior patient testing, underlying risk status, and the use of images for diagnosis needs to be answered before the potential contribution of additional diagnostic tests can be set in context and appropriately placed in the diagnostic pathway.

This review follows a generic protocol that covers the full series of Cochrane DTA reviews for the diagnosis of melanoma (Dinnes 2015a). The Background and Methods sections of this review therefore use some text that was originally published in the protocol (Dinnes 2015a) and text that overlaps some of our other reviews (Dinnes 2018).

Objectives

To determine the diagnostic accuracy of visual inspection for the detection of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants in adults.

Accuracy was estimated separately according to the prior testing undergone by study participants:

  • those with limited prior testing, that is, primary presentation; and

  • those referred for further evaluation of a suspicious lesion, that is, referred participants.

Accuracy was also estimated separately according to whether the diagnosis was recorded based on a face‐to‐face (in‐person) encounter or based on remote (image‐based) assessment.

Secondary objectives

For the identification of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants:

  • to determine the diagnostic accuracy of individual algorithms used to assist visual inspection; and

  • to determine the effect of observer experience on diagnostic accuracy.

For the alternative definitions of the target condition:

  • to determine the diagnostic accuracy of visual inspection for the detection of invasive melanoma alone in adults;

  • to determine the diagnostic accuracy of visual inspection for the detection of any skin cancer or skin lesion with a high risk of progression to melanoma in adults (i.e. requiring excision).

Investigation of sources of heterogeneity

We set out to address a range of potential sources of heterogeneity for investigation across our series of reviews, as outlined in our generic protocol (Dinnes 2015a) and described in Appendix 5; however, our ability to investigate these was necessarily limited by the available data on each individual test reviewed.

The sources of heterogeneity that we investigated for visual inspection were:

  • in‐person versus image‐based evaluations;

  • study setting: primary, community or private care versus secondary versus specialist clinics;

  • use of a diagnostic algorithm: no algorithm reported versus any named algorithm used;

  • type of reference standard: histology alone versus histology plus clinical follow‐up or other reference standard; and

  • disease prevalence: ≤ 10% versus > 10%. We chose the 10% cut‐off based on advice from clinical co‐authors (RB, HW).

Methods

Criteria for considering studies for this review

Types of studies

We included test accuracy studies that allow comparison of the result of the index test with that of a reference standard, including the following:

  • studies where all participants receive a single index test and a reference standard;

  • studies where all participants receive more than one index test(s) and reference standard;

  • studies where participants are allocated (by any method) to receive different index tests or combinations of index tests and all receive a reference standard (between‐person comparative studies (BPC));

  • studies that recruit series of participants unselected by true disease status (referred to as case series for the purposes of this review);

  • diagnostic case‐control studies that separately recruit diseased and non‐diseased groups (see Rutjes 2005); however, we did not include studies that compared results for malignant lesions to those for healthy skin (i.e. with no lesion present);

  • both prospective and retrospective studies; and

  • studies where previously acquired clinical or dermoscopic images were retrieved and prospectively interpreted for study purposes.

We excluded studies from which we could not extract 2x2 contingency data or if they included fewer than five melanoma cases or fewer than five benign lesions. The size threshold of five is arbitrary. However, such small studies are unlikely to add precision to the estimate of accuracy.

Studies available only as conference abstracts were excluded; however, attempts were made to identify full papers for potentially relevant conference abstracts (Searching other resources).

Participants

We included studies in adults with pigmented skin lesions or lesions suspicious for melanoma or those at high risk of developing melanoma, including those with a family history or previous history of melanoma skin cancer, atypical or dysplastic naevus syndrome, or genetic cancer syndromes.

We excluded studies that recruited only participants with malignant or benign diagnoses.

We excluded studies conducted in children or that clearly reported inclusion of more than 50% of participants aged 16 and under.

Index tests

Studies reporting accuracy data for visual inspection alone, with either image‐based or in‐person diagnosis, were eligible for inclusion. For in‐person visual inspection, diagnosis is undertaken in a clinic setting with the patient present (face‐to‐face diagnosis). For these studies we assumed that patient history‐taking would have taken place and is likely to have contributed to lesion diagnosis; however, we did not specifically extract details of patient history‐taking due to anticipated poor reporting in the primary studies. For image‐based studies, diagnosis is based on clinical or ‘macro’ images (photographs), remotely from the study participant. For these studies, we extracted any additional patient information that was provided to assist diagnosis.

We included all established algorithms or checklists to assist diagnosis by visual inspection. We included studies developing new algorithms or methods of diagnosis (i.e. derivation studies) if they:

  • used a separate independent 'test set' of participants or images to evaluate the new approach; or

  • investigated lesion characteristics that had previously been suggested as associated with melanoma and the study reported accuracy based on the presence or absence of particular combinations of characteristics.

We excluded studies if they:

  • used a statistical model to produce a data‐driven equation, or algorithm based on multiple diagnostic features, with no separate test set;

  • used cross‐validation approaches such as 'leave‐one‐out' cross‐validation (Efron 1983);

  • evaluated the accuracy of the presence or absence of individual lesion characteristics or morphological features, with no overall diagnosis of malignancy;

  • reported accuracy data for ‘clinical diagnosis’ with no clear description as to whether the reported data related to visual inspection alone;

  • were based on the experience of a particular skin cancer clinic, where dermoscopy may or may not have been used on an individual patient‐basis.

Although primary care clinicians can in practice be specialists in skin cancer, we considered primary care physicians as generalist practitioners and dermatologists as specialists. Within each group, we extracted any reporting of special interest or accreditation in skin cancer.

Target conditions

We defined the primary target condition as the detection of:

  • any form of invasive cutaneous melanoma or atypical intraepidermal melanocytic variants (i.e. including melanoma in situ, or lentigo maligna, which has a risk of progression to invasive melanoma).

We considered two additional definitions of the target condition in secondary analyses, namely the detection of:

  • any form of invasive cutaneous melanoma alone;

  • any skin lesion requiring excision. This latter definition includes melanoma plus other forms of skin cancer, such as BCC and cSCC, as well as melanoma in situ, lentigo maligna, and lesions with severe melanocytic dysplasia.

The diagnosis of the keratinocyte skin cancers, BCC, and SCC as primary target conditions are the subject of a separate series of reviews (Dinnes 2015b).

Reference standards

The ideal reference standard is histopathological diagnosis in all eligible lesions. A qualified pathologist or dermatopathologist should perform histopathology. Ideally, reporting should be standardised detailing a minimum dataset to include the histopathological features of melanoma to determine the American Joint Committee on Cancer (AJCC) Staging System (e.g. Slater 2014). We did not apply reporting of a minimum dataset as a necessary inclusion criterion, but extracted any pertinent information.

Partial verification (applying the reference test only to a subset of those undergoing the index test) was of concern given that lesion excision or biopsy are unlikely to be carried out for all benign‐appearing lesions within a representative population sample. Therefore, to reflect what happens in reality, we accepted clinical follow‐up of benign‐appearing lesions as an eligible reference standard, whilst recognising the risk of differential verification bias (as misclassification rates of histopathology and follow‐up will differ).

Additional eligible reference standards included cancer registry follow‐up and 'expert opinion' with no histology or clinical follow‐up. Cancer registry follow‐up is considered less desirable than active clinical follow‐up, as follow‐up is not carried out within the control of the study investigators. Furthermore, if participant‐based analyses as opposed to lesion‐based analyses are presented, it may be difficult to determine whether the detection of a malignant lesion during follow‐up is the same lesion that originally tested negative on the index test.

All of the above were considered eligible reference standards with the following caveats:

  • all study participants with a final diagnosis of the target disorder must have a histological diagnosis, either subsequent to the application of the index test or after a period of clinical follow‐up; and

  • at least 50% of all participants with benign lesions must have either a histological diagnosis or clinical follow‐up to confirm benignity.

Search methods for identification of studies

Electronic searches

The Information Specialist (SB) carried out a comprehensive search for published and unpublished studies. A single large literature search was conducted to cover all topics in the programme grant (see Appendix 1 for a summary of reviews included in the programme grant). This allowed for the screening of search results for potentially relevant papers for all reviews at the same time. A search combining disease related terms with terms related to the test names, using both text words and subject headings was formulated. The search strategy was designed to capture studies evaluating tests for the diagnosis or staging of skin cancer. As the majority of records were related to the searches for tests for staging of disease, a filter using terms related to cancer staging and to accuracy indices was applied to the staging test search, to try to eliminate irrelevant studies, for example, those using imaging tests to assess treatment effectiveness. A sample of 300 records that would be missed by applying this filter was screened and the filter adjusted to include potentially relevant studies. When piloted on MEDLINE, inclusion of the filter for the staging tests reduced the overall numbers by around 6000. The final search strategy, incorporating the filter, was subsequently applied to all bibliographic databases as listed below (Appendix 6). The final search result was cross‐checked against the list of studies included in five systematic reviews; our search identified all but one of the studies, and this study was not indexed on MEDLINE. The Information Specialist devised the search strategy, with input from the Information Specialist from Cochrane Skin. No additional limits were used.

We searched the following bibliographic databases to 29 August 2016 for relevant published studies:

  • MEDLINE via OVID (from 1946);

  • MEDLINE In‐Process & Other Non‐Indexed Citations via OVID; and

  • Embase via OVID (from 1980).

We searched the following bibliographic databases to 30 August 2016 for relevant published studies:

  • the Cochrane Central Register of Controlled Trials (CENTRAL; 2016, Issue 7) in the Cochrane Library;

  • the Cochrane Database of Systematic Reviews (CDSR; 2016, Issue 8) in the Cochrane Library;

  • Cochrane Database of Abstracts of Reviews of Effects (DARE; 2015, Issue 2);

  • CRD HTA (Health Technology Assessment) database, 2016, Issue 3; and

  • CINAHL (Cumulative Index to Nursing and Allied Health Literature via EBSCO from 1960).

We searched the following databases for relevant unpublished studies using a strategy based on the MEDLINE search:

  • CPCI (Conference Proceedings Citation Index), via Web of Science™ (from 1990; searched 28 August 2016); and

  • SCI Science Citation Index Expanded™ via Web of Science™ (from 1900, using the 'Proceedings and Meetings Abstracts' Limit function; searched 29 August 2016).

We searched the following trials registers using the search terms 'melanoma', 'squamous cell', 'basal cell' and 'skin cancer' combined with 'diagnosis':

We aimed to identify all relevant studies regardless of language or publication status (published, unpublished, in press, or in progress) and applied no date limits.

Searching other resources

We have included information about potentially relevant ongoing studies in the Characteristics of ongoing studies tables. We have screened relevant systematic reviews identified by the searches for their included primary studies, and included any missed by our searches. We have checked the reference lists of all included papers, and subject experts within the author team reviewed the final list of included studies. We did not conduct any citation searching.

Data collection and analysis

Selection of studies

At least one review author (JDi or NC) screened titles and abstracts, with any queries discussed and resolved by consensus. A pilot screen of 539 MEDLINE references showed good agreement (89% with a kappa of 0.77) between screeners. We included at initial screening primary test accuracy studies and test accuracy reviews (for scanning of reference lists) of any test used to investigate suspected melanoma, BCC, or cSCC. Both a clinical reviewer (from one of a team of twelve clinician reviewers) and a methodologist reviewer (JDi or NC) independently applied Inclusion criteria (Appendix 7) to all full text articles, disagreements were resolved by consensus or by a third party (JDe, CD, HW, and RM). We contacted authors of eligible studies when insufficient data were presented to allow for the construction of 2x2 contingency tables.

Data extraction and management

One clinical (as detailed above) and one methodologist reviewer (JDi, NC or LFR) independently extracted data concerning details of the study design, participants, index test(s) or test combinations and criteria for index test positivity, reference standards, and data required to populate a 2x2 diagnostic contingency table for each index test using a piloted data extraction form. We extracted data at all available index test thresholds. We resolved disagreements by consensus or by consulting a third party (JDe, CD, HW, and RM).

We contacted authors of included studies where information related to final lesion diagnoses or diagnostic thresholds were missing. In particular, invasive cSCC (included as disease‐positive for one of our secondary objectives) is not always differentiated from ‘in situ’ variants such as Bowen’s disease (which we did not consider as disease‐positive for any of our definitions of the target condition). We contacted authors of conference abstracts published from 2013 to 2015 to ask whether full data were available. If no full paper was identified, we marked conference abstracts as 'pending' and will revisit them in a future review update.

Dealing with multiple publications and companion papers

Where we identified multiple reports of a primary study, we maximised yield of information by collating all available data. Where there were inconsistencies in reporting or overlapping study populations, we contacted study authors for clarification in the first instance. If this contact with authors was unsuccessful, we used the most complete and up‐to‐date data source where possible.

Assessment of methodological quality

We assessed risk of bias and applicability of included studies using the QUADAS‐2 checklist (Whiting 2011), tailored to the review topic (see Appendix 8) and piloted it on a small number of included full‐text articles. One clinical (as detailed above) and one methodologist reviewer (JDi, NC or LFR) independently assessed quality for the remaining studies; we resolved any disagreement by consensus or by consulting a third party where necessary (JDe, CD, HW, and RM).

Statistical analysis and data synthesis

We conducted separate analyses according to the point that study participants reached in the clinical pathway (numbered from 1 to 7 in Figure 3), the clarity with which the pathway could be determined (clear or unclear), and the evaluation of in‐person versus image‐based diagnosis.

3.

3

Clinical pathway

Our unit of analysis was the lesion rather than the participant. This is because firstly, in skin cancer, initial treatment is directed to the lesion rather than systemically (thus it is important to be able to correctly identify cancerous lesions for each person), and secondly, it is the most common way in which the primary studies reported data. Although there is a theoretical possibility of correlations of test errors when the same people contribute data for multiple lesions, most studies include very few people with multiple lesions and any potential impact on findings is likely to be very small, particularly in comparison with other concerns regarding risk of bias and applicability. Where an individual study assessed multiple algorithms, we selected datasets on the following preferential basis:

  • ‘no algorithm’ reported; data presented for clinician’s overall diagnosis or management decision;

  • pattern analysis or pattern recognition;

  • ABCD algorithm (or derivatives of);

  • seven‐point checklist (also referred to as Glasgow/MacKie checklist).

Where multiple thresholds per algorithm were reported, we included the standard or most commonly used threshold. If data for multiple observers were reported, we used data for the most experienced observer, using single observer diagnosis in preference to a consensus or average across observers. If we were unable to choose a dataset based on the above ‘rules’, we made a random selection of one dataset per study. To allow comparisons of tests, we have included data on the accuracy of dermoscopy in a separate review in our series (Dinnes 2018).

For each analysis, we plotted estimates of sensitivity and specificity on coupled forest plots and in receiver operating characteristic (ROC) space. For tests where commonly used thresholds were reported we estimated summary operating points (summary sensitivities and specificities) with 95% confidence intervals and prediction regions using the bivariate hierarchical model (Chu 2006; Reitsma 2005). Where inadequate data were available for the model to converge the model was simplified, first by assuming no correlation between estimates of sensitivity and specificity and secondly by setting estimates of near zero variance terms to zero (Takwoingi 2015). Where all studies reported 100% sensitivity (or 100% specificity) we summed the number with disease (or no disease) across studies and used it to compute a binomial exact 95% confidence interval.

For computation of likely numbers of true‐positive, false‐positive, false‐negative and true‐negative findings in the 'Summary of findings' tables, we applied these indicative values to lower quartile, median and upper quartiles of the prevalence observed in the study groups. We have reported these numbers for the average operating point on the SROC curve in 'Summary of findings' tables.

Investigations of heterogeneity

We investigated heterogeneity, and made comparisons between algorithms and according to observer experience by comparing summary ROC curves using the hierarchical summary receiver‐operator curves (HSROC) model (Rutter 2001). HSROC curves allow incorporation of data at different thresholds and from different algorithms or checklists. We used an HSROC model that assumed a constant SROC shape between tests and subgroups, but allowed for differences in threshold and accuracy by addition of covariates. We assessed the significance of the differences between tests or subgroups by the likelihood ratio test assessing differences in both accuracy and threshold, and by a Wald test on the parameter estimate testing for differences in accuracy alone. We fitted simpler models when convergence was not achieved due to small numbers of studies, first assuming symmetric SROC curves (setting the shape term to zero), and then setting random‐effects variance estimates to zero. We have presented estimates of accuracy from HSROC models as diagnostic odds ratios (DORs) (estimated where the SROC curve crosses the sensitivity=specificity line) with 95% confidence intervals. We have presented differences between tests and subgroups from HSROC analyses as relative diagnostic odds ratios (RDORs) with 95% confidence intervals.

We fitted bivariate models using the xtmelogit command in STATA 15 and HSROC models using the NLMIXED procedure in the SAS statistical software package (SAS 2012; version 9.3; SAS Institute, Cary, NC, USA) and the metadas macro (Takwoingi 2010).

Sensitivity analyses

We planned sensitivity analyses, restricting analyses to studies at the least risk of bias; however, these were not carried out due to insufficient study numbers.

Assessment of reporting bias

Because of uncertainty about the determinants of publication bias for diagnostic accuracy studies and the inadequacy of tests for detecting funnel plot asymmetry (Deeks 2005), we did not perform tests to detect publication bias.

Results

Results of the search

The Information Specialist identified a total of 34,517 unique references and we screened them for inclusion. Of these, we reviewed 1051 full‐text papers for eligibility for any one of the suite of reviews of tests to assist in the diagnosis of melanoma or keratinocyte skin cancer. Of the 1051 full‐text papers assessed, we excluded 848 from all reviews in our series (see Figure 4; PRISMA flow diagram of search and eligibility results).

4.

4

PRISMA flow diagram.

Of the 232 studies tagged as potentially eligible for this review of visual inspection, we included 49 publications, reporting 49 individual studies. Exclusions were mainly due to the inability to construct a 2x2 contingency table based on the data presented (n = 54); the use of ineligible index tests (n = 39) (for example: reporting of data for visual inspection and dermoscopy only (n = 12), reporting of data for ‘clinical diagnosis’ (n = 11), or for serial use of the index test in a follow‐up context (n = 7)); or not meeting our requirements for an eligible reference standard (n = 23). Other reasons for exclusion included ineligible study populations (n = 20) (for example, recruiting only malignant or only benign lesions (n = 18)), inadequate sample size (n = 14), ineligible definition of the target condition (n = 14) or with test interpretation by medical students or laypeople (n = 6). A list of the 183 publications excluded from this review with reasons for exclusion is provided in Characteristics of excluded studies, with a list of all studies excluded from the full series of reviews available as a separate pdf (please contact skin.cochrane.org for a copy of the pdf).

We contacted the authors of 14 publications for the purposes of this review of visual inspection and, to date, have received responses about seven publications. One response allowed the inclusion of the study in the review (Walter 2012), five provided clarifications on methods used on studies included (Bono 2006; Bourne 2012; Rosendahl 2011; Stanganelli 2000; Walter 2012); one replied with the information needed but the two studies could not be included due to the evaluation of ‘clinical diagnosis’ (Youl 2007a; Youl 2007b); and five replied but were not able to provide the information requested in relation to eight study publications, one of which we could still include (Menzies 2009) and seven we could not (Fabbrocini 2008; Freeman 1963; Heal 2008; Menzies 2009; Warshaw 2009a; Warshaw 2009b; Warshaw 2010).

The 49 included study publications report on a total of 51 cohorts of lesions and 134 datasets with 34,351 lesions and 2499 malignancies. The total number of study participants with suspicious lesions cannot be estimated due to lack of reporting in study publications. Two thirds of studies (n = 32; 65%) also reported accuracy data for diagnosis using dermoscopy; these comparisons are reported in Dinnes 2018. Seven studies reported data for additional tests including teledermatology (n = 1) and computer‐assisted diagnosis techniques (n = 6).

Methodological quality of included studies

We have summarised the overall methodological quality of all included studies (n = 49) in Figure 5 and Figure 6.

5.

5

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

6.

6

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

The majority of study reports provided insufficient information across almost all study quality domains to allow us to judge the risk of bias, while we scored applicability of study findings as of ‘High’ concern in three of four domains assessed.

Participant selection

We judged only 22% of studies (n = 11) at low risk of bias for participant selection and 27% (n = 13) at high risk of bias. Ten studies (20%) either used a case‐control type design with separate selection of melanoma cases and lesions with benign diagnoses (n = 6) or did not clearly describe the study design used (n = 5). Over half (55%; n = 27) reported random or consecutive participant recruitment; the remaining 45% did not describe recruitment methods. Over half of studies (53%) did not describe whether they had applied any exclusion criteria and we judged them at unclear risk of bias. Seven studies (14%) applied inappropriate participant exclusions, excluding ‘difficult to diagnose’ lesions such as awkwardly located lesions (Bono 2002a; Morales Callaghan 2008; Unlu 2014); those with disagreement on histopathology (de Giorgi 2012; Ek 2005; Zaumseil 1983); or dermoscopically ‘peculiar’ lesions (Carli 2003a).

We considered almost all cohorts (96%; n = 47) at high concern for applicability of participants. In the majority of cases (n = 41), high concern was due to restricted study populations: inclusion of only melanocytic (n = 10) or amelanotic (n = 1) lesions; restriction by lesion diameter (Bono 2002b; Bono 2006; Steiner 1987); or, most commonly, inclusion of lesions selected for excision based on the clinical or dermoscopic diagnosis or selected retrospectively from histopathology databases (n = 37). We judged only four cohorts to have included a representative patient population (Grimaldi 2009; Menzies 2009; Stanganelli 2000; Walter 2012). Fourteen cohorts also included multiple lesions per participant, with only eight clearly including a similar number of participants and lesions (Bono 2002a; Bono 2002b; Bono 2006; Bourne 2012; Collas 1999; Krahn 1998; Pizzichetta 2004; Unlu 2014).

Index test

For the index test domain, we considered studies separately according to whether they reported in‐person evaluations of visual inspection (n = 33) or evaluations based on interpretation of clinical images (image‐based evaluations; n = 16). For the in‐person evaluations, we judged 24% (n = 8) at low risk of bias, and 9% (n = 3) at high risk; 22 (67%) did not provide sufficient information to allow us to judge the risk of bias fully. We considered that all studies made the diagnosis blinded to the reference standard result: 24% (n = 8) also clearly reported pre‐specification of the diagnostic threshold (five of the eight using named algorithms (Argenziano 2006; Cristofolini 1994; Stanganelli 2000; Walter 2012; Zaumseil 1983 and three by the same author team (Bono 2002a; Bono 2002b; Bono 2006) describing the process by which they had reached the diagnosis. Three studies developed new algorithms (Thomas 1998) or evaluated multiple thresholds for test positivity (Benelli 2001; McGovern 1992). Reporting was poorer for the image‐based evaluations, with over three quarters of studies (n = 13) not providing sufficient information to allow us to judge the risk of bias fully, one study (6%) judged at low risk of bias and two (12%) at high risk. Again, we considered that all the studies had made the diagnosis blinded to the reference standard result, with one prospectively testing two pre‐specified diagnostic thresholds (Benelli 2001) and two (de Giorgi 2012; Scope 2008) testing multiple diagnostic thresholds.

We recorded high concern for the applicability of the index tests for 85% (n = 28) of in‐person evaluations. High concern was primarily due to a lack of description of the diagnostic thresholds used (n = 24), but also as a result of presentation of average (Argenziano 2006) or consensus diagnoses (Barzegari 2005; Benelli 1999; Carli 2002a; Cristofolini 1997; Morales Callaghan 2008; Steiner 1987) as opposed to the diagnosis of a single observer. Two studies were also judged to have reported diagnosis by non‐expert observers (Menzies 2009; Walter 2012), both of which reported diagnoses by large groups of primary care practitioners. In reality, specific expertise in diagnosing pigmented lesions does vary amongst examiners, for example Menzies 2009 requiring a history of excision or referral of at least 10 pigmented skin lesions over the previous 12‐month period but excluding those already using dermoscopy or digital monitoring of lesions, and Walter 2012 excluding those with specialist dermatology training but reporting some training in dermatology for almost a quarter of participating GPs. We judged almost three quarters of studies (n = 24) to have applied and interpreted the ‘test’ in a clinically applicable manner, nine (27%) provided sufficient detail of the threshold used and 11 (33%) described the observers as expert or experienced. All image‐based studies were of high concern for applicability, due to the image‐based nature of interpretation limiting the clinical applicability of findings but also the lack of detail on the thresholds used (n = 13). A higher proportion (62%; n = 10) described the observers as expert or experienced.

Reference standard

Of the 49 included cohorts, we judged 85% at low risk of bias for the reference standard due to the use of an acceptable reference standard (n = 42). Six did not meet our criteria for an acceptable reference standard, with more than 20% of the benign lesions having only expert diagnosis with no clinical follow‐up (Bono 1996; Green 1991; Grimaldi 2009; Menzies 2009; Stanganelli 2000; Walter 2012), three of which were primary care‐based studies (Grimaldi 2009; Menzies 2009; Walter 2012). We recorded blinding of the reference standard to the index test (in this case the pathology referral diagnosis) but it did not contribute to the overall risk of bias for the reference standard domain. Three studies implemented no blinding of the reference standard (Menzies 2009 and Walter 2012 referring patients for excision under standard practice and Thomas 1998 describing a form recording the presence or absence of each ABCDE criterion to the usual pathology form) and the remaining 46 studies did not describe blinding (94%). The applicability of the reference standard was of low concern in nine studies (18%), high in seven (14%), and unclear for 33 (67%). In all cases, high concern was due to the use of expert opinion for classifying the final diagnosis of some lesions. The majority of studies (n = 40; 82%) did not report histopathology interpretation by an experienced histopathologist or by a dermatopathologist.

Participant flow

In terms of flow and timing, we judged 20 cohorts at high risk of bias, seven at low risk, and 22 did not provide enough information on which to judge this domain. Of those at high risk, 11 cohorts did not use the same reference standard for all participants (differential verification), and 15 did not include all participants in the analysis either due to incomplete information (Argenziano 2006; Bono 1996; Ek 2005; McGovern 1992; Menzies 2009; Pizzichetta 2004; Walter 2012); inadequate images (Chang 2013; Dolianitis 2005; Green 1994; Lorentzen 1999; Pizzichetta 2004; Rosendahl 2011; Scope 2008); and exclusion of particular lesion groups following recruitment (Bourne 2012; Dummer 1993; Menzies 2009). A further 37 cohorts were unclear on the interval between the application of the index test and excision for histology with 12 reporting consecutive diagnosis and excision or biopsy.

Findings

1. Target condition: invasive melanoma and atypical intraepidermal melanocytic variants

Thirty‐seven studies reported accuracy data for the detection of invasive melanoma and atypical intraepidermal melanocytic variants, one of which reported data for three different sets of lesions (Morton 1998a; Morton 1998b; Morton 1998c), giving a total of 39 datasets; the studies conducted 28 evaluations in person and 11 were image‐based.

We have summarised details of the in‐person studies in Appendix 9, with quality assessments in Appendix 10. Summary details of the image‐based studies are in Appendix 11 with quality assessments in Appendix 12. Details of established algorithms used to assist diagnosis are described in detail in Appendix 4. Results for the primary analyses are presented in Table 2. We have presented forest plots of study data for each analysis in Table 2 in Figure 7 and Figure 8; summary estimates are depicted in Figure 9 and Figure 10. Table 3 reports heterogeneity investigations, Table 4 compares test algorithms and Table 5 compares observers.

1. Primary analyses for detection of invasive melanoma or atypical intraepidermal melanocytic variants by position on the clinical pathway.
In‐person evaluations (n = 28)
Position on pathway Datasets Lesions (melanomas) Sensitivity %
(95% CI %)
Variance Specificity %
(95% CI %)
Variance
Participants with limited prior testing (unselected on reference standard)
Clear 3 1339 (55) 92.4
(26.2 to 99.8)
6.26 79.7
(73.7 to 84.7)
0.07
Participants with limited prior testing (selected for excision)
Clear 2a 4228 (160) 90.1
(70.0 to 97.3)
0.53 81.3
(67.5 to 90.0)
0.25
Unclear 1 353 (38) 78.9
(62.7 to 90.4)
94.0
(90.7 to 96.3)
Combined 3 4581 (198) 87.2
(73.2 to 94.4)
0.45 87.1
(74.6 to 94.0)
0.51
Referred participants (unselected on reference standard)
Clear 2 3494 (61) 74.6
(48.9 to 90.0)
0.14 98.6
(94.7 to 99.6)
0.77
Referred participants (selected for excision)
Clear 8 5331 (258) 76.7
(61.7 to 87.1)
0.78 95.7
(89.7 to 98.3)
1.73
Unclear 9 9611 (1015) 82.8
(74.4 to 88.9)
0.34 89.2
(71.1 to 96.5)
3.21
Combined 17 14942
(1273)
79.7
(71.7 to 85.8)
0.59 93.0
(85.4 to 96.8)
2.59
Referred participants with equivocal lesions (selected for excision)
Clear 2a 930 (88) 84.7
(55.5 to 96.1)
0.93 89.5
(79.5 to 95.0)
0.27
Unclear 1 318 (73) 61.4
(49.0 to 72.9)
87.3
(82.5 to 91.2)
Combined 3 1248 (161) 76.4
(48.4 to 91.8)
1.03 88.8
(81.8 to 93.3)
0.21
b. Image‐based evaluations (n = 11)
Position on pathway Datasets Lesions (melanomas) Sensitivity
(95% CI %)
Variance Specificity
(95% CI %)
Variance
Participants with limited prior testing (selected for excision)
Clear 1 50 (9) 22.2
(2.8 to 60.0)
70.7
(54.4 to 83.9)
Unclear 1 463 (29) 20.7
(8.0 to 39.7)
96.8
(94.6 to 98.2)
Combined 2 513 (38) 21.4
(10.0 to 40.1)
0 90.9
(60.7 to 98.1)
1.50
Referred participants (unselected on reference standard)
Clear 1 134 (31) 74.2
(55.4 to 88.1)
82.5
(73.8 to 89.3)
1
Referred participants (selected for excision)
Unclear 6 293 (96) 60.3
(49.2 to 70.5)
0.02 77.0
(63.9 to 86.4)
0.40
Referred participants with equivocal lesions (selected for excision)
Unclear 2 303 (98) 61.9
(46.7 to 75.0)
0.10 81.8
(75.2 to 87.0)
0.01
CI: confidence interval

aSensitivity and specificity estimated independently in separate models due to sparse data.

7.

7

Forest plot of in‐person evaluations of visual inspection for detection of invasive melanoma and atypical intraepidermal melanocytic variants by point on the clinical pathway where they are diagnosed

8.

8

Forest plot of image‐based evaluations of visual inspection for detection of invasive melanoma and melanocytic intraepidermal variants by point on the clinical pathway where they are diagnosed

9.

9

Summary estimates of accuracy of in‐person visual inspection for the detection of invasive melanoma and melanocytic intraepidermal variants by point on the clinical pathway where they are diagnosed

(confidence regions are not plotted due to small numbers of studies)

10.

10

Summary estimates of accuracy of image‐based visual inspection for the detection of invasive melanoma and melanocytic intraepidermal variants by point on the clinical pathway where they are diagnosed

(confidence regions are not plotted due to small numbers of studies)

2. Secondary analyses for primary target condition by covariate.
Subgroup Datasets Lesions (melanomas) Diagnostic odds ratio (DOR)
(95% CI)
Relative DOR
(95% CI)
P value (DOR) P valuea (hierarchical summary receiver‐operator curves (HSROC) models)
Differences: in‐person and image based evaluations
In‐person 28 25,604 (1748) 37.5 (21.7 to 64.7) 8.54 (2.89 to 25.3) < 0.001 0.001
Image‐based 11 1243 (263) 4.38 (1.79 to 10.8)
Analyses based on in‐person evaluations only (n = 28)
Study setting
Primary/community/private 6 5920 (253) 27.6 (6.95 to 109)
Secondary 10 10,419 (1019) 39.0 (13.8 to 110)
Specialist clinic 12 9265 (476) 44.4 (17.2 to 115) Secondary/specialist vs primaryb: 1.51 (0.32 to 7.09) 0.59 0.62
Use of a diagnostic algorithm
No algorithm used 21 19,330 (1076) 37.3 (18.0 to 77.3)
Any algorithm used 7 6274 (672) 38.5 (11.3 to 132) 1.03 (0.25 to 4.34) 0.96 0.55
Type of reference standard used
Histology alone 22 20,783 (1627) 39.1 (19.7 to 77.8)
Histology plus any other 6 4821 (121) 29.7 (6.60 to 134) 0.76 (0.14 to 4.02) 0.74 0.68
Prevalence
Prevalence ≤ 0.1 16 21,907 (811) 63.7 (28.6 to 142)
Prevalence > 0.1 12 3697 (937) 19.6 (8.39 to 45.8) 0.31 (0.09 to 1.00) 0.05 0.06
CI: confidence interval; DOR: diagnostic odds ratio; RDOR: relative diagnostic odds ratio

aLikelihood ratio test assessing differences in both accuracy and threshold.
 bSecondary vs primary 1.41 (0.25 to 7.93), P = 0.68; specialist vs primary 1.61 (0.30 to 8.63), P = 0.56; specialist vs secondary 1.14 (0.28 to 4.68), P = 0.85.

3. Visual inspection for detection of melanoma and atypical intraepidermal melanocytic variants ‐ by algorithm.
Test (threshold) Datasets Lesions 
 (melanomas) Pooled sensitivity 
 (95% CI %) Pooled specificity 
 (95% CI %) Diagnostic odds ratio (DOR)
 (95% CI)
In‐person evaluations
No algorithm 21 19,330 (1076) 78% (68 to 85) 93% (88 to 96) 46.2 (21.9 to 97.5)
(A)BCD(E)a 6b 5501 (654) 83% (75 to 88) 88% (64 to 97) 36.6 (7.94 to 168)
7‐point checklist at ≥ 2 1 205 (12) 92% (62 to 1.00) 65% (58 to 72) 22.8 (2.08 to 176)
7‐point checklist at ≥ 3 1 205 (12) 42% (15 to 72) 93% (89 to 96) 11.8 (3.22 to 43.3)
7‐point checklist at ≥ 4 1 205 (12) 25% (07 to 57) 98% (96 to 100) 31.8 (4.71 to 215)
7‐point checklist (revised) at ≥ 3 1 773 (18) 94% (73 to 100) 80% (77 to 83)
Collas algorithm at ≥ 1 1 353 (38) 76% (60 to 89) 50% (44 to 56) 3.24 (1.49 to 7.07)
Image‐based evaluations
No algorithm 9 1090 (217) 58% (43 to 71) 84% (76 to 90) 7.47 (4.12 to 13.5)
ABCD(E)d 2 153 (46) 53% (37 to 70) 71% (45 to 88) 2.87 (0.93 to 8.79)
CI: confidence interval

aCombines data from studies using ABCD with threshold not reported (n = 2), ABCDE with at least 2 characteristics present (n = 3) and BCD with at least 2 characteristics present (n = 1).
 bDue to non‐convergence, the bivariate models were fitted assuming zero correlation between the logit sensitivity and logit specificity and removing the random‐effects term for specificity when estimating sensitivity and the random‐effects term for sensitivity when estimating specificity.
 cStudy authors developed and used own algorithm.
 dCombines data from studies using ABCD with at least 2 characteristics present (n = 1) and ABCDE with at least 2 characteristics present (n = 1).

4. Secondary analyses for detection of melanoma and atypical intraepidermal melanocytic variants by observer.
Subgroup Datasets Lesions (melanomas) Diagnostic odds ratio (DOR)
(95% CI)
Relative DOR (RDOR)
(95% CI)
P value (for RDOR) P valuea (hierarchical summary receiver‐operator curves (HSROC) models)
In‐person evaluations
Expert consultant 9 3547 29.0 (11.0 to 76.2) 1 0.36
Consultant 13 16,858 38.4 (16.9 to 87.6) 1.32 (0.37 to 4.71) 0.65
Resident/registrar 2 1339 12.9 (1.99 to 84.0) 0.45 (0.05 to 3.67) 0.44
Mixed (secondary care) 2 2704 48.0 (4.54 to 507) 1.65 (0.13 to 21.4) 0.69
GP 3 1236 211 (24.9 to 1788) 7.28 (0.69 to 76.3) 0.09
Image‐based evaluations
Expert consultant 6 974 20.5 (4.82 to 86.9) 1 0.22
Consultant 4 200 3.76 (1.15 to 12.3) 0.18 (0.04 to 0.90) 0.04
Mixed (secondary care) 1 200 10.9 (2.02 to 59.2) 0.53 (0.07 to 3.97) 0.50
Mixed (secondary/primary care) 1 40 11.5 (0.94 to 142) 0.56 (0.04 to 7.51) 0.63
Mixed (primary care) 2 184 6.60 (1.73 to 25.2) 0.32 (0.07 to 1.40) 0.11
CI: confidence interval; DOR: diagnostic odds ratio; RDOR: relative diagnostic odds ratio

aLikelihood ratio test assessing differences in both accuracy and threshold.

In‐person evaluations

Of the 28 evaluations conducted on an in‐person basis, 17 contained enough information to describe where on the clinical pathway they had assessed participants (coded as ‘clear’ on pathway), and we considered 11 not to have provided sufficient information to allow us to identify the pathway (coded ‘unclear’ on pathway). We considered these evaluations according to position on the pathway and clear versus unclear pathway classification (Table 2). Figure 7 presents the results of the individual studies grouped by their position on the pathway; Figure 9 depicts the summary estimates at each point on the pathway.

Studies in participants with limited prior testing

Six in‐person evaluations of visual inspection recruited series of participants with pigmented lesions, who were presenting for a first structured clinical assessment of a suspicious lesion (Collas 1999; Gachon 2005; Grimaldi 2009; McGovern 1992; Menzies 2009; Walter 2012) (Appendix 9; Appendix 10). All studies included participants with pigmented lesions; Gachon 2005 restricted inclusion to melanocytic lesions only. The prevalence of disease ranged from 4% to 6% in four studies, with Collas 1999 (11%) and Grimaldi 2009 (20%) reporting higher prevalence of melanoma.

Three studies prospectively included all participants presenting in primary care within a given time frame and were clearly positioned on the clinical pathway (Pathway 2‐c in Figure 9):

The studies supplemented histological diagnosis with clinical follow‐up of at least three months for lesions considered benign (all three studies) and two included expert clinical diagnosis without follow‐up for some benign lesions (Menzies 2009; Walter 2012).

Three studies included only participants with lesions selected for excision (Pathway 3‐c and 3‐u in Figure 9): two were conducted in private dermatology clinics (Collas 1999; Gachon 2005) and one at an open access veterans’ dermatology clinic (McGovern 1992) (Appendix 9):

  • summary sensitivity was 90.1% (95% CI 70.0% to 97.3%) and specificity 81.3% (95% CI 67.5% to 90.0%) for two studies clearly positioned on the clinical pathway (Pathway 3‐c; 4228 lesions and 160 melanomas; Gachon 2005; McGovern 1992);

  • sensitivity was 78.9% (95% CI 62.75% to 90.4%) and specificity 94.0% (95% CI 90.7% to 96.3%) (353 lesions and 38 melanomas; Collas 1999) in the single study that could not be clearly positioned on the clinical pathway (Pathway 3‐u).

Diagnosis was recorded by primary care physicians with a range of experience (Grimaldi 2009; Menzies 2009; Walter 2012) or by dermatologists (Collas 1999; Gachon 2005; McGovern 1992) with no obvious differences in sensitivity or specificity. Four studies reported no formal algorithm to assist diagnosis. Two of these classified lesions ‘suspicious for malignancy’ as test‐positive (Gachon 2005; Grimaldi 2009) and two reported data for ‘correct’ or ‘primary’ diagnosis of melanoma (Collas 1999; Menzies 2009). Walter 2012 reported data for MacKie's revised seven‐point checklist (MacKie 1990) at a threshold of ≥ 3, and McGovern 1992 used the BCD algorithm at ≥ 2 characteristics present (this study also reported data using the original seven‐point checklist, see 'Analyses by algorithm' reported below).

Studies in referred participants

Studies conducted 22 in‐person evaluations of visual inspection in participants referred for specialist assessment. We were able to position 12 clearly on the clinical pathway (three evaluations from a single study) and 10 did not provide sufficient information for us to make a clear assessment (Figure 7; Appendix 9; Appendix 10).

We judged two studies to include all participants referred for further assessment (Pathway 4‐c in Figure 9) and both were clearly positioned on the clinical pathway:

  • summary sensitivity was 74.6% (95% CI 48.9% to 90.0%) and specificity 98.6% (95% CI 94.7% to 99.6%) (3494 lesions and 61 melanomas; Barzegari 2005; Stanganelli 2000).

Fifteen studies providing 17 datasets included only those with any lesion selected for excision (Pathway 5‐c and 5‐u in Figure 9):

We considered three studies to report data only for those participants with equivocal or difficult‐to‐diagnose lesions selected for excision (Pathway 5*‐c and 5*‐u in Figure 7 and Figure 9):

  • summary sensitivity was 84.7% (95% CI 55.5% to 96.1%) and specificity 89.5% (95% CI 79.5% to 95.0%) (930 lesions and 88 melanomas) for two studies clearly positioned on the clinical pathway (Pathway 5*‐c; Dummer 1993; Soyer 1995);

  • sensitivity was 61.4% (95% CI 49.0% to 72.9%) and specificity 87.3% (95% CI 82.5% to 91.2%) (318 lesions and 73 melanomas) in one study not clearly positioned on the clinical pathway (Pathway 5*‐u; Steiner 1987).

Studies included pigmented lesions referred for further evaluation at a dermatology or pigmented lesion clinic, two restricting to melanocytic lesions only (Morales Callaghan 2008; Unlu 2014) and four restricting by lesion diameter (≤ 3 mm (Bono 2006), ≤ 6 mm (Bono 2002b), < 10 mm (Steiner 1987), or ≤ 15 mm (Barzegari 2005)). The prevalence of disease ranged from 1% (Ek 2005) to 41% (Soyer 1995). Disease prevalence was generally lower in studies clearly positioned on the clinical pathway (11% or less in 7 of 10 datasets) compared to those that could not be clearly positioned (7 of 9 datasets reporting disease prevalence of 15% or over (Appendix 9)). The prevalence of melanoma in studies of equivocal lesions was 3% (Dummer 1993), 23% (Steiner 1987) and 41% (Soyer 1995).

Diagnoses were recorded by dermatologists or dermatology residents (or were assumed to be by dermatologists based on study authors’ institutions or study settings), by surgical oncologists or by plastic surgeons (Appendix 9). Observer experience was poorly reported, with only seven studies referring to ‘experienced’ or ‘expert’ observers; three studies were clearly positioned on the pathway and four not clearly positioned. All studies reported observer diagnosis with no formal algorithm, apart from five using ABCD or ABCDE algorithms (Benelli 1999; Cristofolini 1994; Cristofolini 1997; Thomas 1998; Stanganelli 2000). Diagnosis was more often based on the opinion of a single observer as opposed to a consensus or average decision in studies clearly positioned on the pathway (10 of 12 datasets; Stanganelli 2000; Bono 2002a; Bono 2002b; Bono 2006; Green 1991; Morton 1998a; Morton 1998b; Morton 1998c; Dummer 1993; Soyer 1995) compared to those not clearly positioned (3 of 10 datasets; Thomas 1998; Unlu 2014; Zaumseil 1983).

Image‐based evaluations

Of the 11 image‐based evaluations, two contained enough information to describe where on the clinical pathway they had assessed participants (coded as ‘clear’ on pathway), and we considered nine to have provided insufficient information to allow us to identify the pathway (coded ‘unclear’ on pathway) (Appendix 11Appendix 12). We have presented the results in Table 2. Figure 8 presents the results of the individual studies grouped by their position on the pathway; Figure 10 depicts the summary estimates at each point on the pathway.

Studies in participants with limited prior testing

Two studies retrospectively reviewed clinical images from participants with lesions excised in primary care settings (Pathway 3‐c and 3‐u in Figure 10):

  • sensitivity was 22.2% (95% CI 2.8% to 60.0%) and specificity 70.7% (95% CI 54.4% to 83.9%) (50 lesions and 9 melanomas) in one study clearly positioned on the clinical pathway (Pathway 3‐c; Bourne 2012);

  • sensitivity was 20.7% (95% CI 8.0% to 39.7%) and specificity 96.8% (95% CI 94.6% to 98.2%) (463 lesions and 29 melanomas) in the study not clearly positioned on the clinical pathway (Pathway 3‐u) (Rosendahl 2011). The study report was unclear as to whether the excisions were undertaken at the primary care practice or in a referral setting.

The prevalence of melanoma was 6% (Rosendahl 2011) and 20% (Bourne 2012) and both studies included a range of different types of lesions. Three GPs and a clinical nurse, with varying levels of dermoscopy experience, reviewed the lesion images in Bourne 2012 and an expert dermatologist reviewed the images in Rosendahl 2011. They made their diagnoses without the aid of a published algorithm.

Studies in referred participants

Nine evaluations of clinical images were conducted in participants referred for specialist assessment; we could clearly position one on the clinical pathway and eight did not provide sufficient information for us to make a clear assessment.

We considered the one study clearly positioned on the clinical pathway to have included all participants referred for further assessment (Pathway 4‐c in Figure 10):

  • sensitivity was 74.2% (95% CI 55.4% to 88.1%) and specificity 82.5% (95% CI 73.8% to 89.3%) (134 lesions and 31 melanomas; Stanganelli 2005).

Although the remaining eight studies did not provide sufficient information to allow us to clearly position them on the clinical pathway, we assumed that they had obtained lesion images from referral settings (Pathway 5‐u and 5*‐u in Figure 10):

  • summary sensitivity was 60.3% (95% CI 49.2% to 70.5%) and specificity 77.0% (95% CI 63.9% to 86.4%) (293 lesions and 96 melanomas) for six studies that included all lesions selected for excision (Pathway 5‐u) (Benelli 2001; Carli 2002b; Dolianitis 2005; Pizzichetta 2004; Stanganelli 1998a; Winkelmann 2016);

  • summary sensitivity was 61.9% (95% CI 46.7% to 75.0%) and specificity 81.8% (95% CI 75.2% to 87.0%) (303 lesions and 98 melanomas) across two studies that included participants with equivocal lesions selected for excision (Pathway 5*‐u) (Carli 2003a; de Giorgi 2012).

Studies were retrospective case series apart from two case‐control type studies (Dolianitis 2005; Winkelmann 2016) and one with an unclear design (Benelli 2001). Three studies (Benelli 2001; Dolianitis 2005; Stanganelli 1998a) evaluated observer accuracy before and after dermoscopy training. All the studies reviewed images of pigmented or melanocytic lesions apart from one that focused on hypomelanotic (≤ 30% pigmentation) or amelanotic lesions (Pizzichetta 2004). The prevalence of melanoma ranged from 19% (Carli 2002b) to 50% (Dolianitis 2005); four studies included only melanomas (including in situ) and benign naevi (Carli 2003a; de Giorgi 2012; Stanganelli 2005; Winkelmann 2016).

Dermatologists or observers with mixed qualifications undertook lesion diagnosis; observer experience was poorly reported (Appendix 11). Stanganelli 2005 also provided accuracy data for the average of three GPs (data reported in section 1.3.2). Most studies presented average accuracy across observers; only two reported accuracy for a single observer (Benelli 2001; Pizzichetta 2004). All studies except Benelli 2001 (ABCDE algorithm) and de Giorgi 2012 (ABCD) made diagnoses without the use of diagnostic algorithms.

Secondary analyses

We conducted secondary analyses for the detection of invasive melanoma and atypical intraepidermal melanocytic variants, regardless of classification by clinical pathway.

Covariate investigations

A preliminary analysis across the 39 datasets contributing to the primary analyses described above found a large difference in accuracy for in‐person evaluations compared to those based on the assessment of clinical images (RDOR 8.54, 95% CI 2.89 to 25.3, P < 0.001; Table 3; Figure 11). The magnitude and importance of the observed difference is so large, raising serious concerns about the applicability of visual inspection studies done via image observation only, that we elected to undertake all subsequent covariate investigations based on in‐person evaluations only (n = 28).

11.

11

Summary ROC comparing in‐person and image‐based evaluations of visual inspection for detection of invasive melanoma or atypical intraepidermal melanocytic variants (MEL)

For the 28 in‐person evaluations, only one of the four covariate investigations approached statistical significance (Table 3); observed accuracy was lower in studies where disease prevalence of melanoma (percentage of cases in the study that tested positive for the reference standard) was over 10% compared to those with disease prevalence of 10% or less (RDOR 0.31, 95% CI 0.09 to 1.00; P = 0.05). The RDOR for study setting (secondary care or specialist clinic compared to primary care) was 1.51 (95% CI 0.32 to 7.09; P = 0.59; Figure 12), for use of a named algorithm to aid diagnosis compared to no algorithm reported was 1.03 (95% CI 0.25 to 4.34; P = 0.96; Figure 13), and for use of histology plus clinical follow‐up or other reference standard compared to histology alone was 0.76 (95% CI 0.14 to 4.02; P = 0.74; Figure 14).

12.

12

Summary ROC plot of in‐person visual inspection evaluations stratified by study setting for detection of invasive melanoma and atypical intraepidermal melanocytic variants (MEL)

13.

13

Summary ROC Plot of in‐person visual inspection evaluations stratified by use of a published algorithm for detection of invasive melanoma and atypical intraepidermal melanocytic variants (MEL)

14.

14

Summary ROC plot of in‐person visual inspection evaluations stratified by reference standard for detection of invasive melanoma and atypical intraepidermal melanocytic variants (MEL)

Analyses by algorithms used to assist visual inspection

Of the 28 in‐person evaluations only seven reported using an algorithm to assist visual inspection, limiting our ability to make meaningful comparisons between algorithms (Table 4). Observer diagnosis without the use of a formal algorithm (n = 21 datasets) had the highest diagnostic accuracy (DOR 46.2, 95% CI 21.9 to 97.5), with an average sensitivity of 78% (95% CI 68% to 85%) and average specificity of 93% (95% CI 88% to 96%). Pooled sensitivity was slightly higher and specificity slightly lower for variations on the (A)BCD(E) algorithm (n = 6 datasets), but with overlapping confidence intervals (summary sensitivity 83% (95% CI 75% to 88%); summary specificity 88% (95% CI 64% to 97%)). Two datasets reported data for either the original seven‐point checklist at a number of thresholds (McGovern 1992) or for the revised seven‐point checklist (Walter 2012). At the standard threshold of 3 or above for both algorithms, the highest observed sensitivity and specificity was 94% (95% CI 73% to 100%) and 80% (95% CI 77% to 83%) for the revised version (Walter 2012).

The image‐based evaluations reported data for either no algorithm or for variations of ABCD(E); we observed a similar pattern with much lower levels of overall accuracy (Table 4).

Analyses by observer experience

Analyses by observer expertise were restricted by the limited amount of information provided in the study reports (Table 5; Appendix 9; Appendix 11). Our analyses are therefore based primarily on study subgroups by observer qualifications (consultant/registrar/mixed qualifications/primary care practitioners), with the ‘consultant’ category separated into ‘Expert consultant’ (for any study describing observers as expert or experienced) and ‘Consultant’ where experience or expertise was not otherwise reported (for example, for those that described observers as dermatologists) (Table 5).

No clear pattern according to observer experience could be discerned for in‐person evaluations. RDORs in comparison to the ‘Expert consultant’ group (9 studies) ranged from 0.45 (95% CI 0.05 to 3.67; P = 0.44) for observers at resident/registrar level (2 studies) to 7.28 (95% CI 0.69 to 76.3; P = 0.09) for GPs (3 studies).

For image‐based evaluations, accuracy was highest for the ‘Expert consultant’ group (DOR 20.5, 95% CI 4.82 to 86.9); RDORs in comparison to the ‘expert’ group ranged from 0.18 (95% CI 0.04 to 0.90; P = 0.04) for observers described as ‘dermatologists’ (4 studies) to 0.56 (95% CI 0.04 to 7.51; P = 0.63) for mixed secondary and primary care observers (1 study).

Across all definitions of the target condition, seven studies provided comparative data according to observer qualifications or experience (Table 6). Most were image‐based assessments, using no prescribed algorithm to aid diagnosis and reporting average results across groups of observers. We observed some evidence of increased sensitivity and smaller increases in specificity with increasing experience; however, wide variations in accuracy remained, with sensitivity ranging from 58% to 91% for expert dermatologists and specificities from 53% to 99%.

5. Results for studies reporting data for more than one observer.
Study
Algorithm (diagnostic approach)
Dis/non‐dis; prevalence
Observer qualification Sensitivity (95% CI %) Specificity (95% CI %) Observer qualification Sensitivity (95% CI %) Specificity (95% CI %) Observer qualification Sensitivity (95% CI %) Specificity (95% CI %)
Target condition: invasive melanoma and/or atypical intraepidermalmelanocytic variants
Benelli 2001
ABCDE (i‐b)
12/38; 24%
Dermatologist (n = 65) 50%
(21 to 79)
50%
(33 to 67)
Expert dermatologists (n = 1) 58%
(28 to 85)
53%
(36 to 69)
Morton 1998a; Morton 1998b; Morton 1998c
No algorithm (in‐p)
Different lesions per obs
Registrar (n = 6)
69/694; 9%
79%
(59 to 92)
98%
(97 to 99)
Senior registrar (n = 2)
31/536; 5%
90%
(74 to 98)
97%
(96 to 99)
Expert dermatologists (n = 2)
28/641; 4%
91%
(82 to 97)
99%
(97 to 99)
Stanganelli 2005
No algorithm (i‐b)
31/103; 23%
GP (n = 3) 81%
(63 to 93)
73%
(63 to 81)
Experienced dermatologists (n = 3) 74%
(55 to 88)
83%
(74 to 89)
Target condition: invasive melanoma alone
Lorentzen 1999
No algorithm (i‐b)
49/183; 21%
Non‐expert dermatology residents (n = 5) 61%
(46 to 75)
88%
(82 to 92)
Experienced dermatologists (n = 4) 78%
(63 to 88)
89%
(84 to 93)
Rao 1997
ABCD (i‐b)
21/51; 29%
Melanoma Fellow 1 (n = 1) 90%
(70 to 99)
80%
(67 to 90)
Dermatologist 1 (n = 1) 76%
(53 to 92)
82%
(69 to 92)
Melanoma Fellow 2 (n = 1) 86%
(64 to 97)
75%
(60 to 86)
Dermatologist 2 (n = 1) 86%
(64 to 97)
75%
(60 to 86)
Scope 2008
Ugly duckling (i‐b)
5/140; 3%
Dermatology nurse + medical photographer (n = 5) 60%
(15 to 95)
96%
(91 to 98)
General dermatologists (n = 13) 80%
(28 to 99)
86%
(79 to 91)
Expert dermatologists (n = 8) 80%
(28 to 99)
95%
(90 to 98)
Westerhoff 2000
No algorithm (i‐b)
50/50; 50%
GP pre‐dermoscopy training (n = 37) 54%
(39 to 68)
53%
(38 to 67)
GP post‐ dermoscopy training
(n = 37)
62%
(47 to 75)
54%
(39 to 68)
CI: confidence interval; GP: general practitioner; in‐p: in‐person; i‐b: image‐based; obs: observer

aNumber of diseased/number of non‐diseased (prevalence of disease), for each definition of the target condition

2. Target condition: invasive melanoma only

In this section, we present the results for studies of visual inspection for the identification of invasive melanoma, according to the approach taken for diagnosis: in‐person or image‐based evaluations. We have presented summary characteristics of studies in Appendix 13 and results of meta‐analyses in Table 7. Table 8 compares results in studies reporting data for invasive melanoma alone and for invasive melanoma plus atypical intraepidermal melanocytic variants.

6. Secondary analyses for alternative definitions of the target condition.
Subgroup Datasets Participants (cases) Diagnostic odds ratio (DOR)
(95% CI)
Sensitivity
(95% CI %)
Specificity
(95% CI %)
Relative DOR (RDOR)
(95% CI)
P value (RDOR) P valuea (hierarchical summary receiver‐operator curves (HSROC) models)
Differences between in‐person and image‐based evaluations
Detection of invasive melanoma alone
In‐person 7 6857 (208) 62.4 (17.6 to 222) 86%
(68 to 94)
91%
(81 to 96)
4.21 (0.62 to 28.6) 0.13 0.27
Image‐based 5 599 (150) 14.8 (3.56 to 61.9) 76%
(50 to 91)
83%
(62 to 93)
Detection of any skin lesion requiring excision
In‐person 7 8091 (2187) 20.5 (7.11 to 59.3) 81%
(68 to 90)
81%
(56 to 93)
1.70 (0.24 to 12.3) 0.55 0.87
Image‐based 3 547 (138) 11.9 (2.22 to 65.3) 75%
(49 to 90)
79%
(38 to 96)

aLikelihood ratio test assessing differences in both accuracy and threshold.

7. Results for studies reporting data for more than one definition of the target condition.
  Detection of invasive melanoma Detection of invasive melanoma or atypical intraepidermalmelanocytic variants Detection of any lesion requiring excision
Study author Dis/non‐dis; preva Sensitivity (95% CIs) Specificity (95% CIs) Dis/non‐dis; preva Sensitivity (95% CIs) Specificity (95% CIs) Dis/non‐dis; preva Sensitivity (95% CIs) Specificity (95% CIs)
In‐person
Ek 2005 23/2559; 1% 48% (27 to 69) 99% (99 to 99) 1754 /828; 68% 98% (97 to 98) 13% (11 to 15)
McGovern 1992 6/186; 3% 100% (54 to 100) 89% (83 to 93) 11/181; 6% 73% (39 to 94) 88% (83 to 93) 15/177; 8% 73% (45 to 92) 88% (82 to 93)
Stanganelli 2000 55/3317; 2% 67% (53 to 79) 99% (99 to 100) 98/3274; 3% 71% (61 to 80) 99% (99 to 99)
Steiner 1987 73/245; 23% 59% (47 to 70) 87% (83 to 91) 93/225; 29% 67% (56 to 76) 86% (81 to 90)
Walter 2012 16/757; 2% 94% (70 to 100) 80% (77 to 83) 18/755; 2% 94% (73 to 100) 80% (77 to 83) 22/751; 3% 82% (60 to 95) 80% (77 to 83)
Image‐based
Carli 2002b 10/43; 19% 80% (44 to 97) 84% (69 to 93) 20/34; 37% 80% (56 to 94) 74% (56 to 87)
Rosendahl 2011 29/434; 6% 21% (08 to 40) 97% (95 to 98) 104/359; 22% 76% (67 to 84) 85% (81 to 88)
Stanganelli 1998a 10/20; 33% 40% (12 to 74) 75% (51 to 91) 14/16; 47% 64% (35 to 87) 75% (48 to 93)

aNumber of diseased/number of non‐diseased; prevalence of disease, for each definition of the target condition.

Seven datasets evaluated the accuracy of in‐person visual inspection for the detection of invasive melanoma (Bono 1996; Green 1994; Kopf 1975; Krahn 1998; McGovern 1992; Viglizzo 2004; Walter 2012), only two of which also reported data for the primary target condition (McGovern 1992; Walter 2012). All studies were based in secondary care or specialist units apart from Walter 2012 (primary care) and McGovern 1992 (army medical centre dermatology clinic). Studies used a modified version of the ABCD checklist (McGovern 1992), the revised seven‐point‐checklist (Walter 2012), or no algorithm (n = 5; 71%) to assist diagnosis. The prevalence of melanoma ranged from 2% (Kopf 1975; Walter 2012) to 49% (Krahn 1998). Two studies supplemented a histological reference standard with clinical follow‐up (Walter 2012) and expert diagnosis of some benign lesions (Bono 1996; Walter 2012).

Sensitivities ranged from 67% to 100% and specificities ranged from 76% to 100%. In meta‐analysis the DOR was 62.4 (95% CI 17.6 to 222) (6857 lesions and 208 melanoma cases). Sensitivity and specificity at the average operating point on the SROC curve were 86% (95% CI 68% to 94%) and 91% (95% CI 81% to 96%) respectively. For the two in‐person evaluations that also reported data for the primary target condition (Table 8), specificity estimates were hardly affected due to small numbers of included melanoma in situ lesions (five in McGovern 1992 and two in Walter 2012). Sensitivity however, was higher for detection of invasive melanoma alone in McGovern 1992 (100% versus 73% for detection of invasive melanoma or atypical intraepidermal melanocytic variants) due to correct diagnosis of only two of five in situ melanomas, and was marginally lower in Walter 2012 (93.8% versus 94.4% for detection of invasive melanoma or atypical intraepidermal melanocytic variants) due to correct identification of both in situ melanomas with one invasive melanoma missed.

Five datasets reported the accuracy of image‐based visual inspection for the detection of invasive melanoma (Lorentzen 1999; Rao 1997; Scope 2008; Troyanova 2003; Westerhoff 2000), but none of them reported data for the primary target condition. Only two studies used images from normal practice settings (Lorentzen 1999; Rao 1997); one obtained images from a teledermatology company (Scope 2008) and two selected images of melanoma cases and controls for use in dermoscopy training studies (Troyanova 2003; Westerhoff 2000). The prevalence of melanoma ranged from 3% (Scope 2008) to 50% (Troyanova 2003; Westerhoff 2000). Studies used the ABCD checklist (Rao 1997), the ugly duckling approach (Scope 2008), or no algorithm (n = 3) to assist diagnosis. Four evaluations clearly presented only the clinical image with no further patient information (80%), and one (Rao 1997) may have presented observers with a concurrent dermoscopic image of the lesion, as blinding between images was not clearly described.

Sensitivities ranged from 62% to 86%; specificities ranged from 54% to 95%. In meta‐analysis the DOR was 14.8 (95% CI 3.56 to 61.9) (599 lesions and 150 melanoma cases). Sensitivity and specificity at the average operating point on the SROC curve were 76% (95% CI 50% to 91%) and 83% (95% CI 62% to 93%) respectively.

Accuracy was non‐significantly higher for in‐person compared to image‐based evaluations (RDOR 4.21; 95% CI 0.62 to 28.6; P = 0.13).

3. Target condition: any skin lesion requiring excision

In this section, we present the results for studies of visual inspection for the identification of any skin lesion requiring excision (for each study, we could only extract data for the detection of any skin cancer), according to the approach taken for diagnosis: in‐person or image‐based evaluations. Summary characteristics of studies are presented in Appendix 14 and results of meta‐analyses in Table 7 and Figure 15. Table 8 compares results in studies reporting data for invasive melanoma alone and for invasive melanoma plus atypical intraepidermal melanocytic variants.

15.

15

Summary ROC comparing in‐person and image‐based evaluations of visual inspection for detection of any skin lesion requiring excision (any)

Seven datasets evaluated the accuracy of in‐person visual inspection for the detection of any skin lesion requiring excision (Argenziano 2006; Chang 2013; Ek 2005; McGovern 1992; Stanganelli 2000; Steiner 1987; Walter 2012). Five of these also reported data for the primary target condition (Ek 2005; McGovern 1992; Stanganelli 2000; Steiner 1987; Walter 2012). Three studies were based in primary care (Argenziano 2006; Walter 2012) or community dermatology clinics (McGovern 1992), the others were based in secondary care or specialist referral clinics. The prevalence of skin cancer ranged from 3% (Walter 2012) to 68% (Ek 2005). Studies used the ABCD algorithm (Argenziano 2006; McGovern 1992; Stanganelli 2000), the revised seven‐point‐checklist (Walter 2012), or no algorithm (n = 3) to assist diagnosis. Two studies supplemented a histological reference standard with clinical follow‐up (Stanganelli 2000; Walter 2012) and expert diagnosis of some benign lesions (Walter 2012).

Sensitivities ranged from 57% to 98%; specificities ranged from 13% to 99%. In meta‐analysis the DOR was 20.5 (95% CI 7.11 to 59.3; 8091 lesions and 2187 skin cancer cases). Sensitivity and specificity at the average operating point on the SROC curve were 81% (95% CI 68% to 90%) and 81% (95% CI 56% to 93%) respectively. For the in‐person evaluations that also reported data for the primary target condition (Table 8), specificity estimates were not affected in four of the five studies due to the relatively small percentage of other skin cancers in the study populations (BCCs making up 2% of all lesions in McGovern 1992; 1% in Stanganelli 2000 and Walter 2012; and 6% in Steiner 1987). Sensitivities increased in two studies due to a majority of BCCs correctly identified (Stanganelli 2000; Steiner 1987); sensitivity fell in Walter 2012 due to three of four BCCs not being picked up by the revised seven‐point checklist; and remained the same in McGovern 1992. We observed a large increase in sensitivity and fall in specificity in Ek 2005, however, as BCCs made up 47% of the total study population and invasive SCCs comprised 20%. When these two lesion groups were considered as disease‐positive, sensitivity increased from 48% to 98% and specificity fell from 99% to 13% due to the largely correct identification of BCC and SCC as malignant and high false‐positives in the remaining group of lesions considered disease‐negative (including large proportions with Bowen's disease, solar keratoses, or seborrhoeic keratoses).

Three datasets reported the accuracy of image‐based visual inspection for the detection of any skin lesion requiring excision (Carli 2002b; Rosendahl 2011; Stanganelli 1998a), all of which also reported data for the primary condition. All studies selected images from normal practice settings, two in secondary care (Carli 2002b; Stanganelli 1998a) and one from a primary care practice (Rosendahl 2011). The prevalence of lesions suitable for excision ranged from 22% (Rosendahl 2011) to 47% (Stanganelli 1998a); the latter selecting images for use in a dermoscopy training study. Rosendahl 2011 presented data for a single dermatologist, Carli 2002b for a consensus of two dermatologists, and Stanganelli 1998a presented the average across 20 dermatologists. None of these studies used an algorithm to assist diagnosis (n = 3) and none presented any further participant information to assist diagnosis.

Sensitivities ranged from 64% to 80%; specificities ranged from 74% to 85%. In meta‐analysis the DOR was 11.9 (95% CI 2.22 to 65.3; 547 lesions and 138 skin cancer cases). Sensitivity and specificity at the average operating point on the SROC curve were 75% (95% CI 49% to 90%) and 79% (95% CI 38% to 96%) respectively. For the three studies that also reported data for the primary target condition (Table 8), sensitivities increased in two due to correct identification of BCCs (Rosendahl 2011; Stanganelli 1998a). Specificity decreased in Carli 2002b due to small sample size and high prevalence of malignancy (20 of 53; 38%) and decreased in Rosendahl 2011 due to the use of a different threshold for the primary target condition 'is this lesion a melanoma?' compared to ‘should this lesion be excised?’ for the target condition of any lesion requiring excision.

We did not identify any significant difference in accuracy between in‐person and image‐based evaluations (RDOR 1.70; 95% CI 0.24 to 12.3; P = 0.55).

Discussion

Summary of main results

The included studies evaluated visual inspection in a range of study populations, on an in‐person basis and using clinical images, and both with and without the use of published algorithms to assist diagnosis. We observed wide variations in sensitivity and specificity for all definitions of the target condition.

There are five main findings from our review:

1) There is an almost universal problem with poor reporting in the primary studies, hindering attempts to analyse studies according to their position on the clinical pathway and to fully assess sources of heterogeneity and methodological quality.

Fewer than two thirds of in‐person evaluations of visual inspection contained enough information to describe where on the clinical pathway participants were assessed. This was particularly the case for studies apparently conducted in referred populations, where almost half of studies neither described participants as ‘referred’, nor provided any description of participants’ prior testing or pathway followed prior to presentation for specialist review. Observer experience and expertise in pigmented lesion diagnosis is likely to affect test accuracy; however, this information was rarely provided in any detail making it difficult to assess any differences in accuracy according to clinician experience. Analyses by reported observer qualifications and descriptions of observers as ‘expert’ or ‘experienced’ showed no significant differences between groups.

In terms of methodological quality, studies were at unclear risk of bias due to poor reporting of key items around participant selection, pre‐specification of thresholds used, and timing of diagnosis in relation to reference standard diagnosis. Concern around applicability of studies was almost universally poor due to restricted inclusion of lesions and lack of reproducibility of diagnostic thresholds. Given these limitations and the heterogeneity in various aspects of the primary studies, our results cannot be considered conclusive regarding the accuracy of visual inspection for melanoma diagnosis.

2) Prior testing of participants or study position on the clinical pathway does appear to matter.

Focusing on in‐person evaluations that could be clearly positioned on the clinical pathway (Table 1), we observed the highest sensitivity (92.4%, 95% CI 26.2% to 99.8%) and lowest specificity (79.7%, 95% CI 73.7% to 84.7%) for the primary target condition of invasive melanoma or atypical intraepidermal melanocytic variants in three datasets from participants with limited prior testing; however, confidence intervals were wide and heterogeneity high, particularly for sensitivity. Data for referred participants suggest that summary sensitivities fall to around 75%, but with much higher specificities (e.g. sensitivity 76.7% (95% CI 61.7% to 87.1%) and specificity 95.7% (95% CI 89.7% to 98.3%) for lesions selected for excision, n = 8 datasets). Sensitivity was higher for equivocal lesion populations but with very wide confidence intervals (84.7%, 95% CI 55.5% to 96.1%) with summary specificity of 89.5% (95% CI 79.5% to 95.0%; 2 datasets).

The general trade‐off between sensitivity and specificity along the pathway could be due to differences in the spectrum or ‘case mix’ of included lesions, differences in the definition of a positive test result, or may be linked to variations in observer expertise. Spectrum effects can be observed when tests that are developed further down the referral pathway have lower sensitivity and higher specificity when applied in settings with participants with limited prior testing (Usher‐Smith 2016). Classic examples include the use of dipstick tests for detection of urinary tract infection (UTI) (Lachs 1992) and the D‐dimer test to detect pulmonary embolism (PE) (Ginsberg 1993). In both studies, as the prior probability of having UTI or PE increases (and so prevalence of disease increased), test sensitivity increased (from 79% to 93% in Ginsberg 1993, and from 58% to 92% in Lachs 1992) while specificities decreased (from 76% to 45% in Ginsberg 1993 and from 77% to 42% in Lachs 1992). However, this direction of effect is not consistent across tests and diseases as Leeflang 2013 clearly demonstrates; the mechanisms in action are often more complex than prevalence alone and can be difficult to identify.

Using disease prevalence as a proxy for disease spectrum, our classification of studies did result in a somewhat lower prevalence of disease (suggesting a wider spectrum of lesion types) in limited prior testing studies (median prevalence 5%, interquartile range (IQR) 3% to 9%) compared to referral settings (median prevalence 15%, IQR 10% to 21%), but with overlapping ranges (2% to 11%, and 1% to 41%, respectively). The lower specificity observed in limited prior testing studies is likely related to the presence of a wider range of benign lesions with similar characteristics to melanoma, leading to more referrals. Observers in primary care are also likely to have a lower threshold for considering benign lesions as possibly malignant due to the risk of missing true cases of melanoma, contributing both to higher sensitivity and a higher false‐positive rate. Referred populations on the other hand may have a higher proportion of equivocal or ‘difficult‐to‐diagnose’ melanomas that are difficult to identify.

In terms of eligibility criteria, the studies required varying degrees of clinical suspicion of malignancy to include lesions in limited prior‐testing populations, ranging from lesions that could not immediately be diagnosed as benign to there being a requirement for a teledermatology second opinion. In referral populations, eligibility was frequently based on lesion excision, the basis or rationale for which was not described. The restriction to lesions deemed to be suitable for excision would decrease specificity, as more obviously benign lesions would be excluded. The spectrum of lesion types in the disease‐negative groups also varied across studies, with a number of studies restricting inclusion only to those with melanocytic lesions (such that all benign lesions were benign melanocytic naevi) and others reporting high proportions of other types of skin cancers (BCC or SCC), or of benign keratotic lesions, such as seborrhoeic or actinic keratoses, or of Spitz naevi, which may be difficult to differentiate from melanoma.

3) Visual inspection alone is not sufficiently sensitive for the detection of melanoma, and there is no clear evidence that accuracy is improved by the use of any named or published algorithm to assist diagnosis in all settings.

Test sensitivity was greater than 90% (i.e. fewer than 1 in 10 melanomas missed) in only six of the 28 in‐person‐based evaluations of the primary target condition, and confidence intervals for the pooled estimates were wide, raising the question of whether visual inspection can be relied on to rule out the presence of melanoma. Applying the sensitivity and specificity estimates for the limited prior testing studies cited above to a hypothetical cohort of 1000 lesions at disease prevalence of 4%, 9%, and 16% (see Table 1) shows that on average, visual inspection would miss 3, 7 or 12 melanomas, with 195, 185 and 171 false‐positive results (potentially leading to unnecessary excisions or lesion referral or follow‐up depending on the anticipated clinical action following a positive result). The wide confidence intervals however mean that the number of melanomas missed could range from between 0 and 118, with false‐positives from 129 to 252. For a cohort of 1000 lesions in a referred population at prevalence of 4%, 9%, and 16% (Table 1), the pooled sensitivity of 76.7% and specificity 95.7% translate to 9, 21, and 37 melanomas missed on average (range: 5 to 61) and 41, 39, and 36 false‐positive results (range: 14 to 99).

The evidence to support the use of available algorithms to assist visual inspection was limited, and results are likely to be confounded by patient spectrum and observer experience. We also observed considerable variation in definitions of test positivity across studies that did not report using any algorithm, that is, where observer diagnosis was based on observers’ own interpretation of lesion characteristics. Where reported, visual inspection was considered to be positive for observers ‘correct diagnosis of melanoma’, ‘suspicion of malignancy’, or ‘selection for excision’, each of which is likely to result in varying proportions of test‐positive or test‐negative for any given population.

Nevertheless, covariate investigations for the primary analysis across all study settings suggested no difference in accuracy according to the reported use of any named or published algorithm to assist diagnosis. This result was supported by limited subgroup analysis according to algorithm used. Only one eligible study directly compared the accuracy of visual inspection with and without the use of an algorithm (Collas 1999); however, the study authors developed their own new algorithm for the study and found sensitivity to be higher without the use of the algorithm. Comparing different algorithms, McGovern 1992 reported highest sensitivities from the BCD algorithm (any one characteristic present) and the original seven‐point checklist (at least two characteristics present). Current guidelines in the UK support the use of the revised seven‐point checklist in primary care (NICE 2015a). A number of studies assessing the revised seven‐point checklist algorithm did not meet the stringent inclusion criteria for our review (Healsmith 1994; Higgins 1992; Osborne 1999; Walter 2013); however, the single eligible study using the revised seven‐point checklist as part of a large randomised controlled trial reported high sensitivity (94%) when used by GPs (Walter 2012).

4) The definition of the target condition has an effect on diagnostic accuracy.

Results from studies reporting data for more than one definition of the target condition show that sensitivity in particular is affected by the inclusion of, and percentage of, melanoma in situ and BCC lesions considered disease‐positive. The direction of effect depends on observers’ ability to correctly identify these lesions as malignant. It is likely that similar effects have an impact on results observed across all included studies. Clear identification of the target condition was not provided in 11 of the 28 datasets included in our primary analyses, so that the inclusion of melanoma in situ lesions as disease‐positive was assumed on the basis that the disease‐positive group was described as ‘melanoma’ and not as ‘invasive melanoma’ or ‘malignant melanoma’. Of those studies that clearly reported including in situ lesions, the percentage of the disease‐positive group (invasive melanoma and atypical intraepidermal melanocytic variants) described as being in situ ranged from 10% to 50%. Where studies included other invasive skin cancers (mainly BCCs or SCCs) in the study population (lesions considered disease‐negative for detection of the primary target), we attempted to class any that were correctly identified by observers as malignant as ‘true negative’ results as opposed to ‘false‐positive’ (thereby increasing observed specificities), on the basis that removal of any skin cancer in the attempt to identify melanomas would not be a negative consequence of the test. Our ability to reclassify lesions relied on studies providing a disaggregation of test results according to final lesion classification and was not always possible, particularly when invasive SCCs were not separated from ‘in situ’ lesions such as Bowen’s disease.

5) There are substantial differences in diagnostic accuracy between in‐person and image‐based assessments.

Accuracy was much lower and reporting was poorer for evaluations of a diagnosis based on the interpretation of clinical images as opposed to in‐person evaluations. Other than possible differences in patient spectrum between in‐person and image‐based studies, one possible explanation for the observed difference is that even using the highest quality clinical image, a remote assessment is not equivalent to a physical, face‐to‐face, patient‐to‐clinician interaction, which will include patient history‐taking as well as a total body examination. We were unable to examine any impact from history‐taking over and above inspection of the lesion itself; however, history‐taking and in particular, assessment of and knowledge of patients’ other lesions could have a significant impact on the decision as to whether or not a patient has melanoma (Aldridge 2013; Grob 1998). Subtle differences in assessing the lesion shape and colour can be done in an in–person consultation, for example, by stretching the lesion in the axis perpendicular to the skin creases, which may distort the lesion shape, and by altering the light intensity and direction used during lesion inspection. Palpation of the lesion (and regional lymph nodes) is also possible during in‐person examination. The fact that image quality is likely to vary between studies, the time taken to review each image is likely to vary, and the considerable variation in supplementary information provided to observers (ranging from no clinical information, to clinical details regarding patient age, gender or lesion site and information on lesion change over time) will have further contributed to variation in accuracy and lower accuracy estimates in comparison to in‐person evaluations. Furthermore, the diagnostic context may have a key influence on observer decisions. In a face‐to‐face diagnostic encounter and for the examination of lesion images for a teledermatology consultation, the clinicians concerned know that their assessment has a direct consequence on patient management and potentially on patient outcomes. The image‐based evaluations included in our primary analysis however were not conducted for teledermatology purposes, but were studies using lesion images to compare accuracy between clinical‐image diagnosis and dermoscopic‐image diagnosis, or to compare observer or algorithm performance, for example. Observers would have been aware that their assessment of the lesion image was done in an experimental setting, and would not have an impact on patients; this could potentially have affected interpretation.

Strengths and weaknesses of the review

The strengths of our review include an in‐depth and comprehensive electronic literature search, systematic review methods including double extraction of papers by both clinicians and methodologists, and contact with study authors to allow study inclusion or clarify data. In order to estimate test accuracy in different study populations, we adopted a clear analysis structure according to approach to diagnosis, the definition of the target condition, and the patient pathway. We undertook a detailed and replicable analysis of methodologic quality.

In comparison to other available systematic reviews, our review extends the time period searched for eligible studies to August 2016 (from 2007 in Vestergaard 2008 and from March 2015 in Harrington 2017), and we include all eligible studies regardless of availability of a direct comparison with dermoscopic examination (as required in Vestergaard 2008) or requirement for an algorithm or clinical prediction rule to be included (Harrington 2017). Our stringent application of review inclusion criteria meant that we excluded several otherwise eligible studies. For example, we excluded those reporting accuracy data for ‘clinical diagnosis’, where dermoscopy may or may not have been used to assist diagnosis, on the basis that the contribution of visual inspection of the lesion could not be discerned.

We also excluded from our review studies evaluating eligible algorithms (that were included in Harrington 2017), due to lack of data to construct a 2x2 contingency table, the serial use of the algorithm in the context of lesion follow‐up, or use of inadequate reference standards. Without these restrictions, the observed data would likely have been considerably more heterogeneous and of poorer methodological quality. At the same time, our inclusion of all studies reporting data for visual inspection meant that we were able to make an overall assessment of observer accuracy, regardless of the use of a named algorithm. Harrington and colleagues rightly point out that lower sensitivity associated with the use of a clinical prediction rule “should not prevent [its] use unless usual decisions, made without the rule, are demonstrably better”; however, unless the accuracy of ‘usual decisions’ is examined, any benefit from the use of an algorithm cannot be established.

The main concerns for the review are a result of the poor reporting of primary studies, in particular forcing some assumptions to be made to allow studies to be split by pathway and in separating studies by the different definitions of the target condition. Our inability to clearly separate studies by pathway is of real concern given the evidence for the effect on accuracy according to the spectrum or case‐mix of included participants (Lachs 1992; Leeflang 2013; Moons 1997).

Finally, observer expertise is key for any diagnostic process based on visual inspection, with both non‐analytical pattern recognition (implicit identification) and analytical pattern recognition (using more explicit ‘rules’ based on conscious analytical reasoning) employed to varying extents between clinicians, according to factors such as experience and familiarity with the diagnostic question (Norman 2009). A lack of clear reporting of observer training and experience made analysis difficult.

Applicability of findings to the review question

Varying definitions of the eligible study populations and lack of clarity regarding the patient pathway and any prior testing may restrict the extent to which our findings are applicable to the clinical setting. Varying definitions of test positivity and lack of reproducibility of diagnostic thresholds, variability in the use of published algorithms, and in observer qualifications and experience, further restrict the transferability of results to a clinical setting.

Authors' conclusions

Implications for practice.

Visual inspection is an essential, fundamental component of the assessment of a suspicious skin lesion; however, the evidence suggests that melanomas will be missed if visual inspection is used on its own. The evidence to support its accuracy in the range of settings in which it is used is both flawed and poorly reported, resulting in an inability to produce meaningful summary results and clear pointers as to where visual inspection is most useful. Overall, the use of published algorithms to assist diagnosis does not appear to improve accuracy; however, neither is there sufficient evidence to suggest that the ‘no algorithm’ approach should be preferred in all settings, for example, for training junior staff. Further investigation may lend support to the theory that expert observers are more reliant on non‐analytical pattern recognition, while attempts to assist analytical pattern recognition are of more benefit for less experienced or more generalist observers.

Implications for research.

Despite the vast volume of research that has been funded to evaluate visual inspection, further prospective evaluation of the added value of established algorithms according to the prior testing or diagnostic difficulty of lesions may be warranted. Prospective recruitment of consecutive series of participants and with systematic follow‐up of non‐excised lesions to avoid over‐reliance on a histological reference standard would allow results to be more generalisable to routine practice. A clear identification of the level of training and experience required to achieve good results is also required. Any future research study needs to be clear about the diagnostic pathway followed by study participants prior to study enrolment, and should conform to the updated Standards for Reporting of Diagnostic Accuracy (STARD) guideline (Bossuyt 2015).

What's new

Date Event Description
19 December 2018 Amended Affiliations, Disclaimer and Sources of support updated

Acknowledgements

Members of the Cochrane Skin Cancer Diagnostic Test Accuracy Group include:

  • the full project team (Susan Bayliss, Naomi Chuchu, Clare Davenport, Jonathan Deeks, Jacqueline Dinnes, Lavinia Ferrante di Ruffano, Kathie Godfrey, Rubeta Matin, Colette O'Sullivan, Yemisi Takwoingi, Hywel Williams);

  • our 12 clinical reviewers (Rachel Abbott, Ben Aldridge, Oliver Bassett, Sue Ann Chan, Alana Durack, Monica Fawzy, Abha Gulati, Jacqui Moreau, Lopa Patel, Daniel Saleh, David Thompson, Kai Yuen Wong) and two methodologists (Lavinia Ferrante di Ruffano and Louise Johnston), who assisted with full‐text screening, data extraction and quality assessment across the entire suite of reviews of diagnosis and staging and skin cancer;

  • our expert advisor and co‐author Fiona Walter; and

  • all members of our Advisory Group (Jonathan Bowling, Seau Tak Cheung, Colin Fleming, Matthew Gardiner, Abhilash Jain, Susan O’Connell, Pat Lawton, John Lear, Mariska Leeflang, Richard Motley, Paul Nathan, Julia Newton‐Bishop, Miranda Payne, Rachael Robinson, Simon Rodwell, Julia Schofield, Neil Shroff, Hamid Tehrani, Zoe Traill, Fiona Walter, Angela Webster).

Cochrane Skin editorial base wishes to thank Michael Bigby, who was the Dermatology Editor for this review; and the clinical referees, Andrew Affleck and Chris Bower. We also wish to thank the Cochrane DTA editorial base and colleagues, as well as Denise Mitchell, who copy‐edited this review.

Appendices

Appendix 1. Current content and structure of the Programme Grant

  LIST OF REVIEWS Number of studies
  Diagnosis of melanoma  
1 Visual inspection 49
2 Dermoscopy +/‐ visual inspection 104
3 Teledermatology 22
4 Smartphone applications 2
5a Computer‐assisted diagnosis – dermoscopy‐based techniques 42
5b Computer‐assisted diagnosis – spectroscopy‐based techniques Review amalgamated into 5a
6 Reflectance confocal microscopy 18
7 High‐frequency ultrasound 5
  Diagnosis of keratinocyte skin cancer (BCC and cSCC)  
8 Visual inspection +/‐ Dermoscopy 24
5c Computer‐assisted diagnosis – dermoscopy‐based techniques Review amalgamated into 5a
5d Computer‐assisted diagnosis – spectroscopy‐based techniques Review amalgamated into 5a
9 Optical coherence tomography 5
10 Reflectance confocal microscopy 10
11 Exfoliative cytology 9
  Staging of melanoma  
12 Imaging tests (ultrasound, CT, MRI, PET‐CT) 38
13 Sentinel lymph node biopsy 160
  Staging of cSCC  
  Imaging tests review Review dropped; only one study identified
13 Sentinel lymph node biopsy Review amalgamated into 13 above (n = 15 studies)

Appendix 2. Glossary of terms

Term Definition
Atypical intraepidermal melanocytic variant Unusual area of darker pigmentation contained within the epidermis that may progress to an invasive melanoma; includes melanoma in situ and lentigo maligna
Atypical naevi Unusual looking but noncancerous mole or area of darker pigmentation of the skin
BRAF V600 mutation BRAF is a human gene that makes a protein called B‐Raf which is involved in the control of cell growth. BRAF mutations (damaged DNA) occur in around 40% of melanomas, which can then be treated with particular drugs.
BRAF inhibitors Therapeutic agents that inhibit the serine‐threonine protein kinase BRAF mutated metastatic melanoma
Breslow thickness A scale for measuring the thickness of melanomas by the pathologist using a microscope, measured in mm from the top layer of skin to the bottom of the tumour
Congenital naevi A type of mole found on infants at birth
Dermoscopy Whereby a handheld microscope is used to allow more detailed, magnified, examination of the skin compared to examination by the naked eye alone
False‐negative An individual who is truly positive for a disease, but whom a diagnostic test classifies as disease‐free
False‐positive An individual who is truly disease‐free, but whom a diagnostic test classifies as having the disease
Histopathology/histology The study of tissue, usually obtained by biopsy or excision, for example under a microscope
Incidence The number of new cases of a disease in a given time period
Index test A diagnostic test under evaluation in a primary study
Lentigo maligna Unusual area of darker pigmentation contained within the epidermis that includes malignant cells but with no invasive growth. May progress to an invasive melanoma
Lymph node Lymph nodes filter the lymphatic fluid (clear fluid containing white blood cells) that travels around the body to help fight disease; they are located throughout the body often in clusters (nodal basins)
Melanocytic naevus An area of skin with darker pigmentation (or melanocytes) also referred to as ‘moles’
Meta‐analysis A form of statistical analysis used to synthesise results from a collection of individual studies
Metastases/metastatic disease Spread of cancer away from the primary site to somewhere else through the bloodstream or the lymphatic system
Micrometastases Micrometastases are metastases so small that they can only be seen under a microscope.
Mitotic rate Microscopic evaluation of number of cells actively dividing in a tumour
Morbidity Detrimental effects on health
Mortality Either (1) the condition of being subject to death; or (2) the death rate, which reflects the number of deaths per unit of population in relation to any specific region, age group, disease, treatment or other classification, usually expressed as deaths per 100, 1000, 10,000 or 100,000 people
Multidisciplinary team A team with members from different healthcare professions and specialties (e.g. urology, oncology, pathology, radiology, and nursing). Cancer care in the National Health Service (NHS) uses this system to ensure that all relevant health professionals are engaged to discuss the best possible care for a patient.
Prevalence The proportion of a population found to have a condition
Prognostic factors/indicators Specific characteristics of a cancer or the person who has it, which might affect the patient’s prognosis
Receiver operating characteristic (ROC) plot A plot of the sensitivity against the inverse of the specificity of a test at different thresholds for test positivity; represents the diagnostic capability of a test with a range of binary test results
Receiver operating characteristic (ROC) analysis The analysis of a ROC plot of a test to select an optimal threshold for test positivity
Recurrence Recurrence is when new cancer cells are detected following treatment. This can occur either at the site of the original tumour or at other sites in the body.
Reference standard A test or combination of tests used to establish the final or ‘true’ diagnosis of a patient in an evaluation of a diagnostic test
Reflectance confocal microscopy (RCM) A microscopic technique using infrared light (either in a handheld device or a static unit) that can create images of the deeper layers of the skin
Sensitivity In this context the term is used to mean the proportion of individuals with a disease who have that disease correctly identified by the study test.
Specificity The proportion of individuals without the disease of interest (in this case with benign skin lesions) who have that absence of disease correctly identified by the study test.
Staging Clinical description of the size and spread of a patient’s tumour, fitting into internationally agreed categories
Subclinical (disease) Disease that is usually asymptomatic and not easily observable, e.g. by clinical or physical examination
Systemic treatment Treatment, usually given by mouth or by injection, that reaches and affects cancer cells throughout the body rather than targeting one specific area.

Appendix 3. Table of acronyms and abbreviations used

Acronym Definition
3PCL three‐point checklist
7FFM seven features for melanoma
7PCL seven‐point checklist
ABCD(E) asymmetry, border, colour, differential structures (enlargement)
AHM amelanotic or hypomelanotic melanoma
AK actinic keratosis
AMN atypical melanocytic naevi
AUC area under the curve
BCC basal cell carcinoma
BD Bowen’s disease
BN benign naevi
BNM benign non‐melanocytic
BPC between‐person comparison (of tests)
CAD computer‐assisted diagnosis
CCS case‐control study
CD compact disc
CM cutaneous melanoma
CMM cutaneous malignant melanoma
CS case series
CSCC cutaneous squamous cell carcinoma
D‐ disease‐negative
D+ disease‐positive
DF dermatofibroma
Dx diagnosis
ELM epiluminescence microscopy
FN false‐negative
FP false‐positive
FU follow‐up
GP general practitioner
H&E haematoxylin and eosin stain
LPLK lichen planus‐like keratosis
LS lentigo simplex
MiS melanoma in situ (or lentigo maligna)
MM malignant (invasive) melanoma
MN melanocytic naevi
MSDSLA multispectral digital skin lesion analysis device
N/A not applicable
NC non comparative
NMLs non melanocytic lesions
NPV negative predictive value
NR not reported
P prospective
PCPs primary care providers
PLC pigmented lesion clinic
PPV positive predictive value
PSL pigmented skin lesion
R retrospective
RCM reflectance confocal microscopy
RCT randomised controlled trial
SCC squamous cell carcinoma
SD standard deviation
SDDI Short term sequential digital dermoscopy imaging
se sensitivity
sp specificity
SK seborrhoeic keratosis
SN Spitz naevi
SSM superficial spreading melanoma
SVS support vector system
TD teledermatology
TN true negative
TWR two‐week rule
VI visual inspection
WPC within‐person comparison (of tests)
WPC‐algs within‐person comparison (of algorithms)

Appendix 4. Content of algorithms used to assist melanoma diagnosis by visual inspection alone

ABCD (Friedman 1985; Rigel 1993; Pehamberger 1993)
ABCDE (Abbasi 2004; Benelli 1999; Benelli 2001; Carli 1994; Cristofolini 1994; Thomas 1998)
BCD (McGovern 1992)
Seven‐point checklist (Keefe 1990; MacKie 1985; MacKie 1990) Seven‐point checklist (revised) (Healsmith 1994; MacKie 1990)
A – asymmetry
  • variable centripetal growth of melanocytes (Friedman 1985)

  • “geometrical asymmetry in two axes of the tumour” (Benelli 1999; Benelli 2001; Thomas 1998)

  • “one half does not match the other half” (McGovern 1992); not separately scored in study “because we believed that asymmetry and border irregularity were linked”


B ‐ irregular borders

C ‐ colour
  • variable pigmentation, multiple colours; various of hues of brown, also black, blue, red and white (Friedman 1985 )

  • “pigmentation is not uniform; shades of tan, brown and black are present with dashes of red, white, or blue” (McGovern 1992)

  • “mottled‐haphazard display” (Cristofolini 1994)

  • “presence of at least two different colours within the lesion (with the exception of the usual symmetrical darkening of the lesion in its center)” (Benelli 2001; Thomas 1998)

  • “multiple colours” (Abbasi 2004)


D ‐ diameter equal or superior to 6 mm
  • all studies agree


E ‐ evolution
  • “changes in pigmentation” (Cristofolini 1994)

  • “enlargement of the surface (and not in height) of the lesion; anamnestic criterion based on the patient’s description of the natural history of the lesion” (Thomas 1998)

  • “elevation, enlargement or change in the color of the lesion” (Benelli 1999; Benelli 2001)

  • “evolving (with respect to size, shape, shades of colour, surface features, or symptoms)” (Abbasi 2004)


McGovern 1992 describes 7 characteristics as: “increasing size, variegation, inflammation, irregular outline, greater than 1cm diameter, itch, bleeding”
These are expanded on in MacKie 1990, who describes the original (1985) criteria as:
  • sensory change, often described as a greater awareness of the lesion but also as a mild itch;

  • diameter of 1 cm or greater;

  • growth of the lesion;

  • an irregular edge;

  • irregular pigment with different shades of brown and black in the lesion;

  • inflammation (a reddish tinge within the lesion); and

  • crusting, oozing, or bleeding.

  • ≥ 3 criteria should prompt referral (MacKie 1990)

  • sensory change (greater awareness of the lesion or mild itch);

  • diameter of ≥ 1 cm;

  • growth of the lesion;

  • an irregular edge;

  • irregular pigment with different shades of brown and black in the lesion;

  • inflammation

  • crusting, oozing, or bleeding.


Presence of 3 or more suggestive of melanoma
Healsmith 1994 , MacKie 1990 and Mackie 1991 describe the revised criteria as:
major signs
  • change in size

  • change in shape

  • change in colour


minor signs
  • inflammation

  • crusting or bleeding

  • sensory change

  • diameter ≥ 7 mm


“a patient with a pigmented lesion with any one of the major signs should be considered for referral and that the presence of any of the minor signs should be a further stimulus to referral.” (MacKie 1990)

Appendix 5. Proposed sources of heterogeneity

i. Population characteristics

  • general versus higher risk populations

  • patient population: primary/secondary/specialist unit

  • lesion suspicion: general suspicion/atypical/equivocal/NR

  • lesion type: any pigmented; melanocytic

  • inclusion of multiple lesions per participant

  • ethnicity

ii. Index test characteristics

  • the nature of and definition of criteria for test positivity

  • observer experience with the index test

  • approaches to lesion preparation (e.g. the use of oil or antiseptic gel for dermoscopy)

iii. Reference standard characteristics

  • reference standard used

  • whether histology‐reporting meets pathology‐reporting guidelines

  • use of excisional versus diagnostic biopsy

  • whether two independent dermatopathologists reviewed histological diagnosis

iv. Study quality

  • consecutive or random sample of participants recruited

  • index test interpreted blinded to the reference standard result

  • index test interpreted blinded to the result of any other index test

  • presence of partial or differential verification bias (whereby only a sample of those subject to the index test are verified by the reference test or by the same reference test with selection dependent on the index test result)

  • use of an adequate reference standard

  • overall risk of bias

Appendix 6. Final search strategies

Database: Ovid MEDLINE(R) 1946 to August week 3 2016

Search strategy:

1 exp melanoma/

2 exp skin cancer/

3 exp basal cell carcinoma/

4 basalioma$1.ti,ab.

5 ((basal cell or skin) adj2 (cancer$1 or carcinoma$1 or mass or masses or tumour$1 or tumor$1 or neoplasm$1 or adenoma$1 or epithelioma$1 or lesion$1 or malignan$ or nodule$1)).ti,ab.

6 (pigmented adj2 (lesion$1 or mole$ or nevus or naevi or naevus or naevi or skin)).ti,ab.

7 (melanom$1 or nonmelanoma$1 or non‐melanoma$1 or melanocyt$ or non‐melanocyt$ or nonmelanocyt$ or keratinocyt$).ti,ab.

8 nmsc.ti,ab.

9 (squamous cell adj2 (cancer$1 or carcinoma$1 or mass or masses or tumor$1 or tumour$1 or neoplasm$1 or adenoma$1 or epithelioma$1 or epithelial or lesion$1 or malignan$ or nodule$1) adj2 (skin or epiderm$ or cutaneous)).ti,ab.

10 (BCC or CSCC or NMSC).ti,ab.

11 keratinocy$.ti,ab.

12 Keratinocytes/

13 or/1‐12

14 dermoscop$.ti,ab.

15 dermatoscop$.ti,ab.

16 photomicrograph$.ti,ab.

17 exp epiluminescence microscopy/

18 (epiluminescence adj2 microscop$).ti,ab.

19 (confocal adj2 microscop$).ti,ab.

20 (incident light adj2 microscop$).ti,ab.

21 (surface adj2 microscop$).ti,ab.

22 (visual adj (inspect$ or examin$)).ti,ab.

23 ((clinical or physical) adj examin$).ti,ab.

24 3 point.ti,ab.

25 three point.ti,ab.

26 pattern analys$.ti,ab.

27 ABCD$.ti,ab.

28 menzies.ti,ab.

29 7 point.ti,ab.

30 seven point.ti,ab.

31 (digital adj2 (dermoscop$ or dermatoscop$)).ti,ab.

32 artificial intelligence.ti,ab.

33 AI.ti,ab.

34 computer assisted.ti,ab.

35 computer aided.ti,ab.

36 neural network$.ti,ab.

37 exp diagnosis, computer‐assisted/

38 MoleMax.ti,ab.

39 image process$.ti,ab.

40 automatic classif$.ti,ab.

41 image analysis.ti,ab.

42 SIAscop$.ti,ab.

43 Aura.ti,ab.

44 (optical adj2 scan$).ti,ab.

45 MelaFind.ti,ab.

46 SIMSYS.ti,ab.

47 MoleMate.ti,ab.

48 SolarScan.ti,ab.

49 VivaScope.ti,ab.

50 (high adj3 ultraso$).ti,ab.

51 (canine adj2 detect$).ti,ab.

52 ((mobile or cell or cellular or smart) adj ((phone$1 adj2 app$1) or application$1)).ti,ab.

53 smartphone$.ti,ab.

54 (DermoScan or SkinVision or DermLink or SpotCheck).ti,ab.

55 Mole Detective.ti,ab.

56 Spot Check.ti,ab.

57 (mole$1 adj2 map$).ti,ab.

58 (total adj2 body).ti,ab.

59 exfoliative cytolog$.ti,ab.

60 digital analys$.ti,ab.

61 (image$1 adj3 software).ti,ab.

62 (teledermatolog$ or tele‐dermatolog$ or telederm or tele‐derm or teledermoscop$ or tele‐dermoscop$ or teledermatoscop$ or tele‐dermatoscop$).ti,ab.

63 (optical coherence adj (technolog$ or tomog$)).ti,ab.

64 (computer adj2 diagnos$).ti,ab.

65 exp sentinel lymph node biopsy/

66 (sentinel adj2 node).ti,ab.

67 naevisense.mp. or HFUS.ti,ab.

68 electrical impedance spectroscopy.ti,ab.

69 history taking.ti,ab.

70 patient history.ti,ab.

71 (naked eye adj (exam$ or assess$)).ti,ab.

72 (skin adj exam$).ti,ab.

73 physical examination/

74 ugly duckling.mp. or UD.ti,ab.

75 ((physician$ or clinical or physical) adj (exam$ or triage or recog$)).ti,ab.

76 ABCDE.mp. or VOC.ti,ab.

77 clinical accuracy.ti,ab.

78 Family Practice/ or Physicians, Family/ or clinical competence/

79 (confocal adj2 microscop$).ti,ab.

80 diagnostic algorithm$1.ti,ab.

81 checklist$.ti,ab.

82 virtual imag$1.ti,ab.

83 volatile organic compound$1.ti,ab.

84 dog$1.ti,ab.

85 gene expression analy$.ti,ab.

86 reflex transmission imag$.ti,ab.

87 thermal imaging.ti,ab.

88 elastography.ti,ab.

89 or/14‐88

90 (CT or PET).ti,ab.

91 PET‐CT.ti,ab.

92 (FDG or F18 or Fluorodeoxyglucose or radiopharmaceutical$).ti,ab.

93 exp Deoxyglucose/

94 deoxy‐glucose.ti,ab.

95 deoxyglucose.ti,ab.

96 CATSCAN.ti,ab.

97 exp Tomography, Emission‐Computed/

98 exp Tomography, X‐ray computed/

99 positron emission tomograph$.ti,ab.

100 exp magnetic resonance imaging/

101 (MRI or fMRI or NMRI or scintigraph$).ti,ab.

102 exp echography/

103 Doppler echography.ti,ab.

104 sonograph$.ti,ab.

105 ultraso$.ti,ab.

106 doppler.ti,ab.

107 magnetic resonance imag$.ti,ab.

108 or/90‐107

109 (stage$ or staging or metasta$ or recurrence or sensitivity or specificity or false negative$ or thickness$).ti,ab.

110 "Sensitivity and Specificity"/

111 exp cancer staging/

112 or/109‐111

113 108 and 112

114 89 or 113

115 13 and 114

Database: Ovid MEDLINE(R) In‐Process & Other Non‐Indexed Citations 29 August, 2016

Search strategy:

1 basalioma$1.ti,ab.

2 ((basal cell or skin) adj2 (cancer$1 or carcinoma$1 or mass or masses or tumour$1 or tumor$1 or neoplasm$1 or adenoma$1 or epithelioma$1 or lesion$1 or malignan$ or nodule$1)).ti,ab.

3 (pigmented adj2 (lesion$1 or mole$ or nevus or naevi or naevus or naevi or skin)).ti,ab.

4 (melanom$1 or nonmelanoma$1 or non‐melanoma$1 or melanocyt$ or non‐melanocyt$ or nonmelanocyt$ or keratinocyt$).ti,ab.

5 nmsc.ti,ab.

6 (squamous cell adj2 (cancer$1 or carcinoma$1 or mass or masses or tumor$1 or tumour$1 or neoplasm$1 or adenoma$1 or epithelioma$1 or epithelial or lesion$1 or malignan$ or nodule$1) adj2 (skin or epiderm$ or cutaneous)).ti,ab.

7 (BCC or CSCC or NMSC).ti,ab.

8 keratinocy$.ti,ab.

9 or/1‐8

10 dermoscop$.ti,ab.

11 dermatoscop$.ti,ab.

12 photomicrograph$.ti,ab.

13 (epiluminescence adj2 microscop$).ti,ab.

14 (confocal adj2 microscop$).ti,ab.

15 (incident light adj2 microscop$).ti,ab.

16 (surface adj2 microscop$).ti,ab.

17 (visual adj (inspect$ or examin$)).ti,ab.

18 ((clinical or physical) adj examin$).ti,ab.

19 3 point.ti,ab.

20 three point.ti,ab.

21 pattern analys$.ti,ab.

22 ABCD$.ti,ab.

23 menzies.ti,ab.

24 7 point.ti,ab.

25 seven point.ti,ab.

26 (digital adj2 (dermoscop$ or dermatoscop$)).ti,ab.

27 artificial intelligence.ti,ab.

28 AI.ti,ab.

29 computer assisted.ti,ab.

30 computer aided.ti,ab.

31 neural network$.ti,ab.

32 MoleMax.ti,ab.

33 image process$.ti,ab.

34 automatic classif$.ti,ab.

35 image analysis.ti,ab.

36 SIAscop$.ti,ab.

37 Aura.ti,ab.

38 (optical adj2 scan$).ti,ab.

39 MelaFind.ti,ab.

40 SIMSYS.ti,ab.

41 MoleMate.ti,ab.

42 SolarScan.ti,ab.

43 VivaScope.ti,ab.

44 (high adj3 ultraso$).ti,ab.

45 (canine adj2 detect$).ti,ab.

46 ((mobile or cell or cellular or smart) adj ((phone$1 adj2 app$1) or application$1)).ti,ab.

47 smartphone$.ti,ab.

48 (DermoScan or SkinVision or DermLink or SpotCheck).ti,ab.

49 Mole Detective.ti,ab.

50 Spot Check.ti,ab.

51 (mole$1 adj2 map$).ti,ab.

52 (total adj2 body).ti,ab.

53 exfoliative cytolog$.ti,ab.

54 digital analys$.ti,ab.

55 (image$1 adj3 software).ti,ab.

56 (teledermatolog$ or tele‐dermatolog$ or telederm or tele‐derm or teledermoscop$ or tele‐dermoscop$ or teledermatoscop$ or tele‐dermatoscop$).ti,ab.

57 (optical coherence adj (technolog$ or tomog$)).ti,ab.

58 (computer adj2 diagnos$).ti,ab.

59 (sentinel adj2 node).ti,ab.

60 naevisense.mp. or HFUS.ti,ab.

61 electrical impedance spectroscopy.ti,ab.

62 history taking.ti,ab.

63 patient history.ti,ab.

64 (naked eye adj (exam$ or assess$)).ti,ab.

65 (skin adj exam$).ti,ab.

66 ugly duckling.mp. or UD.ti,ab.

67 ((physician$ or clinical or physical) adj (exam$ or triage or recog$)).ti,ab.

68 ABCDE.mp. or VOC.ti,ab.

69 clinical accuracy.ti,ab.

70 (Family adj (Practice or Physicians)).ti,ab.

71 (confocal adj2 microscop$).ti,ab.

72 clinical competence.ti,ab.

73 diagnostic algorithm$1.ti,ab.

74 checklist$.ti,ab.

75 virtual imag$1.ti,ab.

76 volatile organic compound$1.ti,ab.

77 dog$1.ti,ab.

78 gene expression analy$.ti,ab.

79 reflex transmission imag$.ti,ab.

80 thermal imaging.ti,ab.

81 elastography.ti,ab.

82 or/10‐81

83 (CT or PET).ti,ab.

84 PET‐CT.ti,ab.

85 (FDG or F18 or Fluorodeoxyglucose or radiopharmaceutical$).ti,ab.

86 deoxy‐glucose.ti,ab.

87 deoxyglucose.ti,ab.

88 CATSCAN.ti,ab.

89 positron emission tomograph$.ti,ab.

90 (MRI or fMRI or NMRI or scintigraph$).ti,ab.

91 Doppler echography.ti,ab.

92 sonograph$.ti,ab.

93 ultraso$.ti,ab.

94 doppler.ti,ab.

95 magnetic resonance imag$.ti,ab.

96 or/83‐95

97 (stage$ or staging or metasta$ or recurrence or sensitivity or specificity or false negative$ or thickness$).ti,ab.

98 96 and 97

99 82 or 98

100 9 and 99

Database: Embase 1974 to 29 August 2016

Search strategy:

1 *melanoma/

2 *skin cancer/

3 *basal cell carcinoma/

4 basalioma$.ti,ab.

5 ((basal cell or skin) adj2 (cancer$1 or carcinoma$1 or mass or masses or tumour$1 or tumor$1 or neoplasm$ or adenoma$ or epithelioma$ or lesion$ or malignan$ or nodule$)).ti,ab.

6 (pigmented adj2 (lesion$1 or mole$ or nevus or naevi or naevus or naevi or skin)).ti,ab.

7 (melanom$1 or nonmelanoma$1 or non‐melanoma$1 or melanocyt$ or non‐melanocyt$ or nonmelanocyt$ or keratinocyt$).ti,ab.

8 nmsc.ti,ab.

9 (squamous cell adj2 (cancer$1 or carcinoma$1 or mass or tumor$1 or tumour$1 or neoplasm$1 or adenoma$1 or epithelioma$1 or epithelial or lesion$1 or malignan$ or nodule$1) adj2 (skin or epiderm$ or cutaneous)).ti,ab.

10 (BCC or cscc).mp. or NMSC.ti,ab.

11 keratinocyte.ti,ab.

12 keratinocy$.ti,ab.

13 or/1‐12

14 dermoscop$.ti,ab.

15 dermatoscop$.ti,ab.

16 photomicrograph$.ti,ab.

17 *epiluminescence microscopy/

18 (epiluminescence adj2 microscop$).ti,ab.

19 (confocal adj2 microscop$).ti,ab.

20 (incident light adj2 microscop$).ti,ab.

21 (surface adj2 microscop$).ti,ab.

22 (visual adj (inspect$ or examin$)).ti,ab.

23 ((clinical or physical) adj examin$).ti,ab.

24 3 point.ti,ab.

25 three point.ti,ab.

26 pattern analys$.ti,ab.

27 ABCD$.ti,ab.

28 menzies.ti,ab.

29 7 point.ti,ab.

30 seven point.ti,ab.

31 (digital adj2 (dermoscop$ or dermatoscop$)).ti,ab.

32 artificial intelligence.ti,ab.

33 AI.ti,ab.

34 computer assisted.ti,ab.

35 computer aided.ti,ab.

36 neural network$.ti,ab.

37 MoleMax.ti,ab.

38 exp diagnosis, computer‐assisted/

39 image process$.ti,ab.

40 automatic classif$.ti,ab.

41 image analysis.ti,ab.

42 SIAscop$.ti,ab.

43 (optical adj2 scan$).ti,ab.

44 Aura.ti,ab.

45 MelaFind.ti,ab.

46 SIMSYS.ti,ab.

47 MoleMate.ti,ab.

48 SolarScan.ti,ab.

49 VivaScope.ti,ab.

50 confocal microscop$.ti,ab.

51 (high adj3 ultraso$).ti,ab.

52 (canine adj2 detect$).ti,ab.

53 ((mobile or cell$ or cellular or smart) adj ((phone$1 adj2 app$1) or application$1)).ti,ab.

54 smartphone$.ti,ab.

55 (DermoScan or SkinVision or DermLink or SpotCheck).ti,ab.

56 Spot Check.ti,ab.

57 Mole Detective.ti,ab.

58 (mole$1 adj2 map$).ti,ab.

59 (total adj2 body).ti,ab.

60 exfoliative cytolog$.ti,ab.

61 digital analys$.ti,ab.

62 (image$1 adj3 software).ti,ab.

63 (optical coherence adj (technolog$ or tomog$)).ti,ab.

64 (teledermatolog$ or tele‐dermatolog$ or telederm or tele‐derm or teledermoscop$ or tele‐dermoscop$ or teledermatoscop$).mp. or tele‐dermatoscop$.ti,ab.

65 (computer adj2 diagnos$).ti,ab.

66 *sentinel lymph node biopsy/

67 (sentinel adj2 node).ti,ab.

68 naevisense.ti,ab.

69 HFUS.ti,ab.

70 electrical impedance spectroscopy.ti,ab.

71 history taking.ti,ab.

72 patient history.ti,ab.

73 (naked eye adj (exam$ or assess$)).ti,ab.

74 (skin adj exam$).ti,ab.

75 *physical examination/

76 ugly duckling.ti,ab.

77 UD sign$.ti,ab.

78 ((physician$ or clinical or physical) adj (exam$ or recog$ or triage)).ti,ab.

79 ABCDE.ti,ab.

80 clinical accuracy.ti,ab.

81 *general practice/

82 (confocal adj2 microscop$).ti,ab.

83 clinical competence/

84 diagnostic algorithm$.ti,ab.

85 checklist$1.ti,ab.

86 virtual image$1.ti,ab.

87 volatile organic compound$1.ti,ab.

88 VOC.ti,ab.

89 dog$1.ti,ab.

90 gene expression analys$.ti,ab.

91 reflex transmission imaging.ti,ab.

92 thermal imaging.ti,ab.

93 elastography.ti,ab.

94 dog$1.ti,ab.

95 gene expression analys$.ti,ab.

96 reflex transmission imaging.ti,ab.

97 thermal imaging.ti,ab.

98 elastography.ti,ab.

99 or/14‐93

100 PET‐CT.ti,ab.

101 (CT or PET).ti,ab.

102 (FDG or F18 or Fluorodeoxyglucose or radiopharmaceutical$).ti,ab.

103 exp Deoxyglucose/

104 CATSCAN.ti,ab.

105 deoxyglucose.ti,ab.

106 deoxy‐glucose.ti,ab.

107 *positron emission tomography/

108 *computer assisted tomography/

109 positron emission tomograph$.ti,ab.

110 *nuclear magnetic resonance imaging/

111 (MRI or fMRI or NMRI or scintigraph$).ti,ab.

112 *echography/

113 Doppler.ti,ab.

114 sonograph$.ti,ab.

115 ultraso$.ti,ab.

116 magnetic resonance imag$.ti,ab.

117 or/100‐116

118 (stage$ or staging or metasta$ or recurrence or sensitivity or specificity or false negative$ or thickness$).ti,ab.

119 "Sensitivity and Specificity"/

120 *cancer staging/

121 or/118‐120

122 117 and 121

123 99 or 122

124 13 and 123

Database: Cochrane Library (Wiley) 2016 searched 30 August 2016 CDSR Issue 8 of 12 2016 CENTRAL Issue 7 of 12 2016 HTA Issue 3 of 4 July 2016 DARE Issue 3 of 4 2015

Search strategy:

#1 melanoma* or nonmelanoma* or non‐melanoma* or melanocyt* or non‐melanocyt* or nonmelanocyt* or keratinocyte*

#2 MeSH descriptor: [Melanoma] explode all trees

#3 "skin cancer*"

#4 MeSH descriptor: [Skin Neoplasms] explode all trees

#5 skin near/2 (cancer* or carcinoma* or mass or masses or tumour* or tumor* or neoplasm* or adenoma* or epithelioma* or lesion* or malignan* or nodule*)

#6 nmsc

#7 "squamous cell" near/2 (cancer* or carcinoma* or mass or masses or tumour* or tumor* or neoplasm* or adenoma* or epithelioma* or lesion* or malignan* or nodule*) near/2 (skin or epiderm* or cutaneous)

#8 "basal cell" near/2 (cancer* or carcinoma* or mass or masses or tumour* or tumor* or neoplasm* or adenoma* or epithelioma* or lesion* or malignan* or nodule*)

#9 pigmented near/2 (lesion* or nevus or mole* or naevi or naevus or naevi or skin)

#10 #1 or #2 or #3 or #4 or #5 or #6 or #7 or #8 or #9

#11 dermoscop*

#12 dermatoscop*

#13 Photomicrograph*

#14 MeSH descriptor: [Dermoscopy] explode all trees

#15 confocal near/2 microscop*

#16 epiluminescence near/2 microscop*

#17 incident next light near/2 microscop*

#18 surface near/2 microscop*

#19 "visual inspect*"

#20 "visual exam*"

#21 (clinical or physical) next (exam*)

#22 "3 point"

#23 "three point"

#24 "pattern analys*"

#25 ABDC

#26 menzies

#27 "7 point"

#28 "seven point"

#29 digital near/2 (dermoscop* or dermatoscop*)

#30 "artificial intelligence"

#31 "AI"

#32 "computer assisted"

#33 "computer aided"

#34 AI

#35 "neural network*"

#36 MoleMax

#37 "computer diagnosis"

#38 "image process*"

#39 "automatic classif*"

#40 SIAscope

#41 "image analysis"

#42 "optical near/2 scan*"

#43 Aura

#44 MelaFind

#45 SIMSYS

#46 MoleMate

#47 SolarScan

#48 Vivascope

#49 "confocal microscopy"

#50 high near/3 ultraso*

#51 canine near/2 detect*

#52 Mole* near/2 map*

#53 total near/2 body

#54 mobile* or smart near/2 phone*

#55 cell next phone*

#56 smartphone*

#57 "mitotic index"

#58 DermoScan or SkinVision or DermLink or SpotCheck

#59 "Mole Detective"

#60 "Spot Check"

#61 mole* near/2 map*

#62 total near/2 body

#63 "exfoliative cytolog*"

#64 "digital analys*"

#65 image near/3 software

#66 teledermatolog* or tele‐dermatolog* or telederm or tele‐derm or teledermoscop* or tele‐dermoscop* or teledermatoscop* or tele‐dermatolog*

#67 "optical coherence" next (technolog* or tomog*)

#68 computer near/2 diagnos*

#69 sentinel near/2 node*

#70 #11 or #12 or #13 or #14 or #15 or #16 or #17 or #18 or #19 or #20 or #21 or #22 or #23 or #24 or #25 or #26 or #27 or #28 or #29 or #30 or #31 or #32 or #33 or #34 or #35 or #36 or #37 or #38 or #39 or #40 or #41 or #42 or #43 or #44 or #45 or #46 or #47 or #48 or #49 or #50 or #51 or #52 or #53 or #54 or #55 or #56 or #57 or #58 or #59 or #60 or #61 or #62 or #63 or #64 or #65 or #66 or #67 or #68 or #69

#71 ultraso*

#72 sonograph*

#73 MeSH descriptor: [Ultrasonography] explode all trees

#74 Doppler

#75 CT or PET or PET‐CT

#76 "CAT SCAN" or "CATSCAN"

#77 MeSH descriptor: [Positron‐Emission Tomography] explode all trees

#78 MeSH descriptor: [Tomography, X‐Ray Computed] explode all trees

#79 MRI

#80 MeSH descriptor: [Magnetic Resonance Imaging] explode all trees

#81 MRI or fMRI or NMRI or scintigraph*

#82 "magnetic resonance imag*"

#83 MeSH descriptor: [Deoxyglucose] explode all trees

#84 deoxyglucose or deoxy‐glucose

#85 "positron emission tomograph*"

#86 #71 or #72 or #73 or #74 or #75 or #76 or #77 or #78 or #79 or #80 or #81 or #82 or #83 or #84 or #85

#87 stage* or staging or metasta* or recurrence or sensitivity or specificity or "false negative*" or thickness*

#88 MeSH descriptor: [Neoplasm Staging] explode all trees

#89 #87 or #88

#90 #89 and #86

#91 #70 or #90

#92 #10 and #91

#93 BCC or CSCC or NMCS

#94 keratinocy*

#95 #93 or #94

#96 #10 or #95

#97 naevisense

#98 HFUS

#99 "electrical impedance spectroscopy"

#100 "history taking"

#101 "patient history"

#102 naked next eye near/1 (exam* or assess*)

#103 skin next exam*

#104 "ugly duckling" or (UD sign*)

#105 MeSH descriptor: [Physical Examination] explode all trees

#106 (physician* or clinical or physical) near/1 (exam* or recog* or triage*)

#107 ABCDE

#108 "clinical accuracy"

#109 MeSH descriptor: [General Practice] explode all trees

#110 confocal near microscop*

#111 "diagnostic algorithm*"

#112 MeSH descriptor: [Clinical Competence] explode all trees

#113 checklist*

#114 "virtual image*"

#115 "volatile organic compound*"

#116 dog or dogs

#117 VOC

#118 "gene expression analys*"

#119 "reflex transmission imaging"

#120 "thermal imaging"

#121 elastography

#122 #97 or #98 or #99 or #100 or #101 or #102 or #103 or #104 or #105 or #106 or #107 or #108 or #109 or #110 or #111 or #112 or #113 or #114 or #115 or #116 or #117 or #118 or #119 or #120 or #121

#123 #70 or #122

#124 #96 and #123

#125 #96 and #90

#126 #125 or #124

#127 #10 and #126

Database : CINAHL Plus (EBSCO) 1937 to 30 August 2016

Search strategy:

S1 (MH "Melanoma") OR (MH "naevi and Melanomas+")

S2 (MH "Skin Neoplasms+")

S3 (MH "Carcinoma, Basal Cell+")

S4 basalioma*

S5 (basal cell) N2 (cancer* or carcinoma* or mass or masses or tumor* or tumour* or neoplasm* or adenoma* or epithelioma* or lesion* or malignan* or nodule*)

S6 (pigmented) N2 (lesion* or mole* or nevus or naevi or naevus or naevi or skin)

S7 melanom* or nonmelanoma* or non‐melanoma* or melanocyt* or non‐melanocyt* or nonmelanocyt*

S8 nmsc

S9 TX BCC or cscc or NMSC

S10 (MH "Keratinocytes")

S11 keratinocyt*

S12 S1 OR S2 OR S3 OR S4 OR S5 OR S6 OR S7 OR S8 OR S9 OR S10 OR S11

S13 dermoscop* or dermatoscop* or photomicrograph* or (3 point) or (three point) or ABCD* or menzies or (7 point) or (seven point) or AI or Molemax or SIASCOP* or Aura or MelaFind or SIMSYS or MoleMate or SolarScan or smartphone* or DermoScan or SkinVision or DermLink or SpotCheck

S14 (epiluminescence or confocal or incident or surface) N2 (microscop*)

S15 visual N1 (inspect* or examin*)

S16 (clinical or physical) N1 (examin*)

S17 pattern analys*

S18 (digital) N2 (dermoscop* or dermatoscop*)

S19 (artificial intelligence)

S20 (computer) N2 (assisted or aided)

S21 (neural network*)

S22 (MH "Diagnosis, Computer Assisted+")

S23 (image process*)

S24 (automatic classif*)

S25 (image analysis)

S26 SIAScop*

S27 (optical) N2 (scan*)

S28 (high) N3 (ultraso*)

S29 elastography

S30 (mobile or cell or cellular or smart) N2 (phone*) N2 (app or application*)

S31 (mole*) N2 (map*)

S32 total N2 body

S33 exfoliative cytolog*

S34 digital analys*

S35 image N3 software

S36 teledermatolog* or tele‐dermatolog* or telederm or tele‐derm or teledermoscop* or tele‐dermoscop* or teledermatoscop* or tele‐dermatoscop* teledermatolog* or tele‐dermatolog* or telederm or tele‐derm or teledermoscop*

S37 (optical coherence) N1 (technolog* or tomog*)

S38 computer N2 diagnos*

S39 sentinel N2 node

S40 (MH "Sentinel Lymph Node Biopsy")

S41 naevisense or HFUS or checklist* or VOC or dog*

S42 electrical impedance spectroscopy

S43 history taking

S44 "Patient history"

S45 naked eye

S46 skin exam*

S47 physical exam*

S48 ugly duckling

S49 UD sign*

S50 (physician* or clinical or physical) N1 (exam*)

S51 clinical accuracy

S52 general practice

S53 (physician* or clinical or physical) N1 (recog* or triage)

S54 confocal microscop*

S55 clinical competence

S56 diagnostic algorithm*

S57 checklist*

S58 virtual image*

S59 volatile organic compound*

S60 gene expression analys*

S61 reflex transmission imag*

S62 thermal imaging

S63 S13 or S14 or S15 OR S16 OR S17 OR S18 OR S19 OR S20 OR S21 OR S22 OR S23 OR S24 OR S25 OR S26 OR S27 OR S28 OR S29 OR S30 OR S31 OR S32 OR S33 OR S34 OR S35 OR S36 OR S37 OR S38 OR S39 OR S40 OR S41 OR S42 OR S43 OR S44 OR S45 OR S46 OR S47 OR S48 OR S49 OR S50 OR S51 OR S52 OR S53 OR S54 OR S55 OR S56 OR S57 OR S58 OR S59 OR S60 OR S61 OR S62

S64 CT or PET

S65 PET‐CT

S66 FDG or F18 or Fluorodeoxyglucose or radiopharmaceutical*

S67 (MH "Deoxyglucose+")

S68 deoxy‐glucose or deoxyglucose

S69 CATSCAN

S70 CAT‐SCAN

S71 (MH "Deoxyglucose+")

S72 (MH "Tomography, Emission‐Computed+")

S73 (MH "Tomography, X‐Ray Computed")

S74 positron emission tomograph*

S75 (MH "Magnetic Resonance Imaging+")

S76 MRI or fMRI or NMRI or scintigraph*

S77 echography

S78 doppler

S79 sonograph*

S80 ultraso*

S81 magnetic resonance imag*

S82 S64 OR S65 OR S66 OR S67 OR S68 OR S69 OR S70 OR S71 OR S72 OR S73 OR S74 OR S75 OR S76 OR S77 OR S78 OR S79 OR S80 OR S81

S83 stage* or staging or metasta* or recurrence or sensitivity or specificity or (false negative*) or thickness

S84 (MH "Neoplasm Staging")

S85 S83 OR S84

S86 S82 AND S85

S87 S63 OR S86

S88 S12 AND S87

Database: Science Citation Index SCI Expanded (Web of Science) 1900 to 30 August 2016

Conference Proceedings Citation Index (Web of Science) 1900 to 1 September 2016

Search strategy:

#1 (melanom* or nonmelanom* or non‐melanoma* or melanocyt* or non‐melanocyt* or nonmelanocyt* or keratinocyt*)

#2 (basalioma*)

#3 ((skin) near/2 (cancer* or carcinoma or mass or masses or tumour* or tumor* or neoplasm* or adenoma* or epithelioma* or lesion* or malignan* or nodule*))

#4 ((basal) near/2 (cancer* or carcinoma* or mass or masses or tumour* or tumor* or neoplasm* or adenoma* or epithelioma* or lesion* or malignan* or nodule*))

#5 ((pigmented) near/2 (lesion* or mole* or nevus or naevi or naevus or naevi or skin))

#6 (nmsc or BCC or NMSC or keratinocy*)

#7 ((squamous cell (cancer* or carcinoma* or mass or masses or tumour* or tumor* or neoplasm* or adenoma* or epithelioma* or lesion* or malignan* or nodule*))

#8 (skin or epiderm* or cutaneous)

#9 #8 AND #7

#10 #9 OR #6 OR #5 OR #4 OR #3 OR #2 OR #1

#11 ((dermoscop* or dermatoscop* or photomicrograph* or epiluminescence or confocal or "incident light" or "surface microscop*" or "visual inspect*" or "physical exam*" or 3 point or three point or pattern analy* or ABCDE or menzies or 7 point or seven point or dermoscop* or dermatoscop* or AI or artificial or computer aided or computer assisted or neural network* or Molemax or image process* or automatic classif* or image analysis or siascope or optical scan* or Aura or melafind or simsys or molemate or solarscan or vivascope or confocal microscop* or high ultraso* or canine detect* or cellphone* or mobile* or phone* or smartphone or dermoscan or skinvision or dermlink or spotcheck or spot check or mole detective or mole map* or total body or exfoliative psychology or digital or image software or optical coherence or teledermatology or telederm* or teledermoscop* or teledermatoscop* or computer diagnos* or sentinel))

#12 ((naevisense or HFUS or impedance spectroscopy or history taking or patient history or naked eye or skin exam* or physical exam* or ugly duckling or UD sign* or physician* exam* or physical exam* or ABCDE or clinical accuracy or general practice or confocal microscop* or clinical competence or diagnostic algorithm* or checklist* or virtual image* or volatile organic or VOC or dog* or gene expression or reflex transmission or thermal imag* or elastography))

#13 #11 or #12

#14 ((PET or CT or FDG or deoxyglucose or deoxy‐glucose or fluorodeoxy* or radiopharma* or CATSCAN or positron emission or computer assisted or nuclear magnetic or MRI or FMRI or NMRI or scintigraph* or echograph* or Doppler or sonograph* or ultraso* or magnetic reson*))

#15 ((stage* or staging or metast* or recurrence or sensitivity or specificity or false negative* or thickness*))

#16 #14 AND #15

#17 #16 OR #13

#18 #10 AND #17

Refined by: DOCUMENT TYPES: (MEETING ABSTRACT OR PROCEEDINGS PAPER)

Appendix 7. Full text inclusion criteria

Criterion Inclusion Exclusion
Study design For diagnostic and staging reviews
  • Any study for which a 2×2 contingency table can be extracted, e.g.

    • diagnostic case control studies

    • 'cross‐sectional' test accuracy study with retrospective or prospective data collection

    • studies where estimation of test accuracy was not the primary objective but test results for both index and reference standard were available

    • RCTs of tests or testing strategies where participants were randomised between index tests and all undergo a reference standard (i.e. accuracy RCTs)

  • < 5 melanoma cases (diagnosis reviews)

  • < 10 participants (staging reviews)

  • Studies developing new criteria for diagnosis unless a separate 'test set' of images were used to evaluate the criteria (mainly digital dermoscopy)

  • Studies using 'normal' skin as controls

  • Letters, editorials, comment papers, narrative reviews

  • Insufficient data to construct a 2×2 table

Target condition
  • Melanoma

  • Keratinocyte skin cancer (or non‐melanoma skin cancer)

    • BCC or epithelioma

    • cSCC

  • Studies exclusively conducted in children

  • Studies of non‐cutaneous melanoma or SCC

Population For diagnostic reviews
  • Adults with a skin lesion suspicious for melanoma, BCC, or cSCC (other terms include pigmented skin lesion/naevi, melanocytic, keratinocyte, etc.)

  • Adults at high risk of developing melanoma skin cancer, BCC, or cSCC


For staging reviews
  • Adults with a diagnosis of melanoma or cSCC undergoing tests for staging of lymph nodes or distant metastases or both

  • People suspected of other forms of skin cancer

  • Studies conducted exclusively in children

Index tests For diagnosis
  • Visual inspection/clinical examination

  • Dermoscopy/dermatoscopy

  • Teledermoscpoy

  • Smartphone/mobile phone applications

  • Digital dermoscopy/artificial intelligence

  • Confocal microscopy

  • Ocular coherence tomography

  • Exfoliative cytology

  • High‐frequency ultrasound

  • Canine odour detection

  • DNA expression analysis/gene chip analysis

  • Other


For staging
  • CT

  • PET

  • PET‐CT

  • MRI

  • Ultrasound +/fine needle aspiration cytology FNAC

  • SLNB +/high‐frequency ultrasound

  • Other


Any test combination and in any order
Any test positivity threshold
Any variation in testing procedure (e.g. radioisotope used)
  • Sentinel lymph biopsy for therapeutic rather than staging purposes

  • Tests to determine melanoma thickness

  • Tests to determine surgical margins/lesion borders

  • Tests to improve histopathology diagnose

  • LND

Reference standard For diagnostic studies
  • Histopathology of the excised lesion

  • Clinical follow‐up of non‐excised/benign appearing lesions with later histopathology if suspicious

  • Expert diagnosis (studies should not be included if expert diagnosis is the sole reference standard)


For studies of imaging tests for staging
  • Histopathology (via LND or SLMB)

  • Clinical/radiological follow‐up

  • A combination of the above


For studies of SLNB accuracy for staging
  • LND of both SLN+ and SLn participants to identify all diseased nodes

  • LND of SLN+ participants and follow‐up of SLN participants to identify a subsequent nodal recurrence in a previously investigated nodal basin

For diagnostic studies
  • Exclude if any disease‐positive participants have diagnosis unconfirmed by histology

  • Exclude if > 50% of disease‐negative participants have diagnosis confirmed by expert opinion with no histology or follow‐up

  • Exclude studies of referral accuracy, i.e. comparing referral decision with expert diagnosis, unless evaluations of teledermatology or mobile phone applications

BCC: basal cell carcinoma; cSCC: cutaneous squamous cell carcinoma; CT: computed tomography; FNAC: fine needle aspiration cytology; LND: lymph node dissection; MRI: magnetic resonance imaging; PET: positron emission tomography; PET‐CT: positron emission tomography computed tomography; RCT: randomised controlled trial; SCC: squamous cell carcinoma; SLN+: positive sentinel lymph node; SLn: negative sentinel lymph node; SLNB: sentinel lymph node biopsy

Appendix 8. Quality assessment (based on QUADAS‐2)

We tailored the QUADAS‐2 checklist (Whiting 2011) to the review topic as follows below.

Patient selection domain (1)

Selective recruitment of study participants can be a key influence on test accuracy. In general terms, all participants eligible to undergo a test should be included in a study, allowing for the intended use of that test within the context of the study. We considered studies that separately sampled malignant and benign lesions to have used a case‐control design; and those that supplemented a series of suspicious lesions with additional malignant or benign lesions to be at unclear risk of bias

In terms of exclusions, we considered studies that excluded particular lesion types (e.g. lentigo maligna), particular lesion sites, or that excluded lesions on the basis of image quality or lack of observer agreement (e.g. on histopathology) to be at high risk of bias.

In judging the applicability of patient populations to the review question, we considered restriction to particular lesion populations, such as melanocytic, nodular, high risk or restrictions by size to be of high concern for applicability.

Given that diagnosis of skin cancer is primarily lesion‐based, there is the potential for study participants with multiple lesions to contribute disproportionately to estimates of test accuracy, especially if they are at particular risk of having skin cancer. We considered studies that included a high number of lesions in relation to the number of study to be less representative than studies conducted in a more general population participants (i.e. if the difference between the number of included lesions and number of included participants is greater than 5%).

Index test domain (2)

Given the potential for subjective differences in test interpretation for melanoma, the interpretation of the index test blinded to the result of the reference standard is a key means of reducing bias. For prospective studies and retrospective studies that used the original index test interpretation, the diagnosis will by nature be interpreted and recorded before the result of the reference standard is known; however, studies using previously acquired images could be particularly susceptible to information bias. For these studies to be at low risk of bias, we required a clear indication that observers were unaware of the reference standard diagnosis at time of test interpretation. We also added an item to assess the presence of blinding between interpretations of different algorithms, however we did not include this item in the overall assessment of risk of bias.

We considered pre‐specification of the index test threshold to be present if the study clearly reported that the threshold used was not data driven, that is, was not based on study results. Studies that did not clearly describe the threshold used but that required clinicians to record a diagnosis or management decision for a lesion, we considered to be unclear on this criterion. Studies reporting accuracy for multiple numeric thresholds, where ROC analysis was used to select the threshold, or that reported accuracy for the presence of independently significant lesion characteristics with no separate test set of lesions, we considered at high risk of bias.

In terms of applicability of the index test to the review question, we required the test to be applied and interpreted as it would be in a clinical practice setting, that is, in‐person or face‐to‐face with the patient, and by a single observer as opposed to a consensus decision or average across multiple observers. We considered image‐based studies to be high concern, although reflectance confocal microscopy (RCM) image interpretations where the observer was also supplied with a clinical or dermoscopic image of the lesion along with some patient characteristics were considered ‘unclear’.

Despite the often subjective nature of test interpretation, it is also important for study authors to outline the particular lesion characteristics that were considered to be indicative for melanoma, particularly where established algorithms or checklists were not used. We considered studies to be of low concern if the threshold used was established in a prior study or sufficient threshold details were presented to allow replication.

The experience of the examiner will also impact on the applicability of study results. We required studies to describe the test interpreter as ‘experienced’ or ‘expert’ in RCM to have low concern about applicability.

Reference standard domain (3)

In an ideal study, consecutively recruited participants should all undergo incisional or excisional biopsy of the skin lesion regardless of level of clinical suspicion of melanoma. In reality, both partial and differential verification bias are likely. Partial verification bias may occur where histology is the only reference standard used, and only those participants with a certain degree of suspicion of malignancy based on the result of the index test undergo verification, the others either being excluded from the study or defined as being disease‐negative without further assessment or follow‐up, as discussed above.

Differential verification bias will be present where other reference standards are used in addition to histological verification of suspicious lesions. A typical example of verification bias in skin cancer occurs when investigators do not biopsy people with benign‐appearing lesions but instead follow them up for a period of time to determine whether any malignancy subsequently develops (these would be false‐negatives on the index test). We defined an 'adequate' reference standard as: all disease‐positive individuals having a histological reference standard either at the time of application of the index test or after a period of clinical follow‐up; and at least 80% of disease‐negative participants have received a histological diagnosis, with up to 20% undergoing at least three months' follow‐up of benign‐appearing lesions.

A further challenge is the potential for incorporation bias, that is, where the result of the index test is used to help determine the reference standard diagnosis. It is normal practice for the clinical diagnosis (usually by visual inspection or dermoscopy) to be included on pathology request forms and for the histopathologist to use this diagnosis to help with the pathology interpretation. Although inclusion of such clinical information on the histopathology request form is theoretically a form of incorporation bias, blinded interpretation of the histopathology reference standard is not normal practice, and enforcement of such conditions would significantly limit the generalisability of the study results. For studies evaluating RCM, we divided this item into two questions, firstly whether the reference standard was blinded to the index test result (RCM), and secondly whether it was blinded to the clinical diagnosis. We included only the response to the first part (i.e. blinding to RCM) in our overall assessment of risk of bias for the reference standard domain.

In judging the applicability of the reference standard to our review question, we scored studies as high concern around applicability if they used expert diagnosis (with no follow‐up) as a reference standard in any participant, or did not report histology interpretation by a dermatopathologist.

Flow and timing domain (4)

In the ideal study, the diagnosis based on the index test and reference standard should be made consecutively or as near to each other in time as possible to avoid changes in lesion over time. For lesions with a histological reference standard, we have defined a one‐month period as an appropriate interval between application of the index test and the reference standard. For studies using clinical follow‐up, we defined a minimum three‐month follow‐up period as at low risk of bias for detecting false‐negatives. We chose this interval based on a study showing that most false‐negative melanomas will be diagnosed within three months of the initial negative index test although a small number will be diagnosed up to 12 months subsequently (Altamura 2008).

In assessing whether all participants were included in the analysis, we considered studies at high risk of bias if they excluded participants following recruitment.

Comparative domain

We added a comparative domain to the QUADAS‐2 checklist for studies comparing the accuracy of RCM and dermoscopy. We included items to assess the presence of blinding of interpretation between tests, and to specify a maximum one‐month interval between application of index tests, as intervals greater than these may be accompanied by changes in tumour characteristics. As it would not be normal practice for RCM to be interpreted blinded to the clinical or dermoscopic diagnosis, the scoring of this item did not contribute to our overall assessment of risk of bias. We also considered whether both tests were applied and interpreted in a clinically applicable manner.

The following tables use text that was originally published in the QUADAS‐2 tool by Whiting and colleagues (Whiting 2011).

Item Response (delete as required)
Participant selection 1. Risk of bias
1. Was a consecutive or random sample of participants or images enrolled? Yes – if paper states consecutive or random
No – if paper describes other method of sampling
Unclear – if participant sampling not described
2. Was a case‐control design avoided? Yes – if consecutive or random or case‐control design clearly not used
No – if study described as case‐control or describes sampling specific numbers of participants with particular diagnoses
Unclear – if not described
3. Did the study avoid inappropriate exclusions, e.g.
  • 'difficult‐to‐diagnose' lesions not excluded

  • lesions not excluded on basis of disagreement between evaluators

Yes if inappropriate exclusions were avoided
No – if lesions were excluded that might affect test accuracy, e.g. 'difficult‐to‐diagnose' lesions, or where disagreement between evaluators was observed
Unclear – if not clearly reported but there is suspicion that difficult‐to‐diagnose lesions may have been excluded
4. For between‐person comparative studies only (i.e. allocating different tests to different study participants):
  • A. were the same participant selection criteria used for those allocated to each test?

  • B. was the potential for biased allocation between tests avoided through adequate generation of a randomised sequence?

  • C. was the potential for biased allocation between tests avoided through concealment of allocation prior to assignment?

For A
  • Yes – if same selection criteria were used for each index test,

  • No – if different selection criteria were used for each index test,

  • Unclear – if selection criteria per test were not described,

  • N/A – if only 1 index test was evaluated or all participants received all tests


For B
  • Yes – if adequate randomisation procedures are described,

  • No – if inadequate randomisation procedures are described,

  • Unclear – if the method of allocation to groups is not described (a description of 'random' or 'randomised' is insufficient),

  • N/A – if only 1 index test was evaluated or all participants received all tests


For C
  • Yes – if appropriate methods of allocation concealment are described,

  • No – if appropriate methods of allocation concealment are not described,

  • Unclear – if the method of allocation concealment is not described (sufficient detail to allow a definite judgement is required),

  • N/A – if only 1 index test was evaluated

Could the selection of participants have introduced bias?
For non‐comparative and within‐person‐comparative studies
  1. If answers to all of questions 1, 2, and 3 'Yes'

  2. If answers to any 1 of questions 1, 2, or 3 'No'

  3. If answers to any 1 of questions 1, 2, or 3 'Unclear'


For between‐person comparative studies
  1. If answers to all of questions 1, 2, 3, and 4 'Yes'

  2. If answers to any 1 of questions 1, 2, 3, or 4 'No'

  3. If answers to any 1 of questions 1, 2, 3, or 4 'Unclear'

For non‐comparative and within‐person‐comparative studies
  1. Risk is low

  2. Risk is high

  3. Risk unclear


For between‐person comparative studies
  1. Risk is low

  2. Risk is high

  3. Risk unclear

Participant selection 1. Concerns regarding applicability
1. Are the included participants and chosen study setting appropriate to answer the review question, i.e. are the study results generalisable?
  • This item is not asking whether exclusion of certain participant groups might bias the study's results (as in Risk of bias above), but is asking whether the chosen study participants and setting are appropriate to answer our review question. Because we are looking to establish test accuracy in both primary presentation and referred participants, a study could be appropriate for 1 setting and not for the other, or it could be unclear as to whether the study can appropriately answer either question

  • For each study assessed, please consider whether it is more relevant for A, participants with a primary presentation of a skin lesion or B, referred participants, and respond to the questions in either A or B accordingly. If the study gives insufficient details, please respond Unclear to both parts of the question

A. For studies that will contribute to the analysis of participants with a primary presentation of a skin lesion (i.e. test naive)
  • Yes – if participants included in the study appear to be generally representative of those who might present in a usual practice setting

  • No – if study participants appear to be unrepresentative of usual practice, e.g. in terms of severity of disease, demographic features, presence of differential diagnosis or comorbidity, setting of the study, and previous testing protocols

  • Unclear – if insufficient details are provided to determine the generalisability of study participants


B. For studies that will contribute to the analysis of referred participants (i.e. who have already undergone some form of testing)
  • Yes – if study participants appear to be representative of those who might be referred for further investigation. If the study focuses only on those with equivocal lesions, for example, we would suggest that this is not representative of the wider referred population

  • No – if study participants appear to be unrepresentative of usual practice, e.g. if a particularly high proportion of participants have been self‐referred or referred for cosmetic reasons. Other factors to consider include severity of disease, demographic features, presence of differential diagnosis or comorbidity, setting of the study, and previous testing protocols

  • Unclear – if insufficient details are provided to determine the generalisability of study participants

2. Did the study avoid including participants with multiple lesions?
  • Yes – if the difference between the number of included lesions and number of included participants is less than 5%

  • No – if the difference between the number of included lesions and number of included participants is greater than 5%

  • Unclear – if it is not possible to assess

Is there concern that the included participants do not match the review question?
  1. If the answer to question 1 or 2 'Yes'

  2. If the answer to question 1 or 2 'No'

  3. If the answer to question 1 or 2 'Unclear'

  1. Concern is low

  2. Concern is high

  3. Concern is unclear

Index test 2. Risk of bias (to be completed per test evaluated)
1. Was the index test or testing strategy result interpreted without knowledge of the results of the reference standard?
  • Yes – if index test described as interpreted without knowledge of reference standard result or, for prospective studies, if index test is always conducted and interpreted prior to the reference standard

  • No – if index test described as interpreted in knowledge of reference standard result

  • Unclear – if index test blinding is not described

2. Was the diagnostic threshold at which the test was considered positive (i.e. melanoma present) prespecified?
  • Yes – if threshold was prespecified (i.e. prior to analysing study results)

  • No – if threshold was not prespecified

  • Unclear – if not possible to tell whether or not diagnostic threshold was prespecified

3. For within‐person comparisons of index tests or testing strategies (i.e. > 1 index test applied per participant), was each index test result interpreted without knowledge of the results of other index tests or testing strategies?
  • Yes – if all index tests were described as interpreted without knowledge of the results of the others

  • No – if the index tests were described as interpreted in the knowledge of the results of the others

  • Unclear – if it is not possible to tell whether knowledge of other index tests could have influenced test interpretation

  • N/A – if only 1 index test was evaluated

Could the conduct or interpretation of the index test have introduced bias?
For non‐comparative and between‐person comparison studies
  1. If answers to questions 1 and 2 'Yes'

  2. If answers to either questions 1 or 2 'No'

  3. If answers to either questions 1 or 2 'Unclear'


For within‐person comparative studies
  1. If answers to all questions 1, 2, for any index test and 3 'Yes'

  2. If answers to any 1 of questions 1 or 2 for any index test or 3 'No'

  3. If answers to any 1 of questions 1 or 2 for any index test or 3 'Unclear'

For non‐comparative and between‐person comparison studies
  1. Risk is low

  2. Risk is high

  3. Risk is unclear


For within‐person comparative studies
  1. Risk is low

  2. Risk is high

  3. Risk is unclear

Index test 2. Concern about applicability
1. Was the diagnostic threshold to determine presence or absence of disease established in a previously published study?
E.g. previously evaluated/established
  • algorithm/checklist used

  • lesion characteristics indicative of melanoma used

  • objective (usually numerical) threshold used

  • Yes – if a previously evaluated/established tool to aid diagnosis of melanoma was used or if the diagnostic threshold used was established in a previously published study

  • No – if an unfamiliar/new tool to aid diagnosis of melanoma was used, if no particular algorithm was used, or if the objective threshold reported was chosen based on results in the current study

  • Unclear – if insufficient information was reported

2. Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication?
Study results can only be reproduced if the diagnostic threshold is described in sufficient detail. This item applies equally to studies using pattern recognition and those using checklists or algorithms to aid test interpretation
  • Yes – If the criteria for diagnosis of melanoma were reported in sufficient detail to allow replication

  • No – if the criteria for diagnosis of melanoma were not reported in sufficient detail to allow replication

  • Unclear – If some but not sufficient information on criteria for diagnosis to allow replication were provided

3. Was the test interpretation carried out by an experienced examiner?
  • Yes – if the test was interpreted by 1 or more speciality‐accredited dermatologists, or by examiners of any clinical background with special interest in dermatology and with any formal training in the use of the test

  • No – if the test was not interpreted by an experienced examiner (see above)

  • Unclear – if the experience of the examiner(s) was not reported in sufficient detail to judge or if examiners were described as 'Expert' with no further detail given

  • N/A – if system‐based diagnosis, i.e. no observer interpretation

Is there concern that the index test, its conduct, or interpretation differ from the review question?
  1. If answers to questions 1, 2, and 3 'Yes'

  2. If answers to questions 1, 2, or 3 'No'

  3. If answers to questions 1, 2, or 3 'Unclear'

  1. Concern is low

  2. Concern is high

  3. Concern is unclear

Reference standard 3. Risk of bias
1. Is the reference standard likely to correctly classify the target condition?
A. Disease‐positive – 1 or more of the following:
  • histological confirmation of melanoma following biopsy or lesion excision

  • clinical follow‐up of benign‐appearing lesions for at least 3 months following the application of the index test, leading to a histological diagnosis of melanoma


B) Disease‐negative – 1 or more of the following:
  • histological confirmation of absence of melanoma following biopsy or lesion excision in at least 80% of disease‐negative participants

  • clinical follow‐up of benign‐appearing lesions for a minimum of 3 months following the index test in up to 20% of disease‐negative participants

A. Disease‐positive
  • Yes – if all participants with a final diagnosis of melanoma underwent 1 of the listed reference standards

  • No – If a final diagnosis of melanoma for any participant was reached without histopathology

  • Unclear – if the method of final diagnosis was not reported for any participant with a final diagnosis of melanoma or if the length of clinical follow‐up used was not clear or if a clinical follow‐up reference standard was reported in combination with a participant‐based analysis and it was not possible to determine whether the detection of a malignant lesion during follow‐up is the same lesion that originally tested negative on the index test


B. Disease‐negative
  • Yes – if at least 80% of benign diagnoses were reached by histology and up to 20% were reached by clinical follow‐up for a minimum of 3 months following the index test

  • No – if more than 20% of benign diagnoses were reached by clinical follow‐up for a minimum of 3 months following the index test or if clinical follow‐up period was less than 3 months

  • Unclear – if the method of final diagnosis was not reported for any participant with benign or non‐melanoma diagnosis

2. Were the reference standard results interpreted without knowledge of the results of the index test?
Please score this item for all studies even though histopathology interpretation is usually conducted with knowledge of the clinical diagnosis (from visual inspection or dermoscopy or both). We will deal with this by not including the response to this item in the 'Risk of bias' assessment for these tests. For reviews of all other tests, this item will be retained
  • Yes – if the reference standard diagnosis was reached blinded to the index test result

  • No – if the reference standard diagnosis was reached with knowledge of the index test result

  • Unclear – if blinded reference test interpretation was not clearly reported

Could the reference standard, its conduct, or its interpretation have introduced bias?
For visual inspection/dermoscopy evaluations
  1. If answer to question 1 'Yes'

  2. If answer to question 1 'No'

  3. If answer to question 1 'Unclear'


For all other tests
  1. If answers to questions 1 and 2 'Yes'

  2. If answers to questions 1 or 2 'No'

  3. If answers to questions 1 or 2 'Unclear'

For visual inspection/dermoscopy evaluations
  1. Risk is low

  2. Risk is high

  3. Risk is unclear


For all other tests
  1. Risk is low

  2. Risk is high

  3. Risk is unclear

Reference standard 3. Concern about applicability
1. Are index test results presented separately for each component of the target condition (i.e. separate results presented for those with invasive melanoma, melanoma in situ, lentigo maligna, severe dysplasia, BCC, and cSCC)?
  • Yes – if index test results for each component of the target condition can be disaggregated

  • No – if index test results for the different components of the target condition cannot be disaggregated

  • Unclear – if not clearly reported

2. Expert opinion (with no histological confirmation) was not used as a reference standard
'Expert opinion' means diagnosis based on the standard clinical examination, with no histology or lesion follow‐up
***do not complete this item for teledermatology studies
  • Yes – if expert opinion was not used as a reference standard for any participant

  • No – if expert opinion was used as a reference standard for any participant

  • Unclear – if not clearly reported

3. Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist?
  • Yes – if histology interpretation was reported to be carried out by an experienced histopathologist or dermatopathologist

  • No – if histology interpretation was reported to be carried out by a less experienced histopathologist

  • Unclear – if the experience/qualifications of the pathologist were not reported

Is there concern that the target condition as defined by the reference standard does not match the review question?
  1. If answers to all questions 1, 2, and 3 'Yes'

  2. If answers to any 1 of questions 1, 2, or 3 'No'

  3. If answers to any 1 of questions 1, 2, or 3 'Unclear'


***For teledermatology studies only
  1. If answers to all questions 1 and 3 'Yes'

  2. If answers to questions 1 or 3 'No'

  3. If answers to questions 1 or 3 'Unclear'

  1. Concern is low

  2. Concern is high

  3. Concern is unclear


***For teledermatology studies only
  1. Concern is low

  2. Concern is high

  3. Concern is unclear

Flow and timing 4. Risk of bias
1. Was there an appropriate interval between index test and reference standard?
A. For histopathological reference standard, was the interval between index test and reference standard ≤ 1 month?
B. If the reference standard includes clinical follow‐up of borderline/benign‐appearing lesions, was there at least 3 months' follow‐up following application of index test(s)?
A
  • Yes – if study reports ≤ 1 month between index and reference standard

  • No – if study reports > 1 month between index and reference standard

  • Unclear – if study does not report interval between index and reference standard


B
  • Yes – if study reports ≥ 3 months' follow‐up

  • No – if study reports < 3 months' follow‐up

  • Unclear – if study does not report the length of clinical follow‐up

2. Did all participants receive the same reference standard?
  • Yes – if all participants underwent the same reference standard

  • No – if more than 1 reference standard was used

  • Unclear – if not clearly reported

3. Were all participants included in the analysis?
  • Yes – if all participants were included in the analysis

  • No – if some participants were excluded from the analysis

  • Unclear– if not clearly reported

4. For within‐person comparisons of index tests
  • Was the interval between application of index tests ≤ 1 month?

  • Yes – if study reports ≤ 1 month between index tests

  • No – if study reports > 1 month between index tests

  • Unclear – if study does not report the interval between index tests

Could the participant flow have introduced bias?
For non‐comparative and between‐person comparison studies
  1. If answers to questions 1, 2, and 3 'Yes'

  2. If answers to any 1 of questions 1, 2, or 3 'No'

  3. If answers to any 1 of questions 1, 2, or 3 'Unclear'


For within‐person comparative studies
  1. If answers to all questions 1, 2, 3, and 4 'Yes'

  2. If answers to any 1 of questions 1, 2, 3, or 4 'No'

  3. If answers to any 1 of questions 1, 2, 3, or 4 'Unclear'

For non‐comparative and between‐person comparison studies
  1. Risk is low

  2. Risk is high

  3. Risk is unclear


For within‐person comparative studies
  1. Risk is low

  2. Risk is high

  3. Risk is unclear

BCC: basal cell carcinoma; cSCC; cutaneous squamous cell carcinoma

Appendix 9. Summary study details: in‐person evaluations

Study
Position on clinical pathway a, b
Outcomes reported
Study type
Country
Setting
Inclusion criteria Numberparticipants/lesions Index tests (algorithm)
Diagnostic approach
Threshold Observer qualification (number)
Experience
Reference standard
Final diagnoses
Prevalence (invasive melanoma or atypical intraepidermal melanocytic variants)
Exclusions
Limited prior testing (position 2 on clinical pathway)
Grimaldi 2009
Pathway: clear
MEL
WPC
P‐CS
Italy
Primary
Cutaneous PSL requiring confirmation of diagnosis by teledermatology 197/235 VI (no algorithm)
 Dermoscopy (no algorithm)
In‐person (single)
Subjective impression ("suspicious for malignancy") GP (n = 13)
Assumed to be low (expertise NR; simple protocols for diagnosis provided for study purposes)
Histology/clinical FU (6 months)
MEL 5;
BCC 0; BN 230 (NR)
20%
None reported
Menzies 2009
Pathway: clear
MEL
Any
WPC
P‐CS
Australia
Primary
PSL that would be biopsied or referred on after routine naked eye examination NR/374 VI (no algorithm)
 Dermoscopy (no algorithm)
In‐person (single)
Subjective impression ("correct diagnosis of melanoma") GP (n = 62)
Assumed to be low (trained for study; required history of excision or referral of ≥ 10 pigmented skin lesions over the previous 12‐month period but no prior dermoscopy use)
Histology/clinical FU (3‐6 months)/expert dx
MEL 32;
BD 2; BN 323; Unknown 9
4%
6 BCC and 2 BD excluded by study authors, 43 excluded as both VI + dermoscopic diagnoses not available
Walter 2012
Pathway: clear
MM
MEL
Any
BPC
RCT
UK
Primary
Any suspicious PSL that could not immediately be diagnosed as benign 654/792 (control arm only) VI (7‐point)
Siascope (iv arm)
In‐person (single)
7PCL: ≥ 3 GP (n = 28)
Nurse practitioner (n = 2)
Low (excluded if specialist dermatology training)
Histology/clinical FU (3‐6 months)/expert dx
Control group only:
MM 16; MiS 2
BCC 4; SK 20; DF 2; lentigo 5; "benign" 686; unknown 10
6%
19 (5 due to violation of recruitment criteria or discontinued protocol; 1 died; 4 did not attend for dermatology assessment; 2 missing histology; 7 not clearly accounted for)
Limited prior testing (selected for excision) (position 3 on clinical pathway)
Collas 1999
Pathway – unclear
MEL
NC
P‐CS
France
Mixed (private/hospital)
PSL undergoing excision by dermatologists in private practice, and by hospital dermatologists 353/353 VI (1. no algorithm; 2. own new algorithm)
In‐person
1. subjective impression
2. ≥ 1 of 3 characteristics present
Dermatologist (n = NR; exp NR)
Single observer
Histology
MEL 38
BN 249; other pigmented 55
38/353; 11%
None reported
Gachon 2005
Pathway – clear
NC
P‐CS
France
Private
Melanocytic skin lesions removed for any reason NR/4036 VI (no algorithm)
In‐person; single
Subjective impression ("considered suspicious") Dermatologists (135/200)
Exp NR
Histology
MM 113; MiS 36
BN 3887
149/4036; 4%
None reported
McGovern 1992
Pathway – clear
WPC‐algs
P‐CS
USA
Community (Army Medical Center DermClinic)
PSL (> 10 mm) excised to rule out dysplasia, MiS or MM 179/237 VI (7‐point; (A)BCD)
In‐person; single
7‐point: ≥ 2, ≥ 3, ≥ 4 characteristics present
 (A)BCD: ≥ 1, ≥ 2, ≥ 3 characteristics present NR (presume dermatologist)
Exp NR
Histology
MM 6; MiS 6
BCC 4; SK 32; BN 138; AK 6; other 45
12/205; 6%
32 lesions unaccounted for; 13 excluded due to lesion size of ≤ 8 mm. 192 evaluated for ABCD and 3‐point; 205 evaluated for 7‐point
Referred for further assessment (position 4 on clinical pathway)
Barzegari 2005
Pathway – clear
MEL
WPC
NR‐CS
Iran
Secondary
PSL ≤ 15 mm diameter referred to dermatology clinic for diagnostic evaluation or cosmetic reasons 91/122 VI (no algorithm)
In‐person (consensus diagnosis of 2)
Melanoma likely/melanoma possible Mixed (n = 2; 1
attending dermatologist and a third year dermatology resident)
Histology
MM 3; MiS 3
SK 2; AK 1; BN 106; DF 7
6/122; 5%
None reported
Stanganelli 2000
Pathway – clear
MEL
Any
WPC
R‐CS
Italy
Specialist clinic
PSL referred by dermatologists and GPs either for pre‐surgical assessment or consultation NR/3372 VI (ABCD)
 Dermoscopy (no algorithm)
In‐person (single)
NR
 Subjective impression NR (assumed dermatologist: described as one of the co‐authors; n = 1) Histology/registry FU
MEL 55
BCC 43; BN 3274
55/3372; 2%
None reported
Referred for further assessment (selected for excision) (position 5 on clinical pathway)
Benelli 1999
Pathway – unclear
MEL
WPC
P‐CS
Italy
Secondary
All PSL observed and excised at the Dermatologic Surgery Department NR/401 1. VI (ABCDE)
 2. Dermoscopy (7FFM)
In‐person
1. ≥ 1 characteristic present; ≥ 2 characteristics present; ≥ 3 characteristics present; ≥ 4 characteristics present; all 5 characteristics present
 2. Score ≥ 2 Dermatologist (n = 2; exp NR)
Consensus of 2
Histology
MM 54; MiS 6
BCC 1
BN 337; LS 5; SK 1
60/401; 15%
None reported
Bono 2002a
Pathway – clear
MEL
WPC
P‐CS
Italy
Specialist clinic
PSL with a more or less important suspicion for MM on VI and/or dermoscopy 298/313 VI (no algorithm)
 Dermoscopy (no algorithm)
In‐person
VI: subjective impression
 Dermoscopy: ≥ 1 characteristic present Surgical oncologist (n = 4; high)
Single observer
Histology
MM 55; MiS 11
BCC 6; 8 SK; 3 SN; BN 230
66/313; 21%
None reported
Bono 2002b
Pathway – clear
MEL
WPC
P‐CS
Italy
Specialist clinic
PSL ≤ 6 mm requiring surgical biopsy for diagnosis based on clinical or dermoscopic suspicion of MM 157/161 VI (no algorithm)
 Dermoscopy (no algorithm)
In‐person
VI: subjective impression
 Dermoscopy: ≥ 1 characteristic present Surgical oncologist (n = 2; high)
Single observer
Histology
MM 10; MiS 3
BCC 2; SK 4; SN 5; BN 124
13/161; 8%
None reported
Bono 2006
Pathway – clear
MEL
WPC
R‐CS
Italy
Specialist clinic
PSL ≤ 3mm undergoing excision due to a more or less important suspicion for MM on VI and/or dermoscopy 204/206 VI (no algorithm)
 Dermoscopy (Menzies)
In‐person
VI: subjective impression
 Dermoscopy: NR NR; assumed surgical oncologist as per Bono 2002a; Bono 2002b (n = 4; exp NR)
Single observer
Histology
MM 19; MiS 4
SN 3; BN 169; Other 11
23/206; 11%
None reported
Carli 2002a
Pathway – unclear
MEL
WPC
R‐CS
Italy
Secondary
Clinically equivocal and suspicious PSL subjected to excisional biopsy at the Institute of Dermatology NR/256 1. VI (no algorithm)
 2. Dermoscopy (pattern)
In‐person (dermoscopy, image‐based)
Subjective impression Dermatologist (n = 2; high exp – “extensive experience in both clinical and dermoscopic diagnosis”)
Consensus of 2
Histology
MM 40; MiS 14
BCC 5
BN 177; SN 16; SK 4
54/256; 21%
None reported
Cristofolini 1994
Pathway – unclear
MEL
WPC
P‐CS
Italy
Secondary
Patients with PSL presenting during a campaign for the early diagnosis of cutaneous melanoma at the Dermatology Department NR/220 1. VI (ABCDE)
 2. Dermoscopy (pattern)
In‐person
1. ≥ 2 characteristics present
 2. ≥ 1 characteristic present Dermatologist (n = 4; high exp: dermatologists had all been trained in the recognition of pigmented lesions)
Unclear observer interpretation
Histology
MEL 33
BCC 0
BN 181; SK 4; 2 thrombosed angioma
33/220; 15%
None reported
Cristofolini 1997
Pathway – unclear
MEL
WPC‐algs
NR‐CS
Italy
Secondary
Patients with small and flat common and atypical PSL recruited during a health campaign for the early diagnosis of melanoma; all underwent skin biopsy. 176/176 VI (ABCD)
In‐person
NR Dermatologist (n = 3; high experience)
Consensus of 3
Histology
MEL 35
BN 141
35/176; 20%
None reported
Ek 2005
Pathway – clear
MEL
Any
NC
P‐CS
Australia
Specialist clinic
Lesions excised for which malignancy could not be excluded 1223/2582 VI (no algorithm)
In‐person
Subjective impression Plastic surgeon (n = 4 or 5; mixed experience; 3 consultants, 1 plastic surgery trainee (usually 1st year, on 6‐month rotation) and a clinical assistant)
Unclear
Histology
MEL 23
BCC 1214; SCC 517; BD 188; SK 63; 577 other BN (including 330 solar keratosis)
23/2582; 1%
Incomplete or incorrectly entered proformas were excluded – 79 participants with 96 lesions
Green 1991
Pathway – clear
MEL
NC
NR‐CS
Australia
Secondary
PSL for excision 81/89 VI (no algorithm)
In‐person
Subjective impression NR (n = NR; exp NR "in the majority of cases a surgeon or a dermatologist")
Single observer
Histology
MEL 5
BCC 2; SK 7; BN 54; Other 2
5/70; 7%
19/89 lesions excluded (number of participants not reported) due to incomplete clinical and histology records.
Langley 2001
Pathway – unclear
MEL
NC
P‐CS
USA
Specialist clinic
Patients with lesions scheduled for excision at the pigmented lesion clinic to either remove atypical naevi or to rule out melanoma or for cosmetic reasons NR/38 VI (no algorithm)
In‐person
NR NR (presume dermatologist; n = NR; exp NR)
Unclear
Histology
MM 3; MiS 3
BN 32
6/38; 16%
None reported
Morales Callaghan 2008
Pathway – unclear
MEL
WPC
P‐CS
Spain
Secondary
Randomly selected melanocytic lesions; melanocytic on both clinical and dermoscopic criteria 166/200 1. VI (no algorithm)
 2. Dermoscopy (no algorithm)
In‐person
NR Dermatologist (n = 2; high exp – “experience in dermoscopy”)
Consensus of 2
Histology
MEL 6
BN 184; SN 1; Other 9
6/200; 3%
None reported
Morton 1998a (high exp), Morton 1998b (mod exp), and Morton 1998c (low exp)
Pathway – clear
MEL
NC
R‐CS
UK
Specialist clinic
Patients referred by their GP to the clinic NR/1999 VI (no algorithm)
In‐person
NR Dermatologist (n = 2; high); Dermatology senior registrar (n = 1; moderate); Dermatology registrar (n = 1; low)
SIngle observer per lesion
Histology
MM 104; MiS 24
BN 1871
High exp: 69/763; 9%
Moderate exp: 31/567; 5%
Low exp: 28/669; 4%
None reported
Thomas 1998
Pathway – unclear
MEL
NC
CCS
France
Secondary
All cases of melanoma and a nonselected consecutive group of "non‐melanoma" PSL NR/1140 VI (ABCDE)
In‐person
≥ 1 characteristic present
 ≥ 2 characteristics present
 ≥ 3 characteristics present
 ≥ 4 characteristics present
 all 5 characteristics present Dermatologist (n = 2; high exp: described as "trained dermatologists")
Single observer
Histology
MEL 460
BCC 8
BN 638; SN 2; Other 13
460/1140; 40%
None reported
Unlu 2014
Pathway – unclear
MEL
WPC‐algs
R‐CS
Turkey
Specialist clinic
Melanocytic lesions excised at Department of Dermatology Pigmented Lesion Clinic 115/115 1. VI (no algorithm)
 2. Dermoscopy (7‐point; 3‐point; CASH; ABCD)
In‐person
1. subjective impression
2. score ≥ 3; ≥ 2 characteristics present; score ≥ 8; score > 5.44
NR (presume dermatologist; n = 1 for VI; n = 3 for dermoscopy; Exp NR for VI)
Single observer (VI); consensus of 3 (dermoscopy)
Histology
MEL 24
BN 91
24/115; 21%`
None reported
Zaumseil 1983
Pathway – unclear
MEL
NC
NR‐CS
Germany
Secondary
Skin lesions undergoing excision NR/7063 VI (no algorithm)
In‐person
Subjective impression NR (n = NR; exp NR)
Single observer
Histology
MEL 337
Not melanoma 6726 (dx listed only for FPs)
337/7063; 5%
None reported
Equivocal referred for further assessment (selected for excision) (position 5* on clinical pathway)
Dummer 1993
Pathway – clear
MEL
WPC
P‐CS
Germany
Patients with melanocytic skin lesions difficult to diagnose clinically NR/771 VI (no algorithm)
Dermoscopy (pattern)
In‐person (image‐based for dermoscopy)
NR NR assume dermatologist (assumed) (n = 2; exp NR)
Single observer
Histology
MM 19; MiS 4
SK 4; BN 706; BN NML 32; other 6
23/771; 3%
53 non‐melanocytic lesions not included in the final analysis (no melanomas present in this group)
Soyer 1995
Pathway – clear
MEL
WPC
NR‐CS
Austria
PSL difficult to diagnose on clinical grounds alone NR/159 VI (no algorithm)
Dermoscopy (pattern)
In‐person
NR Dermatologist (n = 2; exp high; "the examination was performed by a dermatologist expert in dermoscopy")
Single observer
Histology
MM 50; MiS 15
BCC 3; SK 18; AK 4; BN 61; other 7
65/159; 41%
None reported
Steiner 1987
Pathway – unclear
MEL
P‐CS
Austria
Specialist clinic
Small (< 10 mm) diagnostically equivocal PSL; no absolute agreement on clinical diagnosis among investigating clinicians at a pigmented lesion clinic. NR/318 1. VI (no algorithm)
2. Dermoscopy (pattern)
In‐person
Subjective impression Dermatologists (n = 3; high exp: "experienced dermatologists")
Consensus diagnosis of 3 observers
Histology
MM 49; MiS 24
BCC 20
BN 143; SK 20; lentigo simplex and nevoid lentigo 19; other 15
73/318; 23%
None reported
apositions on the clinical pathway described in Figure 3.
bclear or unclear position on the clinical pathway.
AHM: atypical melanocytic naevi; AK: actinic keratosis; BCC: basal cell carcinoma; BD: Bowen’s disease; BN: benign naevi; BNM: benign non‐melanocytic; BPC: between‐person comparison (of tests); CCS: case control study; CS: case series; cSCC: cutaneous squamous cell carcinoma; DF: dermatofibroma; dx: diagnosis; ELM: epiluminescence microscopy; Exp: experience; FP: false‐positive; FU: follow‐up; GP: general practitioner; LS: lentigo simplex; MEL: invasive melanoma or atypical intraepidermal melanocytic variants; MM: malignant (invasive) melanoma; MiS: melanoma in situ (or lentigo maligna); NC: noncomparative; NR: not reported; P: prospective; PLC: pigmented lesion clinic; PSL: pigmented skin lesion; R: retrospective; RCT: randomised controlled trial; SCC: squamous cell carcinoma; SK: seborrhoeic keratosis; SN: Spitz naevi; VI: visual inspection; WPC: within‐person comparison (of tests); WPC‐algs: within‐person comparison (of algorithms); 7FFM: seven features for melanoma; 7PCL: seven‐point checklist

Appendix 10. Summary QUADAS: in‐person evaluations

  Studies clearly placed on clinical pathway Studies not clearly placed on clinical pathway
Pathway a, b Risk of bias Concerns about applicability Risk of bias Concerns about applicability
Limited prior testing (position 2 on clinical pathway)
Studies N = 3; Grimaldi 2009; Menzies 2009; Walter 2012 N = 0
Participant selection Low (3/3) High (2/3)
Unclear (1/3)
Inclusion of multiple lesions per participant (Grimaldi 2009; Walter 2012); patient numbers NR (Menzies 2009)
Index test Low (1/3)
Unclear (2/3)
Lack of clear pre‐specification of threshold (Grimaldi 2009; Menzies 2009)
Low (1/3)
High (2/3)
Lack of description of diagnostic threshold (Grimaldi 2009; Menzies 2009). Non‐expert test interpretation (Menzies 2009; Walter 2012); not clear in Grimaldi 2009
Reference standard High (3/3)
< 80% of disease‐negative participants had histological or clinical follow‐up reference standard
High (2/3)
Unclear (1/3)
Expert diagnosis as reference standard (Menzies 2009; Walter 2012); unclear histopathologist expertise (3/3)
Flow and timing High (3/3)
Mixed reference standards (3/3); participant exclusions (Menzies 2009; Walter 2012); all unclear on index to reference interval
Limited prior testing (selected for excision) (position 3 on clinical pathway)
Studies N = 2; Gachon 2005; McGovern 1992 N = 1; Collas 1999
Participant selection Low (1/2)
Unclear (1/2)
Unclear exclusion criteria (1/2; Gachon 2005).
High (2/2)
Restriction to melanocytic (1/2; Gachon 2005) or primarily excised lesions (2/2); multiple lesions per participant (1/2; McGovern 1992); number participants NR (1/2; Gachon 2005)
Unclear (1/1)
Participant sampling not described; exclusion criteria NR
High (1/1)
Excised only included
Index test Unclear (1/2)
High (1/2)
Lack of clear pre‐specification of the threshold (1/2; Gachon 2005) or testing of multiple thresholds (1/2; McGovern 1992)
High (1/2)
Unclear (1/2)
Lack of threshold detail (1/2; Gachon 2005); unclear description of observer expertise (2/2)
Low (1/1) Unclear (1/1)
Observer expertise not described
Reference standard Low (2/2) Low (1/2)
Unclear (1/2)
Lack of description of histopathology expertise (1/2; Gachon 2005)
Low (1/1) Unclear (1/1)
Histology expertise not described (histologically analysed by different private and hospital pathologists and reviewed by one of the study authors)
Flow and timing High (1/2)
Unclear (1/2)
Participant exclusions (1/2; McGovern 1992); unclear reference interval (2/2).
Low (1/1)
Referred for further assessment (position 4 on clinical pathway)
Studies N = 2; Barzegari 2005; Stanganelli 2000 N = 0
Participant selection Low (2/2) High (2/2)
Included excisions for cosmetic reasons (1/2; Barzegari 2005), or multiple lesions per participant (2/2)
Index test Low (1/2)
Unclear (1/2)
Lack of clear pre‐specification of the threshold (Barzegari 2005)
High (1/2)
Unclear (1/2)
Consensus result (1/2; Barzegari 2005); insufficient threshold detail (1/2; Barzegari 2005); observer expertise not clear (2/2)
Reference standard Low (1/2)
High (1/2)
< 80% of disease‐negative participants had histological or clinical follow‐up reference standard (Stanganelli 2000)
Unclear (2/2)
Lack of description of histopathology expertise (2/2)
Flow and timing High (1/2)
Unclear (1/2)
Unclear reference interval (2/2); use of different reference standards (1/2; Stanganelli 2000)
Referred for further assessment (selected for excision) (position 5 on clinical pathway)
Studies N = 6; Bono 2002a; Bono 2002b; Bono 2006; Ek 2005; Green 1991; Morton 1998a; Morton 1998b; Morton 1998cb N = 9; Benelli 1999; Carli 2002b; Cristofolini 1994; Cristofolini 1997; Langley 2001; Morales Callaghan 2008; Thomas 1998; Unlu 2014; Zaumseil 1983
Participant selection Low (2/6)
High (2/6)
Unclear (2/6)
Inappropriate (2/6; Bono 2002a; Ek 2005) or unclear (2/6; Green 1991; Morton 1998a; Morton 1998b; Morton 1998c) exclusions; consecutive recruitment not reported (1/6; Green 1991)
High (6/6)
Unrepresentative (6/6) participants; all excised. Multiple lesions per participant (2/6; Ek 2005; Green 1991) or number of participants NR (Morton 1998a; Morton 1998b; Morton 1998c)
High (4/9)
Unclear (5/9)
Inappropriate exclusions (4/9) due to restriction to melanocytic only (Morales Callaghan 2008; Unlu 2014), disagreement on histology (Zaumseil 1983). Use of case‐control type design (1/9; Thomas 1998). Unclear participant sampling (6/9; Benelli 1999; Carli 2002b;Cristofolini 1994;Cristofolini 1997;Langley 2001; Zaumseil 1983).
High (9/9)
Inclusion of only excised lesions (9/9). Multiple lesions per participant (2/9; Langley 2001; Morales Callaghan 2008); number of participants not reported (6/9; Benelli 1999; Carli 2002b; Cristofolini 1994; Cristofolini 1997; Thomas 1998; Zaumseil 1983)
Index test Low (3/6)
Unclear (3/6)
Pre‐specification of threshold not reported (Ek 2005; Green 1991; Morton 1998a; Morton 1998b; Morton 1998c)
High (6/6)
All clinically applicable application of test. No threshold details (6/6). Observer experience unclear (3/6; Bono 2006; Ek 2005; Green 1991).
Low (2/9)
High (2/9)
Unclear (5/9)
Threshold not prespecified (2/9; Benelli 1999;Thomas 1998) or not clear whether prespecified (Carli 2002b; Cristofolini 1997;Langley 2001; Morales Callaghan 2008; Unlu 2014).
Low (1/9)
High (7/9)
Unclear (1/9)
Test application not clinically applicable (4/9; Benelli 1999; Carli 2002b; Cristofolini 1997; Morales Callaghan 2008) or not clear (Cristofolini 1994; Langley 2001). No threshold detail (5/9; Carli 2002b; Langley 2001; Morales Callaghan 2008; Unlu 2014; Zaumseil 1983)
Reference standard Low (5/6)
High (1/6)
Inadequate reference standard (1/6; Green 1991)
Low (1/6)
High (1/6)
Unclear (4/6)
Expert diagnosis used (1/6; Green 1991). Lack of description of histopathology expertise (5/6; all except Morton 1998a; Morton 1998b; Morton 1998c)
Low (9/9) Low (2/9)
High (1/9)
Unclear (6/9)
Use of expert diagnosis (1/9; Langley 2001). Histopathology expertise not reported (7/9; Benelli 1999; Carli 2002b; Cristofolini 1994; Cristofolini 1997; Langley 2001; Morales Callaghan 2008; Zaumseil 1983)
Flow and timing High (2/6)
Unclear (4/6)
Index to reference interval not reported (5/6; Bono 2002a; Bonon 2002b; Bono 2006; Green 1991; Morton 1998a; Morton 1998b; Morton 1998c). Participant exclusions due to incomplete data (2/6; Ek 2005; Green 1991)
Low (3/9)
Unclear (6/9)
Interval to reference standard not reported (6/9; Benelli 1999; Cristofolini 1994; Langley 2001; Thomas 1998; Unlu 2014; Zaumseil 1983)
Equivocal referred for further assessment (selected for excision) (position 5* on clinical pathway)
Studies N = 2; Dummer 1993; Soyer 1995 N = 1; Steiner 1987
Participant selection Unclear (2/2)
Unclear sampling methods (2/2); Unclear exclusions (1/2; Soyer 1995)
High (1/2)
Unclear (1/2)
Participants not representative (1/2; Dummer 1993) or unclear (1/2; Soyer 1995). Number of participants NR (2/2)
Unclear (1/1)
Participant sampling not described; exclusion criteria not reported
High (1/1)
Restricted to small < 10 mm pigmented skin lesions; all excised
Index test Unclear (2/2)
Pre‐specification of threshold not reported (2/2)
High (2/2)
No threshold details (2/2). Observer experience unclear (1/2; Dummer 1993)
Unclear (1/1)
Pre‐specification of threshold NR
High (1/1)
Consensus decision reported and no threshold detail
Reference standard Low (2/2) Unclear (2/2)
Lack of description of histopathology expertise (2/2)
Low (1/1) Unclear (1/1)
Histology expertise not described
Flow and timing High (1/2)
Unclear (1/2)
Participant exclusions (1/2; Dummer 1993). Index to reference interval not reported (2/2)
Low (1/1)
a positions on the clinical pathway described in Figure 3.
bThe study by Morton et al is considered as a single study for quality assessment purposes but as three studies (Morton 1998a; Morton 1998b; Morton 1998c) for the analyses due to the reporting of three separate cohorts of participants.
NR: not reported

Appendix 11. Summary study details: image‐based evaluations

Study
Position on pathway a, b
 
 Outcomes reported
Study type
Country
Setting
Inclusion criteria Numberparticipants/lesions Index tests (algorithm)
Diagnostic approach
Threshold Observer qualification (number)
Experience
Reference standard
Final diagnoses
Prevalence (MEL)
Exclusions
Limited prior testing (with selection on reference standard) (position 3 on clinical pathway)
Bourne 2012
Pathway ‐ clear
WPC‐tests
R‐CS
Australia
Primary
All skin lesions excised to exclude skin cancer (and 3 examples common lesions assessed as clearly benign and not biopsied) 46/50 VI (no algorithm)
 Dermoscopy (3‐point; Menzies; BLINCK (excluded))
Image‐based (blinded)
NR GP (n = 3)
Clinical nurse (n = 1)
Mixed experience “varying levels of dermatoscopic experience”
Average
Histology/clinical FU/expert dx
MM 1; MiS 8
BCC 6; SK 5; BN 11; other 19
9/45; 20%
5 non‐pigmented specimens (not further identified) in the set of 50 were excluded from dermoscopic evaluations
Rosendahl 2011
Pathway – unclear
NC
R‐CS
Australia
Primary
PSL submitted for histology from the primary care skin cancer practice of one study author 389/463 1. VI (no algorithm)
2. Dermoscopy (pattern)
1. Subjective impression
2. Both characteristics present
Dermatologist (n = 1)
Image‐based; high experience (confirmed by study author); single observer
Histology
MM 9; MiS 20
BCC 72; SCC 5
BN 217; BD 18; AK 14*; BNM 140
*considered malignant by study authors
29/463; 6%
3 poor‐quality images excluded
Referred for further assessment (position 4 on clinical pathway)
Stanganelli 2005
Pathway ‐ clear
MEL
WPC
R‐CS
Italy
Specialist clinic
Melanocytic lesions referred to Skin Cancer Unit for clinical and dermoscopic evaluation NR/477 VI (no algorithm)
 Dermsocopy (no algorithm)
Image‐based (average)
NR Dermatologist (n = 3); GP (n = 3)
Dermatologists ‐ high experience (“2 years dermoscopy experience”); experience NR for GPs, assumed low
Histology/registry FU
MEL 31
BN 103
31/134; 23%
None reported
Referred for further assessment (with selection on reference standard) (position 5 on clinical pathway)
Benelli 2001
Pathway – unclear
WPC
R‐CS
Italy
Training images
Slides of PSL selected for evaluation during a training course on dermoscopy. Lesions not located on head, palms or soles NR/49 1. VI (ABCDE)
 2. Dermoscopy (7FFM) 1. ≥ 3 & ≥ 2
 2. ≥ 2 Expert author (n = 1); dermatologists (n = 65)
Image‐based; single author ‐ high experience; Average result for dermatologist group; experience NR
Histology
MM 10, MiS 2
BCC 2
BN 25, SN 5, SK 3,
other 2 (1 missing)
12/50; 24%
None reported
Carli 2002b
Pathway – unclear
WPC
R‐CS
Italy
Secondary
Clinically suspicious or equivocal PSL undergoing excision for diagnostic purposes; all ≤ 14 mm diameter NR/57 1. VI (NR)
 2. Dermoscopy (NR) NR Dermatologists (n = 2)
Image‐based; high experience ("with experience in the field of PSL"); consensus of 2
Histology
MM 6, MiS 5
BCC 10
BN 31, SK 1; other 4
11/57; 19%
4 "not evaluables" excluded (1 MM, 3 benign)
Dolianitis 2005
Pathway – unclear
WPC
CCS
Multi‐centre
Training images
Melanocytic skin lesions selected from a collection of dermoscopic images belonging to one study author NR/40 1. VI (no algorithm)
 2. Dermoscopy (pattern analysis; Menzies criteria; 7‐point; ABCD) 1. Subjective impression
 2. Subjective impression; NR; NR; > 4.75 Dermatologists (n = 16); dermatology trainees (n = 16); GPs (n = 35)
Image‐based; mixed experience (“range of experience levels with assessment of skin lesions”); average result
Histology (n = 39); Expert diagnosis (n = 1)
MM 18, MiS 2
BN 12; SN 3; other 4
20/20; 50%
None reported; poor‐quality images exclusion criterion
Pizzichetta 2004
Pathway – unclear
WPC
R‐CS
USA/Italy
Secondary
Clinical and/or dermoscopic hypomelanotic (extent of pigmentation ≤ 30%) and amelanotic skin lesions 151/151 1. VI (no algorithm)
 2. Dermoscopy (pattern) Subjective impression NR (presume dermatologist; n = 1)
Image‐based; experience NR; single observer
Histology
AHM 34, MiS 5
BCC 25, SCC 5
BN 47, SN 5, SK 8, other 18
39/108; 36% (analysed)
23 lesions excluded due to image quality; further 43 lesions were not available for evaluation by clinical images ("mainly benign melanocytic lesions")
Stanganelli 1998a
Pathway – unclear
WPC
R‐CS
Italy
Training images
PSL images selected from computerised files of the skin cancer clinic NR/30 1. VI (no algorithm)
 2. Dermoscopy (no algorithm) NR Dermatologists (n = 20)
Image‐based; experience NR (“experience in ELM but (with) no formal training”); average
Histology
MEL 10
BCC 4
BN 10, SK 3, other 3
10/30; 33%
None reported
Winkelmann 2016
Pathway – unclear
WPC
CCS
Unclear
Training images
Selected images previously analysed by MSDSLA NR/12 1. VI (no algorithm)
 2. Dermoscopy (no algorithm) NR Dermatologists (n = 70)
Image‐based; experience NR; average
Histology
MM 3; MiS 2
BN 7
5/12; 42%
None reported
Equivocal referred for further assessment (with selection on reference standard) (position 5* on clinical pathway)
Carli 2003a
Pathway – unclear
WPC
R‐CS
Italy
Secondary
Clinically difficult to diagnose or equivocal melanocytic lesions randomly selected from image database; all melanomas < 1 mm thickness NR/200 1. VI (no algorithm)
2. Dermoscopy (own choice)
Subjective impression Dermatology registrar (n = 2); dermatologists (senior experts n = 2; practicing dermatologists n = 4)
Classed as high experience (both dermatologists and registrars “formally trained in dermoscopy”); Average result
Histology
MM 40; MiS 24
BN 136
64/200; 32%
None reported
de Giorgi 2012
Pathway – unclear
WPC
R‐CS
Italy
Secondary
Pigmented melanocytic skin lesions ≤ 6 mm diameter excised at dermatology department NR/103 VI (ABCD) 1. ≥ 2 characteristics present
 2. ≥ 3 characteristics present Dermatologists (n = 3)
High experience (“more than 5 years of practice in dermoscopy”); consensus of 3
Histology
MM 16; MiS 18
BN 69
34/103; 33%
None reported
apositions on the clinical pathway described in Figure 3.
bclear or unclear position on the clinical pathway.
AHM: amelanotic ⁄ hypomelanotic melanoma; AK: actinic keratosis; BCC: basal cell carcinoma; BD: Bowen’s disease; BLINCK: Benign Lonely irregular Nervous Change Known Clues; BN: benign naevi; BNM: benign non‐melanocytic; BPC: between‐person comparison (of tests); CCS: case‐control study; CS: case series; cSCC: cutaneous squamous cell carcinoma; DF: dermatofibroma; dx: diagnosis; ELM: epiluminescence microscopy; FU: follow‐up; GP: general practitioner; LS: lentigo simplex; MEL: invasive melanoma or atypical intraepidermal melanocytic variants; MiS: melanoma in situ (or lentigo maligna); MM: malignant (invasive) melanoma; MSDSLA: multispectral digital skin lesion analysis device; NC: non comparative; NR: not reported; P: prospective; PLC: pigmented lesion clinic; PSL: pigmented skin lesion; R: retrospective; RCT: randomised controlled trial; SCC: squamous cell carcinoma; SK: seborrhoeic keratosis; SN: Spitz naevi; VI: visual inspection; WPC: within person comparison (of tests); 7FFM: seven features for melanoma; 7PCL: seven‐point checklist

Appendix 12. Summary QUADAS: image‐based evaluations

  Studies clearly placed on clinical pathway Studies not clearly placed on clinical pathway
Pathway a Risk of bias Concerns about applicability Risk of bias Concerns about applicability
Limited prior testing (with selection on reference standard) (position 3 on clinical pathway)
Studies N = 1; Bourne 2012 N = 1; Rosendahl 2011
Participant selection Unclear (1/1)
Unclear exclusion criteria (Bourne 2012)
High (1/1)
Restriction to primarily excised lesions (1/1)
Low (1/1) High (1/1)
Includes excised lesions only; multiple lesions per participant
Index test Unclear (1/1)
Lack of clear pre‐specification of the threshold (Bourne 2012)
High (1/1)
Blinded image interpretation and average observer result presented (Bourne 2012); lack of threshold detail (Bourne 2012); unclear description of observer expertise
Unclear (1/1)
No clear pre‐specification of threshold
High (1/1)
Image‐based study; no threshold detail
Reference standard Low (1/1) High (1/1)
Use of expert diagnosis as reference (Bourne 2012); lack of description of histopathology expertise (Bourne 2012)
Low (1/1) Unclear (1/1)
Histopathology experience NR
Flow and timing High (1/1)
Use of different reference standards (Bourne 2012); participant exclusions (Bourne 2012)
High (1/1)
Exclusions on image quality Unclear interval between index and reference
Referred for further assessment (position 4 on clinical pathway)
Studies N = 1;Stanganelli 2005 N = 0
Participant selection Unclear (1/1)
Unclear participant sampling across all items (Stanganelli 2005)
High (1/1)
Sample restricted to melanocytic lesions (Stanganelli 2005). Patient numbers NR
Index test Unclear (1/1)
Lack of clear pre‐specification of the threshold (Stanganelli 2005)
High (1/1)
Average result presented (Stanganelli 2005); insufficient threshold detail (Stanganelli 2005)
Reference standard Low (1/1) Unclear (1/1). Unclear use of expert diagnosis as reference standard (Stanganelli 2005). Unclear histopathology expertise
Flow and timing High (1/1)
Use of different reference standards (Stanganelli 2005); unclear reference interval
Referred for further assessment (with selection on reference standard) (position 5 on clinical pathway)
Studies N = 0 N = 6; Benelli 2001; Carli 2002b; Dolianitis 2005; Pizzichetta 2004; Stanganelli 1998a; Winkelmann 2016
Participant selection High (3/6)
Unclear (3/6)
Case‐control type design used (3/3; Dolianitis 2005; Stanganelli 1998a; Winkelmann 2016) or unclear design (Benelli 2001; Pizzichetta 2004). Unclear participant sampling (5/6; Benelli 2001;Carli 2002b;Pizzichetta 2004;Stanganelli 1998a;Winkelmann 2016), design unclear (1/6), exclusion criteria not clearly reported (5/6; Benelli 2001; Carli 2002b; Dolianitis 2005;Stanganelli 1998a; Winkelmann 2016)
High (6/6)
Excised only included (6/6), amelanotic/ hypomelanotic lesions only (1/6; Pizzichetta 2004). Number participants NR (5/6; Benelli 2001;Carli 2002b'Dolianitis 2005; Stanganelli 1998a; Winkelmann 2016)
Index test Low (1/6)
Unclear (5/6)
No clear pre‐specification of threshold
 (5/6; Carli 2002b; Dolianitis 2005; Pizzichetta 2004; Stanganelli 1998a; Winkelmann 2016)
High (6/6)
Image‐based evaluations (6/6), blinded to all other information (5/6; Benelli 2001;Carli 2002b; Dolianitis 2005; Stanganelli 1998a; Winkelmann 2016), with consensus (1/6; Carli 2002b) or average result (4/6; Benelli 2001; Dolianitis 2005; Stanganelli 1998a; Winkelmann 2016) reported. Threshold not clearly specified (5/6; Carli 2002b; Dolianitis 2005; Pizzichetta 2004; Stanganelli 1998a; Winkelmann 2016). Observer expertise NR (4/6; Dolianitis 2005; Pizzichetta 2004; Stanganelli 1998a; Winkelmann 2016)
Reference standard Low (6/6) High (1/6)
Unclear (5/6)
Use of expert observer diagnosis (1/6; Dolianitis 2005); expertise of histopathologist not described (6/6)
Flow and timing Low (1/6)
High (2/6)
Unclear (3/6)
Lesions excluded from analysis (reason NR) (2/6; Dolianitis 2005; Pizzichetta 2004); different reference standards used (1/6; Dolianitis 2005). Index to reference interval NR (5/6; Benelli 2001, Dolianitis 2005, Pizzichetta 2004, Stanganelli 1998a, Winkelmann 2016).
Equivocal referred for further assessment (with selection on reference standard) (position 5* on clinical pathway)
Studies N = 0 N = 2; Carli 2003a; de Giorgi 2012
Participant selection High (2/2)
Exclusion of difficult to diagnose, including peculiar lesions (1/2; Carli 2003a), histology disagreement (1/2; de Giorgi 2012)
High (2/2)
Restriction to melanocytic only (2/2), excised only (2/2). Patient numbers NR (2/2)
Index test High (1/2)
Unclear (1/2)
Multiple thresholds tested (1/2; de Giorgi 2012); no clear threshold specification (1/2; Carli 2003a)
High (2/2)
Image‐based evaluations (2/2), blinded to all other information (1/2; Carli 2003a), with consensus (1/2; de Giorgi 2012) or average result (1/2; Carli 2003a) reported. Threshold not described (1/2; Carli 2003a)
Reference standard Low (2/2) Low (2/2)
Flow and timing Unclear (2/2)
Index to reference interval NR (2/2)
a positions on the clinical pathway described in Figure 3.
NR: not reported

Appendix 13. Summary study details: detection of invasive melanoma alone

Study author
 
 Outcomes reported Study type
Country
Setting
Inclusion criteria Numberparticipants/lesions Index tests (algorithm)
Diagnostic approach
Threshold Observer qualifications (number)
Experience
Reference standard
Final diagnoses
Prevalence (MEL)
Exclusions
In‐person
Bono 1996 WPC‐tests
Unclear
Italy
Specialist clinic
Pigmented skin lesions at the Instituto Nazionale Tumori of Milan 45/54 VI (no algorithm)
Single observer
Subjective impression Plastic surgeon Histology plus other (31% of benign had expert dx)
MM: 18
BN: 25
18/43; 42%
Only 43 lesions had complete clinical and histological information. 11 lesions not surgically removed had only clinical diagnosis (benign) and were not included in the final accuracy analysis
Green 1994 NC
NR‐CS
Australia
Secondary
Pigmented lesions for excision 129/164 VI (no algorithm)
Single observer
Subjective impression; clinical dx recorded NR Histology
MM 18; MiS 3
BN 128; misc pigmented lesions including SK, BCC, lentigines 15
18/164; 11%
Kopf 1975 NC
R‐CS
USA
Specialist clinic
All lesions subject to biopsy at the Oncology Section of the Skin and Cancer Unit NR/5538 VI (no algorithm)
Single observer
No details; "clinical diagnosis" Oncologist Histology
MM 99
other dx listed only for false‐positives
99/5538; 2%
None reported
Krahn 1998 WPC‐tests
P‐CS
Germany
Secondary
Excised pigmented skin lesions 80/80 VI (no algorithm)
Single observer
No details Dermatologist (assumed) Histology
MM 39
BN 40; SN 1
39/80; 49%
None reported
McGovern 1992 WPC‐algs
P‐CS
USA
Community
PSL (> 10 mm) excised to rule out dysplasia, MiS or MM 179/237 VI (7‐point; (A)BCD)
In‐person; single
7‐point: ≥ 2, ≥ 3, ≥ 4 characteristics present
 (A)BCD: ≥ 1, ≥ 2, ≥ 3 characteristics present NR (presume dermatologist)
experience.
NR
Histology
MM 6; MiS 6
BCC 4; SK 32; BN 138; AK 6; other 45
6/211; 3%
32 lesions unaccounted for; 13 excluded due to lesion size of ≤ 8 mm. 192 evaluated for ABCD and 3‐point; 205 evaluated for 7‐point
Viglizzo 2004 WPC‐tests
NR‐CS
Italy
Specialist clinic
Pigmented skin lesions examined at the Dermoscopy Service and undergoing excisions; high and medium risk on dermoscopy were selected for excision and 2x2 can be estimated only for melanocytic subgroup NR/79 VI (no algorithm)
Single observer
No details Dermatologist (assumed) Histology
Melanoma (invasive): 11; MiS: 1
 Melanocytic lesion: 57
11/67 16%
None reported
Walter 2012 BPC
RCT
UK
Primary
Any suspicious PSL that could not immediately be diagnosed as benign 654/792 (control arm only) VI (7‐point)
Siascope (iv arm)
In‐person (single)
NR GP (n = 28)
Nurse practitioner. (n = 2)
Low (excluded if specialist dermatology training)
Histology/clinical FU(3‐6 months)/expert dx
Control group only:
MM 16; MiS 2
BCC 4; SK 20; DF 2; lentigo 5; "benign" 686; unknown 10
16/773 2%
19 (5 due to violation of recruitment criteria or discontinued protocol; 1 died; 4 did not attend for dermatology assessment; 2 missing histology; 7 not clearly accounted for)
Image‐based
Lorentzen 1999 WPC‐tests
P‐CS
Denmark
Secondary
Patients with lesions suspicious for CMM referred to outpatients clinic 232/232 VI (no algorithm)
(Dermoscopy)
Single observer
Subjective impression; clinical diagnosis Dermatologist Histology
MM 49 "malignant melanoma"
 BCC 16, SK
12; BN: 137 other: 18 (including SN, BD, and others)
49/232; 21%
Poor‐quality index test image 10 cases excluded
Rao 1997 WPC‐algs
(tests)
R‐CS
USA
Private
Patients with atypical melanocytic lesions or suspected early MM 63/72 VI (ABCD) (Dermoscopy)
Single observer
Diagnosis of melanoma Dermatology registrar Histology
MM 21
Atypical melanocytic naevus 51
21/72; 29%
None reported
Scope 2008 NC
R‐CS
New Zealand
Industry image database
Images of pigmented skin lesions selected from a database of standardised patient images provided by a New Zealand–based teledermatology company (MoleMap); images were selected on the basis that (1) ≥ 8 clinically atypical naevi were apparent on the back; (2) most of the lesions on the back and all of the atypical naevi had close‐up clinical digital images; (3) 1‐year FU images (close‐up clinical and dermoscopic images) were available to show that lesions considered to be benign were in fact biologically indolent by revealing no change; and (4) the image quality of both the overview and the close‐up images were acceptable 12/145 VI (ugly duckling)
Single observer
Lesion id as "completely different"
 or somewhat different from the other moles; (Bx) decision Dermatologist Histology or FU
MM 5 "malignant melanoma"
BN: 140
5/145; 3%
Unacceptable image quality
Troyanova 2003 BPC/WPC‐tests
R‐CCS
NR
Training images (source NR)
Images of pigmented skin lesions selected for a dermoscopy training study NR/50 VI (no algorithm)
(Dermoscopy)
Single observer
Subjective impression; dx of melanoma Dermatologist Histology
MM: 25
"Benign": 25
25/50; 50%
None reported
Westerhoff 2000 WPC‐tests
R‐CCS
Australia
Training images (Specialist unit)
Clinically atypical pigmented skin lesions; 50 invasive melanomas and 50 nonmelanomas randomly selected from the Sydney Melanoma Unit PSL image database NR/100 VI (no algorithm)
(Dermoscopy)
Single observer
Subjective impression; dx of melanoma GP Histology or FU
MM 50
"Benign":50
50/100; 50%
None reported
AK: actinic keratosis; BCC: basal cell carcinoma; BD: Bowen’s disease; BN: benign naevi; BPC: between person comparison (of tests); Bx: biopsy; CCS: case control study; CMM: cutaneous malignant melanoma; CS: case series; DF: dermatofibroma; FU: follow‐up; MEL: invasive melanoma or atypical intraepidermal melanocytic variants; MiS: melanoma in situ (or lentigo maligna); MM: malignant melanoma; NC: non comparative; NR: not reported; P: prospective; PLC: pigmented lesion clinic; PSL: pigmented skin lesion; R: retrospective; RCT: randomised controlled trial; SK: seborrhoeic keratosis; SN: Spitz naevi; VI: visual inspection; WPC: within person comparison (of tests); WPC‐algs: within‐person comparison (of algorithms)

Appendix 14. Summary study details: detection of any skin lesion requiring excision

Study author
 
 Outcomes reported Study type
Country
Setting
Inclusion criteria Numberparticipants/lesions Index tests (algorithm)
Diagnostic approach
Threshold Observer qualifications (number)
Experience
Reference standard
Final diagnoses
Prevalence (MEL)
Exclusions
In‐person
Argenziano 2006 RCT
Italy, Spain
Primary
Patients asking for screening or exhibiting ≥ 1 skin tumours as seen during routine physical examination (patient‐finding screening).
Participating PCPs randomised to either VI alone or VI + dermoscopy; only excised lesions can be included for each arm.
NR/85 VI (ABCD)
Dermoscopy (3‐point checklist)
In person (single observer)
Subjective impression; dx of malignancy GPs (n = 37)
All trained in ABCD rule
Histology
MEL 6
BCC 37; SCC 10
benign 32
53/85; 62%
Only those participants who were considered to have lesions suggestive of skin cancer had histology and could be included; rest had expert diagnosis (making full dataset ineligible for this review)
Chang 2013 NC
R‐CS
Taiwan
Secondary
Potentially malignant biopsied or excised skin lesions (nontumour specimens excluded) 676/769 VI (no algorithm)
In‐person (single observer)
Subjective impression; definitely malignant Dermatologists; n = 25
Board‐certified
Histology
MM 4; MiS 4
BCC: 110; cSCC: 20
"Benign" diagnoses: 595
152/769; 20%
Poor‐quality index test image mis‐registered or poor‐quality images (unfocused or containing a motion artifact)
Ek 2005 NC
P‐CS
Australia
Specialist clinic
Lesions excised for which malignancy could not be excluded 1223/2582 VI (no algorithm)
In person
Subjective impression Plastic surgeon (n = 4 or 5; mixed experience; 3 consultants, 1 plastic surgery trainee (usually 1st year, on 6‐month rotation) and a clinical assistant)
Unclear
Histology
MEL 23
BCC 1214; SCC 517; BD 188; SK 63; 577 other benign (including 330 solar keratosis)
1754/2582; 68%
Incomplete or incorrectly entered proformas were excluded – 79 participants with 96 lesions
McGovern 1992 WPC‐algs
P‐CS
USA
Community
PSL (> 10 mm) excised to rule out dysplasia, MiS or MM 179/237 VI (7‐point; (A)BCD)
In‐person; single
7‐point: ≥ 2, ≥ 3, ≥ 4 characteristics present
 (A)BCD: ≥ 1, ≥ 2, ≥ 3 characteristics present NR (presume dermatologist)
experience. NR
Histology
MM 6; MiS 6
BCC 4; SK 32; BN 138; AK 6; other 45
15/192; 8%
32 lesions unaccounted for; 13 excluded due to lesion size of ≤ 8 mm. 192 evaluated for ABCD and 3‐point; 205 evaluated for 7‐point
Stanganelli 2000 WPC
R‐CS
Italy
Specialist clinic
PSL referred by dermatologists and GPs either for pre‐surgical assessment or consultation NR/3372 VI (ABCD)
 Dermoscopy (no algorithm)
In person (single)
NR
 Subjective impression NR (assumed dermatologist ‐ described as one of the co‐authors; n = 1) Histology/registry FU
MEL 55
BCC 43; BN 3274
98/3372; 3%
None reported
Steiner 1987 P‐CS
Austria
Specialist clinic
Small (< 10 mm) diagnostically equivocal PSL; no absolute agreement on clinical diagnosis among investigating clinicians at a PLC NR/318 1. VI (no algorithm)
2. Dermoscopy (pattern)
In person
Subjective impression Dermatologists (n = 3; high experience ‐ "experienced dermatologists")
Consensus diagnosis of 3 observers
Histology
MM 49; MiS 24
BCC 20
BN 143; SK 20; lentigo simplex and nevoid lentigo 19; other 15
93/318; 29%
None reported
Walter 2012 BPC
RCT
UK
Primary
Any suspicious PSL that could not immediately be diagnosed as benign 654/792 (control arm only) VI (7‐point)
Siascope (iv arm)
In person (single)
NR GP (n = 28)
Nurse practitioner (n = 2)
Low (excluded if specialist dermatology training)
Histology/clinical FU (3‐6 months)/expert dx
Control group only:
MM 16; MiS 2
BCC 4; SK 20; DF 2; lentigo 5; "benign" 686; unknown 10
22/773; 3%
19 (5 due to violation of recruitment criteria or discontinued protocol; 1 died; 4 did not attend for dermatology assessment; 2 missing histology; 7 not clearly accounted for)
Image‐based
Carli 2002b WPC
R‐CS
Italy
Secondary
Clinically suspicious or equivocal PSL undergoing excision for diagnostic purposes; all ≤ 14 mm diameter NR/57 1. VI (NR)
 2. Dermoscopy (NR) NR Dermatologists (n = 2)
Image‐based; high experience ("with experience in the field of PSL"); consensus of 2
Histology
MM 6, MiS 5
BCC 10
BN 31, SK 1; other 4
20/54; 37%
4 'not evaluables' excluded (1 MM, 3 benign)
Rosendahl 2011 NC
R‐CS
Australia
Primary
PSL submitted for histology from the primary care skin cancer practice of one study author 389/463 1. VI (no algorithm)
2. Dermoscopy (pattern)
1. Subjective impression
2. Both characteristics present
Dermatologist (n = 1)
Image‐based; high experience (confirmed by study author); single observer
Histology
MM 9; MiS 20
BCC 72; SCC 5
BN 217; BD 18; AK 14*; BNM 140
*considered malignant by study authors
104/463; 22%
3 poor‐quality images excluded
Stanganelli 1998a WPC
R‐CS
Italy
Training images
PSL images selected from computerised files of the skin cancer clinic NR/30 1. VI (no algorithm)
 2. Dermoscopy (no algorithm) NR Dermatologists (n = 20)
Image‐based; experience NR (“experience in ELM but (with) no formal training”); average
Histology
MEL 10
BCC 4
BN 10, SK 3, other 3
14/30; 47%
None reported
AK: actinic keratosis; BN: benign naevi; BCC: basal cell carcinoma; BD: Bowen’s disease; BPC: between person comparison (of tests); CCS: case control study; CS: case series; cSCC: cutaneous squamous cell carcinoma; DF: dermatofibroma; FU: follow‐up; dx: diagnosis; ELM: epiluminescence microscopy; GP: general practitioner; MEL: invasive melanoma or atypical intraepidermal melanocytic variants; MiS: melanoma in situ (or lentigo maligna); MM: malignant (invasive) melanoma; NC: non comparative; NR: not reported; P: prospective; PCP: primary care practitioner; PLC: pigmented lesion clinic; PSL: pigmented skin lesion; R:retrospective; RCT: randomised controlled trial; SCC: squamous cell carcinoma; SK: seborrhoeic keratosis; SN: Spitz naevi; VI: visual inspection; WPC: within person comparison (of tests); WPC‐algs: within person comparison of algorithms

Data

Presented below are all the data for all of the tests entered into the review.

Tests. Data tables by test.

Test No. of studies No. of participants
1 Visual inspection ‐ in‐person (MM) 7 6857
2 Visual inspection ‐ image‐based (MM) 5 599
3 Visual inspection ‐ in‐person (MEL) 28 25604
4 Visual inspection ‐ image‐based (MEL) 11 1243
5 Visual inspection ‐ in‐person (Any) 7 8091
6 Visual inspection ‐ image‐based (Any) 3 547
7 MEL‐ VI ‐ in‐person ‐ no algorithm 21 19330
8 MEL‐ VI ‐ in‐person ‐ no algorithm (alternative thresholds) 2 475
9 MEL‐ VI ‐ in‐person ‐ (A)BCD(E) at NR or standard threshold 6 5501
10 MEL‐VI ‐ in‐person ‐ ABCD at NR 2 3548
11 MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 1 2 1541
12 MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 2 3 1761
13 MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 3 2 1541
14 MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 4 2 1541
15 MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 5 2 1541
16 MEL‐VI ‐ in‐person ‐ BCD at ≥ 1 1 192
17 MEL‐VI ‐ in‐person ‐ BCD at ≥ 2 1 192
18 MEL‐VI ‐ in‐person ‐ BCD at ≥ 3 1 192
19 MEL‐VI ‐ in‐person ‐ 7point at ≥ 2 1 205
20 MEL‐VI ‐ in‐person ‐ 7point at ≥ 3 1 205
21 MEL‐VI ‐ in‐person ‐ 7point at ≥ 4 1 205
22 MEL‐VI ‐ in‐person ‐ 7point(rev) at ≥ 3 1 773
23 MEL‐VI ‐ in‐person ‐ Collas at ≥ 1 1 353
24 MEL‐ VI ‐ image‐based ‐ no algorithm 9 1090
26 MEL‐VI ‐ image‐based ‐ ABCD(E) at standard 2 153
27 MEL‐VI ‐ image‐based ‐ ABCD at ≥ 2 1 103
28 MEL‐VI ‐ image‐based ‐ ABCD at ≥ 3 1 103
29 MEL‐VI ‐ image‐based ‐ ABCDE at ≥ 2 1 50
30 MEL‐VI ‐ image‐based ‐ ABCDE at ≥ 3 1 50
31 MEL‐ VI ‐ in‐person ‐ experience NR 12 16778
32 MEL‐ VI ‐ in‐person ‐ experience high 9 3547
33 MEL‐ VI ‐ in‐person ‐ experience moderate 1 567
34 MEL‐ VI ‐ in‐person ‐ experience low 4 2008
35 MEL‐ VI ‐ in‐person ‐ experience mixed 2 2704
36 MEL‐ VI ‐ image‐based ‐ experience NR 5 663
37 MEL‐ VI ‐ image‐based ‐ experience high 5 540
38 MEL‐ VI ‐ image‐based ‐ experience low 1 134
39 MEL‐ VI ‐ image‐based ‐ experience mixed 2 90
40 VI ‐ in‐person ‐ expert consultant (MEL) 9 3547
41 VI ‐ in‐person ‐ consultant (MEL) 12 16778
42 VI ‐ in‐person ‐ resident/registrar (MEL) 2 1236
43 VI ‐ in‐person ‐ mixed qualifications (secondary care) (MEL) 2 2704
44 VI ‐ in‐person ‐ GP (MEL) 3 1339
45 MEL‐ VI ‐ image‐based ‐ expert consultant 4 700
46 MEL‐ VI ‐ image‐based ‐ consultant 4 200
47 MEL‐ VI ‐ image‐based ‐ mixed qualifications (secondary care) 1 200
48 MEL‐ VI ‐ image‐based ‐ mixed qualifications (secondary/primary care) 1 40
49 MEL‐ VI ‐ image‐based ‐ mixed qualifications (primary care) 2 184
51 MEL ‐ Selected on quality ‐ pathway 2 or 3 5 5728
52 MEL ‐ Selected on quality ‐ pathway 5 9 3556

1. Test.

1

Visual inspection ‐ in‐person (MM).

2. Test.

2

Visual inspection ‐ image‐based (MM).

3. Test.

3

Visual inspection ‐ in‐person (MEL).

4. Test.

4

Visual inspection ‐ image‐based (MEL).

5. Test.

5

Visual inspection ‐ in‐person (Any).

6. Test.

6

Visual inspection ‐ image‐based (Any).

7. Test.

7

MEL‐ VI ‐ in‐person ‐ no algorithm.

8. Test.

8

MEL‐ VI ‐ in‐person ‐ no algorithm (alternative thresholds).

9. Test.

9

MEL‐ VI ‐ in‐person ‐ (A)BCD(E) at NR or standard threshold.

10. Test.

10

MEL‐VI ‐ in‐person ‐ ABCD at NR.

11. Test.

11

MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 1.

12. Test.

12

MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 2.

13. Test.

13

MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 3.

14. Test.

14

MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 4.

15. Test.

15

MEL‐VI ‐ in‐person ‐ ABCDE at ≥ 5.

16. Test.

16

MEL‐VI ‐ in‐person ‐ BCD at ≥ 1.

17. Test.

17

MEL‐VI ‐ in‐person ‐ BCD at ≥ 2.

18. Test.

18

MEL‐VI ‐ in‐person ‐ BCD at ≥ 3.

19. Test.

19

MEL‐VI ‐ in‐person ‐ 7point at ≥ 2.

20. Test.

20

MEL‐VI ‐ in‐person ‐ 7point at ≥ 3.

21. Test.

21

MEL‐VI ‐ in‐person ‐ 7point at ≥ 4.

22. Test.

22

MEL‐VI ‐ in‐person ‐ 7point(rev) at ≥ 3.

23. Test.

23

MEL‐VI ‐ in‐person ‐ Collas at ≥ 1.

24. Test.

24

MEL‐ VI ‐ image‐based ‐ no algorithm.

26. Test.

26

MEL‐VI ‐ image‐based ‐ ABCD(E) at standard.

27. Test.

27

MEL‐VI ‐ image‐based ‐ ABCD at ≥ 2.

28. Test.

28

MEL‐VI ‐ image‐based ‐ ABCD at ≥ 3.

29. Test.

29

MEL‐VI ‐ image‐based ‐ ABCDE at ≥ 2.

30. Test.

30

MEL‐VI ‐ image‐based ‐ ABCDE at ≥ 3.

31. Test.

31

MEL‐ VI ‐ in‐person ‐ experience NR.

32. Test.

32

MEL‐ VI ‐ in‐person ‐ experience high.

33. Test.

33

MEL‐ VI ‐ in‐person ‐ experience moderate.

34. Test.

34

MEL‐ VI ‐ in‐person ‐ experience low.

35. Test.

35

MEL‐ VI ‐ in‐person ‐ experience mixed.

36. Test.

36

MEL‐ VI ‐ image‐based ‐ experience NR.

37. Test.

37

MEL‐ VI ‐ image‐based ‐ experience high.

38. Test.

38

MEL‐ VI ‐ image‐based ‐ experience low.

39. Test.

39

MEL‐ VI ‐ image‐based ‐ experience mixed.

40. Test.

40

VI ‐ in‐person ‐ expert consultant (MEL).

41. Test.

41

VI ‐ in‐person ‐ consultant (MEL).

42. Test.

42

VI ‐ in‐person ‐ resident/registrar (MEL).

43. Test.

43

VI ‐ in‐person ‐ mixed qualifications (secondary care) (MEL).

44. Test.

44

VI ‐ in‐person ‐ GP (MEL).

45. Test.

45

MEL‐ VI ‐ image‐based ‐ expert consultant.

46. Test.

46

MEL‐ VI ‐ image‐based ‐ consultant.

47. Test.

47

MEL‐ VI ‐ image‐based ‐ mixed qualifications (secondary care).

48. Test.

48

MEL‐ VI ‐ image‐based ‐ mixed qualifications (secondary/primary care).

49. Test.

49

MEL‐ VI ‐ image‐based ‐ mixed qualifications (primary care).

51. Test.

51

MEL ‐ Selected on quality ‐ pathway 2 or 3.

52. Test.

52

MEL ‐ Selected on quality ‐ pathway 5.

Characteristics of studies

Characteristics of included studies [ordered by study ID]

Argenziano 2006.

Study characteristics
Patient sampling Study design: RCT allocating primary care physicians to use either VI alone or VI plus dermoscopy (only excised lesions can be included for each arm).
Data collection: prospective
Period of data collection: May 2003‐September 2004
Country: Italy and Spain
Patient characteristics and setting Inclusion criteria: patients asking for screening or exhibiting ≥ 1 skin tumours as seen during routine physical examination (patient‐finding screening) were considered for inclusion; those undergoing excision were included in this review (i.e. those deemed sufficiently suspicious by the expert evaluation). PCPs were invited to participate in the trial; only those who attended the training sessions and who then screened patients and referred them to the PLCs were randomised.
Setting: primary
Prior testing: no prior testing
Setting for prior testing: N/A
Exclusion criteria: NR
Sample size (participants): number eligible: 3271 screened; 1325 participants allocated to 'naked eye' observation and 1197 participants allocated to dermoscopy observation; number included: 162 received histology after expert evaluation at the PLC
Sample size (lesions): 85 in VI arm and 77 in dermoscopy arm underwent excision
Participant characteristics: based on full sample: mean age 40, range 2‐90 (VI group)/ 41, range 3‐94 (dermoscopy group). Male 498 (38%): VI group/451 (38%) dermoscopy
Lesion characteristics: NR
Index tests VI: ABCD (control arm of RCT comparing naked eye examination to naked eye plus dermoscopy)
Method of diagnosis: in‐person diagnosis
Prior test data: N/A in‐person diagnosis
Diagnostic threshold: qualitative NR; described in intro as: simple morphologic features summarized by the asymmetry, border irregularity, colour variegation, and diameter 5 mm (ABCD)
Diagnosis based on: average (n = 37)
Observer qualifications: primary care physicians
Experience in practice: not described
Experience with index test: not described
Other detail: pre‐randomisation all participating PCPs underwent training in ABCD rule for clinical diagnosis and 3‐point checklist for dermoscopy.
Dermoscopy: evaluated in intervention arm of trial only
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Details: all lesions considered suggestive of skin cancer at the PLC were excised and subsequently diagnosed histopathologically. Equivocal lesions by histopathologic examination were reviewed by a second independent pathologist and a final diagnosis made.
 Disease positive: 92 malignant tumours; disease negative: 70 benign tumours
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 12; BCC: 66; cSCC: 14
SK: 13; MN 51; other: 6
Flow and timing Excluded participants: data can only be extracted for those with histology (i.e. participants considered to have lesions suggestive of skin cancer); remainder had expert diagnosis (not included in the final 2x2 data extracted)
Time interval to reference test: NR
Comparative RCT examining effect of making dermoscopy available to primary care practitioners
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Low High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Barzegari 2005.

Study characteristics
Patient sampling Study design: CS
Data collection: NR
Period of data collection: NR
Country: Iran
Patient characteristics and setting Inclusion criteria: PSLs with a clinical diagnosis of melanocytic lesion ≤ 15 mm diameter referred to dermatology clinic for diagnostic evaluation or cosmetic reasons
Setting: secondary (general dermatology)
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion; patient request for evaluation/excision
Setting for prior testing: NR
Exclusion criteria: none reported
Sample size (participants): number included: 91
Sample size (lesions): number included: 122
Participant characteristics: mean age 32.3 (6‐94 years); male: 30; 33%
Lesion characteristics: NR
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A in‐person diagnosis
Diagnostic threshold: qualitative melanoma likely (i.e. melanoma first in list of considered diagnoses)/ melanoma possible (melanoma one of a number of diagnoses)
Diagnosis based on: consensus (2 observers); n = 2
Observer qualifications: dermatology registrar (dermatology resident (3rd year)); dermatologist
Experience in practice: mixed experience (low and high experience combined)
Experience with index test: mixed (low and high experience combined)
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 6; disease negative: 116
Target condition (final diagnoses)
Melanoma (invasive): 3; melanoma (in situ): 3
SK: 2; benign naevus: 104; dysplastic naevus 7 DF, 1 AK
Flow and timing Excluded participants: none
Time interval between index and reference: unclear
Time interval between index test(s): consecutive
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? Unclear    
Did the study avoid including participants with multiple lesions? No    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less? No    
    Unclear  

Benelli 1999.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: 1 September1997‐30 September 1998
Country: Italy
Patient characteristics and setting Inclusion criteria: all PSLs observed and excised at the dermatologic surgery department
Setting: dermatologic surgery department
Prior testing: selected for excision (no further detail)
Setting for prior testing: dermatologic surgery department
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): number included: 401
Participant characteristics: NR
Lesion characteristics: melanoma thickness: 6 in situ; 42 < 0.75 mm thick, 80 0.76‐1.5 mm thick, 4 1.5‐4 mm thick (mean 0.60 mm, median 0.55 mm, max 1.9 mm, min 0.10 mm, SD 0.45)
Index tests VI: ABCDE
Method of diagnosis: in‐person diagnosis
Prior test data: lesions assessed by both dermatologists clinically and dermoscopically
Diagnostic threshold: data given for accuracy of each potential score (1‐5); score estimation described in detail
Diagnosis based on: consensus (2 observers); n = 2
Observer qualifications: dermatologist
Experience in practice: not described
Experience with index test: not described
Dermoscopy 7FFM also assessed by same observers
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 60 (15%) lesions; disease negative: 340 (non melanoma) + 1 BCC
Target condition (final diagnoses)
Melanoma (invasive): 54 (13.5%); melanoma (in situ): 6 (1.5%); BCC: 1 (0.4%)
SK: 1 (0.4%); MN: 316; epithelioid and/or spindle cell naevi: 18 (4.5%); LS: 5 (1.2%)
Flow and timing Excluded participants: NR
Time interval to reference test: same day
Comparative Blinding between tests: Clinical and dermoscopic evaluations made in‐person by 2 dermatologists prior to excision.
Time interval between index test(s): same day
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? No    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Unclear    
    High High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Low  

Benelli 2001.

Study characteristics
Patient sampling Study design: unclear
Data collection: retrospective image selection/prospective interpretation
Period of data collection: NR ‐ only dates of training course and agreement study given (April‐May 1999)
Country: Italy
Patient characteristics and setting Inclusion criteria: slides of pigmented skin tumours were selected for evaluation during a training course on dermoscopy. Lesions not located on head, palms or soles; histological slide available
Setting: training images; study authors' institution. Institute of Dermatologic Sciences, University of Milan
Prior testing: slides of pigmented skin tumours were selected for evaluation during a training course on dermoscopy
Setting for prior testing: unspecified
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): number included: 49 (paper reports 50 but only 49 accounted for in text)
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: ABCDE
Method of diagnosis: clinical photographs
Prior test data: no further information used
Diagnostic threshold: ABCDE Score ≥ 2; presence of 2 criteria; ABCDE score ≥ 3; presence of 3 criteria. All criteria described in full
Diagnosis based on: single (n = 1); average (n = 65; attending 1/3 courses in dermoscopy held to inform dermatologists about a new dermatoscopic diagnostic method (7FFM))
Observer qualifications: dermatologists
Experience in practice: expert author; not described for participating dermatologists
Experience with dermoscopy: expert author; prior experience not described for participating dermatologists; all underwent dermoscopy training for study purposes
Dermoscopy: 7FFM; ABCDE also evaluated in study
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 12/49 melanomas (paper reports 50 but only 49 accounted for in text)
Target condition (final diagnoses)
Melanoma (invasive): 10; melanoma (in situ): 2; BCC: 2 pigmented BCC
3 seborrhoeic keratoses: 2; pigmented BCC: 1; blue nevus: 2; angiokeratoma: 5; Spitz nevus: 5; junctional naevi 9 compound naevi, 10 naevi undergoing regression
Flow and timing Excluded participants: none reported
Time interval to reference test: unclear
Comparative Blinding between tests: Clinical images interpreted in the morning and dermoscopic images in the afternoon
Time interval between index test(s): image capture NR
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Unclear    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Yes    
    Low High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Bono 1996.

Study characteristics
Patient sampling Study design: unclear
Data collection: NR
Period of data collection: March 1993‐October 1994
Country: Italy
Patient characteristics and setting Inclusion criteria: PSLs at the Instituto Nazionale Tumori of Milan
Setting: specialist unit (skin cancer clinic/PLC) Instituto Nazionale Tumori of Milan
Prior testing: NR
Setting for prior testing: NR
Exclusion criteria: none reported
Sample size (participants): number eligible: 45
Sample size (lesions): number eligible: 54/ number included: 43
Participant characteristics: NR
Lesion characteristics: site ‐ face/ears: 3 (6%)/trunk: 39 (72%)/limbs: 12 (22%); 10 MM ≤ 1 mm depth; median size: 10 mm (4 mm‐40 mm)
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A in‐person diagnosis
Diagnostic threshold: NR; 'clinical diagnosis'
Diagnosis based on: single observer; n = NR
Observer qualifications: treating surgeon
Experience in practice: not described
Experience with index test: not described
Target condition and reference standard(s) Reference standard: histological diagnosis
Disease positive: 18; disease negative: 25
Expert opinion: disease negative: 11
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 18
Mild/moderate dysplasia: 8 dysplastic naevi
Benign naevus: 17 common MN
Flow and timing Excluded participants: only 43 lesions had complete clinical and histological information. 11 lesions not surgically removed had only clinical diagnosis (benign) and were not included in the final accuracy analysis
Time interval to reference test: NR
Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Unclear    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? Unclear    
Did the study avoid including participants with multiple lesions? No    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? No    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard No    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    High High
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Yes    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Bono 2002a.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: June 1998‐March 2000
Country: Italy
Test set derived: a training set was separately derived using data obtained from 237 previously studied lesions (Farina 2000)
Patient characteristics and setting Inclusion criteria: cutaneous pigmented lesions with clinical and/or dermatoscopic features that suggested a more or less important suspicion for CM
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: clinical and/or dermatoscopic suspicion
Setting for prior testing: specialist unit (skin cancer clinic/PLC)
Exclusion criteria: location/site of lesion. Awkwardly situated lesions, e.g. interdigital space, ears, nose or eyelids. Lesions on scalp excluded due to hair interference with reflectance. Lesion size, obvious large, thick melanomas
Sample size (participants): number included: 298
Sample size (lesions): number included: 313
Participant characteristics: mean age: 40 years (10‐86 years); male: 122; 41%
Lesion characteristics: lesion site: head/neck: 3%; trunk: 61%; limbs: 36%; thickness ≤ 1 mm: 70% (46/66); for 55 invasive MM: median thickness 0.64 mm, range 0.17‐3.24 mm. Median diameter: 11 mm (3‐31 mm)
Index tests VI: no algorithm (training in the unit based on ABCD but subjective experience of the clinician used for diagnosis)
Method of diagnosis: in‐person diagnosis
Prior test data: same clinician undertook clinical diagnosis and diagnosis using dermoscopy
Diagnostic threshold: clinical diagnostic criteria based on subjective experience; emphasised lesion colour over dimensions. Diagnosis of suspect CM made when the level of suspicion was "roughly 50% or more". ABCD criteria have been the basis of training at the unit, but is not implemented in diagnosis; preferred emphasis on colour rather than dimensional character
Diagnosis based on: single observer; (n = 1)
Observer qualifications: surgical oncologists
Experience in practice: high experience or 'Expert’; over 5 years
Dermoscopy: also evaluated in same study (no algorithm)
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses)
Melanoma (invasive): 55; Melanoma (in situ): 11; BCC: 6
'Benign' diagnoses: 241; 151 compound naevus, 24 junctional naevus, 12 dermal naevus, 12 LS, 10 dysplastic naevus, 8 spindle‐cell naevus, 8 SK, 5 blue naevus, 3 Spitz naevus, 8 other
Flow and timing Excluded participants: NR
Interval between index and reference: NR
Comparative Same clinician undertook both diagnoses (in‐person)
Time interval between index test(s): Appears consecutive but not fully clear
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? No    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Yes    
    High High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Low High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Bono 2002b.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: December 2000 and August 2001
Country: Italy
Patient characteristics and setting Inclusion criteria: consecutive cutaneous pigmented lesions that were ≤ 6 mm in diameter and required surgical biopsy for diagnosis based on clinical or dermoscopic suspicion of CMM
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: clinical and/or dermatoscopic suspicion
Setting for prior testing: NR
Exclusion criteria: lesion size > 6 mm; non‐pigmented
Sample size (participants): number eligible: 349/number included: 157
Sample size (lesions): number eligible: 375/number included: 161
Participant characteristics: mean age 38 years (14‐82); male: 61 (39%)
Lesion characteristics: site: head/neck: 14 (9%); trunk: 88 (55%); limbs: 59 (36%)
Lesion size: median: 5 mm (1 mm‐6 mm)
Index tests VI: no algorithm (ABCD criteria have been the basis of training at the unit, but is not implemented in diagnosis; preferred emphasis on colour rather than dimensional character)
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Other test data: dermoscopy evaluated in same study by same observer(s)
Diagnostic threshold: a diagnosis of suspect CM is made when the level of suspicion is roughly 50% or more; lesions at a lower index of suspicion were considered benign for the purposes of this study.
Diagnosis based on: single observer diagnostic criteria based on the subjective experience of the single clinician examining the pigmented lesion (n = 2)
Observer qualifications: surgical oncologists
Experience in practice: high experience or ‘Expert’; observers described as “expert in the recognition of pigmented lesions"
Other detail: diagnostic criteria were based on the subjective experience of the single clinician examining the pigmented lesion, although the ABCD criteria have been the basis of training at the unit, they did not consider the ABCD mnemonic an essential formula for diagnosis of CM. They did not take into consideration the dimensional character and attributed great importance to the colour of a given lesion.
Dermoscopy: performed by the same 2 clinicians who firstly made and registered the clinical diagnosis
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 13 CM; disease negative: 148
Target condition (final diagnoses)
Melanoma (invasive): 10; melanoma (in situ): 3; BCC: 2 (1.2%)
Mild/moderate dysplasia: 26 (16.1%); SK: 4 (2.5%); benign naevus: compound nevus 57 (35.4%), junctional nevus 38 (23.6%), spindle‐cell nevus 6 (3.7%), Spitz nevus 5 (3.1%), blue nevus 2 (1.2%), other 6 (3.7%), LS 2 (1.2%)
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative Dermoscopy performed by the same two clinicians who firstly made and registered the clinical diagnosis
Time interval between index test(s): appears consecutive
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Yes    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Low High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Bono 2006.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective
Period of data collection: January 2003‐December 2004
Country: Italy
Patient characteristics and setting Inclusion criteria: consecutive patients with PSLs with a maximum diameter of ≤ 3 mm undergoing excision. The decision for diagnostic excision was based on clinical and/or dermoscopic features suggesting a more or less important suspicion for CM
Setting: specialist unit (skin cancer clinic/PLC) Istituto Nazionale Tumori of Milan
Prior testing: clinical and/or dermatoscopic suspicion
Setting for prior testing: specialist unit (skin cancer clinic/PLC)
Exclusion criteria: lesion size > 3 mm
Sample size (participants): number eligible: 204/number included: 204
Sample size (lesions): number eligible: 206/number included: 206
Participant characteristics: median age: 40 (6‐74); male: 71 (35%)
Lesion characteristics: head/neck: 8 (4%); trunk: 84 (41%); limbs: 114 (55%). Median size: 2 mm (1 mm‐3 mm)
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Other test data: dermoscopy evaluated in same study by same observer(s)
Diagnostic threshold: a diagnosis of suspicious CM is made when the level of suspicion is roughly 50% or more; lesions at a lower index of suspicion were considered not CM
Diagnosis based on: single observer; n = 1
Observer qualifications: NR (assumed Oncologist as per Bono 2002a and Bono 2002b); "single clinician examining the pigmented lesion"
Experience in practice: not described
Experience with dermoscopy: not described
Dermoscopy: evaluated in same study; Menzies criteria
Any other detail: ABCD criteria have been the basis of training at the unit, but is not implemented in diagnosis; preferred emphasis on colour rather than dimensional character
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Details: the slides were evaluated according to widely accepted criteria for the histopathological diagnosis of the various pigmented lesions.
 Disease positive: 23; disease negative: 183
Target condition (final diagnoses)
Melanoma (invasive): 19 (9.2%); melanoma (in situ): 4 (2.0%)
Mild/moderate dysplasia: dysplastic naevus 10 (4.9%); junctional naevus 76 (36.9%); compound naevus 50 (24.3%); dermal naevus 12 (5.8%); blue naevus 11 (5.3%); reed naevus 7 (3.4%); Spitz naevus 3 (1.5%); halo naevus 3 (1.5%); LS 7 (3.4%); other 4 (1.9%)
Flow and timing Excluded participants: none
 Time interval to reference test: NR
Comparative Sibngle observer performed both tests
Time interval between index test(s): not reported
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Yes    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Low High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Bourne 2012.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: 1 June‐6 July 2009
Country: Australia
Patient characteristics and setting Inclusion criteria: all skin lesions consecutively excised at a skin cancer practice to exclude skin cancer and common lesions assessed as clearly benign and not biopsied were included
Setting: primary
Prior testing: clinical and/or dermatoscopic suspicion. Prior testing to assemble the test set occurs in secondary care by an experienced skin cancer doctor, then the images are tested on primary care professionals
Setting for prior testing: specialist unit (skin cancer clinic/PLC)
Exclusion criteria: clinically obvious BCCs that could be easily diagnosed without dermoscopy were not included in the collection set.
Sample size (participants): number eligible: 46/number included: 46
Sample size (lesions): number eligible: 50/number included: 50
Participant characteristics: mean age: 58 (30‐60); male: 22
Lesion characteristics: face = 8; neck = 1; chest = 3; back = 21; shoulder = 2; arm = 3; thigh = 4; leg = 7; foot plantar = 1
Index tests VI: no algorithm
Method of diagnosis: clinical photographs
Prior test data: no further information used; image assessments were done on 4 occasions, each time using a different diagnostic approach.
Diagnostic threshold: NR, clinicians provided with Excel answer sheets for each method listing the various criteria used in that algorithm but no algorithm was cited for VI
Diagnosis based on: average (n = 4)
Observer qualifications: 3 GPs and 1 clinical nurse
Experience in practice: mixed; described as varying levels of dermatoscopic experience
Dermoscopy: evaluated in same study; 3‐point rule; Menzies criteria
Target condition and reference standard(s) Reference standard: histological diagnosis plus other
Histopathological examination (n = 46); expert diagnosis as benign (n = 3); digital follow‐up (n = 1)
Target condition (final diagnoses)
Melanoma (invasive): 1; melanoma (in situ): 7; BCC: 6; lentigo maligna 1
SK: 5. 'Benign' diagnoses: banal nevus 10, blue naevus 1, nevus and SK/solar lentigo collision 3, solar lentigo 4, LPLK 4, DF 1, psoriasis 1, solar keratosis 2, intraepidermal carcinoma 3, regressed keratoacanthoma 1
Flow and timing Excluded participants: as 2 of the methods (Menzies and 3‐point checklist) related to only pigmented lesions, we excluded the 5 non‐pigmented specimens in the set of 50 from the contingency tables for these methods.
Time interval to reference test, quote: "all skin lesions consecutively excised to exclude skin cancer were recorded"
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Yes    
    Unclear High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Yes    
Expert opinion (with no histological confirmation) was not used as a reference standard No    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low High
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Unclear    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less? Yes    
    High  

Carli 2002a.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective for clinical examination and in vivo dermoscopy; retrospective image selection/prospective interpretation for ex vivo dermoscopic evaluation
Period of data collection: June 1997‐December 1998
Country: Italy
Patient characteristics and setting Inclusion criteria: clinically equivocal and suspicious PSLs subjected to excisional biopsy at the Institute of Dermatology
Setting: secondary (not further specified)
Prior testing: clinical and/or dermatoscopic suspicion
Setting for prior testing: secondary
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): 256
Participant characteristics: none reported
Lesion characteristics: of the CMs, 14 (25.9%) were in situ melanoma (Clark level I), 18 (33.3%) were invasive with < 0.75 mm thickness, 19 (35.3%) were of intermediate thickness (0.76–1.50 mm) and 3 (5.5%) were > 1.5 mm. The median thickness of invasive melanomas was 0.94 mm ± 0.5 (SD) (range 0.2–2.6)
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: unclear
Other test data: clinical examination and in vivo dermoscopy were performed before excision by 2 trained dermatologists and diagnosis reached
Diagnostic threshold: NR
Diagnosis based on: consensus (2 observers); final clinical diagnosis was based on agreement between the 2 observers. In case of disagreement, the opinion of a 3rd observer (B.G.) was considered to be the judge for the diagnosis
Observer qualifications: dermatologist
Experience in practice: high experience or ‘Expert’; described as “dermatologists with extensive experience in both clinical and dermoscopic diagnosis of pigmented skin lesions”
Dermoscopy: evaluated in same study; pattern analysis
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses) 
 Melanoma (invasive): 40; melanoma (in situ): 14
 BCC: 5
SK: 4; benign naevus: 90 common MN; 78 MN; 9 blue naevi; 16 Spitz reed naevi
Flow and timing Excluded participants: none reported
 Time interval to reference test: NR
Comparative In person clinical examination and dermoscopy
Time interval between index test(s): the interval between the time in‐vivo dermoscopy and re‐evaluation of dermoscopic images was reported as 1 year
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Carli 2002b.

Study characteristics
Patient sampling Study design: CS
Data collection: NR
Period of data collection: NR
Country: Italy
Patient characteristics and setting Inclusion criteria: clinically suspicious or equivocal PSLs undergoing excision for diagnostic purposes; only lesions with a diameter of ≤ 14 mm were included
Setting: secondary (general dermatology)
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion
Setting for prior testing: secondary (general dermatology)
Exclusion criteria: none reported
Sample size (participants): number included: NR
Sample size (lesions): number included: 57
Participant characteristics: none reported
Lesion characteristics: thickness ≤ 1 mm: 11 cases (5 in situ 6 invasive); All ≤ 14 mm diameter
Index tests VI: no algorithm
Method of diagnosis: clinical photographs; fixed focus distance of 10 cm; images observed using a viewer in 2 separate diagnostic sessions
Prior test data: no further information used; contact (dermoscopic) images viewed first and then distant images (clinical), without knowing the classification of the contact image of the individual lesions.
Diagnostic threshold: NR
Diagnosis based on: consensus (2 observers); n = 2
Observer qualifications: dermatologist
Experience in practice: high experience or ‘Expert’; states "with experience in the field of PSL"
Other detail: used an AF micro Nikkor 60 lens objective mounted on a NIKON f50 camera, with a fixed focus distance of 10 cm
Dermoscopy: evaluated in same study; no algorithm
Target condition and reference standard(s) Reference standard: histology (not further described) 
 Disease positive: 21; disease negative: 36
Target condition (final diagnoses)
Melanoma (invasive): 6; melanoma (in situ): 5; BCC: 10
'Benign' diagnoses: 36
Flow and timing Excluded participants: no exclusions reported
Time interval to reference test: photographic procedures performed consecutively prior to surgery
Comparative Photographic procedures performed consecutively prior to surgery
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Low  

Carli 2003a.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: 1999‐2001
Country: Italy
Patient characteristics and setting Inclusion criteria: clinically difficult to diagnose or equivocal melanocytic lesions randomly selected from image database; all melanomas < 1 mm thickness
Setting: secondary (general dermatology)
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion
Setting for prior testing: secondary (general dermatology)
Exclusion criteria: ≥ 1 mm thick melanomas, dermoscopically peculiar lesions (e.g. blue naevi or Spitz naevi)
Sample size (participants): NR
Sample size (lesions): number included: 200
Participant characteristics: none reported
Lesion characteristics: diameter < 6 mm, 58; 6‐10 mm, 87; ≥ 10 mm, 55 (results reported per subgroup) Lesions ≤ 1 mm thickness: 64; median thickness 0.3 mm, 25th‐75th centile 0.00‐0.58 mm; mean diameter 7.4 (SD2.79) mm; median: 7 mm (2‐16 mm)
Any other detail: same lesions appear to be reported in De Giorgi 2011 but with a different set of 8 observers (De Giorgi 2011 excluded from review on this basis)
Index tests VI: no algorithm
Method of diagnosis: clinical photographs
Prior test data: no further information used; dermoscopic images interpreted subsequent to clinical images
Diagnostic threshold: NR
Diagnosis based on: average; n = 8
Observer qualifications: dermatology registrar; 2 final year residents. Dermatologist 6
Experience in practice: mixed ‐ 2 senior experts, 4 practicing dermatologists, 2 last year resident dermatologists. Classified as 'high' due to expertise/training in dermoscopy use
Other detail: clinical photos using Nikon F40 with macro lens at 15 cm
Dermoscopy: evaluated in same study; no algorithm (own choice)
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 64; disease negative: 136
Target condition (final diagnoses)
Melanoma (invasive): 40; melanoma (in situ): 24
Other: 136 MN
Flow and timing Excluded participants: no exclusions reported
Time interval to reference test: interval not described
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? No    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Chang 2013.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective
Period of data collection: January 2006‐July 2009
Country: Taiwan
Patient characteristics and setting Inclusion criteria: potentially malignant biopsied or excised skin lesions (non‐tumour specimens excluded)
Setting: secondary (general dermatology)
Prior testing: selected for excision (no further detail)
Setting for prior testing: secondary (general dermatology)
Exclusion criteria: prior surgery; image misregistered or poor‐quality images (unfocused or containing a motion artefact) (considered under 'Flow and timing')
Sample size (participants): number eligible: 3964; number included: 676
Sample size (lesions): number eligible: 4192; number included: 769
Participant characteristics: mean age: 47.6 (SD 21.0); male: 296; 43.8%
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Diagnostic threshold: NR; clinicians’ impressions prior to biopsy were classified as ‘‘benign’’, ‘‘malignant’’, or ‘‘indeterminate’’. When the clinicians were not confident enough to make a definite benign or malignant diagnosis, the clinical impression was considered as ‘‘indeterminate’’. Data extracted for malignant vs rest and malignant/indeterminate vs rest
Diagnosis based on: single observer; board‐certified staff dermatologists from institute; n = 25
Observer qualifications: dermatologist
Experience in practice: board certified; 'High'
Target condition and reference standard(s) Reference standard: histology (not further described) 
 Disease positive: 174; disease negative: 595
Target condition (final diagnoses)
Melanoma (invasive): 4; melanoma (in situ): 4; BCC: 110; cSCC: 20
'Benign' diagnoses: 595
Flow and timing Excluded participants: misregistered or poor‐quality images (unfocused or containing a motion artifact) as a study inclusion criterion
Time interval to reference test: not described
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? No    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Collas 1999.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: January 1996 and August 1997
Country: France
Patient characteristics and setting Inclusion criteria: PSLs undergoing excision by dermatologists in private practice, and by hospital dermatologists
Setting: secondary (general dermatology); private care
Prior testing: selected for excision (no further detail)
Setting for prior testing: secondary (general dermatology); private care
Exclusion criteria: none reported
Sample size (participants): number included: 353
Sample size (lesions): number included: 353
Participant characteristics: male: 46%; 162
Lesion characteristics: none reported
Index tests VI: no algorithm. Own new algorithm
Diagnosis based on features from ABCD and 7‐point checklist but neither one specifically followed.
Study authors selected own combination of lesion characteristics based on observed data
Method of diagnosis: in‐person diagnosis
Prior test data: unclear
Diagnostic threshold: data can be extracted at a number of thresholds.
1. primary diagnosis of melanoma; 2. certainty of melanoma diagnosis; 3. various combinations of assessed features (based on logistic regression)
 Recorded: most likely clinical diagnosis; degree of melanoma suspicion and clinical sign(s) that led to the removal decision based on ABCD rule (McCarthy 1995) and the 7‐point checklist (Healsmith 1994)
Diagnosis based on: single observer; n = NR
Observer qualifications: dermatologist
Experience in practice: not described
Experience with index test: not described
Other detail: most predictive features derived by logistic regression from the following list: irregular contours; abnormal pigmentation; blurred; frank tumor appearance; erosion, ulceration or bleeding; regression signs; lesion recently amended; lesion appeared recently; pruritic lesion; other
Target condition and reference standard(s) Reference standard: histology (not further described) 
 Disease positive: 38; disease negative: 315
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 38
Other: 160
Flow and timing Excluded participants: no exclusions reported
Time interval to reference test: consecutive; quote: "When the dermatologist decided to resection a pigmented lesion, he fulfilled a pre‐printed sheet"
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Yes    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others? No    
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear Unclear
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less? Yes    
    Low  

Cristofolini 1994.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: October 1990‐June 1991
Country: Italy
Patient characteristics and setting Inclusion criteria: patients with pigmented lesions presenting during a campaign for the early diagnosis of CM at the Dermatology Department in Trento
Setting: secondary (general dermatology)
Prior testing: NR
Setting for prior testing: NR
Exclusion criteria: lesions that were not taken into consideration included benign lesions, naevi of Unna and Miescher types and naevi that showed no inclusion criteria at the ABCDE clinical examination
Sample size (participants): number eligible: 700 people; number included: NR
Sample size (lesions): number eligible: 220; number included: 220
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: ABCDE
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Other test data: dermoscopy undertaken by same clinician(s) subsequent to clinical evaluation
Diagnostic threshold: lesions showing ≥ 2 of the ABCDE criteria all of which were shown the same diagnostic importance, were considered positive
Diagnosis based on: unclear; n = 4
Observer qualifications: dermatologist
Experience in practice: high experience or ‘Expert’; all trained in the recognition of pigmented lesions during a training course about the clinical diagnosis of naevi and melanomas; all working in a department where the early diagnosis of melanoma had been dealt with for over 10 years
Experience with dermoscopy: high experience/‘Expert’ users
Other detail: ABCDE criteria are (asymmetry in shape, border irregular and notched, colour mottled‐haphazard display, dimension > 6 mm, evolution changes in pigmentation)
Dermoscopy: evaluated in same study; pattern analysis
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 33
Mild//moderate dysplasia: 23 dysplastic naevi; SK: 4; benign naevus: 158 common naevus
 Other: 2 thrombosed angiomas
Flow and timing Excluded participants: no exclusions reported
Time interval to reference test: not described
Time interval between index tests: clinical evaluation directly followed by dermoscopy
Comparative Clinical evaluation directly followed by dermoscopy
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Unclear    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Yes    
    Low Unclear
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Cristofolini 1997.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: November 1992‐September 1993
Country: Italy
Patient characteristics and setting Inclusion criteria: patients with small and flat common and atypical PSLs recruited during a health campaign for the early diagnosis of CM underwent clinical diagnosis, computerised analysis by SVS and subsequent skin biopsy
Setting: secondary (general dermatology)
Prior testing: no prior testing
Setting for prior testing: secondary (general dermatology)
Exclusion criteria: none reported
Sample size (participants): 176
Sample size (lesions): 176
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: ABCD
Method of diagnosis: in‐person diagnosis
Prior test data: clinical examination and/or case notes
Diagnostic threshold: NR; examined individual ABCD characteristics but no 'rule' as to when to diagnose melanoma; appears to be subjective diagnosis
Diagnosis based on: consensus (3 observers) (n = 3)
Observer qualifications: dermatologist
Experience in practice: not described in paper but judged as 'High'; states that, quote: “All lesions were examined by three dermatologists according to the ABCD system, if they disagreed a fourth dermatologist an expert in the diagnosis of pigmented lesions was consulted.” Cristofolini 1994 describes 4 dermatologists "trained in the recognition of pigmented lesions", 3/4 are in common with Cristofolini 1997.
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 35
Other: 141 MN
Flow and timing Excluded participants: NR
 Time interval to reference test: quote: "subsequent skin biopsy"
 Time interval between index test(s): NR, appears to be simultaneous
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Low  

de Giorgi 2012.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: October 2006‐September 2010
Country: Italy
Patient characteristics and setting Inclusion criteria: pigmented melanocytic skin lesions with a maximum diameter of 6 mm excised at Deptartment of Dermatology
Setting: secondary (general dermatology)
Prior testing: NR
Setting for prior testing: NR
Exclusion criteria: location/site of lesion ‐ palmar and plantar regions, mucosal lesions and pigmented melanocytic lesions of the nails excluded
Sample size (participants): NR
Sample size (lesions): number included: 103
Participant characteristics: mean age: melanoma group male (50.4 years) female (48.4 years); benign group male (36 years) female (36.8 years)
Lesion characteristics: head/neck: 3; trunk: 21; upper limbs/shoulder: 16; lower limbs/hip: 26; back = 34; dorsal acral = 3. Thickness: ≤ 1 mm 15; > 1 mm = 1 MM
Index tests VI: ABCD
Method of diagnosis: clinical photographs
Prior test data: unclear
Other test data: dermoscopic images also presented separately to observer (only presence/absence of particular dermoscopic features recorded; not an overall diagnostic assessment)
Diagnostic threshold: ABCD criteria ≥ 2 criteria present
Diagnosis based on: consensus (3 observers); n = 3
Observer qualifications: dermatologist
Experience in practice: high experience or ‘Expert’; quote: “the four dermatologists had the same level of training and experience in dermatology, with more than 5 years of practice in dermoscopy”
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 34; disease negative: 69
Target condition (final diagnoses) 
 Melanoma (in situ and invasive, or NR): 34
'Benign' diagnoses: 69 benign melanocytic nevus
Flow and timing Excluded participants: none reported
 Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? No    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? No    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Yes    
    High High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Dolianitis 2005.

Study characteristics
Patient sampling Study design: CCS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: July 2001‐June 2002
Country: Australia
Patient characteristics and setting Inclusion criteria: dermoscopy training study using a CD with 5 test sets of images, each with 40 images of melanocytic skin lesions. Only good‐quality macroscopic and dermoscopic images were included.
Setting: training images; study author's institute, Deptartment of Dermatology, University of Melbourne
Prior testing: unclear
Setting for prior testing: NR
Exclusion criteria: nonmelanocytic lesions; poor‐quality index test image, only good‐quality macroscopic and dermoscopic images were included, where the whole lesion was visible, including the entire periphery (considered under 'Flow and timing')
Sample size (participants): NR
Sample size (lesions): number eligible: 40; number included: 40
Participant characteristics: NR
Lesion characteristics: ≤ 1 mm thickness: 14 invasive melanomas; median 0.50 mm
Index tests VI: no algorithm
Method of diagnosis: clinical photographs alone
Prior test data: no further information used
Other test data: dermoscopic images presented to observer subsequent to diagnosis using clinical images alone
Diagnostic threshold: NR
Diagnosis based on: average; 61 participants (invited to participate in a study comparing dermoscopic algorithms; advertised at several medical meetings and on a Website for primary care physicians)
Observer qualifications: 10 dermatologists, 16 dermatology trainees, 35 GPs
Experience in practice: mixed. Participant (volunteers), quote: "had a range of experience levels with assessment of skin lesions [outlined in detail in the paper]... and a significant number were novices in dermoscopy”. Paper reports 82% of participants responded that they assessed at least 2‐4 PSL per week. Participants were given explanatory written material and CDs containing educational material on dermoscopy and test images.
Dermoscopy: evaluated in same study based on dermoscopic images alone; pattern analysis; 7‐point checklist; ABCD; Menzies criteria
Target condition and reference standard(s) Reference standard: histological diagnosis plus other (1 lesion described as having no biopsy performed)
Histology (not further described). Disease positive: 20; disease negative: 19
Expert diagnosis: 1
Target condition (final diagnoses)
Melanoma (invasive): 18; lentigo maligna 2
Benign naevus: 7 dysplastic naevi; 3 Spitz naevi; 3 junctional naevi; 2 compound naevi; 4 other (ink‐spot lentigo, blue naevus, solar lentigo, ephelis)
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
 Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? No    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard No    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low High
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Unclear    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Dummer 1993.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: 12‐month period (year/dates NR)
Country: Germany
Patient characteristics and setting Inclusion criteria: patients with skin lesions difficult to diagnose clinically
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion
Setting for prior testing: specialist unit (skin cancer clinic/PLC). A type of specialist‐care‐dermatology‐based clinic
Exclusion criteria: patients who had excisions performed in individual practices or where there was no histology or cases that were so obvious they did not need to have further investigation (clearly benign)
Sample size (participants): NR
Sample size (lesions): number eligible: 824; number included: 771
Participant characteristics: NR
Lesion characteristics: NR
Index tests VI: no algorithm
Method of diagnosis: in person
Prior test data: in person
Other test data: dermoscopic images viewed separately
Diagnostic threshold: NR
Diagnosis based on: single observer; (n = 2 or 3)
 Observer qualifications: unclear; clinician based in dermatology clinic
Experience in practice: unclear
Experience with index test: unclear
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 23 MM; disease negative: 748 benign
Target condition (final diagnoses)
Invasive melanoma: 23
 Benign naevus 706; SK 4; benign non‐melanocytic naevus 32
Flow and timing Excluded participants: 53 NML not included in the final analysis (no melanomas present in this group)
Time interval to reference test: NR
 Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Ek 2005.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: January 2001‐December 2002
Country: Australia
Patient characteristics and setting Inclusion criteria: lesions excised at tertiary referral centre for the management of cancers; only those lesions in which malignancy could not be excluded were included
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: selected for excision (no further detail)
Setting for prior testing: specialist unit (skin cancer clinic/PLC)
Exclusion criteria: punch, shave or incisional biopsies and palliative excisions. Equivocal pathology report (n = 56)
Sample size (participants): number eligible: 1302; number included: 1223
Sample size (lesions): number eligible: 2678; number included: 2582
Participant characteristics: mean age: 73.6 years (16–102 years); male: 784 (64.1%); history of melanoma/skin cancer (%) 224; 8.7% recurrent lesions
Lesion characteristics: head/neck: 61%; trunk: 14.4%; limbs: 24.6%
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Diagnostic threshold: NR, pre‐operative diagnosis
Diagnosis based on: unclear; likely single (n = 5)
Observer qualifications: 3 consultants, a plastic surgery trainee and a clinical assistant
Experience in practice: mixed (low and high experience combined); plastic surgery trainee usually 1st year, on 6‐month rotation; clinical assistant described as having “many years of experience”
Other detail: some results are presented for consultant, senior registrar and registrar but underlying patient numbers are not provided per observer to allow separate 2x2 estimation. The discussion does describe the “six MM misdiagnosed as benign … as .. assessed by non‐consultants”.
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 23
 BCC: 1214; cSCC: 517
'Benign' diagnoses: 188 (7.3%) SCC in situ (Bowen’s disease), 330 (12.8%) solar keratoses, 63 (2.4%) seborrhoeic keratoses 247 (9.6%) were other benign lesions
Flow and timing Excluded participants: lesions with incomplete or incorrectly entered proformas were excluded (n = 40).
Index to reference interval: consecutive; used pre‐operative clinical diagnosis of lesions undergoing biopsy
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? No    
Are the included patients and chosen study setting appropriate? Unclear    
Did the study avoid including participants with multiple lesions? No    
    High High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Gachon 2005.

Study characteristics
Patient sampling Study design: CS (dermatologists recruited and asked to use standardised questionnaire form whenever he or she decided to remove a nevus or MM for any reason, e.g. suspicion of MM, aesthetics,comfort, prevention)
Data collection: prospective
Period of data collection: NR
Country: France
Patient characteristics and setting Inclusion criteria: melanocytic skin lesions removed for any reason (e.g. suspicion of melanoma, aesthetics, comfort, prevention) by volunteer dermatologists
Setting: secondary (general dermatology) and private care; mostly "community dermatologists working in a private setting, and only 2 were academic dermatologists"
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion/patient request for evaluation/excision; 1199 (29.7%) excised because they were considered suspicious by the dermatologist, and 869 (21.5%) because they were considered as precursors by the dermatologist; 1634 (40.7%) removed due to aesthetic or functional reasons, and 535 (13.3%) “only to reassure the patient"
Setting for prior testing: N/A
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): number included: 4036
Participant characteristics: none reported
Lesion characteristics: 36 (24.1%) of 149 melanoma were in situ or other invasive lesions with a median Breslow thickness of 0.60 mm
Index tests VI: no algorithm. Accuracy presented only for clinician's first clinical impression of lesions; after recording likelihood of melanoma, assessments were made as to the contributions of pattern recognition, ABCD criteria and ugly duckling (differential recognition)
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Diagnostic threshold: 'considered suspicious' by dermatologist
Diagnosis based on: single observer; (n = 135 of 200 volunteers)
Observer qualifications: dermatologist
Experience in practice: not described; most were community dermatologists working in a private setting, and 2 were academic dermatologists
Experience with index test: not described
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 149; disease negative: 3887
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 149 (36 were in situ or other invasive lesions with a median Breslow thickness of 0.60 mm)
'Benign' diagnoses: 3629 naevi (89.9%); 4 uncertain MMs/naevi (0.1%); and 254 NML clinically considered to be naevi or MMs (6.3%)
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Green 1991.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: February 1989‐August 1990
Country: Australia
Patient characteristics and setting Inclusion criteria: pigmented lesions with complete clinical and histological data
Setting: secondary (referred from surgery, dermatology, casualty)
Prior testing: NR
Setting for prior testing: surgery, dermatology and casualty departments
Exclusion criteria: none reported
Sample size (participants): number eligible: 81/number included: unclear
Sample size (lesions): number eligible: 89; number included: 70
Participant characteristics: median age 32 years; male 36 (44%)
Lesion characteristics: site trunk: 80%; limbs: 10%; face and neck 10%
Index tests VI: no algorithm
Method of diagnosis: in‐person
Prior test data: in‐person
Diagnostic threshold: NR, clinical diagnosis recorded plus assessment of diameter, colour, regularity of outline, diffuseness of edge and palpability
Diagnosis based on: single observer; (n = NR)
Observer qualifications: mixed; "in the majority of cases a surgeon or a dermatologist"
Experience in practice: not described
Target condition and reference standard(s) Reference standard; histological diagnosis and expert diagnosis
Histology: 62/70 lesions
Expert diagnosis: 8/70 lesions; 8 lesions had clinical diagnoses assigned (all benign) in the absence of available histology reports
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 5
 BCC: 2; SK: 7; benign naevus: 53 oOther: 2 skin tags, 1 'lentigo'
Flow and timing Excluded participants: 19/89 lesions excluded due to incomplete clinical and histology records.
Time interval to reference test: assumed consecutive; pathology referral form used to ascertain clinical diagnosis
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? No    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard No    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    High High
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Green 1994.

Study characteristics
Patient sampling Study design: CS
Data collection: NR; appears to use previously acquired images to develop a new CAD classifier (not included as derivation), and compare results to clinical diagnosis of clinicians as recorded in notes. Unclear whether set up prospectively or was retrospective assessment.
Period of data collection: August 1990‐April 1992
Country: Australia
Patient characteristics and setting Inclusion criteria: pigmented lesions for excision
Setting: secondary (Deptartment of Surgery)
Prior testing: selected for excision (no further detail)
Setting for prior testing: NR
Exclusion criteria: none reported
Sample size (participants): number included: 129
Sample size (lesions): number eligible: 204; number included: 164
Participant characteristics: mean age 36 years, range 6‐87 years; male: 42.6%
Lesion characteristics: site face/neck: 10%, trunk: 66%, limbs: 24%
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: no further information used
Diagnostic threshold: NR; clinical diagnosis recorded plus assessment of diameter, colour, regularity of outline, diffuseness of edge and palpability (same as for Green 1991)
Diagnosis based on: single observer; (n = NR)
Observer qualifications: NR
Experience in practice: not described
Target condition and reference standard(s) Reference standard: histology (not further described) 
 Disease positive: 18; disease negative: 146
Target condition (final diagnoses) 
 Melanoma (invasive): 18; melanoma (in situ): 3
128 MN; 15 miscellaneous pigmented lesions including seborrhoeic keratoses, BCCs, and lentigines
Flow and timing Excluded participants: 33 lesions excluded due to problems using the images with the CAD software, e.g. lesion "too big"; image "obscured by hairs or surgeons pen marks" or "software was unable to contend with the lesion characteristics, mainly because the lesion was too light or too fragmented" or "avoidable operator error"
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? No    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Grimaldi 2009.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: October 2005‐March 2006
Country: Italy
Patient characteristics and setting Inclusion criteria: cutaneous pigmented lesions with digital images forwarded by primary care physicians to a referral centre for confirmation of diagnosis
Setting: primary; lesions selected for referral by GPs; accuracy of GP diagnosis assessed
Prior testing: NR
Setting for prior testing: NR
Exclusion criteria: lesions whose removal had been explicitly demanded by the patients for aesthetic reasons, as well as those irritated or subjected to trauma
Sample size (participants): number included: 197
Sample size (lesions): number included: 235
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Other test data: "two‐step judgment (before and after dermoscopy) formulated by the sending physician, who labelled each lesion as ‘benign’ or ‘suspicious for malignancy’."
Diagnostic threshold: NR, quote, "Each physician was asked to formulate a written first judgment of every lesion before digital acquisition and to re‐evaluate it after dermoscopy"
Diagnosis based on: single observer; (n = 13)
Observer qualifications: GP; from approximately 250 primary care clinicians attending a conference, 13 volunteered to participate
Experience in practice: not clearly described; assumed to be low experience with pigmented lesions
Experience in dermoscopy: unclear; classified as 'trained', “simple protocols for diagnosis were made up and given to the participants via e‐learning courses, direct meetings, and involving self assessment procedures”
Dermoscopy: evaluated in same study; no algorithm (ABCD used for telediagnosis at reference centre)
Target condition and reference standard(s) Reference standard: histological diagnosis plus follow‐up (reference is expert diagnosis for teledermatology component of study)
Histology (not further described): n = 16;disease positive: 5; disease negative: 11
Clinical follow‐up (6 months) plus histology of suspicious lesions: n = 219; disease positive: 0; disease negative: 208
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 5
Other: 230 benign
Flow and timing Excluded participants: NR
 Time interval to reference test: NR
 Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? Yes    
Did the study avoid including participants with multiple lesions? No    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? No    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? No    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    High Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Yes    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Kopf 1975.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective
Period of data collection: 1955‐1967
Country: USA
Patient characteristics and setting Inclusion criteria: all lesions subject to biopsy at the Oncology Section of the Skin and Cancer Unit
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: NR
Setting for prior testing: NR
Exclusion criteria: none reported
Sample size (participants): number included: NR
Sample size (lesions): number included: 5538
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: unclear
Diagnostic threshold: NR; clinical diagnosis
Diagnosis based on: single observer; in‐clinic diagnosis (n = NR)
Observer qualifications: oncologist
Experience in practice: not described
Experience with index test: not described
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 99; disease negative: 5439
Target condition (final diagnoses)
Melanoma (invasive): 99 (described as "malignant melanoma")
Diagnoses listed only for false‐positives; included: 3 pigmented BCC, 3 DFs, 2 junction naevi, 2 compound naevi, and 1 each of: Kaposi sarcoma, hemangioma, SK, leiomyoma, cellular blue nevus, sclerosing hemangioma, SCC, verrucous nevus, and intradermal nevus FNs included: 6 clinically diagnosed as pigmented BCC; 2 "other forms" of BCC; 3 junction naevi; 3 pyogenic granulomas; 2 compound naevi; 2 SCCs; 2 halo naevi; 1 Bowen disease; 1 SK; and 1 lentigo. 17 of these lesions were pigmented and 6 were not.
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Krahn 1998.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: NR
Country: Germany
Patient characteristics and setting Inclusion criteria: excised PSLs
Setting: secondary (general dermatology)
Prior testing: NR
Setting for prior testing: NR
Exclusion criteria: none reported
Sample size (participants): number included: 80
Sample size (lesions): number included: 80
Participant characteristics: none reported
Lesion characteristics range in thickness (melanomas) 0.18‐1.9 mm; 29/39 < 0.76 mm; 7/39 0.76‐1.5 mm; 3/39 > 1.5 mm
Index tests VI: no algorithm reported
Method of diagnosis: in‐person diagnosis
Prior test data: unclear
Other test data: dermoscopy undertaken by same clinician(s) subsequent to clinical evaluation
Diagnostic threshold: NR; no details
Diagnosis based on: single observer (n = 1)
Observer qualifications: NR, likely dermatologist
Experience in practice: not described
Experience with index test: not described
Dermoscopy: evaluated in same study; no algorithm
Target condition and reference standard(s) Reference standard: histological diagnosis alone including histometrics
Disease positive: 39; disease negative: 41
Target condition (final diagnoses)
Melanoma (invasive): 39 (SSM, lentigo MM, nodular M)
Benign naevus: 37 common naevus; 3 dysplastic nevus, 1 Spitz naevus
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
 Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Yes    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Langley 2001.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: NR
Country: USA
Patient characteristics and setting Inclusion criteria: patients with lesions scheduled for excision at the PLC to either remove atypical naevi or to rule out melanoma or for cosmetic reasons
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: selected for excision; to remove atypical naevi or rule out melanoma or for cosmetic reasons
Setting for prior testing: NR
Exclusion criteria: none reported
Sample size (participants): number included: 29
Sample size (lesions): number eligible: 40; number included: 38
Participant characteristics: mean age 39 years, range 19‐95 years; male: 14 (48%)
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Diagnostic threshold: NR; clinical diagnosis
Diagnosis based on: unclear likely in clinic diagnoses (n = NR)
Observer qualifications: NR, likely dermatologists
Experience in practice: not described
Experience with index test: not described
Target condition and reference standard(s) Reference standard: histological diagnosis plus other
Histology details (n = 38):"After excision, the samples were processed in paraffin and stained with H&E for routine light microscopy. Correlation was performed by examining the confocal images and the pathology sections to compare nuclear, cellular, and morphologic detail and to identify potential significance of the in vivo CSLM observations. For the histologic diagnosis of dysplastic naevi, we used the criteria that are defined in the World Health Organization consensus study."
Expert diagnosis (n = 2): 2 lesions did not undergo histology; expert diagnosis only (both benign)
Target condition (final diagnoses)
Melanoma (invasive): 3; melanoma (in situ): 1; lentigo maligna 2
Dysplastic naevi: 17; benign naevus: 15
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? Unclear    
Did the study avoid including participants with multiple lesions? No    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Unclear    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard No    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low High
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Lorentzen 1999.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: 1994‐1997
Country: Denmark
Patient characteristics and setting Inclusion criteria: patients with lesions suspicious for CMM referred to outpatients clinic
Excluded participants: none reported
Time interval to reference test: NR
Setting: NR
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion
Setting for prior testing: NR
Exclusion criteria: poor‐quality index test image (considered under flow/timing)
Sample size (participants): number eligible: 242; number included: 232
Sample size (lesions): number eligible: 242; number included: 232*
Participant characteristics: none reported
Lesion characteristics: none reported
*NB Not all cases were assessed by all observers; 2x2 are based on presented sensitivity and specificity estimates for full dataset of lesions; "the dermatoscopy experts assessed almost all cases (98 ± 100%), whereas the non‐expert group completed fewer assessments, from 76 to 98%."
Index tests VI: no algorithm
Method of diagnosis: clinical photographs
Prior test data: no further information used; no option to change clinical diagnosis after viewing dermoscopic image
Other test data: dermoscopic images presented to observer subsequent to diagnosis using clinical images alone; clinical images presented before dermoscopic images
Diagnostic threshold: NR; clinical diagnosis
Diagnosis based on: average; n = 9
Observer qualifications: dermatologist
Experience in practice: high; moderate; mixed (average reported); 4 "experienced dermatologists" (4‐5 years daily experience) and 5 "non‐expert dermatology residents" (1‐2 years' interest and formal training in dermatoscopy)
Experience with index test: high; moderate; mixed
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 65 ; disease negative: 167
Target condition (final diagnoses)
 Melanoma (invasive): 49 "malignant melanoma"
 BCC: 16
SK: 12; benign naevus: 137 (pigmented naevi = 116; blue naevi = 16; atypical naevi = 5); Other: 18 (Spitz naevi, Bowen's disease, sarcoid, nevus spilus, hemangioma, and others)
Flow and timing Excluded participants: 10 cases were "considered unfit for evaluation" due to poor‐quality image
Reference interval: "biopsy specimens...were obtained after the clinical and dermatoscopic photographs had been performed"
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

McGovern 1992.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: 1 November 1989‐31 October 1990
Country: USA
Patient characteristics and setting Inclusion criteria: pigmented lesions excised to rule out dysplasia, lentigo maligna or MM
Setting: secondary (general dermatology); army dermatology clinic ‐ appears to be open access
Prior testing: no prior testing. Multiple reasons given for seeking dermatological consultation, including (in descending order): increasing size, "mole check", inflammation, colour change, itch, follow‐up, variegation, cosmetic, referral, irregular border, seen for other lesion, unknown, large size
Setting for prior testing: N/A
Exclusion criteria: none reported
Sample size (participants): number eligible: 179; number included: NR
Sample size (lesions): number eligible: 237; number included: 13 lesions excluded and 32 lesions unaccounted for
Participant characteristics: mean age: 44 (SD 18); range: 3 months to 86 years; male: 89 (49%)
Lesion characteristics: lesion site: head/neck: 71 (30%); trunk: 52 (23%); upper limbs/shoulder: 22 (9%); lower limbs/hip: 33 (14%); back = 58 (24%); genitalia = 1 (0.4%)
Index tests VI: ABCD; assessed only 'BCD'; also referred to in paper as 3‐point checklist; Glasgow/MacKie original 7‐point checklist (Keefe 1990)
Method of diagnosis: in‐person diagnosis
Prior test data: unclear
Diagnostic threshold: described in detail; ABCD excluded asymmetry ‐ one half does not match the other half)
Diagnosis based on: single observer in clinic diagnoses used (n = NR)
Observer qualifications: NR, likely dermatologists
Experience in practice: not described
Any other details: border irregularity, edges are ragged, notched, or blurred; colour irregularity, pigmentation is not uniform, shades of tan, brown and black are present with dashes of red, white, or blue; diameter > 6 mm, the size of a pencil eraser
7‐point: increasing size, variegation, inflammation, irregular outline, > 1 cm diameter, itch, bleeding,1 point awarded for each feature
Target condition and reference standard(s) Reference standard: histological diagnosis alone
 Details: shave excision = 109; punch biopsy = 64; excision = 47; snip biopsy = 17
 Disease positive: 16 lesions; disease negative: 221
Target condition (final diagnoses)
Melanoma (invasive): 6; lentigo maligna 6; BCC: 4;
Dysplastic naevus 28; SK: 32; benign naevus: 110; lentigo 12; blue naevus 9; AK 6; DF 6; atypical naevus 4; other 14
Flow and timing Excluded participants: missing data for the different algorithms; approximately 32 lesions unaccounted for (13 excluded due to lesion size of ≤ 8 mm). ABCD evaluated = 192/224 lesions; 3‐point evaluated = 192/224 lesions; 7‐point evaluated = 205/224 lesions
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? Unclear    
Did the study avoid including participants with multiple lesions? No    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? No    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others? Unclear    
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Unclear    
    High Unclear
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less? Unclear    
    High  

Menzies 2009.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: December 2005‐August 2006
Country: Australia
Patient characteristics and setting Inclusion criteria: pigmented lesions which, after routine naked eye examination by the GP, would have been biopsied or referred, i.e. a suspicious pigmented lesion. GPs were recruited from practices with at least 3 clinicians; excluded if they already used dermoscopy or SDDI in their routine practice.
Setting: primary
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion
Setting for prior testing: primary
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): number included: 374
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A in‐person diagnosis
Other test data: clinical diagnosis and placed in a sealed envelope before proceeding to dermoscopy examination
Diagnostic threshold: NR; initial diagnosis recorded along with confidence of diagnosis (scale 1‐10; 1 not at all confident and 10 extremely confident), certainty of melanoma (scale 0%‐100%; 0 definitely not melanoma and 100 definitely melanoma) and management (biopsy, referral)
Diagnosis based on: single observer (n = 63; 102 GPs initially recruited; 74 (72.5%) completed the educational intervention and online assessment; 63 GPs from 19 practices finally participated)
Observer qualifications: GP
Experience in practice: not fully described; assumed to be low experience with pigmented lesions. GPs must have each excised or referred ≥ 10 PSL in previous 12‐month period; excluded if dermoscopy or SDDI already used in routine practice. During the pretrial period all GPs underwent a training programme in the use of dermoscopy.
Dermoscopy: evaluated in same study; no algorithm
Target condition and reference standard(s) Reference standard: histological diagnosis plus other
Histology (not further described): described as to standard practice and not necessarily blinded to the GP’s diagnosis; author confirmed that all melanoma had histological diagnosis and > 50% of benign had histology or follow‐up
Total excised or referred: 163. Immediate excision/referral: 110. Excision/referral after SDDI: 48. Excision/examination after patient self‐referral 5
 Disease positive: 37; disease negative: total of 126 benign or unknown were 'excised OR referred' so some would have had specialist diagnosis only.
Clinical follow‐up plus histology of suspicious lesions: short‐term digital monitoring (SDDI) available as an option for lesions considered not to be melanoma but that were still considered suspicious; follow‐up imaging occurred initially at 3 months with any morphological changes to result in biopsy or referral; some lesions continued SDDI for a further 3 months; length of follow‐up: 3‐6 months
 Number of participants: initially recommended for SDDI: 192; SDDI continued for further 3 months: 6; Underwent SDDI only (no excision): 146
 Disease positive: 15 (SDDI then histologically confirmed); disease negative: 176 benign (including 1 missed in situ melanoma); 4 unknown
Expert opinion: GPs could refer for specialist opinion or lesions could undergo dermoscopy telemedicine (images reviewed by an expert in dermoscopy and SDDI). Dermoscopy telemedicine was blinded to the GP’s diagnosis.
 Observe for change group, i.e. discharged after dermoscopy: 72, plus a proportion of those in excise/refer group will have had expert diagnosis alone but details not given
 Disease positive: 0; disease negative: 71 benign; 1 unknown
Target condition (final diagnoses) 
 Melanoma (invasive): 33; melanoma (in situ): 1
 BCC: 6
2 Bowen's disease; 323 benign; 9 unknown
Flow and timing Excluded participants: 9 lesions with unknown diagnoses, plus BCC and Bowen's excluded from some analyses
Time interval to reference test: NR; histopathological and specialist examination occurred according to standard practice
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? Yes    
Did the study avoid including participants with multiple lesions? Unclear    
    Low Unclear
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? No    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? No    
Were the reference standard results interpreted without knowledge of the results of the index tests? No    
Expert opinion (with no histological confirmation) was not used as a reference standard No    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    High High
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Yes    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Morales Callaghan 2008.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: 1 January 2005‐31 December 2005
Country: Spain
Patient characteristics and setting Inclusion criteria: randomly selected melanocytic lesions; melanocytic on both clinical and dermoscopic criteria
Setting: secondary (general dermatology)
Prior testing: dermatoscopic suspicion in all cases
Setting for prior testing: NR
Exclusion criteria: location/site of lesion, palms, soles, mucous membranes of face, under nails; non‐melanocytic appearance
Sample size (participants): number included: 166
Sample size (lesions): number included: 200
Participant characteristics: mean age 33.7 years (SD 14.5), range 8‐84 years; male: 64 (38.6%); Fitzpatrick phototype II (44%); type III (41.5%)
Lesion characteristics: macular component = 181 (90.5%), papular component = 125 (62.5%), both = 106 (53%), either one or other = 94 (47%). Asymmetrical 144 (72%). Irregular borders 154 (77%). 4 colours in 40 (20%), 3 colours in 96 (48%), 2 colours in 57 (28.5%), 1 colour in 1 (0.5%). History of bleeding 7 (3.5%). Changes reported by participant 154 (77%). Lesion site: trunk 155 (77.5%), including the back in 106 (53%). Lesion size: mean long axis diameter 7.9 mm (SD 8.6) mm, mean short axis diameter 5.1 (SD 5)
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: clinical examination and/or case notes
Other test data: appears that dermoscopy was undertaken by same clinician(s) subsequent to clinical evaluation; clinical history was constructed following a standardised protocol and a presumptive clinical diagnosis recorded. Each lesion was then photographed and immediately afterwards examined using a manual dermatoscope
Diagnostic threshold: NR; presumptive clinical diagnosis
Diagnosis based on: consensus (n = 2)
Observer qualifications: dermatologist
Experience in practice: not clearly described; assumed to be high ‐ “both dermatologists had experience in dermoscopy.”
Dermoscopy: evaluated in same study; pattern analysis
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Details: lesions described using terminology proposed by US National Institutes of Health
 Disease positive: 6/6 lesions; disease negative: 194/194 lesions (assuming the 9 'other' diagnosis lesions were not malignant), or 185/185 (removing the 9 'other' diagnosis lesions from dataset)
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 6 (3%)
Other: atypical mole = 104, common mole = 70, congenital naevus = 6, blue nevus = 3, Spitz/Reed naevus = 1, spilus naevus = 1, others (unclear whether benign or malignant) = 9
Flow and timing Exclusions: none reported
Time interval to reference test: "Samples for histologic analysis were taken immediately after clinical and dermoscopic examination"
Time interval between index test(s): images taken at same time
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? No    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? No    
    High High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Low  

Morton 1998a.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective
Period of data collection: 1992‐1994
Country: Scotland
Patient characteristics and setting Inclusion criteria: all biopsies generated at PLC during time period
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: NR
Setting for prior testing: N/A
Exclusion criteria: none reported
Sample size (participants): number eligible: 1999
Sample size (lesions): 763 lesions examined by 1 of 2 consultants
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis referred to as "clinical diagnosis"; no dermoscopy used
Diagnostic threshold: NR; clinical diagnosis
Diagnosis based on: single observer and average data presented; (n = 10 in total)
Observer qualifications: 2 consultant dermatologists
Experience in practice: high (2 consultants each with > 10 years' experience in dermatology)
Any other detail: data from same study for senior registrar and registrar presented in Morton 1998b and Morton 1998c
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses; for full sample of 1999 biopsies)
Melanoma (invasive): 102 (82 SSM, 11 nodular melanoma, 4 partially regressed, 2 acral lentiginous, 2 metastatic CM deposits, 1 desmoplastic melanoma); melanoma (in situ): 24; lentigo maligna: 2
Benign: 1871 benign (breakdown by lesion type NR)
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Time interval between index test(s): N/A
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Morton 1998b.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective
Period of data collection: 1992‐1994
Country: Scotland
Patient characteristics and setting Inclusion criteria: all biopsies generated at PLC during time period
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: NR
Setting for prior testing: N/A
Exclusion criteria: none reported
Sample size (participants): number eligible: 1999
Sample size (lesions): 567 lesions examined by senior registrar
Participant characteristics: NR
Lesion characteristics: NR
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis referred to as 'clinical diagnosis'; no dermoscopy used
Diagnostic threshold: NR; clinical diagnosis
Diagnosis based on: single observer and average data presented; (n = 10 in total)
Observer qualifications: 2 senior registrars
Experience in practice: moderate, 2 senior registrars each with 3‐5 years' experience
Any other detail: data from same study for consultants and for registrar presented in Morton 1998a and Morton 1998c
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses; for full sample of 1999 biopsies)
Melanoma (invasive): 102 (82 SSM, 11 nodular melanoma, 4 partially regressed, 2 acral lentiginous, 2 metastatic CM deposits, 1 desmoplastic melanoma); melanoma (in situ): 24; lentigo maligna: 2
Benign: 1871 benign (breakdown by lesion type NR)
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Time interval between index test(s): N/A
Comparative  
Notes The study by Morton et al is considered as a single study for quality assessment purposes (as per Morton 1998a) but as three studies (Morton 1998a; Morton 1998b; Morton 1998c) for the analyses due to the reporting of three separate cohorts of participants.
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate?      
Did the study avoid including participants with multiple lesions?      
       
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard?      
If a threshold was used, was it pre‐specified?      
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner?      
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication?      
Was the test interpretation carried out by an experienced examiner?      
       
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard?      
If a threshold was used, was it pre‐specified?      
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner?      
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication?      
Was the test interpretation carried out by an experienced examiner?      
       
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition?      
Were the reference standard results interpreted without knowledge of the results of the index tests?      
Expert opinion (with no histological confirmation) was not used as a reference standard      
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist?      
       
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard?      
Did all patients receive the same reference standard?      
Were all patients included in the analysis?      
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
       

Morton 1998c.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective
Period of data collection: 1992‐1994
Country: Scotland
Patient characteristics and setting Inclusion criteria: all biopsies generated at PLC during time period
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: NR
Setting for prior testing: N/A
Exclusion criteria: NR
Sample size (participants): number eligible: 1999
Sample size (lesions): 669 lesions examined by registrar
Participant characteristics: NR
Lesion characteristics: NR
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis referred to as 'clinical diagnosis'; no dermoscopy used
Diagnostic threshold: NR; clinical diagnosis
Diagnosis based on: single observer and average data presented; (n = 10 in total)
Observer qualifications: registrars
Experience in practice: low, 6 rotating registrars each with 1‐2 years' experience
Any other detail: data from same study for consultants and for senior registrars presented in Morton 1998a and Morton 1998b
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses; for full sample of 1999 biopsies)
Melanoma (invasive): 102 (82 SSM, 11 nodular melanoma, 4 partially regressed, 2 acral lentiginous, 2 metastatic CM deposits, 1 desmoplastic melanoma); melanoma (in situ): 24; lentigo maligna: 2
Benign: 1871 benign (breakdown by lesion type NR)
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Time interval between index test(s): N/A
Comparative  
Notes The study by Morton et al is considered as a single study for quality assessment purposes (as per Morton 1998a) but as three studies (Morton 1998a; Morton 1998b; Morton 1998c) for the analyses due to the reporting of three separate cohorts of participants.
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate?      
Did the study avoid including participants with multiple lesions?      
       
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard?      
If a threshold was used, was it pre‐specified?      
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner?      
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication?      
Was the test interpretation carried out by an experienced examiner?      
       
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard?      
If a threshold was used, was it pre‐specified?      
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner?      
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication?      
Was the test interpretation carried out by an experienced examiner?      
       
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition?      
Were the reference standard results interpreted without knowledge of the results of the index tests?      
Expert opinion (with no histological confirmation) was not used as a reference standard      
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist?      
       
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard?      
Did all patients receive the same reference standard?      
Were all patients included in the analysis?      
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
       

Pizzichetta 2004.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: January 1996‐December 2001
Country: participants recruited from 5 participating centres (4 in Italy and 1 in USA) study conducted in Italy
Patient characteristics and setting Inclusion criteria: clinical and/or dermoscopic hypomelanotic (extent of pigmentation ≤ 30%) and amelanotic skin lesions seen and excised at the 5 participating centres
Setting: secondary (general dermatology)
Prior testing: clinical and/or dermatoscopic suspicion
Setting for prior testing: NR
Exclusion criteria: poor‐quality or unavailable index test image (considered under 'Flow and timing')
Sample size (participants): number included: 151
Sample size (lesions): number eligible: 174; number included: 151
Participant characteristics: mean age 47 years (± 17.5 SD); male: 73 (48%)
Lesion characteristics: lesion site, head/neck (5.3%); trunk (20.5%); upper limbs/shoulder (11.9%); lower limbs/hip (25.2%); back (21.2%); abdomen (11.3%); hand (3.3%); foot (1.3%). Melanoma thickness: ≤ 1 mm 85.3% (n = 29); > 1 mm 14.7% (n = 15)
Index tests VI: no algorithm
Method of diagnosis: clinical photographs
Prior test data: only the gender, age at diagnosis and the site of the skin lesion were known to the observer
Other test data: file contained clinical and dermoscopic images; unclear whether both observed at the same time
Diagnostic threshold: investigated clinical features such as elevation, ulceration, shape, borders, colour
Diagnosis based on: single observer (n = 1)
Observer qualifications: NR, likely dermatologist
Experience in practice: not described
Experience with index test: not described
Dermoscopy: evaluated in same study; pattern analysis
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses)
Melanoma (invasive): 34 (39 in full sample); melanoma (in situ): 5
Other diagnoses reported only for full sample of 151 (only 108 with clinical images for VI evaluation):
55 (40 with clinical images) "amelanotic ⁄ hypomelanotic non melanocytic lesions" (25 BCC, 4 SCC, 10 DF, 8 Bowen’s disease, 8 SK)
52 (29 with clinical images) "amelanotic ⁄ hypomelanotic benign melanocytic lesions" (24 compound naevi, 17 dermal naevi, 5 Spitz naevi, 4 congenital naevi and 2 combined naevi)
Flow and timing Excluded participants: 23 lesions excluded due to image quality; further 43 lesions were not available for evaluation by clinical images ("mainly benign melanocytic lesions").
Time interval to reference test: NR
Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Unclear    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Yes    
    Unclear High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Rao 1997.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: NR
Country: USA
Patient characteristics and setting Inclusion criteria: patients with atypical melanocytic lesions or suspected early MM
Setting: private care
Prior testing: selected for excision (no further detail)
Setting for prior testing: private care
Exclusion criteria: lesions > 13 mm in diameter were excluded as they could not fit entirely within the standardised photographs
Sample size (participants): number included: 63
Sample size (lesions): number included: 72
Participant characteristics: none reported
Lesion characteristics: melanoma thickness ≤ 1 mm: 100% of MM (n = 21)
Index tests VI ABCD
Method of diagnosis: clinical photographs
Prior test data: unclear
Other test data: dermoscopic images also presented to observer but unclear whether both viewed at the same time or not; "Each color transparency was independently analyzed" by observers. The 1) clinical, 2) ”overall” dermoscopic, and 3) ABCD ”scored dermoscopic diagnoses" of either MM or AMN were recorded for each lesion by the same observers. No indication of blinding between images
Diagnostic threshold: clinical variables were defined as follows: asymmetry (A): both silhouette and colour distribution were considered. Border irregularity (B): this was judged by the unevenness of the perimeter. Colour (C): colour variegation and number of colours were evaluated. Diameter (D): the largest in situ diameter in mm of each lesion was recorded
Diagnosis based on: single observer (n = 4)
Observer qualifications: 2 experienced dermatologists, and 2 melanoma fellows
Experience in practice: mixed experience (low and high experience combined)
Experience with index test: NR
Dermoscopy: evaluated in same study; ABCD and no algorithm
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Details: each of the 72 melanocytic neoplasms was histopathologically diagnosed as with AMN or an early MM by a dermapathologist with special expertise in melanocytic neoplasms. Each lesion was completely excised and step‐sectioned.
 Disease positive: 21 MMs; disease negative: 51 AMN
Target condition (final diagnoses) 
 Melanoma (invasive): 21
51 AMN
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? No    
    Unclear High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Low  

Rosendahl 2011.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: 30‐month period; dates NR
Country: Australia
Patient characteristics and setting Inclusion criteria: consecutive series of pigmented lesions submitted for histology from the primary care skin cancer practice of one study author.
Setting: primary care skin cancer practice
Prior testing: selected for excision (no further detail)
Setting for prior testing: primary
Exclusion criteria: poor image quality (considered under 'Flow and timing'); no other exclusion criteria reported
Sample size (participants): number included: 389
Sample size (lesions): number eligible: 466 pigmented lesions out of 1959 lesions excised or biopsied; number included: 463
Participant characteristics: mean age: 57 years (SD 17); male: 67.4%
Lesion characteristics: (53.1%) melanocytic. Lesion site: 17.7% head or face; trunk: 52.1%; 27.6% extremities; 2.2% palms or soles. Melanoma thickness: ≤ 1 mm: 1/29 melanoma (3.4%)
Index tests VI: no algorithm
Method of diagnosis: clinical photographs overview and close‐up image presented
Prior test data: no further information used
Other test data: dermoscopic images presented to observer subsequent to diagnosis using clinical images alone.
Diagnostic threshold: clinical diagnosis/subjective impression. Observers gave a diagnosis with level of confidence (from 0 for definitely benign to 100 for definitely malignant) after viewing the clinical images. (NB used study authors' threshold for detection of any skin cancer that includes lesions clinically considered to be MM, BCC pigmented epithelial carcinoma including SCC, keratoacanthoma, AK and Bowen's disease as test‐positive; review only considered histologically confirmed MM, BCC or invasive SCC to be disease‐positive)
Diagnosis based on: single observer (n = NR)
Observer qualifications: expert dermatologist (based on author communication)
Experience in practice: expert
Experience with dermoscopy: expert
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Details: excise or biopsy
 Disease positive: 138; disease negative: 325
Target condition (final diagnoses)
Melanoma (invasive): 9; melanoma (in situ): 20; BCC: 72; cSCC: 5 (including 2 keratoacanthoma)
'Benign' diagnoses*: 18 Bowen's disease and 14 AK, 217 benign melanocytic plus additional 140 benign non melanocytic
*authors considered Bowen's disease, AK and keratoacanthoma as malignant; all considered benign for review analysis
Flow and timing Excluded participants: lesions were excluded due to poor image quality (n = 3)
Time interval to reference test: unclear; lesions "routinely photographed" if scheduled for excision or biopsy but not further described
Time interval between index test(s): consecutive
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? Yes    
Did the study avoid including participants with multiple lesions? No    
    Low High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Scope 2008.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: after January 2003
Country: NR
Patient characteristics and setting Inclusion criteria: images of PSLs selected from a database of standardised patient images provided by a New Zealand‐based teledermatology company (MoleMap). Images were selected on the basis that (1) at least 8 clinically atypical naevi were apparent on the back; (2) most of the lesions on the back and all of the atypical naevi had close‐up clinical digital images; (3) 1‐year follow‐up images (close‐up clinical and dermoscopic images) were available to show that lesions considered to be benign were in fact biologically indolent by revealing no change; and (4) the image quality of both the overview and the close‐up images were acceptable
Setting: New Zealand‐based teledermatology company; images were sent electronically to participants as a PowerPoint file.
Prior testing: NR
Setting for prior testing: unspecified
Exclusion criteria: poor‐quality index test image (considered under 'Flow and timing'); naevi on any body site except the back
Sample size (participants): number eligible: 12; number included: 12
Sample size (lesions): number eligible: 145; number included: 145
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: ugly duckling
Method of diagnosis: clinical photographs
Prior test data: no further information used
Diagnostic threshold: for each lesion that was deemed as different, the participants had to mark the lesion number on the form, identify it as either completely different or somewhat different from the other moles, give a short qualitative description of how the lesion differed, and report whether they would like to have a biopsy performed on the lesion
Diagnosis based on: average (n = 34)
Observer qualifications: 4 subgroups in terms of clinical expertise: group 1, pigmented lesion experts (n = 8); group 2, dermatologists who were considered non‐experts in pigmented lesion evaluation (n = 13); group 3, dermatology nurses (n = 5, including 1 dermatology medical photographer); and group 4, non‐clinical medical staff (n = 8)
Experience in practice: mixed experience (low and high experience combined)
Other detail: the study was sent electronically to participants as a PowerPoint file (Microsoft Corp, Redmond, Washington) that contained the clinical image interface and a Word document that contained questionnaire and response forms. The participants were not shown dermoscopic images. However, dermoscopic images of lesions (with a 1‐year follow‐up dermoscopic image) were available to the investigators to verify that lesions considered benign did not show dermoscopic features suggestive of malignancy, and the 1‐year follow‐up images confirmed that the lesions were in fact biologically indolent by revealing no change.
Target condition and reference standard(s) Reference standard: histological diagnosis plus follow‐up
Details: unclear; all MMs were excised with histological confirmation and all benign had 1‐year follow‐up images (close‐up clinical and dermoscopic images) to show that lesions considered to be benign were in fact biologically indolent by revealing no change, not clear whether any of the benign group were excised
Target condition (final diagnoses)
Melanoma (invasive): 5 "malignant melanoma"
Benign naevus: 140
Flow and timing Excluded participants: excluded if unacceptable image quality of both the overview and the close‐up images
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? Unclear    
Did the study avoid including participants with multiple lesions? No    
    Unclear High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? No    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Yes    
    High High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Unclear    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Unclear    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Unclear Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Yes    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Soyer 1995.

Study characteristics
Patient sampling Study design: CS
Data collection: unclear
Period of data collection: NR
Country: Austria
Patient characteristics and setting Inclusion criteria: PSL, difficult to diagnose on clinical grounds alone
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: clinical suspicion
Setting for prior testing: secondary (general dermatology); referred by dermatologists or general physicians
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): number included: 159
Participant characteristics: none reported
Lesion characteristics "23 melanomas with a Breslow index of ≤ 0.75 mm, 13 melanomas with a Breslow index ≥ 0.76 mm and ≤ 1.5 mm, 12 melanomas with a Breslow index ≥ 1.51 mm and ≤ 3.5 mm, 2 melanomas with a Breslow index of ≥ 3.5 mm."
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A in‐person diagnosis
Other test data: dermoscopy undertaken by same clinician(s) subsequent to clinical evaluation
Diagnostic threshold: NR
Diagnosis based on: n = 2 (1 or 2 per lesion)
Observer qualifications: dermatologist
Experience in practice: not clearly described; assumed to be high; “Each lesion was examined clinically by .. one of the authors .. and a clinical diagnosis was recorded.” “After application of a drop of immersion oil, each lesion was examined dermoscopically …; the examination was performed by a dermatologist expert in dermoscopy and a dermoscopic diagnosis was recorded”
Experience with index test: not described
Other detail: "Photographic documentation was performed using an incident light stereomicroscope (Wild M 650) equipped with a Minolta XG‐M camera"
Dermoscopy: evaluated in same study; pattern analysis
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 65 (41%); disease negative: 94 (59%)
Target condition (final diagnoses)
Melanoma (invasive): 50; melanoma (in situ): 15
 BCC: pigmented BCC (3)
SK: 18; Clark's naevus of dysplastic naevus (61 cases); lentigo actinica lentigo (2), pigmented AK (4), angioma (3), angiokeratoma (2)
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? Unclear    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear Unclear
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Stanganelli 1998a.

Study characteristics
Patient sampling Study design: CCS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: just states 1997
Country: Italy
Patient characteristics and setting Inclusion criteria: images of PSLs selected from computerised files of the skin cancer clinic
Setting: training study; images selected from skin cancer clinic
Prior testing: NR
Setting for prior testing: unspecified
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): number included: 30 PSLs
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: clinical photographs
Prior test data: no further information used
Other test data: dermascopic images presented to observer subsequent to diagnosis using clinical images alone (images were randomised)
Diagnostic threshold: NR
Diagnosis based on: average; n = 20
Observer qualifications: dermatologist
Experience in practice: not described; 30 dermatologists with “experience in ELM but [with] no formal training” attended a seminar on clinical and ELM diagnosis of PSL; 20 then participated in a test of their diagnostic accuracy. A second session on ELM was then held.
Other detail: the observers received 2‐h seminar of the principles of clinical diagnosis of NMLs, BCC, MN and MM. The participants were then invited to undergo an anonymous test of their diagnostic accuracy.
Dermoscopy: evaluated in same study; no algorithm
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 10
 BCC: 4
Mild/moderate dysplasia: 3; SK: 3; benign naevus: MN‐7
 Other: 1 hemangioma, 1 subungunal haemorrhage, 1 plantar intraepidermal haemorrhage
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? No    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Stanganelli 2000.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective
Period of data collection: 1994‐1996
Country: Italy
Patient characteristics and setting Inclusion criteria: patients with PSLs referred by dermatologists and general practitioners either for pre‐surgical assessment or consultation
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: patients referred for pre‐surgical assessment or consultation indicating they have had prior tests
Setting for prior testing: primary, some patients referred for consultation only; dermoscopy findings reported back and management decision remains with referring clinician; secondary (general dermatology)
Exclusion criteria: none reported
Sample size (participants): number eligible: 1556
Sample size (lesions): number eligible: 3372; number included: 3372
Participant characteristics: median age 30 years, range 10‐94; male: 522 (34%)
Lesion characteristics: none reported
Index tests VI: ABCD
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Other test data: dermoscopic and clinical images subsequently presented separately to observer subsequent to diagnosis using clinical images alone.
Diagnostic threshold: NR
Diagnosis based on: single observer; n = 1
Observer qualifications: NR; described as one of the co‐authors and study based in skin cancer clinic; likely dermatologist
Experience in practice: not described
Other detail: a crude clinical image (magn x6 and x10) was recorded in the digital database
Dermoscopy: evaluated in same study (image‐based); pattern analysis
Target condition and reference standard(s) Reference standard: histological diagnosis plus follow‐up; histology report of known surgical excisions (n = 262) plus a cancer registry‐based follow‐up of benign cases (n = 3110)
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 55; BCC: 43
'Benign' diagnoses: 3274
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Time interval between index test(s): not clearly reported just indicated that D‐ELM was performed soon after clinical examination
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? Yes    
Did the study avoid including participants with multiple lesions? No    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Low Unclear
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? No    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    High Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Yes    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Stanganelli 2005.

Study characteristics
Patient sampling Study design: unclear (likely CS)
Data collection: retrospective image selection/prospective interpretation
Period of data collection: NR
Country: Italy
Test set derived: a training set of 22 melanomas and 218 MN was randomised from the dataset. The test set was formed by the complement (the remaining 20 melanomas and 217 naevi). A further subset of images from the original dataset, consisting of 31 melanomas and 103 naevi, was used for the comparison between observers and CAD; derivation of the subset NR.
Patient characteristics and setting Inclusion criteria: melanocytic lesions from patients referred to the Skin Cancer Unit and undergoing clinical and dermoscopic evaluation; images were 'selected' from a larger image database. Potential overlap with Stanganelli 2000 (not possible to determine).
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: clinical and/or dermatoscopic suspicion
Setting for prior testing: specialist unit (skin cancer clinic/PLC)
Exclusion criteria: none reported
Sample size (participants): number eligible: 1556. Referred/number included: NR
Sample size (lesions): number eligible: 3274. Number included: 477 melanocytic lesions; 237 in test set and 134 in comparison between CAD and human operators
Participant characteristics: none reported
Lesion characteristics: melanoma thickness 61.2% < 0.75 mm
Index tests VI: no algorithm
Method of diagnosis: clinical photographs
Prior test data: GPs evaluated only clinical images; unclear for dermatologists
Other test data: dermatologists examined both clinical and dermoscopic images but unclear whether clinical diagnosis was made prior to presentation of dermoscopic images
Diagnostic threshold: NR
Diagnosis based on: average (n = 6)
Observer qualifications: GP 3; dermatologist 3
Experience in practice: assumed low for GPs; high for dermatologists. Described as “dermatologists with experience in ELM (2 years)”
Other detail: digital images included melanocytic lesions evaluated in ELM with a fixed x16 magnification
Dermoscopy: evaluated in same study; no algorithm
Target condition and reference standard(s) Reference standard: histological diagnosis plus cancer registry
All included lesions underwent histology but some were identified using a cancer registry‐based follow‐up of benign diagnoses.
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 42 in full sample; 31 in CAD vs human observer interp and 20 in test set
'Benign' diagnoses: 435 MN; 103 in CAD‐observer comp and 217 in test set
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Unclear    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Unclear    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Steiner 1987.

Study characteristics
Patient sampling Study design: CS
Data collection: prospective
Period of data collection: not specified
Country: Austria
Patient characteristics and setting Inclusion criteria: small (< 10 mm) PSLs considered diagnostically equivocal in that there was no absolute agreement on the clinical diagnosis among investigating clinicians at a PLC.
Setting: specialist unit (skin cancer clinic/PLC)
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion
Setting for prior testing: specialist unit (skin cancer clinic/PLC)
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): 318
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A
Other test data: dermoscopy undertaken by same clinician(s) subsequent to clinical evaluation
Diagnostic threshold: NR
Diagnosis based on: consensus (3 observers) "All lesions were independently seen and diagnosed by the three investigators, and the diagnosis that appeared most probable to at least two of the three investigators was recorded as the clinical"; n = 3
Observer qualifications: dermatologist
Experience in practice: high experience or ‘Expert’ "experienced dermatologists"
Experience with index test: "experienced dermatologists"
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 73 melanomas, 20 BCCs; disease negative: 225
Target condition (final diagnoses)
Melanoma (invasive): 49; melanoma (in situ): 15; lentigo maligna 9 (also includes lentigo maligna melanoma)
BCC: 20
SK: 20; junctional naevi 39; blue naevus 29; dysplastic naevus 75; LS and nevoid lentigo 19; angioma/angiokeratoma 15
Flow and timing Excluded participants: none reported
Time interval to reference test: assumed consecutive; following diagnosis, lesions subsequently excised
Time interval between index test(s): consecutive
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Low  

Thomas 1998.

Study characteristics
Patient sampling Study design: CCS; separate recruitment
Data collection: retrospective
Period of data collection: NR; appears to be post‐1992
Country: France
Patient characteristics and setting Inclusion criteria: retrospective selection of all 460 cases of melanoma and a nonselected consecutive group of 680 nonmelanoma pigmented tumours
Setting: secondary (general dermatology)
Prior testing: selected for excision (no further detail). All excised
Setting for prior testing: NR
Exclusion criteria: NR
Sample size (participants): NR
Sample size (lesions): number included: 1140
Participant characteristics: NR
Lesion characteristics: Other test data: dermoscopy undertaken by same clinician(s) subsequent to clinical evaluation
Index tests VI: ABCDE
Method of diagnosis: in‐person diagnosis; dermatologist making referral for excision made the diagnosis
Prior test data: N/A in‐person diagnosis
Diagnostic threshold: number of characteristics present (from ≥ 1 to all 5)
Diagnosis based on: single observer; n = NR
Observer qualifications: dermatologist
Experience in practice: assumed to be high; described as 'trained' dermatologists
Other detail: preliminary meeting held to precisely define each criterion, agree on the significance of each abnormality and define the appropriate way to fill in the study form. ABCDE: criterion A was defined as geometrical asymmetry in two axes of the tumour, criterion B as irregular (unsharp or ill‐defined or angular) borders, criterion C as presence of at least 2 different colours within the lesion (with the exception of the usual symmetrical darkening of the lesion in its centre), criterion D as diameter ≥ 6 mm. Criterion E, the only anamnestic (based on the patient’s description of the natural history of the lesion) criterion was defined as enlargement of the surface (and not in height) of the lesion.
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 460; disease negative: 680
Target condition (final diagnoses)
Melanoma (in situ and invasive, or NR): 460
 BCC: 8
SK: 19; 576 benign pigmented naevus; 55 dysplastic naevi; 4 blue naevi; 2 compound naevi with Sutton inflammatory infiltrate; 2 Spitz; 1 Reed's naevi; 3 haemangiomas; 9 DFs; 1 accessory nipple
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? No    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? No    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? Yes    
    High Low
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? No    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Troyanova 2003.

Study characteristics
Patient sampling Study design: CCS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: NR
Country: NR
Patient characteristics and setting Inclusion criteria: Images of PSLs ≤ 13 mm in diameter selected for a dermoscopy training study
Setting: training study
Prior testing: NR
Setting for prior testing: NR
Exclusion criteria: NR
Sample size (participants): NR
Sample size (lesions): number included: 50 lesions
Participant characteristics: NR
Lesion characteristics: melanoma thickness: ≤ 1 mm: 100%
Index tests VI: no algorithm
Method of diagnosis: clinical photographs and dermoscopic images
Other test data: dermoscopic images presented to observer subsequent to diagnosis using clinical images alone.
Prior test data: no further information used
Diagnostic threshold: NR
Diagnosis based on: average; n = 32
Observer qualifications: dermatologist
Experience in practice: high experience or ‘Expert’
Experience with index test: low experience/novice users; experienced in PSL field but not ELM
Dermoscopy: evaluated in same study; no algorithm
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 25; disease negative: 25
Target condition (final diagnoses) 
 Melanoma (in situ and invasive, or NR): 25
'Benign' diagnoses: 25 "not melanoma"
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Time interval between index test(s): NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? No    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Unlu 2014.

Study characteristics
Patient sampling Study design: CS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: January 2008‐January 2010
Country: Turkey
Patient characteristics and setting Inclusion criteria: melanocytic lesions excised at Ankara University Department of Dermatology PLC
Setting: specialist unit (skin cancer clinic/PLC) Ankara University Department of Dermatology PLC
Prior testing: selected for excision (no further detail)
Setting for prior testing: specialist unit (skin cancer/PLC)
Exclusion criteria: location/site of lesion facial, nail and volar acral lesions were excluded; non‐melanocytic appearance
Sample size (participants): number included: 115
Sample size (lesions): number included: 115
Participant characteristics: mean age: 38.72 years (+/‐ 18.46 years); male: n = 56 (49%)
Lesion characteristics: lesion site: 100% trunk and limbs. Melanoma thickness: 10 (41.7%) < 0.75 mm; 14 (58.3%) ≥ 0.75 mm
Index tests VI: no algorithm; appears to be original clinical diagnosis at time of lesion presentation
Method of diagnosis: in‐person diagnosis. Appears to be diagnosis on presentation
Prior test data: N/A, in‐person diagnosis
Other test data: dermoscopic images presented to different observers
Diagnostic threshold: NR
Diagnosis based on: unclear. For VI appears to be single examiner at time of clinic diagnosis (n = NR); dermoscopic images "scored by three other experienced dermatoscopists" (n = 3)
Observer qualifications: NR; assumed dermatologists. Described as experienced dermatoscopists
Experience in practice: unclear for clinic diagnosis; dermatoscopists described as "experienced"
Experience with index test: described as "experienced"
Dermoscopy: evaluated in same study by 3 experienced dermoscopists; 3‐point rule; 7‐point checklist; ABCD; CASH algorithm (image‐based)
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 24; disease negative: 91
Target condition (final diagnoses) 
 Melanoma (in situ and invasive, or NR): 24
'Benign' diagnoses: 91 melanocytic benign lesions; 37 (32.2%) dermal naevi; 15 (13%) Clark's naevi; 14 (12.2%) compound naevi; 13 (11.3%) blue naevi; 6 (5.2%) Spitz naevi; 4 (3.5%) congenital MN; 2 (1.7%) junctional naevi
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Time interval between index test(s): appear to be consecutively applied
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? No    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Yes    
    High High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Yes    
    Low Low
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Viglizzo 2004.

Study characteristics
Patient sampling Study design: CS
Data collection: NR
Period of data collection: NR
Country: Italy
Patient characteristics and setting Inclusion criteria: PSLs examined at the Dermoscopy Service and undergoing excision; a modified version of Kenet's risk stratification approach for dermoscopy (Ascierto 2000) was used to select high‐ and very high‐risk lesions for excision; medium‐ and low‐risk lesions were excised based on cosmetic or functional reasons. (We extracted 2x2 data for melanocytic subgroup only).
Setting: specialist unit (skin cancer clinic/PLC) dermoscopy service at a university department (Department of Endocrinologic and Metabolic Disease)
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion
Setting for prior testing: specialist unit (skin cancer clinic/PLC)
Exclusion criteria: none reported
Sample size (participants): number eligible: 349 patients; number included: NR
Sample size (lesions): number eligible: 520 lesions; number included: 79 lesions excised included in the final analysis
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: unclear
Other test data: dermoscopy undertaken by same clinician(s) subsequent to clinical evaluation
Diagnostic threshold: NR; correct diagnosis of melanoma
Diagnosis based on: single observer (n = NR; "All dermoscopic evaluations were performed by the same operators")
Observer qualifications: NR; "each lesion was ... diagnosed clinically and dermoscopically" at the dermoscopy service
Experience in practice: not described
Experience with dermoscopy: not described; assumed high as diagnosis at 'Dermoscopy service'
Dermoscopy: evaluated in same study; no algorithm
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Target condition (final diagnoses) 
 Melanoma (invasive): 11; melanoma (in situ): 1
 Melanocytic lesion: 67
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    Unclear High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Walter 2012.

Study characteristics
Patient sampling Study design: RCT (control group only included)
Data collection: prospective
Period of data collection: March 2008‐May 2010
Country: UK
Patient characteristics and setting Inclusion criteria: adults with any suspicious PSL, i.e. any lesion presented by a patient, or opportunistically seen by a family doctor or practice nurse, that could not immediately be diagnosed as benign and about which the patient could not be reassured.
Setting: primary. 15 general practices in eastern England
Prior testing: clinical suspicion of malignancy without dermatoscopic suspicion
Setting for prior testing: primary
Exclusion criteria: those unable to give informed consent or considered inappropriate to include by their family doctor
Sample size (participants): number eligible: 1297; number included: 1293
Sample size (lesions): number eligible: 1580; number included: 1583
Participant characteristics: mean age: 44.6 years (SD 16.8). Male: 465 (36%). Ethnicity: white 1214 (93.9%); mixed 45 (3.5%); missing: 34 (2.6%)
Lesion characteristics: lesion thickness ≤ 1 mm: in 'more than half' of MM
Index tests VI: Glasgow/MacKie revised 7‐point checklist (MacKie 1990)
Method of diagnosis: in‐person diagnosis
Prior test data: N/A
Diagnostic threshold: NR
Diagnosis based on: single observer (n = 30)
Observer qualifications: 28 GPs and 2 nurse practitioners recruited as 'lead clinicians' (2 per practice); appears as though they conducted all skin examinations. Excluded GPs with known dermatological expertise, e.g. current hospital practitioners, clinical assistants in dermatology, and GPs with a special interest in dermatology
Experience in practice: mixed GP experience, median of 15 years' experience (range 4‐27 years); assumed low experience with PSLs. 7 had undergone some training in dermatology: 3 had a short dermatology training post, 3 were on clinical attachment to an out‐patient clinic, and 1 was unspecified
Target condition and reference standard(s) Reference standard: histological diagnosis plus follow‐up and expert opinion
Histology (not further described) 215 (histology result missing in further 4)
 Disease positive: 35; disease negative: 180
Clinical follow‐up plus histology of suspicious lesions: 22 of the 411 referred patients were monitored (not further described); 566 of the 1162 not referred underwent expert review and were then re‐assessed at 3‐6 months
 Disease positive: 1; disease negative: 588
Expert opinion. Reviewed by 2 dermatology experts using the recorded clinical history and examination, a digital photograph, and MoleMate image where available
 Disease positive: 0; disease negative: 725
Target condition (final diagnoses) 
 Melanoma (invasive): 30; melanoma (in situ): 6; BCC: 10
'Benign' diagnoses: 1306
Flow and timing Excluded participants: 417 withdrew from control group after randomisation. 10 did not attend for dermatology assessment; 19 excluded; 1 died; 4 missing histology (in referred group; included as benign?); plus 12 with unknown outcome (in non‐referred group, assumed benign and included)
Time interval to reference test: suspicious lesions referred under 2‐week wait system
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? Yes    
Are the included patients and chosen study setting appropriate? Yes    
Did the study avoid including participants with multiple lesions? No    
    Low High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? Yes    
Was the test interpretation carried out by an experienced examiner? No    
    Low High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? No    
Were the reference standard results interpreted without knowledge of the results of the index tests? No    
Expert opinion (with no histological confirmation) was not used as a reference standard No    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    High High
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? No    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Yes    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Westerhoff 2000.

Study characteristics
Patient sampling Study design: CCS (for lesion selection; study was an RCT of dermoscopy training for PCPs)
Data collection: retrospective
Period of data collection: NR
Country: Australia
Patient characteristics and setting Inclusion criteria: clinically atypical PSLs; 50 invasive melanomas and 50 nonmelanomas randomly selected from the Sydney Melanoma Unit PSL image database.
Setting: specialist unit (lesion selection)
Prior testing: selected for excision or followed up
Setting for prior testing: specialist unit (skin cancer clinic/PLC)
Exclusion criteria: none reported
Sample size (participants): number included: NR
Sample size (lesions): number included: 100
Participant characteristics: none reported
Lesion characteristics: median Breslow thickness 0.6 mm
Index tests VI: no algorithm
Method of diagnosis: clinical photographs
Prior test data: unclear; all participants "were instructed not to look at the surface microscopic image until they had scored the clinical image"
Diagnostic threshold: NR
Diagnosis based on: average (n = 37); 74 practising primary care practitioners randomised to dermoscopy education intervention or not. Diagnoses were recorded for both groups of GPs at baseline (pre‐test) and after the training intervention had been administered to the intervention group (post‐test), resulting in 8 sets of 2x2 data based on interpretation of the same set of 100 lesions; post‐test data for the intervention group of GPs was used for the VI analysis.
Observer qualifications: GP
Experience in practice: considered to be low; only practitioners who had had no formal training with surface microscopy and did not use a surface microscope in their clinical practice were included.
Experience with dermoscopy: low experience/novice users (non‐training arm); 'trained' for the intervention arm
Other detail: camera designed for close‐up clinical photography (Elicar Macrolens, Japan)
Dermoscopy: evaluated in same study; Menzies criteria (intervention arm underwent training in Menzies criteria)
Target condition and reference standard(s) Reference standard: histological diagnosis plus follow‐up
Histology: all the lesions except 2 had been excised after photography and subjected to histopathological examination.
Disease positive: 50; disease negative: 48
Clinical follow‐up plus histology of suspicious lesions: the 2 benign PSLs that had not been excised were monitored over a longer period of time and had shown no morphological change
 Length of follow‐up: NR; disease positive: 0; disease negative: 2
Target condition (final diagnoses)
Melanoma (invasive): 50; 'benign' diagnoses: 50
Flow and timing Excluded participants: none reported
Time interval to reference test: "All the lesions except two had been excised after photography"
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Yes    
Was a case‐control design avoided? No    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Yes    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Yes    
Did all patients receive the same reference standard? No    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Unclear    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    High  

Winkelmann 2016.

Study characteristics
Patient sampling Study design: CCS
Data collection: retrospective image selection/prospective interpretation
Period of data collection: NR
Country: NR
Patient characteristics and setting Inclusion criteria: images of PSLs previously analysed by a digital classifier MSDSLA; method of selection of the 12 NR
Setting: dermoscopy conference
Prior testing: NR
Setting for prior testing: unspecified
Exclusion criteria: none reported
Sample size (participants): NR
Sample size (lesions): number included: 12
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: clinical photographs
Prior test data: unclear
Other test data: dermoscopic images presented to observer subsequent to diagnosis using clinical images alone
Diagnostic threshold: NR, biopsy decision
Diagnosis based on: average (n = 70)
Observer qualifications: dermatologist
Experience in practice: not described; recruited “dermatologists at a dermoscopy conference”; no further details
Other detail: study authors report that practitioners with a particular interest in skin cancer or technology may have chosen to attend this conference and/or self‐selected to take part in the study.
Dermoscopy: evaluated in same study; no algorithm
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 5; disease negative: 7
Target condition (final diagnoses) 
 Melanoma (invasive): 3; melanoma (in situ): 2
Mild/moderate dysplasia: 7 low‐grade dysplastic naevi
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? No    
Did the study avoid inappropriate exclusions? Unclear    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual inspection ‐ image‐based
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Unclear    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? No    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Unclear High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC? Yes    
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

Zaumseil 1983.

Study characteristics
Patient sampling Study design: CS
Data collection: NR
Period of data collection: 1976‐1981
Country: Germany
Patient characteristics and setting Inclusion criteria: skin lesions undergoing excision
Setting: secondary (not further specified)
Prior testing: selected for excision (no further detail)
Setting for prior testing: specialist unit (skin cancer clinic/PLC) Described as 'skin clinic'
Exclusion criteria: disagreement between evaluators on tumour histological classification. Those in which the histological diagnosis was 'unclear' were excluded, melanoma metastases were excluded
Sample size (participants): NR
Sample size (lesions): number included: 7063
Participant characteristics: none reported
Lesion characteristics: none reported
Index tests VI: no algorithm
Method of diagnosis: in‐person diagnosis
Prior test data: N/A, in‐person diagnosis
Diagnostic threshold: primary diagnosis of melanoma (method of Kopf 1975 was cited)
Diagnosis based on: single observer (n = NR)
Observer qualifications: NR
Experience in practice: not described
Experience with index test: not described
Target condition and reference standard(s) Reference standard: histological diagnosis alone
Disease positive: 337; disease negative: 6726
Target condition (final diagnoses)
Melanoma (invasive or in situ): 337
 Other diagnoses only listed for the 89 false‐positives: 23 benign naevi; 13 BCC; 12 blue nevus; 11 angiomatosis; 10 SK; 6 histiocytoma; 4 Spitz nevus; 4 lentigo; 3 Bowen's disease; 1 acrospiroma; 1 keratinizing papilloma
Flow and timing Excluded participants: none reported
Time interval to reference test: NR
Comparative  
Notes
Methodological quality
Item Authors' judgement Risk of bias Applicability concerns
DOMAIN 1: Patient Selection
Was a consecutive or random sample of patients enrolled? Unclear    
Was a case‐control design avoided? Yes    
Did the study avoid inappropriate exclusions? No    
Are the included patients and chosen study setting appropriate? No    
Did the study avoid including participants with multiple lesions? Unclear    
    High High
DOMAIN 2: Index Test Visual Inspection ‐ in‐person
Were the index test results interpreted without knowledge of the results of the reference standard? Yes    
If a threshold was used, was it pre‐specified? Yes    
For studies reporting the accuracy of multiple diagnostic thresholds, was each threshold or algorithm interpreted without knowledge of the results of the others?      
Was the test applied and interpreted in a clinically applicable manner? Yes    
Were thresholds or criteria for diagnosis reported in sufficient detail to allow replication? No    
Was the test interpretation carried out by an experienced examiner? Unclear    
    Low High
DOMAIN 3: Reference Standard
Is the reference standards likely to correctly classify the target condition? Yes    
Were the reference standard results interpreted without knowledge of the results of the index tests? Unclear    
Expert opinion (with no histological confirmation) was not used as a reference standard Yes    
Was histology interpretation carried out by an experienced histopathologist or by a dermatopathologist? Unclear    
    Low Unclear
DOMAIN 4: Flow and Timing
Was there an appropriate interval between index test and reference standard? Unclear    
Did all patients receive the same reference standard? Yes    
Were all patients included in the analysis? Yes    
If the reference standard includes clinical follow‐up of borderline/benign appearing lesions, was there a minimum follow‐up following application of index test(s) of at least: 3 months for melanoma or cSCC or 6 months for BCC?      
If more than one algorithm was evaluated for the same test, was the interval between application of the different algorithms 1 month or less?      
    Unclear  

ABCD(E): asymmetry, border, colour, differential structures (enlargement); AK: actinic keratosis; AMN: atypical MN; BCC: basal cell carcinoma; CAD: computer‐assisted diagnosis; CCS: case‐controlled study; CD: compact disc; CM: cutaneous melanoma; CMM: cutaneous malignant melanoma; CS: case series; cSCC: cutaneous squamous cell carcinoma; DF: dermatofibroma; ELM: epiluminescence microscopy; FN: false‐negative; FP: false‐positive; GP: general practitioner; H&E: haematoxylin and eosin stain; LPLK: lichen planus‐like keratosis; LS: lentigo simplex; MM: malignant (invasive) melanoma; MN: melanocytic naevi; MSDSLA: multispectral digital skin lesion analysis device; N/A: not applicable; NMLs: non‐melanocytic lesions; NR: not reported; PCPs: primary care providers; PLC: pigmented lesion clinic; PSL: pigmented skin lesion; RCM: reflectance confocal microscopy; RCT: randomised controlled trial; SCC: squamous cell carcinoma; SD: standard deviation; SDDI: short‐term sequential digital dermoscopy imaging; SK: seborrhoeic keratosis; SSM: superficial spreading melanoma; SVS: support vector system; VI: visual inspection; 7FFM: seven features for melanoma

Characteristics of excluded studies [ordered by study ID]

Study Reason for exclusion
Abbasi 2004 Not a primary study; systematic review
Aldridge 2011a Ineligible test observer: medical students and lay people
Aldridge 2011b Ineligible test observer
Aldridge 2013 Unable to construct 2x2 table based on data presented. Not test accuracy study
Alendar 2009 Ineligible reference standard. Only 7 reported verified histologically
Argenziano 1999 Ineligible study population. Only included melanoma
Argenziano 2003 Unable to construct 2x2 table based on data presented. Table V gives se/sp data for 108 lesions but cannot derive the number of melanoma for this subset of the original 128
Contacted study authors 10 May 2016; 24 June 2016
Argenziano 2012 Ineligible reference standard. No follow‐up of test‐negatives
Argenziano 2014 Unable to construct 2x2 table based on data presented
Ascierto 2003 Not a primary study
Badertscher 2015 Unable to construct 2x2 table based on data presented
Bafounta 2001 Not a primary study, systematic review
Banky 2005 Ineligible target condition
Ineligible index test
Basarab 1996 Ineligible study population. Not all suspected of skin cancer
Unable to construct 2x2 table based on data presented
Bauer 2000 Ineligible index test. Does not provide 2x2 data for VI alone
Bauer 2005 Ineligible index test, follow‐up/monitoring study
Becker 1954 Not a primary study
Benelli 2000 Unable to construct 2x2 table based on data presented. Only inter‐rater reliability data given (n = 25); study authors have published much larger evaluations of 7FFM and ABCD
Blum 2004a Not a primary study, comment paper
Blum 2004b Not a primary study, letter. Only limited data presented. Evaluates '3‐colour' rule as developed by Mackie 2002 (excluded as assessment of individual lesion features only)
Blum 2004c Ineligible index test, evaluates dermoscopy only
Bolognia 1990 Ineligible reference standard, no reference standard diagnosis for index test‐negatives
Bono 2001 Unable to construct 2x2 table based on data presented. Aim of the study was to determine what features are present in amelanotic cutaneous melanoma
Borsari 2015 Individual lesion characteristics
Borve 2012 Ineligible study population, included participants without skin lesions
Ineligible sample size, < 5 BCC
Brown 2000 Not a primary study, systematic review
Brown 2009 Ineligible test observer, lay people
Buhl 2012 Ineligible index test, follow‐up/monitoring
Duplicate or related publication, same participants as Haenssle 2010
Burki 2015 Not a primary study
Burr 2015 Not a primary study
Burton 1998 Ineligible reference standard
Unable to construct 2x2 table based on data presented, can only get 2x2 data for referral accuracy
Carli 2003b Ineligible reference standard. Only 39/1042 with reference test
Carli 2003c Ineligible sample size
Carli 2004a Ineligible sample size, < 5 MM per arm
Unable to construct 2x2 table based on data presented
Carli 2004b Ineligible index test
Study author passed away; unable to make contact with co‐authors
Carli 2004c Ineligible index test, 'clinical diagnosis'. Dataset covers 1997‐2001, but dermoscopy routinely introduced 1998; study authors contacted but no response
Carli 2005 Unable to construct 2x2 table based on data presented. Only sensitivity data given (% with correct diagnosis); % of benign lesions incorrectly diagnosed was not reported
We will try to contact study authors.
Carlos‐Ortega 2007 Unable to construct 2x2 table based on data presented. Gives se/sp for VI and dermoscopy in the English abstract. 68 patients/70 lesions were included but only 36 seem to have had VI results and all underwent dermoscopy. 2 observers performed each test blinded to each other. Table I gives 22 with BCC and 11 with melanoma overall (number D+ not reported for those with VI results), but using either or both of these numbers with the se/sp provided does not give the same PPV and NPV as given by the study authors
Data not clearly presented for 2x2; translator suggested alternative but still does not work out to what is in paper; tried contacting authors twice, no reply
Chen 2001 Not a primary study, systematic review comparing PCP accuracy with dermatologist accuracy
Chen 2006 Unable to construct 2x2 table based on data presented, only given AUC
Chiaravalloti 2014 Ineligible study population, included melanoma only
Ciudad‐Blanco 2014 Ineligible study population, included melanoma only
Cooper 2002 Ineligible target condition, insufficient data for inclusion in melanoma review
Cornell 2015 Ineligible test observer
Cox 2008 Ineligible reference standard. Se and sp estimates for diagnosis of melanoma for both the 7‐point checklist and the revised (10‐point) checklist; reference standard not reported for any of the 381 TWR referrals for melanoma
Study author contacted 10 May 2016; co‐author contacted 24 June 2016
De Giorgi 2011 Duplicate publication. Study appears to use same lesions as Carli 2003a (included study). Both studies have the same numbers of melanomas and benign nevi and have common co‐authors (De Giorgi 2011 in particular). Although not explicit, the De Giorgi 2011 paper appears to have used the same lesions and study design but with different observers. The original Carli 2003b paper reported using 8 expert observers while the later paper recruited 8 dermatologists who had undergone a dermoscopy training course but who reported no experience in assessing pigmented skin lesions.
DeCoste 1993 Unable to construct 2x2 table based on data presented. Not given the total number of D+/D‐ or total number of lesions included. Just given the se/sp values
Di Carlo 2014 Ineligible index test. Videothermography not relevant for the review and there is no 2x2 data for dermoscopy if derivation study. Only included AK and BCC
Di Chiacchio 2010 Ineligible target condition, excluded nail bed melanoma
Unable to construct 2x2 table due to insufficient data to extract
Dreiseitl 2009 Ineligible index test. Did not evaluate VI alone
Duff 2001 Ineligible index test. Did not evaluate VI alone
Edmondson 1999 Ineligible reference standard. It seems that the reference standard here was expert diagnosis. This is not a teledermatology paper
Emmons 2011 Unable to construct 2x2 table based on data presented. Not test accuracy study; promoted primary prevention
Engelberg 1999 Ineligible sample size, only 1 confirmed melanoma and 3 BCC
English 2003 Unable to construct 2x2 table based on data presented. No accuracy data given
English 2004 Unable to construct 2x2 table based on data presented. No accuracy data
Fabbrocini 2008 Unable to construct 2x2 table because insufficient data provided for each index test to populate 2x2 table
Contacted study authors to request cross‐tabulation of each clinician's diagnosis (e.g. at threshold of ≥ 3 on 7‐point checklist) against the histological diagnosis or a cross‐tabulation of the remote diagnosis against the face‐to‐face diagnoses, or both. Study author responded 30 June 2016, cannot access data needed
Federman 1995 Unable to construct 2x2 table based on data presented. Not test accuracy study
Fikrle 2013 Ineligible reference standard. Follow‐up study < 50% of study participants had their final diagnosis reached by histopathology
Freeman 1963 Unable to construct 2x2 table based on data presented. Only gives % correct for each lesion type
Tables 2 and 3 appear to give % correct diagnoses per lesion type, but do not give data on numbers misclassified as melanoma, or other malignancy, i.e. FPs.
Contacted study authors who responded; paper too old, cannot provide data
Friedman 1985 Not a primary study
Funt 1963 Ineligible index test. No 2x2 data to construct 2x2 table
Gerbert 1996 Ineligible target condition. No breakdown of final diagnoses for included lesions
Unable to construct 2x2 table based on data presented
Only gives % correct for each lesion type; not se/sp
Gerbert 1998 Unable to construct 2x2 table based on data presented
Giannotti 2004 Not a primary study, a review
Grana 2003 Ineligible index test. Individual lesion characteristics, only looking at lesion border
Grob 1998 Not a primary study
Guibert 2000 Ineligible reference standard. Not designed as an accuracy study only observational. Cannot get 2x2 data > 50% of study participants did not receive histology as ref standard.
Gunduz 2003 Ineligible sample size, case study
Gutierrez 2013 Ineligible index test, test to improve histopathology diagnosis
Hacioglu 2013 Ineligible target condition. Does not provide sufficient data for detection of melanoma
Haenssle 2010 Ineligible index test. Test used for monitoring and not initial diagnosis; no VI data
Haenssle 2010a Unable to construct 2x2 table based on data presented. Does not report specificity
Duplicate or related publication, same participants as Haenssle 2010
Hallock 1998 Ineligible index test. 'Clinical diagnosis'; dermoscopy used for 3 of the 4 years of study recruitment
Haniffa 2007 Ineligible reference standard, looks like approximately 20% of participants received a final diagnosis by histology. 179 biopsies were performed. Total sample was 881 lesions
Har‐Shai 2001 Ineligible index test, 'clinical diagnosis'
Heal 2008 Unable to construct 2x2 table based on data presented. Sensitivities and PPVs are given so theoretically a 2x2 could be worked out, but the numbers do not appear to work out
Author response: the 2x2 table the Cochrane researchers want to create is not possible for our results, because sensitivity and PPV are based on different sample sizes.
Healsmith 1994 Ineligible reference standard. Benign lesions described as 'clinically diagnosed' rather than histology/follow‐up
Higgins 1992 Ineligible study population, included only benign lesions
Ineligible sample size, no melanomas
Unable to construct 2x2 table based on data presented, no malignant cases
Hoorens 2016 Ineligible index test
Ineligible reference standard. No information on numbers undergoing histology; and no follow‐up reported for benign‐appearing lesions
Unable to construct 2x2 table based on data presented
Huang 1996 Individual lesion characteristics. Border irregularity not overall diagnosis
Unable to construct 2x2 table based on data presented
Jamora 2003 Ineligible reference standard. No reference standard for index test‐negatives
Janda 2014 Ineligible sample size, only 1 case of melanoma, 1 case of BCC and 1 of SCC
Jensen 2015 Not a primary study, comment paper
Jolliffe 2001 Ineligible index test. Provides data for clinical diagnosis (including dermoscopy for some cases)
Jonna 1998 Unable to construct 2x2 table based on data presented, only included index test‐positives to get PPV
Kaddu 1997 Ineligible sample size. Sample size < 5; not test accuracy
Keefe 1990 Ineligible reference standard. Only 28% (60/214) of non‐melanoma group had excision
Kelly 1986 Ineligible target condition. Cannot disaggregate the severely dysplastic/in situ MM
Ineligible sample size, unclear whether > 5 in situ melanoma
Koh 1990 Ineligible reference standard, screening study; no adequate reference standard
Kroemer 2011 Ineligible index test, provides data for clinical diagnosis (including dermoscopy for some cases)
Krol 1991 Ineligible reference standard. No follow‐up reported for those who were test‐negative
Kurvers 2015 Ineligible index test. Collective intelligence ‐ majority rule and quorum rule applied to large number of test interpreter decisions
Duplicate or related publication, re‐analyses data from 2 previously published studies to determine whether collective intelligence (i.e. majority rules or quorum rules across a large number of observers) improves test accuracy. We have excluded one of these studies as it did not provide the number of melanomas (Argenziano 2003) and included the other in our dermoscopy review (Zalaudek 2006).
Kvedar 1997 Ineligible study population. Not all suspected of skin cancer
Lechner 2015 Not a primary study, erratum
Lewis 1999 Unable to construct 2x2 table based on data presented. Study appears to meet all eligibility criteria but disease prevalence not given alongside se/sp
Contacted study authors 10 May 2016; email returned
Lindelöf 1994 Ineligible study population, only malignant melanoma
Unable to construct 2x2 table based on data presented. Not enough information given to derive a 2x2 table. Only given for a sample of 50 participants who had a strong suspicion of melanoma clinically. Do not know what happened to those with no suspicion clinically
Lorentzen 2000 Ineligible index test. Does not provide data for VI alone
Luttrell 2012 Ineligible test observer. Accuracy data only given for lay‐people, not interested in this population of test observers
Machet 2005 Ineligible study population. (Note: this is a staging study)
MacKenzie‐Wood 1998 Ineligible study population, only malignant diagnosis
MacKie 1990 Not a primary study
Mackie 1991 Not a primary study, letter
Mackie 2002 Individual lesion characteristics, presence of ≥ 3 colours on dermoscopy
Mahendran 2005 Ineligible index test. Face to face was 'clinical diagnosis', i.e. VI +/‐ use of dermoscopy
Mahon 1997 Not a primary study, a summary of a comparison of 2 screening checklists
Malvehy 2014 Ineligible index test. Does not report data for VI alone
Marghoob 1995 Not a primary study, letter
Marghoob 2007 Not a primary study
Markowitz 2015 Ineligible target condition. Does not report sufficient data for detection of melanoma
McCarthy 1995 Not a primary study, leaflet
McMullan 1956 Unable to construct 2x2 table based on data presented
Menzies 2008 Ineligible index test, evaluated dermoscopy alone
Menzies 2011 Ineligible index test, surveillance study; data used to id factors predictive of lesion changes
Menzies 2013 Ineligible index test, evaluated dermoscopy only
Moffatt 2006 Ineligible index test, 'clinical diagnosis'
Mohammad 2015 Ineligible study population, only included BCC
Morrison 2001 Unable to construct 2x2 table based on data presented
Study gives % correct diagnosis within each histology group and then gives the % ‘correct’ diagnosis of skin cancer as 22% for FP and 87% for dermatologist. But these statistics appear to have been reached by taking the mean of the % correct diagnoses across the malignant groups and do not equate to sensitivity. i.e. If you take the mean of the FP correct (%) for the 4 malignant groups you get:
(40 + 22 + 25 + 0) / 4 = 21.75%
and then the same for the dermatologist correct (%) column:
(95 + 77 + 75 + 100)/4 = 86.75%
Nachbar 1994 Ineligible index test. Data for VI alone influenced by use of dermoscopy in most cases
Nathansohn 2007 Unable to construct 2x2 table based on data presented. Not test accuracy; follow‐up study
Nilles 1994 Ineligible index test. Does not provide data for VI alone
Osborne 1998 Ineligible reference standard. Not clear what the ref standard is
Unable to construct 2x2 table based on data presented
Osborne 1999 Ineligible study population. Only patients with melanoma included
Parslew 1997 Ineligible study population. Not all suspected of skin cancer
Pazzini 1996 Unable to construct 2x2 table based on data presented
Perednia 1992 Unable to construct 2x2 table based on data presented. Not test accuracy
Perrinaud 2007 Ineligible index test. Does not provide data for VI alone
Piccolo 2000 Ineligible index test. No data can be extracted for VI alone
Piccolo 2002 Not a primary study
Not enough data to populate 2x2 table. No breakdown of index test results and ref standard.
Pizzichetta 2001 Unable to construct 2x2 table based on data presented. Observer agreement only
Provost 1998 Unable to construct 2x2 table based on data presented. Not test accuracy; only reports concordance
Quereux 2011 Ineligible index test, self‐administered questions to patients attending a GP surgery before their appointment to determine whether they were at high risk of melanoma, which is meant to highlight to the GP which patient to examine during their consultation
Rallan 2006 Ineligible index test. No data can be extracted for VI alone
Rampen 1988 Ineligible study population. Only melanoma included
Reeck 1999 Ineligible study population. Only included index test‐negatives; i.e. those considered benign by referring clinician
Ineligible target condition
Riddell 1961 Ineligible study population. All malignant
Rigel 1993 Not a primary study
Robati 2014 Ineligible reference standard. No follow‐up of participants not referred to dermatology clinics, who did not receive histopathology
Robinson 2010 Ineligible index test, self examination
Rosado 2003 Not a primary study, systematic review
Rossi 2000 Ineligible reference standard. Unclear reference standard in disease‐negative
Roush 1986 Ineligible target condition, only dysplastic naevus
Salvio 2011 Not a primary study
Ineligible sample size
Schindewolf 1994 Ineligible index test, evaluated CAD not VI
Schmoeckel 1987 Not a primary study
Schwartzberg 2005 Ineligible target condition, does not provide sufficient data for detection of melanoma
Seidenari 2006 Ineligible study population. Assessed best means of follow‐in up patients with previous melanoma ‐ total body exam versus only lesions > 2 cm. No melanoma identified
Seidenari 2006a Individual lesion characteristics. Looks like this study is only looking at asymmetry judgement
Shariff 2010 Ineligible reference standard
Sondak 2015 Not a primary study, comment paper
Soyer 2004 Ineligible index test. Does not provide data for VI alone
Stanganelli 1998b Unable to construct 2x2 table based on data presented. Cannot derive specificity; only gives exact diagnoses for MM and 2 benign categories and not number benign misdiagnosed as MM
Stanley 2003 Individual lesion characteristics. Fuzzy histogram based on the lesion's colour, which is an individual lesion characteristic
Stathopoulos 2015 Unable to construct 2x2 table based on data presented. Only included index test‐positive patients, i.e. no FN or TN results
Stratigos 2007 Ineligible reference standard
Unable to construct 2x2 table based on data presented
Tandjung 2015 Ineligible target condition. 'Malignant' included: AK, Bowen's, dysplastic nevus, lentigo maligna, SCC, BCC, MM, keratoacanthoma
Ineligible index test. GPs sent images for telederm opinion; then free to send for biopsy or not; results shown are only for those that were biopsied, according to TD advice
Terrill 2009 Ineligible index test. Whole body skin examination after participants referred on for further assessment by a specialist
Unable to construct 2x2 table based on data presented
Terushkin 2010a Unable to construct 2x2 table based on data presented. Not test accuracy, reports final diagnoses of those excised over a number of time periods and benign‐malignant ratio
Terushkin 2010b Unable to construct 2x2 table based on data presented. Not test accuracy ‐ reports final diagnoses of those excised over a number of time periods and benign‐malignant ratio
Thomson 2005 Not a primary study, letter
Torrey 1941 Ineligible target condition, included non‐cutaneous lesions
Ulrich 2015 Ineligible target condition. Does not provide sufficient data for evaluation of melanoma
Van der Rhee 2010 Ineligible reference standard.< 50% of disease‐negative have an adequate reference standard
Van der Rhee 2011 Ineligible sample size, < 5 cases
Vasili 2010 Conference abstract
Wagner 1985 Unable to construct 2x2 table based on data presented
Walter 2010 Not a primary study, clinical trial protocol
Walter 2013 Ineligible reference standard. Final diagnosis reached by histology or expert opinion; no follow‐up of non‐excised lesions reported in this paper. The Walter 2012 trial report does report follow‐up for enough benign lesions for control arm (weighted 7‐point checklist) data to be included. Study authors contacted and confirmed calculations (2 March 2016)
Warshaw 2009a Unable to construct 2x2 table based on data presented. Study presents diagnostic accuracy of teledermatology and clinic diagnosis in comparison to histopathology; in order to include in our review, data would need to be presented as a 2x2 contingency table, either per type of malignancy e.g. tele‐diagnosis classification of melanoma vs not melanoma against histological diagnosis of melanoma/not melanoma, or with malignant diagnoses grouped together, ie tele‐diagnosis of malignancy vs not malignant against same histological breakdown
Study authors contacted: "the 2x2 table the Cochrane researchers want to create is not possible for our results, because sensitivity and PPV are based on different sample sizes. This can be seen in Table 2 of the paper which actually adds up to 11870 skin lesions across, as for each histological diagnosis of interest the first lesion with such a histological diagnosis was considered per patient. Hence, a patient might appear several times across the columns. Table 1 adds up to 8585 skin lesions – the first skin lesion in the data set per patient with a clinical diagnosis."
Warshaw 2009b Unable to construct 2x2 table based on data presented, as per Warshaw 2009a
Warshaw 2010 Unable to construct 2x2 table based on data presented, as per Warshaw 2009a; this 2010 paper presents combined data for pigmented and nonpigmented lesions
Westbrook 2006 Unable to construct 2x2 table based on data presented
Whitaker‐Worth 1998 Ineligible study population
Ineligible test observer, mixed medical student/clinicians
Unable to construct 2x2 table based on data presented, not test accuracy study
Whited 1998 Ineligible sample size
Williams 1991 Unable to construct 2x2 table based on data presented
Winkelmann 2015a Duplicate or related publication
Winkelmann 2015b Duplicate or related publication
Wolf 1998 Ineligible index test, clinical diagnosis study. Test clearly described, "concerning the clinical diagnosis, we were not able to ascertain from the clinical data sheet whether the referring physicians used additional diagnostics techniques such as dermoscopy"
Yoo 2015 Conference abstract
Youl 2007a Ineligible index test, 'clinical diagnosis' ‐ dermoscopy used in some but not all cases.
Response from study author, "One of the main issues is that we just don’t know to what extent dermoscopy was used in that study. We just asked where they used it in a general sense and not for each case. However for each case GPs and skin clinic doctors did indicate whether they conducted a whole‐ or part‐body skin examination (or just lesion specific)
Youl 2007b Ineligible index test. Evaluates clinical diagnosis (some lesions had dermoscopy)
Zaballos 2013 Ineligible study population. They do not have enough benign cases to include as full report
Zou 2001 Not a primary study. Study uses results from Stolz 1994
Unable to construct 2x2 table based on data presented. Just showing ROC curves

AK: actinic keratosis; AUC: area under the curve; BCC ‐ basal cell carcinoma; CAD: computer assisted diagnosis; D+/D‐: disease‐positive/disease‐negative; 7FFM: seven features for melanoma; FPs: false‐positives; FN: false‐negative; GP: general practitioner; PCP ‐ primary care provider; PPV: positive predictive value; MM: malignant (invasive) melanoma; NPV: negative predictive value; ref: reference; SCC: squamous cell carcinoma; se/sp: sensitivity/specificity; TD: teledermatology; TN: true negative; TWR: two week rule; VI: visual inspection

Differences between protocol and review

We set out to review visual inspection and dermoscopy for the detection of melanoma in a single review; however, due to the volume of evidence identified, we prepared two separate reviews: one for visual inspection alone and one for dermoscopy, the latter including direct comparisons with visual inspection where the same studies evaluated both tests.

We changed the primary objectives and primary target condition from detection of cutaneous invasive melanoma alone, to the detection of cutaneous invasive melanoma and atypical intraepidermal melanocytic variants, as the latter is more clinically relevant to the practicing clinician. We included the detection of the target condition of invasive melanoma alone as a secondary objective instead.

We tailored secondary objectives to the individual test, and added two objectives, to determine the diagnostic accuracy of individual algorithms for visual inspection, and to determine the effect of observer experience.

Sources of heterogeneity that could be investigated (as listed under Secondary objectives) were restricted due to lack of data.

We amended the text to clarify that studies available only as conference abstracts would be excluded from the review unless full papers could be identified; studies available only as conference abstracts do not allow a comprehensive assessment of study methods or methodological quality.

We excluded, rather than included, studies using cross‐validation, such as 'leave‐one‐out' cross‐validation, as these methods are not sufficiently robust and are likely to produce unrealistic estimates of test accuracy.

To improve clarity of methods, we replaced this text from the protocol, "we will include studies developing new algorithms or methods of diagnosis (i.e. derivation studies) if they use a separate independent ’test set’ of participants or images to evaluate the new approach. We will also include studies using other forms of cross validation, such as ’leave‐one‐out’ cross‐validation (Efron 1983). We will note for future reference (but not extract) any data on the accuracy of lesion characteristics individually, e.g. the presence or absence of a pigment network or detection of asymmetry" with, "studies developing new algorithms or methods of diagnosis (i.e. derivation studies) were included if they:

  • used a separate independent 'test set' of participants or images to evaluate the new approach, or

  • investigated lesion characteristics that had previously been suggested as associated with melanoma and the study reported accuracy based on the presence or absence of particular combinations of characteristics.

Studies were excluded if they:

  • used a statistical model to produce a data driven equation, or algorithm based on multiple diagnostic features, with no separate test set.

  • used cross‐validation approaches such as 'leave‐one‐out' cross‐validation (Efron 1983)

  • evaluated the accuracy of the presence or absence of individual lesion characteristics or morphological features, with no overall diagnosis of malignancy

  • reported accuracy data for ‘clinical diagnosis’ with no clear description as to whether the reported data related to visual inspection alone

  • were based on the experience of a particular skin cancer clinic, where dermoscopy may or may not have been used on an individual patient‐basis.”

Although we extracted any reporting of special interest or accreditation in skin cancer according to observer expertise, we were unable to analyse the effect on accuracy.

We proposed to supplement the database searches by searching the annual meetings of appropriate organisations (e.g. British Association of Dermatologists Annual Meeting, American Academy of Dermatology Annual Meeting, European Academy of Dermatology and Venereology Meeting, Society for Melanoma Research Congress, World Congress of Dermatology, European Association of Dermato Oncology), however due to volume of evidence retrieved from database searches and time restrictions we were unable to do this.

For quality assessment, we further tailored the QUADAS‐2 tool according to the review topic. In terms of analysis, due to lack of data we did not restrict analyses to per‐participant data only, nor perform sensitivity analyses as planned.

Contributions of authors

JD was the contact person with the editorial base.
 JD co‐ordinated contributions from the co‐authors and wrote the final draft of the review.
 SB conducted the literature searches.
 JD, NC, LFR, DT, KYW, RBA, RA, and MF screened papers against eligibility criteria.
 JD and NC obtained data on ongoing and unpublished studies.
 JD, NC, LFR, DT, KYW, RBA, RA, and MF appraised the quality of papers.
 JD, NC, LFR, DT, KYW, RBA, RA, and MF extracted data for the review and sought additional information about papers.
 JD entered data into Review Manager 5.
 JD, MJG and JJD analysed and interpreted data.
 JD, JJD, NC, LFR, YT and CD worked on the methods sections.
 JD, FW, DT, KYW, RBA, RA, MF, RNM and HCW drafted the clinical sections of the background and responded to the clinical comments of the referees.
 JD, JJD, CD and YT responded to the methodology and statistics comments of the referees.
 KG was the consumer co‐author and checked the review for readability and clarity, as well as ensuring outcomes are relevant to consumers.
 JD is the guarantor of the update.

Disclaimer

This project presents independent research supported by the National Institute for Health Research, via Cochrane Infrastructure funding to the Cochrane Skin Group and Cochrane Programme Grant funding, and the NIHR Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham. The views and opinions expressed therein are those of the authors and do not necessarily reflect those of the Systematic Reviews Programme, NIHR, NHS or the Department of Health and Social Care.

Sources of support

Internal sources

  • No sources of support supplied

External sources

  • NIHR Systematic Review Programme, UK.

    This project was funded by an NIHR Cochrane Systematic Reviews Programme Grant (13/89/15)

  • The National Institute for Health Research (NIHR), UK.

    The NIHR, UK, is the largest single funder of Cochrane Skin

  • NIHR Birmingham Biomedical Research Centre, UK.

    JD, JJD and YT receive support from the NIHR Birmingham Biomedical Research Centre

Declarations of interest

Jacqueline Dinnes: nothing to declare.
 Jonathan J Deeks: nothing to declare.
 Matthew J Grainge: nothing to declare.
 Naomi Chuchu: nothing to declare.
 Lavinia Ferrante di Ruffano: nothing to declare.
 Rubeta N Matin: "my institution received a grant for a Barco NV commercially sponsored study to evaluate digital dermoscopy in the skin cancer clinic. My institution also received Oxfordshire Health Services Research Charitable Funds for carrying out a study of feasibility of using the Skin Cancer Quality of Life Impact Tool (SCQOLIT) in non melanoma skin cancer. I have received royalties for the Oxford Handbook of Medical Dermatology (Oxford University Press) and payment from the UK Photopheresis Society for a lecture on cutaneous graft versus host disease (October 2017). I have no conflicts of interest to declare that directly relate to the publication of this work."
 David R Thomson: nothing to declare.
 Kai Yuen Wong: nothing to declare.
 Roger Benjamin Aldridge: nothing to declare.
 Rachel Abbott: nothing to declare.
 Monica Fawzy: nothing to declare.
 Susan E Bayliss: nothing to declare.
 Yemisi Takwoingi: nothing to declare.
 Clare Davenport: nothing to declare.
 Kathie Godfrey: nothing to declare.
 Fiona M Walter: nothing to declare.
 Hywel C Williams: I am director of the NIHR HTA Programme. HTA is part of the NIHR, which also supports the NIHR systematic reviews programme from which this work is funded.

Edited (no change to conclusions)

References

References to studies included in this review

Argenziano 2006 {published data only}

  1. Argenziano G, Puig S, Zalaudek I, Sera F, Corona R, Alsina M, et al. Dermoscopy improves accuracy of primary care physicians to triage lesions suggestive of skin cancer. Journal of Clinical Oncology 2006;24(12):1877‐82. [ER4:17940973; PUBMED: 16622262] [DOI] [PubMed] [Google Scholar]

Barzegari 2005 {published data only}

  1. Barzegari M, Ghaninezhad H, Mansoori P, Taheri A, Naraghi ZS, Asgari M. Computer‐aided dermoscopy for diagnosis of melanoma. BMC Dermatology 2005;5:8. [ER4:15465860; PUBMED: 16000171] [DOI] [PMC free article] [PubMed] [Google Scholar]

Benelli 1999 {published data only}

  1. Benelli C, Roscetti E, Pozzo VD, Gasparini G, Cavicchini S. The dermoscopic versus the clinical diagnosis of melanoma. European Journal of Dermatology 1999;9(6):470‐6. [ER4:18375029; PUBMED: 10491506] [PubMed] [Google Scholar]

Benelli 2001 {published data only}

  1. Benelli C, Roscetti E, Dal Pozzo V. Reproducibility of the clinical criteria (ABCDE rule) and dermatoscopic features (7FFM) for the diagnosis of malignant melanoma. European Journal of Dermatology 2001;11(3):234‐9. [ER4:18375028; PUBMED: 11358731] [PubMed] [Google Scholar]

Bono 1996 {published data only}

  1. Bono A, Tomatis S, Bartoli C, Cascinelli N, Clemente C, Cupeta C, et al. The invisible colours of melanoma. A telespectrophotometric diagnostic approach on pigmented skin lesions. European Journal of Cancer 1996;32(4):727‐9. [DOI: ; ER4:20569437; PUBMED: 8695280] [DOI] [PubMed] [Google Scholar]

Bono 2002a {published data only}

  1. Bono A, Bartoli C, Cascinelli N, Lualdi M, Maurichi A, Moglia D, et al. Melanoma detection. A prospective study comparing diagnosis with the naked eye, dermatoscopy and telespectrophotometry. Dermatology 2002;205(4):362‐6. [ER4:15465870; PUBMED: 12444332] [DOI] [PubMed] [Google Scholar]

Bono 2002b {published data only}

  1. Bono A, Bartoli C, Baldi M, Tomatis S, Bifulco C, Santinami M. Clinical and dermatoscopic diagnosis of small pigmented skin lesions. European Journal of Dermatology 2002;12(6):573‐6. [ER4:18375034; PUBMED: 12459531] [PubMed] [Google Scholar]

Bono 2006 {published data only}

  1. Bono A, Tolomio E, Trincone S, Bartoli C, Tomatis S, Carbone A, et al. Micro‐melanoma detection: a clinical study on 206 consecutive cases of pigmented skin lesions with a diameter < or = 3 mm. British Journal of Dermatology 2006;155(3):570‐3. [ER4:15465872; PUBMED: 16911283] [DOI] [PubMed] [Google Scholar]

Bourne 2012 {published data only}

  1. Bourne P, Rosendahl C, Keir J, Cameron A. BLINCK‐A diagnostic algorithm for skin cancer diagnosis combining clinical features with dermatoscopy findings. Dermatology Practical & Conceptual 2012;2(2):202a12. [ER4:17941081; PUBMED: 23785600] [DOI] [PMC free article] [PubMed] [Google Scholar]

Carli 2002a {published data only}

  1. Carli P, Giorgi V, Argenziano G, Palli D, Giannotti B. Pre‐operative diagnosis of pigmented skin lesions: in vivo dermoscopy performs better than dermoscopy on photographic images. Journal of the European Academy of Dermatology & Venereology 2002;16(4):339‐46. [ER4:15465882; PUBMED: 12224689] [DOI] [PubMed] [Google Scholar]

Carli 2002b {published data only}

  1. Carli P, Giorgi V, Salvini C, Mannone F, Chiarugi A. The gold standard for photographing pigmented skin lesions for diagnostic purposes: contact versus distant imaging. Skin Research & Technology 2002;8(4):255‐9. [ER4:15465888; PUBMED: 12423545] [DOI] [PubMed] [Google Scholar]

Carli 2003a {published data only}

  1. Carli P, Giorgi Vincenzo, Chiarugi A, Nardini P, Mannone F, Stante M, et al. Effect of lesion size on the diagnostic performance of dermoscopy in melanoma detection. Dermatology 2003;206(4):292‐6. [ER4:15465883; PUBMED: 12771468] [DOI] [PubMed] [Google Scholar]

Chang 2013 {published data only}

  1. Chang WY, Huang A, Yang CY, Lee CH, Chen YC, Wu TY, et al. Computer‐aided diagnosis of skin lesions using conventional digital photography: a reliability and feasibility study. PloS One 2013;8(11):e76212. [ER4:15465893; PUBMED: 24223698] [DOI] [PMC free article] [PubMed] [Google Scholar]

Collas 1999 {published data only}

  1. Collas H, Delbarre M, Preville P‐A, Courville Ph, Neveu C, Dompmartin A, et al. Evaluation of the diagnosis of pigmented tumors of the skin and factors leading to a decision to excise [Evaluation du diagnostic des tumeurs pigmentées de la peau et des éléments conduisant à une décision d'exérèse]. Annales de dermatologie et de vénéréologie 1999;126(6‐7):494‐500. [ER4:21450600; PUBMED: 10495858] [PubMed] [Google Scholar]

Cristofolini 1994 {published data only}

  1. Cristofolini M, Zumiani G, Bauer P, Cristofolini P, Boi S, Micciolo R. Dermatoscopy: usefulness in the differential diagnosis of cutaneous pigmentary lesions. Melanoma Research 1994;4(6):391‐4. [ER4:15465898; PUBMED: 7703719] [DOI] [PubMed] [Google Scholar]

Cristofolini 1997 {published data only}

  1. Cristofolini M, Bauer P, Boi S, Cristofolini P, Micciolo R, Sicher MC. Diagnosis of cutaneous melanoma: accuracy of a computerized image analysis system (Skin View). Skin Research and Technology 1997;3(1):23‐7. [ER4:17941039; PUBMED: 27333169] [DOI] [PubMed] [Google Scholar]

de Giorgi 2012 {published data only}

  1. Giorgi V, Savarese I, Rossari S, Gori A, Grazzini M, Crocetti E, et al. Features of small melanocytic lesions: does small mean benign? A clinical‐dermoscopic study. Melanoma Research 2012;22(3):252‐6. [ER4:18375042; PUBMED: 22430838] [DOI] [PubMed] [Google Scholar]

Dolianitis 2005 {published data only}

  1. Dolianitis C, Kelly J, Wolfe R, Simpson P. Comparative performance of 4 dermoscopic algorithms by nonexperts for the diagnosis of melanocytic lesions. Archives of Dermatology 2005;141(8):1008‐14. [ER4:15465906; PUBMED: 16103330] [DOI] [PubMed] [Google Scholar]

Dummer 1993 {published data only}

  1. Dummer W, Doehnel KA, Remy W. Videomicroscopy in differential diagnosis of skin tumors and secondary prevention of malignant melanoma. Hautarzt 1993;44(12):772‐6. [ER4:18375044; PUBMED: 8113040] [PubMed] [Google Scholar]

Ek 2005 {published data only}

  1. Ek EW, Giorlando F, Su SY, Dieu T. Clinical diagnosis of skin tumours: how good are we?. ANZ Journal of Surgery 2005;75(6):415‐20. [DOI: 10.1111/j.1445-2197.2005.03394.x; ER4:20569451; PUBMED: 15943729] [DOI] [PubMed] [Google Scholar]

Gachon 2005 {published data only}

  1. Gachon J, Beaulieu P, Sei JF, Gouvernet J, Claudel JP, Lemaitre M, et al. First prospective study of the recognition process of melanoma in dermatological practice. Archives of Dermatology 2005;141(4):434‐8. [ER4:15465924; PUBMED: 15837860] [DOI] [PubMed] [Google Scholar]

Green 1991 {published data only}

  1. Green A, Martin N, McKenzie G, Pfitzner J, Quintarelli F, Thomas BW, et al. Computer image analysis of pigmented skin lesions. Melanoma Research 1991;1(4):231‐6. [ER4:17941055; PUBMED: 1823631] [DOI] [PubMed] [Google Scholar]

Green 1994 {published data only}

  1. Green A, Martin N, Pfitzner J, O'Rourke M, Knight N. Computer image analysis in the diagnosis of melanoma. Journal of the American Academy of Dermatology 1994;31(6):958‐64. [ER4:15465938; PUBMED: 7962777] [DOI] [PubMed] [Google Scholar]

Grimaldi 2009 {published data only}

  1. Grimaldi L, Silvestri A, Brandi C, Nisi G, Brafa A, Calabro M, et al. Digital epiluminescence dermoscopy for pigmented cutaneous lesions, primary care physicians, and telediagnosis: a useful tool?. Journal of Plastic, Reconstructive & Aesthetic Surgery 2009;62(8):1054‐8. [ER4:15465940; PUBMED: 18547883] [DOI] [PubMed] [Google Scholar]

Kopf 1975 {published data only}

  1. Kopf AW, Mintzis M, Bart RS. Diagnostic accuracy in malignant melanoma. Archives of Dermatology 1975;111(10):1291‐2. [DOI: 10.1001/archderm.1975.01630220055001; ER4:21450617; PUBMED: 1190800] [DOI] [PubMed] [Google Scholar]

Krahn 1998 {published data only}

  1. Krahn G, Gottlober P, Sander C, Peter RU. Dermatoscopy and high frequency sonography: two useful non‐invasive methods to increase preoperative diagnostic accuracy in pigmented skin lesions. Pigment Cell Research 1998;11(3):151‐4. [ER4:15465981; PUBMED: 9730322] [DOI] [PubMed] [Google Scholar]

Langley 2001 {published data only}

  1. Langley RG, Rajadhyaksha M, Dwyer PJ, Sober AJ, Flotte TJ, Anderson RR. Confocal scanning laser microscopy of benign and malignant melanocytic skin lesions in vivo. Journal of the American Academy of Dermatology 2001;45(3):365‐76. [DOI: 10.1067/mjd.2001.117395; ER4:20569473; PUBMED: 11511832] [DOI] [PubMed] [Google Scholar]

Lorentzen 1999 {published data only}

  1. Lorentzen H, Weismann K, Petersen CS, Larsen FG, Secher L, Skodt V. Clinical and dermatoscopic diagnosis of malignant melanoma. Assessed by expert and non‐expert groups. Acta Dermato‐Venereologica 1999;79(4):301‐4. [ER4:17941062; PUBMED: 10429989] [DOI] [PubMed] [Google Scholar]

McGovern 1992 {published data only}

  1. McGovern TW, Litaker MS. Clinical predictors of malignant pigmented lesions. A comparison of the Glasgow seven‐point checklist and the American Cancer Society's ABCDs of pigmented lesions. Journal of Dermatologic Surgery & Oncology 1992;18(1):22‐6. [ER4:18375119; PUBMED: 1740563] [DOI] [PubMed] [Google Scholar]

Menzies 2009 {published data only}

  1. Menzies SW, Emery J, Staples M, Davies S, McAvoy B, Fletcher J, et al. Impact of dermoscopy and short‐term sequential digital dermoscopy imaging for the management of pigmented lesions in primary care: a sequential intervention trial. British Journal of Dermatology 2009;161(6):1270‐7. [DOI: 10.1111/j.1365-2133.2009.09374.x; ER4:15466005; PUBMED: 19747359] [DOI] [PubMed] [Google Scholar]

Morales Callaghan 2008 {published data only}

  1. Morales Callaghan AM, Castrodeza‐Sanz J, Martinez‐Garcia G, Peral‐Martinez I, Miranda‐Romero A. Correlation between clinical, dermatoscopic, and histopathologic variables in atypical melanocytic nevi. Actas Dermo‐sifiliograficas 2008;99(5):380‐9. [ER4:17941068; PUBMED: 18501170] [PubMed] [Google Scholar]

Morton 1998a {published data only}

  1. Morton CA, MacKie RM. Clinical accuracy of the diagnosis of cutaneous malignant melanoma. British Journal of Dermatology 1998;138(2):283‐7. [ER4:20569481; PUBMED: 9602875] [DOI] [PubMed] [Google Scholar]

Morton 1998b {published data only}

  1. Morton CA, MacKie RM. Clinical accuracy of the diagnosis of cutaneous malignant melanoma. British Journal of Dermatology 1998;138(2):283‐7. [PUBMED: 9602875] [DOI] [PubMed] [Google Scholar]

Morton 1998c {published data only}

  1. Morton CA, MacKie RM. Clinical accuracy of the diagnosis of cutaneous malignant melanoma. British Journal of Dermatology 1998;138(2):283‐7. [ER4:20569481; PUBMED: 9602875] [DOI] [PubMed] [Google Scholar]

Pizzichetta 2004 {published data only}

  1. Pizzichetta MA, Talamini R, Stanganelli I, Puddu P, Bono R, Argenziano G, et al. Amelanotic/hypomelanotic melanoma: clinical and dermoscopic features. British Journal of Dermatology 2004;150(6):1117‐24. [ER4:15466066; PUBMED: 15214897] [DOI] [PubMed] [Google Scholar]

Rao 1997 {published data only}

  1. Rao BK, Marghoob AA, Stolz W, Kopf AW, Slade J, Wasti Q, et al. Can early malignant melanoma be differentiated from atypical melanocytic nevi by in vivo techniques? Part I. Clinical and dermoscopic characteristics. Skin Research and Technology 1997;3(1):8‐14. [ER4:17941048; PUBMED: 27333167] [DOI] [PubMed] [Google Scholar]

Rosendahl 2011 {published data only}

  1. Rosendahl C, Tschandl P, Cameron A, Kittler H. Diagnostic accuracy of dermatoscopy for melanocytic and nonmelanocytic pigmented lesions. Journal of the American Academy of Dermatology 2011;64(6):1068‐73. [ER4:15466083; PUBMED: 21440329] [DOI] [PubMed] [Google Scholar]

Scope 2008 {published data only}

  1. Scope A, Dusza SW, Halpern AC, Rabinovitz H, Braun RP, Zalaudek I, et al. The "ugly duckling" sign: agreement between observers. Archives of Dermatology 2008;144(1):58‐64. [ER4:15465911; PUBMED: 18209169] [DOI] [PubMed] [Google Scholar]

Soyer 1995 {published data only}

  1. Soyer HP, Smolle J, Leitinger G, Rieger E, Kerl H. Diagnostic reliability of dermoscopic criteria for detecting malignant melanoma. Dermatology 1995;190(1):25‐30. [ER4:18375054; PUBMED: 7894091] [DOI] [PubMed] [Google Scholar]

Stanganelli 1998a {published data only}

  1. Stanganelli I, Serafini M, Cainelli T, Cristofolini M, Baldassari L, Staffa M, et al. Accuracy of epiluminescence microscopy among practical dermatologists: a study from the Emilia‐Romagna region of Italy. Tumori 1998;84(6):701‐5. [ER4:18375055; PUBMED: 10080681] [DOI] [PubMed] [Google Scholar]

Stanganelli 2000 {published data only}

  1. Stanganelli I, Serafini M, Bucch L. A cancer‐registry‐assisted evaluation of the accuracy of digital epiluminescence microscopy associated with clinical examination of pigmented skin lesions. Dermatology 2000;200(1):11‐6. [ER4:15466129; PUBMED: 10681607] [DOI] [PubMed] [Google Scholar]

Stanganelli 2005 {published data only}

  1. Stanganelli I, Brucale A, Calori L, Gori R, Lovato A, Magi S, et al. Computer‐aided diagnosis of melanocytic lesions. Anticancer Research 2005;25(6C):4577‐82. [ER4:15466126; PUBMED: 16334145] [PubMed] [Google Scholar]

Steiner 1987 {published data only}

  1. Steiner A, Pehamberger H, Wolff K. In vivo epiluminescence microscopy of pigmented skin lesions. II. Diagnosis of small pigmented skin lesions and early detection of malignant melanoma. Journal of the American Academy of Dermatology 1987;17(4):584‐91. [ER4:17940992; PUBMED: 3668003] [DOI] [PubMed] [Google Scholar]

Thomas 1998 {published data only}

  1. Thomas L, Tranchand P, Berard F, Secchi T, Colin C, Moulin G. Semiological value of ABCDE criteria in the diagnosis of cutaneous pigmented tumors. Dermatology 1998;197(1):11‐7. [ER4:15466141; PUBMED: 9693179] [DOI] [PubMed] [Google Scholar]

Troyanova 2003 {published data only}

  1. Troyanova P. A beneficial effect of a short‐term formal training course in epiluminescence microscopy on the diagnostic performance of dermatologists about cutaneous malignant melanoma. Skin Research & Technology 2003;9(3):269‐73. [ER4:17941004; PUBMED: 12877690] [DOI] [PubMed] [Google Scholar]

Unlu 2014 {published data only}

  1. Unlu E, Akay BN, Erdem C. Comparison of dermatoscopic diagnostic algorithms based on calculation: the ABCD rule of dermatoscopy, the seven‐point checklist, the three‐point checklist and the CASH algorithm in dermatoscopic evaluation of melanocytic lesions. Journal of Dermatology 2014;41(7):598‐603. [ER4:15466145; PUBMED: 24807635] [DOI] [PubMed] [Google Scholar]

Viglizzo 2004 {published data only}

  1. Viglizzo G, Rongioletti F. Clinical, dermoscopic and pathologic correlation of pigmentary lesions observed in a dermoscopy service in the year 2003 [Correlazione clinico‐dermoscopico‐patologica delle lesioni cutanee pigmentate osservate in un servizio di dermoscopia nell'anno 2003]. Giornale Italiano di Dermatologia e Venereologia 2004;139(4):339‐44. [ER4:18375099] [Google Scholar]

Walter 2012 {published data only}

  1. Walter FM, Morris HC, Humphrys E, Hall PN, Prevost AT, Burrows N, et al. Effect of adding a diagnostic aid to best practice to manage suspicious pigmented lesions in primary care: randomised controlled trial. BMJ 2012;345:e4110. [ER4:15466154; PUBMED: 22763392] [DOI] [PMC free article] [PubMed] [Google Scholar]

Westerhoff 2000 {published data only}

  1. Westerhoff K, McCarthy WH, Menzies SW. Increase in the sensitivity for melanoma diagnosis by primary care physicians using skin surface microscopy. British Journal of Dermatology 2000;143(5):1016‐20. [ER4:15466164; PUBMED: 11069512] [DOI] [PubMed] [Google Scholar]

Winkelmann 2016 {published data only}

  1. Winkelmann RR, Farberg AS, Tucker N, White R, Rigel DS. Enhancement of international dermatologists' pigmented skin lesion biopsy decisions following dermoscopy with subsequent integration of multispectral digital skin lesion analysis. Journal of Clinical and Aesthetic Dermatology 2016;9(7):53‐5. [ER4:25701735; PUBMED: 27672411] [PMC free article] [PubMed] [Google Scholar]

Zaumseil 1983 {published data only}

  1. Zaumseil RP, Fiedler H, Gstöttner R. Clinico‐diagnostic accuracy in malignant melanoma of the skin [Klinisch‐diagnostische treffsicherheit beim malignen melanom der haut]. Dermatologische Monatschrift 1983;169(2):101‐5. [ER4:21450660; PUBMED: 6840366] [PubMed] [Google Scholar]

References to studies excluded from this review

Abbasi 2004 {published data only}

  1. Abbasi NR, Shaw HM, Rigel DS, Friedman RJ, McCarthy WH, Osman I, et al. Early diagnosis of cutaneous melanoma: revisiting the ABCD criteria. JAMA 2004;292(22):2771‐6. [DOI] [PubMed] [Google Scholar]

Aldridge 2011a {published data only}

  1. Aldridge RB, Glodzik D, Ballerini L, Fisher RB, Rees JL. Utility of non‐rule‐based visual matching as a strategy to allow novices to achieve skin lesion diagnosis. Acta Dermato‐Venereologica 2011;91(3):279‐83. [DOI] [PMC free article] [PubMed] [Google Scholar]

Aldridge 2011b {published data only}

  1. Aldridge RB, Zanotto M, Ballerini L, Fisher RB, Rees JL. Novice identification of melanoma: not quite as straightforward as the ABCDs. Acta Dermato‐Venereologica 2011;91(2):125‐30. [DOI] [PMC free article] [PubMed] [Google Scholar]

Aldridge 2013 {published data only}

  1. Aldridge RB, Naysmith L, Ooi ET, Murray CS, Rees JL. The importance of a full clinical examination: assessment of index lesions referred to a skin cancer clinic without a total body skin examination would miss one in three melanomas. Acta Dermato‐Venereologica 2013;93(6):689‐92. [DOI] [PMC free article] [PubMed] [Google Scholar]

Alendar 2009 {published data only}

  1. Alendar F, Drljevic I, Drljevic K, Alendar T. Early detection of melanoma skin cancer. Bosnian Journal of Basic Medical Sciences 2009;9(1):77‐80. [DOI] [PMC free article] [PubMed] [Google Scholar]

Argenziano 1999 {published data only}

  1. Argenziano G, Fabbrocini G, Carli P, Giorgi V, Delfino M. Clinical and dermatoscopic criteria for the preoperative evaluation of cutaneous melanoma thickness. Journal of the American Academy of Dermatology 1999;40(1):61‐8. [DOI] [PubMed] [Google Scholar]

Argenziano 2003 {published data only}

  1. Argenziano G, Soyer HP, Chimenti S, Talamini R, Corona R, Sera F, et al. Dermoscopy of pigmented skin lesions: results of a consensus meeting via the Internet. Journal of the American Academy of Dermatology 2003;48(5):679‐93. [DOI] [PubMed] [Google Scholar]

Argenziano 2012 {published data only}

  1. Argenziano G, Zalaudek I, Hofmann‐Wellenhof R, Bakos RM, Bergman W, Blum A, et al. Total body skin examination for skin cancer screening in patients with focused symptoms. Journal of the American Academy of Dermatology 2012;66(2):212‐9. [DOI] [PubMed] [Google Scholar]

Argenziano 2014 {published data only}

  1. Argenziano G, Moscarella E, Annetta A, Battarra VC, Brunetti B, Buligan C, et al. Melanoma detection in Italian pigmented lesion clinics. Giornale Italiano di Dermatologia e Venereologia 2014;149(2):161‐6. [PubMed] [Google Scholar]

Ascierto 2003 {published data only}

  1. Ascierto PA, Palmieri G, Botti G, Satriano RA, Stanganelli I, Bono R, et al. Early diagnosis of malignant melanoma: proposal of a working formulation for the management of cutaneous pigmented lesions from the Melanoma Cooperative Group. International Journal of Oncology 2003;22(6):1209‐15. [PubMed] [Google Scholar]

Badertscher 2015 {published data only}

  1. Badertscher N, Tandjung R, Senn O, Kofmehl R, Held U, Rosemann T, et al. A multifaceted intervention: no increase in general practitioners' competence to diagnose skin cancer (minSKIN) ‐ randomized controlled trial. Journal of the European Academy of Dermatology & Venereology 2015;29(8):1493‐9. [DOI] [PubMed] [Google Scholar]

Bafounta 2001 {published data only}

  1. Bafounta ML, Beauchet A, Aegerter P, Saiag P. Is dermoscopy (epiluminescence microscopy) useful for the diagnosis of melanoma? Results of a meta‐analysis using techniques adapted to the evaluation of diagnostic tests. Archives of Dermatology 2001;137(10):1343‐50. [DOI] [PubMed] [Google Scholar]

Banky 2005 {published data only}

  1. Banky JP, Kelly JW, English DR, Yeatman JM, Dowling JP. Incidence of new and changed nevi and melanomas detected using baseline images and dermoscopy in patients at high risk for melanoma. Archives of Dermatology 2005;141(8):998‐1006. [DOI] [PubMed] [Google Scholar]

Basarab 1996 {published data only}

  1. Basarab T, Munn SE, Jones R. Diagnostic accuracy and appropriateness of general practitioner referrals to a dermatology out‐patient clinic. British Journal of Dermatology 1996;135(1):70‐3. [PubMed] [Google Scholar]

Bauer 2000 {published data only}

  1. Bauer P, Cristofolini P, Boi S, Burroni M, Dell'Eva G, Micciolo R, et al. Digital epiluminescence microscopy: usefulness in the differential diagnosis of cutaneous pigmentary lesions. A statistical comparison between visual and computer inspection. Melanoma Research 2000;10(4):345‐9. [DOI] [PubMed] [Google Scholar]

Bauer 2005 {published data only}

  1. Bauer J, Blum A, Strohhacker U, Garbe C. Surveillance of patients at high risk for cutaneous malignant melanoma using digital dermoscopy. British Journal of Dermatology 2005;152(1):87‐92. [DOI] [PubMed] [Google Scholar]

Becker 1954 {published data only}

  1. Becker S. PItfalls in the diagnosis and treatment of melanoma. A.M.A. Archives of Dermatology and Syphilology 1954;69(1):11‐30. [DOI] [PubMed] [Google Scholar]

Benelli 2000 {published data only}

  1. Benelli C, Roscetti E, Dal Pozzo V. The dermoscopic (7FFM) versus the clinical (ABCDE) diagnosis of small diameter melanoma. European Journal of Dermatology 2000;10(4):282‐7. [PubMed] [Google Scholar]

Blum 2004a {published data only}

  1. Blum A. Pattern analysis, not simplified algorithms, is the most reliable method for teaching dermoscopy for melanoma diagnosis to residents in dermatology. British Journal of Dermatology 2004;151(2):511‐2. [DOI] [PubMed] [Google Scholar]

Blum 2004b {published data only}

  1. Blum A, Clemens J, Argenziano G. Three‐colour test in dermoscopy: a re‐evaluation. British Journal of Dermatology 2004;150(5):1040. [DOI] [PubMed] [Google Scholar]

Blum 2004c {published data only}

  1. Blum A, Hofmann‐Wellenhof R, Luedtke H, Ellwanger U, Steins A, Roehm S, et al. Value of the clinical history for different users of dermoscopy compared with results of digital image analysis. Journal of the European Academy of Dermatology & Venereology 2004;18(6):665‐9. [DOI] [PubMed] [Google Scholar]

Bolognia 1990 {published data only}

  1. Bolognia JL, Berwick M, Fine JA. Complete follow‐up and evaluation of a skin cancer screening in Connecticut. Journal of the American Academy of Dermatology 1990;23(6 Pt 1):1098‐106. [DOI] [PubMed] [Google Scholar]

Bono 2001 {published data only}

  1. Bono A, Maurichi A, Moglia D, Camerini T, Tragni G, Lualdi M, et al. Clinical and dermatoscopic diagnosis of early amelanotic melanoma. Melanoma Research 2001;11(5):491‐4. [DOI] [PubMed] [Google Scholar]

Borsari 2015 {published data only}

  1. Borsari S, Longo C, Piana S, Moscarella E, Lallas A, Alfano R, et al. When the 'ugly duckling' loses brothers, it becomes the 'only son of a widowed mother'. Dermatology 2015;231(3):222‐3. [DOI] [PubMed] [Google Scholar]

Borve 2012 {published data only}

  1. Borve A, Holst A, Gente‐Lidholm A, Molina‐Martinez R, Paoli J. Use of the mobile phone multimedia messaging service for teledermatology. Journal of Telemedicine & Telecare 2012;18(5):292‐6. [DOI] [PubMed] [Google Scholar]

Brown 2000 {published data only}

  1. Brown N. Exploration of diagnostic techniques for malignant melanoma: an integrative review. Clinical Excellence for Nurse Practitioners 2000;4(5):263‐71. [PubMed] [Google Scholar]

Brown 2009 {published data only}

  1. Brown NH, Robertson KM, Bisset YC, Rees JL. Using a structured image database, how well can novices assign skin lesion images to the correct diagnostic grouping?. Journal of Investigative Dermatology 2009;129(10):2509‐12. [DOI] [PubMed] [Google Scholar]

Buhl 2012 {published data only}

  1. Buhl T, Hansen‐Hagge C, Korpas B, Kaune KM, Haas E, Rosenberger A, et al. Integrating static and dynamic features of melanoma: the DynaMel algorithm. Journal of the American Academy of Dermatology 2012;66(1):27‐36. [DOI] [PubMed] [Google Scholar]

Burki 2015 {published data only}

  1. Burki TK. Total body exam or lesion detection screening for skin cancer?. Lancet Oncology 2015;16(16):e590. [DOI] [PubMed] [Google Scholar]

Burr 2015 {published data only}

  1. Burr S. The assessment, history taking and differential diagnosis of pigmented skin lesions. Dermatological Nursing 2015;14(4):(5p). [Google Scholar]

Burton 1998 {published data only}

  1. Burton RC, Howe C, Adamson L, Reid AL, Hersey P, Watson A, et al. General practitioner screening for melanoma: sensitivity, specificity, and effect of training. Journal of Medical Screening 1998;5(3):156‐61. [DOI] [PubMed] [Google Scholar]

Carli 2003b {published data only}

  1. Carli P, Giorgi V, Giannotti B, Seidenari S, Pellacani G, Peris K, et al. Skin cancer day in Italy: method of referral to open access clinics and tumor prevalence in the examined population. European Journal of Dermatology 2003;13(1):76‐9. [PubMed] [Google Scholar]

Carli 2003c {published data only}

  1. Carli P, Mannone F, Giorgi V, Nardini P, Chiarugi A, Giannotti B. The problem of false‐positive diagnosis in melanoma screening: the impact of dermoscopy. Melanoma Research 2003;13(2):179‐82. [DOI] [PubMed] [Google Scholar]

Carli 2004a {published data only}

  1. Carli P, Giorgi V, Chiarugi A, Nardini P, Weinstock MA, Crocetti E, et al. Addition of dermoscopy to conventional naked‐eye examination in melanoma screening: a randomized study. Journal of the American Academy of Dermatology 2004;50(5):683‐9. [DOI] [PubMed] [Google Scholar]

Carli 2004b {published data only}

  1. Carli P, Giorgi V, Crocetti E, Mannone F, Massi D, Chiarugi A, et al. Improvement of malignant/benign ratio in excised melanocytic lesions in the 'dermoscopy era': a retrospective study 1997‐2001. British Journal of Dermatology 2004;150(4):687‐92. [DOI] [PubMed] [Google Scholar]

Carli 2004c {published data only}

  1. Carli P, Nardini P, Crocetti E, Giorgi V, Giannotti B. Frequency and characteristics of melanomas missed at a pigmented lesion clinic: a registry‐based study. Melanoma Research 2004;14(5):403‐7. [DOI] [PubMed] [Google Scholar]

Carli 2005 {published data only}

  1. Carli P, Chiarugi A, Giorgi V. Examination of lesions (including dermoscopy) without contact with the patient is associated with improper management in about 30% of equivocal melanomas. Dermatologic Surgery 2005;31(2):169‐72. [DOI] [PubMed] [Google Scholar]

Carlos‐Ortega 2007 {published data only}

  1. Carlos‐Ortega B, Sanchez‐Alva ME, Ysita‐Morales A, Angeles‐Garay U. Correlation among simple observation and dermoscopy in the study of pigmented lesions of the skin. Revista Medica del Instituto Mexicano del Seguro Social 2007;45(6):541‐8. [PubMed] [Google Scholar]

Chen 2001 {published data only}

  1. Chen SC, Bravata DM, Weil E, Olkin I. A comparison of dermatologists' and primary care physicians' accuracy in diagnosing melanoma (structured abstract). Archives of Dermatology 2001;137(12):1627‐34. [DOI] [PubMed] [Google Scholar]

Chen 2006 {published data only}

  1. Chen SC, Pennie ML, Kolm P, Warshaw EM, Weisberg EL, Brown KM, et al. Diagnosing and managing cutaneous pigmented lesions: primary care physicians versus dermatologists. Journal of General Internal Medicine 2006;21(7):678‐82. [DOI] [PMC free article] [PubMed] [Google Scholar]

Chiaravalloti 2014 {published data only}

  1. Chiaravalloti AJ, Laduca JR. Melanoma screening by means of complete skin exams for all patients in a dermatology practice reduces the thickness of primary melanomas at diagnosis. Journal of Clinical & Aesthetic Dermatology 2014;7(8):18‐22. [PMC free article] [PubMed] [Google Scholar]

Ciudad‐Blanco 2014 {published data only}

  1. Ciudad‐Blanco C, Aviles‐Izquierdo JA, Lazaro‐Ochaita P, Suarez‐Fernandez R. Dermoscopic findings for the early detection of melanoma: an analysis of 200 cases. Actas Dermo‐sifiliograficas 2014;105(7):683‐93. [DOI] [PubMed] [Google Scholar]

Cooper 2002 {published data only}

  1. Cooper SM, Wojnarowska F. The accuracy of clinical diagnosis of suspected premalignant and malignant skin lesions in renal transplant recipients. Clinical and Experimental Dermatology 2002;27(6):436‐8. [DOI] [PubMed] [Google Scholar]

Cornell 2015 {published data only}

  1. Cornell E, Robertson K, McIntosh RD, Rees JL. Viewing exemplars of melanomas and benign mimics of melanoma modestly improves diagnostic skills in comparison with the ABCD method and other image‐based methods for lay identification of melanoma. Acta Dermato‐Venereologica 2015;95(6):681‐5. [DOI] [PubMed] [Google Scholar]

Cox 2008 {published data only}

  1. Cox NH, Madan V, Sanders T. The U.K. skin cancer 'two‐week rule' proforma: assessment of potential modifications to improve referral accuracy. British Journal of Dermatology 2008;158(6):1293‐8. [DOI] [PubMed] [Google Scholar]

DeCoste 1993 {published data only}

  1. DeCoste SD, Stern RS. Diagnosis and treatment of nevomelanocytic lesions of the skin: a community‐based study. Archives of Dermatology 1993;129(1):57‐62. [PubMed] [Google Scholar]

De Giorgi 2011 {published data only}

  1. Giorgi V, Grazzini M, Rossari S, Gori A, Alfaioli B, Papi F, et al. Adding dermatoscopy to naked eye examination of equivocal melanocytic skin lesions: effect on intention to excise by general dermatologists. Clinical & Experimental Dermatology 2011;36(3):255‐9. [ER4:15465901] [DOI] [PubMed] [Google Scholar]

Di Carlo 2014 {published data only}

  1. Carlo A, Elia F, Desiderio F, Catricala C, Solivetti FM, Laino L. Can video thermography improve differential diagnosis and therapy between basal cell carcinoma and actinic keratosis?. Dermatologic Therapy 2014;27(5):290‐7. [DOI] [PubMed] [Google Scholar]

Di Chiacchio 2010 {published data only}

  1. Chiacchio N, Hirata SH, Enokihara MY, Michalany NS, Fabbrocini G, Tosti A. Dermatologists' accuracy in early diagnosis of melanoma of the nail matrix. Archives of Dermatology 2010;146(4):382‐7. [DOI] [PubMed] [Google Scholar]

Dreiseitl 2009 {published data only}

  1. Dreiseitl S, Binder M, Hable K, Kittler H. Computer versus human diagnosis of melanoma: evaluation of the feasibility of an automated diagnostic system in a prospective clinical trial. Melanoma Research 2009;19(3):180‐4. [DOI] [PubMed] [Google Scholar]

Duff 2001 {published data only}

  1. Duff CG, Melsom D, Rigby HS, Kenealy JM, Townsend PL. A 6 year prospective analysis of the diagnosis of malignant melanoma in a pigmented‐lesion clinic: even the experts miss malignant melanomas, but not often. British Journal of Plastic Surgery 2001;54(4):317‐21. [DOI] [PubMed] [Google Scholar]

Edmondson 1999 {published data only}

  1. Edmondson PC, Curley RK, Marsden RA, Robinson D, Allaway SL, Willson CD. Screening for malignant melanoma using instant photography. Journal of Medical Screening 1999;6(1):42‐6. [DOI] [PubMed] [Google Scholar]

Emmons 2011 {published data only}

  1. Emmons KM, Geller AC, Puleo E, Savadatti SS, Hu SW, Gorham S, et al. Skin cancer education and early detection at the beach: a randomized trial of dermatologist examination and biometric feedback. Journal of the American Academy of Dermatology 2011;64(2):282‐9. [DOI] [PMC free article] [PubMed] [Google Scholar]

Engelberg 1999 {published data only}

  1. Engelberg D, Gallagher RP, Rivers JK. Follow‐up and evaluation of skin cancer screening in British Columbia. Journal of the American Academy of Dermatology 1999;41(1):37‐42. [DOI] [PubMed] [Google Scholar]

English 2003 {published data only}

  1. English DR, Burton RC, Mar CB, Donovan RJ, Ireland PD, Emery G. Evaluation of aid to diagnosis of pigmented skin lesions in general practice: controlled trial randomised by practice. BMJ 2003;327(7411):375. [DOI] [PMC free article] [PubMed] [Google Scholar]

English 2004 {published data only}

  1. English DR, Mar C, Burton RC. Factors influencing the number needed to excise: excision rates of pigmented lesions by general practitioners. Medical Journal of Australia 2004;180(1):16‐9. [DOI] [PubMed] [Google Scholar]

Fabbrocini 2008 {published data only}

  1. Fabbrocini G, Balato A, Rescigno O, Mariano M, Scalvenzi M, Brunetti B. Telediagnosis and face‐to‐face diagnosis reliability for melanocytic and non‐melanocytic 'pink' lesions. Journal of the European Academy of Dermatology & Venereology 2008;22(2):229‐34. [DOI] [PubMed] [Google Scholar]

Federman 1995 {published data only}

  1. Federman D, Hogan D, Taylor JR, Caralis P, Kirsner RS. A comparison of diagnosis, evaluation, and treatment of patients with dermatologic disorders. Journal of the American Academy of Dermatology 1995;32(5, Part 1):726‐9. [DOI] [PubMed] [Google Scholar]

Fikrle 2013 {published data only}

  1. Fikrle T, Pizinger K, Szakos H, Panznerova P, Divisova B, Pavel S. Digital dermatoscopic follow‐up of 1027 melanocytic lesions in 121 patients at risk of malignant melanoma. Journal of the European Academy of Dermatology & Venereology 2013;27(2):180‐6. [DOI] [PubMed] [Google Scholar]

Freeman 1963 {published data only}

  1. Freeman RG, Knox JM. Clinical accuracy in diagnosis of skin tumors. Geriatrics 1963;18:546‐51. [PubMed] [Google Scholar]

Friedman 1985 {published data only}

  1. Friedman RJ, Rigel DS. The clinical features of malignant melanoma. Dermatologic Clinics 1985;3(2):271‐83. [PubMed] [Google Scholar]

Funt 1963 {published data only}

  1. Funt TR. Early recognition of cutaneous malignant melanoma in adults. Journal of the Florida Medical Association 1963;50:280‐2. [PubMed] [Google Scholar]

Gerbert 1996 {published data only}

  1. Gerbert B, Maurer T, Berger T, Pantilat S, McPhee SJ, Wolff M, et al. Primary care physicians as gatekeepers in managed care: primary care physicians' and dermatologists' skills at secondary prevention of skin cancer. Archives of Dermatology 1996;132(9):1030‐8. [PubMed] [Google Scholar]

Gerbert 1998 {published data only}

  1. Gerbert B, Bronstone A, Wolff M, Maurer T, Berger T, Pantilat S, et al. Improving primary care residents’ proficiency in the diagnosis of skin cancer. Journal of General Internal Medicine 1998;13(2):91‐7. [DOI] [PMC free article] [PubMed] [Google Scholar]

Giannotti 2004 {published data only}

  1. Giannotti B, Carli P. Improvement of early diagnosis of melanoma in a mediterranean population: the experience of the Florence melanoma clinic [Novita in tema di diagnosi precoce del melanoma cutaneo: l'esperienza del gruppo Fiorentino]. Giornale Italiano di Dermatologia e Venereologia 2004;139(2):89‐96. [Google Scholar]

Grana 2003 {published data only}

  1. Grana C, Pellacani G, Cucchiara R, Seidenari S. A new algorithm for border description of polarized light surface microscopic images of pigmented skin lesions. IEEE Transactions on Medical Imaging 2003;22(8):959‐64. [DOI] [PubMed] [Google Scholar]

Grob 1998 {published data only}

  1. Grob JJ, Bonerandi JJ. The 'ugly duckling' sign: identification of the common characteristics of nevi in an individual as a basis for melanoma screening. Archives of Dermatology 1998;134(1):103‐4. [DOI] [PubMed] [Google Scholar]

Guibert 2000 {published data only}

  1. Guibert P, Mollat F, Ligen M, Dreno B. Melanoma screening: report of a survey in occupational medicine. Archives of Dermatology 2000;136(2):199‐202. [DOI] [PubMed] [Google Scholar]

Gunduz 2003 {published data only}

  1. Gunduz K, Koltan S, Sahin MT, E Filiz E. Analysis of melanocytic naevi by dermoscopy during pregnancy. Journal of the European Academy of Dermatology & Venereology 2003;17(3):349‐51. [DOI] [PubMed] [Google Scholar]

Gutierrez 2013 {published data only}

  1. Gutierrez R, Rueda A, Romero E. Learning semantic histopathological representation for basal cell darcinoma classification. Proc. SPIE 8676, Medical Imaging 2013: Digital Pathology, 86760U (accessed 29 March 2013). [DOI: 10.1117/12.2007117] [DOI]

Hacioglu 2013 {published data only}

  1. Hacioglu S, Saricaoglu H, Baskan EB, Uner SI, Aydogan K, Tunali S. The value of spectrophotometric intracutaneous analysis in the noninvasive diagnosis of nonmelanoma skin cancers. Clinical & Experimental Dermatology 2013;38(5):464‐9. [DOI] [PubMed] [Google Scholar]

Haenssle 2010 {published data only}

  1. Haenssle HA, Korpas B, Hansen‐Hagge C, Buhl T, Kaune KM, Johnsen S, et al. Selection of patients for long‐term surveillance with digital dermoscopy by assessment of melanoma risk factors. Archives of Dermatology 2010;146(3):257‐64. [DOI] [PubMed] [Google Scholar]

Haenssle 2010a {published data only}

  1. Haenssle HA, Korpas B, Hansen‐Hagge C, Buhl T, Kaune KM, Rosenberger A, et al. Seven‐point checklist for dermatoscopy: performance during 10 years of prospective surveillance of patients at increased melanoma risk. Journal of the American Academy of Dermatology 2010;62(5):785‐93. [DOI] [PubMed] [Google Scholar]

Hallock 1998 {published data only}

  1. Hallock GG, Lutz DA. Prospective study of the accuracy of the surgeon's diagnosis in 2000 excised skin tumors. Plastic and Reconstructive Surgery 1998;101(5):1255‐61. [DOI] [PubMed] [Google Scholar]

Haniffa 2007 {published data only}

  1. Haniffa MA, Lloyd JJ, Lawrence CM. The use of a spectrophotometric intracutaneous analysis device in the real‐time diagnosis of melanoma in the setting of a melanoma screening clinic. British Journal of Dermatology 2007;156(6):1350‐2. [DOI] [PubMed] [Google Scholar]

Har‐Shai 2001 {published data only}

  1. Har‐Shai Y, Hai N, Taran A, Mayblum S, Barak A, Tzur E, et al. Sensitivity and positive predictive values of presurgical clinical diagnosis of excised benign and malignant skin tumors: a prospective study of 835 lesions in 778 patients. Plastic and Reconstructive Surgery 2001;108(7):1982‐9. [DOI] [PubMed] [Google Scholar]

Heal 2008 {published data only}

  1. Heal CF, Raasch BA, Buettner PG, Weedon D. Accuracy of clinical diagnosis of skin lesions. British Journal of Dermatology 2008;159(3):661‐8. [DOI] [PubMed] [Google Scholar]

Healsmith 1994 {published data only}

  1. Healsmith MF, Bourke JF, Osborne JE, Graham‐Brown RA. An evaluation of the revised seven‐point checklist for the early diagnosis of cutaneous malignant melanoma. British Journal of Dermatology 1994;130(1):48‐50. [DOI] [PubMed] [Google Scholar]

Higgins 1992 {published data only}

  1. Higgins EM, Hall P, Todd P, Murthi R, Du Vivier AW. The application of the seven‐point check‐list in the assessment of benign pigmented lesions. Clinical & Experimental Dermatology 1992;17(5):313‐5. [DOI] [PubMed] [Google Scholar]

Hoorens 2016 {published data only}

  1. Hoorens I, Vossaert K, Pil L, Boone B, Schepper S, Ongenae K, et al. Total‐body examination vs lesion‐directed skin cancer screening. JAMA Dermatology 2016;152(1):27‐34. [DOI] [PubMed] [Google Scholar]

Huang 1996 {published data only}

  1. Huang CL, Wasti Q, Marghoob AA, Kopf AW, David M, Rao BK, et al. Border irregularity: atypical moles versus melanoma. European Journal of Dermatology 1996;6(4):270‐3. [Google Scholar]

Jamora 2003 {published data only}

  1. Jamora MJ, Wainwright BD, Meehan SA, Bystryn J‐C. Improved identification of potentially dangerous pigmented skin lesions by computerized image analysis. Archives of Dermatology 2003;139(2):195‐8. [DOI] [PubMed] [Google Scholar]

Janda 2014 {published data only}

  1. Janda M, Loescher LJ, Banan P, Horsham C, Soyer HP. Lesion selection by melanoma high‐risk consumers during skin self‐examination using mobile teledermoscopy. JAMA Dermatology 2014;150(6):656‐8. [DOI] [PubMed] [Google Scholar]

Jensen 2015 {published data only}

  1. Jensen JD, Elewski BE. The ABCDEF rule: combining the 'ABCDE rule' and the "ugly duckling sign" in an effort to improve patient self‐screening examinations. Journal of Clinical and Aesthetic Dermatology 2015;8(2):15. [PMC free article] [PubMed] [Google Scholar]

Jolliffe 2001 {published data only}

  1. Jolliffe VM, Harris DW, Whittaker SJ. Can we safely diagnose pigmented lesions from stored video images? A diagnostic comparison between clinical examination and stored video images of pigmented lesions removed for histology. Clinical & Experimental Dermatology 2001;26(1):84‐7. [DOI] [PubMed] [Google Scholar]

Jonna 1998 {published data only}

  1. Jonna BP, Delfino RJ, Newman WG, Tope WD. Positive predictive value for presumptive diagnoses of skin cancer and compliance with follow‐up among patients attending a community screening program. Preventive Medicine 1998;27(4):611‐6. [DOI] [PubMed] [Google Scholar]

Kaddu 1997 {published data only}

  1. Kaddu S, Soyer HP, Wolf IH, Rieger E, Kerl H. Reticular lentigo. Hautarzt 1997;48(3):181‐5. [DOI] [PubMed] [Google Scholar]

Keefe 1990 {published data only}

  1. Keefe M, Dick DC, Wakeel RA. A study of the value of the seven‐point checklist in distinguishing benign pigmented lesions from melanoma. Clinical & Experimental Dermatology 1990;15(3):167‐71. [DOI] [PubMed] [Google Scholar]

Kelly 1986 {published data only}

  1. Kelly JW, Crutcher WA, Sagebiel RW. Clinical diagnosis of dysplastic melanocytic nevi. A clinicopathologic correlation. Journal of the American Academy of Dermatology 1986;14(6):1044‐52. [DOI] [PubMed] [Google Scholar]

Koh 1990 {published data only}

  1. Koh HK, Caruso A, Gage I, Geller AC, Prout MN, White H, et al. Evaluation of melanoma/skin cancer screening in Massachusetts. Preliminary results. Cancer 1990;65(2):375‐9. [DOI] [PubMed] [Google Scholar]

Kroemer 2011 {published data only}

  1. Kroemer S, Fruhauf J, Campbell TM, Massone C, Schwantzer G, Soyer HP, et al. Mobile teledermatology for skin tumour screening: diagnostic accuracy of clinical and dermoscopic image tele‐evaluation using cellular phones. British Journal of Dermatology 2011;164(5):973‐9. [DOI] [PubMed] [Google Scholar]

Krol 1991 {published data only}

  1. Krol S, Keijser LM, Rhee HJ, Welvaart K. Screening for skin cancer in the Netherlands. Acta Dermato‐Venereologica 1991;71(4):317‐21. [PubMed] [Google Scholar]

Kurvers 2015 {published data only}

  1. Kurvers RH, Krause J, Argenziano G, Zalaudek I, Wolf M. Detection accuracy of collective intelligence assessments for skin cancer diagnosis. JAMA Dermatology 2015;151(12):1346‐53. [DOI] [PubMed] [Google Scholar]

Kvedar 1997 {published data only}

  1. Kvedar JC, Edwards RA, Menn ER, et al. The substitution of digital images for dermatologic physical examination. Archives of Dermatology 1997;133(2):161‐7. [PubMed] [Google Scholar]

Lechner 2015 {published data only}

  1. Lechner SC, Pereira LC, Reategui E, Gordon C, Byrne M, Hooper MW, et al. Erratum to: Acceptability of a rinse screening test for diagnosing head and neck squamous cell carcinoma among black americans. Journal of Racial and Ethnic Health Disparities 2015;2(NUMB 1):68. [DOI] [PubMed] [Google Scholar]

Lewis 1999 {published data only}

  1. Lewis K, Gilmour E, Harrison PV, Patefield S, Dickinson Y, Manning D, et al. Digital teledermatology for skin tumours: a preliminary assessment using a receiver operating characteristics (ROC) analysis. Journal of Telemedicine and Telecare 1999;5(Suppl 1):S57‐8. [DOI] [PubMed] [Google Scholar]

Lindelöf 1994 {published data only}

  1. Lindelöf B, Hedblad MA. Accuracy in the clinical diagnosis and pattern of malignant melanoma at a dermatological clinic. Journal of Dermatology 1994;21(7):461‐4. [DOI] [PubMed] [Google Scholar]

Lorentzen 2000 {published data only}

  1. Lorentzen H, Weismann K, Kenet RO, Secher L, Larsen FG. Comparison of dermatoscopic ABCD rule and risk stratification in the diagnosis of malignant melanoma. Acta Dermato‐Venereologica 2000;80(2):122‐6. [PubMed] [Google Scholar]

Luttrell 2012 {published data only}

  1. Luttrell MJ, McClenahan P, Hofmann‐Wellenhof R, Fink‐Puches R, Soyer HP. Laypersons' sensitivity for melanoma identification is higher with dermoscopy images than clinical photographs. British Journal of Dermatology 2012;167(5):1037‐41. [DOI] [PubMed] [Google Scholar]

Machet 2005 {published data only}

  1. Machet L, Nemeth‐Normand F, Giraudeau B, Perrinaud A, Tiguemounine J, Ayoub J, et al. Is ultrasound lymph node examination superior to clinical examination in melanoma follow‐up? A monocentre cohort study of 373 patients. British Journal of Dermatology 2005;152(1):66‐70. [DOI] [PubMed] [Google Scholar]

MacKenzie‐Wood 1998 {published data only}

  1. MacKenzie‐Wood AR, Milton GW, Launey JW. Melanoma: accuracy of clinical diagnosis. Australasian Journal of Dermatology 1998;39(1):31‐3. [DOI] [PubMed] [Google Scholar]

MacKie 1990 {published data only}

  1. MacKie RM. Clinical recognition of early invasive malignant melanoma. BMJ 1990;301(6759):1005‐6. [DOI] [PMC free article] [PubMed] [Google Scholar]

Mackie 1991 {published data only}

  1. MacKie RM, Doherty VR. Seven‐point checklist for melanoma. Clinical & Experimental Dermatology 1991;16(2):151‐3. [DOI] [PubMed] [Google Scholar]

Mackie 2002 {published data only}

  1. MacKie RM, Fleming C, McMahon AD, Jarrett P. The use of the dermatoscope to identify early melanoma using the three‐colour test. British Journal of Dermatology 2002;146(3):481‐4. [DOI] [PubMed] [Google Scholar]

Mahendran 2005 {published data only}

  1. Mahendran R, Goodfield MJ, Sheehan‐Dare RA. An evaluation of the role of a store‐and‐forward teledermatology system in skin cancer diagnosis and management. Clinical & Experimental Dermatology 2005;30(3):209‐14. [DOI] [PubMed] [Google Scholar]

Mahon 1997 {published data only}

  1. Mahon SM. A comparison of findings from two checklists for the early detection of skin cancer. Missouri Nurse 1997;66(2):12. [PubMed] [Google Scholar]

Malvehy 2014 {published data only}

  1. Malvehy J, Hauschild A, Curiel‐Lewandrowski C, Mohr P, Hofmann‐Wellenhof R, Motley R, et al. Clinical performance of the Nevisense system in cutaneous melanoma detection: an international, multicentre, prospective and blinded clinical trial on efficacy and safety. British Journal of Dermatology 2014;171(5):1099‐107. [DOI] [PMC free article] [PubMed] [Google Scholar]

Marghoob 1995 {published data only}

  1. Marghoob AA, Slade J, Kopf AW, Rigel DS, Friedman RJ, Perelman RO. The ABCDs of melanoma: why change?. Journal of the American Academy of Dermatology 1995;32(4):682‐4. [DOI] [PubMed] [Google Scholar]

Marghoob 2007 {published data only}

  1. Marghoob AA, Korzenko AJ, Changchien L, Scope A, Braun RP, Rabinovitz H. The beauty and the beast sign in dermoscopy. Dermatologic Surgery 2007;33(11):1388‐91. [DOI] [PubMed] [Google Scholar]

Markowitz 2015 {published data only}

  1. Markowitz O, Schwartz M, Feldman E, Bienenfeld A, Bieber AK, Ellis J, et al. Evaluation of optical coherence tomography as a means of identifying earlier stage basal cell carcinomas while reducing the use of diagnostic biopsy. Journal of Clinical & Aesthetic Dermatology 2015;8(10):14‐20. [PMC free article] [PubMed] [Google Scholar]

McCarthy 1995 {published data only}

  1. McCarthy JT. ABCDs of Melanoma. Cutis 1995;56(6):313. [PubMed] [Google Scholar]

McMullan 1956 {published data only}

  1. McMullan FH, Hubener LF. Malignant melanoma: a statistical review of clinical and histological diagnoses. A.M.A. Archives of Dermatology 1956;74(6):618‐9. [PubMed] [Google Scholar]

Menzies 2008 {published data only}

  1. Menzies SW, Kreusch J, Byth K, Pizzichetta MA, Marghoob A, Braun R, et al. Dermoscopic evaluation of amelanotic and hypomelanotic melanoma. Archives of Dermatology 2008;144(9):1120‐7. [DOI] [PubMed] [Google Scholar]

Menzies 2011 {published data only}

  1. Menzies SW, Stevenson ML, Altamura D, Byth K. Variables predicting change in benign melanocytic nevi undergoing short‐term dermoscopic imaging. Archives of Dermatology 2011;147(6):655‐9. [DOI] [PubMed] [Google Scholar]

Menzies 2013 {published data only}

  1. Menzies SW, Moloney FJ, Byth K, Avramidis M, Argenziano G, Zalaudek I, et al. Dermoscopic evaluation of nodular melanoma. JAMA Dermatology 2013;149(6):699‐709. [DOI] [PubMed] [Google Scholar]

Moffatt 2006 {published data only}

  1. Moffatt CR, Green AC, Whiteman DC. Diagnostic accuracy in skin cancer clinics: the Australian experience. International Journal of Dermatology 2006;45(6):656‐60. [DOI] [PubMed] [Google Scholar]

Mohammad 2015 {published data only}

  1. Mohammad E‐A, Mansour M, Parichehr K, Farideh D, Amirhossein R, Ahmad S‐A. Assessment of clinical diagnostic accuracy compared with pathological diagnosis of basal cell carcinoma. Indian Dermatology Online Journal 2015;6(4):258‐62. [DOI] [PMC free article] [PubMed] [Google Scholar]

Morrison 2001 {published data only}

  1. Morrison A, O'Loughlin S, Powell FC. Suspected skin malignancy: a comparison of diagnoses of family practitioners and dermatologists in 493 patients. International Journal of Dermatology 2001;40(2):104‐7. [DOI] [PubMed] [Google Scholar]

Nachbar 1994 {published data only}

  1. Nachbar F, Stolz W, Merkle T, Cognetta AB, Vogt T, Landthaler M, et al. The ABCD rule of dermatoscopy. High prospective value in the diagnosis of doubtful melanocytic skin lesions. Journal of the American Academy of Dermatology 1994;30(4):551‐9. [ER4:15466022] [DOI] [PubMed] [Google Scholar]

Nathansohn 2007 {published data only}

  1. Nathansohn N, Orenstein A, Trau H, Liran A, Schachter J. Pigmented lesions clinic for early detection of melanoma: preliminary results. Israel Medical Association Journal 2007;9(10):708‐12. [PubMed] [Google Scholar]

Nilles 1994 {published data only}

  1. Nilles M, Boedeker RH, Schill WB. Surface microscopy of naevi and melanomas‐‐clues to melanoma. British Journal of Dermatology 1994;130(3):349‐55. [DOI] [PubMed] [Google Scholar]

Osborne 1998 {published data only}

  1. Osborne JE, Bourke JF, Holder J, Colloby P, Graham‐Brown RA. The effect of the introduction of a pigmented lesion clinic on the interval between referral by family practitioner and attendance at hospital. British Journal of Dermatology 1998;138(3):418‐21. [DOI] [PubMed] [Google Scholar]

Osborne 1999 {published data only}

  1. Osborne JE, Bourke JF, Graham‐Brown RA, Hutchinson PE. False negative clinical diagnoses of malignant melanoma. British Journal of Dermatology 1999;140(5):902‐8. [DOI] [PubMed] [Google Scholar]

Parslew 1997 {published data only}

  1. Parslew RA, Rhodes LE. Accuracy of diagnosis of benign skin lesions in hospital practice: a comparison of clinical and histological findings. Journal of the European Academy of Dermatology and Venereology 1997;9(2):137‐41. [Google Scholar]

Pazzini 1996 {published data only}

  1. Pazzini C, Pozzi M, Betti R, Vergani R, Crosti C. Improvement of diagnostic accuracy in the clinical diagnosis of pigmented skin lesions by epiluminescence microscopy. Skin Cancer 1996;11(2):159‐61. [Google Scholar]

Perednia 1992 {published data only}

  1. Perednia DA, Gaines JA, Rossum AC. Variability in physician assessment of lesions in cutaneous images and its implications for skin screening and computer‐assisted diagnosis. Archives of Dermatology 1992;128(3):357‐64. [PubMed] [Google Scholar]

Perrinaud 2007 {published data only}

  1. Perrinaud A, Gaide O, French LE, Saurat JH, Marghoob AA, Braun RP. Can automated dermoscopy image analysis instruments provide added benefit for the dermatologist? A study comparing the results of three systems. British Journal of Dermatology 2007;157(5):926‐33. [DOI] [PubMed] [Google Scholar]

Piccolo 2000 {published data only}

  1. Piccolo D, Smolle J, Argenziano G, Wolf IH, Braun R, Cerroni L, et al. Teledermoscopy‐‐results of a multicentre study on 43 pigmented skin lesions. Journal of Telemedicine & Telecare 2000;6(3):132‐7. [DOI] [PubMed] [Google Scholar]

Piccolo 2002 {published data only}

  1. Piccolo D, Peris K, Chimenti S, Argenziano G, Soyer HP. Jumping into the future using teledermoscopy. Skinmed 2002;1(1):20‐4. [DOI] [PubMed] [Google Scholar]

Pizzichetta 2001 {published data only}

  1. Pizzichetta MA, Talamini R, Piccolo D, Argenziano G, Pagnanelli G, Burgdorf T, et al. The ABCD rule of dermatoscopy does not apply to small melanocytic skin lesions. Archives of Dermatology 2001;137(10):1376‐8. [PubMed] [Google Scholar]

Provost 1998 {published data only}

  1. Provost N, Kopf AW, Rabinovitz HS, Stolz W, DeDavid M, Wasti Q, et al. Comparison of conventional photographs and telephonically transmitted compressed digitized images of melanomas and dysplastic nevi. Dermatology 1998;196(3):299‐304. [DOI] [PubMed] [Google Scholar]

Quereux 2011 {published data only}

  1. Quereux G, Lequeux Y, Cary M, Jumbou O, Nguyen JM, Dreno B. Feasibility and effectiveness of a melanoma targeted screening strategy. Melanoma Research 2011;21:e1‐2. [Google Scholar]

Rallan 2006 {published data only}

  1. Rallan D, Dickson M, Bush NL, Harland CC, Mortimer P, Bamber JC. High‐resolution ultrasound reflex transmission imaging and digital photography: potential tools for the quantitative assessment of pigmented lesions. Skin Research & Technology 2006;12(1):50‐9. [DOI] [PubMed] [Google Scholar]

Rampen 1988 {published data only}

  1. Rampen FH, Rumke P. Referral pattern and accuracy of clinical diagnosis of cutaneous melanoma. Acta Dermato‐venereologica 1988;68(1):61‐4. [PubMed] [Google Scholar]

Reeck 1999 {published data only}

  1. Reeck MC, Chuang T‐Y, Eads TJ, Faust HB, Farmer ER, Hood AF. The diagnostic yield in submitting nevi for histologic examination. Journal of the American Academy of Dermatology 1999;40(4):567‐71. [DOI] [PubMed] [Google Scholar]

Riddell 1961 {published data only}

  1. Riddell Jr JM. A report of 300 patients with skin cancer. Texas State Journal of Medicine 1961;57:588‐92. [PubMed] [Google Scholar]

Rigel 1993 {published data only}

  1. Rigel DS, Friedman RJ. The rationale of the ABCDs of early melanoma. Journal of the American Academy of Dermatology 1993;29(6):1060‐1. [DOI] [PubMed] [Google Scholar]

Robati 2014 {published data only}

  1. Robati RM, Toossi P, Karimi M, Ayatollahi A, Esmaeli M. Screening for skin dancer: a pilot study in Tehran, Iran. Indian Journal of Dermatology 2014;59(1):1‐4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Robinson 2010 {published data only}

  1. Robinson JK, Turrisi R, Mallett K, Stapleton J, Pion M. Comparing the efficacy of an in‐person intervention with a skin self‐examination workbook. Archives of Dermatology 2010;146(1):91‐4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Rosado 2003 {published data only}

  1. Rosado B, Menzies S, Harbauer A, Pehamberger H, Wolff K, Binder M, et al. Accuracy of computer diagnosis of melanoma: a quantitative meta‐analysis. Archives of Dermatology 2003;139(3):361‐7; discussion 366. [DOI] [PubMed] [Google Scholar]

Rossi 2000 {published data only}

  1. Rossi CR, Vecchiato A, Bezze G, Mastrangelo G, Montesco MC, Mocellin S, et al. Early detection of melanoma: an educational campaign in Padova, Italy. Melanoma Research 2000;10(2):181‐7. [PubMed] [Google Scholar]

Roush 1986 {published data only}

  1. Roush GC, Kirkwood JM, Ernstoff M, Somma SJ, Duray PH, Klaus SN, et al. Reproducibility and validity in the clinical diagnosis of the nonfamilial dysplastic nevus: work in progress. Recent Results in Cancer Research 1986;102:154‐8. [DOI] [PubMed] [Google Scholar]

Salvio 2011 {published data only}

  1. Salvio AG, Assumpcao JA, Segalla JG, Panfilo BL, Nicolini HR, Didone R. One year experience of a model for melanoma continuous prevention in the city of Jau (Sao Paulo), Brazil. Anais Brasileiros de Dermatologia 2011;86(4):669‐74. [DOI] [PubMed] [Google Scholar]

Schindewolf 1994 {published data only}

  1. Schindewolf T, Schiffner R, Stolz W, Albert R, Abmayr W, Harms H. Evaluation of different image acquisition techniques for a computer vision system in the diagnosis of malignant melanoma. Journal of the American Academy of Dermatology 1994;31(1):33‐41. [DOI] [PubMed] [Google Scholar]

Schmoeckel 1987 {published data only}

  1. Schmoeckel C, Braun‐Falco O. Diagnosis of early malignant melanoma: sensitivity and specificity of clinical and histological criteria. Pigment Cell 1987;8:96‐106. [Google Scholar]

Schwartzberg 2005 {published data only}

  1. Schwartzberg JB, Elgart GW, Romanelli P, Ma F, Federman DG, Kirsner RS. Accuracy and predictors of basal cell carcinoma diagnosis. Dermatologic surgery 2005;31(5):534‐7. [DOI] [PubMed] [Google Scholar]

Seidenari 2006 {published data only}

  1. Seidenari S, Longo C, Giusti F, Pellacani G. Clinical selection of melanocytic lesions for dermoscopy decreases the identification of suspicious lesions in comparison with dermoscopy without clinical preselection. British Journal of Dermatology 2006;154(5):873‐9. [DOI] [PubMed] [Google Scholar]

Seidenari 2006a {published data only}

  1. Seidenari S, Pellacani G, Grana C. Asymmetry in dermoscopic melanocytic lesion images: a computer description based on colour distribution. Acta Dermato‐Venereologica 2006;86(2):123‐8. [DOI] [PubMed] [Google Scholar]

Shariff 2010 {published data only}

  1. Shariff Z, Roshan A, Williams AM, Platt AJ. 2‐week wait referrals in suspected skin cancer: does an instructional module for general practitioners improve diagnostic accuracy?. Surgeon Journal of the Royal Colleges of Surgeons of Edinburgh & Ireland 2010;8(5):247‐51. [DOI] [PubMed] [Google Scholar]

Sondak 2015 {published data only}

  1. Sondak VK, Glass LF, Geller AC. Risk‐stratified screening for detection of melanoma. JAMA 2015;313(6):616‐7. [DOI] [PubMed] [Google Scholar]

Soyer 2004 {published data only}

  1. Soyer HP, Argenziano G, Zalaudek I, Corona R, Sera F, Talamini R, et al. Three‐point checklist of dermoscopy. A new screening method for early detection of melanoma. Dermatology 2004;208(1):27‐31. [DOI] [PubMed] [Google Scholar]

Stanganelli 1998b {published data only}

  1. Stanganelli I, Bucchi L. Epiluminescence microscopy versus clinical evaluation of pigmented skin lesions: effects of operator's training on reproducibility and accuracy. Dermatology and Venereology Society of the Canton of Ticino. Dermatology 1998;196(2):199‐203. [DOI] [PubMed] [Google Scholar]

Stanley 2003 {published data only}

  1. Stanley RJ, Moss RH, Stoecker W, Aggarwal C. A fuzzy‐based histogram analysis technique for skin lesion discrimination in dermatology clinical images. Computerized Medical Imaging & Graphics 2003;27(5):387‐96. [DOI] [PMC free article] [PubMed] [Google Scholar]

Stathopoulos 2015 {published data only}

  1. Stathopoulos P, Ghaly G, Sisodia B, Harrop C. Positive predictive value of clinical diagnosis of head and neck non‐melanoma skin malignancies. How accurate are we?. Oral and Maxillofacial Surgery 2015;19(NUMB 4):387‐90. [DOI] [PubMed] [Google Scholar]

Stratigos 2007 {published data only}

  1. Stratigos A, Nikolaou V, Kedicoglou S, Antoniou C, Stefanaki I, Haidemenos G, et al. Melanoma/skin cancer screening in a Mediterranean country: results of the Euromelanoma Screening Day Campaign in Greece. Journal of the European Academy of Dermatology & Venereology 2007;21(1):56‐62. [DOI] [PubMed] [Google Scholar]

Tandjung 2015 {published data only}

  1. Tandjung R, Badertscher N, Kleiner N, Wensing M, Rosemann T, Braun RP, et al. Feasibility and diagnostic accuracy of teledermatology in Swiss primary care: process analysis of a randomized controlled trial. Journal of Evaluation in Clinical Practice 2015;21(2):326‐31. [DOI] [PubMed] [Google Scholar]

Terrill 2009 {published data only}

  1. Terrill PJ, Fairbanks S, Bailey M. Is there just one lesion? The need for whole body skin examination in patients presenting with non‐melanocytic skin cancer. ANZ Journal of Surgery 2009;79(10):707‐12. [DOI] [PubMed] [Google Scholar]

Terushkin 2010a {published data only}

  1. Terushkin V, Braga JC, Dusza SW, Scope A, Busam K, Marghoob AA, et al. Agreement on the clinical diagnosis and management of cutaneous squamous neoplasms. Dermatologic Surgery 2010;36(10):1514‐20. [DOI] [PubMed] [Google Scholar]

Terushkin 2010b {published data only}

  1. Terushkin V, Warycha M, Levy M, Kopf AW, Cohen DE, Polsky D. Analysis of the benign to malignant ratio of lesions biopsied by a general dermatologist before and after the adoption of dermoscopy. Archives of Dermatology 2010;146(3):343‐4. [DOI] [PubMed] [Google Scholar]

Thomson 2005 {published data only}

  1. Thomson MA, Loffeld A, Marsden JR. More skin cancer detected from nonurgent referrals. British Journal of Dermatology 2005;153(2):453‐4. [DOI] [PubMed] [Google Scholar]

Torrey 1941 {published data only}

  1. Torrey FA, Levin EA. Comparison of the clinical and the pathologic diagnoses of malignant conditions of the skin. Archives of Dermatology 1941;43(3):532. [Google Scholar]

Ulrich 2015 {published data only}

  1. Ulrich M, Braunmuehl T, Kurzen H, Dirschka T, Kellner C, Sattler E, et al. The sensitivity and specificity of optical coherence tomography for the assisted diagnosis of nonpigmented basal cell carcinoma: an observational study. British Journal of Dermatology 2015;173(2):428‐35. [DOI] [PubMed] [Google Scholar]

Van der Rhee 2010 {published data only}

  1. Rhee JI, Bergman W, Kukutsch NA. The impact of dermoscopy on the management of pigmented lesions in everyday clinical practice of general dermatologists: a prospective study. British Journal of Dermatology 2010;162(3):563‐7. [DOI] [PubMed] [Google Scholar]

Van der Rhee 2011 {published data only}

  1. Rhee JI, Bergman W, Kukutsch NA. Impact of dermoscopy on the management of high‐risk patients from melanoma families: a prospective study. Acta Dermato‐Venereologica 2011;91(4):428‐31. [DOI] [PubMed] [Google Scholar]

Vasili 2010 {published data only}

  1. Vasili E, Shkodrani E, Harja D, Labinoti L, Zoto A. Retrospective study of 70 patients with NMSC. Melanoma Research 2010;20:e63. [Google Scholar]

Wagner 1985 {published data only}

  1. Wagner RF, Wagner D, Tomich JM, Wagner KD, Grande DJ. Residents' corner: diagnoses of skin disease: dermatologists vs. nondermatologists. Journal of Dermatologic Surgery and Oncology 1985;11(5):476‐9. [DOI] [PubMed] [Google Scholar]

Walter 2010 {published data only}

  1. Walter FM, Morris HC, Humphrys E, Hall PN, Kinmonth AL, Prevost AT, et al. Protocol for the MoleMate UK Trial: a randomised controlled trial of the MoleMate system in the management of pigmented skin lesions in primary care. BMC Family Practice 2010;11:36. [DOI] [PMC free article] [PubMed] [Google Scholar]

Walter 2013 {published data only}

  1. Walter FM, Prevost AT, Vasconcelos J, Hall PN, Burrows NP, Morris HC, et al. Using the 7‐point checklist as a diagnostic aid for pigmented skin lesions in general practice: a diagnostic validation study. British Journal of General Practice 2013;63(610):e345‐53. [DOI] [PMC free article] [PubMed] [Google Scholar]

Warshaw 2009a {published data only}

  1. Warshaw EM, Lederle FA, Grill JP, Gravely AA, Bangerter AK, Fortier LA, et al. Accuracy of teledermatology for pigmented neoplasms. Journal of the American Academy of Dermatology 2009;61(5):753‐65. [DOI] [PubMed] [Google Scholar]

Warshaw 2009b {published data only}

  1. Warshaw EM, Lederle FA, Grill JP, Gravely AA, Bangerter AK, Fortier LA, et al. Accuracy of teledermatology for nonpigmented neoplasms. Journal of the American Academy of Dermatology 2009;60(4):579‐88. [DOI: 10.1016/j.jaad.2008.11.892] [DOI] [PubMed] [Google Scholar]

Warshaw 2010 {published data only}

  1. Warshaw EM, Gravely AA, Nelson DB. Accuracy of teledermatology/teledermoscopy and clinic‐based dermatology for specific categories of skin neoplasms. Journal of the American Academy of Dermatology 2010;63(255):348‐52. [DOI] [PubMed] [Google Scholar]

Westbrook 2006 {published data only}

  1. Westbrook RH, Goyal N, Gawkrodger DJ. Diagnostic accuracy for skin cancer: comparison of general practitioner with dermatologist and dermatopathologist. Journal of Dermatological Treatment 2006;17(1):57‐8. [DOI] [PubMed] [Google Scholar]

Whitaker‐Worth 1998 {published data only}

  1. Whitaker‐Worth DL, Susser WS, Grant‐Kels JM. Clinical dermatologic education and the diagnostic acumen of medical students and primary care residents. International Journal of Dermatology 1998;37(11):855‐9. [DOI] [PubMed] [Google Scholar]

Whited 1998 {published data only}

  1. Whited JD, Mills BJ, Hall RP, Drugge RJ, Grichnik JM, Simel DL. A pilot trial of digital imaging in skin cancer. Journal of Telemedicine & Telecare 1998;4(2):108‐12. [DOI] [PubMed] [Google Scholar]

Williams 1991 {published data only}

  1. Williams HC, Smith D, du Vivier A. Melanoma: differences observed by general surgeons and dermatologists. International Journal of Dermatology 1991;30(4):257‐61. [DOI] [PubMed] [Google Scholar]

Winkelmann 2015a {published data only}

  1. Winkelmann RR, Hauschild A, Tucker N, White R, Rigel DS. The impact of multispectral digital skin lesion analysis on German dermatologist decisions to biopsy atypical pigmented lesions with clinical characteristics of melanoma. Journal of Clinical & Aesthetic Dermatology 2015;8(10):27‐9. [PMC free article] [PubMed] [Google Scholar]

Winkelmann 2015b {published data only}

  1. Winkelmann RR, Yoo J, Tucker N, White R, Rigel DS. Impact of guidance provided by a multispectral digital skin lesion analysis device following dermoscopy on decisions to biopsy atypical melanocytic lesions. Journal of Clinical & Aesthetic Dermatology 2015;8(9):21‐4. [PMC free article] [PubMed] [Google Scholar]

Wolf 1998 {published data only}

  1. Wolf IH, Smolle J, Soyer HP, Kerl H. Sensitivity in the clinical diagnosis of malignant melanoma. Melanoma Research 1998;8(5):425‐9. [DOI] [PubMed] [Google Scholar]

Yoo 2015 {published data only}

  1. Yoo J, Tucker N, White R, Rigel D. The impact of probability of melanoma information provided by a multispectral digital skin lesion analysis device (MSDSLA) on resident dermatologists' decisions to biopsy clinical atypical lesions. Journal of the American Academy of Dermatology 2015;1):AB177. [Google Scholar]

Youl 2007a {published data only}

  1. Youl PH, Baade PD, Janda M, Mar CB, Whiteman DC, Aitken JF. Diagnosing skin cancer in primary care: how do mainstream general practitioners compare with primary care skin cancer clinic doctors?. Medical Journal of Australia 2007;187(4):215‐20. [DOI] [PubMed] [Google Scholar]

Youl 2007b {published data only}

  1. Youl PH, Raasch BA, Janda M, Aitken JF. The effect of an educational programme to improve the skills of general practitioners in diagnosing melanocytic/pigmented lesions. Clinical and Experimental Dermatology 2007;32(4):365‐70. [DOI] [PubMed] [Google Scholar]

Zaballos 2013 {published data only}

  1. Zaballos P, Banuls J, Cabo H, Llambrich A, Salsench E, Puig S, et al. The usefulness of dermoscopy for the recognition of basal cell carcinoma‐‐seborrhoeic keratosis compound tumours. Australasian Journal of Dermatology 2013;54(3):208‐12. [DOI] [PubMed] [Google Scholar]

Zou 2001 {published data only}

  1. Zou KH. Comparison of correlated receiver operating characteristic curves derived from repeated diagnostic test data. Academic Radiology 2001;8(3):225‐33. [DOI] [PubMed] [Google Scholar]

Additional references

ACIM 2017

  1. Australian Cancer Database. Melanoma of the skin for Australia (ICD10 C43). Australian Institute of Health and Welfare (AIHW) 2017 Australian Cancer Incidence and Mortality (ACIM) books (www.aihw.gov.au/acim‐books/). Canberra: Australian Institute of Health and Welfare, 2017. [Google Scholar]

Altamura 2008

  1. Altamura D, Avramidis M, Menzies SW. Assessment of the optimal interval for and sensitivity of short‐term sequential digital dermoscopy monitoring for the diagnosis of melanoma. Archives of Dermatology 2008;144(4):502‐6. [PUBMED: 18427044] [DOI] [PubMed] [Google Scholar]

American Academy of Dermatology 2015

  1. American Academy of Dermatology Ad Hoc Task Force for the ABCDEs of Melanoma, Tsao H, Olazagasti JM, Cordoro KM, Brewer JD, Taylor SC, et al. Early detection of melanoma: reviewing the ABCDEs. Journal of the American Academy of Dermatology 2015;72(4):717‐23. [PUBMED: 25698455] [DOI] [PubMed] [Google Scholar]

Annessi 2007

  1. Annessi G, Bono R, Sampogna F, Faraggiana T, Abeni D. Sensitivity, specificity, and diagnostic accuracy of three dermoscopic algorithmic methods in the diagnosis of doubtful melanocytic lesions: the importance of light brown structureless areas in differentiating atypical melanocytic nevi from thin melanomas. Journal of the American Academy of Dermatology 2007;56(5):759‐67. [PUBMED: 17316894] [DOI] [PubMed] [Google Scholar]

Argenziano 1998

  1. Argenziano G, Fabbrocini G, Carli P, Giorgi V, Sammarco E, Delfino M. Epiluminescence microscopy for the diagnosis of doubtful melanocytic skin lesions. Comparison of the ABCD rule of dermatoscopy and a new 7‐point checklist based on pattern analysis. Archives of Dermatology 1998;134(12):1563‐70. [PUBMED: 9875194] [DOI] [PubMed] [Google Scholar]

Argenziano 2001

  1. Argenziano G, Soyer HP. Dermoscopy of pigmented skin lesions‐‐a valuable tool for early diagnosis of melanoma. Lancet Oncology 2001;2(7):443‐9. [PUBMED: 11905739] [DOI] [PubMed] [Google Scholar]

Armstrong 2017

  1. Armstrong BK, Cust AE. Sun exposure and skin cancer, and the puzzle of cutaneous melanoma: a perspective on Fears et al. Mathematical models of age and ultraviolet effects on the incidence of skin cancer among whites in the United States. American Journal of Epidemiology 1977; 105: 420‐427. Cancer Epidemiology 2017;48:147‐56. [PUBMED: 28478931] [DOI] [PubMed] [Google Scholar]

Arnold 2014

  1. Arnold M, Holterhues C, Hollestein LM, Coebergh JW, Nijsten T, Pukkala E, et al. Trends in incidence and predictions of cutaneous melanoma across Europe up to 2015. Journal of the European Academy of Dermatology & Venereology 2014;28(9):1170‐8. [PUBMED: 23962170] [DOI] [PubMed] [Google Scholar]

Ascierto 2000

  1. Ascierto PA, Palmieri G, Celentano E, Parasole R, Caraco C, Daponte A, et al. Sensitivity and specificity of epiluminescence microscopy: evaluation on a sample of 2731 excised cutaneous pigmented lesions. The Melanoma Cooperative Study. British Journal of Dermatology 2000;142(5):893‐8. [PUBMED: 10809845] [DOI] [PubMed] [Google Scholar]

Balch 2009

  1. Balch CM, Gershenwald JE, Soong SJ, Thompson JF, Atkins MB, Byrd DR, et al. Final version of 2009 AJCC melanoma staging and classification. Journal of Clinical Oncology 2009;27(36):6199‐206. [PUBMED: 19917835] [DOI] [PMC free article] [PubMed] [Google Scholar]

Belbasis 2016

  1. Belbasis L, Stefanaki I, Stratigos AJ, Evangelou E. Non‐genetic risk factors for cutaneous melanoma and keratinocyte skin cancers: an umbrella review of meta‐analyses. Journal of Dermatological Science 2016;84(3):330‐9. [PUBMED: 27663092] [DOI] [PubMed] [Google Scholar]

Binder 1997

  1. Binder M, Schwarz M, Steiner A, Kittler H, Muellner M, Wolff K, et al. Epiluminescence microscopy of small pigmented skin lesions: short‐term formal training improves the diagnostic performance of dermatologists. Journal of the American Academy of Dermatology 1997;36(2 Pt 1):197‐202. [PUBMED: 9039168] [DOI] [PubMed] [Google Scholar]

Boniol 2012

  1. Boniol M, Autier P, Boyle P, Gandini S. Cutaneous melanoma attributable to sunbed use: systematic review and meta‐analysis. BMJ 2012;345:e4757. [PUBMED: 22833605] [DOI] [PMC free article] [PubMed] [Google Scholar]

Boring 1994

  1. Boring CC, Squires TS, Tong T, Montgomery S. Cancer statistics, 1994. CA: a Cancer Journal for Clinicians 1994;44(1):7‐26. [PUBMED: 8281473] [DOI] [PubMed] [Google Scholar]

Bossuyt 2015

  1. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, Irwig L, et al. STARD 2015: an updated list of essential items for reporting diagnostic accuracy studies. BMJ 2015;351:h5527. [DOI: 10.1136/bmj.h5527; PUBMED: 26511519] [DOI] [PMC free article] [PubMed] [Google Scholar]

Cancer Research UK 2017

  1. Skin cancer incidence statistics. Available from www.cancerresearchuk.org/health‐professional/cancer‐statistics/statistics‐by‐cancer‐type/skin‐cancer/incidence (accessed prior to July 2017).

Carli 1994

  1. Carli P, Giorgi V, Donati E, Pestelli E, Gianotti B. Epiluminescence microscopy reduces the risk of removing clinically atypical, but histologically common, melanocytic lesions [La microscopia a epiluminescenza (Elm) riduce il rischio di asportare lesioni melanocitarie clinicamente sospette ma istologicamente comuni]. Giornale Italiano di Dermatologia e Venereologia 1994;129(12):5990605. [Google Scholar]

Chao 2013

  1. Chao D, London Cancer North and East. London Cancer, guidelines for cutaneous malignant melanoma management August 2014. www.londoncancer.org/media/76373/london‐cancer‐melanoma‐guidelines‐2013‐v1.0.pdf. London: London Cancer North and East Alliance, (accessed 25 February 2015).

Cho 2014

  1. Cho H, Mariotto AB, Schwartz LM, Luo J, Woloshin S. When do changes in cancer survival mean progress? The insight from population incidence and mortality. Journal of the National Cancer Institute. Monographs 2014;2014(49):187‐97. [PUBMED: 25417232] [DOI] [PMC free article] [PubMed] [Google Scholar]

Chu 2006

  1. Chu H, Cole SR. Bivariate meta‐analysis for sensitivity and specificity with sparse data: a generalized linear mixed model approach (comment). Journal of Clinical Epidemiology 2006;59(12):1331‐2. [PUBMED: 17098577] [DOI] [PubMed] [Google Scholar]

Chuchu 2018

  1. Chuchu N, Dinnes J, Takwoingi Y, Matin RN, Bayliss SE, Davenport C, et al. Teledermatology for diagnosing skin cancer in adults. Cochrane Database of Systematic Reviews 2018, Issue 12. [DOI: 10.1002/14651858.CD013193] [DOI] [PMC free article] [PubMed] [Google Scholar]

Corbo 2012

  1. Corbo MD, Wismer J. Agreement between dermatologists and primary care practitioners in the diagnosis of malignant melanoma: review of the literature. Journal of Cutaneous Medicine & Surgery 2012;16(5):306‐10. [PUBMED: 22971304] [DOI] [PubMed] [Google Scholar]

Cordoro 2013

  1. Cordoro KM, Gupta D, Frieden IJ, McCalmont T, Kashani‐Sabet M. Pediatric melanoma: results of a large cohort study and proposal for modified ABCD detection criteria for children. Journal of the American Academy of Dermatology 2013;68(6):913‐25. [PUBMED: 23395590] [DOI] [PubMed] [Google Scholar]

Deeks 2005

  1. Deeks JJ, Macaskill P, Irwig L. The performance of tests of publication bias and other sample size effects in systematic reviews of diagnostic test accuracy was assessed. Journal of Clinical Epidemiology 2005;58(9):882‐93. [PUBMED: 16085191] [DOI] [PubMed] [Google Scholar]

DePry 2011

  1. DePry JL, Reed KB, Cook‐Norris RH, Brewer JD. Iatrogenic immunosuppression and cutaneous malignancy. Clinics in Dermatology 2011;29(6):602‐13. [PUBMED: 22014982] [DOI] [PubMed] [Google Scholar]

Dinnes 2018

  1. Dinnes J, Deeks JJ, Chuchu N, Ferrante di Ruffano L, Matin RN, Thomson DR, et al. Dermoscopy, with and without visual inspection, for diagnosing melanoma in adults. Cochrane Database of Systematic Reviews 2018, Issue 12. [DOI: 10.1002/14651858.CD011902.pub2] [DOI] [PMC free article] [PubMed] [Google Scholar]

Efron 1983

  1. Efron B. Estimating the error rate of a prediction rule: improvement on cross‐validation. Journal of the American Statistical Association 1983;78(382):316‐31. [DOI: 10.1080/01621459.1983.10477973] [DOI] [Google Scholar]

Elstein 2002

  1. Elstein AS, Schwartz A. Clinical problem solving and diagnostic decision making: selective review of the cognitive literature. BMJ 2002;324(7339):729‐32. [PUBMED: 11909793] [DOI] [PMC free article] [PubMed] [Google Scholar]

Erdmann 2013

  1. Erdmann F, Lortet‐Tieulent J, Schuz J, Zeeb H, Greinert R, Breitbart EW, et al. International trends in the incidence of malignant melanoma 1953‐2008‐‐are recent generations at higher or lower risk?. International Journal of Cancer 2013;132(2):385‐400. [PUBMED: 22532371] [DOI] [PubMed] [Google Scholar]

EUCAN 2012

  1. EUCAN, International Agency for Research on Cancer. Malignant melanoma of skin: estimated incidence, mortality & prevalence for both sexes, 2012. eco.iarc.fr/eucan/Cancer.aspx?Cancer=20. International Agency for Research on Cancer, (accessed 29 July 2015).

Farina 2000

  1. Farina B, Bartoli C, Bono A, Colombo A, Lualdi M, Tragni G, et al. Multispectral imaging approach in the diagnosis of cutaneous melanoma: potentiality and limits. Physics in Medicine and Biology 2000;45(5):1243‐54. [PUBMED: 10843103] [DOI] [PubMed] [Google Scholar]

Ferlay 2015

  1. Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, et al. Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. International Journal of Cancer 2015;136(5):E359‐86. [PUBMED: 25220842] [DOI] [PubMed] [Google Scholar]

Gandini 2005

  1. Gandini S, Sera F, Cattaruzza MS, Pasquini P, Abeni D, Boyle P, et al. Meta‐analysis of risk factors for cutaneous melanoma: I. Common and atypical naevi. European Journal of Cancer 2005;41(1):28‐44. [PUBMED: 15617989] [DOI] [PubMed] [Google Scholar]

Garbe 2016

  1. Garbe C, Peris K, Hauschild A, Saiag P, Middleton M, Bastholt L, et al. Diagnosis and treatment of melanoma. European consensus‐based interdisciplinary guideline ‐ Update 2016. European Journal of Cancer 2016;63:201‐17. [PUBMED: 27367293] [DOI] [PubMed] [Google Scholar]

Geller 2002

  1. Geller AC, Miller DR, Annas GD, Demierre MF, Gilchrest BA, Koh HK. Melanoma incidence and mortality among US whites, 1969‐1999. JAMA 2002;288(14):1719‐20. [PUBMED: 12365954] [DOI] [PubMed] [Google Scholar]

Gereli 2010

  1. Gereli MC, Onsun N, Atilganoglu U, Demirkesen C. Comparison of two dermoscopic techniques in the diagnosis of clinically atypical pigmented skin lesions and melanoma: seven‐point and three‐point checklists. International Journal of Dermatology 2010;49(1):33‐8. [PUBMED: 20465608] [DOI] [PubMed] [Google Scholar]

Gershenwald 2017

  1. Gershenwald JE, Scolyer RA, Hess KR, Sondak VK, Long GV, Ross, MI, et al. Melanoma staging: evidence‐based changes in the American Joint Committee on Cancer eighth edition cancer staging manual. CA: a Cancer Journal for Clinicians 2017;67(6):472‐492. [PUBMED: 29028110] [DOI] [PMC free article] [PubMed] [Google Scholar]

Ginsberg 1993

  1. Ginsberg JS, Brill‐Edwards PA, Demers C, Donovan D, Ranju A. D‐dimer in patients with clinically suspected pulmonary embolism. Chest 1993;104(6):1679‐84. [PUBMED: 8252941] [DOI] [PubMed] [Google Scholar]

Girardi 2006

  1. Girardi S, Gaudy C, Gouvernet J, Teston J, Richard MA, Grob JJ. Superiority of a cognitive education with photographs over ABCD criteria in the education of the general population to the early detection of melanoma: a randomized study. International Journal of Cancer 2006;118(9):2276‐80. [PUBMED: 16331608] [DOI] [PubMed] [Google Scholar]

Goldsmith 2014

  1. Goldsmith SM. A unifying approach to the clinical diagnosis of melanoma including "D" for "Dark" in the ABCDE criteria. Dermatology Practical and Conceptual 2014;4(4):75‐8. [PUBMED: 25396093] [DOI] [PMC free article] [PubMed] [Google Scholar]

Goodson 2010

  1. Goodson AG, Florell SR, Hyde M, Bowen GM, Grossman D. Comparative analysis of total body and dermatoscopic photographic monitoring of nevi in similar patient populations at risk for cutaneous melanoma. Dermatologic Surgery 2010;36(7):1087‐98. [PUBMED: 20653722] [DOI] [PMC free article] [PubMed] [Google Scholar]

Harrington 2017

  1. Harrington E, Clyne B, Wesseling N, Sandhu H, Armstrong L, Bennett H, et al. Diagnosing malignant melanoma in ambulatory care: a systematic review of clinical prediction rules. BMJ Open 2017;7(3):e014096. [PUBMED: 28264830] [DOI] [PMC free article] [PubMed] [Google Scholar]

Herschorn 2012

  1. Herschorn A. Dermoscopy for melanoma detection in family practice. Canadian Family Physician 2012;58(7):740‐5. [PUBMED: 22859635] [PMC free article] [PubMed] [Google Scholar]

HPA and MelNet NZ 2014

  1. Health Promotion Agency and the Melanoma Network of New Zealand (MelNet). New Zealand Skin Cancer Primary Prevention and Early Detection Strategy 2014 to 2017. www.sunsmart.org.nz//sites/default/files/documents/NZ%20Skin%20Cancer%20PrimaryPrevention%20and%20EarlyDetection%20Strategy%202014%20to%202017%20FINAL%20VERSION%20%23406761.pdf. Cancer Society of New Zealand, (accessed 29 May 2018).

Kasprzak 2015

  1. Kasprzak JM, Xu YG. Diagnosis and management of lentigo maligna: a review. Drugs in Context 2015;4:212281. [PUBMED: 26082796] [DOI] [PMC free article] [PubMed] [Google Scholar]

Kittler 1999

  1. Kittler H, Seltenheim M, Dawid M, Pehamberger H, Wolff K, Binder M. Morphologic changes of pigmented skin lesions: a useful extension of the ABCD rule for dermatoscopy. Journal of the American Academy of Dermatology 1999;40(4):558‐62. [PUBMED: 10188673] [DOI] [PubMed] [Google Scholar]

Kittler 2002

  1. Kittler H, Pehamberger H, Wolff K, Binder M. Diagnostic accuracy of dermoscopy. Lancet Oncology 2002;3(3):159‐65. [PUBMED: 11902502] [DOI] [PubMed] [Google Scholar]

Kittler 2011

  1. Kittler H, Rosendahl C, Cameron A, Tschandl P. Dermatoscopy. An algorithmic method based on pattern analysis. Austria: Facultas.WUV, 2011. [ISBN‐10: 3708907175] [Google Scholar]

Korn 2008

  1. Korn EL, Liu PY, Lee SJ, Chapman JA, Niedzwiecki D, Suman VJ, et al. Meta‐analysis of phase II cooperative group trials in metastatic stage IV melanoma to determine progression‐free and overall survival benchmarks for future phase II trials. Journal of Clinical Oncology 2008;26(4):527‐34. [PUBMED: 18235113] [DOI] [PubMed] [Google Scholar]

Lachs 1992

  1. Lachs MS, Nachamkin I, Edelstein PH, Goldman J, Feinstein AR, Schwartz JS. Spectrum bias in the evaluation of diagnostic tests: lessons from the rapid dipstick test for urinary tract infection. Annals of Internal Medicine 1992;117(2):135‐40. [PUBMED: 1605428] [DOI] [PubMed] [Google Scholar]

Leeflang 2013

  1. Leeflang MM, Rutjes AW, Reitsma JB, Hooft L, Bossuyt PM. Variation of a test's sensitivity and specificity with disease prevalence. CMAJ : Canadian Medical Association Journal 2013;185(11):E537‐44. [PUBMED: 23798453] [DOI] [PMC free article] [PubMed] [Google Scholar]

Leff 2008

  1. Leff B, Finucane TE. Gizmo idolatry. JAMA 2008;299(15):1830‐2. [PUBMED: 18413879] [DOI] [PubMed] [Google Scholar]

Lehmann 2011

  1. Lehmann AR, McGibbon D, Stefanini M. Xeroderma pigmentosum. Orphanet Journal Of Rare Diseases 2011;6:70. [PUBMED: 22044607] [DOI] [PMC free article] [PubMed] [Google Scholar]

Linos 2009

  1. Linos E, Swetter SM, Cockburn MG, Colditz GA, Clarke CA. Increasing burden of melanoma in the United States. Journal of Investigative Dermatology 2009;129(7):1666‐74. [PUBMED: 19131946] [DOI] [PMC free article] [PubMed] [Google Scholar]

Liu 2005

  1. Liu W, Hill D, Gibbs AF, Tempany M, Howe C, Borland R, et al. What features do patients notice that help to distinguish between benign pigmented lesions and melanomas?: the ABCD(E) rule versus the seven‐point checklist. Melanoma Research 2005;15(6):549‐54. [PUBMED: 16314742] [DOI] [PubMed] [Google Scholar]

Loescher 2011

  1. Loescher LJ, Harris JM Jr, Curiel‐Lewandrowski C. A systematic review of advanced practice nurses' skin cancer assessment barriers,skin lesion recognition skills, and skin cancer training activities. Journal of the American Academy of Nurse Practitioners 2011;23(12):667‐73. [PUBMED: 22145657] [DOI] [PubMed] [Google Scholar]

MacKie 1985

  1. MacKie RM, English J, Aitchison TC, Fitzsimons CP, Wilson P. The number and distribution of benign pigmented moles (melanocytic naevi) in a healthy British population. British Journal of Dermatology 1985;113(2):167‐74. [PUBMED: 4027184] [DOI] [PubMed] [Google Scholar]

Maley 2014

  1. Maley A, Rhodes AR. Cutaneous melanoma: preoperative tumor diameter in a general dermatology outpatient setting. Dermatologic Surgery 2014;40(4):446‐54. [PUBMED: 24479783] [DOI] [PubMed] [Google Scholar]

Marsden 2010

  1. Marsden JR, Newton‐Bishop JA, Burrows L, Cook M, Corrie PG, Cox NH, et al. BAD Guidelines: revised UK guidelines for the management of cutaneous melanoma 2010. British Journal of Dermatology 2010;163(2):238‐56. [PUBMED: 20608932] [DOI] [PubMed] [Google Scholar]

Mascaro 1998

  1. Mascaro JM Jr, Mascaro JM. The dermatologist's position concerning nevi: a vision ranging from "the ugly duckling" to "little red riding hood". Archives of Dermatology 1998;134(11):1484‐5. [PUBMED: 9828892] [DOI] [PubMed] [Google Scholar]

Menzies 1996

  1. Menzies SW, Ingvar C, Crotty KA, McCarthy WH. Frequency and morphologic characteristics of invasive melanomas lacking specific surface microscopic features. Archives of Dermatology 1996;132(10):1178‐82. [PUBMED: 8859028] [PubMed] [Google Scholar]

Mistry 2011

  1. Mistry M, Parkin DM, Ahmad AS, Sasieni P. Cancer incidence in the United Kingdom: projections to the year 2030. British Journal of Cancer 2011;105(11):1795‐803. [PUBMED: 22033277] [DOI] [PMC free article] [PubMed] [Google Scholar]

Moons 1997

  1. Moons KG, Es GA, Deckers JW, Habbema JD, Grobbee DE. Limitations of sensitivity, specificity, likelihood ratio, and Bayes' theorem in assessing diagnostic probabilities: a clinical example. Epidemiology 1997;8(1):12‐7. [PUBMED: 9116087] [DOI] [PubMed] [Google Scholar]

Moreau 2013

  1. Moreau JF, Weissfeld JL, Ferris LK. Characteristics and survival of patients with invasive amelanotic melanoma in the USA. Melanoma Research 2013;23(5):408‐13. [PUBMED: 23883947] [DOI] [PubMed] [Google Scholar]

Morton 1998

  1. Morton CA, MacKie RM. Clinical accuracy of the diagnosis of cutaneous malignant melanoma. British Journal of Dermatology 1998;138(2):283‐7. [PUBMED: 9602875] [DOI] [PubMed] [Google Scholar]

Moynihan 1994

  1. Moynihan GD. The 3 Cs of melanoma: time for a change?. Journal of the American Academy of Dermatology 1994;30(3):510‐1. [PUBMED: 8113477] [DOI] [PubMed] [Google Scholar]

NICE 2015a

  1. National Institute for Health and Care Excellence. Melanoma: assessment and management. www.nice.org.uk/guidance/ng14. London: National Institute for Health and Care Excellence, (accessed prior to 19 July 2017).

NICE 2015b

  1. National Institute for Health and Clinical Excellence. Suspected cancer: recognition and referral. www.nice.org.uk/guidance/ng12. London: National Institute for Health and Clinical Excellence, (accessed prior to 19 July 2017).

Norman 1989

  1. Norman GR, Rosenthal D, Brooks LR, Allen SW, Muzzin LJ. The development of expertise in dermatology. Archives of Dermatology 1989;125(8):1063‐8. [PUBMED: 2757402] [PubMed] [Google Scholar]

Norman 2009

  1. Norman G, Barraclough K, Dolovich L, Price D. Iterative diagnosis. BMJ 2009;339:b3490. [PUBMED: 19773326] [DOI] [PubMed] [Google Scholar]

Pasquali 2018

  1. Pasquali S, Hadjinicolaou AV, Chiarion Sileni V, Rossi CR, Mocellin S. Systemic treatments for metastatic cutaneous melanoma. Cochrane Database of Systematic Reviews 2018, Issue 2. [DOI: 10.1002/14651858.CD011123.pub2] [DOI] [PMC free article] [PubMed] [Google Scholar]

Pehamberger 1993

  1. Pehamberger H, Binder M, Steiner A, Wolff K. In vivo epiluminescence microscopy: improvement of early diagnosis of melanoma. Journal of Investigative Dermatology 1993;100(3):356s‐62s. [PUBMED: 8440924] [DOI] [PubMed] [Google Scholar]

Rademaker 2010

  1. Rademaker M, Oakley A. Digital monitoring by whole body photography and sequential digital dermoscopy detects thinner melanomas. Journal of Primary Health Care 2010;2(4):268‐72. [PUBMED: 21125066] [PubMed] [Google Scholar]

Reitsma 2005

  1. Reitsma JB, Glas AS, Rutjes AW, Scholten RJ, Bossuyt PM, Zwinderman AH. Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. Journal of Clinical Epidemiology 2005;58(10):982‐90. [PUBMED: 16168343] [DOI] [PubMed] [Google Scholar]

Reyes‐Ortiz 2006

  1. Reyes‐Ortiz CA, Goodwin JS, Freeman JL, Kuo YF. Socioeconomic status and survival in older patients with melanoma. Journal of the American Geriatrics Society 2006;54(11):1758‐64. [PUBMED: 17087705] [DOI] [PMC free article] [PubMed] [Google Scholar]

Rutjes 2005

  1. Rutjes AW, Reitsma JB, Vandenbroucke JP, Glas AS, Bossuyt PM. Case‐control and two‐gate designs in diagnostic accuracy studies. Clinical Chemistry 2005;51(8):1335‐41. [PUBMED: 15961549] [DOI] [PubMed] [Google Scholar]

Rutter 2001

  1. Rutter CM, Gatsonis CA. A hierarchical regression approach to meta‐analysis of diagnostic test accuracy evaluations. Statistics in Medicine 2001;20(19):2865‐84. [PUBMED: 11568945] [DOI] [PubMed] [Google Scholar]

Salerni 2012

  1. Salerni G, Carrera C, Lovatto L, Marti‐Laborda RM, Isern G, Palou J, et al. Characterization of 1152 lesions excised over 10 years using total‐body photography and digital dermatoscopy in the surveillance of patients at high risk for melanoma. Journal of the American Academy of Dermatology 2012;67(5):836‐45. [PUBMED: 22521205] [DOI] [PubMed] [Google Scholar]

SAS 2012 [Computer program]

  1. SAS Institute Inc.. SAS 2012. Version 9.3. Cary, NC, USA: SAS Institute Inc., 2012.

Shaikh 2012

  1. Shaikh WR, Xiong M, Weinstock MA. The contribution of nodular subtype to melanoma mortality in the United States, 1978 to 2007. Archives of Dermatology 2012;148(1):30‐6. [PUBMED: 21931016] [DOI] [PubMed] [Google Scholar]

Siegel 2015

  1. Siegel R, Miller K, Jemal A. Cancer statistics, 2015. CA: a Cancer Journal for Clinicians 2015;65(1):5‐29. [PUBMED: 25559415] [DOI] [PubMed] [Google Scholar]

SIGN 2017

  1. Scottish Intercollegiate Guidelines Network. Cutaneous Melanoma. www.sign.ac.uk/sign‐146‐melanoma.html. Scotland: SIGN, (accessed prior to 19 July 2017).

Sladden 2009

  1. Sladden MJ, Balch C, Barzilai DA, Berg D, Freiman A, Handiside T, et al. Surgical excision margins for primary cutaneous melanoma. Cochrane Database of Systematic Reviews 2009, Issue 10. [DOI: 10.1002/14651858.CD004835.pub2] [DOI] [PubMed] [Google Scholar]

Slater 2014

  1. Slater D, Walsh M. Standards and datasets for reporting cancers: dataset for the histological reporting of primary cutaneous malignant melanoma and regional lymph nodes, May 2014. www.rcpath.org/Resources/RCPath/Migrated%20Resources/Documents/G/G125_DatasetMaligMelanoma_May14.pdf. London: Royal College of Pathologists, (accessed 29 July 2015).

Sober 1979

  1. Sober AJ, Fitzpatrick TB, Mihm MC, Wise TG, Pearson BJ, Clark WH, et al. Early recognition of cutaneous melanoma. JAMA 1979;242(25):2795‐9. [PUBMED: 501893] [PubMed] [Google Scholar]

Stolz 1994

  1. Stolz W, Riemann A, Cognetta AB, Pillet L, Abmayer W, Holzel D, et al. ABCD rule of dermatoscopy: a new practical method for early recognition of malignant melanoma. European Journal of Dermatology 1994;4(7):521‐7. [EMBASE: 24349113] [Google Scholar]

Swerdlow 1995

  1. Swerdlow AJ, English JS, Qiao Z. The risk of melanoma in patients with congenital nevi: a cohort study. Journal of the American Academy of Dermatology 1995;32(4):595‐9. [PUBMED: 7896948] [DOI] [PubMed] [Google Scholar]

Takwoingi 2010

  1. Takwoingi Y, Deeks J. MetaDAS: a SAS macro for meta‐analysis of diagnostic accuracy studies. User Guide Version 1.3. 2010. www.methods.cochrane.org/sites/methods.cochrane.org.sdt/files/public/uploads/MetaDAS%20Readme%20v1.3%20May%202012.pdf (accessed prior to 17 July 2017).

Takwoingi 2015

  1. Takwoingi Y, Guo B, Riley RD, Deeks JJ. Performance of methods for meta‐analysis of diagnostic test accuracy with few studies or sparse data. Statistical Methods in Medical Research 2015;24:1‐19. [DOI: 10.1177/0962280215592269] [DOI] [PMC free article] [PubMed] [Google Scholar]

Tucker 1985

  1. Tucker MA, Boice JD Jr, Hoffman DA. Second cancer following cutaneous melanoma and cancers of the brain, thyroid, connective tissue, bone, and eye in Connecticut, 1935‐82. National Cancer Institute Monographs 1985;68:161‐89. [PUBMED: 4088297] [PubMed] [Google Scholar]

Usher‐Smith 2016

  1. Usher‐Smith JA, Sharp SJ, Griffin SJ. The spectrum effect in tests for risk prediction, screening, and diagnosis. BMJ 2016;353:i3139. [DOI: 10.1136/bmj.i3139] [DOI] [PMC free article] [PubMed] [Google Scholar]

Vestergaard 2008

  1. Vestergaard ME, Macaskill P, Holt PE, Menzies SW. Dermoscopy compared with naked eye examination for the diagnosis of primary melanoma: a meta‐analysis of studies performed in a clinical setting. British Journal of Dermatology 2008;159(3):669‐76. [PUBMED: 18616769] [DOI] [PubMed] [Google Scholar]

Whiting 2011

  1. Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS‐2: a revised tool for the quality assessment of diagnostic accuracy studies. Annals of Internal Medicine 2011;155(8):529‐36. [PUBMED: 22007046] [DOI] [PubMed] [Google Scholar]

Yagerman 2014

  1. Yagerman SE, Chen L, Jaimes N, Dusza SW, Halpern AC, Marghoob A. 'Do UC the melanoma?' Recognising the importance of different lesions displaying unevenness or having a history of change for early melanoma detection. Australasian Journal of Dermatology 2014;55(2):119‐24. [PUBMED: 24548383] [DOI] [PubMed] [Google Scholar]

Zalaudek 2006

  1. Zalaudek I, Argenziano G, Soyer HP, Corona R, Sera F, Blum A, et al. Dermoscopy Working Group. Three‐point checklist of dermoscopy: an open Internet study. British Journal of Dermatology 2006;154(3):431‐7. [PUBMED: 16445771] [DOI] [PubMed] [Google Scholar]

References to other published versions of this review

Dinnes 2015a

  1. Dinnes J, Matin RN, Moreau JF, Patel L, Chan SA, Wong KY, et al. Tests to assist in the diagnosis of cutaneous melanoma in adults: a generic protocol. Cochrane Database of Systematic Reviews 2015, Issue 10. [DOI: 10.1002/14651858.CD011902] [DOI] [Google Scholar]

Dinnes 2015b

  1. Dinnes J, Wong KY, Gulati A, Chuchu N, Leonardi‐Bee J, Bayliss SE, et al. Tests to assist in the diagnosis of keratinocyte skin cancers in adults: a generic protocol. Cochrane Database of Systematic Reviews 2015, Issue 10. [DOI: 10.1002/14651858.CD011901] [DOI] [Google Scholar]

Articles from The Cochrane Database of Systematic Reviews are provided here courtesy of Wiley

RESOURCES