Abstract
Background
The 1980 classification criteria for systemic sclerosis (SSc) lack sensitivity in early SSc and limited cutaneous SSc. A joint ACR-EULAR committee was established to develop new classification criteria for SSc.
Methods
Using consensus methods, 23 candidate items were arranged in a multi-criteria additive point system with a threshold to classify cases as SSc. The classification system was reduced by clustering items, and simplifying weights. The system was tested by: a) determining specificity and sensitivity in SSc cases and controls with scleroderma-like disorders; b) validating against the combined view of a group of experts on a set of cases with or without SSc.
Results
Skin thickening of the fingers extending proximal to the MCPs is sufficient to be classified as SSc, if that is not present, seven additive items apply with varying weights for each: skin thickening of the fingers, finger tip lesions, telangiectasia, abnormal nailfold capillaries, interstitial lung disease or pulmonary arterial hypertension, Raynaud's phenomenon, and SSc-related autoantibodies. Sensitivity and specificity in the validation sample were 0.91 and 0.92 for the new classification criteria and 0.75 and 0.72 for the 1980 ARA classification criteria. All selected cases were classified in accordance with consensus-based expert opinion. All cases classified as SSc by the 1980 ARA criteria were classified with the new criteria, and several additional cases were now considered to be SSc.
Conclusion
The ACR-EULAR classification criteria for SSc performed better than the 1980 ARA Criteria for SSc and should allow for more patients to be classified correctly as SSc.
Keywords: Systemic Sclerosis, Scleroderma, Classification Criteria, Conjoint Analysis, Multi Criteria Additive Point System, Validation, ACR-EULAR
INTRODUCTION
Systemic sclerosis (SSc; scleroderma) is a heterogeneous disease with a pathogenesis characterized by three hallmarks: small vessel vasculopathy, production of autoantibodies, and fibroblast dysfunction leading to increased deposition of extracellular matrix [1]. The clinical manifestations and the prognosis of SSc are variable, with the majority of patients having skin thickening and variable involvement of internal organs. Subsets of SSc can be discerned: limited cutaneous SSc (lcSSc), diffuse cutaneous SSc (dcSSc), and SSc without skin involvement [1].
In the absence of a diagnostic test proving absence or presence for SSc, several sets of classification criteria have been developed [2–6]. The purpose of classification criteria is to include patients with a similar clinical entity for research [7]. Classification criteria are not synonymous with diagnostic criteria but will almost always mirror the list of criteria that one uses for diagnosis [7]. However, classification criteria generally are more standardized and less inclusive than physician diagnosis.
The current standard classification criteria for SSc are the 1980 `Preliminary criteria for the classification of systemic sclerosis (scleroderma)' by the American Rheumatology Association (ARA) [2–4,8]. These classification criteria were developed in patients with longstanding SSc. As a consequence, patients with early SSc and about 20% of patients with limited cutaneous disease do not meet the criteria and are excluded from clinical studies [1,9,10]. Since the development of the 1980 criteria, knowledge regarding SSc-specific autoantibodies has improved [11–13]. In addition, characteristic nailfold capillary changes have been found to be associated with SSc and nailfold capillaroscopy is widely accepted as a diagnostic tool [10,14–17]. In 1988, LeRoy et al proposed new criteria that included clinical features, autoantibodies and capillaroscopy, underling the differences between the two main SSc subsets [11]. In 2001, LeRoy and Medsger proposed to revise the classification criteria to include `early' cases of SSc, making use of nail fold capillary pattern and SSc-specific autoantibodies [6]. It also has been demonstrated that the addition of nailfold capillary abnormalities and telangiectasias to the ACR SSc criteria improves their sensitivity [9,18].
Because of the insufficient sensitivity of the 1980 criteria and advances in knowledge about SSc, the American College of Rheumatology (ACR) and the European League Against Rheumatism (EULAR) established a committee to provide a joint proposal for new classification criteria for SSc. The aims were to develop criteria which 1) include a broader spectrum of SSc including patients who are early as well as late in the disease process; 2) include vascular, immunologic, and fibrotic manifestations; 3) are feasible to use in daily clinical practice and 4) are in keeping with criteria used for diagnosis of SSc in clinical practice [7]. These criteria are intended to be used by rheumatologists, researchers, national and international drug agencies, pharmaceutical companies or any others involved in studies of SSc. Our objective was to develop a set of criteria that would enable identification of individuals with SSc for inclusion in clinical studies, being more sensitive and specific than previous criteria.
METHODS
Overview
The development and testing of the classification system for SSc was based on both data and expert clinical judgment. First, candidate items for the classification criteria were generated using consensus methods and evaluated using existing databases [19,20]. Second, multi-criteria decision analysis was used to reduce the number of candidate criteria and assign preliminary weights [21]. The classification system was repeatedly tested and adapted using prospectively collected SSc cases and non-SSc controls, and compared against expert clinical judgment. Third, the classification criteria were tested in a validation cohort and tested against pre-existing criteria sets. Figure 1.
Item generation and reduction
168 candidate criteria were identified through 2 Delphi exercises. A 3-round Delphi exercise and a face-to-face consensus meeting using nominal group technique facilitated reduction of the 168 items to 23 [19]. Using a random sample of existing databases (SSc (n = 783), control patients with diseases similar to SSc (n=1071), all based on physician diagnosis) the candidate criteria were found to have good discriminative validity [20].
Item reduction and weighting
Draft classification system
A face-to-face meeting of 4 European and 4 North American SSc experts was held to further reduce items and assign preliminary weights using multi-criteria decision analysis. The number of experts was limited in advance to eight and they were invited based on geographical representation, knowledge from a scientific and a practical diagnostic viewpoint, and availability. At the meeting, the experts determined by consensus to whom the criteria should be and should not be applied, and which items are sufficient to allow a patient to be classified as SSc (sufficient criteria). The experts then participated in a multi-criteria decision analysis to further reduce the 23 items and assign preliminary weights [21]. The experts were presented hypothetical pairs of cases with two of the 23 items at a time (e.g. Raynaud's phenomenon positive AND abnormal nailfold capillaries absent versus Raynaud's phenomenon negative AND abnormal nailfold capillaries present, all other manifestations considered being equal) and they were asked to individually vote electronically for which case of the pair of cases was more likely to have SSc. The result of the votes was immediately presented. If there was no complete agreement among the experts, considerations were discussed and a second round of voting was conducted. As a result of the repeated choices between two alternative cases, items were ranked and weights for the items were derived using 1000Minds decision-making software [21]. Additional details about the methods are available in [22].
Initial threshold identification
The committee prepared summaries of 45 SSc cases with a concentration of cases that were difficult to classify. These were presented to 22 SSc experts who classified these cases as definite SSc or not. The draft classification system derived from the multi-criteria decision analysis was applied to 45 cases, resulting in a score for each case. The ranking of cases by the SSc the experts and the ranking of cases based on the scores provided by the draft scoring system were examined. Higher scores of the scoring system were expected to relate to a higher probability to have SSc by experts. Using these results, an initial threshold score for SSc was identified.
Reduction and testing of iterative changes
In this step, the committee reduced the number of items, simplified the weights and modified the threshold score. First, data on the candidate items were prospectively collected in 13 SSc centers in North America and 10 in Europe using standardized case record forms. Data were collected from 368 consecutive patients with SSc (diagnosis based on physician opinion) of whom half were to have SSc duration from first non-RP symptom for a maximum of two years in order to include early SSc, and from 237 consecutive control patients having a scleroderma-like disorder: eosinophilic fasciitis (also called Shulman's syndrome or diffuse fasciitis with eosinophilia), scleromyxedema, systemic lupus erythematosus, dermatomyositis, polymyositis, primary Raynaud's phenomenon, mixed connective tissue disease, undifferentiated connective tissue disease, generalized morphea, nephrogenic systemic sclerosis and diabetic cheiropathy. From these 605 patients a random sample of 100 SSc cases and 100 controls, 50% from North America and 50% Europe, was selected to form the derivation sample. The remaining 268 cases and 137 controls formed the validation sample. Institutional research ethics board approval was obtained for the collection of patient data.
Then the committee met and made iterative changes to the draft system which they continually applied in real-time to the derivation cohort derived as above. Using the derivation cohort, the scoring system was simplified by removing items that were low frequency or redundant, by aggregating similar items, and then transforming the weights to obtain single digits. The preliminary score threshold was adjusted to account for the weight simplification. The impact of all proposed changes was evaluated by assessing changes to sensitivity and specificity of the criteria in the derivation cohort. The reference standard to test the sensitivity and specificity was the diagnosis by the SSc expert who submitted the case(s) and control(s).
At the same time, the changes in the classification system were also tested in 38 difficult to classify cases. Consequently, weights of some items were adjusted to align the scoring system with the reference standard formed by the opinions of the SSc experts as to which cases were to be classified as having SSc.
Validation
The final classification system was independently tested using the validation sample of SSc cases and controls. The sensitivity and specificity were calculated for the ARA 1980 preliminary classification criteria for SSc, the classification criteria proposed by Leroy and Medsger in 2001, and the newly developed classification criteria [3,6]. Exact binomial confidence limits were calculated for sensitivity and specificity. ARA criteria and Leroy/Medsger criteria were compared in 2×2 tables with the new criteria using McNemar's Chi-squared test with continuity correction. The criteria sets were also tested in patients with 3 or less years of disease. Further, the classification system was validated against the expert consensus on the set of 38 selected cases.
RESULTS
Draft classification system
The experts concluded that `skin thickening of the fingers of both hands that extends proximal to the metacarpophalangeal joints' was sufficient to classify a subject as having SSc. Further, patients with `skin thickening sparing the fingers' are classified as not having SSc. It was felt that the criteria should be applied to any patient considered for inclusion in a SSc study, without further specifications. Items with relatively low weights were deleted and items considered to be from a similar domain were clustered (e.g. finger tip lesions encompasses ulcers and pitting scars, lung involvement encompasses interstitial lung disease and pulmonary hypertension). Using conjoint analysis, the number of items was reduced from 23 to 14 and all items were assigned weights. The 14 resulting items (with weights) were presence of: bilateral skin thickening of the fingers (14 if distal to PIP only, 22 if whole finger), puffy fingers (5), finger tip lesions (16 if pitting scars or 9 if digital ulcers), finger flexion contractures (16), telangiectasia (10), abnormal nailfold capillaries (10), calcinosis (12), Raynaud's phenomenon (13), tendon or bursal friction rubs (21), interstitial lung disease / pulmonary fibrosis (14), pulmonary arterial hypertension (12), scleroderma renal crisis (11), esophageal dilation (7), and SSc-related antibodies (15 if presence of anti-centromere antibody, or anti-centromere pattern on ANA, anti-topoisomerase I which is also known as anti-Scl70 and anti-RNA polymerase III).
Initial threshold identification
Comparison of the case ranking by scoring system and by experts found that above a score of 55, except for one case, most experts (≥75%) considered the cases to be SSc. Similarly below a score of 40, most experts (≥88%) considered the cases not to be SSc. Between 40 and 55 there was more diversity of opinion. Thus it was concluded that the initial threshold would be a score of 56 or higher.
Reduction and testing of iterative changes
The 14 items in the scoring system were reduced to 9 while maintaining sensitivity and specificity in the derivation sample. Deleted items included: finger flexion contractures, calcinosis, tendon or bursal friction rubs, renal crisis, and esophageal dilation. Puffy fingers and sclerodactyly were combined in one item, and pulmonary arterial hypertension and interstitial lung disease were also combined into one item, resulting in 7 items for the scoring system. In the derivation sample, with reduction of the 14 items to 7 items the sensitivity and specificity were 93% and 94%. Weights were simplified by dividing each weight by 5 and rounding to the nearest integer. The threshold for this simplified scoring system was determined to be 9. The resulting sensitivity and specificity were 97% and 88%.
Additional adjustments to the weights of the scoring system were made to align the scoring system with the expert opinions of which patients, using the set of 38 cases that were difficult to classify. To improve the specificity of the classification criteria, an exclusionary criterion was added: patients with a diagnosis that better explains their manifestations than SSc should not be classified as SSc. These revisions resulted in the correct classification of all patient profiles judged to have SSc by the majority of experts.
The SSc classification criteria
The new classification criteria are shown in Table 1, showing one sufficient criterion, two exclusionary criteria, and 7 items, with a threshold above which cases are classified as SSc. The classification criteria may be applied to patients who may have SSc being considered for inclusion in a SSc study. The criteria are not to be applied to patients having a systemic sclerosis-like disorder better explaining their manifestations; and patients with `Skin thickening sparing the fingers' are not classified as having SSc. The classification criteria include one sufficient criterion for the classification of SSc: if a patient has skin thickening of the fingers of both hands that extends proximal to the metacarpophalangeal (MCP) joints, the classification system assigns 9 points for this one item alone, which is sufficient to be classified as having SSc and further application of the point system is not necessary. Otherwise, the point system is applied by adding the scores for manifestations that are `positive' with a maximum score in each category as the highest item in the category when a patient has more than one manifestation in any category. That is, for skin thickening of the fingers and for finger tip lesions only the item that scores highest is counted. The domains are: skin thickening of the fingers, finger tip lesions (digital tip ulcers and digital pitting scars), telangiectasia, abnormal nailfold capillaries, pulmonary arterial hypertension and/or interstitial lung disease, Raynaud's phenomenon, scleroderma related antibodies. The maximum possible score is 19 and patients who have 9 points or more are classified as having SSc. The definitions of the items used in the criteria are provided in Table 2.
Table 1.
1. These criteria are applicable to any patient considered for inclusion in a SSc study. | ||
---|---|---|
2. These criteria are not applicable to: | ||
a) Patients having a SSc-like disorder better explaining their manifestations, such as: nephrogenic sclerosing fibrosis, generalized morphea, eosinophilic fasciitis, scleredema diabeticorum, scleromyxedema, erythromyalgia, porphyria, lichen sclerosis, graft versus host disease, and diabetic cheiropathy. b) Patients with `Skin thickening sparing the fingers', | ||
| ||
Items | Sub-items | Weight / Score |
Skin thickening of the fingers of both hands extending proximal to the metacarpophalangeal joints (sufficient criterion) |
9 | |
Skin thickening of the fingers^ (only count the highest score) |
Puffy fingers Sclerodactyly of the fingers (distal to MCP but proximal to the PIPs) |
2 4 |
| ||
Finger tip lesions^ (only count the highest score) |
Digital Tip Ulcers Finger Tip Pitting Scars |
2 3 |
| ||
Telangiectasia | 2 | |
| ||
Abnormal nailfold capillaries | 2 | |
| ||
Pulmonary arterial hypertension and/or Interstitial lung Disease* (*Maximum score is 2) |
PAH ILD |
2 |
| ||
Raynaud's phenomenon | 3 | |
| ||
Scleroderma related antibodies** (any of anti-centromere, anti-topoisomerasel |
Anti-centromere Anti-topoisomerasel |
3 |
| ||
[anti-Sd 70], anti-RNA polymerase III) (**Maximum score is 3) |
Anti-RNA polymerase III | |
| ||
TOTAL SCORE^: |
Patients having a total score of 9 or more are being classified as having definite systemic sclerosis.
Add the maximum weight (score) in each category to calculate the total score.
PAH is pulmonary arterial hypertension. The definition is proven PAH by right heart catheterization. ILD is interstitial lung disease defined as pulmonary fibrosis on HRCT or chest radiograph, most pronounced in the basilar portions of the lungs, or presence of `velcro' crackles on auscultation not due to another cause such as congestive heart failure. See definition of terms for all variables (Table 2).
Table 2.
Item | Definition |
---|---|
Skin thickening | Skin thickening or hardening not due to scarring after injury, trauma, etc. |
| |
Puffy fingers | Swollen digits - a diffuse, usually nonpitting increase in soft tissue mass of the digits extending beyond the normal confines of the joint capsule. Normal digits are tapered distally with the tissues following the contours of the digital bone and joint structures. Swelling of the digits obliterates these contours. Not due to other reasons such as inflammatory dactylitis. |
| |
Finger tip ulcers or pitting scars | Ulcers or scars distal to or at the PIP joint not thought to be due to trauma. Digital pitting scars are depressed areas at digital tips as a result of ischemia, rather than trauma or exogenous causes. |
| |
Telangiectasia | Telangiectasia(e) in a scleroderma like pattern are round and well demarcated and found on hands, lips, inside of the mouth, and/or large matt-like telangiectasia(e). Telangiectasiae are visible macular dilated superficial blood vessels; which collapse upon pressure and fill slowly when pressure is released; distinguishable from rapidly filling spider angiomas with central arteriole and from dilated superficial vessels. |
| |
Abnormal nailfold capillary pattern consistent with SSc | Enlarged capillaries and/or capillary loss with or without peri-capillary hemorrhages at the nailfold and may be seen on the cuticle. |
| |
Pulmonary arterial hypertension | Pulmonary arterial hypertension diagnosed by right heart catheterization according to standard definitions. |
| |
Interstitial lung disease | Pulmonary fibrosis on HRCT or chest radiograph, most pronounced in the basilar portions of the lungs, or presence of `Velcro' crackles on auscultation not due to another cause such as congestive heart failure. |
| |
Raynaud's phenomenon | Self report or reported by a physician with at least a two-phase color change in finger(s) and often toe(s) consisting of pallor, cyanosis and/or reactive hyperemia in response to cold exposure or emotion; usually one phase is pallor. |
| |
Scleroderma specific antibodies | Anti-centromere antibody or centromere pattern on antinuclear antibody (ANA) testing; anti-topoisomerase I antibody (also known as anti-Scl70 antibody); or anti-RNA polymerase III antibody. Positive according to local laboratory standards. |
Validation
The characteristics of the validation sample (SSc n=238, controls n=137) are presented in Table 3. The sensitivity and specificity of the new SSc classification criteria was compared with the 1980 ARA classification criteria and the classification criteria proposed by Leroy and Medsger are presented in Table 4. The sensitivity and specificity of the new SSc criteria were 0.91 and 0.92 in the validation sample. Sensitivity and specificity of the new criteria was better than the two previous classification schemes, and test results of the new criteria versus ARA (p=0.01) and versus Leroy/Medsger criteria (p=0.004) were significantly different. The area under the ROC curve (95% CI) of the classification system tested against presence of SSc in the validation sample was 0.81 (0.77, 0.85). The performance of the new criteria in patients with ≤3 years disease duration are presented in Table 4.
Table 3.
Derivation sample | Validation sample | |||||
---|---|---|---|---|---|---|
Item | SSc | Scleroderma-like disorder | p-value | SSc | Scleroderma-like disorder | p-value |
N= | 100 | 100 | -- | 268 | 137 | -- |
Age | 55 (13) | 51 (15) | 0.05 | 54 (13) | 52 (15) | 0.17 |
Female | 86 (86%) | 79 (79%) | 0.25 | 221 (83%) | 101 (75%) | 0.08 |
Region | ||||||
North America | 50 | 50 | -- | 191 (68%) | 91 (32%) | |
Europe | 50 | 50 | 77 (63%) | 46 (37%) | 0.32 | |
Time since onset of Raynaud's (years) | 13 (7–18) | 12 (4–18) | 0.42 | 9 (5–18) | 10 (4–22) | .40 |
Time since first non-Raynaud's symptom (years) | 10 (4–13) | 9 (2–14) | 0.58 | 7 (3–12) | 7 (3–15) | .89 |
Time since diagnosis (years) | 8 (3–12) | 6 (1–9) | 0.10 | 5 (2–11) | 4 (1–7) | .016 |
Scleroderma-like disorders | ||||||
systemic lupus erythematosus | 28 (28%) | 32 (23%) | ||||
polymyositis/ dermatomyositis | 23 (23%) | 21 (15%) | ||||
primary Raynaud's syndrome | 19 (19%) | 7 (5%) | ||||
mixed connective tissue disease | 9 (9%) | 14 (10%) | ||||
undifferentiated connective tissue disease | 8 (8%) | 17 (12%) | ||||
eosinophilic fasciitis | 6 (6%) | 16 (12%) | ||||
nephrogenic sclerosing fibrosis | 3 (3%) | 3 (2%) | ||||
generalized morphea | 5 (5%) | 8 (6%) | ||||
scleromyxedema | 1 (1%) | 3 (2%) | ||||
graft versus host disease | 3 (3%) | 3 (2%) | ||||
other diagnoses | 8 (8%) | 13 (9%) | ||||
Manifestations | ||||||
Raynaud's phenomenon | 91 (91%) | 49 (49%) | <0.0001 | 257 (96%) | 63 (46%) | <0.0001 |
Autoantibodies | 68 (68%) | 7 (7%) | <0.0001 | 137 (51%) | 15 (11%) | <0.0001 |
Anticentromere Antibody | 33 (33%) | 5 (5%) | <0.0001 | 41 (15%) | 8 (6%) | 0.0057 |
Anti-topoisomerase-I | 34 (34%) | 1 (1%) | <0.0001 | 69 (26%) | 7 (5%) | <0.0001 |
Anti-RNApolymeraseIII | 2 (2%) | 1 (1%) | 1.0 | 27 (10%) | 0 | <0.0001 |
Puffy fingers | 61 (61%) | 17 (17%) | <0.0001 | 169 (63%) | 24 (18%) | <0.0001 |
Abnormal nailfold capillaries | 54 (54%) | 24 (24%) | <0.0001 | 146 (54%) | 51 (37%) | 0.0010 |
Dilated vessels | 37 (37%) | 28 (28%) | 0.08 | 124 (46%) | 34 (25%) | 0.0080 |
Avascular areas | 21 (21%) | 11 (11%) | 0.08 | 86 (32%) | 9 (7%) | <0.0001 |
Hemorrhages | 12 (12%) | 9 (9%) | 0.64 | 63 (24%) | 8 (6%) | <0.0001 |
Digital tip ulcers | 53 (53%) | 8 (8%) | <0.0001 | 108 (40%) | 12 (9%) | <0.0001 |
Pitting scars | 53 (53%) | 5 (5%) | <0.0001 | 105 (39%) | 5 (4%) | <0.0001 |
PAH or ILD | 48 (48%) | 14 (14%) | <0.0001 | 138 (52%) | 14 (10%) | <0.0001 |
PAH | 44 (44%) | 10 (10%) | <0.0001 | 20 (7%) | 2 (1%) | 0.012 |
ILD | 12 (12%) | 4 (4%) | 0.037 | 131 (49%) | 12 (9%) | <0.0001 |
Telangiectasia | 35 (35%) | 10 (10%) | <0.0001 | 68 (25%) | 13 (9%) | 0.0002 |
Skin thickening of fingers to proximal of MCPs | 26 (26%) | 1 (1%) | <0.0001 | 105 (39%) | 6 (4%) | <0.0001 |
Skin thickening of fingers distal to MCPs | 5 (5%) | 38 (38%) | <0.0001 | 178 (66%) | 24 (18%) | <0.0001 |
Manifestations appear in order of frequency of occurrence in the SSc derivation sample. MCPs: Metacarpophalangeal joints; PAH: Pulmonary Arterial Hypertension; ILD: Interstitial Lung Disease. The p-value is for the difference between SSc cases and controls with scleroderma-like disorder. Values are presented as mean (SD), median (P25 P75), or n (%), as appropriate. Data were prospectively collected from 605 consecutive patients with SSc or a scleroderma-like disorder (see methods) of whom half of the SSc sample at each site were to be early SSc. A random sample of 100 SSc cases and 100 controls, 50% from North America and 50% Europe, was selected to form the derivation sample and the remaining patients formed the validation sample.
Table 4.
Derivation sample (N=200) | Validation sample (N=405) | Validation sample ≤ 3 years disease duration (N=100) | ||||
---|---|---|---|---|---|---|
Sensitivity (95% CI) | Specificity (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | |
1980 ARASSc Criteria | 0.80 (0.72, 0.87) | 0.77 (0.68, 0.84) | 0.75 (0.70, 0.80) | 0.72 (0.64, 0.79) | 0.75 (0.70, 0.80) | 0.72 (0.63, 0.79) |
2001 LeRoy and Medsger criteria | 0.76 (0.68, 0.84) | 0.69 (0.68, 0.84) | 0.75 (0.70, 0.80) | 0.78 (0.70, 0.85) | 0.80 (0.69, 0.88) | 0.76 (0.53, 0.92) |
2013 ACR-EULAR SSc Classification Criteria | 0.95 (0.90, 0.98) | 0.93 (0.86, 0.97) | 0.91 (0.87, 0.94) | 0.92 (0.86, 0.96) | 0.91 (0.83, 0.96) | 0.90 (0.70, 0.99) |
1980 ARA criteria for the classification of SSc [ARA 1980]; 2001 proposal for the classification of SSc, where IcSSC and dcSSc were regarded as `definite' SSc [Leroy 2001]; 2013 ACR-EULAR SSc classification criteria (from Table 1). Data were prospectively collected from 605 consecutive patients with SSc or a scleroderma-like disorder (see methods). A random sample of 100 SSc cases and 100 controls, 50% from North America and 50% Europe, was selected to form the derivation sample and the remaining patients formed the validation sample (the characteristics are in Table 3).
The classification system was additionally tested against expert opinion, using the set of 38 selected cases. (Table 5). All of the cases scoring 9 or above were considered SSc whereas cases scoring less than 9 were not regarded as SSc or were controversial. The proposed system classified all of these cases in accordance with consensus based expert opinion. The new criteria classified as SSc all cases that were classified as SSc by the 1980 ARA criteria, and also included several cases not classified as SSc by the 1980 ARA criteria.
Table 5.
ID | Skin thickening of the fingers | Fingertips | Telangiectasia | Abnormal nailfold capillaries |
Puffy fingers |
Raynaud's phenonomen |
ILD/Pulmonary Fibrosis | Pulmonary Arterial Hypertension |
SSc related antibodies |
Classified as SSc |
Score | 1980 ARA criteria |
Experts agreeing (N=16)* |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Whole finger | Pitting scars | Yes | Yes | Yes | Yes | Yes | Yes | 17 | Yes | 16 | ||
2 | Whole finger | Pitting scars | Yes | Yes | Yes | Yes | Yes | 15 | Yes | 16 | |||
3 | Whole finger | Pitting scars | Yes | Yes | Yes | Yes | 13 | Yes | 16 | ||||
4 | Whole finger | Pitting scars | Yes | Yes | Yes | Yes | 12 | Yes | 15 | ||||
5 | Whole finger | Yes | Yes | Yes | Yes | Yes | 12 | No | 15 | ||||
6 | Whole finger | Yes | Yes | Yes | 11 | No | 10 | ||||||
7 | Pitting scars | Yes | Yes | Yes | Yes | 10 | ? | ||||||
8 | Ulcers | Yes | Yes | Yes | Yes | 10 | No | 14 | |||||
9 | Yes | Yes | Yes | Yes | Yes | 10 | No | ||||||
10 | Yes | Yes | Yes | Yes | Yes | 10 | No | ||||||
11 | Ulcers | Yes | Yes | Yes | Yes | 10 | No | ||||||
12 | Yes | Yes | Yes | Yes | Yes | 10 | No | ||||||
13 | Whole finger | Yes | Yes | Yes | Yes | 10 | No | 15 | |||||
14 | Whole finger | Yes | Yes | Yes | 10 | No | |||||||
15 | Whole finger | Yes | Yes | Yes | 9 | No | |||||||
16 | Whole finger | Yes | Yes | Yes | Yes | 9 | No | ||||||
17 | Whole finger | Ulcers | Yes | Yes | Yes | 9 | No | ||||||
18 | Whole finger | Yes | Yes | Yes | 9 | No | |||||||
19 | Whole finger | Yes | Yes | Yes | Yes | 9 | No | 9 | |||||
20 | Whole finger | Yes | Yes | Yes | 9 | No | |||||||
21 | Pitting scars | Yes | Yes | No | 8 | ? | |||||||
22 | Yes | Yes | Yes | No | 8 | No | |||||||
23 | Pitting scars | Yes | Yes | No | 8 | No | 4 | ||||||
24 | Pitting scars | Yes | Yes | No | 8 | No | 2 | ||||||
25 | Yes | Yes | Yes | No | 8 | No | |||||||
26 | Yes | Yes | Yes | No | 7 | No | |||||||
27 | Yes | Yes | Yes | No | 7 | No | |||||||
28 | Distal to PIPs | Yes | Yes | Yes | No | 7 | No | 5 | |||||
29 | Yes | Yes | Yes | No | 7 | No | |||||||
30 | Whole finger | Yes | No | 7 | No | ||||||||
31 | Whole finger | Yes | Yes | No | 6 | No | |||||||
32 | Distal to PIPs | Yes | Yes | No | 6 | No | 8 | ||||||
33 | Yes | Yes | No | 6 | No | 0 | |||||||
34 | Yes | Yes | No | 5 | No | ||||||||
35 | Yes | Yes | No | 5 | No | 0 | |||||||
36 | Yes | Yes | No | 5 | No | 1 | |||||||
37 | Yes | Yes | No | 5 | No | 0 | |||||||
38 | Yes | No | 2 | No | 0 |
Profiles of 38 cases likely to have SSc or a SSc-like disorder, with presence or absence of the manifestations in the newly developed SSc classification system. Items are blank if not present. The cases are ranked in order of the points awarded by the classification system in the `Score' column and are classified as `SSc' if they have 9 points or more. The number of experts agreeing that each case had SSc is given (total N=16).Fulfillment of the 1980 ARA criteria is indicated in the `1980 ARA' column, while `?' means inconclusive. None of the cases had `Skin thickening of the fingers of both hands extending proximal to the metacarpophalangeal joints' thus none of the cases was classified as SSc immediately. Whole finger is sclerodactyly of entire finger but not proximal to the MCPs. PIPs are proximal interphalangeal joints.
Experts were only asked when a value is given.
Discussion
A classification system for systemic sclerosis (SSc) is required to ensure that patients assigned the label `SSc' for inclusion in studies have specific defined characteristics. The major reason to revise the 1980 ARA criteria was that these criteria lacked adequate sensitivity, especially in patients with early SSc and limited cutaneous SSc (lcSSc) [9,10,18]. The proposed classification criteria are superior and demonstrate greater sensitivity and specificity compared to the 1980 criteria and the classification criteria proposed by Leroy and Medsger. All possible profiles of patients who were considered to have SSc by a majority of experts are indeed classified as having SSc by the classification system, and the new system is more inclusive and also perform well in patients with early disease, meaning a time since diagnosis of 0–3 years.
The newly developed classification system includes disease manifestations of the three hallmarks of SSc: fibrosis of the skin and/or internal organs, production of certain autoantibodies, and vasculopathy. The four items of the 1980 ARA classification criteria (scleroderma proximal to the metacarpophalangeal joints, sclerodactyly, digital pitting scars, not pulp loss, and bilateral basilar pulmonary fibrosis) are also included, as well as the items of the 2001 proposal for classification of SSc by Leroy and Medsger (Raynaud's phenomenon, autoantibodies, nailfold capillaroscopy, skin fibrosis) [3,6].
The classification criteria include one sufficient criterion: skin thickening of the fingers extending proximal to the metacarpophalangeal joints, which is similar to the 1980 criteria. If the sufficient criterion is not fulfilled, then the point system is applied and patients with 9 points or more are also classified as having SSc. All items in the classification criteria represent measurements that are performed in routine clinical practice. The criteria are meant for including SSc patients in studies, not for SSc diagnosis. Although the list of items in the classification criteria mimics the list of items one usually uses for diagnosis, in practice diagnosis of SSc may also be informed by items not in the classification criteria, such as tendon friction rubs, calcinosis, and dysphagia. Consequently, people classified as having SSc are a subset of people being diagnosed with SSc; where the latter is more sensitive. Ideally, there would be no difference between diagnosis and classification criteria.
As intended, the new classification incorporates the considerable advances made in the diagnosis of SSc. It includes the concept of specific serum autoantibodies such as anti-topoisomerase I, anti-centromere, and anti-RNA polymerase III [15, 23]. There is the possibility that other SSc autoantibodies such as anti-Th/To, anti-U3 RNP and others may become more widely available. The criteria also acknowledge the value of nailfold magnification in the diagnosis of SSc [14,15]. Although capillaroscopy can be performed with highly specialized equipment such as video capillaroscopy cameras, simple in-office ophthalmoscopes or dermatoscopes suffice for separating normal versus abnormal nailfold capillaries [24,25]. Capillaroscopy is now widely used and considering the value of nailfold magnification in the diagnosis and management of SSc, these new criteria may encourage the acquisition of this skill by physicians caring for SSc. Likewise, criteria for PAH have changed over the years. The criteria appreciate this and the diagnosis of PAH should be based on the most recent accepted criteria from right heart catheterization.
Several items that are useful in clinical practice to recognize SSc, such as calcinosis, flexion contractures of the fingers, tendon or bursal friction rubs, renal crisis, esophageal dilatation and dysphagia are not included in the criteria. These were considered but did not substantially improve sensitivity or specificity. For example, renal crisis is a strong indicator of SSc, but its low occurrence makes it less useful for the purpose of classification [20]. The committee considered a non-point based additive system such as the SLE criteria or the 1980 ARA criteria [8]. We however concluded that assigning weights yielded superior results for SSc classification. Indeed the weights were simplified to single digit numbers to make the system easy to use even in the absence of a computing device. Similar weighted systems have been used for other rheumatic diseases [26]. The committee also decided not to include `probable' or `possible' SSc in the classification.
Examples of profiles not captured by the ARA criteria that fulfilled the new classification criteria are combinations of skin thickening of the whole finger, scleroderma related antibodies and pulmonary arterial hypertension and/or Raynaud's phenomenon. A patient with Raynaud's phenomenon, autoantibodies and abnormal nailfold capillaries is not classified as `SSc, though such a patient may develop SSc over the years [6,15].
Patients may have disease manifestations similar to SSc that are better explained by another well-defined disorder such as nephrogenic sclerosing fibrosis, generalized morphea, eosinophilic fasciitis, scleredema diabeticorum, scleromyxedema, porphyria, lichen sclerosis, graft versus host disease, and diabetic cheiropathy. We decided that it was not necessary to develop criteria that differentiated SSc specifically from these conditions. Some of these diseases were included in the validation cohort of SSc-like disorders and it is possible that specificity may have been slightly higher had they been excluded.
In developing the revised SSc classification criteria, we followed the recommendations and guidelines of ACR and EULAR which included: 1) collaboration between clinical experts and clinical epidemiologists in criteria development, 2) evaluation of the psychometric properties of each candidate criterion, and 3) description of the test sample (origin of the patients and control subjects) [27,28]. Ideally, phases of criteria development should have a balance between expert opinion and data-driven methods. Yet there should be avoidance of circularity of reasoning (a bias that can occur when the same experts developing the criteria are the ones contributing cases and comparison patients) [29]. We included different experts at different steps in the development of SSc criteria to avoid circularity.
Testing and validating a classification system for SSc is difficult because there is no gold standard for defining a particular case as SSc; that is, there is no incontrovertible test or criterion. We relied on expert opinion for our gold standard, which is similar to what has been used in the development of other criteria [8]. In the absence of a `gold standard' we developed and tested the proposed classification system against two standards of expert opinion: 1) the opinion of the clinician who selected cases for the North American and European derivation and validation cohorts; and 2) the combined opinion of a group of clinical experts in SSc. Both standards have strengths and weaknesses. Each individual clinician who selected cases had access to information that could have included aspects that were not captured by the forms that were restricted to 23 particular manifestations. Data were obtained from several sites in Europe and North America so this should improve generalizability and reduce selection bias. On the other hand, it is possible that other expert clinicians may have had a different opinion about particular cases. The consensus opinion of a group of experts who had the opportunity to discuss controversial cases strengthens the combined expert opinion. However, the group may have not been aware of some relevant information not included in the available data. It is also difficult for a group of experts to consider in depth hundreds of cases; however this was managed by having the group of experts consider in depth only those cases, or combinations of items, which appeared to be controversial. In this way, the expert group was able to form a consensus over the whole range of cases in the databases. A key strength of the present work is the use of both standards for testing and validation of the proposed system.
The approach we took has other strengths and limitations. The methodology was state of the art with validation by data and by expert opinion at every step. Various methods have already been described in the development process [19,20]. The criteria have face validity, because the items are routinely assessed in daily clinical practice and also appeared in other important classification criteria for SSc. The criteria are open for new developments in e.g. autoantibodies or assessment of nailfold capillaries. Formal conjoint analysis to derive the weights associated with items improved the sensitivity and specificity of the items, as was found also in the development of the recent ACR-EULAR criteria for the classification of Rheumatoid Arthritis [30].
The criteria have not been validated on ethnicities that are not common in North America and Europe. This will require further studies. Regarding clinical use, the number of items and weights may not be easy to remember, but wide availability and (electronic) applications can be developed. The SSc classification criteria steering committee and the expert consultants agreed that the criteria could allow patients with another rheumatic disease to also be classified as SSc (such as having both SLE and SSc, or RA and SSc, etc). Although this is a possible limitation, it permits individual researchers to decide whether or not to include subjects who fulfill criteria for more than one rheumatic disease in any particular study.
Conclusions
The ACR-EULAR classification criteria for SSc perform better than 1980 Preliminary ARA Criteria for SSc both in terms of sensitivity and specificity. They are relatively simple to apply to individual subjects. These criteria may be endorsed as inclusion criteria for SSc studies. Validation in other populations is encouraged.
Acknowledgements
This research was supported by the American College of Rheumatology (ACR) and the European League Against Rheumatism (EULAR). Dr. Johnson is supported by a Canadian Institutes of Health Research Clinician Scientist Award and the Norton-Evans Fund for Scleroderma Research. Dr. Khanna was supported by the Scleroderma Foundation (New Investigator Award), and a National Institutes of Health Award (NIAMS K24 AR063120).
Footnotes
Key message The ACR-EULAR criteria for the classification of systemic sclerosis performed better than 1980 Preliminary ARA Criteria for SSc and should allow for more patients to be classified correctly as SSc. The use of these criteria is recommended for including patients in epidemiological, clinical or experimental studies.
REFERENCES
- 1.Wollheim FA. Classification of systemic sclerosis: visions and reality. Rheumatology (Oxford) 2005;44:1212–6. doi: 10.1093/rheumatology/keh671. [DOI] [PubMed] [Google Scholar]
- 2.Masi AT, Medsger TA, Jr, Rodnan GP, Fries JF, Altman RD, Brown BW, et al. Methods and preliminary results of the Scleroderma Criteria Cooperative Study of the American Rheumatism Association. Clin Rheum Dis. 1979;5:27–79. [Google Scholar]
- 3.Subcommittee for Scleroderma Criteria of the American Rheumatism Association Diagnostic and Therapeutic Criteria Committee Preliminary criteria for the classification of systemic sclerosis (scleroderma) Arthritis Rheum. 1980;23:581–90. doi: 10.1002/art.1780230510. [DOI] [PubMed] [Google Scholar]
- 4.Preliminary criteria for the classification of systemic sclerosis (scleroderma) Bull Rheum Dis. 1981;31:1–6. No authors listed. [PubMed] [Google Scholar]
- 5.Nadashkevich O, Davis P, Fritzler MJ. A proposal of criteria for the classification of systemic sclerosis. Med Sci Monit. 2004;10:CR615–21. [PubMed] [Google Scholar]
- 6.Leroy EC, Medsger TA., Jr Criteria for the classification of early systemic sclerosis. J Rheumatol. 2001;8(7):1573–6. [PubMed] [Google Scholar]
- 7.Classification and Response Criteria Subcommittee of the American College of Rheumatology Committee on Quality Measures Development of classification and response criteria for rheumatic diseases [editorial] Arthritis Rheum. 2006;55:348–52. doi: 10.1002/art.22003. [DOI] [PubMed] [Google Scholar]
- 8.Johnson SR, Goek ON, Singh-Grewal D, Vlad SC, Feldman BM, Felson DT, et al. Classification criteria in rheumatic diseases: a review of methodologic properties. Arthritis Rheum. 2007;57(7):1119–33. doi: 10.1002/art.23018. [DOI] [PubMed] [Google Scholar]
- 9.Lonzetti LS, Joyal F, Raynauld JP, Roussin A, Goulet JR, Rich E, et al. Updating the American College of Rheumatology preliminary classification criteria for systemic sclerosis: addition of severe nailfold capillaroscopy abnormalities markedly increases the sensitivity for limited scleroderma [letter] Arthritis Rheum. 2001;44:735–6. doi: 10.1002/1529-0131(200103)44:3<735::AID-ANR125>3.0.CO;2-F. [DOI] [PubMed] [Google Scholar]
- 10.Hachulla E, Launay D. Diagnosis and classification of systemic sclerosis. Clin Rev Allergy Immunol. 2011;40:78–83. doi: 10.1007/s12016-010-8198-y. [DOI] [PubMed] [Google Scholar]
- 11.LeRoy EC, Black C, Fleischmajer R, Jablonska S, Krieg T, Medsger TA, Jr, Rowell N, Wollheim F. Scleroderma (systemic sclerosis): classification, subsets and pathogenesis. J Rheumatol. 1988;15(2):202–5. [PubMed] [Google Scholar]
- 12.Avouac J, Fransen J, Walker UA, Riccieri V, Smith V, Muller C, et al. Preliminary criteria for the very early diagnosis of systemic sclerosis: results of a Delphi Consensus Study from EULAR Scleroderma Trials and Research Group. Ann Rheum Dis. 2011;70(3):476–81. doi: 10.1136/ard.2010.136929. [DOI] [PubMed] [Google Scholar]
- 13.Walker JG, Pope J, Baron M, Leclercq S, Hudson M, Taillefer S, et al. The development of systemic sclerosis classification criteria. Clin Rheumatol. 2007;26(9):1401–9. doi: 10.1007/s10067-007-0537-x. [DOI] [PubMed] [Google Scholar]
- 14.Harper FE, Maricq HR, Turner RE, Lidman RW, Leroy EC. A prospective study of Raynaud phenomenon and early connective tissue disease. A five-year report. Am J Med. 1982;72(6):883–8. doi: 10.1016/0002-9343(82)90846-4. [DOI] [PubMed] [Google Scholar]
- 15.Koenig M, Joyal F, Fritzler MJ, Roussin A, Abrahamowicz M, Boire G, et al. Autoantibodies and microvascular damage are independent predictive factors for the progression of Raynaud's phenomenon to systemic sclerosis: a twenty-year prospective study of 586 patients, with validation of proposed criteria for early systemic sclerosis. Arthritis Rheum. 2008;58(12):3902–12. doi: 10.1002/art.24038. [DOI] [PubMed] [Google Scholar]
- 16.Maricq HR, Weinberger AB, Leroy EC. Early detection of scleroderma-spectrum disorders by in vivo capillary microscopy: a prospective study of patients with Raynaud's phenomenon. J Rheumatol. 1982;9(2):289–91. [PubMed] [Google Scholar]
- 17.Cutulo M, Matucci-Cerinic M. Nailfold capillaroscopy and classification criteria for systemic sclerosis. Clinical and Experimental Rheumatology. 2007;25:663–665. [PubMed] [Google Scholar]
- 18.Hudson M, Taillefer S, Steele R, Dunne J, Johnson SR, Jones N, et al. Improving the sensitivity of the American College of Rheumatology classification criteria for systemic sclerosis. Clin Exp Rheumatol. 2007;25:754–7. [PubMed] [Google Scholar]
- 19.Fransen J, Johnson SR, van den Hoogen F, Baron M, Allanore Y, Carreira PE, et al. Items for developing revised classification criteria in systemic sclerosis: results of a consensus exercise. Arthritis Care Res. 2012;64:351–7. doi: 10.1002/acr.20679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Johnson SR, Fransen J, Khanna D, Medsger TA, Peschken C, Carreira P, et al. Validation of potential classification criteria for systemic sclerosis. Arthritis Care Res. 2012;64(3):358–67. doi: 10.1002/acr.20684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Hansen P, Ombler F. A new method for scoring multi-attribute value models using pairwise rankings of alternatives. Journal of Multi-Criteria Decision Analysis. 2009;15:87–107. [Google Scholar]
- 22.Johnson SR, Naden RP, Fransen J, Van den Hoogen F, Pope JE, Baron M, et al. Systemic Sclerosis Classification Criteria: Developing methods for multi-criteria decision analysis. Submitted. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Steen VD. Autoantibodies in systemic sclerosis. Semin Arthritis Rheum. 2005;35(1):35–42. doi: 10.1016/j.semarthrit.2005.03.005. [DOI] [PubMed] [Google Scholar]
- 24.Baron M, Bell M, Bookman A, et al. Office capillaroscopy in systemic sclerosis. Clin Rheumatol. 2007;26(8):1268–74. doi: 10.1007/s10067-006-0489-6. [DOI] [PubMed] [Google Scholar]
- 25.Hudson M, Masetto A, Steele R, Arthurs E, Baron M. Canadian Scleroderma Research Group. Reliability of widefield capillary microscopy to measure nailfold capillary density in systemic sclerosis. Clin Exp Rheumatol. 2010 Sep-Oct;28(5 Suppl 62):S36–41. [PubMed] [Google Scholar]
- 26.Aletaha D, Neogi T, Silman AJ, Funovits J, Felson D, Bingham CO, et al. 2010 Rheumatoid arthritis classification criteria: an American College of Rheumatology/European League Against Rheumatism collaborative initiative. Ann Rheum Dis. 2010;69:1580–1588. doi: 10.1136/ard.2010.138461. [DOI] [PubMed] [Google Scholar]
- 27.Singh JA, Solomon DH, Dougados M, Felson D, Hawker G, Katz P, et al. Development of classification and response criteria for rheumatic diseases. Arthritis Rheum. 2006 Jun 15;55(3):348–52. doi: 10.1002/art.22003. [DOI] [PubMed] [Google Scholar]
- 28.Dougados M, Gossec L. Classification criteria for rheumatic diseases: why and how? Arthritis Rheum. 2007;57(7):1112–5. doi: 10.1002/art.23015. [DOI] [PubMed] [Google Scholar]
- 29.Felson DT, Anderson JJ. Methodological and statistical approaches to criteria development in rheumatic diseases. Baillieres Clin Rheumatol. 1995 May;9(2):253–66. doi: 10.1016/s0950-3579(05)80189-x. [DOI] [PubMed] [Google Scholar]
- 30.Neogi T, Aletaha D, Silman AJ, Naden RL, Felson DT, Aggarwal R, et al. The 2010 American College of Rheumatology/European League Against Rheumatism classification criteria for rheumatoid arthritis: Phase 2 methodological report. Arthritis & Rheumatism. 2010;62(9):2582–91. doi: 10.1002/art.27580. [DOI] [PMC free article] [PubMed] [Google Scholar]