Skip to main content
JAMA Network logoLink to JAMA Network
. 2023 Feb 8;159(3):299–307. doi: 10.1001/jamadermatol.2022.6365

Development and Validation of the Morphea Activity Measure in Patients With Pediatric Morphea

Maria Teresa García-Romero 1,, Megha Tollefson 2,3, Elena Pope 4,5, Heather A Brandling-Bennett 6,7, Amy S Paller 8,9,10, Emily Keimig 11, Lisa Arkin 12, Karolyn A Wanat 13, Stephen R Humphrey 13, Victoria P Werth 14,15, Vikash Oza 16, Heidi Jacobe 17, Nicole Fett 18, Kelly M Cordoro 19, Isabel Medina-Vera 20, Yvonne E Chiu 13,21
PMCID: PMC9909574  PMID: 36753150

Key Points

Question

Can a reliable, valid, and viable tool usable for any type and severity of morphea be developed to evaluate morphea activity for clinical and research use?

Finding

In this pilot diagnostic study involving a panel of 14 experts in morphea and 14 pediatric patients with morphea, the Delphi consensus method was used to develop the Morphea Activity Measure. During an in-person meeting and use in patients, reliability (interrater and intrarater agreement), validity, and viability were evaluated and found to be strong.

Meaning

This study’s findings suggest the Morphea Activity Measure is reliable, valid, and viable for evaluating disease activity; although further testing is needed, the tool has potential for clinical use.

Abstract

Importance

Morphea is an insidious inflammatory disorder of the skin and deeper tissues. Determining disease activity is challenging yet important to medical decision-making and patient outcomes.

Objective

To develop and validate a scoring tool, the Morphea Activity Measure (MAM), to evaluate morphea disease activity of any type or severity that is easy to use in clinical and research settings.

Design, Setting, and Participants

This pilot diagnostic study was conducted from September 9, 2019, to March 6, 2020, in 2 phases: development and validation. During the development phase, 14 morphea experts (dermatologists and pediatric dermatologists) used a Delphi consensus method to determine items that would be included in the MAM. The validation phase included 8 investigators who evaluated the tool in collaboration with 14 patients with pediatric morphea (recruited from a referral center [Medical College of Wisconsin]) during a 1-day in-person meeting on March 6, 2020.

Main Outcomes and Measures

During the development phase, online survey items were evaluated by experts in morphea using a Likert scale (score range, 0-10, with 0 indicating not important and 10 indicating very important); agreement was defined as a median score of 7.0 or higher, disagreement as a median score of 3.9 or lower, and no consensus as a median score of 4.0 to 6.9. During the validation phase, reliability (interrater and intrarater agreement using intraclass correlation coefficients), validity (using the content validity index and κ statistics as well as correlations with the modified Localized Scleroderma Severity Index and the Physician Global Assessment of Activity using Spearman ρ coefficients), and viability (using qualitative interviews of investigators who used the MAM tool) were evaluated. Descriptive statistics were used for quantitative variables. Data on race and ethnicity categories were collected but not analyzed because skin color was more relevant for the purposes of this study.

Results

Among 14 survey respondents during the development phase, 9 (64.3%) were pediatric dermatologists and 5 (35.7%) were dermatologists. After 2 rounds, a final tool was developed comprising 10 items that experts agreed were indicative of morphea activity (new lesion in the past 3 months, enlarging lesion in the past 3 months, linear lesion developing progressive atrophy in the past 3 months, erythema, violaceous rim or color, warmth to the touch, induration, white-yellow or waxy appearance, shiny white wrinkling, and body surface area). The validation phase was conducted with 14 patients (median age, 14.5 years [range, 8.0-18.0 years]; 8 [57.1%] female), 2 dermatologists, and 6 pediatric dermatologists. Interrater and intrarater agreement for MAM total scores was good, with intraclass correlation coefficients of 0.844 (95% CI, 0.681-0.942) for interrater agreement and 0.856 (95% CI, 0.791-0.901) for intrarater agreement. Correlations between the MAM and the modified Localized Scleroderma Severity Index (Spearman ρ = 0.747; P < .001) and the MAM and the Physician Global Assessment of Activity (Spearman ρ = 0.729; P < .001) were moderately strong. In qualitative interviews, evaluators agreed that the tool was easy to use, measured morphea disease activity at a single time point, and should be responsive to changes in morphea disease activity over multiple time points.

Conclusions and Relevance

In this study, the MAM was found to be a reliable, valid, and viable tool to measure pediatric morphea activity. Further testing to assess validity in adults and responsiveness to change is needed.


This diagnostic study uses a Delphi consensus method to develop the Morphea Activity Measure, a scoring tool designed to evaluate morphea disease activity of any type and severity for use in clinical and research settings, and validates the tool among pediatric patients with morphea.

Introduction

Morphea is a rare inflammatory disorder of the skin and deeper tissues, with an active inflammatory phase that is amenable to treatment and an inactive phase during which changes occur that may be permanent.1 Morphea may have a relapsing-remitting course, with risk of disease reactivation over time.2,3

Determining morphea disease activity is challenging, and disease progression is often underrecognized, contributing to delayed diagnosis or undertreatment.4 Delaying adequate treatment for active disease may result in deleterious patient outcomes, such as functional disabilities, disfigurement, and higher relapse rates.2 Better ability to characterize and define clinical signs of disease activity is important for clinicians. Several disease activity measures have been proposed, including clinical scores,5 serological markers,6,7 and technological devices8,9,10 that are not readily available or easy to use in clinical practice. A systematic review11 found that clinical scoring methods seemed to provide the most reliable assessment of morphea activity and should be further investigated. The Localized Scleroderma Cutaneous Assessment Tool (LoSCAT) is one such method that has been validated. It is composed of the modified Localized Scleroderma Skin Severity Index (LoSSI), which is frequently used as a proxy to evaluate disease activity, and the Localized Scleroderma Skin Damage Index (LoSDI), which has been found to differentiate between activity and damage and is sensitive to change.12,13,14,15,16 However, the modified LoSSI assesses only 3 features of disease activity (new or enlarging lesions, erythema, and skin thickness) and does not capture lesion size. Based on anecdotal experience, the modified LoSSI may not be sensitive enough to detect differences in morphea of all types or degrees of severity (eAppendix 1 in Supplement 1).

There is a need for an objective validated measure of morphea activity that consists of more clinical features, includes assessment of body surface area (BSA), can be applied to morphea of any type or severity, and is easy to use in clinical and research settings. In this pilot diagnostic study, we aimed to develop a new morphea skin-specific instrument, the Morphea Activity Measure (MAM), and assess its reliability, validity, and viability.

Methods

This pilot diagnostic study was conducted from September 9, 2019, to March 6, 2020. The study consisted of 2 phases: a development phase to determine items that would be included in the tool and a validation phase. The study protocol was approved by the institutional review boards of Children’s Wisconsin (Milwaukee), the Medical College of Wisconsin (Milwaukee), and the National Institute of Pediatrics (Mexico City, Mexico). The parents of all patients provided written informed consent. Consent to use and publish patient data and the images shown in eAppendix 1 in Supplement 1 (patients 1 and 2) was provided by parents. All other images were obtained from the photographic records of Children´s Wisconsin, the Hospital for Sick Children (Toronto, Canada), and the National Institute of Pediatrics; permission for clinical care use was previously obtained for the use of those images. This study followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline for diagnostic studies.

Fourteen investigators with morphea expertise participated in the development phase. An initial version of the tool was developed by 5 coinvestigators (M.T.G.-R., M.T., E.P., H.A.B.-B., and Y.E.C.). These coinvestigators and 9 morphea experts participated in a Delphi consensus process via an online survey (Qualtrics) to refine the tool.17,18 In each survey round, statements were evaluated using a Likert scale (range, 1-10, with 0 indicating not important and 10 indicating very important). Agreement was defined as a median Likert score of 7.0 or higher, disagreement as a median score of 3.9 or lower, and no consensus as a median score of 4.0 to 6.9. Respondents could add comments or propose new items or modifications to the current items. After the first round, anonymous responses were presented in charts and graphs using descriptive statistics. Respondents were asked to reevaluate the tool’s items in the context of their peers’ responses, with the objective of reaching a consensus. Items were modified according to comments and suggestions until agreement on the final tool was reached (Figure 1). A final report was prepared, and the tool was user tested by 5 experts (M.T.G.-R., M.T., E.P., H.A.B.-B., and Y.E.C.) in collaboration with 2 to 3 local patients with morphea.

Figure 1. Morphea Activity Measure—Final Tool.

Figure 1.

LLE indicates left lower extremity; LUE, left upper extremity; RLE, right lower extremity; and RUE, right upper extremity.

aIf a lesion crosses body sites, score each domain for each of the body sites.

bFor linear lesions only.

cIf a patient has more than 1 lesion in a body site, the most active lesion should be chosen to score items; the body surface area (BSA) of all active lesions should be taken into account.

During a 1-day in-person meeting on March 6, 2020, at Children’s Wisconsin, 8 investigators (M.T.G.-R., M.T., A.S.P., E.K., L.A., K.A.W., S.R.H., and Y.E.C.) evaluated the tool in collaboration with 14 patients with pediatric morphea who were recruited from a referral center (Medical College of Wisconsin). Investigators were trained to score clinical features included in the tool using photographs and definitions shown in eAppendix 2 in Supplement 1. Demographic and clinical data were collected (data on race and ethnicity categories were collected but not analyzed because skin color [assessed with the NIS Skin Color Scale19] was more relevant for the purposes of this study). Evaluators entered each examination room individually, without access to their peers’ evaluations, and were given a maximum of 6 minutes per room. Two rounds of testing were conducted 3.5 hours apart; investigators did not have their first evaluations with them at the time the second evaluation was conducted. In addition to the MAM, investigators used the modified LoSSI and the Physician Global Assessment of Activity (PGA-A). The modified LoSSI evaluates 18 body areas for the presence of 3 clinical features: erythema (score of 0-3), thickness and induration in lesions (score of 0-3), and the presence of new or enlarged lesions within the last month (score of 0 or 3). The PGA-A is a 100-mm visual analog scale ranging from 0 to 100, with 0 indicating inactive and 100 indicating markedly active.

Statistical Analysis

Validity of the MAM was evaluated for both content and convergent construct validity. Content validity was assessed using both the content validity index (CVI) and a modified κ statistic20 to adjust for chance agreement with results obtained from expert consensus. The item-level CVI score was calculated by dividing the number of experts who ranked each variable as moderately to extremely important by the total number of experts, excluding the coinvestigators (M.T.G.-R., M.T., E.P., H.A.B.-B., and Y.E.C.) who developed the questions. The scale-level CVI score was obtained by summing item-level CVI scores and dividing those scores by the number of variables. An item-level CVI score of 0.75 or higher and a scale-level CVI score of 0.90 or higher were considered evidence of excellent content validity.21 With regard to κ statistics, values greater than 0.74 were considered excellent, values from 0.60 to 0.74 were considered good, and values from 0.40 to 0.59 were considered fair.21,22,23 Because there is no gold standard for morphea disease activity, convergent construct validity was investigated through correlation of the MAM score with scores from the modified LoSSI and the PGA-A.24 We used the Spearman ρ coefficient to measure correlation, with coefficients greater than 0.8 considered very strong correlation, coefficients from 0.6 to 0.8 considered moderately strong correlation, coefficients from 0.3 to less than 0.6 considered fair correlation, and coefficients less than 0.3 considered poor correlation.22

Internal consistency or degree of homogeneity across items was measured using the Cronbach α coefficient, with values from 0.7 to 0.9 reflecting good internal consistency of the scale without substantial redundancy of items.25,26,27,28 We assessed the impact of deleting each item on the total internal consistency and analyzed the interitem correlation, aiming for an ideal range of 0.15 to 0.50.27,28 Reliability was measured by analyzing the external consistency (interrater agreement) and reproducibility (intrarater agreement) of the tool. Statistical analysis of interrater and intrarater agreement was performed by calculating the intraclass correlation coefficient (ICC). MAM scores from the first round of evaluations were used to calculate interrater agreement; ICC values of less than 0.50 were interpreted as poor agreement, values from 0.50 to 0.75 as moderate agreement, values from 0.76 to 0.90 as good agreement, and values greater than 0.90 as excellent agreement.29 Viability of the instrument was studied qualitatively by interviewing the investigators who used the MAM tool, centering on experience and impressions regarding the use of the instrument.

Descriptive statistics were expressed as means with SDs or medians with ranges for continuous variables and as frequencies with percentages for categorical variables. Two-sided P ≤ .05 was considered statistically significant. All analyses were conducted using IBM SPSS Statistics, version 25.0 (IBM Corporation).

Results

Among 14 survey respondents included during the development phase, 9 (64.3%) were pediatric dermatologists and 5 (35.7%) were dermatologists (Table 1). In the first survey round, experts reached consensus on 8 items that were indicative of activity (new lesion in the past 3 months, enlarging lesion in the past 3 months, linear lesion developing progressive atrophy in the past 3 months, erythema, violaceous rim or color, warmth to the touch, white-yellow or waxy appearance, and shiny white wrinkling) and suggested the size of the entire lesion, rather than just the active portion, should be measured. After the second round, there was consensus by the experts that induration was a sign of morphea activity, that certain clinical characteristics should have more weight than others (new lesion, enlarging lesion, and progressive atrophy score of 0 vs 3), that warmth should be scored as 0 vs 1, and that all other items (erythema, violaceous rim or color, white-yellow or waxy appearance, and shiny white wrinkling) should have tiered scoring (0, 1, 2, and 3). Also after 2 rounds, consensus was reached on the final tool, consisting of 10 items (new lesion, enlarging lesion, progressive atrophy, erythema, violaceous rim or color, warmth, induration, white-yellow or waxy appearance, shiny white wrinkling, and BSA) to evaluate on each of 7 body sites (head and neck; chest, abdomen, and back; groin and buttocks; right upper extremity; right lower extremity; left upper extremity; and left lower extremity) (Figure 1). Representative photographs and definitions of each feature are shown in eAppendix 2 in Supplement 1.

Table 1. Dermatologist and Patient Characteristics.

Characteristic Participants, No. (%)
Dermatologists
Total No. 14
Specialty
Pediatric dermatology 9 (64.3)
Dermatology 5 (35.7)
Year clinical training was completed, median (range) 2009 (1983-2014)
Total patients with morphea/y
<10 0
10-29 8 (57.1)
30-49 3 (21.4)
≥50 3 (21.4)
Adult patients with morphea/y
<10 9 (64.3)
10-29 2 (14.3)
30-49 1 (7.1)
≥50 2 (14.3)
Pediatric patients with morphea/y
<10 4 (28.6)
10-29 8 (57.1)
30-49 0
≥50 2 (14.3)
Patients
Total No. 14
Age, median (range), y 14.5 (8.0-18.0)
Sex
Female 8 (57.1)
Male 6 (42.9)
Morphea subtype
Linear 5 (35.7)
Plaque 2 (14.3)
Generalized 2 (14.3)
Mixed 4 (28.6)
Pansclerotic 1 (7.1)
Site involvement
Head and neck 3 (21.4)
Chest, abdomen, and back 8 (57.1)
Groin and buttocks 4 (28.6)
Right upper extremity 3 (21.4)
Left upper extremity 2 (14.3)
Right lower extremity 8 (57.1)
Left lower extremity 6 (42.9)
Skin colora
Color 1 (lightest) 0
Color 2 5 (35.7)
Color 3 5 (35.7)
Color 4 1 (7.1)
Color 5 2 (14.3)
Color 6 1 (7.1)
Colors 7-10 (darkest) 0
a

According to the NIS Skin Color Scale.19

The 10 clinical characteristics included in the MAM through expert consensus were ranked by 62.5% of experts as being very or extremely relevant, with 6 of the characteristics achieving item-level CVI scores of 0.75 or greater (new lesion [0.77], enlarging lesion [0.88], progressive atrophy [0.77], erythema [1.00], violaceous rim or color [1.00], and induration [1.00]) (Table 2). The scale-level CVI was 0.76, suggesting the MAM had good content validity. The modified κ statistic was excellent (>0.74) for 6 items (new lesion [0.75], enlarging lesion [0.88], progressive atrophy [0.75], erythema [1.00], violaceous rim or color [1.00], and induration [1.00]), good (0.60-0.74) for 2 items (shiny white wrinkling [0.60] and BSA [0.60]), and fair (0.40-0.59) for 2 items (warmth [0.41] and white-yellow or waxy appearance [0.41]).

Table 2. Item-Level Content Validity Index and Modified κ Statistic of Items That Achieved Consensus and Were Included in the Morphea Activity Measure.

Item Median Likert scale scorea Consensus Item-level CVIb Modified κ statisticc
New lesion in 3-mo period 9.0 Agreement 0.77 0.75
Enlarging lesion in 3-mo period 8.5 Agreement 0.88 0.88
Developing progressive atrophy in 3-mo periodd 8.5 Agreement 0.77 0.75
Skin on the lesion is erythematous 8.0 Agreement 1.00 1.00
Lesion has a violaceous rim or color 9.0 Agreement 1.00 1.00
Lesion feels warm to the touch 7.0 Agreement 0.55 0.41
Induration in a morphea lesion 9.0 Agreement 1.00 1.00
Lesion has a white-yellow or waxy appearance 7.0 Agreement 0.55 0.41
Lesion has a shiny white appearance 7.0 Agreement 0.66 0.60
Size of the entire lesion should be measured 8.0 Agreement 0.66 0.60

Abbreviation: CVI, content validity index.

a

Likert scale range, 0 to 10, with 0 indicating not important and 10 indicating very important. Agreement was defined as a median Likert score of 7.0 or higher, disagreement as a median score of 3.9 or lower, and no consensus as a median score of 4.0 to 6.9.

b

An item-level CVI score of 0.75 or higher indicates excellent content validity.

c

An item-level modified κ statistic greater than 0.74 indicates excellent validity; 0.60 to 0.74, good validity; and 0.40 to 0.59, fair validity.

d

For linear lesions only.

The validation phase was conducted with 14 patients (median age, 14.5 years [range, 8.0-18.0 years]; 8 [57.1%] female) (Table 1), 2 dermatologists, and 6 pediatric dermatologists. The median MAM score on the first round of evaluation was 19.5 (range, 0-619.0), the median modified LoSSI score was 5.0 (range, 0-46.0), and the median PGA-A score was 30.5 (range, 0-100) (Table 2). The correlations between the MAM and the modified LoSSI (Spearman ρ = 0.747; P < .001) and the MAM and the PGA-A (Spearman ρ = 0.729; P < .001) were found to be moderately strong (Figure 2). Floor and ceiling effects were not found. The MAM had strong internal consistency (Cronbach α = 0.72).

Figure 2. Correlation of the Morphea Activity Measure (MAM) With the Modified Localized Scleroderma Severity Index (mLoSSI) and the Physician Global Assessment of Activity (PGA-A).

Figure 2.

Each dot represents an individual patient’s score on the MAM and the mLoSSI or the MAM and the PGA-A. The lines in the plot show the mathematical best fit to the data and provide an additional signal regarding the strength of the correlation between the 2 variables.

Interrater agreement for MAM total scores was good, with an ICC of 0.844 (95% CI, 0.681-0.942; P < .001) (Table 3). Three items (new lesion [ICC, 0.912; 95% CI, 0.819-0.967; P < .001], enlarging lesion [ICC, 0.943; 95% CI, 0.883-0.979; P < .001], and shiny white wrinkling [ICC, 0.915; 95% CI, 0.825-0.968; P < .001]) had excellent interrater agreement, 3 items (erythema [ICC, 0.824; 95% CI, 0.634-0.934; P < .001], white-yellow or waxy appearance [ICC, 0.875; 95% CI, 0.745-0.953; P < .001], and BSA [ICC, 0.889; 95% CI, 0.773-0.959; P < .001]) had good agreement, 3 items (violaceous rim or color [ICC, 0.532; 95% CI, 0.039-0.825; P = .02], warmth [ICC, 0.548; 95% CI, 0.074-0.831; P = .02], and induration [ICC, 0.675; 95% CI, 0.334-0.879; P = .001]) had moderate agreement, and 1 item (progressive atrophy [ICC, −0.077; 95% CI, −1.234 to 0.600; P = .53]) had poor agreement. Excluding progressive atrophy resulted in a slight improvement in ICC of 0.849 (95% CI, 0.690-0.943; P < .001).

Table 3. Scores of All Items on the Morphea Activity Measure and Total Scores of the 3 Evaluated Tools.

Variable Scorea Agreement
Mean (SD) Median (range) Interratera Intrarater
ICC (95% CI)b P value ICC (95% CI)b P value
MAM items
New lesion 1.34 (2.72) 0 (0 to 15.0) 0.912 (0.819 to 0.967) <.001 0.890 (0.840 to 0.924) <.001
Enlarging lesion 1.52 (2.93) 0 (0 to 15.0) 0.943 (0.883 to 0.979) <.001 0.860 (0.797 to 0.904) <.001
Progressive atrophy 0.11 (0.68) 0 (0 to 6.0) −0.077 (−1.234 to 0.600) .53 −0.016 (−0.476 to 0.301) .53
Erythema 1.92 (1.79) 1.0 (0 to 9.0) 0.824 (0.634 to 0.934) <.001 0.815 (0.732 to 0.873) <.001
Violaceous rim or color 0.86 (1.32) 0 (0 to 9.0) 0.531 (0.039 to 0.825) .02 0.548 (0.342 to 0.689) <.001
Warmth 0.33 (0.79) 0 (0 to 5.0) 0.548 (0.074 to 0.831) .02 0.688 (0.547 to 0.785) <.001
Induration 1.24 (1.66) 1.0 (0 to 11.0) 0.675 (0.334 to 0.879) .001 0.752 (0.640 to 0.830) <.001
White-yellow or waxy appearance 1.09 (1.63) 1.0 (0 to 10.0) 0.875 (0.745 to 0.953) <.001 0.852 (0.785 to 0.898) <.001
Shiny white wrinkling 1.07 (1.99) 0 (0 to 11.0) 0.915 (0.825 to 0.968) <.001 0.940 (0.913 to 0.959) <.001
BSA 7.07 (10.13) 4.0 (0 to 54.0) 0.889 (0.773 to 0.959) <.001 0.910 (0.869 to 0.938) <.001
Total scores
MAM 42.44 (80.70) 19.5 (0 to 619.0) 0.844 (0.681 to 0.942) <.001 0.856 (0.791 to 0.901) <.001
Modified LoSSI 6.55 (7.28) 5.0 (0 to 46.0) NA NA NA NA
PGA-A 34.05 (25.91) 30.5 (0 to 100) NA NA NA NA

Abbreviations: BSA, body surface area; ICC, intraclass correlation coefficient; MAM, Morphea Activity Measure; LoSSI, Localized Scleroderma Severity Index; NA, not applicable; PGA-A, Physician Global Assessment of Activity.

a

Calculated from round 1 of the testing.

b

An ICC greater than 0.90 indicates excellent correlation; 0.76 to 0.90, good correlation; 0.50 to 0.75, moderate correlation; and less than 0.50, poor correlation.

Intrarater agreement for MAM total scores measured on the first round of evaluation was good, with an ICC of 0.856 (95% CI, 0.791-0.901; P < .001) (Table 3). Two items (shiny white wrinkling [ICC, 0.940; 95% CI, 0.913-0.959; P < .001] and BSA [ICC, 0.910; 95% CI, 0.869-0.938; P < .001]) had excellent intrarater agreement, 5 items (new lesion [ICC, 0.890; 95% CI, 0.840-0.924; P < .001], enlarging lesion [ICC, 0.860; 95% CI, 0.797-0.904; P < .001], erythema [ICC, 0.815; 95% CI, 0.732-0.873; P < .001], induration [ICC, 0.752; 95% CI, 0.640-0.830; P < .001], and white-yellow or waxy appearance [ICC, 0.852; 95% CI, 0.785-0.898; P < .001]) had good agreement, 2 items (warmth [ICC, 0.688; 95% CI, 0.547-0.785; P < .001] and violaceous rim or color [ICC, 0.548; 95% CI, 0.342-0.689; P < .001]) had moderate agreement, and 1 item (progressive atrophy [ICC, −0.016; 95% CI, −0.476 to 0.301; P = .53]) had poor agreement. Excluding progressive atrophy resulted in slight improvement in the ICC of 0.860 (95% CI, 0.797-0.904; P < .001). In qualitative interviews, evaluators agreed that the tool was easy to use, convenient because it measured morphea disease activity at a single point in time, and likely responsive to changes in morphea disease activity over multiple time points.

Discussion

This pilot diagnostic study found the MAM was a practical and simple tool that was useful for assessing activity of morphea in children and should be explored in further studies. Analysis revealed that the MAM had an appropriate level of internal consistency and content validity, with moderately strong correlations between the MAM and the modified LoSSI and PGA-A. We found the MAM had good interrater and intrarater agreement, which is a strong measure of reliability. Users agreed that the MAM was a viable tool.

Clinical scoring tools for morphea disease activity and severity have been developed previously, including the visual analog score, the DIET (dyspigmentation, induration, erythema, and telangiectasia) score, the modified Rodnan skin score, and the LoSCAT.13,14,30,31,32,33 The LoSCAT was developed predominantly by pediatric rheumatologists and has emerged as the most commonly used outcome measure. As originally developed, the LoSCAT consists of the modified LoSSI, which measures disease severity and activity (new or enlarging lesion, erythema, and skin thickness), and the LoSDI, which measures extent of damage (dermal atrophy, subcutaneous atrophy, and dyspigmentation). Studies have found that the LoSCAT has good interobserver and intraobserver reliability and sensitivity to change.15,16,34 In the experience of some investigators, the LoSCAT did not capture all features of disease activity, and the need for a different tool was identified (eAppendix 1 in Supplement 1). For example, the modified LoSSI weights new or enlarging lesion heavily compared with the other 2 features of erythema and skin thickness and does not include BSA as a factor. In addition, the modified LoSSI requires that lesions be new or enlarging within the past month; however, because morphea typically progresses slowly, there may not be much change in lesion size over a 1-month period.

Of the 10 clinical characteristics in the MAM, new lesion, enlarging lesion, erythema, and induration are shared with the LoSCAT, while the 6 other features (progressive atrophy, violaceous rim or color, warmth, white-yellow or waxy appearance, shiny white wrinkling, and BSA) are distinct. New lesion development and enlarging lesion were included as separate items because both new and enlarging lesions may signal more active disease. Both items had excellent content validity, excellent interrater agreement, and good intrarater agreement.

Epidermal, dermal, or subcutaneous tissue atrophy is generally considered to be a marker of damage in morphea. In fact, both dermal and subcutaneous atrophy are part of the clinical characteristics evaluated in the LoSDI, reflecting chronic irreversible tissue damage.14 However, our panel of experts agreed that progressive atrophy may be a sign of activity in deep morphea, such as Parry-Romberg syndrome, for cases in which other features of disease activity may be lacking. We included progressive atrophy only for linear lesions in the MAM because plaque and generalized morphea usually have other features that are also indicative of activity. Despite this item (progressive atrophy) having excellent content validity, it had poor interrater and intrarater agreement. Of note, because the testing was performed on 1 day, physicians were required to ask the patients about progressive atrophy and rely on their responses, which may explain the lower interrater and intrarater agreement. In a clinical practice setting, clinical photographs and repeat examinations could provide a more objective measure of progressive atrophy. Excluding progressive atrophy from the MAM did not result in substantial changes in interrater and intrarater agreement. This variable deserves further study in a longitudinal manner to assess responsiveness to change before making a decision regarding its inclusion in the MAM, either applied only to linear lesions or to other types of morphea.

Erythema is a hallmark of skin inflammation, and there are many objective tools to measure erythema, including the chromameter,35 erythema scales,36 and computer-assisted algorithms.37 In other studies,36,38 erythema has been reported to have moderate reliability, especially in the setting of darker skin, for which erythema is harder to distinguish. To overcome this issue, we trained evaluators to use the 0 to 3 scoring system when evaluating erythema in patients with different skin phototypes (eAppendix 2 in Supplement 1). With proper training, erythema had good interrater and intrarater reliability in our study and had excellent content validity.

Induration has long been considered a cardinal sign of activity in morphea because it is a product of excessive collagen deposition and dermal thickening. However, it is difficult to standardize what different degrees of induration mean. As part of the consensus method used to design the tool, evaluators were asked about different ways to define induration. There was consensus agreement about the following definitions, which were then used in the validation phase: mild induration was defined as an increase in the thickness of the skin but normal mobility, moderate induration as a decrease in the mobility of the skin but not fixed or bound down, and severe induration as the inability to move or pinch the skin (ie, it is fixed). These categories and definitions are similar to those used in the modified Rodnan skin score for systemic sclerosis, in which an index finger and a thumb or 2 thumbs are used to measure thickness.39 In the second round, agreement was reached regarding whether induration is a sign of activity in morphea. This item had moderate interrater agreement and good intrarater agreement.

White-yellow or waxy appearance, shiny white wrinkling, and violaceous rim have not been used previously in validated scales to evaluate morphea. The white-yellow or waxy appearance that is often at the center of a morphea lesion has been called sclerosis. While some argue that this feature is a sign of disease damage, it is often reversible with treatment, and consensus on including white-yellow or waxy appearance in the MAM was achieved, albeit with a lower item-level CVI score and κ value. This feature had good interrater and intrarater reliability. Shiny white wrinkling is a marker of epidermal involvement, as observed in morphea lesions with lichen sclerosis-like features. This item had excellent interrater and intrarater reliability. A violaceous rim may be associated with intense inflammation of active disease and may possibly be a lymphocytic infiltrate with regard to histologic characteristics. This item had high item-level CVI scores indicating excellent content validity but moderate interrater and intrarater reliability, likely because it can be difficult to distinguish a violaceous rim from erythema.

Warmth is one of the cardinal signs of inflammation present in active morphea and has been extensively studied as a marker of activity, with varying results.10,40,41 Older lesions with severe atrophy, even if not active, may have a higher temperature than surrounding skin because of loss of subcutaneous fat, and it is not easy to estimate warmth. These difficulties are also encountered with thermography used to measure and monitor disease activity, so the utility of this item should be studied further. We found moderate interrater and intrarater agreement for warmth, and removing this item did not improve the ICC.

Deciding how to estimate the extent or surface area involved was challenging because lesions in morphea may have diffuse margins that can complicate this assessment, and there may be interevaluator disagreement in calculation of the percentage of BSA based on palm-sized estimates. On the first round of consensus, we agreed that the size of a lesion impacts disease activity and the size of an entire lesion should be measured, even if only part of the lesion is active. Furthermore, we decided to take the cumulative BSA of all active lesions into account and agreed that the size of new and/or active lesions was a more practical and relevant feature than the number of new lesions. We found that BSA had excellent intrarater agreement and good interrater agreement.

Limitations

This study has limitations. There is no gold standard for morphea activity against which to measure the MAM, and the MAM was only tested among patients with pediatric morphea. Because of travel restrictions at the start of the COVID-19 pandemic and other conflicts, some of the investigators who were part of the development phase could not attend the in-person meeting. Expert and patient numbers were small, although our numbers are comparable with those of other validation studies of outcome measures in rare diseases. Because measurements were performed on a single day, progressive atrophy was difficult to evaluate. Further studies are necessary to determine whether progressive atrophy, induration, violaceous rim, and warmth should be included in the MAM and to test the MAM in a longitudinal manner to assess responsiveness to change. In addition, a disease damage counterpart to the MAM is planned.

Conclusions

This pilot diagnostic study found that the MAM developed by experts via consensus methods was a reliable, valid, and viable tool to measure disease activity in pediatric morphea at a single point in time. The MAM has potential as an outcome measure for use in both clinical and research settings. Further study, including a larger validation study and longitudinal testing to measure responsiveness to change, is warranted.

Supplement 1.

eAppendix 1. Examples of Variations in MAM and Modified LoSSI Performance

eAppendix 2. Examples and Definitions of Items in the MAM

Supplement 2.

Data Sharing Statement

References

  • 1.Christen-Zaech S, Hakim MD, Afsar FS, Paller AS. Pediatric morphea (localized scleroderma): review of 136 patients. J Am Acad Dermatol. 2008;59(3):385-396. doi: 10.1016/j.jaad.2008.05.005 [DOI] [PubMed] [Google Scholar]
  • 2.Martini G, Fadanelli G, Agazzi A, Vittadello F, Meneghel A, Zulian F. Disease course and long-term outcome of juvenile localized scleroderma: experience from a single pediatric rheumatology centre and literature review. Autoimmun Rev. 2018;17(7):727-734. doi: 10.1016/j.autrev.2018.02.004 [DOI] [PubMed] [Google Scholar]
  • 3.O’Brien JC, Nymeyer H, Green A, Jacobe HT. Changes in disease activity and damage over time in patients with morphea. JAMA Dermatol. 2020;156(5):513-520. doi: 10.1001/jamadermatol.2020.0034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Cruz-Diaz CN, Haemel AK. At the leading edge in morphea—new insights into disease course and management options. JAMA Dermatol. 2020;156(5):495-496. doi: 10.1001/jamadermatol.2020.0033 [DOI] [PubMed] [Google Scholar]
  • 5.Lis-Święty A, Skrzypek-Salamon A, Ranosz-Janicka I, Brzezińska-Wcisło L. Associations between disease activity/severity and damage and health-related quality of life in adult patients with localized scleroderma—a comparison of LoSCAT and visual analogue scales. J Clin Med. 2020;9(3):756. doi: 10.3390/jcm9030756 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.O’Brien JC, Rainwater YB, Malviya N, et al. Transcriptional and cytokine profiles identify CXCL9 as a biomarker of disease activity in morphea. J Invest Dermatol. 2017;137(8):1663-1670. doi: 10.1016/j.jid.2017.04.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Wolska-Gawron K, Bartosińska J, Rusek M, Kowal M, Raczkiewicz D, Krasowska D. Circulating miRNA-181b-5p, miRNA-223-3p, miRNA-210-3p, let 7i-5p, miRNA-21-5p and miRNA-29a-3p in patients with localized scleroderma as potential biomarkers. Sci Rep. 2020;10(1):20218. doi: 10.1038/s41598-020-76995-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Ranosz-Janicka I, Lis-Święty A, Skrzypek-Salamon A, Brzezińska-Wcisło L. Detecting and quantifying activity/inflammation in localized scleroderma with thermal imaging. Skin Res Technol. 2019;25(2):118-123. doi: 10.1111/srt.12619 [DOI] [PubMed] [Google Scholar]
  • 9.Saad Magalhães C, de Albuquerque Pedrosa Fernandes T, Dias Fernandes T, de Lima Resende LA. A cross-sectional electromyography assessment in linear scleroderma patients. Pediatr Rheumatol Online J. 2014;12:27. doi: 10.1186/1546-0096-12-27 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Garcia-Romero MT, Randhawa HK, Laxer R, Pope E. The role of local temperature and other clinical characteristics of localized scleroderma as markers of disease activity. Int J Dermatol. 2017;56(1):63-67. doi: 10.1111/ijd.13452 [DOI] [PubMed] [Google Scholar]
  • 11.Lis-Święty A, Janicka I, Skrzypek-Salamon A, Brzezińska-Wcisło L. A systematic review of tools for determining activity of localized scleroderma in paediatric and adult patients. J Eur Acad Dermatol Venereol. 2017;31(1):30-37. doi: 10.1111/jdv.13790 [DOI] [PubMed] [Google Scholar]
  • 12.Arkachaisri T, Pino S. Localized scleroderma severity index and global assessments: a pilot study of outcome instruments. J Rheumatol. 2008;35(4):650-657. [PubMed] [Google Scholar]
  • 13.Arkachaisri T, Vilaiyuk S, Li S, et al. ; Localized Scleroderma Clinical and Ultrasound Study Group . The localized scleroderma skin severity index and physician global assessment of disease activity: a work in progress toward development of localized scleroderma outcome measures. J Rheumatol. 2009;36(12):2819-2829. doi: 10.3899/jrheum.081284 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Arkachaisri T, Vilaiyuk S, Torok KS, Medsger TA Jr. Development and initial validation of the localized scleroderma skin damage index and physician global assessment of disease damage: a proof-of-concept study. Rheumatology (Oxford). 2010;49(2):373-381. doi: 10.1093/rheumatology/kep361 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Agazzi A, Fadanelli G, Vittadello F, Zulian F, Martini G. Reliability of LoSCAT score for activity and tissue damage assessment in a large cohort of patients with juvenile localized scleroderma. Pediatr Rheumatol Online J. 2018;16(1):37. doi: 10.1186/s12969-018-0254-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kelsey CE, Torok KS. The Localized Scleroderma Cutaneous Assessment Tool: responsiveness to change in a pediatric clinical population. J Am Acad Dermatol. 2013;69(2):214-220. doi: 10.1016/j.jaad.2013.02.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Diamond IR, Grant RC, Feldman BM, et al. Defining consensus: a systematic review recommends methodologic criteria for reporting of Delphi studies. J Clin Epidemiol. 2014;67(4):401-409. doi: 10.1016/j.jclinepi.2013.12.002 [DOI] [PubMed] [Google Scholar]
  • 18.Trevelyan EG, Robinson N. Delphi methodology in health research: how to do it? Eur J Integr Med. 2015;7(4):423-428. doi: 10.1016/j.eujim.2015.07.002 [DOI] [Google Scholar]
  • 19.Massey DS, Martin JA. The NIS Skin Color Scale. 2003. Accessed August 3, 2018. https://nis.princeton.edu/downloads/NIS-Skin-Color-Scale.pdf
  • 20.Zamanzadeh V, Ghahramanian A, Rassouli M, Abbaszadeh A, Alavi-Majd H, Nikanfar AR. Design and implementation content validity study: development of an instrument for measuring patient-centered communication. J Caring Sci. 2015;4(2):165-178. doi: 10.15171/jcs.2015.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Polit DF, Beck CT, Owen SV. Is the CVI an acceptable indicator of content validity? appraisal and recommendations. Res Nurs Health. 2007;30(4):459-467. doi: 10.1002/nur.20199 [DOI] [PubMed] [Google Scholar]
  • 22.Chan YH. Biostatistics 104: correlational analysis. Singapore Med J. 2003;44(12):614-619. [PubMed] [Google Scholar]
  • 23.Carmines EG, Zeller RA. Reliability and Validity Assessment. Sage Publications; 1979. Quantitative Applications in the Social Sciences. [Google Scholar]
  • 24.Cicchetti DV, Sparrow SA. Developing criteria for establishing interrater reliability of specific items: applications to assessment of adaptive behavior. Am J Ment Defic. 1981;86(2):127-137. [PubMed] [Google Scholar]
  • 25.Mokkink LB, Prinsen CAC, Bouter LM, de Vet HCW, Terwee CB. The Consensus-based Standards for the Selection of Health Measurement Instruments (COSMIN) and how to select an outcome measurement instrument. Braz J Phys Ther. 2016;20(2):105-113. doi: 10.1590/bjpt-rbf.2014.0143 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.de Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in Medicine. Cambridge University Press; 2011. Practical Guides to Biostatistics and Epidemiology. [Google Scholar]
  • 27.Phelan C, Wren J. Exploring reliability in academic assessment. Office of Academic Assessment, University of Northern Iowa. 2005-2006. Accessed May 5, 2022. https://chfasoa.uni.edu/reliabilityandvalidity.htm
  • 28.Glen S. Average inter-item correlation: definition, example. StatisticsHowTo.com. Accessed April 20, 2021. https://www.statisticshowto.com/average-inter-item-correlation
  • 29.Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155-163. doi: 10.1016/j.jcm.2016.02.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Pope E, Doria AS, Theriault M, Mohanta A, Laxer RM. Topical imiquimod 5% cream for pediatric plaque morphea: a prospective, multiple-baseline, open-label pilot study. Dermatology. 2011;223(4):363-369. doi: 10.1159/000335560 [DOI] [PubMed] [Google Scholar]
  • 31.Cunningham BB, Landells ID, Langman C, Sailer DE, Paller AS. Topical calcipotriene for morphea/linear scleroderma. J Am Acad Dermatol. 1998;39(2 Pt 1):211-215. doi: 10.1016/S0190-9622(98)70077-5 [DOI] [PubMed] [Google Scholar]
  • 32.Kroft EBM, Groeneveld TJ, Seyger MMB, de Jong EMGJ. Efficacy of topical tacrolimus 0.1% in active plaque morphea: randomized, double-blind, emollient-controlled pilot study. Am J Clin Dermatol. 2009;10(3):181-187. doi: 10.2165/00128071-200910030-00004 [DOI] [PubMed] [Google Scholar]
  • 33.Garcia-Romero MT, Laxer R, Pope E. Correlation of clinical tools to determine activity of localized scleroderma in paediatric patients. Br J Dermatol. 2016;174(2):408-410. doi: 10.1111/bjd.14001 [DOI] [PubMed] [Google Scholar]
  • 34.Teske NM, Jacobe HT. Using the Localized Scleroderma Cutaneous Assessment Tool (LoSCAT) to classify morphoea by severity and identify clinically significant change. Br J Dermatol. 2020;182(2):398-404. doi: 10.1111/bjd.18097 [DOI] [PubMed] [Google Scholar]
  • 35.Ahmad Fadzil MH, Ihtatho D, Mohd Affandi A, Hussein SH. Objective assessment of psoriasis erythema for PASI scoring. J Med Eng Technol. 2009;33(7):516-524. doi: 10.1080/07434610902744074 [DOI] [PubMed] [Google Scholar]
  • 36.Tan J, Liu H, Leyden JJ, Leoni MJ. Reliability of Clinician Erythema Assessment grading scale. J Am Acad Dermatol. 2014;71(4):760-763. doi: 10.1016/j.jaad.2014.05.044 [DOI] [PubMed] [Google Scholar]
  • 37.Frew J, Penzi L, Suarez-Farinas M, et al. The erythema Q-score, an imaging biomarker for redness in skin inflammation. Exp Dermatol. 2021;30(3):377-383. doi: 10.1111/exd.14224 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Zhao CY, Wijayanti A, Doria MC, et al. The reliability and validity of outcome measures for atopic dermatitis in patients with pigmented skin: a grey area. Int J Womens Dermatol. 2015;1(3):150-154. doi: 10.1016/j.ijwd.2015.05.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Khanna D, Furst DE, Clements PJ, et al. Standardization of the modified Rodnan skin score for use in clinical trials of systemic sclerosis. J Scleroderma Relat Disord. 2017;2(1):11-18. doi: 10.5301/jsrd.5000231 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Birdi N, Shore A, Rush P, Laxer RM, Silverman ED, Krafchik B. Childhood linear scleroderma: a possible role of thermography for evaluation. J Rheumatol. 1992;19(6):968-973. [PubMed] [Google Scholar]
  • 41.Martini G, Murray KJ, Howell KJ, et al. Juvenile-onset localized scleroderma activity detection by infrared thermography. Rheumatology (Oxford). 2002;41(10):1178-1182. doi: 10.1093/rheumatology/41.10.1178 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1.

eAppendix 1. Examples of Variations in MAM and Modified LoSSI Performance

eAppendix 2. Examples and Definitions of Items in the MAM

Supplement 2.

Data Sharing Statement


Articles from JAMA Dermatology are provided here courtesy of American Medical Association

RESOURCES