Skip to main content
Blood Advances logoLink to Blood Advances
. 2025 Nov 7;10(4):1145–1152. doi: 10.1182/bloodadvances.2025016648

Reproducibility and repeatability of the Myoton to quantify sclerotic chronic graft-versus-host disease

Nosha Farhadfar 1,, Najla El Jurdi 2, Kelsey K Baker 3, Shramana Ghosh 4,5, Mongoljin Bat-Erdene 3, Heidi Chen 4,6, Ruchi Sahu 4,5, Rachel Weiss 4,5, Jerry Mi 4,5, Gabriela Desatnik 3, Lacey R Williams 3, Eric R Tkaczyk 4,5, Stephanie J Lee 3
PMCID: PMC12917516  PMID: 41191519

Key Points

  • The Myoton had excellent interobserver and good intraobserver agreement.

  • A bonus site selected for severe involvement had similar interobserver and intraobserver agreement as standard anatomic sites.

Visual Abstract

graphic file with name BLOODA_ADV-2025-016648-ga1.jpg

Abstract

There is an urgent need for validated tools to measure sclerotic cutaneous chronic graft-versus-host disease (scGVHD). We examined the interobserver reproducibility within a session and intraobserver repeatability between sessions of the Myoton device for quantifying skin sclerosis in 36 adults with scGVHD. The Myoton was used to measure oscillation frequency and relaxation time of soft tissues at 7 bilateral sites (14 anatomic sites) by 2 study personnel at 2 study sessions. Agreement was measured using mean pairwise absolute difference (MPD), and reliability was measured using intraclass correlation coefficient (ICC). For each of the 2 Myoton parameters, the overall interobserver MPD was <5% of the average overall values and the interobserver ICC was >0.90 between the 2 observers, indicating excellent agreement and reliability within a measurement session. The median time between sessions 1 and 2 was 47.5 days. The overall normalized intraobserver MPD was <7% of the average overall values for each of the 2 Myoton parameters, reflecting good agreement between sessions. The intraobserver ICC for frequency and relaxation time parameters were 0.85 and 0.84, respectively, indicating good reliability between sessions. The reproducibility and repeatability of a bonus site selected at each study visit were similar to the standard 14 anatomic sites. However, no individual site was nearly as reproducible or repeatable as the overall Myoton measurements averaged across the patient. Our findings emphasize the utility of the Myoton for assessing skin properties in scGVHD with patient-level measurements.

Introduction

Chronic graft-versus-host disease (cGVHD) is a heterogeneous syndrome with a median time to diagnosis of 4 to 6 months after allogeneic hematopoietic cell transplantation (HCT).1 Skin is one of the organs most frequently affected by cGVHD. Cutaneous cGVHD encompasses 2 distinct phenotypes: epidermal/erythematous and sclerotic. Epidermal disease manifestations, including erythematous and lichen planus-like lesions, are often responsive to therapy. Sclerotic cutaneous cGVHD (scGVHD) represents a distinctive phenotype observed in ∼7% of patients with newly diagnosed cGVHD, with the cumulative incidence increasing to 20% at 3 years.2 scGVHD is characterized by inflammation and progressive fibrosis of the dermis and subcutaneous tissues, resembling morphea, systemic sclerosis, or eosinophilic fasciitis. scGVHD is often resistant to systemic immunosuppression and has been associated with significant functional disability and morbidity after HCT.3, 4, 5, 6

Due to the lack of quantitative methods to measure the extent, depth, and severity of sclerotic involvement in general practice, sclerotic skin changes are currently described in qualitative terms related to thickening, pliability, color, adherence to underlying tissues, and joint range of motion in the 2014 National Institutes of Health (NIH) consensus criteria.7 The current qualitative skin grading system is insufficiently sensitive to detect responses in established sclerosis. Therefore, the validation of novel tools for a more sensitive measurement of change in skin sclerosis was identified as an area of urgent need by the 2020 NIH Consensus Conference.8

The MyotonPRO (Myoton),9 a handheld myotonometer that measures soft tissue biomechanical parameters through a mechanical impulse, has been proposed as a noninvasive device for the quantitative assessment of scGVHD (Figure 1). Prior studies have shown the interobserver and intraobserver agreement of Myoton in healthy individuals as well as interobserver agreement in 7 patients with scGVHD.10, 11, 12 Moreover, a preliminary study demonstrated ability of the Myoton to objectively distinguish between 10 patients with severe scGVHD and 14 post-HCT patients without cGVHD.13

Figure 1.

Figure 1.

MyotonPRO measuring soft tissue biomechanical parameters in a patient with sclerotic cGVHD.

In this study of the Myoton for measuring skin biomechanics and quantifying skin sclerosis in patients with scGVHD, we determined both the agreement and reliability of interobserver reproducibility within a session and intraobserver repeatability between sessions weeks apart.

Methods

Study population

This single-center study included 36 adults (aged ≥18 years at HCT) with scGVHD after allogeneic HCT. All graft sources, donor sources, and conditioning regimens were included. The institutional review board of the Fred Hutchinson Cancer Center approved the study, and all participants provided written informed consent.

Study design

The Myoton was used to measure 5 biomechanical and viscoelastic properties of soft tissues involved with cGVHD: (1) natural oscillation frequency refers to the natural frequency at which the tissue oscillates, (2) dynamic stiffness refers to resistance to deformation, (3) logarithmic decrement is the ability to dissipate mechanical energy as the skin recovers from deformation, (4) mechanical relaxation time refers to the time to recover to original shape after a deforming impulse, and (5) creep is defined as a ratio of the time to recover shape to the time to reach maximum deformation. Myoton parameters are calculated from the damped oscillations of the skin following a quick mechanical impulse.9 This study is focused on the Myoton parameters of frequency and relaxation time, which have shown promise toward accurately capturing skin biomechanics (highest discriminatory value) in this patient population.9 Study measurements were performed by 6 study personnel (5 research coordinators and 1 physician) without prior experience in using the Myoton. A training manual developed by E.R.T.’s team for research personnel was utilized to ensure a uniform measurement protocol that yields reliable measurements.14 No training was provided other than the written manual.

Each participant underwent 2 study visits, sessions 1 and 2, separated by at least 7 days. Two study personnel independently performed Myoton measurements at each visit for each participant, as previously described.12 After 1 observer completed all measurements, she/he exited the room and the next observer entered. At each visit, measurements were obtained at 7 bilateral sites (14 anatomic sites) in the following order: dorsal forearm (extensor digitorum), volar forearm (brachioradialis), upper arm (biceps brachii), shoulder (deltoideus), chest (pectoralis major), abdomen (rectus abdominis), and shin (tibialis anterior). Rather than measuring both sides of the body, we used the supine-only Myoton measurement protocol because it was previously demonstrated to maintain the diagnostic accuracy of total body (anterior + posterior) measurements with reduced measurement time.15 The standard 12-mm disk attachment and 7 milliseconds tap time (10 milliseconds for abdomen) for superficial soft tissue were used, as described previously.12 In addition to the 7 prespecified sites (14 anatomic sites when counted bilaterally), up to 3 additional “bonus sites” (6 when counted bilaterally) involved by scGVHD but not prespecified in the protocol were also measured for exploratory analysis to determine whether there was value in assessing the most clinically involved areas directly. Each observer conducted the entire measurement protocol during a measurement session (ie, a total of 14 readings were obtained by a single observer in a single measurement session for each patient), followed by the second observer conducting their assessment. Each anatomic site measurement was the average reading from the device from 5 successive impulses (ie, a device reading represents 5 mechanical impulses to the anatomic site). In the first 11 participants’ first session only, the measured locations were marked by the first observer for the second observer. Once the participant was positioned supine for evaluation, the estimated time for 1 observer to perform 1 set of 14 Myoton measurements was 10 minutes.

Definitions of reproducibility, repeatability, agreement, and reliability

Each of our 2 validations (reproducibility and repeatability) can be considered in 2 different ways (agreement or reliability), as thoroughly expounded upon by Bartlett and Frost.16 To navigate potentially variable terminologies, we quote the following definitions from the literature:

  • 1.

    Reproducibility is “the ability of different observers to come up with a same measurement” and reflects errors attributable to between-observer differences.17(p318) Because some definitions of reproducibility are more general, we specify “interobserver reproducibility” throughout this article for clarity.

  • 2.

    Repeatability is “the ability of a same observer to come up with a same (similar) result on a second measurement performed on the same sample” and measures errors intrinsic to a single observer.17(p318) This is often referred to as test-retest when the same measurement tool is evaluated at different times,18 assuming these time points are sufficiently close to ensure constancy of the true underlying value.16 For clarity, we use the term “intraobserver repeatability.”

  • 3.

    “Agreement quantifies how close 2 measurements made on the same subject are and is measured on the same scale as the measurements themselves. Agreement between measurements is a characteristic of the measurement method(s) involved, which does not depend on the population in which measurements are made.”16(p467)

  • 4.

    Reliability is “the ability of a measurement to differentiate among subjects or objects. Reliability relates the magnitude of the measurement error in observed measurements to the inherent variability in the ‘error-free,’ ‘true,’ or underlying level of the quantity.”19(p663) Therefore, it is not an intrinsic property of the evaluated method.17

Statistical analysis

Descriptive statistics were used to summarize the study population characteristics. Categorical variables were summarized with counts and percentages, whereas median and range were used for continuous variables. For each participant, the averages of an observer’s measurement across all anatomic sites (14 anatomic sites) were calculated for an “overall” patient measurement. For interobserver reproducibility, this overall measurement was compared between observers for session 1. If either observer missed any of the 14 sites in session 1, their overall score could not be calculated and then the session 2 data were used instead for the analysis of interobserver agreement. Intraobserver repeatability was calculated between the session 1 and 2 measurements conducted by the same observer at least 7 days apart. Sclerotic cGVHD changes slowly, and exploratory analysis revealed a scarce effect of longer times between the first and second assessments.

To assess agreement, we used the mean pairwise absolute difference (MPD) U-statistic of the overall value as well as for each measured site to calculate the mean disagreement between the 2 measurements. The 2 measurements are either by the 2 observers in the case of interobserver reproducibility or at the 2 time points in the case of intraobserver repeatability. The MPD for each site (or the “overall” value) was reported both in raw units as well as normalized to the site’s average measurement across 2 observers and all participants. We estimated the 95% confidence interval of the MPD via Liu’s open-source bootstrapping algorithm.20 A high MPD reflected inconsistency or variability within the data, whereas a low value indicated agreement between observations. We were unable to find standards for the levels of acceptable agreement in skin biomechanical measurements in the literature. Therefore, based on our group’s previous experience with the Myoton, we considered a normalized MPD of <5% to reflect excellent agreement, <10% to be good agreement, <20% to be moderate agreement, <30% as slight/fair agreement, and >30% to be poor agreement.

To assess reliability, we used a linear mixed model to calculate single-measure, absolute-agreement intraclass correlation coefficients (ICC) together with 95% confidence intervals. ICC results were interpreted according to Landis and Koch criteria: poor reliability (<0.5), moderate reliability (0.5-0.75), good reliability (0.75-0.9), and excellent reliability (>0.90).21 Data analysis was performed in R version 4.4.1 using the nlme packages and Liu and Harrell’s observerVariability functions.20

Results

Demographics

A total of 36 patients with scGVHD were included in the analysis. Demographics and transplant characteristics of the study population are shown in Table 1. Most patients were male (55.6%), White (88.9%), and received matched unrelated donor (63.9%) HCT. The median age at transplant was 56.0 years (range, 22.0-72.0). The median time from HCT to enrollment was 6.8 years (range, 1.7-18.5), with a median time from scGVHD diagnosis to study enrollment of 3.1 years (range, 0.1-16.1). Patients were heavily pretreated with a median of 3 prior lines of therapy for scGVHD (range, 1-11).

Table 1.

Demographics and transplant characteristics of the study population (N = 36)

Characteristic n (%)
Age at transplant, median (range), y 56.0 (22.0-72.0)
Sex
 Male 20 (55.6)
 Female 16 (44.4)
Race
 Black 1 (2.8)
 American Indian/Alaska Native 1 (2.8)
 Native Hawaiian 1 (2.8)
 White 32 (88.9)
 Other 1 (2.8)
Ethnicity
 Hispanic 2 (5.6)
 Not Hispanic 34 (94.4)
Primary diagnosis
 AML 7 (19.4)
 ALL 8 (22.2)
 MDS 8 (22.2)
 MPN 9 (25.0)
 Others 4 (11.1)
Source, PBSC 36 (100)
Donor match
 HLA-identical sibling 9 (25.0)
 Haplo-identical 1 (2.8)
 MMUD 3 (8.3)
 MUD 23 (63.9)
Conditioning regimen intensity
 MAC 17 (47.2)
 RIC 19 (52.8)
Conditioning TBI dose
 No TBI 11 (30.5)
 ≤450 cGy 16 (44.4)
 >450 cGy 9 (25.0)
Transplant to enrollment, median (range), y 6.7 (1.7-18.5)
cGVHD to enrollment, median (range), y 4.4 (0.1-16.1)
NIH global score at baseline
 Moderate 7 (19.4)
 Severe 29 (80.6)
NIH skin score at baseline
 Score 2 2 (5.5)
 Score 3 34 (94.5)
Skin features, deep sclerotic features 34 (94.5)
Sclerosis to enrollment, median (range), y 3.1 (0.1-16.1)
No. of lines of prior therapy for sclerosis, median (range) 3.0 (1.0-11.0)

ALL, acute lymphoblastic leukemia; AML, acute myeloid leukemia; MAC, myeloablative conditioning; MDS, myelodysplastic syndrome; MMUD, mismatched unrelated donor; MPN, myeloproliferative disorder; MUD, matched unrelated donor; PBSC, peripheral blood stem cell; RIC, reduced intensity conditioning; TBI, total body irradiation.

Others include hemophagocytic lymphohistiocytosis, lymphoma, multiple myeloma, and mixed phenotypic acute leukemia.

Interobserver reproducibility of a single measurement session

Approximately 80% of the patients (30/36) completed the Myoton assessments with usable data in session 1 by the 2 observers. For the 6 remaining patients, session 2 data were used for interobserver analysis. At the patient level, the overall interobserver MPD (Table 2) was <5% of the average overall values for each of the 2 Myoton parameters, indicating excellent interobserver agreement. The interobserver ICC for both frequency and relaxation time parameters were >0.90, indicating excellent interobserver reliability. By comparing the first 11 measurement sessions (with sites marked between observers 1 and 2) to the remaining (unmarked) data, we found that marking the area of interest did not have a significant effect on interobserver reproducibility.

Table 2.

Interobserver reproducibility across 2 observers at the patient level

Value Study population (N = 36)
Frequency, Hz Relaxation time, ms
Observer 1 supine average (IQR) 20.14 (18.37-21.21) 19.04 (14.80-26.20)
Observer 2 supine average (IQR) 20.11 (18.47-20.95) 19.04 (15.65-27.70)
Observers 1 and 2 supine average (SD) 20.12 (2.92) 19.04 (2.88)
Overall absolute MPD between observers (95% CI) 0.76 (0.55-0.98) 0.58 (0.43-0.74)
Overall normalized MPD, % (95% CI) 3.77 (2.74-4.88) 3.02 (2.28-3.86)
ICC, % (95% CI) 0.94 (0.90-0.98) 0.97 (0.95-0.99)

CI, confidence interval; IQR, interquartile range; SD, standard deviation.

A total of 72 observations (36 for observer 1 and 36 for observer 2).

In a site-by-site analysis (supplemental Tables 1 and 2), the MPD for frequency was lowest for the abdomen (best agreement) and highest for the dorsal forearm (worst agreement). The MPDs for relaxation time was lowest in the shin (best agreement) and highest in the shoulder (worst agreement). MPDs seldom exceeded 10% when the left-right average was normalized, even in the worst cases, indicating moderate to good site-level interobserver agreement. In terms of reliability, site-level interobserver ICC of the individual sites varied from fair to excellent.

The first bonus site measured was most often the thigh (12/30 patients with a bonus site) followed by the calf (4/30 patients). Because few patients had additional bonus sites selected, we restricted the analysis only to the first bonus site. The absolute and normalized MPDs for both parameters (Table 3) of the bonus site fell well within and even toward the lower end of the range of individual anatomic sites from the standard protocol (supplemental Tables 1 and 2). For bonus site 1, the normalized interobserver MPD for frequency was 7.4% and for relaxation time was 11.7%, indicating good to moderate agreement. Furthermore, the interobserver ICC for both frequency and relaxation time parameters were >0.90 for bonus site 1, indicating excellent interobserver reliability.

Table 3.

Interobserver reproducibility across 2 observers for bonus site 1

Value Study population (n = 26)
Frequency, Hz Relaxation time, ms
Observer 1 average (IQR) 21.89 (14.65-25.88) 18.80 (13.00-25.18)
Observer 2 average (IQR) 22.26 (14.94-27.50) 17.97 (13.30-21.94)
Observers 1 and 2 supine average (SD) 22.08 (8.96) 18.39 (7.68)
Absolute MPD between observers (95% CI) 1.64 (1.12-2.16) 2.15 (1.32-2.99)
Normalized MPD, % (95% CI) 7.43 (4.96-10.23) 11.69 (6.75-17.60)
ICC, % (95% CI) 0.97 (0.95-0.99) 0.91 (0.85-0.97)

CI, confidence interval; IQR, interquartile range; SD, standard deviation.

A total of 52 observations (26 for observer 1 and 26 for observer 2).

Intraobserver repeatability between sessions (test-retest)

Intraobserver (test-retest) repeatability between sessions, as shown in Table 4, could be calculated for 40 observations in 22 patients, excluding cases where there was only 1 complete session or where different observers obtained measurements at session 2. The median time between the 2 measurement sessions was 47.5 days (range, 14-203). At the patient level, the overall intraobserver (test-retest) MPD was <10% of the average overall values for each of the 2 Myoton parameters, indicating good agreement. The ICC for frequency and relaxation time parameters were 0.87 and 0.85, respectively, indicating good intraobserver (test-retest) reliability.

Table 4.

Intraobserver repeatability across 2 sessions (test-retest) at the patient level

Value Observations (n = 40); patients (n = 22)
Frequency, Hz Relaxation time, ms
Session 1 supine average (IQR) 20.28 (18.22-22.15) 18.58 (17.08-20.66)
Session 2 supine average (IQR) 20.25 (18.12-21.02) 18.58 (17.02-20.66)
Observers 1 and 2 supine average (SD) 20.27 (3.32) 18.74 (3.10)
Overall absolute MPD between sessions (95% CI) 1.18 (0.86-1.59) 1.29 (0.99-1.64)
Overall normalized MPD, % (95% CI) 5.83 (4.24-7.86) 6.87 (5.31-8.73)
ICC, % (95% CI) 0.87 (0.80-0.95) 0.85 (0.77-0.94)

CI, confidence interval; IQR, interquartile range; SD, standard deviation.

A total of 80 observations (n = 40 for session 1 and n = 40 for session 2).

In the site-by-site analysis of the repeatability between sessions, the normalized MPD for frequency ranged from ∼10% (good to moderate) for the upper arm to >20% (fair) for the chest (supplemental Table 3). For bonus site 1 (most commonly on the thigh), the normalized MPD between sessions for frequency was 17%, reflecting moderate intraobserver (test-retest) repeatability. The ICC for frequency at bonus site 1 was 0.88 (Table 5). Relaxation time had highly similar findings to frequency in terms of both agreement and reproducibility (supplemental Table 4), with the upper arm as one of the most repeatable sites, the chest as the least repeatable, and bonus site 1 falling in between with moderate to fair intraobserver (test-retest) repeatability.

Table 5.

Intraobserver repeatability across 2 sessions (test-retest) for bonus site 1

Value Observations (n = 30); patients (n = 17)
Frequency, Hz Relaxation time, ms
Session 1 average (IQR) 21.57 (15.00-23.04) 18.12 (13.97-22.24)
Session 2 average (IQR) 21.08 (14.78-21.71) 18.78 (16.20-24.40)
Observers 1 and 2 supine average (SD) 21.32 (9.19) 18.45 (7.07)
Absolute MPD between sessions (95% CI) 3.51 (2.54-4.49) 3.99 (2.97-5.02)
Normalized MPD, % (95% CI) 16.46 (12.51-21.63) 21.63 (16.51-26.63)
ICC, % (95% CI) 0.88 (0.81-0.96) 0.77 (0.62-0.91)

CI, confidence interval; IQR, interquartile range; SD, standard deviation.

A total of 60 observations (n = 30 for session 1 and n = 30 for session 2).

In summary, similar to that for interobserver reproducibility, the intraobserver repeatability for bonus site 1 fell within the range of the standard individual anatomic sites.

Discussion

This study is among the first to determine measurement error of the Myoton to evaluate biomechanical parameters of scGVHD. We observed excellent reproducibility between 2 observers shown by normalized overall MPD of <5% and ICC >0.9 for both frequency and relaxation time. Furthermore, the Myoton demonstrated good intraobserver repeatability across 2 sessions more than a week apart, with a normalized overall MPD of ∼7% and ICC level exceeding 0.8.

As an exploratory end point, 1 additional site (bonus site) involved by scGVHD but not prespecified in the protocol was analyzed to determine whether there was value in assessing the most clinically involved areas directly as a potential “sentinel site.” When assessed at an individual body site level, the interobserver normalized MPD for frequency and relaxation time at bonus sites fell within the range of MPD for the 14 standard anatomic sites. However, because of the averaging out of measurement error, overall patient-level measurements had far better reproducibility and repeatability than any individual standard anatomic or bonus site. The use of different bonus sites between patients limits intersubject comparisons, because normal soft tissue biomechanics vary between body sites. Because the bonus site was as reproducible and repeatable as typical validated sites, future clinical measurement protocols will likely retain a high level or agreement and reliability if they choose to prespecify a larger number of anatomic sites to consistently measure than those in previous published Myoton studies.

For future work, we recommend the use of a supine-only protocol including 14 prespecified anatomic sites to measure the 2 skin parameters, frequency, and relaxation time. The MPDs reported here could be used to inform whether changes in a participant’s measurements fall within the expected range of interobserver and intraobserver agreement. This would enable investigators to determine if a change in measurements is due to measurement error or represents a true change in the biomechanical properties of the skin. Having achieved satisfactory reproducibility and repeatability, our next study will attempt to establish a minimal clinically meaningful difference by correlating changes in the Myoton readings with patient- and clinician-reported perceptions of change. Generally, a change of 0.5 standard deviations approximates a clinically meaningful difference. For example, per the observer 1 and 2 measurements, this half a standard deviation would be 1.46 Hz and 1.44 milliseconds for a patient’s overall frequency and relaxation measurements, respectively (Table 2), which are much larger changes than the corresponding typical measurement errors of 0.76 Hz and 0.58 millisecond as estimated by MPD. A further prospective longitudinal study is underway to assess the Myoton’s ability to monitor disease progression and therapeutic response.

The NIH consensus scoring system, which is based on the subjective assessment of clinical features and estimated involved body surface area (BSA), currently remains the only formally validated metric for the assessment of skin cGVHD. The thresholds for defining a change in clinical features beyond measurement error are 18% to 26% BSA for movable sclerosis and 17% to 21% BSA for nonmovable sclerosis, reflecting the need for more sensitive measurement tools.17,18 The ICC score for measuring BSA involved by cGVHD using the NIH skin score is typically in the range of 0.2 to 0.8 depending on clinician expertise, suggesting poor to moderate interobserver agreement.22, 23, 24, 25 The assessment of sclerosis based on the NIH skin score is challenged by the need to integrate visual and tactile assessments and accompanying difficulty determining the geographical extent. In a study assessing the interobserver reliability of the NIH skin score between cGVHD experts and a panel of clinicians after a brief training session (2.5 hours), the ICC score was <0.40 for the evaluation of the percentage of BSA involved with movable sclerotic features (ICC score, 0.24) suggesting poor reliability.24 Given that the NIH skin score relies on variable clinical judgment, adding the Myoton measurements to the NIH skin score can enhance reliability by providing objective, reproducible, and repeatable scGVHD skin assessments.

The modified Rodnan skin score (mRSS) is a widely used semiquantitative outcome measure for skin thickness in systemic sclerosis,26 with reported ICC is in the range of 0.64 to 0.68.26,27 This scale is not validated for scGVHD, and its use may be limited due to inability to measure lichen sclerosis-like changes, subcutaneous involvement without overlying skin thickening, or fascial involvement because it relies on the "pinchability" of the skin. Based on the results of our study, the Myoton measures had higher ICC (ICC > 0.8) than those found with both the NIH skin score and the mRSS, reflecting excellent reliability for measuring scGVHD. Furthermore, unlike the mRSS that assesses dermal thickness, the Myoton is able to assess biomechanical parameters that reflect changes in both the skin and underlying fascia. Moreover, the Myoton is a relatively easy to use device, which can be operated by users with minimal training. As shown in our study, inexperienced users can achieve high agreement and reliability between each other by simply following a standardized protocol in an illustrated manual, without further training.

Several attempts have been made to identify quantifiable and reproducible measurements or imaging methods to evaluate the severity of skin and soft tissue sclerosis.28 One of the devices previously proposed as a potential tool for the quantitative assessment of scGVHD is the durometer. The durometer measures only the amplitude of deformation force that the operator places on the skin, as opposed to the Myoton that assesses more detailed aspects of the soft tissue’s biomechanical properties including frequency, stiffness, decrement, relaxation time, and creep. In prior head-to-head comparisons,10,12 the Myoton exhibited better interobserver agreement and reliability than the durometer in both healthy participants and patients with scGVHD. The subjects with high but different Myoton stiffness readings shared a maximum durometer reading, suggesting the inability of the durometer to differentiate between varying degrees of elevated stiffness. Both Myoton stiffness and durometer hardness are amplitude measurements, which can be affected by variations in contact between the probe and the skin as well as the amount of force applied by the observer. This may explain why Myoton frequency and relaxation time were shown to be the most reliable and diagnostic among the directly compared biomechanical parameters.15

Several imaging methods including 20-mHz ultrasound,29,30 magnetic resonance imaging,31,32 and optical coherence tomography33,34 have been studied as potential tools for diagnosing and monitoring scGVHD. However, these imaging modalities are currently not widely available or used in clinical practice for routine GVHD assessment due to limited clinical validation, a lack of standardized protocols, and the need for specialized expertise for the interpretation of the images. In contrast, the Myoton is a user-friendly, noninvasive, portable device with a straightforward interface that allows quick acquisition and interpretation without expert input, facilitating its adoption in routine clinical practice.

The limitations of this study include a small sample size and heterogeneous population. Moreover, we determined the reliability of the Myoton by measuring ICC scores, which are highly dependent on the variance of the assessed population. Higher ICC values may be obtained when applied to a more heterogeneous population such as patients with scGVHD compared with a more homogeneous one despite similar levels of agreement. To overcome this problem, we measured MPD in addition to the ICC score to assess both agreement and reliability. Finally, this study included mostly patients with severe scGVHD; therefore, these results may not be generalizable to patients with superficial skin involvement.

Conflict-of-interest disclosure: N.F. has been an advisory board member for Incyte; received speaker fees from Incyte; is on the data safety monitoring committee for the Chronic Graft-versus-Host Disease Consortium; and is the medical monitor for the Blood and Marrow Transplant Clinical Trials Network. S.J.L. reports consulting fees from Mallinckrodt, Equillium, Kadmon, Novartis, Sanofi, and Incyte; research funding from AstraZeneca, Pfizer, Sanofi, and Syndax; drug supply from Janssen; being on the clinical trial steering committees for Incyte and Sanofi; and serving on the board of directors of the National Marrow Donor Program (uncompensated). The remaining authors declare no competing financial interests.

Acknowledgments

This work was supported by the Career Development Award IK2 CX001785 (E.R.T.) and US Department of Veterans Affairs Merit award I01 CX002721 (E.R.T.) from the US Department of Veterans Affairs Clinical Science Research & Development Service; and by grants R01 HL169944 (primary investigator [PI]: E.R.T.), R01 CA118953 (PI: S.J.L.), and U01 CA236229 (PI: S.J.L.) from the Fred Hutchinson Cancer Center, National Institutes of Health (NIH). This research was also supported, in part, by the Center for Cancer Research, National Cancer Institute, Intramural Research Program of the NIH.

The contributions of the NIH author(s) were made as part of their official duties as NIH federal employees, are in compliance with agency policy requirements, and are considered Works of the US government. However, the findings and conclusions presented in this article are those of the author(s) and do not necessarily reflect the views of the NIH or the US Department of Health and Human Services.

Authorship

Contribution: S.J.L., E.R.T., N.F., and N.E.J. contributed to the conception and design; S.G., M.B.-E., G.D., L.R.W., R.S., R.W., and J.M. contributed to the data collection, extraction, and verification; K.K.B. and H.C. contributed to the data analysis; all authors contributed to the interpretation; N.F., E.R.T., and S.J.L. contributed to the manuscript writing and draft preparation; and all authors helped revise the manuscript and gave final approval.

Footnotes

E.R.T. and S.J.L. are joint senior authors.

Deidentified individual participant data that underlie the reported results will be made available 3 months after publication for a period of 5 years. Proposals for access should be sent to the author, Stephanie J. Lee (sjlee@fredhutch.org).

The full-text version of this article contains a data supplement.

Supplementary Material

Supplemental Tables

References

  • 1.Zeiser R, Blazar BR. Pathophysiology of chronic graft-versus-host disease and therapeutic targets. N Engl J Med. 2017;377(26):2565–2579. doi: 10.1056/NEJMra1703472. [DOI] [PubMed] [Google Scholar]
  • 2.Inamoto Y, Storer BE, Petersdorf EW, et al. Incidence, risk factors, and outcomes of sclerosis in patients with chronic graft-versus-host disease. Blood. 2013;121(25):5098–5103. doi: 10.1182/blood-2012-10-464198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Martires KJ, Baird K, Steinberg SM, et al. Sclerotic-type chronic GVHD of the skin: clinical risk factors, laboratory markers, and burden of disease. Blood. 2011;118(15):4250–4257. doi: 10.1182/blood-2011-04-350249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Pidala J, Kurland B, Chai X, et al. Patient-reported quality of life is associated with severity of chronic graft-versus-host disease as measured by NIH criteria: report on baseline data from the chronic GVHD consortium. Blood. 2011;117(17):4651–4657. doi: 10.1182/blood-2010-11-319509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Pidala J, Kurland BF, Chai X, et al. Sensitivity of changes in chronic graft-versus-host disease activity to changes in patient-reported quality of life: results from the Chronic Graft-versus-Host Disease Consortium. Haematologica. 2011;96(10):1528–1535. doi: 10.3324/haematol.2011.046367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Lee SJ, Nguyen TD, Onstad L, et al. Success of immunosuppressive treatments in patients with chronic graft-versus-host disease. Biol Blood Marrow Transplant. 2018;24(3):555–562. doi: 10.1016/j.bbmt.2017.10.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Jagasia MH, Greinix HT, Arora M, et al. National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: I. The 2014 diagnosis and staging working group report. Biol Blood Marrow Transplant. 2015;21(3):389–401.e1. doi: 10.1016/j.bbmt.2014.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wolff D, Radojcic V, Lafyatis R, et al. National Institutes of Health consensus development project on criteria for clinical trials in chronic graft-versus-host disease: IV. The 2020 highly morbid forms report. Transplant Cell Ther. 2021;27(10):817–835. doi: 10.1016/j.jtct.2021.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Myoton AS MyotonPRO Digital Palpation Device User Manual. Device model: 1B, Device Firmware v.1.8.1. https://www.myoton.com/UserFiles/Updates/MyotonPRO_User_Manual.pdf
  • 10.Ghosh S, Baker L, Chen F, et al. Interrater reproducibility of the Myoton and durometer devices to quantify sclerotic chronic graft-versus-host disease. Arch Dermatol Res. 2023;315(9):2545–2554. doi: 10.1007/s00403-023-02626-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Gilbert I, Gaudreault N, Gaboury I. Intra- and inter-evaluator reliability of the MyotonPRO for the assessment of the viscoelastic properties of caesarean section scar and unscarred skin. Skin Res Technol. 2021;27(3):370–375. doi: 10.1111/srt.12956. [DOI] [PubMed] [Google Scholar]
  • 12.Dellalana LE, Chen F, Vain A, et al. Reproducibility of the durometer and myoton devices for skin stiffness measurement in healthy subjects. Skin Res Technol. 2019;25(3):289–293. doi: 10.1111/srt.12646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Chen F, Dellalana LE, Gandelman JS, Vain A, Jagasia MH, Tkaczyk ER. Correction: non-invasive measurement of sclerosis in cutaneous cGVHD patients with the handheld device Myoton: a cross-sectional study. Bone Marrow Transplant. 2020;55(5):992. doi: 10.1038/s41409-020-0780-1. [DOI] [PubMed] [Google Scholar]
  • 14.Ghosh S, Ssempijja Y, Patel SA, Tkaczyk ER. Myoton Illustrated Skin Manual. Vanderbilt Dermatology Translational Research Clinic. MyotonIllustratedSkinManual_v35.pdf
  • 15.Baker LX, Chen F, Cronin A, et al. Optimal biomechanical parameters for measuring sclerotic chronic graft-versus-host disease. JID Innov. 2021;1(3) doi: 10.1016/j.xjidi.2021.100037. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bartlett JW, Frost C. Reliability, repeatability and reproducibility: analysis of measurement errors in continuous variables. Ultrasound Obstet Gynecol. 2008;31(4):466–475. doi: 10.1002/uog.5256. [DOI] [PubMed] [Google Scholar]
  • 17.Popović ZB, Thomas JD. Assessing observer variability: a user’s guide. Cardiovasc Diagn Ther. 2017;7(3):317–324. doi: 10.21037/cdt.2017.03.12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Berchtold A. Test-retest: agreement or reliability? Methodological innovations. SagePub. 2016;9 [Google Scholar]
  • 19.Kottner J, Audige L, Brorson S, et al. Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. Int J Nurs Stud. 2011;48(6):661–671. doi: 10.1016/j.ijnurstu.2011.01.016. [DOI] [PubMed] [Google Scholar]
  • 20.Harrel FE. Analysis of observer variability and measurement agreement. https://hbiostat.org/bbr/obsvar
  • 21.Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33(1):159–174. [PubMed] [Google Scholar]
  • 22.Greinix HT, Pohlreich D, Maalouf J, et al. A single-center pilot validation study of a new chronic GVHD skin scoring system. Biol Blood Marrow Transplant. 2007;13(6):715–723. doi: 10.1016/j.bbmt.2007.02.007. [DOI] [PubMed] [Google Scholar]
  • 23.Mitchell S, Jacobsohn D, Thormann K, et al. Feasibility and reproducibility of the NIH consensus criterai to evaluate response in chronic graft versus host disease (cGVHD) Blood. 2006;108(11):5344. [Google Scholar]
  • 24.Mitchell SA, Jacobsohn D, Thormann Powers KE, et al. A multicenter pilot evaluation of the National Institutes of Health chronic graft-versus-host disease (cGVHD) therapeutic response measures: feasibility, interrater reliability, and minimum detectable change. Biol Blood Marrow Transplant. 2011;17(11):1619–1629. doi: 10.1016/j.bbmt.2011.04.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Cardones AR, Sullivan KM, Green C, et al. Interrater reliability of clinical grading measures for cutaneous chronic graft-vs-host disease. JAMA Dermatol. 2019;155(7):833–837. doi: 10.1001/jamadermatol.2018.5459. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Khanna D, Furst DE, Clements PJ, et al. Standardization of the modified Rodnan skin score for use in clinical trials of systemic sclerosis. J Scleroderma Relat Disord. 2017;2(1):11–18. doi: 10.5301/jsrd.5000231. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Ionescu R, Rednic S, Damjanov N, et al. Repeated teaching courses of the modified Rodnan skin score in systemic sclerosis. Clin Exp Rheumatol. 2010;28(2 suppl 58):S37–S41. [PubMed] [Google Scholar]
  • 28.Shakshouk H, Tkaczyk ER, Cowen EW, et al. Methods to assess disease activity and severity in cutaneous chronic graft-versus-host disease: a critical literature review. Transplant Cell Ther. 2021;27(9):738–746. doi: 10.1016/j.jtct.2021.05.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Osmola-Mańkowska A, Silny W, Dańczak-Pazdrowska A, et al. Assessment of chronic sclerodermoid graft-versus-host disease patients, using 20 MHz high-frequency ultrasonography and cutometer methods. Skin Res Technol. 2013;19(1):e417–e422. doi: 10.1111/j.1600-0846.2012.00659.x. [DOI] [PubMed] [Google Scholar]
  • 30.Gottlöber P, Leiter U, Friedrich W, et al. Chronic cutaneous sclerodermoid graft-versus-host disease: evaluation by 20-MHz sonography. J Eur Acad Dermatol Venereol. 2003;17(4):402–407. doi: 10.1046/j.1468-3083.2003.00516.x. [DOI] [PubMed] [Google Scholar]
  • 31.Horger M, Boss A, Bethge W, et al. MR findings in patients with disabling musculocutaneous chronic graft-versus-host disease. Skeletal Radiol. 2008;37(10):885–894. doi: 10.1007/s00256-008-0535-3. [DOI] [PubMed] [Google Scholar]
  • 32.Su P, Cao T, Tang MBY, Tey HL. In vivo high-definition optical coherence tomography: a bedside diagnostic aid for morphea. JAMA Dermatol. 2015;151(2):234–235. doi: 10.1001/jamadermatol.2014.2668. [DOI] [PubMed] [Google Scholar]
  • 33.Deegan AJ, Talebi-Liasi F, Song S, et al. Optical coherence tomography angiography of normal skin and inflammatory dermatologic conditions. Lasers Surg Med. 2018;50(3):183–193. doi: 10.1002/lsm.22788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Chen GL, Jeon M, Ross M, et al. Optical coherence tomography for quantifying human cutaneous chronic graft-versus-host disease. Transplant Cell Ther. 2021;27(3):271.e1–271.e8. doi: 10.1016/j.jtct.2020.11.023. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Tables

Articles from Blood Advances are provided here courtesy of The American Society of Hematology

RESOURCES