Key Points
Question
Is the itch numeric rating scale an appropriate measurement tool for quantifying the extent and intensity of pruritus associated with prurigo nodularis?
Findings
This secondary analysis of a randomized clinical trial of 123 participants found that the Worst Itch Numeric Rating Scale and Average Itch Numeric Rating Scale are acceptable measures of pruritus for patients with prurigo nodularis with good evidence of validity and reliability. This was demonstrated by analyzing test-retest reliability, construct validity, sensitivity to change, and thresholds for meaningful change.
Meaning
The Worst Itch Numeric Rating Scale and Average Itch Numeric Rating Scale should be considered acceptable tools for assessing pruritus in future clinical trials of prurigo nodularis.
This randomized, double-blind study evaluates the psychometric properties of the itch numeric rating scale for measuring severity of pruritis associated with prurigo nodularis.
Abstract
Importance
There is an unmet need for psychometrically sound instruments to measure pruritus associated with prurigo nodularis (PN).
Objective
To evaluate the psychometric properties of the itch numeric rating scale (itch NRS), both the Worst Itch Numeric Rating Scale (WI-NRS) and the Average Itch Numeric Rating Scale (AI-NRS).
Design, Setting, and Participants
This secondary analysis is based on a secondary end point of a phase 2 randomized clinical trial of serlopitant for treatment of pruritus associated with PN. This randomized, double-blind, placebo-controlled study was conducted at 15 sites in Germany. Eligible patients were aged 18 to 80 years and had generalized PN for more than 6 weeks that was refractory to previous antipruritic therapies. Patients were required to have a visual analog scale itch score of 7 or higher at screening. Data were collected from July 2014 to June 2016 and analyzed from June 2016 to January 2017.
Main Outcomes and Measures
The itch NRS (AI-NRS and WI-NRS) was correlated together with the following measures: the electronic verbal rating scale (eVRS) for itch self-categorization, average itch visual analog scale (AI-VAS), worst itch visual analog scale (WI-VAS), the pruritus-specific quality-of-life rating instrument ItchyQoL, Dermatology Life Quality Index (DLQI), and Prurigo Activity and Severity Score (items 7b and 7a: percentage healed prurigo lesions and percentage of prurigo lesions with excoriations).
Results
There were 123 participants in this study; the mean (SD) age of participants was 57.3 (11.58) years, and 58 (47.2%) were male. Strong associations (r ≥ 0.5) were observed between itch NRS items (WI-NRS and AI-NRS) and AI-VAS (24 hours) at weeks 2, 4, and 8 (r = 0.72-0.90; P < .001). Similar strong associations were also observed between itch NRS items and WI-VAS (24 hours) and eVRS for itch severity across weeks 2, 4, and 8 (r = 0.65-0.92; all P < .001). Strong correlations were seen between change scores for WI-NRS and WI-VAS and AI-VAS (r = 0.76 and 0.70, respectively; both P < .001). Similar findings were seen for AI-NRS, where correlations between change scores for WI-VAS and AI-VAS were 0.71 and 0.72, respectively (both P < .001). Analyses for the itch NRS items also showed that test-retest reliability was acceptable and provided evidence of acceptable convergent validity based on the eVRS and visit verbal rating score for itch self-categorization, ItchyQoL, and DLQI.
Conclusions and Relevance
Results from this secondary analysis show that the itch NRS items WI-NRS and AI-NRS have good psychometric properties for pruritus associated with PN and should be considered acceptable tools for assessing pruritus in future clinical trials of PN.
Trial Registration
ClinicalTrials.gov Identifier: NCT02196324
Introduction
Prurigo nodularis (PN), also called chronic nodular prurigo, is a pruritic disease characterized by papulonodular lesions that arise as a consequence of long-term pruritus and scratching.1,2,3 Pruritus is the dominant symptom, with patients reporting a median intensity of 8 on an 11-point numeric rating scale (NRS).4
Because pruritus is a subjective measure, the assessment of any antipruritic treatment effects relies on patients’ reports.3 The NRS for itch (itch NRS) is an instrument intended to measure its intensity3; however, it has not specifically been evaluated for pruritus associated with PN.
The primary objective of this analysis was to evaluate the psychometric properties of the itch NRS score (Average Itch Numeric Rating Score [AI-NRS] and Worst Itch Numeric Rating Score [WI-NRS]) that was included as an end point in a phase 2 study of serlopitant (NCT02196324), an oral neurokinin 1 receptor antagonist, for the treatment of pruritus associated with PN.5
Methods
Design and Patients
The methods and results of the phase 2 study have been previously published.5 The final version of the trial protocol and the version of trial protocol used for ethics committee approval are presented in Supplement 1 and Supplement 2, respectively. The phase 2 study comprised the analysis and was approved by institutional review board. For this analysis, the patient-reported outcome (PRO)-evaluable population was defined as patients who received at least 1 dose of the study drug and had electronic diary itch NRS data at week 1. Other outcome measures are shown in eTable 1 in Supplement 3.
Statistics
All analyses were conducted using SAS, version 9.2 (SAS Institute). After performing descriptive statistics, the quality criterions of PRO tools (test-retest reliability, construct validity [convergent and discriminant validity], sensitivity to change, and thresholds for meaningful change) were analyzed. For more detailed descriptions of the statistics, see eMethods in Supplement 3.
Results
Patient Population
Baseline demographics and characteristics of the PRO-evaluable population (N = 123) are summarized in Table 1.
Table 1. Patient Demographics and Clinical Characteristics at Baseline.
Characteristic | PRO-evaluable population (N = 123) |
---|---|
Age, mean (SD), y | 57.3 (11.58) |
Sex, No. (%) | |
Male | 58 (47.2) |
Female | 65 (52.8) |
Duration of prurigo nodularis, No. (%) | |
≤5 y | 60 (48.8) |
>5 y | 63 (51.2) |
AI-NRS, mean (SD)a,b | 6.5 (1.83) |
WI-NRS, mean (SD)a,b | 7.3 (1.67) |
AI-VAS, mean (SD)c | 7.9 (1.43) |
WI-VAS, mean (SD)c | 8.6 (1.27) |
eVRS for itch self-categorization, No. (%)a,d | |
No itching | 0 (0.0) |
Mild itching | 2 (1.6) |
Moderate itching | 34 (27.9) |
Severe itching | 50 (41.0) |
Very severe itching | 36 (29.5) |
DLQI, mean (SD)e | 14.2 (6.76) |
ItchyQoL, mean (SD)f | 3.6 (0.71) |
Abbreviations: AI-NRS, Average Itch Numeric Rating Scale; AI-VAS, average itch visual analog scale; DLQI, Dermatology Life Quality Index; eVRS, electronic verbal rating scale; PRO, patient-reported outcome; WI-NRS, Worst Itch Numeric Rating Scale; WI-VAS, worst itch visual analog scale.
The baseline score was defined as the average of the daily scores over the first 7 days of data collected at and after visit 2 (ie, days 1-7; week 1).
Numeric rating scale scores ranged from 0 to 10, with higher scores reflecting worse itch.
Visual analog scale scores ranged from 0 to 100, with higher scores reflecting worse itch.
Values are based on n = 122, as 1 eVRS score was missing.
DLQI scores ranged from 0 to 30, with higher scores reflecting worse influence on quality of life.
ItchyQoL scores ranged from 1 to 5, with higher scores reflecting worse pruritus-specific health-related quality of life impairment.
Convergent and Discriminant Validity
Convergent and discriminant validity test results for WI-NRS and AI-NRS, assessing their associations with other clinical outcome assessments, are summarized in Table 2. Strong associations (r ≥ 0.5) were observed between itch NRS items and average itch visual analog scale (AI-VAS; 24 hours) at weeks 2, 4, and 8 (r = 0.72-0.90; all P < .001). Similar strong associations were also observed between itch NRS items and worse itch visual analog scale (WI-VAS; 24 hours) and electronic Verbal Rating Scale (eVRS) for itch severity items across weeks 2, 4, and 8 (r = 0.65-0.92; all P < .001). Moderate associations were seen between itch NRS items and the pruritus-specific quality of life (QoL) rating instrument ItchyQoL and the Dermatology Life Quality Index (DLQI) (r = 0.36-0.46; all P < .001). In general, weaker associations were observed with percentage of healed prurigo lesions (r = –0.25 to –0.32; P < .01 and P < .001, respectively) and percentage of prurigo lesions with excoriations (r = 0.21-0.28; P < .05 and P < .01, respectively) for itch NRS items at weeks 2 and 4.
Table 2. Construct Validity Results of WI-NRS and AI-NRS With Measures of Itch, Quality of Life, and Prurigo Nodularis Status.
Variable | WI-NRSa,b | AI-NRSa,b | ||||
---|---|---|---|---|---|---|
Week 2 | Week 4 | Week 8 | Week 2 | Week 4 | Week 8 | |
AI-VAS (24 h) | 0.72c | 0.78c | 0.87c | 0.82c | 0.88c | 0.90c |
WI-VAS (24 h) | 0.79c | 0.86c | 0.92c | 0.74c | 0.81c | 0.85c |
eVRSb | 0.65c | 0.65c | 0.72c | 0.66c | 0.65c | 0.72c |
ItchyQoL | 0.36c | 0.36c | 0.40c | 0.38c | 0.42c | 0.39c |
DLQI | 0.40c | 0.42c | 0.46c | 0.41c | 0.45c | 0.45c |
Healed prurigo lesions, % | –0.32c | –0.27d | –0.41c | –0.25d | –0.28d | –0.43c |
Prurigo lesions with excoriations, % | 0.28d | 0.25d | 0.38c | 0.21e | 0.23e | 0.40c |
Abbreviations: AI-NRS, Average Itch Numeric Rating Scale; AI-VAS, average itch visual analog scale; DLQI, Dermatology Life Quality Index; eVRS, electronic verbal rating scale; WI-NRS, Worst Itch Numeric Rating Scale; WI-VAS, worst itch visual analog scale.
Spearman correlations.
Weekly mean WI-NRS and eVRS for itch severity scores from the previous week were used for each time point.
P < .001.
P < .01.
P < .05.
Known-Groups Validity
Both WI-NRS and AI-NRS demonstrated significant differentiation among the intensity of itching levels, based on the visit Verbal Rating Scale and eVRS at week 2 (Figure, A and B; lowest P value reported is < .001 for all overall models). As measured by the ItchyQoL and DLQI at week 2, WI-NRS and AI-NRS demonstrated the ability to significantly discriminate among itch-specific QoL levels (Figure, C; P = .01 and P < .01 for overall models, respectively) and dermatology-specific QoL (Figure, D; P < .001 for both overall models). Based on the Prurigo Activity and Severity Scale at week 2, WI-NRS showed significant differentiation among percentage of prurigo lesions with excoriations (P < .05); AI-NRS results were not significant (Figure, E).
Figure. Tests of Known Validity for Worst Itch Numeric Rating Scale (WI-NRS) and Average Itch Numeric Rating Scale (AI-NRS) With Measures of Itch, Quality of Life, and Prurigo Nodularis Status.
eVRS indicates electronic Verbal Rating Scale; ItchyQoL, the pruritus-specific quality-of-life rating instrument ItchyQoL; DLQI, Dermatology Life Quality Index; vVRS indicates visit Verbal Rating Scale. Weekly mean itch numeric rating scale (NRS) and electronic verbal rating scale for itch severity scores from the previous week were used to determine WI-NRS and AI-NRS. Both WI-NRS and AI-NRS were assessed using analysis of variance models. A, Mean NRS with visit verbal rating scale (itching) from the visit Patient Global Assessment. The none and mild categories were combined due to small sample size for the none category (n = 1 or 2). B, Mean NRS with electronic verbal rating scale for itch severity. C, Mean NRS with ItchyQoL. D, Mean NRS with Dermatology Life Quality Index (DLQI). E, Mean NRS with percentage of prurigo lesions with excoriations.
Test-Retest Reliability
For the itch NRS, stable patients were defined based on the eVRS. The mean (SD) difference from week 1 to week 2 was –0.5 (0.94) for WI-NRS (P = .002) and –0.4 (1.00) for AI-NRS (P = .01). Corresponding intraclass correlation coefficient values were 0.78 and 0.84, respectively, which demonstrated acceptable agreement between test and retest measures.
Sensitivity to Change
Strong correlations were seen between change scores (week 2 to week 8) for WI-NRS, WI-VAS, and AI-VAS (r = 0.76 and 0.70, respectively; both P < .001). Similar findings were seen for AI-NRS, where correlations between change scores for WI-VAS and AI-VAS were 0.71 and 0.72, respectively (both P < .001). In addition, moderate correlations were seen between change scores for WI-NRS, DLQI, and ItchyQoL (r = 0.40 and 0.45, respectively; both P < .001). Correlations between change scores for AI-NRS and the DLQI and ItchyQoL were 0.46 and 0.45, respectively (both P < .001).
Thresholds for Meaningful Change
For WI-NRS, standard error of the mean values were 1.08 and 1.01 for the intraclass correlation coefficient and 8-week averaged SD and the intraclass correlation coefficient and change score in SD, respectively; corresponding standard error of the mean values were 0.76 and 0.69, respectively, for AI-NRS. These findings suggest that an approximate 1-point change on WI-NRS and AI-NRS may be associated with minimal clinical improvement.
Responder Definitions
A reduction of 2.6 points (37%) in WI-NRS score was associated with at least a moderate improvement in itch based on the visit Patient Global Assessment question, If improved, to what extent?, while a reduction of 3.6 points (53%) was associated with at least a good improvement in itch (eTable 2 in Supplement 3). When eVRS findings were summarized based on any improvement from week 2 to week 8, a reduction of 2.3 points (32%) in WI-NRS score was associated with at least a 1-point improvement in itch category, an improvement of 3.0 points (41%) in WI-NRS score was associated with at least a 2-point improvement in itch category, and an improvement of 4.5 points (62%) in WI-NRS score was associated with at least a 3-point improvement in itch category. Similar findings based on visit Patient Global Assessment and eVRS improvement categories were seen for the AI-NRS.
Discussion
Results from this secondary data analysis of the serlopitant trial5 show that the itch NRS items (WI-NRS and AI-NRS) have good psychometric properties for pruritus associated with PN. Analyses for the itch NRS items showed that test-retest reliability was acceptable. Significant correlations were seen between itch NRS items and the other PROs and provided evidence of acceptable convergent validity based on the eVRS, ItchyQoL, and DLQI. It was anticipated that the DLQI would be used to assess discriminant validity as a more general dermatology QoL measure; however, correlations between the itch NRS items were all greater than or equal to 0.40, suggesting that the concepts are more closely related than had been previously hypothesized. In addition, the itch NRS items showed evidence of acceptable known-groups validity and evidence of acceptable sensitivity to change.
Findings from 2 methods of defining responders based on improvement in itch show that WI-NRS change scores of 2.3 to 4.5 are consistent with a 30% to 60% improvement in WI-NRS scores, which could serve as a responder definition in future studies. Previous studies in psoriasis and atopic dermatitis recommend a 4-point WI-NRS response.6,7,8 The findings of this analysis show that a 3-point or greater improvement in WI-NRS scores from baseline to end point is equivalent to the responder definition. Similar findings were seen for the AI-NRS, suggesting that a 3-point or greater improvement in scores from baseline to end point is equivalent to the responder definition. The similarity in responder definitions for the WI-NRS and AI-NRS is not unexpected, as findings from research in atopic dermatitis and chronic pruritus showed that these measures were highly correlated with each other.7,9 Additional studies may be needed to confirm the responder definition for these NRS measures in PN.
Limitations
A limitation of this report is that the population comprised only White patients, all of whom were from Germany. Other populations have been shown to interpret their pruritus differently.10,11
Conclusions
The results of this study fulfill the US Food and Drug Administration’s criteria for an adequate instrument to assess treatment benefit of a given intervention, including evidence supporting reliability, validity, and ability to detect change. Thus, WI-NRS and AI-NRS should be considered acceptable tools for assessing pruritus in future clinical trials of PN.
Trial Protocol Version 8
Trial Protocol Version 3
eMethods (with references)
eTable 1. Outcome Measures Used in the Trial
eTable 2. Meaningful Change Thresholds for WI-NRS and AI-NRS for Defining Responders Using the PGA of Chronic Pruritus and eVRS for Itch Self-Categorization
Data Sharing Statement
References
- 1.Pereira MP, Steinke S, Zeidler C, et al. ; EADV Task Force Pruritus group members . European Academy of Dermatology and Venereology European Prurigo Project: expert consensus on the definition, classification and terminology of chronic prurigo. J Eur Acad Dermatol Venereol. 2018;32(7):1059-1065. doi: 10.1111/jdv.14570 [DOI] [PubMed] [Google Scholar]
- 2.Zeidler C, Tsianakas A, Pereira M, Ständer H, Yosipovitch G, Ständer S. Chronic prurigo of nodular type: a review. Acta Derm Venereol. 2018;98(2):173-179. doi: 10.2340/00015555-2774 [DOI] [PubMed] [Google Scholar]
- 3.Ständer S, Augustin M, Reich A, et al. ; International Forum for the Study of Itch Special Interest Group Scoring Itch in Clinical Trials . Pruritus assessment in clinical trials: consensus recommendations from the International Forum for the Study of Itch (IFSI) Special Interest Group Scoring Itch in Clinical Trials. Acta Derm Venereol. 2013;93(5):509-514. doi: 10.2340/00015555-1620 [DOI] [PubMed] [Google Scholar]
- 4.Iking A, Grundmann S, Chatzigeorgakidis E, Phan NQ, Klein D, Ständer S. Prurigo as a symptom of atopic and non-atopic diseases: aetiological survey in a consecutive cohort of 108 patients. J Eur Acad Dermatol Venereol. 2013;27(5):550-557. doi: 10.1111/j.1468-3083.2012.04481.x [DOI] [PubMed] [Google Scholar]
- 5.Ständer S, Kwon P, Hirman J, et al. ; TCP-102 Study Group . Serlopitant reduced pruritus in patients with prurigo nodularis in a phase 2, randomized, placebo-controlled trial. J Am Acad Dermatol. 2019;80(5):1395-1402. doi: 10.1016/j.jaad.2019.01.052 [DOI] [PubMed] [Google Scholar]
- 6.Kimball AB, Naegeli AN, Edson-Heredia E, et al. Psychometric properties of the Itch Numeric Rating Scale in patients with moderate-to-severe plaque psoriasis. Br J Dermatol. 2016;175(1):157-162. doi: 10.1111/bjd.14464 [DOI] [PubMed] [Google Scholar]
- 7.Yosipovitch G, Reaney M, Mastey V, et al. Peak Pruritus Numerical Rating Scale: psychometric validation and responder definition for assessing itch in moderate-to-severe atopic dermatitis. Br J Dermatol. 2019;181(4):761-769. doi: 10.1111/bjd.17744 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Ständer S, Luger T, Cappelleri JC, et al. Validation of the Itch Severity Item as a measurement tool for pruritus in patients with psoriasis: results from a phase 3 tofacitinib program. Acta Derm Venereol. 2018;98(3):340-345. doi: 10.2340/00015555-2856 [DOI] [PubMed] [Google Scholar]
- 9.Verweyen E, Ständer S, Kreitz K, et al. Validation of a comprehensive set of pruritus assessment instruments: The Chronic Pruritus Tools Questionnaire PRURITOOLS. Acta Derm Venereol. 2019;99(7):657-663. doi: 10.2340/00015555-3158 [DOI] [PubMed] [Google Scholar]
- 10.Shaw FM, Luk KMH, Chen KH, Wrenn G, Chen SC. Racial disparities in the impact of chronic pruritus: a cross-sectional study on quality of life and resource utilization in United States veterans. J Am Acad Dermatol. 2017;77(1):63-69. doi: 10.1016/j.jaad.2017.01.016 [DOI] [PubMed] [Google Scholar]
- 11.Tey HL, Yosipovitch G. Itch in ethnic populations. Acta Derm Venereol. 2010;90(3):227-234. doi: 10.2340/00015555-0867 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Trial Protocol Version 8
Trial Protocol Version 3
eMethods (with references)
eTable 1. Outcome Measures Used in the Trial
eTable 2. Meaningful Change Thresholds for WI-NRS and AI-NRS for Defining Responders Using the PGA of Chronic Pruritus and eVRS for Itch Self-Categorization
Data Sharing Statement