Abstract
Introduction
Lichen planus (LP) is an inflammatory skin disorder that can present in various forms across the body, including lesions on the skin (cutaneous LP [CLP]), scalp (lichen planopilaris [LPP]) and mucosal regions (mucosal LP [MLP]). Several existing patient-reported outcome measures (PROMs) were identified for potential use in LP clinical development programs. This study aimed to assess the content validity and psychometric measurement properties of the Dermatology Life Quality Index (DLQI), Epworth Sleepiness Scale (ESS), Scalpdex and Oral Lichen Planus Symptom Severity Measure (OLPSSM) in an LP population.
Methods
Patients completed the PROs at various time points as part of an international Phase 2 clinical study in adults with MLP (n = 37), LPP (n = 37) and CLP (n = 37). Test-retest reliability, construct validity and sensitivity to change were assessed. In addition, qualitative cognitive debriefing interviews were conducted with adults with MLP (n = 20), LPP (n = 19) and CLP (n = 19) in the USA and Germany to examine the PROM content validity.
Results
The DLQI demonstrated adequate reliability and validity, although its ability to detect change was modest and most items were considered not relevant in qualitative interviews. The ESS had good reliability but limited evidence of validity and ability to detect change. Conceptual relevance varied according to the qualitative interview data. The Scalpdex was miscellaneous across domains, but the ‘Symptoms’ domain performed well overall. Overall, Scalpdex concepts were reported as relevant by most LPP patients interviewed. The OLPSSM demonstrated good psychometric properties and strong evidence of content validity.
Conclusions
The psychometric and qualitative findings support the use of the OLPSSM and Scalpdex within specific LP subtypes but cautioned use of the DLQI. Administration of the ESS is not recommended in LP because of its poor psychometric performance. Given these limitations, further validation of non-specific disease measures is needed and/or the development of additional LP-specific PROMs.
Trial Registration
Supplementary Information
The online version contains supplementary material available at 10.1007/s13555-023-00990-4.
Keywords: Dermatology Life Quality Index, Epworth Sleepiness Scale, Health-related quality of life, Oral Lichen Planus Symptom Severity Measure, Lichen planus, Patient-reported outcomes, Psychometric evaluation, Qualitative evidence, Scalpdex
Key Summary Points
Why carry out this study? |
A review of existing patient-reported outcome measures used in dermatological conditions indicated that there were some existing measures that could be appropriate for use in lichen planus; however, further qualitative and psychometric testing was required to address evidence gaps. |
This study aimed to assess the content validity and psychometric measurement properties of the Dermatology Life Quality Index, Epworth Sleepiness Scale, Scalpdex and Oral Lichen Planus Symptom Severity Measure across three lichen planus subpopulations: cutaneous lichen planus, lichen planopilaris and mucosal lichen planus. |
What was learned from the study? |
The findings recommend the use of the Scalpdex and the Oral Lichen Planus Symptom Severity Measure with lichen planopilaris and oral mucosal lichen planus patients, respectively, and the Dermatology Life Quality Index in general lichen planus populations, with caveats. The Epworth Sleepiness Scale demonstrated weak psychometric properties and content validity when utilised with lichen planus patients. |
This study highlights the importance of assessing the appropriateness of non-specific disease patient-reported outcome measures in disease-specific populations. |
Introduction
Lichen planus (LP) is an inflammatory skin disorder estimated to affect between 0.5 and 1% of the population worldwide [1, 2]. LP can present in various forms across the body [3]. Cutaneous LP (CLP) lesions are the most common type of LP and are characterized by polygonal purple papules on the skin, often associated with severe itch and typically affecting flexor surfaces including the wrists, ankles and lower back [4]. Lichen planopilaris (LPP) is a follicular variant of LP and is most common in females [2]. LPP can present as painful and itchy patches of hair loss, predominantly localized to the centre of the scalp, along the frontal hair line and/or in the eyebrows [5]. If untreated, LPP can lead to irreversible scarring and alopecia [4]. Mucosal LP (MLP) lesions typically present as asymptomatic bilateral white striations or painful plaques localized in mucosal areas including buccal mucosa, tongue and gingivae, genitalia and conjunctiva [2, 4, 6]. Individuals may be diagnosed with more than one LP subtype, based on the clinical presentation [4].
Given the range of LP signs and symptoms (including itch, pain and a burning sensation at the affected areas) [1, 8–11], LP can have a significant impact on patients’ health-related quality of life (HRQoL) [4]. While qualitative literature is limited, there is evidence that LP patients, particularly CLP and MLP patients, experience psychological impacts including anxiety and depression [12]. Patients with oral MLP also report experiencing significant impacts to daily activities such as discomfort when having certain foods and drinks, which in some cases can result in depression and high levels of stress and anxiety [13, 14]. LPP patients have reported impacts on social interactions and daily activities as a result of scarring and hair loss, causing patients to have low self-esteem and feel self-consciousness [15].
Patient-reported outcome measures (PROMs) are commonly used in routine medical practice and clinical studies to measure symptoms and HRQoL from the patient perspective. It is important that PROMs are appropriate and fit for purpose in terms of content validity and psychometric validity in the context of use [16]. A review of existing PROMs used in LP and other similar dermatological conditions identified several PROMs that could be appropriate for use in LP clinical development programs. Specifically, dermatological measures such as the Dermatology Life Quality Index (DLQI) [17] and Scalpdex [18], and non-specific disease measures such as the Epworth Sleepiness Scale (ESS) [19], have been used to assess HRQoL in LP patients [15, 20–23]. While there is some evidence of content validity and psychometric properties for these measures in some dermatological conditions [23, 24], there is limited evidence to support their use in an LP population [25]. In contrast, while existing LP-specific PROMs such as the recently developed Oral Lichen Planus Symptom Severity Measure (OLPSSM) have strong content validity [8, 26], there is no published additional evidence of psychometric validation in an LP (nor any other) population.
To address the gaps in evidence and align with regulatory standards [16, 27], the current study aimed to assess the content validity and psychometric measurement properties of the DLQI, ESS, Scalpdex and OLPSSM in an LP population through the conduct of qualitative patient interviews and psychometric analysis of data from an international Phase 2 LP clinical study. Aligned with the United States Food and Drug Administration (FDA) patient-focused drug development (PFDD) guidance documents, a mixed-method approach was used to ensure that the patient voice was represented in the evaluation of the select PROMs and in future clinical study design in LP [28–31].
Methods
Study Design
This study was conducted in two phases: In the quantitative phase the psychometric properties of the DLQI, ESS, Scalpdex and OLPSSM were assessed in an LP population. In the qualitative phase content validity of the measures was evaluated via cognitive debriefing interviews.
Compliance with Ethics Guidelines
Ethical approval and oversight were obtained for the clinical study including exit interviews ([clinicaltrials.gov ID: NCT04300296, EUDRACT: 2019-003588-24]) and the independent qualitative interviews (Western Copernicus Group Independent Review Board [WCG IRB; reference: 20216826]). The studies were performed in accordance with the Helsinki Declaration of 1964 and its later amendments, and all participants provided informed consent indicating their data will be used for medical research purposes and the study results may be published.
Quantitative Phase
The quantitative phase used data collected from a global, randomized, double-blind, placebo-controlled, multi-centre, parallel-group Phase 2 clinical study involving 111 adults with biopsy-proven forms of moderate to severe LP (based on Investigator Global Assessment [IGA] rating of ≥ 3) who were eligible for systemic therapy and not adequately controlled with topical corticosteroids of high-ultrahigh potency in the opinion of the investigator. The study consisted of three cohorts (CLP, MLP and LPP) and two treatment periods (treatment period 1: baseline to Week 16; treatment period 2: Week 16 to Week 32) (Supplementary Material). For the psychometric analyses, treatment period 1 data were used. The PROMs selected were included as secondary or exploratory study endpoints.
Overview of PROMs
Table 1 provides a brief description of the PROMs included in the planned analyses and the cohorts they were administered to within the clinical study. Licenses to use the PROMs in the clinical study were obtained.
Table 1.
PROM Patient-reported outcome measure, DLQI Dermatology Life Quality Index, HRQoL Health-related quality of life, QoL Quality of life, ESS Epworth Sleepiness Scale, OLPSSM Oral Lichen Planus Symptom Severity Measure, OLP Oral lichen planus
Anchor Measures
Anchor measures were developed and administered in the LP clinical study to the full clinical sample to support psychometric evaluation of the PROMs [16]. This included a five-point patient global impression of severity (PGI-S) item, a five-point patient global impression of change (PGI-C) item, a five-point Investigator’s Global Assessment (IGA) scale and Item 1 of the DLQI (‘Over the last week, how itchy, sore, painful or stinging has your skin been?’). The PGI-S and the IGA were administered at baseline and at Week 2, 4, 8, 12 and 16; the PGI-C was administered at Week 2, 4, 8, 12 and 16.
Psychometric Analysis
Item- and scale-level psychometric analyses were conducted (Table 2). Unless noted otherwise, Week 4 data were used, as this time point was identified to provide a greater range of scores. As the PROMs were not appropriate for use in all LP types, analyses were conducted with different patient samples, e.g., DLQI and ESS with all LP types (n = 111), Scalpdex with LPP only (n = 37) and OLPSSM with MLP patients with oral LP (n = 33). The aim of this study was not to evaluate the structure of the questionnaires; therefore, factor analyses were not conducted.
Table 2.
LP Lichen planus, IGA Investigator Global Assessment, PGI-S Patient Global Impression of Severity, PGI-C Patient Global Impression of Change, DLQI Dermatology Life Quality Index, ICC Intraclass correlation coefficient, ESS Epworth Sleepiness Scale, OLPSSM Oral Lichen Planus Symptom Severity Measure, ANOVA Analysis of variance
Qualitative Phase
The qualitative phase assessed the content validity of the PROMs via cognitive debriefing interviews. Given that the DLQI, ESS, Scalpdex and OLPSSM are existing validated measures, only relevance will be reported on, as evidence of understanding is already available from the original development studies and consequent studies evaluating their use. An overview of the study procedure is provided in Supplementary Material, with further detail described in the subsequent sections.
Sample and Recruitment
A subset of patients (n = 13) enrolled in the Phase 2 LP clinical study in the US were invited to participate in an exit interview once they had completed all treatment visits to Week 32 but before their Week 40 follow-up visit. Participation was voluntary and patients could opt-out from taking part in an interview; patients who withdrew from the clinical study early were not eligible to participate in an exit interview. To further enhance the sample size, an additional and independent sample of patients (n = 45) were recruited by third-party recruitment agencies via referring clinicians in the US and Germany to participate in a qualitative interview. Inclusion and exclusion criteria for the independent interviews were broadly reflective of the LP clinical study eligibility criteria. Based on previous research, the sample included was deemed sufficient for assessing the content validity of the PROMs [32].
Interview Procedure
Interviews were 60 min and conducted via telephone by trained qualitative interviewers in the patient’s native language using a semi-structured interview guide to facilitate the discussions. The cognitive debriefing (CD) section of the interview, which aimed to explore the relevance of the concepts assessed in the PROMs, lasted approximately 30 min and consisted of direct and focused questions.
Qualitative Analysis
All interviews were audio-recorded and transcribed verbatim with identifiable information redacted; the German interviews were further translated to English. Interview transcripts were analysed using Atlas.ti (Version 22) [33] using a framework approach [34]. Dichotomous codes were assigned to each item, instruction, response option(s) and recall period to indicate whether it was understood, relevant and/or appropriate, and why. Further codes captured any suggested changes.
Results
Participant Demographic and Clinical Characteristics
Overviews of the demographic and clinical characteristics for the qualitative interviews (N = 58: exit interviews, n = 13; independent qualitative interviews, n = 45) are presented in Tables 3 and 4, respectively. Age was lower for MLP participants and there was a higher proportion of females, again reflecting the female inclination of LP [35]. Most participants enrolled were in the US and were Black or African American. There was a higher proportion of participants with ‘moderate’ LP, as confirmed by IGA severity scores at recruitment. The clinical study sample (N = 111; n = 37 in each LP cohort) was comparable with the qualitative samples; these data will be presented elsewhere.
Table 3.
SD Standard deviation; CLP Cutaneous lichen planus; MLP Mucosal lichen planus; LPP Lichen planopilaris
aEducation-level data were not collected for the exit interview sample
bCollection of racial data as part of surveys or studies in Germany is not permitted
cElementary school defined as including Kindergarten to Grade 5
dMiddle and high school defined as including Grade 6 to Grade 12, high school diploma/General Education diploma or equivalent
Table 4.
aInformation about concomitant CLP or MLP in participants enrolled in the LPP cohort was not collected for this study
bPrevious treatment categories for the clinical study sample were aligned with the clinical study eligibility criteria and thus differ from those collected for the independent interviews
cSome participants reported multiple LP treatments
LP Lichen planus, CLP Cutaneous lichen planus, MLP Mucosal lichen planus, LPP Lichen planopilaris, SD Standard deviation, IGA Investigator Global Assessment
Quantitative Phase
Item-Level and Dimensionality Analyses
Inter-Item Correlations
As expected, items within the DLQI domains (Table 5) correlated well with each other, particularly ‘Leisure’ (r = 0.894) and ‘Personal relationships’ (r = 0.890). Items in the domains ‘Symptoms and feelings’ (r = 0.479) and ‘Daily activities’ (r = 0.579) correlated moderately, however, ‘Daily activities’ items correlated most strongly with Item 2 (‘Embarrassed or self-conscious’), which was part the ‘Symptoms and feelings’ domain (range: r = 0.721–0.848). The ESS (Table 6) had a few weak correlations with the weakest (r = 0.311) being observed between Item 2 (‘Watching TV’) and Item 6 (‘Sitting and talking to someone’). Majority of correlations were in the range of r = 0.60–0.70. No correlations in the ESS exceeded 0.80. For the Scalpdex (Tables 7 and 8), inter-item correlations ranged from – 0.226 to 0.935. Items within Scalpdex domains overall correlated moderately, but this varied. Item 19 (‘I feel that my knowledge for caring for my scalp is adequate’), Item 20 (‘The cost of caring for my scalp condition bothers me’) and Item 8 (‘My scalp condition bleeds’) had the lowest correlations with the remainder of the items, suggesting they measure concepts dissimilar to other items in the Scalpdex. A number of strong correlations were observed, suggesting potential redundancies. As shown in Table 9, the OLPSSM had few weak correlations < 0.40, with the weakest correlation (r = 0.136) being observed between Item 1 (‘When you brushed your teeth’) and Item 6 (‘When you talked’). Majority of correlations were in the range of r = 0.50–0.60, with Item 2 (‘When you ate food’) and Item 7 (‘When it was touched’) having the strongest correlation (r = 0.889), indicating possible redundancy.
Table 5.
DLQI Dermatology Life Quality Index
Table 6.
Kindly refer the legend of Table 5 for the significance of color codes
ESS Epworth Sleepiness Scale, TV Television
Table 7.
Kindly refer the legend of Table 5 for the significance of color codes
LPP Lichen planopilaris
Table 8.
Kindly refer the legend of Table 5 for the significance of color codes
LPP Lichen planopilaris
Table 9.
Kindly refer the legend of Table 5 for the significance of color codes
OLPSSM Oral Lichen Planus Symptom Severity Measure, MLP Mucosal lichen planus, OLP Oral lichen planus
Scale-Level Analyses
Internal Consistency Reliability
Internal consistency was examined using Cronbach's alpha to assess the homogeneity of items belonging to the total measure score or domain score (Table 10). As Cronbach’s alpha cannot be used for domains with fewer than three items, this was not assessed for DLQI domain scores. Alpha coefficients surpassed 0.70, indicating good internal consistency (DLQI total score = 0.920, ESS total score = 0.859, Scalpdex ‘Functioning’ domain score = 0.823, Scalpdex ‘Emotions’ domain score = 0.941, OLPSSM total score = 0.877), except for the Scalpdex ‘Symptoms’ domain score (0.655). However, this domain is only composed of three items, and therefore lower reliability was expected. The measure with the highest reliability coefficient was the DLQI total score.
Table 10.
DLQI Dermatology Life Quality Index, ESS Epworth Sleepiness Scale, OLPSSM Oral Lichen Planus Symptom Severity Measure, CLP Cutaneous lichen planus, MLP Mucosal lichen planus, LPP Lichen planopilaris, OLP Oral lichen planus
The alpha-if-deleted method was also conducted to assess whether the internal consistency of each total score or domain would improve with the removal of each item in turn (Supplementary Material). The overall internal consistency improved slightly with the removal of: DLQI Item 10 (‘Over the last week, how much of a problem has the treatment for your skin been, for example by making your home messy, or by taking up time?’) (0.921); ESS Item 6 (‘Sitting and talking to someone’) (0.864); Scalpdex ‘Symptoms’ domain Item 8 (‘My scalp condition bleeds’) (0.695); Scalpdex ‘Functioning’ domain Item 15 (‘My scalp condition affects the color of clothes I wear’) (0.844); Scalpdex ‘Emotions’ domain Item 19 (‘I feel that my knowledge for caring for my scalp is adequate’) (0.949) and Item 20 (‘The cost of caring for my scalp condition bothers me’) (0.949). However, given the marginal difference in the Cronbach's alpha coefficient, these results were not considered problematic.
Test-retest Reliability
Test-retest reliability was evaluated to examine the stability of scores either between Week 2 and 4, and Week 4 and 8 for the scales (ESS total score & OLPSSM total score) assessed at those three time points, or between Week 4 and Week 8 for the scales (DLQI total score & Scalpdex ‘Total’, ‘Symptoms’, ‘Emotions’ and ‘Functioning’ domain scores) not assessed at Week 2.
When stability was defined using the IGA, PGI-S, PGI-C or DLQI item 1, all ICCs surpassed 0.75, indicating good test-retest reliability [36] (Table 11). Pearson’s correlation coefficients were similar to the ICCs, providing further evidence of the reproducibility of measure scores in stable participants.
Table 11.
DLQI Dermatology Life Quality Index, ESS Epworth Sleepiness Scale, OLPSSM Oral Lichen Planus Symptom Severity Measure, LP Lichen planus, SD Standard deviation, ICC Intra-class correlation coefficient, IGA Investigator Global Assessment, PGI-S Patient Global Impression of Severity, PGI-C Patient Global Impression of Change, MLP Mucosal lichen planus, CLP Cutaneous lichen planus, LPP Lichen planopilaris, OLP Oral lichen planus
aStability defined as no change
Concurrent Validity
The ESS total score had weak correlations (≤ 0.250) with all convergent measures (Table 12). The Scalpdex total score correlated strongly with the DLQI total score (0.801) and moderately with the OLPSSM total score (0.353). Both the Scalpdex total score and the OLPSSM total score correlated moderately with the DLQI Item 1 (range: 0.473–0.504) and the PGI-S (range: 0.609–0.637), while both had weak correlations with the PGI-C (range: 0.173–0.290). The IGA correlated moderately with the OLPSSM total score (0.552) and weakly with the Scalpdex total score (0.030).
Table 12.
DLQI Dermatology Life Quality Index, ESS Epworth Sleepiness Scale, OLPSSM Oral Lichen Planus Symptom Severity Measure, LP Lichen planus, PGI-S Patient Global Impression of Severity, PGI-C Patient Global Impression of Change, IGA Investigator Global Assessment, MLP Mucosal lichen planus, CLP Cutaneous lichen planus, LPP Lichen planopilaris
aLight shaded cells indicate hypothesized correlations (> 0.30 or > 0.40) were met, indicating concurrent validity; dark shaded cells indicate hypothesized correlations were not met
bIndicates Spearman’s correlation, otherwise polyserial
Known-Group Validity
Known-group analyses compared DLQI total score, ESS total score, Scalpdex total and domain scores and OLPSSM total score, according to groups defined by IGA and PGI-S disease severity scores (Table 13). The DLQI total score, ESS total score, Scalpdex total, ‘Symptoms’ domain score and Scalpdex ‘Emotions’ domain score differed significantly (p < 0.05) among groups defined by the PGI-S, with moderate to large between-group effect size estimates. In contrast, the difference in mean scores between target PROMs and the IGA was non-significant with negative moderate to small between-group effect size estimates, suggesting that the IGA cannot discriminate between groups. Of note, due to the sample size for the OLPSSM, more weight should be given to the between-group effect size values to interpret validity; as such, OLPSSM scores show evidence of being able to discriminate between groups for the PGI-S known groups and the IGA known groups.
Table 13.
DLQI Dermatology Life Quality Index, ESS Epworth Sleepiness Scale, OLPSSM Oral Lichen Planus Symptom Severity Measure, LP Lichen planus, SD Standard deviation, MLP Mucosal lichen planus, CLP Cutaneous lichen planus, LPP Lichen planopilaris, IGA Investigator Global Assessment, PGI-S Patient Global Impression of Severity, OLP Oral lichen planus
aIGA Reference: 0 = clear, 1 = minimal, 2 = mild group; IGA Severe: 3 = moderate, 4 = severe group. PGI-S Reference: PGI-S none or mild group; PGI-S Severe: moderate, severe or very severe
bIndicates that one of the pre-defined groups had a sample size < 10
Ability to Detect Change
Within-group effect sizes [37] and between-group one-way ANOVA F-test were calculated to evaluate the magnitude and significance of the differences in change scores between each group (improved/worsened versus stable participants) (Table 14).
Table 14.
SD Standard deviation; Min Minimum, Max Maximum, DLQI Dermatology Life Quality Index, PGI-S Patient Global Impression of Severity, PGI-C Patient Global Impression of Change, ESS Epworth Sleepiness Scale, OLPSSM Oral Lichen Planus Symptom Severity Measure
aChange between baseline and Week 16
bNegative values indicate improvement
For the DLQI total score, change scores between groups were statistically significant for both the PGI-S and PGI-C. For the ESS, small effect sizes were observed for all groups in the PGI-S and PGI-C and in the improved group for the DLQI Item 1. However, effect sizes were either non-significant (DLQI Item 1), in an unexpected direction (PGI-S) or similar for the stable and improved/worsened groups (PGI-C), suggesting that the ESS has limited ability to detect change in these anchor measures. For the OLPSSM, both the PGI-S and PGI-C showed a statistically significant difference between groups; however, statistical significance was not achieved for the DLQI Item 1.
For the Scalpdex total score, small effect sizes were found across the three groups for the DLQI, PGI-S and PGI-C, except for a moderate effect in the stable group for the DLQI. The DLQI and the PGI-C demonstrated some evidence of ability to detect change. The Scalpdex ‘Symptoms’ score had a large effect size for improved groups in all measures (DLQI Item 1, PGI-S, PGI-C). Worsened groups demonstrated a large effect size in the DLQI Item 1, a moderate effect size in the PGI-C and a small effect size in the PGI-S. A small effect size was observed for the stable groups in all three anchor measures. Change scores between groups were statistically significant for the DLQI Item 1 and PGI-C but not for PGI-S; however, the PGI-S p value may have been impacted by the low sample size for the worsened group. For the Scalpdex ‘Functioning’ score, all groups (improved, worsened, stable) had small effects sizes in all three anchors (DLQI Item 1, PGI-S, PGI-C). The only statistically significant difference between groups was for the PGI-S. For the Scalpdex ‘Emotions’ score, the DLQI Item 1 had a statistically significant change between groups with a small effect size reported for the improved and stable groups and a moderate effect size for the worsened group. The PGI-S and PGI-C had small effect sizes for all groups, with the change scores between groups being statistically significant for the PGI-S and not statistically significant for the PGI-C.
Qualitative Phase
DLQI
The DLQI was cognitively debriefed with all exit interviews participants (n = 13). Individual items did not perform well in terms of relevance; i.e., most items (n = 8/10, 80%) were considered relevant to less than half of participants. The least relevant items were Item 6 (‘Over the last week, how much has your skin made it difficult for you to do any sport?’) and Item 7 (‘Over the last week, has your skin prevented you from working or studying?’) (n = 1/13, 7.7% per item). The most relevant items were Item 1 (‘Over the last week, how itchy, sore, painful or stinging has your skin been?’; n = 11/11, 100.0%) and Item 2 (‘Over the last week, how embarrassed or self-conscious have you been because of your skin?’; n = 9/13, 69.2%), both of which are included in the DLQI ‘Symptoms and feelings’ domain.
ESS
The ESS was cognitively debriefed with a total of 49 participants (CLP participants during the exit interviews: n = 4, all participants during the independent interviews: n = 45). Relevance was mixed, with just over half of items (n = 5/8, 62.5%) being considered relevant to at least half of participants. The item that demonstrated the highest relevance was Item 5 (‘Lying down to rest in the afternoon when circumstances permit’; n = 43/49, 87.8%). Item 6 (‘Sitting and talking to someone’; n = 8/48, 16.7%) demonstrated the lowest relevance. Some participants were also asked additional probes about sleepiness with almost all participants reporting never feeling sleepy because of LP (n = 19/20, 95%) and most participants reporting never dozing off or falling asleep due to LP (n = 11/13, 84.6%).
Scalpdex
The Scalpdex was cognitively debriefed with a total of 19 LPP participants (exit interviews: n = 4, independent interviews: n = 15). Relevance was high, with almost all items (n = 21/23, 91.3%) being considered relevant to at least half of participants. The most relevant items were Item 3 (‘My scalp itches’), Item 6 (‘I am frustrated by my scalp condition’) and Item 9 (‘I am annoyed by my scalp condition’) (n = 18/19, 94.7% per item). The least relevant item was Item 15 (‘My scalp condition affects the color of clothes I wear’; n = 7/19, 36.8%).
OLPSSM
The OLPSSM was cognitively debriefed with MLP participants with oral involvement during the exit interviews (n = 5). Just over half of the items (n = 4/7, 57.1%) were considered relevant to at least half of participants. Almost all participants considered Item 4 (‘When you smiled?’; n = 4/5, 80.0%) and Item 6 (‘When you talked?’; n = 4/5, 80.0%) relevant to their experience of MLP, while Item 5 (‘When you breathed through your mouth?’; n = 2/5, 40.0%) was considered least relevant.
Of note, participant quotes to support the qualitative results are presented in Supplementary Material.
Discussion
There are limited disease-specific PROMs that assess HRQoL in LP patients and a scarcity of psychometric evidence for the use of generic HRQoL PROMs in this population. The analyses described in this study evaluated the content validity and psychometric properties of the DLQI, ESS, Scalpdex and OLPSSM to assess appropriateness of use in clinical trials with LP patients. Importantly, the mixed methods approach adopted allows for the patient voice to be represented not only in this study but in future clinical study designs, as recommended by the PFDD guidance documents [28–31] and followed the FDA recommendation for evidence-based rationale when proposing a clinical outcome assessment (COA) as fit for purpose [30]. Specifically, the approach adopted allowed for the assessment of whether the PROMs capture all important aspects of the concept of interest; that the method of scoring is appropriate and sufficiently sensitive to reflect clinically meaningful change within the context of use; that respondents understand the items as intended; that differences in scores can be interpreted in terms of impact on patient’s experience and that scores correspond to specific health experiences of patients [30]. The study also included exit interviews, which the FDA have noted as a valuable tool to contribute cumulative evidence on aspects of the patient experience; inform development or refinement of COAs; add greater depth to data in diseases, such as LP, that do not have much qualitative patient input; and to obtain patient input on meaningful outcomes [29].
While the DLQI is one of the most widely used PROM in multiple dermatological indications and has also been commonly used with LP patients [17], content and psychometric evidence of its appropriateness in LP patients for usage in clinical studies is limited [21]. The current study on the one hand supports the use of the DLQI in LP patients, as findings provide strong evidence of reliability and construct validity. The DLQI domain ‘Symptoms and feelings’ performed particularly well. On the other hand, the psychometric data do not confidently support that the DLQI can detect change over time in the specific context of use for adults with LP as high inter-item correlations between some items suggest potential redundancies. The qualitative interview data further suggest that patients did not consider most items relevant to their disease experience of LP. Given the modular nature of the DLQI, the study data support the use of the ‘Symptoms and feelings’ domain as an independent module with LP patients, where necessary and appropriate.
Even though the ESS demonstrated evidence of reliability in other populations, convergent validity was poor in this study. Furthermore, known-group comparisons showed evidence of the ESS’ ability to discriminate between groups for the PGI-S but not the IGA; ability to detect change was limited or null. These findings suggest that the ESS may not be appropriate for use in clinical trials with LP patients. This is supported by the qualitative findings where most participants reported that they never felt sleepy or wanted to fall asleep because of their LP, although some patients did spontaneously report sleep-related impacts, such as sleep disturbance (i.e., sleep quality and/or sleep quantity). It is suggested that measures that assess sleep rather than daytime sleepiness should be used in clinical studies with LP patients. However, further research is needed to ascertain whether sleep is a meaningful and important concept of LP, as data are scarce [20].
The Scalpdex performed relatively well when psychometrically evaluated in the study’s LPP patient sample, demonstrating evidence of internal consistency, test-retest reliability and convergent validity (although only weak correlations with PGI-C and IGA). There was mixed evidence to differentiate between known groups and to report an ability to detect change. Not all items may be appropriate for use with LPP patients. For example, inter-item correlations for Item 19 and Item 20 were much weaker than the rest of the items, while Item 15 demonstrated weak correlations with the other ‘Function’ domain items and Item 8 had overall very weak correlations including other ‘Symptoms’ domain items, which is particularly concerning as the ‘Symptoms’ domain only consisted of three items. These findings are not surprising as the Scalpdex was originally developed with patients with seborrheic dermatitis and scalp psoriasis [18]. Clinical characteristics present in these patients, such as desquamation and bleeding [23], may not be relevant to LPP patients. This finding is supported by the qualitative CD interviews and the original Scalpdex development study whereby the impact of desquamation, as assessed via Item 15, was reported as not relevant by a high percentage of patients [18]. Based on the study findings, it is suggested that the Scalpdex may be used with caution with LPP patients and that further evidence is needed when it is used in clinical trials. A potential further limitation of the Scalpdex is its length with 23 items that might be viewed as burdensome for many patients, particularly if some items are deemed not relevant. Similar to the DLQI, the Scalpdex ‘Symptoms’ domain performed better than the measure as a whole, but caution should be taken if the acceptable performance of the measure total score is purely driven by the ‘Symptoms’ domain-specific items.
Lastly, the OLPSSM, as psychometrically evaluated in MLP patients with oral involvement, had evidence of good reliability, construct validity and ability to detect change over time (PGI-S and PGI-C). It is not surprising that the OLPSSM performed well as it was designed specifically for patients with oral lichen planus and has been previously used within similar populations [8, 38]. However, despite the psychometric validity of this measure, it is worth noting that not all items may be relevant to all patients with oral involvement. For example, Item 4 and Item 5 have been noted in the literature and supported by the qualitative interviews in the current study as triggers least likely to cause soreness and are associated more with patients with severe OLP [8]. Furthermore, inter-item correlations between Item 1 and Item 6 were weak, suggesting that these two items might measure dissimilar concepts while correlations between Item 2 and Item 5 were very high, suggesting potential redundancy. Lastly, the OLPSSM is limited in its use to patients with oral involvement [8, 38], leaving a gap for other LP patients. Overall, the data suggest that the OLPSSM is a valid HRQoL PROM for use with patients with OLP.
Study Limitations
Given the potential limitation of a relatively small sample size of some LP cohorts in the current study, particularly for the OLPSSM and Scalpdex, future research in a larger sample size is recommended to strengthen the findings. Further research is also recommended to review other existing HRQoL measures that may be used in LP patients.
Conclusion
The results of our study contribute to the literature by providing novel insights into the appropriateness of existing PROMs commonly used with LP patients. Our study further highlights the need for additional psychometric evaluation and qualitative evidence to assess whether PROMs under consideration are “fit for purpose” for use in future LP clinical studies and support the development of additional LP specific HRQoL PROMs.
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
The authors thank the participants of this study.
Medical Writing, Editorial Assistance and Other Assistance
Charlotte Panter, Molly Clifford, Carl Cooper and Sophi Tatlock contributed to the study design, and study conduct. Jan Stochl and Mike Greenwood contributed to the conduct and review of psychometric analyses. All contributors, except Carl Cooper, are employees of Adelphi Values Ltd. Carl Cooper was an employee of Adelphi Values Ltd. at the time the work was performed.
Author Contributions
Adelphi Values authors (Aoife Mahon-Smith, George Skingley, Lara Ayala-Nunes, Anjali Batish, Rosie Sharp) contributed to the study design, data collection, interpretation of data and preparation of the manuscript for publication. All sponsor authors (Christel Naujoks, Nicolò Compagno, Eva Schruf and Santiago G. Moreno) contributed to defining the scope of the research, including study design, and interpretation of study results in the manuscript.
Funding
Adelphi Values Ltd. were commissioned by Novartis Pharma AG to conduct this research and the sponsor contributed to the study design, data collection and preparation of the manuscript for publication. The sponsor is funding the journal’s rapid service fees.
Data Availability
The datasets generated and/or analyzed during the current study are not publicly available in order to protect participant confidentiality.
Declarations
Conflict of Interest
Christel Naujoks and Santiago G. Moreno are employees of Novartis Pharma AG at the time of performing the research and preparing the manuscript. Eva Schruf is an employee of Novartis Pharma GmbH at the time of performing the research and preparing the manuscript. Nicolò Compagno was an employee of Novartis Pharma AG at the time of performing the research and preparing the manuscript and is now an employee of F. Hoffmann-La Roche Ltd, Basel, Switzerland. Aoife Mahon-Smith, Lara Ayala, George Skingley and Rosie Sharp are employees of Adelphi Values Ltd., a health outcomes agency commissioned by Novartis Pharma AG to conduct this research. Anjali Batish was an employee of Adelphi Values at the time of performing the research and is now an employee of the National Institute for Health and Care Research (NIHR). The authors declare that there are no competing interests.
Ethical Approval
All participants provided informed consent indicating their data will be used for medical research purposes and the study results may be published. The studies were performed in accordance with the Helsinki Declaration of 1964 and its later amendments. Ethical approval and oversight for the clinical study, including exit interviews was obtained as part of clinical study procedures (clinicaltrials.gov ID: NCT04300296, EUDRACT: 2019-003588-24). The Western Copernicus Group Independent Review Board (WCG IRB), a centralized IRB, provided ethical approval and oversight to conduct the independent qualitative interviews in the US and Germany.
References
- 1.Le Cleach L, Chosidow O. Lichen planus. N Engl J Med. 2012;366(8):723–732. doi: 10.1056/NEJMcp1103641. [DOI] [PubMed] [Google Scholar]
- 2.Weston G, Payette M. Update on lichen planus and its clinical variants. Int J Women's Dermatol. 2015;1(3):140–149. doi: 10.1016/j.ijwd.2015.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Thandar Y, Maharajh R, Haffejee F, Mosam A. Treatment of cutaneous lichen planus (Part 1): a review of topical therapies and phototherapy. Cogent Med. 2019;6(1):1582467. doi: 10.1080/2331205X.2019.1582467. [DOI] [PubMed] [Google Scholar]
- 4.Gorouhi F, Davari P, Fazel N. Cutaneous and mucosal lichen planus: a comprehensive review of clinical subtypes, risk factors, diagnosis, and prognosis. Sci World J. 2014;2014:742826. doi: 10.1155/2014/742826. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Assouly P, Reygagne P. Lichen planopilaris: update on diagnosis and treatment. Semin Cutan Med Surg. 2009;28(1):3–10. doi: 10.1016/j.sder.2008.12.006. [DOI] [PubMed] [Google Scholar]
- 6.Ingafou M, Leao JC, Porter SR, Scully C. Oral lichen planus: a retrospective study of 690 British patients. Oral Dis. 2006;12(5):463–468. doi: 10.1111/j.1601-0825.2005.01221.x. [DOI] [PubMed] [Google Scholar]
- 7.Usatine RP, Tinitigan M. Diagnosis and treatment of lichen planus. Am Fam Physician. 2011;84(1):53–60. [PubMed] [Google Scholar]
- 8.Burke LB, Brennan MT, Ni Riordain R, Madsen LS. Novel oral lichen planus symptom severity measure for assessing patients’ daily symptom experience. Oral Dis. 2019;25(6):1564–1572. doi: 10.1111/odi.13109. [DOI] [PubMed] [Google Scholar]
- 9.Cassol-Spanemberg J, Blanco-Carrión A, Rodríguez-de Rivera-Campillo ME, Estrugo-Devesa A, Jané-Salas E, López-López J. Cutaneous, genital and oral lichen planus: a descriptive study of 274 patients. Med Oral Patol Oral Cir Bucal. 2019;24(1):e1–e7. doi: 10.4317/medoral.22656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Lepe K, Nassereddin A, Salazar FJ. Lichen Planopilaris. StatPearls [Internet]. 2021. https://www.ncbi.nlm.nih.gov/books/NBK470325/. [PubMed]
- 11.Tadakamadla J, Kumar S, Lalloo R, Johnson NW. Qualitative analysis of the impact of oral potentially malignant disorders on daily life activities. PLoS ONE. 2017;12(4):e0175531. doi: 10.1371/journal.pone.0175531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Hsu D-Y, Chien W-C, Chung C-H, et al. Risk of anxiety and depression in patients with lichen planus: a nationwide population-based study. J Affect Disord. 2022;300:255–262. doi: 10.1016/j.jad.2021.12.127. [DOI] [PubMed] [Google Scholar]
- 13.Nassab A, Navabi N, Pour M, Charrosta N, Hashemipour M. Quality of life in patients with chronic oral mucosal conditions: a qualitative research. Pesqui Bras Odontopediatria Clín Integrada. 2021;21:e0092. doi: 10.1590/pboci.2021.018. [DOI] [Google Scholar]
- 14.López-Jornet P, Camacho-Alonso F. Quality of life in patients with oral lichen planus. J Eval Clin Pract. 2010;16(1):111–113. doi: 10.1111/j.1365-2753.2009.01124.x. [DOI] [PubMed] [Google Scholar]
- 15.Nasimi M, Ahangari N, Lajevardi V, Mahmoudi H, Ghodsi SZ, Etesami I. Quality of life and mental health status in patients with lichen planopilaris based on dermatology life quality index and general health questionnaire-28 questionnaires. Int J Women's Dermatol. 2020;6(5):399–403. doi: 10.1016/j.ijwd.2020.09.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.US Food and Drug Administration. Patient-Reported Outcome Measures: Use in medical product development to support labeling claims, Silver Spring, ML, 2009.
- 17.Finlay AY, Khan GK. Dermatology life quality index (DLQI)–a simple practical measure for routine clinical use. Clin Exp Dermatol. 1994;19(3):210–216. doi: 10.1111/j.1365-2230.1994.tb01167.x. [DOI] [PubMed] [Google Scholar]
- 18.Chen SC, Yeung J, Chren MM. Scalpdex: a quality-of-life instrument for scalp dermatitis. Arch Dermatol. 2002;138(6):803–807. doi: 10.1001/archderm.138.6.803. [DOI] [PubMed] [Google Scholar]
- 19.Johns MW. A new method for measuring daytime sleepiness: the Epworth sleepiness scale. Sleep. 1991;14(6):540–545. doi: 10.1093/sleep/14.6.540. [DOI] [PubMed] [Google Scholar]
- 20.Adamo D, Ruoppo E, Leuci S, Aria M, Amato M, Mignogna MD. Sleep disturbances, anxiety and depression in patients with oral lichen planus: a case–control study. J Eur Acad Dermatol Venereol. 2015;29(2):291–297. doi: 10.1111/jdv.12525. [DOI] [PubMed] [Google Scholar]
- 21.Balcı DD, İnandı T. Dermatology life quality index scores in lichen planus: comparison of psoriasis and healthy controls. Turkderm-Turk Arch Dermatol Venereol. 2008;42(4):127–130. [Google Scholar]
- 22.Fiocco Z, Kupf S, Patzak L, et al. Quality of life and psychopathology in lichen planus: a neglected disease burden. Acta Derm Venereol. 2021;101(12):adv00619. doi: 10.2340/actadv.v101.442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Sampogna F, Linder D, Piaserico S, et al. Quality of life assessment of patients with scalp dermatitis using the Italian version of the Scalpdex. Acta Derm Venereol. 2014;94(4):411–414. doi: 10.2340/00015555-1731. [DOI] [PubMed] [Google Scholar]
- 24.Chernyshov PV. The evolution of quality of life assessment and use in dermatology. Dermatology. 2019;235(3):167–174. doi: 10.1159/000496923. [DOI] [PubMed] [Google Scholar]
- 25.Wiriyakijja P, Fedele S, Porter SR, Mercadante V, Ni RR. Patient-reported outcome measures in oral lichen planus: a comprehensive review of the literature with focus on psychometric properties and interpretability. J Oral Pathol Med. 2018;47(3):228–239. doi: 10.1111/jop.12604. [DOI] [PubMed] [Google Scholar]
- 26.Ni Riordain R, Hodgson T, Porter S, Fedele S. Validity and reliability of the chronic oral mucosal diseases questionnaire in a UK population. J Oral Pathol Med. 2016;45(8):613–616. doi: 10.1111/jop.12425. [DOI] [PubMed] [Google Scholar]
- 27.US Food and Drug Administration. Patient-Focused Drug Development: Methods to identify what is important to patients guidance for industry, food and drug administration staff, and other stakeholders. Silver Spring, MD. 2022.
- 28.US Food and Drug Administration. Patient-Focused Drug Development: Collecting comprehensive and representative input - guidance for industry, food and drug administration staff, and other stakeholders. In: (FDA) UFaDA, ed2020.
- 29.US Food and Drug Administration. Patient-Focused Drug Development: Methods to identify what is important to patients, guidance for industry, food and drug administration staff, and other stakeholders (Guidance 2). 2022.
- 30.US Food and Drug Administration. Patient-Focused Drug Development: Selecting, developing, or modifying fit-for-purpose clinical outcome assessments (draft guidance). In: Services USDoHaH, ed2022.
- 31.US Food and Drug Administration. Guidance for industry: Patient-Reported Outcome Measures: Use in medical product development to support labeling claims. 2009. [DOI] [PMC free article] [PubMed]
- 32.Willis GB. Cognitive interviewing: a tool for improving questionnaire design. Sage Publications; 2004. [Google Scholar]
- 33.ATLAS.ti Scientific Software Development GmbH [ATLAS.ti 22 Windows]. Retrieved from https://atlasti.com [computer program]. 2022.
- 34.Hsieh H, Shannon S. Three approaches to qualitative content analysis. Qual Health Res. 2005;15(9):1277–1288. doi: 10.1177/1049732305276687. [DOI] [PubMed] [Google Scholar]
- 35.Laga AC, Haefner HK, Granter SR. Diagnostic gynecologic and obstetric pathology. Elsevier; 2018. Noninfectious inflammatory disorders of the vulva; pp. 22–52. [Google Scholar]
- 36.Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–163. doi: 10.1016/j.jcm.2016.02.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Kazis LE, Anderson JJ, Meenan RF. Effect sizes for interpreting changes in health status. Med Care. 1989;27(3 Suppl):S178–189. doi: 10.1097/00005650-198903001-00015. [DOI] [PubMed] [Google Scholar]
- 38.Brennan MT, Madsen LS, Saunders DP, et al. Efficacy and safety of a novel mucoadhesive clobetasol patch for treatment of erosive oral lichen planus: a phase 2 randomized clinical trial. J Oral Pathol Med. 2022;51(1):86–97. doi: 10.1111/jop.13270. [DOI] [PubMed] [Google Scholar]
- 39.Nunnally J, Bernstein I. Psychometric theory. 3. New York: McGraw-Hill Inc.; 1994. [Google Scholar]
- 40.Cohen J. Statistical power analysis for the behavioral sciences. Routledge; 2013. [Google Scholar]
- 41.Hedges LV. Distribution theory for Glass's estimator of effect size and related estimators. J Educ Stat. 1981;6(2):107–128. doi: 10.3102/10769986006002107. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The datasets generated and/or analyzed during the current study are not publicly available in order to protect participant confidentiality.