Abstract
Objectives
Physical function is a core outcome in PsA. We examined the construct validity and responsiveness of three commonly used instruments to assess physical function in PsA: HAQ disability index (HAQ-DI), MultiDimensional HAQ (MDHAQ) and the Patient-Reported Outcomes Measurement Information System (PROMIS®) Global-10.
Methods
Between 2016 and 2019, patients with PsA were enrolled in the Psoriatic Arthritis Research Consortium longitudinal cohort study in the USA. Correlations were calculated at baseline and among change scores using Spearman’s correlation coefficient. Standardized response means were calculated. Agreement with the 20% improvement cut-off was used to determine the potential effect of using MDHAQ or the PROMIS Global-10 physical health (GPH) subscore in place of HAQ-DI when assessing the ACR20.
Results
A total of 274 patients were included in the analysis. The mean age of patients was 49 years and 51% were male. At baseline, the mean HAQ-DI was 0.6 (s.d. 0.6; range 0–3), the mean MDHAQ was 1.8 (s.d. 1.6; range 0–10) and the mean GPH T-score was 43.4 (s.d. 9.3; range 0–100). All three instruments were strongly correlated at baseline (rho 0.75–0.85). Change scores were moderately correlated (rho 0.42–0.71). Among therapy initiators, the mean change between two visits in HAQ-DI, MDHAQ and GPH was −0.1 (s.d. 0.4), −0.2 (s.d. 1.2) and 2.5 (s.d. 6.1), respectively. The standardized response means were 0.18, 0.16 and 0.41, respectively.
Conclusion
The three instruments tested are not directly interchangeable but have overall similar levels of responsiveness.
Keywords: patient-reported outcomes, PsA
Rheumatology key messages
Multiple patient-reported outcomes can assess physical function in PsA.
In this study, we assessed three physical function instruments in PsA and found that they have similar levels of responsiveness to changes in disease states.
While the HAQ – Disability Index is the most commonly used instrument for physical function in trials, simpler instruments (PROMIS10 global physical health and MDHAQ) may be more practical for routine clinical practice and longitudinal cohort studies.
Introduction
PsA is a heterogeneous, inflammatory musculoskeletal disease that affects 10–30% of patients with psoriasis [1]. While physicians often focus on the clinical manifestations of PsA (inflammatory arthritis, skin psoriasis, dactylitis, enthesitis, spondylitis and nail disease), PsA has a broad influence on how patients live their lives, in particular, how they carry out day-to-day activities [1]. PsA has a significant impact on patients’ physical function, fatigue, mood, social engagement and quality of life [2]. Therefore, understanding patient perceptions of health and function are necessary for guiding clinical decision-making. This underscores the need for patient-reported outcomes (PROs) in the assessment of disease impact in the clinic and in clinical trials.
Physical function is one of the core measures of patient-reported disease impact captured in randomized controlled trials (RCTs), longitudinal observational studies and in clinical practice [3]. Assessment of physical function is part of the Group for Research and Assessment of Psoriasis and PsA OMERACT Core Outcome Set for PsA [4, 5]. Although originally designed for RA, the ACR Response Criteria, 20% improvement threshold (ACR20) is the primary outcome of PsA RCTs [6]. To achieve an ACR20 response, both the tender joint count (TJC) and swollen joint count (SJC) must improve by ≥20%, and three of five additional criteria (patient global assessment, physician global assessment, physical function and disability, visual analogue pain scale and CRP) must also improve by ≥20% [7]. In ACR20, the HAQ disability index (HAQ-DI) is the instrument used to measure physical function and disability. Moreover, the HAQ-DI is the instrument that is most commonly used in PsA RCTs [8].
The HAQ-DI, while commonly used in PsA clinical trials, is less frequently used in clinical care because it is relatively long (16 items) and includes potentially outdated disability questions (i.e. use of assistive devices) that may not be as relevant to the majority of patients with PsA in 2020. Additionally, scoring of the HAQ-DI is complex; the highest score of each subset of questions is used for that category, and assistive devices are incorporated into the scoring. Conversely, the Routine Assessment of Patient Index Data is a widely recognized, validated and feasible PRO measure [9, 10]. It is the most commonly used tool in clinical practice in the USA across inflammatory diseases and measures physical function, pain and a global assessment of overall health [11–15]. The first 10 items of the Routine Assessment of Patient Index Data are the MultiDimensional HAQ (MDHAQ), which has been used in PsA [14, 16]. The MDHAQ is derived in part from the HAQ-DI and measures similar constructs, but has fewer questions and is more easily scored.
In addition to these two commonly used instruments in inflammatory arthritis, newer instruments of physical function and physical health are increasingly used in the USA. The Patient-Reported Outcome Measure Information System (PROMIS®) is a set of PROs developed by the National Institutes of Health. The PROMIS global 10-item short form (PROMIS-10) is a static 10-item global assessment that assesses physical health, pain, fatigue, mental health, social health and general health. The score from the 10 items is broken into two subscores: a global mental health (GMH) subscore and a global physical health (GPH) subscore [17]. PROMIS10 is increasingly being used across diseases, departments and health systems in the USA [18]. While these instruments are free, comparable across patient populations, have excellent psychometric properties and are relatively easy to score, the construct validity and responsiveness of these instruments have not been examined in PsA [14, 19, 20].
In this study, we aimed to compare the HAQ-DI, MDHAQ and GPH instruments. Ultimately, understanding the performance and interchangeability of these instruments will help in selecting the appropriate outcomes measurement set for RCTs, pragmatic trials, longitudinal observation studies and clinical practice. We hypothesized that the three instruments would be strongly correlated at baseline, have similar responsiveness, and would be interchangeable within the ACR20 response criteria. The objectives of this study were: (i) to examine the correlation between the baseline MDHAQ and GPH with HAQ-DI (construct validity), (ii) to measure the longitudinal construct validity between the change scores of MDHAQ, GPH and HAQ-DI, and the responsiveness of all three instruments (longitudinal construct validity) and (iii) to assess whether the MDHAQ or GPH can substitute for the HAQ-DI in the ACR20 response criteria.
Methods
Patient population
Between 2016 and 2019, patients with PsA were enrolled in the Psoriatic Arthritis Research Consortium (PARC). PARC is a longitudinal observational cohort study in PsA-dedicated clinics at four USA institutions: the University of Pennsylvania, Cleveland Clinic, New York University and the University of Utah [9]. For the outlined analyses, only patients meeting the Classification Criteria for Psoriatic Arthritis (CASPAR) who completed all three instruments were included. Use of HAQ-DI and PROMIS Global-10 began in 2016. Within this cohort, a subcohort of patients who were starting or switching therapy were identified.
Assessments
Patients enrolled in PARC completed a patient questionnaire with information about their demographics and health history (i.e. age, sex, disease duration, comorbid conditions, medication use, family history) as well as a series of PRO instruments [Routine Assessment of Patient Index Data, HAQ-DI, PROMIS Global-10, patient pain assessment, patient global assessment and PsA Impact of Disease Questionnaire (PSAID)]. Physicians likewise completed a form at each visit that included a comprehensive musculoskeletal examination (criteria for inflammatory back pain, 66/68 joint count, enthesitis count, dactylitis count, nail assessment, physician global assessments), a skin examination (psoriasis body surface area assessment, psoriasis physician global assessment) and therapy information, including treatment decisions made at the visit. The index date for this study was defined as the first time the patient completed the HAQ-DI, MDHAQ and GPH at the same visit. Follow-up visits occurred between week 12 and week 52. At the follow-up visit, patients completed a global assessment of response in which they rated their status as ‘improved’, ‘stayed the same’ or ‘worsened’ since the last visit and rated the level of improvement or worsening. The global assessment of response was used as an external anchor in these analyses.
Scoring of instruments
Scoring of the MDHAQ and HAQ-DI has been previously described [21, 22]. Baseline patient characteristics were descriptively reported [9]. Raw scores of the MDHAQ physical function scale range from 0 to 10. The total HAQ-DI score ranges from 0 to 3, and normal scores are 0.5 or lower. PROMIS10 is a global assessment measure that is broken into global mental health and GPH subscores using published algorithms that are then converted to T-scores [17, 23–25]. The physical health subscore is a sum of four items on the 10-item questionnaire; the sum is then directly converted to a T-score using a chart. A T-score of 50 represents the average person in the general population. The range of the T-scores is 0–100, where higher scores (i.e. >50) represent greater functional ability than the average person in the general population; likewise, scores <50 represent poorer physical function [25]. (Thus, the direction of improvement of GPH differs from that of HAQ-DI and MDHAQ).
Hypotheses
A priori, we created a list of hypotheses to be tested to support construct validity, examining relationships between the physical function instruments and other similar and dissimilar constructs. We hypothesized that the MDHAQ and HAQ-DI would have strong correlation at baseline (rho > 0.8), because the MDHAQ was derived from the HAQ-DI. We also hypothesized that there would be moderate-to-strong correlation between both MDHAQ and HAQ-DI and the MDHAQ and PROMIS10 GPH subscore (rho 0.6–1.0), because all these instruments are measuring the same outcome. We hypothesized similar moderate correlation among the change scores (rho 0.6–0.8). We hypothesized weak-to-moderate correlation (rho < 0.6) between the physical function scores and patient pain score, patient global score and PSAID Skin (a measure of the patient’s perception of their skin disease severity). We hypothesized a moderate correlation between the physical function scores and the PSAID function item. Hypotheses for construct validity (and whether the hypotheses were met) are summarized in Supplementary Table S1, available at Rheumatology online. Fulfilling at least 75% of the hypotheses is considered acceptable [26]. For responsiveness, we hypothesized the standardized response means (SRMs) would be similar for the three PROs. Based on previous studies of the HAQ-DI, we hypothesized that the responsiveness using this index would be low-to-moderate (SRM 0.3–0.5) because of the relatively low functional impairment of the patients in our study compared with that in prior studies that enrolled patients in RCTs, assuming that treatment effect sizes are lower in clinical practice than in clinical trials [27].
Statistical analysis
Correlations were calculated for the total scores using Spearman’s correlation coefficient at baseline and follow-up visits. Change scores were calculated as the score at visit 2 minus the score at visit 1. Spearman’s correlation coefficients were calculated to assess the relationship between the change scores. Spearman’s correlation is typically used for populations that do not have a normal distribution [28]. Standardized mean responses were calculated as the mean change scores divided by the s.d. of the mean change score [29]. Correlation coefficients for baseline and change scores were also calculated for subgroups of the population with polyarticular arthritis (i.e. TJC ≥ 4 and SJC ≥ 4, which is more similar to the scores for patients seen in clinical trials), moderate-to-high disease activity at baseline as measured by the Clinical Disease Activity Index for PsA (as instruments may have different psychometric properties in patients with higher disease activity at baseline), and among treatment initiators (those expected to improve). These analyses aimed to determine whether correlations were changed by the subgroup.
SRMs were calculated for three groups: (i) for all patients initiating therapy (as therapy initiators are expected to improve), (ii) for patients reporting improvement using the global assessment of response and (iii) for patients who met the stem of the ACR20 criteria (i.e. a 20% improvement in both SJC and TJC).
Finally, we examined the agreement (percentage agreement and kappa statistics) among the three instruments after applying the 20% improvement cut-off to determine the potential effect of using MDHAQ or the GPH scores in place of HAQ-DI when assessing the ACR20. We examined the agreement for all patients initiating therapy and for the subset who met the stem of the ACR20 response criteria. Patients must first achieve a 20% improvement in both TJC and SJC (the stem of the ACR20 criteria) in order for a 20% improvement in the HAQ-DI to ‘count’ as a part of the criteria. Thus, analysing agreement in this setting would provide support for interchangeability.
This study was approved by the Institutional Review Boards at each institution and has been registered on ClinicalTrials.gov (NCT03378336).
Results
Baseline characteristics
A total of 412 patients were recruited for the PARC cohort. For patients who had completed all three instruments at the same visit (n = 274), the mean age was 49.3 (14.2) years old, and 51% were males. Most had relatively low disease activity, with mean SJC and TJC of 2.5 (out of 66) and 4.7 (out of 68), respectively (see Table 1 for additional baseline characteristics). At baseline, the mean (s.d.) for HAQ-DI was 0.6 (0.6) (range 0–3), the mean (s.d.) for MDHAQ was 1.8 (1.6) (range 0–10) and the mean (s.d.) GPH T-score was 43.4 (9.3) (range 0–100).
Table 1.
Baseline characteristics of patients (n = 274)
| Variable | N | Mean (s.d.) or N (%) |
|---|---|---|
| Age | 274 | 49.3 (14.2) |
| BMI (kg/m2) | 270 | 30.5 (7.2) |
| Total tender joint count (0–68) | 273 | 4.7 (6.6) |
| Tender joint count ≥4 | 273 | 108 (39.6%) |
| Total swollen joint count (0–66) | 272 | 2.5 (4.7) |
| Swollen joint count ≥4 | 272 | 57 (21.0%) |
| Body surface area (%) | 264 | 2.4 (7.7) |
| Enthesitis | 269 | 70 (26.0%) |
| Dactylitis | 269 | 29 (10.8%) |
| Mean c-DAPSA (range 0–154) | 272 | 15.1 (14.0) |
| Minimal disease activity | 259 | 116 (44.8%) |
| c-DAPSA category | 272 | |
| Remission | 64 (23.5%) | |
| Low disease activity | 92 (33.8%) | |
| Moderate disease activity | 83 (30.5%) | |
| High disease activity | 33 (12.1%) | |
| HAQ-DI (range 0–3) | 274 | 0.6 (0.6) |
| Floor: HAQ-DI Score 0 (minimum) | 73 (26.6%) | |
| Ceiling: HAQ-DI Score 3 (maximum) | 0 (0%) | |
| MDHAQ (range 0–10) | 274 | 1.8 (1.6) |
| Floor: MDHAQ Score 0 (minimum) | 49 (17.9%) | |
| Ceiling: MDHAQ Score 10 (maximum) | 0 (0%) | |
| PROMIS10 global physical health (range 0–100) | 274 | 43.4 (9.3) |
| Floor: PROMIS10 global physical health score 0 (minimum) | 0 (0%) | |
| Ceiling: PROMIS10 global physical health score 100 (maximum) | 0 (0%) |
Body surface area: body surface area affected by psoriasis; c-DAPSA: clinical disease activity index for PsA; HAQ-DI: HAQ disability index; MDHAQ: MultiDimensional HAQ physical function; PROMIS10: Patient-Reported Outcome Measure Information System.
Floor and ceiling effects at baseline
There was no ceiling effect for any of the three instruments (i.e. 0 patients had the maximum possible score on any instrument). There was no floor effect for PROMIS10 GPH. In contrast, there was a floor effect for the other two instruments: 26.6% had the minimum possible HAQ-DI score and 17.9% had the minimum possible MDHAQ score. Floor and ceiling effects are shown in Table 1, and score distributions are shown in Supplementary Fig. S1A–C, available at Rheumatology online.
Correlation among baseline scores
Fig. 1A–C shows the correlations among the instruments at baseline. The Spearman’s correlation coefficient comparing the HAQ-DI with the MDHAQ was 0.85 (P-value < 0.001) at baseline. Correlation between GPH and HAQ-DI and between GPH and MDHAQ was −0.75 and −0.79 (P-values < 0.001), respectively (Table 2). The PSAID functional component had similar correlation with HAQ-DI (0.64), MDHAQ (0.70) and GPH (−0.79, P-values < 0.001) (Supplementary Table S2, available at Rheumatology online).
Fig. 1.
Baseline correlation among the instruments
(A) Correlation between the HAQ-DI and the MDHAQ; (B) correlation between the HAQ-DI and the PROMIS10 GPH; and (C) correlation between the GPH and MDHAQ. HAQ-DI: HAQ disability index; MDHAQ: MultiDimensional HAQ; PROMIS10 GPH: Patient-Reported Outcome Measure Information System global physical health.
Table 2.
Correlations for physical health patient-reported outcomes among all and among subsets of patients with higher disease activity
| Cross-sectional (at index date) |
Longitudinal (change over two visits) |
|||
|---|---|---|---|---|
| N | rho (P-value) | N | rho (P-value) | |
| HAQ-DI and MDHAQ (all) | 274 | 0.85 (<0.001) | 60 | 0.71 (<0.001) |
| Tender joint counts ≥4 | 109 | 0.78 (<0.001) | 27 | 0.83 (<0.001) |
| Swollen joint count ≥4 | 59 | 0.72 (<0.001) | 17 | 0.81 (<0.001) |
| c-DAPSA ≥14 | 122 | 0.75 (<0.001) | 35 | 0.75 (<0.001) |
| Therapy initiators | — | — | 20 | 0.71 (<0.001) |
| PROMIS10 GPH and HAQ-DI (all) | 274 | −0.75 (<0.001) | 61 | −0.42 (<0.001) |
| Tender joint counts ≥4 | 109 | −0.64 (<0.001) | 27 | −0.47 (0.013) |
| Swollen joint count ≥4 | 59 | −0.62 (<0.001) | 18 | −0.62 (0.006) |
| c-DAPSA ≥14 | 122 | −0.63 (<0.001) | 36 | −0.50 (0.002) |
| Therapy initiators | — | — | 20 | −0.52 (0.019) |
| MDHAQ and PROMIS10 GPH (all) | 274 | −0.79 (<0.001) | 82 | −0.46 (<0.001) |
| Tender joint counts ≥4 | 109 | −0.74 (<0.001) | 35 | −0.58 (<0.001) |
| Swollen joint count ≥4 | 59 | −0.72 (<0.001) | 21 | −0.74 (<0.001) |
| c-DAPSA ≥14 | 122 | −0.69 (<0.001) | 44 | −0.49 (<0.001) |
| Therapy initiators | — | — | 23 | −0.56 (0.006) |
Index date was the first visit at which a patient completed all three instruments on the same visit. c-DAPSA: clinical disease activity index for PsA; HAQ-DI: HAQ disability index; MDHAQ: MultiDimensional HAQ physical function; PROMIS10 GPH: Patient-Reported Outcome Measure Information System global physical health; rho: Spearman’s correlation coefficient; therapy initiators: patients who started or switched to a new medication for PsA.
Longitudinal construct validity (change scores between two visits)
Of the 274 patients with all three instruments at baseline, 95, 182 and 178 patients had at least one subsequent visit available from completed HAQ-DI, MDHAQ and GPH for longitudinal analyses, respectively. Among these patients, the mean (s.d.) ΔHAQ-DI (Δ = Change in), ΔMDHAQ and ΔGPH were −0.04 (0.34), −0.14 (1.08) and 0.84 (5.78), respectively. Distribution of the score changes are shown in Supplementary Fig. S2, available at Rheumatology online. Fig. 2A–C shows correlation of the Δ instruments. The Spearman’s correlation coefficient for ΔHAQ-DI and ΔMDHAQ was 0.71 (P-value < 0.001). The Spearman’s correlation coefficient for ΔGPH and ΔHAQ-DI was −0.42, and for ΔGPH and ΔMDHAQ was −0.46 (P-values < 0.001) (Table 2). Correlation between the change in other instruments and ΔHAQ-DI, ΔMDHAQ and ΔGPH are shown in Supplementary Table S3, available at Rheumatology online. Out of the thirty hypotheses determined a priori regarding the correlation between PROs, 24 (80.0%) were fulfilled (Supplementary Table S1, available at Rheumatology online).
Fig. 2.
Longitudinal construct validity: correlation among the change scores
(A) Correlation between the Δ HAQ-DI and the Δ MDHAQ; (B) correlation between the ΔHAQ-DI and the ΔPROMIS10 GPH; and (C) correlation between the ΔGPH and the ΔMDHAQ. HAQ-DI: HAQ disability index; MDHAQ: MultiDimensional HAQ; PROMIS10 GPH: Patient-Reported Outcome Measure Information System global physical health.
Correlations in subsets with higher disease activity (cross-sectional and longitudinal)
At baseline, 21.0–42.6% of participants had higher disease activity, as measured by TJC ≥ 4, SJC ≥ 4 and Clinical Disease Activity Index for PsA ≥ 14 (Table 1). Baseline correlations for HAQ-DI, MDHAQ and PROMIS10 GPH were slightly lower in participants with higher disease activity than in the entire population. For example, correlations between HAQ-DI and MDHAQ were 0.72–0.78 in higher disease activity subgroups, compared with 0.85 in the entire population. In contrast, comparisons between change scores in the longitudinal analyses mostly demonstrated higher correlations in the high disease activity subgroups than in the entire population. For example, correlations between ΔHAQ-DI and ΔMDHAQ were 0.71–0.83 in higher disease activity subgroups, compared with 0.71 in the entire population.
Responsiveness
Among patients initiating therapy, the mean (s.d.) ΔHAQ-DI between two visits was −0.07 (0.37), the mean (s.d.) ΔMDHAQ was −0.19 (1.22) and the mean (s.d.) ΔGPH was 2.48 (6.06). The SRMs for patients initiating therapy were 0.18, 0.16 and 0.41 for ΔHAQ-DI, ΔMDHAQ and ΔGPH, respectively. Responsiveness was higher in the subgroup of patients reporting ‘improvement’ on the global assessment of response at the follow-up visit and the subgroup with ≥20% improvement in both the TJC and SJC (Table 3). For comparison, we also calculated the SRMs for the Δ patient pain assessment and Δ patient global assessment (Supplementary Table S4, available at Rheumatology online). Among therapy initiators, the SRMs for patient pain assessment and patient global assessment were 0.38 and 0.16, respectively.
Table 3.
Responsiveness of physical function instruments
| Change between two visits |
||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| HAQ-DI |
MDHAQ |
PROMIS10 GPH |
||||||||||
| N | Mean change | s.d. | SRM | N | Mean change | s.d. | SRM | N | Mean change | s.d. | SRM | |
| All patients | 95 | −0.04 | 0.34 | 0.13 | 182 | −0.14 | 1.08 | 0.13 | 178 | 0.84 | 5.78 | 0.14 |
| Therapy initiators | 65 | −0.07 | 0.37 | 0.18 | 118 | −0.19 | 1.22 | 0.16 | 84 | 2.48 | 6.06 | 0.41 |
| Patient-reported improvementa | 19 | −0.29 | 0.27 | 1.06 | 32 | −0.78 | 1.17 | 0.67 | 26 | 6.42 | 7.17 | 0.90 |
| ≥20% improvement in TJC/SJC | 21 | −0.12 | 0.30 | 0.39 | 31 | −0.71 | 1.22 | 0.58 | 23 | 5.49 | 7.15 | 0.77 |
Proportion of patients reporting ‘improvement’ on the patient global assessment of response among therapy initiators. ≥20% improvement in TJC/SJC: patients who achieved a composite measure (ACR20) defined as both improvement of 20% in the number of tender joints (TJC) and number of swollen joints (SJC), and three of five additional criteria; HAQ-DI: HAQ disability index; MDHAQ: MultiDimensional HAQ physical function; patient-reported improvement: patient notes improvement on global assessment of response at follow-up visit; PROMIS10 GPH: Patient-Reported Outcome Measure Information System global physical health; SRM: standardized response mean; therapy initiators: patients who started or switched a new medication for PsA.
Impact of the physical function instrument in the ACR response criteria
Of the patients with complete data at subsequent time points, 26 (40.0%), 41 (34.8%) and 87 (55.8%) achieved a ≥20% improvement in the HAQ-DI, MDHAQ and GPH, respectively. Of those meeting the stem of the ACR20 (N = 41), 10 (47.6%), 17 (54.8%) and 25 (61.0%) also attained a 20% improvement in the HAQ-DI, MDHAQ and GPH, respectively. Of those meeting the ACR20 stem, the agreement between the instruments was 80% (kappa = 0.63) between HAQ-DI and MDHAQ, 67% (kappa = 0.37) between GPH and HAQ-DI, and 80% (kappa = 0.03) between GPH and MDHAQ (Table 4).
Table 4.
Agreement between function instruments in therapy initiators with improvement from baseline
| Proportion of patients |
Agreement (Kappa) |
|||||
|---|---|---|---|---|---|---|
| HAQ-DI # patients (%) |
MDHAQ # patients (%) |
PROMIS10 GPH # patients (%) |
HAQ-DI vs MDHAQ | PROMIS10 GPH vs HAQ-DI | MDHAQ vs PROMIS10 GPH | |
| ≥20% improvement in scores | 26/65 (40.0%) | 41/118 (34.8%) | 87/156 (55.8%) | 82% (0.65) | 66% (0.41) | 80% (0.22) |
| ≥20% improvement in function score among those with ≥20% in tender and swollen joint counts | 10/21 (47.6%) | 17/31 (54.8%) | 25/41 (61.0%) | 80% (0.63) | 67% (0.37) | 80% (0.03) |
Among therapy initiators, we identified patients with a 20% improvement in each score and then examined agreement among 20% improvement for all therapy initiators and subsequently among patients who also met the stem of the ACR20 (i.e. a 20% improvement in both the tender and swollen joint counts). The 20% cut-off is based on the way HAQ-DI is used in the ACR20. As Cohen suggested, the Kappa statistic result should be interpreted as follows: values ≤0 as indicating no agreement and 0.01–0.20 as none to slight, 0.21–0.40 as fair, 0.41–0.60 as moderate, 0.61–0.80 as substantial and 0.81–1.00 as almost perfect agreement. ≥20% improvement in TJC/SJC: patients who achieved a composite measure (ACR20) defined as both an improvement of 20% in the number of tender joints (TJC) and the number of swollen joints (SJC), and three of five additional criteria; HAQ-DI: HAQ disability index; MDHAQ: MultiDimensional HAQ physical function; PROMIS10 GPH: Patient-Reported Outcome Measure Information System global physical health.
Discussion
Physical function is a core outcome in the measurement of therapeutic response in PsA [11, 30, 31]. While the HAQ-DI has long been the standard measure of physical function in PsA RCTs [8], it has relatively low feasibility for use in clinical practice and for pragmatic trials. Reassessment of physical function measures is helpful in selecting and interpreting appropriate measures for various research and patient care settings. The MDHAQ and PROMIS10 instruments are more widely used among clinical populations in the USA in particular, but little is known about how they compare with the HAQ-DI. In this study, we found that all three instruments were strongly correlated at baseline and that all PROs satisfied >75% of the hypotheses stated a priori in the cross-sectional setting [26]. The change scores between two visits were moderately correlated. Overall, these results suggest that these instruments have similar measurement properties. Additionally, when used in the ACR20 criteria, HAQ-DI and MDHAQ had high levels of agreement.
PROs that evaluate functional impairment were originally derived and validated in RA in the 1980s and then applied to the PsA population [21]. One of the first widely accepted physical function instruments, the HAQ-DI, is still considered the reference standard in PsA and RA and is used to assess functional ability in RCTs. A recent systematic review identified 23 PROs for measuring physical function in patients with PsA. Of these, three PROs (BASFI, short-form 36 physical function, and HAQ-DI) have at least some evidence assessing both reliability and validity [32]. The HAQ-DI has the most evidence for the physical function domain, and good internal consistency and structural validity [32–34]. The limitations of the HAQ-DI include marked floor effects, conflicting studies on construct validity (particularly across languages), lack of responsiveness to treatment effects in later disease stages, and underestimation of patient-reported physical impairment [11, 32]. Finally, HAQ-DI was developed for RA. Although both RA and PsA share overlapping clinical features, PsA has patterns of musculoskeletal and skin involvement that are distinct from RA [35]. Moreover, the PsA patient population has changed over time, and the HAQ-DI maybe outdated, particularly for use in real-world patient populations, in which the change over time is smaller than in clinical trials [36–38]. Previous literature has noted relatively little change in HAQ-DI over time in clinical practice, and our study identified the same issue [39, 40]. However, this was similar for all three instruments. These patients tend to have a blunted response (and therefore reduced change) compared with patients enrolled in RCTs [41].
The landscape of PROs in inflammatory diseases is also changing. PROs are much more commonly integrated into clinical practice in rheumatology. Despite being more widely used in practice, the MDHAQ and PROMIS10 GPH have little evidence-based studies guiding their use in PsA. In fact, no published studies have examined GPH in PsA to our knowledge. On the other hand, PROMIS physical function, a related but different item bank, has been studied in RA. A recent study demonstrated that the MDHAQ and PROMIS physical function are essentially interchangeable in patients with RA [42]. Additionally, in a cohort of patients with RA, there was a high correlation between PROMIS physical function and HAQ-DI scores at baseline (rho = 0.76), a finding similar to that of our study in patients with PsA (rho = 0.64) [42]. In a study examining the responsiveness of PROs in patients with RA, the responsiveness of PROMIS physical function was superior to the HAQ-DI [43]. Responsiveness of the MDHAQ has not been reported in RA or PsA [14, 16, 44]. Because of these studies and the more widespread use of PROMIS, a recent white paper by the ACR now recommends that the MDHAQ and PROMIS physical function be used to assess functional status in patients with RA at least annually [14].
To our knowledge, this study is the first study that has compared three physical function instruments in patients with PsA, evaluating construct validity, responsiveness and longitudinal construct validity. This is also the first study to examine GPH in patients with inflammatory arthritis, a measure developed by the National Institutes for Health in the USA with excellent psychometric properties across the general population. A strength of this study lies in its performance in a ‘real-world’ patient population with PsA, in particular the types of patients that may be enrolled in pragmatic trials in the future. Additionally, overall, our patients had relatively mild disease activity, consistent with other ‘real-world’ cohorts, but considerably less disease activity than patients enrolled in clinical trials [37].
We also note limitations. In this prospective cohort study, embedded in clinical practice, there was a relatively small sample size of patients who completed two PROs and achieved a 20% improvement in SJC and TJC. Despite recruiting consecutive patients, there may be selection bias in which patients agreed to fill out all of the surveys, because adding additional surveys may have added to responder burden. Finally, this study was performed at four academic medical centres in the USA with expertise in PsA, potentially impacting generalizability. However, the overall disease activity of these patients is likely comparable with that of non-academic and non-specialty clinical practices, at least more so than clinical trials.
Conclusion
Although the HAQ-DI consists of more questions than the MDHAQ or the GPH, the total scores were strongly correlated when analysed cross-sectionally. However, when examining longitudinal scores (and specifically changes in the scores between two visits), both the correlation and the agreement in the setting of a 20% cut-off were moderate, although slightly higher in the subset of patients with higher disease activity at baseline. Given these results, any one of the three instruments could be used to measure physical function, though GPH and MDHAQ may have more feasibility for clinical practice. Finally, agreement was good between MDHAQ, PROMIS10 GPH and HAQ-DI. However, because this was a relatively small sample, to address this question future larger studies should address the suitability of substitution with alternative physical function measures for HAQ-DI in the ACR20.
Funding: This work was supported by NIH/NIAMS R01 AR072363.
Disclosure statement: M.E.H has received consulting fees and/or honoraria from AbbVie, Janssen, Sanofi Genzyme/Regeneron, UCB, Novartis, and Lilly (less than $10 000 each) and is a coinventor on a patent for a PsA questionnaire (PsA Screening Evaluation), for which she receives royalties. Y.-Y.L. has received speaking fees from AbbVie, Janssen, Elli Lilly and Novartis (less than $10 000 each). A.O. has received consulting fees, speaking fees, and/or honoraria from AbbVie, Amgen, Bristol‐Myers Squibb, Celgene, Corrona, Janssen, Lilly, Novartis, Pfizer, and Takeda (less than $10 000 each); grants from Novartis and Pfizer to the trustees of University of Pennsylvania; and royalties to husband from Novartis (greater than $10 000). S.M.R. has received consulting fees, speaking fees, and/or honoraria from Novartis, AbbVie, Amgen, UCB, and Pfizer (less than $10 000 each). J.U.S. has received consulting fees from Janssen, UCB, and BMS (less than $10 000 each), consulting fees from Novartis (more than $10 000) and consulting fees or research grants from Novartis to NYU Langone Health. J.A.W. has received consulting fees or research grants from Novartis, AbbVie, Amgen, Lilly, and Pfizer. The other author has declared no conflicts of interest.
Data availability statement
If there is an interest in utilizing the PARC dataset, interested parties may contact the corresponding author.
Supplementary data
Supplementary data are available at Rheumatology online.
Supplementary Material
References
- 1. Ogdie A, Weiss PF.. The epidemiology of psoriatic arthritis. Rheumatic Dis Clin North Am 2015;41:545–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Taylor WJ, Mease PJ, Adebajo A. et al. Effect of psoriatic arthritis according to the affected categories of the international classification of functioning, disability and health. J Rheumatol 2010;37:1885–91. [DOI] [PubMed] [Google Scholar]
- 3. Orbai A-M, de Wit M, Mease P. et al. International patient and physician consensus on a psoriatic arthritis core outcome set for clinical trials. Ann Rheum Dis 2017;76:673–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Leung YY, Orbai A-M, de Wit M. et al. ; The ReFlap Study Group. Comparing the patient reported physical function outcome measures in a real-life international cohort of patients with psoriatic arthritis. Arthritis Care Res 2020; Advance Access published 21 January 2020, [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Leung YY, Orbai AM, Ogdie A. et al. Appraisal of candidate instruments for assessment of the physical function domain in patients with psoriatic arthritis. J Rheumatol 2020;47 Advance Access published 1 February 2020, [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Felson DT, Anderson JJ, Boers M. et al. American College of Rheumatology. Preliminary definition of improvement in rheumatoid arthritis. Arthritis Rheum 1995;38:727–35. [DOI] [PubMed] [Google Scholar]
- 7.American College of Rheumatology Committee to Reevaluate Improvement Criteria. A proposed revision to the ACR20: the hybrid measure of American College of Rheumatology response. Arthritis Rheum 2007;57:193–202. [DOI] [PubMed] [Google Scholar]
- 8. Mease P, Strand V, Gladman D.. Functional impairment measurement in psoriatic arthritis: importance and challenges. Semin Arthritis Rheum 2018;48:436–48. [DOI] [PubMed] [Google Scholar]
- 9. Walsh JA, Wan MT, Willinger C. et al. Measuring outcomes in psoriatic arthritis: comparing Routine Assessment of Patient Index Data (RAPID3) and Psoriatic Arthritis Impact of Disease (PSAID). J Rheumatol 2020;47:1496–1505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Coates LC, Moverley AR, McParland L. et al. Effect of tight control of inflammation in early psoriatic arthritis (TICOPA): a UK multicentre, open-label, randomised controlled trial. Lancet 2015;386:2489–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Orbai A-M, Ogdie A.. Patient-reported outcomes in psoriatic arthritis. Rheum Dis Clin North Am 2016;42:265–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Pincus T, Yazici Y, Castrejon I.. Pragmatic and scientific advantages of MDHAQ/RAPID3 completion by all patients at all visits in routine clinical care. Bull NYU Hosp Jt Dis 2012;70(Suppl 1):30–6. [PubMed] [Google Scholar]
- 13. Pincus T, Swearingen C, Wolfe F.. Toward a multidimensional Health Assessment Questionnaire (MDHAQ): assessment of advanced activities of daily living and psychological status in the patient-friendly health assessment questionnaire format. Arthritis Rheum 1999;42:2220–30. [DOI] [PubMed] [Google Scholar]
- 14. Barber CEH, Zell J, Yazdany J. et al. 2019 American College of Rheumatology recommended patient-reported functional status assessment measures in rheumatoid arthritis. Arthritis Care Res 2019;71:1531–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Yazdany J, Bansback N, Clowse M. et al. Rheumatology informatics system for effectiveness: a national informatics-enabled registry for quality improvement. Arthritis Care Res 2016;68:1866–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Coates LC, Tillett W, Shaddick G. et al. Value of the Routine Assessment of Patient Index Data 3 in patients with psoriatic arthritis: results from a tight-control clinical trial and an observational cohort. Arthritis Care Res 2018;70:1198–205. [DOI] [PubMed] [Google Scholar]
- 17.Health Measures: Transforming How Health Is Measured [Internet]. http://www.healthmeasures.net/explore-measurement-systems/promis (31 August 2019, date last accessed).
- 18. Kasturi S, Szymonifka J, Burket JC. et al. Feasibility, validity, and reliability of the 10-item patient reported outcomes measurement information system global health short form in outpatients with systemic lupus erythematosus. J Rheumatol 2018;45:397–404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Annapureddy N, Giangreco D, Devilliers H, Block JA, Jolly M.. Psychometric properties of MDHAQ/RAPID3 in patients with systemic lupus erythematosus. Lupus 2018;27:982–90. [DOI] [PubMed] [Google Scholar]
- 20. Hung M, Hon SD, Franklin JD. et al. Psychometric properties of the PROMIS physical function item bank in patients with spinal disorders. Spine 2014;39:158–63. [DOI] [PubMed] [Google Scholar]
- 21. Bruce B, Fries JF.. The Stanford Health Assessment Questionnaire: dimensions and practical applications. Health Qual Life Outcomes 2003;1:20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Pincus T, Swearingen CJ, Bergman M, Yazici Y.. RAPID3 (Routine Assessment of Patient Index Data 3), a rheumatoid arthritis index without formal joint counts for routine care: proposed severity categories compared to disease activity score and clinical disease activity index categories. J Rheumatol 2008;35:2136–47. [DOI] [PubMed] [Google Scholar]
- 23. Cella D, Choi S, Garcia S. et al. Setting standards for severity of common symptoms in oncology using the PROMIS item banks and expert judgment. Qual Life Res 2014;23:2651–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Schalet BD, Revicki DA, Cook KF. et al. Establishing a common metric for physical function: linking the HAQ-DI and SF-36 PF subscale to PROMIS® physical function. J Gen Intern Med 2015;30:1517–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.A Brief Guide to the PROMIS© Global Health Instruments [Internet]. 2017. http://www.healthmeasures.net/images/PROMIS/manuals/PROMIS_Global_Scoring_Manual.pdf (20 December 2019, date last accessed).
- 26. Mokkink LB, Terwee CB, Patrick DL. et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol 2010;63:737–45. [DOI] [PubMed] [Google Scholar]
- 27. Cohen J. Statistical power analysis for the behavioral sciences. 2nd edn. New York: Routledge, 2013. [Google Scholar]
- 28. Bishara AJ, Hittner JB.. Testing the significance of a correlation with nonnormal data: comparison of Pearson, Spearman, transformation, and resampling approaches. Psychol Methods 2012;17:399–417. [DOI] [PubMed] [Google Scholar]
- 29. Husted JA, Cook RJ, Farewell VT, Gladman DD.. Methods for assessing responsiveness: a critical review and recommendations. J Clin Epidemiol 2000;53:459–68. [DOI] [PubMed] [Google Scholar]
- 30. Orbai AM, de Wit M, Mease PJ. et al. Updating the psoriatic arthritis (PsA) core domain set: a report from the PsA workshop at OMERACT 2016. J Rheumatol 2017;44:1522–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Husted JA, Gladman DD, Farewell VT, Cook RJ.. Health-related quality of life of patients with psoriatic arthritis: a comparison with patients with rheumatoid arthritis. Arthritis Rheum 2001;45:151–8. [DOI] [PubMed] [Google Scholar]
- 32. Højgaard P, Klokker L, Orbai AM. et al. A systematic review of measurement properties of patient reported outcome measures in psoriatic arthritis: a GRAPPA-OMERACT initiative. Semin Arthritis Rheum 2018;47:654–65. [DOI] [PubMed] [Google Scholar]
- 33. Taylor WJ, McPherson KM.. Using Rasch analysis to compare the psychometric properties of the Short Form 36 physical function score and the Health Assessment Questionnaire disability index in patients with psoriatic arthritis and rheumatoid arthritis. Arthritis Rheum 2007;57:723–9. [DOI] [PubMed] [Google Scholar]
- 34. Leung YY, Tam LS, Kun EW, Ho KW, Li EK.. Comparison of 4 functional indexes in psoriatic arthritis with axial or peripheral disease subgroups using Rasch analyses. J Rheumatol 2008;35:1613–21. [PubMed] [Google Scholar]
- 35. Coates LC, FitzGerald O, Helliwell PS, Paul C.. Psoriasis, psoriatic arthritis, and rheumatoid arthritis: is all inflammation the same? Semin Arthritis Rheum 2016;46:291–304. [DOI] [PubMed] [Google Scholar]
- 36. Allard A, Antony A, Shaddick G. et al. Trajectory of radiographic change over a decade: the effect of transition from conventional synthetic disease-modifying antirheumatic drugs to anti-tumour necrosis factor in patients with psoriatic arthritis. Rheumatology 2019;58:269–73. [DOI] [PubMed] [Google Scholar]
- 37. Ogdie A, Coates L.. The changing face of clinical trials in psoriatic arthritis. Curr Rheumatol Rep 2017;19:21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Vandendorpe AS, de Vlam K, Lories R.. Evolution of psoriatic arthritis study patient population characteristics in the era of biological treatments. RMD open 2019;5:e000779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Leung YY, Zhu TY, Tam LS, Kun EW, Li EK.. Minimal important difference and responsiveness to change of the SF-36 in patients with psoriatic arthritis receiving tumor necrosis factor-alpha blockers. J Rheumatol 2011;38:2077–9. [DOI] [PubMed] [Google Scholar]
- 40. Husted JA, Gladman DD, Cook RJ, Farewell VT.. Responsiveness of health status instruments to changes in articular status and perceived health in patients with psoriatic arthritis. J Rheumatol 1998;25:2146–55. [PubMed] [Google Scholar]
- 41. Ward MM, Castrejon I, Bergman MJ. et al. Minimal clinically important improvement of Routine Assessment of Patient Index Data 3 in rheumatoid arthritis. J Rheumatol 2019;46:27–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Yun H, Nowell WB, Curtis D. et al. Assessing RA disease activity with PROMIS measures using digital technology. Arthritis Care Res 2020;72:553–60. [DOI] [PubMed] [Google Scholar]
- 43. Hays RD, Spritzer KL, Fries JF, Krishnan E.. Responsiveness and minimally important difference for the patient-reported outcomes measurement information system (PROMIS) 20-item physical functioning short form in a prospective observational study of rheumatoid arthritis. Ann Rheum Dis 2015;74:104–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Vakil-Gilani KM, Dinno A, Rich-Garg N, Deodhar A.. Routine Assessment of Patient Index Data 3 score and psoriasis quality of life assess complementary yet different aspects of patient-reported outcomes in psoriasis and psoriatic arthritis. J Clin Rheumatol 2018;24:319–23. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
If there is an interest in utilizing the PARC dataset, interested parties may contact the corresponding author.


