SUMMARY
Objective
Total joint replacement has been proposed as an endpoint in disease modifying osteoarthritis drug (DMOAD) randomized clinical trials (RCTs); however, disparities have generated concerns regarding this outcome. A combined Osteoarthritis Research Society International (OARSI)/Outcome Measures in Rheumatology (OMERACT) initiative was launched in 2004 to develop a composite index [‘virtual total joint replacement’ (VJR)] as a surrogate outcome for osteoarthritis (OA) progression in DMOAD RCTs. Our objective was to evaluate the prevalence of patients fulfilling different thresholds of sustained pain, reduced function, and X-ray change in existing DMOAD RCTs.
Design
Post hoc analysis of summary data from the placebo arm of eight DMOAD RCTs.
Results
Eight OA RCTs representing 1379 patients were included. Pain was assessed by WOMAC and/or VAS and function by WOMAC and/or Lequesne. Among six knee and two hip studies, 248 (22%) and 132 (51%) patients respectively had X-ray progression [decrease joint space width (JSW) ≥0.5 mm]. The prevalence of patients fulfilling clinical and radiographic criteria was highest (n = 163, 12%) in the least stringent scenario (pain + function ≥80 at ≥2 visits); with few patients (n = 129, 2%) in the most stringent scenario (pain + function ≥80 at ≥4 visits). Using these prevalence data, a sample size of 352–2144 per group would be needed to demonstrate a 50% difference between groups.
Conclusions
The prevalence of patients with sustained symptomatic OA of at least a moderate degree with X-ray progression is low. Even using lenient criteria to define VJR, large patient numbers would be required to detect differences between groups in DMOAD RCTs. Investigation of the optimal cutoff threshold and combination of symptoms and radiographic change should be pursued.
Keywords: Osteoarthritis, Outcomes, Randomized clinical trials
Introduction
The mission of drug development in disease modifying osteoarthritis drugs (DMOADs) is to alter the natural history of osteoarthritis (OA). This requires carefully designed and meticulously executed randomized clinical trials (RCTs) of potential DMOADs with clear, accurate, and measurable outcomes that correspond to OA progression. For most disease processes, the definitive endpoint (‘gold standard’) is death or organ failure. In OA, this would translate to ‘joint death’ or ‘joint failure’. However, determining exactly when such a state has been reached is challenging and imprecise. There is, in fact, no ‘gold standard’ at this time for OA related ‘joint failure’; nor is there an accepted measurement of OA disease progression which would classify such a final state. Therefore, a surrogate outcome for OA joint failure, one that represents the natural evolution of OA, would be of great value in DMOAD RCTs. In a DMOAD RCT, a surrogate outcome for ‘OA joint death’ would represent failure of medical therapy to prevent progression of disease. Theoretically, in addition to being a clear measurable endpoint, a surrogate outcome in OA could also allow for increased frequency of events in DMOAD RCTs therefore decreasing the total sample size necessary to see a difference between treatment groups.
Total joint replacement (TJR) has been proposed as a primary outcome in DMOAD RCTs1–3 as this procedure typically improves pain, function, and the structural joint derangement caused by OA. TJR is easy to measure and dichotomous. However, overall the number of OA patients who reach this endpoint is small and important disparities in TJR by race, gender, socioeconomic status, access to care, surgeon preference, and health care systems have generated significant concerns that TJR may represent an inaccurate outcome4–8. Further, the decision to recommend TJR by orthopedic surgeons is complex as recently established in a large international study that demonstrated significant overlap in symptom severity between those who were and were not recommended for TJR, even after adjusting for radiographic severity9.
In 2004, the international organizations Outcome Measures in Rheumatology (OMERACT) and Osteoarthritis Research Society International (OARSI) embarked on a combined initiative to develop a composite index to be used in lieu of TJR as an endpoint in clinical trials evaluating potential DMOADs10,11. It was determined that such a composite index should incorporate key symptomatic (i.e., pain, functional impairment) and structural variables of OA11. Patients meeting criteria for this composite index could be considered as having a ‘virtual joint replacement’ (VJR) with sustained pain, reduced function, and evidence of X-ray progression. The conceptual goal of the VJR endpoint in DMOAD RCTs is to eliminate many of the biases associated with TJR as a study outcome while still employing the concept of a dichotomous, OA-specific outcome.
It is important to note that the intention of these OMERACT/OARSI VJR criteria is not for use in clinical practice; nor is the objective of this composite index (VJR) for its use as a threshold at which to recommend surgery. Rather, this is a construct to be examined in the context of DMOAD RCTs to determine if a drug is having an effect on the progression of OA. For example, if the VJR composite index is used as the primary endpoint in a DMOAD RCT comparing drug A to drug B for knee OA, the results could be stated as: “a greater disease-modifying benefit was demonstrated for drug A compared to drug B among knee OA patients with 30% fewer patients receiving drug A reaching the VJR endpoint.”
The objective of the current phase of this OARSI/OMERACT initiative was to evaluate several different thresholds of symptomatic severity and structural change to determine the feasibility and sample size requirement for each scenario if it were to be used as a VJR outcome in a DMOAD RCT. There are no published reports in the literature about which scenario is optimal. We aimed to achieve this by establishing the prevalence of patients fulfilling different clinical (pain, functional impairment) and/or X-ray progression scenarios in a post hoc analysis of the placebo arm of existing DMOAD RCTs.
Methods
Selection of key domains to define OA progression
In 2004, a steering committee of OARSI/OMERACT members conducted a review of the literature and selected three domains to define OA severity in the context of clinical decision making when referring a patient for TJR. These three domains are: pain, functional status, and structural damage1. These domains, commonly captured in all clinical trials, were used to develop potential composite indices and binary outcomes (VJR) for DMOAD RCTs.
Determining thresholds for pain and functional disability
An international prospective observational cross-sectional study of patients with knee or hip OA was conducted, also under the auspices of this OARSI/OMERACT initiative, to determine if cutoff points could be established for pain and functional disability using TJR as the gold standard9. These data ultimately could not identify a specific cut point for pain or functional disability to discriminate between those who did or did not receive TJR; although those who did receive TJR were more symptomatic9.
Elaboration of clinical and radiographic criteria to generate VJR scenarios
The OARSI/OMERACT working group consensus was that all three domains (pain, functional disability, and structural derangement) should be components of the proposed outcome “VJR”. In addition to setting thresholds for each domain, it was determined that sustainability and persistence of symptoms were important components for VJR as well. Therefore, nine VJR clinical scenarios with varying thresholds of pain and function were generated by group consensus (Table I). These scenarios proposed three symptomatic cutoffs: (1) pain + function ≥80, (2) pain + function ≥100, (3) [(pain ≥50 + function ≥30) OR (function ≥50 and pain ≥30)]. These cutoffs are assuming pain and function measures are normalized to a scale of 100 (zero being the best and 100 the worst condition). A composite score of ‘pain and function’ was ascertained by simply adding the two scores: painplus function (each on a scale of 100 for a maximum potential combined score of 200). The symptomatic cutoffs for a non-acceptable symptomatic state (NASS) were combined with three thresholds of symptom sustainability: (1) NASS during at least two consecutive study visits, (2) NASS during at least three consecutive study visits, (3) NASS during at least four consecutive study visits. It should be noted, that in the definition proposed by this working group, the consecutive visits could be at any time during the duration of the study, but should be of at least 3-month intervals.
Table I.
Scenario | Symptom threshold* | Duration of symptoms† |
---|---|---|
A | Pain + function ≥80 | ≥2 consecutive visits |
B | Pain + function ≥80 | ≥3 consecutive visits |
C | Pain + function ≥80 | ≥4 consecutive visits |
D | Pain + function ≥100 | ≥2 consecutive visits |
E | Pain + function ≥100 | ≥3 consecutive visits |
F | Pain + function ≥100 | ≥4 consecutive visits |
G | [pain ≥50 and function ≥30] OR [function ≥50 and pain ≥30] | ≥2 consecutive visits |
H | [pain ≥50 and function ≥30] OR [function ≥50 and pain ≥30] | ≥3 consecutive visits |
I | [pain ≥50 and function ≥30] OR [function ≥50 and pain ≥30] | ≥4 consecutive visits |
Each pain and function tool is normalized to 0–100 scale (0 = best, 100 = worst).
Consecutive visits are 3–6 months apart.
Just as it was determined that OA symptoms of pain and function had to be sustainable and persistent to indicate OA disease progression, for the domain of structural derangement the OARSI/OMERACT working group determined that progression of structural damage (i.e., evidence of significant structural change) was a requisite component for the outcome VJR12. X-ray progression was defined two ways: decrease in joint space width (JSW) ≥0.5 mm and decrease beyond the smallest detectable difference (SDD)13–15. X-ray progression defined by SDD was only utilized in those studies that reported a SDD.
The nine clinical scenarios were then combined with evidence of structural damage to generate a binary outcome, ‘VJR’, which would represent fulfillment of symptomatic criteria (one of the nine scenarios) and evidence of progressive structural damage from OA (Δ JSW).
Selection of DMOAD RCTs
Investigators with available databases from recent DMOAD RCTs of hip or knee OA were invited to participate in these post hoc analyses. To be included in this study, the DMOAD RCT had to have a clearly defined placebo group as only placebo arm data was utilized for analyses in this phase of the VJR initiative to eliminate any potential treatment-related confounders. Further, the DMOAD RCT had to have repeat measures of pain and function throughout the trial (every 3–6 months) and evaluation of radiologic OA parameters at baseline and study end (duration ≥1 year).
Once selected for these post hoc analyses, investigators from each included DMOAD RCT were provided with an extensive case report form (CRF). The investigators from each study had access to individual patient-level data for the placebo arm of their particular study. Each investigator then completed analyses of these patient level data according to the CRF which required the calculation of summary statistics [mean ± standard deviation (SD)] for baseline, final (i.e., study end), and calculated change during the study for clinical symptoms (pain, function, pain + function) and JSW (in millimeters). From these individual patient-level data, the investigators were also asked to determine how many individuals met the nine VJR scenarios and criteria for X-ray progression. For DMOAD RCTs that did not report a SDD, then only change in JSW was included. Since the nine clinical scenarios required sustained symptoms missing data for a specific visit in a single study was resolved by averaging the previous available data with the next available data point.
These analyses (from the CRF) were then forwarded to our senior biostatistician. The data presented in this paper represent the combined work from the CRFs by the senior biostatistician. It is notable, that he did not have access to individual patient-level data only the summary data from the CRFs. The senior biostatistician of this OARSI/OMERACT project then completed analyses of the aggregate summary data obtained from the CRF of each study.
Statistical analyses
Descriptive statistics [mean ± SD, median, 95% confidence interval (CI)] for baseline pain, function, pain + function, and JSW measurements were recorded for each DMOAD RCT, and pooled, weighted means for all baseline measures were generated to account for the wide variation in sample size.
The total number of patients fulfilling criteria (i.e., prevalence) for radiographic progression (decrease JSW ≥0.5 mm or decrease JSW > SDD) and each of the clinical scenarios (Table I; A through I) was determined. Then the prevalence of patients fulfilling criteria for both radiographic progression and each clinical scenario was determined.
The discriminant capacity of the clinical scenarios (Table I, A–I) for X-ray progression (Δ JSW)was based on the aggregate data from all DMOAD RCTs. For these analyses, for each study, and for each clinical scenario two analyses were conducted: (1) radiographic progression was reported as a continuous variable [Δ JSW as outcome, mean (SD)] according to each clinical scenario yes/no (dichotomous variable) allowing us to calculate an effect size through standardized mean difference; (2) radiographic progression was reported as Δ JSW ≥0.5 mm (yes/no) according to each clinical scenario (yes/no) allowing us to calculate an odds ratio. Then, meta-analysis pooled odds ratios and effect sizes were generated using meta-analysis software (Revman) with heterogeneity assessment and using randomized model effects. Covariates could not be included in these models as the data utilized were the aggregate data from each DMOAD RCT CRF (summary data) and not individual patient-level data.
The sample size necessary for a future RCT to detect a 30% or 50% difference in proportions between treatment arms was estimated for each VJR scenario (each clinical scenario plus radiographic progression). These sample size estimates were calculated using the prevalence estimates above and based on α = 0.05 and β = 0.2 and did not account for a drop-out rate.
Statistical analyses for the pooled data analyses were performed using SAS, version 9.1. All reported P-values are two sided with α = 0.05.
Results
DMOAD RCTs
Data from placebo arms of eight RCTs of putative DMOADS for OA were included in these post hoc analyses: ECHODIAH [Evaluation of the structure-modifying effects of diacerein in hip OA: ECHODIAH, a 3-year, placebo-controlled trial. Evaluation of the Chondromodulating Effect of Diacerein in OA of the Hip]16, PAVELKA [Glucosamine sulfate use and delay of progression of knee OA: a 3-year, randomized, placebo-controlled, double-blind study]17, DOXY [Effects of doxycycline on progression of OA: results of a randomized, placebo-controlled, double-blind trial]18, KOSTAR [Risedronate decreases biochemical markers of cartilage degradation but does not decrease symptoms or slow radiographic progression in patients with medial compartment OA of the knee: results of the 2-year multinational knee OA structural arthritis study]19, ERADIAS [Evaluation of the structure-modifying effect of Avocado-Soybean Unsaponifiables (ASU) in Hip OA: results of the ERADIAS study, a 3-year, prospective, randomized, double-blind, placebo-controlled trial]20, REGINSTER [Long-term effects of glucosamine sulfate on OA progression: a randomised, placebo-controlled clinical trial]21, GAIT [The effect of glucosamine and/or chondroitin sulfate on the progression of knee OA: a report from the glucosamine/chondroitin arthritis intervention trial]22, STOPP [Long-term effects of chondroitins 4 and 6 sulfate on knee OA: the study on OA progression prevention, a 2-year, randomized, double-blind, placebo-controlled trial]23.
The characteristics of these eight DMOAD RCTs are described in Table II and Table IIa. There were two studies of hip OA (ECHODIAH, ERADIAS). The remaining six RCTs were of knee OA. Pain was assessed by Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) and/or Visual Analog Scale (VAS)24, and function was assessed by WOMAC and/or Lequesne25 in all studies. Three studies had a minimum level of pain at baseline (on a normalized scale 0–100) required for entry into the study: ERADIAS (Lequesne ≥30), GAIT (WOMAC >25), STOPP (WOMAC/VAS >30). None of the studies had a minimum level of baseline function required for entry. Of note, both PAVELKA and REGINSTER required a minimum severity at enrollment of four points on the Lequesne Algofunctional Index. All eight studies included JSW measurements at the beginning and end of the studies. Radiographic progression was evaluated in these post hoc analyses using the cutoff of SDD (range 0.20–0.50 mm) in four of the DMOAD RCTs.
Table II.
Study | N* | Study duration (weeks) |
# of visits during the trial† |
Duration between study visits (weeks)‡ |
OA site | Pain tool | Function tool | X-ray progression defined by SDD†† |
---|---|---|---|---|---|---|---|---|
DOXY18 | 155 | 120 | 13 | 8 | Knee | WOMAC, VAS | WOMAC | 0.50 mm |
ECHODIAH16 | 136 | 156 | 12 | 13 | Hip | VAS | Lequesne | N/A |
ERADIAS20 | 127 | 156 | 7 | 26 | Hip | WOMAC, VAS | WOMAC, Lequesne | N/A |
GAIT22 | 50 | 104 | 8 | 13** | Knee | WOMAC | WOMAC | 0.32 mm |
KOSTAR19 | 625 | 104 | 4 | 24 | Knee | WOMAC | WOMAC | 0.20 mm |
PAVELKA17 | 54 | 156 | 12 | 12 | Knee | WOMAC | WOMAC | N/A |
REGINSTER21 | 69 | 156 | 9 | 16 | Knee | WOMAC | WOMAC | N/A |
STOPP23 | 163 | 104 | 8 | 12 | Knee | WOMAC, VAS | WOMAC | 0.284 mm |
Number of participants in the placebo arm with baseline radiographic measures.
Starting at month 3 after baseline and including the visit at month 3.
The number of weeks between visits starting at month 3.
There were two visits from month 3 to month 6, at 8 and then 13 weeks apart.
These studies, in addition to defining X-ray progression as change in JSW > 0.5 mm, also analyzed their data by change in JSW > SDD.
Table IIa.
Study | Age range (years) |
OA inclusion criteria | Exclusion criteria by arthritis | Exclusion criteria by treatment | Pain medication permitted? |
---|---|---|---|---|---|
DOXY18 | 45–64 | ♀; BMI*; unilateral radiographic knee OA† | Posttraumatic arthritis; inflammatory arthritis | Tetracycline allergy; intraarticular hyaluronan <6 months; intraarticular corticosteroid <3 months | Yes‡ |
ECHODIAH16 | N/A | Daily hip pain ≥1 month; Lequesne ≥3; JSW 1–3 mm | Secondary hip OA; medial femoral head migration | Intraarticular injection or surgery <3 months; contralateral THR <6 months | Yes** |
ERADIAS20 | ≥45 | Primary hip OA; JSW 1–4 mm; Lequesne ≥30 | Not available | Not available | Not available |
GAIT22 | ≥40 | Knee pain ≥6 months; Kellgran/Lawrence grade 2–3 knee OA | History of trauma; JSW <2 mm; arthroscopy <6 months; predominate lateral compartment OA | Oral/IM corticosteroids <1 month; intraarticular steroids to study joint <3 months; intraarticular steroids to any other joint <1 month; hyaluronic acid or congeners into the study joint <12 months; topical analgesics to the study joint or any oral analgesics <2 weeks or during the study; implementation of any other medical therapy for arthritis <1 month; glucosamine <3 months or chondroitin <6 months; initiation of physical therapy or muscle condition program <2 months; use of anticoagulants | Yes††† |
KOSTAR19 | 40–80 | Knee pain >1 month††; JSW 2–4 mm + ≥1 osteophyte in the medial tibiofemoral compartment‡‡ | Inflammatory arthritis; BMI >40 kg/m2; cancer within 10 years | Tetracycline use within 6 months; intraarticular corticosteroids or hyaluronan <3 months; calcitonin or fluoride <6 months; bisphosphonates <12 months or for >60 days ever | Yes*** |
PAVELKA17 | 45–70 | Knee OA of the medial femorotibial compartment†††; Lequesne ≥4 | Lequesne >12; inflammatory arthritis; trauma; BMI > 27 kg/m2, diabetes | Systemic or intraarticular corticosteroids <3 months | Yes‡‡‡ |
REGINSTER21 | >50 | Knee OA of the medial femorotibial compartment††† | Inflammatory arthritis; trauma; BMI > 30 kg/m2 | Systemic or intraarticular corticosteroids <3 months | Yes |
STOPP23 | 45–80 | Knee OA of the medial femorotibial compartment†††; JSW ≥ 1 mm; VAS ≥30 for >3 months | Kellgran/Lawrence grade 4; isolated lateral tibiofemoral or patellofemoral OA; surgery; trauma; secondary OA; hip OA; hip surgery | Intraarticular injection <3 months; treatment with symptomatic slow-acting or disease modifying OA drugs <3 months; corticosteroids <1 month | Yes**** |
Upper tertile of age- and race-adjusted norms for body mass index (BMI) in women.
According to American College of Rheumatology (ACR) criteria with Kellgran/Lawrence (K/L) grade 2 or 3 changes in the index knee and grade 0 or 1 changes in the contralateral knee.
Except during the final 24 h washout period.
3–7 days washout period before study visits; no intraarticular or systemic corticosteroids were allowed.
Knee pain due to OA on most days during at least 1 month in a 3-month period prior to screening plus at least 1 of the following: age >50 years, morning knee stiffness <30 min, or knee crepitus according to ACR criteria for knee OA.
Plus a medial compartment that was narrower than the lateral.
Acetaminophen/paracetamol and diclofenac were to be used as the only pain medications from day −5 to day −3 preceding the baseline, 6-, 12-, 18-, 24-month visits; all pain medications were discontinued on day −2 & day −1 prior to these visits & on the visit day.
Based on ACR criteria for OA.
Acetaminophen only.
Appropriate washout periods before symptom assessment: 24 h for acetaminophen and 5 days for NSAIDs; i.e., at least five half-lives of the selected medication.
The eight OA RCTs were between 1 and 3 years duration and represented 1379 [note: only 1354 with pain/function measures] patients with OA. The baseline pain, function, and X-ray data for each study are summarized in Table III. The weighted baseline means among all eight studies were: JSW 3.18 mm (range: 2.39–4.05), pain score 37.8 (range: 31.0–45.8), function score 38.0 (range: 29.5–47.1), and pain + function 76.8. The number of patients that met symptomatic criteria at baseline is summarized in Table IIIa.
Table III.
Study | Mean pain* ± SD |
Median pain* (95% CI) |
Mean function* ± SD |
Median function* (95% CI) |
Mean JSW (mm)† ± SD |
Median JSW (mm)† (95% CI) |
---|---|---|---|---|---|---|
DOXY18 | 27 ± 20 | 25 (24, 30) | 28 ± 18 | 28 (25, 31) | 3.69 ± 1.14 | 3.86 (3.51, 3.87) |
ECHODIAH16 | 42 ± 18 | 43 (39, 45) | 30 ± 9 | 29 (28, 32) | 2.39 ± 0.77 | 2.30 (2.26, 2.52) |
ERADIAS20 | 36 ± 24 | 34 (0, 84) | 29 ± 9 | 29 (11, 48) | 2.85 ± 0.90 | 2.80 (1.09, 4.61) |
GAIT22 | 46 ± 14 | 45 (42, 50) | 47 ± 18 | 44 (42, 52) | 3.94 ± 1.01 | 3.98 (3.66, 4.23) |
KOSTAR19 | 40 ± 22 | 39 (38, 42) | 44 ± 23 | 44 (42, 46) | 2.96 ± 0.59 | 2.99 (2.92, 3.01) |
PAVELKA17 | 31 ± 16 | 30 (27, 36) | 30 ± 16 | 29 (26, 34) | 3.80 ± 1.57 | 3.75 (3.37, 4.23) |
REGINSTER21 | 33 ± 21 | 31 (28, 38) | 38 ± 22 | 34 (32, 43) | 4.05 ± 1.25 | 4.10 (3.76, 4.35) |
STOPP23 | 38 ± 20 | 36 (35, 42) | 38 ± 21 | 35 (35, 41) | 3.69 ± 1.20 | 3.75 (3.51, 3.88) |
At baseline entry into the individual study for all participants; normalized to scale of 0–100; 95% CI is for the mean.
JSW in millimeters at baseline entry into the individual study for all participants.
Table IIIa.
Pain + function ≥80 at baseline | Pain + function ≥100 at baseline | |||
---|---|---|---|---|
All patients | Patients with Rx progression* (change >0.5 mm) |
All patients | Patients with Rx progression* (change >0.5 mm) |
|
N (%)† | N | N (%)† | N | |
ECHODIAH16 | 52 (39%) | 30 | 16 (12%) | 7 |
ERADIAS20 | 46 (36%) | 24 | 18 (14%) | 9 |
PAVELKA17 | 14 (26%) | 3 | 5 (9%) | 1 |
REGINSTER21 | 26 (38%) | 12 | 16 (23%) | 7 |
DOXY18 | 37 (24%) | 16 | 18 (12%) | 10 |
GAIT22 | 30 (60%) | 4 | 19 (38%) | 2 |
KOSTAR19 | 316 (53%) | 55 | 229 (38%) | 42 |
STOPP23 | 68 (42%) | 19 | 41 (25%) | 9 |
Number of patients from each DMOAD RCT who met symptomatic criteria at baseline (either pain + function ≥80 or ≥100) and radiographic criteria by the end of the study.
Percentage of patients from corresponding DMOAD RCT who met symptomatic criteria at baseline among those with complete pain and function data at baseline.
Among six knee and two hip studies, 248 (22%) and 132 (51%) patients, respectively, had X-ray progression as defined by change in JSW greater than or equal to 0.5 mm by the study’s end. Among the four studies (DOXY, GAIT, KOSTAR, STOPP) that included radiographic progression measurements by decrease greater than SDD (of note, all were of knee OA), 366 patients (37%) met criteria for X-ray progression by decrease in JSW greater than SDD.
With all of the studies combined, the prevalence of patients fulfilling criteria for each of the nine clinical scenarios [Fig. 1(A), Table I] was highest (n = 486, 36%) in the least stringent scenario (Scenario A: pain + function ≥80 at ≥2 visits), and the fewest patients (n = 101, 7%) fulfilled criteria in the most stringent scenario (Scenario F: pain + function ≥80 at ≥4 visits). This trend was maintained when the studies were stratified by OA site (hip vs knee).
When radiographic progression (defined as change in JSW ≥0.5 mm) was added to the clinical scenarios, the prevalence of patients fulfilling a complete VJR scenario (both symptoms and radiographic progression) ranged from 2.2% to 12.1% [Fig. 1(B)]. Among the four studies analyzed also by SDD, the prevalence of patients fulfilling a complete VJR scenario when defining radiographic progression as change in JSW ≥ SDD was 3.4–16.7%.
Although individual studies demonstrated associations between a particular clinical scenario (A–I) and radiographic progression (Δ JSW >0.5 mm), in the pooled meta-analyses of all eight DMOAD RCTs (pooled odds ratio or effect size) there were no statistically significant relationships (Table IV). There was considerable heterogeneity with I2 results ranging from 17.8% to 62.8% (OR) and 48.5% to 69.4% (ES) across the scenarios.
Table IV.
Clinical scenario | ES* (95% CI) | OR† (95% CI) |
---|---|---|
A | −0.19 (−0.37, 0.00) | 1.40 (0.94, 2.09) |
B | −0.11 (−0.37, 0.14) | 1.29 (0.75, 2.22) |
C | −0.13 (−0.43, 0.17) | 1.36 (0.78, 2.35) |
D | −0.31 (−0.64, 0.03) | 1.35 (0.81, 2.26) |
E | −0.14 (−0.48, 0.21) | 1.21 (0.76, 1.91) |
F | −0.09 (−0.44, 0.26) | 0.97 (0.54, 1.74) |
G | −0.18 (−0.39, 0.03) | 1.30 (0.78, 2.17) |
H | −0.11 (−0.37, 0.14) | 1.21 (0.78, 1.86) |
I | −0.07 (−0.37, 0.22) | 1.19 (0.66, 2.13) |
Effect size of clinical scenario on radiographic progression (outcome) as measured by change in JSW as a continuous variable.
Pooled odds ratio of clinical scenario on radiographic progression (outcome) as measured by change in JSW as a dichotomous variable (≥0.5 mm).
The sample size required per study arm to demonstrate a 30% or 50% difference between treatment and placebo groups ranged from 352 to 6692 patients (per arm) (Table V).
Table V.
Clinical scenario | Δ 30% | Δ 50% |
---|---|---|
A | 1096 | 352 |
B | 1748 | 561 |
C | 2632 | 843 |
D | 1991 | 637 |
E | 3675 | 1172 |
F | 6692 | 2144 |
G | 1472 | 473 |
H | 2474 | 790 |
I | 3880 | 1237 |
Discussion
In this post hoc analysis from the placebo group of eight large DMOAD RCTs with plain radiographic endpoints representing over 1300 patients with OA we found that the prevalence of patients with sustained symptomatic OA of at least moderate degree with evidence of radiographic progression is overall quite low. The scenario with the most lenient criteria to define VJR (Scenario A: pain + function ≥80 for ≥2 consecutive visits) had the highest prevalence (12.14%) even when combined with radiographic progression. To use this VJR criteria (Scenario A plus radiographic progression) as the primary outcome in a DMOAD RCT, 352 patients per study arm would be required to detect a 50% difference between groups.
These data and the overall impact of this OARSI/OMERACT initiative are best interpreted in the context of the OMERACT filter26. The OMERACT filter is composed of three key components: truth, discrimination, and feasibility. Each component criterion represents a question to be answered of an outcome measure in its intended settings.
Truth
The objective of the VJR criteria is to be a meaningful, attainable outcome in DMOAD RCTs that represents progression of OA (i.e., a successful DMOAD would have fewer patients reaching such an outcome). Cartilage degradation and structural derangement are inherent pathological processes seen radiographically in OA. However, providers agree that radiographic features of OA alone are not sufficient to necessitate therapy and symptoms and functional impairment are equally important aspects of the disease. Hence, agreement among the OARSI/OMERACT steering committee, experts in the field of DMOAD research and RCTs, regarding the use of both sustained symptoms and radiographic progression to fulfill VJR criteria demonstrates the face validity of the VJR as an outcome measure.
The application of the VJR criteria to data collected from eight DMOAD RCTs supports the content validity of this measure. However, there is no ‘gold standard’ for disease progression in the natural history of OA. Theoretically, this would be ‘joint death’ or ‘complete joint failure’ which is also difficult to define as evidenced by the variability in actual TJR rates9 and small number of patients reaching this endpoint in DMOAD RCTs. As there is no ‘gold standard’ for identifying significant progression in the natural course of OA, and VJR criteria could not be tested against such, we are unable to comment on criterion and construct validity at this time. However, it will be important for future investigations by this OARSI–OMERACT initiative to test the VJR scenarios in the context of a large, prospective longitudinal cohort, such as the OA Initiative (OAI), to establish such validity. The OAI consortium could also provide valuable data regarding the prevalence and incidence of VJR scenarios compared to TJR. This is an important concept that we could not address in these post hoc analyses because we did not consistently have TJR data from the placebo arm of these DMOAD RCTs.
Discrimination
We were not able to demonstrate discriminant capacity as; once again, there is no gold standard against which to measure. We were also not able to determine the measure’s sensitivity to change as this would require multiple measures of a population over time. None of the clinical scenarios (A–F) were able to discriminate between individuals with and without radiographic progression.
Feasibility
The VJR is easy to perform and requires only self-administered pain and function measures coupled with plain radiography. These measures are routinely obtained in DMOAD RCTs. However, we found that using these criteria as the primary outcome in a DMOAD RCT, 352–1096 individuals per study arm (based on the prevalence of VJR scenario A with radiographic progression) would be required to detect at least a 30% improvement over placebo. Although it is feasible to recruit >500 patients for a DMOAD study as demonstrated by KOSTAR, one of the largest DMOAD RCTs included in these analyses with over 500 patients per arm, this will contribute considerable cost and time to the investigation. Further, a 50% difference from placebo, which allows for the ‘smallest’ sample size of 352 subjects per arm (VJR scenario A), may represent a high hurdle for a DMOAD. DOXY reported a 33% improvement over placebo in loss of JSW at 30 months, and ECHODIAH reported a 32% improvement over placebo in median annual joint space narrowing rate after 3 years. Changing the assumption to detect a 30% improvement over placebo with VJR Scenario A significantly increases the sample size from a somewhat manageable 352 subjects per arm to a much larger study of 1096 per arm. It is also important to note that these sample size calculations do not account for drop-out rates which are usually >20% in long-term DMOAD RCTs. For these reasons, the VJR is a feasible measure given its ease of administration, but due to the overall low prevalence and incidence of even the least stringent VJR scenario, large sample sizes will be required making implementation costly and time-consuming.
A primary strength of this study is the large number of international OA patients with serial pain, function, and radiographic measures. This is the first analysis of pooled data from across existing DMOAD clinical trials. It reflects the involvement of an international working group and the input of various investigators and organizations to address a critical question to facilitate DMOAD development.
The primary limitation of this study is that it is a post hoc analysis of summary data. For this reason we were unable to perform analyses related to individual patient-level data or control for confounders such as rescue medication use. Further, these studies differed significantly in terms of follow up time, frequency of assessment (number of visits, time between visits), protocol for acquisition of radiographic data, and inclusion criteria. There was a low level of pain among all of the studies at entry. Due to the slow progression and episodic nature of symptom exacerbations in OA, these patients with low levels of symptoms would be less likely to have sustained severe symptoms over time compared to patients with high levels of baseline symptoms. There is inherent difficulty however in enrolling patients with high levels of pain in a long-term DMOAD study because if pain is not adequately addressed, it is expected that a high number of patients may not remain in the study over time. There were several differences between the hip and knee OA DMOAD RCTs, most notably, there was a higher incidence of radiographic progression among the two hip OA studies (n = 132, 50.6%) compared to the six knee OA studies (n = 380, 27.6%). There was a significant amount of variability in the prevalence of radiographic progression in the knee OA studies as well (range: 15–45%). This heterogeneity may have contributed significantly to the pooled results, and potential VJR scenarios may need to be identified with studies limited to a single OA site. Finally, these data are based on X-ray radiography with the method of acquisition and analysis of images variable from study to study. Imaging with magnetic resonance imaging (MRI) may be a more sensitive measure of OA progression; however these methods are not yet standardized.
Future analyses of these pooled DMOAD RCT data should be considered with additional criteria. We only assessed JSW by change. By incorporating an absolute JSW cut-off we may be able to capture a larger group of patients who meet VJR criteria. It is likely that the inclusion of additional criteria would further increase the sample size requirements for studies. As there may be a group whose symptoms correlate more closely with change in JSW, further subgroup analyses, based on baseline JSW, may also be explored. This may be best addressed in an epidemiologic setting because many of the DMOAD RCTs had entry criteria which included a minimum JSW. Future studies should also consider the evaluation of known OA risk factors (i.e., age, history of trauma) with the VJR scenarios, and the sustainability of the VJR scenarios over time (i.e., the duration with which patients continue to meet VJR criteria).
The advancement of drug development for disease modifying medications for OA is of critical importance and requires robust studies with feasible sample sizes, reasonable duration of follow up time, and reliable, valid outcome measures. Ongoing investigation of the optimal cutoff threshold and combination of symptoms and radiographic change should be pursued.
Acknowledgment
None.
Role of the funding source
None.
Conflict of interests (Disclosures)
RL Manno: Rebecca Manno is supported by NIH T32 AR048522 (PI: Antony Rosen), ACR/REF Rheumatology Scientist Development Award; B Donald and L Dorothy: Stabler Foundation; CO Bingham III: Investigator/consultant/grant support P&G pharmaceuticals; S Paternotte: None; L Gossec: None; H Halhol: Employee of Expanscience Courbevoie; G Giacovelli: Employee of Rottapharm|Madaus; L Rovati: Employee of Rottapharm|Madaus; SA Mazzuca: None; DO Clegg: None; H Shi: None; E Tajana Messi: Employee of IBSA Institut Biochimique SA, Pambio-Noranco, Switzerland; A Lanzarotti: Employee of IBSA Institut Biochimique SA, Pambio-Noranco, Switzerland; M Dougados: None.
Footnotes
Author contributions
RL Manno: Analysis/interpretation of data, drafting/revising article, final approval of article.
CO Bingham III: Conception and design, drafting/revising article, final approval of article.
S Paternotte: Conception and design, statistical expertise, final approval of article.
L Gossec: Conception and design, drafting/revising article, final approval of article.
H Halhol: Data acquisition/analyses, drafting of the article, final approval of article.
G Giacovelli: Data acquisition/analyses, drafting of the article, final approval of article.
L Rovati: Data acquisition/analyses, drafting of the article, final approval of article.
SA Mazzuca: Conception and design, statistical expertise, final approval of article.
DO Clegg: Analysis/interpretation of data, drafting/revising article, final approval of article.
H Shi: Analysis/interpretation of data, drafting/revising article, final approval of article.
E Tajana Messi: Analysis/interpretation of data, drafting/revising article, final approval of article.
A Lanzarotti: Analysis/interpretation of data, drafting/revising article, final approval of article.
M Dougados: Conception and design, drafting/revising article, final approval of article.
References
- 1.Maillefert JF, Hawker GA, Gossec L, Mahomed NN, Lohmander S, Dieppe PA, et al. Concomitant therapy: an outcome variable for musculoskeletal disorders? Part 2: total joint replacement in osteoarthritis trials. J Rheumatol. 2005;32:2449–2451. [PubMed] [Google Scholar]
- 2.Dougados M, Gueguen A, Nguyen M, Berdah L, Lequesne M, Mazieres B, et al. Requirement for total hip arthroplasty: an outcome measure of hip osteoarthritis? J Rheumatol. 1999;26:855–861. [PubMed] [Google Scholar]
- 3.Abadie E, Ethgen D, Avouac B, Bouvenot G, Branco J, Bruyere O, et al. Recommendations for the use of new methods to assess the efficacy of disease-modifying drugs in the treatment of osteoarthritis. Osteoarthritis Cartilage. 2004;12:263–268. doi: 10.1016/j.joca.2004.01.006. [DOI] [PubMed] [Google Scholar]
- 4.Hawker GA, Guan J, Croxford R, Coyte PC, Glazier RH, Harvey BJ, et al. A prospective population-based study of the predictors of undergoing total joint arthroplasty. Arthritis Rheum. 2006;54:3212–3220. doi: 10.1002/art.22146. [DOI] [PubMed] [Google Scholar]
- 5.Hannan MT, Felson DT, Pincus T. Analysis of the discordance between radiographic changes and knee pain in osteoarthritis of the knee. J Rheumatol. 2000;27:1513–1517. [PubMed] [Google Scholar]
- 6.Merx H, Dreinhofer K, Schrader P, Sturmer T, Puhl W, Gunther KP, et al. International variation in hip replacement rates. Ann Rheum Dis. 2003;62:222–226. doi: 10.1136/ard.62.3.222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Gossec L, Tubach F, Baron G, Ravaud P, Logeart I, Dougados M. Predictive factors of total hip replacement due to primary osteoarthritis: a prospective 2 year study of 505 patients. Ann Rheum Dis. 2005;64:1028–1032. doi: 10.1136/ard.2004.029546. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Boutron I, Rannou F, Jardinaud-Lopez M, Meric G, Revel M, Poiraudeau S. Disability and quality of life of patients with knee or hip osteoarthritis in the primary care setting and factors associated with general practitioners’ indication for prosthetic replacement within 1 year. Osteoarthritis Cartilage. 2008;16:1024–1031. doi: 10.1016/j.joca.2008.01.001. [DOI] [PubMed] [Google Scholar]
- 9.Gossec L, Paternotte S, Maillefert JF, Combescure C, Conaghan PG, Davis AM, et al. The role of pain and functional impairment in the decision to recommend total joint replacement in hip and knee osteoarthritis: an international cross-sectional study of 1909 patients. Report of the OARSI–OMERACT Task Force on total joint replacement. Osteoarthritis Cartilage. 2011;19(2):147–154. doi: 10.1016/j.joca.2010.10.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gossec L, Hawker G, Davis AM, Maillefert JF, Lohmander LS, Altman R, et al. OMERACT/OARSI initiative to define states of severity and indication for joint replacement in hip and knee osteoarthritis. J Rheumatol. 2007;34:1432–1435. [PubMed] [Google Scholar]
- 11.Dougados M, Hawker G, Lohmander S, Davis AM, Dieppe P, Maillefert JF, et al. OARSI/OMERACT criteria of being considered a candidate for total joint replacement in knee/hip osteoarthritis as an endpoint in clinical trials evaluating potential disease modifying osteoarthritic drugs. J Rheumatol. 2009;36:2097–2099. doi: 10.3899/jrheum.090365. [DOI] [PubMed] [Google Scholar]
- 12.Ornetti P, Brandt K, Hellio-Le Graverand MP, Hochberg M, Hunter DJ, Kloppenburg M, et al. OARSI–OMERACT definition of relevant radiological progression in hip/knee osteoarthritis. Osteoarthritis Cartilage. 2009;17:856–863. doi: 10.1016/j.joca.2009.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Gossec L, Jordan JM, Lam MA, Fang F, Renner JB, Davis A, et al. Comparative evaluation of three semi-quantitative radiographic grading techniques for hip osteoarthritis in terms of validity and reproducibility in 1404 radiographs: report of the OARSI–OMERACT task force. Osteoarthritis Cartilage. 2009;17:182–187. doi: 10.1016/j.joca.2008.06.009. [DOI] [PubMed] [Google Scholar]
- 14.Gossec L, Jordan JM, Mazzuca SA, Lam MA, Suarez-Almazor ME, Renner JB, et al. Comparative evaluation of three semi-quantitative radiographic grading techniques for knee osteoarthritis in terms of validity and reproducibility in 1759 X-rays: report of the OARSI–OMERACT task force. Osteoarthritis Cartilage. 2008;16:742–748. doi: 10.1016/j.joca.2008.02.021. [DOI] [PubMed] [Google Scholar]
- 15.Bruynesteyn K, Boers M, Kostense P, van der Linden S, van der Heijde D. Deciding on progression of joint damage in paired films of individual patients: smallest detectable difference or change. Ann Rheum Dis. 2005;64:179–182. doi: 10.1136/ard.2003.018457. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Dougados M, Nguyen M, Berdah L, Mazieres B, Vignon E, Lequesne M, et al. Evaluation of the structure-modifying effects of diacerein in hip osteoarthritis: ECHODIAH, a three-year, placebo-controlled trial. Evaluation of the Chondromodulating Effect of Diacerein in OA of the Hip. Arthritis Rheum. 2001;44:2539–2547. doi: 10.1002/1529-0131(200111)44:11<2539::aid-art434>3.0.co;2-t. [DOI] [PubMed] [Google Scholar]
- 17.Pavelka K, Gatterova J, Olejarova M, Machacek S, Giacovelli G, Rovati LC. Glucosamine sulfate use and delay of progression of knee osteoarthritis: a 3-year, randomized, placebo-controlled, double-blind study. Arch Intern Med. 2002;162:2113–2123. doi: 10.1001/archinte.162.18.2113. [DOI] [PubMed] [Google Scholar]
- 18.Brandt KD, Mazzuca SA, Katz BP, Lane KA, Buckwalter KA, Yocum DE, et al. Effects of doxycycline on progression of osteoarthritis: results of a randomized, placebo-controlled, double-blind trial. Arthritis Rheum. 2005;52:2015–2025. doi: 10.1002/art.21122. [DOI] [PubMed] [Google Scholar]
- 19.Bingham CO, 3rd, Buckland-Wright JC, Garnero P, Cohen SB, Dougados M, Adami S, et al. Risedronate decreases biochemical markers of cartilage degradation but does not decrease symptoms or slow radiographic progression in patients with medial compartment osteoarthritis of the knee: results of the two-year multinational knee osteoarthritis structural arthritis study. Arthritis Rheum. 2006;54:3494–3507. doi: 10.1002/art.22160. [DOI] [PubMed] [Google Scholar]
- 20.Maheu E, Cadet C, Marty M, Moyse D, Kerloch I, Coste P, et al. Evaluation of the structure-modifying effect of avocado-soybean unsaponifiables (ASU) in hip osteoarthritis (OA): results of the ERADIAS study, a 3-year, prospective, randomized, double-blind, placebo controlled trial [abstract] Arthritis Rheum. 2009;60(Suppl 10):847. [Google Scholar]
- 21.Reginster JY, Deroisy R, Rovati LC, Lee RL, Lejeune E, Bruyere O, et al. Long-term effects of glucosamine sulphate on osteoarthritis progression: a randomised, placebo-controlled clinical trial. Lancet. 2001;357:251–256. doi: 10.1016/S0140-6736(00)03610-2. [DOI] [PubMed] [Google Scholar]
- 22.Sawitzke AD, Shi H, Finco MF, Dunlop DD, Bingham CO, 3rd, Harris CL, et al. The effect of glucosamine and/or chondroitin sulfate on the progression of knee osteoarthritis: a report from the glucosamine/chondroitin arthritis intervention trial. Arthritis Rheum. 2008;58:3183–3191. doi: 10.1002/art.23973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kahan A, Uebelhart D, De Vathaire F, Delmas PD, Reginster JY. Long-term effects of chondroitins 4 and 6 sulfate on knee osteoarthritis: the study on osteoarthritis progression prevention, a two-year, randomized, double-blind, placebo-controlled trial. Arthritis Rheum. 2009;60:524–533. doi: 10.1002/art.24255. [DOI] [PubMed] [Google Scholar]
- 24.Bellamy N, Buchanan WW, Goldsmith CH, Campbell J, Stitt LW. Validation study of WOMAC: a health status instrument for measuring clinically important patient relevant outcomes to antirheumatic drug therapy in patients with osteoarthritis of the hip or knee. J Rheumatol. 1988;15:1833–1840. [PubMed] [Google Scholar]
- 25.Lequesne MG, Mery C, Samson M, Gerard P. Indexes of severity for osteoarthritis of the hip and knee. Validation–value in comparison with other assessment tests. Scand J Rheumatol Suppl. 1987;65:85–89. doi: 10.3109/03009748709102182. [DOI] [PubMed] [Google Scholar]
- 26.Boers M, Brooks P, Strand CV, Tugwell P. The OMERACT filter for outcome measures in rheumatology. J Rheumatol. 1998;25:198–199. [PubMed] [Google Scholar]