Abstract
Background and purpose
The prevention of disability over the long term is the main treatment goal in multiple sclerosis (MS); however, randomized clinical trials evaluate only short‐term treatment effects on disability. This study aimed to define criteria for 6‐month confirmed disability progression events of MS with a high probability of resulting in sustained long‐term disability worsening.
Methods
In total, 14,802 6‐month confirmed disability progression events were identified in 8741 patients from the global MSBase registry. For each 6‐month confirmed progression event (13,321 in the development and 1481 in the validation cohort), a sustained progression score was calculated based on the demographic and clinical characteristics at the time of progression that were predictive of long‐term disability worsening. The score was externally validated in the Cladribine Tablets Treating Multiple Sclerosis Orally (CLARITY) trial.
Results
The score was based on age, sex, MS phenotype, relapse activity, disability score and its change from baseline, number of affected functional system domains and worsening in six of the domains. In the internal validation cohort, a 61% lower chance of improvement was estimated with each unit increase in the score (hazard ratio 0.39, 95% confidence interval 0.29–0.52; discriminatory index 0.89). The proportions of progression events sustained at 5 years stratified by the score were 1: 72%; 2: 88%; 3: 94%; 4: 100%. The results of the CLARITY trial were confirmed for reduction of disability progression that was >88% likely to be sustained (events with score ˃1.5).
Conclusions
Clinicodemographic characteristics of 6‐month confirmed disability progression events identify those at high risk of sustained long‐term disability. This knowledge will allow future trials to better assess the effect of therapy on long‐term disability accrual.
Keywords: CLARITY, clinical trial, functional system impairment, risk scoring, sustained disability progression
Using 13,321 confirmed disability progression events from the MSBase registry, a sustained progression score was developed based on patients' characteristics at the time of progression. The sustained progression score helps identify those confirmed progression events that will be sustained over at least 5 years. This score allows randomized trials to establish the effect of therapy not only on short‐term but also on long‐term disability accrual, as demonstrated in our reanalysis of the Cladribine Tablets Treating Multiple Sclerosis Orally (CLARITY) trial data.
INTRODUCTION
Multiple sclerosis (MS) is associated with the accumulation of disability over time, which affects multiple neurological domains. In trials of MS therapies, disability outcomes are mostly measured using the Expanded Disability Status Scale (EDSS), which ranges from 0 to 10 with higher score indicating more severe disease. A ≥1 point (≥1.5 if the baseline score is 0) increase in EDSS score that is sustained over ≥3 or ≥6 months is an accepted clinically meaningful measure of sustained disability accrual [1, 2, 3]. However, 3‐6‐month confirmed disability progression events can overestimate the accumulation of irreversible disability by up to 30% [4]. Assessment of long‐term treatment efficacy is therefore challenging in standard clinical trial settings.
Expanded Disability Status Scale scores, particularly at the lower levels, are predominantly based on signs and symptoms in seven neurological domains—pyramidal, cerebellar, brainstem, sensory, bowel and bladder, visual, and cerebral (cognitive)—assessed using functional system scores (FSSs). Changes in the pyramidal, cerebellar, bowel and bladder, and sensory domains contribute more to EDSS progression sustained for at least 3 to 6 months than other domains [5, 6, 7]. However, none of the studies has investigated the impact of lead worsening FSS identity/type on sustained disability progression in the long term.
The aim of this study was to develop a risk scoring system to identify disability persistent over the long term, using information about patients' demographic and clinical characteristics, in particular the change in specific neurological functions at the time of 6‐month confirmed progression events. The score will improve the ability of future clinical trials to evaluate the effect of MS therapies on long‐term disability.
METHODS
Ethics statement
The study was approved by the Melbourne Health Human Research Ethics Committee and by the MSBase site institutional review boards. Written informed consent was obtained from enrolled patients as required in accordance with the Declaration of Helsinki.
Study design
All 6‐month confirmed disability progression events recorded in the MSBase registry [8], a global observational cohort of MS patients, were identified. For each progression event, the probability of subsequent improvement in disability was estimated, depending on patient characteristics at the time of progression. Based on the identified association, a sustained progression score was developed that quantifies the likelihood that a progression event is sustained over the long term. This score was then internally validated in order to establish the accuracy with which the score identifies those events that will remain sustained. In an external validation step, the score was applied in a clinical trial dataset to demonstrate its use in estimating the effect of MS therapies on long‐term disability outcomes.
Study population
Longitudinal demographic and clinical data collected as part of routine clinical care from 129 mostly tertiary centres in 34 countries were extracted from MSBase in December 2016. The inclusion criteria consisted of the diagnosis of MS or clinically isolated syndrome [9, 10], at least four visits with EDSS score and FSSs recorded, and availability of minimum dataset. The minimum dataset included date of birth, sex, date of first clinical presentation, clinical visits, disease course, relapses and treating centre. Centres contributing fewer than 10 patient records to MSBase were not included.
The MSBase data are entered in local data entry systems, either iMed or MDS, and are typically updated 6–12 monthly. Data quality was assessed prior to data extraction as per standard MSBase procedures [11].
Study outcomes
All 6‐month confirmed disability progression events and the time over which the progression events remained sustained (i.e., the time until the next 6‐month confirmed disability improvement event) were identified. A progression event was defined by an increase of ≥1.5 EDSS steps from a baseline score of 0, 1 step from baseline scores 1.0–5.5, or 0.5 step from a baseline score ≥6.0, sustained at two or more consecutive visits separated by ≥6 months. Improvement of EDSS was defined as a decrease of 1.5 or ≥1 or 0.5 EDSS steps if baseline EDSS was 1.5 or 2–6 or ˃6, respectively, sustained at two or more consecutive visits separated by ≥6 months. To confirm a progression or improvement, EDSS scores recorded more than 30 days from the onset of a preceding relapse were used [4]. The minimum EDSS score recorded within 6 months after an identified disability progression or improvement event was considered as the new baseline EDSS. ‘Time to improvement’ represents the time a confirmed progression event was sustained before the occurrence of a 6‐month confirmed improvement.
Primary model development
A multivariable Cox proportional hazards model was constructed using 90% of the confirmed progression events randomly assigned to the development cohort. This model assessed the associations of demographic and clinical characteristics recorded at the time of progression with the occurrence of a confirmed improvement after a recorded progression. The characteristics included were age, sex, disease course, disease duration, EDSS score ≥6 (categorized based on the Kaplan–Meier survival curves), EDSS change (difference in score between progression event and baseline), number of affected functional system domains (including ambulation), recency of the previous relapse (≥2, 1–2, <1 month) and worsening in any of the seven FSSs at the progression event relative to the corresponding baseline score. The model was adjusted for follow‐up visit density (visits per year) and two‐way interactions between worsening in FSSs and disease duration at the progression event (to account for the differential effect of FSS worsening on the likelihood of disability improvement at different disease durations). Within‐patient correlation due to the multiple progression events was modelled with a frailty random effect. The model was also adjusted for MSBase centres to account for the inter‐centre variability in EDSS scoring. The goodness‐of‐fit of the model was assessed by checking for influential observations using index plots of dfbeta residuals, a measure of the influence of an observation on the regression coefficient. Proportionality of hazards was assessed with Schoenfeld residuals. To evaluate the contribution of variables related to functional system domains in model fitting, a comparison was performed with a null model that consisted of demographic and clinical characteristics only, using the Akaike information criterion.
Sensitivity analyses
To evaluate the robustness of the primary model in different scenarios, two sensitivity analyses were conducted including (i) only relapsing–remitting multiple sclerosis (RRMS) patients with EDSS ˂6 and (ii) only the first progression event identified for each patient including all MS phenotypes. Another sensitivity analysis was performed by adjusting the primary model for treatment status (untreated/low‐efficacy disease‐modifying therapy (DMT)/high‐efficacy DMT) at the time of the confirmed progression events. Low‐efficacy DMTs included interferon beta, glatiramer acetate, dimethyl fumerate and teriflunomide. High‐efficacy DMTs included natalizumab, alemtuzumab, ocrelizumab, rituximab, cladribine, fingolimod, mitoxantrone and daclizumab.
Constructing the sustained progression score
The sustained progression score was derived for each progression event from the sum of the regression coefficients of all characteristics for which p ≤ 0.1 in the primary model.
Validation analyses
The internal validation cohort consisted of 10% of the 6‐month confirmed progression events, not overlapping with patients of the development cohort. Two models were used to assess whether the score predicts the persistence of the progression events. A univariate Cox model evaluated the association of the score with ‘time to improvement’. The discrimination ability of the Cox model was assessed using Harrell's c index [12]. A logistic regression model used a subset of the progression events with available follow‐up of ≥5 years to evaluate the association between the score and the likelihood of a progression event to be sustained for ≥5 years. Another univariate Cox model was performed to validate the association between the score and the time to disability improvement using progression events that occurred during the RRMS phase only.
Validation and application in a randomized trial setting
The sustained progression score was externally validated using data from the Cladribine Tablets Treating Multiple Sclerosis Orally (CLARITY) and CLARITY 2‐year extension studies [13]. Three‐ and 6‐month confirmed disability progression events and the time over which the progression events remained sustained in the combined dataset of the CLARITY and CLARITY extension studies were identified. A univariate Cox model was used to validate the association of the score calculated for each 6‐month confirmed disability progression event with ‘time to improvement’.
To demonstrate the application of the score in a trial setting, firstly, results of the CLARITY study was replicated by comparing the risk of 3‐month confirmed disability progression between the placebo arm and the combined cladribine 3.5‐mg and 5.25‐mg arms. Finally, a Cox model was developed evaluating the effect of cladribine on 3‐month confirmed progression events (following the CLARITY study) with a score ≥1.51 (median score) to demonstrate the application of the score in estimating the effect of therapy on long‐term disability outcomes.
All analyses were done in R 3.5.1 [14].
RESULTS
A total of 14,802 6‐month confirmed disability progression events were identified in 8741 patients fulfilling the inclusion criteria (Figure 1). 92% of the progression events were corroborated by ≥1 unit worsening in any of the seven FSSs (other than gait). The development cohort included 13,321 progression events (7516 patients) and the validation cohort consisted of the remaining 1481 events (1226 patients) with similar characteristics (Table 1).
FIGURE 1.
Flow diagram of patients and 6‐month confirmed progression events included in the analyses
TABLE 1.
Baseline characteristics of patients included in the development and the internal validation cohort
Characteristic | Development cohort | Validation cohort | |
---|---|---|---|
Patients, no. (% female) | 7516 (69) | 1226 (68) | |
Confirmed progression events, no. (%) | 13,321 (90) | 1481 (10) | |
Age at symptom onset, years a | 31 (24–39) | 31 (24–40) | |
Age at inclusion, years a | 37 (30–46) | 38 (30–47) | |
Disease duration from symptom onset to inclusion, years a | 3.57 (0.87–9.44) | 3.68 (0.83–9.72) | |
Follow‐up duration, years a | 9.48 (6.02–13.32) | 8.54 (5.17–12.28) | |
Disease course, no. (%) | |||
At inclusion | |||
Clinically isolated syndrome | 1277 (16.99) | 227 (18.52) | |
Relapsing–remitting | 5008 (66.63) | 793 (64.68) | |
Secondary progressive | 561 (7.46) | 95 (7. 57) | |
Primary progressive | 552 (7.34) | 97 (7.91) | |
Progressive‐relapsing | 118 (1.57) | 14 (1.14) | |
At censoring | |||
Clinically isolated syndrome | 88 (1.17) | 16 (1.31) | |
Relapsing–remitting | 5009 (66.64) | 894 (72.92) | |
Secondary progressive | 1749 (23.27) | 205 (16.72) | |
Primary progressive | 445 (5.92) | 77 (6.28) | |
Progressive‐relapsing | 225 (2.99) | 34 (2.77) | |
EDSS score a | |||
At inclusion | 2.00 (1.50–3.50) | 2.00 (1.50–3.50) | |
At censoring | 4.50 (2.50–6.50) | 4.00 (2.00–6.00) | |
Functional system score at inclusion a , % of non‐zero scores | |||
Pyramidal | 1 (0–2), 72 | 1 (0–3), 73 | |
Cerebellar | 0 (0–2), 42 | 0 (0–2), 41 | |
Brainstem | 0 (0–1), 33 | 0 (0–1), 34 | |
Sensory | 1 (0–2), 53 | 1 (0–2), 54 | |
Bowel, bladder | 0 (0–1), 33 | 0 (0–1), 35 | |
Visual | 0 (0–0), 24 | 0 (0–0), 24 | |
Cerebral | 0 (0–0), 19 | 0 (0–0), 18 | |
Annualized visit density a | 1.79 (1.23–2.58) | 1.74 (1.16–2.59) |
Abbreviation: EDSS, Expanded Disability Status Scale.
Median (quartiles).
The fit of the primary model was superior compared to the null model that did not include any functional system domains (Akaike information criterion difference 70). The characteristics associated with the chance of improvement after progression included age, sex, RRMS and primary progressive MS (compared to clinically isolated syndrome), a relapse <1 month prior to the progression event (compared to relapse ≥2 months ago), EDSS score ≥6 at the progression event and greater EDSS change from baseline, number of affected functional system domains, worsening in pyramidal, cerebellar, brainstem, sensory, visual and cerebral FSSs, and interaction of disease duration with worsening in pyramidal, sensory and cerebral FSSs (Table 2). The index plots comparing the largest dfbeta residuals to the regression coefficient for each covariate of the primary model illustrated no influential observations (Figure S1).
TABLE 2.
Associations between patient characteristics at the time of 6‐month confirmed disability progression and the subsequent improvement
Covariate | Coefficient (95% CI) | Hazard ratio (95% CI) |
---|---|---|
Age, years | −0.02 (−0.02, −0.01)*** | 0.98 (0.98, 0.99)*** |
Male | −0.12 (−0.26, 0.02)* | 0.89 (0.77, 1.02)* |
Disease course | ||
Clinically isolated syndrome | Ref | Ref |
Relapsing–remitting | 0.36 (−0.12, 0.84)* | 1.43 (0.89, 2.31)* |
Secondary progressive | −0.23 (−0.75, 0.28) | 0.79 (0.47, 1.33) |
Primary progressive | −0.73 (−1.34, −0.13)** | 0.48 (0.26, 0.88)** |
Progressive‐relapsing | −0.31 (−0.98, 0.36) | 0.73 (0.38, 1.43) |
Disease duration, years | −0.001 (−0.02, 0.01) | 1.00 (0.98, 1.02) |
Recency of a previous relapse | ||
≥2 months | Ref | Ref |
1–<2 months | 0.14 (−0.08, 0.36) | 1.15 (0.92, 1.44) |
˂1 month | 0.24 (0.10, 0.39)** | 1.28 (1.10, 1.47)** |
EDSS score | ||
0–5.5 | Ref | Ref |
≥6 | −0.25 (−0.44, −0.06)** | 0.78 (0.65, 0.94)** |
Change in EDSS score from baseline | −0.37 (−0.47, −0.27)*** | 0.69 (0.63, 0.76)*** |
No. of affected functional system domains | −0.08 (−0.12, −0.04)*** | 0.92 (0.88, 0.96)*** |
Worsening in pyramidal FSS | −0.14 (−0.27, −0.02)** | 0.87 (0.76, 0.98)** |
Worsening in cerebellar FSS | −0.12 (−0.25, 0.02)* | 0.89 (0.78, 1.02)* |
Worsening in brainstem FSS | 0.17 (0.03, 0.30)** | 1.18 (1.03, 1.35)** |
Worsening in sensory FSS | 0.14 (0.03, 0.25)** | 1.15 (1.03, 1.29)** |
Worsening in bowel, bladder FSS | −0.03 (−0.17, 0.11) | 0.97 (0.85, 1.12) |
Worsening in visual FSS | 0.12 (0.001, 0.25)* | 1.13 (1.00, 1.28)* |
Worsening in cerebral FSS | 0.15 (−0.04, 0.33)* | 1.16 (0.96, 1.39)* |
Pyramidal × disease duration | 0.01 (−0.001, 0.02)* | 1.01 (1.00, 1.02)* |
Cerebellar × disease duration | 0.003 (−0.01, 0.01) | 1.00 (0.99, 1.01) |
Brainstem × disease duration | −0.01 (−0.02, 0.005) | 0.99 (0.98, 1.00) |
Sensory × disease duration | −0.01 (−0.02, 0.001)* | 0.99 (0.98, 1.00)* |
Bowel, bladder × disease duration | 0.002 (−0.01, 0.01) | 1.00 (0.99, 1.01) |
Visual × disease duration | −0.0003 (−0.01, 0.01) | 1.00 (0.99, 1.01) |
Cerebral × disease duration | −0.01 (−0.03, 0.002)* | 0.99 (0.97, 1.00)* |
Annualized visit density | 0.06 (−18.21, 22.49)** | 1.06 (1.01, 1.12)** |
Note: The associations were estimated in a multivariable Cox proportional hazards model. The Schoenfeld test suggested that increase in age violates the proportional hazards assumption; however, the plot of scaled Schoenfeld residuals over time did not reveal any non‐random pattern, thus not providing evidence against the proportionality of hazards.
Abbreviations: CI, confidence interval; EDSS, Expanded Disability Status Scale; FSS, functional system score.
p ≤ 0.1.
p < 0.05
p < 0.001.
The two sensitivity analyses replicated the outcomes of the primary model almost in full, with a small number of exceptions (Table 3): worsening in cerebral FSS did not achieve statistical significance in either model, male sex was no longer associated with a lower likelihood of improvement (analysis of only progression events during RRMS and EDSS < 6), secondary progressive and progressive‐relapsing MS but not RRMS were associated with a higher risk of sustained progression (analysis including only first progression event). The primary model after adjustment for treatment status at the time of confirmed progression events produced very similar estimates in terms of magnitude, direction and significance of associations (Table S3). Progression events recorded on low‐efficacy DMTs were marginally more likely to be sustained than events recorded on high‐efficacy DMTs. Importantly, the primary model was independent from treatment status at the time of the progression event. Therefore, the estimates of the associations between the baseline patient characteristics and the likelihood of sustained disability change due to progression events can be generalized to most of the treatment scenarios.
TABLE 3.
Sensitivity analyses of the primary model in two different scenarios
Covariate | Coefficient (95% CI) | |
---|---|---|
Relapsing–remitting multiple sclerosis with EDSS score <6 | First progression event only | |
Age, years | −0.018 (−0.026, −0.010)*** | −0.019 (−0.026, −0.012)*** |
Male | −0.076 (−0.225, 0.074) | −0.110 (−0.245, 0.025)* |
Disease course | NA | |
Clinically isolated syndrome | Ref | |
Relapsing–remitting | 0.177 (−0.207, 0.560) | |
Secondary progressive | −0.415 (−0.876, 0.046)* | |
Primary progressive | −0.912 (−1.505, −0.318)** | |
Progressive‐relapsing | −0.604 (−1.312, 0.105)* | |
Disease duration, years | −0.004 (−0.024, 0.016) | 0.006 (−0.011, 0.022) |
Recency of a previous relapse | ||
≥2 months | Ref | Ref |
1–˂2 months | 0.102 (−0.126, 0.329) | 0.137 (−0.088, 0.363) |
˂1 month | 0.253 (0.103, 0.404)** | 0.219 (0.073, 0.365)** |
EDSS score | NA | |
0–5.5 | Ref | |
≥6 | −0.267 (−0.511, −0.024)** | |
Change in EDSS score from baseline | −0.379 (−0.496, −0.263)*** | −0.289 (−0.389, −0.189)*** |
No. of affected functional system domains | −0.093 (−0.139, −0.047)*** | −0.097 (−0.141, −0.052)* ** |
Worsening in pyramidal FSS | −0.144 (−0.281, −0.006)** | −0.096 (−0.220, 0.028)* |
Worsening in cerebellar FSS | −0.126 (−0.267, 0.014)* | −0.162 (−0.294, −0.030)** |
Worsening in brainstem FSS | 0.116 (−0.031, 0.262)* | 0.103 (−0.029, 0.235)* |
Worsening in sensory FSS | 0.166 (0.044, 0.289)** | 0.170 (0.061, 0.278)** |
Worsening in bowel, bladder FSS | −0.071 (−0.234, 0.093) | −0.093 (−0.239, 0.054) |
Worsening in visual FSS | 0.132 (0.001, 0.264)* | 0.109 (−0.004, 0.222)* |
Worsening in cerebral FSS | 0.133 (−0.069, 0.335) | 0.062 (−0.121, 0.246) |
Pyramidal × disease duration | 0.008 (−0.003, 0.020) | 0.004 (−0.007, 0.014) |
Cerebellar × disease duration | 0.009 (−0.003, 0.020)* | 0.009 (−0.002, 0.019)* |
Brainstem × disease duration | −0.003 (−0.016, 0.010) | −0.004 (−0.016, 0.007) |
Sensory × disease duration | −0.011 (−0.022, 0.000)* | −0.010 (−0.020, −0.001)** |
Bowel, bladder × disease duration | −0.003 (−0.010, 0.016) | 0.002 (−0.009, 0.013) |
Visual × disease duration | −0.001 (−0.012, 0.010) | 0.003 (−0.007, 0.012) |
Cerebral × disease duration | −0.010 (−0.027, 0.007) | −0.003 (−0.017, 0.011) |
Annualized visit density | 0.029 (−0.026, 0.084) | 0.037 (−0.012, 0.087)* |
Note: In the relapsing–remitting model, inclusion of EDSS score as a covariate did not contribute to the model fitting and therefore was excluded from the final model. The Schoenfeld test suggested that increase in age and cerebral FSS violate the proportional hazards assumption. However, the plots of corresponding scaled Schoenfeld residuals over time did not reveal any non‐random pattern, thus not providing evidence against the proportionality of hazards.
In the first progression event model, the Schoenfeld test suggested that increase in age, worsening in cerebral FSS and annualized visit density violate the proportional hazards assumption. However, the plots of corresponding scaled Schoenfeld residuals over time did not reveal any non‐random pattern, thus not providing evidence against the proportionality of hazards.
Abbreviations: CI, confidence interval; EDSS, Expanded Disability Status Scale; FSS, functional system score.
p ≤ 0.1.
p < 0.05.
p < 0.001.
The sustained progression score
A sustained progression score was calculated in order to estimate the risk of individual 6‐month confirmed disability progression events being sustained over the long term, as follows:
In the internal validation cohort, the score ranged from 0.02 to 4.21 with a median (quartiles) score of 1.40 (1.02–1.83) (Figure 2a). An example of a score calculated for an individual progression event is provided in the caption to Figure 2. A higher score was associated with a lower probability of recovery from confirmed disability progression events (hazard ratio [HR] 0.39; 95% confidence interval [CI] 0.29–0.52) (Figure 2b). This association and its magnitude were consistent in the RRMS cohort (Table 4). The c index for the two models was 0.89 and 0.93, respectively, demonstrating excellent discriminatory ability of the score to distinguish progression events with different risks of persistence.
FIGURE 2.
(a) Histogram of the sustained progression scores in the internal validation cohort. n represents the number of patients with each integer progression score. (b) The risk of 6‐month confirmed progression events being sustained over time stratified by the sustained progression score in the internal validation cohort. Example: A progression event confirmed over 6 months was recorded in a 40‐year‐old male diagnosed with RRMS, who presented with a two‐step increase in EDSS from step 5.5 to 7.5, in the absence of a relapse during the preceding month, with five neurological domains affected and worsening in pyramidal (1 unit), cerebellar (2 units) and sensory (2 units) functional system scores. The sustained progression score in this patient was estimated as 2.001 [Colour figure can be viewed at wileyonlinelibrary.com]
TABLE 4.
Predictive performance of the sustained progression score in the internal validation cohort
Model | Hazard ratio (95% CI) | Harrell's c index |
---|---|---|
Cox proportional hazards model | 0.39 (0.29, 0.52) a | 0.89 |
Cox proportional hazards model in relapsing–remitting multiple sclerosis only | 0.48 (0.33, 0.69) | 0.93 |
Abbreviation: CI, confidence interval.
The Schoenfeld test suggested that increase in sustained progression score violates the proportional hazards assumption; however, the plot of scaled Schoenfeld residuals over time did not reveal any non‐random pattern, thus not providing evidence against the proportionality of hazards.
In the subset of confirmed progression events with ≥5 years of subsequent recorded follow‐up, a higher score was associated with a greater proportion of events that remained sustained over ≥5 years (Table 5). This reflects the applicability of the score in identifying the small subgroup of patients in whom the 6‐month confirmed disability progression events will almost certainly lead to long‐term disability. With each unit increase in the score, the odds of a progression event being sustained for ≥5 years increased 2.8‐fold (95% bootstrapped CI 1.9–5.3). The score performed moderately well in discriminating sustained progression from possible improvement with an area under the receiver operating characteristic of 0.66 (95% bootstrapped CI 0.60–0.71) in the cohort of progression events with ≥5 years' follow‐up. Using the optimal cut‐off score of 1.20, the positive predictive value was 0.86 indicating 86% of progression events with score >1.20 were sustained for at least 5 years (sensitivity 0.64; specificity 0.62; negative predictive value 0.32).
TABLE 5.
The proportion of 6‐month confirmed progression events sustained over ≥5 years from the internal validation cohort, stratified by the sustained progression score
Sustained progression score a | Progression sustained, n (%) | Progression not sustained, n (%) |
---|---|---|
1 | 235 (72) | 90 (28) |
2 | 139 (88) | 19 (12) |
3 | 31 (94) | 2 (6) |
4 | 1 (100) | 0 (0) |
Sustained progression score 1 refers to scores between 0.02 and 1.50; 2 refers to scores between 1.51 and 2.50; 3 refers to scores between 2.51 and 3.50; 4 refers to scores between 3.51 and 4.21.
Validation and application in a randomized trial setting
A total of 667 6‐month confirmed disability progression events were identified in the combined dataset of CLARITY studies. Of these events, 12 progression events were followed by a 6‐month confirmed disability improvement. The Cox model indicated a lower probability of improvement associated with a higher sustained progression score (HR 0.72; 95% CI 0.21–2.42), in keeping with the findings from our discovery study and the internal validation.
The replication analysis of the effect of cladribine on disability progression showed that cladribine was superior to placebo in reducing the risk of 3‐month confirmed disability progression events (HR 0.60; 95% CI 0.47–0.76), confirming the results reported by the CLARITY trial [13]. Of the 1326 trial patients, 162 had 3‐month confirmed progression events with a sustained progression score of ≥1.51. The Cox model showed that cladribine reduced the risk of disability progression events that were likely to be sustained over the long term (score ≥1.51) by 36% (HR 0.64; 95% CI 0.47–0.87).
DISCUSSION
Using 14,802 6‐month confirmed disability progression events from 8741 patients with MS from the global MSBase registry, a comprehensive risk scoring system was developed and validated that enables translation of the effect of MS therapies on short‐term disability outcomes observed in randomized clinical trials into an estimated effect on long‐term disability outcomes. Using data from the CLARITY and CLARITY 2‐year extension studies, the applicability of the risk scoring system was assessed in the setting of clinical trials. It is confirmed that the efficacy of cladribine in reducing the risk of disability progression as reported in the original study [13] translates into an effect in reducing progression events that are likely to persist over the long term. The sustained progression score can inform the design of future randomized trials of MS therapies and prognostics in individual patients.
The performance of any prognostic construct requires validation in an independent dataset. Our internal validation study in a cohort of non‐overlapping 10% of the progression events demonstrated that the sustained progression score was strongly associated with the hazard of non‐recovery from progression events. The high c index (0.93 in the RRMS cohort) confirmed excellent discriminatory power of the score in identifying disability progression events that were highly likely to be sustained over a longer time compared to those with reduced risk of persistence. 78% of 6‐month confirmed progression events were sustained for ≥5 years in the internal validation cohort, consistent with our previous work [4] that reported 74% of 6‐month confirmed events persistent over 5 years. The odds of a 6‐month confirmed event to be sustained at 5 years were 2.8 times higher for every stepwise increase in the score. Over 94% of the progression events with score ≥3 were sustained at 5 years, in comparison with only 72% of the events with score 1. This confirms that the score will enable identification of the small subgroup of patients in whom the 6‐month confirmed disability progression events will almost certainly lead to long‐term disability.
The external validation in the CLARITY datasets also demonstrated the positive association between the sustained progression score and the risk of non‐recovery from disability progression. However, the association did not reach formal statistical significance presumably due to the small number (12) of the 667 progression events that were followed by a 6‐month confirmed improvement during the combined follow‐up of 2–4 years.
The potential value of the sustained progression score in the design of future trials was demonstrated through the analysis of the hazard of progression events with a minimum score of 1.51 in the CLARITY dataset (which translates to a >88% risk of progression persistence over ≥5 years). The beneficial effect of the treatment, cladribine, was found to be robust, beyond the conclusions of the original trial, in reducing the risk of disability progression events that are likely to be sustained over the long term.
Amongst the characteristics of confirmed progression events that were used to construct the score, primary progressive MS, an EDSS score ≥6 and its greater change during the progression event were associated with long‐term change in disability. This is not surprising because progressive disease phenotypes are defined by relentless, relapse‐independent accumulation of disability [15]. Our previous work also reported that progression events are more common and more sustained in primary and secondary progressive MS than in relapsing disease forms. Also, recovery was less common in progression events associated with higher EDSS and greater increase in EDSS [4]. In keeping with this observation, recent relapse activity lowered the score, meaning a higher chance of recovery than the progression events that occurred independently from relapses. This is an expected consequence of the transient nature of the underlying inflammation and the capacity to recover, further facilitated by potent immunotherapies [16, 17]. The score was also weighted by the number of affected neurological domains at progression with worsening in pyramidal and cerebellar domains increasing the score. Thus, the score accounts for the reduced compensatory and regenerative capacity in patients with more widespread damage to their central nervous system [18] as reflected by the greater number of affected neurological domains [19].
Two sensitivity analyses demonstrated that the associations identified by the primary model are robust to the definition of the studied cohort. The confirmation in the subgroup with RRMS and EDSS score ˂6 replicated the results in a population that is typically studied in phase 3 randomized controlled trials. The results are therefore reproducible in the relapsing–remitting phenotype, which is characterized by highly fluctuating disability scores [20]. The replication in the dataset consisting of only single progression events per patient ruled out confounding by within‐subject clustering of the events in the primary analysis.
This study was not aimed to capture the influence of different DMTs on the likelihood of progression events being sustained or the likelihood of improvement in the future. However, considering that treatment status could potentially act as a confounder by influencing both the likelihood of progression and subsequent improvement, a sensitivity analysis was performed. When adjusted for treatment status at the time of confirmed progression events, the primary model resulted in very similar estimates of associations, confirming its generalizability to different treatment situations.
The main limitation of this study is inherent in the limitations of the method of disability assessment. EDSS is subject to inter‐rater variability and fluctuation at the lower end of the spectrum [21]. Moreover, the multicentre nature of the studied cohort introduces inter‐centre heterogeneity in EDSS which was mitigated by requiring neurostatus certification at each centre [20], and further accounted for using a random effect term for centres in the primary model. Furthermore, 6‐month confirmation of EDSS progression as well as of subsequent improvement events was used, with the confirmatory EDSS recorded outside a relapse in each case [4]. Baseline EDSS was reset after every progression or improvement event. Finally, not only were the EDSS values studied but also the more granular information contained in the FSSs, which further increases the robustness of the modelled disability outcomes. In this study, the potential role of improvement in FSS in the assessment of sustained EDSS progression events is discounted. This decision reflects our focus on the neurological domains that either drive or contribute to the worsening of disability. In a study that focuses on evaluating sustained improvement of disability, assessment of sustained decrease in FSS would, naturally, be relevant. To maximize the robustness and generalizability of the sustained progression score, a large number of patients with very long median follow‐up from an international MS registry was used and was complemented with both internal and external validation and a demonstration of its application in a setting of a randomized clinical trial of MS therapy.
CONCLUSION
Randomized clinical trials of MS provide a short‐term perspective of the effect of therapies on confirmed disability progression. In this study, a weighting system was developed, validated and applied for significance of disability progression events by calculating their probability of being sustained over the long term. Prevention of irreversible disability is the ultimate goal of the presently used DMTs for MS [22]. It is therefore proposed to incorporate an estimate of the likelihood of disability progression events to outlast the duration of a trial as a complementary measure to infer a long‐term perspective to treatment effects on disability progression in randomized clinical trials. This approach can be applied using data routinely acquired in trials (thus enabling reanalysis of the previously completed clinical trials). The additional generated information may help clinicians and researchers bridge the gap between the short‐term rigorous evaluation of treatment efficacy and the effect of treatments on long‐term disability outcomes.
AUTHOR CONTRIBUTIONS
Sifat Sharmin: Formal analysis (lead); methodology (lead); project administration (lead); validation (lead); visualization (lead); writing – original draft (lead); writing – review and editing (lead). Francesca Bovis: Validation (supporting); writing – review and editing (supporting). Charles B. Malpas: Methodology (supporting); writing – review and editing (supporting). Dana Horakova: Data curation (equal); writing – review and editing (supporting). Eva Havrdova: Data curation (equal); writing – review and editing (supporting). Guillermo Izquierdo: Data curation (equal); writing – review and editing (supporting). Sara Eichau: Data curation (equal); writing – review and editing (supporting). Maria Trojano: Data curation (equal); writing – review and editing (supporting). Alexandre Prat: Data curation (equal); writing – review and editing (supporting). Marc Girard: Data curation (equal); writing – review and editing (supporting). Pierre Duquette: Data curation (equal); writing – review and editing (supporting). Marco Onofrj: Data curation (equal); writing – review and editing (supporting). Alessandra Lugaresi: Data curation (equal); writing – review and editing (supporting). Francois Grand' Maison: Data curation (equal); writing – review and editing (supporting). Pierre Grammond: Data curation (equal); writing – review and editing (supporting). Patrizia Sola: Data curation (equal); writing – review and editing (supporting). Diana Ferraro: Data curation (equal); writing – review and editing (supporting). Murat Terzi: Data curation (equal); writing – review and editing (supporting). Oliver Gerlach: Data curation (equal); writing – review and editing (supporting). Raed Alroughani: Data curation (equal); writing – review and editing (supporting). Cavit Boz: Data curation (equal); writing – review and editing (supporting). Vahid Shaygannejad: Data curation (equal); writing – review and editing (supporting). Vincent Van Pesch: Data curation (equal); writing – review and editing (supporting). Elisabetta Cartechini: Data curation (equal); writing – review and editing (supporting). Ludwig Kappos: Data curation (equal); writing – review and editing (supporting). Jeannette Lechner‐Scott: Data curation (equal); writing – review and editing (supporting). Roberto Bergamaschi: Data curation (equal); writing – review and editing (supporting). Recai Turkoglu: Data curation (equal); writing – review and editing (supporting). Claudio Solaro: Data curation (equal); writing – review and editing (supporting). Gerardo Iuliano: Data curation (equal); writing – review and editing (supporting). Franco Granella: Data curation (equal); writing – review and editing (supporting). Bart Van Wijmeersch: Data curation (equal); writing – review and editing (supporting). Daniele Spitaleri: Data curation (equal); writing – review and editing (supporting). Mark Slee: Data curation (equal); writing – review and editing (supporting). Pamela A McCombe: Data curation (equal); writing – review and editing (supporting). Julie Prevost: Data curation (equal); writing – review and editing (supporting). Radek Ampapa: Data curation (equal); writing – review and editing (supporting). Serkan Ozakbas: Data curation (equal); writing – review and editing (supporting). Jose Luis Sanchez‐Menoyo: Data curation (equal); writing – review and editing (supporting). Aysun Soysal: Data curation (equal); writing – review and editing (supporting). Steve Vucic: Data curation (equal); writing – review and editing (supporting). Thor Petersen: Data curation (equal); writing – review and editing (supporting). Koen de Gans: Data curation (equal); writing – review and editing (supporting). Ernest Butler: Data curation (equal); writing – review and editing (supporting). Suzanne Hodgkinson: Data curation (equal); writing – review and editing (supporting). Youssef Sidhom: Data curation (equal); writing – review and editing (supporting). Riadh Gouider: Data curation (equal); writing – review and editing (supporting). Edgardo Cristiano: Data curation (equal); writing – review and editing (supporting). Tamara Castillo: Data curation (equal); writing – review and editing (supporting). Maria Laura Saladino: Data curation (equal); writing – review and editing (supporting). M. H. Barnett: Data curation (equal); writing – review and editing (supporting). Fraser Moore: Data curation (equal); writing – review and editing (supporting). Csilla Rózsa: Data curation (equal); writing – review and editing (supporting). Bassem I Yamout: Data curation (equal); writing – review and editing (supporting). Olga Skibina: Data curation (equal); writing – review and editing (supporting). Anneke van der Walt: Data curation (equal); writing – review and editing (supporting). Katherine Buzzard: Data curation (equal); writing – review and editing (supporting). Orla Gray: Data curation (equal); writing – review and editing (supporting). Stella E Hughes: Data curation (equal); writing – review and editing (supporting). Angel Pérez‐Sempere: Data curation (equal); writing – review and editing (supporting). Bhim Singhal: Data curation (equal); writing – review and editing (supporting). Yara Dadalti Fragoso: Data curation (equal); writing – review and editing (supporting). Cameron Shaw: Data curation (equal); writing – review and editing (supporting). Allan Kermode: Data curation (equal); writing – review and editing (supporting). Bruce V Taylor: Data curation (equal); writing – review and editing (supporting). Magdolna Simo: Data curation (equal); writing – review and editing (supporting). Neil Shuey: Data curation (equal); writing – review and editing (supporting). Talal Al‐Harbi: Data curation (equal); writing – review and editing (supporting). Richard Macdonell: Data curation (equal); writing – review and editing (supporting). Jose Andres Dominguez: Data curation (equal); writing – review and editing (supporting). Tunde Csepany: Data curation (equal); writing – review and editing (supporting). Carmen Adella Sirbu: Data curation (equal); writing – review and editing (supporting). Maria Pia Sormani: Data curation (equal); writing – review and editing (supporting). Helmut Butzkueven: Conceptualization (equal); data curation (equal); writing – review and editing (supporting). Tomas Kalincik: Conceptualization (equal); data curation (equal); funding acquisition (lead); methodology (supporting); writing – review and editing (supporting).
CONFLICT OF INTEREST
The authors report the following relationships: speaker honoraria, advisory board or steering committee fees, research support and/or conference travel support from Almirall (G.Iz., M.Tr., R.B., C.So., J.L.SM), Bayer (P.S., V.V.P., L.K., J.LS., R.B., G.Iu., MSl, RAl, RAm, J.L.SM, T.P., S.H., C.R., B.T., M.Si., N.S., P.M., T.C., S.E., M.Tr., M.Te., E.Cr.), BioCSL (T.K., K.B.), Biogen (A.L., D.H., P.G., P.S., D.F., H.B., J.LS., F.Gm., F.Gr., S.H., M.B., O.Gr., B.S., C.So., C.Sh., B.T., N.S., P.M., M.P.S., O.Ge., O.S., S.E., K.B., M.Tr., G.Iz., G.Iu., MSl, M.Si., RAl, R.Am., E.Cr.), Celgene (E.K.H.), Genzyme‐Sanofi (T.K., A.L., M.P.S., O.Gr., O.Ge., O.S., S.E., K.B., M.Tr., M.Te., C.So., G.Iz., G.Iu., F.Gm., F.Gr., MSl, RAl, E.Cr.), GSK (RAl), Innate Immunotherapeutics (A.K.), Merck/EMD (D.H., E.K.H., G.Iu., M.Tr., A.L., P.G., P.S., D.F., T.K., M.Te., R.Am., H.B., C.B., V.V.P., L.K., J.LS, R.B., C.So., G.Iz., F.Gr., B.V.W., D.S., MSl, RAl, J.L.SM., T.P., S.Ho., E.Cr., M.B., C.R., O.Gr., B.S., Y.F., A.K., M.Si., T.C., M.G., P.D., F.M., M.P.S., O.Ge., O.S., S.E., K.B.), Mitsubishi (F.Gm.), Mylan (A.L.), Novartis (D.H., E.K.H., G.Iz., M.Tr., M.G., P.D., A.L., F.Gm., P.G., P.S., D.F., T.K., M.Te., R.Am., H.B., CB, V.V.P., L.K., J.LS., R.B., C.So., G.Iu., F.Gr., B.V.W., D.S., MSl, J.P., RAl, J.L.SM, T.P., S.H., E.Cr., M.B., F.M., C.R., O.Gr., Y.F., C.Sh., A.K., BT, MSi, NV, NS, PM, TC, MPS, OS, FB, SE, K.B.), ONO Pharmaceuticals (FGm), Roche (DH, EKH, GIz, AL, TK, MTe, RAl, CB, VVP, LK, FGr, BVW, T.P., ECr, Y.F., B.T., T.C., M.P.S., S.Ho., S.E., K.B.), Teva (D.H., E.K.H., G.Iz., G.Iu., M.Tr., M.G., P.D., A.L., F.Gm., P.G., P.S., D.F., T.K., M.Te., R.T., C.B., V.V.P., L.K., J.LS., R.B., C.So., B.V.W., D.S., J.P., R.Am., J.L.SM., C.R., Y.F., A.K., M.Si., T.C., C.A.S., M.P.S., O.Ge., O.S., S.E., K.B.), WebMD (T.K.), UCB (L.K.), GeNeuro (M.P.S.), Medday (M.P.S.), Fondazione Italiana Sclerosi Multipla (A.L.), Grifols (K.B.), Actelion (R.Am.).
Supporting information
ACKNOWLEDGEMENTS
The MSBase study group contributors are listed in Table S1. Merck are thanked for allowing access to the CLARITY and CLARITY Extension datasets. Open access publishing facilitated by The University of Melbourne, as part of the Wiley ‐ The University of Melbourne agreement via the Council of Australian University Librarians.
Sharmin S, Bovis F, Malpas C, et al. Confirmed disability progression as a marker of permanent disability in multiple sclerosis. Eur J Neurol. 2022;29:2321‐2334. doi: 10.1111/ene.15406
Authors Helmut Butzkueven and Tomas Kalincik contributed equally.
Funding information
This study was funded by the NHMRC (grants 1129189 and 1157717 and fellowship 1140766). The MSBase Foundation is a not‐for‐profit organization that receives support from Merck, Biogen, Novartis, Roche, Bayer Schering, Sanofi Genzyme and Teva Pharmaceutical Industries. The study was conducted separately and apart from the guidance of the sponsors.
DATA AVAILABILITY STATEMENT
MSBase is a data processor, and warehouses data from individual principal investigators who agree to share their datasets on a project‐by‐project basis. Data access to external parties can be granted at the sole discretion of each MSBase principal investigator (the data controllers), who will need to be approached individually for permission.
REFERENCES
- 1. CAMMS223 Trial Investigators . Alemtuzumab vs. interferon beta‐1a in early multiple sclerosis. N Engl J Med. 2008;359(17):1786‐1801. [DOI] [PubMed] [Google Scholar]
- 2. Polman CH, O'connor PW, Havrdova E, et al. A randomized, placebo‐controlled trial of natalizumab for relapsing multiple sclerosis. N Engl J Med. 2006;354(9):899‐910. [DOI] [PubMed] [Google Scholar]
- 3. Rudick RA, Stuart WH, Calabresi PA, et al. Natalizumab plus interferon beta‐1a for relapsing multiple sclerosis. N Engl J Med. 2006;354(9):911‐923. [DOI] [PubMed] [Google Scholar]
- 4. Kalincik T, Cutter G, Spelman T, et al. Defining reliable disability outcomes in multiple sclerosis. Brain. 2015;138(11):3287‐3298. [DOI] [PubMed] [Google Scholar]
- 5. Healy BC, Engler D, Glanz B, Musallam A, Chitnis T. Assessment of definitions of sustained disease progression in relapsing–remitting multiple sclerosis. Mult Scler Int. 2013;2013:1‐9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Scott T, Wang P, You X, Mann M, Sperling B. Relationship between sustained disability progression and functional system scores in relapsing–remitting multiple sclerosis: analysis of placebo data from four randomized clinical trials. Neuroepidemiology. 2015;44(1):16‐23. [DOI] [PubMed] [Google Scholar]
- 7. Scott T, You X, Foulds P. Functional system scores provide a window into disease activity occurring during a multiple sclerosis treatment trial. Neurol Res. 2011;33(5):549‐552. [DOI] [PubMed] [Google Scholar]
- 8. Butzkueven H, Chapman J, Cristiano E, et al. MSBase: an international, online registry and platform for collaborative outcomes research in multiple sclerosis. Mult Scler J. 2006;12(6):769‐774. [DOI] [PubMed] [Google Scholar]
- 9. Polman CH, Reingold SC, Banwell B, et al. Diagnostic criteria for multiple sclerosis: 2010 revisions to the McDonald criteria. Ann Neurol. 2011;69(2):292‐302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Polman CH, Reingold SC, Edan G, et al. Diagnostic criteria for multiple sclerosis: 2005 revisions to the “McDonald criteria”. Ann Neurol. 2005;58(6):840‐846. [DOI] [PubMed] [Google Scholar]
- 11. Kalincik T, Kuhle J, Pucci E, et al. Data quality evaluation for observational multiple sclerosis registries. Mult Scler J. 2017;23(5):647‐655. [DOI] [PubMed] [Google Scholar]
- 12. Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15(4):361‐387. [DOI] [PubMed] [Google Scholar]
- 13. Giovannoni G, Comi G, Cook S, et al. A placebo‐controlled trial of oral cladribine for relapsing multiple sclerosis. N Engl J Med. 2010;362(5):416‐426. [DOI] [PubMed] [Google Scholar]
- 14. R Core Team . R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; 2018. [Google Scholar]
- 15. Lublin FD, Reingold SC, Cohen JA, et al. Defining the clinical course of multiple sclerosis: the 2013 revisions. Neurology. 2014;83(3):278‐286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Belachew S, Phan‐Ba R, Bartholomé E, et al. Natalizumab induces a rapid improvement of disability status and ambulation after failure of previous therapy in relapsing–remitting multiple sclerosis. Eur J Neurol. 2011;18(2):240‐245. [DOI] [PubMed] [Google Scholar]
- 17. Kalincik T, Brown JWL, Robertson N, et al. Treatment effectiveness of alemtuzumab compared with natalizumab, fingolimod, and interferon beta in relapsing–remitting multiple sclerosis: a cohort study. Lancet Neurol. 2017;16(4):271‐281. [DOI] [PubMed] [Google Scholar]
- 18. Giovannoni G, Cutter G, Sormani MP, et al. Is multiple sclerosis a length‐dependent central axonopathy? The case for therapeutic lag and the asynchronous progressive MS hypotheses. Mult Scler Relat Disord. 2017;12:70‐78. [DOI] [PubMed] [Google Scholar]
- 19. Hirst CL, Ingram G, Swingler R, Compston A, Pickersgill TP, Robertson NP. Change in disability in patients with multiple sclerosis: a 20 year prospective population based analysis. J Neurol Neurosurg Psychiatry. 2008;79:1137‐1143. [DOI] [PubMed] [Google Scholar]
- 20. D'Souza M, Yaldizli Ö, John R, et al. Neurostatus e‐scoring improves consistency of Expanded Disability Status Scale assessments: a proof of concept study. Mult Scler J. 2017;23(4):597‐603. [DOI] [PubMed] [Google Scholar]
- 21. Amato MP, Fratiglioni L, Groppi C, Siracusa G, Amaducci L. Interrater reliability in assessing functional systems and disability on the Kurtzke scale in multiple sclerosis. Arch Neurol. 1988;45(7):746‐748. [DOI] [PubMed] [Google Scholar]
- 22. Brown JWL, Coles A, Horakova D, et al. Association of initial disease‐modifying therapy with later conversion to secondary progressive multiple sclerosis. JAMA. 2019;321(2):175‐187. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
MSBase is a data processor, and warehouses data from individual principal investigators who agree to share their datasets on a project‐by‐project basis. Data access to external parties can be granted at the sole discretion of each MSBase principal investigator (the data controllers), who will need to be approached individually for permission.