Table 2.
Measures | Pros | Cons |
---|---|---|
Hazard ratio (model-based) | A valid summary for the difference of two cumulative incidence distributions (when the PH assumption is correct) with statistically efficient inference procedures. | Lacks a clinically meaningful reference value for the hazard from the control arm to assess the difference between groups. Difficult to interpret when the PH model is far from correct because it estimates a population quantity that depends in part on the censoring distributions. May not have adequate power to detect a safety signal especially when the two hazard functions cross during study follow up. May require an impractically large study because the precision of the estimated hazard ratio depends on the number of observed events and not directly on the number of patients and their exposure times. May selectively study a higher-risk population than the indicated patient population for the new treatment because many observed events are needed. |
Relative time (model-based) | Provides a clinically meaningful summary of the differences between the groups if the model is correctly specified. For example, if the estimated ratio (treated vs. control) of two event times is 1.3, one can claim that on average a control patient if treated by the new therapy would gain an extra 30% “survival time.” This, coupled with the survival distribution of the control arm, provides a clinically meaningful interpretation of the treatment benefit. | Difficult to interpret when the model is not correct because the empirical relative time estimates a population quantity that depends on the censoring distributions. |
Difference of percentiles (model-free) | Provides a clinically meaningful summary of the differences between groups and does not depend on a model assumption. Has a well-developed inference procedure for the difference (ratio). |
May not be estimable if follow-up time is short or the event rate is low because in such studies not all the percentile can be observed. May be an unstable estimate because the median (i.e., the 50th percentile) is heavily dependent on the local shape of the cumulative incidence curve. |
The t-year event rate difference (model-free) | Provides an easy to interpret and clinically meaningful summary of the differences between groups. Has a well-established and robust inference procedure. Probably the most relevant quantity for decision-making when one is interested in long-term survival. |
Only reflects cumulative information at time t and does not reflect any differences in the profile of the cumulative incidence curves up to t |
Restricted mean survival time (RMST) difference (model-free) | Provides a clinically meaningful summary of the differences between groups. Provides a more stable estimate than the median in survival time studies. Utilizes more information than its t-year event rate counterpart. May not need an impractically large study to assess noninferiority if the patient’s exposure time is sufficiently large for safety evaluation. |
Needs prespecification of the time point of interest. May selectively study a relatively healthy population with low event rates rather than the indicated patient population in order to obtain a noninferiority claim. |