Clinician’s Approach to Advanced Statistical Methods: Win Ratios, Restricted Mean Survival Time, Responder Analyses, and Standardized Mean Differences

Melissa Lane; Tyson Miao; Ricky D Turgeon

doi:10.1007/s11606-023-08582-w

. 2024 Jan 3;39(7):1196–1203. doi: 10.1007/s11606-023-08582-w

Clinician’s Approach to Advanced Statistical Methods: Win Ratios, Restricted Mean Survival Time, Responder Analyses, and Standardized Mean Differences

Melissa Lane ^1,^2,^✉, Tyson Miao ¹, Ricky D Turgeon ³

PMCID: PMC11116328 PMID: 38172409

Abstract

Novel statistical methods have emerged in recent medical literature, which clinicians must understand to properly appraise and integrate evidence into their practice. Some of these key concepts include win ratios, restricted mean survival time, responder analyses, and standardized mean difference. This article offers guidance to busy clinicians on the comprehension and practical applicability of the results to patients. Win ratios provide an alternative method to analyze composite outcomes by prioritizing individual components of the composite; prioritization of the outcomes should be evidence-based, pre-specified, and patient-centered. Restricted mean survival time presents a method to analyze Kaplan–Meier curves when assumptions required for Cox proportional hazards analysis are not met. As it only considers outcomes that occur within a specific timeframe, the duration of follow-up must be appropriately defined and based on prior epidemiologic and mechanistic evidence. Researchers can analyze continuous outcomes with responder analyses, in which participants are dichotomized into “responders” or “non-responders.” While clinicians and patients may more easily grasp outcomes analyzed in this way, they should be aware of the loss of information and resulting imprecision, as well as potential to manipulate data presentation. When meta-analyzing continuous outcomes, point estimates can be converted to standardized mean differences to facilitate the combination of data utilizing various outcome measures. However, clinicians may find it challenging to grasp the clinical meaningfulness of a standardized mean difference, and may benefit from converting it to well-known outcomes. By providing the background knowledge of these statistical methods, along with practical applicability, benefits, and inevitable limitations, this article aims to provide clinicians with an approach to appraise the literature and apply the results in clinical practice.

KEY WORDS: win ratio, restricted mean survival, responder analysis, standardized mean difference

Introduction

In the world of medical research, investigators are often in search of novel techniques to effectively analyze data. This article provides clinicians with the knowledge and skillset to utilize the results from these sophisticated techniques, including win ratios, restricted mean survival time, responder analyses, and standardized mean differences, in their practice. Through clinical scenarios, this article explores the foundational principles, practical applicability, intrinsic advantages, and limitations of these statistical methods. Clinicians will be equipped to comprehend the literature, appraise the results, and apply the evidence to facilitate evidence-based clinical practice. Table 1 describes questions for clinicians to consider when appraising literature utilizing these statistical methods.

Table 1.

Appraisal Questions to Consider when Reviewing Literature Utilizing the Specified Statistical Methods

Win ratios

1. What is included in the composite outcome? Are there any components not clinically relevant to the clinical question?

2. Is the hierarchy of outcomes reasonable?

- Were priorities determined based on prior evidence of patient values, and/or patient engagement?

- Was the hierarchy pre-specified by the researchers?

3. What component is driving a difference in the win ratio, if any?

Restricted mean survival time

1. How was the follow-up time decided?

- Is this evidence-based?

- Does it make sense clinically? Biologically? Pharmacologically?

Responder analyses

1. Is continuous data (i.e., mean differences) also reported?

- Are the results between approaches consistent?

2. Is the definition of “response” reasonable?

- Is it based on a known minimally important difference?

- Is this consistent with the response definition used in other trials?

3. Is it appropriate to consider all “responders” as equivalent to one another and “non-responders” as equivalent to one another?

Standardized mean difference

1. Are the individual studies using different outcome measures?

2. Can we assume the differences in standard deviations among studies are due to different outcome measures, and not other differences such as clinical heterogeneity?

- Are the standard deviations similar between studies, accounting for differences in the scale (e.g., 0–10 versus 0–50)?

- Do trials that report the same scale show similar SDs?

3. Are the results clinically meaningful when converted to a well-known outcome measure?

Open in a new tab

Win ratios

Clinical Scenario

You are seeing one of your longstanding patients, a 72-year-old male, in follow-up following a recent hospitalization for heart failure exacerbation (ejection fraction 60%). You are aware of new evidence for sacubitril/valsartan or empagliflozin in this context, and decide to delve into these recent trials.

Background Information

Large cardiovascular trials often utilize a composite endpoint analyzed with time-to-event statistical methods such as Cox proportional hazards regression. A limitation of this method is that all components of the composite outcome are treated as equivalent, and the only component that contributes is the one that occurs first. For instance, in a composite outcome of mortality, myocardial infarction, or stroke, a participant who suffers a stroke, and then subsequently dies during the trial, will be included in the primary outcome as “stroke,” as this event occurred first, disregarding mortality as being a higher clinical priority. Win ratios have been introduced as a method of addressing this limitation.

Win Ratios Explained

Win ratios offer a statistical approach to prioritize individual endpoints within a composite outcome. The process for the simplest version (unmatched win ratio) is as follows:^1,2

A hierarchical composite outcome is developed with prioritized outcomes occurring first and other endpoints ranked based on clinical priority.
Each patient in the intervention group (Ni) is compared to each patient in the control group (Nc) to form a total number of pairs = Ni × Nc.
The two participants within each pair are compared against one another in the pre-specified hierarchy of events:
1. If there is a difference between the two participants in the first outcome in the hierarchy, a win is recorded and no further comparisons within the pair occur.
2. If there is no difference (tie) in the first outcome, comparison in the next outcome in the hierarchy occurs.
3. The comparisons continue throughout the hierarchy until one member of the pair has a “win”; if no wins occur, the pair is recorded as a “tie.”
4. This process is repeated for every pair.
The total number of “wins” for the intervention group is compared to the total number of “wins” for the control group to calculate a win ratio.
1. A win ratio of 1 (equal number of wins in the intervention and control group) indicates the same number of wins in both groups.

Example of Win Ratios in the Literature

The “PARAGLIDE-HF” trial randomized patients with heart failure and a moderately reduced or preserved ejection fraction, who recently experienced a worsening heart failure event, to receive either sacubitril/valsartan or valsartan.³ A key secondary analysis in this trial was the win ratio for the composite outcome consisting of the hierarchy of endpoints depicted in Fig. 1. ³ Each patient in the sacubitril/valsartan group (n = 233) was compared to each patient in the valsartan group (n = 233) for a total of 54,289 pairs.³ The two participants in each pair were compared with results depicted in Fig. 1.³

Sacubitril/valsartan versus valsartan wins from the PARAGLIDE-HF trial.

In general terms, if there is one patient in the sacubitril/valsartan group and one patient in the valsartan group, the patient in the sacubitril/valsartan group is 1.19 times as likely to “win” or have a “positive” outcome, although in this specific trial, the results were not statistically significant as the 95% CI (0.93 to 1.52) included a null effect (WR = 1.00).³

The “EMPULSE” trial compared empagliflozin to placebo using a composite outcome consisting of the endpoints in Fig. 2.⁴ Figure 2 illustrates the results of this trial, with an overall statistically significant win ratio of 1.38 (1.11 to 1.71).

Empagliflozin versus placebo wins in the EMPULSE trial.

In comparison to PARAGLIDE-HF, EMPULSE presents a more clinically meaningful composite outcome by including quality of life instead of a surrogate marker as NT-proBNP.

Advantages of Win Ratios

The primary advantage of the win ratio is that it allows for prioritization of outcomes within a composite outcome based on a predefined hierarchy.^1,2 Returning to the illustration of a composite outcome including mortality, myocardial infarction, and stroke: if the same number of patients experienced a stroke at the same time-point with drug A versus drug B, but more patients taking drug A subsequently died following a stroke, this would be recognized as a “win” for drug B.

Win ratios also allow for different types of endpoints to be included within one composite, such as time-to-event, categorical, or continuous, as the analysis of each endpoint occurs independently.¹ In the PARAGLIDE-HF example, dichotomous and continuous endpoints were both included in the composite outcome without forcing an arbitrary dichotomization of NT-proBNP.³ This also allows for incorporation of patient-oriented outcomes, such as quality of life, into primary composite outcomes. The EMPULSE trial demonstrated this by including a minimum change in KCCQ score within their primary composite outcome.⁴

Limitations of Win Ratios

It is important to acknowledge a single outcome can still drive the direction and statistical significance of a win ratio. For example, in the EMPULSE trial, two-thirds of empagliflozin’s wins occurred with the fourth-most prioritized outcome, the proportion of patients with a clinically important improvement in KCCQ-TSS score.⁴ Although this outcome is clinically meaningful, it is important to recognize that this drove the results, and that there is considerably less certainty about the individual impact on mortality or hospitalization purely based on this trial. As with any composite outcome, it is essential to consider the impact of individual components when evaluating the impact of an intervention.

As with other uses of composite outcomes, debate may occur regarding the inclusion and order of the outcomes within the hierarchy. For example, in EMPULSE, some patients may value quality of life over mortality, and would prefer change in KCCQ score as the top of the hierarchy.⁴ In PARAGLIDE-HF, some may debate if change in NT-proBNP, a surrogate outcome, should be included in the composite at all.³ However, the win ratio prioritization allowed this less-important outcome, which was the primary outcome and individually statistically significant, to be recontextualized in a hierarchy following more clinically meaningful outcomes.

Win ratios can be susceptible to data manipulation by re-ordering the hierarchy to generate a preferable outcome. Researchers should pre-specify a clear statistical analysis plan including the hierarchy and outcome definitions.

Additionally, due to the novelty of win ratios, it may be more challenging to interpret the effect of an intervention.

Finally, it is worth noting that comparisons involving re-analyses using win ratios for trials initially utilizing hazard ratios demonstrated nearly identical results (win ratios were nearly identical to the reciprocal of the original hazard ratios). In other words, a clear benefit with win ratios has not been demonstrated in these analyses.^5,6

Back to the Clinical Scenario

Returning to our patient case, the win ratio from PARAGLIDE-HF does not independently support the use of sacubitril/valsartan, as it did not reach statistical significance. EMPULSE-HF supported the use of empagliflozin, acknowledging that the benefit observed in this trial was driven by improved quality of life. These numbers, in combination with other considerations including safety and adherence, should be discussed with the patient to facilitate shared decision-making.

Restricted mean survival time

Clinical Scenario

Later in your clinic day, you are seeing a patient who was recently discharged from hospital after a bioprosthetic mitral valve replacement. He has been resumed on rivaroxaban, which he was taking pre-operatively. You are wondering what the evidence is to support this.

Example of Restricted Mean Survival Time in the Literature

The “RIVER” trial was a non-inferiority randomized controlled trial comparing rivaroxaban to warfarin in patients with atrial fibrillation and a bioprosthetic mitral valve for the composite outcome of death, stroke, transient ischemic attack, systemic embolism, valve thrombosis, hospitalization for heart failure, or major bleeding at 12 months, analyzed using restricted mean survival time (RMST).⁷ This composite met non-inferiority criteria at a mean time of 347.5 days with rivaroxaban versus 340.1 days with warfarin, with a mean difference of 7.4 days (1.4 days shorter to 16.3 days longer with rivaroxaban).⁷

RMST Explained

Longer-term medication trials often analyze dichotomous outcomes using survival analysis to account for patients lost-to-follow-up (censored) over time. Hazards ratios (HRs) produced by Cox regression are a common way to analyze and present these outcomes. However, this relies on the assumption that the HR is constant over time (proportional hazards assumption).^8,9 When this assumption is violated, which can be observed visually when Kaplan–Meier curves diverge and then cross over time, or tested more formally, alternate methods that do not rely on proportional hazards must be used.^8,9

Hazard rates are constant, instantaneous rates of events; HR is the ratio of hazards with the intervention versus comparator.¹⁰ For example, a hazard ratio of 0.50 means that the hazard of the outcome is 50% lower in the intervention group at a given timeframe.

RMST is the area under the Kaplan–Meier curve from the start of the study to a pre-determined time-point.^8,11 Delta-RMST describes the differences in time without the outcome between the intervention and the comparator (i.e., how long it took the intervention group to meet the outcome versus the comparator group).^8,11 The difference in RMST is depicted by the differences in the area under the curves (Fig. 3).

RMST as differences between the areas under the curves.

Advantages of RMST

Proportional hazards assumption (Fig. 4) is not always met and the hazard ratio may mislead clinicians by providing an inaccurate estimate of effect, particularly when the intervention and comparator are substantially different.^8,11 Examples of potential non-proportionality include surgical procedures, in which a short-term perioperative increase in mortality may occur, with an overall decrease in long-term mortality. In these cases, RMST provides a more informative summary of effect.

Kaplan–Meier curve demonstrating proportionality.

RMST provides insight into time before experiencing an outcome (e.g., death, or any component of the RIVER trial composite) with the intervention versus the comparator. This data provides an absolute benefit, which may improve interpretability of data and appreciation of effects for patients. An example includes oncology trials, where data can be presented as “x” more months alive and progression-free with an intervention versus a comparator.

Limitations of RMST

Clinical meaningfulness of RMST relies on appropriate selection of a trial duration, as RMST by definition requires “restricting” looking at the data to a certain timeframe. However, pre-specifying a reliable timeframe poses difficulties if the appropriate timeframe is not explicitly clear, which may lead to inaccurate results.^8,11 The timeframe should consider clinical knowledge, previous evidence, and properties of the intervention. If the timeframe is inappropriate, valuable data of the intervention may be lost due to events occurring beyond the timeframe and inaccurate conclusions may be drawn.

For example, consider the RIVER trial, which was limited to 12 months of follow-up and demonstrated non-inferiority of rivaroxaban to warfarin in patients with atrial fibrillation and bioprosthetic mitral valve replacements.⁷ A similar trial (though in a different patient population), INVICTUS, compared rivaroxaban to warfarin over 54 months.¹² In INVICTUS, the Kaplan–Meier curves crossed after 12 months, and subsequently diverged, resulting in rivaroxaban being inferior.¹² This leads to uncertainty for the result of the Kaplan–Meier curves in RIVER if follow-up was extended beyond 12 months. Figure 5 demonstrates a similar pattern, where if follow-up was stopped at 12 months (Fig. 5a), the comparator may appear superior. However, if follow-up continued for 28 months, the curves cross and diverge towards intervention superiority (Fig. 5b).

a Kaplan–Meier curve demonstrating non-proportionality with a short follow-up time (12 months) with intervention appearing superior to comparator. b Kaplan–Meier curve demonstrating non-proportionality with a longer follow-up time (48 months), demonstrating crossing and divergence of survival curves.

Back to Clinical Scenario

Based on the available evidence of rivaroxaban compared to warfarin, further discussions need to occur with the patient to determine a safe and effective long-term anticoagulation plan.

Interpreting continuous outcomes: responder analyses and standardized mean differences

Clinical Scenario

Your last clinic appointment of the day is with a patient who has a history of poorly controlled major depressive disorder who has failed four different antidepressants. He is wondering if ketamine may be an effective treatment option.

Background Information

Continuous outcomes, such as depression scales or pain ratings, are often reported as mean differences between the intervention and control group. Due to the variability in baseline patient scores, as well as response to treatment, group averages of subjective outcomes can be challenging to translate to an individual patient.

Additionally, continuous outcomes often use a variety of scales, such as the Montgomery-Asberg Depression Rating Scale (MADRS) or Hamilton Depression Rating Scale (HDRS). To meta-analyze multiple studies using various outcome measures, the scales must be standardized, either by converting the data to responder rates or standardizing the mean differences.

Responder Analyses Explained

A responder analysis presents the proportion of patients who “respond”, or meet predefined criteria of a response, to an intervention versus a comparator at a given time-point. Definitions for response vary based on the outcome and disease state; it is sometimes defined as the minimal important difference (MID).^13,14 An example of an outcome where response does not match the MID is the MADRS, which has a MID of 1.6–1.9 points out of 60, but response is defined as a “decrease of 50% or greater from baseline”.¹⁵

Example of Responder Analysis in the Literature

A meta-analysis from a Cochrane review of ketamine for major depressive disorder included seven randomized controlled trials (n = 185) that compared ketamine to placebo for the management of major depressive disorder.¹⁶ Various depression rating scales, including MADRS and HDRS, were used, along with different definitions of response; responder analyses allowed researchers to combine the studies into one effect. In one example (Hu 2016), 5/13 (38%) patients achieved a response (MADRS score decreased by 50% or more from baseline) with ketamine compared to 0/14 (0%) with placebo.¹⁶ In another study (Berman 2000), the response definition was HDRS decreased by 50% or more from baseline, with response achieved in 25% with ketamine compared to 0% with placebo.¹⁶ The combined response rates at 24 h across all trials was 36% with ketamine versus 9% with placebo, with an odds ratio of 3.94 (95% confidence interval 1.54 to 10.10). ¹⁶ This can subsequently be converted to a number needed to treat for one patient to respond of 4.

Advantages of Responder Analysis

Responder analyses provide an estimate on how likely it is that a patient will have a response to an intervention. If the definition of response is clinically meaningful, this can provide an easy-to-interpret estimate of effect.

Additionally, the results of a responder analysis may be more intuitive to patients due to the simplicity and applicability to individuals (i.e., if I take this intervention, what is the likelihood that I will have a particular outcome?). This may improve engagement in shared decision-making.

Limitations of Responder Analysis

Since evidence-based definitions of response are not standardized across clinical trials, researchers often define an arbitrary threshold, which may or may not be clinically meaningful; this also leads to uncertainties when comparing interventions that may have been evaluated in studies that used different response definitions.¹⁴

Although responder analyses provide more details on the likelihood of a patient attaining a benefit, limitations to this exist. External variables may contribute to the response rate, such as natural disease course or placebo and confounding effects.¹³

Responder analyses, due to the dichotomization of continuous data, ultimately lead to the omission of clinically meaningful information. For example, assume a response to an antidepressant is defined as a 50% decrease from baseline MADRS. If patient “A” has a baseline MADRS of 37/60, which decreases to 18/60 with treatment (51% reduction), they would be considered a responder. Conversely, patient “B” who goes from a MADRS of 37/60 to 19/60 with treatment (49% reduction in score) would be considered a non-responder despite nearly the same improvement in symptoms as patient A.

Dichotomizing continuous outcomes into responder analyses is associated with a reduction in power.¹³ Researchers demonstrated that a continuous outcome that requires 100 participants to identify a significant difference would require 158 participants when dichotomized into a responder analysis.¹⁷ For readers, this can lead to perceived inconsistency if the mean difference is statistically significantly different, but the responder analysis is not.

Standardized Mean Difference Explained

The standardized mean difference (SMD, sometimes called effect size) helps us compare the impact of interventions by taking into account the variability within the data.

SMD = (intervention change - comparator change) / standard deviation

Once the effect is in units of standard deviations, multiple SMDs can be meta-analyzed to calculate an overall effect.¹⁸ Different versions of this exist, such as Cohen’s d and Hedge’s g; although they have varying formulas, the overall concept is similar. ¹⁸ To simplify the interpretation of SMDs, Cohen initially proposed that SMDs of 0.2–0.5, 0.5–0.8, and > 0.8 could be considered small, moderate, and large effects, respectively.¹⁹ However, there is no evidence supporting these thresholds, and they should be interpreted with clinical judgment. As a visual, assume an antidepressant has a SMD of − 1.0 compared to placebo for a depression rating scale. Mathematically, this means that the difference between the mean of the antidepressant compared to the mean of placebo is equal to one standard deviation. Assuming a normal distribution, 84% of the depression scores with the intervention are better than the mean of the control group.

Example of SMDs in the Literature

Returning to the aforementioned Cochrane review of ketamine, eight randomized controlled trials (n = 231) compared ketamine to placebo by using the SMD to pool different depression rating scale scores.¹⁶ In Hu (2016), the mean MADRS score with ketamine was 23.6 versus 32.1 with placebo, with a pooled standard deviation (SDpooled = √((SD12 + SD22) ⁄ 2)) of 9.77.

S M D = (23.6 - 32.1) / 9.77 = - 0.85

When all eight studies were meta-analyzed, the overall SMD was − 0.87 (− 1.26 to − 0.48).¹⁶

Advantages of SMD

SMDs are the analysis method of choice when meta-analyzing continuous data using different measurement scales. The only other option to combine this type of data would be to utilize “responder analyses” as previously discussed.

Limitations of SMD

A key assumption of SMDs is that differences in standard deviations between studies are only present due to differences in the outcome measurements.²⁰ If standard deviations are otherwise different between studies because of differences in trial design, patient population, or treatment effect, this may impact the SMD.²⁰ For example, if two studies utilize the same outcome measurement and result in the same mean difference between intervention and control, but have different standard deviations, the one with the larger SD will result in a smaller SMD.

Although Cohen’s proposed thresholds can be used to qualify the magnitude of effect, translating SMDs clinically may pose challenges. For example, the SMD of depression rating scale scores with ketamine versus placebo for major depressive disorder was − 0.87 (− 1.26 to − 0.48), a “large” effect as per Cohen.¹⁹ In order to improve interpretability of a SMD, clinicians can “back-transform” it to a well-known scale by multiplying the SMD by the SD of the desired outcome. The SMD of − 0.87 can be converted to the change in MADRS score, by multiplying it by the SD (9.77), which results in a − 8.5-point (− 12.3 to − 4.7) difference.¹⁶ Since the MID for MADRS is 1.6–1.9,¹⁵ this difference appears clinically meaningful. Converting the final SMDs to familiar outcome measures may enhance comprehension for the reader and improve facilitation of shared decision-making with patients.

Back to the Clinical Scenario

This meta-analysis demonstrates an observed benefit of ketamine versus placebo in depression, consistent in both response analysis and SMD. You discuss the potential benefits of ketamine with the patient including the absolute benefits, responder rates, safety concerns, and administration procedures. The patient and healthcare team decide ketamine is a reasonable option.

Overall conclusions

This review provides guidance to clinicians on the interpretation and application of advanced and novel statistical methods. As with all statistical tests, these methods are tools to use to view the evidence, and must be contextualized with their advantages and limitations. Clinicians must interpret these results in the context of clinical relevance and patient-specific goals in order to facilitate evidence-based shared decision making in their practice.

Win ratios are a useful method to prioritize and analyze composite outcomes and allow for patient-centered endpoints to be included in primary outcomes. Clinicians should assess individual components of the composite outcome to ensure clinical and statistical significance. Restricted mean survival time provides an alternative method to survival analysis that is more accurate when proportionality assumptions are not met, and may be more meaningful to patients. Researchers and clinicians should ensure the timeframe set to analyze RMST is evidence-based. Continuous outcomes may be reported as responder analyses or standardized mean differences, both of which demonstrate inherent limitations of which clinicians should consider when interpreting the data.

Declarations:

Conflict of Interest:

The authors declare that they do not have a conflict of interest.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Redfors B, Gregson J, Crowley A et al. The win ratio approach for composite endpoints: practical guidance based on previous experience. Eur Heart J. 2020;41(46):4391–9. [DOI] [PubMed]
2.Pocock SJ, Ariti CA, Collier TJ, Wang D. The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. Eur Heart J. 2011;33(2):176–82. [DOI] [PubMed]
3.Wentz RJ, Ward JH, Hernandez AF et al. Angiotensin-neprilysin inhibition in patients with mildly reduced or preserved ejection fraction and worsening heart failure. J Am Coll Cardiol. 2023;82(1):1–12. [DOI] [PubMed]
4.Biegus J, Voors AA, Collins SP et al. Impact of empagliflozin on decongestion in acute heart failure: the Empulse trial. Eur Heart J. 2022;44(1):41–50. [DOI] [PMC free article] [PubMed]
5.Ferreira JP, Jhund PS, Duarte K et al. Use of the win ratio in cardiovascular trials. JACC Heart Fail. 2020;8(6):441–50. [DOI] [PubMed]
6.Ajufo E, Nayak A, Mehra MR. Fallacies of using the win ratio in cardiovascular trials. JACC Basic Transl Sci. 2023;8(6):720–7. [DOI] [PMC free article] [PubMed]
7. Guimarães HP, Lopes RD, de Barros e Silva PGM et al. Rivaroxaban in patients with atrial fibrillation and a bioprosthetic mitral valve. N Engl J Med. 2020;383(22):2117–26. [DOI] [PubMed]
8.Gregson J, Sharples L, Stone GW, Burman C-F, Öhrn F, Pocock S. Nonproportional hazards for time-to-event outcomes in clinical trials. Journal of the American College of Cardiology. 2019;74(16):2102–12. [DOI] [PubMed]
9.Hasegawa T, Misawa S, Nakagawa S et al. Restricted mean survival time as a summary measure of time to event outcome. Pharmaceutical Statistics. 2020;19(4):436–53. [DOI] [PubMed]
10.Kim DH, Uno H, Wei L-J. Restricted mean survival time as a measure to interpret clinical trial results. JAMA Cardiology. 2017;2(11):1179-80. [DOI] [PMC free article] [PubMed]
11.Han K, Jung I. Restricted mean survival time for survival analysis: a quick guide for clinical researchers. Korean Journal of Radiology. 2022;23(5):495–499. doi: 10.3348/kjr.2022.0061. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Connolly SJ, Karthikeyan G, Ntsekhe M et al. Rivaroxaban in rheumatic heart disease–associated atrial fibrillation. New England Journal of Medicine. 2022;387(11):978–88. [DOI] [PubMed]
13.Snapinn SM, Jiang Q. Responder analyses and the assessment of a clinically relevant treatment effect. Trials. 2007;8(1). [DOI] [PMC free article] [PubMed]
14.Cook CE, Bejarano G, Reneker J, Vigotsky AD, Riddle DL. Responder analyses: a methodological mess. J Orthop Sports Phys Ther. 2023;1–9. [DOI] [PubMed]
15.Duru G, Fantino B. The clinical relevance of changes in the Montgomery–Asberg Depression rating scale using the minimum clinically important difference approach. Curr Med Res Opin. 2008;24(5):1329–35. [DOI] [PubMed]
16.Dean RL, Hurducas C, Hawton K et al. Ketamine and other glutamate receptor modulators for depression in adults with unipolar major depressive disorder. Cochrane Database Syst Rev. 2021;2021(11). [DOI] [PMC free article] [PubMed]
17.Fedorov V, Mannino F, Zhang R. Consequences of dichotomization. Pharm Stat. 2009;8(1):50–61. [DOI] [PubMed]
18.Murad MH, Wang Z, Chu H, Lin L. When continuous outcomes are measured using different scales: guide for meta-analysis and interpretation. BMJ. 2019; 364: k4817 [DOI] [PMC free article] [PubMed]
19.Cohen J. Statistical power analysis for the behavioral sciences. Hillsdale, NJ: L. Erlbaum Associates; 1988.
20.Cochrane Handbook for Systematic Reviews of interventions [Internet]. [cited 2023 Jul 15]. Available from: https://training.cochrane.org/handbook/current

[CR1] 1.Redfors B, Gregson J, Crowley A et al. The win ratio approach for composite endpoints: practical guidance based on previous experience. Eur Heart J. 2020;41(46):4391–9. [DOI] [PubMed]

[CR2] 2.Pocock SJ, Ariti CA, Collier TJ, Wang D. The win ratio: a new approach to the analysis of composite endpoints in clinical trials based on clinical priorities. Eur Heart J. 2011;33(2):176–82. [DOI] [PubMed]

[CR3] 3.Wentz RJ, Ward JH, Hernandez AF et al. Angiotensin-neprilysin inhibition in patients with mildly reduced or preserved ejection fraction and worsening heart failure. J Am Coll Cardiol. 2023;82(1):1–12. [DOI] [PubMed]

[CR4] 4.Biegus J, Voors AA, Collins SP et al. Impact of empagliflozin on decongestion in acute heart failure: the Empulse trial. Eur Heart J. 2022;44(1):41–50. [DOI] [PMC free article] [PubMed]

[CR5] 5.Ferreira JP, Jhund PS, Duarte K et al. Use of the win ratio in cardiovascular trials. JACC Heart Fail. 2020;8(6):441–50. [DOI] [PubMed]

[CR6] 6.Ajufo E, Nayak A, Mehra MR. Fallacies of using the win ratio in cardiovascular trials. JACC Basic Transl Sci. 2023;8(6):720–7. [DOI] [PMC free article] [PubMed]

[CR7] 7. Guimarães HP, Lopes RD, de Barros e Silva PGM et al. Rivaroxaban in patients with atrial fibrillation and a bioprosthetic mitral valve. N Engl J Med. 2020;383(22):2117–26. [DOI] [PubMed]

[CR8] 8.Gregson J, Sharples L, Stone GW, Burman C-F, Öhrn F, Pocock S. Nonproportional hazards for time-to-event outcomes in clinical trials. Journal of the American College of Cardiology. 2019;74(16):2102–12. [DOI] [PubMed]

[CR9] 9.Hasegawa T, Misawa S, Nakagawa S et al. Restricted mean survival time as a summary measure of time to event outcome. Pharmaceutical Statistics. 2020;19(4):436–53. [DOI] [PubMed]

[CR10] 10.Kim DH, Uno H, Wei L-J. Restricted mean survival time as a measure to interpret clinical trial results. JAMA Cardiology. 2017;2(11):1179-80. [DOI] [PMC free article] [PubMed]

[CR11] 11.Han K, Jung I. Restricted mean survival time for survival analysis: a quick guide for clinical researchers. Korean Journal of Radiology. 2022;23(5):495–499. doi: 10.3348/kjr.2022.0061. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Connolly SJ, Karthikeyan G, Ntsekhe M et al. Rivaroxaban in rheumatic heart disease–associated atrial fibrillation. New England Journal of Medicine. 2022;387(11):978–88. [DOI] [PubMed]

[CR13] 13.Snapinn SM, Jiang Q. Responder analyses and the assessment of a clinically relevant treatment effect. Trials. 2007;8(1). [DOI] [PMC free article] [PubMed]

[CR14] 14.Cook CE, Bejarano G, Reneker J, Vigotsky AD, Riddle DL. Responder analyses: a methodological mess. J Orthop Sports Phys Ther. 2023;1–9. [DOI] [PubMed]

[CR15] 15.Duru G, Fantino B. The clinical relevance of changes in the Montgomery–Asberg Depression rating scale using the minimum clinically important difference approach. Curr Med Res Opin. 2008;24(5):1329–35. [DOI] [PubMed]

[CR16] 16.Dean RL, Hurducas C, Hawton K et al. Ketamine and other glutamate receptor modulators for depression in adults with unipolar major depressive disorder. Cochrane Database Syst Rev. 2021;2021(11). [DOI] [PMC free article] [PubMed]

[CR17] 17.Fedorov V, Mannino F, Zhang R. Consequences of dichotomization. Pharm Stat. 2009;8(1):50–61. [DOI] [PubMed]

[CR18] 18.Murad MH, Wang Z, Chu H, Lin L. When continuous outcomes are measured using different scales: guide for meta-analysis and interpretation. BMJ. 2019; 364: k4817 [DOI] [PMC free article] [PubMed]

[CR19] 19.Cohen J. Statistical power analysis for the behavioral sciences. Hillsdale, NJ: L. Erlbaum Associates; 1988.

[CR20] 20.Cochrane Handbook for Systematic Reviews of interventions [Internet]. [cited 2023 Jul 15]. Available from: https://training.cochrane.org/handbook/current

PERMALINK

Clinician’s Approach to Advanced Statistical Methods: Win Ratios, Restricted Mean Survival Time, Responder Analyses, and Standardized Mean Differences

Melissa Lane, BSc Pharm, ACPR, ACPR2

Tyson Miao, BScPsyc, PharmD, ACPR, ACPR2

Ricky D Turgeon, BSc Pharm, ACPR, PharmD

Abstract

Introduction

Table 1.

Win ratios

Clinical Scenario

Background Information

Win Ratios Explained

Example of Win Ratios in the Literature

Figure 1.

Figure 2.

Advantages of Win Ratios

Limitations of Win Ratios

Back to the Clinical Scenario

Restricted mean survival time

Clinical Scenario

Example of Restricted Mean Survival Time in the Literature

RMST Explained

Figure 3.

Advantages of RMST

Figure 4.

Limitations of RMST

Figure 5.

Back to Clinical Scenario

Interpreting continuous outcomes: responder analyses and standardized mean differences

Clinical Scenario

Background Information

Responder Analyses Explained

Example of Responder Analysis in the Literature

Advantages of Responder Analysis

Limitations of Responder Analysis

Standardized Mean Difference Explained

Example of SMDs in the Literature

Advantages of SMD

Limitations of SMD

Back to the Clinical Scenario

Overall conclusions

Declarations:

Conflict of Interest:

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases