Abstract
TOPCAT was a multinational clinical trial of 3,445 heart failure with preserved ejection fraction (HFpEF) patients that enrolled in 233 sites in six countries in North America, Eastern Europe and South America. Patients with a heart failure hospitalization in the last 12 months or an elevated B-type natriuretic peptide (BNP) were randomized to the mineralocorticoid receptor antagonist spironolactone vs. placebo. Sites in Russia and the Republic of Georgia provided the majority of early enrollment, primarily based on the hospitalization criterion since BNP levels were initially unavailable there. With the emergence of country-specific aggregate event rate data indicating lower rates in Eastern Europe and differences in patient characteristics there, the DSMB recommended relatively increasing enrollment in North America plus other corrective measures. Although final enrollment reflected the increased contribution from North America, a plurality of the final cohort came from Russia and Georgia (49% vs. 43% in North America). BNP measurements from Russia and Georgia available later in the trial suggested no or a mild level of heart failure consistent with low event rates. The primary results showed no significant spironolactone treatment effect overall (primary endpoint hazard ratio 0.89 (0.77, 1.04)), with a significant hazard ratio in North and South America (0.82 (0.69, 0.98), p =0.026) but not in Russia and Georgia (1.10 (0.79, 1.51), interaction p = 0.12).
This report describes the DSMB’s detection and management recommendations for regional differences in patient characteristics in TOPCAT, and suggests methods of surveillance and corrective actions that may be useful for future trials.
CONDENSED ABSTRACT
TOPCAT was a multinational clinical trial of 3,445 HFpEF patients conducted in six countries in North America, Eastern Europe and South America, comparing spironolactone to placebo. Sites in Russia and the Republic of Georgia provided a plurality of the final cohort (49% vs. 43% in North America). Patient and biomarker data from Russia and Georgia suggested no or mild heart failure in patients enrolled in these regions, likely explaining a lack of treatment effect there. This report describes the DSMB’s detection and suggested management of these regional differences, and recommends policies that may be useful for future trials.
INTRODUCTION
Country- or region-specific differences in outcomes are frequently observed in multinational clinical trials (1–3) and may or may not be indicative of true differences in drug response (4). Geographies may vary with respect to genetics (4,5), non-genetic racial characteristics (5,6), medical practice, training or infrastructure patterns that may influence outcomes despite general adherence to a clinical trial protocol (4), and other factors. In the planning and conduct of multinational clinical trials the potential impact of geographic differences in outcomes is one of many important factors that needs to be considered, but often is not taken into account (4).
The Treatment of Preserved Cardiac Function Heart Failure with an Aldosterone Antagonist Trial (TOPCAT) (7,8) was a large scale, multinational National Heart Lung and Blood Institute (NHLBI)-sponsored clinical trial conducted in 233 sites in six countries located in three distinct geographic regions: North America (U.S. and Canada); Eastern Europe (Russia and the Republic of Georgia); and South America (Argentina and Brazil). The target disease indication investigated, heart failure with preserved left ventricular ejection fraction (HFpEF), has been a challenge to define and enroll in clinical trials and has thus far eluded the development of definitive therapy (9,10). The TOPCAT Data and Safety Monitoring Board (DSMB) worked closely with the NHLBI, Trial Executive/Steering Committee and the Data/Clinical Coordinating Center to deal with several major challenges during the trial. In this report we attempt to provide insight into regional heterogeneity issues in TOPCAT, including how they were detected and the recommendations made by the DSMB to resolve them. Based on this experience we offer suggestions for DSMB oversight of potential geographic disparities in future multinational trials.
METHODS
DSMB organization
The TOPCAT DSMB organization and responsibilities are given in the online Supplement.
Database
The database consisted of Clinical Trial Coordinating Center (CTCC) reports provided to the Open and Closed Sessions of the various DSMB meetings, the meeting minutes, monthly safety reports viewed by the DSMB Chair, correspondence of the Chair with the NHLBI and published data from the TOPCAT entire cohort (8,11). Interval enrollment data were the most recent figures available at the various reviews or from more extensive “data freeze” analyses performed within two months of the meetings.
Statistical methodology
Analyses were as described for the TOPCAT trial (7,8,11) using an unadjusted model. Additional analyses were conducted using Chi Square/contingency table tests with a Bonferroni correction.
Funding source and mechanism
The TOPCAT trial was supported by NHLBI contract HHSN268200425207C awarded to New England Research Institutes Clinical Trials Coordinating Center.
TOPCAT TRIAL
Overview
TOPCAT (ClinicalTrials.gov number, NCT00094302) enrolled 3445 patients with HFpEF between August 10, 2006, and January 31, 2012 with a mean follow-up of 3.3 years ending on June 30, 2013. Overall (8) and regional (8,11) differences in outcomes have been previously reported. The primary endpoint of TOPCAT was the composite of time to cardiovascular death, aborted cardiac arrest or heart failure hospitalization, with each component adjudicated by a clinical events committee (CEC). Inclusion of non-U.S. sites was an integral part of the original study design, in order to enhance generalizability of the results and manage trial costs.
In addition to monthly safety monitoring that was focused on the class adverse effects of MRAs or MRAs in combination with other renin-angiotensin system inhibitors (hyperkalemia, renal function, gynecomastia in each blinded treatment arm), at its scheduled meetings the DSMB monitored overall enrollment, the aggregate (both treatment groups combined) event rate and country-specific data. Throughout the trial the DSMB elected not to disclose the observed aggregate event rate to the Executive/Steering Committee, but to report to them any important departure from the pretrial assumption expected rate that might impact statistical power. The interim analysis plan included a DSMB review of unblinded primary outcome data by treatment arm at 33%, 50% and 75% of the expected number of primary endpoints, using events confirmed by the CEC and respective efficacy conditional power futility boundaries of ≤10%, ≤15% and ≤20%. The trial reached completion, with 671 subjects having a primary event (8). The hazard ratio (95% confidence intervals) for the primary endpoint was 0.89 (0.77, 1.04), p = 0.14 (8); and 0.83 (0.69, 0.99), p = 0.04 for time to heart failure hospitalization as a single endpoint (8).
The TOPCAT protocol defined HFpEF as a) having at least one heart failure symptom present during screening, b) one heart failure sign present during the last 12 months, and c) meeting criteria for one of two design strata: at least one hospitalization in the last 12 months “for which heart failure had to be a major component of the hospitalization,” or an elevated brain natriuretic peptide (BNP) or N-terminal pro-BNP level sampled in the last 60 days. To exclude HFrEF, the LVEF measured during the previous 6 months had to be ≥45%. The dual hospitalization or natriuretic peptide (NP) qualification scheme for enrollment was necessary because at the beginning of the trial BNP or NT-proBNP assays were not universally available. For the final cohort 28% of patients were enrolled using natriuretic peptide (NP) criteria (8), 45% of patients in North and South America and 11% in Eastern Europe. The prespecified subgroup analysis of heart failure qualifying criteria yielded a significant interaction p value of 0.01, with an NP subgroup hazard ratio of 0.65 (0.49,0.87), p = 0.003 and the heart failure hospitalization subgroup having no evidence of a treatment effect (hazard ratio 1.01 (0.84,1.21), p = 0.92) (8).
Of the 981 patients enrolled by NP criteria, 81% were from North or South America. Additional evidence of geographic differences in outcomes by country and region were noted (8,11). Patients enrolled in Russia and Georgia had lower event rates in the placebo arm, and in the spironolactone arm no evidence of of hyperkalemia or renal dysfunction despite having an increased incidence of gynecomastia plus no evidence of a treatment effect (8,11). The hazard ratio of the primary endpoint for patients enrolled in the Americas was 0.82 (0.69, 0.98), p =0.026, compared with 1.10 (0.79, 1.51), p = 0.58 in Russia and Georgia (p=0.12 for interaction between treatment and region) (8,11).
Enrollment
Pre-trial plans and projections
The DSMB also served as the protocol review committee, with the final version of the protocol approved in January, 2005. The protocol contained language on the logistical and clinical research organizational support for recruiting patients from North and South America, Western Europe and Eastern Europe. The DSMB became aware of the potential for non-North American sites to influence the overall outcome of the trial when the plan to enroll a large number of patients from Russia and the Republic of Georgia was reported by Trial leadership and the sponsor at the first DSMB meeting in December, 2006 (Figure 1). Support for an Eastern European strategy as a means of facilitating recruitment was offered by the TOPCAT Steering Committee leadership, who reported that the CHARM Preserved Trial9 “had not observed any interaction between patient prognosis and country,” including Russia. Nevertheless the DSMB expressed concern over enrolling a large number of patients with a diagnosis of HFpEF in non-North American sites. The DSMB-recommended strategy to deal with this issue included a) monitoring of source documents to check that the qualifying hospitalization met the heart failure criterion, for which translation of source records would be needed; b) requesting that at a minimum, 20% of all patients enrolled should come from North America, and the enrollment from any one country should not exceed 50%; and c) asking that enrollment rates and other trial data be monitored by country.
The projected enrollment by region, presented by Steering Committee leadership at the January, 2008 DSMB meeting (Figure 1), is given in Figure 2 juxtaposed against the actual final enrollment on January 31, 2012. In the pre-trial projections, 51.1% of patients were expected to be enrolled in Eastern Europe (42.2% in Russia and 8.9% in Georgia), 27.8% in North America (20.0% in the U.S. and 7.8% in Canada), and 21.1% in South America (11.1% in Argentina and 10.0% in Brazil). The final Russia/Georgia enrollment was close to projections, whereas North American enrollment was higher and South American was lower (Figure 2). The trend for enrollment to favor more North American participation over time was at the request of the DSMB, to counter the early dominance of Eastern European enrollment.
Recruitment dynamics
Enrollment milestones and important trial developments are summarized in Figure 1 and discussed in more detail in the online Supplement. The first patient was enrolled in August, 2006 approximately one year behind schedule due primarily to delays in manufacture and supply of study medication. In March 2007 the DSMB was provided with the first data available for review, for 165 patients randomized in Russia, Georgia and the U.S. Figure 3 shows the dominance of Russia and Georgia in the early phases of trial enrollment, followed by the steady ascendancy of North American randomization. At the completion of enrollment in January 2012 the Russia and Georgia contribution had fallen below 50% of the total, but still constituted a plurality: North America, n = 1477 (42.9%); Russia and Georgia, n = 1678 (48.7%); South America 290 (8.4%) (Figure 3).
Geographic discrepancies in recruitment and clinical course of patient populations
Details of the DSMB’s detection of and management recommendations for geographic discrepancies in the TOPCAT patient population are given in the online Supplement, with highlights summarized below and in Figure 1.
Emergence and detection; tactical responses
The first evidence suggesting that patients from Russia or Georgia were potentially different from other TOPCAT regions emerged from country-specific baseline characteristics for the first 794 enrolled patients, presented at the January 2008 DSMB scheduled review (Figure 1). The data indicated that a history of myocardial infarction or angina was more prevalent in patients enrolled in Russia, and that Russian and Georgian patients had less orthopnea by history compared to patients from other countries. The significance of these early findings was not clear at the time, but based on subsequent developments these data likely reflected a predominance of coronary artery disease pathophysiology in Russia, and a lesser degree or absence of heart failure in Russian and Georgian patients compared with other countries.
The first indication that the aggregate (combination of both treatment arms) event rate was lower in Georgia or Russia was revealed in the September, 2008 Closed Session review (Figure 1), where based on 1461 total randomized patients Georgia’s was 2.0% compared to 5.4% overall and 9.0% in the U.S. However, the number of events was low (only 6 in Georgia and 38 in the U.S.). These data were insufficient for drawing firm conclusions, but in response the DSMB recommended that the Trial leadership “encourage Georgian Investigators to enroll patients as expected per the study protocol”.
At the September, 2008 meeting the DSMB approved a plan to reduce the total target enrollment from 4,500 to 3,515 patients with 2 years of minimum follow-up, based on the aggregate event rate being as expected and estimated length of median follow-up being longer, plus accepting a power calculation of >80% opposed to 90%. In this event-driven trial the new sample size calculations were based on 80% and 85% power for 551 and 630 primary events respectively, assuming a 20% relative difference between the two treatment arms (8). In addition, the DSMB requested a substudy of BNP in patients entering the trial via a history of heart failure hospitalization to assess the severity of heart failure at baseline among the different countries, and that the trial obtain a BNP or NT-proBNP on every patient enrolled in Georgia and Russia. Finally, the DSMB requested that adverse events, serious adverse events and primary endpoint event rates be presented by country at all future meetings.
At the next DSMB review in April 2009 (Figure 1) the country-specific unadjudicated aggregate primary event rate patterns first noted at the September 2008 meeting persisted, and are given in Table 1. Georgia was an outlier, with an event rate 75% lower than the composite of all other countries (p <0.0001), and 83% lower than the U.S. (p <0.0001). The event rate in Russia was 35% lower than the average of other countries (p = 0.017), 56% lower than the U.S. (p <0.0001), and 2.6 fold higher than Georgia (p = 0.010). The statistical analysis of the event rates in Table 1 is post hoc, and was not performed at the time of data review. However, the lower event rate in Georgia was noted at the DSMB review, which prompted it to recommend an increase in enrollment of patients in the United States and Canada in an attempt to address the circumstances in Russia and Georgia. In March, 2010 the first unblinded interim analysis was performed, at 33% of the expected primary events (Figure 1). Based on the conditional poer being above the futility threshold the recommendation was to continue the trial with the current design.
Table 1.
Country | Final number of patients randomized (% of total) |
Number of patients randomized on 1/30/09 (%†){%‡} |
Number of primary events (%†) |
Event rate (%) |
P value vs. Georgia |
P value vs. Russia |
---|---|---|---|---|---|---|
U.S. | 1151 (33.4) | 501 (29.4){82} | 66 (50.0) | 13.2 | <0.0001 | <0.0001 |
Canada | 326 (9.5) | 137 (8.0){60} | 13 (9.8) | 9.5 | 0.0004 | 0.11 |
Russia | 1066 (30.9) | 657 (38.5){51} | 38 (28.8) | 5.8 | 0.010 | - |
Republic of Georgia |
612 (17.8) | 355 (20.8){119} | 8 (6.1) | 2.2* | - | 0.010 |
Argentina | 123 (3.6) | 57 (3.3){14} | 7 (5.3) | 12.3 | 0.0002 | 0.053 |
Brazil | 167 (4.8) | 0 (0) {0} | 0 (0) | 0 | - | - |
Total | 3445 (100) | 1707 (100){56} | 132 (100) | 7.7 | - | - |
Chi-sq, with p = 0.0125 the Bonferroni critical value;
Of the total trial enrollment or events on 9/16/08;
Of target allocation
In early October 2010 NHLBI Program Divisional leadership managing the trial held an unscheduled meeting with the DSMB Chair (Figure 1) to review results of the requested BNP Pilot project in Russia, and to discuss a review of source records for the qualifying hospitalization that had been reviewed by an NHLBI Program staff member fluent in Russian. In the vast majority (19 or 20) of the 22 reviewed hospitalizations ischemic symptoms rather than heart failure appeared to predominate. It was noted by NHLBI Program staff that this was consistent with ischemic heart disease prevalence data from the I-PRESERVE (10) and EVEREST (12) trials. The BNP and NT-proBNP data from Russia were compared to data from other countries (Table S1 and associated discussion in online Supplement). The majority of patients who had been enrolled via the heart failure hospitalization criteria, most of whom had NP samples drawn after enrollment and not during the hospitalization, had values within the normal range. In contrast, very few (9%) U.S. or Canadian patients, all of whom had NP draws done during the index hospitalization, had values within the normal range. The NT-proBNP median value for all patients entering the trial via the hospitalization criterion in Russia or Georgia was within the normal range (respectively 233 pg/ml and 164 pg/ml), whereas it was markedly elevated (887 pg/ml)in the U.S. and Canadian patients. Concerns over these data were expressed in written communication to the NHLBI, where the DSMB Chair outlined a strategy recommending that the Steering Committee institute closer monitoring of enrollment criteria for patients in both Russia and Georgia (online Supplement).
These recommendations were accepted by the full DSMB at the scheduled 2nd interim analysis meeting later in October (Figure 1), where unblinded outcome data on 2,732 patients easily exceeded the conditional power futility boundary (online Supplement). During this meeting the Trial leadership and the CTCC pointed out that the the BNP Pilot data were subject to selection bias because most samples in North America were obtained during a HF hospitalization, as opposed to in Russia and Georgia where NPs were drawn post randomization after the index hospitalization. Nevertheless, the low (208 pg/ml combined) median values of Russian and Georgian patients were known to be associated with low mortality or cardiovascular hospitalization event rates in HFrEF (13), and were subsequently shown to be associated with a low composite cardiovascular mortality and heart failure hospitalization rate in HFpEF (14). Thus although the NP data from the Pilot study did not allow a direct comparison of Eastern European enrolled patients to those enrolled in the Americas, they were consistent with the low primary event rates in Russia and Georgia. After extensive discussion of this issue, it was decided to terminate the NP Pilot study, and to again emphasize to Russian and Georgian investigators to ensure that patients hospitalized for apparent heart failure met the eligibility criteria for the trial.
By the June 2011 review (Figure 1) 3,080 patients had been enrolled, and for the first time U.S. enrollment (n = 1,024) exceeded any other country (Figure 3). The previously noted country-specific differences in baseline characteristics and event rates persisted at this and the subsequent review in December, 2011 (Figure 1) conducted in 3317 patients. At the June 2012 DSMB review (Figure 1) full enrollment had been reached 5 months earlier, with the U.S. leading recruitment at 33.4% of the total (Table 1). Relative to the projected enrollment, the actual final enrollment by country was: U.S. 167% over target, Canada 122%, Russia 73%, Georgia 200%, Argentina 32% and Brazil 48% (Figure 2). However, despite the over-enrollment in the U.S. and Canada, the two Eastern European countries enrolled 48.7% of the total, compared to 42.9% in North America and 8.4% in South America.
The final interim analysis, at 75% of the projected number of primary events, was conducted at the June, 2012 review (Figure 1). Conditional power again was well above the futility boundary (20%), and was 51% using the observed event rates in each arm modeled forward to the completion of follow-up, or 69% using the observed placebo event rate but the pre-trial/expected 20% difference in crude event rates in the remaining patients active in the trial who had not had a primary event. The country-specific hazard ratios and number of events are shown in Table 2. Based on 382 patients with confirmed primary events the overall hazard ratio was 0.792 (p = 0.020, efficacy boundary for stopping = 0.001). At this interim an additional 161 patients had unconfirmed events that were pending adjudication, and when they were included the hazard ratio p value was 0.118. Country-specific hazard ratios were available for the first time, and are shown in Table 2. For the confirmed events there was no evidence of a treatment effect in Georgia, but the number of primary events (n = 14) was extremely small. There was also little evidence of a treatment effect in Russia (hazard ratio 0.95) despite 62 primary events being observed. These country-specific hazard ratios are consistent with the final trial outcome in North/South America combined vs. Russia/Georgia (8, 11).
Table 2.
Country | Hazard ratio | Number of subjects with confirmed primary event |
---|---|---|
U.S. | 0.786 | 230 |
Canada | 0.642 | 49 |
Russia | 0.950 | 62 |
Republic of Georgia | 0.993 | 14 |
Argentina | 0.745 | 18 |
Brazil | 0.473 | 9 |
Overall | 0.792 | 382 |
TOPCAT trial follow-up was completed on June 30, 2013, and top line results were presented by to the DSMB and NHLBI on September 18, 2013 (Figure 1). The total number of primary events was 671 (8), which conferred >85% power to detect a relative difference in event rates of 20% between the two treatment arms.
Potential causes of geographic discrepancies
In TOPCAT four of the six countries exhibited relatively similar patient characteristics and event rates, as well as favorable effects of spironolactone treatment. Data from the two Eastern European countries differed from the four countries in the Americas, but also from each other. The Russian cohort was dominated by clinical presentations of ischemic heart disease, a common cause of HFpEF, and patients may have been symptomatic (dyspneic) due to ischemia rather than heart failure. An increased prevalence of an ischemic etiology in Eastern European HFpEF patients that was known unofficially during the TOPCAT Trial was eventually published for the I-PRESERVE and CHARM-Preserved patient populations (15). It is therefore possible that ineffectiveness of spironolactone as an anti-ischemic agent, rather than lack of efficacy against HFpEF, might have contributed to the TOPCAT results in Russia.
A predominance of ischemic heart disease symptoms, however, was apparently not the issue in patients enrolled in the Republic of Georgia, who from the beginning of the trial had a very low event rate and little evidence for heart failure based on signs and symptoms or random NP measurements. The lack of any treatment effect in these patients may well have been due to the fact that heart failure was either absent or much milder compared to those enrolled in the Americas.
Although study drug compliance did not appear to account for the lack of treatment effect in Russia and Georgia as these countries reported using higher doses of both spironolactone and placebo (11), in the spironolactone arm there was a substantially higher incidence of hyperkalemia and elevations in serum creatinine in the Americas compared with Russia/Georgia (11). This was interpreted as a “lack of pharmacologic effect” in Russian and Georgian patients (11), but could also be due to the absence of actual study drug consumption. However, Russia and Georgia patients had an increase in gynecomastia in the spironolactone vs. the placebo arm (11), indicating they were likely taking study medication. Gynecomastia in the absence of marked hyperkalemia has been observed in another spironolactone vs. placebo HFpEF study (16), and could be characteristic of a subcohort of patients. These pharmacodynamic adverse event data were not reviewed by country during the trial by the DSMB, although they were being tracked monthly for the entire trial. Country-specific AEs and SAEs within their standard regulatory organ system groupings were tracked by the DSMB, and no obvious differences between countries or regions were noted in any review.
LESSONS LEARNED AND RECOMMENDATIONS FOR FUTURE TRIALS
Lessons Learned
While the country-specific and regional heterogeneity in TOPCAT could be viewed as expected statistical variation in a large multinational trial (1,3), the differences in patient characteristics, lower event rates, lack of certain drug class-related pharmacodynamic effects (11) and complete lack of treatment effect in Russia/Georgia compared with the other regions strongly suggest that more than the play of chance occurred.
Enrollment issues
Because of the difficulty in identifying the phenotype (17) and other issues, enrollment of patients into HFpEF clinical trials can be challenging. Consequently, in HFpEF trials the pressure to enroll at a projected rate vs. the imperative to confine enrollment to the target population is in even more in conflict than usual. TOPCAT started behind schedule, and once begun there was brisk enrollment from two countries where data ultimately proved to be qualitatively problematic. By the time it was appreciated that there were serious issues with patient characteristics and event rates in patients from Russia and Georgia, both countries had enrolled substantial numbers of subjects. The lesson learned here is that during the early as well as the later phases of trial enrollment, recruitment from one or two regions or sites should not dominate, and patient characteristics should be monitored carefully in order to identify potential regional irregularities in study subpopulations.
What you see early may be what you get late
The discrepant patterns of patient characteristics and event rates in Russia or Georgia were present from the first opportunity to observe them, and persisted throughout the trial. Although it is well known that treatment effects can vary during a trial, fluctuations in patient characteristics and overall event rates may not exhibit such plasticity. It would seem prudent to place considerable weight on early observations that remain consistent, particularly if they could threaten trial integrity.
Country- and region-specific event rates need to be carefully followed and may demand dissemination beyond the DSMB
At no time during the trial did the aggregate event rate depart from expectations, and this was periodically communicated to the Steering Committee. Country-specific event rates were not specifically disclosed by the DSMB to the Steering Committee, and it is possible that such information would have triggered earlier or more definitive corrective actions. Continuing to enroll in a region where the event rate is inadequate to assess the tested treatment effect is questionable, and at a minimum should be brought to a steering committee’s attention. In TOPCAT this information was communicated only indirectly. Once the DSMB was confident there was an issue, direct reports of the actual aggregate event rates to the Trial’s leadership might have resulted in different management decisions.
In a large, multicenter clinical trial, change is difficult but not changing may be fatal
Both aspects of this popular management trope apply to large multicenter trials, especially multinational ones. For various reasons that include regulatory compliance and site burden, investigators are understandably reluctant to make substantive changes in clinical protocols during a trial. Yet there are often developments in trials that need major adjustments or changes in overall approach. In TOPCAT first opportunity for geographic diapartiy corrective measures occurred at the review in April 2009, with the confirmation that event rates were extremely low in Georgia, and low in Russia. The DSMB recommendation at the time was continued enhanced enrollment of patients in North America. Thereafter an ad hoc NP study strongly suggesting much less advanced heart failure in Russia or Georgia plus persistence of the lower event rates and the Trial’s response of accelerated site monitoring did not lead to any apparent change in the clinical characteristics of randomized patients, or to curtailment of enrollment. In fact Georgia finished at 200% of its enrollment target, and after April 2009 another 666 patients were enrolled in Russia nd Georgia. If those patients had been enrolled in the Americas it could be speculated that the trial may have been positive.
Requirement of an elevated B-type natriuretic peptide measurement in future HFpEF trials?
As referenced in the Trial Overview section, the hazard ratio in the 981 patients enrolled with an elevated BNP or NT-proBNP was highly statistically significant at 0.65 compared to the nonsignificant hazard ratio in the 2,464 patients enrolled based on a history of HF hospitalization (8,11). In HFpEF an elevated NP level adds precision to the HF diagnosis and provides objective evidence for a certain degree of HF severity (13,14). NP assays are now available in virtually all regions where clinical trial capability exists. However, in I-PRESERVE a treatment effect of the angiotensin receptor blocker irbesartan was only observed in patients with randomization NT-proBNP levels below the median of 339 pg/ml (18), a value below the TOPCAT qualifying value of 360 pg/ml. In addition, in TOPCAT NP levels are not easily separated from the geographic disparity issues. Therefore, further analysis of the TOPCAT data will be required to add credence to any recommendation regarding baseline NP levels and eligibility criteria for future HFpEF trials.
Adaptive increases in trial sample size can be prospectively designed, and may increase the probability of success
Despite the country/region-specific issues in TOPCAT, it is possible that the trial could have been salvaged based on information available at the 75% of total events interim analysis. Specifically, when a late/pre-specified interim analysis has a hazard ratio within certain boundaries termed the “promising zone (PZ)” the sample size may be increased without inflating Type 1 error (19,20). In general the PZ band encompasses conditional powers between 50% and 80% (19,20), which is where the TOPCAT conditional power calculations were at the 75% interim. Operationally the option to increase sample size by a prespecified amount based on conditional power or Bayesian predictive probabilities needs to be prospectively defined, and of course the sponsor must be willing to support the increase in trial budget to account for the sample size increase. An increase in sample size, by for example 25%, could have been a recommendation by the DSMB if this “adaptive” strategy had been incorporated into the design. However, NHLBI budget limitations, overall trial fatigue and the downside of extended follow-up of already enrolled patients all would have argued against such an approach. Notably the 75% interim analysis overall hazard ratio was substantially lower (0.79, Table 2) than the final result of 0.89, and so some of these negative factors including an increasingly high study drug discontinuation rate that reached 34% in the spironolactone arm at trial completion (8, online Supplement) may have adversely affected outcomes between the 75% interim and completion of the trial. If in fact any additional enrolled patients would have had outcomes similar to those recorded between the 75% interim analysis and the end of the trial the additional increase in sample size would not have rendered the trial positive.
Recommendations for future trials
Based on the above issues and discussion, the TOPCAT DSMB has the following suggestions for multinational clinical trials:
Launch the trial in a variety of geographic jurisdictions, and do not allow 1 or 2 geographic areas to dominate early (or late) enrollment.
The DSMB should follow country- and region-specific patient characteristics and aggregate event rates carefully, beginning early in the trial; if a country or region exhibits event rates that are statistically significantly lower than the composite of other regions and especially if this is reinforced by differences in disease characteristics, bring this to the attention of Steering Committee leadership.
Establish detailed plans for trial surveillance in the DSMB charter, and at the initiation of the trial inform investigators and national leaders of proposed country- and region-specific analyses of patient data and requirements for characteristics of the study population, with the directive that the trial may be subject to geographic constraint if enrolled patients do not fulfill pre-trial assumptions and/or are substantially different from other regions.
Incorporate objective measures used to determine disease presence and severity to the greatest extent possible in enrollment criteria, particularly for conditions such as HFpEF where the diagnosis can be challenging.
CONCLUSIONS
In summary, to paraphrase a popular aphorism from American football (21), multicenter, multinational clinical trials are a rough game and often a cruel one. They require extreme cooperation from groups of individuals and institutions with experience and skill, a willingness to adjust to unanticipated circumstances and the ability to make difficult decisions. Unanticipated developments are to be expected, and provisions can and should be built into trial design to facilitate identifying and managing them.
Supplementary Material
Acknowledgments
The authors thank Bertram Pitt, MD (Chair of the Steering and Executive Committees), as well as Marc A. Pfeffer, MD, PhD and Sonja McKinlay, PhD (Trial Co-PIs), for their tireless and expert devotion to design and management of the TOPCAT Trial. We thank Sonja McKinlay and Susan F. Assmann, PhD for their inputs into the manuscript. We are grateful to Susan F. Assmann and Brian J Harty, MA of the CTCC (NERI) for supplying the DSMB monthly safety reports, interim trial data and statistical support. We also thank Rachel Rosenberg for manuscript editing and handling, and Ben Harnke for sourcing the NFL Films track.
This work is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, and Blood Institute or National Institutes of Health. TOPCAT was funded by the National Institutes of Health, National Heart, Lung, and Blood Institute (NHLBI), Bethesda, MD, contract N01 HC45207.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- 1.Wedel H, Demets D, Deedwania P, Fagerberg B, Goldstein S, Gottlieb S, Hjalmarson A, Kjekshus J, Waagstein F, Wikstrand J MERIT-HF Study Group. Challenges of subgroup analyses in multinational clinical trials: experiences from the MERIT-HF trial. Am Heart J. 2001;142:502–511. doi: 10.1067/mhj.2001.117600. [DOI] [PubMed] [Google Scholar]
- 2.O’Connor CM, Fiuzat M, Caron MF, Davis G, Karl Swedberg K, Peter ECarson PE, Koch B, Bristow MR. Influence of global region on outcomes in large heart failure β-Blocker trials. J Am Coll Cardiol. 2011;58:915–922. doi: 10.1016/j.jacc.2011.03.057. [DOI] [PubMed] [Google Scholar]
- 3.Pocock S, Calvo G, Marrugat J, Prasad K, Tavazzi L, Wallentin L, Zannad F, Alonso Garcia A. International differences in treatment effect: do they really exist and why? Eur Heart J. 2013;34:1846–1852. doi: 10.1093/eurheartj/eht071. [DOI] [PubMed] [Google Scholar]
- 4.Glickman SW, McHutchison JG, Peterson ED, Cairns CB, Harrington RA, Califf RM, Schulman KA. Ethical and scientific implications of the globalization of clinical research. N Engl J Med. 2009;360:816–823. doi: 10.1056/NEJMsb0803929. [DOI] [PubMed] [Google Scholar]
- 5.Taylor MR, Sun AY, Davis G, Fiuzat M, Liggett SB, Bristow MR. Race, genetic variation, and therapeutic response disparities in heart failure. JACC Heart Fail. 2014;2:561–572. doi: 10.1016/j.jchf.2014.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Gravlee CC. How race becomes biology: embodiment of social inequality. Am J Phys Anthropol. 2009;139:47–57. doi: 10.1002/ajpa.20983. [DOI] [PubMed] [Google Scholar]
- 7.Desai AS, Lewis EF, Li R, Solomon SD, Assmann SF, Boineau R, Clausell N, Diaz R, Fleg JL, Gordeev I, McKinlay S, O’Meara E, Shaburishvili T, Pitt B, Pfeffer MA. Rationale and design of the treatment of preserved cardiac function heart failure with an aldosterone antagonist trial: a randomized, controlled study of spironolactone in patients with symptomatic heart failure and preserved ejection fraction. Am Heart J. 2011;162 doi: 10.1016/j.ahj.2011.09.007. 966-72.e10. [DOI] [PubMed] [Google Scholar]
- 8.Pitt B, Pfeffer MA, Assmann SF, Boineau R, Anand IS, Claggett B, Clausell N, Desai AS, Diaz R, Fleg JL, Gordeev I, Harty B, Heitner JF, Kenwood CT, Lewis EF, O’Meara E, Probstfield JL, Shaburishvili T, Shah SJ, Solomon SD, Sweitzer NK, Yang S, McKinlay SM TOPCAT Investigators. Spironolactone for heart failure with preserved ejection fraction. N Engl J Med. 2014;370:1383–1392. doi: 10.1056/NEJMoa1313731. [DOI] [PubMed] [Google Scholar]
- 9.Yusuf S, Pfeffer MA, Swedberg K, Granger CB, Held P, McMurray JJ, Michelson EL, Olofsson B, Ostergren J CHARM Investigators and Committees. Effects of candesartan in patients with chronic heart failure and preserved left-ventricular ejection fraction: the CHARM-Preserved Trial. Lancet. 2003;362:777–781. doi: 10.1016/S0140-6736(03)14285-7. [DOI] [PubMed] [Google Scholar]
- 10.Massie BM, Carson PE, McMurray JJ, Komajda M, McKelvie R, Zile MR, Anderson S, Donovan M, Iverson E, Staiger C, Ptaszynska A. I-PRESERVE Investigators. Irbesartan in patients with heart failure and preserved ejection fraction. N Engl J Med. 2008;359:2456–2467. doi: 10.1056/NEJMoa0805450. [DOI] [PubMed] [Google Scholar]
- 11.Pfeffer MA, Claggett B, Assmann SF, Boineau R, Anand IS, Clausell N, Desai AS, Diaz R, Fleg JL, Gordeev I, Heitner JF, Lewis EF, O’Meara E, Rouleau JL, Probstfield JL, Shaburishvili T, Shah SJ, Solomon SD, Sweitzer NK, McKinlay SM, Pitt B. Regional variation in patients and outcomes in the Treatment of Preserved Cardiac Function Heart Failure With an Aldosterone Antagonist (TOPCAT) trial. Circulation. 2015;13:34–42. doi: 10.1161/CIRCULATIONAHA.114.013255. [DOI] [PubMed] [Google Scholar]
- 12.Blair JE, Zannad F, Konstam MA, Cook T, Traver B, Burnett JC, Jr, Grinfeld L, Krasa H, Maggioni AP, Orlandi C, Swedberg K, Udelson JE, Zimmer C, Gheorghiade M. EVEREST Investigators. Continental differences in clinical characteristics, management, and outcomes in patients hospitalized with worsening heart failure results from the EVEREST (Efficacy of Vasopressin Antagonism in Heart Failure: Outcome Study with Tolvaptan) program. J Am Coll Cardiol. 2008;52:1640–1648. doi: 10.1016/j.jacc.2008.07.056. [DOI] [PubMed] [Google Scholar]
- 13.Kubánek M, Goode KM, Lánská V, Clark AL, Cleland JG. The prognostic value of repeated measurement of N-terminal pro-B-type natriuretic peptide in patients with chronic heart failure due to left ventricular systolic dysfunction. Eur J Heart Fail. 2009;11:367–377. doi: 10.1093/eurjhf/hfp003. [DOI] [PubMed] [Google Scholar]
- 14.Kristensen SL, Jhund PS, Køber L, McKelvie RS, Zile MR, Anand IS, Komajda M, Cleland JG, Carson PE, McMurray JJ. Relative importance of history of heart failure hospitalization and N-terminal pro-B-type natriuretic peptide level as predictors of outcomes in patients with heart failure and preserved ejection fraction. doi: 10.1016/j.jchf.2015.01.014. [DOI] [PubMed] [Google Scholar]
- 15.Kristensen SL, Køber L, Jhund PS1, Solomon SD, Kjekshus J, McKelvie RS, Zile MR, Granger CB, Wikstrand J, Komajda M, Carson PE, Pfeffer MA, Swedberg K, Wedel H, Yusuf S, McMurray JJ. International geographic variation in event rates in trials of heart failure with preserved and reduced ejection fraction. Circulation. 2015;131:43–53. doi: 10.1161/CIRCULATIONAHA.114.012284. [DOI] [PubMed] [Google Scholar]
- 16.Edelmann F, Wachter R, Schmidt AG, Kraigher-Krainer E, Colantonio C, Kamke W, Duvinage A, Stahrenberg R, Durstewitz K, Löffler M, Düngen HD, Tschöpe C, Herrmann-Lingen C, Halle M, Hasenfuss G, Gelbrich G, Pieske B. Aldo-DHF Investigators. Effect of spironolactone on diastolic function and exercise capacity in patients with heart failure with preserved ejection fraction: the Aldo-DHF randomized controlled trial. JAMA. 2013;309:781–791. doi: 10.1001/jama.2013.905. [DOI] [PubMed] [Google Scholar]
- 17.Cleland JG, Pellicori P. Defining diastolic heart failure and identifying effective therapies. JAMA. 2013;309:825–826. doi: 10.1001/jama.2013.1569. [DOI] [PubMed] [Google Scholar]
- 18.Anand IS, Rector TS, Cleland JG, Kuskowski M, McKelvie RS, Persson H, McMurray JJ, Zile MR, Komajda M, Massie BM, Carson PE. Prognostic value of baseline plasma amino-terminal pro-brain natriuretic peptide and its interactions with irbesartan treatment effects in patients with heart failure and preserved ejection fraction: findings from the I-PRESERVE trial. Circ Heart Fail. 2011;4:569–577. doi: 10.1161/CIRCHEARTFAILURE.111.962654. [DOI] [PubMed] [Google Scholar]
- 19.Chen YHJ, DeMets DL, Lan KKG. Increasing the sample size when the unblinded interim result is promising. Stat Med. 2004;23:1023–1038. doi: 10.1002/sim.1688. [DOI] [PubMed] [Google Scholar]
- 20.Mehta CR, Pocock SJ. Adaptive increase in sample size when interim results are promising: A practical guide with examples. Stat Med. 2011;30:3267–3284. doi: 10.1002/sim.4102. [DOI] [PubMed] [Google Scholar]
- 21.Facenda J. [Accessed 1/7/16];NFL Films, The Power and the Glory. 1998 Track 25, Pain is Inevitable. http://www.cduniverse.com/search/xx/music/pid/1020022/a/power+and+the+glory%3A+music+%26+voices+of+nfl+films.htm.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.