Abstract
Objectives
We evaluated the performance of administrative claims in ascertaining Clinical Events Committee-adjudicated outcomes in the US CoreValve Studies.
Background
Real-world data offers tremendous opportunity to improve outcome ascertainment in clinical trials. However, little is known on the validity of outcomes ascertained using real-world data to capture trial endpoints.
Methods
We linked patients enrolled in 3 pivotal trials and 2 pre-market continued access studies evaluating transcatheter aortic valve replacement to Medicare fee-for-service inpatient claims. We calculated the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and kappa agreement statistic of claims to detect clinical endpoints and procedural complications in trial patients. (17). This study was approved by the institutional review board at Beth Israel Deaconess Medical Center.
Results
Claims accurately identified trial-adjudicated deaths (sensitivity, specificity, PPV, and NPV all > 99.6%; kappa 1.00). Claims had good performance in identifying trial-adjudicated permanent pacemaker implantation (sensitivity 92.2%, specificity 99.1%, PPV 96.1%, NPV 98.2%, kappa 0.93) and aortic valve reintervention (sensitivity 84.4%, specificity 99.6%, PPV 69.1%, NPV 99.8%, kappa 0.76). Claims had more modest performance in ascertaining trial-adjudicated myocardial infarction (sensitivity 63.6%, specificity 97.2%, PPV 29.9%, NPV 99.3%, kappa 0.39) and acute kidney injury (sensitivity 70.2%, specificity 85.4%, PPV 38.2%, and NPV 95.7%, kappa 0.41) and the poorest performance for identifying trial-adjudicated bleeding events (sensitivity 86.4%, specificity 36.8%, PPV 35.0%, NPV 86.3%, kappa 0.16).
Conclusions
Compared with trial-adjudicated outcomes, claims data performed well in ascertaining death and outcomes with procedural billing codes and more modestly in identifying other outcomes. Claims may be cautiously and selectively used to augment data collection in future cardiovascular device trials.
Keywords: TAVR, clinical trials, administrative claims, validation
CONDENSED ABSTRACT
Little is known on the validity of outcomes ascertained using real-world data to capture trial endpoints. We evaluated the performance of administrative claims in ascertaining Clinical Events Committee-adjudicated outcomes in the US CoreValve Studies. We linked patients enrolled in 3 pivotal trials and 2 pre-market continued access studies evaluating transcatheter aortic valve replacement to Medicare fee-for-service inpatient claims. Compared with trial-adjudicated outcomes, claims data performed well in ascertaining death and outcomes with procedural billing codes and more modestly in identifying other outcomes. Claims may be cautiously and selectively used to augment data collection in future cardiovascular device trials.
INTRODUCTION
The growth of real-world health data offers significant opportunity to change the conduct of cardiovascular clinical trials. Foresighted trialists have already mobilized registries to accelerate patient recruitment and leveraged user-owned devices to obviate the need for clinical study sites (1,2). However, there remains a fundamental need to assess the validity of outcomes captured using real-world data in order to ensure the integrity of future clinical trial designs.
Administrative claims data are ubiquitous, standardized, and routinely collected data that can aid in the ascertainment of trial endpoints (3). However, relatively little is known about the validity of claims to capture endpoints in randomized controlled trials (RCTs). The few studies evaluating the accuracy of administrative data in ascertaining clinically adjudicated outcomes in large observational studies and RCTs have found discordant results (4–9). Trials of new medical devices may be particularly suitable for claims-based approaches for endpoint ascertainment given that the US Food and Drug Administration (FDA) is developing the National Evaluation System for health Technology (NEST) to explore the use of real-world data to evaluate medical devices (10). Recent RCTs have enabled disruptive innovation in the field of structural heart disease in particular (11), but there has been concern for under-reporting of deaths in post-marketing surveillance data in structural heart intervention trials (12). Only one structural heart intervention registry-based study has evaluated outcome ascertainment using claims (6), and there are no studies evaluating the use of claims to detect events in a structural heart disease RCT.
We sought to examine the performance of administrative claims to ascertain clinical events committee (CEC)-adjudicated outcomes in the US CoreValve Studies, a collection of studies evaluating the self-expanding CoreValve transcatheter aortic valve bioprosthesis for severe aortic valve stenosis (13–15). In particular, we evaluated the ability of Medicare fee-for-service claims to identify CEC-adjudicated major clinical endpoints and procedural complications.
METHODS
Study population
We included all patients ≥ 65 years old in the US CoreValve Studies who could be successfully linked to the Centers for Medicare and Medicaid Services Medicare Provider Analysis and Review (MedPAR) database, a 100% sample of inpatient Part A discharge claims for Medicare fee-for-service beneficiaries. The US CoreValve Studies include 3 large clinical trials evaluating the self-expanding CoreValve bioprosthesis in individuals with severe aortic stenosis: the US CoreValve High Risk Study (13), the US CoreValve Extreme Risk Study (14), and the Surgical or Transcatheter Aortic-Valve Replacement in Intermediate Risk Patients (SURTAVI) study (15). The High Risk and the SURTAVI studies were RCTs comparing transcatheter aortic valve replacement (TAVR) with surgical aortic valve replacement (SAVR), whereas the Extreme Risk Study was a nonrandomized comparison of the CoreValve with an objective performance measure. The US CoreValve Studies also include the CoreValve Continued Access Studies, a cohort of high and extreme risk patients that received a CoreValve after completion of these trials but before commercialization. Patients who underwent the initial valve deployment at a Veterans Affairs or at hospitals outside the United States were excluded from this analysis.
We linked the CoreValve dataset (including both TAVR and SAVR patients) and MedPAR using indirect identifiers as a part of the broader EXTEND Study, an industry and academic partnership funded by the National Heart, Lung and Blood Institute (1R01HL136708) (16). Since patient identifiers were not available in the CoreValve dataset, records were linked to MedPAR via a deterministic matching algorithm based on age, procedure date and type, admission and discharge dates, physician identifier, and hospital identifier for SURTAVI, and patient date of birth, procedure date and type, physician identifier, and hospital identifier for the other trials. Using these linkage rules, 79.8% (4229/5302) of patients in the US CoreValve Pivotal Trials dataset were successfully linked to MedPAR (eFigure 1). Linked patients were generally similar to those who were not linked, though linked patients had numerically greater age, Society of Thoracic Surgeons Risk Score, and pre-existing congestive heart failure (eTable 1). The majority of the non-linked individuals were likely enrolled in Medicare Advantage, since Medicare Advantage represented 13–30% of overall Medicare enrollees during the time period evaluated (17). This study was approved by the institutional review board at Beth Israel Deaconess Medical Center.
Variables
We evaluated major clinical endpoints including death, aortic valve reintervention, and myocardial infarction (MI) at 1 year. Stroke was not evaluated in this study as it was the focus of a separate analysis (18). We also evaluated procedural complications, including permanent pacemaker implantation, acute kidney injury (AKI), and bleeding events at 30 days. All trial outcomes were defined by the Valve Academic Research Consortium and were validated by an independent CEC according to prespecified, consensus-based criteria (19).
We identified trial endpoints in claims data as follows. Death and date of death were identified using vital status information in the Medicare Master Beneficiary Summary File (i.e. denominator file). Candidate International Classification of Diseases, 9th revision, Clinical Modification (ICD-9) and International Classification of Diseases, 10th revision, Clinical Modification (ICD-10) codes for other outcomes were identified by two coauthors, and a comprehensive list of codes for each endpoint was created by consensus based on face validity and prior literature (eTable 2, eTable 3) (5,20–22). The Extreme Risk, High Risk, and Continued Access Studies used ICD-9 codes, whereas SURTAVI used both ICD-9 and ICD-10 codes. ICD diagnosis codes in any code position were included for MI, AKI, and bleeding, whereas ICD procedure codes in any code position were included for reintervention and pacemaker implantation.
Statistical analysis
We evaluated the concordance of events ascertained in claims and in trials using an event-match approach and a cumulative incidence approach.
To evaluate major clinical endpoints in the event-match approach, claims submitted for an individual with an admission date within 14 days of a similar CEC-adjudicated event were considered a “match,” and claims for events outside the window were considered a “non-match.” For reintervention, the presence of multiple codes on the same day were counted as one event. For procedural complications, we considered all claims submitted for any admission within 30 days as a match for 30-day outcomes.
For each outcome, we calculated sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and an unweighted kappa statistic to evaluate the performance of ICD codes in ascertaining CEC-adjudicated outcomes. We calculated test characteristics using both a comprehensive set of codes and a parsimonious subset of codes. We included individual codes in the parsimonious subset if they unequivocally matched trial events (i.e. PPV was 100%) or if the positive likelihood ratio (LR+) was greater than 10, similar to other studies (23). Test characteristics for the parsimonious set of codes were adjusted for optimism through 10-fold cross validation, in which a new parsimonious set of codes was chosen for each fold.
In order to evaluate the timing of potential divergence between claims and trials-based ascertainment of endpoints, we also compared the cumulative incidence at 1 year of major clinical endpoints in claims versus trials without regard to the 14-day match window (6). We used the Gray test to compare differences between pairs of curves. The date of censoring for these curves was based on the trial censoring date. For non-death outcomes, all estimates were adjusted for the competing risk of death using a cause-specific hazard model.
As a sensitivity analysis, we evaluated test characteristics separately for ICD-9 and ICD-10 coding systems. All analyses were conducted in SAS v 9.4 (SAS Institute, Cary, NC), using a two-tailed p <0.05 to define significance.
RESULTS
Major clinical endpoints
A total of 4229 patients were in the study analytic dataset. The median duration of follow up was 365 days, and <5% (196 patients) were lost to follow up in trials data before 1 year. CEC-adjudicated deaths were accurately identified by claims, with a sensitivity of 99.9%, specificity of 99.9%, PPV of 99.7%, a NPV of 100.0%, and a kappa of 1.00 (Table 1).
Table 1.
Assessment of Outcomes Using Claims Compared to Adjudicated Trial Data in the EXTEND-CoreValve Study
Present in Claims* (N) | Present in Trial (N) | Sensitivity (%, 95% CI) | Specificity (%, 95% CI) | PPV (%, 95% CI) | NPV (%. 95% CI) | Kappa Coefficient (95% CI) | ||
---|---|---|---|---|---|---|---|---|
Yes | No | Total | ||||||
Major clinical endpoints at 1 year | ||||||||
Death †, ‡ | ||||||||
Yes | 99.9 (99.2, 100.0) | 99.9 (99.8, 100.0) | 99.7 (98.9, 99.9) | 100.0 (99.8, 100.00) | 1.00 (0.99, 1.00) | |||
No | ||||||||
Total | ||||||||
Aortic Valve Reintervention† | ||||||||
Yes | 84.4 (70.5, 93.5) | 99.6 (99.4, 99.8) | 69.1 (57.8, 78.5) | 99.8 (99.7, 99.9) | 0.76 (0.66, 0.85) | |||
No | ||||||||
Total | ||||||||
Myocardial Infarction | ||||||||
Yes | 49 | 115 | 162 | 63.6 (51.9, 74.3) | 97.2 (96.7, 97.7) | 29.9 (25.0, 35.3) | 99.3 (99.1, 99.5) | 0.39 (0.31, 0.47) |
No | 28 | 4054 | 4084 | |||||
Total | 77 | 4169 | 4246 | |||||
Procedural complications at 30 days | ||||||||
Permanent Pacemaker Implantation | ||||||||
Yes | 733 | 30 | 763 | 92.2 (90.1, 94.0) | 99.1 (98.8, 99.4) | 96.1 (94.5, 97.2) | 98.2 (97.7, 98.6) | 0.93 (0.91, 0.94) |
No | 62 | 3405 | 3467 | |||||
Total | 795 | 3435 | 4230 | |||||
Acute Kidney Injury | ||||||||
Yes | 346 | 560 | 906 | 70.2 (65.9, 74.2) | 85.4 (84.2, 86.5) | 38.2 (36.0, 40.5) | 95.7 (95.1, 96.2) | 0.41 (0.37, 0.44) |
No | 147 | 3273 | 3420 | |||||
Total | 493 | 3833 | 4326 | |||||
Bleeding | ||||||||
Yes | 1184 | 2201 | 3385 | 86.4 (84.5, 88.2) | 36.8 (35.2, 38.4) | 35.0 (34.2, 35.7) | 87.3 (85.7, 88.8) | 0.16 (0.14, 0.18) |
No | 186 | 1281 | 1467 | |||||
Total | 1370 | 3482 | 4852 |
Sensitivity = (true positives)/(true positives + false negatives); specificity = (true negatives)/(true negatives + false positives); PPV = (true positives)/(true positives + false positives); NPV = (true negatives)/(true negatives + false negatives).
PPV=positive predictive value, NPV=negative predictive value, CI=confidence interval
The comprehensive code set was used to ascertain outcomes in claims.
Values for cell counts Not reported per the Centers for Medicare and Medicaid cell suppression policy.
Discrepancies between ascertainment of death in the trial and in CMS data stem from missing event(s) in CMS data, missing date information in trial data, or difference(s) in dates between the 2 data sources of > 14 days.
Using the comprehensive set of codes to identify CEC-adjudicated reintervention events, there was a sensitivity of 84.4%, specificity of 99.6%, PPV of 69.1%, and NPV of 99.8%, with a kappa of 0.76 (Table 1). Using a parsimonious set of codes to identify events led to a slightly lower sensitivity (75.6%), but a higher PPV (79.1%; Table 2; eTable 4).
Table 2.
Assessment of Outcomes Using Claims Compared to Adjudicated Trial Data Using Different Coding Strategies in the EXTEND-CoreValve Study
Outcome | Sensitivity (%) (95% CI) | Specificity (%) (95% CI) | PPV (%) (95% CI) | NPV (%) (95% CI) | Kappa (95% CI) |
---|---|---|---|---|---|
Major clinical endpoints (at 1 year) | |||||
Aortic Valve Reintervention | |||||
Comprehensive code set | 84.4 (70.5, 93.5) | 99.6 (99.4, 99.8) | 69.1 (57.8, 78.5) | 99.8 (99.7, 99.9) | 0.76 (0.66, 0.85) |
Parsimonious code set | 75.6 (60.5, 87.1) | 99.8 (99.6, 99.9) | 79.1 (65.8, 88.1) | 99.7 (99.6, 99.8) | 0.77 (0.67, 0.87) |
Myocardial Infarction | |||||
Comprehensive code set | 63.6 (51.9, 74.3) | 97.2 (96.7, 97.7) | 29.9 (25.0, 35.3) | 99.3 (99.1, 99.5) | 0.39 (0.31, 0.47) |
Parsimonious code set | 62.3 (50.6, 73.1) | 97.8 (97.3, 98.2) | 34.0 (28.4, 40.2) | 99.3 (99.1, 99.5) | 0.43 (0.34, 0.51) |
Procedural outcomes (at 30 days) | |||||
Permanent Pacemaker Implantation | |||||
Comprehensive code set | 92.2 (90.1, 94.0) | 99.1 (98.8, 99.4) | 96.1 (94.5, 97.2) | 98.2 (97.7, 98.6) | 0.93 (0.91, 0.94) |
Parsimonious code set | 91.8 (89.7, 93.6) | 99.3 (99.0, 99.6) | 96.8 (95.3, 97.8) | 98.1 (97.7, 98.5) | 0.93 (0.92, 0.94) |
Acute Kidney Injury | |||||
Comprehensive code set | 70.2 (65.9, 74.2) | 85.4 (84.2, 86.5) | 38.2 (36.0, 40.5) | 95.7 (95.1, 96.2) | 0.41 (0.37, 0.44) |
Parsimonious code set | 25.6 (21.8, 29.7) | 98.1 (97.6, 98.5) | 63.0 (56.5, 69.1) | 91.1 (90.7, 91.5) | 0.32 (0.27, 0.37) |
Bleeding | |||||
Comprehensive code set | 86.4 (84.5, 88.2) | 36.8 (35.2, 38.4) | 35.0 (34.2, 35.7) | 87.3 (85.7, 88.8) | 0.16 (0.14, 0.18) |
Parsimonious code set | 3.1 (2.2, 4.1) | 99.3 (99.0, 99.6) | 63.6 (51.6, 74.2) | 72.2 (72.0, 72.4) | 0.03 (0.02, 0.05) |
The parsimonious code set represents hospitalizations with events selected using an algorithm incorporating individual codes with perfect matching or with a likelihood ratio positive >10 for identifying adjudicated events. Test characteristics for parsimonious code set are adjusted for optimism through ten-fold cross validation. Sensitivity = (true positives)/(true positives + false negatives); specificity = (true negatives)/(true negatives + false positives); PPV = (true positives)/(true positives + false positives); NPV = (true negatives)/(true negatives + false negatives).
PPV=positive predictive value, NPV=negative predictive value, CI=confidence interval
A total of 77 CEC-adjudicated MI events were identified (Table 1). Using the comprehensive set of codes to identify these events, there was a sensitivity of 63.6%, specificity of 97.2%, PPV of 29.9%, and NPV of 99.3%, with a kappa of 0.39. Using a parsimonious set of codes to identify events led to similar test characteristics (Table 2; eTable 4).
Procedural complications
A total of 795 CEC-adjudicated pacemaker implantation events were identified (Table 1). Using the comprehensive set of codes to identify these events, there was a sensitivity of 92.2%, specificity of 99.1%, PPV of 96.1%, and NPV of 98.2% with a kappa of 0.93. Using a parsimonious set of codes, there were very similar test characteristics (Table 2, eTable 5).
A total of 493 CEC-adjudicated AKI events were identified (Table 1). Using the comprehensive set of codes to identify these events, there was a sensitivity of 70.2%, specificity of 85.4%, PPV of 38.2%, and NPV of 95.7%, with a kappa of 0.41. Using a parsimonious set of codes, there was a lower sensitivity (25.6%), NPV (91.1%), and kappa (0.32), but a higher specificity (98.1%) and PPV (63.0%; Table 2, eTable 5).
A total of 1370 CEC-adjudicated bleeding events were identified (Table1). Using the comprehensive set of codes to identify these events, there was a sensitivity of 86.4%, specificity of 36.8%, PPV of 35.0%, and NPV of 87.3% with a kappa of 0.16. Using a parsimonious set of codes, there was a higher specificity (99.3%) and PPV (63.6%), but a substantially lower sensitivity (3.1%) and kappa (0.03; Table 2, eTable 5).
Cumulative Incidence Analysis
At 1 year of follow up, cumulative incidence curves representing ascertainment of death via CEC adjudication and claims were nearly identical with a cumulative incidence of 16.8% in the trial and in claims (p=0.97; Figure 1A). The curves for reintervention in claims were similar to the CEC-adjudicated curve, with a cumulative incidence of 1.04% in the trial, 1.30% in the comprehensive code set (p=0.26 for difference with trials curve), and 1.10% using the parsimonious code set (p=0.75 for difference with trials curve; Figure 1B). The curves for MI in claims were similar early in follow-up but then separated over the course of the year, with a cumulative incidence of 1.77% in the trial, 3.29% in the comprehensive code set, and 2.91% using the parsimonious code set (p<0.01 for difference between each claims curve and trial curve; Figure 1C).
Figure 1.
Cumulative incidence of (A) death, (B) aortic valve reintervention, and (C) myocardial infarction, from 0 to 12 months following procedure date in the CoreValve trials. A. Cumulative incidence of 16.8% in both curves; Gray test p=0.97 for differences between curves. B. Cumulative incidence of 1.04% in the trial, 1.30% in the comprehensive code set, and 1.10% in the parsimonious code set; Gray test p=0.26 for differences between comprehensive code set curve and trial curve and Gray test p=0.75 for differences between the parsimonious code set curve and trial curve. C. Cumulative incidence of 1.77% in the trial, 3.29% in the comprehensive code set, and 2.91% in the parsimonious code set; Gray test p<0.01 for differences between all pairs of curves. For MI and aortic valve reintervention, the trial curve represents events adjudicated by the trial Clinical Events Committee; the comprehensive candidate claims curve represents hospitalizations with events determined by the comprehensive diagnosis code set; and the parsimonious claims curve represents hospitalizations with events selected using an algorithm incorporating individual codes with perfect matching or with a likelihood ratio positive >10 for identifying adjudicated events. CEC=Clinical Events Committee.
Sensitivity analysis
When examined separately, the sensitivity of ICD-10 codes for each outcome was generally similar to the sensitivity in ICD-9, while differences in other test characteristics were observed (eTable 6 and eTable 7). Test characteristics of ICD-10 codes for identifying trial events had wide confidence intervals due to their limited use (portion of a single trial).
DISCUSSION
Incorporation of real-world data into cardiovascular clinical trials has generated enthusiasm for its potential to increase efficiency and reduce costs (24), but rigorous assessment of the validity of real-world data in capturing trial outcomes is lacking. In this study, we found that Medicare claims accurately identified death and endpoints based on procedural billing codes, had modest performance in ascertaining MI and AKI, and had poor performance in ascertaining bleeding in the US CoreValve Studies. These results beget both optimism and caution for cardiovascular trialists who seek to utilize real-world data in future studies. Furthermore, as future cardiovascular clinical trial designs evolve, this study also highlights how clinicians as consumers of this new knowledge will need to take a nuanced approach to interpreting trial outcomes captured using real-world data.
Our findings contribute to existing literature on the performance of claims in accurately identifying medical procedures. Claims have been widely used to define cohorts of patients with a pacemaker and have reliably captured events when validated against surveys (25,26). One large cohort study compared ascertainment of pacemaker implantation in claims with that from chart review and found a sensitivity of 0.91 and PPV of 1.00 (27). Our study compared claims to CEC-adjudicated pacemaker implantation and found similar sensitivity and PPV (92.2% and 96.1%, respectively). Although claims have also been widely used to identify patients with aortic valve intervention and validate surgical registry data (28,29), ascertainment of aortic valve intervention with claims has not been previously validated against a gold standard. In our study, we found high sensitivity, specificity, PPV, and NPV of claims for identifying aortic valve re-intervention. These results further validate the use of claims to augment the ascertainment of these procedures in future clinical trials.
Our study also adds to the existing evidence on the performance of claims in identifying other important outcomes in cardiovascular trials. The date of death in claims has been validated as accurate in >99% of cases based on information provided by the Social Security Administration and other sources (30). Similarly, we found that death in claims had >99% accuracy in ascertaining CEC-adjudicated death. Observational studies comparing claims for MI with chart review have found high sensitivity and PPV (4,31,32), and several studies comparing ICD codes in administrative records to CEC-adjudicated MI found substantial agreement (kappa 0.71–0.76) (5,7,8). In contrast, although we found high specificity and NPV in comparing claims to CEC-adjudicated MI, we found only modest sensitivity, low PPV, and fair agreement (kappa 0.39). Although cumulative incidence of MI in claims was higher than in the trials, we considered trial-adjudication to be the gold standard so any additional claims-based events were considered false positives, thus leading to the lower PPV. The number of false positives was small relative to the number of true negatives so the specificity still remained high. Despite identifying a larger number of events than adjudication, there were still events identified by adjudication not matched by corresponding claims, leading to imperfect sensitivity. The low PPV for both AKI and bleeding events in our study are in concordance with low PPV for these outcomes using claims to identify adverse outcomes in a mitral valve repair registry (6), and may stem from the more stringent structural heart intervention trial endpoint definitions and broad inclusion criteria in our comprehensive code set. Taken together, these results suggest that caution should be exercised in relying solely on claims to ascertain non-death outcomes in future trials for cardiovascular devices.
In our manuscript, we found that a parsimonious code set performed similarly to a comprehensive code set for reintervention, MI, and pacemaker implantation, but worse than a comprehensive code set for bleeding and AKI. We also found tradeoffs in sensitivity and specificity in choosing between comprehensive or parsimonious coding strategies, which may dictate the selection of liberal or stringent lists of codes to identify events in future trials. Understanding whether parsimonious code sets may have greater utility in ascertaining outcomes in other contexts is a rich area for further inquiry.
Our findings suggest that claims may be a suitable mechanism to augment data collection of future cardiovascular device trials. Although not all devices will have unique procedure billing codes enabling the retrospective linkage strategy performed in this study, future trials could incorporate authorization for linkage to insurance data prospectively using direct identifiers. Prospective collection of both CEC-adjudicated events and claims-based events in parallel during an initial run-in period could allow for validation of claims-based endpoints for use in subsequent periods. For endpoints such as death and procedural billing outcomes, claims may initially serve as an adjunct to CEC adjudication in the initial period when most events occur, and can then potentially be a reasonable substitute for longer term CEC-adjudicated follow up if they prove to be valid. Claims can thus mitigate loss to follow-up in trials, which can otherwise lead to missing data and can confound trial result interpretation, as in the case of peripheral drug-coated balloons (33). For other endpoints with more nuanced definitions (such as bleeding), claims may still be an adjunct to trials, though likely cannot replace traditional CEC-mediated adjudication altogether due to the suboptimal agreement between the two methods of ascertainment for these endpoints (34). Although we used the CEC-adjudicated data as the gold standard, it is plausible that there were true events identified in claims that the CEC missed. Outcomes such as MI may be missed in trials if patients do not recall events or site investigators reporting outcomes are unaware that an event took place (7). Notably, in our study, cumulative incidence of MI in claims was initially similar to trials, but later diverged, suggesting that the utility of claims in ascertaining events may increase as duration of follow up lengthens. If claims are incorporated into adjudication processes in a prospective fashion, it may be possible to investigate whether ‘false positives’ in claims could represent true events that were not captured in the standard trials adjudication process.
Our findings also have implications for the use of claims to capture specific clinical trial endpoints in medical device evaluation and surveillance efforts outside of traditional clinical trials. Post-marketing data is becoming increasingly important as the FDA works to expedite device approval, but there has been some concern with misclassification of adverse events and underreporting of deaths in traditional post-marketing surveillance data (12). The FDA has begun to use claims data to detect drug safety signals in the Sentinel Initiative (35), and is collaborating with stakeholders to build the NEST system to use real-world data to support regulatory decision-making for medical devices (10). Our results suggest that claims data may adequately capture some TAVR endpoints but should be used with caution for others, which is in concordance with an analysis of claims to ascertain outcomes in a transcatheter mitral valve repair registry (6). Future research is necessary to evaluate the utility of claims to ascertain outcomes for other types of medical devices and determine how claims can be used in concert with other real-world evidence to augment outcome ascertainment more broadly.
This study should be interpreted in context of its limitations. First, since we employed Medicare fee-for-service claims, results may not generalize to individuals under the age of 65 years or those with other insurance coverage. However, the ICD codes validated in this manuscript are universal across insurance type, and the validation approach presented in this study can be undertaken in other contexts. Second, 20% of patients in the US CoreValve Studies could not be linked to Medicare data. However, patients from trials linked to Medicare data were generally similar to those who were not linked suggesting that this would be unlikely to change the results of the manuscript. Third, the study was limited to patients in the US CoreValve studies and accuracy of codes for certain non-procedural endpoints such as bleeding or AKI which are uniquely defined in TAVR trials may differ for non-TAVR populations, in which events may be defined differently. However, a similar approach to endpoint ascertainment through linkage with insurance claims may be used for other conditions, particularly if claims are used alongside traditional endpoint adjudication prospectively in clinical trials. Fourth, the specific parsimonious codes generated in this study may not be generalizable to other contexts. However, our dataset spans multiple trials and non-trial studies, which adds to the generalizability of these results, and we attempted to mitigate the threat to external validity through 10-fold cross validation. Finally, there were a limited number of events in our ICD-10 data. It will be important to conduct future claims validation studies using ICD-10 data for trial outcomes.
CONCLUSIONS
In conclusion, Medicare claims accurately identified CEC-adjudicated outcomes, including death, aortic valve reintervention, and permanent pacemaker implantation, in the US CoreValve Studies. More modest performance for ascertaining outcomes was observed with other billing codes. These results suggest that claims may be cautiously and selectively used to augment data collection in future cardiovascular device trials in a manner that could improve the efficiency and reliability of outcome ascertainment. Understanding the strengths and limitations of using claims to capture trial outcomes will be crucial for both trialists and clinicians as cardiovascular clinical trials make use of real-world data in the future.
Supplementary Material
Central Illustration.
Performance of claims in ascertaining trial-adjudicated trial outcomes in the EXTEND-CoreValve Study.
PERSPECTIVES.
WHAT IS KNOWN?
Real-world data offers tremendous opportunity to improve outcome ascertainment in clinical trials, but little is known on the validity of outcomes ascertained using real-world data to capture trial endpoints.
WHAT IS NEW?
Through linking Medicare claims data with the US CoreValve Studies evaluating transcatheter aortic valve replacement in patients with severe aortic stenosis, we found that claims data performed well in ascertaining death and outcomes with procedural billing codes but performed more modestly in identifying other outcomes. Thus, claims may be cautiously and selectively used to augment data collection in future cardiovascular device trials.
WHAT IS NEXT?
Future research is necessary to evaluate the utility of claims to ascertain outcomes for other types of medical devices and determine how claims can be used in concert with other real-world evidence to augment outcome ascertainment more broadly.
Acknowledgments
Funding:
The study was funded by NHLBI 1R01HL136708 (Yeh).
Disclosures:
Dr. Butala is funded by the John S. LaDue Memorial Fellowship at Harvard Medical School, Boston, MA and reports consulting fees and ownership interest in HiLabs, outside the submitted work.
Dr. Strom is funded by a grant from the NIH/NHLBI (1K23HL144907). Dr. Brennan holds an Innovation in Regulatory Science Award from Burroughs Welcome Fund (1014158), a Food and Drug Administration grant (1U01FD004591-01), and consulting for Edwards Lifesciences and Atricure.
Dr. Popma receives grants from Medtronic, Abbott Vascular, Cook, and Boston Scientific and personal fees from Boston Scientific.
Dr. Yeh reports additional grant support from Abiomed, Astra Zeneca and Boston Scientific and, and consulting fees from Abbott, Boston Scientific, Medtronic, and Teleflex, outside the submitted work.
Dr. Strom, Dr. Faridi, Dr. Kazi, Ms. Zhao, and Dr. Chen have no relationships with industry.
All other authors report nothing to disclose.
ABBREVIATIONS AND ACRONYMS
- RCT
randomized controlled trial
- FDA
Food and Drug Administration
- NEST
National Evaluation System for health Technology
- CEC
clinical events committee
- MedPAR
Medicare Provider Analysis and Review
- SURTAVI
Surgical or Transcatheter Aortic-Valve Replacement in Intermediate Risk Patients
- TAVR
transcatheter aortic valve replacement
- SAVR
surgical aortic valve replacement
- MI
myocardial infarction
- AKI
acute kidney injury
- ICD-9
International Classification of Diseases, 9th revision, Clinical Modification
- ICD-10
International Classification of Diseases, 10th revision, Clinical Modification
- PPV
positive predictive value
- NPV
negative predictive value
- LR+
positive likelihood ratio
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- 1.Perez MV, Mahaffey KW, Hedlin H et al. Large-Scale Assessment of a Smartwatch to Identify Atrial Fibrillation. N Engl J Med 2019;381:1909–1917. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fröbert O, Lagerqvist B, Olivecrona GK et al. Thrombus aspiration during ST-segment elevation myocardial infarction. New England Journal of Medicine 2013;369:1587–1597. [DOI] [PubMed] [Google Scholar]
- 3.Choudhry NK. Randomized, Controlled Trials in Health Insurance Systems. N Engl J Med 2017;377:957–964. [DOI] [PubMed] [Google Scholar]
- 4.Psaty BM, Delaney JA, Arnold AM et al. Study of cardiovascular health outcomes in the era of claims data: the Cardiovascular Health Study. Circulation 2016;133:156–164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Guimarães PO, Krishnamoorthy A, Kaltenbach LA et al. Accuracy of medical claims for identifying cardiovascular and bleeding events after myocardial infarction: a secondary analysis of the TRANSLATE-ACS study. JAMA cardiology 2017;2:750–757. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lowenstern A, Lippmann SJ, Brennan JM et al. Use of medicare claims to identify adverse clinical outcomes after mitral valve repair. Circulation: Cardiovascular Interventions 2019;12:e007451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hlatky MA, Ray RM, Burwen DR et al. Use of Medicare data to identify coronary heart disease outcomes in the Women’s Health Initiative. Circulation: Cardiovascular Quality and Outcomes 2014;7:157–162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Kjoller E, Hilden J, Winkel P et al. Agreement between public register and adjudication committee outcome in a cardiovascular randomized clinical trial. Am Heart J 2014;168:197–204.e1–4. [DOI] [PubMed] [Google Scholar]
- 9.Barry SJE, Dinnett E, Kean S, Gaw A, Ford I. Are Routinely Collected NHS Administrative Records Suitable for Endpoint Identification in Clinical Trials? Evidence from the West of Scotland Coronary Prevention Study. PloS one 2013;8:e75379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Shuren J, Califf RM. Need for a National Evaluation System for Health Technology. Jama 2016;316:1153–4. [DOI] [PubMed] [Google Scholar]
- 11.Tabata N, Sinning J-M, Kaikita K, Tsujita K, Nickenig G, Werner N. Current status and future perspective of structural heart disease intervention. Journal of cardiology 2019. [DOI] [PubMed] [Google Scholar]
- 12.Meier L, Wang EY, Tomes M, Redberg RF. Miscategorization of Deaths in the US Food and Drug Administration Adverse Events Database. JAMA Intern Med 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Adams DH, Popma JJ, Reardon MJ et al. Transcatheter aortic-valve replacement with a self-expanding prosthesis. N Engl J Med 2014;370:1790–8. [DOI] [PubMed] [Google Scholar]
- 14.Popma JJ, Adams DH, Reardon MJ et al. Transcatheter aortic valve replacement using a self-expanding bioprosthesis in patients with severe aortic stenosis at extreme risk for surgery. J Am Coll Cardiol 2014;63:1972–81. [DOI] [PubMed] [Google Scholar]
- 15.Reardon MJ, Van Mieghem NM, Popma JJ et al. Surgical or Transcatheter Aortic-Valve Replacement in Intermediate-Risk Patients. N Engl J Med 2017;376:1321–1331. [DOI] [PubMed] [Google Scholar]
- 16.Strom JB, Tamez-Aguilar H, Zhao MY et al. Validating the Use of Registries and Claims Data to Support Randomized Trials: Rationale and Design of the Extending Trial-Based Evaluations of Medical Therapies Using Novel Sources of Data (EXTEND) Study. American Heart Journal 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Anthony Jacobson GD; Neuman Tricia; Gold Marsha. Medicare Advantage 2017 Spotlight: Enrollment Market Update. 2017.
- 18.Strom JB, Zhao Y, Faridi KF et al. Comparison of Clinical Trials and Administrative Claims to Identify Stroke Among Patients Undergoing Aortic Valve Replacement: Findings From the EXTEND Study. Circulation Cardiovascular interventions 2019;12:e008231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Kappetein AP, Head SJ, Genereux P et al. Updated standardized endpoint definitions for transcatheter aortic valve implantation: the Valve Academic Research Consortium-2 consensus document. J Am Coll Cardiol 2012;60:1438–54. [DOI] [PubMed] [Google Scholar]
- 20.Hlatky MA, Ray RM, Burwen DR et al. Use of Medicare data to identify coronary heart disease outcomes in the Women’s Health Initiative. Circulation: Cardiovascular Quality and Outcomes 2014:CIRCOUTCOMES. 113.000373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Andrade SE, Harrold LR, Tjia J et al. A systematic review of validated methods for identifying cerebrovascular accident or transient ischemic attack using administrative data. Pharmacoepidemiology and drug safety 2012;21 Suppl 1:100–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ellis ER, Culler SD, Simon AW, Reynolds MR. Trends in utilization and complications of catheter ablation for atrial fibrillation in Medicare beneficiaries. Heart rhythm 2009;6:1267–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Birman-Deych E, Waterman AD, Yan Y, Nilasena DS, Radford MJ, Gage BF. Accuracy of ICD-9-CM codes for identifying cardiovascular and stroke risk factors. Medical care 2005:480–485. [DOI] [PubMed] [Google Scholar]
- 24.Solomon SD, Pfeffer MA. The Future of Clinical Trials in Cardiovascular Medicine. Circulation 2016;133:2662–70. [DOI] [PubMed] [Google Scholar]
- 25.Greenspon AJ, Patel JD, Lau E et al. Trends in permanent pacemaker implantation in the United States from 1993 to 2009: increasing complexity of patients and procedures. Journal of the American College of Cardiology 2012;60:1540–1545. [DOI] [PubMed] [Google Scholar]
- 26.Zhan C, Baine WB, Sedrakyan A, Steiner C. Cardiac device implantation in the United States from 1997 through 2004: a population-based analysis. Journal of General Internal Medicine 2008;23:13–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Fisher ES, Whaley FS, Krushat WM et al. The accuracy of Medicare’s hospital claims data: progress has been made, but problems remain. American journal of public health 1992;82:243–248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Barreto-Filho JA, Wang Y, Dodson JA et al. Trends in aortic valve replacement for elderly patients in the United States, 1999–2011. Jama 2013;310:2078–2084. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Welke KF, Peterson ED, Vaughan-Sarrazin MS et al. Comparison of cardiac surgery volumes and mortality rates between the Society of Thoracic Surgeons and Medicare databases from 1993 through 2001. The Annals of thoracic surgery 2007;84:1538–1546. [DOI] [PubMed] [Google Scholar]
- 30.Jarosek S Death Information in the Research Identifiable Medicare Data. 2018. [Google Scholar]
- 31.Kiyota Y, Schneeweiss S, Glynn RJ, Cannuscio CC, Avorn J, Solomon DH. Accuracy of Medicare claims-based diagnosis of acute myocardial infarction: estimating positive predictive value on the basis of review of hospital records. American heart journal 2004;148:99–104. [DOI] [PubMed] [Google Scholar]
- 32.Rosamond WD, Chambless LE, Sorlie PD et al. Trends in the sensitivity, positive predictive value, false-positive rate, and comparability ratio of hospital discharge diagnosis codes for acute myocardial infarction in four US communities, 1987–2000. American journal of epidemiology 2004;160:1137–1146. [DOI] [PubMed] [Google Scholar]
- 33.US Food & Drug Administration. FDA Executive Summary, Circulatory System Devices Panel Meeting Paclitaxel-Coated Drug Coated Balloon and Drug-Eluting Stent Late Mortality Panel. June 19–20, 2019. [Google Scholar]
- 34.Olivier CB, Bhatt DL, Leonardi S et al. Central Adjudication Identified Additional and Prognostically Important Myocardial Infarctions in Patients Undergoing Percutaneous Coronary Intervention. Circulation Cardiovascular interventions 2019;12:e007342. [DOI] [PubMed] [Google Scholar]
- 35.Platt R, Brown JS, Robb M et al. The FDA Sentinel Initiative - An Evolving National Resource. N Engl J Med 2018;379:2091–2093. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.