Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Feb 1.
Published in final edited form as: JACC Cardiovasc Interv. 2020 Jul 15;13(15):1786–1788. doi: 10.1016/j.jcin.2020.04.023

Evaluating Clinical Outcomes from Administrative Databases

William S Weintraub 1, Brandon K Bellows 2
PMCID: PMC7848782  NIHMSID: NIHMS1665118  PMID: 32682675

How often in practice do we hear patients ask, “what is going to happen to me?” When they do, how do we to estimate their risk of serious events? Do we consider data from clinical trials, observational studies, or our daily clinical practice? Because of the nature of medical practice in the United States, there is no single reliable source of outcome information for patients, whether they are seen after a procedure or an event or are in stable condition in the clinic.

For many clinicians, the major source of information on risk comes from clinical trials, which may be sponsored by organizations requiring peer review (e.g., the National Institutes of Health), industry, professional societies, or clinical registries. Accurate reporting of clinical events is critical to our understanding of patient risk.

What are the sources of outcome data? Outcomes in clinical trials are usually assessed through follow-up using direct patient contact and electronic health record information, as well as some from local hospital administrative sources and death certificates. Events are then adjudicated by an expert committee that reviews the available data.1 This process is often seen as the gold standard for evaluating events, although there have been criticisms.2, 3 A limitation of this process is that the period of follow-up in clinical trials is variable, from short term of a month or so, to intermediate term of a one or two years, to longer term of up to five years. Clinical follow-up beyond this point within the structure of a clinical trial is unusual. This process is similar for clinical trials sponsored by peer review organizations and by industry.

Registries may also be used to assess outcomes. However, registries sponsored by professional societies may only collect short term data, often related to procedures (e.g., percutaneous coronary intervention) or events (e.g., myocardial infarction),4 and those sponsored by industry are variable in their approach to data collection. Furthermore, events during follow-up in registries are rarely adjudicated. Another source for outcomes data may be long-term epidemiological cohort studies in the United States, some with follow-up to as long as 30 years.5 Events in these cohorts are usually adjudicated and serve as a particularly rich source of information.

How then can events during follow-up be determined for patients in registries and clinical trials? One approach is to link the data to commercial or governmental administrative data sources, which at least in principle include these follow-up data.610 This process of linkage can be either what is called deterministic or probabilistic.7, 11 In deterministic linkage, patients are matched directly using patient identifiers such as Social Security number. Quite often, because of concerns over privacy, Social Security numbers and other patient identifiers are not collected.12 In such cases, probabilistic matching is used, in which institution, patient age and sex, and date of admission are used. This approach will only match a fraction of the patients (typically about two thirds). There is some danger of false matches, but generally the specificity of the matches is over 0.90.13

How do administrative databases acquire the information about events? Generally these data are gathered from insurance claims submitted when billing for care. When there is no claim, an administrative database may not have information on the event. This is particularly relevant for out-of-hospital mortality, which may not be captured in commercial claims databases. Furthermore, commercial databases may not be able to distinguish whether an event is fatal or not (e.g., fatal vs non-fatal myocardial infarction). This is a severe limitation of linking to commercial databases in the United States, and as such it is not commonly done. In the United States, the major claims database to which trials and registries are linked is Medicare.6, 14 Medicare has event data and obtains mortality data from the National Death Index, generally considered the gold standard for evaluating survival in the United States.1, 15 However, the National Death Index is of limited accuracy for cause of death.1 A limitation of Medicare, however, is that it generally only includes those over age 65, and will not include patients over 65 with commercial insurance (e.g., those still working) or patients with Medicare Advantage. Linking of trials and registries to the long-term epidemiologic cohorts in the United States is unusual, as linkage to specific patients is generally not possible and the cohorts have limited numbers of patients. Linkage to similar patients, however, is possible and has recently been accomplished for patients in SPRINT (Systolic Blood Pressure Intervention Trial).16

How reliable are the outcomes identified in administrative databases? Butala et al. have addressed this concern in this issue of JACC: Cardiovascular Interventions.17 These investigators linked data from three clinical trials and two registry studies of transcatheter aortic valve replacement to Medicare fee-for-service inpatient claims. Of 5,302 patients older than 65 years in the dataset, 4,229 (79.8%) were deterministically matched to Medicare claims. Non-linked patients were most likely enrolled in Medicare Advantage. Linked and non-linked patients were generally similar. The events considered were death, aortic valve reintervention, and myocardial infarction at one year and permanent pacemaker implantation, acute kidney injury, and bleeding at 30 days. Trial outcomes were all defined by the Valve Academic Research Consortium and adjudicated by an independent clinical events committee.18 Specific International Classification of Diseases (ICD)-9 or ICD-10 codes were used, noted by the investigators, to find events in the Medicare database. All comparisons were of Medicare claims to the adjudicated events in the clinical database as the standard. For mortality, Medicare claims had a sensitivity of 99.9% and specificity of 99.9%. For reintervention, Medicare claims has a sensitivity of 84.4% and specificity of 99.6%. For myocardial infarction, Medicare claims had a sensitivity of 63.6% and specificity of 97.2%. For pacemaker implantation, Medicare claims had a sensitivity of 92.2% and specificity of 99.1%. For acute kidney injury, Medicare claims had a sensitivity of 70.2% and specificity of 85.4%. For bleeding, Medicare claims had a sensitivity of 86.4% and specificity of 36.8% (Table 1)

Table 1.

Sensitivity and Specificity of Medicare Claims Identification of Events Compared to Independent Clinical Events Committee Adjudication from Butala et al.

Event Sensitivity Specificity
Mortality 99.9% 99.9%
Reintervention 84.4% 99.6%
Myocardial infarction 63.6% 97.2%
Pacemaker implantation 92.2% 99.1%
Acute kidney injury 70.2% 85.4%
Bleeding 86.4% 36.8%

The data reveal that Medicare claims are most useful for mortality and short-term events, such as pacemaker implantation. For longer term events, however, such as myocardial infarction or reintervention, identification of events from Medicare claims will be more limited. For complications such as acute kidney injury or bleeding, Medicare claims is likely to be unreliable. As the investigators note, events during follow-up may be missed even in clinical trial databases, and Medicare claims may prove useful in supplementing clinical trial databases.

This study is most important in understanding how we go about assessing events for research purposes and how the results can be interpreted in clinical practice. There simply is no perfect approach to this issue. Given the complicated nature of the health care system in the United States, there will not be a single source of outcomes data, except perhaps for mortality in the National Death Index, which has its own flaws.1 Thus, it is imperative for investigators to carefully consider their sources of data for events, be transparent in reporting, and carefully consider where data may be incomplete or erroneous and the impact this could have on their findings. Even clinical event committees are limited by imperfect data with which to evaluate outcomes.2

Can the results of this study be generalized to other conditions? The authors rightfully note that additional research is needed in other areas. However, it seems unlikely that this study can be replicated for all conditions where Medicare claims might be used. Indeed, there may not always be a gold standard for comparison. Nonetheless, further studies of this type are certainly justified. Furthermore, Butala et al. offer an excellent prototype for such studies. In any case, follow-up data are almost always going to have limitations. Readers of published clinical research and clinicians should be reasonably skeptical of long-term follow-up data obtained using Medicare claims, excluding mortality, which seems reliable.

How does the United States compare to other countries? Most of the world will have no reliable follow-up data. Some well-developed countries, such as Denmark and Sweden, have whole country hospitalization and mortality databases.19, 20 These databases will be useful for mortality and presumably events such as myocardial infarction but may also fall short for complications such as acute kidney injury or bleeding. The generalizability of results from Denmark and Sweden to the rest of the world, is at best uncertain.

The problems of how to properly follow up on our patients is real and there will be no perfect solution. Yet decisions must be made for our patients in the absence of truly sound data. Skilled, knowledgeable, skeptical clinicians who are patient advocates will continue to be the basis of sound practice.

Footnotes

Conflicts of Interest: none

Contributor Information

William S. Weintraub, MedStar Heart & Vascular Institute, Georgetown University, Washington, DC.

Brandon K. Bellows, Columbia University, New York, NY.

References

  • 1.Olubowale OT, Safford MM, Brown TM, Durant RW, Howard VJ, Gamboa C, Glasser SP, Rhodes JD and Levitan EB. Comparison of Expert Adjudicated Coronary Heart Disease and Cardiovascular Disease Mortality With the National Death Index: Results From the REasons for Geographic And Racial Differences in Stroke (REGARDS) Study. Journal of the American Heart Association. 2017;6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Granger CB, Vogel V, Cummings SR, Held P, Fiedorek F, Lawrence M, Neal B, Reidies H, Santarelli L, Schroyer R, Stockbridge NL and Feng Z. Do we need to adjudicate major clinical events? Clinical trials. 2008;5:56–60. [DOI] [PubMed] [Google Scholar]
  • 3.Mahaffey KW, Harrington RA, Akkerhuis M, Kleiman NS, Berdan LG, Crenshaw BS, Tardiff BE, Granger CB, DeJong I, Bhapkar M, Widimsky P, Corbalon R, Lee KL, Deckers JW, Simoons ML, Topol EJ, Califf RM and For the PI. Disagreements between central clinical events committee and site investigator assessments of myocardial infarction endpoints in an international clinical trial: review of the PURSUIT study. Curr Control Trials Cardiovasc Med. 2001;2:187–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Brindis RG, Fitzgerald S, Anderson HV, Shaw RE, Weintraub WS and Williams JF. The American College of Cardiology-National Cardiovascular Data Registry (ACC-NCDR): building a national clinical data repository. J Am Coll Cardiol. 2001;37:2240–5. [DOI] [PubMed] [Google Scholar]
  • 5.Lloyd-Jones DM, Leip EP, Larson MG, D’Agostino RB, Beiser A, Wilson PW, Wolf PA and Levy D. Prediction of lifetime risk for cardiovascular disease by risk factor burden at 50 years of age. Circulation. 2006;113:791–8. [DOI] [PubMed] [Google Scholar]
  • 6.Hammill BG, Hernandez AF, Peterson ED, Fonarow GC, Schulman KA and Curtis LH. Linking inpatient clinical registry data to Medicare claims data using indirect identifiers. Am Heart J. 2009;157:995–1000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Meray N, Reitsma JB, Ravelli AC and Bonsel GJ. Probabilistic record linkage is a valid and transparent tool to combine databases without a patient identification number. J Clin Epidemiol. 2007;60:883–91. [DOI] [PubMed] [Google Scholar]
  • 8.Shahian DM, O’Brien SM, Sheng S, Grover FL, Mayer JE, Jacobs JP, Weiss JM, Delong ER, Peterson ED, Weintraub WS, Grau-Sepulveda MV, Klein LW, Shaw RE, Garratt KN, Moussa ID, Shewan CM, Dangas GD and Edwards FH. Predictors of Long-Term Survival After Coronary Artery Bypass Grafting Surgery: Results From the Society of Thoracic Surgeons Adult Cardiac Surgery Database (The ASCERT Study). Circulation. 125:1491–500. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Weintraub WS, Grau-Sepulveda MV, Weiss JM, Delong ER, Peterson ED, O’Brien SM, Kolm P, Klein LW, Shaw RE, McKay C, Ritzenthaler LL, Popma JJ, Messenger JC, Shahian DM, Grover FL, Mayer JE, Garratt KN, Moussa ID, Edwards FH and Dangas GD. Prediction of long-term mortality after percutaneous coronary intervention in older adults: results from the national cardiovascular data registry. Circulation. 125:1501–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Weintraub WS, Grau-Sepulveda MV, Weiss JM, O’Brien SM, Peterson ED, Kolm P, Zhang Z, Klein LW, Shaw RE, McKay C, Ritzenthaler LL, Popma JJ, Messenger JC, Shahian DM, Grover FL, Mayer JE, Shewan CM, Garratt KN, Moussa ID, Dangas GD and Edwards FH. Comparative Effectiveness of Revascularization Strategies. N Engl J Med. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Tromp M, Ravelli AC, Bonsel GJ, Hasman A and Reitsma JB. Results from simulated data sets: probabilistic record linkage outperforms deterministic record linkage. J Clin Epidemiol. 2011;64:565–72. [DOI] [PubMed] [Google Scholar]
  • 12.Ness RB and Joint Policy Committee SoE. Influence of the HIPAA Privacy Rule on health research. JAMA. 2007;298:2164–70. [DOI] [PubMed] [Google Scholar]
  • 13.Aldridge RW, Shaji K, Hayward AC and Abubakar I. Accuracy of Probabilistic Linkage Using the Enhanced Matching System for Public Health and Epidemiological Studies. PLoS One. 2015;10:e0136179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Mues KE, Liede A, Liu J, Wetmore JB, Zaha R, Bradbury BD, Collins AJ and Gilbertson DT. Use of the Medicare database in epidemiologic and health services research: a valuable source of real-world evidence on the older and disabled populations in the US. Clin Epidemiol. 2017;9:267–277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Cowper DC, Kubal JD, Maynard C and Hynes DM. A primer and comparative review of major US mortality databases. Ann Epidemiol. 2002;12:462–8. [DOI] [PubMed] [Google Scholar]
  • 16.SPRINT Research Group Wright JT Jr., Williamson JD Whelton PK, Snyder JK Sink KM, Rocco MV Reboussin DM, Rahman M, Oparil S, Lewis CE, Kimmel PL, Johnson KC, Goff DC Jr., Fine LJ, Cutler JA, Cushman WC, Cheung AK and Ambrosius WT. A Randomized Trial of Intensive versus Standard Blood-Pressure Control. N Engl J Med. 2015;373:2103–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Butala NM, Strom JB, Faridi KF, Kazi DS, Zhao Y, Brennan JM, Popma JJ, Shen C and Yeh RW. Validation of administrative claims to ascertain outcomes in pivotal trials of transcatheter aortic valve replacement. JACC Cardiovascular Interventions. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Leon MB, Piazza N, Nikolsky E, Blackstone EH, Cutlip DE, Kappetein AP, Krucoff MW, Mack M, Mehran R, Miller C, Morel MA, Petersen J, Popma JJ, Takkenberg JJ, Vahanian A, van Es GA, Vranckx P, Webb JG, Windecker S and Serruys PW. Standardized endpoint definitions for Transcatheter Aortic Valve Implantation clinical trials: a consensus report from the Valve Academic Research Consortium. J Am Coll Cardiol. 2011;57:253–69. [DOI] [PubMed] [Google Scholar]
  • 19.Schmidt M, Schmidt SAJ, Adelborg K, Sundboll J, Laugesen K, Ehrenstein V and Sorensen HT. The Danish health care system and epidemiological research: from health care contacts to database records. Clin Epidemiol. 2019;11:563–591. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Webster PC. Sweden’s health data goldmine. CMAJ. 2014;186:E310. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES