Key Points
Question
Are mortality rates calculated with data from the Social Security Administration Death Master File reliable for use in clinical research?
Findings
In this study of mortality data of 667 516 adults with atherosclerotic cardiovascular disease, the Social Security Administration Death Master File appeared to markedly underestimate mortality rates, with variable undercapture among states and over time.
Meaning
The Social Security Administration Death Master File data appeared unreliable and should not be used alone by researchers to estimate mortality rates.
Abstract
Importance
Despite its documented undercapture of mortality data, the US Social Security Administration Death Master File (SSDMF) is still often used to provide mortality end points in retrospective clinical studies. Changes in death data reporting to SSDMF in 2011 may have further affected the reliability of mortality end points, with varying consequences over time and by state.
Objective
To evaluate the reliability of mortality rates in the SSDMF in a cohort of patients with atherosclerotic cardiovascular disease (ASCVD).
Design, Setting, and Participants
This observational analysis used the IBM MarketScan Medicare and commercial insurance databases linked to mortality information from the SSDMF. Adults with ASCVD who had a clinical encounter between January 1, 2012, and December 31, 2013, at least 2 years of follow-up, and from states with 1000 or more eligible adults with ASCVD were included in the study. Data analysis was conducted between April 18 and May 21, 2018.
Main Outcomes and Measures
Kaplan-Meier analyses were conducted to estimate state-level mortality rates for adults with ASCVD, stratified by database (commercial or Medicare). Constant hazards of mortality by state were tested, and individual state Kaplan-Meier curves for temporal changes were evaluated. For states in which the hazard of death was constant over time, mortality rates for adults with ASCVD were compared with state-level, age group–specific overall mortality rates in 2012, as reported by the National Center for Health Statistics (NCHS).
Results
This study of mortality data of 667 516 adults with ASCVD included 274 005 adults in the commercial insurance database cohort (171 959 male [62.8%] and median [interquartile range (IQR)] age of 58 [52-62] years) and 393 511 in the Medicare database cohort (245 366 male [62.4%] and median [IQR] age of 76 [70-83] years). Of the 41 states included, 11 states (26.8%) in the commercial cohort and 18 states (43.9%) in the Medicare cohort had a change in the hazard of death after 2012. Among states with constant hazard, state-level mortality rates using the SSDMF ranged widely, from 0.06 to 1.30 per 100 person-years (commercial cohort) and from 0.83 to 6.07 per 100 person-years (Medicare cohort). Variability between states in mortality estimates for adults with ASCVD using SSDMF data greatly exceeded variability in overall mortality from the NCHS. No correlation was found between NCHS mortality estimates and those from the SSDMF (ρ = 0.29 [P = .06] for age 55-64 years; ρ = 0.18 [P = .27] for age 65-74 years).
Conclusions and Relevance
The SSDMF appeared to markedly underestimate mortality rates, with variable undercapture among states and over time; this finding suggests that SSDMF data are not reliable and should not be used alone by researchers to estimate mortality rates.
This study evaluates the challenges of using the Social Security Administration Death Master File mortality data, particularly for analyzing mortality trends among patients with atherosclerotic cardiovascular disease.
Introduction
Mortality is one of the most important end points in clinical research, but assessing mortality in the United States is difficult. The National Death Index maintained by the Centers for Disease Control and Prevention is a comprehensive database for mortality investigations in the United States. Yet, owing to its associated costs and availability lag of up to 2 years, use of the National Death Index has not been widespread. Researchers often turn to the US Social Security Administration Death Master File (SSDMF) as an alternative. The SSDMF is maintained by the Social Security Administration, relies on death reports from a variety of sources, and offers a quick and inexpensive alternative to National Death Index data. The Social Security Administration has noted that their records are not meant to capture all deaths in the United States1 and previous work has documented the incompleteness of the SSDMF to ascertain mortality.2,3,4 The SSDMF became even more limited in 2011, when the Social Security Administration removed death data reported by state-based agencies.5 Nevertheless, published observational research continues to reference mortality estimates based on the SSDMF.6,7,8,9,10,11,12,13
Although previous work2,3,4,5 has discussed the relative completeness of the data set, the association of the policy change with observed temporal trends in mortality and geographic variation in SSDMF completeness has been less well described. In this analysis, we examine associations between changes in reporting in the SSDMF and observed trends in mortality for a large cohort of patients with atherosclerotic cardiovascular disease (ASCVD).
Methods
We evaluated time-to-death data in adults with ASCVD using the SSDMF linked to the MarketScan commercial insurance and Medicare databases (IBM).14 This analysis was approved by the Duke University Institutional Review Board, which waived informed consent because the study used deidentified data. Data analysis was conducted between April 18 and May 21, 2018.
Inclusion criteria were as follows: (1) adults 18 years or older as of January 1, 2009; (2) at least 1 claim for an inpatient, emergency, or outpatient visit with a health care system between January 1, 2012, and December 31, 2013; (3) continuous enrollment from January 1, 2010, to January 1, 2012; (4) ASCVD, determined according to International Classification of Diseases, Ninth Revision, diagnosis or procedure codes for coronary heart disease, cerebrovascular disease, or peripheral arterial disease and procedure codes for coronary, cerebral, or peripheral revascularization; and (5) availability of a Social Security number in the MarketScan database to enable a link to the SSDMF. States (n = 9) with fewer than 1000 individuals with ASCVD in the analysis cohort who met the inclusion criteria were excluded, owing to the lack of precision in mortality rate estimation secondary to sample size.
Mortality was evaluated for up to 24 months from the index date, set as the first visit claim date in 2012, and the cumulative incidence of mortality at 24 months was estimated through the Kaplan-Meier method. Median state-specific mortality rates were estimated as the number of events per 100 person-years of follow-up. Analyses were stratified by the commercial and Medicare population. A survival model assuming Weibull distribution for time to death was used to test for constant hazard of death during follow-up with a shape parameter equal to 1, indicating an exponential distribution (constant hazard).
To compare state-level mortality rate variability from the SSDMF with the state-level mortality variability from the National Center for Health Statistics (NCHS), which is considered the criterion standard, we accessed the 2012 state-level mortality data from the Centers for Disease Control and Prevention WONDER databases (http://wonder.cdc.gov) for the following age groups: 55 to 64 years and 65 to 74 years. These age group–specific death rates were then compared with 1-year state-level mortality rates for adults in the commercial cohort (age 55-64 years) and Medicare cohort (age 65-74 years). The correlations were assessed using the Spearman correlation coefficient and its associated P value; P < .05 was used for statistical significance of the correlation coefficient.
Results
Among the 41 states with at least 1000 individuals with ASCVD in the MarketScan databases (N=667 516), 274 005 adults had commercial insurance (171 959 male [62.8%] and median [interquartile range (IQR)] age of 58 [52-62] years), and 393 511 adults had Medicare insurance (245 366 male [62.4%] and median [IQR] age of 76 [70-83] years). In the commercial population, the median (IQR) mortality rate per 100 person-years by state was 0.62 (0.22-0.97) and ranged from 0.06 to 1.42. In the Medicare population, the median (IQR) mortality rate per 100 person-years by state was 4.61 (1.64-6.88) and ranged from 0.80 to 9.64.
The hazard of death was not constant over the 2-year follow-up in 11 (26.8%) of the 41 states in the commercial cohort and 18 (43.9%) of the 41 states in the Medicare cohort. In several states in which hazard was not constant, we observed a discrete inflection point at which the mortality rate abruptly decreased (Figure 1). Across states, these inflection points did not occur at any fixed common time point, and the size of the inflection (ie, change in slope) varied. Among states with constant hazards of mortality, rates per 100 person-years varied widely for both the commercial and Medicare populations (Figure 2). The mortality rate per 100 person-years among states with a constant hazard ranged from 0.06 to 1.30 for the commercial cohort and from 0.83 to 6.07 for the Medicare cohort.
Figure 1. Kaplan-Meier Mortality Curves Using the Social Security Administration Death Master File (SSDMF) for Patients in States With Nonconstant Hazards.
Figure 2. Kaplan-Meier Mortality Curves Using the Social Security Administration Death Master File (SSDMF) for Patients in States With Constant Hazards.
A and B, Mortality curves show time to death based on data from the SSDMF from patients with atherosclerotic cardiovascular disease in states in which the hazard of death was constant over time. There were 23 states in the Medicare cohort and 30 states in the commercial cohort with constant hazards of mortality. Curves shown are for the 19 states where hazards were constant in both cohorts.
State-level mortality rates per 100 person-years for adults with ASCVD aged 55 to 64 years in the commercial cohort (n = 183 476) and for adults aged 65 to 74 years in the Medicare cohort (n = 154 497) compared with the overall mortality for these age groups in each state, according to the NCHS, are displayed in Figure 3. No correlation was found between the SSDMF and NCHS state-level mortality rates for the ASCVD population in either cohort (ρ = 0.29 [P = .06] for age 55-64 years; ρ = 0.18 [P = .27] for age 65-74 years).
Figure 3. Comparison of Overall Mortality Rates From the National Center for Health Statistics (NCHS) With Atherosclerotic Cardiovascular Disease (ASCVD) Population Mortality From Social Security Administration Death Master File (SSDMF) in 41 US States.
A, Orange dots show state-specific mortality rates per 100 person-years from the SSDMF for patients with ASCVD aged 55 to 64 years in the MarketScan commercial database. Blue dots show age group–specific state-level overall mortality rates from the NCHS for comparison. B, Orange dots show state-specific mortality rates per 100 person-years from the SSDMF for patients with ASCVD aged 65 to 74 years in the MarketScan Medicare database. Blue dots show age group–specific state-level overall mortality rates from the NCHS for comparison. No correlation was found in mortality estimates between NCHS and SSDMF (ρ = 0.29 [P = .06] for age 55-64 years; ρ = 0.18 [P = .27] for age 65-74 years).
Discussion
Given its timeliness and ease of access, the SSDMF is commonly used for research purposes to evaluate mortality, and data vendors continue to offer SSDMF linkages to researchers as part of real-world evidence services. Previous work has demonstrated that the SSDMF may undercapture mortality,15,16 but this analysis further shows that undercapture varies substantially over time and between states. The state-by-state variability was out of proportion to the heterogeneity across states in overall state-level mortality rates using data maintained by the NCHS, suggesting differential undercapture of mortality using the SSDMF by state. Adults with ASCVD can be expected to have higher overall mortality than the general population, yet in many states, the mortality rate, estimated using the SSDMF for adults with ASCVD, was far lower than the NCHS-derived mortality rate for adults in the same age category.
Further evidence of the lack of reliability of SSDMF mortality estimate comes from examining the shape of the hazard function. Biologically, the hazard of mortality should be constant over relatively short time windows, such as the 24 months assessed in this study; however, death data from more than half of the states demonstrated a nonconstant hazard, indicating that, within states, variability in the completeness of death data reporting occurred over time. Patterns of apparent undercapture of death data from the SSDMF varied, making it nearly impossible to identify specific states in which death data from the SSDMF may be reliable.
In several states, discrete time points were identified after which the risk of death abruptly decreased, suggesting that SSDMF mortality reporting appeared to stop or markedly drop off; however, these times were not similar across all states. Other states exhibited nonconstant hazards of death but with mortality curves that appeared to taper off over time, preventing researchers from identifying a single point in time to censor data to prevent undercapture.
Limitations
This study has several limitations. First, we restricted our analysis of SSDMF data to adults with ASCVD, and we evaluated mortality only through 2015. However, given the lack of change in SSDMF data collection practices, the issues of data completeness identified in this study likely extend to other disease states and are unlikely to have improved since that time. Second, state-level mortality data stratified by ASCVD status are not available from NCHS. However, we believe the comparison of variability in mortality between the 2 data sets across states still highlights the biologic implausibility that these differences were the result of true variation in ASCVD survival rather than differential data capture.
Conclusions
The SSDMF appears to undercapture mortality, with variable undercapture by state and over time, which may lead to artifactual geographic and temporal differences in mortality rates. Researchers should avoid relying on mortality estimates based on the SSDMF alone and be aware of heterogeneity in SSDMF data completeness.
References
- 1.Social Security Administration Requesting the Full Death Master File (DMF). https://www.ssa.gov/dataexchange/request_dmf.html. Accessed January 31, 2019.
- 2.Lash TL, Silliman RA. A comparison of the National Death Index and Social Security Administration databases to ascertain vital status. Epidemiology. 2001;12(2):259-261. doi: 10.1097/00001648-200103000-00021 [DOI] [PubMed] [Google Scholar]
- 3.Schisterman EF, Whitcomb BW. Use of the Social Security Administration Death Master File for ascertainment of mortality status. Popul Health Metr. 2004;2(1):2. doi: 10.1186/1478-7954-2-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Buchanich JM, Dolan DG, Marsh GM, Madrigano J. Underascertainment of deaths using social security records: a recommended solution to a little-known problem. Am J Epidemiol. 2005;162(2):193-194. doi: 10.1093/aje/kwi178 [DOI] [PubMed] [Google Scholar]
- 5.Rothwell CJ. SSA Death Master File & NCHS National Death Index: How do they relate? https://www.cdc.gov/nchs/data/bsc/rothwell.pdf. Accessed November 20, 2018
- 6.Agarwal S, Parashar A, Garg A, Ellis SG, Tuzcu EM, Kapadia SR. Length of stay and long-term mortality following ST elevation myocardial infarction. Catheter Cardiovasc Interv. 2015;86(suppl 1):S1-S7. doi: 10.1002/ccd.26071 [DOI] [PubMed] [Google Scholar]
- 7.Avgerinos ED, Go C, Ling J, Makaroun MS, Chaer RA. Survival and long-term cardiovascular outcomes after carotid endarterectomy in patients with chronic renal insufficiency. Ann Vasc Surg. 2015;29(1):15-21. doi: 10.1016/j.avsg.2014.07.029 [DOI] [PubMed] [Google Scholar]
- 8.Chung R, Houghtaling PL, Tchou M, et al. Left ventricular hypertrophy and antiarrhythmic drugs in atrial fibrillation: impact on mortality. Pacing Clin Electrophysiol. 2014;37(10):1338-1348. doi: 10.1111/pace.12426 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mogensen KM, Robinson MK, Casey JD, et al. Nutritional status and mortality in the critically ill. Crit Care Med. 2015;43(12):2605-2615. doi: 10.1097/CCM.0000000000001306 [DOI] [PubMed] [Google Scholar]
- 10.Oberweis BS, Smilowitz NR, Nukala S, et al. Relation of perioperative elevation of troponin to long-term mortality after orthopedic surgery. Am J Cardiol. 2015;115(12):1643-1648. doi: 10.1016/j.amjcard.2015.03.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Patorno E, Glynn RJ, Levin R, Lee MP, Huybrechts KF. Benzodiazepines and risk of all cause mortality in adults: cohort study. BMJ. 2017;358:j2941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Tarakji KG, Wazni OM, Harb S, Hsu A, Saliba W, Wilkoff BL. Risk factors for 1-year mortality among patients with cardiac implantable electronic device infection undergoing transvenous lead extraction: the impact of the infection type and the presence of vegetation on survival. Europace. 2014;16(10):1490-1495. doi: 10.1093/europace/euu147 [DOI] [PubMed] [Google Scholar]
- 13.Kosiborod M, Cavender MA, Fu AZ, et al. ; CVD-REAL Investigators and Study Group . Lower risk of heart failure and death in patients initiated on sodium-glucose cotransporter-2 inhibitors versus other glucose-lowering drugs: the CVD-REAL Study (Comparative Effectiveness of Cardiovascular Outcomes in New Users of Sodium-Glucose Cotransporter-2 Inhibitors). Circulation. 2017;136(3):249-259. doi: 10.1161/CIRCULATIONAHA.117.029190 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.The MarketScan Databases for Life Sciences Researchers https://truvenhealth.com/Portals/0/Assets/2017-MarketScan-Databases-Life-Sciences-Researchers-WP.pdf. Accessed November 20, 2018
- 15.Ashley T, Cheung L, Wokanovicz R. Accuracy of vital status ascertainment using the Social Security Death Master File in a deceased population. J Insur Med. 2012;43(3):135-144. [PubMed] [Google Scholar]
- 16.Cowper DC, Kubal JD, Maynard C, Hynes DM. A primer and comparative review of major US mortality databases. Ann Epidemiol. 2002;12(7):462-468. doi: 10.1016/S1047-2797(01)00285-X [DOI] [PubMed] [Google Scholar]