Abstract
Objectives
This study aimed to evaluate the cause-of-death data in Minnesota (2011-2021) to understand the usage of "garbage codes" on death certificates.
Study design
We conducted a logistic regression analysis using death data from the Minnesota Vital Statistics System that compiles statistical data on all births, deaths, infant deaths, and fetal deaths in Minnesota.
Methods
Death certificate data from the Minnesota Department of Health were analyzed, and garbage codes were classified using ANACONDA criteria. Logistic regression assessed associations with socioeconomic variables, considering demographic factors, county characteristics, and fixed effects.
Results
Garbage codes constituted 3-4% of deaths, with variations by location, demographics, and office affiliation. Logistic regression revealed significant odds variations, notably related to age, rural residence, education, marital status, and place of death.
Conclusions
The study unveiled variations in cause-of-death data reliability in Minnesota, emphasizing the prevalence of garbage codes. Enhancing cause-of-death data accuracy is pivotal for informed public health decisions and accurate death statistics to guide targeted public health interventions and mitigate health disparities.
Keywords: Cause-of-death data, Garbage codes, Mortality statistics, Public health data quality, Death certificates
1. Introduction
Beyond their administrative role, death certificates provide a crucial source of statistical data by recording information such as causes of death (CoD) [1,2]. Accurately measuring disease prevalence and burden by studying CoD allows for better-designed policy tools, and more efficiently spent resources to mitigate disease [1,2]. Thus, it is important to have good-quality CoD data [1,2]. Improved accuracy in CoD information not only refines our understanding of disease burdens but also strengthens the foundation upon which public health strategies to combat the burden of diseases are formulated [[3], [4], [5]].
Several studies have demonstrated the poor quality of causes of death data [6,7]. However, most of these rely on data aggregated at the county level. No studies have examined factors associated with misclassification at an individual level. We use unique microdata from Minnesota, which has a relatively high usage of garbage codes on death certificates [7], to facilitate a granular analysis of trends at the individual decedent level, setting it apart from the aggregated data prevalent in studies conducted in most other states. In this paper, we examined the CoD data in Minnesota from 2011 to 2021 to assess its reliability. Our investigation focused on identifying instances where death certificates assigned inappropriate or unsuitable causes of death, often termed garbage codes. Since the usage of garbage codes is known to vary with age, sex, and other demographic characteristics globally, with particularly high incidence among the very young and very old [7], we also test these associations more formally using regression-based approaches to see whether similar relationships are observed in this US context. This study contributes to efforts to improve the utility of death statistics in informing evidence-based public health interventions.
2. Methods
We use a death dataset from the Minnesota Vital Statistics System compiled by the Minnesota Department of Health. These data are described elsewhere [8]. The dataset lists causes of death used to calculate national statistics listed in descending in order of importance. We use records from 2011 to 2021 to identify observations assigned unsuitable CoD using the criteria proposed by the Analysis of National Causes of Death for Action (ANACONDA) software [9]. ANACONDA categorizes all ICD-10 codes into four levels of garbage codes based on the severity of implications of their misclassification for policy decisions, with codes classified as level 1 having serious implications and codes classified as level 2 having substantial implications (Definitions detailed in Appendix SA1). Observations were flagged as garbage codes if the first cause of death in the database was classified as a level 1 or level 2 garbage code by ANACONDA. We only use the first cause of death code listed in the database because it is the code most frequently aggregated when vital statistics are calculated, making it the most important cause of death informing policy decisions, public comprehension of the CoD, and estimates of the burden imposed by each cause. Our initial analysis measured the percentage of observations allocated a garbage code and the frequency of distinct garbage codes, in total and for different demographic groups by place of death, gender, marital status, age, and education levels (Appendix SA2).
The second part of the analysis used logistic regression to estimate the relationship between socioeconomic variables and the odds of being assigned a garbage code. The independent variables were education level, marital status, gender, race and ethnicity, age, and place of death. We also include county-level characteristics including Social Vulnerability Index (SVI) and rural-urban classification.1 Our model also includes year and coroner or medical examiner's (CME) office affiliation fixed effects. The independent variables are defined in detail in Appendix SA1. We use robust standard errors clustered by CME.
3. Results
Garbage codes account for 3-4% of recorded deaths, with a marginal decline over time (Appendix SA3). The usage of garbage codes decreased from 3.52% (n = 1405) in 2011 to 2.83% (n = 1456) in 2021. This trend was mainly driven by the decreasing prevalence of garbage codes in the Twin Cities Metro area between 2011 and 2017. The prevalence of garbage codes remained stable between 2020 and 2021 in both the Twin Cities Metro area and the rest of the state, commonly referred to as Greater Minnesota despite the pandemic (Appendix SA3). The most frequently utilized garbage code in our sample was R99, i.e., other ill-defined and unspecified causes of mortality (n = 9389) followed by R54, i.e., senility-age-related physical debility, and F179, i.e., mental and behavioral disorders due to use of tobacco: unspecified mental and behavioral disorder (Appendix SA3). The usage of garbage codes varied by the supervising CME office, with the highest usage of garbage codes among urban examining agencies, such as Ramsey County (3.7%) and Midwest Medical Examiner's Office (3%), and some rural counties with independent CME offices, including Rice, Becker, Brown, Stevens, Pope, Pipestone, and Lyon counties (Appendix SA4). In comparison, lower rates were observed in counties serviced by the Mayo Clinic (1.1%) and UND Forensics (2.6%).
We also observed that usage of garbage codes was significantly higher for decedents without higher education (3.4% compared to 2.9% for high school graduates or higher), married decedents (2% compared to 3.6% for decedents who were not married at the time of death), rural decedents (3.5% compared to 2.8% for urban decedents), and decedents whose place of death was not a healthcare setting, i.e., those whose place of death was home (4%), nursing home (3.7%), or another place (2.8%) compared to deaths in healthcare settings (1.4%). The usage of garbage codes also increased with the age of the decedent, with only 1.5% for decedents below the age of 50 compared to 3.5% for decedents over 75 years of age (Appendix SA2).
Our main results reveal significant variations in the odds of being assigned a garbage code based on demographic characteristics (Fig. 1). The odds are higher for older individuals (βage51-65 = 1.724, βage66-75 = 2.184, βage76+ = 2.387, p-value <0.05), those in rural counties (βrural = 1.242, p-value <0.05), locations with higher SVI (βSVI = 2.007, p-value <0.05), and non-healthcare settings (βhome = 3.112, βnursinghome = 2.318, βother = 2.168, p-value <0.05). Conversely, lower odds are associated with higher education (βHSgraduate = 0.907, βAnyCollegeEducation = 0.941, p-value <0.05), and married decedents (βmarried = 0.569, p-value <0.05), and BIPOC decedents2 (βBIPOC = 0.926, p-value <0.05). Gender and the affiliation of the County/Medical Examiner's (CME) office present inconclusive relationships with garbage code assignment (Fig. 1, Appendix SA5).
Fig. 1.
Factors associated with higher odds of decedents being assigned an unsuitable cause of death (garbage code) on their death certificate
In the above figures, the x-axis represents the odds ratio and the y-axis represents the factors examined. An odds ratio higher than 1 indicates higher odds of decedents being assigned a garbage code.
4. Discussion
There exists a significant opportunity to enhance the accuracy of CoD data in Minnesota, thereby advancing our comprehension of the burden imposed by various diseases, especially within underrepresented communities. Although BIPOC decedents were not significantly more likely to be assigned garbage codes, decedents from counties with a higher score on the SVI index were, indicating that deaths occurring in counties with more vulnerable populations are less likely to accurately report the cause of death of their residents. This finding, which might be in part due to the limited tax dollars and other resources available to the county, offers an opportunity to identify counties where we could improve our understanding of health disparities and allocate resources effectively to underserved communities.
Disseminating accurate mortality data is crucial for informed decision-making in public health. To achieve this goal, it is imperative to identify challenges and provide comprehensive training to medical professionals and other stakeholders involved in certifying deaths [10]. Raising awareness about the impact of precise mortality data not only bolsters the reliability of health statistics but also plays a pivotal role in guiding targeted interventions, particularly benefiting marginalized and underrepresented populations.
Ethical statement
The University of Minnesota institutional review board determined this study to not be human participants research (IRB no. STUDY00012527).
Contributors
H. Karnik wrote the article. H. Karnik, J.Lasway, and Z. Levin conducted the data analysis. E. Wrigley-Field and J. P. Leider secured data and funding. H. Karnik, E. Wrigley-Field, and J. P. Leider conceptualized the study. All authors revised the article critically for intellectual content and approved the final version submitted.
Human participant protection
The University of Minnesota institutional review board determined this study to not be human participants research (IRB no. STUDY00012527).
Funding
E. Wrigley-Field was supported by the Eunice Kennedy Shriver National Institutes of Child Health and Human Development via the Minnesota Population Center (grant P2C HD041023). The views expressed in this article are those of the authors and do not necessarily reflect those of the Federal Trade Commission or any individual Commissioner. The authors would like to thank Celestine Simeh for research assistance, and David Stanton, Andrew Stokes, and Rebecca Wurtz for their thoughtful suggestions.
Declaration of competing interest
The authors have no conflicts of interest to disclose.
Acknowledgments
E. Wrigley-Field was supported by the Eunice Kennedy Shriver National Institutes of Child Health and Human Development via the Minnesota Population Center (grant P2C HD041023). The views expressed in this article are those of the authors and do not necessarily reflect those of the Federal Trade Commission or any individual Commissioner. The authors would like to thank Celestine Simeh for research assistance, and David Stanton, Andrew Stokes, and Rebecca Wurtz for their thoughtful suggestions.
Biographies
Harshada Karnik is a Senior Research Manager at WRMA Inc.
Jonathon P. Leider is with the Health Policy and Management Division, University of Minnesota School of Public Health, Minneapolis.
Elizabeth Wrigley-Field is with the Department of Sociology and Minnesota Population Center, University of Minnesota.
Jovin Lasway is with the Department of Applied Economics, University of Minnesota.
Zachary Levin is with the Bureau of Economics, Federal Trade Commission.
Supplementary data to this article can be found online at https://doi.org/10.1016/j.puhip.2026.100746.
Since the SVI includes the proportion of BIPOC residents in the jurisdiction, we did not include SVI and race in the same specification. All the models specified are detailed in Appendix SA5.
Since BIPOC and SVI are correlated, we do not include them in the same model, so the point estimates for BIPOC are derived from Model 2 in Appendix SA5.
Appendix A. Supplementary data
The following are the Supplementary data to this article:
References
- 1.McGivern L., Shulman L., Carney J.K., Shapiro S., Bundock E. Death certification errors and the effect on mortality statistics. Public Health Rep Wash DC 1974. 2017;132(6):669–675. doi: 10.1177/0033354917736514. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Gill J.R., DeJoseph M.E. The importance of proper death certification during the COVID-19 pandemic. JAMA. 2020;324(1):27–28. doi: 10.1001/jama.2020.9536. [DOI] [PubMed] [Google Scholar]
- 3.Dwyer-Lindgren L., Bertozzi-Villa A., Stubbs R.W., et al. US county-level trends in mortality rates for major causes of death, 1980-2014. JAMA. 2016;316(22):2385. doi: 10.1001/jama.2016.13645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Foreman K.J., Naghavi M., Ezzati M. Improving the usefulness of US mortality data: new methods for reclassification of underlying cause of death. Popul. Health Metr. 2016;14(1):14. doi: 10.1186/s12963-016-0082-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Ng T.C., Lo W.C., Ku C.C., Lu T.H., Lin H.H. Improving the use of mortality data in public health: a comparison of garbage code redistribution models. Am. J. Publ. Health. 2020;110(2):222–229. doi: 10.2105/AJPH.2019.305439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Johnson S.C., Cunningham M., Dippenaar I.N., et al. Public health utility of cause of death data: applying empirical algorithms to improve data quality. BMC Med. Inf. Decis. Making. 2021;21(1):175. doi: 10.1186/s12911-021-01501-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Flagg L.A., Anderson R.A. Unsuitable underlying causes of death for assessing the quality of cause-of-death reporting. Natl. Vital Stat. Rep. 2021;69(14) [PubMed] [Google Scholar]
- 8.Karnik H., Wrigley-Field E., Levin Z., et al. Examining excess mortality among critical workers in Minnesota during 2020–2021: an occupational analysis. Am. J. Publ. Health. 2023;113(11):1219–1222. doi: 10.2105/AJPH.2023.307395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Mikkelsen L., Moesgaard K., Hegnauer M., Lopez A.D. ANACONDA: a new tool to improve mortality and cause of death data. BMC Med. 2020;18:1–13. doi: 10.1186/s12916-020-01521-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Morgan A., Andrew T., Guerra S.M., Luna V., Davies L., Rees J.R. Provider reported challenges with completing death certificates: a focus group study demonstrating potential sources of error. PLoS One. 2022;17(5) doi: 10.1371/journal.pone.0268566. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.

