Improving Validity of Cause of Death on Death Certificates

Ryan A Hoffman; Janani Venugopalan; Li Qu; Hang Wu; May D Wang

doi:10.1145/3233547.3233581

. Author manuscript; available in PMC: 2020 Jun 18.

Published in final edited form as: ACM BCB. 2018 Aug;2018:178–183. doi: 10.1145/3233547.3233581

Improving Validity of Cause of Death on Death Certificates

Ryan A Hoffman ¹, Janani Venugopalan ¹, Li Qu ¹, Hang Wu ¹, May D Wang ¹

PMCID: PMC7302107 NIHMSID: NIHMS1595289 PMID: 32558825

Abstract

Accurate reporting of causes of death on death certificates is essential to formulate appropriate disease control, prevention and emergency response by national health-protection institutions such as Center for disease prevention and control (CDC). In this study, we utilize knowledge from publicly available expert-formulated rules for the cause of death to determine the extent of discordance in the death certificates in national mortality data with the expert knowledge base. We also report the most commonly occurring invalid causal pairs which physicians put in the death certificates. We use sequence rule mining to find patterns that are most frequent on death certificates and compare them with the rules from the expert knowledge based. Based on our results, 20.1% of the common patterns derived from entries into death certificates were discordant. The most probable causes of these discordance or invalid rules are missing steps and non-specific ICD-10 codes on the death certificates.

1. INTRODUCTION

Approximately 2.6 million deaths occur each year in the United States (US) and 56 million deaths occur per year worldwide [1, 2]. Accurate death statistics are imperative to help the national health protection institutions such as the National Center for Health Statistics prevent epidemics and disease outbreak, formulate a response to communicable diseases and evaluate statistics such as the birth and death trends. The data from death certificates is also used for the estimation of the trends in chronic conditions such as the prevalence of diabetes and cardiovascular conditions. Causes of death from death records are often used by the reporting agencies to carry out the tasks mentioned above. To ensure a timely and accurate response to disease threats, high-quality death information on death certificates is essential.

The World Health Organization (WHO) has classified the causes of death (COD) using the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10) which contains 22 chapters covering 2,046 categories of diseases [3, 4]. Despite the pressing need for high quality cause of death information, challenges such as lack of adequate knowledge and practice still exist for the accurate filling of death certificates. These challenges lead to death certificates of uncertain quality. Studies have found disagreements between the death certificate COD and the sequence of events reported in the medical record[5, 6]. The reporting of COD with a higher accuracy is difficult to implement in some situations, i.e. when the certifying physician is not the primary care provider, when the primary care provider may be part of a separate healthcare organization or located in a different area, when certificate completion is significantly delayed after the death, or when the certifying physician does not have ready access to relevant medical records. These situations are part of the reality of medical practice, where the clinicians providing care the time of a patient’s death may not have a comprehensive knowledge of the patient’s medical history. In fact, the errors in COD reports results from improperly maintained and improved data quality within both systems in state and local hospitals. One study found that only 56.9% of attending physicians, 56.0% of resident physicians, and 55.7% of medical students matched experts for the correct cause of death in clinical case studies [7]. Another study found that 45% of resident respondents incorrectly identified a cardiovascular event as the primary cause of death [8]. The Framingham Heart Study and other studies have indicated that coronary artery disease is overestimated on death certificates as a cause of death in the general population by 24% and two times more in older patients than in younger ones [9]. The extent of such overestimation affects the quality of death certificate information and national mortality statistics is not known.

The information on COD constitutes the basis of health systems in hospitals and is essential for studying the relationships among diseases. The inaccurate reporting of these data could lead to inappropriate public health interventions [10]. Moreover, discrepancies in vital statistics documents pose a challenge in implementing effective public health interventions and accurate reporting of morbidity and mortality information [11, 12]. For example, Lauren E. Johns et. al. have shown that inaccurate COD reports on health disparity have a large impact on New York City premature cardiovascular mortality [12]. Accurate scientific results have a strong need for improving the accuracy and validity of mortality statistics [13, 14]. In addition, accurate COD reports are also important in clinical trials and studies. Besides, COD reports are the basis for the National Center for Health Statistics (NCHS) to help with surveillance of disease and proper allocation of funds for public health programs and research, and to help prioritize governmental decisions and actions regarding health care. Because health statistics, national mortality and morbidity statistics, and data on disease prevalence in society are largely derived from COD reports, the accuracy of COD reports is essential.

Although ICD provides detailed and specific rules, COD reports still have many issues. Some studies have focused on the coding problems that are related to particular diseases in specific countries, such as hypertension [15], myocardial infarction [16]. Some studies have come up with the incompleteness and inaccuracy in cause-of-death statement [17]. Several researchers have studied patient medical charts for errors in cause-of-death reporting, and several have demonstrated inaccurate COD reporting among residents [8]. A few studies have discussed the types of coding errors and the reasons for them [18]. No study has provided effective solutions to the errors in the COD reports. Methods to remedy errors in COD reports are urgently needed. To increase the accuracy of COD reports, physician training in death certificate completion is found effective [17]. The NYC Office of Vital Statistics is solving this issue, especially on heart disease death reporting. An intervention was implemented within eight NYC hospitals from 2009 to 2010. The proportion of heart disease deaths reported at the intervention hospitals decreased 50% to the level of those at the nonintervention hospitals after the intervention [19, 20]. Many low-cost or free methods are also used, including an e-learning module created by New York City DOHMH in cooperation with the National Association for Public Health Statistics and Information Systems, and cause-of-death documentation handbooks provided by the CDC [21, 22]. However, these methods can only reduce the possibility of errors when recording COD reports, but not avoid or find the errors in the COD reports after recording the COD reports.

There is a gap in research about the efficient determination of the true match among several potential matches for COD reports [23]. This process is extremely complex and time intensive. Luciana K.T. et.al. identified accurate deaths focusing on only anaphylaxis-caused deaths in Brazil [24]. Perviz A. M. et.al. revealed that only 10–15% of assigned underlying causes in data from England are not true pathophysiological causes of death. Although this research showed that the COD reports have some inaccurate records, they focus on only one disease in a specific country or area. Another research study focused on accurate cause classification for only stillbirths and neonatal deaths [25]. There are also studies providing new methods for reclassification of the underlying cause of death using all the COD codes including the errors [26].

In order to document acceptable causal relationships to be used in automated and manual mortality coding, experts NCHS have published a comprehensive list of acceptable sequence codes for the cause of death [27]. The list consists of valid causes of death and the list of ICD-10 codes which can cause the COD code. Accuracies on the death certificates can be improved by multiple ways, one of which is to captured concordance with this expert pool of knowledge. However, studies have not actively used this resource to identify the discordances in codes put on death certificates, their sources and help improve the current clinical practice regarding the filling of death certificates.

In this paper, we utilized this publicly available resource of expert knowledge to determine the discordance in death certificates in national mortality repositories (National Vital Statistics System (NVSS) death certificate data). To determine frequently filled patterns from NVSS data, we used sequential rule mining. Then we compared the rules obtained from the sequential rule mining with those obtained from the expert knowledge base. The goal is to develop processes for eliminating the wrong and inaccurate COD records. In addition, our methods will help researchers better understand the relationships among diseases as well as the right intervention of public health.

We structure the rest of the paper as follows: we first describe the data sources and modeling in section 2, followed by the results and discussion in section 3. Finally, we conclude with the conclusions, limitations, and potential for future work in section 4.

2. EXPERIMENTAL AND COMPUTATIONAL DETAILS

As mentioned above, this study seeks to find the extent of discordances in the death certificates in national mortality database with an expert pool of knowledge, the most common sources of differences, and the sources of these discordances. We used the death certificates from those made public by the NVSS.

2.1. Data Sources: Expert Knowledge Base

Medical experts working on the COD have published a comprehensive list of acceptable sequence codes for the cause of death, which integrates NCHS’s guidance as well as other international sources of data [28]. This data consists of a valid cause of death relationships between ICD-10 codes. Each chain given consists of an address (COD) (F3) followed by one or two sub-addresses (F2: F1). The relationships are such that each address is caused by the following sub-address, i.e. F2 → F3. In case of more than one sub-address, the address is caused by the following sub-address and all the sub-addresses which fall in between the two (F1: F2 → F3).

In this dataset, some of the relationships that are marked ambivalent with ambivalent codes are defined. Relationships which are marked ambivalent indicate that further clarifications may be needed in the future. In this analysis, we utilized all the relationships, including the ones marked as ambivalent.

2.2. Data Sources: Death Certificate Data

National Vital Statistics System (NVSS), coordinated by the National Center for Health Statistics aggregates the causes of death for all deaths occurring within the United States from 1959 to 2014 [2]. For this analysis, we used the mortality data from 2012, which contains 2,547,864 deaths. Each death certificate format in vital statistics offices of each state, the District of Columbia, and other special jurisdictions vary but generally consists of the underlying cause of death as recorded by physicians and other details such as the demographics, comorbid conditions, race and ethnicity. The cause of death on the death certificates was recorded as entity access codes and record access codes (up to 20 conditions). The entity axis codes refer to raw data put on death certificates and the record access axis codes refer to the codes cleaned by the NVSS. The entity access codes for this data contains two parts, with part one containing the ordered set of COD (sequences) codes, and part two containing additional related COD codes, which are unordered. Since the goal of this study is to find the discrepancies in the sequence of COD on the actual death certificates (not cleaned or filtered codes), we used the part one of the entity access codes for this analysis. Using the COD information, we extracted rules indicative of the most frequently used sequences on COD using sequential rule mining.

2.3. Deriving Frequent COD Patterns from Death Certificates

In this analysis, we derive the most frequently used patterns from death certificates using sequence rule mining (SRM). Sequence rule mining, sequence pattern mining [29–33] or association rule mining [34, 35] are the most commonly used temporal models in literature for finding temporal relationships among sequences. SRM has diverse pattern mining applications in finance and market analysis [36, 37], travel analysis [38], mobile learning [39] and database projections. (PrefixSpan [40], MEMISP [41]). In healthcare, SRM has applications in multi-dimensional EEG analysis [42], administrative data analysis [43], heart disease prediction [34, 35], healthcare auditing [44], and neurological diagnosis [45]. It was first introduced by Agrawal et al. to extract regularities between products in large-scale warehouse databases [46]. Ordonez et al. adopted SRM in medical data and proposed an improved algorithm to constrain rules so as to speed up the mining process [35]. In this analysis, SRM was chosen as the method for analysis as opposed to direct comparison of the sequences on the death records, since SRM discovers the sequences which are more commonly used on the death certificates. This helps the clinicians and the national health institutions find top sequences where discordances occur with the expert knowledge base described above. This allows for targeted interventions for clinician training and clinical decision support systems. SRM is used to discover all temporal sequences frequently found in the dataset. The rules are determined useful if there exists a minimum presence in the dataset. The rules are included for analysis if they have a minimum support. Support of a rule is defined as the proportion of sequences in the data that exhibit the pattern [47]. In COD data mining, a rule of the form “X => Y, with a support B” can be interpreted as follows. If the sequence has COD X, there is B% possibility of it being followed by COD Y. In the current NVSS dataset, which does not have temporal relationships, the pattern mining is performed on the sequences of COD in the death certificates.

The training dataset in the death certificates consists of a list of COD C = [C₁, C₂, …, C_K]. Using this training dataset, we discover a set of N rules, S = [R₁, R₂, R₃…R_N]. Each rule R in the set of rule S is given by R = <r₁, r₂,…,r_T>, such that r₁, r₂, … r_N is the sequence of COD in the rule R. They are sequentially ordered to reflect the relationship of r₁ - >r₂ - > r₃ … - > r_T (T is the number of COD in the sequence). The support of a rule R in the set of sequences S is defined as the number of sequences that contain this rule. The support value in this analysis is used as a metric to pick the valid rules, which have a value larger than a minimum support. In our experiments, we use the BIDE algorithm, short for BI-Directional-Extension-based frequent closed sequence mining, proposed by Wang et al. [48]. The BIDE algorithm was selected because it is an implementation of frequent closed sequence mining which emphasizes scalability and real-world performance.

2.4. Deriving Rules from Expert Knowledge

In the expert-derived relationship data we mentioned above, a total of 10,849 unique ICD-10 codes were found. In addition, we also have a list of ICD-10 codes, which were marked as not valid in the US for mortality reporting (1,301 out of 10,849). In this analysis, after consultation with experts from the Center for Disease Control and Prevention, we used only the codes which were valid in the US. Following that consultation, we excluded from analysis death records containing external (V-Y) or nature of injury (S-T) codes, as being outside of the scope for physician mortality certifiers.

3. RESULTS AND DISCUSSION

3.1. Results from SRM Analysis

We applied BIDE to the cause of death data to part 1 of the entity access codes from 2012 death certificates.

After performing the filtering procedure in section 2.4, 224,608 death records were omitted from the analysis. From the remaining records, using a minimum support of 50 occurrences, we extracted 11,815 sequential rules, together accounting for 4,010,150 causal relationships between codes. Of these rules, 61 had length 4 and 3,386 had length 3, with the remaining 8,368 being length 2, seen in Table 1.

Table 1:

Size of Frequent Patterns from SRM Analysis

Length	Count
2	8368
3	3386
4	61
5+	0

Open in a new tab

3.2. Results from Comparing Rules from SRM with Expert Knowledge

For each rule of length greater than two found using SRM, we checked for the validity using the relationship dataset mentioned above. Rules were mapped as invalid if the address and sub-address from SRM were not found in the relationship data set. For multi-part rules, if one part of the SRM rule did not conform, the entire rule was marked invalid, as seen in Fig. 1.

Figure 1: — Interpretation of a multi-part rule, where any invalid link identifies the rule as invalid.

Of the total 11,815 rules, 2,378 (20.1%) of the rules were marked as invalid. Based on the counts, there the cumulative count of relationships from SRM was 4,010,150. Of these relationships, a cumulative count of 491,955 (12.3%) were marked invalid. Rules can be invalid if any individual step is invalid or if the ICD10 code not allowable or is not specific enough. Table 2 and Table 3 show the top rules, of lengths two and three respectively, which were marked invalid based on the frequency of occurrence. Support is expressed in the number of records which contained the rule. The full table of invalid rules is available at miblab.bme.gatech.edu. For comparison, valid rules are shown in Table 4. The valid rules show relationships that are accepted by medical consensus as being plausibly causally linked.

Table 2:

Top 10 Invalid Rules of Length 2

Support	Rule (c1- >c2)
7278	Essential (primary) hypertension - > Unspecified diabetes mellitus without complications
7175	Chronic obstructive pulmonary disease, unspecified - > Malignant neoplasm of bronchus or lung, unspecified
6598	Congestive heart failure - > Chronic obstructive pulmonary disease, unspecified
6248	Congestive heart failure - > Atherosclerotic heart disease
5998	Essential (primary) hypertension - > Chronic obstructive pulmonary disease, unspecified
5992	Chronic obstructive pulmonary disease, unspecified - > Atherosclerotic heart disease
5670	Atherosclerotic heart disease - > Chronic obstructive pulmonary disease, unspecified
5665	Essential (primary) hypertension - > Unspecified dementia
3550	Atrial fibrillation and flutter - > Atherosclerotic heart disease
3407	Essential (primary) hypertension - > Non-insulin-dependent diabetes mellitus without complications

Open in a new tab

Table 3:

Top 10 Invalid Rules of Length 3

Support	Rule (c1- >c2)
1329	Essential (primary) hypertension - > Unspecified diabetes mellitus without complications - > Cardiac arrest, unspecified
1284	Congestive heart failure - > Atherosclerotic heart disease - > Cardiac arrest, unspecified
1186	Chronic obstructive pulmonary disease, unspecified - > Atherosclerotic heart disease - > Cardiac arrest, unspecified
1079	Essential (primary) hypertension - > Unspecified diabetes mellitus without complications - > Atherosclerotic heart disease
879	Essential (primary) hypertension - > Unspecified diabetes mellitus without complications - > Acute myocardial infarction, unspecified
861	Congestive heart failure - > Chronic obstructive pulmonary disease, unspecified - > Respiratory failure, unspecified
798	Essential (primary) hypertension - > Chronic obstructive pulmonary disease, unspecified - > Cardiac arrest, unspecified
733	Atherosclerotic heart disease - > Chronic obstructive pulmonary disease, unspecified - > Cardiac arrest, unspecified
607	Congestive heart failure - > Chronic obstructive pulmonary disease, unspecified - > Cardiac arrest, unspecified
600	Chronic obstructive pulmonary disease, unspecified - > Malignant neoplasm of bronchus or lung, unspecified - > Respiratory failure, unspecified

Open in a new tab

Table 4:

Top 10 Valid Rules of Length 2

Support	Rule (c1- >c2)
68064	Atherosclerotic heart disease - > Cardiac arrest, unspecified
47089	Atherosclerotic heart disease - > Acute myocardial infarction, unspecified
38386	Atherosclerotic heart disease - > Congestive heart failure
30814	Congestive heart failure - > Cardiac arrest, unspecified
30787	Essential (primary) hypertension - > Cardiac arrest, unspecified
28401	Chronic obstructive pulmonary disease, unspecified - > Respiratory failure, unspecified
27733	Pneumonia, unspecified - > Septicemia, unspecified
26349	Pneumonia, unspecified - > Respiratory failure, unspecified
24728	Essential (primary) hypertension - > Atherosclerotic heart disease
24535	Acute myocardial infarction, unspecified - > Cardiac arrest, unspecified

Open in a new tab

In summary, 20.1% of the frequently occurring patterns from the death certificate data showed discordance with the expert knowledge data. Major causes of this could include the use of non-specific codes and missing entities in the sequences entered. Finding the codes which are top areas of discordance on the COD section of death certificates can help the national health institutions such the CDC, NCHS, and NVSS provide the certifying clinicians with the requisite training to avoid potential inaccuracies. They could also help these institutions to assist the physicians to record a more detailed information. Also, finding out the missing links in the COD sequence for death analysis has the potential to help in clinical decision support.

4. CONCLUSIONS

The lack of methods and evaluation of the accuracy of current death records in national mortality databases is a challenge and can result in inappropriate public health interventions, loss of life, or increased expense. In this study, we develop a framework to showcase the discordances in COD coding of death certificates in mortality databases with an expert knowledge base. We also identified the rules with the most frequent discrepancies. This provides us with the knowledge to improve the training of physicians to improve the filling of death certificates. It also gives us an insight into the common discordances and the systemic changes required for improving the accuracy of death certificates. These systematic differences may highlight potential opportunities for updating and revising either clinician training or the causal classification procedures themselves. In addition, this can be incorporated into intelligent analytics with future potential for improving the accuracy of death reporting.

There are limitations to the methods and results described by this work. Though potential root causes for the discordance are proposed and discussed throughout, the lack of ground truth hampers direct evaluation of the causes of the discordance. Additionally, only one year of historical data was used for the rule mining step. In the future, we will extend our analysis to data from spanning multiple years. We will also use graph based ontology analysis which can potentially provide the missing links to the clinicians at the time of filling of the death certificates. We believe that this work demonstrates the applicability and value of decision support systems to mortality reporting, and hope that future work in this area may definitively identify the sources of the sources of these errors. We will also investigate the combination of our methods with decision support systems.

ACKNOWLEDGMENTS

This work was carried out in collaboration with the Center for Disease Control and Prevention. This work was supported in part by grants from the National Center for Advancing Translational Sciences of the National Institutes of Health (NIH) under Award UL1TR000454 to Dr. May D. Wang, National Science Foundation Award NSF1651360, and the US Department of Health and Human Services (HHS) Centers for Disease Control and Prevention (CDC) HHSD2002015F62550B to Dr. May D. Wang, and Microsoft Research and Hewlett Packard. This article does not reflect the official policy or opinions of the CDC, NSF, or the US Department of HHS and does not constitute an endorsement of the individuals or their programs.

For this work, the authors thank Paula Braun (CDC), and Charles Sirc (CDC) for their invaluable assistance and support in shaping this project. We also thank Donna Hoyert and Robert Anderson at the National Center for Health Statistics for their invaluable feedback and support.

REFERENCES

[1].Top 10 Causes of Death - Factsheet. WHO, 2013. [Google Scholar]
[2].NCHS, United States, 2014: With special feature on adults aged 55–64 (2015). [PubMed] [Google Scholar]
[3].International statistical classification of diseases and related health problems. World Health Organization, 2004. [PubMed] [Google Scholar]
[4].Mony PK and Nagaraj C Health information management: An introduction to disease classification and coding. National Medical Journal of India, 20, 6 (2007), 307. [PubMed] [Google Scholar]
[5].Sington J and Cottrell B Analysis of the sensitivity of death certificates in 440 hospital deaths: a comparison with necropsy findings. Journal of clinical pathology, 55, 7 (2002), 499–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
[6].Hoff C and Ratard R Louisiana death certificate accuracy: a concern for the public’s health. J La State Med Soc, 162, 6 (2010), 350–352. [PubMed] [Google Scholar]
[7].Messite J and Stellman SD Accuracy of death certificate completion: the need for formalized physician training. Jama, 275, 10 (1996), 794–796. [PubMed] [Google Scholar]
[8].Lakkireddy DR, Gowda MS, Murray CW, Basarakodu KR and Vacek JL Death certificate completion: how well are physicians trained and are cardiovascular causes overstated? The American journal of medicine, 117, 7 (2004), 492–498. [DOI] [PubMed] [Google Scholar]
[9].Agarwal R, Norton JM, Konty K, Zimmerman R, Glover M, Lekiachvili A, McGruder H, Malarcher A, Casper M and Mensah GA Peer Reviewed: Overreporting of Deaths From Coronary Heart Disease in New York City Hospitals, 2003. Preventing chronic disease, 7, 3 (2010). [PMC free article] [PubMed] [Google Scholar]
[10].Seske LM, Muglia LJ, Hall ES, Bove KE and Greenberg JM Infant mortality, cause of death, and vital records reporting in Ohio, United States. Maternal and child health journal, 21, 4 (2017), 727–733. [DOI] [PubMed] [Google Scholar]
[11].Johansson LA and Westerling R Comparing Swedish hospital discharge records with death certificates: implications for mortality statistics. International journal of epidemiology, 29, 3 (2000), 495–502. [PubMed] [Google Scholar]
[12].Johns LE, Madsen AM, Maduro G, Zimmerman R, Konty K and Begier E A case study of the impact of inaccurate cause-of-death reporting on health disparity tracking: New York City premature cardiovascular mortality. American journal of public health, 103, 4 (2013), 733–739. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Madsen A, Thihalolipavan S, Maduro G, Zimmerman R, Koppaka R, Li W, Foster V and Begier E Peer Reviewed: An Intervention to Improve Cause-of-Death Reporting in New York City Hospitals, 2009–2010. Preventing chronic disease, 9 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
[14].Gissler M, Kauppila R, Merilainen J, Toukomaa H and Hemminki E Pregnancy-associated deaths in Finland 1987–1994-definition problems and benefits of record linkage. Acta obstetricia et gynecologica Scandinavica, 76, 7 (1997), 651–657. [DOI] [PubMed] [Google Scholar]
[15].Curb J, Babcock C, Pressel S, Tung B, Remington R and Hawkins C Nosological coding of cause of death. American journal of epidemiology, 118, 1 (1983), 122–128. [DOI] [PubMed] [Google Scholar]
[16].Guibert RL, Wigle DT and Williams JI Decline of acute myocardial infarction death rates not due to cause of death coding. Canadian journal of public health= Revue canadienne de sante publique, 80, 6 (1989), 418–422. [PubMed] [Google Scholar]
[17].Myers KA and Farquhar DR Improving the accuracy of death certification. Canadian Medical Association Journal, 158, 10 (1998), 1317–1323. [PMC free article] [PubMed] [Google Scholar]
[18].Percy C and Muir C The international comparability of cancer mortality data: results of an international death certificate study. American journal of epidemiology, 129, 5 (1989), 934–946. [DOI] [PubMed] [Google Scholar]
[19].Madsen A and Begier E Improving quality of cause-of-death reporting in New York City. Preventing chronic disease, 10 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
[20].Madsen A, Thihalolipavan S and Maduro M A successful intervention to improve the quality of cause of death reporting in New York City hospitals. (In Press). Preventing chronic disease (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].NAPHSIS Training Resources. National Association for Public Health Statistics and Information Systems Web site. [Google Scholar]
[22].Physician’s Handbook on Death Registration and Fetal Death Reporting Hyattsville, MD: Centers for Disease Control and Prevention, 2003. [Google Scholar]
[23].Skopp NA, Smolenski DJ, Schwesinger DA, Johnson CJ, Metzger-Abamukong MJ and Reger MA Evaluation of a methodology to validate National Death Index retrieval results among a cohort of U.S. service members. Ann Epidemiol, 27, 6 (June 2017), 397–400. [DOI] [PubMed] [Google Scholar]
[24].Tanno LK, Bierrenbach AL, Calderon MA, Sheikh A, Estelle R Simons F and Demoly P Increasing the Accuracy of Notification of Anaphylaxis Deaths in Brazil through the International Classification of Diseases (ICD)-11 Revision. Journal of Allergy and Clinical Immunology, 139, 2 (AB226. [DOI] [PubMed] [Google Scholar]
[25].Flenady V, Wojcieszek AM, Ellwood D, Leisher SH, Erwich JJH, Draper ES, McClure EM, Reinebrant HE, Oats J and McCowan L Classification of causes and associated conditions for stillbirths and neonatal deaths. Elsevier, City, 2017. [DOI] [PubMed] [Google Scholar]
[26].Foreman KJ, Naghavi M and Ezzati M Improving the usefulness of US mortality data: new methods for reclassification of underlying cause of death. Population health metrics, 14, 1 (2016), 14. [DOI] [PMC free article] [PubMed] [Google Scholar]
[27].Lu TH Using ACME (Automatic Classification of Medical Entry) software to monitor and improve the quality of cause of death statistics. Journal of Epidemiology and Community Health, 57 (2003-06-01 00:00:00 2003), 470–471. [DOI] [PMC free article] [PubMed] [Google Scholar]
[28].DIMDI - About Iris. DIMDI: German Institute of Medical Documentation and Information, City. [Google Scholar]
[29].Tao C, Wongsuphasawat K, Clark K, Plaisant C, Shneiderman B and Chute CG Towards event sequence representation, reasoning and visualization for EHR data In Proceedings of the Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium (Miami, Florida, USA, 2012). ACM, [insert City of Publication],[insert 2012 of Publication]. [Google Scholar]
[30].Wang TD, Plaisant C, Quinn AJ, Stanchak R, Murphy S and Shneiderman B Aligning temporal data by sentinel events: discovering patterns in electronic health records. ACM, City, 2008. [Google Scholar]
[31].Syed H and Das AK Identifying Chemotherapy Regimens in Electronic Health Record Data Using Interval-Encoded Sequence Alignment. Springer, City, 2015. [Google Scholar]
[32].Casanova IJ, Campos M, Juarez JM, Fernandez-Fernandez-Arroyo A and Lorente JA Using Multivariate Sequential Patterns to Improve Survival Prediction in Intensive Care Burn Unit. Springer International Publishing, City, 2015. [Google Scholar]
[33].Batal I, Valizadegan H, Cooper GF and Hauskrecht M A pattern mining approach for classifying multivariate temporal data. IEEE, City, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
[34].Konias S, Giaglis GD, Gogou G, Bamidis PD and Maglaveras N Uncertainty rule generation on a home care database of heart failure patients. City, 2003. [Google Scholar]
[35].Ordonez C, Omiecinski E, De Braal L, Santana CA, Ezquerra N, Taboada JA, Cooke D, Krawczynska E and Garcia EV Mining constrained association rules to predict heart disease. City, 2001. [Google Scholar]
[36].Liu H and Du H Stock Sequence Pattern Mining Method Based on SWI-GSP Algorithm. ACM, City, 2017. [Google Scholar]
[37].Zhang D and Zhou L Discovering golden nuggets: data mining in financial application. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 34, 4 (2004), 513–522. [Google Scholar]
[38].Vu HQ, Li G, Law R and Zhang Y Travel Diaries Analysis by Sequential Rule Mining. Journal of Travel Research (2017), 0047287517692446. [Google Scholar]
[39].Tiwari S and Tiwari LK Sequential Rule Mining in M-Learning Domain. International Journal of Computer Applications, 134, 3 (2016), 23–29. [Google Scholar]
[40].Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U and Hsu M-C Mining sequential patterns by pattern-growth: The prefixspan approach. Knowledge and Data Engineering, IEEE Transactions on, 16, 11 (2004), 1424–1440. [Google Scholar]
[41].Lin M-Y and Lee S-Y Fast discovery of sequential patterns by memory indexing. Springer, City, 2002. [Google Scholar]
[42].Pradhan GN and Prabhakaran B Association rule mining in multiple, multidimensional time series medical data. Journal of Healthcare Informatics Research, 1, 1 (2017), 92–118. [DOI] [PMC free article] [PubMed] [Google Scholar]
[43].Vandromme M, Jacques J, Taillard J, Hansske A, Jourdan L and Dhaenens C Extraction and optimization of classification rules for temporal sequences: Application to hospital data. Knowledge-Based Systems, 122 (2017), 148–158. [Google Scholar]
[44].Concaro S, Sacchi L, Cerra C, Fratino P and Bellazzi R Mining health care administrative data with temporal association rules on hybrid events. Methods of information in medicine, 50, 2 (2011), 166–179. [DOI] [PubMed] [Google Scholar]
[45].Chaves R, Górriz JM, Ramírez J, Illán IA, Salas-Gonzalez D and Gómez-Río M Efficient mining of association rules for the early diagnosis of Alzheimer’s disease. Physics in Medicine and Biology, 56, 18 (2011), 6047. [DOI] [PubMed] [Google Scholar]
[46].Agrawal R, Imieli T, #324, ski and Swami, A. Mining association rules between sets of items in large databases In Proceedings of the Proceedings of the 1993 ACM SIGMOD international conference on Management of data (Washington, D.C., USA, 1993). ACM, [insert City of Publication],[insert 1993 of Publication]. [Google Scholar]
[47].Srikant R and Agrawal R Mining sequential patterns: Generalizations and performance improvements. Springer, 1996. [Google Scholar]
[48].Wang J, Han J and Li C Frequent closed sequence mining without candidate maintenance. Knowledge and Data Engineering, IEEE Transactions on, 19, 8 (2007), 1042–1056. [Google Scholar]

[R1] [1].Top 10 Causes of Death - Factsheet. WHO, 2013. [Google Scholar]

[R2] [2].NCHS, United States, 2014: With special feature on adults aged 55–64 (2015). [PubMed] [Google Scholar]

[R3] [3].International statistical classification of diseases and related health problems. World Health Organization, 2004. [PubMed] [Google Scholar]

[R4] [4].Mony PK and Nagaraj C Health information management: An introduction to disease classification and coding. National Medical Journal of India, 20, 6 (2007), 307. [PubMed] [Google Scholar]

[R5] [5].Sington J and Cottrell B Analysis of the sensitivity of death certificates in 440 hospital deaths: a comparison with necropsy findings. Journal of clinical pathology, 55, 7 (2002), 499–502. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] [6].Hoff C and Ratard R Louisiana death certificate accuracy: a concern for the public’s health. J La State Med Soc, 162, 6 (2010), 350–352. [PubMed] [Google Scholar]

[R7] [7].Messite J and Stellman SD Accuracy of death certificate completion: the need for formalized physician training. Jama, 275, 10 (1996), 794–796. [PubMed] [Google Scholar]

[R8] [8].Lakkireddy DR, Gowda MS, Murray CW, Basarakodu KR and Vacek JL Death certificate completion: how well are physicians trained and are cardiovascular causes overstated? The American journal of medicine, 117, 7 (2004), 492–498. [DOI] [PubMed] [Google Scholar]

[R9] [9].Agarwal R, Norton JM, Konty K, Zimmerman R, Glover M, Lekiachvili A, McGruder H, Malarcher A, Casper M and Mensah GA Peer Reviewed: Overreporting of Deaths From Coronary Heart Disease in New York City Hospitals, 2003. Preventing chronic disease, 7, 3 (2010). [PMC free article] [PubMed] [Google Scholar]

[R10] [10].Seske LM, Muglia LJ, Hall ES, Bove KE and Greenberg JM Infant mortality, cause of death, and vital records reporting in Ohio, United States. Maternal and child health journal, 21, 4 (2017), 727–733. [DOI] [PubMed] [Google Scholar]

[R11] [11].Johansson LA and Westerling R Comparing Swedish hospital discharge records with death certificates: implications for mortality statistics. International journal of epidemiology, 29, 3 (2000), 495–502. [PubMed] [Google Scholar]

[R12] [12].Johns LE, Madsen AM, Maduro G, Zimmerman R, Konty K and Begier E A case study of the impact of inaccurate cause-of-death reporting on health disparity tracking: New York City premature cardiovascular mortality. American journal of public health, 103, 4 (2013), 733–739. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Madsen A, Thihalolipavan S, Maduro G, Zimmerman R, Koppaka R, Li W, Foster V and Begier E Peer Reviewed: An Intervention to Improve Cause-of-Death Reporting in New York City Hospitals, 2009–2010. Preventing chronic disease, 9 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] [14].Gissler M, Kauppila R, Merilainen J, Toukomaa H and Hemminki E Pregnancy-associated deaths in Finland 1987–1994-definition problems and benefits of record linkage. Acta obstetricia et gynecologica Scandinavica, 76, 7 (1997), 651–657. [DOI] [PubMed] [Google Scholar]

[R15] [15].Curb J, Babcock C, Pressel S, Tung B, Remington R and Hawkins C Nosological coding of cause of death. American journal of epidemiology, 118, 1 (1983), 122–128. [DOI] [PubMed] [Google Scholar]

[R16] [16].Guibert RL, Wigle DT and Williams JI Decline of acute myocardial infarction death rates not due to cause of death coding. Canadian journal of public health= Revue canadienne de sante publique, 80, 6 (1989), 418–422. [PubMed] [Google Scholar]

[R17] [17].Myers KA and Farquhar DR Improving the accuracy of death certification. Canadian Medical Association Journal, 158, 10 (1998), 1317–1323. [PMC free article] [PubMed] [Google Scholar]

[R18] [18].Percy C and Muir C The international comparability of cancer mortality data: results of an international death certificate study. American journal of epidemiology, 129, 5 (1989), 934–946. [DOI] [PubMed] [Google Scholar]

[R19] [19].Madsen A and Begier E Improving quality of cause-of-death reporting in New York City. Preventing chronic disease, 10 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] [20].Madsen A, Thihalolipavan S and Maduro M A successful intervention to improve the quality of cause of death reporting in New York City hospitals. (In Press). Preventing chronic disease (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].NAPHSIS Training Resources. National Association for Public Health Statistics and Information Systems Web site. [Google Scholar]

[R22] [22].Physician’s Handbook on Death Registration and Fetal Death Reporting Hyattsville, MD: Centers for Disease Control and Prevention, 2003. [Google Scholar]

[R23] [23].Skopp NA, Smolenski DJ, Schwesinger DA, Johnson CJ, Metzger-Abamukong MJ and Reger MA Evaluation of a methodology to validate National Death Index retrieval results among a cohort of U.S. service members. Ann Epidemiol, 27, 6 (June 2017), 397–400. [DOI] [PubMed] [Google Scholar]

[R24] [24].Tanno LK, Bierrenbach AL, Calderon MA, Sheikh A, Estelle R Simons F and Demoly P Increasing the Accuracy of Notification of Anaphylaxis Deaths in Brazil through the International Classification of Diseases (ICD)-11 Revision. Journal of Allergy and Clinical Immunology, 139, 2 (AB226. [DOI] [PubMed] [Google Scholar]

[R25] [25].Flenady V, Wojcieszek AM, Ellwood D, Leisher SH, Erwich JJH, Draper ES, McClure EM, Reinebrant HE, Oats J and McCowan L Classification of causes and associated conditions for stillbirths and neonatal deaths. Elsevier, City, 2017. [DOI] [PubMed] [Google Scholar]

[R26] [26].Foreman KJ, Naghavi M and Ezzati M Improving the usefulness of US mortality data: new methods for reclassification of underlying cause of death. Population health metrics, 14, 1 (2016), 14. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] [27].Lu TH Using ACME (Automatic Classification of Medical Entry) software to monitor and improve the quality of cause of death statistics. Journal of Epidemiology and Community Health, 57 (2003-06-01 00:00:00 2003), 470–471. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] [28].DIMDI - About Iris. DIMDI: German Institute of Medical Documentation and Information, City. [Google Scholar]

[R29] [29].Tao C, Wongsuphasawat K, Clark K, Plaisant C, Shneiderman B and Chute CG Towards event sequence representation, reasoning and visualization for EHR data In Proceedings of the Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium (Miami, Florida, USA, 2012). ACM, [insert City of Publication],[insert 2012 of Publication]. [Google Scholar]

[R30] [30].Wang TD, Plaisant C, Quinn AJ, Stanchak R, Murphy S and Shneiderman B Aligning temporal data by sentinel events: discovering patterns in electronic health records. ACM, City, 2008. [Google Scholar]

[R31] [31].Syed H and Das AK Identifying Chemotherapy Regimens in Electronic Health Record Data Using Interval-Encoded Sequence Alignment. Springer, City, 2015. [Google Scholar]

[R32] [32].Casanova IJ, Campos M, Juarez JM, Fernandez-Fernandez-Arroyo A and Lorente JA Using Multivariate Sequential Patterns to Improve Survival Prediction in Intensive Care Burn Unit. Springer International Publishing, City, 2015. [Google Scholar]

[R33] [33].Batal I, Valizadegan H, Cooper GF and Hauskrecht M A pattern mining approach for classifying multivariate temporal data. IEEE, City, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] [34].Konias S, Giaglis GD, Gogou G, Bamidis PD and Maglaveras N Uncertainty rule generation on a home care database of heart failure patients. City, 2003. [Google Scholar]

[R35] [35].Ordonez C, Omiecinski E, De Braal L, Santana CA, Ezquerra N, Taboada JA, Cooke D, Krawczynska E and Garcia EV Mining constrained association rules to predict heart disease. City, 2001. [Google Scholar]

[R36] [36].Liu H and Du H Stock Sequence Pattern Mining Method Based on SWI-GSP Algorithm. ACM, City, 2017. [Google Scholar]

[R37] [37].Zhang D and Zhou L Discovering golden nuggets: data mining in financial application. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 34, 4 (2004), 513–522. [Google Scholar]

[R38] [38].Vu HQ, Li G, Law R and Zhang Y Travel Diaries Analysis by Sequential Rule Mining. Journal of Travel Research (2017), 0047287517692446. [Google Scholar]

[R39] [39].Tiwari S and Tiwari LK Sequential Rule Mining in M-Learning Domain. International Journal of Computer Applications, 134, 3 (2016), 23–29. [Google Scholar]

[R40] [40].Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U and Hsu M-C Mining sequential patterns by pattern-growth: The prefixspan approach. Knowledge and Data Engineering, IEEE Transactions on, 16, 11 (2004), 1424–1440. [Google Scholar]

[R41] [41].Lin M-Y and Lee S-Y Fast discovery of sequential patterns by memory indexing. Springer, City, 2002. [Google Scholar]

[R42] [42].Pradhan GN and Prabhakaran B Association rule mining in multiple, multidimensional time series medical data. Journal of Healthcare Informatics Research, 1, 1 (2017), 92–118. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] [43].Vandromme M, Jacques J, Taillard J, Hansske A, Jourdan L and Dhaenens C Extraction and optimization of classification rules for temporal sequences: Application to hospital data. Knowledge-Based Systems, 122 (2017), 148–158. [Google Scholar]

[R44] [44].Concaro S, Sacchi L, Cerra C, Fratino P and Bellazzi R Mining health care administrative data with temporal association rules on hybrid events. Methods of information in medicine, 50, 2 (2011), 166–179. [DOI] [PubMed] [Google Scholar]

[R45] [45].Chaves R, Górriz JM, Ramírez J, Illán IA, Salas-Gonzalez D and Gómez-Río M Efficient mining of association rules for the early diagnosis of Alzheimer’s disease. Physics in Medicine and Biology, 56, 18 (2011), 6047. [DOI] [PubMed] [Google Scholar]

[R46] [46].Agrawal R, Imieli T, #324, ski and Swami, A. Mining association rules between sets of items in large databases In Proceedings of the Proceedings of the 1993 ACM SIGMOD international conference on Management of data (Washington, D.C., USA, 1993). ACM, [insert City of Publication],[insert 1993 of Publication]. [Google Scholar]

[R47] [47].Srikant R and Agrawal R Mining sequential patterns: Generalizations and performance improvements. Springer, 1996. [Google Scholar]

[R48] [48].Wang J, Han J and Li C Frequent closed sequence mining without candidate maintenance. Knowledge and Data Engineering, IEEE Transactions on, 19, 8 (2007), 1042–1056. [Google Scholar]

PERMALINK

Improving Validity of Cause of Death on Death Certificates

Ryan A Hoffman

Janani Venugopalan

Li Qu

Hang Wu

May D Wang

Abstract

1. INTRODUCTION

2. EXPERIMENTAL AND COMPUTATIONAL DETAILS

2.1. Data Sources: Expert Knowledge Base

2.2. Data Sources: Death Certificate Data

2.3. Deriving Frequent COD Patterns from Death Certificates

2.4. Deriving Rules from Expert Knowledge

3. RESULTS AND DISCUSSION

3.1. Results from SRM Analysis

Table 1:

3.2. Results from Comparing Rules from SRM with Expert Knowledge

Figure 1:

Table 2:

Table 3:

Table 4:

4. CONCLUSIONS

ACKNOWLEDGMENTS

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Improving Validity of Cause of Death on Death Certificates

Ryan A Hoffman

Janani Venugopalan

Li Qu

Hang Wu

May D Wang

Abstract

1. INTRODUCTION

2. EXPERIMENTAL AND COMPUTATIONAL DETAILS

2.1. Data Sources: Expert Knowledge Base

2.2. Data Sources: Death Certificate Data

2.3. Deriving Frequent COD Patterns from Death Certificates

2.4. Deriving Rules from Expert Knowledge

3. RESULTS AND DISCUSSION

3.1. Results from SRM Analysis

Table 1:

3.2. Results from Comparing Rules from SRM with Expert Knowledge

Figure 1:

Table 2:

Table 3:

Table 4:

4. CONCLUSIONS

ACKNOWLEDGMENTS

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases