Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2014 Nov 14;2014:1806–1814.

Syndromic surveillance in an ICD-10 world

Achala Jayatilleke 1, Jeffrey Kriseman 1, Lisa H Bastin 1, Umed Ajani 1, Peter Hicks 1
PMCID: PMC4419924  PMID: 25954453

Abstract

The Centers for Disease Control and Prevention’s BioSense program is an integrated national public health surveillance system that uses electronic medical record (EMR) data to provide situational awareness for all-hazard health-related events. Because the system leverages International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) coded data from EMRs for syndromic surveillance, the upcoming Health and Human Services-mandated transition from ICD-9-CM to ICD-10-CM will have a significant impact. To translate across the two encoding systems, we developed a Mapping Reference Table (MRT) for the ICD-9/10 transition. We extracted ICD-9-CM codes binned to predefined syndromes and mapped each to its corresponding ICD-10-CM code(s). Then, we translated the output ICD-10-CM codes back to ICD-9-CM through a reverse translation validation process. Throughout the translation process, we examined outputs manually and incorporated annotated results into the MRT. The resulting MRT can be used to refine and update each existing syndromic surveillance definition in BioSense to be compatible with ICD-10-CM and consistently classify or bin any given emergency department visit into the correct syndrome regardless of coding system.

Introduction

The Public Health Security and Bioterrorism Preparedness and Response Act of 2002 mandated the establishment of public health surveillance systems for early detection and rapid assessment of potential bioterrorism-related illness. As a response to that mandate, the Centers for Disease Control and Prevention (CDC) launched the BioSense Program in 2003 to establish an integrated national public health surveillance system.(1) BioSense began receiving data feeds in 2003 from outpatient clinics maintained by the Department of Veteran Affairs (VA) and Department of Defense (DoD).(2, 3) In 2005 BioSense began to receive data from non-federal civilian hospitals as well. Initially, BioSense defined 11 broad major syndromes (botulism-like, fever, gastrointestinal, hemorrhagic illness, localized cutaneous lesion, lymphadenitis, neurological, respiratory, rash, severe illness or death, and specific infection) and more-specific sub-syndromes based on chief complaint and final diagnosis data.(4) Although BioSense was initially launched as a surveillance system for early event detection and rapid assessment of potential bioterrorism-related illness over time, it has transformed to a public health surveillance system that provides situational awareness for all-hazard health-related events.(5)

CDC started redesigning BioSense in 2010, and the system was launched as BioSense 2.0 in 2012. The latest iteration of BioSense is a streamlined collaborative data-exchange system that enables its users to share health-related data, quickly track adverse or anomalous health issues, and share this information rapidly among other public health jurisdictions participating in the system.(57) Currently, as of July 2014, BioSense receives data feeds from 3,377 (1,920 non-federal and 1,457 federal) facilities and tracks approximately 130 pre-defined syndromes.(8)

Leveraging data from electronic medical records (EMRs) is a modern cornerstone for public health surveillance activities.(9) BioSense uses EMR data (such as chief complaint and final diagnosis codes) to provide national, regional, and local situational awareness for all-hazard health-related events and to inform a wide range of public health activities. Within the BioSense program, International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM)-coded data are extracted from EMR data provided by participating healthcare facilities.(4,8) These ICD-9-CM codes are binned and classified into syndromes and used to create temporal and geographic indicators and trend-lines for injury, disease, conditions, healthcare utilization, and adverse or anomalous event monitoring.(4,5)

In 2009, the Department of Health and Human Services mandated that a transition from ICD-9 to ICD-10 will occur on October 1, 2014 (now extended to October 1, 2015), for all entities covered under the Health Insurance Portability Accountability Act (HIPAA) and for healthcare systems that submit reimbursements to the Centers for Medicare and Medicaid (CMS) in the United States.(10) As many other secondary users of final diagnosis-coded data, public health surveillance systems will be significantly affected by this transition. To receive, analyze, interpret, and report upon ICD-9 and ICD-10 encoded data, public health surveillance systems that used ICD-9-CM codes must modify the existing database structure, modify data extraction rules, and create well-defined Mapping Reference Tables (MRT) to bridge the gap across the two encoding systems.

The most challenging aspect of the ICD-9/10 transition for public health surveillance systems will be to develop flexible and standardized solutions to accommodate analysis across two different code sets in 2015 and beyond. In addition, a meaningful baseline must be developed that can be used for benchmarking after the transition. Public health surveillance systems should expect to receive ICD-9-coded data for the first nine months of 2015 and ICD-10-coded data for the last three months of 2015; in addition, public health surveillance systems should plan for the possibility of receiving and analyzing a mixture of both ICD-9 and ICD-10-coded data for some period of time following October 1, 2015, which will make the situation more complicated. In this context it is essential to have a method to identify the type of original codes received by public health surveillance systems and to have the compatibility to seamlessly map across the two code sets.

We carried out this project to describe the process of developing an MRT for ICD-9/10 transition to be used to translate across the code sets and to propose a methodology for a meaningful baseline to be used in the BioSense system after October 1, 2015.

Methods

We extracted ICD-9-CM codes binned to all predefined BioSense syndromes from the list of syndromes developed by the BioSense community, which includes federal, state, and local public health partners.(8) We updated each of the extracted codes to 2014 ICD-9-CM codes by using the Code Translation Tool (CTT) developed by 3M for the translation process. (11) By using CTT, we translated the resulting ICD-9-CM 2014 codes to ICD-10-CM 2014 codes. Then, we translated the output ICD-10-CM codes back to ICD-9-CM by using a reverse translation validation process to ensure that the appropriate codes were correctly identified at the onset of the translation process. Throughout the translation process, outputs were individually examined manually and annotated results were incorporated into the MRT by an epidemiologist and a clinician. We used ICD-9 and ICD-10 complete official code sets for the manual process.(12, 13)

By using the resulting MRT, we computed ICD-9-CM codes for use cases of ICD-10-CM codes and vice versa. Computed ICD-9-CM codes were then binned to existing BioSense syndromes. Similarly, we attempted to bin computed ICD-10-CM codes to the existing BioSense syndromes. During this process, we also developed an algorithm that can identify whether the original code belongs to ICD-9-CM or ICD-10-CM. In addition, we used the resulting MRT to refine and update each of the existing syndromic surveillance definitions used in the BioSense program to be ICD-10-CM compatible.

Results

The output MRT comprises four columns: 1) the original ICD-9-CM, 2) the updated 2014 ICD-9-CM codes, 3) corresponding ICD-10-CM codes, and 4), lastly, a value to indicate the level of confidence for each individual mapping in regards to a specific syndrome. Table 1 depicts a section of MRT for syndrome asthma as an example. We indicated the level of confidence by using four values (Table 2).

Table 1.

Mapping Reference Table (MRT) for selected codes of asthma syndrome

Original ICD-9 CM 2014 ICD-9 CM Corresponding ICD-10 CM Level of Confidence
493 493.00 J45.20 A
493 493.00 J45.30 A
493 493.00 J45.40 A
493 493.00 J45.50 A
493 493.01 J45.22 A
493 493.01 J45.32 A
493 493.01 J45.42 A
493 493.01 J45.52 A
493 493.90 J45.909 A
493 493.90 J45.998 A
493 493.91 J45.902 A
493 493.92 J45.901 A
493 493.81 J45.990 A
493 493.82 J45.991 A
491.20* J44.9 D
491.21* J44.1 D
491.22* J44.0 D
496* J44.9 D
*

Resulted from reverse translation process

Table 2.

Level of confidence

Level of Confidence Description
A Consists of codes that reflect general symptoms of the syndrome group and also include codes for the bioterrorism diseases of highest concern or those diseases highly approximating them.
B Consists of codes that might normally be placed in the syndrome group, but daily volume could overwhelm or otherwise detract from the signal generated from the Category 1 code set alone.
C Consists of codes that might normally be placed in the syndrome group, but daily volume could overwhelm or otherwise detract from the signal generated from the Category 1 code set alone.
D Not a match

In some instances where a single ICD-9-CM code is insufficient to represent the relevant ICD-10-CM codes, we introduced an additional column in the MRT named Supplementary ICD-9-CM Code. Similarly, we introduced supplementary ICD-10-CM codes where a single ICD-10-CM code is insufficient to represent some concepts represented by ICD-9-CM.

Table 3 shows the level of confidence of ICD-10-CM codes for five selected syndromes. Figure 1 illustrates the number of ICD-9-CM and ICD-10-CM codes binned to five selected syndromes.

Table 3.

Level of confidence of translated ICD-10 codes for selected syndromes

Syndrome Level of Confidence Number of ICD-10 codes %
Asthma A 33 89.2
B 0 0.0
C 0 0.0
D 4 10.8
Diphtheria A 10 62.5
B 6 37.5
C 0 0.0
D 0 0.0
Lymphadenitis A 11 36.7
B 0 0.0
C 3 10.0
D 16 53.3
Anthrax A 7 100.0
B 0 0.0
C 0 0.0
D 0 0.0
Convulsions A 4 100.0
B 0 0.0
C 0 0.0
D 0 0.0

Figure 1.

Figure 1.

Number of ICD-9-CM and ICD-10-CM codes per selected syndromes

Leveraging the MRT, we computed ICD-9-CM and ICD-10-CM codes for hypothetical use cases and binned them to the existing BioSense syndromes. Table 4 shows the computed codes and binning of codes to the syndromes asthma and chronic obstructive pulmonary disease (COPD).

Table 4.

Binning of computed codes to asthma and chronic obstructive pulmonary disease (COPD) syndromes

Visit ID Original code [A] Code type Computed ICD-10 code [B] Computed ICD-9 code [C] ICD-9 [A+C] ICD-10 [A+B] Syndrome ICD-9 Syndrome ICD_10
XXX1 493.00 ICD-9 J45.20 493.00 J45.20 Asthma* Asthma
XXX2 493.20 ICD-9 J44.9 493.20 J44.9 Asthma COPD
XXX3 J45.909 ICD-10 493.90 493.90 J45.909 Asthma Asthma
XXX4 J44.9 ICD-10 493.20 493.20 J44.9 Asthma COPD
XXX5 J44.9 ICD-10 491.21 491.21 J44.9 COPD COPD
XXX6 493.90 ICD-9 J45.998 493.90 J45.998 Asthma Asthma
XXX7 J44.9 ICD-10 496 496 J44.9 COPD COPD
XXX8 J44.0 ICD-10 493.21 493.21 J44.0 Asthma COPD
XXX9 J44.0 ICD-10 491.20 491.20 J44.0 COPD COPD
*

ICD-9 CM codes included in Asthma syndrome definition - 493

ICD-9 CM codes included in COPD syndrome definition - 491, 492

Figure 2 shows the algorithm we used to identify the type of original code. This algorithm describes how an ICD-9-CM diagnosis code can be differentiated from that of ICD-10-CM.

Figure 2.

Figure 2.

Algorithm to differentiate ICD-9 and ICD-10 codes

Discussion

The ICD-9-CM to ICD-10-CM transition will have a significant impact on the BioSense program. Due to the complexity and higher level of specificity in ICD-10-CM codes, ICD-9-CM codes pertinent to syndromes within BioSense cannot be automatically translated into ICD-10-CM codes. Existing translation tools are insufficient and frequently provide results that are either inaccurate or incompatible with the syndromic surveillance concept under review. Therefore, we developed an MRT to be used by the BioSense program with manual expert review and input based on General Equivalence Mappings (GEMs) files prepared by CMS.(14)

By leveraging the developed MRT, ICD-10-CM codes can be down-coded and translated into ICD-9-CM codes. The resulting ICD-9-CM codes can be binned to existing BioSense syndromes. Similarly, ICD-9-CM codes can be up-coded and translated to ICD-10-CM by using the MRT. The output (the resulting ICD-10-CM codes) can then be incorporated into new syndromic surveillance visit (case) definitions that can be incorporated into the overall BioSense system or appended to a given record to allow a given event to be analyzed in either code set. The developed MRT can be used to refine and update each of the existing syndromic surveillance definitions used in the BioSense program to be ICD-10-CM compatible. However, due to the higher specificity of ICD-10-CM codes, existing BioSense 2.0 syndromes should be reviewed based on ICD-10-CM codes to achieve syndromic surveillance objectives more effectively.

Syndromic surveillance data are of limited use without a reliable and interpretable referential baseline. The BioSense program uses the past two years of data to define a referential baseline. However, after October 1, 2015, it will be impossible to compute a referential baseline unless there is a platform to accommodate both ICD-9-CM and ICD-10-CM codes. Referential base lines can be computed for both ICD-9 and ICD-10 code sets by using our developed MRT. These calculated referential baselines can be used until all healthcare facilities providing data to the BioSense program fully transition to the ICD-10 coding system or until a complete two-year referential baseline is computed (October 2017).

After October 1, 2015, public health surveillance systems may receive a mixture of both ICD-9 and ICD-10 codes. Because ICD-10-CM diagnosis codes begin with an alpha character (e.g., J45.30, J45.20), a majority of ICD-10-CM codes can be differentiated from ICD-9-CM codes that begin with a numeric character (e.g., 493.0, 493.1). However, there are similarities in ICD-10-CM codes and ICD-9-CM codes starting with an alpha character (E and V).(15) To address this issue, we developed an algorithm for this project that can be used to validate and differentiate ICD-10 codes from ICD-9 codes after the transition. However, in some occasions where ICD-10 codes are identical to ICD-9 codes, additional information will be required for the differentiation.

The lack of ICD-10-CM-based data to test the developed MRT is a limitation of this project. Currently, we are developing a synthetic ICD-10-CM data set by using probability theories based on historical ICD-9-CM data. This data set can be used to test the developed MRT several months before the real transition occurs.

In conclusion, the developed MRT can be used as a common language and reference to receive, integrate, and classify healthcare visits to syndromic surveillance definitions and develop computed referential baseline data regardless of ICD code set. However, to receive the benefit of the more specific ICD-10 codes, existing BioSense syndrome definitions should be revised based on ICD-10-CM codes to achieve syndromic surveillance objectives more effectively. The MRT and the code set identification algorithm can be leveraged beyond the BioSense program and will be made available to state and local partners for potential inclusion in their specific syndromic surveillance systems.

Footnotes

Disclaimer

The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention (CDC).

References


Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES