Abstract
The transition from ICD-9-CM to ICD-10-CM diagnosis codes has potentially significant implications for quality measurement. Challenges include translating each measure by manual review, and reassessing the validity of each measure. Without this effort, all claims-based quality measures have unknown accuracy, affecting their use by providers, patients, and insurers.
Importance of Quality Measurement
Quality measures assess structures, processes of care, patient experiences, or outcomes that are associated with high quality health care. Use of quality measures is a cornerstone of the US healthcare system, as they are used to identify opportunities for improvement in care, allow consumers to make informed decisions, and influence payment.
The National Quality Forum (NQF) is a private, nonprofit organization tasked by the US Department of Health and Human Services with developing and implementing a national strategy for health care quality measurement. NQF endorses quality measures through a consensus process; endorsed measures must meet rigorous evaluation criteria. NQF-endorsement signifies that the measure is important, feasible, valid and reliable. Among the most rigorous evaluation criteria in the NQF endorsement process is that a measure must be proven to be scientifically acceptable, meaning it must produce credible (valid) and consistent (reliable) results.1 As a result, entities such as the Centers for Medicare and Medicaid Services (CMS) give strong preference to NQF-endorsed measures when considering what measures should be used for federal public reporting and performance-based payment programs.
Many NQF-endorsed quality measures utilize administrative billing claims, as these are readily available, inexpensive to collect, and can contain information on large populations. Currently, there are over 1,000 quality measures endorsed by NQF; over half of these measures are specified using administrative claims. Claims contain diagnosis codes, which are used to classify diseases, illnesses or injuries that result in patient encounters with the healthcare system. Diagnosis codes in claims are often used to identify the population targeted by a quality measure. Worldwide, diagnosis codes are reported using a classification system called the International Classification of Diseases (ICD).
A Major Change in Quality Measurement
On October 1, 2015, a major change, with enormous and unrecognized consequences for quality measurement, occurred in the US. We switched from reporting diagnosis codes using the 9th version of ICD (ICD-9-CM) to the 10th version of ICD (ICD-10-CM). This transition was implemented to address limitations within the ICD-9-CM, such as specificity of codes (i.e., left versus right arm injuries). The use of ICD-10-CM resulted in the increase of the number of diagnosis codes from approximately 14,000 to over 70,000.
As a result of the transition to ICD-10-CM, all existing quality measures using ICD-9-CM codes had to be “translated” to the new ICD-10-CM codes to continue their use. Without translation, existing quality measures would become invalid as of October 2015. Many have erroneously perceived this “translation” to be a simplistic process. Although challenges in interpreting and using ICD-10-CM data have been discussed, little attention has been paid to the implications on quality measurement.2
A flawed translation process
CMS, the Centers for Disease Control and Prevention (CDC), and NQF have suggested methods to assist in translation. CMS and the CDC created a tool called General Equivalence Mappings (GEMs). According to CMS, GEMs are a “comprehensive translation dictionary that can be used to accurately effectively translate any ICD-9-CM-based data,” including converting ICD-9-CM codes to ICD-10-CM for quality measures.3 However, quality assessments of the accuracy and completeness of GEMs was not comprehensive or thorough. NQF has suggested a best practice approach to translation, which includes convening clinical and coding experts, determining intent of the measure, using appropriate conversion tools (which may include GEMs), assessing the measure for material change, and soliciting stakeholder comments.4
Implications of the transition from ICD-9-CM to ICD-10-CM on quality measurement
The transition of diagnosis codes to ICD-10-CM poses two major challenges for every single existing quality measure using diagnosis codes: 1) the effort required to translate the diagnosis codes used within the measure; and 2) the assessment of the scientific acceptability of the measure after translation.
Unfortunately, use of GEMs alone does not result in a comprehensive and accurate translation of the diagnosis codes between the two ICD systems. For example, only 50% of the ICD-9-CM diagnosis codes match directly (1:1) to ICD-10-CM diagnosis codes; 5% of these matches are exact, while the other 45% are only “approximate” matches, which do not provide the scientific acceptability expected in quality measurement.5 In addition, numerous commercial entities have developed translation tools based on the GEMs; quality assessments of the accuracy and inconsistencies across translation tools exist.5
The only certain option to ensure the comparability of the ICD-9-CM diagnosis codes to the ICD-10-CM diagnosis codes generated by any translation tool is to conduct a time-consuming and tedious manual review of each code, which is consistent with NQF’s best practice approach. For example, the National Committee for Quality Assurance (NCQA), a leading developer of quality measures, implemented a process for translating their measures from ICD-9-CM to ICD-10-CM. The process included GEMs, web searches for additional codes, expert panel review, and a public comment period.6 However, the financial and time-dependent resources to implement this best practice approach are currently lacking in most public, private or academic systems to accurately update all existing quality measures. Manual review of each code would take hundreds of hours of time from both a clinical and measure development expert to make appropriate determinations about which codes should be included and excluded within each quality measure.7
Given the hundreds of NQF measures that are specified using ICD-9-CM codes, many measures will either forgo translation and lose endorsement due to a lack of usability and relevance,1 or in a worst case scenario, will be translated simply using GEMs without the marked attention and detail required to ensure continued scientific acceptability. Thus, health systems, health plans, providers, and consumers may be unclear on the accuracy and consistency of any data provided by such NQF-endorsed measures.
Even if all NQF-endorsed measures that use diagnosis codes are translated to ICD-10-CM with accuracy, the manner in which providers are utilizing the ICD-10-CM codes in clinical practice is unknown. Therefore, each measure that uses diagnosis codes must be re-tested for scientific acceptability, irrespective of the quality of translation between the ICD coding systems. Without this additional testing, measures may have questionable scientific acceptability. Even for measure developers such as NCQA, it may be years before enough data are available to untangle the effects of the translation and true changes in performance scores generated by quality measure in the ICD-10 era. Additional research is necessary to characterize the depth and impact of this problem, such as the number of quality measures reliant on conversion algorithms alone, validated using ICD-10-CM claims, and the potential costs associated with re-validation of claims-based quality measures.
Moving Forward to Maintain Confidence in Quality Measurement
Major stakeholders in quality measurement, including NQF, CMS, CDC, and health plans, urgently need to identify a path forward. Without a standard mechanism for translating diagnosis codes and reassessing the scientific acceptability of quality measures using ICD-10-CM, the quality agenda in the US will be markedly compromised. The unintended consequences of implementing measures without appropriate testing must be assessed. Otherwise, the already limited confidence that many healthcare providers and payers have in quality measurement will be undermined.
References
- 1.National Quality Forum. Measure Evaluation Criteria. http://www.qualityforum.org/Measuring_Performance/Submitting_Standards/Measure_Evaluation_Criteria.aspx. Accessed 06/29, 2018.
- 2.Khera R, Dorsey KB, Krumholz HM. Transition to the ICD-10 in the United States: An Emerging Data Chasm. JAMA. 2018;320(2):133–134. [DOI] [PubMed] [Google Scholar]
- 3.Centers for Medicare and Medicaid Services. General Equivalence Mapping FAQs. 2017; https://www.cms.gov/Medicare/Coding/ICD10/downloads/GEMs-CrosswalksBasicFAQ.pdf. Accessed 06/27, 2018.
- 4.National Quality Forum. ICD-10-CM/PCS Coding Maintenance Operational Guidance. 2010; https://www.qualityforum.org/Publications/2010/10/Coding_full.aspx. Accessed 06/28, 2018.
- 5.Jones L, Nachimson S Use Caution When Entering the Crosswalk: A Warning About Relying on GEMs as Your ICD-10 Solution. 2014; http://www.cms.org/uploads/ICDLogicGEMSWhitePaper.pdf. Accessed 06/27, 2018.
- 6.National Center for Quality Assurance. How ICD-10 Codes Affect HEDIS: What You Need To Know. 2015; https://blog.ncqa.org/how-icd-10-codes-affect-hedis-what-you-need-to-know/. Accessed 01/09, 2018.
- 7.Feudtner C, Feinstein JA, Zhong W, Hall M, Dai D. Pediatric complex chronic conditions classification system version 2: updated for ICD-10 and complex medical technology dependence and transplantation. BMC Pediatr. 2014;14(1):199. [DOI] [PMC free article] [PubMed] [Google Scholar]