Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2020 Mar 4;2019:804–811.

Have ICD-10 Coding Practices Changed Since 2015?

Srikanth Sivashankaran 1, John P Borsi 1, Amanda Yoho 1
PMCID: PMC7153097  PMID: 32308876

Abstract

Usage of ICD-10 codes in administrative data has continued to shift since mandatory adoption in 2015. Identifying changing patterns in coding behavior is imperative in producing reliable analyses and robust conclusions. We examined the granularity of ICD-10 coding over time in a cohort selected from the IBM Explorys Therapeutic Dataset, which contains the records of over 60 million patients. Our seasonality-aware trend model identified patterns of interest, such as increased use of laterality codes for pain and increased use of codes denoting concepts novel to ICD- 10 for screening encounters. Those relying on these codes should adjust for these ‘learning curve’ effects. This work should be extended to additional modalities of terminology usage and represents a starting point for researchers working with dynamic clinical ontologies.

Introduction

The Tenth Revision of the International Classification of Diseases (ICD-10) represented a “thorough rethinking” of the World Health Organization (WHO) terminology1. Specifically, the clinical modification of ICD-10 (ICD-10-CM) in use in the United States included a four-fold increase in the number of available codes over the ICD-9 version2, intended to dramatically increase the applicability, specificity, and usability of the code system3,4. The migration to ICD-10 in the US proved controversial: on the one hand, the conversion to ICD-10 promised improved understanding of patient outcomes5, reduced fraud and abuse6, and increased extensibility7. On the other, the increase in complexity and structural changes threatened to incur implementation costs and cause headaches for practitioners8-10.

We should expect the disruption surrounding the 2015 transition to have an observable impact on the resulting ICD- coded data. A 2001 report from the CDC highlighted the statistical disruption produced by ICD-10 coding for mortality data11, and reviews from the 1998-99 Australian12 and 2002-07 Canadian13 ICD-10 transitions showed noticeable changes in coding volume and prevalence of some disease categories. And, indeed, US studies in the wake of the cutover have shown under- and over-reporting of important clinical categories. Substance abuse14, self-harm15, HIV and Alzheimer’s16 coding were subject to significant alterations in perceived population prevalence across the ICD- 10 boundary.

The impact of the conversion to ICD-10 is not limited to the immediate transition period, however. In the wake of Switzerland’s 1998 ICD cutover, for example, ICD usage continued to shift for years as coders acclimated to the new system17. The idea of a coding ‘learning curve’ is supported both by anecdotal and experimental evidence. A South African controlled study demonstrated that ICD coding behavior can noticeably shift as a result of training and familiarization18. Early reviews of US ICD coding post-cutover show that we might expect a significant learning curve19,20; as of 2016 there remained significant opportunities to improve the accuracy and relevance of coding.

Although learning curve effects may contribute substantially to coding variation immediately following a transition, there are other factors that might impact coding patterns over time. Coders may respond to risk adjusted payment incentives21 or different auditing regimes20. There are potentially any number of systemic factors that could have subtle influences on subsequent data analyses. Understanding and revalidating ICD-10 coding patterns is an important step towards having confidence in analyses using ICD-coded data moving forward22.

We provide a unified structure for assessing temporal coding patterns over the years 2015-2018 and present results from applying it to a large administrative data set. We examine both seasonal and long-run trends for related families of codes. For this work, we focus primarily on codes which were newly introduced in ICD-10, to best capture the relevant shifts in usage following the ICD-10 transition period.

Data and Methods

We employed the General Equivalence Mappings (GEMs)23 from the Center for Medicare and Medicaid Services (CMS) to identify novel concepts in ICD-10-CM. These bidirectional mappings, between ICD-9-CM and ICD-10, are accompanied by a series of flags, one of which indicates whether the mapping between concepts in the two terminologies is approximate. We identified novel ICD-10 codes by applying all of the following conditions in the ICD-10 to ICD-9 segment of the diagnosis terminology GEMs:

  1. The “approximate” flag had the value TRUE

  2. The ICD-10 term did not describe a “not elsewhere classified” or “other” concept

  3. No other ICD-9 term mapped to the ICD-10 term under consideration

  4. The ICD-9 term in the GEM entry describes an “other” concept – a catch-all for terms not explicitly enumerated in that terminology

These criteria identify, for example, the ICD-10 code for Salmonella pyelonephritis (A02.25) as describing a concept not listed in ICD-9:

ICD-10 Code ICD-10 Description Approximate mapping? ICD-9 Code ICD-9 Description
A02.25 Salmonella pyelonephritis TRUE 003.29 Other localized salmonella infections

In order to provide a baseline against which to compare the use of novel codes, we organized the concept space of ICD-10 into groups consisting of codes sharing a header. Headers in ICD-10 are grouping artifacts defined by CMS as “not valid” but “included as a convenience for other uses”, to capture related sets of terms24. For this study, we considered every concept group meeting the following conditions:

  1. The group contained an ICD-10 term identified in the previous step

  2. The membership of the group remained constant over the study period (fiscal years 2016, 2017, and 2018 to align with the ICD code update cycle)

  3. The group contained at least one term describing an “other”, “unspecified”, or “not elsewhere classified” concept – one to be used where the medical record lacks adequate granularity or describes a condition not otherwise enumerated -- against which to compare the novel ICD-10 term(s)

These criteria situate the novel ICD-10 concept of Salmonella pyelonephritis, for example, within the following concept group:

Header Header Description Code Code Description Flag
A02.2 Localized salmonella infections A02.20 Localized salmonella infection, unspecified (unspecified)
A02.2 Localized salmonella infections A02.21 Salmonella meningitis (non-novel)
A02.2 Localized salmonella infections A02.22 Salmonella pneumonia (non-novel)
A02.2 Localized salmonella infections A02.23 Salmonella arthritis (non-novel)
A02.2 Localized salmonella infections A02.24 Salmonella osteomyelitis (non-novel)
A02.2 Localized salmonella infections A02.25 Salmonella pyelonephritis (novel)
A02.2 Localized salmonella infections A02.25 Salmonella with other localized infection (other)

We then obtained unique patient counts per month for each ICD-10 code in the selected groups over the fiscal years 2016, 2017, and 2018 from the IBM Explorys Therapeutic Dataset25. These data are comprised of the clinical and claims records of over 60 million unique patients from over 39 healthcare systems; see Table 1 for demographic information about the study cohort selected from those patients. In order to more precisely consider the behavior of professional medical coders, we limited the study to administrative data alone. To ensure sufficient sample size, we limited analyses to the 20 header groups with greatest overall patient counts.

Table 1.

Cohort demographic summary. IQR stands for interquartile range.

Total Individuals [Count] 7,893,000 Race [Count (%)]
Sex [Count (%)] White 5,878,000 (74.5%)
Female 4,486,000 (56.8%) Black 803,000 (10.2%)
Male 3,406,000 (43.1%) Asian 214,000 (2.7%)
Other/unidentified 2,000 (0.1%) Other/unidentified 729,000 (9.2%)
Ethnicity [Count (%)] Declined 269,000 (3.4%)
Non-Hispanic 6,139,000 (77.8%) Other [Median (IQR)]
Hispanic 626,000 (7.9%) Age 70 (58-79)
Declined 394,000 (5.0%) Years of follow-up 7 (2-13)
Other/unidentified 734,000 (9.3%)

In order to surface patterns in the assignment of ICD-10 codes over time, we measured the difference between novel codes and other codes. We decomposed the observed variation in differences into level, trend, and seasonality components using the R forecast package, which selects an exponential smoothing method for each component that minimizes the value of Akaike’s Information Criterion (AIC) of the whole26. We assessed the explanatory significance of the AIC-minimizing model components by comparison to the fit obtained when using no exponential smoothing for the relevant subcomponent (trend or seasonality) when holding all other methods identical. We also calculate the seasonal and non-seasonal variants of the Mann-Kendall test for trend, whose tau statistic provides a guideline as to the direction of trend where applicable27. For consistency between groups, we used the measure of patient count relative to all patients with any diagnosis record in the filtered dataset.

Results

We identified 11 statistically significant coding trends that were deemed to be of practical interest. These trends are summarized in Table 2. Each of the novel ICD-10 codes identified in the reported trends can be categorized in one of two ways. First, novel codes may relate specifically to body site laterality – left knee or right shoulder, for example. Alternatively, novel codes may represent an increase in granularity over similar ICD-9 codes.

Table 2.

Summary of coding trends, by novel ICD-10 code and corresponding header. Code type describes our categorization of what makes this code distinct from related ICD-9 codes. Percent change is the absolute change in the relative patient share of the novel code(s) over the entire study period.

ICD-10 Code Description Header Description Novel Code Type D %
E11.65 Type 2 DM with hyperglycemia E11.6 Type 2 DM with complication Granular -10.8%
Z79.891 Long term use of opiate analgesic Z79.81 Long-term drug therapy Granular -2.8%
I25.5 Ischemic cardiomyopathy I25 Chronic heart disease Granular +5.8%
Z12.31 Mammogram for breast cancer Z12.3 Breast cancer screening Granular +9.9%
R09.81 Nasal congestion R09.8 Respiratory symptoms Granular +7.2%
M25.51X Pain in right/left shoulder M25.51 Pain in shoulder Laterality +8.4%
M25.55X Pain in right/left hip M25.55 Pain in hip Laterality +8.9%
M25.56X Pain in right/left knee M25.56 Pain in knee Laterality +7.0%
M79.60X Pain in right/left leg M79.60 Pain in limb Laterality +9.7%
M17.1X Osteoarthritis of right/left knee M17.1 Osteoarthritis Laterality -3.7%
D50.1 Sideropenic dysphagia D50 Iron deficiency anemia Other/unspec. -

Although this work is primarily focused on changes in the usage of novel ICD-10 codes, our modelling approach also detected the tradeoff between non-novel codes closely related to codes of interest. In Table 2, we indicated this coding pattern as ‘other/unspec.’ to indicate that the calculated statistics typically reflect differential usage of the other and unspecified code types, rather than changes in usage of the new, more specific codes. For example, within the header grouping D50, ‘Iron deficiency anemia,’ the new more granular code of interest is D50.1, ‘Sideropenic dysphagia.’ Despite the relative consistency of D50.1 coding over the study period, there was a significant alteration in the comparison baseline of D50.8 and D50.9 (other anemia and unspecified anemia, respectively). We present this trend because decreased usage of ‘unspecified’ codes in favor of ‘other’ codes may indicate an area where the clinical content coverage of ICD-10 is lacking28.

One of the most common types of coding pattern identified through our analysis was the increased use of laterality codes over time. For example, Figure 1 provides a clear example of reduced usage of unspecified laterality over time for the code header M25.51, pain in shoulder. This increased uptake of laterality codes is most pronounced for pain related codes; we observed increases in laterality-specific coding within the code groups describing pain in the shoulder, hip, knee, and limbs. We did also note a decrease in usage for M17, osteoarthritis of the knee.

Figure 1.

Figure 1.

Use of laterality codes for M25.51 header (pain in shoulder)

Generally, the novel codes for laterality behaved similarily to one another over the study period. There was more diversity among the trends identified for novel codes that were more specific than their ICD-9 counterparts. Of the five identified trends involving more granular ICD-10 codes, three were for increased usage of the novel codes and two were decreases in usage.

We also identified certain trends that had a dramatic seasonality to the relative usage of novel ICD-10 codes. An example of a highly-seasonal trend is provided in Figure 2. For the most part, our analysis is not susceptible to the underlying seasonal trends of disease and healthcare utilization because we are primarily interested in the relationships among codes that share similar underlying temporal attributes. However, in cases where there are clearly different month-to-month behaviors for the different codes under a single header, our analysis will include artifacts from that seasonality. After correcting for this seasonal behavior, there remains a more subtle trend which we still present as a valid shift in coding patterns.

Figure 2.

Figure 2.

Seasonal coding patterns for R09.8 (respiratory symptoms)

Discussion

We expected to see an increase in novel ICD-10 code usage over time as a result of increased familiarization. The increased uptake of laterality information is an excellent example of such a positive learning curve. Initially, coders may have been unaware of the increased documentation possibilities or were using ICD-9 crosswalks which would map to unspecified laterality codes. The adoption of these laterality codes represents a positive development: usage of these new ICD-10 codes is likely to improve care and reduce fraud6.

The discrepant laterality trend, M17 - osteoarthritis, was the single reported decline in usage of the laterality codes. We suspect that this trend is due to a data artifact that does not represent actual coder behavior. Over a 3-month period in early 2017, there was a substantial jump in the raw number of unspecified laterality variant of arthritis codes. This coincided with an overall increase of arthritis codes in our analysis and is likely due to a newly-introduced data contributor to the overall dataset. Contributors are deliberately concealed in this deidentified dataset, and the only proxy for determining such effects is through analysis of changes in geographical distributions of patients in the data. Although such increases in raw numbers was rare, further work will be required to fully understand the impact of such distributional shifts.

Apart from the usage of laterality codes, the perceived trends for new ICD-10 codes were of mixed magnitude and direction. Some of the reported trends were straightforward increases in adoption of more specific codes: Z12 added details about mammograms, I25 added a code for cardiomyopathy, and R09 included increased symptom specificity. However, some of the trends moved in the opposite direction: the header groupings for drug therapy and complicated type 2 diabetes (T2DM) saw shifts in usage away from novel codes and towards more unspecified or other codes. For example, in the T2DM header E11.6, the usage of the newer more granular code E11.65 (hyperglycemia) decreased and the generic, unspecified code E11.69 became more popular. See Figure 3 for the patient shares of these codes over time. There are a number of possible explanations for this type of trends: they may be explained by an evolution of diagnostic practice that ICD-10 fails to cover.

Figure 3.

Figure 3.

Decreased usage of granular E11 code for T2DM with complications

In addition to the temporal anomaly observed in the distribution of the osteoarthritis codes mentioned previously, there were some other data fluctuations on short timescales. For example, the coding for breast cancer screening experienced a noticeable discontinuity in August 2017, where the use of the novel mammogram code Z12.31 increased about ten percent among patients receiving any code for such a screening. There are several potential explanations for such a shift; to our knowledge no new screening guidelines were issued around that period, but hospital systems might have been adopting 2016 guidelines at that time29. Alternatively, coders may have been responding to reimbursement pressures around screening codes. No other trend experienced a short-term shift as large as that of the breast cancer screening header, and the others may be attributable to random noise.

Conclusion

Understanding the uptake of ICD-10 codes is critical to the accuracy of ICD-based analyses. Our work has identified several broad categories of coding patterns that should be corrected for or excluded from retrospective studies. The introduction of laterality codes in ICD-10 represents an opportunity for more detailed study of the specific body sites of given conditions. However, researchers need to be aware of the learning curve effects present in these codes after the ICD-10 transition. Similarly, the increased specificity for screening encounters available in ICD-10 enables enhanced study of utilization and health system effects but may be affected by the trends identified here. Finally, the cases of decreased uptake of novel codes may indicate areas where revisions or additions to ICD-10 are necessary and should be tracked in future ICD-10 implementation studies.

There are several opportunities for more in-depth analysis of key coding trends revealed by this work. Throughout our trend modelling, we restricted code groupings to the header codes provided by CMS. Although sufficient for our purposes, there are higher level groupings that might reveal broader shifts in coding practice. Looking at coding practices across higher-level ICD-10 groupings or other clinically significant clusters would present highly relevant patterns to coding and informatics professionals. Additionally, novel features of ICD-10 can be spread across many different code groups. Analyzing the usage of cause of injury details15 or overlapping site codes19 would provide useful learning curve information. Another factor to be considered in future work would be the “grace period” that CMS applied only to the first fiscal year (2016) of ICD-10 implementation, which allowed for a degree of imprecision in Medicare fee-for-service claims; the freedom from audit that it promised in that year could help to explain some of the variation in coding patterns observed between that year and fiscal years 2017 and 201830. Finally, this work focuses mainly on novel ICD-10 codes but, as we move further away from the ICD-10 transition, that novelty may prove less relevant to coding practice and our analysis should be expanded to other properties of ICD-10 codes and the relationships among them.

For future coding system updates, including both structural and incremental changes, this type of coder uptake analysis is vital to understanding the impact of various incentives on reported outcomes. This knowledge of coding practices is important not just to coders and researchers; it suggests important features of the inner workings of our health care systems.

Figures & Table

References

  • 1.World Health Organization. International statistical classification of diseases and related health problems: 10th revision ICD-10 Geneva. WHO 2016 https://icd.who.int/browse10/Content/statichtml/ICD10Volume2_en_2016.pdf. [Google Scholar]
  • 2.Boyd AD, Li J, Burton MD, Jonen M, Gardeux V, Achour I, et al. The discriminatory cost of ICD-10-CM transition between clinical specialties: metrics, case study, and mitigating tools. J AMIA. 2013 Jul;20(4):708–17. doi: 10.1136/amiajnl-2012-001358. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Office of the Secretary, HHS. HIPAA administrative simplification modifications to medical data code set standards to adopt ID-10-CM and ICD-10-PCS Final rule. Fed Regist. 2009 Jan 16;74(11):3328–62. [PubMed] [Google Scholar]
  • 4.National Center for Health Statistics. International Classification of Diseases (ICD-10-CM/PCS). [Internet]2015 [cited] 2019 Feb 1; Available from https://www.cdc.gov/nchs/icd/icd10cm_pcs_background.htm. [Google Scholar]
  • 5.Clark JS. The facts about ICD-10-CM/PCS implementation. Implementation will improve the quality of patient care.J AHIMA. 2012 Mar;83(3):42–3. [PubMed] [Google Scholar]
  • 6.Fox B, Sheehan J. Openness and Exactness - Mitigating Fraud Vulnerabilities in the Age of EHRs and ICD-10. [Internet] HIMSS. 2012 [Google Scholar]
  • 7.World Health Organization. Official ICD-10 Updates. 2016 who.int/classifications/icd/icd10updates/en/ [Google Scholar]
  • 8.Hartley C, Nachimson S. The cost of implementing ICD-10 for physician practices. Report to the AMA. Nachimson Advisors: 2014. [Google Scholar]
  • 9.HIMSS. ICD-10 transformation: five critical risk-mitigation strategies. [Internet] HIMSS; 2011. [Google Scholar]
  • 10.Topaz M, Shafran-Topaz L, Bowles KH. CD-9 to ICD-10: evolution, revolution, and current debates in the United States. Perspect Health Inf Manag. 2013;10:1d. [PMC free article] [PubMed] [Google Scholar]
  • 11.Anderson RN, Miniño AM, Hoyert DL, Rosenberg HM. Comparability of cause of death between ICD-9 and ICD-10: preliminary estimates. Natl Vital Stat Rep. 2001 May 18;49(2):1–32. [PubMed] [Google Scholar]
  • 12.Roberts R, Hirsch N, Innes K, Truran DL. Casemix classification issues and change from ICD-9-CM to ICD-10- AM coding. Casemix Quarterly. 1999 Jun;1(2) [Google Scholar]
  • 13.Walker RL, Hennessy DA, Johansen H, Sambell C, Lix L, Quan H Implementation of ICD-10 in Canada: how has it impacted coded hospital discharge data. BMC Health Serv Res. 2012 Jun 10;:12–149. doi: 10.1186/1472-6963-12-149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Heslin KC, Owens PL, Karaca Z, Barrett ML, Moore BJ, Elixhauser A. Trends in Opioid-related Inpatient Stays Shifted After the US Transitioned to ICD-10-CM Diagnosis Coding in 2015. Med Care. 2017;55(11):918–23. doi: 10.1097/MLR.0000000000000805. [DOI] [PubMed] [Google Scholar]
  • 15.Stewart C, Crawford PM, Simon GE. Changes in Coding of Suicide Attempts or Self-Harm With Transition From ICD-9 to ICD-10. Psychiatric Services. 2017 Mar;68((3)):215–215. doi: 10.1176/appi.ps.201600450. [DOI] [PubMed] [Google Scholar]
  • 16.Yoon J, Chow A. Comparing chronic condition rates using ICD-9 and ICD-10 in VA patients FY2014-2016. BMC Health Serv Res. 2017 Aug 17;17(1):572. doi: 10.1186/s12913-017-2504-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Januel J-M, Luthi J-C, Quan H, Borst F, Taffé P, Ghali WA, et al. Improved accuracy of co-morbidity coding over time after the introduction of ICD-10 administrative data. BMC Health Serv Res. 2011 Aug;18:11–194. doi: 10.1186/1472-6963-11-194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Dyers R, Ward G, Du Plooy S, Fourie S, Evans J, Mahomed H. Training and support to improve ICD coding quality. A controlled before-and-after impact evaluation. S Afr Med J. 2017 May 24;107(6):501–6. doi: 10.7196/SAMJ.2017.v107i6.12075. [DOI] [PubMed] [Google Scholar]
  • 19.Romano T, Hovey B. Early ICD-10 audits indicate a learning curve for general surgeons. Bull Am Coll Surg. 2016 Jun;101(6):50–2. [PubMed] [Google Scholar]
  • 20.Butler M. Analyzing eight months of ICD-10. J AHIMA. 2016 Jun;87(6):16–22. [PubMed] [Google Scholar]
  • 21.Pope GC, Kautter J, Ellis RP, Ash AS, Ayanian JZ, Lezzoni LI, et al. Risk adjustment of Medicare capitation payments using the CMS-HCC model. Health Care Financ Rev. 2004;25(4):119–41. [PMC free article] [PubMed] [Google Scholar]
  • 22.Khera R, Dorsey KB, Krumholz HM. Transition to the ICD-10 in the United States. An Emerging Data Chasm. JAMA. 2018 Jul 10;320(2):133. doi: 10.1001/jama.2018.6823. [DOI] [PubMed] [Google Scholar]
  • 23.Center for Medicare and Medicaid Services. ICD-10 GEMs. [Internet]. 2015 [cited. 2019 Feb 1; Available from: https://www.cms.gov/Medicare/Coding/ICD10/index.html. [Google Scholar]
  • 24.National Center for Health Statistics. ICD-10-CM/PCS Order Files. [Internet]. 2017[cited. 2019 Mar 1; Available from: https://www.cdc.gov/nchs/data/icd/icd10_fy_2017_order_files.pdf. [Google Scholar]
  • 25.IBM Explorys Cohort Discovery, IBM Explorys Therapeutic Datasets, and IBM Explorys Virtual Workbench provide life sciences insights into real-world care delivery. IBM United States Software Announcement: 2016. Sep 13, pp. 216–401. https://www-01.ibm.com/common/ssi/rep_ca/1/897/ENUS216-401/ENUS216-401.PDF. [Google Scholar]
  • 26.Hyndman R, Koehler AB, Ord JK, Snyder RD. Forecasting with exponential smoothing: the state space approach. Springer Science & Business Media. 2008 Jun 19; [Google Scholar]
  • 27.Mann HB. Nonparametric tests against trend. Econometrica. 1945;13:245–259. [Google Scholar]
  • 28.Chute CG, Cohn SP, Campbell KE, Oliver DE, Campbell JR. The Content Coverage of Clinical Classifications. JAMIA. 1996 May 1;3(3):224–33. doi: 10.1136/jamia.1996.96310636. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.American Academy of Family Physicians. Summary of recommendations for clinical preventive services. 2016 Available from: http://www.aafp.org/dam/AAFP/documents/patient_care/clinical_recommendations/cps- recommendations.pdf. [Google Scholar]
  • 30.Center for Medicare and Medicaid Services. Clarifying Questions and Answers Related to the July 6, 2015, CMS/AMA Joint Announcement and Guidance Regarding ICD-10 Flexibilities. [Internet] 2016[cited 2019 Mar 6 ].Available from: https://www.cms.gov/Medicare/Coding/ICD10/Clarifying-Questions-and-Answers- Related-to-the-July-6-2015-CMS-AMA-Joint-Announcement.pdf. [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES