Skip to main content
Clinical Kidney Journal logoLink to Clinical Kidney Journal
. 2017 Aug 2;11(2):254–258. doi: 10.1093/ckj/sfx085

A scheme based on ICD-10 diagnoses and drug prescriptions to stage chronic kidney disease severity in healthcare administrative records

Leif Friberg 1,, Alessandro Gasparini 2, Juan Jesus Carrero 3
PMCID: PMC5888226  PMID: 29644067

Abstract

Background

Information about renal function is important for drug safety studies using administrative health databases. However, serum creatinine values are seldom available in these registries. Our aim was to develop and test a simple scheme for stratification of renal function without access to laboratory test results.

Methods

Our scheme uses registry data about diagnoses, contacts, dialysis and drug use. We validated the scheme in the Stockholm CREAtinine Measurements (SCREAM) project using information on approximately 1.1 million individuals residing in the Stockholm County who underwent calibrated creatinine testing during 2006–11, linked with data about health care contacts and filled drug prescriptions. Estimated glomerular filtration rate (eGFR) was calculated with the CKD-EPI formula and used as the gold standard for validation of the scheme.

Results

When the scheme classified patients as having eGFR <30 mL/min/1.73 m2, it was correct in 93.5% of cases. The specificity of the scheme was close to 100% in all age groups. The sensitivity was poor, ranging from 68.2% in the youngest age quartile, down to 10.7% in the oldest age quartile. Age-related decline in renal function makes a large proportion of elderly patients fall into the chronic kidney disease (CKD) range without receiving CKD diagnoses, as this often is seen as part of normal ageing.

Conclusions

In the absence of renal function tests, our scheme may be of value for identifying patients with moderate and severe CKD on the basis of diagnostic and prescription data for use in studies of large healthcare databases.

Keywords: chronic kidney disease, classification, database, eGFR, validity

Introduction

Population-wide administrative health databases are frequently used for observational Post Authorization Safety Studies (PASS) of new drugs. Such studies have many advantages: they are quick and easy to perform; it is often possible to study of outcomes in small subgroups due to large number of patients; few patients are lost to follow-up; and selection bias is seldom a problem when whole populations are included. It is even possible to study the consequences of off-label use and drugs taken by patients with contraindications, which may happen in the real world even if it should not.

An important limitation with many of these registers is that they miss detailed information about renal function. Many drugs are excreted via the kidneys, and chronic kidney disease (CKD) may lead to drug accumulation and drug toxicity. A diagnostic code for CKD in one of these registers is often just a binary yes or no, which may cover almost any degree of renal impairment from a slightly elevated S-creatinine to end-stage renal failure.

Attempts have been made to stage CKD from International Classification of Diseases (ICD) codes (International Classfication of Diseases) and claims codes [1–5]. Validation studies of these schemes have been disappointing and have shown that administrative databases generally have insufficient sensitivity and positive predictive value (PPV) to allow for stratified analyses according to renal function.

However, registries hold more information than just diagnoses. There is other information that could be used as surrogate markers of the severity of renal disease, for example, whether phosphate binders and other drugs used in CKD are used or not, the duration of disease, the frequency of hospitalizations or contacts for CKD, if there was dialysis or surgery for vascular access and so on.

The aim of this study was to develop and test a surrogate method to grade renal function for research purposes when laboratory test values are unavailable.

Materials and methods

We constructed a scheme for classification of renal function aiming at the stratification of patients according to the assumed estimated glomerular filtration rate (eGFR). The unit for eGFR is mL/min/1.73 m2. In the following, eGFR values are presented without unit for brevity.

The scheme aims to differentiate between the following eGFR strata: >30, 30–59, 60–89 and ≥90. The scheme is presented as a flow chart in Figure 1. The diagnostic and procedure codes used, with plain text translation, are listed in Table 1.

Fig. 1.

Fig. 1.

Scheme for classifying renal function without access to laboratory test results. *Renal-specific drugs: phosphate binders (ATC codes A12AA, V03AE02, V03AE03, V03AE04), active vitamin D (A11CC03, A11CC04), sodium bicarbonate (A02AH), erythropoiesis-stimulating agents (B03XA).

Table 1.

Codes used for identification of patients with chronic renal disease

Condition
ICD-10 code beginning with
CKD N18
CKD Stage 5 (eGFR <15) N185
CKD Stage 4 (eGFR 15–29) N184
CKD Stage 3 (eGFR 30–59) N183
CKD Stage 2 (eGFR 60–89) N182
CKD Stage 1 (eGFR ≥90) N181
Acute renal failure N17
Unspecified renal failure N19
Dependence on renal dialysis Z992
Adjustment and management of vascular access device Z492
Procedure codes beginning with
Creation of arterio-venous fistula from artery in the upper limb PBL
Repair surgery of arterio-venous fistula in the upper limb PBU
Haemodialysis, chronic DR016
Peritoneal dialysis, chronic DR024

The scheme was tested and validated in Stockholm CREAtinine Measurements (SCREAM) database [6], a healthcare utilization cohort from the region of Stockholm, Sweden, where serum creatinine was measured in 1.3 million adults during 2006–11 in connection to a healthcare consultation in ambulatory or hospital care. Laboratory data was, thereafter, linked via each citizen’s personal identification number of administrative records containing diagnostic codes (ICD-10 classification), therapeutic procedures (codes issued by the Nordic Medico-Statistical Committee, NOMESCO), validated renal endpoints (undergoing dialysis of renal transplantation) and pharmacy-filled claims [6].

The regional healthcare utilization register contains information on all ICD-10 diagnoses and therapeutic procedures issued in ambulatory or inpatient care in the region of Stockholm since the system was adopted in Sweden in 1997. The Swedish Dispensed Drug register stores records of all pharmacy-dispensed prescriptions in Sweden since 1 July 2005. All pharmacies in the country are required to participate by law, and information is transferred electronically whenever a prescribed drug is dispensed. It does not contain information about prescriptions that were not dispensed, drugs used during hospital stays and over-the-counter drugs.

The study population considered for this analysis consisted of adult SCREAM individuals (≥18 years). We discarded all measurements taken in connection with a hospital stay (n = 2 415 743) because we assumed that a high proportion of these samples represented acute illness rather than a potential underlying chronic renal disease. We also discarded all measurements from non-residents in the Stockholm County (n = 203 326), implausible serum-creatinine concentrations (i.e. below 25 or above 1500 µmol/L; n = 1808), and measurements recorded after a renal transplantation (n = 50 843). In cases where there were concurrent serum creatinine measurements on the same day, we took their median value. Index date was defined as the date of the most recent measurement, and we obtained 1 126 952 individuals with 5 352 191 measurements eligible for the study.

The 2009 CKD-EPI creatinine-based equation [7] was used for calculation of eGFR. We estimated renal function at index date, averaging two eGFR values 3–12 months apart (if available). Race is not registered in Sweden by law, and therefore all patients were assumed to be Caucasian; nonetheless, the Swedish population is relatively homogenous and dominated by Caucasians (91.3% born in Europe, according to Statistics Sweden: http://www.statistikdatabasen.scb.se). CKD stages were categorized as eGFR <30, 30–59, 60–89 or ≥90. Patients undergoing dialysis (hemodialysis or peritoneal dialysis) have varying eGFR values related to the time since the previous dialysis. Such eGFR values are not representative of kidney function and were, therefore, replaced by a random value between 0 and 15 mL/min/1.73 m2.

For each patient, scheme-based classifications of renal function were made based on the presence or absence of diagnoses, contacts, drugs or procedures according to the schemes. This classification was then compared with the classification obtained through measured creatinine values, which was used as the gold standard.

We computed accuracy and Cohen’s Kappa statistics comparing the predicted CKD categories with the observed ones, and also performed McNemar’s test of agreement. Furthermore, we computed sensitivity, specificity, PPV and negative predictive value (NPV) in discriminating each CKD category against the remaining ones pooled together. All analyses were performed using R (R Foundation for Statistical Computing, Vienna, Austria).

The regional Ethical Review Board and the Swedish National Board of Health and Welfare approved the study for use of de-identified data. The study conforms to the Declaration of Helsinki.

Results

Renal function in the validation cohort

The study population consisted of 1 126 954 individuals, who contributed a total of more than 5.2 million creatinine measurements. The median age was 52.8 years [interquartile interval (IQI) 37.7–67.5] and 54.2% were females. Median eGFR was 94.4 (IQI 80.5–107.8). More than 92% had normal or near normal eGFR ≥60 (n = 1 038 461, Figure 2), 6.9% had eGFR 30–59 (n = 77 286) and 1.0% had eGFR <30 (n = 11 207). Mean and median eGFR values declined with age, from ∼120 at the age of 30 years to ∼60 at the age of 90 years (Figure 3). The proportion of patients with a diagnosis of CKD increased with more severe degrees of CKD, thus only 11.8% of patients with eGFR <60 and 52.5% of patients with eGFR <30 had a CKD diagnosis in the register. Among patients on dialysis, 99.1% also had a registry diagnosis of renal failure. There was a widespread underreporting of CKD in the elderly population.

Fig. 2.

Fig. 2.

Distribution of eGFR values among 1.1 million patients in the SCREAM cohort, Stockholm County, Sweden.

Fig. 3.

Fig. 3.

Creatinine-based eGFR values in relation to age among 1.1 million inhabitants in Stockholm County, Sweden.

Scheme validation

The correspondence between the scheme-derived stratification and the eGFR-based stratification, used as reference, is presented in Table 2. The full four-graded scheme (Scheme A) created groups with mean eGFR values in the targeted intervals, but the distribution was wide in all groups (Figure 4, left panel). The overall accuracy was only 59.4% [95% confidence interval (CI) 59.3–59.4%]. Cohen’s Kappa statistic was low, 0.02, and McNemar’s test was significant indicating lack of agreement between predicted and observed values. Nonetheless, the scheme identified patients with eGFR <30 with a PPV of 93.5% and NPV of 99.2% (Figure 5).

Table 2.

Scheme-predicted classification compared with classification according to the gold standard

Scheme-based classification of eGFR mL/min/1.73 m2 Gold standard eGFR reference, mL/min/1.73 m2
<30 30–59 60–89 ≥90
<30 1832 53 44 30
30–59 2552 2011 284 50
60–89 1411 2470 905 426
≥90 5412 72 752 372 586 664 136

Fig. 4.

Fig. 4.

Correspondence between the SCREAM-scheme classification of CKD stages versus creatinine-based eGFR values.

Fig. 5.

Fig. 5.

Scheme performance in relation to age.

The four-graded scheme was clearly incapable of differentiating between eGFR 60–89 and eGFR ≥90, largely due to the infrequent use of CKD diagnoses among elderly patients who typically have eGFR in the 60–89 range. The attempts to differentiate between these strata was abandoned, and all patients with eGFR ≥60 were combined into one group, thus making the scheme three-graded (Scheme B). This modification resulted in a marked improvement in the overall accuracy (92.5%, CI 92.5–92.6%), mostly driven by higher PPV (92.7%) and NPV (94.0%) in the normal/mildly reduced renal function group with assumed GFR ≥60. The specificity and the NPV was close to 100% for scheme-classified eGFR <30 and eGFR 30–59, while the sensitivity was very poor. Cohen’s Kappa statistics improved by the simplification of the scheme, up to 0.12, but McNemar’s test still indicated lack of agreement between predicted and observed values.

Stratification by age

Given the age-related decline in renal function, and the infrequent use of codes for CKD among the elderly, we proceeded with stratification according to age quartiles. This showed that the sensitivity for advanced CKD was higher in younger age groups; 68.2% in the lowest quartile below 38 years compared with 10.7% in the highest quartile older than 68 years (Figure 5).

When the scheme classified patients as GFR <30 it was correct in 87.8% of patients in the lowest age group, and increased up to 95.4% of the cases in the highest age group. Conversely, when the scheme said that a patient had normal or near-normal renal function, it was true for almost all patients below 68 years, and for about three-quarters of the patients older than 68 years.

Accordingly, the accuracy was well above 90% in all but the highest age quartile (99.9, 99.5, 97.3 and 73.3%, respectively). Cohen's Kappa was 53.9, 42.2, 26.2 and 7.4 in the respective age quartiles.

Discussion

We have shown that it is possible to achieve a crude grading of renal function without access to laboratory test values. Our scheme was, however, not able to discriminate between normal renal function and age-related decline of renal function or CKD in early stages. As shown here, and in other international registers [8–11], there is an important underutilization of ICD diagnostic codes for CKD in healthcare, overall emphasizing the importance of estimating CKD on the basis of laboratory values. However, most general administrative registers and claims databases lack information about renal function, which is important, for instance, for studies on pharmacovigilance or drug safety in real-life settings.

Because of this underutilization of diagnostic codes for CKD among the elderly, many individuals with advanced CKD are not identified by our scheme (low sensitivity). The age-related decline of renal function may be seen as a part of the aging process, in the same way as the reduction of pulmonary function, reduction of physical strength, atherosclerotic changes, etc.

The stratification into age quartiles showed that the sensitivity and performance of the scheme was better among younger than among older patients, which is consistent with what has just been said about underutilization of diagnoses. The study population was relatively young (mean age 52.8 years), and the cut-off age for the highest quartile was only 68 years. Many patients in pharmacovigilance studies are well past that age. A scheme that identifies two-thirds of patients with eGFR <30 in a population below 38 years may not be very useful in a retired population, where it only can identify 1 out of 10.

However, the high PPV of the scheme makes it possible to identify a subgroup with a 93.5% probability of having an eGFR <30. For studies where it is more important to identify a group with eGFR <30 with a high degree of certainty than to identify all patients with poor renal function, this scheme may be useful.

Access to actual eGFR values would of course be better, but if this information is not available and the advantages with Big Data make it desirable to go on without eGFR values, this scheme may be of value as it offers more information than a simple yes/no to a previous diagnosis of renal disease.

Our experiences of the under-reporting of CKD by diagnostic codes has been observed previously by a number of study groups [1–5] who have tried to identify CKD patients by means of diagnostic or claims codes. All these studies reported much lower sensitivity than specificity, just as we do in our study.

What is new with our study is that it uses information other than diagnostic codes, and that it stratifies renal function, instead of simply categorizing patients in a binary way as having CKD or not. Moreover, this study, with over 1 million patients and over 5 million creatinine measurements, is by far the largest study in this field that has ever been published, as far as we know.

Despite the scale of the study, it is not certain that the scheme would perform well in another database or in another country where healthcare registration is organized in a different way. Although different versions of the ICD coding system have been in use in different parts of the world, it is generally not difficult to translate from one version to another. Translation of codes for diagnostic and surgical procedures may be a greater challenge since there is no universally accepted list for these. The exact meaning of the NOMESCO codes used in our scheme study are listed in Table 1 in order to facilitate adoption to other countries. We think it is desirable that the scheme is evaluated in other settings with openness for modifications/adaptations to those other contexts.

Conclusion

The likelihood that patients identified as having poor renal function by our scheme is high. The scheme may, therefore, be useful for pharmacovigilance studies using administrative registries lacking information about creatinine values. The sensitivity for detection of CKD, especially in the elderly, is poor.

Funding

We acknowledge grant support from Stockholm Country Council and from the Swedish Heart and Lung Foundation.

Conflict of interest statement

None of the authors has any conflicts of interest related to the contents of the present study, which is of a purely methodological nature. Outside of the present work L.F. has conducted pharmacovigilance studies in registries as a consultant to Bayer, Bristol-Myers-Squibb, Pfizer and Sanofi.

References

  • 1. Grams ME, Plantinga LC, Hedgeman E. et al. Validation of CKD and related conditions in ex isting data sets: a systematic review. Am J Kidney Dis 2011; 57: 44–54 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Muntner P, Gutierrez OM, Zhao H. et al. Validation study of medicare claims to identify older US adults with CKD using the Reasons for Geographic and Racial Differences in Stroke (REGARDS) Study. Am J Kidney Dis 2015; 65: 249–258 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Grams ME, Rebholz CM, McMahon B. et al. Identification of incident CKD Stage 3 in research studies. Am J Kidney Dis 2014; 64: 214–221 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Ronksley PE, Tonelli M, Quan H. et al. Validating a case definition for chronic kidney disease using administrative data. Nephrol Dial Transplant 2012; 27: 1826–1831 [DOI] [PubMed] [Google Scholar]
  • 5. Fleet JL, Dixon SN, Shariff SZ. et al. Detecting chronic kidney disease in population-based administrative databases using an algorithm of hospital encounter and physician claim codes. BMC Nephrol 2013; 14: 81. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Runesson B, Gasparini A, Qureshi AR. et al. The Stockholm CREAtinine Measurements (SCREAM) project: protocol overview and regional representativeness. Clin Kidney J 2016; 9: 119–127 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Levey AS, Stevens LA, Schmid CH. et al. A new equation to estimate glomerular filtration rate. Ann Intern Med 2009; 150: 604–612 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Li C, Wen XJ, Pavkov ME. et al. Awareness of kidney disease among US adults: findings from the 2011 Behavioral Risk Factor Surveillance System. Am J Nephrol 2014; 39: 306–313 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Ravera M, Noberasco G, Weiss U. et al. CKD awareness and blood pressure control in the primary care hypertensive population. Am J Kidney Dis 2011; 57: 71–77 [DOI] [PubMed] [Google Scholar]
  • 10. Robertson LM, Denadai L, Black C. et al. Is routine hospital episode data sufficient for identifying individuals with chronic kidney disease? A comparison study with laboratory data. Health Inform J 2014; 22: 383–396 [DOI] [PubMed] [Google Scholar]
  • 11. Gasparini A, Evans M, Coresh J. et al. Prevalence and recognition of chronic kidney disease in Stockholm healthcare. Nephrol Dial Transplant 2016; 12: 2086–2094 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Clinical Kidney Journal are provided here courtesy of Oxford University Press

RESOURCES