Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Aug 1.
Published in final edited form as: Am J Kidney Dis. 2014 Apr 13;64(2):214–221. doi: 10.1053/j.ajkd.2014.02.021

Identification of Incident CKD Stage 3 in Research Studies

Morgan E Grams 1, Casey Rebholz 2, Blaithin MacMahon 1, Seamus Whelton 1, Shoshana H Ballew 2, Elizabeth Selvin 2, Lisa Wruck 4, Josef Coresh 2,3
PMCID: PMC4112019  NIHMSID: NIHMS578307  PMID: 24726628

Abstract

Background

In epidemiologic research, incident chronic kidney disease (CKD) is commonly determined by laboratory tests performed at planned study visits. Given the morbidity and mortality associated with CKD, persons with incident disease may be less likely to attend scheduled visits, affecting observed associations. The objective of this study was to quantify loss-to-follow-up by CKD status, and to determine whether supplementation with diagnostic code data improves capture of incident CKD.

Study Design

Prospective cohort study.

Setting & Participants

11,560 participants in the Atherosclerosis Risk in Communities (ARIC) Study underwent continuous surveillance for hospitalizations and death from baseline visit (1996-1999) to follow-up visit (2011-2013). A subset of hospitalizations in Washington County, MD, was used in diagnostic code validation (n=2,540).

Predictor

Baseline demographics and comorbid conditions.

Outcomes

Incident CKD stage 3 ascertained by follow-up visit (visit-based definition), or by hospitalization surveillance (hospitalization-based definition).

Measurements

Visit-based definition: ≥25% decline from baseline estimated glomerular filtration rate to <60 ml/min/1.73 m2 at follow-up visit; hospitalization-based definition: hospitalization CKD diagnostic code.

Results

Among 11,560 participants, 5,951 attended the follow-up visit, and 9,264 were hospitalized. Never-hospitalized participants were younger, more often female, and had fewer comorbid conditions; 73.5% attended the follow-up visit. Incident CKD stage 3 occurred in 1,172 participants by the visit-based definition (251 were never-hospitalized) and 1,078 participants by the hospitalization-based definition (237 attended the follow-up study visit). The sensitivity of the hospitalization-based CKD definition was 35.5% (95%

CI, 31.6%-39.7%); specificity was 95.7% (95% CI, 94.2%-96.8%). Sensitivity was higher with later time period, older participant age, and baseline prevalent diabetes and CKD.

Limitations

A subset of hospitalizations were used for validation; 15-year gap between study visits.

Conclusions

The sensitivity of diagnostic code–identified CKD is low and varies by certain factors; however, supplementing a visit-based definition with hospitalization information can increase disease identification during periods of follow-up without study visits.

Keywords: chronic kidney disease (CKD), incident, diagnostic codes, validation, identification, loss to follow-up, CKD surveillance, screening, administrative data


Chronic kidney disease (CKD) is strongly associated with cardiovascular disease, end-stage renal disease and mortality.1, 2 Early identification of high-risk individuals facilitates timely intervention for the prevention of CKD and its associated morbidities,3 and there is great interest in quantifying risk factors for incident CKD.4 However, estimation of both the rates of disease and the magnitude of risk factor associations may vary by method of disease identification. Validation studies of methods to detect incident CKD are needed.

Many prospective studies define incident CKD by biomarker measure at planned study visits.5, 6 Given the morbidity and mortality associated with CKD, persons developing disease may be less likely to attend study visits. The interrelatedness of study loss-to-follow-up and outcomes (i.e., informative censoring) is a well known source of study bias.7, 8 Methods to attenuate this bias include statistical modeling, such as mixed effects or joint models,9 and supplementation of study visit-derived outcomes with information from inter-visit events. For example, for the often-fatal outcome coronary heart disease, continuous surveillance of interim hospitalizations is commonly used to identify incident disease.10,11-14 A similar review of hospitalizations for CKD events may improve the accuracy of incident CKD capture when participants fail to attend subsequent study visits.

Using the Atherosclerosis Risk in Communities (ARIC) Study population and data from planned study visits, continuous hospitalization surveillance, and outpatient electronic medical records, we sought to validate methods for identifying incident CKD stage 3. We quantified loss-to-follow-up by incident CKD stage 3 status to determine whether CKD onset affected study visit attendance. We compared phenotypes of participants with incident CKD stage 3 identified via follow-up study visit (visit-based definition) with those identified via incident hospitalizations (hospitalization-based definition). Finally, we estimated the validity of the hospitalization-based CKD definition against a gold standard of outpatient estimated glomerular filtration rate (eGFR) less than 60 ml/min/1.73m2, and evaluated whether the sensitivity and specificity varied by patient and hospitalization characteristics.

METHODS

Study Population

The ARIC Study is an ongoing, community-based, prospective cohort designed for the study of risk factors for cardiovascular disease and atherosclerosis.10 Recruitment occurred from 1987-1989 and was based in four U.S. communities: Washington County, Maryland; suburban Minneapolis, Minnesota; Jackson, Mississippi; and Forsyth County, North Carolina. Only persons between the ages of 45 and 64 years were included (N=15,792). After initial enrollment, follow-up clinical examinations occurred at approximately 3-year intervals until visit 4 (1996-1999); visit 5 was conducted 15 years later (2011-2013). For the purposes of the present study, visit 4 was considered baseline, since the corresponding electronic medical record data were available only in recent years. Thus, only those ARIC participants with serum creatinine measured at visit 4 were included (n=11,560).

Data Source and CKD Definition

Planned Study Visits

Baseline creatinine was measured in plasma specimens by the modified kinetic Jaffé method. At the follow-up study visit (ARIC visit 5), serum creatinine was measured using the Roche enzymatic method (Roche-Hitachi Modular P chemistry analyzer with Roche Creatininase Plus assay, Hoffman-La Roche Ltd). To account for variability between laboratories, assays, and methods, creatinine values were calibrated to the National Institute of Standards and Technology (NIST; www.nist.gov) standard.15, 16 Creatinine values were converted to eGFR using the 2009 CKD-EPI (CKD Epidemiology Collaboration) creatinine equation.17 Incident CKD stage 3 was considered a drop in eGFR ≥25%, resulting in a final eGFR< 60 ml/min/1.73 m2, a definition modeled after the KDIGO (Kidney Disease: Improving Global Outcomes) guideline pertaining to definition of CKD progression.18 Only those with eGFR ≥60 ml/min/ 1.73 m2 at baseline were considered at risk for incident CKD. In sensitivity analyses, we also report incident eGFR <60 ml/min/1.73 m2, without requiring a decrement in eGFR ≥25%.

Hospitalization Diagnostic Codes

Participants in the ARIC Study are contacted annually by telephone (92% response rate in follow-up year 20). Intervening hospitalizations are determined by self-report as well as ongoing active surveillance of community hospital discharge lists; deaths are determined by alternate contact-report and active surveillance of local newspaper obituaries, state death lists, and death certificates from the Department of Vital Statistics. For each hospitalization, 26 discrete International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) discharge diagnostic codes are abstracted per protocol. For the current study, diagnostic codes for all hospitalizations occurring between visit 4 (February 1, 1996 to January 30, 1999) and December 31, 2010, were evaluated. The diagnostic code algorithm used to define CKD is listed in Table S1 (provided as online supplementary material).

Administrative Creatinine

Unlike the three other study sites, Washington County, MD, participants primarily receive care in a single integrated local health system, where a single laboratory system supplies both inpatient and most outpatient serum creatinine measurements. Thus, all creatinine values for Washington County participants were abstracted from the electronic medical record; these data were available for the years 2002-2010. To evaluate the incidence of CKD among participants without interim hospitalizations (and thus in whom there was no possibility of an inpatient diagnostic code-based diagnosis of CKD), incident CKD stage 3 was determined from outpatient serum creatinine and categorized from stage 3 to 5 (stage 3, eGFR 30-59 ml/min/1.73 m2; stage 4, 15-29 ml/min/1.73 m2; stage 5, <15 ml/min/1.73 m2). To assess the validity of hospitalization diagnostic codes for identifying CKD, the sensitivity and specificity of these codes were evaluated against a gold standard of outpatient eGFR < 60 ml/min/1.73 m2, calculated using mean outpatient eGFR estimated using serum creatinine (median measurements, 3; interquartile range, 1-4) during the year prior to admission.

Measurement of Other Variables

Gender and race were self-reported at ARIC study enrollment. Baseline variables (including age) reflect measures at ARIC visit 4 (1996-1999). Diabetes was defined as self-reported disease, the use of hypoglycemic medication, a fasting glucose level ≥126 mg/dl, or a random glucose ≥200 mg/dl. Hypertension was defined as a mean systolic blood pressure ≥140 mmHg or a diastolic blood pressure ≥90 mmHg measured at the study visit, or the use of anti-hypertensive medications. Moderate-to-severely increased albuminuria was defined as a random urine albumin-creatinine ratio (ACR) > 30 mg/g.18 Urine albumin was measured by a nephelometric method either on the Dade Behring BN 100 or on the Beckman IMMAGE Nephelometer.

Statistical Analysis

Baseline characteristics of the study participants were compared using χ2 and t-tests for categorical and continuous variables, respectively. For variables with skewed distributions, medians and interquartile ranges were reported. The relative risk (RR) of study visit non-attendance was calculated in unadjusted and demographic-adjusted models using modified Poisson regression with robust variance as previously described.19 In sensitivity analysis, the RR of non-attendance from ARIC visit 1 to visit 2 was estimated (3-year interval, n=15,642) as well as from ARIC visit 2 to visit 4 (6-year interval, n=14,292). The entire population was used in comparisons by attendance at the follow-up study visit and interim hospitalization status; only those with eGFR ≥60 ml/min/1.73 m2 at baseline were used when comparing incident CKD stage 3 rates (Figure S1). Washington County participants alive in 2002 with eGFR ≥60 ml/min/1.73 m2 at baseline were included in analyses of electronic medical record data. In the validation analysis, standard errors were calculated using a clustered sandwich estimator to account for potential within-participant correlation due to the possibility of multiple hospitalizations per participant. To determine if validity varied by participant or hospitalization factors, sensitivity and specificity were evaluated using logistic regression with each of the following factors included as a covariate: time period (2002-2004, 2005-2008), participant age (<65, ≥65 years), gender, baseline diabetes (yes, no), and baseline eGFR <60 ml/min/1.73 m2 (yes, no).

Because the validation analyses were performed within a single study center, in sensitivity analysis we performed a validation of hospitalization diagnostic codes across the entire ARIC cohort using chart review. Hospitalization charts were selected using a random number generator (n=546), and two blinded independent reviewers adjudicated the presence of eGFR <60 ml/min/1.73 m2 not occurring in the presences of acute kidney injury. Estimates of validity were adjusted for sampling technique as previously described.20 All analyses were performed using Stata/SE 11.2 (College Station, TX: StataCorp LP).

RESULTS

Study Population

There were 11,560 ARIC participants with available eGFR at baseline (1996-1999). Nearly half (48.5%) did not attend the follow-up visit in 2011-2013 (Table 1). Persons who did not attend were older, more often male, and had slightly lower baseline eGFR. The RR of non-attendance associated with baseline eGFR < 60 ml/min/1.73 m2 was 1.59 (95% CI, 1.51-1.66) and higher than those associated with baseline hypertension or diabetes (RRs of 1.29 [95% CI, 1.24-1.33] and 1.42 [95% CI, 1.37-1.48], respectively). After adjustment for demographic factors, the associated risks were less pronounced (RRs for baseline eGFR <60 ml/min/1.73 m2, hypertension, and diabetes of 1.26 [95% CI, 1.20-1.32], 1.19 [95% CI, 1.15-1.23], and 1.35 [95% CI, 1.30-1.40], respectively). The cause of non-attendance was death in 45% of the participants. Persons who missed the follow-up visit had higher median number of hospitalizations compared with those who attended the follow-up visit (4 vs. 1 hospitalizations; p<0.001), and fewer were never hospitalized (10.8% vs. 28.4%; p<0.001). In sensitivity analysis, the increased risk of study visit non-attendance associated with baseline eGFR <60 ml/min/1.73 m2 persisted when evaluated at separate ARIC study visits with shorter inter-visit intervals (ARIC visit 2 to ARIC visit 4 demographic-adjusted RR, 1.90 [95% CI, 1.67-2.16]; ARIC visit 1 to ARIC visit 2 demographic-adjusted RR, 2.21 [95% CI, 1.74-2.79]).

TABLE 1.

Characteristics of study population at baseline (1996-1999) and incidence of hospitalization, CKD, and mortality, according to follow-up visit (2011-2013) attendance

Not attending Attending
No. of participants (%) 5609 (48.5) 5951 (51.5)
Baseline variables
 Age (y) 64.7 ±5.6 61.1 ±5.1
 Female sex 2966 (52.9) 3486 (58.6)
 African-American 1316 (23.5) 1267 (21.3)
 Hypertension 2477 (44.5) 1937 (32.7)
 Diabetes mellitus 1245 (22.3) 690 (11.6)
 Prevalent cardiovascular disease 673 (12.3) 305 (5.2)
 eGFR <60 ml/min/1.73 m2 548 (9.8) 194 (3.3)
 eGFR (ml/min/1.73 m2) 83.9 ±17.9 88.3 ±14.4
Risk of hospitalization, CKD, mortality
 Never hospitalized 608 (10.8) 1688 (28.4)
 No. of hospitalizations 4 [1-7] 1 [0-3]
 Incident eGFR <60 ml/min/1.73 m2*a 0 (0.0) 1572 (27.3)
 Incident CKD stage 3*b
  Visit-based definition 0 (0.0) 1330 (23.1)
  Hospitalization-based definition 821 (16.2) 302 (5.3)
  Combined definition 821 (16.2) 1432 (24.9)
 Death before follow-up visit 2542 (45.3) 0 (0.0)

Note: N=11,560. Values for categorical variables are given as frequency (percentage); values for continuous variables are given as mean ± standard deviation or median [interquartile range].

P<0.001 for all, except African-American P=0.005.

Abbreviations: eGFR, estimated glomerular filtration rate; CKD, chronic kidney disease

*

Among those with visit 4 eGFR ≥60 ml/min/1.73 m2 (n=10,818; 5,061 of whom did not attend the follow-up visit and 5,757 of whom did).

a)

Incident eGFR < 60 ml/min/1.73 m2 was determined at the follow-up study visit.

b)

The visit-based definition requires both incident eGFR < 60 ml/min/1.73 m2 and a decrement in eGFR ≥25%. The hospitalization-based definition defines CKD based on hospitalizations with CKD diagnostic codes. The combined definition combines the visit-based and hospitalization-based definitions.

By interim hospitalization status, participants who were never hospitalized were younger, more often female, and had slightly higher baseline eGFR (Table S2). The RR of any hospitalization (vs. never) associated with baseline eGFR <60 ml/min/1.73 m2 was 1.15 (95% CI, 1.12-1.17) and similar to those of hypertension and diabetes (RRs of 1.13 [95% CI, 1.11-1.15] and 1.15 [95% CI, 1.13-1.17], respectively). More never-hospitalized participants attended the final study visit (73.5% vs. 46.0%; p<0.001) and fewer died during follow-up (0.2% vs. 27.4%; p<0.001) than participants who were hospitalized at least once during the study period.

Incident CKD Stage 3, by Definition

There were 10,818 participants at risk for incident CKD stage 3 (742 participants with eGFR <60 ml/min/1.73 m2 at baseline were excluded from this analysis); 2,253 (20.8%) developed CKD stage 3 by either a decrease in eGFR ≥25% with a resultant eGFR < 60 ml/min/ 1.73 m2 at the follow-up study visit (visit-based definition) or a hospitalization with a CKD-associated diagnostic code (hospitalization-based definition). By the visit-based definition, incident CKD stage 3 developed among 23.1% of the second study visit attendees and 0% (by definition) among the non-attendees. By the hospitalization-based definition, 5.2% of the second study visit attendees and 16.2% of the non-attendees developed incident CKD stage 3. By the combined definition, the cumulative probability of CKD stage 3 over the total study period was 24.9% among attendees and 16.2% among non-attendees (p<0.001). Overall, persons labeled as developing CKD stage 3 by the visit-based definition compared with the hospitalization-based definition were younger (baseline age, 62.7 vs. 64.5 years) and had a lower baseline prevalence of hypertension (42.2% vs. 54.1%), diabetes (18.6% vs. 35.7%), and moderately-severely increased albuminuria (7.9% vs. 19.5%).

Incident CKD Stage 3, by Baseline Age

The proportion developing CKD by the various definitions (irrespective of follow-up visit attendance) was assessed within 5-year categories of baseline age (Figure 1). By the visit-based definition, the proportion developing CKD stage 3 was highest among participants aged 61-65 years; rates were lower in both older and younger populations. By the hospitalization-based definition, the cumulative probability of CKD appeared linearly related to age category, whereby participants in the oldest age category (71-75 years) had the highest risk of incident CKD stage 3. Using a combination of both definitions resulted in a higher proportion with captured incident CKD stage 3 within each age group.

FIGURE 1.

FIGURE 1

Proportion developing chronic kidney disease stage 3 during follow-up, by categories of baseline age*

*Among those with estimated glomerular filtration rate > 60 ml/min/1.73 m2 at baseline study visit (1996-1999)

**Abbreviations: CI, confidence interval; CKD, chronic kidney disease Error bars indicate 9% confidence intervals.

Missed Cases of Incident CKD, by Stage

The majority (84.4%) of the Washington County, Maryland, ARIC participant population had an outpatient creatinine value by which to assess stages of incident CKD (Table 2). Compared to those who remained with eGFR ≥60 ml/min/1.73 m2, participants with incident CKD stages 3-5 were older, with lower baseline eGFR and a higher prevalence of baseline hypertension and diabetes. Women were overrepresented in incident CKD stages 3 and 4 but underrepresented in incident CKD stage 5. Persons with incident CKD stages 3-5 were less likely to attend the subsequent study visit, particularly among those developing CKD stages 4 and 5. Mortality was slightly higher among those with incident CKD stage 3 (16.9% vs. 15.8% among those without incident CKD) and substantially higher among those with incident CKD stages 4 and 5 (38.5% and 54.5%, respectively).

TABLE 2.

Baseline (1996-1999) and follow-up characteristics of Washington County participants with baseline eGFR>60 ml/min/1.73m2

Unable to
assess CKD
status**
Did not
develop
CKD
Developed
CKD Stage 3
Developed
CKD Stage 4
Developed
CKD Stage 5
No. of participants 440 (15.6) 939 (33.4) 1240 (44.0) 174 (6.2) 22 (0.8)
Baseline variables
 Age (y) 62.3 ±5.2 61.8 ±5.2 63.8 ±5.5 64.7 ±5.4 64.8 ±5.5
 Female sex 231 (52.5) 483 (51.4) 718 (57.9) 99 (56.9) 11 (50)
 Hypertension 129 (29.5) 253 (27.1) 496 (40.2) 93 (53.8) 14 (64)
 Diabetes mellitus 66 (15.1) 102 (10.9) 213 (17.2) 53 (30.5) 10 (46)
 eGFR at baseline ARIC visit (ml/min/1.73
m2)
88.1 ±11.4 91.8 ±9.6 83.3 ±11.1 80.9 ±14.0 79.5 ±12.5
Follow-up variables
 Participant in subsequent ARIC visit 241 (54.8) 582 (62.0) 724 (58.4) 55 (31.6) 4 (18)
 No hospitalizations 140 (31.8) 229 (24.4) 217 (17.5) 5 (2.9) 1 (5)
 Incident eGFR <60 ml/min/1.73 m2 at
ARIC follow-up visit
62 (14.1) 51 (5.4) 317 (25.6) 44 (25.3) 4 (18)
 Incident CKD stage 3*
  Visit-based definition 50 (11.4) 50 (5.3) 270 (21.8) 42 (24.1) 4 (18)
  Hospitalization-based definition 36 (8.0) 32 (2.9) 140 (11.0) 96 (55.2) 18 (82)
  Combined definition 80 (18.2) 78 (8.3) 387 (31.2) 122 (70.1) 18 (82)
 Incident mortality 73 (16.6) 148 (15.8) 209 (16.9) 67 (38.5) 12 (55)

Note: n=2,815. Assessment of CKD status shown in column headings based not on ARIC study visits but on outpatient laboratory results from 2002 through 2010 (most outpatient serum creatinine measurements are captured in the electronic medical record for Washington Country participants). CKD stage 3: eGFR<60 ml/min/1.73m2 and ≥ 30 ml/min/1.73m2; CKD stage 4: eGFR<30 ml/min/1.73m2 and ≥ 15 ml/min/1.73m2; CKD stage 5: eGFR<15 ml/min/1.73m2. Values for categorical variables are given as number (percentage); values for continuous variables are given as mean ± standard deviation.

**

Did not have an outpatient creatinine measurement in the electronic medical record.

*

The visit-based definition requires both incident eGFR < 60 ml/min/1.73 m2 (at ARIC follow-up visit) and a decrement in eGFR ≥25%. The hospitalization-based definition defines CKD based on hospitalizations with CKD diagnostic codes. The combined definition combines the visit-based and hospitalization-based definitions.

Abbreviations: ARIC, Atherosclerosis Risk in Communities; eGFR, estimated glomerular filtration rate; CKD, chronic kidney disease

Accuracy of Diagnostic Codes

There were 2,540 hospitalizations in Washington County, Maryland, with available diagnostic codes and outpatient creatinine in the year pre-hospitalization (Table S3). Diagnostic code–identified CKD captured 35.5% (95% CI, 31.6%-39.7%) of the cases with pre-hospitalization eGFR <60 ml/min/1.73 m2 (Table 3). Specificity was 95.7% (95% CI, 94.2%-96.8%); positive and negative predictive values were 90.8% and 55.3%, respectively. Sensitivity was higher in the more recent era (2006-2010), and among older participants, those with diabetes, and those with long-standing CKD (Figure 2). Specificity was slightly higher among women than men and among participants with diabetes. In sensitivity analysis using adjudicated chart review from all four study sites as a gold standard, the validity of diagnostic code-identified CKD was similar (sensitivity, 36.2% [95% CI, 27.1%-45.4%]; specificity, 97.7% [95% CI, 96.3%-99.1%]).

TABLE 3.

Validation of administrative diagnostic codes for CKD, Washington County hospitalizations (2002-2010)

Parameter Value (95% CI)
Sensitivity 35.5 (31.6-39.7)
Specificity 95.7 (94.2-96.8)
Positive Predictive Value 90.8 (87.3-93.3)
Negative Predictive Value 55.3 (51.9-58.7)

Note: n=2,540. Mean outpatient eGFR <60 ml/min/1.73 m2 is the gold standard comparator. Outpatient creatinine values were converted to eGFR using the CKD-Epidemiology Collaboration creatinine equation and averaged in the year prior to hospital admission. The prevalence of CKD in the validation population was 47.1%. The positive predictive value in a population with fewer cases of CKD would be lower.

CKD, chronic kidney disease; eGFR, estimated glomerular filtration rate;

FIGURE 2.

FIGURE 2

Variation in CKD diagnostic code validity by participant and hospitalization characteristic, Washington County hospitalizations 2002-2010 (n=2,540)

*Indicates a significant difference in sensitivity or specificity between categories (p<0.05).

DISCUSSION

This study confirms that eGFR <60 ml/min/1.73 m2 is common in an older, community-based population and that incident CKD stages 3-5 is associated with substantial morbidity and mortality. It also suggests that reliance on study visit attendance for the identification of incident disease may incompletely capture cases, particularly severe ones, possibly resulting in biased associations with risk factors. Cases of incident CKD stage 3 varied by method of identification: those who were younger and healthier were more likely to be captured in a planned study visit, and those who were older were more likely to be captured in a hospitalization. Although it is believed that the risk of CKD stage 3 increases with age,21 a visit-based definition resulted in an implausible decrease in disease risk among those older than 65 years, whereas a hospitalization-based approach may have exaggerated the age–CKD association. We propose that a combination of study visit–based screening and administrative data may improve incident CKD capture in cohort studies with variably spaced follow-up visits—in the ARIC Study, over 70% of those with advanced disease were identified—but that such data should be interpreted cautiously.

While we focused on incident CKD stage 3 in a prospective cohort, the themes of informative censoring and biases associated with methods of disease identification likely are applicable to other chronic diseases and study designs. For example, we found that not only eGFR <60 ml/min/1.73 m2 but also diabetes and hypertension were associated with higher risk of follow-up study visit non-attendance. For kidney disease, the nonlinear trends with age observed using a visit-based definition are easy to refute; there is strong evidence that GFR decline occurs over a lifetime with no discernible plateau in CKD incidence.21 For diabetes, on the other hand, a slowing of disease incidence with age has been reported.22, 23 Perhaps some (or all) of this plateau could be attributable to higher rates of subsequent study non-attendance among persons with incident disease.

Epidemiologic studies using administrative data are increasingly common. Such studies are relatively inexpensive to perform and benefit from large sample sizes.24 A significant disadvantage, however, is the possibility of spectrum bias, whereby the performance of a diagnostic method (e.g., diagnostic codes for CKD, or laboratory-based categorization of eGFR) may vary by setting.25 A method to identify CKD that relies on diagnostic codes—particularly in the case of hospitalization-based studies, but also in the outpatient clinic setting—may be more effective in an older and/or sicker population, and the case-mix may skew toward more advanced disease. In our analysis of incident CKD within the electronic medical record, for example, we found that those with CKD stage 4 or greater were much more likely both to be hospitalized during follow-up and to receive a CKD diagnostic code during hospitalization than those with CKD stage 3 or greater. Our results suggest that CKD surveillance methods can be an important supplement to study visit-based surveillance to “fill in gaps”; however, methods relying on study visit follow-up would be expected to be more effective in a younger and healthier population.

The present study extends the existing literature on the validity of CKD diagnostic codes in administrative data. Few U.S. studies have validated diagnostic codes against eGFR; those that have are variable in results, with sensitivities ranging from 8% to 59%.26-28 While some studies have reported differential sensitivity by patient characteristic—Stevens et al.28 reported no variation by CKD risk factors, Ferris et al.29 found higher sensitivity with severe CKD, hypertension, and diabetes, and Ronksley et al.30 again found higher sensitivity with diabetes and more severe CKD—none to our knowledge tested the variation in validity over time. Our finding that the sensitivity of ICD-9-CM codes was higher in the more recent era corresponds with the introduction of automatic eGFR reporting (in Washington County, MD, this occurred in December 2004) and suggests that studies using ICD-9-CM codes to determine trends over time may require adjustment for the differential sensitivity.

This study has certain limitations. The two planned study visits were approximately 15 years apart. This interim period is longer than most prospective cohort designs and almost certainly results in a more dramatic demonstration of differences between study visit– and hospitalization-based CKD stage 3 identification. Still, in sensitivity analysis, the association between baseline CKD stage or greater and study visit non-attendance persisted, even with shorter inter-visit intervals. Hospitalization-based definitions require that persons are hospitalized; a younger, healthier cohort may not have had similar increases in disease capture by supplementing visit-based identification with hospitalization-based identification. Estimates of positive and negative predictive values are very much dependent on prevalence of disease in the population, and nearly 50% of our validation population had CKD, although this may not be atypical in a hospitalized population. Finally, the diagnostic codes were validated only in a subset of ARIC participants; to the extent that coding practices or severity of disease differs from one region to another, our measures of sensitivity and specificity may not reflect those of other populations. However, in sensitivity analyses using hospitalization chart review from participants across all four study centers and a wide range of hospitals, the validity of the diagnostic code algorithm was quite similar.

Strengths of this study include a large, community-based population with long-term follow-up and high retention rates (>90% contact rate). Incident CKD was determined through multiple sources, and, for a subset of participants, verified against an electronic medical record containing both inpatient and outpatient laboratory measures. These complementary data sources allow in-depth analysis of those who miss (whether due to non-adherence or death) follow-up study visits.

In conclusion, this study quantifies loss-to-follow-up associated with CKD stage 3 or greater, and it demonstrates that neither study-based methods nor administrative data based methods captured the true spectrum of incident CKD in a community-based prospective cohort. Each method identified different cases of incident CKD that varied in systematic ways. A combination of study-visit and administrative data may be the most accurate method for identifying community-representative incident disease.

Supplementary Material

01
02
03
04

Acknowledgements

The authors thank the staff and participants of the ARIC Study for their important contributions.

Support: Dr Grams is supported by National Institutes of Health/National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) grant K08DK092287. The Atherosclerosis Risk in Communities Study is carried out as a collaborative study supported by National Heart, Lung and Blood Institute contracts (HHSN268201100005C, HHSN268201100006C, HHSN268201100007C, HHSN268201100008C, HHSN268201100009C, HHSN268201100010C, HHSN268201100011C, and HHSN268201100012C) as well as NIDDK grant R01 DK076770. Dr Grams had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Footnotes

Financial Disclosure: Dr Coresh has consulted for Amgen and Merck and has an investigator-initiated grant from Amgen. The other authors declare that they have no other relevant financial interests.

Supplementary Material Table S1: Diagnostic code algorithm for identifying CKD.

Table S2: Baseline and follow-up characteristics, by interim hospitalization status during follow-up.

Table S3: Inpatient CKD diagnostic codes compared to 1-year antecedent average outpatient eGFR < 60 ml/min/1.73 m2.

Figure S1: Flow diagram outlining study populations.

Note: The supplementary material accompanying this article (doi:_______) is available at www.ajkd.org

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

REFERENCES

  • 1.Astor BC, Matsushita K, Gansevoort RT, et al. Lower estimated glomerular filtration rate and higher albuminuria are associated with mortality and end-stage renal disease. A collaborative meta-analysis of kidney disease population cohorts. Kidney Int. 2011;79(12):1331–1340. doi: 10.1038/ki.2010.550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Gansevoort RT, Matsushita K, van der Velde M, et al. Lower estimated GFR and higher albuminuria are associated with adverse kidney outcomes. A collaborative meta-analysis of general and high-risk population cohorts. Kidney Int. 2011;80(1):93–104. doi: 10.1038/ki.2010.531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Kinchen KS, Sadler J, Fink N, et al. The timing of specialist evaluation in chronic kidney disease and mortality. Ann Intern Med. 2002;137(6):479–486. doi: 10.7326/0003-4819-137-6-200209170-00007. [DOI] [PubMed] [Google Scholar]
  • 4.Kshirsagar AV, Bang H, Bomback AS, et al. A simple algorithm to predict incident kidney disease. Arch Intern Med. 2008;168(22):2466–2473. doi: 10.1001/archinte.168.22.2466. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bash LD, Coresh J, Kottgen A, et al. Defining incident chronic kidney disease in the research setting: The ARIC study. Am J Epidemiol. 2009;170(4):414–424. doi: 10.1093/aje/kwp151. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Shlipak MG, Day EC. Biomarkers for incident CKD: A new framework for interpreting the literature. Nat Rev Nephrol. 2013;9(8):478–483. doi: 10.1038/nrneph.2013.108. [DOI] [PubMed] [Google Scholar]
  • 7.Rothman KJ, Greenland S, Lash TL. Modern epidemiology. Third ed Lippincott Williams & Wilkins; Philadelphia, PA: 2008. [Google Scholar]
  • 8.Szklo M, Nieto FJ. Epidemiology: Beyond the basics. 2nd ed Jones and Bartlett Publishers; Sudbury, Massachusetts: 2007. [Google Scholar]
  • 9.Law NJ, Taylor JM, Sandler H. The joint modeling of a longitudinal disease progression marker and the failure time process in the presence of cure. Biostatistics. 2002;3(4):547–563. doi: 10.1093/biostatistics/3.4.547. [DOI] [PubMed] [Google Scholar]
  • 10.The atherosclerosis risk in communities (ARIC) study: Design and objectives. the ARIC investigators. Am J Epidemiol. 1989;129(4):687–702. [PubMed] [Google Scholar]
  • 11.Muntner P, He J, Astor BC, Folsom AR, Coresh J. Traditional and nontraditional risk factors predict coronary heart disease in chronic kidney disease: Results from the atherosclerosis risk in communities study. J Am Soc Nephrol. 2005;16(2):529–538. doi: 10.1681/ASN.2004080656. [DOI] [PubMed] [Google Scholar]
  • 12.Shlipak MG, Sarnak MJ, Katz R, et al. Cystatin C and the risk of death and cardiovascular events among elderly persons. N Engl J Med. 2005;352(20):2049–2060. doi: 10.1056/NEJMoa043161. [DOI] [PubMed] [Google Scholar]
  • 13.Scialla JJ, Xie H, Rahman M, et al. Fibroblast growth factor-23 and cardiovascular events in CKD. J Am Soc Nephrol. 2014;25(2):349–360. doi: 10.1681/ASN.2013050465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Peralta CA, Katz R, Sarnak MJ, et al. Cystatin C identifies chronic kidney disease patients at higher risk for complications. J Am Soc Nephrol. 2011;22(1):147–155. doi: 10.1681/ASN.2010050483. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Coresh J, Astor BC, McQuillan G, et al. Calibration and random variation of the serum creatinine assay as critical elements of using equations to estimate glomerular filtration rate. Am J Kidney Dis. 2002;39(5):920–929. doi: 10.1053/ajkd.2002.32765. [DOI] [PubMed] [Google Scholar]
  • 16.Eckfeldt JH, Chambless LE, Shen YL. Short-term, within-person variability in clinical chemistry test results. experience from the atherosclerosis risk in communities study. Arch Pathol Lab Med. 1994;118(5):496–500. [PubMed] [Google Scholar]
  • 17.Levey AS, Stevens LA, Schmid CH, et al. A new equation to estimate glomerular filtration rate. Ann Intern Med. 2009;150(9):604–612. doi: 10.7326/0003-4819-150-9-200905050-00006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kidney Disease: Improving Global Outcomes (KDIGO): Clinical practice guideline for the evaluation and management of chronic kidney disease. Kidney Int Suppl. 2013;3:1–150. [Google Scholar]
  • 19.Zou G. A modified poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7):702–706. doi: 10.1093/aje/kwh090. [DOI] [PubMed] [Google Scholar]
  • 20.Grams ME, Waikar SS, McMahon B, Whelton S, Ballew SH, Coresh J. Performance and limitations of administrative data in the identification of acute kidney injury. Clin J Am Soc Nephrol. 2014 doi: 10.2215/CJN.07650713. Epub ahead of print. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Grams ME, Chow EK, Segev DL, Coresh J. Lifetime incidence of CKD stages 3-5 in the united states. Am J Kidney Dis. 2013;62(2):245–252. doi: 10.1053/j.ajkd.2013.03.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Drey N, Roderick P, Mullee M, Rogerson M. A population-based study of the incidence and outcomes of diagnosed chronic kidney disease. Am J Kidney Dis. 2003;42(4):677–684. doi: 10.1016/s0272-6386(03)00916-8. [DOI] [PubMed] [Google Scholar]
  • 23.Narayan KM, Boyle JP, Thompson TJ, Sorensen SW, Williamson DF. Lifetime risk for diabetes mellitus in the united states. JAMA. 2003;290(14):1884–1890. doi: 10.1001/jama.290.14.1884. [DOI] [PubMed] [Google Scholar]
  • 24.Waikar SS, Wald R, Chertow GM, et al. Validity of international classification of diseases, ninth revision, clinical modification codes for acute renal failure. J Am Soc Nephrol. 2006;17(6):1688–1694. doi: 10.1681/ASN.2006010073. [DOI] [PubMed] [Google Scholar]
  • 25.Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med. 1978;299(17):926–930. doi: 10.1056/NEJM197810262991705. [DOI] [PubMed] [Google Scholar]
  • 26.Grams ME, Plantinga LC, Hedgeman E, et al. Validation of CKD and related conditions in existing data sets: A systematic review. Am J Kidney Dis. 2011;57(1):44–54. doi: 10.1053/j.ajkd.2010.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Vlasschaert ME, Bejaimal SA, Hackam DG, et al. Validity of administrative database coding for kidney disease: A systematic review. Am J Kidney Dis. 2011;57(1):29–43. doi: 10.1053/j.ajkd.2010.08.031. [DOI] [PubMed] [Google Scholar]
  • 28.Stevens LA, Fares G, Fleming J, et al. Low rates of testing and diagnostic codes usage in a commercial clinical laboratory: Evidence for lack of physician awareness of chronic kidney disease. J Am Soc Nephrol. 2005;16(8):2439–2448. doi: 10.1681/ASN.2005020192. [DOI] [PubMed] [Google Scholar]
  • 29.Ferris M, Shoham DA, Pierre-Louis M, Mandhelker L, Detwiler RK, Kshirsagar AV. High prevalence of unlabeled chronic kidney disease among inpatients at a tertiary-care hospital. Am J Med Sci. 2009;337(2):93–97. doi: 10.1097/MAJ.0b013e318181288e. [DOI] [PubMed] [Google Scholar]
  • 30.Ronksley PE, Tonelli M, Quan H, et al. Validating a case definition for chronic kidney disease using administrative data. Nephrol Dial Transplant. 2012;27(5):1826–1831. doi: 10.1093/ndt/gfr598. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

01
02
03
04

RESOURCES