Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2022 Jul 29;17(7):e0264167. doi: 10.1371/journal.pone.0264167

A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records

Faye Cleary 1,*, David Prieto-Merino 1,, Dorothea Nitsch 1,
Editor: Mabel Aoun2
PMCID: PMC9337679  PMID: 35905096

Abstract

Background

Electronic healthcare records (EHRs) are a useful resource to study chronic kidney disease (CKD) progression prior to starting dialysis, but pose methodological challenges as kidney function tests are not done on everybody, nor are tests evenly spaced. We sought to review previous research of CKD progression using renal function tests in EHRs, investigating methodology used and investigators’ recognition of data quality issues.

Methods and findings

We searched for studies investigating CKD progression using EHRs in 4 databases (Medline, Embase, Global Health and Web of Science) available as of August 2021. Of 80 articles eligible for review, 59 (74%) were published in the last 5.5 years, mostly using EHRs from the UK, USA and East Asian countries. 33 articles (41%) studied rates of change in eGFR, 23 (29%) studied changes in eGFR from baseline and 15 (19%) studied progression to binary eGFR thresholds. Sample completeness data was available in 44 studies (55%) with analysis populations including less than 75% of the target population in 26 studies (33%). Losses to follow-up went unreported in 62 studies (78%) and 11 studies (14%) defined their cohort based on complete data during follow up. Methods capable of handling data quality issues and other methodological challenges were used in a minority of studies.

Conclusions

Studies based on renal function tests in EHRs may have overstated reliability of findings in the presence of informative missingness. Future renal research requires more explicit statements of data completeness and consideration of i) selection bias and representativeness of sample to the intended target population, ii) ascertainment bias where follow-up depends on risk, and iii) the impact of competing mortality. We recommend that renal progression studies should use statistical methods that take into account variability in renal function, informative censoring and population heterogeneity as appropriate to the study question.

Introduction

Chronic kidney disease (CKD) is a growing public health problem [1, 2]. Risks associated with CKD include cardiovascular morbidity, death, and in rare cases progression to end-stage renal disease (ESRD) requiring renal replacement therapy (RRT) [3]. Severity of disease, mechanism of renal damage and rate of progression of disease vary between patients, and the disease may change course over time in response to changing risk factors [4, 5]. While a minority of patients progress to ESRD, the cost of RRT presents a substantial economic burden to public health services and is likely to increase further over the coming years as prevalence of RRT rises alongside population growth and an ageing population [6, 7]. Increasing adoption of electronic healthcare records (EHRs) offers an opportunity to study progression of kidney disease in real-world care, that may enable improved decision-making in clinical practice. Whilst there is the promise of big sample sizes to be analysed, constraints on data availability of renal function test results may complicate reliable evaluation in EHRs. Frequency of monitoring of renal function is likely to vary in routine care according to differing individual patient risk profiles, local healthcare policy, physician-related factors, area of management within the healthcare system, social factors, or temporary illness. This may lead to some members of the target population being less likely to be followed up for renal function, potentially leading to selection and ascertainment biases in the study of CKD progression that may result in unreliable conclusions.

There are other methodological challenges in evaluation of CKD progression that are not specific to EHRs that should be considered by researchers. Deterioration in renal function over time is most commonly detected through changes of the estimated glomerular filtration rate (eGFR), usually derived from serum creatinine, sex, age, and ethnicity. Such creatinine-based GFR-estimating equations are imprecise, particularly at high levels of eGFR [8, 9]. Major changes in renal function in the context of acute illness are a sign of acute kidney injury (AKI). Although AKI is at least partially reversible in surviving patients, a history of AKI may accelerate subsequent loss in renal function. However, when researchers study eGFR decline over time, often statistical models are used that ignore the impact of acute drops in renal function on the subsequent trajectory. Population heterogeneity (caused by variation in risk factors both at baseline and evolving over time) may complicate analyses that assume a common mean linear trajectory of renal function loss over time, and it may be necessary to use more sophisticated methods if this assumption is violated that take this variability into account. Unmeasured confounding may also present issues, particularly if important confounders are not considered in the analysis. Competing events such as initiation of RRT or death complicate evaluation of progression outcomes. A previous systematic review by Boucquemont et al. in 2014 [10] reviewed statistical methods used to identify risk factors for progression of CKD, covering research on cohort studies published between 2002 and 2012. They summarised most used outcome measures and statistical models, critiquing handling of bias due to informative censoring, competing risks, correlation due to repeated measures, and non-normality of response, and proposed recommendations for best practice statistical methods and software packages.

We performed a systematic review of all longitudinal analyses of renal function tests investigating the nature, burden or consequences of CKD progression using EHRs. We aimed to establish how data issues inherent to EHRs and methodological challenges were handled, how CKD progression was defined, what statistical methods were used and whether data issues were acknowledged in the context of reliability of study conclusions.

Materials and methods

Protocol and registration

There is no published protocol available for this systematic review. Prior to completion of data extraction, this review was registered in the PROSPERO international prospective register of systematic reviews (registration number CRD42020182587).

Eligibility criteria

This is a review of statistical methodology covering all research studying the nature, burden or consequences of CKD progression using EHRs. Our intention was to focus on how researchers used renal function tests to study CKD progression. Initiation of dialysis is already a well-established clinically important outcome and as this was not the subject of the review, we excluded dialysis endpoints (as a measure of CKD progression) from review. Populations that had already initiated RRT at baseline or that were sampled on the basis of RRT initiation were excluded from review, since such populations are not appropriate for studying progression of CKD. (This criterion does not exclude patients that initiated RRT during follow-up.) Measures of CKD progression may constitute either exposures or outcomes of analysis. PICOS criteria are listed in the table below. There are no restrictions on sample size, population location or date of publication. Only studies reported in English language are included.

Participants Include: Adults aged ≥18 with CKD stages 3–5; Studies that involve both CKD and non-CKD patients are also included, e.g. diabetes
Exclude: Patients who have initiated RRT (dialysis or transplant), even if data is collected for renal function prior to RRT initiation; Patients with AKI (unless chronic changes are also studied); Non-human subjects; Children
Intervention/ Exposure No restriction if CKD progression is measured as the outcome, rather than exposure.
If CKD progression is analysed as an exposure, restrictions of this measure apply (see outcome definition).
Comparators/ Control No restriction.
Outcome No restriction on outcomes if CKD progression is measured as an exposure, rather than outcome.
If the outcome is a measure of CKD progression:
Include: Measures of chronic change in renal function based on multiple measures of eGFR or any other measure that may be used to infer eGFR (e.g. serum creatinine, cystatin-C, iohexol clearance), e.g. rate of change, change from baseline, regression slope, time to change or threshold eGFR
Exclude: All other measures of renal function, e.g. proteinuria; Studies of acute AKI or short term follow up (<6 months) of renal function following a procedure; Single time-point analyses; Time to RRT as single outcome.
Study design Include: Retrospective analysis of routinely collected electronic healthcare records which may include retrospective cohort studies, case-control studies and cross-sectional studies (if a measure of past progression is included)
Exclude: Case reports, Clinical trials, prospective cohort studies or any other study design with pre-planned data collection strategy for research purposes.

Searches

We performed electronic searches of MEDLINE, EMBASE, Global Health and Web of Science databases through to 11th August 2021. A copy of the search strategy is provided in the supplementary materials S2 File.

Study selection

This study had one lead reviewer and two supporting reviewers. The lead reviewer was responsible for screening all articles for eligibility, which involved scrutiny of abstracts followed by full-text review. The two supporting reviewers independently screened a sample of 50 articles each for eligibility. Consistency of agreement and reasons for disagreement were discussed. Clarity of inclusion/exclusion criteria was updated following discussion and prior to completion of eligibility review by the lead reviewer.

Data collection process

The lead reviewer was responsible for data extraction for all eligible research articles. In addition, key items that were the subject of this review were validated by supporting reviewers who independently extracted the following items for all articles: (1) measure of change in renal function; (2) statistical methods used in analysis of changes in renal function; and (3) definitions of progression of CKD, if any. The lead reviewer developed a data extraction form in an Excel spreadsheet, which was reviewed and approved by supporting reviewers in the initial stages of data extraction.

Data items

Information extracted from eligible research articles included details of the study population, study methodology and how data quality issues and other methodological issues were handled. Extracted items are listed below.

Study population

Data collection timeframe; Country of residence; Mean age; Percent male; Primary morbidity under study / reason for inclusion; Data source / healthcare setting

Study methodology

Date of publication; Study design; Research aims; Sample size (before and after exclusions for reasons of data completeness [for details, see below explanation of data completeness inclusion criteria and calculations of percentage of target population analysed]); Measure of renal function; Measure of change in renal function over time; Definition of progression (if any); Whether change in renal function was exposure or outcome; Duration of follow up for changes in renal function; Data completeness inclusion criteria and the minimum number of renal function tests required for analysis; Statistical tools used; Statistical model used.

Some additional results were derived to quantify data completeness for analysis, including the percentage of the target population that were analysed after application of data completeness inclusion criteria and the percentage of patients that dropped out of analysis during the intended follow up period having met criteria for inclusion in analysis. Here, “data completeness inclusion criteria” refer to the study-specific inclusion criteria applied prior to main analyses being performed that aimed to retain only those patients with sufficient data completeness to be deemed suitable for analysis, with such criteria expected to vary between studies.

Percentage of target population analysed was defined as:

numberofpatientsanalysedmeetingpopulationcriteriaafterexclusionsduetodatacompletenessnumberofpatientsmeetingpopulationcriteriapriortoexclusionsduetodatacompleteness×100

This was computable in some but not all studies, as it requires data on the total number of patients included in analysis as well as the number of patients that met population criteria before data completeness exclusion criteria were applied. (In propensity score matched cohort studies, propensity score matching criteria are included in population criteria, and we only compute percentage of target population analysed in the propensity score matched cohort, where this is possible.)

Percentage of study population lost to follow up was defined as:

numberofanalysedpatientslosttofollowupduringtheintendedfollowupperiodnumberofpatientsanalysed×100

Again, this was computable in some but not all studies, as it requires data on the number of patients analysed and the number of those patients that dropped out during the intended follow up period, for example due to death, initiation of RRT or other lack of follow up in routine care which could be for many different reasons.

Handling of data quality issues and other methodological challenges

Of the items below, details extracted included whether items were mentioned, whether information was provided on data completeness [if relevant], whether implications were acknowledged, whether challenges were tackled methodologically and any statistical methods used to attempt to overcome challenges:

Handling of sample completeness / representativeness of the target population; Handling of informative drop-outs/censoring; Handling of missing longitudinal data; Handling of missing covariate data; Distributional checks/issues; Handling of within-patient correlation and variability of kidney function over time; Handling of population heterogeneity; Handling of confounding.

Risk of bias in individual studies

Assessment of bias in individual studies was one of the main aims of this systematic review. Key measures of bias evaluated in individual studies were the percentage of the sample target population that were analysed and the percentage of the analysed study population that were lost to follow up. Study-specific measures were reported and bar charts were produced for these measures to demonstrate the potential for bias in individual studies due to informatively missing data.

Synthesis of results

This review was descriptive with simple aggregation of collected data items only and no statistical analysis was performed. 4 separate summaries are provided to describe study population characteristics, study methodology used, acknowledgment and handling of data quality issues and other methodological challenges, and definitions of CKD progression. For studies exploring multiple outcomes or conducting multiple analyses of changes in renal function, the outcomes and analyses considered the primary focus regarding renal progression in each paper are summarised in the review.

Risk of bias across studies

There was no single effect size of interest in this study and no meta-analysis was performed, as the review focussed on methodology used and investigators’ handling of data quality issues. Publication bias was therefore challenging to evaluate, as funnel plots and statistical tests could not be used. Efforts were made to maximise coverage of peer-reviewed literature in this field, including extraction of articles from 4 major databases. If research is missing from review due to publication in non-English languages, then data quality issues in such missing studies are likely to be similar to those in English language studies that were included. There will be clinical audit studies that are not peer-reviewed; these studies are likely to be of a similar of worse quality than reviewed studies because peer-reviewed literature is expected to go through certain research quality checks. In any case, as peer-reviewed literature is more likely to be used to inform policy than other research, this is arguably the optimal collection of research to assess the aims of this review.

Results

731 unique articles were identified from database searching, of which 80 met study eligibility criteria (Fig 1). Primary reasons for exclusion were not using EHRs, pre-planned data collection for research purposes such as a prospective cohort study, and studies with a single renal function test rather than longitudinal analysis of repeated measures of renal function. Other reasons for exclusion were ineligible populations, such as studies including children, restricted to RRT populations or studies that did not include CKD patients, such as studies of the incidence of CKD. All included studies retrospectively analysed routinely collected healthcare data. It was not always clear whether electronic or paper records were used, and while efforts were taken to differentiate this, it is possible that some included studies may have involved manual data extraction from paper records. 70 studies (88%) clearly stated the use of EHRs. In the 10 studies that did not state this, the time-frame for data collection and location of research suggested that electronic healthcare systems were likely to have been used, but we could not verify this. These studies have been summarised separately in the supplementary materials. A full list of reviewed studies is also included in the supplementary materials S3 File.

Fig 1. Flow chart of study selection.

Fig 1

Study population characteristics

Table 1 summarises characteristics of study populations analysed in reviewed articles. Research was most commonly conducted in the UK (25%) and USA (30%), followed by East Asian countries, including South Korea (8%), China (6%), Taiwan (9%) and Japan (8%). Research in non-English-speaking countries may be missing from review. Typically (based on median), studied populations had a mean age of 64 and were 52% male, although there was substantial variation between studies in these characteristics. Most commonly studied morbidities were CKD (26%) and diabetes (20%) although research covered a range of different populations, including (non-renal) transplant recipients and specific renal diseases. 10% studied the general population, with a further 3% studying patients with general risk factors for CKD. Clinical settings of retrieved databases varied widely, including primary care (23%), un-specified hospital settings (14%), outpatient clinics (21%), and 29% of studies used linked data across multiple care settings.

Table 1. Summary of study populations studied (N = 80).

Study population characteristics N (%)
Primary decade of follow up
 2010–2019 35 (43.8%)
 2000–2009 36 (45.0%)
 1990–1999 3 (3.8%)
 Not available 6 (7.5%)
Country
Europe  28 (35.0%)
  UK 20 (25.0%)
  Germany 2 (2.5%)
  Italy 2 (2.5%)
  Norway 2 (2.5%)
  Multiple European countries 2 (2.5%)
North America  25 (31.3%)
  USA 24 (30.0%)
  Canada 1 (1.3%)
Asia 25 (31.3%)
  South Korea 6 (7.5%)
  China 5 (6.3%)
  Taiwan 7 (8.8%)
  Japan 6 (7.5%)
  Thailand 1 (1.3%)
Oceania  1 (1.3%)
  Australia 1 (1.3%)
South America  1 (1.3%)
  Colombia 1 (1.3%)
Africa  0
Mean agea
 Median (IQR) 64 (56, 71)
 30–49 7 (8.8%)
 50–59 20 (25.0%)
 60–69 29 (36.3%)
 70–80 22 (27.5%)
 Not stated 2 (2.5%)
Percent male
 Median (IQR) 52% (44%, 63%)
 ≤ 34% 6 (7.5%)
 35–44% 15 (18.8%)
 45–54% 24 (30.0%)
 55–64% 16 (20.0%)
 ≥ 65% 19 (23.8%)
Main morbidity /reason for inclusion
 CKD 21 (26.3%)
 Diabetes 16 (20.0%)
 General population 8 (10.0%)
 Diabetic nephropathy / kidney disease 5 (6.3%)
 Atrial fibrillation 5 (6.3%)
 Multiple CKD risk factors 2 (2.5%)
 IgA nephropathy 2 (2.5%)
 Infections (Hepatitis C, HIV) 3 (3.8%)
 Transplant recipients (liver, heart) 3 (3.8%)
 Autoimmune diseases (lupus, IgG4 related, vasculitis) 3 (3.8%)
 Gout/hyperuricemia 2 (2.5%)
 Other* 10 (12.5%)
Data source / clinical setting
 Multiple care settings  23 (28.8%)
 Primary care  19 (23.8%)
 Outpatient  17 (21.3%)
  Diabetes clinic 6 (7.5%)
  Renal clinic 3 (3.8%)
  Diabetic-renal clinic 1 (1.3%)
 Not specified 7 (8.8%)
 Hospital  11 (13.8%)
 Tertiary care  6 (7.5%)
 Not stated  4 (5.0%)

aOther morbidities/reason for inclusion were urinary system disorders, hyperkalemia, obesity, osteoporosis, primary aldosteronism, abdominal aortic aneurysm, acute renal embolism, light chain deposition disease, lung cancer and renal cancer.

Study methodology

Study methodology is summarised in Table 2 and a listing of key items by study is also provided in the supplementary materials S4 Table. Use of EHRs for observational research increased rapidly in recent years, with 74% of reviewed studies published in the last 5.5 years. The overwhelming majority of research was focussed on risk factor identification and causal inference (82%), with only a handful of studies attempting risk prediction (9%). Other aims included estimation of incidence or prevalence (4%) and descriptive characterisations of changes in renal function (4%). Sample size ranged drastically from 24 up to 1,597,629, with a median sample size of 1,114.

Table 2. Study methodology (N = 80).

Study methodology features N (%)
Date of publication
 2015–2021 59 (73.8%)
 2010–2014 14 (17.5%)
 2005–2009 6 (7.5%)
 2000–2004 1 (1.3%)
Study design
 Retrospective cohort study 74 (92.5%)
 Cross-sectional study 4 (5.0%)
 Case-control study 2 (2.5%)
Research aims
 Risk factor identification / causal inference 65 (81.3%)
 Risk prediction 7 (8.8%)
 Estimation of incidence/prevalence 3 (3.8%)
 Descriptive characterisation of changes in renal function 3 (3.8%)
 Identification of sub-populations 1 (1.3%)
 Audit of care provision 1 (1.3%)
Sample size
 Median (IQR) 1114 (209, 9876)
 ≤ 99 10 (12.5%)
 100–499 18 (22.5%)
 500–999 11 (13.8%)
 1,000–9,999 22 (27.5%)
 ≥ 10,000 19 (23.8%)
Measure of renal function
eGFR  75 (93.8%)
  MDRD 33 (41.3%)
  CKD-EPI 28 (35.0%)
  MDRD, CKD-EPI combination 1 (1.3%)
  Taiwan CKD-EPI 1 (1.3%)
  Japanese formula 3 (3.8%)
  Not specified 9 (11.3%)
Estimated creatinine clearance  2 (2.5%)
  Cockcroft and Gault 2 (2.5%)
Serum creatinine  2 (2.5%)
Inverse serum creatinine  1 (2.5%)
Measure of change in renal function over timea
eGFR 75 (93.8%)
  Regression slope (absolute changes) 20 (25.0%)
   Individual linear regression 8 (10.0%)
   Linear mixed model 10 (12.5%)
   Growth model 1 (1.3%)
   Generalised estimating equations 1 (1.3%)
  Regression slope (absolute and percent changes) 1 (1.3%)
   Linear mixed model 1 (1.3%)
  Rate of change between measures 5 (6.3%)
  Rate of change, not clearly defined 4 (5.0%)
  Rate of percentage change, not clearly defined 3 (3.8%)
  Raw absolute change from baseline 10 (12.5%)
  Raw percent change from baseline 13 (16.3%)
  Raw percent change between measures 1 (1.3%)
  Binary progression to threshold eGFR 6 (7.5%)
  Binary progression (changes/threshold combination) 3 (3.8%)
  Transition between CKD stages 6 (7.5%)
  Trajectory shape class (mixed model) 1 (1.3%)
  Model predicted percent change per year 1 (1.3%)
  Model predicted eGFR at multiple time points 1 (1.3%)
Estimated creatinine clearance 2 (2.5%)
  Regression slope (absolute scale) 1 (1.3%)
  Raw percent change from baseline 1 (1.3%)
Serum creatinine 2 (2.5%)
  Raw absolute change from baseline 1 (1.3%)
  Binary progression to threshold serum creatinine 1 (1.3%)
Inverse serum creatinine 1 (1.3%)
  Regression slope (absolute changes) 1 (1.3%)
Change in renal function as outcome or exposure
 Outcome  74 (92.5%)
 Exposure (if exposure, outcome listed below)  6 (7.5%)
  Referral to renal care   1 (1.3%)
  CV events   1 (1.3%)
  Multiple outcomes (CV, hospitalisation, death)   1 (1.3%)
  Advanced CKD (stage 4)   1 (1.3%)
  Bleeding events   1 (1.3%)
Duration of follow up for renal function changes
 Median (IQR), years 3.0 (1.6, 4.4)
 < 1 year 7 (8.8%)
 1–4.9 years 48 (60.0%)
 5–9.9 years 14 (17.5%)
 ≥ 10 years 1 (1.3%)
 Not stated 10 (12.5%)
Minimum number of renal function measures for inclusion
 0 1 (1.3%)
 1 7 (8.8%)
 2 24 (30.0%)
 3 15 (18.8%)
 4 5 (6.3%)
 5 1 (1.3%)
 6 4 (5.0%)
 Not stated 23 (28.8%)
Percentage of target population used in analysis
 <50% 17 (21.3%)
 50% - 75% 9 (11.3%)
 75% - 90% 5 (6.3%)
 90% - 95% 5 (6.3%)
 >95% 8 (10.0%)
 Not available 36 (45.0%)
Percentage of study population lost to follow up
 < 25% 2 (2.5%)
 25% - 50% 3 (3.8%)
 > 50% 1 (1.3%)
 Not available 62 (77.5%)
 Complete case analysis (only including records of people with follow-up data) 11 (13.8%)
Statistical tools usedb
 Descriptive results only 5 (6.3%)
 Simple statistical tests 9 (11.3%)
 Linear regression models 8 (10.0%)
 ANOVA/ANCOVA 2 (2.5%)
 Kaplan-Meier estimation / life table analysis 3 (3.8%)
 Generalised linear models (GLMs) 11 (13.8%)
 Cox proportional hazards regression 18 (22.5%)
 Competing risks survival models 3 (3.8%)
 Mixed modelling methods 12 (15.0%)
 Other latent variable methods 2 (2.5%)
 Generalised estimating equations (GEEs) 2 (2.5%)
 Joint longitudinal survival modelling 2 (2.5%)
 Structural equation modelling 1 (1.3%)
 Multiple imputation 5 (6.3%)
 Machine learning methods 3 (3.8%)
Statistical model usedb
Risk factor identification / causal inference N = 65
  Difference in means t-test 2 (3.1%)
  Mean difference paired t-test 4 (6.2%)
  Simple non-parametric tests (Mann-Whitney U) 1 (1.5%)
  Difference in proportions chi-squared test 2 (3.1%)
  ANOVA 1 (1.5%)
  ANCOVA 1 (1.5%)
  Linear regression  7 (10.8%)
  Logistic regression 10 (15.4%)
  Kaplan Meier estimation /life table analysis 3 (4.6%)
  Cox proportional hazards regression 16 (24.6%)
  Competing risk survival models 3 (4.6%)
  Linear mixed model 10 (15.4%)
  Generalised estimating equations (GEEs) 2 (3.1%)
  Joint longitudinal survival model 2 (3.1%)
  Structural equation modelling 1 (3.1%)
Risk prediction N = 7
  Kalman filter (time series model) 1 (14.3%)
  Naïve Bayes classifier 1 (14.3%)
  Logistic regression 4 (57.1%)
  Cox proportional hazards regression 1 (14.3%)
  Random forest regression 2 (28.6%)
  Linear mixed model 1 (14.9%)
Estimation of incidence/prevalence N = 3
  Crude estimation 3 (100%)
Identification of sub-populations N = 1
  Trajectory clustering using latent variables 1 (100%)
Audit of care provision N = 1
  Linear mixed model 1 (100%)

aMore specific details of measures of changes in renal function in individual studies assessing CKD progression and corresponding statistical analysis methods are shown in Table 4, including where time-to-event models were used in the presence of unequal follow up or censoring.

bMultiple items possible for a single study but focus only on main analysis of CKD progression.

eGFR was the most commonly used measure of renal function (94%). Measures of change in renal function and methods of derivation were highly variable. Regression of absolute changes in eGFR was most common (26% of studies), although methods varied with many using mixed models but others using individual linear regression. Calculation of absolute changes and percent changes in eGFR were also common (14% and 17% respectively), but duration of follow up varied substantially between studies. Other less common measures were rates of change calculated between measures, regression slopes on the percent scale, and binary measures for progression to thresholds of eGFR or CKD stages. 7 studies (9%) analysed rates of change in eGFR that were not clearly defined as either regression slopes or rates of change between measures. Other renal function measures studied were Cockcroft and Gault estimated creatinine clearance (3%), serum creatinine (3%) and inverse serum creatinine (1%).

Most studies (93%) analysed changes in renal function as an outcome, with only 6 studying changes in renal function as an exposure. Typical (median) duration of follow up for renal function was 3 years, but ranged from 3 months to 14 years, and was not stated in 13% of studies. Duration of follow up also commonly varied significantly between patients within individual studies, mostly due to variation in data completeness with regards to availability and timing of serum creatinine test results on the health record. Inclusion criteria relating to availability of repeat eGFR measures varied and was commonly not stated (29%). The percentage of the target population analysed could not be calculated for 36 studies (45%) due to insufficient data (Fig 2A). The study population constituted less than 50% of patients in the target population for 17 studies (21%), and less than 75% of the target population in 26 studies (33%) (Fig 2B). Statistics on data completeness were rarely stated explicitly and were often difficult to ascertain. Rates of loss to follow up were even more difficult to ascertain, and many studies sampled patients on the basis of varying levels of completeness of follow up. In 11 studies (14%), quantifying the impact of loss to follow up was not possible due to sampling based on complete follow up, and in 62 studies (78%) no data was reported on losses to follow up. The supplementary listing of individual studies provides a more detailed breakdown of analysis criteria, percentage of target population analysed and rates of loss to follow up.

Fig 2.

Fig 2

Risk of selection bias (A) and ascertainment bias (B) in individual studies.

Statistical methods for analysing CKD progression depended on whether the renal function measure was continuous (e.g. rate of change in eGFR) or binary (e.g. >30% change in eGFR from baseline at repeat measurement), which varied between studies. Most commonly used statistical methods were linear mixed models, linear regression, logistic regression, and Cox proportional hazards regression. Many studies used simple statistical tests, despite the inability of these methods to adjust for confounders commonly present in observational data. More sophisticated methods taking into account differential drop-outs due to death were rare. 2 studies used joint longitudinal survival models and 3 studies used competing risks survival models.

Handling of data quality issues and methodological challenges

Table 3 summarises how data quality issues and methodological challenges were dealt with in reviewed articles. EHR databases used for analysis rarely had good quality data on renal function, i.e. collected regularly over time and completely for all patients in the target population. A few studies attempted to improve sample completeness, for example by using imputation methods to avoid exclusions. Studies selected patients for analysis on the basis of varying levels of data completeness, relating to number of measures and duration of follow up, and many studies would have excluded patients from analysis completely on the basis of insufficient data over time. 64% of studies at least partially acknowledged this as introducing bias, 18% provided some data on sample completeness without acknowledging implications and 16% did not mention sample completeness or representativeness at all. Very few studies mentioned losses to follow up during the study period or potential reasons for loss to follow up and 61% of studies did not mention the issue of informative censoring at all. Only 6 studies (8%) tackled the issue methodologically, for example by accounting for the competing risk of death through joint longitudinal survival models and competing risks survival models.

Table 3. Critique of handling of data quality and methodological challenges (N = 80).

Handling of data quality and methodological challenges N (%)
Representativeness of sample to target population
 Not mentioned 13 (16.3%)
 Mentioned care pathway and inclusion criteria, but not sample completeness 2 (2.5%)
 Mentioned sample completeness, but not implications 14 (17.5%)
 Partially acknowledged implications of sample completeness 37 (46.3%)
 Fully acknowledged implications of sample completeness 10 (12.5%)
 Tackled methodologically 4 (5.0%)
Methods of handlinga
 None 68 (85.0%)
 Detailed/comprehensive database of EHRs used 5 (6.3%)
 Multiple imputation (to avoid exclusions) 4 (5.0%)
 Other imputation methods (to avoid exclusions) 3 (3.8%)
Handling of informative drop-outs/censoring
 Not mentioned 49 (61.3%)
 Mentioned care pathway follow up, but not losses to follow up (inc. death) 2 (2.5%)
 Mentioned losses to follow up, but not implications 7 (8.8%)
 Partially acknowledged implications of losses to follow up 13 (16.3%)
 Fully Acknowledged implications of losses to follow up 3 (3.8%)
 Tackled methodologically 6 (7.5%)
Methods of handlinga
 None 71 (88.8%)
 Complete follow up 1 (1.3%)
 Joint modelling of longitudinal changes and time to drop out (including death) 2 (2.5%)
 Sensitivity analysis in drop-outs 1 (1.3%)
 Competing risks survival models 4 (5.0%)
 Sensitivity analysis adjusting for competing risks 1 (1.3%)
Handling of missing longitudinal data
 Not mentioned 47 (58.8%)
 Mentioned care pathway follow up, but not data completeness 4 (5.0%)
 Mentioned data completeness, but not implications 7 (8.8%)
 Partially acknowledged implications of data completeness 13 (16.3%)
 Fully acknowledged implications of data completeness 1 (1.3%)
 Tackled methodologically 8 (10.0%)
Methods of handlinga
 None 62 (77.5%)
 LOCF 1 (1.3%)
 Imputation with mean/median 2 (2.5%)
 Mixed modelling 13 (16.3%)
 Generalised estimating equations 1 (1.3%)
 Multiple imputation 1 (1.3%)
Handling of missing covariate data
 Not relevant (no covariate analysis) 16 (20.0%)
 Not mentioned (despite covariate analysis) 32 (40.0%)
 Mentioned data completeness, but not implications 2 (2.5%)
 Partially acknowledged implications of data completeness 17 (21.3%)
 Fully acknowledged implications of data completeness 3 (3.8%)
 Tackled methodologically 7 (8.8%)
Methods of handlinga
 None 64 (80.0%)
 LOCF 2 (2.5%)
 Imputation with mean 4 (5.0%)
 Multiple imputation 5 (6.3%)
 Complete data was available for all covariates 2 (2.5%)
 Data linkage to improve data completeness 1 (1.3%)
 Adjustment for missingness 2 (2.5%)
Distributional checks/issues
 Not mentioned 70 (87.5%)
 Mentioned or partially addressed 5 (6.3%)
 Fully Acknowledged 0
 Tackled 5 (6.3%)
Methods of handlinga
 None 75 (93.8%)
 Distributional checks 4 (5.0%)
 Consideration of alternative error distributions 1 (1.3%)
Handling of within-patient correlation / variability in kidney function over time
 Not mentioned 20 (25.0%)
 Mentioned or partially addressed 24 (30.0%)
 Fully Acknowledged 4 (5.0%)
 Tackled 32 (40.0%)
Methods of handlinga
 None 35 (43.8%)
 Random effects / latent variables 17 (21.3%)
 Generalised estimating equations 2 (2.5%)
 Modelling of stochastic process 1 (1.3%)
 Outcome likely to identify real change 22 (27.5%)
 Measures capturing AKI explicitly excluded 1 (1.3%)
 Paired t-test 3 (3.8%)
Handling of population heterogeneity
 Not mentioned 1 (1.3%)
 Mentioned or partially addressed 36 (45.0%)
 Fully Acknowledged 3 (3.8%)
 Tackled 40 (50.0%)
Method of handlinga
 None 8 (10.0%)
 Adjustment for covariates 21 (26.3%)
 Interaction terms 9 (11.3%)
 Stratified or separate/subgroup analysis 34 (42.5%)
 Latent classes 1 (1.3%)
 Random effects 3 (3.8%)
 ANOVA/ANCOVA 2 (1.5%)
 Propensity score methods 1 (1.3%)
 Features in machine learning classification 1 (1.3%)
Handling of confounding (risk factor / causal inference analyses only) N = 65
 Not mentioned 7 (10.8%)
 Mentioned or partially addressed 17 (26.2%)
 Fully Acknowledged 3 (4.6%)
 Tackled 38 (58.5%)
Methods of handlinga
 None 12 (18.5%)
 Adjustment for baseline confounders 46 (70.8%)
 Propensity score methods 6 (9.2%)

aMethods/approaches for handling issues are listed, regardless of whether the corresponding issues were fully tackled in analysis.

Most studies (59%) did not mention (or tackle) the issue of missing longitudinal data on renal function tests over time. One in 6 studies did however use mixed modelling methods (16%) which may partially deal with the issue. 4 studies (5%) attempted to deal with missing longitudinal data through imputation methods. 40% of studies failed to mention missing covariate data despite covariate analysis, while 20% did not perform covariate adjustment. 25% at least partially acknowledged the issue and 16 studies (20%) made some attempt to handle missing covariate data through imputation methods, data linkage or other adjustment for missingness.

Distributional checks for renal function measures were rare, with only 5 studies (6%) mentioning distributional checks or considering alternative error distributions. Regarding the issue of variability in renal function over time and within-patient correlation, 25% did not mention (or tackle) such issues at all, 40% tackled the issue methodologically, 30% partially tackled or acknowledged the issue and a further 5% fully acknowledged such issues. 21% of studies used patient random effects to account for within-patient correlation, and 28% used outcomes which are likely to identify an important and real change.

Most studies acknowledged some aspects of population heterogeneity in analyses. At the most basic level, covariate adjusted analyses were used to account for baseline differences between patients (26%). Other methods included stratification or subgroup analyses to study distinct populations (43%), interaction terms allowing differing trajectories of renal function according to patient characteristics (11%) and random effects (4%). For studies performing causal analyses, 59% tackled the issue of confounding, mostly through baseline adjustment. A subset (11%) did not mention (or tackle) confounding at all, with some studies performing simple statistical tests such as t-tests and chi-squared tests despite the potential for confounding by indication.

Definitions of CKD progression

Table 4 provides a list of CKD progression measures used in individual studies, grouped by method of derivation. A listing is provided rather than aggregate summary due to the substantial variation in the way researchers defined CKD progression across the literature. Terms used included progression, rapid progression, fast progression, rapid decline, progressive decline, progressive renal impairment, renal function deterioration and worsening renal function, while some did not provide labels, simply stating the outcome as a threshold percent change in renal function for example. There is no consistency between studies in the way these terms apply to different outcomes.

Table 4. Listing of CKD progression measures in reviewed articles (52 of 80 articles).

Methods Rulea Term Author [ref]b Year Avg follow up Sample size Other methodsa
Individual linear regression eGFR slope decline: > 3 ml/min/1.73m2/year Progressors Chase HS et al. [11] 2014 6 years 481 Naïve Bayes classifier; logistic regression
eGFR slope decline: > median (8.1) ml/min/1.73m2/year Relatively rapid eGFR decline Wang Y et al. [12] 2019 2 years 128 Logistic regression
eGFR slope decline: > mean (1.5) ml/min/1.73m2/year Faster decline Abdelhafiz AH et al. [13] 2012 14 years 100 Logistic regression
Linear mixed model eGFR slope decline: > 5 ml/min/1.73m2/year Rapid progression Eriksen BO et al. [14] 2006 3.7 years 3,047 Slope interactions
eGFR slope decline: > 4 ml/min/1.73m2/year Rapid progression Jalal K et al. [15] 2019 > = 3 years 10,927 N/A
eGFR slope decline: > 3 ml/min/1.73m2/year eGFR slope decline Cabrera CS et al. [16] 2020 4.3 years 30,222 Cox PH regression
eGFR slope decline: > 0 ml/min/1.73m2/year Progressors (vs non-progressors) Eriksen et al. [17] 2010 4 years 1,224 2-level model
eGFR slope decline: > 0 ml/min/1.73m2/year eGFR decline Annor FB et al. [18] 2015 4 years 575 Structural equation modelling
eGFR predicted percent rate of decline: > 5% per year Progression Diggle PJ et al. [19] 2015 4.5 years 22,910 Piecewise linear mixed model
Absolute change between measures eGFR drop at any time: > 10 ml/min/1.73m2 Progression Butt AA et al. [20] 2018 3 months 17,624 Difference in proportions chi-squared test
Percent change between measures eGFR percent drop: >10%; >20% Progression Singh A et al. [21] 2015 1 year 6,435 Logistic regression
eGFR percent drop: >15% Progressive renal impairment Evans RDR et al. [22] 2018 5 years 24 Descriptive result only
eGFR percent drop: >20% Transient or persistent renal function decline Jackevicius CA et al. [23] 2021 Approx. 1.4 years 49,458 Cox PH regression
eGFR percent drop: >25% Progression Lai YJ et al. [24] 2019 1 year 1,620 Cox PH regression
eGFR percent drop: >25% Progression Vejakama P et al. [25] 2015 4.5 years 32,106 Competing risks survival models
(AND increase in CKD stage)
eGFR percent drop: >30% “30% decline in eGFR” Posch F et al. [26] 2019 1.4 years 14,432 Cox PH regression
eGFR percent drop: >30% Renal function decline Hsu TW et al. [27] 2019 5 years 5,046 Cox PH regression
eGFR percent drop: >30% Rapid eGFR decline Inaguma D et al. [28] 2020 2 years 9,911 Logistic regression; Random forest regression
eGFR percent drop: >30% eGFR decline Peng YL et al. [29] 2020 1.5 years 1,050 Cox PH regression
eGFR percent drop: >30% (no label) Yao X et al. [30] 2017 11 months 9,796 Cox PH regression
eGFR percent drop: >30% “Loss of eGFR >30%” Lamacchia O et al. [31] 2018 4 years 582 Logistic regression
eGFR percent drop: >30% eGFR loss Viazzi F et al. [32] 2018 4 years 535 Logistic regression
eGFR percent drop: >30% Clinically important decline Rej S et al. [33] 2020 3.1 years 6,226 Cox PH regression
eGFR percent drop: >30%; 30–50%; and 50% Progression Yoo H et al. [34] 2019 5.7 years 478 Kaplan meier with log-rank test
eGFR percent drop: >40% (or RRT initiation) RRT40 Tangri N et al. [35] 2021 3.9 years 32,007 Cox PH regression
eGFR percent drop: >50% Renal survival endpoint Lv L et al. [36] 2017 3.1 years 208 Cox PH regression
Serum creatinine percent increase: >50% Worsening renal function Li XM et al. [37] 2016 1.8 years 44 Descriptive results only
Estimate creatinine clearance percent drop: >0% Decline in creatinine clearance Gallant JE et al. [38] 2005 1 year 658 Descriptive results only
Rate of change between measures eGFR drop per time elapsed (assumed): Progressive GFR decline Herget-Rosenthal S et al. [39] 2013 3 years 803 Logistic regression
> 2.5 ml/min/1.73m2/year
eGFR drop per time elapsed: > 3 ml/min/1.73m2/year Rapid progression Morales-Alvarez MC et al. [40] 2019 Not stated 594 Descriptive comparisons
eGFR drop per time elapsed: > 5 ml/min/1.73m2/year eGFR decline Nderitu P et al. [41] 2014 9 months 4,145 Logistic regression
eGFR drop per time elapsed: > 5 ml/min/1.73m2/year Fast progression Koraishy FM et al. [42] 2017 Not stated 2,170 Logistic regression
eGFR drop per time elapsed (assumed): > 5 ml/min/1.73m2/year Progressive CKD Johnson F et al. [43] 2015 Not stated 200 Difference in proportions chi-squared test
eGFR drop per time elapsed: > 5 ml/min/1.73m2/year Rapid decline Chakera A et al. [44] 2015 7 years 147 Logistic regression
eGFR percent drop per time elapsed (assumed): >5% per year Rapid kidney function decline Chen H et al. [45] 2014 3 years 365 Logistic regression
Change in CKD stage, based on measures Population: incident CKD stage 3 (2 x eGFR < 60 over > 3 months); CKD progression from stage 3 to 4 Perotte A et al. [46] 2015 Not stated 2,908 Cox proportional hazards regression
Outcome: 2 x eGFR <30 over >3 months
Increase in CKD stage: By one or more stages Worsening in CKD stage Cummings DM et al. [47] 2011 7.6 years 791 Logistic regression
Increase in CKD stage: By one or more stages (eGFR values or diagnostic codes) Declining kidney function Horne L et al. [48] 2019 Not stated 195,178 Crude estimation of incidence rate
Increase in CKD stage: By one or more stages (eGFR values or coded RRT) CKD stage worsening Robinson DE et al. [49] 2021 Approx. 3.7 years 19,324 Competing risks survival models
Increase in CKD stage: By one stage Progression of kidney dysfunction to next CKD stage Nicolos GA et al. [50] 2020 5 years Approx 37,000 Life-table analysis
Increase in CKD stage / risk category: To very high risk category (eGFR <30 and proteinuria (-); eGFR <45 and proteinuria (±); eGFR < 60 and proteinuria (+)) Diabetic kidney disease progression Yanagawa T et al. [51] 2021 6.2 years 681 Cox PH regression
Change in CKD stage: From and to any stage, summarised by initial and final stage Transition between CKD stages Vesga JI et al. [52] 2021 6-month intervals 1,783 Crude estimation
Binary progression to threshold value Threshold eGFR: median eGFR < 30, for at least 3 consecutive months Nephrotoxicity Oetjens M et al. [53] 2014 8.8 years 115 Cox PH regression
Threshold eGFR: 2 x eGFR<30 over ≥90 days with no intermediate eGFR>30 Advanced CKD Neuen BL et al. [54] 2021 2.9 years 91,319 Cox PH regression
Threshold eGFR: 2 x eGFR<30 over ≥90 days with no intermediate eGFR>30 (or a stage 4–5 code) Incident CKD stages 4–5 Weldegiorgis M et al. [55] 2019 7.5 years 1,397,573 Cox PH regression
Threshold eGFR: < 45 ml/min/1.73m2 Progression to CKD stage 3b Niu SF et al. [56] 2021 3.0 years 3,114 Cox PH regression
Threshold eGFR: < 15 ml/min/1.73m2 Renal survival endpoint O’Riordan A et al. [57] 2009 3.2 years 54 Kaplan meier estimation; log-rank test
Threshold eGFR: ESRD (eGFR<15 or dialysis) Progression to ESRD Tsai CW et al. [58] 2017 4.2 years 739 Cox PH regression
Binary progression (changes/threshold combination) eGFR percent drop: >50% Renal event Leither MD et al. [59] 2019 5.3 years 196,209 Cox PH regression
AND
Threshold eGFR: 2 x eGFR <30
eGFR percent drop: >50% “ESRD or an irreversible reduction in eGFR” Liu D et al. [60] 2019 3.7 years 455 Cox PH regression
OR
Threshold eGFR: ESRD
eGFR percent drop: >50% CKD progression Rincon-Choles H et al. [61] 2017 2.8 years 1,676 Competing risks survival models
OR
Threshold eGFR: ESRD
Latent class non-linear mixed models Prediction of latent eGFR trajectory class, 6 categories Trajectory category* VanWagner LB et al. [62] 2018 1 year 671 Logistic regression, conditional on class

aIn time-to-event analyses (e.g. Cox PH regression, competing risks survival models), the rule for progression can be met at any time during data collection, utilising repeated test results over time. In binary analyses (e.g. logistic regression), the rule is applied once per patient, likely at a specific time which may vary between studies.

bFor consistency, article reference numbers [ref] also match those provided in the supplementary S3 File listing of reviewed studies.

Discussion

We performed a systematic review of peer-reviewed literature studying progression of CKD using routinely collected EHR data. Handling of data quality issues was generally poor, with unclear reporting of analysis criteria, data completeness and discussion of the implications of missing data on reliability of conclusions. For studies with sufficient data, representativeness of samples to target populations was likely to be poor with large numbers of patients excluded from analysis on the basis of poor data completeness at baseline and during follow-up thereby likely introducing selection bias. Methods capable of handling missing longitudinal data and informative losses to follow up, such as joint longitudinal survival models, were only used in a minority of studies and many studies are likely to have overstated the reliability of findings and applicability to populations of interest. Measures of change in renal function and definitions of progression varied substantially between studies, revealing a lack of consensus on clinically important and statistically robust measures in the study of CKD progression.

Unlike prospective cohort studies and clinical trials which prospectively identify patients for research and take efforts to follow up patients regularly and completely over time, retrospective analysis of routine healthcare data relies on data collected for the purposes of clinical care. While monitoring guidelines may be in place in healthcare systems that aim to ensure regular follow up of patients at risk of CKD progression, such guidelines may be followed at the discretion of healthcare providers, and frequency of testing and time between tests is likely to be influenced by patient risk. If patients are sampled for analysis on the basis of threshold levels of data completeness over time, there is a risk of disproportionately including patients in analysis that are followed up more regularly as a result of their evolving risk profile (selection bias) and that remain both alive and free of RRT long enough to meet the follow up criteria (survival bias). In addition, if data is collected in a single care setting but patients are managed in different care settings based on their risk, data may be informatively missing where patients move between care settings (ascertainment bias). It is highly likely that studies using EHRs that exclude patients from analysis due to poor data completeness or fail to follow up patients equally among different risk groups will have unreliable results, and results may reflect an unknown subgroup of the target population. The use of such studies to inform clinical decision-making may therefore fail to benefit the community as hoped.

There are a number of methodological challenges in longitudinal analysis of renal function that are not necessarily specific to EHRs but that are important considerations for researchers, discussed in more detail in [10, 63] and introduced earlier. In the absence of acute kidney injury, mixed effects models with patient random effects may improve estimation of changes over time compared to individual linear regressions which may lead to more extreme slope estimations. Such models allow sharing of information between patients, assuming a common mean trajectory, and they allow patients to be included in analysis with variable levels of data completeness to avoid excluding patients from analysis unnecessarily. Other benefits are the ability to perform the entire analysis (comparing exposures and outcomes) in a single model, without the loss of information and under-estimation of standard errors that may result from a 2-step model that estimates individual changes prior to further modelling. CKD is a heterogeneous disease, with various possible contributing causes and pathways of progression. Linear mixed models typically assume a common mean trajectory but other methods are available if this assumption is too strong. While random slope models allow individual trajectories to vary around a common mean slope, more sophisticated models such as latent class mixed models allow modelling of trajectory groups which may be linear or non-linear and correspond to sub-populations of patients. Another challenge is competing risk of mortality and how to handle the initiation of RRT in the analyses of repeated renal function tests, where such events are likely to be associated with rate of decline. An analysis that does not account for informative censoring may lead to biased results. Joint longitudinal survival models and competing risks survival models can be used to account for competing risks if data is available (this may require data linkage to external databases to obtain information on competing event dates).

A major finding of this review was the extreme variation in definitions of CKD progression used, and the clinical importance of each definition was unclear. More work has been done in the last decade to identify clinically important measures of progression of CKD. In 2012, the United States Food and Drug Administration (FDA) commissioned research to identify new endpoints of CKD progression for use in clinical trials [64, 65]. Definitions were developed using data from the Chronic Kidney Disease Prognosis Consortium (CKD-PC) that showed strong association with important clinical outcomes of progression to ESRD and all-cause mortality, including thresholds of reduction in eGFR between measures of 30% and 40% over approximately 2 years, stratified by baseline eGFR. Further research that aims to define new outcomes of smaller clinically meaningful changes in renal function would be useful, as this may enable earlier identification of progression of CKD that would be useful in clinical practice, and future EHR studies could adopt such outcomes for research.

Strengths of this review include the large number of databases utilised and studies reviewed and detailed data extraction efforts, allowing a comprehensive evaluation of how well data quality issues were handled and acknowledged. The review was however limited to peer-reviewed articles and those that clarified in their abstract that repeated renal function tests were used in analysis. Limitations include the limitation to articles written in English, lack of inclusion of grey literature and issues with ascertaining whether EHRs were used as opposed to other methods of extraction from paper records. Despite this, the majority of data issues present will be the same regardless of whether electronic or paper records were used. Retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks.

Conclusions

Many studies using EHRs to study progression of CKD do not fully acknowledge the biases that result from poor data quality inherent in EHRs and reporting was poor. While some studies have defined CKD progression measures similar to those validated by FDA in 2012 [64, 65] showing an understanding of identifying clinically important changes in renal function, recommendations following the systematic review by Boucquemont et al. review in 2014 [10] have not been implemented on a broader scale. Observational studies using EHRs should follow the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) [66, 67] and REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) [68] guidelines, which aim to improve transparency and clarity in reporting of research. Research publications should clearly state the care pathway and intended follow up framework, data completeness eligibility criteria, the percentage of the target population excluded based on those criteria, whether there were differences in characteristics of those included vs. excluded and according to important risk factors, as well as rates of loss to follow up. Where possible, researchers should attempt to ascertain reasons for loss to follow up, which may involve linkage to external data. Researchers should consider using existing validated outcomes of CKD progression and we hope that heterogeneity in definitions of CKD progression will improve over time. Focussing research questions on populations for which regular data collection is performed as part of routine care may offer a route to better quality data on changes of renal function over time and important changes in renal function will be easier to identify accurately in patients with reduced renal function at baseline, such as those with established CKD where GFR-estimating equations perform better.

Supporting information

S1 File. PRISMA checklist.

(DOC)

S2 File. MEDLINE database search strategy.

(DOCX)

S3 File. List of reviewed studies.

(DOCX)

S1 Table. Summary of study populations, where unclear if EHRs used.

(DOCX)

S2 Table. Study methodology, where unclear if EHRs used.

(DOCX)

S3 Table. Critique of handling of data quality and methodological challenges, where unclear if EHRs used.

(DOCX)

S4 Table. Listing of key features of all included studies, sorted by year of publication.

(DOCX)

S1 Data. Data extraction spreadsheet.

(XLSX)

Acknowledgments

Only the listed authors contributed to the work reported in this manuscript.

Data Availability

This is a systematic review of previously published research, available in the public domain. All relevant data extracted from reviewed articles are captured in the manuscript and its supporting Information files.

Funding Statement

This work was supported by the Medical Research Council (MR/N013638/1), grant awarded to FC. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Eckardt KU, Coresh J, Devuyst O, Johnson RJ, Kottgen A, Levey AS, et al. Evolving importance of kidney disease: from subspecialty to global health burden. Lancet. 2013;382(9887):158–69. doi: 10.1016/S0140-6736(13)60439-0 [DOI] [PubMed] [Google Scholar]
  • 2.Hill NR, Fatoba ST, Oke JL, Hirst JA, O’Callaghan CA, Lasserson DS, et al. Global Prevalence of Chronic Kidney Disease—A Systematic Review and Meta-Analysis. PLoS One. 2016;11(7):e0158765. doi: 10.1371/journal.pone.0158765 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Go AS, Chertow GM, Fan D, McCulloch CE, Hsu CY. Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization. N Engl J Med. 2004;351(13):1296–305. doi: 10.1056/NEJMoa041031 [DOI] [PubMed] [Google Scholar]
  • 4.Li L, Astor BC, Lewis J, Hu B, Appel LJ, Lipkowitz MS, et al. Longitudinal progression trajectory of GFR among patients with CKD. Am J Kidney Dis. 2012;59(4):504–12. doi: 10.1053/j.ajkd.2011.12.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Caravaca-Fontán F, Azevedo L, Luna E, Caravaca F. Patterns of progression of chronic kidney disease at later stages. Clin Kidney J. 2018;11(2):246–53. doi: 10.1093/ckj/sfx083 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.UK Renal Registry. 20th Annual Report of the Renal Association. NEPHRON 2018; 139 (suppl1). Available: renal.org/audit-research/annual-report
  • 7.UK Renal Registry. UK Renal Registry 22nd Annual Report–data to 31/12/2018, Bristol, UK. Available: renal.org/audit-research/annual-report
  • 8.Levey AS, Stevens LA, Schmid CH, Zhang YL, Castro AF 3rd, Feldman HI, et al. A new equation to estimate glomerular filtration rate. Ann Intern Med. 2009;150(9):604–12. doi: 10.7326/0003-4819-150-9-200905050-00006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.The Renal Association. About eGFR. [cited 8 January 2021] Available: https://renal.org/health-professionals/information-resources/uk-eckd-guide/about-egfr
  • 10.Boucquemont J, Heinze G, Jager KJ, Oberbauer R, Leffondre K. Regression methods for investigating risk factors of chronic kidney disease outcomes: the state of the art. BMC Nephrology. 2014;15(1):45. doi: 10.1186/1471-2369-15-45 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Chase HS, Hirsch JS, Mohan S, Rao MK, Radhakrishnan J. Presence of early CKD-related metabolic complications predict progression of stage 3 CKD: a case-controlled study. BMC Nephrology. 2014;15:187. doi: 10.1186/1471-2369-15-187 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Wang Y, Zhao L, Zhang J, Wu Y, Zhang R, Li H, et al. Implications of a Family History of Diabetes and Rapid eGFR Decline in Patients With Type 2 Diabetes and Biopsy-Proven Diabetic Kidney Disease. Frontiers in Endocrinology. 2019;10 (no pagination). doi: 10.3389/fendo.2019.00855 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Abdelhafiz AH, Tan E, Levett C, Minchin J, Nahas ME. Natural history and predictors of faster glomerular filtration rate decline in a referred population of older patients with type 2 diabetes mellitus. Hospital practice (1995) Hospital practice. 2012;40(4):49–55. [DOI] [PubMed] [Google Scholar]
  • 14.Eriksen BO, Ingebretsen OC. The progression of chronic kidney disease: a 10-year population-based study of the effects of gender and age. Kidney International. 2006;69(2):375–82. doi: 10.1038/sj.ki.5000058 [DOI] [PubMed] [Google Scholar]
  • 15.Jalal K, Anand EJ, Venuto R, Eberle J, Arora P. Can billing codes accurately identify rapidly progressing stage 3 and stage 4 chronic kidney disease patients: a diagnostic test study. Bmc Nephrology. 2019;20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Cabrera CS, Lee AS, Olsson M, Schnecke V, Westman K, Lind M, et al. Impact of CKD Progression on Cardiovascular Disease Risk in a Contemporary UK Cohort of Individuals With Diabetes. Kidney International Reports. 2020;5(10):1651–60. doi: 10.1016/j.ekir.2020.07.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Eriksen BO, Tomtum J, Ingebretsen OC. Predictors of declining glomerular filtration rate in a population-based chronic kidney disease cohort. Nephron. 2010;115(1):c41–50. doi: 10.1159/000286349 [DOI] [PubMed] [Google Scholar]
  • 18.Annor FB, Masyn KE, Okosun IS, Roblin DW, Goodman M. Psychosocial stress and changes in estimated glomerular filtration rate among adults with diabetes mellitus. Kidney Research and Clinical Practice. 2015;34(3):146–53. doi: 10.1016/j.krcp.2015.07.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Diggle PJ, Sousa I, Asar O. Real-time monitoring of progression towards renal failure in primary care patients. Biostatistics. 2015;16(3):522–36. doi: 10.1093/biostatistics/kxu053 [DOI] [PubMed] [Google Scholar]
  • 20.Butt AA, Ren Y, Puenpatom A, Arduino JM, Kumar R, Abou-Samra AB. Effectiveness, treatment completion and safety of sofosbuvir/ledipasvir and paritaprevir/ritonavir/ombitasvir + dasabuvir in patients with chronic kidney disease: an ERCHIVES study. Alimentary Pharmacology & Therapeutics. 2018;48(1):35–43. doi: 10.1111/apt.14799 [DOI] [PubMed] [Google Scholar]
  • 21.Singh A, Nadkarni G, Gottesman O, Ellis SB, Bottinger EP, Guttag JV. Incorporating temporal EHR data in predictive models for risk stratification of renal function deterioration. Journal of Biomedical Informatics. 2015;53:220–8. doi: 10.1016/j.jbi.2014.11.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Evans RDR, Cargill T, Goodchild G, Oliveira B, Rodriguez-Justo M, Pepper R, et al. Clinical Manifestations and Long-term Outcomes of IgG4-Related Kidney and Retroperitoneal Involvement in a United Kingdom IgG4-Related Disease Cohort. Kidney International Reports. 2019;4(1):48–58. doi: 10.1016/j.ekir.2018.08.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Jackevicius CA, Lu LY, Ghaznavi Z, Warner AL. Bleeding Risk of Direct Oral Anticoagulants in Patients With Heart Failure And Atrial Fibrillation. Circulation-Cardiovascular Quality and Outcomes. 2021;14(2):155–68. doi: 10.1161/CIRCOUTCOMES.120.007230 [DOI] [PubMed] [Google Scholar]
  • 24.Lai YJ, Lin YC, Peng CC, Chen KC, Chuang MT, Wu MS, et al. Effect of weight loss on the estimated glomerular filtration rates of obese patients at risk of chronic kidney disease: the RIGOR-TMU study. Journal of Cachexia, Sarcopenia and Muscle. 2019;10(4):756–66. doi: 10.1002/jcsm.12423 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Vejakama P, Ingsathit A, Attia J, Thakkinstian A. Epidemiological Study of Chronic Kidney Disease Progression: A Large-Scale Population-Based Cohort Study. Medicine. 2015;94(4). doi: 10.1097/MD.0000000000000475 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Posch F, Ay C, Stoger H, Kreutz R, Beyer-Westendorf J. Exposure to vitamin k antagonists and kidney function decline in patients with atrial fibrillation and chronic kidney disease. Research and Practice in Thrombosis and Haemostasis. 2019;3(2):207–16. doi: 10.1002/rth2.12189 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Hsu TW, Hsu CN, Wang SW, Huang CC, Li LC. Comparison of the effects of denosumab and alendronate on cardiovascular and renal outcomes in osteoporotic patients. Journal of Clinical Medicine. 2019;8(7). doi: 10.3390/jcm8070932 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Inaguma D, Kitagawa A, Yanagiya R, Koseki A, Iwamori T, Kudo M, et al. Increasing tendency of urine protein is a risk factor for rapid eGFR decline in patients with CKD: A machine learning-based prediction model by using a big database. PLoS ONE [Electronic Resource]. 2020;15(9):e0239262. doi: 10.1371/journal.pone.0239262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Peng YL, Tain YL, Lee CT, Yang YH, Huang YB, Wen YH, et al. Comparison of uric acid reduction and renal outcomes of febuxostat vs allopurinol in patients with chronic kidney disease. Scientific Reports. 2020;10(1):10734. doi: 10.1038/s41598-020-67026-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Yao X, Tangri N, Gersh BJ, Sangaralingham LR, Shah ND, Nath KA, et al. Renal Outcomes in Anticoagulated Patients With Atrial Fibrillation. Journal of the American College of Cardiology. 2017;70(21):2621–32. doi: 10.1016/j.jacc.2017.09.1087 [DOI] [PubMed] [Google Scholar]
  • 31.Lamacchia O, Viazzi F, Fioretto P, Mirijello A, Giorda C, Ceriello A, et al. Normoalbuminuric kidney impairment in patients with T1DM: Insights from annals initiative. Diabetology and Metabolic Syndrome. 2018;10(1). doi: 10.1186/s13098-018-0361-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Viazzi F, Greco E, Ceriello A, Fioretto P, Giorda C, Guida P, et al. Apparent treatment resistant hypertension, blood pressure control and the progression of chronic kidney disease in patients with type 2 diabetes. Kidney and Blood Pressure Research. 2018;43(2):422–38. doi: 10.1159/000488255 [DOI] [PubMed] [Google Scholar]
  • 33.Rej S, Herrmann N, Gruneir A, McArthur E, Jeyakumar N, Muanda FT, et al. Association of Lithium Use and a Higher Serum Concentration of Lithium With the Risk of Declining Renal Function in Older Adults: A Population-Based Cohort Study. The Journal of clinical psychiatry. 2020;81(5). doi: 10.4088/JCP.19m13045 [DOI] [PubMed] [Google Scholar]
  • 34.Yoo H, Park I, Kim DJ, Lee S. Effects of sarpogrelate on microvascular complications with type 2 diabetes. International Journal of Clinical Pharmacy. 2019. doi: 10.1007/s11096-019-00794-7 [DOI] [PubMed] [Google Scholar]
  • 35.Tangri N, Reaven NL, Funk SE, Ferguson TW, Collister D, Mathur V. Metabolic acidosis is associated with increased risk of adverse kidney outcomes and mortality in patients with non-dialysis dependent chronic kidney disease: an observational cohort study. BMC Nephrology. 2021;22(1). doi: 10.1186/s12882-021-02385-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lv L, Chang DY, Li ZY, Chen M, Hu Z, Zhao MH. Persistent hematuria in patients with antineutrophil cytoplasmic antibody-associated vasculitis during clinical remission: chronic glomerular lesion or low-grade active renal vasculitis? BMC Nephrology. 2017;18(1):354. doi: 10.1186/s12882-017-0763-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Li XM, Rui HC, Liang DD, Xu F, Liang SS, Zhu XD, et al. Clinicopathological characteristics and outcomes of light chain deposition disease: an analysis of 48 patients in a single Chinese center. Annals of Hematology. 2016;95(6):901–9. doi: 10.1007/s00277-016-2659-1 [DOI] [PubMed] [Google Scholar]
  • 38.Gallant JE, Parish MA, Keruly JC, Moore RD. Changes in renal function associated with tenofovir disoproxil fumarate treatment, compared with nucleoside reverse-transcriptase inhibitor treatment. Clinical Infectious Diseases. 2005;40(8):1194–8. doi: 10.1086/428840 [DOI] [PubMed] [Google Scholar]
  • 39.Herget-Rosenthal S, Dehnen D, Kribben A, Quellmann T. Progressive chronic kidney disease in primary care: modifiable risk factors and predictive model. Preventive Medicine. 2013;57(4):357–62. doi: 10.1016/j.ypmed.2013.06.010 [DOI] [PubMed] [Google Scholar]
  • 40.Morales-Alvarez MC, Garcia-Dolagaray G, Millan-Fierro A, Rosas SE. Renal Function Decline in Latinos With Type 2 Diabetes. Kidney International Reports. 2019;4(9):1230–4. doi: 10.1016/j.ekir.2019.05.012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Nderitu P, Doos L, Strauss VY, Lambie M, Davies SJ, Kadam UT. Analgesia dose prescribing and estimated glomerular filtration rate decline: a general practice database linkage cohort study. BMJ Open. 2014;4(8):e005581. doi: 10.1136/bmjopen-2014-005581 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Koraishy FM, Hooks-Anderson D, Salas J, Scherrer JF. Rate of renal function decline, race and referral to nephrology in a large cohort of primary care patients. Family Practice. 2017;34(4):416–22. doi: 10.1093/fampra/cmx012 [DOI] [PubMed] [Google Scholar]
  • 43.Johnson F, Phillips D, Talabani B, Wonnacott A, Meran S, Phillips AO. The impact of acute kidney injury in diabetes mellitus. Nephrology. 2016;21(6):506–11. doi: 10.1111/nep.12649 [DOI] [PubMed] [Google Scholar]
  • 44.Chakera A, MacEwen C, Bellur SS, Chompuk LO, Lunn D, Roberts ISD. Prognostic value of endocapillary hypercellularity in IgA nephropathy patients with no immunosuppression. Journal of Nephrology. 2016;29(3):367–75. doi: 10.1007/s40620-015-0227-8 [DOI] [PubMed] [Google Scholar]
  • 45.Chen H, Liu C, Fu C, Zhang H, Yang H, Wang P, et al. Combined application of eGFR and albuminuria for the precise diagnosis of stage 2 and 3a CKD in the elderly. Journal of Nephrology. 2014;27(3):289–97. doi: 10.1007/s40620-013-0011-6 [DOI] [PubMed] [Google Scholar]
  • 46.Perotte A, Ranganath R, Hirsch JS, Blei D, Elhadad N. Risk prediction for chronic kidney disease progression using heterogeneous electronic health record data and time series analysis. Journal of the American Medical Informatics Association. 2015;22(4):872–80. doi: 10.1093/jamia/ocv024 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Cummings DM, Larsen LC, Doherty L, Lea CS, Holbert D. Glycemic Control Patterns and Kidney Disease Progression among Primary Care Patients with Diabetes Mellitus. Journal of the American Board of Family Medicine. 2011;24(4):391–8. doi: 10.3122/jabfm.2011.04.100186 [DOI] [PubMed] [Google Scholar]
  • 48.Horne L, Ashfaq A, MacLachlan S, Sinsakul M, Qin L, LoCasale R, et al. Epidemiology and health outcomes associated with hyperkalemia in a primary care setting in England. BMC Nephrology. 2019;20(1):85. doi: 10.1186/s12882-019-1250-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Robinson DE, Ali MS, Pallares N, Tebe C, Elhussein L, Abrahamsen B, et al. Safety of Oral Bisphosphonates in Moderate-to-Severe Chronic Kidney Disease: A Binational Cohort Analysis. Journal of Bone and Mineral Research. 2021;36(5):820–32. doi: 10.1002/jbmr.4235 [DOI] [PubMed] [Google Scholar]
  • 50.Nichols GA, Deruaz-Luyet A, Brodovicz KG, Kimes TM, Rosales AG, Hauske SJ. Kidney disease progression and all-cause mortality across estimated glomerular filtration rate and albuminuria categories among patients with vs. Without type 2 diabetes. BMC Nephrology. 2020;21(1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Yanagawa T, Koyano K, Azuma K. Retrospective study of factors associated with progression and remission/regression of diabetic kidney disease-hypomagnesemia was associated with progression and elevated serum alanine aminotransferase levels were associated with remission or regression. Diabetology International. 2021;12(3):268–76. doi: 10.1007/s13340-020-00483-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Vesga JI, Cepeda E, Pardo CE, Paez S, Sanchez R, Sanabria RM. Chronic kidney disease progression and transition probabilities in a large preventive cohort in colombia. International Journal of Nephrology. 2021;2021 (no pagination). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Oetjens M, Bush WS, Birdwell KA, Dilks HH, Bowton EA, Denny JC, et al. Utilization of an EMR-biorepository to identify the genetic predictors of calcineurin-inhibitor toxicity in heart transplant recipients. Pacific Symposium on Biocomputing. 2014:253–64. [PMC free article] [PubMed] [Google Scholar]
  • 54.Neuen BL, Weldegiorgis M, Herrington WG, Ohkuma T, Smith M, Woodward M. Changes in GFR and Albuminuria in Routine Clinical Practice and the Risk of Kidney Disease Progression. American Journal of Kidney Diseases. 2021. doi: 10.1053/j.ajkd.2021.02.335 [DOI] [PubMed] [Google Scholar]
  • 55.Weldegiorgis M, Smith M, Herrington WG, Bankhead C, Woodward M. Socioeconomic disadvantage and the risk of advanced chronic kidney disease: results from a cohort study with 1.4 million participants. Nephrology Dialysis Transplantation. 2020;35(9):1562–70. doi: 10.1093/ndt/gfz059 [DOI] [PubMed] [Google Scholar]
  • 56.Niu SF, Wu CK, Chuang NC, Yang YB, Chang TH. Early Chronic Kidney Disease Care Programme delays kidney function deterioration in patients with stage I-IIIa chronic kidney disease: an observational cohort study in Taiwan. Bmj Open. 2021;11(1). doi: 10.1136/bmjopen-2020-041210 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.O’Riordan A, Dutt N, Cairns H, Rela M, O’Grady JG, Heaton N, et al. Renal biopsy in liver transplant recipients. Nephrology Dialysis Transplantation. 2009;24(7):2276–82. doi: 10.1093/ndt/gfp112 [DOI] [PubMed] [Google Scholar]
  • 58.Tsai CW, Lin SY, Kuo CC, Huang CC. Serum Uric Acid and Progression of Kidney Disease: A Longitudinal Analysis and Mini-Review. PLoS ONE [Electronic Resource]. 2017;12(1):e0170393. doi: 10.1371/journal.pone.0170393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Leither MD, Murphy DP, Bicknese L, Reule S, Vock DM, Ishani A, et al. The impact of outpatient acute kidney injury on mortality and chronic kidney disease: a retrospective cohort study. Nephrology Dialysis Transplantation. 2019;34(3):493–501. doi: 10.1093/ndt/gfy036 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Liu D, You J, Liu Y, Tang X, Tan X, Xia M, et al. Serum immunoglobulin G provides early risk prediction in immunoglobulin A nephropathy. International Immunopharmacology. 2019;66:13–8. doi: 10.1016/j.intimp.2018.10.044 [DOI] [PubMed] [Google Scholar]
  • 61.Rincon-Choles H, Jolly SE, Arrigain S, Konig V, Schold JD, Nakhoul G, et al. Impact of Uric Acid Levels on Kidney Disease Progression. American Journal of Nephrology. 2017;46(4):315–22. doi: 10.1159/000481460 [DOI] [PubMed] [Google Scholar]
  • 62.VanWagner LB, Montag S, Zhao L, Allen NB, Lloyd-Jones DM, Das A, et al. Cardiovascular Disease Outcomes Related to Early Stage Renal Impairment After Liver Transplantation. Transplantation. 2018;102(7):1096–107. doi: 10.1097/TP.0000000000002175 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Shou H, Hsu JY, Xie D, Yang W, Roy J, Anderson AH, et al. Analytic Considerations for Repeated Measures of eGFR in Cohort Studies of CKD. Clin J Am Soc Nephrol. 2017;12(8):1357–65. doi: 10.2215/CJN.11311116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Levin A, Agarwal R, Herrington WG, Heerspink HL, Mann JFE, Shahinfar S, et al. International consensus definitions of clinical trial outcomes for kidney failure: 2020. Kidney International. 2020;98(4):849–59. doi: 10.1016/j.kint.2020.07.013 [DOI] [PubMed] [Google Scholar]
  • 65.Levey AS, Inker LA, Matsushita K, Greene T, Willis K, Lewis E, et al. GFR decline as an end point for clinical trials in CKD: a scientific workshop sponsored by the National Kidney Foundation and the US Food and Drug Administration. Am J Kidney Dis. 2014;64(6):821–35. doi: 10.1053/j.ajkd.2014.07.030 [DOI] [PubMed] [Google Scholar]
  • 66.Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ, et al. Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): explanation and elaboration. Int J Surg. 2014;12(12):1500–24. doi: 10.1016/j.ijsu.2014.07.014 [DOI] [PubMed] [Google Scholar]
  • 67.von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. J Clin Epidemiol. 2008;61(4):344–9. doi: 10.1016/j.jclinepi.2007.11.008 [DOI] [PubMed] [Google Scholar]
  • 68.Benchimol EI, Smeeth L, Guttmann A, Harron K, Moher D, Petersen I, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med. 2015;12(10):e1001885. doi: 10.1371/journal.pmed.1001885 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Stanislaw Stepkowski

Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

11 May 2021

PONE-D-21-03779

A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records

PLOS ONE

Dear Dr. Cleary,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The authord need to address reviewers' comments.

Reviewer # 1:

The systematic analysis by Faye Cleary et al. is useful in terms of identifying the opportunities coming from increasing availability of electronic health records, as well as pointing to common mistakes and challenges associated with their intrinsic nature. In this regard, the authors did a good job summarizing the methodology across the 65 included studies. It would be nice to note that retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks. I don't have other concerns about this work.

Reviewer # 2:

A well written systematic review with proper design, presentation of result, and discussion. I have few comments:

-The search needs to be updated beyond 7th May 2020. Multiple studies were published after this date.

-Clarify the study methodology: Sample size (before and after exclusions for reasons of data completeness);

-The authors stated that “It is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed”.  Only studies reported in English language were included in this review. Therefore,  it is unfair to label studies in non-English language as inferior in quality.

-What is the outcome for Cox proportional hazards regression? I assume time to event !!

Please submit your revised manuscript by Jun 25 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Stanislaw Stepkowski

Academic Editor

PLOS ONE

Additional Editor Comments:

The authord need to address reviewers' comments.

Reviewer # 1:

The systematic analysis by Faye Cleary et al. is useful in terms of identifying the opportunities coming from increasing availability of electronic health records, as well as pointing to common mistakes and challenges associated with their intrinsic nature. In this regard, the authors did a good job summarizing the methodology across the 65 included studies. It would be nice to note that retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks. I don't have other concerns about this work.

Reviewer # 2:

A well written systematic review with proper design, presentation of result, and discussion. I have few comments:

-The search needs to be updated beyond 7th May 2020. Multiple studies were published after this date.

-Clarify the study methodology: Sample size (before and after exclusions for reasons of data completeness);

-The authors stated that “It is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed”. Only studies reported in English language were included in this review. Therefore, it is unfair to label studies in non-English language as inferior in quality.

-What is the outcome for Cox proportional hazards regression? I assume time to event !!

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. In your Data Availability statement, you have not specified where the minimal data set underlying the results described in your manuscript can be found. PLOS defines a study's minimal data set as the underlying data used to reach the conclusions drawn in the manuscript and any additional data required to replicate the reported study findings in their entirety. All PLOS journals require that the minimal data set be made fully available. For more information about our data policy, please see http://journals.plos.org/plosone/s/data-availability.

Upon re-submitting your revised manuscript, please upload your study’s minimal underlying data set as either Supporting Information files or to a stable, public repository and include the relevant URLs, DOIs, or accession numbers within your revised cover letter. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories. Any potentially identifying patient information must be fully anonymized.

Important: If there are ethical or legal restrictions to sharing your data publicly, please explain these restrictions in detail. Please see our guidelines for more information on what we consider unacceptable restrictions to publicly sharing data: http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

We will update your Data Availability statement to reflect the information you provide in your cover letter.

3. Please remove your figures from within your manuscript file, leaving only the individual TIFF/EPS image files, uploaded separately.  These will be automatically included in the reviewers’ PDF.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The systematic analysis by Faye Cleary et al. is useful in terms of identifying the opportunities coming from increasing availability of electronic health records, as well as pointing to common mistakes and challenges associated with their intrinsic nature. In this regard, the authors did a good job summarizing the methodology across the 65 included studies. It would be nice to note that retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks. I don't have other concerns about this work.

Reviewer #2: A well written systematic review with proper design, presentation of result, and discussion. I have few comments:

-The search needs to be updated beyond 7th May 2020. Multiple studies were published after this date.

-Clarify the study methodology: Sample size (before and after exclusions for reasons of data completeness);

-The authors stated that “It is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed”. Only studies reported in English language were included in this review. Therefore, it is unfair to label studies in non-English language as inferior in quality.

-What is the outcome for Cox proportional hazards regression? I assume time to event !!

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Dulat Bekbolsynov

Reviewer #2: Yes: Sadik A. Khuder

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Jul 29;17(7):e0264167. doi: 10.1371/journal.pone.0264167.r002

Author response to Decision Letter 0


26 Oct 2021

I respond to specific reviewer comments individually below. I break this down point by point for reviewer 2 using the word 'RESPONSE' for each individual point separately.

Comments reviewer #1:

The systematic analysis by Faye Cleary et al. is useful in terms of identifying the opportunities coming from increasing availability of electronic health records, as well as pointing to common mistakes and challenges associated with their intrinsic nature. In this regard, the authors did a good job summarizing the methodology across the 65 included studies. It would be nice to note that retrospective studies using traditional paper records will suffer from the same problems as those using electronic health records: incomplete records, variation in logging practices, addressing AKI when modeling CKD progression, loss to follow-up and competing risks. I don't have other concerns about this work.

RESPONSE:

We are pleased that the reviewer sees the value and quality of our work. In response to reviewer suggestions, we have updated text in the discussion to note that traditional paper records will suffer from the same problems as those using electronic healthcare records.

Comments Reviewer #2:

A well written systematic review with proper design, presentation of result, and discussion.

RESPONSE: We are pleased that the reviewer believes we have conducted a well-designed and presented review of the literature.

-The search needs to be updated beyond 7th May 2020. Multiple studies were published after this date.

RESPONSE: We have updated the search dates to include studies available in the 4 databases covered by the review as of August 2021, allowing us to capture more recently published studies.

-Clarify the study methodology: Sample size (before and after exclusions for reasons of data completeness);

RESPONSE: We have clarified in the methods text that data completeness inclusion criteria refer to the specific study inclusion criteria applied prior to main analyses being performed that aimed to restrict analyses to only those patients with sufficient data completeness to be deemed suitable for analysis, with such criteria expected to vary between studies. The explanation of the calculation for “percent of target population analysed” also shows readers how we used sample size data before and after data completeness inclusion criteria were applied to uncover the extent to which patients were excluded from analysis purely due to failure to meet a study’s data completeness requirements.

-The authors stated that “It is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed”. Only studies reported in English language were included in this review. Therefore, it is unfair to label studies in non-English language as inferior in quality.

RESPONSE: Our comment that “With peer-reviewed literature expected to go through certain research quality checks, it is likely that if research is missing from review, then data quality issues in missing studies are likely to be of a similar quality or worse quality than studies reviewed” was intended to convey that studies missing from review due to not being peer-reviewed are likely to be of similar or worse quality as/than those peer-reviewed, due to the quality checks that peer-reviewed studies go through. It was not intended to say anything about studies published in non-English languages. We have clarified in the methods text that we anticipate that studies published in both English and non-English languages are likely to be of similar quality.

-What is the outcome for Cox proportional hazards regression? I assume time to event !!

RESPONSE: I’m not 100% sure where exactly in the manuscript the reviewer is referring to in this comment, but I imagine it may be results Tables 2 and 4. I would like to clarify what is reported and what is not. Due to the anticipated variation in how researchers define progression of kidney disease over time and the challenges this may pose in clinical interpretability of findings of research studies, a key aim of our review was to summarise how researchers measured changes in renal function over time. We also reported methods for analysis (which include as the reviewer states Cox proportional hazards regression models with such models using as outcome time to some event). In our reporting of study methodology (Table 2), we summarise “Measure of change in renal function over time”. As an example, an event of a 30% decline in eGFR between measures would be reported as “Raw percent change in eGFR between measures” (as this captures how changes over time were measured) and we do not specifically state whether this was analysed as time to event or as a binary outcome but we do report the method of analysis (“Statistical model used”), for example Cox proportional hazards regression. Table 4 further clarifies precise measures of changes in renal function over time for each individual study (e.g. percent loss in eGFR between measures >30%) alongside the methods used (e.g. Cox PH regression). Although we do not specifically state what the outcome is (e.g. time to percent loss in eGFR between measures >30%), this is inferred. We have added a comment to Table 4 to clarify this.

Attachment

Submitted filename: Response to Reviewers.docx

Decision Letter 1

Mabel Aoun

7 Feb 2022

A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records

PONE-D-21-03779R1

Dear Dr. Cleary,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Mabel Aoun, MD, MPH

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Dulat Bekbolsynov

Acceptance letter

Mabel Aoun

9 Feb 2022

PONE-D-21-03779R1

A systematic review of statistical methodology used to evaluate progression of chronic kidney disease using electronic healthcare records

Dear Dr. Cleary:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Mabel Aoun

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 File. PRISMA checklist.

    (DOC)

    S2 File. MEDLINE database search strategy.

    (DOCX)

    S3 File. List of reviewed studies.

    (DOCX)

    S1 Table. Summary of study populations, where unclear if EHRs used.

    (DOCX)

    S2 Table. Study methodology, where unclear if EHRs used.

    (DOCX)

    S3 Table. Critique of handling of data quality and methodological challenges, where unclear if EHRs used.

    (DOCX)

    S4 Table. Listing of key features of all included studies, sorted by year of publication.

    (DOCX)

    S1 Data. Data extraction spreadsheet.

    (XLSX)

    Attachment

    Submitted filename: Response to Reviewers.docx

    Data Availability Statement

    This is a systematic review of previously published research, available in the public domain. All relevant data extracted from reviewed articles are captured in the manuscript and its supporting Information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES