Abstract
Background
Several clinical prediction models that predict the risk of chronic kidney disease (CKD) in people with diabetes have been developed; however, these models lack external validation demonstrating accurate predictions in Canadian primary care. We externally validated existing clinical prediction models for CKD in Canadian primary care data, overall and across subgroups defined by sex/gender, age, comorbidities, and neighbourhood-level deprivation.
Methods
We conducted a retrospective cohort study using data from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) electronic medical record database (2014–2019). We identified models that use demographic, health behaviour, clinical and diabetes-related characteristics to predict incident CKD based on two recent systematic reviews and included models with sufficient predictors in CPCSSN (≤ 1 unavailable) and eGFR-based CKD definitions. We included adult patients (18 +) with diabetes without an existing diagnosis of CKD. We identified incident cases of CKD within 5 years based on ≥ 2 laboratory values corresponding to eGFR < 60 mL/min/1.73 m2 separated by ≥ 90 days and ≤ 1 year. For each model, we estimated the discrimination, precision, recall, and calibration within CPCSSN.
Results
Among 37,604 patients with diabetes, 14.6% met diagnostic criteria for CKD within 5 years. Overall performance of the 13 included CKD prediction models in CPCCSN was mixed: three models displayed moderate to strong discrimination (areas under the receiver-operating characteristic curves [AUROCs] > 0.70), whereas other AUROCs were as Low as 0.508. After model updating, calibrations were heterogeneous with most models displaying some miscalibration. Some subgroups displayed considerable differences in performance: discriminative performance (AUROC) declined with increasing age and number of comorbidities, whereas the precision and recall improved with increasing age and number of comorbidities. We observed no difference in performance according to sex/gender or deprivation quintile.
Conclusions
Three models displayed moderate to strong performance predicting CKD among CPCSSN patients. Next, these models should be evaluated for their impact on practitioner and patient outcomes when implemented in clinical practice. If successful, these models hold promise in achieving widespread adoption to help identify those at highest risk of CKD and guide therapies that may prevent or delay CKD and related sequelae (e.g., end-stage renal disease) among people with diabetes.
Supplementary Information
The online version contains supplementary material available at 10.1186/s41512-025-00208-5.
Keywords: Clinical prediction models, External validation, Chronic kidney disease, Diabetes, Primary care, CPCSSN
Background
Diabetes is associated with the development of several micro- and macrovascular complications, including diabetic nephropathy and chronic kidney disease (CKD) [1]. In Canada, CKD remains common among people with diabetes despite established therapies and clinical guidelines aimed at preventing diabetes complications [2, 3]. Early identification of patients with diabetes who are at increased risk of CKD may enable targeted risk-reducing strategies to help prevent or delay the onset of CKD. Such strategies include intensification of standard therapy; addition of novel kidney protective agents; closer monitoring of patient adherence to treatments and therapeutic efficacy; and referral for specialized renal or diabetes services [1]. For example, traditional approaches to managing a patient with newly diagnosed type 2 diabetes may involve gradually introducing therapies to reduce blood glucose levels. However, this initial period following diabetes diagnosis is critical in pathophysiological processes that partly determine a patient’s progression towards microvascular complications, including nephropathy [4, 5]. Patients at increased risk of CKD may benefit from rapidly achieving target glycemic levels through intensive therapies that may be less impactful in low-risk patients.
Clinical prediction models can identify patients who are at increased risk of CKD based on patient factors, including demographic, health behaviour, clinical and diabetes-related characteristics. Many clinical prediction models that predict the risk of incident CKD (i.e., probability ranging from 0 to 1) in people with diabetes have been developed based on such patient factors [6, 7]. In Canada, opportunities exist to deploy a CKD clinical prediction model in primary care settings—where patients with diabetes are most commonly managed; however, these models lack external validation demonstrating accurate predictions in this setting [8]. Indeed, clinical prediction models are prone to inadequate performance among subgroups, which may introduce or exacerbate inequities in care [9, 10]. Only upon confirmation of robust performance in Canadian primary care should a CKD prediction model be implemented in clinical practice and subsequently evaluated for its ability to modify practitioner and patient behaviours and prevent CKD [11].
We sought to externally validate existing clinical prediction models for incident CKD in Canadian primary care data. We assessed the performance of CKD prediction models overall and among specific subgroups known to be associated with incident CKD [12] (i.e., sex/gender, age group, number of comorbidities, and social and material deprivation quintile).
Methods
Study setting and data source
We conducted a retrospective cohort study using data from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN), a pan-Canadian collection of electronic medical record (EMR) data for more than 2 million primary care patients [13]. Primary care practices contribute patient data describing diagnoses, procedures, laboratory tests, medication prescriptions, and referrals, as recorded by participating practitioners in the EMR. However, important information on patient demographics (e.g., race and ethnicity) and health behaviours (e.g., smoking and exercise) are not reliably recorded in the EMR. CPCSSN was established in 2008 and regularly receives EMR data extracted by 14 contributing networks. Coverage among most provinces is strong, though Quebec contributes fewer patient records due to difficulties obtaining consent from patients [14]. Female and older patients are overrepresented in CPCSSN compared to the general Canadian population [15]; this is expected, as female and older patients are more likely to visit primary care practitioners [16]. Practitioners participating in CPCSSN practiced in similar geographic locations as those who responded to the 2013 National Physician Survey, though CPCSSN tended to overrepresent practitioners working in academic practices [15]. CPCSSN uses robust processing to clean and standardize patient records (e.g., assigning diagnostic codes to free-text diagnoses) to facilitate research use [14].
Models
Two recent systematic reviews identified and characterized clinical prediction models for CKD [6, 7]. Models were developed using longitudinal data from people with pre-diabetes or type 1 or 2 diabetes. We considered 47 models published in 33 research articles, largely developed in people with type 2 diabetes. To ensure each model could feasibly be implemented within CPCSSN, we considered the availability of predictors within CPCSSN and how CKD was defined. We restricted to models where most predictors were available within CPCSSN (≤ 1 unavailable) and CKD was defined based on an eGFR threshold (i.e., excluding CKD definitions based on albuminuria or documentation of kidney failure), consistent with our CPCSSN incident CKD definition (see §Measures). Selected model coefficients are presented in Supplementary Table S1, Additional File 1.
Participants
We used a validated case definition [17] (i.e., a combination of diagnostic codes, medication prescriptions, or laboratory test results; see Supplementary Table S2, Additional File 1) to identify adult patients (18 +) with diabetes in CPCSSN. We excluded patients from Quebec and Manitoba because eGFR laboratory test results were not reliably available in CPCSSN. We identified a unique baseline visit for each patient as their first visit with a participating primary care practitioner in 2014. We included patients diagnosed with diabetes prior to or at baseline, but excluded patients diagnosed with probable type 1 diabetes (based on a validated case definition composed of free-text terms, insulin prescriptions, and age [18]; see Supplementary Table S3, Additional File 1) within 5 years before their baseline visit, as guidelines recommend screening for CKD should commence 5 years following diagnosis for patients with type 1 diabetes [3, 19]. While this definition had sufficient sensitivity (87.3%) to exclude patients diagnosed with potential type 1 diabetes in the past 5 years, it suffered insufficient positive predictive value (35.6%) to reliably distinguish patients with type 1 and 2 diabetes. As such, we did not assess performance by diabetes type. We included patients without an existing diagnosis of CKD (see §Measures) prior to baseline; with at least 3 visits in the 2 years prior to baseline; and with at least one visit within 5 years after baseline. We followed patients for up to 5 years (2014 through 2019) to examine patterns of incident CKD over time. We censored patients at their last visit during the study period to account for patients lacking further follow-up (e.g., due to moving, changing primary care providers, or death).
Measures
For each CKD model considered, we attempted to identify all predictors using CPCSSN data. We first assessed whether the predictor was directly measured (e.g., laboratory results such as HbA1c) or whether a CPCSSN-validated case definition was available (e.g., hypertension and chronic obstructive pulmonary disease [17]). Lacking these, we searched for other case definitions validated in primary care EMR data or previous research identifying the predictor in CPCSSN. If no case definition existed, we consulted clinical colleagues to develop a case definition via a set of criteria that identify the predictor in CPCSSN. We were unable to measure predictors without any information in CPCSSN, such as most health behaviors (e.g., physical activity or alcohol use) and several laboratory tests that are not extracted by CPCSSN processing (e.g., cystatin C or serum uric acid). Details on how we identified each predictor in CPCSSN are described in Supplementary Table S4, Additional File 1.
We identified diagnoses of CKD during the 5-year study period based on clinical guidelines: two or more laboratory values corresponding to an eGFR < 60 mL/min/1.73 m2 that were separated by ≥ 90 days to < 1 year [20].
Some subgroups are known to be at increased risk of CKD compared to other primary care patients with diabetes [21]. We considered subgroups defined by sex/gender (female/woman and male/man), age group (18 to 39, 40 to 64, and 65 and older), number of comorbidities (none, one, two, or three or more), and social and material deprivation quintile as these subgroups could be identified within CPCSSN; other important subgroups, such as Black patients, could not be identified in CPCSSN. We determined sex/gender as recorded in the EMR—as practitioners entered this information, we cannot be certain whether sex or gender was recorded. We calculated age based on patient birth dates. We identified patient comorbidities at baseline for diseases with case definitions validated in CPCSSN data: cardiovascular disease; liver cirrhosis; chronic obstructive pulmonary disease; dementia; depression; epilepsy; hypertension; osteoarthritis; Parkinson’s disease; obesity; and dyslipidemia [17, 22–25]. We estimated patients’ deprivation quintile by mapping their full postal code to neighbourhood-level Pampalon indices [26]. The Pampalon deprivation index estimates the neighbourhood social and material deprivation based on the level of education, employment, income, living situation, marital status, and single-parent status of individuals within that dissemination area according to the Census of Canada [26]. Higher Pampalon deprivation index scores indicate greater deprivation.
Statistical analysis
We described baseline continuous predictors using means with standard deviations [SD] or medians with first and third quartiles and categorical predictors using frequencies with percentages.
We characterized the amount of missing data for predictors at baseline. Where eGFR laboratory testing was absent over follow-up, we assumed normal kidney function (i.e., the patient had not developed CKD). To evaluate the impact of this assumption, we performed a sensitivity analysis where patients without eGFR laboratory over follow-up were excluded.
To compare performances between different models, we used single imputation with k-nearest neighbours to address missing data rather than a more complex, computationally intensive method such as multiple imputation. As a result, confidence intervals around measures of performance are artificially more precise; however, comparisons between point estimates of model performance remain valid. To ensure the imputed values were reasonable, we graphically compared imputed values against observed values using density plots (Supplementary Fig. S1, Additional File 1) [27].
We updated models by re-estimating the model intercept and scaling the model coefficients (i.e., reducing or increasing the magnitude of all coefficients by some factor). This process yields risk estimates better suited to CPCSSN’s characteristics and accounts for the varying prediction horizons of different models (e.g., models predicting 3-year vs. 5-year CKD risks). We assume that 3-year and 5-year risk estimates should rank patients similarly, even if their absolute values differ. By re-estimating intercepts and scaling coefficients, we adjust for differences in the mean and distribution of risks while preserving their ranking.
We evaluated the performance of existing models within CPCSSN data using well-established validation measures: discrimination, precision, recall, and calibration. Based on the updated regression equation for each CKD prediction model (Supplementary Table S5, Additional File 1), we calculated the predicted risk of CKD within 5 years for all patients in our cohort and compared these with their observed outcome (i.e., incident CKD within 5 years). We used the receiver-operating characteristic curve and corresponding AUROC to assess the model’s ability to correctly discern between high- and low-risk patients (discrimination). We used the precision-recall curve and corresponding area under the precision-recall curve (AUPRC) to understand the balance between precision and recall for each model. Finally, we used calibration curves to assess the agreement between the predicted and observed risks (calibration) for each updated model. To create calibration curves, we regressed observed CKD diagnoses against the linear predictor calculated from each model using natural cubic splines. From this model, we obtained the observed probability for each patient that we then plotted against their predicted risk. We conducted all analyses in R 4.3.2 [28], including model validation using the predRupdate package [29].
We evaluated performance among key groups served by primary care defined by sex/gender, age group, number of comorbidities, and deprivation quintile.
Sample size
We computed the minimum sample size required for external validation to estimate a 95% confidence interval (CI) for the AUROC with a width of approximately 0.04, and 95% CIs for the calibration intercept and slope both with widths of approximately 0.2 [30]. We considered multiple scenarios, varying the anticipated AUROC and prevalence of CKD based on the previous models we identified and their development datasets. Based on these specifications, the required minimum sample size ranged from 689 to 21,925 (Supplementary Table S6, Additional File 1).
Ethics approval
Our study was approved by the University of Calgary Conjoint Health Research Ethics Board under study ID REB21-1741. All patients provided consent to contribute their records to CPCSSN.
Results
Of the 47 prediction models for CKD we considered, only 6 had information available on all predictors and 7 had information available for all but one predictor available within CPCSSN (Fig. 1). Many excluded models required predictors such as demographic characteristics that are unavailable within CPCSSN. The 13 selected models were developed in various European, Asian, and North American countries using logistic, multinomial logistic, Cox proportional hazards, or Weibull regression based on a range of data sources, including registries, administrative databases, electronic medical records, prospective observational studies, and randomized controlled trials. The included models were published between 2010 and 2020 (Table 1); however, development data for these models spanned time periods from 1990 to 2019. Most models were developed exclusively among people with type 2 diabetes; however, the Vergouwe et al. model [31] was developed among people with type 1 diabetes. The included models displayed moderate to strong internal performance based on their development data (AUROCs ranged from 0.65 to 0.87).
Fig. 1.
CKD models included in analysis, including availability of predictors and CKD definition
Table 1.
Characteristics of CKD models included in analysis and their development cohorts
| First author | Publication year | Sample size (n) | Frequency of CKD diagnosis (%) | Model | Prediction horizon | Development AUROC |
|---|---|---|---|---|---|---|
| Afghahi [32] | 2011 | 3,667 | 11.1 | Logistic | 5 years | 0.67 to 0.87 |
| Basu [33] | 2017 | 9,635 | 61.3 | Cox | 4.7 years | 0.76 |
| Dagliati [34] | 2018 | 943 | 12.8 | Logistic | 3 years | 0.701 |
| Dorajoo [35] | 2017 | 716 | 35.2 | Logistic | 3 years | 0.76 |
| Dunkler [36] | 2015 | 6,766 | 15.9 | Multinomial logistic | 5.5 years | 0.68 |
| Hu [37] | 2020 | 3,489 | 20.1 | Logistic | 4 years | 0.74 |
| Jardine [38] | 2012 | 7,377 | 36.8 | Cox | 5 years | 0.65 |
| Low [39] | 2017 | 1,582 | 42.9 | Logistic | 6 years | 0.83 |
| Miao [40] | 2017 | 5,705 | N/R | Cox | 20 years | 0.80 |
| Nelson [41] | 2019 | 781,627 | 40.1 | Weibull | 5 years | 0.80 |
| Riphagen [42] | 2015 | 640 | 28.6 | Cox | 10 years | 0.69 |
| Tanaka [43] | 2013 | 1,748 | 4.1 | Cox | 5 years | 0.77 |
| Vergouwe [31] | 2010 | 1,115 | 13 | Logistic | 7 years | 0.69 |
Among 324,890 adult patients with baseline primary care visits in 2014 without CKD, we identified 37,604 patients with diabetes for external validation of CKD prediction models (Fig. 2); this CPCSSN diabetes cohort exceeded all minimum sample size estimates. Cohort characteristics are summarized in Table 2. The median follow-up duration was 4.8 (IQR: 0.4) years per patient, for a total of 166,558 person-years. Supplementary Table S7 summarizes the number of days between the last recording and baseline visits for characteristics with multiple recordings. Supplementary Table S8, Additional File 1 compares characteristics of our validation cohort with characteristics of the development cohorts for the selected models. One model did not report any characteristics for their cohort (Dagliati et al. [34]). The mean age of the CPCSSN diabetes cohort (63.4 years) approximated those of the development cohorts, except Vergouwe et al. [31] which used a much younger cohort (mean age: 33 years). There was considerable heterogeneity in the sex/gender distribution of the development cohorts we considered; the CPCSSN diabetes cohort had slightly more males/men than females/women, similar to some development cohorts with approximately equal sex/gender distributions but dissimilar from those that were disproportionately split (e.g., nearly two-thirds of the Afghahi et al. cohort [32] and 87% of the Nelson [41] cohort were male/men). The CPCSSN diabetes cohort had a lower mean HbA1c value than the development cohorts. Similarly, systolic and diastolic blood pressures were lower among the CPCSSN diabetes cohort than the development cohorts. However, the CPCSSN diabetes cohort had a higher average body mass index (BMI) compared to the development cohorts.
Fig. 2.

CPCSSN patients included in analysis, including reason for exclusion
Table 2.
Characteristics of CPCSSN diabetes cohort for external validation
| N = 37,604 | |
|---|---|
| Demographic variables | |
| Age (years), mean ± SD | 63.4 ± 12.2 |
| Age group, n (%) | |
| 18 to 39 | 1,073 (2.9) |
| 40 to 64 | 18,370 (48.9) |
| 65 and older | 18,161 (48.3) |
| Sex/gender, n (%) | |
| Male/men | 23,174 (53.0) |
| Female/women | 20,516 (47.0) |
| Social and material deprivation, n (%) | |
| 1st quintile (least deprived) | 6,564 (17.5) |
| 2nd quintile | 7,521 (20.0) |
| 3rd quintile | 7,032 (18.7) |
| 4th quintile | 7,123 (18.9) |
| 5th quintile (most deprived) | 7,039 (18.7) |
| Missing | 2,325 (6.2) |
| Rurality (urban), n (%) | 29,088 (77.4) |
| Health behaviour variables | |
| Smoking, n (%) | |
| Not current | 1,111 (3.0) |
| Non-smoker | 1,080 (2.9) |
| Ex-smoker | 1,344 (3.6) |
| Current smoker | 1,417 (3.8) |
| Missing | 32,652 (86.8) |
| Diabetes-related variables | |
| Fasting blood glucose (mg/dL), mean ± SD | 136.3 ± 45.5 |
| HbA1c (%), median (first–third quartiles) | 6.7 (6.3—7.4) |
| Diabetes duration (years), median (first–third quartiles) | 2.8 (1.3—5.1) |
| Diabetes duration ≥ 7 years, n (%) | 5,018 (13.3) |
| Oral diabetes drugs, n (%) | 21,460 (57.1) |
| Insulin, n (%) | 4,128 (11.0) |
| Oral diabetes drugs or insulin, n (%) | 22,477 (59.8) |
| Anthropometric variables | |
| Weight (kg), mean ± SD | 91.0 ± 24.9 |
| BMI (kg/m2), mean ± SD | 32.3 ± 7.4 |
| Physical examination variables | |
| Systolic blood pressure (mmHg), mean ± SD | 131 ± 16 |
| Diastolic blood pressure (mmHg), mean ± SD | 76 ± 10 |
| Pulse pressure (mmHg), mean ± SD | 55 ± 14 |
| Laboratory variables | |
| eGFR (mL/min/1.73 m2), mean ± SD | 84 ± 18 |
| Urine albumin (mg/dL), mean ± SD | 3.2 ± 5.9 |
| Urine creatinine (mg/dL), mean ± SD | 127.8 ± 70.7 |
| Urinary ACR (mg/mmol), median (first–third quartiles) | 1.3 (0.6–2.9) |
| Serum creatinine (μmol/L), mean ± SD | 73 ± 17 |
| Triglycerides (mmol/L), mean ± SD | 1.7 ± 1.0 |
| LDL cholesterol (mmol/L), mean ± SD | 2.3 ± 0.9 |
| HDL cholesterol (mmol/L), mean ± SD | 1.2 ± 0.3 |
| Total cholesterol (mmol/L), mean ± SD | 4.3 ± 1.1 |
| Cholesterol-HDL ratio, mean ± SD | 3.7 ± 1.2 |
| Medical history variables | |
| Antihypertensive medication, n (%) | 24,855 (66.1) |
| RAS-antagonist, n (%) | 21,198 (56.4) |
| Lipid-lowering drugs, n (%) | 21,745 (57.8) |
| Anticoagulants, n (%) | 7,840 (20.8) |
| Hypertension, n (%) | 19,121 (50.8) |
| Dyslipidemia, n (%) | 17,434 (46.4) |
| Hypertension or dyslipidemia, n (%) | 27,657 (73.5) |
| Diabetic retinopathy, n (%) | 643 (1.7) |
| Cerebrovascular disease, n (%) | 1,137 (3.0) |
| Previous atrial fibrillation, n (% | 668 (1.8) |
| Macrovascular complications, n (%) | 3,238 (8.6) |
| Previous cardiovascular disease, n (%) | 5,631 (15.0) |
| Comorbidities, n (%) | |
| None | 3,692 (9.8) |
| One | 9,062 (24.1) |
| Two | 10,826 (28.8) |
| Three or more | 14,024 (37.3) |
| RAS-antagonist: renin–angiotensin system antagonist | |
Over the 5-year follow-up period, we found that 14.6% of patients with diabetes developed incident CKD with an incidence rate of 33.1 cases per 1000 person-years (Table 3). Though CKD incidence proportions and rates were similar between males/men and females/women, we observed increased proportions and rates among older patients, patients with more comorbidities, and patients living in areas with higher deprivation.
Table 3.
Incidence proportion and rate of incident CKD over follow-up* (n = 37,604)
| Incidence proportion, % (95% CI) | Incidence rate, per 1000 person-years (95% CI) | |
|---|---|---|
| Overall | 14.6 (14.3 to 15.0) | 33.1 (32.2 to 34.0) |
| Sex/gender | ||
| Female/Woman | 15.0 (14.5 to 15.6) | 33.9 (32.6 to 35.2) |
| Male/Man | 14.3 (13.8 to 14.8) | 32.3 (31.2 to 33.5) |
| Age at baseline | ||
| 18 to 39 | 1.2 (0.7 to 2.1) | 2.8 (1.6 to 4.7) |
| 40 to 64 | 6.2 (5.9 to 6.6) | 13.7 (12.9 to 14.5) |
| 65 and older | 24.0 (23.4 to 24.6) | 55.1 (53.5 to 56.7) |
| Comorbidities | ||
| None | 12.1 (11.0 to 13.2) | 27.0 (24.5 to 29.5) |
| One | 12.5 (11.8 to 13.2) | 28.0 (26.4 to 29.7) |
| Two | 14.3 (13.7 to 15.0) | 32.2 (30.7 to 33.9) |
| Three or more | 17.0 (16.3 to 17.6) | 38.7 (37.2 to 40.3) |
| Social and material deprivation | ||
| 1 st quintile (least deprived) | 14.4 (13.5 to 15.2) | 32.2 (30.2 to 34.3) |
| 2nd quintile | 14.4 (13.6 to 15.2) | 32.3 (30.3 to 34.2) |
| 3rd quintile | 14.0 (13.2 to 14.8) | 31.5 (29.6 to 33.5) |
| 4th quintile | 14.9 (14.1 to 15.7) | 33.6 (31.6 to 35.6) |
| 5th quintile (most deprived) | 14.8 (14.0 to 15.6) | 33.4 (31.4 to 35.5) |
CKD chronic kidney disease, CI confidence interval
*Two or more laboratory values separated by at least 90 days but not more than 1 year apart reporting an eGFR less than 60 mL/min/1.73 m2
Overall performance of the CKD prediction models in CPCCSN was mixed (Table 4), with AUROCs (discrimination) ranging from 0.492 to 0.826. The models developed by Nelson et al. [41] and Afghahi et al. [32] had the best discrimination, whereas several models performed only slightly better or no better than chance (AUROC < 0.60), including models developed by Dagliati et al. [34], Dorajoo et al. [35], Dunkler et al. [36], Hu et al. [37], Miao et al. [40], Tanaka et al. [43], and Vergouwe et al. [31] The model developed by Jardine et al. [38] also performed strongly, with an AUROC of 0.752 (95% CI: 0.745 to 0.759). We found similar patterns in performance considering the AUPRC: the Nelson et al. model had the highest AUPRC (0.467 [95% CI: 0.462 to 0.472]) while the Dagliati et al. and Dunkler et al. models had the lowest (0.149 [95% CI: 0.146 to 0.153]). All models displayed poor calibration before updating (Supplementary Fig. 2, Additional file 1). Even after model updating (via intercept re-estimation and coefficient scaling), calibrations were heterogeneous with most models displaying some miscalibration (Fig. 3). We found predicted risks from the Afghahi et al. model closely approximated observed risks, except for a small proportion (less than 5%) of predicted risks that exceeded values of 50% and overestimated observed risks. The Jardine et al. model had severe underestimation of risk among a small group of predicted low risk patients but good calibration thereafter. The Nelson et al. model overestimated Lower risk patients and underestimated higher risk patients, though to a maximum error of approximately 10%. In some cases, coefficient scaling applied to models with poor predictive performance resulted in extremely poor calibration, with predicted values closely clustered together to minimize error in the model; however, these models do not provide meaningful predicted values. For example, the calibration plot for the Dagliati et al. model shows predicted risks clustered around one value, providing little to no valuable risk information as all patients were assigned almost the same predicted risk.
Table 4.
Discrimination of selected prediction models for incident CKD among CPCSSN patients with diabetes
| First author | Publication year | Model | Reported development AUROC | CPCSSN AUROC | CPCSSN AUPRC |
|---|---|---|---|---|---|
| Afghahi [32] | 2011 | Logistic | 0.67 to 0.87 | 0.822 (0.817 to 0.828) | 0.445 (0.440 to 0.450) |
| Basu [33] | 2017 | Cox | 0.76 | 0.621 (0.613 to 0.628) | 0.211 (0.206 to 0.215) |
| Dagliati [34] | 2018 | Logistic | 0.701 | 0.515 (0.507 to 0.523) | 0.149 (0.146 to 0.153) |
| Dorajoo [35] | 2017 | Logistic | 0.76 | 0.572 (0.564 to 0.58) | 0.185 (0.181 to 0.189) |
| Dunkler [36] | 2015 | Multinomial logistic | 0.68 | 0.524 (0.516 to 0.532) | 0.149 (0.146 to 0.153) |
| Hu [37] | 2020 | Logistic | 0.74 | 0.492 (0.483 to 0.5) | 0.171 (0.168 to 0.175) |
| Jardine [38] | 2012 | Cox | 0.65 | 0.752 (0.745 to 0.759) | 0.359 (0.354 to 0.364) |
| Low [39] | 2017 | Logistic | 0.83 | 0.661 (0.654 to 0.669) | 0.235 (0.231 to 0.240) |
| Miao [40] | 2017 | Cox | 0.80 | 0.526 (0.518 to 0.534) | 0.152 (0.149 to 0.156) |
| Nelson [41] | 2019 | Weibull | 0.80 | 0.826 (0.82 to 0.832) | 0.467 (0.462 to 0.472) |
| Riphagen [42] | 2015 | Cox | 0.69 | 0.679 (0.672 to 0.686) | 0.248 (0.244 to 0.253) |
| Tanaka [43] | 2013 | Cox | 0.77 | 0.571 (0.562 to 0.579) | 0.180 (0.176 to 0.184) |
| Vergouwe [31] | 2010 | Logistic | 0.69 | 0.549 (0.54 to 0.557) | 0.178 (0.175 to 0.182) |
AUROC area under the receiver-operating characteristic curve, AUPRC area under the precision-recall curve
Fig. 3.
Calibration plots of selected prediction models for incident CKD after model updating among CPCSSN patients with diabetes. Each plot shows the predicted probability compared to the observed probability (solid blue line); the dashed line represents ideal calibration
Considering subgroups defined by sex/gender, age group at baseline, number of comorbidities, and deprivation quintile, we observed considerable differences in performance across some subgroups. For most models, discriminative performance (AUROC) declined with increasing age (Supplementary Table S9, Additional File 1); however, according to the AUPRC, performance improved with increasing age (Supplementary Table S10, Additional File 1). After model updating, patients aged 65 and older displayed the best calibration intercepts compared to younger patients, though calibration slopes were heterogeneous across age groups (Supplementary Table S11, Additional File 1). Like the patterns we observed by age group, we found decreasing discriminative performance (AUROC) with increasing numbers of comorbidities but increasing AUPRC with increasing numbers of comorbidities. However, we observed no patterns in calibration (intercept or slope) depending on the number of comorbidities. We observed no difference in performance according to sex/gender or deprivation quintile.
In our sensitivity analysis, we found models performed similarly after excluding 2,030 patients (5.4%) who had no eGFR testing over follow-up (Supplementary Tables S12 to S15, Additional File 1): overall AUROCs were slightly higher due to the increased incidence of CKD in this cohort.
Discussion
Principal findings
We externally validated several CKD clinical prediction models for use in Canadian primary care using EMR data, considering performance within specific subgroups. Of the CKD prediction models previously identified in the literature [6, 7], we found only 13 models where most predictors could be successfully characterized using CPCSSN data (≤ 1 unavailable). We found performance of these models within CPCSSN was mixed, though some models demonstrated strong performance, namely the models developed by Afghahi et al., Jardine et al. and Nelson et al. Comparing subgroups defined by sex/gender, age, number of comorbidities, and deprivation quintile, we found similar model performance across sex/genders and deprivation quintiles; however, greater discrimination (AUROC) was observed among younger patients with fewer comorbidities, yet greater precision and recall (AUPRC) was observed among older patients with more comorbidities. This pattern may be explained as AUROC is prone to optimism when the outcome is rare, whereas AUPRC is not subject to this optimism [44]. Age- and comorbidity-specific patterns were similar across all the models we assessed.
Model performance in CPCSSN may have differed from performance in their development cohorts for various reasons. Differences in case-mix between the development cohort and the CPCSSN diabetes cohort may contribute to the performance differences we observed; Supplementary Table S7, Additional File 1 presents the baseline characteristic for the CPCSSN diabetes cohort and all development cohorts. For example, the mean age among the Vergouwe et al. development cohort was much younger than that of the CPCSSN diabetes cohort (33 years vs. 63.4 years). Further, differences in how predictors and/or incident CKD were measured may have also contributed to differences in model performance. For some models, performance in the CPCSSN diabetes cohort was markedly worse than the development cohort (i.e., models developed by Dagliati et al., Dorajoo et al., Dunkler et al., Hu et al., Miao et al., Tanaka et al., and Vergouwe et al.). Excluding Dunker et al. and Miao et al., these models did not include age as a predictor, despite known increases in incident CKD risk associated with age [1]. In general, models that had fewer predictors available in CPCSSN tended to have poorer performance in the CPCSSN diabetes cohort compared to models with more predictors. For example, the model with the best performance in the CPCSSN diabetes cohort developed by Nelson et al. included 14 available predictors, whereas the models developed by Dagliati et al., Dorajoo et al., Dunkler et al., Tanaka et al., and Vergouwe et al. used only 5 or fewer predictors.
Our findings approximated previous work to externally validate CKD prediction models [7]. Similar to their performance in the CPCSSN diabetes cohort, the models developed by Afghahi [32] and Nelson [41] demonstrated strong discrimination when estimating 5-year CKD risks among the Hoorn DCS cohort, a prospective cohort of over 14,000 people in the Netherlands (AUROCs > 0.80). The models developed by Basu [33] and Dagliati [34] had similarly poor performance among the Hoorn DCS cohort. These findings support the use of models such as those developed by Afghahi [32] and Nelson [41]; however, targeted validation in Canadian primary care (such as our results among the CPCSSN diabetes cohort) provide the best evidence to advocate for their use in this setting.
Implications
The CKD models developed by Afghahi et al. and Nelson et al. demonstrated robust performance using Canadian primary care data. All predictors necessary for the Afghahi et al. model are routinely collected and stored in CPCSSN, facilitating straightforward integration into clinical practice; whereas the Nelson et al. model includes one predictor that was not available in CPCSSN data (i.e., race) but demonstrated strong performance nonetheless. Collecting race information could improve the performance of the Nelson et al. model. Another strong performing model was developed by Jardine et al.; however, it displayed greater miscalibration, especially among predicted low risk patients. Next, these models should be implemented and evaluated in clinical practice to determine how to best support practitioner decision-making to improve patient health behaviours and reduce risk of incident CKD.
Clinical prediction models for incident CKD have not been used widely in Canadian primary care; however, similar tools have had some adoption, such as the KidneyWise toolkit that was implemented in primary care practices by the Ontario Renal Network. This toolkit includes the Kidney Failure Risk Equation that predicts the risk of end-stage kidney disease among patients with CKD based on their age, sex, eGFR, and urine ACR [45]. While KidneyWise did not improve the appropriateness of primary care referrals to nephrology, it was successful in modifying practitioner behaviours such as including a urine ACR result referrals [46]. Existing tools like the KidneyWise toolkit could be modified to include a clinical prediction model for incident CKD among patients with diabetes, such as the strongly performing models we identified.
Strengths and limitations
We analysed a large, Canada-wide cohort of representative patients from primary care practices using EMRs to measure and detect differences in the performance of CKD prediction models. Indeed, our cohort size exceeded the largest estimate of required sample size by more than 15,000 patients. Nonetheless, the risk of selection bias remains as CPCSSN does not employ random sampling when recruiting practitioners to participate in the network. For example, CPCSSN has been shown to overrepresent older patients when compared to the general population [15], potentially limiting the generalization of results to younger patients. However, our external validation involved numerous primary care practices across Canada where a CKD clinical prediction model could be implemented, offering crucial performance data to support its use in these practices [8]. We only validated models that had information for most predictors available in CPCSSN; thus, models that relied heavily on factors such as ethnicity, education, physical activity, and alcohol use were excluded. Although this limited the range of models we evaluated, the models we considered can be integrated into an EMR system to facilitate model uptake and generate risk predictions using only the data contained within the EMR [47].
We recognize that our study has some limitations. Missing data was common among some predictors; however, we used a robust process to impute missing data and found that imputed predictor distributions closely approximated those observed. Further, there was some loss-to-follow-up that resulted in censoring; however, retention was strong with the median follow-up duration being 4.8 years. Similarly, eGFR values were not collected for some patients. In this case, we assumed that missing eGFR values did not indicate incident CKD. Sensitivity analyses suggested the impact of this assumption on our results was minimal. Blinding to mitigate unbiased outcome assessment was not possible; however, CKD diagnoses were based on laboratory measurements that were not influenced by practitioner bias. We could not distinguish between people with type 1 and 2 diabetes as the case definition for type 1 diabetes lacked sufficient positive predictive value (35.6%) [18]; thus, we could not confirm whether model performance differed by diabetes type. Given most of the models we evaluated were developed in patients with type 2 diabetes, models should be applied with caution among patients with type 1 diabetes as we cannot be certain that performance will be consistent between patients with type 1 diabetes and patients with type 2 diabetes. Additionally, we had no information describing patients’ race or ethnicity—despite known associations between these factors and risk of incident CKD [48]—precluding our ability to use these factors as predictors or confirm model performance among groups defined by these characteristics. Finally, our validation was limited to patients managed by their primary care practitioner; model performance may differ for patients managed by endocrinologists or other specialists where predictive relationships may differ.
Conclusions
Our study externally validated several models for CKD among patients with diabetes in Canadian primary care. Performance of these models was mixed, though the models developed by Afghahi et al., Jardine et al. and Nelson et al. displayed strong discrimination and calibration in the CPCSSN diabetes cohort overall and across a variety of subgroups. We could not examine performance among patients with type 1 diabetes. Additional external validation among patients with type 1 diabetes is required prior to employing models in this group. Despite strong performance, these models displayed opportunities for improved performance. Additionally, a single model must be selected for use to predict CKD in clinical settings, ignoring all other models. Ensemble methods (a type of machine learning) could be used to combine multiple prediction models for CKD to hopefully improve the performance in predicting CKD [49, 50]. In future research, we will explore the use of ensemble methods to integrate risk information across multiple models and improve CKD prediction.
Prior to regular clinical use, these models should be implemented in clinical practice and evaluated for their impact on practitioner and patient outcomes, such as use of risk-lowering therapies or incident CKD. If successful, these models hold promise in achieving widespread adoption to help prevent or delay CKD and related sequelae (e.g., end-stage renal disease) among people with diabetes, significantly improving patient outcomes and quality of life.
Supplementary Information
Additional file 1. Supplementary Table S1: Regression equations of selected prediction models for incident CKD.
Supplementary Table S2: CPCSSN case definition for diabetes. Supplementary Table S3: CPCSSN case definition for type 1 diabetes. Supplementary Table S4: Measurement of predictors of CKD within CPCSSN. Supplementary Table S5: Regression equations of selected prediction models for incident CKD after updating among CPCSSN patients with diabetes. Supplementary Table S6: Sample size estimation parameters and corresponding required sample size. Supplementary Table S7: Recency of CPCSSN diabetes cohort characteristic measurements (time from most recent measurement to baseline visit). Supplementary Table S8: Baseline characteristics of adult patients with diabetes. Supplementary Table S9: Discrimination (AUROC) of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN. Supplementary Table S10: Area under the precision-recall curve (AUPRC) of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN. Supplementary Table S11: Calibration intercept and/or slope of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN. Supplementary Table S12: Discrimination of selected prediction models for incident CKD among CPCSSN patients with diabetes, excluding those who had no eGFR testing over follow up. Supplementary Table S13: Discrimination (AUROC) of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN, excluding those who had no eGFR testing over follow up. Supplementary Table S14: Area under the precision-recall curve (AUPRC) of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN, excluding those who had no eGFR testing over follow up. Supplementary Table S15: Calibration intercept and/or slope of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN, excluding those who had no eGFR testing over follow up. Supplementary Figure S1: Kernel density plots for non-missing and imputed values among all continuous variable with some amount of missingness. Supplementary Figure S2: Calibration plots of selected prediction models for incident CKD before model updating among CPCSSN patients with diabetes (models that did not report intercepts could not be plotted). Each plot shows the predicted probability compared to the observed probability (solid blue line); the dashed line represents ideal calibration.
Additional file 2. The TRIPOD+AI reported checklist.
Acknowledgements
Not applicable
Abbreviations
- AUPRC
Area under the precision-recall curve
- AUROC
Area under the receiver-operating characteristic curve
- BMI
Body mass index
- CI
Confidence interval
- CKD
Chronic kidney disease
- CPCSSN
Canadian Primary Care Sentinel Surveillance Network
- EMR
Electronic medical record
Authors’ contributions
JEB and TSW conceptualized and formalized the research and developed the methodology. All authors reviewed and approved the proposed methodology. JEB performed all statistical analyses and drafted the manuscript. All authors (JEB, DJTC, PER, KAM, and TSW) substantively reviewed and edited the manuscript and approved the submitted version.
Funding
This research partly comprises Jason E. Black’s doctoral work, which is supported by the Achievers in Medical Sciences, Alberta Innovates, and Artificial Intelligence for Public Health scholarships.
Data availability
The dataset supporting the conclusions of this article is available in the CPCSSN repository upon reasonable request, https://cpcssn.ca/dar/.
Declarations
Ethics approval and consent to participate
Our study was approved by the University of Calgary Conjoint Health Research Ethics Board under study ID REB21-1741. All patients provided consent to contribute their records to CPCSSN.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Gross JL, de Azevedo MJ, Silveiro SP, Canani LH, Caramori ML, Zelmanovitz T. Diabetic nephropathy: diagnosis, prevention, and treatment. Diabetes Care. 2005;28:164–76. [DOI] [PubMed] [Google Scholar]
- 2.Bello AK, Ronksley PE, Tangri N, et al. Prevalence and demographics of CKD in Canadian primary care practices: a cross-sectional study. Kidney Int Rep. 2019;4:561–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.McFarlane P, Cherney D, Gilbert R, Senior P. Diabetes Canada 2018 Clinical Practice Guidelines for the Prevention and Management of Diabetes in Canada: Chronic Kidney Disease in Diabetes. Can J Diabetes. 2018.
- 4.Drzewoski J, Kasznicki J, Trojanowski Z. The role of “metabolic memory” in the natural history of diabetes mellitus. Pol Arch Med Wewn. 2009;119:493–500. [PubMed] [Google Scholar]
- 5.Copur S, Rossing P, Afsar B, Sag AA, Siriopol D, Kuwabara M, et al. A primer on metabolic memory: why existing diabesity treatments fail. Clin Kidney J. 2021;14:756–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Ndjaboue R, Ngueta G, Rochefort-Brihay C, et al. Prediction models of diabetes complications: a scoping review. J Epidemiol Community Health. 2022. 10.1136/jech-2021-217793. [DOI] [PubMed] [Google Scholar]
- 7.Slieker RC, van der Heijden AAWA, Siddiqui MK, et al. Performance of prediction models for nephropathy in people with type 2 diabetes: systematic review and external validation study. BMJ. 2021;374:n2134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Sperrin M, Riley RD, Collins GS, Martin GP. Targeted validation: validating clinical prediction models in their intended population and setting. Diagn Progn Res. 2022;6:24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Rajkomar A, Hardt M, Howell MD, Corrado G, Chin MH. Ensuring fairness in machine learning to advance health equity. Ann Intern Med. 2018;169:866–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Chen IY, Pierson E, Rose S, Joshi S, Ferryman K, Ghassemi M. Ethical machine learning in healthcare. Annu Rev Biomed Data Sci. 2021;4:123–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Moons KGM, Altman DG, Vergouwe Y, Royston P. Prognosis and prognostic research: application and impact of prognostic models in clinical practice. BMJ Online. 2009;338:1487–90. [DOI] [PubMed] [Google Scholar]
- 12.Barda N, Yona G, Rothblum GN, Greenland P, Leibowitz M, Balicer R, et al. Addressing bias in prediction models by improving subpopulation calibration. J Am Med Inform Assoc. 2021;28:549–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Garies S, Birtwhistle R, Drummond N, Queenan J, Williamson T. Data Resource Profile: National electronic medical record data from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN). Int J Epidemiol. 2017;46:1091–1092f. [DOI] [PubMed] [Google Scholar]
- 14.Morkem R, Salman A, Herman C, Shah R, Wong S, Barber D. CPCSSN Data Quality: An Opportunity for Enhancing Canadian Primary Care Data. 2023.
- 15.Queenan JA, Williamson T, Khan S, Drummond N, Garies S, Morkem R, et al. Representativeness of patients and providers in the Canadian Primary Care Sentinel Surveillance Network: a cross-sectional study. CMAJ Open. 2016;4:E28–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Nie JX, Wang L, Tracy CS, Moineddin R, Upshur RE. Health care service utilization among the elderly: findings from the study to understand the chronic condition experience of the elderly and the disabled (SUCCEED project). J Eval Clin Pract. 2008;14:1044–9. [DOI] [PubMed] [Google Scholar]
- 17.Williamson T, Green ME, Birtwhistle R, Khan S, Garies S, Wong ST, et al. Validating the 8 CPCSSN case definitions for chronic disease surveillance in a primary care database of electronic health records. Ann Fam Med. 2014;12:367–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Lethebe BC, Williamson T, Garies S, McBrien K, Leduc C, Butalia S, et al. Developing a case definition for type 1 diabetes mellitus in a primary care electronic medical record database: an exploratory study. Can Med Assoc Open Access J. 2019;7:E246–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.de Boer IH, Caramori ML, Chan JCN, et al. KDIGO 2020 clinical practice guideline for diabetes management in chronic kidney disease. Kidney Int. 2020;98:S1–115. [DOI] [PubMed] [Google Scholar]
- 20.Committee; CDACPGE, Cheng AYY. Canadian Diabetes Association 2013 clinical practice guidelines for the prevention and management of diabetes in Canada. Can J Diabetes. 2013;2013(37):S1–3. [DOI] [PubMed] [Google Scholar]
- 21.Kazancioğlu R. Risk factors for chronic kidney disease: an update. Kidney Int Suppl. 2013;3:368–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Thomas RD, Kosowan L, Rabey M, Bell A, Connelly KA, Hawkins NM, et al. Validation of a case definition to identify patients diagnosed with cardiovascular disease in Canadian primary care practices. CJC Open. 2023. 10.1016/j.cjco.2023.04.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Faisal N, Kosowan L, Zafari H, Zulkernine F, Lix L, Mahar A, et al. Development and validation of a case definition to estimate the prevalence and incidence of cirrhosis in pan-Canadian primary care databases. Canadian Liver Journal. 2023. 10.3138/canlivj-2023-0002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Rigobon AV, Birtwhistle R, Khan S, Barber D, Biro S, Morkem R, et al. Adult obesity prevalence in primary care users: an exploration using Canadian Primary Care Sentinel Surveillance Network (CPCSSN) data. Can J Public Health Rev Can Sante Publique. 2015;106:e283–289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Spohn O, Morkem R, Singer AG, Barber D. Prevalence and management of dyslipidemia in primary care practices in Canada. Can Fam Physician. 2024;70:187–96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Pampalon R, Hamel D, Gamache P, Raymond G. A deprivation index for health planning in Canada. Chron Dis Can. 2009;29:178–91. [PubMed] [Google Scholar]
- 27.Nguyen CD, Carlin JB, Lee KJ. Model checking in multiple imputation: an overview and case study. Emerg Themes Epidemiol. 2017;14:8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.R Core Team. R: A Language and Environment for Statistical Computing. 2023.
- 29.Martin GP, Jenkins D, Sperrin M. predRupdate: Prediction model validation and updating. 2023.
- 30.Pavlou M, Qu C, Omar RZ, Seaman SR, Steyerberg EW, White IR, et al. Estimation of required sample size for external validation of risk models for binary outcomes. Stat Methods Med Res. 2021;30:2187–206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Vergouwe Y, Soedamah-Muthu SS, Zgibor J, et al. Progression to microalbuminuria in type 1 diabetes: development and validation of a prediction rule. Diabetologia. 2010;53:254–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Afghahi H, Cederholm J, Eliasson B, Zethelius B, Gudbjörnsdottir S, Hadimeri H, et al. Risk factors for the development of albuminuria and renal impairment in type 2 diabetes–the Swedish National Diabetes Register (NDR). Nephrol Dial Transplant. 2011;26:1236–43. [DOI] [PubMed] [Google Scholar]
- 33.Basu S, Sussman JB, Berkowitz SA, Hayward RA, Yudkin JS. Development and validation of risk equations for complications of type 2 diabetes (RECODe) using individual participant data from randomised trials. Lancet Diabetes Endocrinol. 2017;5:788–98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Dagliati A, Marini S, Sacchi L, Cogni G, Teliti M, Tibollo V, et al. Machine learning methods to predict diabetes complications. J Diabetes Sci Technol. 2018;12:295–302. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Dorajoo SR, Ng JSL, Goh JHF, Lim SC, Yap CW, Chan A, et al. Hba1c variability in type 2 diabetes is associated with the occurrence of new-onset albuminuria within three years. Diabetes Res Clin Pract. 2017;128:32–9. [DOI] [PubMed] [Google Scholar]
- 36.Dunkler D, Gao P, Lee SF, et al. Risk prediction for early CKD in type 2 diabetes. Clin J Am Soc Nephrol. 2015;10:1371–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hu Y, Shi R, Mo R, Hu F. Nomogram for the prediction of diabetic nephropathy risk among patients with type 2 diabetes mellitus based on a questionnaire and biochemical indicators: a retrospective study. Aging. 2020;12:10317–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Jardine MJ, Hata J, Woodward M, et al. Prediction of kidney-related outcomes in patients with type 2 diabetes. Am J Kidney Dis. 2012;60:770–8. [DOI] [PubMed] [Google Scholar]
- 39.Low S, Lim SC, Zhang X, Zhou S, Yeoh LY, Liu YL, et al. Development and validation of a predictive model for chronic kidney disease progression in type 2 diabetes mellitus based on a 13-year study in Singapore. Diabetes Res Clin Pract. 2017;123:49–54. [DOI] [PubMed] [Google Scholar]
- 40.Miao DD, Pan EC, Zhang Q, Sun ZM, Qin Y, Wu M. Development and validation of a model for predicting diabetic nephropathy in Chinese people. Biomed Environ Sci. 2017;30:106–12. [DOI] [PubMed] [Google Scholar]
- 41.Nelson RG, Grams ME, Ballew SH, et al. Development of risk prediction equations for incident chronic kidney disease. JAMA. 2019;322:2104–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Riphagen IJ, Kleefstra N, Drion I, et al. Comparison of methods for renal risk prediction in patients with type 2 diabetes (ZODIAC-36). PLoS One. 2015;10:e0120477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Tanaka S, Tanaka S, Iimuro S, et al. Predicting macro- and microvascular complications in type 2 diabetes: the Japan Diabetes Complications Study/the Japanese Elderly Diabetes Intervention Trial risk engine. Diabetes Care. 2013;36:1193–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Ozenne B, Subtil F, Maucort-Boulch D. The precision–recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. J Clin Epidemiol. 2015;68:855–9. [DOI] [PubMed] [Google Scholar]
- 45.Ontario Renal Network. KidneyWise Toolkit. 2018. https://www.ontariorenalnetwork.ca/en/kidney-care-resources/clinical-tools/primary-care-tools/kidneywise/toolkit.
- 46.Brimble KS, Boll P, Grill AK, Molnar A, Nash DM, Garg A, et al. Impact of the KidneyWise toolkit on chronic kidney disease referral practices in Ontario primary care: a prospective evaluation. BMJ Open. 2020;10:e032838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Lee TC, Shah NU, Haack A, Baxter SL. Clinical implementation of predictive models embedded within electronic health record systems: a systematic review. Informatics. 2020;7:25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Patzer RE, McClellan WM. Influence of race, ethnicity and socioeconomic status on kidney disease. Nat Rev Nephrol. 2012;8:533–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Polikar R. Ensemble Learning. In: Zhang C, Ma Y (eds) Ensemble Mach. Learn. Methods Appl. Springer US. Boston, MA. 2012; pp 1–34
- 50.Hu X, Madden LV, Edwards S, Xu X. Combining models is more likely to give better predictions than single models. Phytopathology®. 2015;105(9):1174–82. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Additional file 1. Supplementary Table S1: Regression equations of selected prediction models for incident CKD.
Supplementary Table S2: CPCSSN case definition for diabetes. Supplementary Table S3: CPCSSN case definition for type 1 diabetes. Supplementary Table S4: Measurement of predictors of CKD within CPCSSN. Supplementary Table S5: Regression equations of selected prediction models for incident CKD after updating among CPCSSN patients with diabetes. Supplementary Table S6: Sample size estimation parameters and corresponding required sample size. Supplementary Table S7: Recency of CPCSSN diabetes cohort characteristic measurements (time from most recent measurement to baseline visit). Supplementary Table S8: Baseline characteristics of adult patients with diabetes. Supplementary Table S9: Discrimination (AUROC) of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN. Supplementary Table S10: Area under the precision-recall curve (AUPRC) of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN. Supplementary Table S11: Calibration intercept and/or slope of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN. Supplementary Table S12: Discrimination of selected prediction models for incident CKD among CPCSSN patients with diabetes, excluding those who had no eGFR testing over follow up. Supplementary Table S13: Discrimination (AUROC) of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN, excluding those who had no eGFR testing over follow up. Supplementary Table S14: Area under the precision-recall curve (AUPRC) of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN, excluding those who had no eGFR testing over follow up. Supplementary Table S15: Calibration intercept and/or slope of selected prediction models for incident CKD among subgroups of patients with diabetes represented within CPCSSN, excluding those who had no eGFR testing over follow up. Supplementary Figure S1: Kernel density plots for non-missing and imputed values among all continuous variable with some amount of missingness. Supplementary Figure S2: Calibration plots of selected prediction models for incident CKD before model updating among CPCSSN patients with diabetes (models that did not report intercepts could not be plotted). Each plot shows the predicted probability compared to the observed probability (solid blue line); the dashed line represents ideal calibration.
Additional file 2. The TRIPOD+AI reported checklist.
Data Availability Statement
The dataset supporting the conclusions of this article is available in the CPCSSN repository upon reasonable request, https://cpcssn.ca/dar/.


