Skip to main content
Diabetes Care logoLink to Diabetes Care
. 2022 Sep 27;45(11):2737–2745. doi: 10.2337/dc22-0894

Derivation and External Validation of a Clinical Model to Predict Heart Failure Onset in Patients With Incident Diabetes

Louise Y Sun 1,2,3,, Salwa S Zghebi 4,5, Anan Bader Eddeen 2, Peter P Liu 6, Douglas S Lee 2,7,8, Karen Tu 2,7,8,9, Sheldon W Tobe 8,10,11, Evangelos Kontopantelis 4,5,12, Mamas A Mamas 13,14
PMCID: PMC9862443  PMID: 36107673

Abstract

OBJECTIVE

Heart failure (HF) often develops in patients with diabetes and is recognized for its role in increased cardiovascular morbidity and mortality in this population. Most existing models predict risk in patients with prevalent rather than incident diabetes and fail to account for sex differences in HF risk factors. We derived sex-specific models in Ontario, Canada to predict HF at diabetes onset and externally validated these models in the U.K.

RESEARCH DESIGN AND METHODS

Retrospective cohort study using international population-based data. Our derivation cohort comprised all Ontario residents aged ≥18 years who were diagnosed with diabetes between 2009 and 2018. Our validation cohort comprised U.K. patients aged ≥35 years who were diagnosed with diabetes between 2007 and 2017. Primary outcome was incident HF. Sex-stratified multivariable Fine and Gray subdistribution hazard models were constructed, with death as a competing event.

RESULTS

A total of 348,027 Ontarians (45% women) and 54,483 U.K. residents (45% women) were included. At 1, 5, and 9 years, respectively, in the external validation cohort, the C-statistics were 0.81 (95% CI 0.79–0.84), 0.79 (0.77–0.80), and 0.78 (0.76–0.79) for the female-specific model; and 0.78 (0.75–0.80), 0.77 (0.76–0.79), and 0.77 (0.75–0.79) for the male-specific model. The models were well-calibrated. Age, rurality, hypertension duration, hemoglobin, HbA1c, and cardiovascular diseases were common predictors in both sexes. Additionally, mood disorder and alcoholism (heavy drinker) were female-specific predictors, while income and liver disease were male-specific predictors.

CONCLUSIONS

Our findings highlight the importance of developing sex-specific models and represent an important step toward personalized lifestyle and pharmacologic prevention of future HF development.

Introduction

Over the last four decades, the number of adults with diabetes almost quadrupled worldwide (1). Having diabetes more than doubles an individual’s risk of developing cardiovascular disease (CVD) (2), which is a leading cause of mortality (3) accounting for half of all deaths in patients with diabetes (4). Heart failure (HF) is an important sequela of diabetes through accelerated atherosclerosis and other direct cellular mechanisms (5) and is increasingly being recognized for its role in the cardiovascular morbidity and mortality seen in this population (6). Data from clinical trials suggest that sodium–glucose cotransporter 2 inhibitors (SGLT-2i) may reduce the likelihood of incident HF in patients with diabetes (7). The ability to accurately identify individuals at risk for developing HF provides an opportunity for personalized preventative therapy, potentially reducing the CV disease burden in patients with diabetes.

Although several risk models have been developed to predict the onset of HF in patients with diabetes, the majority are based on patients with prevalent instead of new-onset diabetes (8), potentially missing the maximal window of opportunity for personalized prevention. Additionally, existing risk scores are based on clinical trial or aggregated cohort study data that lack real-world representation. They also fail to address the fundamental differences in HF risk, risk factors, and outcomes in women and men. Women with type 2 diabetes have been found to be at higher risk of developing HF than men. A meta-analysis of 14 studies encompassing >12 million individuals found that diabetes conferred 38% excess risk of HF, as well as a greater excess risk of all-cause and cardiovascular death in women (9). These data emphasize the need for sex-specific approaches to risk stratification and management of patients with diabetes (10).

Given this need, we used population-based administrative data in Ontario, Canada to derive clinical risk models to predict the onset of HF in adult women and men at the time of diabetes diagnosis and externally validated these models on a concurrent cohort of patients in the U.K.

Research Design and Methods

Design and Selection Criteria

Included in this retrospective cohort study were adult patients ≥18 years of age who were newly diagnosed with diabetes. Those who were ≥105 years of age, were long-term care residents, or were dialysis-dependent at the time of diabetes diagnosis were excluded. Patients with no HbA1c within 60 days before and 30 days after diabetes diagnosis and those already diagnosed with HF at the time of diabetes diagnosis were also excluded.

Data Sources and Patient Population

Ontario Cohort

The Ontario cohort consisted of all patients with incident diabetes diagnosed between 1 April 2009 and 31 March 2018. Ontario is the most populous province in Canada with 13 million residents and one of the most ethnically diverse jurisdictions in the world. We used population-level administrative health care databases that are held securely in coded form and analyzed at the Institute for Clinical Evaluation Sciences (ICES). ICES is an independent, nonprofit research institute for which legal status under Ontario’s health information privacy law allows it to collect and analyze health care and demographic data, without consent, for health system evaluation and improvement.

Incident cases of diabetes were identified using a validated algorithm, based on one inpatient or two outpatient physician service claims for diabetes within 2 years. This algorithm was shown to have 86% sensitivity and 97% specificity for identifying the onset of any nongestational diabetes when validated in primary care patient records (11). We linked these records with the Registered Persons Database (demographic and vital statistics), the Canadian Institute for Health Information (CIHI) Discharge Abstract Database (hospitalizations and comorbidities), and the Same Day Surgery database (comorbidities). Physician service claims data were obtained from the Ontario Health Insurance Plan database and laboratory values from the Ontario Laboratory Information System. These databases have been validated for many outcomes, exposures, and comorbidities (11,12).

We estimated socioeconomic status based on patients’ neighborhood median income in the Canadian census and determined rural versus urban residence using Statistics Canada definitions (13). We identified hypertension (12), asthma, and chronic obstructive pulmonary disease (COPD) (14) using validated algorithms and other comorbidities using the Discharge Abstract Database, Same Day Surgery, and Ontario Health Insurance Plan databases based on ICD-10 Canada codes on patient encounters within 5 years of diabetes diagnosis, using previously described methods (15,16).

U.K. Cohort

The external validation cohort consisted of all patients aged ≥35 years with a diagnostic code for incident type 2 diabetes, who met the selection criteria between 1 March 2007 and 31 March 2017, and registered with general practices in the U.K. eligible for linkage to external data sets. The cohort was derived from the U.K. Clinical Practice Research Datalink (CPRD) Global Initiative for Chronic Obstructive Lung Disease (GOLD) database, which is one of the world’s largest electronic medical records (EMR) providing anonymized patient-level data and deemed as representative of the U.K. population (17).

The CPRD GOLD primary care records contain sociodemographic, clinical, therapy, laboratory, and referral information from 1987 onward. We linked the CPRD records of the eligible type 2 diabetes cohort to Hospital Episode Statistics, which contains data on all hospital admissions, the Office for National Statistics mortality data, and the index of multiple deprivation (IMD) quintiles 2015 as a measure for socioeconomic status. IMD is recorded at the patient’s residential postcode level and represents a score calculated as the weighted sum of seven deprivation domains, of which income and employment are the highest contributing domains.

Using an algorithm that categorizes patients’ alcohol consumption status into six categories based on general practice codes (18), alcoholism refers to patients deemed to be heavy drinkers. Mood disorder definition covered conditions including adjustment disorders, bipolar affective disorders, manic episodes, depersonalization-derealization syndrome, depressive episodes, dissociative amnesia, dissociative stupor, and anxiety disorders.

Outcomes

The primary outcome was incident HF. In the Ontario cohort, this was identified by a validated algorithm with 85% sensitivity and 97% specificity based on one inpatient or two outpatient billing claims for HF within 1 year (19). HF was identified from primary care and linked secondary care Hospital Episode Statistics records in the U.K. cohort using Read and ICD-10 codes, respectively. The validity of cardiovascular diagnoses in CPRD is recognized (20).

Statistical Analysis

Continuous variables were compared with a two-sample t test or Wilcoxon rank sum test where appropriate. Categorical variables were compared with a χ2 test. Outcomes were assessed through 31 March 2019. In Ontario, patients were censored when they died or were no longer eligible for Ontario health insurance. In the U.K., patients were censored when they died or left the CPRD-contributing practice.

Model Development

Model development was based on Ontario data and stratified by sex. We split the female and male cohorts by random selection such that 70% of each cohort was used for derivation and 30% for internal validation (21). The prediction of HF was accomplished using Fine and Gray subdistribution hazard models within a competing risk framework (22). Candidate variables from Table 1 were selected based on Bayesian information criteria, using a backward stepwise elimination model with death as the competing event (23).

Table 1.

Baseline characteristics by sex in the Ontario and U.K. cohorts

Variable Women Men
Ontario (N = 156,572) U.K. (N = 24,664) Ontario (N = 191,455) U.K. (N = 29,819)
Age, mean ± SD, years 57.9 ± 13.6 63.9 ± 13.2 57.2 ± 12.7 61.5 ± 12.2
Age, median (IQR), years 58 (49–67) 64 (54–74) 57 (49–66) 61 (52–70)
Rural residence 14,451 (9.2) NA 19,810 (10.3) NA
Income quintile*
 1 (lowest) 36,711 (23.4) 4,938 (20.0) 39,398 (20.6) 5,387 (18.1)
 2 35,265 (22.5) 5,142 (20.9) 40,332 (21.1) 5,871 (19.7)
 3 32,419 (20.7) 5,168 (21.0) 40,118 (21.0) 6,306 (21.2)
 4 28,860 (18.4) 4,856 (19.7) 38,211 (20.0) 6,281 (21.1)
 5 (highest) 23,317 (14.9) 4,560 (18.5) 33,396 (17.4) 5,974 (20.0)
Formally rostered to FP 135,489 (86.5) NA 160,315 (83.7) NA
Comorbidities
 Hypertension 89,272 (57.0) 13,750 (55.8) 103,906 (54.3) 14,729 (49.4)
 Hypertension duration, mean ± SD, years 10.8 ± 6.7 10.4 ± 8.4 10.1 ± 6.7 9.0 ± 7.7
 Hypertension duration, median (IQR), years 10 (5–16) 9 (4–15) 9 (4–15) 7 (3–13)
 Hypertension duration, years
  No hypertension 67,300 (43.0) 10,914 (44.3) 87,549 (45.7) 15,090 (50.6)
  <10 43,620 (27.9) 7,740 (31.4) 55,566 (29.0) 9,368 (31.4)
  10–20 35,878 (22.9) 4,349 (17.6) 38,296 (20.0) 3,996 (13.4)
  ≥20 9,774 (6.2) 1,661 (6.7) 10,044 (5.2) 1,365 (4.6)
 Ischemic heart disease 4,393 (2.8) 4,379 (17.8) 12,004 (6.3) 6,755 (22.7)
 Recent MI 1,808 (1.2) 31 (0.1) 5,616 (2.9) 55 (0.2)
 Valvular heart disease 440 (0.3) 16 (0.1) 811 (0.4) 17 (0.1)
 AF 1,520 (1.0) 1,264 (5.1) 2,745 (1.4) 1,682 (5.6)
 Previous CABG 445 (0.3) 223 (0.9) 2,236 (1.2) 897 (3.0)
 Previous PCI 1,698 (1.1) 342 (1.4) 5,536 (2.9) 1,015 (3.4)
 History of cardiac testing** 35,576 (22.7) 2,211 (9.0) 51,875 (27.1) 3,692 (12.4)
 Cerebrovascular disease 1,681 (1.1) 1,404 (5.7) 2,602 (1.4) 1,845 (6.2)
 Peripheral arterial disease 700 (0.4) 483 (2.0) 1,479 (0.8) 926 (3.1)
 COPD/asthma 40,207 (25.7) 5,215 (21.1) 39,654 (20.7) 4,635 (15.5)
 Pulmonary circulation disorder 537 (0.3) 459 (1.9) 605 (0.3) 399 (1.3)
 GFR, mean ± SD, mL/min/1.73 m2 88.9 ± 20.2 73.8 ± 19.9 89.4 ± 18.9 79.6 ± 19.4
 GFR, median (IQR), mL/min/1.73 m2 91 (76–103) 73 (60–86) 92 (78–102) 78 (67–92)
 HbA1c, mean ± SD, % (mmol/mol) 7.5 ± 1.9 (58 ± 20.8) 7.7 ± 1.9 (61 ± 20.8) 8.0 ± 2.2 (64 ± 24.0) 8.0 ± 2.1 (64 ± 23.0)
 HbA1c, median (IQR), % (mmol/mol) 7 (6–8) 7 (7–8) 7 (7–9) 7 (7–9)
 Hemoglobin, mean ± SD, g/dL 135.2 ± 12.1 136.3 ± 13.0 147.1 ± 12.7 149.2 ± 13.4
 Hemoglobin, median (IQR), g/dL 136 (128–143) 137 (129–145) 148 (140–155) 150 (142–158)
 Venous thromboembolism 311 (0.2) 1,179 (4.8) 440 (0.2) 1,013 (3.4)
 Hypothyroidism 1,192 (0.8) 3,356 (13.6) 416 (0.2) 997 (3.3)
 Liver disease 730 (0.5) 289 (1.2) 1,387 (0.7) 421 (1.4)
 Alcohol abuse (heavy drinker) 365 (0.2) 1,051 (4.3) 1,485 (0.8) 3,491 (11.7)
 Dementia 1,619 (1.0) 292 (1.2) 1,550 (0.8) 198 (0.7)
 Depression 1,203 (0.8) 8,264 (33.5) 838 (0.4) 5,593 (18.8)
 Psychosis 280 (0.2) 447 (1.8) 276 (0.1) 451 (1.5)
 Primary cancer 5,516 (3.5) 3,107 (12.6) 5,924 (3.1) 3,198 (10.7)
 Metastatic cancer 1,110 (0.7) 70 (0.3) 778 (0.4) 81 (0.3)
 Paraplegia/hemiplegia 256 (0.2) 63 (0.3) 409 (0.2) 83 (0.3)

Data are N (%) unless otherwise indicated.

CABG, coronary artery bypass grafting; FP, family physician; MI, myocardial infarction; NA, not applicable; PCI, percutaneous coronary intervention.

*

IMD quintiles are recorded for the U.K. cohort.

**

Cardiac testing includes invasive or computed tomography angiography, nuclear medicine test, cardiac positron emission test, or cardiac stress test (persantine, dobutamine, electrocardiogram, or stress echocardiography).

Rurality and socioeconomic status were missing in <0.1% of patients, glomerular filtration rate (GFR) in 21,078 (6.1%), and hemoglobin in 28,744 (8.3%). We imputed missing values once within the SAS “proc MI” framework, where they were predicted drawing on all candidate covariates using predictive mean matching for continuous variables and logistic regression for categorical variables (24). We examined the association between each continuous variable with the outcome using cubic spline analyses with three knots at percentiles 10, 50, and 90. As the linearity assumption held for all variables, they were entered into the model as continuous values. We validated the model on the remaining 30% of the cohort. We reported subhazard ratios, 95% CIs, and P values for final covariates in each model.

Model Evaluation

We evaluated model discrimination using the C-statistic and estimated 95% CIs using 200 bootstraps. We assessed calibration using Brier scores (25) and time-dependent plots of observed versus predicted HF incidence rates within deciles of predicted risk in the validation cohort.

Sensitivity Analysis

We conducted two sensitivity analyses to assess the robustness of our models in clinically relevant settings. First, as SGLT-2i is the recommended first-line therapy for patients with diabetes and CVD, we evaluated the performance of our models in patients without CVD. Specifically, per European and American Diabetes Association guidelines (26,27), we excluded those with a history of CVD, including ischemic heart disease (IHD), cerebrovascular disease, and peripheral arterial disease. Second, we evaluated the performance of our models in predicting incident HF hospitalization.

Analyses were performed using SAS 9.4 (SAS Institute, Cary, NC) and R studio version 3.5.1 in the Ontario cohort and Stata 16.1 (College Station, TX) in the U.K. cohort, with statistical significance defined by a two-sided P value <0.05.

Ethical Approval

In Ontario, the use of data was authorized under section 45 of Ontario’s Personal Health Information Protection Act, which does not require review by a Research Ethics Board. In the U.K., the study was approved by the Independent Scientific Advisory Committee for the Medicines and Healthcare Products Regulatory Agency Database Research (protocol number 17_168). Generic ethnical approval for observational research using CPRD with approval from the Independent Scientific Advisory Committee has been granted by Health Research Authority Research Ethics Committee (Ease Midlands–Derby, REC reference number 05/MRE04/87).

Data and Resource Availability

For Ontario, the data set from this study is held securely in coded form at ICES. While legal data sharing agreements between ICES and data providers (e.g., health care organizations and government) prohibit ICES from making the data set publicly available, access may be granted to those who meet prespecified criteria for confidential access, available at https://www.ices.on.ca/DAS (e-mail: das@ices.on.ca). The full data set creation plan and underlying analytic code are available from the authors upon request, understanding that the computer programs may rely upon coding templates or macros that are unique to ICES and are therefore either inaccessible or may require modification. In the U.K., access to data can be requested via application to the CPRD.

Results

Derivation and Validation Cohorts

The Ontario cohort was used for derivation and internal validation. No patients were lost to follow-up. Median follow-up for the 348,027 Ontarians with incident diabetes (45.0% women) was 5 years (interquartile range [IQR] 3–7), and maximum follow-up was 9 years. In the derivation cohorts, 4,027 (3.7%) among 109,600 women and 5,803 (4.3%) among 134,018 men developed HF during the follow-up period. In the internal validation cohorts, 1,742 (3.7%) among 46,972 women and 2,469 (4.3%) among 57,437 men developed HF.

The external validation cohort comprised 54,483 U.K. residents with incident type 2 diabetes (45.3% women). No patients were lost to follow-up. Median follow-up duration was 5 years (IQR 3–7). Among 24,664 women, 1,107 (4.5%) developed HF. Among 29,819 men, 1,494 (5.0%) developed HF (Supplementary Table 1).

The baseline characteristics of the Ontario and U.K. cohorts were similar within each sex, with the exception that the Ontario patients were younger, less likely to have atrial fibrillation (AF), IHD, cerebrovascular disease, peripheral arterial disease, hypothyroidism, alcoholism, chronic renal disease, and primary malignancy. Ontarians were, however, more likely to have long-standing hypertension, valvular heart disease, or COPD or to undergo cardiac testing, including invasive or computed tomography angiography, nuclear medicine test, cardiac positron emission test, or cardiac stress test (persantine, dobutamine, electrocardiogram, or stress echocardiography) (Table 1).

Female-Specific Model

The multivariable risk factors of HF in women were age, rurality, duration of hypertension, valvular heart disease, IHD, AF, history of cardiac testing, COPD, alcoholism, baseline hemoglobin, HbA1c, GFR, and mood disorder (Table 2).

Table 2.

Multivariable predictors of HF in women and men with diabetes

Variable Coefficient Adjusted SHR (95% CI) P value
Women
 Age, per year 0.05672 1.06 (1.05–1.06) <0.001
 Rural residence 0.18698 1.21 (1.10–1.33) <0.001
 Hypertension duration, years
  No hypertension Reference Reference Reference
  <10 0.23651 1.27 (1.15–1.39) <0.001
  10–20 0.38748 1.47 (1.34–1.62) <0.001
  ≥20 0.45119 1.57 (1.38–1.78) <0.001
 Ischemic heart disease 0.55744 1.75 (1.55–1.96) <0.001
 Valvular heart disease 0.6776 1.97 (1.50–2.58) <0.001
 AF 0.89362 2.44 (2.11–2.83) <0.001
 History of cardiac testing 0.10943 1.12 (1.04–1.20) 0.003
 COPD or asthma 0.46512 1.59 (1.49–1.70) <0.001
 HbA1c, per 1% 0.09776 1.10 (1.09–1.12) <0.001
 Hemoglobin, per 1 g/dL −0.00867 0.991 (0.989–0.994) <0.001
 GFR, per 1 mL/min/1.73 m2 −0.01025 0.990 (0.988–0.992) <0.001
 Alcoholism 0.84142 2.32 (1.44–3.75) <0.001
 Mood disorder 0.5566 1.75 (1.31–2.32) <0.001
Men
 Age, per year 0.06 1.06 (1.05–1.06) <0.001
 Rural residence 0.14 1.15 (1.06–1.24) <0.001
 Income quintile
  1 (lowest) Reference Reference Reference
  2 −0.11 0.89 (0.83–0.97) <0.001
  3 −0.15 0.86 (0.80–0.93) <0.001
  4 −0.23 0.79 (0.73–0.86) <0.001
  5 (highest) −0.25 0.77 (0.71–0.84) <0.001
 Hypertension duration, years
  No hypertension Reference Reference Reference
  <10 0.22 1.24 (1.16–1.34) <0.001
  10–20 0.44 1.55 (1.44–1.67) <0.001
  ≥20 0.48 1.62 (1.45–1.81) <0.001
 Ischemic heart disease 0.48 1.62 (1.49–1.77) <0.001
 Valvular heart disease 0.69 1.99 (1.63–2.44) <0.001
 AF 0.64 1.90 (1.68–2.14) <0.001
 Previous CABG −0.31 0.73 (0.62–0.87) <0.001
 History of cardiac testing 0.12 1.13 (1.06–1.20) <0.001
 COPD or asthma 0.37 1.45 (1.37–1.54) <0.001
 HbA1c, per 1% 0.08 1.09 (1.07–1.10) <0.001
 Hemoglobin, per 1 g/dL −0.01 0.992 (0.990–0.994) <0.001
 GFR, per 1 mL/min/1.73 m2 −0.01 0.994 (0.992–0.996) <0.001
 Liver disease 0.50 1.65 (1.31–2.09) <0.001

CABG, coronary artery bypass grafting; SHR, subhazard ratio.

The performance of the models on derivation and internal and external validation is summarized in Table 3. In the internal validation data set, the C-statistics at 1, 5, and 9 years were 0.79 (95% CI 0.77–0.82), 0.79 (0.78–0.81), and 0.77 (0.76–0.79); and Brier scores were 0.007 (95% CI 0.006–0.008), 0.034 (0.032–0.035), and 0.066 (0.062–0.069), respectively. In the external validation data set, the C-statistics at 1, 5, and 9 years were 0.81 (0.79–0.84), 0.79 (0.77–0.80), and 0.78 (0.76–0.79); and Brier scores were 0.010 (0.009–0.011), 0.038 (0.035–0.041), and 0.067 (0.062–0.071), respectively, indicating excellent calibration. Supplementary Figs. 1–3 show the area under the receiver operating characteristic curves, and Fig. 1 shows the calibration plots of observed versus expected rates of HF at 1, 5, and 9 years after diabetes diagnosis in women according to each decile of risk. The model calibrated well in all but the highest risk decile in the external validation cohort, in which the model tended to overpredict.

Table 3.

Model performance in the derivation and validation data sets

Time, years Population Women Men
C-statistic (95% CI) Brier score (95% CI) C-statistic (95% CI) Brier score (95% CI)
1 Derivation 0.79 (0.77–0.80) 0.007 (0.006–0.007) 0.75 (0.74–0.77) 0.009 (0.008–0.009)
Internal validation 0.79 (0.77–0.82) 0.007 (0.006–0.008) 0.75 (0.73–0.78) 0.008 (0.007–0.009)
External validation 0.81 (0.79–0.84) 0.010 (0.009–0.011) 0.78 (0.75–0.80) 0.011 (0.010–0.012)
2 Derivation 0.80 (0.79–0.81) 0.012 (0.012–0.013) 0.77 (0.76–0.78) 0.016 (0.015–0.016)
Internal validation 0.80 (0.78–0.82) 0.013 (0.012–0.014) 0.77 (0.76–0.79) 0.015 (0.014–0.016)
External validation 0.82 (0.80–0.84) 0.016 (0.015–0.018) 0.78 (0.77–0.80) 0.019 (0.017–0.020)
3 Derivation 0.79 (0.78–0.80) 0.019 (0.018–0.020) 0.77 (0.76–0.78) 0.023 (0.022–0.024)
Internal validation 0.80 (0.78–0.82) 0.019 (0.018–0.021) 0.78 (0.76–0.79) 0.023 (0.021–0.024)
External validation 0.81 (0.79–0.82) 0.023 (0.021–0.025) 0.78 (0.76–0.79) 0.027 (0.025–0.029)
4 Derivation 0.79 (0.78–0.80) 0.026 (0.025–0.027) 0.77 (0.76–0.78) 0.031 (0.030–0.032)
Internal validation 0.80 (0.78–0.81) 0.026 (0.025–0.028) 0.77 (0.76–0.79) 0.030 (0.029–0.032)
External validation 0.80 (0.78–0.82) 0.030 (0.028–0.032) 0.78 (0.76–0.79) 0.036 (0.033–0.038)
5 Derivation 0.79 (0.78–0.80) 0.033 (0.032–0.034) 0.77 (0.76–0.78) 0.039 (0.038–0.040)
Internal validation 0.79 (0.78–0.81) 0.034 (0.032–0.035) 0.78 (0.77–0.79) 0.039 (0.037–0.041)
External validation 0.79 (0.77–0.80) 0.038 (0.035–0.041) 0.77 (0.76–0.79) 0.044 (0.042–0.047)
6 Derivation 0.79 (0.78–0.80) 0.041 (0.039–0.042) 0.77 (0.76–0.77) 0.047 (0.046–0.049)
Internal validation 0.79 (0.78–0.80) 0.040 (0.039–0.042) 0.78 (0.76–0.78) 0.048 (0.046–0.050)
External validation 0.78 (0.77–0.80) 0.045 (0.042–0.048) 0.77 (0.76–0.78) 0.050 (0.047–0.053)
7 Derivation 0.78 (0.77–0.79) 0.048 (0.046–0.050) 0.76 (0.76–0.77) 0.056 (0.055–0.058)
Internal validation 0.79 (0.78–0.80) 0.048 (0.046–0.050) 0.77 (0.76–0.79) 0.056 (0.054–0.059)
External validation 0.78 (0.76–0.79) 0.052 (0.048–0.055) 0.77 (0.76–0.78) 0.057 (0.053–0.059)
8 Derivation 0.78 (0.77–0.79) 0.056 (0.054–0.058) 0.76 (0.76–0.77) 0.066 (0.064–0.067)
Internal validation 0.78 (0.77–0.80) 0.057 (0.054–0.060) 0.76 (0.75–0.78) 0.065 (0.062–0.067)
External validation 0.78 (0.76–0.79) 0.060 (0.056–0.064) 0.77 (0.75–0.78) 0.065 (0.061–0.068)
9 Derivation 0.77 (0.76–0.78) 0.065 (0.063–0.068) 0.76 (0.75–0.77) 0.075 (0.073–0.078)
Internal validation 0.77 (0.76–0.79) 0.066 (0.062–0.069) 0.76 (0.75–0.77) 0.073 (0.070–0.076)
External validation 0.78 (0.76–0.79) 0.067 (0.062–0.071) 0.77 (0.75–0.79) 0.072 (0.068–0.076)

Figure 1.

Figure 1

Calibration plots of observed vs. expected rates of incident HF at 1, 5, and 9 years of follow-up, according to deciles of expected rates in women and men in the Ontario internal validation cohort (A) and U.K. external validation cohort (B).

Male-Specific Model

The multivariable risk factors of HF in men were age, rurality, income quintile, duration of hypertension, valvular heart disease, IHD, history of coronary artery bypass grafting, AF, history of cardiac testing, COPD, liver disease, hemoglobin, HbA1c, and GFR (Table 2).

In the internal validation data set, the C-statistics at 1, 5, and 9 years were 0.75 (0.73–0.78), 0.78 (0.77–0.79), and 0.76 (0.75–0.77); and Brier scores were 0.008 (0.007–0.009), 0.039 (0.037–0.041), and 0.073 (0.070–0.076), respectively. In the external validation data set, the C-statistics at 1, 5, and 9 years were 0.78 (0.75 to 0.80), 0.77 (0.76 to 0.79), and 0.77 (0.75 to 0.79); and Brier scores were 0.011 (0.010–0.012), 0.044 (0.042–0.047), and 0.072 (0.068–0.076), respectively, indicating excellent calibration (Table 3). Supplementary Figs. 1–3 show the area under the receiver operating characteristic curves, and Fig. 1 shows the calibration plot of observed versus expected rates of HF at 1, 5, and 9 years after diabetes diagnosis in men according to each decile of risk. The model calibrated well in all except the highest risk decile in the external validation cohort tended, in which the model tended to overpredict.

Sensitivity Analysis

Model Performance in Patients Without CVD

For the female-specific model, the C-statistics at 1, 5, and 9 years were 0.82, 0.81, and 0.80, and Brier scores were 0.0010, 0.0082, and 0.022, respectively, in the Ontario cohort. The C-statistics were 0.81, 0.78, and 0.77, and Brier scores were 0.0065, 0.028, and 0.049, respectively, in the U.K. cohort.

For the male-specific model, the C-statistics were 0.78, 0.78, and 0.77, and Brier scores were 0.0011, 0.0086, and 0.022, respectively, in the Ontario cohort. The C-statistics were 0.74, 0.76, and 0.75, and Brier scores were 0.0067, 0.029, and 0.051, respectively, in the U.K. cohort.

Predicting Incident HF Hospitalization

For the female-specific model, the C-statistics at 1, 5, and 9 years were 0.83, 0.82, and 0.81, and Brier scores were 0.0013, 0.0099, and 0.025, respectively, in the Ontario cohort. The C-statistics were 0.81, 0.79, and 0.77, and Brier scores were 0.0098, 0.038, and 0.067, respectively, in the U.K. cohort.

For the male-specific model, the C-statistics were 0.80, 0.80, and 0.78, and the Brier scores were 0.0014, 0.011, and 0.027, respectively, in the Ontario cohort. The C-statistics were 0.78, 0.77, and 0.76, and Brier scores were 0.011, 0.044, and 0.072, respectively, in the U.K. cohort.

The Sex-Specific PARFAIT Risk Calculators

Our sex-specific models are together termed the Predicting Heart Failure in Diabetes (PARFAIT) models. These models have been adapted into risk calculators and provided in the Supplementary Material.

Conclusions

Main Findings

We derived and validated sex-specific models to predict incident HF in adults with new-onset diabetes. We observed that 3.7–4.5% of women and 4.3–5% of men with diabetes developed HF over the study period and that age, duration of hypertension, GFR, hemoglobin, HbA1c, and prevalent CVD are risk factors of HF common to both sexes. Specific to women, mood disorder and alcoholism are additional HF risk factors, while income and liver disease are male-specific risk factors. The performance of our models was robust over a 9-year follow-up. We have enclosed automated risk calculators to make these models readily applicable in clinical settings.

Alleviating the Burden of HF in Patients With Diabetes

The prevalence of diabetes is growing globally at a rapid pace. A previous U.K. population-based study reported that patients with type 2 diabetes are more than twice as likely to have HF as their age-sex-practice–matched comparators without diabetes (men, odds ratio 2.12 [95% CI 1.76–2.54]; women, odds ratio 2.27 [1.81–2.85]) (28). Given HF’s role in the development of disability (29) and other adverse long-term outcomes (16,30), the ability to predict HF risk will inform timely and personalized preventative therapy. Lifestyle modification and other interventions, such as SGLT-2i, have cardioprotective benefits, and the latter is associated with ∼25% reduction in HF hospitalizations and cardiovascular death (7,31). Our work describes new high-risk features in the development of HF and highlights the importance of sex-specific models in predicting future risk, particularly when risk factors may vary by sex. Additionally, our models exhibited excellent performance in predicting severe HF requiring hospitalization, particularly among patients for whom prophylactic SGLT-2i is not routinely recommended by guidelines. These features may play an important role in further reducing population-level risk of HF.

Findings in Comparison With Other Studies

Many existing risk scores were derived and validated in smaller cohorts and lack robust external validation, which effectively limits their applicability in the real world (3236). A number of these models were based on older clinical trial data and predicted a variety of HF-related outcomes instead of HF onset: the WATCH-DM risk score was derived using machine-learning methods based on data from 1999 to 2009 and predicted HF in patients with prevalent type 2 diabetes (C-statistic 0.77 on internal validation) (36); the Thrombolysis in Myocardial Infarction (TIMI) Risk Score for HF in Diabetes (TRS-HFDM) predicted HF hospitalization (C-statistic 0.78) (32); Pfister et al. (35) calculated the risk for HF in people with advanced type 2 diabetes complicated by macrovascular disease (C-statistic 0.75); and the UK Prospective Diabetes Study Outcomes Model (UKPDS-OM) estimated the absolute probability of first occurrence of seven major diabetes-related complications, HF being one of them (33). Pandey et al. (34) used data from three cohort studies to predict HF among patients with prediabetes or prevalent diabetes using biomarker-based risk score (C-statistic 0.74). A recent HF hospitalization risk model was based on EMR data of 54,452 predominately Caucasian patients with incident or prevalent type 2 diabetes from a U.S.-based single-payer system (C-statistic 0.782) (37). In contrast, our models were derived from an ethnically diverse, contemporary population of >250,000 patients within a universal health care system. Our models were validated externally across continents demonstrating excellent performance at all time points. In addition to their demonstrated applicability around the world, these models uniquely apply at the onset of diabetes thus affords a larger window of opportunity for HF prevention (8,35).

Notably, most published HF risk models in patients with diabetes are non–sex-specific, in spite of known sex differences in cardiovascular risk attributable to diabetes (9,10) and the differences in HF risk factors and outcomes in women and men (16,30,38). The only exceptions to this are models derived by Hippsley-Cox and Coupland (39), which were not specific to patients with incident diabetes, incorporated the exact same predictors in both sexes, and lacked external validation. These models had similar performance to ours (C-statistics 0.769 in men and 0.783 in women).

The availability of long-term follow-up data are essential to the clinical applicability of models that predict the onset of chronic disease. Our follow-up duration exceeds most similar studies. The derivation cohort by Pfister et al. (35) had a mean follow-up of 34.5 months. A study on the posttrial monitoring data for the UKPDS-OM risk score found that the HF prediction model performed well in the first 3 years but overpredicted at 10 years (40). Our models demonstrated consistent performance throughout 9-year follow-up and are based on routinely collected data, which demonstrates generalizability to jurisdictions with and without established EMR systems.

A number of risk factors have been reported in association with the future risk of HF in patients with diabetes, including glycemic control, CVD, or cardiovascular risk factors (37), renal function (35), and sociodemographic factors such as age and income (39,41). Our models included these variables and also highlighted other high-risk features such as hemoglobin, COPD, and alcoholism. Additionally, our models are based on simple data elements that are readily available in the primary care setting, in which diabetes is most frequently diagnosed and managed. This makes our models widely applicable in the community. Our report of sex-specific risk factors is important and is, to our knowledge, a first step toward personalized preventative medicine.

Limitations

Our study has several limitations. First, our derivation cohort contained a mixed population with type 1 and 2 diabetes. However, these models performed well in an external validation cohort of exclusively patients with type 2 diabetes. Second, our data sources do not capture diabetes subtype, nor routinely capture measures of physical activity and other lifestyle factors that may have important roles in the development of incident HF. Third, as drug coverage is only publicly funded for Ontarians >65 years of age, we were unable to incorporate medications into the development of our model.

Conclusions

We developed and externally validated sex-specific risk models to predict long-term HF risk in patients with new-onset diabetes to maximize the window of opportunity for preventative therapy. Our models demonstrated robust performance over a 9-year follow-up. Our identification of sex-specific risk factors, the ability of our models to additionally predict severe HF requiring hospitalization, as well as among patients for whom prophylactic SGLT-2i is not routinely prescribed by current guidelines, represent an important step toward personalized lifestyle and pharmacologic prevention to potentially allow millions of patients with diabetes to live longer and better.

Article Information

Acknowledgments. The authors acknowledge the usage of data compiled and provided by the CIHI. These data sets were linked using unique encoded identifiers and analyzed at ICES.

The analyses, conclusions, opinions, and statements expressed in the manuscript are those of the authors and do not necessarily reflect those of the above agencies.

Funding. This study was supported by the Canadian Institutes of Health Research. L.Y.S. was named National New Investigator by the Heart and Stroke Foundation of Canada and is supported by a Tier 2 Clinical Research Chair in Big Data and Cardiovascular Outcomes at the University of Ottawa. D.S.L. is supported by a Mid-Career Investigator Award from the Heart and Stroke Foundation. K.T. received a research scholar award from the Department of Family and Community Medicine at the University of Toronto. This study was also supported by ICES, which is funded by an annual grant from the Ontario Ministry of Health (MOH) and the Ministry of Long-Term Care. Parts of this material are based on data and information compiled and provided by the CIHI. The authors acknowledge that the clinical registry data used in this analysis are from participating hospitals through CorHealth Ontario, which serves as an advisory body to the MOH, is funded by the MOH, and is dedicated to improving the quality, efficiency, access, and equity in the delivery of the continuum of adult cardiac and stroke care in Ontario, Canada. The U.K. cohort study was funded by the National Institute for Health Research School for Primary Care Research.

The analyses, conclusions, opinions and statements expressed in this study are solely those of the authors and do not reflect those of the funding or data sources; no endorsement is intended or should be inferred.

Duality of Interest. No potential conflicts of interest relevant to this article were reported.

Author Contributions. L.Y.S. was responsible for conception and design. L.Y.S., S.S.Z., and A.B.E. were responsible for data acquisition and analysis. L.Y.S., S.S.Z., A.B.E., P.P.L., D.S.L., K.T., S.W.T., E.K., and M.A.M. were responsible for interpretation of data. L.Y.S. and S.S.Z. drafted the manuscript. L.Y.S., S.S.Z., A.B.E., P.P.L., D.S.L., K.T., S.W.T., E.K., and M.A.M. were responsible for critical revision and final approval. L.Y.S. is the guarantor of this work and, as such, had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Footnotes

This article contains supplementary material online at https://doi.org/10.2337/figshare.20736862.

L.Y.S. and S.S.Z. are co-primary authors.

References

  • 1. NCD Risk Factor Collaboration (NCD-RisC) . Worldwide trends in diabetes since 1980: a pooled analysis of 751 population-based studies with 4.4 million participants. Lancet 2016;387:1513–1530 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Dei Cas A, Khan SS, Butler J, et al. Impact of diabetes on epidemiology, treatment, and outcomes of patients with heart failure. JACC Heart Fail 2015;3:136–145 [DOI] [PubMed] [Google Scholar]
  • 3. Barnett KN, Ogston SA, McMurdo ME, Morris AD, Evans JM. A 12-year follow-up study of all-cause and cardiovascular mortality among 10,532 people newly diagnosed with type 2 diabetes in Tayside, Scotland. Diabet Med 2010;27:1124–1129 [DOI] [PubMed] [Google Scholar]
  • 4. Morrish NJ, Wang SL, Stevens LK, Fuller JH, Keen H. Mortality and causes of death in the WHO Multinational Study of Vascular Disease in Diabetes. Diabetologia 2001;44(Suppl. 2):S14–S21 [DOI] [PubMed] [Google Scholar]
  • 5. Dunlay SM, Givertz MM, Aguilar D, et al.; American Heart Association Heart Failure and Transplantation Committee of the Council on Clinical Cardiology; Council on Cardiovascular and Stroke Nursing; and the Heart Failure Society of America . Type 2 diabetes mellitus and heart failure: a scientific statement from the American Heart Association and the Heart Failure Society of America: this statement does not represent an update of the 2017 ACC/AHA/HFSA heart failure guideline update. Circulation 2019;140:e294–e324 [DOI] [PubMed] [Google Scholar]
  • 6. Cavender MA, Steg PG, Smith SC Jr, et al.; REACH Registry Investigators . Impact of diabetes mellitus on hospitalization for heart failure, cardiovascular events, and death: outcomes at 4 years from the Reduction of Atherothrombosis for Continued Health (REACH) Registry. Circulation 2015;132:923–931 [DOI] [PubMed] [Google Scholar]
  • 7. Wiviott SD, Raz I, Bonaca MP, et al.; DECLARE–TIMI 58 Investigators . Dapagliflozin and cardiovascular outcomes in type 2 diabetes. N Engl J Med 2019;380:347–357 [DOI] [PubMed] [Google Scholar]
  • 8. Wang Y, Negishi T, Negishi K, Marwick TH. Prediction of heart failure in patients with type 2 diabetes mellitus- a systematic review and meta-analysis. Diabetes Res Clin Pract 2015;108:55–66 [DOI] [PubMed] [Google Scholar]
  • 9. Xu G, You D, Wong L, et al. Risk of all-cause and CHD mortality in women versus men with type 2 diabetes: a systematic review and meta-analysis. Eur J Endocrinol 2019;180:243–255 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Ohkuma T, Komorita Y, Peters SAE, Woodward M. Diabetes as a risk factor for heart failure in women and men: a systematic review and meta-analysis of 47 cohorts including 12 million individuals. Diabetologia 2019;62:1550–1560 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Hux JE, Ivis F, Flintoft V, Bica A. Diabetes in Ontario: determination of prevalence and incidence using a validated administrative data algorithm. Diabetes Care 2002;25:512–516 [DOI] [PubMed] [Google Scholar]
  • 12. Tu K, Campbell NRC, Chen ZL, Cauch-Dudek KJ, McAlister FA. Accuracy of administrative databases in identifying patients with hypertension. Open Med 2007;1:e18–e26 [PMC free article] [PubMed] [Google Scholar]
  • 13. du Plessis V, Beshiri R, Bollman RD, Clemeson H. Definitions of Rural. Agriculture and Rural Working Paper Series. Vol. 3, No. 3. Statistics Canada, 2002. Accessed 5 September 2022. Available from https://www150.statcan.gc.ca/n1/en/pub/21-006-x/21-006-x2001003-eng.pdf?st=mt1kVOj6
  • 14. Gershon AS, Wang C, Guan J, Vasilevska-Ristovska J, Cicutto L, To T. Identifying individuals with physician diagnosed COPD in health administrative databases. COPD 2009;6:388–394 [DOI] [PubMed] [Google Scholar]
  • 15. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care 2005;43:1130–1139 [DOI] [PubMed] [Google Scholar]
  • 16. Sun LY, Tu JV, Sherrard H, et al. Sex-specific trends in incidence and mortality for urban and rural ambulatory patients with heart failure in Eastern Ontario from 1994 to 2013. J Card Fail 2018;24:568–574 [DOI] [PubMed] [Google Scholar]
  • 17. Herrett E, Gallagher AM, Bhaskaran K, et al. Data resource profile: Clinical Practice Research Datalink (CPRD). Int J Epidemiol 2015;44:827–836 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Parisi R, Webb RT, Carr MJ, et al. Alcohol-related mortality in patients with psoriasis: a population-based cohort study. JAMA Dermatol 2017;153:1256–1262 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Schultz SE, Rothwell DM, Chen Z, Tu K. Identifying cases of congestive heart failure from administrative data: a validation study using primary care patient records. Chronic Dis Inj Can 2013;33:160–166 [PubMed] [Google Scholar]
  • 20. Khan NF, Harrison SE, Rose PW. Validity of diagnostic coding within the General Practice Research Database: a systematic review. Br J Gen Pract 2010;60:e128–e136 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Lee TH, Marcantonio ER, Mangione CM, et al. Derivation and prospective validation of a simple index for prediction of cardiac risk of major noncardiac surgery. Circulation 1999;100:1043–1049 [DOI] [PubMed] [Google Scholar]
  • 22. Gray RJ. A class of K-sample tests for comparing the cumulative incidence of a competing risk. Ann Stat 1988;16:1141–1154 [Google Scholar]
  • 23. Harrell FE Jr, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996;15:361–387 [DOI] [PubMed] [Google Scholar]
  • 24. Rubin DB, Schenker N. Multiple imputation in health-care databases: an overview and some applications. Stat Med 1991;10:585–598 [DOI] [PubMed] [Google Scholar]
  • 25. Brier G. Verification of forecasts expressed in terms of probability. Mon Weather Rev 1950;78:1–3 [Google Scholar]
  • 26. Cosentino F, Grant PJ, Aboyans V, et al.; ESC Scientific Document Group . 2019 ESC Guidelines on diabetes, pre-diabetes, and cardiovascular diseases developed in collaboration with the EASD. Eur Heart J 2020;41:255–323 [DOI] [PubMed] [Google Scholar]
  • 27. American Diabetes Association . 10. Cardiovascular disease and risk management: Standards of Medical Care in Diabetes—2021. Diabetes Care 2021;44(Suppl. 1):S125–S150 [DOI] [PubMed] [Google Scholar]
  • 28. Zghebi SS, Steinke DT, Rutter MK, Ashcroft DM. Eleven-year multimorbidity burden among 637,255 people with and without type 2 diabetes: a population-based study using primary care and linked hospitalisation data. BMJ Open 2020;10:e033866. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Sun LY, Eddeen AB, Mesana TG. Disability-free survival after major cardiac surgery: a population-based retrospective cohort study. CMAJ Open 2021;9:E384–E393 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Sun LY, Mielniczuk LM, Liu PP, et al. Sex-specific temporal trends in ambulatory heart failure incidence, mortality and hospitalisation in Ontario, Canada from 1994 to 2013: a population-based cohort study. BMJ Open 2020;10:e044126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Shruti S, Joshi TS, Newby DE, Singh J. Sodium-glucose co-transporter 2 inhibitor therapy: mechanisms of action in heart failure [published correction appears in Heart 2021;107:e15]. Heart 2021;107:1032–1038 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Berg DD, Wiviott SD, Scirica BM, et al. Heart failure risk stratification and efficacy of sodium-glucose cotransporter-2 inhibitors in patients with type 2 diabetes mellitus. Circulation 2019;140:1569–1577 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Clarke PM, Gray AM, Briggs A, et al.; UK Prospective Diabetes Study (UKDPS) Group . A model to estimate the lifetime health outcomes of patients with type 2 diabetes: the United Kingdom Prospective Diabetes Study (UKPDS) Outcomes Model (UKPDS no. 68). Diabetologia 2004;47:1747–1759 [DOI] [PubMed] [Google Scholar]
  • 34. Pandey A, Vaduganathan M, Patel KV, et al. Biomarker-based risk prediction of incident heart failure in pre-diabetes and diabetes. JACC Heart Fail 2021;9:215–223 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Pfister R, Cairns R, Erdmann E, Schneider CA. A clinical risk score for heart failure in patients with type 2 diabetes and macrovascular disease: an analysis of the PROactive study. Int J Cardiol 2013;162:112–116 [DOI] [PubMed] [Google Scholar]
  • 36. Segar MW, Vaduganathan M, Patel KV, et al. Machine learning to predict the risk of incident heart failure hospitalization among patients with diabetes: the WATCH-DM risk score. Diabetes Care 2019;42:2298–2306 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Williams BA, Geba D, Cordova JM, Shetty SS. A risk prediction model for heart failure hospitalization in type 2 diabetes mellitus. Clin Cardiol 2020;43:275–283 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Sun LY, Tu JV, Coutinho T, et al. Sex differences in outcomes of heart failure in an ambulatory, population-based cohort from 2009 to 2013. CMAJ 2018;190:E848–E854 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Hippisley-Cox J, Coupland C. Development and validation of risk prediction equations to estimate future risk of heart failure in patients with diabetes: a prospective cohort study. BMJ Open 2015;5:e008503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Leal J, Hayes AJ, Gray AM, Holman RR, Clarke PM. Temporal validation of the UKPDS outcomes model using 10-year posttrial monitoring data. Diabetes Care 2013;36:1541–1546 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Nichols GA, Hillier TA, Erbey JR, Brown JB. Congestive heart failure in type 2 diabetes: prevalence, incidence, and risk factors. Diabetes Care 2001;24:1614–1619 [DOI] [PubMed] [Google Scholar]

Articles from Diabetes Care are provided here courtesy of American Diabetes Association

RESOURCES