Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jan 1.
Published in final edited form as: J Diabetes Complications. 2016 Jul 27;31(1):86–93. doi: 10.1016/j.jdiacomp.2016.07.025

Utility of Existing Diabetes Risk Prediction Tools for Young Black and White Adults: Evidence from the Bogalusa Heart Study

Benjamin D Pollock 1, Tian Hu 1, Wei Chen 1, Emily W Harville 1, Shengxu Li 1, Larry S Webber 2, Vivian Fonseca 3, Lydia A Bazzano 1
PMCID: PMC5209262  NIHMSID: NIHMS806249  PMID: 27503406

Abstract

Aims

To evaluate several adult diabetes risk calculation tools for predicting the development of incident diabetes and pre-diabetes in a bi-racial, young adult population.

Methods

Surveys beginning in young adulthood (baseline age≥18) and continuing across multiple decades for 2,122 participants of the Bogalusa Heart Study were used to test the associations of five well-known adult diabetes risk scores with incident diabetes and pre-diabetes using separate Cox models for each risk score. Racial differences were tested within each model. Predictive utility and discrimination were determined for each risk score using the Net Reclassification Index (NRI) and Harrell’s c-statistic.

Results

All risk scores were strongly associated (p<.0001) with incident diabetes and pre-diabetes. The Wilson model indicated greater risk of diabetes for blacks versus whites with equivalent risk scores (HR=1.59; 95%CI 1.11,2.28; p=0.01). C-statistics for the diabetes risk models ranged from 0.79–0.83. Non-event NRIs indicated high specificity (non-event NRIs: 76%–88%), but poor sensitivity [event NRIs: −23%–(−3)%].

Conclusions

Five diabetes risk scores established in middle-aged, racially homogenous adult populations are generally applicable to younger adults with good specificity but poor sensitivity. The addition of race to these models did not result in greater predictive capabilities. A more sensitive risk score to predict diabetes in younger adults is needed.

Keywords: diabetes, risk prediction, young adults, race, pre-diabetes

1. Introduction

Based on recent estimates developed using the National Health and Nutrition Examination Surveys (NHANES), the national prevalence of diagnosed diabetes mellitus continues to rise, currently encompassing about 9% of adults in the United States.(1) An additional 5% of U.S. adults are estimated to have undiagnosed diabetes and nearly 40% of U.S. adults are estimated to be living with pre-diabetes.(1) The diabetes epidemic costs more than $240 billion annually when considering both healthcare expenditures and lost productivity.(2) Several valid risk models have been put forward that reliably predict incident diabetes in specific adult populations.(37) These risk models, when properly applied in the correct clinical settings, can identify adults at high risk of developing diabetes who are likely to benefit from preventive measures.

Yet, while it is known that symptomatic Type 2 diabetes most often manifests and is diagnosed during mid to late adulthood, the long asymptomatic period before onset of symptoms provides a broad and clinically relevant time period for earlier identification of high risk individuals before they develop diabetes.(8) Because following participants through decades of young adulthood presents significant epidemiological challenges, few studies to date have attempted to apply or specify diabetes risk models in young adult populations, and no studies have thoroughly considered and modeled the dynamics of time-varying changes in risk scores throughout follow-up.(913) Here, we use several decades of detailed data collected from young adulthood through middle age in the Bogalusa Heart Study (BHS) to evaluate the utility of several adult diabetes risk prediction tools in predicting the development of incident diabetes and pre-diabetes.

2. Subjects

Because the diabetes risk scores tested were developed in adult populations, we excluded childhood surveys in which participants were less than 18 years old, resulting in a study population ranging from 18–41 years of age at baseline (mean ± SD; 21.9 ± 4.0). Additionally, participants with diabetes (defined as fasting blood glucose ≥ 126mg/dL or use of medication for diabetes) at baseline were excluded. Finally, for each diabetes risk score tested, patients missing any of the necessary measurements were excluded. After exclusions, the study population consisted of n=2,122 young adult Bogalusa Heart Study participants surveyed from 1978–2010.

3. Materials and Methods

The study design and survey methodology of the Bogalusa Heart Study, the oldest bi-racial cohort study to examine cardiovascular health from childhood through adulthood, have been explained in great detail. (1416) Briefly, the study was founded in 1973 by Dr. Gerald Berenson and began enrolling a cohort of early school-aged children, with participants completing repeated, cross-sectional surveys every few years. Collection and measurement of clinical and laboratory characteristics such as blood pressure, triglyceride levels, and cholesterol have also been previously described.(17)

3.1 Outcome and exposure definitions

The primary outcome in this study was diagnosis of incident diabetes, again defined as fasting blood glucose ≥ 126mg/dL or use of medication for diabetes. Time-to-diabetes was measured from the midpoint of the year of the baseline survey through the midpoint of the year in which incident diabetes was reported (i.e. a participant completing a baseline survey on any date during 1978 and reporting diabetes in a follow-up survey at any date during 1993 would contribute 15 person-years to the study). For participants who did not develop diabetes, the midpoint of the year in which they last completed a follow-up survey was used in the same manner to calculate time of censorship. As a secondary analysis, we considered new development of either diabetes or pre-diabetes (any fasting blood glucose [FBG] ≥ 100 mg/dL) as an outcome in order to capture a greater number of events in this cohort. For example, if two participants (both without pre-diabetes or diabetes at baseline) in the cohort reported for follow-up surveys with respective FBG values of 105 mg/dL (pre-diabetic) and 150 mg/dL (diabetic), both would be recorded as experiencing the event of interest at that point in time and neither individual would contribute further person-time after this point.

Five previously established adult diabetes risk prediction tests were evaluated separately. The risk probabilities established by Griffin et al (Wessex, UK) (3) and Balkau et al (DESIR cohort, Franch) (5), as well as the risk scores of Wilson et al (Framingham Offspring Study, US) (6), Schulze (EPIC-Potsdam, Germany) et al (4), and Kahn (ARIC study, US) et al (7) were approximated to determine the magnitude and predictive value of their associations with development of incident diabetes in our population. These risk scores were selected based on compatibility and availability of similar Bogalusa Heart Study risk factors. These five unique risk scores differ slightly in their composition, but they generally consist of 5–10 traditional risk factors, with the most common being hypertension, smoking, family history of diabetes, age, and waist circumference (Appendix Table 1).

3.2 Statistical analysis

Baseline characteristics were tabulated using means and standard deviations (continuous variables) or percentages (categorical variables) according to the specific variables used to calculate each of the five adult risk scores. Risk scores were approximated as similarly as possible to the original formulae presented in the five risk score manuscripts.(37) The resulting risk approximations used here are presented in Appendix Table 1.

To model the associations between each of the five risk scores and the development of incident diabetes, five individual Cox proportional hazards conditional counting process models were formed with the risk approximation as the independent variable, fitted using a five-knot restricted cubic spline function. Traditional analyses have used only a baseline risk assessment to predict diabetes, which may be appropriate for studies of short duration. However, our lengthy follow-up allows for a more thorough analysis over a longer period of time by considering changes in health status and re-calculation of diabetes risk at each point in follow-up. Thus, the counting process model was chosen in order to allow for repeated time-varying modeling of the risk approximations, which not only extends upon the traditional Cox model, but is also more rigorous than logistic regression which completely ignores any element of time.(18) This provides a more realistic and robust portrait of true risk at each stage of life for each participant. The proportional hazards assumptions were examined with plots of Schoenfeld’s residuals. Finally, race (white or black) was added as an independent variable to each Cox model and a race*risk interaction term was tested.

Identical statistical methods were used to generate four pre-diabetes models; the Wilson risk score was excluded from this secondary analysis because fasting glucose is used as a covariate in the calculation of the Wilson risk score.

Additionally, Harrell’s c-statistic for survival data was calculated to measure discrimination, and the Net Reclassification Improvement (NRI) was calculated to compare the predictive ability of each risk model versus the unadjusted diabetes-free survival.(1921) The NRI was calculated versus a null model of unadjusted Kaplan-Meier survival estimates. Percent correctly classified ranges from −100% to 100%, with a negative percentage indicating that the number of events incorrectly reclassified was greater than the number of events correctly reclassified – i.e. −50% indicates that for every 4 events, a risk model correctly predicts an increased probability of diabetes for 1 event, but incorrectly predicts a decreased probability of diabetes for 3 events. Finally, a sensitivity analysis was done using Markov Chain Monte Carlo (MCMC) multiple imputation to compute all missing risk approximations in the full study population (n=2,122).(22) This sensitivity analysis allows us to analyze complete data across all five models. All analyses were conducted using SAS 9.3 (SAS Institute), Cary, NC.

4. Results

The baseline measurements of the study populations used to calculate each risk score are shown in Table 1. From our cohort of n=2,122 participants who completed a baseline survey, an additional 6,252 follow-up surveys were completed (mean=2.95 follow-up surveys per participant.) There were 220 (10.4%) participants who became pre-diabetic and 125 (5.9%) participants who developed incident diabetes during follow-up. Median follow-up time was 14 years (mean±SD; 14.7±8.4 years), and total follow-up time was 31,261 person years. Baseline risk approximations for each of the five risk scores are shown in Table 2. All five risk approximations were strongly associated (p<0.0001) with development of diabetes through 30-years of follow-up (Figure 1). The four pre-diabetes risk models were also significantly associated (p<0.0001) with 30-year development of pre-diabetes.

Table 1.

Baseline diabetes risk factors from the Bogalusa Heart Study

Characteristic: Incident diabetes
(n=125, 5.9%)
No diabetes
(n=1,997, 94.1%)
All participants
(n=2,122)
Age, (years) 22.9 ± 4.7 21.8 ± 4.0 21.9 ± 4.0
Men 56 (44.8%) 878 (44.0%) 934 (44.0%)
Race
  White
  Black
70 (56.0%)
55 (44.0%)
1,346 (67.4%)
651 (32.6%)
1,416 (66.7%)
706 (33.3%)
Height, (cm) 169.1 ± 9.1 169.0 ± 9.5 169.0 ± 9.4
Weight, (kg) 83.5 ± 22.2 68.6 ± 17.1 69.5 ± 17.8
BMI, (kg/m2) 29.3 ± 7.8* 23.9 ± 5.2* 24.2 ± 5.5
Waist circumference, (cm) 96.3 ± 17.0 81.9 ± 14.8 82.9 ± 15.4
Hypertension 9 (7.2%) 70 (3.5%) 79 (3.7%)
Family history of diabetes
  At least one parent
  At least one sibling
8 (6.4%)
4 (3.2%)
96 (4.6%)
52 (2.6%)
104 (4.9%)
56 (2.6%)
Alcohol usage 5 (4.0%) 87 (4.4%) 92 (4.3%)
Current smoker 23 (18.4%) 373 (18.7%) 396 (18.7%)
HDL-C, (mg/dL) 46.3 ± 21.3 50.6 ± 18.9 50.3 ± 19.0
Glucose, (mg/dL) 89.8 ± 26.6 82.1 ± 8.7 82.6 ± 10.7
Triglycerides (mg/dL) 94 (68,164)* 77 (55,106)* 77 (56,107)*
Blood pressure, (mm Hg) 116±10 / 75±10 112±10 / 71±8 112±10 / 72±8
Pulse, (beats per minute) 56.6 ± 24.1 55.1 ± 20.4 55.2 ± 20.7

Abbreviations: BMI=Body Mass Index; HDL-C=High Density Lipoprotein-Cholesterol;

Continuous variables are shown as mean ± standard deviation, and categorical variables are shown by percentages unless otherwise indicated;

*

Median (Q1,Q3);

Table 2.

Baseline diabetes risk factors from the Bogalusa Heart Study used to approximate five adult diabetes risk scores

Characteristic: Griffin analysis(3)
n = 2,122
Wilson analysis(6)
n = 1,998
Schulze analysis(4)
n = 1,821
Balkau analysis(5)
n = 1,573
Kahn analysis(7)
n = 508
Age, (years) 21.9 ± 4.0 22.3 ± 4.2 24.4 ± 4.9
Men 44.0% 32.9% 45.6%
Race
  White
  Black
68.3%
31.7%
Height, (cm) 170.2 ± 9.5
Weight, (kg) 89.2 ± 18.0
BMI, (kg/m2) 24.2 ± 5.5 24.2 ± 5.5
Waist circumference, (cm) 82.2 ± 14.7 81.3 ± 14.8 98.2 ± 11.9
Hypertension 3.7% 3.8% 3.4% 3.2% 6.7%
Family history of diabetes
  At least one parent
  At least one sibling
4.9%
2.6%
5.0%
---------
5.6%
2.9%
8.9%
Alcohol usage 4.9%
Current smoker 18.7% 19.8% 22.9% 20.1%
Physically active   35.8%*
HDL-C, (mg/dL) 50.4 ± 19.3
Glucose, (mg/dL) 82.6 ± 10.7
Triglycerides (mg/dL) 77 (56,106)
Blood pressure, (mm Hg) 112±10 / 72±8
Pulse, (beats per minute) 53.4 ± 21.7
Baseline risk score 2.6% ± 5.3% 4.0 ± 4.3 312 ± 112 5.2% ± 10.5% 33.7 ± 12.9

Abbreviations: BMI=Body Mass Index; HDL-C=High-Density Lipoprotein-Cholesterol;

Continuous variables are shown as mean ± standard deviation, and categorical variables are shown by percentages unless otherwise indicated, and shaded cells indicate factors not used for calculating the risk score in that column;

*

Due to >50% missing, physical activity was excluded from Schulze risk score

Median (Q1,Q3);

Figure 1. Predicted risks of incident diabetes at different follow-up intervals.

Figure 1

Figure 1 shows 5-,10-,15-,25-,and 30-year risks for incident diabetes from the five separate Cox proportional hazards models representing each risk approximation: A.) Griffin (3); B.) Balkau (5); C.) Kahn (7); D.) Schulze (4); E.) Wilson (6);

4.1 Diabetes risk models

For participants at the 75th percentile risk approximation, the Griffin, Wilson, Schulze, and Balkau models predicted a ~10% risk of diabetes after 30 years (10%, 9%, 8%, and 10%, respectively). The Kahn model, which featured a smaller subset of participants (n=508/2,122, 23.9% of cohort) and a greater unadjusted incidence rate, predicted a 29% risk of diabetes after 30 years for participants at the 75th percentile risk approximation. Only the Wilson risk model indicated a significant difference in risk of diabetes by race (Hazard Ratio for Black vs. White =1.59; 95%CI 1.11,2.28; p=0.01), and there was no interaction between race and risk approximation in any of the diabetes risk models (Table 3).

Table 3.

Model fit and performance statistics for Cox proportional time-to-diabetes risk models using established adult diabetes risk score approximations with and without race

Time to diabetes: Griffin analysis(3)
n = 2,122
Wilson analysis(6)
n = 1,998
Schulze analysis(4)
n = 1,821
Balkau analysis(5)
n = 1,573
Kahn analysis(7)
n = 508

Risk approximation model:
  Harrell’s c-statistic (95% CI) (19) 0.83 (0.67,0.94) 0.82 (0.66,0.94) 0.80 (0.64,0.93) 0.79 (0.63,0.92) 0.79 (0.68,0.89)
Risk approximation model with race:
  Harrell’s c-statistic (95% CI) (19) 0.82 (0.67,0.94) 0.82 (0.66,0.94) 0.80 (0.64,0.93) 0.79 (0.63,0.92) N/A*

Risk approximation model:
  Net Reclassification Improvement(21) 0.77 (0.60,0.95) 0.65 (0.48,0.83) 0.81 (0.63,0.99) 0.74 (0.56,0.92) 0.57 (0.38,0.75)
    % events correctly reclassified −6% −23% −3% −12% −19%
    % non-events correctly reclassified 83% 88% 84% 86% 76%
    p-value p<.0001 p<.0001 p<.0001 p<.0001 p<.0001

Risk approximation model with race:
  Net Reclassification Improvement(21) 0.71 (0.53,0.88) 0.62 (0.45,0.80) 0.83 (0.65,1.01) 0.76 (0.58,0.94) N/A
    % events correctly reclassified −12% −26% −2% −10%
    % non-events correctly reclassified 83% 89% 85% 86%
    p-value p<.0001 p<.0001 p<.0001 p<.0001
  Hazard ratio Black vs. White (95% CI, p-value) 1.37 (0.96,1.97 p=0.08) 1.59 (1.11,2.28 p=0.01) 1.21 (0.84,1.75 p=0.31) 1.20 (0.82,1.75 p=0.35) N/A
  p-value of race*risk approximation interaction p=0.15 p=0.11 p=0.16 p=0.10

Time to glucose≥100mg/dL (pre-diabetes): Griffin analysis(3)
n = 2,122
Wilson analysis(6)
n = 1,998
Schulze analysis(4)
n = 1,821
Balkau analysis(5)
n = 1,573
Kahn analysis(7)
n = 508

Risk approximation model:
  Harrell’s c-statistic (95% CI) (19) 0.83 (0.72,0.92) N/A 0.81 (0.70,0.91) 0.81 (0.69,0.91) 0.82 (0.69,0.92)
Risk approximation model with race:
  Harrell’s c-statistic (95% CI) (19) 0.83 (0.72,0.92) 0.80 (0.68,0.89) 0.81 (0.69,0.90) N/A

Risk approximation model:
  Net Reclassification Improvement(21) 1.02 (0.89,1.14) N/A 1.03 (0.90,1.16) 0.96 (0.83,1.09) 0.69 (0.54,0.84)
    % events correctly reclassified 33% 34% 26% 20%
    % non-events cases correctly reclassified 69% 69% 70% 49%
    p-value p<.0001 p<.0001 p<.0001 p<.0001

Risk approximation model with race:
  Net Reclassification Improvement(21) 1.02 (0.90,1.15) N/A 1.02 (0.89,1.15) 0.98 (0.85,1.11) N/A
    % events correctly reclassified 34% 21% 28%
    % non-events correctly reclassified 69% 81% 70%
    p-value p<.0001 p<.0001 p<.0001
  Hazard Ratio Black vs. White (95% CI, p-value) 1.18 (0.89,1.54 p=0.26) N/A 1.49 (1.12,1.96 p<0.01) 1.18 (0.88,1.56 p=0.27) N/A
  p-value of race*risk approximation interaction p=0.13 p=0.81 p=0.04
*

Kahn risk score already includes race in the risk score approximation

Wilson risk score already includes the pre-diabetic glucose range in the risk score approximation

The c-statistics for the five risk models ranged from 0.79 (Kahn) to 0.83 (Griffin), indicating that all models were reasonably discriminatory in separating participants who developed diabetes from those who did not (Table 3).(23) Similarly, the NRIs for the five risk models ranged from 0.57 (Kahn) to 0.81 (Schulze), confirming the significantly improved discriminatory value of the risk approximations versus a null survival model. The discriminatory capabilities of the five risk models appeared to be overwhelmingly due to their specificity (ability to predict lower risk for those not developing diabetes) versus sensitivity (ability to predict a higher risk for those who developed diabetes), with none of the five models resulting in a positive percentage of improvement for re-classifying those who developed diabetes.

4.2 Pre-diabetes risk models

The four pre-diabetes risk approximations had significant (p<0.0001) associations with development of pre-diabetes, and showed similar discriminatory capabilities to the diabetes models, with c-statistics near 0.8. The Schulze model was the only model to show a significant difference in risk of pre-diabetes by race (Hazard Ratio for Black vs. White =1.49; 95%CI 1.12,1.96; p<0.01), though the Balkau model suggested a small interaction between race and Balkau risk probability (p=0.04). Although less specific than the diabetes risk approximations, the pre-diabetes risk approximations were more sensitive to detecting individuals at increased risk for pre-diabetes, with the larger NRIs in the pre-diabetes models indicating better overall predictive capabilities than the diabetes models. Across all diabetes and pre-diabetes models, the addition of race to each model had negligible effects on model discrimination (as measured by the c-statistic) and predictive value (as measured by NRI) (Table 3).

5. Discussion

Our study is the first to use longitudinal data from young adulthood through mid-life to assess the applicability of multiple existing adult diabetes risk scores in a bi-racial, young adult population. Our results indicated that all five risk scores tested were strongly associated with onset of incident diabetes in our bi-racial population. Furthermore, these risk scores showed good discrimination and in general provided great specificity in predicting lower risks for patients who did not develop diabetes. However, these risk scores lacked sensitivity; therefore an opportunity exists to develop a more sensitive diabetes risk model for young adults.

All five diabetes models performed similarly with c-statistics around 0.8, slightly negative event NRIs, and strongly positive non-event NRIs. It is important to highlight that each risk score was approximated using slightly different subsets of our Bogalusa study population considering the availability and completeness of data needed to calculate each unique risk score. Thus, the model fit statistics are not informative for comparison across models to determine a “best” model, but rather we sought to explore each model separately to determine its potential application in our population. We conducted a sensitivity analysis to overcome this issue by imputing missing risk scores for the cohort. Our sensitivity analysis showed that when the full study population was assessed across all five models, risks were comparable and the exaggerated risks predicted by the Kahn score in the main analysis were no longer present 30-year risks=10%, 17%, 14%, 17%, and 10%, respectively for Griffin, Wilson, Schulze, Balkau, and Kahn models). With the caveat that a large amount of data was imputed in this sensitivity analysis, the NRIs from the sensitivity analysis indicated that in the full study population, the Wilson, Kahn, and Balkau risk scores were nearly identical in discriminative ability (NRIs=0.64, 0.66, and 0.64, respectively), while Griffin and Schulze were slightly better (NRIs = 0.77 and 0.80 respectively).

In contrast to the diabetes models tested, all pre-diabetes models showed better utility considering their greater NRIs (NRIs around 1 versus the diabetes NRIs around 0.7–0.8). This difference was notably driven by much greater pre-diabetes event NRIs (approximately 40% absolute increases versus the diabetes event NRIs) while sacrificing only ~15% absolute decreases in pre-diabetes non-event NRIs versus the diabetes non-event NRIs. The use of a pre-diabetes risk score is an important consideration for clinicians serving young adult populations; corrective health measures are better implemented prior to the development of clinical diabetes.

Other studies similar to the present study have varied in their methods and conclusions. In 2007, Mainous and colleagues used a young adult cohort to test an adult diabetes prediction model, developed in those aged 45–64 in the Atherosclerosis Risk in Communities (ARIC) cohort.(9, 10) The authors in that study used only baseline risk factors, and follow-up was limited to ten years. Similarly, the predictive capability of the Finnish Diabetes Risk Score (FINDRISC) was assessed recently for identifying undiagnosed diabetes and pre-diabetes in young American adults. While the FINDRISC score did prove useful in predicting undiagnosed diabetes, the analysis used only cross-sectional data and reported a relatively weak receiver operating characteristic (ROC) of 0.67 for finding pre-diabetic participants.(11) Perhaps the most rigorous attempt at predicting diabetes in young adults came from the Coronary Artery Risk Development in Young Adults (CARDIA) study, which used nearly 24 years of follow-up per person in over 2,000 young adults (aged 18–30 at baseline) to form a diabetes prediction tool for young adults that used baseline risk factors and most importantly included genotype.(12) This approach is more sound than using baseline risk factors alone because genotype does not change over time. However, the CARDIA investigators found that the model including genotype did not improve type 2 diabetes risk prediction compared to a model with traditional clinical risk factors (including BMI, race, age, gender, parental history, glucose, triglycerides, and HDL-C), and suggested the need for studies which can model age-varying effects of individual risk factors.(12) In our study, we were able to build five models for incident diabetes using previously validated risk scores as the sole independent covariates, which afforded us the power to model the time-varying effects of these risk scores throughout the study follow-up. Whereas other studies have not rigorously modeled risk scores to account for changes over time, the Bogalusa Heart Study population afforded us this novel approach.

Another important feature to consider is that four of the five risk scores tested in our study were developed in distinct, racially homogenous adult populations. The Griffin risk score included only white adults from Southern England aged 40–64, while the risk score formed by Schulze et al used German adults aged 35–60.(3, 4) Likewise, the risk score formed by Wilson et al used a similarly aged cohort (mean=54.0) of 99% white, non-Hispanic participants from the Framingham Offspring Study, and Balkau et al developed their risk score in adults aged 30–64 in western France.(5, 6) Only the Kahn risk score, using data from the ARIC study, was formed using a bi-racial population with approximately 22% black participants. Therefore, while complete data for approximating the Kahn score in our population was limited, our study population was most similar to the Kahn population, so we felt inclusion of this score was useful. Our analysis indicated that the predicted risk of diabetes calculated by Wilson et al in white Americans may not apply equally to black Americans – i.e. black participants in our study had nearly 60% higher risk of diabetes than white participants with an identical Wilson risk score. This result is not surprising, as the Wilson risk score is the only score tested that includes triglycerides, and it is well documented that black Americans typically have lower average triglycerides levels than white Americans.(24, 25) Thus, black Americans whose triglycerides are elevated to an equivalent level as white Americans may represent a much higher percentile of risk than white Americans of average triglyceride levels. Regardless, the addition of race to the Wilson model still did not alter the model’s overall predictive or discriminatory capability due to a counteracting effect between sensitivity (3% less sensitive for events) and specificity (1% more specific for non-events).

Recently, CARDIA data was used to test diabetes risk models in biracial cohorts [one using the predicted probabilities from the Framingham Offspring Study whereas we used the abbreviated Wilson risk score from the same study (6), and one in Mexican-Americans from the San Antonio Heart Study (26)].(27) The authors found differences in discrimination of risk models when stratified by white versus black participants of CARDIA. However, the mean baseline age of participants was approximately 45 years, representing a much older population than we have used here. Our study has the advantage of testing four additional risk scores in a younger population, and is consistent with the conclusion drawn by Lacy and colleagues that further investigation in cohorts that include minorities is needed.(27)

The strength of our study lies in the use of longitudinal data from multiple surveys in a large, well-established, bi-racial, young adult cohort. Still, our study has certain limitations. First, we chose not to compare or adjust our risk estimates based on socioeconomic status, which is known to have an effect on the incidence of diabetes.(28) We made this choice due to the fact that socioeconomic indicators have not been traditionally included as factors in diabetes risk scores, and are not used in any of the five common models selected for this study. Thus, while future diabetes prediction models may want to consider the inclusion of socioeconomic factors, most existing models, including those we tested here, do not. Second, our approximation of Schulze’s risk score did not include physical activity (due to missingness). Despite this, the relative magnitude of physical activity (a magnitude of 2 units if physically active) compared to other factors in Schulze’s calculation such as hypertension (a magnitude of 46 units) make this difference in risk approximation negligible.(4) Likewise, alcohol use (defined in this study as ≥ 1 drink per year) was not assessed in several of the early and adolescent BHS surveys, and to maintain consistency in our definitions across the multiple decades of data analyzed here, we chose not to use the extensive three-survey series of alcohol-specific questionnaire data gathered by BHS investigators during in the 1980s.(29) Thus, the low prevalence of alcohol usage reported in our baseline population is assuredly a substantial underestimate of the true prevalence in this population (mean age = 21.9 years). Again, however, this consideration only affects the Schulze risk approximation as it was the only diabetes model tested that considered alcohol usage. Additionally, alcohol consumption is considered a protective effect in the Schulze risk score, so our approximation is conservatively an overestimation of the true diabetes and pre-diabetes risks. Likewise, our study design does not allow for exact dates of diabetes onset to be determined, however by approximating all diagnoses to the midpoint of the year of the survey in which diabetes is first reported, the ascertainment is not differential between participants. With regards to attrition experienced in our study (median follow-up ~15 years during 30-year study period), it is likely that any differential loss to follow-up has made our results more conservative considering the healthy-user bias. Basically, healthier participants are more likely to return for follow-up, so the model discrimination among healthier population is not as great as if a wider distribution consisting of the study dropouts with greater risk factor counts were included. Lastly, our assessment of familial history of diabetes relies on self-reported familial data not tested for validity in the present study, although it has been previously reported that similar self-reports for family history of diabetes are considerably valid.(30, 31)

5.1 Conclusions

In conclusion, five well-documented diabetes risk scores were tested in our Bogalusa Heart Study population, and all showed significant associations with development of incident diabetes. All models were adequately discriminatory with great specificity but poor sensitivity, and the addition of black race, though associated with higher risk of diabetes or pre-diabetes in certain models, did not change models’ predictive values. Current diabetes risk scores may be useful for identifying young adults with lower-than-normal risks of diabetes, and may selectively predict diabetes in extremely high risk young adults. Because of the low sensitivity of these established risk scores, there currently remains an opportunity to develop a new, more sensitive diabetes prediction tool for black and white young adults.

Acknowledgments

The authors would like to thank and acknowledge the Bogalusa Heart Study staff and participants for their contributions to this research.

Funding: This work was supported by grants R01 ES021724 from National Institute of Environmental Health Sciences and R01 AG041200 from the National Institute on Aging.

Appendix A – Approximation methods for adult diabetes risk scores

Risk score: Equation:
  Risk factor variables:   Risk factor variable values:

Griffin risk probability = 1 / [(1)+exp(−1*Griffin risk score)]
  Griffin risk score =(−6.322)+(female*−0.879)+(hypertension*1.22)+griffin_BMI_score+(age*0.063)+griffin_diabetes_score+(smoker*0.855);
  griffin_BMI_score =0 (BMI<25); =0.699 (25≤BMI<27.5); =1.970 (27.5≤BMI<30); =2.518 (BMI≥30);
  griffin_diabetes_score =0 (if all siblings and parents non-diabetic); =.728 (if sibling or parent has diabetes); =.753 (if sibling and parent has diabetes)

Wilson risk score =Wilson_BMI_score+(parental_diabetes*3)+(hypertension*2)+(glucose*10)+(HDL_C*5)+(triglycerides*3)+blood_pressure;
  wilson_BMI_score =0 (BMI<25); =2 (25≤BMI<30); =5 (BMI≥30);
  parental_diabetes =0 (if no parents diabetic); =1 (if ≥ 1 parent diabetic);
  glucose =0 (glucose<100mg/dL); =1 (100mg/dl≤glucose<126mg/dL);
  HDL_C =0 (male and HDL-C ≥ 40mg/dL / female and HDL-C ≥ 50mg/dL) =1 (male and HDL-C < 40mg/dL / female and HDL-C < 50mg/dL)
  triglycerides =0 (triglycerides < 150mg/dL); =1 (triglycerides ≥ 150mg/dL);
  blood_pressure =0 (<130/85mmHG); =2(≥130/85mmHG)

Schulze risk score =(waist_circumference*7.4)−(height*2.4)+(age*4.3)+(hypertension*46)−(alcohol*20)+ (smoker*64);
  alcohol =0 (if no alcoholic beverages in past 12 months); =1(≥1 alcoholic beverage consumed within 12 months)

Balkau risk probability = 1 / [(1)+exp(−1*Balkau risk score)]
  Balkau risk score (Female) =(balkau_diabetes*1.09)+(waist_circumference*.095)+(hypertension*.64)−11.81;
  Balkau risk score (Male) =(smoker*0.72)+(waist_circumference*.081)+(hypertension*.50)−10.45;
  balkau_diabetes =0 (if all siblings and parents non-diabetic); =1 (if ≥ 1 sibling or parent diabetic);

Kahn risk score =(parental_diabetes*10.5)+(hypertension*11)+(African_American*6)+(kahn_age*5)+(smoker*4)+kahn_waist+kahn_height
+kahn_pulse+kahn_weight;
  parental_diabetes =0 (if no parents diabetic); =1 (if ≥ 1 parent diabetic);
  kahn_age =0 (if age ≤ 55 years); =1 (if age>55 years);
  kahn_waist (Female) =0 (waist circumference<81cm); =10 (81cm≤waist<88cm); =20 (88cm≤waist<96cm); =26 (96cm≤waist<105cm); =35 (waist≥105cm);
  kahn_waist (Male) =0 (waist circumference<90cm); =10 (90cm≤waist<95cm); =20 (95cm≤waist<100cm); =26 (100cm≤waist<106cm); =35 (waist≥106cm);
  kahn_height (Female) =0 (height≥164cm); =8 (height<157cm); =6 (157cm≤height<161cm); =3 (161cm≤height<164cm);
  kahn_height (Male) =0 (height≥178cm); =8 (height<171cm); =6 (171cm≤height<175cm); =3 (175cm≤height<178cm);
  kahn_pulse (Female) =0 (if pulse<70 beats per minute); =5 (if pulse ≥ 70 beats per minute);
  kahn_pulse (Male) =0 (if pulse<68 beats per minute); =5 (if pulse ≥ 68 beats per minute);
  kahn_weight (Female) =0 (if weight < 72.7kg); =5 (if weight ≥ 72.7kg);
  kahn_weight (Male) =0 (if weight < 86.4kg); =5 (if weight ≥ 86.4kg);

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

The authors report no conflicts of interest.

Author contributions: BDP contributed to the analysis plan, conducted data analyses, and drafted the manuscript. TH contributed to study design, manuscript revision, and interpretation. WC contributed to manuscript review. EWH contributed to manuscript revision and review. SL contributed to manuscript review. LSW contributed to manuscript review and revision, and presentation of results. VF contributed to manuscript review, revision, and interpretation of results. LAB contributed to research idea, study design, analysis plan, manuscript revision, and interpretation.

Guarantor: BDP is the guarantor of this manuscript and takes responsibility for accuracy of data and analyses.

References

  • 1.Menke A, Casagrande S, Geiss L, Cowie CC. Prevalence of and Trends in Diabetes Among Adults in the United States, 1988–2012. JAMA. 2015;314(10):1021–1029. doi: 10.1001/jama.2015.10029. [DOI] [PubMed] [Google Scholar]
  • 2.American Diabetes Association. Economic costs of diabetes in the U.S. in 2012. Diabetes Care. 2013;36(4):1033–1046. doi: 10.2337/dc12-2625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Griffin SJ, Little PS, Hales CN, Kinmonth AL, Wareham NJ. Diabetes risk score: towards earlier detection of type 2 diabetes in general practice. Diabetes Metab Res Rev. 2000;16(3):164–171. doi: 10.1002/1520-7560(200005/06)16:3<164::aid-dmrr103>3.0.co;2-r. [DOI] [PubMed] [Google Scholar]
  • 4.Schulze MB, Hoffmann K, Boeing H, Linseisen J, Rohrmann S, Mohlig M, et al. An accurate risk score based on anthropometric, dietary, and lifestyle factors to predict the development of type 2 diabetes. Diabetes Care. 2007;30(3):510–515. doi: 10.2337/dc06-2089. [DOI] [PubMed] [Google Scholar]
  • 5.Balkau B, Lange C, Fezeu L, Tichet J, de Lauzon-Guillain B, Czernichow S, et al. Predicting diabetes: clinical, biological, and genetic approaches: data from the Epidemiological Study on the Insulin Resistance Syndrome (DESIR) Diabetes Care. 2008;31(10):2056–2061. doi: 10.2337/dc08-0368. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wilson PW, Meigs JB, Sullivan L, Fox CS, Nathan DM, D'Agostino RB., Sr Prediction of incident diabetes mellitus in middle-aged adults: the Framingham Offspring Study. Arch Intern Med. 2007;167(10):1068–1074. doi: 10.1001/archinte.167.10.1068. [DOI] [PubMed] [Google Scholar]
  • 7.Kahn HS, Cheng YJ, Thompson TJ, Imperatore G, Gregg EW. Two risk-scoring systems for predicting incident diabetes mellitus in U.S. adults age 45 to 64 years. Ann Intern Med. 2009;150(11):741–751. doi: 10.7326/0003-4819-150-11-200906020-00002. [DOI] [PubMed] [Google Scholar]
  • 8.American Diabetes Association. Diagnosis and Classification of Diabetes Mellitus. Diabetes Care. 2014;37:S81–S90. doi: 10.2337/dc14-S081. [DOI] [PubMed] [Google Scholar]
  • 9.Schmidt MI, Duncan BB, Bang H, Pankow JS, Ballantyne CM, Golden SH, et al. Identifying individuals at high risk for diabetes - The Atherosclerosis Risk in Communities study. Diabetes Care. 2005;28(8):2013–2018. doi: 10.2337/diacare.28.8.2013. [DOI] [PubMed] [Google Scholar]
  • 10.Mainous AG, Diaz VA, Everett CJ. Assessing risk for development of diabetes in young adults. Annals of Family Medicine. 2007;5(5):425–429. doi: 10.1370/afm.705. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Zhang L, Zhang ZZ, Zhang YR, Hu G, Chen LW. Evaluation of Finnish Diabetes Risk Score in Screening Undiagnosed Diabetes and Prediabetes among US Adults by Gender and Race: NHANES 1999–2010. Plos One. 2014;9(5) doi: 10.1371/journal.pone.0097865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Vassy JL, Durant NH, Kabagambe EK, Carnethon MR, Rasmussen-Torvik LJ, Fornage M, et al. A genotype risk score predicts type 2 diabetes from young adulthood: the CARDIA study. Diabetologia. 2012;55(10):2604–2612. doi: 10.1007/s00125-012-2637-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Raynor LA, Pankow JS, Duncan BB, Schmidt MI, Hoogeveen RC, Pereira MA, et al. Novel risk factors and the prediction of type 2 diabetes in the Atherosclerosis Risk in Communities (ARIC) study. Diabetes Care. 2013;36(1):70–76. doi: 10.2337/dc12-0609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Berenson GS, McMahon CA, Voors AW, Webber LS, Srinivasan SR, Frank GC, et al. Cardiovascular risk factors in children-the early natural history of atherosclerosis and essential hypertension. New York: Oxford University Press; 1980. Eds.: 450. ed. [Google Scholar]
  • 15.Berenson GS Co-Investigators BHS. Bogalusa Heart Study: A long-term community study of a rural biracial (black/white) population. American Journal of the Medical Sciences. 2001;322(5):267–274. [PubMed] [Google Scholar]
  • 16.Pickoff AS, Berenson GS, Schlant RC. Introduction to the symposium celebrating the Bogalusa Heart Study. Am J Med Sci. 1995;310(Suppl. 1):S1–S2. doi: 10.1097/00000441-199512000-00001. [DOI] [PubMed] [Google Scholar]
  • 17.Nguyen QM, Xu JH, Chen W, Srinivasan SR, Berenson GS. Correlates of Age Onset of Type 2 Diabetes Among Relatively Young Black and White Adults in a Community The Bogalusa Heart Study. Diabetes Care. 2012;35(6):1341–1346. doi: 10.2337/dc11-1818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Prentice RL, Williams BJ, Peterson AV. On the Regression-Analysis of Multivariate Failure Time Data. Biometrika. 1981;68(2):373–379. [Google Scholar]
  • 19.Pencina MJ, D'Agostino RB. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Statistics in Medicine. 2004;23(13):2109–2123. doi: 10.1002/sim.1802. [DOI] [PubMed] [Google Scholar]
  • 20.Pencina MJ, D'Agostino RB, Sr, D'Agostino RB, Jr, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27(2):157–172. doi: 10.1002/sim.2929. discussion 207-12. [DOI] [PubMed] [Google Scholar]
  • 21.Leening MJ, Vedder MM, Witteman JC, Pencina MJ, Steyerberg EW. Net reclassification improvement: computation, interpretation, and controversies: a literature review and clinician's guide. Ann Intern Med. 2014;160(2):122–131. doi: 10.7326/M13-1522. [DOI] [PubMed] [Google Scholar]
  • 22.Yuan Y. Multiple Imputation Using SAS Software. Journal of Statistical Software. 2011;45(6):1–25. doi: 10.18637/jss.v045.i01. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Hosmer DW, Lemeshow S. Applied Logistic Regression (2nd edition) 2nd. New York, NY: John Wiley & Sons; 2000. [Google Scholar]
  • 24.Aminov Z, Haase R, Olson JR, Pavuk M, Carpenter DO Anniston Environmental Health Research C. Racial differences in levels of serum lipids and effects of exposure to persistent organic pollutants on lipid levels in residents of Anniston, Alabama. Environ Int. 2014;73:216–223. doi: 10.1016/j.envint.2014.07.022. [DOI] [PubMed] [Google Scholar]
  • 25.Lin SX, Carnethon M, Szklo M, Bertoni A. Racial/ethnic differences in the association of triglycerides with other metabolic syndrome components: the Multi-Ethnic Study of Atherosclerosis. Metab Syndr Relat Disord. 2011;9(1):35–40. doi: 10.1089/met.2010.0050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Stern MP, Williams K, Haffner SM. Identification of persons at high risk for type 2 diabetes mellitus: do we need the oral glucose tolerance test? Ann Intern Med. 2002;136(8):575–581. doi: 10.7326/0003-4819-136-8-200204160-00006. [DOI] [PubMed] [Google Scholar]
  • 27.Lacy ME, Wellenius GA, Carnethon MR, Loucks EB, Carson AP, Luo X, et al. Racial Differences in the Performance of Existing Risk Prediction Models for Incident Type 2 Diabetes: The CARDIA Study. Diabetes Care. 2015 doi: 10.2337/dc15-0509. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Piccolo RS, Pearce N, Araujo AB, McKinlay JB. The contribution of biogeographical ancestry and socioeconomic status to racial/ethnic disparities in type 2 diabetes mellitus: results from the Boston Area Community Health Survey. Ann Epidemiol. 2014;24(9):648–654. 654 e1. doi: 10.1016/j.annepidem.2014.06.098. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Johnson CC, Myers L, Webber LS, Hunter SM, Srinivasan SR, Berenson GS. Alcohol-Consumption among Adolescents and Young-Adults - the Bogalusa Heart-Study, 1981 to 1991. American Journal of Public Health. 1995;85(7):979–982. doi: 10.2105/ajph.85.7.979. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kahn LB, Marshall JA, Baxter J, Shetterly SM, Hamman RF. Accuracy of reported family history of diabetes mellitus. Results from San Luis Valley Diabetes Study. Diabetes Care. 1990;13(7):796–798. doi: 10.2337/diacare.13.7.796. [DOI] [PubMed] [Google Scholar]
  • 31.McClain MR, Srinivasan SR, Chen W, Steinmann WC, Berenson GS. Risk of type 2 diabetes mellitus in young adults from a biracial community: The Bogalusa heart study. Preventive Medicine. 2000;31(1):1–7. doi: 10.1006/pmed.2000.0682. [DOI] [PubMed] [Google Scholar]

RESOURCES