Skip to main content
. 2016 Oct 11;11(10):e0163942. doi: 10.1371/journal.pone.0163942

Table 4. Studies investigating prediction of diabetes using machine learning methods.

Reference Method Predictors Sample Size Type of prediction Performance
Yu et al. 2010 SVM family history, age, gender, race and ethnicity, weight, height, waist circumference, BMI, hypertension, physical activity, smoking, alcohol use, education, and household income(NHANES Cohort). 4915 Cross-sectional AUC = 0.73
Mani et al. 2012 RF A1c,Sys BP,Diastolic BP, GLU, BMI, Creatinine, HDL, MDRD, Triglycerides, Race, Gender, Age(EHR Data). 2280 1 year ahead AUC = 0.80
Choi et al. 2014 SVMANN age, body mass index, hypertension, gender, daily alcohol intake, and waist circumference(KNHANES cohort) 4685 Cross-sectional AUC = 0.74
Anderson et al. 2016 age,gender,systolic/diastolic BP, Height, Wieght, BMI, 150 ICD9 code, 150 common meds(HER data). 9948 Cross-sectional AUC = 0.81
Luo 2016 BRT + RF The data set includes information ondemographics, diagnoses, allergies, immunizations, lab results, medications, smoking status, and vital signs. 9948 1 year ahead Accuracy = 87.4%
Our Study RF15 Hemoglobin A1c, fasting glucose, waist circumference, adiponectin, BMI, hs-CRP, triglycerides, age, leptin, body surface area, eGFR, 2D calculated left ventricular mass, HFL cholesterol, LDL cholesterol, aldosterone. 3633 8 years ahead AUC = 0.82Accuracy = 75%

ANN–Artificial Neural Networks; BRT +RF–Combination of Boosting Regression Trees and RF classifiers.