Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Aug 28.
Published in final edited form as: Inform Med Unlocked. 2019 Oct 16;17:100254. doi: 10.1016/j.imu.2019.100254

Development of non-invasive diabetes risk prediction models as decision support tools designed for application in the dental clinical environment

Harshad Hegde 1, Neel Shimpi 1, Aloksagar Panny 1, Ingrid Glurich 1, Pamela Christie 1, Amit Acharya 1,*
PMCID: PMC7453822  NIHMSID: NIHMS1547986  PMID: 32864420

Abstract

The objective was to develop a predictive model using medical-dental data from an integrated electronic health record (iEHR) to identify individuals with undiagnosed diabetes mellitus (DM) in dental settings. Retrospective data retrieved from Marshfield Clinic Health System’s data-warehouse was pre-processed prior to conducting analysis. A subset was extracted from the preprocessed dataset for external evaluation (Nvalidation) of derived predictive models. Further, subsets of 30%–70%, 40%–60% and 50%–50% case-to-control ratios were created for training/testing. Feature selection was performed on all datasets. Four machine learning (ML) classifiers were evaluated: logistic regression (LR), multilayer perceptron (MLP), support vector machines (SVM) and random forests (RF). Model performance was evaluated on Nvalidation. We retrieved a total of 5319 cases and 36,224 controls. From the initial 116 medical and dental features, 107 were used after performing feature selection. RF applied to the 50%–50% case-control ratio outperformed other predictive models over Nvalidation achieving a total accuracy (94.14%), sensitivity (0.941), specificity (0.943), F-measure (0.941), Mathews-correlation-coefficient (0.885) and area under the receiver operating curve (0.972). Future directions include incorporation of this predictive model into iEHR as a clinical decision support tool to screen and detect patients at risk for DM triggering follow-ups and referrals for integrated care delivery between dentists and physicians.

Keywords: Dental informatics, Decision-support systems, Electronic health records, Evidence-based practice, Machine leaning, Modeling healthcare services

1. Introduction

Diabetes Mellitus (DM) is a chronic metabolic disorder characterized by development of abnormal regulation of blood glucose levels [1]. The disease progresses gradually through a pre-diabetic phase associated with sub-clinical levels of dysglycemia and often goes undetected during initial years due to its asymptomatic nature [1]. Disease progression is associated with incremental increases in magnitude of micro and macro vascular complications [2]. Hence early detection of undiagnosed DM is crucial for prevention of associated complications, progression delay and better overall management of morbidity [3]. DM is a global pandemic with 415 million individuals affected in 2015, with prevalence projected at 642 million by 2040 [4]. In United States, an estimated 30 million individuals were diagnosed with DM in 2017 while 7.2 million of individuals remained undiagnosed [5].

A mounting evidence base has demonstrated bidirectional association between periodontal disease (PD) and DM [6-8]. PD has been recognized as an early complication of DM [9-11]. Population-based screening to identify individuals at risk for dysglycemia may support prevention and intervention. However, biological screening is currently not supported as a standard of care in the dental setting following evaluation of the evidence base by the US Preventive Services Task Force and United Kingdom’s National Institute for Health Research in 2008 and 2013 respectively [12]. In the absence of biological screening for dysglycemia in the dental setting, non-invasive screening of existent medical and dental data in the electronic health record (EHR) with capacity to identify patients potentially at risk for DM could be really beneficial [12]. These patients may require referral for further evaluation that may contribute to early diagnosis and an opportunity to modify patient risk for progression and onset of complications [13]. The value of using EHR data to detect DM at the point of care (POC) for identification of high-risk individuals has been demonstrated in a study by Sohler et al. (2016) [14]. Moreover a recent systematic review of 10 field trials that evaluated biological screening for DM in the dental setting to project DM prevalence among patients [15]. Their review reported detection of glycemic measures in the diabetic range at POC among 1.3% and 14% patients across the studies while rates of dysglycemia in the prediabetic range varied widely, ranging from 19% to as high as 90% [15]. Screening for conditions like DM in a dental setting is an important component of disease identification/prevention that enables integrated care delivery across disciplines [16].

The percentage of dental visits of adults aged 18 and above substantially increased from approximately 40% in the year 2000 to nearly 60% in 2015 [17,18]. Increase in number of patients seeking dental care provides a unique opportunity to screen individuals for high risk of DM at the POC [19]. Recent studies have showcased the willingness of dental providers to screen and monitor patients for risk of systemic conditions including DM, in the dental setting, to contribute to holistic improvement of health outcomes [19-21]. However, lack of knowledge surrounding DM among dental providers, time constraints and provider perceptions surrounding care for patients with DM were some of the historically-identified barriers to integrated care delivery [20,21]. Creating capacity to identify risk factors and their relative contribution to increased likelihood for developing DM can create opportunities for establishing cost-effective interventions. Applying clinical decision support tools that conduct continuous monitoring for DM risk by screening available data in the health records creates opportunity for more timely and appropriate intervention in the context of integrated healthcare delivery. However, deciphering complex relationships and interactions among multiple risk factors is challenging to compute. This requires creation of computerized models utilizing effective approaches, including application of artificial intelligence. The objective of our study was to engage machine learning (ML) approaches to develop prediction tools which can be implemented at the POC in a dental setting to identify patients at a risk for DM. Implementing such a clinical decision support tool at the (POC) will aid the dental providers in identification of individuals with high risk of DM and directly inform care delivery. Based on our understanding, this is one of the first studies utilizing medical and dental data to develop predictive tool for identifying the risk of DM in dental settings.

2. Methods

2.1. Study setting and population

Marshfield Clinic Health System (MCHS) is one of the largest, comprehensive, medical-dental health systems in the United States with a service area that spans care delivery throughout central, northern and western Wisconsin [22]. Care delivery across its networked medical and dental clinics is supported by a robust integrated medical-dental electronic health record (iEHR) environment. This multi-specialty group practice employs nearly 700 physicians and 40 dentists and approximately 7600 employees who support care delivery. Family Health Center of Marshfield (FHC), a federally qualified health center, partners with MCHS to provide care to more than 59,000 unique dental patients annually through 10 regional dental clinics spanning an extensive service area across Wisconsin which closely aligns with MCHS service area [23,24]. The majority of the populations residing within the largely rural MCHS service area are White/Caucasian. The study was reviewed and approved by using expedited review by the Institutional Review board (IRB) of Marshfield Clinic Research Institute.

2.2. Data retrieval

Retrospective data spanning a 39-year temporal period from 1979 to 2018 were mined from MCHS’s enterprise research data warehouse (EDW). A comprehensive list of potential candidate data features were first identified and cataloged by systematic review of previously-published diabetes/dysglycemia risk assessment models. using multivariate regression across other populations [13]. Features representing candidate risk factors retained in historic models and with available data in our iEHR were selected for further modeling to evaluate their relevance and validity relative to our patient population. Notably, if multiple measures of a feature were made, each measure was initially tallied as a separate feature (e.g., PPD or BOP measures made at 6 sites per tooth). Multiple measures of a feature were finally reduced to a single representation for that feature (e.g. we only considered the max PPD value out of the 6 sites). The final dataset included both medical and dental features. Predicting DM risk was treated as a classification problem and outcome was represented by two categories (binary outcome): ‘high risk (cases)’ and ‘low risk (controls)’. Cases were defined as all the patients who were diagnosed with DM identified by ICD-9/10 codes by the physician practices while controls were defined as patients who were non-diabetics lacking such coding in the EHR.

The following inclusion/exclusion criteria were applied to select individuals for the study.

  • Only patients with both medical and dental visits were included in the study.

  • All data one year prior to the date of DM diagnosis was collected for cases. For controls all data was collected in a one year time frame prior to the last dental visit.

  • The dental data associated with the third molars were excluded from analysis.

  • All individuals between 21yrs. of age and 89 yrs of age were included in the study.

2.3. Data preprocessing

2.3.1. Data deletion

The percentage of missing data for each feature and percentage of missing data for each patient record was calculated. Any feature with more than fifty percent of the data missing and any patient record with more thirty percent of the data missing were excluded from the analysis. Table 1 illustrates the list of all data features included in the prediction model development along with baseline characteristics. Table 2 illustrates the deleted data features from the datasets due to a high proportion of missing data (see Table 3).

Table 1.

Illustrates list of all data features included in the prediction model development along with baseline characteristics.

Feature
number
Feature Cases (High
Risk)
Controls
(Low Risk)
P-Value
1 Age
21–30 years 442 (08.36%) 7488 (21.25%) <0.0001
31–40 years 769 (14.54%) 8236 (23.37%)
41–50 years 1245 (23.55%) 5940 (16.85%)
51–60 years 1434 (27.12%) 6587 (18.69%)
61–70 years 983 (18.59%) 4217 (11.96%)
71–80 years 350 (06.62%) 1923 (05.45%)
80–89 years 63 (01.19%) 842 (02.38%)
2 Body Mass Index (BMI) [25]
Less than 18.5 = Underweight; 18.5–24.99 = Normal;
25.00–29.99 = Overweight; 30.0 and above = Obese
Missing values 49 (00.93%) 2513 (07.13%) <0.0001
Underweight 25 (00.47%) 466 (01.32%)
Normal 284 (05.37%) 7946 (22.55%)
Overweight 933 (17.65%) 9402 (26.69%)
Obese 3995 (75.58%) 14,906 (42.31%)
3–30 Bleeding on probing (BOP) **
Each tooth is probed at six sites.
Missing values (excludes extracted teeth) 117,488 (80.89%) 317,100 (35.40%) <0.0001
Total number of teeth with BOP present 8286 (05.71%) 172,834 (19.29%)
Total number of teeth with BOP absent 16,764 (11.54%) 318,382 (35.34%)
31. Corticosteroids medications (Retrieved from medication lists from the iEHR)
Missing values 0 477 (01.35%) <0.0001
Corticosteroid prescribed 1053 (19.92%) 7258 (20.32%)
Corticosteroid not prescribed 4233 (80.08%) 27,598 (78.33%)
32. Serum Creatinine Levels
Females: Less than 0.6 mg/dl = low, 0.6 mg/dl to 1.1 mg/dl = Normal,
More than 1.1 mg/dl = High
Males: Less than 0.7 mg/dl = low,0.7 mg/dl to 1.3 mg/dl = Normal,
More than 1.3 mg/dl = High
Missing values 1103 (20.87%) 25,124 (71.31%) <0.0001
Low 211 (03.99%) 292 (00.83%)
Normal 3670 (69.43%) 9271 (26.31%)
High 302 (05.71%) 546 (01.55%)
33. Use of Diabetic Medications
Missing 0 477 (01.35%) <0.0001
DM medication prescribed 1870 (35.38%) 1684 (04.78%)
DM medication not prescribed 3416 (64.62%) 33,072 (93.87%)
34. Ethnicity
Missing 92 (01.74%) 3013 (8.55%) <0.0001
Declined 49 (00.93%) 352 (01.00%)
Hispanic or Latino 162 (03.06%) 113 (03.21%)
Not Hispanic or Latino 4975 (94.12%) 30,674 (87.06%)
Patient Does Not Know 8 (00.15%) 64 (00.18%)
35. Family history of Diabetes
Family history included parents and siblings
Yes 459 (08.68%) 178 (00.51%) <0.0001
No 4827 (91.32%) 35,055 (99.49%)
36. Gender
Male 2443 (46.2%) 14,921 (42.3%) <0.0001
Female 2843 (53.7%) 20,312 (57.6%)
37. High Density Lipids (HDL) cholesterol
Less than 40 mg/dl = Poor, 40 mg/dl to 59 mg/dl = Better, 60 mg/dl and above = Best
Missing 1516 (28.68%) 28,790 (81.71%) <0.0001
Poor 1749 (33.09%) 1347 (03.82%)
Better 1711 (32.37%) 3192 (09.06%)
Best 310 (05.86%) 1904 (05.40%)
38. Hypertension
<120 mm Hg (SBP) and <80 mm Hg (DBP) = Normal; 120–129 mm Hg (SBP) and <80 mm Hg (DBP) = Prehypertension; 130–139 mm Hg (SBP) or 80–89 mm Hg (DBP) = Stage 1 hypertension; ≥140 mm Hg (SBP) or ≥90 mm Hg (DBP) = Stage 2 hypertension; ≥180 mm Hg (SBP) or ≥120 mm Hg (DBP) = Hypertensive crisis
Missing 1453 (27.49%) 1551 (04.40%) <0.0001
Normal 1084 (20.51%) 11,896 (33.76%)
Prehypertension 653 (12.35%) 6287 (17.84%)
Stage 1 hypertension 1250 (23.65%) 11,439 (32.47%)
Stage 2 hypertension 824 (15.59%) 3976 (11.28%)
Hypertensive crisis 22 (00.42%) 84 (00.24%)
39. Use of Hypertensive medications
Missing 0 477 (01.35%) <0.0001
Hypertensive medication prescribed 2001 (37.85%) 9026 (25.62%)
Hypertensive medication not prescribed 3285 (62.15%) 25,730 (73.03%)
40–45. Insurance
Types: Medicare, Medicaid, Commercial, Self-pay, Senior Care (Prescription only), No health insurance
Yes 5195 (98.28%) 33,483 (95.03%) <0.0001
No 91 (01.72%) 1750 (04.97%)
46. LDL cholesterol
Less than 100 mg/dl = Optimal, 100 mg/dl to 129 mg/dl = Near optimal, 130 mg/dl to 159 mg/dl = Borderline high, 160 mg/dl to 189 mg/dl = High, 190 mg/dl and above = Very High
Missing 1666 (31.52%) 28,855 (81.90%) <0.0001
Optimal 1786 (33.79%) 2766 (07.85%)
Near optimal 1082 (20.47%) 2111 (05.99%)
Borderline high 528 (09.99%) 1078 (03.06%)
High 165 (03.12%) 311 (00.88%)
Very High 59 (01.12%) 112 (00.32%)
47. Total number of unique dental visits in the given measurement year
Continuous variable 1.08 ± 2.64 3.49 ± 2.66 <0.0001
48. Periodontal Disease (PD) Types
Missing 2300 (43.51%) 8783 (24.93%) <0.0001
Healthy 78 (01.48%) 595 (01.69%)
Type 1 323 (06.11%) 5370 (15.24%)
Type 2 1882 (35.60%) 16,297 (46.25%)
Type 3 623 (11.79%) 3712 (10.54%)
Type 4 78 (01.48%) 445 (01.26%)
Type 5 2 (00.04%) 31 (00.09%)
49–76. Periodontal Pocket Depth (PPD)
Each tooth is probed at six sites and maximum PPD value is assigned as the PPD for each tooth.
Total number of teeth with missing PPD values (excludes extracted teeth) 120,517 (82.98%) 354,068 (39.53%) <0.0001
Total number of teeth with PPD > 5 mm 1270 (05.14%) 17,794 (03.28%)
77–84. Race
Non-White: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander; White;
Unknown: Patient Does Not Know, Declined and Unknown
White 4952 (92.87%) 30,363 (85.57%) <0.0001
Non-White 206 (03.86%) 1328 (03.74%)
Unknown 174 (03.26%) 3791 (10.68%)
85. Use of Statins
Missing 0 477 (01.35%) <0.0001
Statin prescribed 1342 (25.39%) 4941 (14.02%)
Statin not prescribed 3944 (74.61%) 29,815 (84.62%)
86. Tobacco use status
Missing 2452 (46.39%) 2393 (06.79%) <0.0001
Current user 847 (16.02%) 13,041 (37.01%)
Former user 809 (15.30%) 7712 (21.89%)
Never 1178 (22.29%) 12,087 (34.31%)
87–114. Total number of missing teeth
Total number of missing teeth (includes extracted teeth) 2767 (01.87%) 90,756 (09.2%) <0.0001
115. Total Triglycerides
Less than 150 mg/dl = Normal, 150 mg/dl to 199 mg/dl = Borderline high, 200 mg/dl to 499 mg/dl = High, 500 mg/dl and above = Very High.
Missing 1532 (28.98%) 28,805 (81.76%) <0.0001
Less than 150 mg/dl (Normal) 1681 (31.80%) 4628 (13.14%)
150 mg/dl to 199 mg/dl (Borderline high) 810 (15.32%) 878 (02.49%)
200 mg/dl to 499 mg/dl (High) 1150 (21.76%) 875 (02.48%)
500 mg/dl and above (Very High) 113 (02.14%) 47 (00.13%)
116. WBC
Less than 4.0*109/L = Leukopenia, 4.0*109/L to 11.0 *109/L = Normal, More than 11.0*109/L = Leukocytosis
Missing 1883 (34.68%) 25,958 (73.68%) <0.0001
Less than 4.0*109/L (Leukopenia) 33 (00.62% 0 233 (00.66%)
4.0*109/L to 11.0 *109/L(Normal) 2944 (55.69%) 7869 (22.33%)
More than 11.0*109/L (Leukocytosis) 476 (09.00%) 1173 (3.33%)
Table 2.

List of all data features deleted and corresponding percentage of missing values.

Feature Description Cases (% of missing values(N = 5319) Controls (% of missing values(N = 36,224)
CRP C-reactive protein (Continuous value) 99.4 99.9
Bone loss Presence/absence of periodontal bone loss (Boolean value) 98.8 96.9
Plaque Presence/absence of plaque (Boolean value) 98.7 96.7
Mobility Presence/absence of tooth mobility (Boolean value) 97.3 94.4
Uric Acid Uric acid levels(Continuous value) 90.6 98.9
Gingivitis Presence/absence of gingivitis(Boolean value) 83.1 0
Xerostomia Presence/absence of xerostomia (Boolean value) 83.1 0
Oral candidiasis Presence/absence of oral candidiasis (Boolean value) 83.1 0
Table 3.

Shows Case-control distribution of training/testing datasets.

Case-Control Casesa Controlsb
30–70 4757 11,100
40–60 4757 7136
50–50 4757 4757
a

Number of observations kept consistent.

b

Number of observations sampled without replacement.

2.3.2. Redefining feature values

We subset several features into predefined categories [26-28] (e.g. feature #2: BMI in Table 1 were categorized according to study described by Bhaskaran et al. [25] as underweight (<18.5); normal (18.5–24.99), overweight (25.0–29.99) or obese(>30).

For the feature: “periodontal pocket depth” (PPD) we obtained the maximum PPD among the six probing surfaces of the tooth to accurately define the extent of periodontal tissue destruction. The maximum PPD or the worst depth among the six probing sites for each tooth was recorded to classify the severity and extent of a patients periodontal disease along with other factors. The latest classification of periodontal disease was adopted at the 2017 World Workshop on the Classification of Periodontal and Peri-Implant Diseases [29]. The feature “bleeding on probing” (BOP) was recorded as a boolean value that evaluated whether bleeding was present or absent on a tooth’s surface. Moreover this data point was aligned to the corresponding tooth surface that showed the deepest PPD. This exercise resulted in a single value for both “PPD” and “BOP” for each tooth.

2.4. Label encoding and feature scaling [30,31]

All categorical features (string labels) in the preprocessed dataset were transformed to numerical values by importing the data as a Data Frame in R software (R version 3.4.3. R Foundation for Statistical Computing, Vienna, Austria.) [32]. Further, all the features were scaled to a range of [0, 1] in order to prevent any feature being weighted more than others due to a larger range. This normalization was performed using the following function [31]:

xnorm=xmin(x)max(x)min(x) (1)

Where x norm is the normalized value, x is the original value, max(x) is the upper bound, and min(x) is the lower bound values for the features. All feature scaling was performed using R programming language [32].

2.5. Dataset creation

A validation set (Nvalidation) was first separated out from the preprocessed dataset, which constituted of 10% of the total number of cases and equal number of controls. The remaining 90% of training/testing set (Ntrain/testing) was further divided randomly into three subsets that included cases and controls combined in ratios of 30:70; 40:60 and 50:50, respectively (Fig. 1). The number of cases were kept consistent (n =4757) and the number of controls were adjusted accordingly. The controls were randomly sampled without replacement for creating these datasets. SAS®(Base SAS 9.4 SAS Institute Inc., Cary, NC) analytical software was used for data preprocessing and datasets creation [33].

Fig. 1.

Fig. 1.

Shows the study flow.

2.6. Data imputation

We initially evaluated two data imputation methods: Multiple Imputation by Chained Equations (MICE) [34,35] and Probabilistic Principal Component Analysis (PPCA) [36], to address the missing data values in the Ntrain/testing. This was done by selecting a subset of data from Ntrain/testing without any missing data points for all the features. Thirty percent “missingness” was imposed into this subset using ‘missing completely at random’ (MCAR) mechanism to mimic the missing data pattern in preprocessed dataset. The percentage of values that were correctly imputed was higher using PPCA than MICE; hence PPCA method was employed to impute missing values in Ntrain/testing. Data imputation was not carried out in Nvalidation to replicate the real-world scenario for evaluating the classifier performance.

2.7. Feature selection

Feature selection was conducted on all 3 datasets using WEKA® [37]. We performed the feature selection using information gain with ranker search method. This method evaluates the importance of each feature by measuring the information gain with respect to the class. using the following formula:

InfoGain(Class, Feature)=H(Class)H(ClassFeature). (2)

Where H, stands for entropy, which is defined as:

H=(Probabilityclasslog2(Probabilityclass)) (3)

2.8. Model training and validation

Based on previous studies [38] four supervised ML algorithms: Multilayer Perception (MLP), Random Forests (RF), Support Vector Machine (SVM) and Logistic Regression (LR) were used to create models for predicting DM risk. The detailed description of these classifiers with mathematical equations can be found in other papers [39,40]. These classifiers were trained and tested on the three subsets (30:70; 40:60 and 50:50) using 10-fold cross validation. Each fold derived a model whose performance was evaluated over Nvalidation as shown in Fig. 1 and the results of the best performing fold for each corresponding classifier were reported.

WEKA® an open source software tool was used for building these models [37]. All classifiers were implemented using default hyperparameters. Performance metrics to test model performance included evaluation of the total accuracy, sensitivity, specificity, precision, F-measure, Mathews-correlation-coefficient (MCC), false positive rate (FPR), false negative rate (FNR), Negative predictive value (NPV), Positive predictive value (PPV) and area under Receiver Operating Characteristic (ROC) curve (AUC). A paired two-tailed t-test was performed over AUC using WEKA Experimenter. The statistical significance was set at α = 0.05. To indicate the strength of the difference between cases and controls, the observed p-value was used to determine the association (χ2-test).

3. Results

Systematic review identified 69 articles published between 1/1980 and 5/2018 that undertook diabetes risk prediction modeling. These studies examined contribution of a total of 201 candidate medical, dental, demographic, environmental and behavioral candidate features. Of these, 95 features shown in Fig. 2 were variably retained in predictive models created across diverse population cohorts. Among 19 dental features, three variables including PPD, missing teeth and self-reported oral health status were retained in some final models as predictors. Availability of these features was explored in our clinical databases for inclusion in model development and available features were retrieved (see Fig. 3).

Fig. 2.

Fig. 2.

Catalogs all retained variables identified across the 69 studies that met eligibility for data abstraction and were included for modeling.

Fig. 3.

Fig. 3.

Shows ROC (AUC) of all the four classifiers with varied case-control distribution.

Data retrieved were abstracted from a total of 41,543 subjects (5319 cases and 36,224 controls) and 124 features which included demographic (n = 17), medical/environmental/behavioral (n = 15) and oral health (n = 92) features, where each measure of a variable was initially abstracted as a discrete feature (e.g. BOP and PPD measured at 6 sites per tooth). Preprocessing of data resulted in deletion of features based on a high proportion of the missing data, or redefinition via Boolean representation (teeth present or absent) or derivation of a representative value across a series of iterative measures of a feature (e. g. PPD). The processed dataset consisted of 40,519 patients (5286 cases and 35,233 controls) and 116 features (18 demographic, 12 medical and 86 dental features) (see Table 1). After performing feature selection on all 3 datasets, race and ethnicity were excluded due to low information gains of ~10−3 and ~3*10−4 respectively thus bringing the feature count down from 116 to 107. The most significant feature in terms of information gain for all 3 datasets was ‘Number of dental visits’ (~0.6). Nvalidation consisted of a total of 529 cases (10% of the total cases (5,286)) and an equal number of controls totaling to 1058 observations. After separation of Nvalidation, the resulting Ntrain/testing consisted of 39,461 subjects with 4757 cases and 34,704 controls.

A total of 12 prediction models (3 datasets ‘times’ 4 ML classification methods) were developed. RF classifier trained on a dataset with a class distribution of 50-50 (case-control) was the best performing model.

Table 4 shows the confusion matrix of RF with 50-50 (case: control) which was the best performing model. Number of dental visits and PPD were the top two features in terms of information gain. The correctly and incorrectly classified instances using RF were 996 and 62 respectively (see Table 5).

Table 4.

Confusion matrix of the best performing model (RF with 50-50(case: control)).

Total
Population
(Nvalidation)
N = 1058
Actual Positive Actual Negative ACC = 0.9414
(ΣTP+ΣTN
/ΣTotal
Population)
Predicted Positive TP = 481 FP = 48 PPV = 0.943 (ΣTP/ΣPredicted Positive)
Predicted Negative FN = 14 TN = 515 NPV = 0.974 (TN/ΣPredicted Negative
TPR = 0.972 (ΣTP/ΣActual Positive) FPR = 0.085 (ΣFP/ΣActual Negative)
FNR = 0.028 (ΣFN/ΣActual Positive) TNR = 0.915 (ΣTN/ΣActual Negative)

TP = True Positive; TN = True Negative; FP=False Positive; FN=False Negative; ACC = Accuracy; TPR = True positive rate; TNR = True negative rate; FPR=False positive rate; FNR=False negative rate; PPV=Positive predictive value; NPV=Negative predictive value.

Table 5.

Performance metrics for all models.

Case-Control Classifier Total accuracy AUC MCC F-measure Sensitivity Specificity Precision
30–70 MLP 81.29% 0.894 0.628 0.813 0.813 0.789 0.815
LR 81.19% 0.935 0.654 0.808 0.812 0.74 0.843
RF 91.68% 0.983 0.842 0.916 0.917 0.864 0.926
SVM 80.91% 0.809 0.65 0.804 0.809 0.736 0.842
40–60 MLP 82.51% 0.900 0.65 0.825 0.825 0.826 0.825
LR 86.29% 0.935 0.734 0.862 0.863 0.816 0.871
RF 92.72% 0.978 0.860 0.927 0.927 0.883 0.933
SVM 85.35% 0.806 0.719 0.852 0.853 0.799 0.865
50–50 MLP 82.42% 0.898 0.652 0.824 0.824 0.792 0.828
LR 75.24% 0.866 0.519 0.749 0.752 0.704 0.767
RFa 94.14% 0.972 0.885 0.941 0.972 0.915 0.943
SVM 77.50% 0.775 0.581 0.769 0.775 0.708 0.807
a

Best performing model.

We also assessed the best performing models for each of the remaining classifiers. MLP classifier performed better on the class distribution of 40–60 (case-control) and demonstrated a total accuracy of 82.51% and ROC (AUC) of 0.9. LR classifier was efficient with the class distribution of 40–60 (case-control) and demonstrated a total accuracy of 86.29% and ROC (AUC) of 0.935. Similarly, SVM classifier performed better over the class distribution of 40–60 (case-control), achieving a total accuracy of 85.35% and ROC (AUC) of 0.806.

4. Discussion

DM diagnosis is dependent on demonstration of two measures of elevated plasma glucose. These include either a) fasting plasma glucose (FPG) or b) 2-h plasma glucose (2-h PG) value following administration of a 75-g oral glucose tolerance test (OGTT), or based on Hemoglobin A1C (HbA1C) criteria [1], with diagnoses rendered by physicians in a clinical setting. Although dentists do not diagnose DM, recognition of undiagnosed DM is highly relevant to provision of appropriate oral and dental care to a patient with this condition. With mounting evidence supporting association between DM and PD, development of predictive modeling tools such as the tool defined in this study will add value to dental professionals in identifying undiagnosed patients potentially at high risk for DM [38,41]. The bidirectional nature of DM and PD and the importance of managing PD with the knowledge that the individual may be at risk for DM may be crucial at the POC. Identification of potentially at risk individuals in the dental setting supported by triage to clinicians for further testing and confirmation of DM or prediabetic status in an earlier timeframe which benefits patient health.

The objective of our study was to develop a prediction model to noninvasively screen EHR data of patients with no existing diagnosis for dysglycemia seen in the dental setting to identify presence of relative risk of DM. Predictive modeling using ML algorithms has been previously applied in a number of studies for risk determination of various health conditions [42,43]. A systematic review suggested that the ML algorithms applied in our study aligned with those applied in similar risk prediction modeling studies undertaken by other researchers [38]. We focused on increasing predictive accuracy by modeling a combination of medical and dental risk factor candidates to develop a model to predict DM risk for patients in our clinical population. Among the classifiers used in this study, RF yielded the best AUC which was statistically significant as compared to the rest (p < 0.05).

Acharya et al. previously developed a model using multivariate logistic regression as a tool for screening diabetes risk [13]. The study reported an AUC (ROC), sensitivity and specificity of 0.71, 0.70 and 0.62 respectively. Application of ML and expansion of the candidate predictive features in the current study resulted in improved performance of the models developed that achieved AUC (ROC), sensitivity and specificity of 0.972, 0.941 and 0.915 respectively. Improved performance noted in the current study is likely attributable to the volume of data used to train the models: i.e. 107 features modeled in data set of 39,461 individuals vs. 10 features modeled in 4560 individuals in the study by Acharya et al. [13].

Similar studies were done in the past for developing nomograms to predict the risk of DM based on laboratory, semi-laboratory and nonlaboratory data [44,45]. Wong et al. and Li et al. used logistic regression to develop their models. Wong et al. reported an AUC of 0.709 for non-laboratory-based risk algorithm while 0.711 for laboratory-based risk algorithms respectively. Similarly, Li et al. reported an AUC of 0.868 and 0.763 for the semi-lab model and non-lab model respectively. Both studies used 25% of their data for external evaluation. In this study, we used only 10% of cases and equal number of controls (n = 529 each) as part of the external evaluation set in the interest of using most of the data for training and testing so that the feature space was adequately represented.

Lalla et al. built a model to predict undiagnosed pre-diabetes and DM using LR applied to dental features in a cohort of subjects whose HbA1C results at (POC) classified them into dysglycemic or normoglycemic at a ratio of 36:64 [10]. The authors reported an AUC of 0.65 for their best performing model, which included only dental features. When an optimal cut-offs were defined including 26% of teeth with deep PPD or ≥4 missing teeth the sensitivity of the model was 0.73.

Li et al. used the data from NHANES III survey to construct predictive models for DM risk using classification and regression tree (CART) method [46]. They used a wider range of risk factors and also tested other dental parameters including ‘sum of decayed, missing and filled (DMFTs) tooth surfaces’, ‘time since last dental visit’, ‘self-reported oral health status’ in addition to ‘presence or absence of periodontitis’. By contrast, the present study did not include DMFTs, ‘time since last dental visit’ and ‘self-reported oral health status’. However, we considered the missing teeth data for all 28 teeth as a boolean value ‘tooth present or absent’. Furthermore, we limited consideration to only the total number of dental visits within one year temporal window. Li et al. used 55 features (N = 15,090) and their model had a sensitivity, specificity and AUC of 0.824, 0.528 and 0.72 respectively.

Borrell et al. also used data from NHANES III to develop a LR model to identify features that increased the probability of identifying a dental patient with an undiagnosed DM status [47]. These investigators modeled the following features: presence or absence of periodontitis, self-reported family history of DM, hypertension and cholesterol levels. Family history of DM was one of the features used in our study as well. Periodontal assessments tested in their models included: clinical attachment loss (CAL) and PPD [47]. Our model also incorporated PPD data. However, we did not have CAL measures available for the study.

5. Limitations

Since some features were excluded due to missing data (Table 2) whose proportion was greater than 50%, hence imputation techniques were limited to those shown in Table 1. Our dataset consisted of a predominantly White, non-Hispanic population and hence lacked the racial diversity found in the data used by Borell et al. which consisted of a mixed African-American, Mexican-American (Hispanic) and White racial/ethnic representation. Feature selection on our dataset revealed that race and ethnicity had very low information gain which could be associated to the lack of diversity. The model developed in this study was evaluated with an internal dataset and plans include evaluation of our model using data from external organizations. Although our models were restricted to the data procured from a single health system, the service area of MCHS covers a vast geographic area including central, northern and western Wisconsin. The features used in the present study were captured in a clinical setting and not specifically for research purposes. This resulted in the exclusion of some features modeled in other studies because these data were not systematically available in our clinical setting [46]. For example, Li et al. incorporated waist-circumference for testing in their models, which other studies have found to be a highly predictive feature for dysglycemia in some populations. Moreover, some features that were captured mainly in clinical notes could not be readily modeled due to challenges involved in abstracting them. Abstraction could be facilitated by application of natural language processing (NLP) in order to convert clinical notes into structured features for our future studies [48]. Examples of such features include tooth brushing or flossing frequency, which are recorded in the free text fields but not currently captured as structured data elements. Another limitation that we encountered was missing dental visits for patients who carried a DM diagnosis within the last year of the date of diagnosis. Such potentially informative patients did not meet inclusion criteria due to insufficient data within the temporal frame specified for inclusion in the study.

6. Conclusions

We developed a model that identifies undiagnosed patients at risk for DM based on their medical and dental data. Future direction includes embedding this algorithm in an application programmable interface (API) for incorporation into the EHR to provide alerts to dental providers. Moreover, application of this process could be employed to develop additional decision support tools for other systemic diseases with oral health associations such as cardiovascular and cognitive disorders. All features used in modeling performed to date represent phenotypic and demographic data. Additionally relevant genetic data could be modeled in the future to determine whether such data would increase predictive capacity of current clinical phenotypic models as such, data become available in the iEHR.

Acknowledgments

Funding sources

This work was supported by funds from Delta Dental of Wisconsin; Marshfield Clinic Research Institute, Family Health Center of Marshfield, Inc. and partial funding by grant UL1TR000427 from Clinical and Translational Science Award (CTSA) program of the National Center for Advancing Translational Sciences, National Institutes of Health (NIH). The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Footnotes

Declaration of competing interest

The authors do not have any conflict of interest.

References

RESOURCES