Investigation of the risk factors associated with prediabetes in normal-weight Qatari adults: a cross-sectional study

Khadija A Elmagarmid; Mohamed Fadlalla; Johann Jose; Abdelilah Arredouani; Halima Bensmail

doi:10.1038/s41598-024-73476-8

. 2024 Oct 4;14:23116. doi: 10.1038/s41598-024-73476-8

Investigation of the risk factors associated with prediabetes in normal-weight Qatari adults: a cross-sectional study

Khadija A Elmagarmid ¹, Mohamed Fadlalla ², Johann Jose ³, Abdelilah Arredouani ^4,^5,^✉, Halima Bensmail ^2,^6,^✉

PMCID: PMC11452400 PMID: 39367088

Abstract

Type 2 diabetes is one of the most prevalent chronic diseases in the world, and more people than ever before have impaired glucose tolereance, or prediabetes. Many patients with impaired glucose tolerance and undiagnosed diabetes do not know that their glucose metabolism system has been in a state of disorder. Every year, about 5-10% of prediabetics develop diabetes. One of the important achieving factors may be the increase in blood lipids. However, it is not clear whether triglyceride is associated with impaired glucose tolerance and prediabetes in the Qatari population. Therefore, we investigated the relationship between the first several clinical variables and prediabetes status in normal and overweight populations. We conducted a cross-sectional study using data from the Qatar Biobank program. The study included 5,996 participants who were adults over the age of 20. We collected information about participants’ fasting blood glucose levels with other clinical measurements and used various machine learning models and logistic regression to study the association between the clinical measurements and prediabetes for normal and overobese weight groups. The use of several machine learning models showed that, after adjusting the potential confounding factors such as age and sex, Triglyceride has been demonstrated to be positively correlated with prediabetes, and there was a special population dependence phenomenon. Among them, nonobese people (p < 0.05). The effect value and 95% confidence interval and OR of triglyceride on prediabetes was 2.79 and (e0.78, e1.28), respectively.

Keywords: Prediabetes, Diabetes, Triglyceride-glucose-related, Obesity, Normal-weight, Qatar

Subject terms: Computational biology and bioinformatics, Risk factors, Engineering

Introduction

According to the International Diabetes Federation (IDF), prediabetes (PreD) affected 7.5% of the global population in 2019, corresponding to approximately 374 million adults aged 18–99. This figure is expected to rise to 8.6% by 2045 if no prompt actions are taken¹. PreD is a condition in which a person’s blood sugar levels are higher than usual but not high enough to be categorized as type 2 diabetes (T2D). Based on the American Diabetes Association criteria, PredD is defined as having an HbA1_c level between 5.7% and 6.4% (39 and 47 mmol/mol), a fasting glucose concentration between 100 and 125 mg/dl (5.6 and 6.9 mmol/L), or a 2 h oral glucose tolerance test between 140 and 200 mg/dl (7.8–11.0 mmol/L). People with PredD are at an increased risk of developing T2D and other health complications, such as heart disease and stroke². Indeed, around 5-10% of individuals with PreD progress to T2D annually^2,3. Furthermore, people with PreD have a 33-65% 6-year probability of developing T2D, compared to 5% for those with normoglycemia⁴. Fortunately, recent research has shown that the progression from PreD to T2D can be prevented, or at least delayed, in a large fraction of individuals with PreD in response to intensive lifestyle intervention^5–8. These data suggest, therefore, that early detection and treatment of PreD is a highly cost-effective and fundamental strategy in T2D prevention. It also underscores the importance of understanding the pathophysiology and the risk factors associated with the onset of PreD.

PreD and T2D are usually associated with overweight and obesity⁹. Thus, current recommendations for PreD and T2D screening by the American Diabetes Association focus nearly exclusively on adults who are overweight or obese as defined by body mass index (BMI; kg/m²) until the patient meets the age-oriented screening at 45 years^10,11. This focus on obese or overweight individuals, however, may lead to missed opportunities for investigation of undetected PreD and T2D in normal-weight (NW) individuals¹¹.

A normal-weight (NW) person generally has a BMI between 18.5 and 24.9 Kg/m². Although most NW adults appear healthy, a considerable percentage may be afflicted by undiagnosed metabolic conditions such as insulin resistance, PreD, T2D, or nonalcoholic fatty liver disease (NAFLD)¹². Because these individuals usually have a high body fat mass but a normal BMI, they are known as Normal-Weight Obese (NWO)¹³. The precise cause of NWO is unknown, but genetics, food, and physical activity have all been linked to the disorder. There is evidence that when compared to normal-weight healthy (NWH) participants with a normal BMI and body fat percentage, NWO subjects exhibit changes in body composition, inflammation, and oxidative stress¹³. Interestingly, a recent study from the USA showed that the prevalence of both PreD and unhealthy waist circumference (abdominal obesity) among diabetes-free adults aged 20 years and older and within a healthy BMI range, significantly increased between 1988 and 2012¹¹. However, abdominal obesity does not appear to be the primary cause of the observed increase in PreD rate¹¹. Given the above-mentioned annual conversion rate from PreD to T2D, T2D preventive efforts will benefit from future research aimed at establishing the root cause of this rise to help detect PreD in primary care among NW individuals.

Over the last three decades, the Middle East region has witnessed a significant increase in obesity and T2D rates, owing primarily to the adoption of a Western lifestyle characterized by sedentary behavior and the consumption of calorie-dense foods and beverages¹⁴. PreD is also highly prevalent in the region, with reported rates ranging between 20 and 40%^15–17. These alarming data about PreD raise great concern in the region given annual conversion rates, which indicate that the T2D epidemic sweeping the region is set to worsen if nothing is done to prevent the progression from PreD to T2D. Several previous studies have investigated the risk factors of PreD and T2D in overweight or obese people. However, the risk factors associated with PreD in NW subjects have rarely been investigated.

In the present study, we used clinical, demographic, and anthropometric data of NW (BMI of 18.5 to 24.99 Kg/m²) or obese (BMI ≥ 30 Kg/m²) adults (aged 18 years and older) who are normoglycemic or have PreD, and applied different machine learning techniques to identify the most significant risk factors associated with PreD in NW subjects.

Methods

Study Population

We obtained cross-sectional clinical, anthropometric, and demographic data of 5996 Qatari individuals aged between 18 and 86 years (3,229 Females and 2,771 Males) from Qatar Biobank (QBB), a national institute running a well-phenotyped cohort, by collecting data from the general population in Qatar since 2012¹⁸. Inclusion criteria included being over 18 years and having a HbA1c < 6.5%. People with type 2 diabetes (HbA_1c≥ 6.5%) and pregnant women were excluded. The Flowchart of data processing is shown in Fig. 1.

The institutional review board approved the current project at the Qatar Biomedical Research Institute (IRB number: 2017–001) and QBB (IRB number: Ex-2018‐Res‐ACC‐0123‐0067). All participants gave written informed consent for their data and biospecimens to be used in medical research.

Anthropometric and clinical measures

The Qatar Biobank (QBB) provides 52 clinical measurements along with 9 additional measurements used to assess various aspects of health and physiology through medical tests and imaging, including Grip strength, 12-lead ECG, ultrasound scan of carotid arteries, Vicorder artery stiffness, retinal eye test, DXA scan of the whole body, treadmill walking test, lung function, and MRI for eligible participants. For the purpose of this study we only utilized the 52 blood test measurements along with height, weight, body fat measurement, blood pressure, and hip and waist measurements. Consequently, data on 57 variables were requested. Due to space constraints, a comprehensive list of these variables is not provided here. For detailed descriptions of the variables, please refer to the QBB website.

https://www.qatarbiobank.org.qa/participate/description-measurements. The BMI (kg/m²) was calculated as weight in kilograms (kg) divided by measured height in meters squared (m²).

For variable categorization, well-accepted clinical guidelines were used, when available. For BMI (in kg/m²), the Caucasian cut‐offs were used, categorizing BMI into four groups: underweight (BMI < 18.5 kg/m²), normal (BMI 18.5–24.9 kg/m²), overweight (BMI 25–29.9 kg/m²) and obese (BMI ≥ 30 kg/m²).

Specifically, we will use here two groups: normal-weight (BMI 18.5–24.9 kg/m²) and overweight/obese (BMI ≥ 25 kg/m²); NW and OWO respectively.

Plasma samples of patients fasting for at least 6 h were handled according to a standard protocol within 2 h of blood collection. Fasting plasma glucose (FPG), HbA1_c, triglyceride (TG), total cholesterol (TC), low-density lipid cholesterol (LDL-C), and high-density lipid cholesterol (HDL-C) were analyzed with an automated biochemical analyzer at the central laboratories at the Hamad Medical Corporation in Doha.

PreD cases were defined as those individuals with HbA1_c between 39 mmol/mol (5.7%) and 47 mmol/mol (6.4%), whereas controls were those with HbA1_c < 39 mmol/mol (5.7%).

Two more variables were calculated, the Homeostasis model assessment of insulin resistance (HOMA-IR) and homeostasis model assessment of β-cell dysfunction (HOMA-B). HOMA-IR was calculated as = fasting insulin (µIU/L) Inline graphic fasting glucose (nmol/L)/22 or (I0 (µIU/mL) G0 (mmol/L)/22) and HOMA-B was calculated as = ((20 insulin)/(glucose − 3.5))/100 or ((20I0 (μIU/mL)/G0 (mmol/L) − 3.5))/100¹⁹.

Training and validation populations

A 65/35 split was used on the 1,160 samples. For the training of the machine learning (ML) models, we used a case–control design that included 109 cases and 645 healthy controls. To validate the models developed in the training stage, we used data from 59 cases and 347 healthy controls (see Fig. 1).

Statistical analysis

All statistical analysis was carried out using R version 3.32.1.1, and R package “h2o” (version 3.17.0.4195) for building logistic regression and the other machine learning (ML) models. Variables with > 20% missing values were excluded. The unsaturated iron-binding capacity (UIBC) variable, although missing 21%, was kept for its importance. All the remaining variables had < 20% missing values, were imputed using the MICE package in R.

Descriptive statistics were used to describe the baseline characteristics of participants. Continuous variables were expressed as means ± standard deviation (SD). Independent Student’s t-test was used to compare the means, where the Inline graphic ²-test was used to compare proportions and the dependence between the prevalence of PredD and the different factors. Statistical significance for all tests was set at p < 0.05 (Tables 1 and 2).

Table 1.

Baseline characteristics of participants (n = 5996).

Features	Mean ± SD or proportion (%)
Age (years)	40.31 ± 12.11
Sex (% of men)	46.21
Cholesterol (mmol/L)	5.02 ± 0.97
Glucose (mmol/L)	5.13 ± 0.83
PreD (%)	33
HBA1_C (%)	5.43 ± 0.45
HDL (mmol/L)	1.38 ± 0.38
LDL (mmol/L)	3.06 ± 0.89
Triglyceride (mmol/L)	1.29 ± 0.78
Waist size (cm)	89.22 ± 13.75
Hip size (cm)	107.47 ± 11.44
Waist/hip ratio	0.83 ± 0.09
SBP (mmHg)	114.72 ± 14.94
DBP (mmHg)	68.74 ± 10.67
Pulse (b.p.m)	70.15 ± 9.95
BMI (kg/m²)	29.66 ± 5.96
HOMA-IR	3.22 ± 4.92
HOMA-B	1.71 ± 2.26

Open in a new tab

Data shown as the mean ± standard deviation or proportion

Table 2.

Baseline characteristics of normal weight participants in training and validation datasets .

Training dataset (n = 754)			Validation dataset (n = 406)
	Controls (n = 645)	Cases (n = 109)	P	Controls (n = 347)	Cases (n = 59)	P
Age (years)	36.37 ± 11	48.17 ± 12	< 0.001**	37 ± 10	49 ± 12	< 0.001**
Sex (% men)	47.37	49.20	0.12	46.08	51.48	0.48
Cholesterol (mmol/L)	4.72 ± 1.10	4.96 ± 0.93	< 0.001**	4.92 ± 1.90	5.01 ± 0.91	< 0.001**
Glucose (mmol/L)	5.13 ± 1.29	6.5 ± 2.01	< 0.001**	6.8 ± 1.81	7.1 ± 1.03	< 0.001**
PreD (%)		14			14
HbA1_C%	5.30 ± 0.31	6.21 ± 0.41	< 0.001**	5.43 ± 0.32	6.39 ± 0.45	< 0.001**
HDL (mmol/L)	1.34 ± 0.33	1.38 ± 0.36	< 0.001**	1.31 ± 0.31	1. 35 ± 0. 33	< 0.001**
LDL (mmol/L)	3.01 ± 0.96	2.99 ± 0.86	0.026*	2.90 ± 0.82	3.00 ± 0.93	0.022*
Triglyceride (mmol/L)	1.39 ± 0.96	1.79 ± 1.00	< 0.001**	1.31 ± 0.93	1.85 ± 1.01	< 0.001**
Waist (cm)	87 ± 13	97 ± 13	< 0.001**	87 ± 13	96 ± 12	< 0.001**
Hip (cm)	106 ± 11	111 ± 11	< 0.001**	107 ± 11	110 ± 11	< 0.001**
Waist/hip ratio	0.73 ± 0.09	0.84 ± 0.91	< 0.001**	0.78 ± 0.89	0.82 ± 0.95	< 0.001**
SBP (mmol/L)	112 ± 13	122 ± 16	< 0.001**	112 ± 13	122 ± 16	< 0.001**
DBP (mmol/L)	68 ± 10	73 ± 11	< 0.001**	68 ± 10	73 ± 11	< 0.001**
Pulse (b.p.m)	70.01 ± 10	71.35 ± 9.01	0.18	70.15 ± 10	71.21 ± 9.95	< 0.001**
BMI (kg/m²)	29 ± 6	32 ± 6	< 0.001**	29 ± 6	32 ± 6	< 0.001**
HOMA-IR	3.22 ± 1.01	4.82 ± 1.26	< 0.001**	2.56 ± 0.98	3.91 ± 1.36	< 0.001**
HOMA-B	1.71 ± 2.25	2.0 ± 2.61	< 0.001**	1.6 ± 1.98	1.99 ± 1.89	< 0.001**

Open in a new tab

Student’s t-test was used to compare continuous variables, and Inline graphic -test to compare proportions.

Machine learning models

In this section, we employ a variety of machine learning algorithms including deep learning (DL), gradient boosting machine (GBM), random forest (RF), and generalized linear models (GLM). As a baseline, we also use a logistic regression model (LR) due to its simplicity and ease of implementation, making it accessible for researchers with limited machine learning experience and facilitating the creation of their own intent prediction systems. Additionally, other machine learning models excel at capturing complex, non-linear relationships in data, making them highly effective for nuanced pattern recognition.

The package “h2o” (version 3.32.1.1)²⁰ was used for building the machine learning (ML) models.

Random Forest

Random forest (RF) belongs to the class of ensemble based supervised learning techniques. Random forest algorithm applies the general technique of bagging or bootstrapped aggregating to decision tree learners. By performing this bootstrapping procedure, we obtain better model performance as it decreases the variance of the model, without increasing bias. This means that though each tree is a weak learner and sensitive to noise within its respective data, the average/majority of many trees is not, as long as the trees are not correlated. Thus, this bootstrap sampling is used to de-correlate the trees by showing them different parts of the dataset. Random forests automatically rank the importance of variables in a classification problem by considering the average Information Gain corresponding to each variable for all the trees. We used R package caret to generate random forest models²¹.

Gradient boosting machine

We used gradient boosting machine (GBM) another ensemble technique for building a Predictive model. The principle idea behind this algorithm is to construct the new baselearners to be maximally correlated with the negative gradient of the loss function, associated with the whole ensemble. We used R package caret for building a GBM predictive model²¹.

Deep learning

Deep learning (DL) is a more complex and less interpretable machine learning technique.

Deep learning is vaguely inspired by information processing and communication patterns in biological nervous systems. Of late, Deep Learning based models have been successfully applied in computer vision natural language processing, bioinformatics etc. The problem of PreD identification is a classification problem. In the case of deep learning, we learn a non-linear mapping function that takes as input the feature set, x_i, for a given sample and outputs a score ∈ [0, 1] i.e. t : x_i → y_i, where t is the mapping function. In this work, t is a Deep Fully Connected Feed-Forward Neural Network (DNN) that exploits the non-linear interactions between the input features to make its prediction. A feed-forward neural network with a single hidden layer containing a finite number of neurons can approximate any continuous functions under certain mild assumptions on the activation function.

Performance measure

Results are presented as Odds Ratios (OR) with associated 95% confidence intervals (CI) for 1-SD increase of the independent variables. The predictive value for preD of each index was determined by the area under the curve (AUC) in the Receiver Operating Characteristic curve (ROC) analyses. The cut-off point was selected according to the Youden index (sensitivity + specificity − 1). Statistical significance was set at p < 0.05.

To compare the performance of machine learning models and logistic regression, we focused exclusively on the ROC curve and AUC.

Sensitivity (true positive rate)

Sensitivity is the proportion of actual positive cases that are correctly identified by the classifier. It is calculated as.

Specificity (True Negative Rate): Specificity is the proportion of actual negative cases that are correctly identified by the classifier. It is calculated as:

Where TN is the true positive, FN is the false negative, FP is the false positive.

ROC curve

To plot an ROC curve, we calculate the true positive rate (sensitivity) and the false positive rate (1-specificity) at various threshold settings. Then, we plot sensitivity on the y-axis against 1-specificity on the x-axis for each threshold setting. This gives us a curve that shows how sensitivity and specificity change with different threshold values.

AUC is then calculated by measuring the area under the ROC curve. A perfect classifier has an AUC-ROC close to 1, while a completely random classifier.

These metrics are particularly suitable for risk score prediction as they provide a comprehensive evaluation of the model’s ability to discriminate between different risk levels, independent of any specific threshold. The ROC curve illustrates the trade-off between sensitivity and specificity, while the AUC quantifies the overall performance across all possible thresholds, ensuring a robust assessment of the model’s predictive capabilities in identifying prediabetes risk.

Results

Demographic and clinical characteristics of participants

The basal characteristics of the participants are presented in Table 1. The percentage of men was 46.21%. The prevalence of PreD was 33% (out of the 5996 individuals, 1996 had PreD (HbA1_c between 5.7% and 6.5%). The observed distributions of key variables such as age, gender, and other clinical measurements are consistent with population norms, thereby reinforcing the representativeness of our sample.

Table 2 displays the baseline characteristics of the NW participants in the training and validation datasets. The percentage of men in the training population was 49.20%, whereas in the validation population, men represented 51.48%. The prevalence of PreD was 14% in the two sets. HbA_1c levels are significantly different between cases and controls in the two sets (p < 0.001). In the two sets, the individuals with Prediabetes are significantly older than the healthy controls (p < 0.001). Further, The triglyceride levels, the BMI, and insulin resistance, measured with the HOMA-IR, are all significantly higher in the Prediabetes individuals (p < 0.001).

PreD risk factors for NW group versus others

When we fitted a logistic regression model on the NW and OWO groups, the independent variables, risk factors, that are selected by the model, and their corresponding estimates, are shown in Tables 3 and 4 respectively.

Table 3.

Forward and backward stepwise logistic regression for PreD in NW group. Table summarizes significant variables.

Features	OR	Estimate	Std. Error	P-value
Age (Years)	1.09	0.09	0.01	< 2e-16 ***
Phosphorus (mmol/L)	6.31	1.84	0.63	0.00 **
DBP (mmol/L)	1.03	0.03	0.01	0.00 **
Uric Acid (mmol/L)	1.00	0.00	0.00	0.01 *
HOMA-IR	1.01	0.05	0.03	0.00 **
Potassium (mmol/L)	1.31	0.43	0.13	0.00 **
Estradiol (pmol/L)	0.91	0.01	0.03	0.00 **
Magnesium (mmol/L)	0.09	− 1.01	0.52	0.00 **
Vitamin B12 (mcg)	0.93	1.02	0.03	0.00 **
Sodium (mmol/L)	2.01	1.91	0.01	0.00 **
Creatinine (µmol/L)	1.90	0.31	0.05	0.00 **
Folate (pmol/L)	0.84	0.21	0.02	0.00 **
TSH (mU/L)	1.82	0.33	0.11	0.00 **
TIBC (umol/L)	2.03	0.71	0.29	0.02 *
Free Triiodothyronine (T3)	1.41	0.34	0.15	0.02 *
Iron (umol/L)	0.51	− 0.67	0.30	0.02 *
Triglyceride (mmol/L)	1.44	0.36	0.17	0.03 *
Bilirubin (umol/L)	0.95	− 0.06	0.03	0.03 *
HDL (mmol/L)	0.51	− 0.68	0.32	0.03 *
Pulse (b.p.m)	0.98	− 0.02	0.01	0.05 *

Open in a new tab

Table 4.

Logistic regression for PreD in OWO group.

Features	OR	Estimate	Std. Error	P-value
Age (Years)	1.10	0.09	0.00	< 2e-16 ***
HOMA-IR	1.09	0.09	0.01	0.00 **
Potassium (mmol/L)	1.47	0.39	0.11	0.00 **
Free Triiodothyronine	1.35	0.30	0.06	0.00 **
DBP (mmol/L)	1.02	0.02	0.00	0.00 **
Uric Acid (mmol/L)	1.00	0.00	0.00	0.00 **
Testosterone (nmol/L)	0.97	− 0.03	0.01	0.00 **
Albumin (umol/L)	0.94	− 0.06	0.01	0.00 **
Bilirubin (umol/L)	0.93	− 0.07	0.01	0.00 **
Homocysteine (umol/L)	0.93	− 0.08	0.01	0.00 **
Chloride (mmol/L)	0.87	− 0.13	0.02	0.00 **
HOMA-B	0.81	− 0.21	0.03	0.00 **
Estradiol (pmol/L)	1.00	0.00	0.00	0.00 **
Magnesium (mmol/L)	0.17	− 1.77	0.62	0.00 **
Total Protein (g/L)	1.03	0.03	0.01	0.01 *
HDL (mmol/L)	0.74	− 0.31	0.12	0.01 *
Ferritin (pmol/L)	1.00	0.00	0.00	0.02 *
Waist size (cm)	1.02	0.02	0.00	0.03 *
GGT2 (IU/L)	1.00	0.00	0.00	0.03 *
Vitamin B12 (mcg)	1.00	0.00	0.00	0.04 *
Waist hip ratio	1.47	0.39	2.29	0.26
TIBC	1.03	0.71	0.29	0.02 *

Open in a new tab

Comparison between NW vs. OWO groups using logistic regression model

To identify the risk factors that are only associated with NW, we compared the set of variables selected by the NW against OWO using stepwise logistic regression models. Figure 2 shows a Venn diagram of the number of intersecting variables between the two models highlighting the risk factors in NW in green (9 unique variables and 12 overlapping variables), OWO in pink (10 unique variables and 12 overlapping variables). Table 5 lists the variables of each model, highlighting the risk factors unique to NW in green (9 variables), and OWO in pink (10 variables).

Fig. 2 — Stepwise logistic regression model: NW versus OWO.

Table 5.

Stepwise logistic regression model: NW versus OWO.

OWO	NW
Waist hip ratio	Phosphorus
Total Protein	TIBC
Waist size	Triglyceride
GGT2	Folate
Ferritin	Creatinine
Testosterone	Sodium
Albumin	Pulse
Homocysteine	TSH
Chloride	Iron
HOMA-B

Open in a new tab

Triglyceride-based model

After eliminating the 12 overlapping variables, and using only the remaining 9 variables in the NW group, we fit a forward and backward stepwise logistic regression model on the NW group. The independent variables, risk factors, that are selected by the model, and their corresponding estimates, are shown in Table 6.

Table 6.

Forward and backward stepwise logistic regression for PreD in NW subjects after eliminating the overlapping risk factors.

Features	OR	Estimate	Std. Error	P-value
Intercept	0.09	− 2.38	0.83	0.00 **
Triglyceride (mmol/L)	2.79	1.03	0.13	<0.000 ***
Folate (pmol/L)	1.04	0.04	0.01	0.00 **
Pulse (b.p.m)	0.97	− 0.03	0.01	0.00 **
TIBC (umol/L)	1.02	0.02	0.01	0.02 *

Open in a new tab

We noticed that, of the remaining four risk factors, triglyceride is a very relevant risk factor with an odds ratio of 2.79 and a significant p-value. So, we build a model with triglyceride as the main risk factor, and adjusted for age and gender. To measure the performance of the model, we used the Receiver Operating Characteristics (ROC) curve. Figure 3 shows the ROC curve of the triglyceride-based model with an AUC equal to 86.27%.

Fig. 3 — ROC curve of the triglyceride-based model with an AUC equal to 86.27%.

Furthermore, we adjusted the model by adding the other 3 risk factors which are Folate, Pulse, and TIBC in order of their odds ratio, to investigate their effect on the AUC performance. Figure 4 summarizes the ROC curve of the different obtained 8 nested models.

Fig. 4 — ROC curve of the different 8 nested models.

We noticed that the AUC performance slightly improved when we included the TIBC and Pulse risk factors but decreased when Folate was added to the model.

Comparison between NW vs. OWO models using ML models

We followed the same steps done with LR using other ML approaches. First, we used 4 ML algorithms to rank the risk factors. Table 7 shows the top 10 risk factors ranked by each ML algorithm for OWO individuals. Most importantly, Table 8 shows the top 10 risk factors ranked by each ML algorithm for NW. We noticed that in the 10 top ranks of the ML algorithms, Age was ranked first for most of the models. In GBM¹, GLM¹, DL¹ and RF¹, triglyceride is ranked 2nd, 6th, 3rd and 3rd respectively.

Table 7.

The top 10 risk factors ranked by each ML algorithm for the OWO individuals.

DL¹		GBM¹		GLM¹		RF¹
1	HOMA-IR	1	Age	1	Age	1	Age
2	Age	2	HOMA-IR	2	Waistsize	2	HOMA-IR
3	HOMA-B	3	HOMA-B	3	HOMA-IR	3	Waistsize
4	Homocysteine	4	WAISTSIZE	4	Chloride	4	SBP
5	Bilirubin	5	Bilirubin	5	UIBC	5	HOMA-B
6	GGT2	6	UIBC	6	UricAcid	6	WaistHipRatio
7	DBP	7	GGT2	7	sex	7	Bilirubin
8	UricAcid	8	DBP	8	HOMA-B	8	DBP
9	Potassium	9	FreeTriiodothyronine	9	Bilirubin	9	Bilirubin
10	Testosterone	10	WaistHipRatio	10	FreeTriiodothyronine	10	Folate

Open in a new tab

¹DL: Deep Learning. GBM: Gradient Boosting Machine. GLM: Generalized Linear Model. RF: Random Forest.

Table 8.

The top 10 risk factors ranked by each ML algorithm for the NW individuals.

DL¹		GBM¹		GLM¹		RF¹
1	Pulse	1	Age	1	Age	1	Age
2	Age	2	Triglyceride	2	WaistHipRatio	2	Pulse
3	Triglyceride	3	HOMA-B	3	Pulse	3	Triglyceride
4	TIBC	4	FreeTriiodothyronine	4	UricAcid	4	WaistHipRatio
5	Age	5	Phosphorous	5	TIBC	5	DBP
6	Phosphorous	6	UIBC	6	Triglyceride	6	UricAcid
7	DBP	7	Cholesterol	7	Phosphorus	7	Cholesterol
8	Iron	8	DBP	8	Iron	8	Phosphorus
9	Homocysteine	9	SBP	9	Potassium	9	HOMA-B
10	Hipsize	10	Potassium	10	FreeTriiodothyronine	10	FreeThyroxine

Open in a new tab

¹DL: Deep Learning. GBM: Gradient Boosting Machine. GLM: Generalized Linear Model. RF: Random Forest.

To identify the risk factors that are only associated with NW, we did a comparison between the variables selected by the NW (Table 8) against OWO (Table 7) machine learning models. Table 9 shows the remaining risk factors ranked by each ML algorithm for the NW individuals after eliminating the overlapping ones. We noticed that triglyceride was a common factor that has been picked by the 4 ML models. We think that DL¹ did not rank triglyceride on the top probably due to the fact that DL¹ gives good results for large datasets which is not the case here. We also noticed that Phosphorus was sometimes picked by some but not all ML models but this is not the aim of this study.

Table 9.

The top risk factors ranked by each ML algorithm for NW individuals after eliminating the overlapping variables present in the OWO group.

GLM¹		GBM¹		DL¹		RF¹
1	Triglyceride	1	Triglyceride	1	WaistHipRatio	1	UricAcid
2	Potassium	2	Phosphorus	2	TIBC	2	Triglyceride
3	Phosphorus	3	CholesterolTotal	3	Hipsize	3	Cholesterol
4	Iron	4	SBP	4	Triglyceride	4	FreeThyroxine
5	PULSE	5	Potassium	5	Phosphorus	5	Pulse

Open in a new tab

Test of equality of ROC areas between machine learning and logistic regression

After running the four machine learning models and logistic regression model using only triglyceride (adjustied for age and gender), we summarized the performance of the 4 models in Fig. 5. Next, we were interested in testing whether the area under the ROC for logistic regression is significantly equal to that for DL¹, RF¹, GBM¹, and GLM¹. To perform this task, we used roccomp function which provides comparison of the ROC curves of multiple classifiers^21,22.

Fig. 5 — ROC curve of the different 4 machine learning models using only triglyceride.

For each curve, roccomp reports summary statistics and provides a test for the equality of the area under the curves, using an algorithm suggested by DeLong et al. (1988)²¹.

Table 10 summarizes the p-value for each Inline graphic ²-test statistics obtained after applying “roccomp”²². We can see that the ROC of logistic regression is significantly different than that of Random Forest and Gradient Boosting Machine models.

Table 10.

Test of equality of ROC areas between logistic regression and all the other machine learning algorithms¹.

Model A versus Model B

ROC

Std. Error

95% CI

Chisq

p-value

LR versus DL¹

406

√

0.8627

(0.8560)

0.0261

(0.0284)

[0.812; 0.914]

[0.800; 0.912]

1.19

√

0.275

√

LR versus RF¹

406

√

0.8627

(0.7077)

0.0261

(0.0382)

[0.815; 0.914]

[0.633; 0.783]

21.73

√

0.0 ***

√

LR versus GBM¹

406

√

0.8627

(0.7570)

0.0261

(0.0412)

[0.812; 0.914]

[0.676; 0.838]

7.63

√

0.0058 **

√

406

0.7570

0.0412

[0.676; 0.838]

LR versus GLM¹

406

√

0.8627

(0.8612)

0.0261

(0.0266)

[0.815; 0.914]

[0.810; 0.913]

0.63

√

0.4275

√

Open in a new tab

Conclusion & Discussion

To the best of our knowledge, this study is the first to comprehensively investigate the risk factors associated with PreD in a cohort of NW adult individuals in Qatar using different multivariable machine learning (ML) techniques. Our approach allows for a nuanced understanding of the contributory factors, providing a foundation for targeted preventive measures. The different ML models we developed indicate a robust positive correlation between high triglyceride levels and the odds of having PreD in NW Qatari adults.

T2D is a significant global public health issue, with incidence and mortality rates consistently increasing in most countries²³. Prediabetes, recognized as a significant independent risk factor for T2D, is central to this public health challenge. The 5–10% annual conversion rate from PreD to T2D³; signals a warning that, without intervention, the T2D epidemic is likely to worsen in the future.

Obesity is a well-established risk factor for both T2D and prediabetes. Hence, current guidelines from national and international health organizations regarding the screening for PreD and T2D are generally limited to individuals who are overweight or obese¹⁰. This focus, however, may result in missed opportunities for the early detection of undiagnosed disease in individuals with a healthy weight, given that several studies have shown that prediabetes/T2D can strike hard even when weight is in the normal range (BMI between 18.5 and 24.9 kg/m²)^11,24–27.

Significant efforts have been dedicated to identifying and understanding the risk factors associated with PreD in obese individuals. Nevertheless, the factors linked to PreD in NW individuals remain a topic of ongoing debate. Despite the lower prevalence of PreD in NW individuals compared to their overweight and obese counterparts, identifying the specific metabolic and physiological drivers in this population is crucial. This knowledge is essential for developing effective prevention strategies and personalized clinical management approaches tailored to the unique needs of NW individuals. Understanding these factors is vital for the early detection of prediabetes, which, combined with timely and effective intervention strategies, is crucial in preventing its progression to T2D and potentially reducing the overall incidence and associated mortality rates of T2D.

The different ML models we developed indicate a robust positive correlation between triglyceride levels and PreD in NW Qatari adult individuals. The positive correlation between elevated triglyceride levels and T2D in obese individuals is well established. A recent study involving 1341 people aged 25–44 years reported that high triglyceride levels, obesity, and a low level of education are associated with the risk of developing T2D, regardless of other factors²⁸. It was also reported that a rise in triglyceride levels over time increases the risk of T2D in young men independently of traditional risk factors and associated changes in BMI and lifestyle parameters²⁹. Furthermore, fasting triglycerides in the upper normal range were shown to be independently associated with an increased risk of diabetes mortality in representative USA populations³⁰-³¹. Zheng and colleagues recently showed a graded positive association between elevated TG levels and inadequate glycemic control for patients with insulin-treated T2D in China³². A linear relation analysis also suggested that a triglyceride genotype score (involving 25 well-established single nucleotide polymorphisms) is linearly related to elevated T2D risk³³.

Elevated blood triglyceride levels may occur because several factors, including (1) Genetic factors, which increase the predisposition of some people to higher triglyceride levels, regardless of their weight; (2) Dietary Factors, mainly the consumption of excessive amounts of simple carbohydrates (sugars and refined grains) and fats, especially saturated and trans fats; (3) Insulin Resistance, which can develop even in lean individuals; (4) Physical Inactivity, which can contribute to higher triglyceride levels, irrespective of body weight; (5) Medical conditions, including hypothyroidism, kidney disease, or liver disease; (6) Medications, such as corticosteroids, beta-blockers, and specific immunosuppressants.

We do not have data on our sample population’s genetic and dietary factors, physical activity, medical conditions, or medications. However, our data indicate that insulin resistance (measured using HOMA-IA; Table 2), a hallmark of PreD and type 2 diabetes (T2D) (PMID: 28697184), is elevated in the NW participants of this study. Therefore, while we can not exclude the role of other factors, the elevated triglyceride levels observed in NW prediabetic individuals in this study may reflect a certain degree of insulin resistance. Symptoms of metabolic syndrome, typically linked to abdominal obesity, are also relatively common among individuals with a normal BMI and waist circumference, a condition referred to as “metabolically obese normal weight” (MONW) or the TOFI phenotype (thin outside, fat inside). This phenotype suggests that despite appearing healthy, NW individuals may have a threshold of fat that renders them insulin-resistant, leading to PreD and eventually T2D³⁴.

The high levels of triglycerides may also indicate hepatic fat accumulation. Recent epidemiological studies have indeed shown that about 20% of the total cases of non-alcoholic fatty liver disease (NAFLD) patients are lean³⁵. Insulin resistance is the primary driver of NAFLD³⁶, and lean NAFLD is associated with an approximately 1.6-fold increased mortality risk³⁷.

Consequently, the findings of this study suggest that NW individuals having PreD may have insulin resistance that leads to fatty liver, which ultimately increases blood triglyceride levels. These factors combined lead to dysregulated glucose homeostasis and prediabetes.

Another factor potentially contributing to the occurrence of PreD in NW individuals is age. Our data indicate that prediabetic NW individuals are significantly older than their healthy counterparts. It is well established that the risk for metabolic diseases, including metabolic syndrome, insulin resistance, and diabetes, increases with age³⁸-³⁹. Furthermore, age-related alterations in plasma triglyceride metabolism and fatty acid partitioning significantly contribute to these metabolic diseases. Specifically, age-induced changes in human triglyceride metabolism include increased plasma triglyceride levels, reduced postprandial plasma triglyceride clearance rates, and elevated ectopic fat deposition, all of which can contribute to age-associated metabolic conditions⁴⁰.

One of the strengths of our study is its large sample size. According to the Qatar Planning and Statistics Authority, the population of Qatar at the end of March 2024 was 3,080,804 people, with Qataris accounting for approximately 12% of the total, equating to around 369,696 individuals (https://www.psa.gov.qa/en/Pages/default.aspx; accessed on June 26, 2024). In 2015, individuals under 19 made up 47% of all Qatari nationals (https://gulfmigration.org/qatar-population-nationality-qatari-non-qatari-five-year-age-group-2015/). Assuming this percentage remained constant in 2024, approximately 195,939 Qatari adults would be eligible for our study. Thus, with a sample size of 6,000 individuals, our study has statistically significant power (6000/195939 ≈ 3%).

Additionally, the data used in our research were obtained from a well-phenotyped cohort representative of the general population. Our study is also the first to demonstrate the utility of triglyceride levels in identifying PreD in NW individuals within a Middle Eastern population. Given the shared environmental factors and lifestyle habits, as well as genetic background and ethnicity among many Middle Eastern countries, particularly the Gulf Cooperation Council nations (Qatar, Bahrain, Saudi Arabia, United Arab Emirates, Kuwait, and Oman), our findings may perform similarly in many of these countries.

The main limitation of our study is the cross-sectional design, which does allow the use of the findings to predict future prediabetes. However, the QBB has recently started to call back the participants for a 5-year follow-up, which will open new avenues for assessing the predictive ability of the different indices longitudinally. We also did not adjust for parameters such as smoking status, medication, or physical activity. Finally, the present study’s findings may not be generalizable to all populations due to the ethnic and geographic characteristics of the study population.

In conclusion, our study demonstrates a strong correlation between elevated blood triglyceride levels and PreD in NW individuals. This finding highlights the potential of triglyceride levels as a biomarker for the early detection of PreD in this population. By identifying individuals at risk through this biomarker, healthcare providers can implement timely and effective intervention strategies to prevent the progression of PreD to full-blown T2D. Given the rising prevalence of T2D globally, especially in populations not traditionally considered at risk, such as those with normal weight, these insights are critical for improving preventative healthcare measures and reducing the overall burden of diabetes. Futher investigations are warranted to better understand the mechanisms undelying the elevated TG levels in NW individuals and to longitudinally examine the causl link between elevated TGs and PreD development.

Acknowledgements

We want to thank Qatar Biobank for facilitating access to the data and providing us with expert advice. We are also grateful to all the participants of the study.

Author contributions

HB and AA conceived and designed the study. KE, MA performed the statistical analysis. HB and AA and KE interpreted the results and wrote the manuscript. NA and JJ revised the statistical analysis. All the authors reviewed the results, edited the manuscript, and approved the final version. HB and AA are the guarantor of this work and, as such, have full access to all the data in the study and take responsibility for the data’s integrity and the data’s accuracy.

Funding

The project was funded by intermural grants from Qatar the Biomedical Research Institute to AA and from the Qatar Computing Research Institute to HB.

Data availability

The datasets used in the current study are available from the QBB and can be obtained upon submission of an application and study are available on request from the corresponding author.

Declarations

Conflict of interest

The authors declare that the research was conducted without any commercial or financial relationships construed as a potential conflict of interest.

Ethical approval

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Abdelilah Arredouani, Email: aarredouani@hbku.edu.qa.

Halima Bensmail, Email: hbensmail@hbku.edu.qa.

References

1.Saeedi, P. et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th edition. Diabetes Res Clin Pract.157, 107843. 10.1016/j.diabres.2019.107843 (2019). Epub 2019 Sep 10. PMID: 31518657. [DOI] [PubMed]
2.Tabák, A. G., Herder, C., Rathmann, W., Brunner, E. J. & Kivimäki, M. Prediabetes: a high-risk state for diabetes development. Lancet. 379(9833), 2279–2290. 10.1016/S0140-6736(12)60283-9 (2012). Epub 2012 Jun 9. PMID: 22683128; PMCID: PMC3891203. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Nathan, D. M. et al. Impaired fasting glucose and impaired glucose tolerance: implications for care. Diabetes Care.30(3), 753–759. 10.2337/dc07-9920 (2007). PMID: 17327355. [DOI] [PubMed]
4.de Vegt, F. et al. Relation of impaired fasting and postload glucose with incident type 2 diabetes in a Dutch population: The Hoorn Study. JAMA.285(16), 2109–2113. 10.1001/jama.285.16.2109 (2001). PMID: 11311100. [DOI] [PubMed]
5.Jiang, L. et al. Long-term outcomes of Lifestyle intervention to prevent diabetes in American Indian and Alaska native communities: the Special Diabetes Program for indians Diabetes Prevention Program. Diabetes Care. 41(7), 1462–1470. 10.2337/dc17-2685 (2018). PMID: 29915128; PMCID: PMC6014547. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Knowler, W. C. et al. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N Engl. J. Med.346(6), 393–403. 10.1056/NEJMoa012512 (2002). PMID: 11832527; PMCID: PMC1370926. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Diabetes Prevention Program Research Group et al. 10-year follow-up of diabetes incidence and weight loss in the Diabetes Prevention Program Outcomes Study. Lancet. 374(9702), 1677–1686. 10.1016/S0140-6736(09)61457-4 (2009). Epub 2009 Oct 29. Erratum in: Lancet. 374(9707), 2054 (2009). PMID: 19878986; PMCID: PMC3135022. [DOI] [PMC free article] [PubMed]
8.Pan, X. R. et al. Effects of diet and exercise in preventing NIDDM in people with impaired glucose tolerance. The Da Qing IGT and Diabetes Study. Diabetes Care.20(4), 537–544. 10.2337/diacare.20.4.537 (1997). PMID: 9096977. [DOI] [PubMed]
9.Al-Goblan, A. S., Al-Alfi, M. A. & Khan, M. Z. Mechanism linking diabetes mellitus and obesity. Diabetes Metab Syndr Obes.7, 587–591. 10.2147/DMSO.S67400 PMID: 25506234; PMCID: PMC4259868. (2014). [DOI] [PMC free article] [PubMed]
10.American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diabetes Care. 35(Suppl 1), S64–S71. 10.2337/dc12-s064 (2012). PMID: 22187472; PMCID: PMC3632174. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Mainous, A. G. 3rd, Tanner, R. J., Jo, A. & Anton, S. D. Prevalence of prediabetes and abdominal obesity among healthy-weight adults: 18-Year Trend. Ann. Fam Med.14(4), 304–310. 10.1370/afm.1946 (2016). PMID: 27401417; PMCID: PMC4940459. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Suliga, E., Kozieł, D. & Głuszek, S. Prevalence of metabolic syndrome in normal weight individuals. Ann Agric Environ Med.23(4), 631–635. 10.5604/12321966.1226858 (2016). PMID: 28030935. [DOI] [PubMed]
13.Wijayatunga, N. N. & Dhurandhar, E. J. Normal weight obesity and unaddressed cardiometabolic health risk-a narrative review. Int J Obes (Lond).45(10), 2141–2155. 10.1038/s41366-021-00858-7 (2021). Epub 2021 May 18. Erratum in: Int J Obes (Lond). (2021) PMID: 34007010. [DOI] [PubMed]
14.Sherif, S. & Sumpio, B. E. Economic development and diabetes prevalence in MENA countries: Egypt and Saudi Arabia comparison. World J. Diabetes. 6(2), 304–311. 10.4239/wjd.v6.i2.304 (2015). PMID: 25789111; PMCID: PMC4360423. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Aldossari, K. K. et al. Prevalence of Prediabetes, Diabetes, and its Associated Risk factors among males in Saudi Arabia: a Population-based survey. J. Diabetes Res.2018, 2194604. 10.1155/2018/2194604 (2018). PMID: 29850599; PMCID: PMC5941814. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Alkandari, A. et al. The prevalence of pre-diabetes and diabetes in the Kuwaiti adult population in 2014. Diabetes Res Clin Pract.144, 213–223. 10.1016/j.diabres.2018.08.016. Epub 2018 Sep 1. PMID: 30179683. (2018). [DOI] [PubMed]
17.Assaad Khalil, S. H. et al. Prediabetes management in the Middle East, Africa and Russia: Current status and call for action. Diab Vasc Dis Res.16(3), 213–226. 10.1177/1479164118819665. Epub 2019 Jan 3. PMID: 30606039. (2019). [DOI] [PubMed]
18.Al Akl, N. S., Haoudi, E. N., Bensmail, H. & Arredouani, A. The triglyceride glucose-waist-to-height ratio outperforms obesity and other triglyceride-related parameters in detecting prediabetes in normal-weight Qatari adults: a cross-sectional study. Front. Public. Health. 11, 1086771. 10.3389/fpubh.2023.1086771 (2023). PMID: 37089491; PMCID: PMC10117653. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Salgado, A. L., Carvalho Ld, Oliveira, A. C., Santos, V. N., Vieira, J. G. & Parise, E. R. Insulin resistance index (HOMA-IR) in the differentiation of patients with non-alcoholic fatty liver disease and healthy individuals. Arq Gastroenterol.47(2), 165–169. 10.1590/s0004-28032010000200009 (2010). PMID: 20721461. [DOI] [PubMed]
20.Landry, M. Machine learning with R and H2o. (2018). Http://h2o.ai/resources/, seven edition.
21.DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 44(3), 837–845 (1988). PMID: 3203132. [PubMed] [Google Scholar]
22.Pepe, M. S. The Statistical Evaluation of Medical Tests for Classification and Prediction (Oxford University Press, 2003).
23.Sun, H. et al. IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes research and clinical practice, 183, 109119. (2022). [DOI] [PMC free article] [PubMed]
24.Gujral, U. P., Weber, M. B., Staimez, L. R. & Narayan, K. M. V. Diabetes Among Non-Overweight Individuals: an Emerging Public Health Challenge. Curr Diab Rep.18(8), 60. 10.1007/s11892-018-1017-1 (2018). PMID: 29974263. [DOI] [PubMed]
25.Vicks, W. S. et al. Prevalence of prediabetes and diabetes vary by ethnicity among U.S. Asian adults at healthy weight, overweight, and obesity ranges: an electronic health record study. BMC Public. Health. 22(1), 1954. 10.1186/s12889-022-14362-8 (2022). PMID: 36273116; PMCID: PMC9587616. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Salvatore, T. et al. Current knowledge on the pathophysiology of Lean/Normal-Weight type 2 diabetes. Int. J. Mol. Sci.24(1), 658. 10.3390/ijms24010658 (2022). PMID: 36614099; PMCID: PMC9820420. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Taylor, R. & Holman, R. R. Normal weight individuals who develop type 2 diabetes: the personal fat threshold. Clin Sci (Lond).128(7), 405–410. 10.1042/CS20140553 (2015). PMID: 25515001. [DOI] [PubMed]
28.Garbuzova, E. V. et al. Triglycerides, Obesity and Education Status Are Associated with the risk of developing type 2 diabetes in young adults, Cohort Study. J. Pers. Med.13(9), 1403. 10.3390/jpm13091403 (2023). PMID: 37763170; PMCID: PMC10533043. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Tirosh, A. et al. Changes in triglyceride levels over time and risk of type 2 diabetes in young men. Diabetes Care. 31(10), 2032–2037. 10.2337/dc08-0825 (2008). Epub 2008 Jun 30. PMID: 18591400; PMCID: PMC2551650. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Wang, Y. Fasting triglycerides in the Upper Normal Range are independently Associated with an increased risk of diabetes mortality in a large representative US Population. J. Cardiovasc. Dev. Dis.11(4), 128. 10.3390/jcdd11040128 (2024). PMID: 38667746; PMCID: PMC11050947. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Wang, Y. Higher fasting triglyceride predicts higher risks of diabetes mortality in US adults. Lipids Health Dis.20(1), 181. 10.1186/s12944-021-01614-6 (2021). PMID: 34930280; PMCID: PMC8686260. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Zheng, D. et al. Association Between Triglyceride Level and Glycemic Control Among Insulin-Treated Patients With Type 2 Diabetes. J Clin Endocrinol Metab.104(4), 1211–1220. 10.1210/jc.2018-01656 (2019). PMID: 30418583. [DOI] [PubMed]
33.Qi, Q., Liang, L., Doria, A., Hu, F. B. & Qi, L. Genetic predisposition to dyslipidemia and type 2 diabetes risk in two prospective cohorts. Diabetes. 61(3), 745–752. 10.2337/db11-1254. Epub 2012 Feb 7. PMID: 22315312; PMCID: PMC3282815. (2012). [DOI] [PMC free article] [PubMed]
34.Taylor, R. et al. Aetiology of type 2 diabetes in people with a ‘normal’ body mass index: testing the personal fat threshold hypothesis. Clin. Sci. (Lond). 137(16), 1333–1346. 10.1042/CS20230586 (2023). PMID: 37593846; PMCID: PMC10472166. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Ye, Q. et al. Global prevalence, incidence, and outcomes of non-obese or lean non-alcoholic fatty liver disease: a systematic review and meta-analysis. Lancet Gastroenterol. Hepatol.5(8), 739–752. 10.1016/S2468-1253(20)30077-7 (2020). Epub 2020 May 12. PMID: 32413340. [DOI] [PubMed] [Google Scholar]
36.Nogueira, J. P. & Cusi, K. Role of insulin resistance in the development of nonalcoholic fatty liver disease in people with type 2 diabetes: from bench to Patient Care. Diabetes Spectr.37(1), 20–28. 10.2337/dsi23-0013 (2024 Winter). Epub 2024 Feb 15. PMID: 38385099; PMCID: PMC10877218. [DOI] [PMC free article] [PubMed]
37.Wongtrakul, W., Charatcharoenwitthaya, N. & Charatcharoenwitthaya, P. Lean non-alcoholic fatty liver disease and the risk of all-cause mortality: an updated meta-analysis. Ann. Hepatol.29(3), 101288. 10.1016/j.aohep.2024.101288 (2024 May-Jun). Epub 2024 Jan 24. PMID: 38278181. [DOI] [PubMed]
38.Cao, Q. et al. Age-specific prevalence, subtypes and risk factors of metabolic diseases in Chinese adults and the different patterns from other racial/ethnic populations. BMC Public. Health. 22(1), 2078. 10.1186/s12889-022-14555-1 (2022). PMID: 36376828; PMCID: PMC9664823. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Stephens, C. R. et al. The impact of Education and Age on Metabolic disorders. Front. Public. Health. 8, 180. 10.3389/fpubh.2020.00180 (2020). PMID: 32671006; PMCID: PMC7326131. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Spitler, K. M. & Davies, B. S. J. Aging and plasma triglyceride metabolism. J. Lipid Res.61(8), 1161–1167. 10.1194/jlr.R120000922 (2020). Epub 2020 Jun 25. PMID: 32586846; PMCID: PMC7397742. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets used in the current study are available from the QBB and can be obtained upon submission of an application and study are available on request from the corresponding author.

[CR1] 1.Saeedi, P. et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th edition. Diabetes Res Clin Pract.157, 107843. 10.1016/j.diabres.2019.107843 (2019). Epub 2019 Sep 10. PMID: 31518657. [DOI] [PubMed]

[CR2] 2.Tabák, A. G., Herder, C., Rathmann, W., Brunner, E. J. & Kivimäki, M. Prediabetes: a high-risk state for diabetes development. Lancet. 379(9833), 2279–2290. 10.1016/S0140-6736(12)60283-9 (2012). Epub 2012 Jun 9. PMID: 22683128; PMCID: PMC3891203. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Nathan, D. M. et al. Impaired fasting glucose and impaired glucose tolerance: implications for care. Diabetes Care.30(3), 753–759. 10.2337/dc07-9920 (2007). PMID: 17327355. [DOI] [PubMed]

[CR4] 4.de Vegt, F. et al. Relation of impaired fasting and postload glucose with incident type 2 diabetes in a Dutch population: The Hoorn Study. JAMA.285(16), 2109–2113. 10.1001/jama.285.16.2109 (2001). PMID: 11311100. [DOI] [PubMed]

[CR5] 5.Jiang, L. et al. Long-term outcomes of Lifestyle intervention to prevent diabetes in American Indian and Alaska native communities: the Special Diabetes Program for indians Diabetes Prevention Program. Diabetes Care. 41(7), 1462–1470. 10.2337/dc17-2685 (2018). PMID: 29915128; PMCID: PMC6014547. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Knowler, W. C. et al. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N Engl. J. Med.346(6), 393–403. 10.1056/NEJMoa012512 (2002). PMID: 11832527; PMCID: PMC1370926. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Diabetes Prevention Program Research Group et al. 10-year follow-up of diabetes incidence and weight loss in the Diabetes Prevention Program Outcomes Study. Lancet. 374(9702), 1677–1686. 10.1016/S0140-6736(09)61457-4 (2009). Epub 2009 Oct 29. Erratum in: Lancet. 374(9707), 2054 (2009). PMID: 19878986; PMCID: PMC3135022. [DOI] [PMC free article] [PubMed]

[CR8] 8.Pan, X. R. et al. Effects of diet and exercise in preventing NIDDM in people with impaired glucose tolerance. The Da Qing IGT and Diabetes Study. Diabetes Care.20(4), 537–544. 10.2337/diacare.20.4.537 (1997). PMID: 9096977. [DOI] [PubMed]

[CR9] 9.Al-Goblan, A. S., Al-Alfi, M. A. & Khan, M. Z. Mechanism linking diabetes mellitus and obesity. Diabetes Metab Syndr Obes.7, 587–591. 10.2147/DMSO.S67400 PMID: 25506234; PMCID: PMC4259868. (2014). [DOI] [PMC free article] [PubMed]

[CR10] 10.American Diabetes Association. Diagnosis and classification of diabetes mellitus. Diabetes Care. 35(Suppl 1), S64–S71. 10.2337/dc12-s064 (2012). PMID: 22187472; PMCID: PMC3632174. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Mainous, A. G. 3rd, Tanner, R. J., Jo, A. & Anton, S. D. Prevalence of prediabetes and abdominal obesity among healthy-weight adults: 18-Year Trend. Ann. Fam Med.14(4), 304–310. 10.1370/afm.1946 (2016). PMID: 27401417; PMCID: PMC4940459. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Suliga, E., Kozieł, D. & Głuszek, S. Prevalence of metabolic syndrome in normal weight individuals. Ann Agric Environ Med.23(4), 631–635. 10.5604/12321966.1226858 (2016). PMID: 28030935. [DOI] [PubMed]

[CR13] 13.Wijayatunga, N. N. & Dhurandhar, E. J. Normal weight obesity and unaddressed cardiometabolic health risk-a narrative review. Int J Obes (Lond).45(10), 2141–2155. 10.1038/s41366-021-00858-7 (2021). Epub 2021 May 18. Erratum in: Int J Obes (Lond). (2021) PMID: 34007010. [DOI] [PubMed]

[CR14] 14.Sherif, S. & Sumpio, B. E. Economic development and diabetes prevalence in MENA countries: Egypt and Saudi Arabia comparison. World J. Diabetes. 6(2), 304–311. 10.4239/wjd.v6.i2.304 (2015). PMID: 25789111; PMCID: PMC4360423. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Aldossari, K. K. et al. Prevalence of Prediabetes, Diabetes, and its Associated Risk factors among males in Saudi Arabia: a Population-based survey. J. Diabetes Res.2018, 2194604. 10.1155/2018/2194604 (2018). PMID: 29850599; PMCID: PMC5941814. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Alkandari, A. et al. The prevalence of pre-diabetes and diabetes in the Kuwaiti adult population in 2014. Diabetes Res Clin Pract.144, 213–223. 10.1016/j.diabres.2018.08.016. Epub 2018 Sep 1. PMID: 30179683. (2018). [DOI] [PubMed]

[CR17] 17.Assaad Khalil, S. H. et al. Prediabetes management in the Middle East, Africa and Russia: Current status and call for action. Diab Vasc Dis Res.16(3), 213–226. 10.1177/1479164118819665. Epub 2019 Jan 3. PMID: 30606039. (2019). [DOI] [PubMed]

[CR18] 18.Al Akl, N. S., Haoudi, E. N., Bensmail, H. & Arredouani, A. The triglyceride glucose-waist-to-height ratio outperforms obesity and other triglyceride-related parameters in detecting prediabetes in normal-weight Qatari adults: a cross-sectional study. Front. Public. Health. 11, 1086771. 10.3389/fpubh.2023.1086771 (2023). PMID: 37089491; PMCID: PMC10117653. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Salgado, A. L., Carvalho Ld, Oliveira, A. C., Santos, V. N., Vieira, J. G. & Parise, E. R. Insulin resistance index (HOMA-IR) in the differentiation of patients with non-alcoholic fatty liver disease and healthy individuals. Arq Gastroenterol.47(2), 165–169. 10.1590/s0004-28032010000200009 (2010). PMID: 20721461. [DOI] [PubMed]

[CR20] 20.Landry, M. Machine learning with R and H2o. (2018). Http://h2o.ai/resources/, seven edition.

[CR21] 21.DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 44(3), 837–845 (1988). PMID: 3203132. [PubMed] [Google Scholar]

[CR22] 22.Pepe, M. S. The Statistical Evaluation of Medical Tests for Classification and Prediction (Oxford University Press, 2003).

[CR23] 23.Sun, H. et al. IDF Diabetes Atlas: Global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes research and clinical practice, 183, 109119. (2022). [DOI] [PMC free article] [PubMed]

[CR24] 24.Gujral, U. P., Weber, M. B., Staimez, L. R. & Narayan, K. M. V. Diabetes Among Non-Overweight Individuals: an Emerging Public Health Challenge. Curr Diab Rep.18(8), 60. 10.1007/s11892-018-1017-1 (2018). PMID: 29974263. [DOI] [PubMed]

[CR25] 25.Vicks, W. S. et al. Prevalence of prediabetes and diabetes vary by ethnicity among U.S. Asian adults at healthy weight, overweight, and obesity ranges: an electronic health record study. BMC Public. Health. 22(1), 1954. 10.1186/s12889-022-14362-8 (2022). PMID: 36273116; PMCID: PMC9587616. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Salvatore, T. et al. Current knowledge on the pathophysiology of Lean/Normal-Weight type 2 diabetes. Int. J. Mol. Sci.24(1), 658. 10.3390/ijms24010658 (2022). PMID: 36614099; PMCID: PMC9820420. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Taylor, R. & Holman, R. R. Normal weight individuals who develop type 2 diabetes: the personal fat threshold. Clin Sci (Lond).128(7), 405–410. 10.1042/CS20140553 (2015). PMID: 25515001. [DOI] [PubMed]

[CR28] 28.Garbuzova, E. V. et al. Triglycerides, Obesity and Education Status Are Associated with the risk of developing type 2 diabetes in young adults, Cohort Study. J. Pers. Med.13(9), 1403. 10.3390/jpm13091403 (2023). PMID: 37763170; PMCID: PMC10533043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Tirosh, A. et al. Changes in triglyceride levels over time and risk of type 2 diabetes in young men. Diabetes Care. 31(10), 2032–2037. 10.2337/dc08-0825 (2008). Epub 2008 Jun 30. PMID: 18591400; PMCID: PMC2551650. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Wang, Y. Fasting triglycerides in the Upper Normal Range are independently Associated with an increased risk of diabetes mortality in a large representative US Population. J. Cardiovasc. Dev. Dis.11(4), 128. 10.3390/jcdd11040128 (2024). PMID: 38667746; PMCID: PMC11050947. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Wang, Y. Higher fasting triglyceride predicts higher risks of diabetes mortality in US adults. Lipids Health Dis.20(1), 181. 10.1186/s12944-021-01614-6 (2021). PMID: 34930280; PMCID: PMC8686260. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Zheng, D. et al. Association Between Triglyceride Level and Glycemic Control Among Insulin-Treated Patients With Type 2 Diabetes. J Clin Endocrinol Metab.104(4), 1211–1220. 10.1210/jc.2018-01656 (2019). PMID: 30418583. [DOI] [PubMed]

[CR33] 33.Qi, Q., Liang, L., Doria, A., Hu, F. B. & Qi, L. Genetic predisposition to dyslipidemia and type 2 diabetes risk in two prospective cohorts. Diabetes. 61(3), 745–752. 10.2337/db11-1254. Epub 2012 Feb 7. PMID: 22315312; PMCID: PMC3282815. (2012). [DOI] [PMC free article] [PubMed]

[CR34] 34.Taylor, R. et al. Aetiology of type 2 diabetes in people with a ‘normal’ body mass index: testing the personal fat threshold hypothesis. Clin. Sci. (Lond). 137(16), 1333–1346. 10.1042/CS20230586 (2023). PMID: 37593846; PMCID: PMC10472166. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Ye, Q. et al. Global prevalence, incidence, and outcomes of non-obese or lean non-alcoholic fatty liver disease: a systematic review and meta-analysis. Lancet Gastroenterol. Hepatol.5(8), 739–752. 10.1016/S2468-1253(20)30077-7 (2020). Epub 2020 May 12. PMID: 32413340. [DOI] [PubMed] [Google Scholar]

[CR36] 36.Nogueira, J. P. & Cusi, K. Role of insulin resistance in the development of nonalcoholic fatty liver disease in people with type 2 diabetes: from bench to Patient Care. Diabetes Spectr.37(1), 20–28. 10.2337/dsi23-0013 (2024 Winter). Epub 2024 Feb 15. PMID: 38385099; PMCID: PMC10877218. [DOI] [PMC free article] [PubMed]

[CR37] 37.Wongtrakul, W., Charatcharoenwitthaya, N. & Charatcharoenwitthaya, P. Lean non-alcoholic fatty liver disease and the risk of all-cause mortality: an updated meta-analysis. Ann. Hepatol.29(3), 101288. 10.1016/j.aohep.2024.101288 (2024 May-Jun). Epub 2024 Jan 24. PMID: 38278181. [DOI] [PubMed]

[CR38] 38.Cao, Q. et al. Age-specific prevalence, subtypes and risk factors of metabolic diseases in Chinese adults and the different patterns from other racial/ethnic populations. BMC Public. Health. 22(1), 2078. 10.1186/s12889-022-14555-1 (2022). PMID: 36376828; PMCID: PMC9664823. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Stephens, C. R. et al. The impact of Education and Age on Metabolic disorders. Front. Public. Health. 8, 180. 10.3389/fpubh.2020.00180 (2020). PMID: 32671006; PMCID: PMC7326131. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] 40.Spitler, K. M. & Davies, B. S. J. Aging and plasma triglyceride metabolism. J. Lipid Res.61(8), 1161–1167. 10.1194/jlr.R120000922 (2020). Epub 2020 Jun 25. PMID: 32586846; PMCID: PMC7397742. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Investigation of the risk factors associated with prediabetes in normal-weight Qatari adults: a cross-sectional study

Khadija A Elmagarmid

Mohamed Fadlalla

Johann Jose

Abdelilah Arredouani

Halima Bensmail

Abstract

Introduction

Methods

Study Population

Fig. 1.

Anthropometric and clinical measures

Training and validation populations

Statistical analysis

Table 1.

Table 2.

Machine learning models

Random Forest

Gradient boosting machine

Deep learning

Performance measure

Sensitivity (true positive rate)

ROC curve

Results

Demographic and clinical characteristics of participants

PreD risk factors for NW group versus others

Table 3.

Table 4.

Comparison between NW vs. OWO groups using logistic regression model

Fig. 2.

Table 5.

Triglyceride-based model

Table 6.

Fig. 3.

Fig. 4.

Comparison between NW vs. OWO models using ML models

Table 7.

Table 8.

Table 9.

Test of equality of ROC areas between machine learning and logistic regression

Fig. 5.

Table 10.

Conclusion & Discussion

Acknowledgements

Author contributions

Funding

Data availability

Declarations

Conflict of interest

Ethical approval

Footnotes

Contributor Information

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases