Skip to main content
Journal of Diabetes Investigation logoLink to Journal of Diabetes Investigation
. 2025 Mar 21;16(6):1055–1064. doi: 10.1111/jdi.70010

Machine learning‐based risk prediction model for neuropathic foot ulcers in patients with diabetic peripheral neuropathy

Ge Shi 1, Zhenxuan Gao 2, Ze Zhang 3, Quanyu Jin 2, Sitong Li 4, Jiaxin Liu 4, Lei Kou 3, Abudurezhake Aerman 3, Wenqiang Yang 5, Qi Wang 5, Furong Cai 6, Li Zhang 5,
PMCID: PMC12131921  PMID: 40116696

ABSTRACT

Background

Diabetic peripheral neuropathy (DPN) is a common chronic complication of diabetes, marked by symptoms like hyperalgesia, numbness, and swelling that impair quality of life. Nerve conduction abnormalities in DPN significantly increase the risk of neuropathic foot ulcers (NFU), which can progress rapidly and lead to severe outcomes, including infection, gangrene, and amputation. Early prediction of NFU in DPN patients is crucial for timely intervention.

Methods

Clinical data from 400 DPN patients treated at the China–Japan Friendship Hospital (September 2022–2024) were retrospectively analyzed. Data included medical histories, physical exams, biochemical tests, and imaging. After feature selection and data balancing, the dataset was split into training and validation subsets (8:2 ratio). Six machine learning algorithms—random forest, decision tree, logistic regression, K‐nearest neighbor, extreme gradient boosting, and multilayer perceptron—were evaluated using k‐fold cross‐validation. Model performance was assessed via accuracy, precision, recall, F1 score, and AUC. The SHAP method was employed for interpretability.

Results

The multilayer perceptron model showed the best performance (accuracy: 0.875; AUC: 0.901). SHAP analysis highlighted triglycerides, high‐density lipoprotein cholesterol, diabetes duration, age, and fasting blood glucose as key predictors.

Conclusions

A machine learning‐based prediction model using a multilayer perceptron algorithm effectively identifies DPN patients at high NFU risk, offering clinicians an accurate tool for early intervention.

Keywords: Diabetic peripheral neuropathy, Machine learning, Prediction model


This study developed a machine learning‐based prediction model to identify diabetic peripheral neuropathy patients at high risk of neuropathic foot ulcers, using clinical and biochemical data. The multilayer perceptron model achieved the highest accuracy and interpretability, providing a valuable tool for early diagnosis and intervention.

graphic file with name JDI-16-1055-g003.jpg

INTRODUCTION

With the continuous improvement in living standards, transportation methods, dietary habits, and lifestyle pressures have undergone significant transformations. According to data from the International Diabetes Federation (IDF) in 2013, the global prevalence of diabetes reached 382 million cases, with projections indicating a rise to 592 million by 2035 1 . Diabetic peripheral neuropathy (DPN), a prevalent complication of diabetes, affects approximately 30% of diabetic patients 2 . Early manifestations of DPN predominantly involve sensory disturbances, including hyperalgesia, numbness, pain, and swelling in the distal extremities 3 . As the condition progresses, patients may experience sensory deficits and motor nerve impairment, leading to a substantial decline in quality of life 4 . Similarly, diabetic foot ulcer (DFU) represents another frequent complication, impacting an estimated 26% of diabetic patients globally over the course of their lifetime 5 . DFU presents significant challenges in clinical management, imposes a substantial economic burden on healthcare systems globally, and profoundly diminishes patients' quality of life 6 .

DFU can be classified etiologically into neuropathic, ischemic, and mixed (neuro‐ischemic) foot ulcers 7 . Peripheral neuropathy has long been regarded as the primary factor contributing to the development of DFU 8 . Sensory nerve involvement in peripheral neuropathy leads to a loss of protective sensation in the feet, rendering patients unable to detect abnormal pressure 9 . This loss of sensation results in the development of insensitive feet, which, when subjected to repeated mechanical stress, undergo inflammatory and aseptic tissue autolysis, ultimately forming pressure ulcers 10 . These ulcers, commonly located on the plantar forefoot, are termed typical diabetic neuropathic ulcers. Research indicates that approximately half of DFU cases are neuropathic ulcers at the time of initial presentation, while the majority of the remaining cases are mixed ulcers, with purely ischemic ulcers being comparatively rare 11 . These findings underscore that peripheral neuropathy is the central cause of diabetic neuropathic foot ulcers, which represent an advanced stage of peripheral neuropathy. Consequently, preventing DFU necessitates targeted strategies aimed at mitigating the progression of diabetic peripheral neuropathy DPN into neuropathic ulcers.

Machine learning, as a critical branch of artificial intelligence, is transforming the healthcare sector through rapid, efficient, and accurate computations 12 . Currently, machine learning plays a significant role in the prediction of many common diseases. Numerous studies on DFU have demonstrated the exceptional predictive performance of machine learning algorithms. For instance, Khandakar et al. 13 utilized convolutional neural networks to develop a predictive model capable of diagnosing diabetic foot at an early stage of diabetes. In prognostic prediction, Nanda et al. 14 applied machine learning methods to identify risk factors for DFU in diabetic patients and developed related predictive models. Additionally, Chien Wei Oei and collaborators used multiple machine learning algorithms to construct a model predicting the risk of amputation within 180 days of hospital admission in DFU patients 15 . In summary, compared with traditional predictive models, machine learning not only enhances diagnostic and therapeutic efficiency but also optimizes the allocation of healthcare resources, ultimately reducing overall healthcare costs.

METHODS

Study population

This study included 400 patients with DPN who received treatment at the Department of Neurosurgery, China–Japan Friendship Hospital, between September 2022 and September 2024. Participants were categorized into two groups: those with neuropathic foot ulcers (N = 343) and those without neuropathic foot ulcers (N = 57). Neuropathic foot ulcers were defined as ulcers, infections, or tissue destruction in deep tissues below the ankle joint resulting from neuropathy. Comprehensive patient data were collected, encompassing demographic information, blood test results, imaging findings, and physical examination records. These variables were subsequently classified as categorical, continuous, or ordinal (details provided in Table S1). The entire research process, including feature selection, data balancing, model building, model evaluation, and model interpretation, is illustrated in Figure 1.

Figure 1.

Figure 1

Study design for building a machine learning model to predict neuropathic foot ulcers.

Data extraction

This study selected patients based on strict inclusion and exclusion criteria. The inclusion criteria required a clinical diagnosis of DPN and included the following: (1) a confirmed history of diabetes or evidence of glucose metabolism abnormalities; (2) neuropathy occurring at or after the diagnosis of diabetes; (3) for patients with typical DPN clinical symptoms (e.g., pain, numbness, or sensory abnormalities), at least one positive result from five tests (ankle reflex, pinprick sensation, vibration sense, pressure sense, and temperature sense) is required; and (4) for patients without typical DPN clinical symptoms, at least two positive results from the five tests are required. The exclusion criteria were (1) neuropathy caused by other etiologies and (2) severe arterial or venous vascular disease.

Feature selection

Initially, we identified 62 features across all patients in the dataset that were statistically available. Subsequently, through a review of previous relevant studies and consultation with medical experts, the number of features was reduced from 62 to 26 7 , 16 . Only features deemed highly relevant to the diagnosis of DPN were retained. These selected features include (1) general information—gender, age, height, weight, BMI, diabetes duration, alcohol history, smoking history, hypertension history, diabetic nephropathy history, and thyroid disorders history; (2) blood test results: glycated hemoglobin, fasting blood glucose, postprandial blood glucose, glycated albumin, total cholesterol, triglycerides, high‐density lipoprotein cholesterol, low‐density lipoprotein cholesterol, neutrophil count, lymphocyte count, and the neutrophil‐to‐lymphocyte ratio; (3) imaging examinations: lower limb arterial ultrasound, venous ultrasound, and nerve ultrasound; (4) physical examination: presence of bilateral lower limb edema.

Data balancing

In machine learning and data mining, class imbalance is a common and significant challenge. When one class has a substantially smaller sample size than others, traditional machine learning algorithms often exhibit a bias toward the majority class, resulting in a reduced ability to accurately identify the minority class. To address this issue, this study employed the Synthetic Minority Oversampling Technique (SMOTE) to mitigate data imbalance. SMOTE is an oversampling method that balances the dataset by generating synthetic samples for the minority class 17 . Unlike simple random oversampling, which duplicates existing minority class samples, SMOTE creates new samples through interpolation. Specifically, for each minority class sample, SMOTE selects one of its K‐nearest neighbors and generates a new sample by interpolating along the line connecting the two points. This method not only increases the number of minority class samples but also diversifies their feature distribution, allowing the model to better learn minority class characteristics and enhancing overall classification performance.

Statistical analyses

Statistical analyses were performed using IBM SPSS Statistics for Windows, Version 27.0. The Shapiro–Wilk test was used to assess the normality of quantitative data. For normally distributed variables, comparisons were conducted using the independent samples t‐test, with results expressed as mean ± standard deviation. For non‐normally distributed variables, the Wilcoxon rank‐sum test was utilized, with results presented as median and interquartile range. For unordered categorical data, Pearson's chi‐squared test was employed, and results were reported as counts with percentages. For ordered categorical data, the rank‐sum test was applied, with results also expressed as counts and percentages. A P‐value of <0.05 was considered statistically significant.

Model building

The dataset was randomly divided into training and validation sets in an 8:2 ratio. To mitigate overfitting and develop a predictive model with strong generalization capabilities, fivefold cross‐validation was employed for model evaluation. Specifically, the dataset was partitioned into five equal subsets, with one subset serving as the test set and the remaining four used for model training during each iteration. This process was repeated five times, with a different subset designated as the test set in each iteration. The performance metrics from all five iterations were averaged to derive the final evaluation results. Six machine learning algorithms were utilized during the modeling process: random forest (RF), decision tree (DT), logistic regression (LR), K‐nearest neighbors (K‐NN), extreme gradient boosting (XGBoost), and multilayer perceptron (MLP). The algorithm with the best predictive performance was selected for subsequent modeling and validation.

To comprehensively evaluate the model's performance, several metrics were calculated, including accuracy, precision, recall, F1 score, and area under the curve (AUC). The receiver operating characteristic (ROC) curve was also plotted to visually represent the model's discriminative capability. Further validation was performed using a confusion matrix (CM), which provided an intuitive summary of the relationship between predicted and actual values. In the CM, the number in the top‐left corner represents the count of true‐positives correctly predicted by the model, the top‐right corner indicates false negatives, the bottom‐left corner shows false‐positives, and the bottom‐right corner represents true negatives. As a critical evaluation tool in machine learning prediction analysis, the confusion matrix facilitated a detailed assessment of the model's performance. By integrating these evaluation metrics and visual analyses, we ensured the stability and reliability of the model for classification tasks.

Model interpretation

We utilized SHAP (SHapley Additive exPlanations) to interpret the optimal model developed through machine learning methods. SHAP, an interpretability tool grounded in Shapley values from game theory, quantifies the contribution of each feature to the prediction outcome in a clear and measurable manner 18 . Specifically, SHAP calculates the Shapley value for each feature and multiplies it by the feature value to determine its specific impact on the prediction result. A key advantage of SHAP is its ability to generate intuitive visualizations and quantitative explanations, such as feature importance plots and individual prediction contribution graphs. These tools enhance the understanding of the model's decision‐making process and the influence of each feature on the predictions. In this study, we conducted model development, evaluation, and interpretation in a standard Python environment (Python 3.6.1), utilizing relevant machine learning libraries and the SHAP package. This approach ensured both the accuracy and interpretability of the results, providing robust insights into the model's predictive performance.

Ethical principle

All relevant examination results for this study were retrieved from the Laboratory Information System and the Electronic Medical Record System, eliminating the need for additional tests or measurements of related indicators. The data were used exclusively for this study and were rigorously protected to ensure patient privacy. This study was approved by the Clinical Research Ethics Committee of China–Japan Friendship Hospital (Approval No.: 2022‐KY‐200). Given the retrospective nature of the study, along with its non‐invasive and anonymous characteristics, the Ethics Committee waived the requirement for patient informed consent.

RESULTS

Patient characteristics

Based on the inclusion and exclusion criteria, a total of 400 participants were included in this study. The participants were categorized into two groups based on their history of neuropathic foot ulcers: 343 DPN patients without neuropathic foot ulcers (85.5%) and 57 DPN patients with neuropathic foot ulcers (14.2%).

The study analyzed 26 features, comprising 16 quantitative and 10 qualitative attributes. A description of the general characteristics of the patients is provided in Table 1. Statistically significant differences were observed in the distribution of the following variables between the two groups: diabetes duration (P = 0.048), glycated hemoglobin (P = 0.012), fasting blood glucose (P = 0.002), postprandial blood glucose (P = 0.036), triglycerides (P < 0.001), and high‐density lipoprotein cholesterol (P = 0.017).

Table 1.

A description of the general characteristics of the patients

Characteristic DPN without NFU DPN with NFU P value
Age (years) 63.00 (56.00–68.00) 59.00 (55.00–66.00) 0.185
Height (m) 1.70 (1.63–1.75) 1.70 (1.65–1.76) 0.089
Weight (kg) 69.00 (62.00–76.00) 69.00 (63.00–81.00) 0.393
BMI (kg/m2) 24.22 (22.22–26.35) 24.57 (22.00–26.44) 0.769
Diabetes duration (year) 13.00 (8.00–20.00) 20.00 (8.00–23.50) 0.048
Glycated hemoglobin (%) 6.80 (6.20–8.00) 7.20 (6.80–8.25) 0.012
Fasting blood glucose (mmol/L) 6.64 (5.63–8.34) 7.69 (6.29–9.37) 0.002
Postprandial blood glucose (mmol/L) 10.20 (8.30–12.50) 11.60 (8.30–14.15) 0.036
Glycated albumin (%) 16.90 (15.50–19.50) 17.40 (15.40–19.65) 0.491
Total cholesterol (mmol/L) 4.26 (3.60–5.08) 4.56 (3.52–5.54) 0.412
Triglycerides (mmol/L) 1.35 (0.94–1.91) 1.71 (1.31–2.49) <0.001
High‐density lipoprotein cholesterol (mmol/L) 1.29 (1.06–1.67) 1.54 (1.13–1.99) 0.017
Low‐density lipoprotein cholesterol (mmol/L) 2.29 (1.77–3.01) 2.21 (1.66–3.19) 0.574
Neutrophil count (109/L) 3.54 (2.84–4.74) 3.56 (2.96–4.82) 0.909
Lymphocyte count (109/L) 1.89 (1.50–2.34) 1.89 (1.54–2.49) 0.798
Neutrophil‐to‐lymphocyte ratio 1.93 (1.42–2.54) 1.97 (1.55–2.38) 0.973
Lower limb arterial ultrasound (%)
T1 54 (15.7) 3 (5.3) 0.173
T2 226 (65.9) 43 (75.4)
T3 63 (18.4) 11 (19.3)
Lower limb venous ultrasound (%)
T1 333 (97.1) 54 (94.7) 0.355
T2 10 (2.9) 3 (5.3)
Lower limb nerve ultrasound (%)
T1 153 (44.6) 30 (52.6) 0.884
T2 173 (50.4) 23 (40.4)
T3 17 (5.0) 4 (7.0)
Gender (%)
Male 229 (66.8) 42 (73.7) 0.301
Female 114 (33.2) 15 (26.3)
Thyroid disorders history (%)
No 337 (98.3) 57 (100) 0.314
Yes 6 (1.7) 0 (0)
Hypertension history (%)
No 181 (52.8) 23 (40.4) 0.082
Yes 162 (47.2) 34 (59.6)
Diabetic nephropathy history
No 315 (91.8) 54 (94.7) 0.448
Yes 28 (8.2) 3 (5.3)
Alcohol history (%)
No 275 (80.2) 44 (77.2) 0.604
Yes 68 (19.8) 13 (22.8)
Smoking history (%)
No 246 (71.7) 40 (70.2) 0.811
Yes 97 (28.3) 17 (29.8)
Presence of bilateral lower limb edema (%)
No 323 (94.2) 56 (98.2) 0.201
Yes 20 (5.8) 1 (1.8)

Values in bold indicate statistically significant differences between two groups (P < 0.05).

Data balancing

To address the issue of data imbalance, we employed the SMOTE technique. After applying SMOTE oversampling, the minority class samples were increased, resulting in a balanced dataset. Table 2 presents the data distribution before and after balancing.

Table 2.

Comparison of positive and negative samples before and after data balancing

DPN with DFU DPN without DFU
Before sampling 57 343
After sampling 113 343

Modeling and evaluation

We employed six machine learning algorithms—random forest, decision tree, logistic regression, K‐nearest neighbor, extreme gradient boosting, and multilayer perceptron—to develop predictive models. The evaluation results of these six algorithms, based on fivefold cross‐validation, are presented in Table 3. The results indicate that MLP achieved the highest accuracy (0.8751) and F1 score (0.7681). RF exhibited the highest precision (0.9778), while K‐NN demonstrated the highest recall (0.851); however, these models performed less favorably on other metrics. The confusion matrix, used to formally evaluate model performance, showed results consistent with those in Table 3. As shown in Figure 2, the sum of the numbers in the top‐left and bottom‐right corners of the confusion matrix for the MLP model is the largest among all models, indicating that it correctly classified the most samples, demonstrating its superior predictive performance compared to other models. Furthermore, the AUC values of the different algorithms in predicting DPN are presented in Figure 3. MLP achieved the highest AUC (0.901), followed by random forest (0.886), with decision tree recording the lowest AUC (0.707). Considering all evaluation metrics, MLP demonstrated superior accuracy, stability, and overall performance compared to the other models and was selected as the optimal model for predicting neuropathic foot ulcers.

Table 3.

Comparison of classification results of different models (mean)

Algorithm Accuracy Precision Recall F1 score
RF 0.8575 0.9778 0.4344 0.6008
DT 0.7435 0.5255 0.3905 0.4355
LR 0.7435 0.4598 0.2573 0.3270
KNN 0.8575 0.6659 0.8514 0.7461
GB 0.8531 0.8086 0.5328 0.6386
MLP 0.8751 0.7250 0.8245 0.7681

Values in bold indicate the best‐performing results for each metric among all models.

Figure 2.

Figure 2

Model classification confusion matrix of fivefold cross‐validation.

Figure 3.

Figure 3

ROC curves for different classification models.

SHAP analysis was employed to elucidate the relationship between features and predictive outcomes in the MLP model. As shown in Figure 4, the top 10 features influencing predictive outcomes were triglycerides, high‐density lipoprotein cholesterol, duration of diabetes, age, fasting blood glucose, glycated albumin, history of hypertension, low‐density lipoprotein cholesterol, postprandial blood glucose, and history of diabetic nephropathy.

Figure 4.

Figure 4

MLP model interpretation using SHAP.

DISCUSSION

In recent years, traditional linear regression models, such as logistic regression and Cox regression, have been widely utilized in clinical disease prediction and prognostic studies 19 , 20 . While these methods offer advantages in managing simple relationships and categorical variables, their limitations become increasingly evident when addressing complex pathological mechanisms and multidimensional data 21 . Specifically, their inability to capture nonlinear relationships can compromise predictive accuracy in real‐world clinical scenarios. In contrast, with the rapid advancement of artificial intelligence and big data technologies, machine learning methods have emerged as a focal point of medical research 22 . A growing body of evidence demonstrates that machine learning algorithms not only excel in managing high‐dimensional and complex datasets but also achieve superior accuracy and broader applicability in the diagnosis and prognostic prediction of diseases such as DFU 13 , 14 , 15 . Nevertheless, previous studies have predominantly focused on diabetic patients, with predictive outcomes centered on ulcer severity and the likelihood of amputation.

In comparison, our study specifically targets a distinct subgroup—patients with diabetic peripheral neuropathy—and further refines the predictive objective to identify individuals at high risk of developing neuropathic foot ulcers. This focused approach addresses a critical research gap while simultaneously offering opportunities for early intervention and personalized treatment strategies. By incorporating multiple machine learning algorithms and systematically comparing their performance, this study provides a comprehensive evaluation of the strengths and limitations of various algorithms in managing complex clinical datasets. Such insights contribute to advancing predictive analytics in precision medicine.

The multilayer perceptron is a feedforward artificial neural network and a foundational model in deep learning 23 . It consists of an input layer, one or more hidden layers, and an output layer, with each layer comprising multiple neurons. These neurons enable nonlinear transformations of data by leveraging inter‐layer weights and activation functions, allowing MLP to model complex relationships effectively. Compared to traditional machine learning algorithms, MLP offers significant advantages in nonlinear modeling, flexibility, and adaptability 24 . It is particularly adept at managing complex, high‐dimensional datasets and excels in feature extraction and pattern recognition. By optimizing its architecture—such as adjusting the number of layers, neurons, and activation functions—or integrating with advanced models like convolutional neural networks (CNNs) and recurrent neural networks (RNNs), MLP demonstrates substantial potential in applications such as medical image analysis, disease diagnosis, and treatment outcome prediction 25 . These capabilities highlight its vital role in advancing precision medicine and intelligent healthcare.

SHAP analysis identified triglycerides, high‐density lipoprotein cholesterol, diabetes duration, age, and fasting blood glucose as the most significant contributors to the model's predictive performance. These findings are consistent with clinical observations and previous studies on the risk factors for DFU 19 , 26 . The substantial influence of these features underscores their critical role in predicting neuropathic foot ulcers, offering a reliable foundation for clinical risk assessments and targeted interventions.

Triglycerides (TG) are a major lipid component in the bloodstream and play a critical role in energy storage and supply 27 . Abnormal TG levels are strongly linked to various metabolic disorders 28 . Elevated TG levels can trigger microvascular complications and inflammatory responses, leading to impaired blood supply to neural tissues and subsequent disruption of the nervous system structure and function 29 . Studies indicate that high TG levels not only accelerate the onset and progression of peripheral neuropathy but also impair local blood circulation, causing tissue hypoxia in the foot and exacerbating the development of DFUs 30 . Furthermore, elevated TG levels are associated with delayed ulcer healing, thereby increasing clinical risks for patients and complicating disease management. High‐density lipoprotein cholesterol (HDL‐C), carried by high‐density lipoproteins (HDL), possesses various protective functions, including anti‐inflammatory and antioxidant effects, as well as facilitating reverse cholesterol transport 31 . Studies have shown that low HDL‐C levels can indirectly contribute to the development of DFU by exacerbating microvascular complications and neuropathy. In a retrospective study, Chen et al. 32 found that lower HDL‐C levels were strongly associated with an increased risk of foot ulcers in diabetic patients. Additionally, reduced HDL‐C levels can delay ulcer wound healing, further increasing the clinical burden on patients.

Patients with a longer duration of diabetes face a significantly higher risk of developing DFU, primarily due to two key factors: First, prolonged hyperglycemia leads to sensory neuropathy, impairing the patient's ability to detect foot microtraumas and infections, which delays timely wound care 33 . Second, chronic diabetes is frequently accompanied by more severe vascular complications, including microvascular and macrovascular diseases, further contributing to the development and delayed healing of foot ulcers 34 . A multicenter cross‐sectional study revealed that with each additional year of diabetes, the incidence of DFU increases by approximately 5–10% 35 . Additionally, these patients experience significantly prolonged ulcer healing times, complicating treatment and heightening the clinical and economic burden. Age is also a significant risk factor for the development of DFU, with underlying mechanisms partially overlapping those associated with prolonged diabetes duration. Older patients are more likely to experience severe peripheral neuropathy and microvascular complications, which result in diminished protective sensation in the feet and impaired blood circulation 36 . These conditions reduce the ability to detect foot injuries or pressure and increase the risk of tissue hypoxia, thereby facilitating the development of DFU 37 . Additionally, aging significantly impairs tissue repair capacity, leading to prolonged ulcer healing times and complicating treatment efforts. Elevated fasting blood glucose contributes to the development of DFU through multiple pathways. These include chronic hyperglycemia‐induced accumulation of advanced glycation end products (AGEs) and activation of the polyol pathway, leading to nerve fiber damage 38 , 39 . Studies have confirmed that elevated fasting blood glucose is an independent risk factor for the occurrence of DFU in diabetic patients 40 .

In summary, triglycerides, high‐density lipoprotein cholesterol, diabetes duration, age, and fasting blood glucose were identified as independent risk factors for DFU in this study. Consequently, clinical practice should leverage the advantages of machine learning in diagnostic processes to emphasize the assessment of lipid profiles, blood glucose levels, diabetes duration, and age in patients with diabetic peripheral neuropathy. This targeted approach will facilitate early diagnosis and intervention, thereby reducing the risk of DFU and improving patient outcomes.

Despite the development of an efficient predictive model, this study has several limitations. First, the relatively small sample size may affect the statistical robustness and generalizability of the model. Second, this study is based on single‐center data and lacks an independent external validation dataset, which limits the generalizability of the findings. Third, the study population consisted solely of Chinese patients, necessitating further investigation to assess global generalizability. Finally, as this study is retrospective in nature, certain features previously proven to be associated with the occurrence of diabetic foot ulcers (e.g., eGFR and history of diabetic retinopathy) could not be collected in this study 41 , 42 . Future research will adopt a multicenter approach, include larger and more diverse sample sizes, and aim to develop a more robust and broadly applicable predictive model.

CONCLUSION

In this study, we applied six machine learning algorithms—random forest, decision tree, logistic regression, k‐nearest neighbor, extreme gradient boosting, and multilayer perceptron—to develop predictive models for neuropathic foot ulcers in patients with diabetic peripheral neuropathy. Among these, the Multilayer Perceptron model achieved the highest performance, with an area under the receiver operating characteristic curve of 0.901 and was identified as the optimal model. SHAP analysis indicated that triglycerides, high‐density lipoprotein cholesterol, diabetes duration, age, and fasting blood glucose were the most influential features contributing to the model's predictive accuracy. The findings demonstrated that, compared to traditional linear regression models, machine learning approaches offer superior accuracy and reliability in predictive tasks. Additionally, these models enable individualized analysis of patient risk factors, providing novel insights and methodologies for personalized risk assessment. These advantages underscore the significant potential of machine learning in advancing precision medicine.

DISCLOSURE

The authors declare no conflict of interest.

Approval of the research protocol: N/A.

Informed consent: N/A.

Registry and the registration no. of the study/trial: N/A.

Animal studies: N/A.

Supporting information

Table S1. Explanation of features.

JDI-16-1055-s001.xlsx (10.8KB, xlsx)

ACKNOWLEDGMENTS

This work was supported by the Beijing Natural Science Foundation (Grant No. L244029).

[Correction statement added on 9 May 2025, after first online publication: Funding information has been updated.]

References

  • 1. Saeedi P, Petersohn I, Salpea P, et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: results from the international diabetes federation diabetes atlas. Diabetes Res Clin Pract 2019; 157: 107843. [DOI] [PubMed] [Google Scholar]
  • 2. Selvarajah D, Kar D, Khunti K, et al. Diabetic peripheral neuropathy: advances in diagnosis and strategies for screening and early intervention. Lancet Diabetes Endocrinol 2019; 7: 938–948. [DOI] [PubMed] [Google Scholar]
  • 3. Patel K, Horak H, Tiryaki E. Diabetic neuropathies. Muscle Nerve 2021; 63: 22–30. [DOI] [PubMed] [Google Scholar]
  • 4. Zakin E, Abrams R, Simpson DM. Diabetic neuropathy. Semin Neurol 2019; 39: 560–569. [DOI] [PubMed] [Google Scholar]
  • 5. Armstrong DG, Boulton A, Bus SA. Diabetic foot ulcers and their recurrence. N Engl J Med 2017; 376: 2367–2375. [DOI] [PubMed] [Google Scholar]
  • 6. Rice JB, Desai U, Cummings AK, et al. Burden of diabetic foot ulcers for medicare and private insurers. Diabetes Care 2014; 37: 651–658. [DOI] [PubMed] [Google Scholar]
  • 7. Armstrong DG, Tan TW, Boulton A, et al. Diabetic foot ulcers: a review. JAMA 2023; 330: 62–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Boulton AJ. Clinical presentation and management of diabetic neuropathy and foot ulceration. Diabet Med 1991; 8: S52–S57. [DOI] [PubMed] [Google Scholar]
  • 9. Piaggesi A. Research development in the pathogenesis of neuropathic diabetic foot ulceration. Curr Diab Rep 2004; 4: 419–423. [DOI] [PubMed] [Google Scholar]
  • 10. Mueller MJ, Minor SD, Diamond JE, et al. Relationship of foot deformity to ulcer location in patients with diabetes mellitus. Phys Ther 1990; 70: 356–362. [DOI] [PubMed] [Google Scholar]
  • 11. Reiber GE, Vileikyte L, Boyko EJ, et al. Causal pathways for incident lower‐extremity ulcers in patients with diabetes from two settings. Diabetes Care 1999; 22: 157–162. [DOI] [PubMed] [Google Scholar]
  • 12. Deo RC. Machine learning in medicine. Circulation 2015; 132: 1920–1930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Khandakar A, Chowdhury M, Ibne Reaz MB, et al. A machine learning model for early detection of diabetic foot using thermogram images. Comput Biol Med 2021; 137: 104838. [DOI] [PubMed] [Google Scholar]
  • 14. Nanda R, Nath A, Patel S, et al. Machine learning algorithm to evaluate risk factors of diabetic foot ulcers and its severity. Med Biol Eng Comput 2022; 60: 2349–2357. [DOI] [PubMed] [Google Scholar]
  • 15. Oei CW, Chan YM, Zhang X, et al. Risk prediction of diabetic foot amputation using machine learning and explainable artificial intelligence. J Diabetes Sci Technol 2024:30: 19322968241228606. doi: 10.1177/19322968241228606 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Adler A. Risk factors for diabetic neuropathy and foot ulceration. Curr Diab Rep 2001; 1: 202–207. [DOI] [PubMed] [Google Scholar]
  • 17. Blagus R, Lusa L. SMOTE for high‐dimensional class‐imbalanced data. BMC Bioinformatics 2013; 14: 106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Wang K, Tian J, Zheng C, et al. Interpretable prediction of 3‐year all‐cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput Biol Med 2021; 137: 104813. [DOI] [PubMed] [Google Scholar]
  • 19. Lv J, Li R, Yuan L, et al. Development and validation of a risk prediction model for foot ulcers in diabetic patients. J Diabetes Res 2023; 2023: 1199885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Wu B, Niu Z, Hu F. Study on risk factors of peripheral neuropathy in type 2 diabetes mellitus and establishment of prediction model. Diabetes Metab J 2021; 45: 526–538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Elragal R, Elragal A, Habibipour A. Healthcare analytics‐a literature review and proposed research agenda. Front Big Data 2023; 6: 1277976. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Ascent of machine learning in medicine. Nat Mater 2019; 18: 407. [DOI] [PubMed] [Google Scholar]
  • 23. Tang J, Deng C, Huang GB. Extreme learning machine for multilayer perceptron. IEEE Trans Neural Netw Learn Syst 2016; 27: 809–821. [DOI] [PubMed] [Google Scholar]
  • 24. Féraud R, Clérot F. A methodology to explain neural network classification. Neural Netw 2002; 15: 237–246. [DOI] [PubMed] [Google Scholar]
  • 25. Lerner B, Guterman H, Dinstein I, et al. Human chromosome classification using multilayer perceptron neural network. Int J Neural Syst 1995; 6: 359–370. [DOI] [PubMed] [Google Scholar]
  • 26. Jiang M, Gan F, Gan M, et al. Predicting the risk of diabetic foot ulcers from diabetics with dysmetabolism: a retrospective clinical trial. Front Endocrinol (Lausanne) 2022; 13: 929864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Nordestgaard BG, Varbo A. Triglycerides and cardiovascular disease. Lancet 2014; 384: 626–635. [DOI] [PubMed] [Google Scholar]
  • 28. Sniderman AD, Dufresne L, Pencina KM, et al. Discordance among apoB, non‐high‐density lipoprotein cholesterol, and triglycerides: implications for cardiovascular prevention. Eur Heart J 2024; 45: 2410–2418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Zhang Q, Shen X, Yuan X, et al. Lipopolysaccharide binding protein resists hepatic oxidative stress by regulating lipid droplet homeostasis. Nat Commun 2024; 15: 3213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Schaper NC, Apelqvist J, Bakker K. The international consensus and practical guidelines on the management and prevention of the diabetic foot. Curr Diab Rep 2003; 3: 475–479. [DOI] [PubMed] [Google Scholar]
  • 31. McDermott K, Fang M, Boulton A, et al. Etiology, epidemiology, and disparities in the burden of diabetic foot ulcers. Diabetes Care 2023; 46: 209–221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Chen L, Ma W, Chen D, et al. Association of high‐density lipoprotein cholesterol and wound healing in patients with diabetic foot ulcers. Chin Med J 2022; 135: 110–112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Guo Q, Ying G, Jing O, et al. Influencing factors for the recurrence of diabetic foot ulcers: a meta‐analysis. Int Wound J 2023; 20: 1762–1775. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Yekta Z, Pourali R, Nezhadrahim R, et al. Clinical and behavioral factors associated with management outcome in hospitalized patients with diabetic foot ulcer. Diabetes Metab Syndr Obes 2011; 4: 371–375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Boulton AJ, Vileikyte L, Ragnarson‐Tennvall G, et al. The global burden of diabetic foot disease. Lancet 2005; 366: 1719–1724. [DOI] [PubMed] [Google Scholar]
  • 36. Finch CE, Cohen DM. Aging, metabolism, and Alzheimer disease: review and hypotheses. Exp Neurol 1997; 143: 82–102. [DOI] [PubMed] [Google Scholar]
  • 37. Brisset M, Nicolas G. Peripheral neuropathies and aging. Geriatr Psychol Neuropsychiatr Vieil 2018; 16: 409–413. [DOI] [PubMed] [Google Scholar]
  • 38. Lane KL, Abusamaan MS, Voss BF, et al. Glycemic control and diabetic foot ulcer outcomes: a systematic review and meta‐analysis of observational studies. J Diabetes Complicat 2020; 34: 107638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Dutta A, Bhansali A, Rastogi A. Early and intensive glycemic control for diabetic foot ulcer healing: a prospective observational nested cohort study. Int J Low Extrem Wounds 2023; 22: 578–587. [DOI] [PubMed] [Google Scholar]
  • 40. Tong T, Yang C, Tian W, et al. Phenotypes and outcomes in middle‐aged patients with diabetic foot ulcers: a retrospective cohort study. J Foot Ankle Res 2020; 13: 24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Jin L, Xu W. Renal function as risk factor for diabetic foot ulcers: a meta‐analysis. Int Wound J 2024; 21: e14409. [DOI] [PMC free article] [PubMed] [Google Scholar] [Retracted]
  • 42. Borderie G, Foussard N, Larroumet A, et al. Diabetic retinopathy relates to the incidence of foot ulcers and amputations in type 2 diabetes. Diabetes Metab Res Rev 2023; 39: e3605. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1. Explanation of features.

JDI-16-1055-s001.xlsx (10.8KB, xlsx)

Articles from Journal of Diabetes Investigation are provided here courtesy of Wiley

RESOURCES