Abstract
Background
Human Immunodeficiency Virus (HIV) continues to be a major global public health challenge, affecting 39.9 million people globally by the end of 2023. Sub-Saharan Africa bears a significant burden, contributing to 67% of cases. Malnutrition is prevalent among people living with HIV, exacerbating immunosuppression and accelerating disease progression. This study explored the application of machine learning (ML) to assess the nutritional status of PLWHIV and predict the risk of malnutrition.
Methods and materials
A quantitative cross-sectional study design was employed. Data were collected from the University of Gondar Comprehensive and Specialized Hospital in Ethiopia. The study population included PLWHIV who attended antiretroviral therapy (ART) clinics. The variables included demographic, clinical, hematological, immunological, and treatment-related factors of the patients. Data preprocessing involves imputation, encoding, and dimensionality reduction. The ML models were trained using an 80:20 train-test split and evaluated in terms of accuracy, precision, recall, F1 score, and AUC.
Results
The study included data from 4,152 respondents, with the majority aged 48–57 years (32.9%), female (59.5%), and living in urban areas (76.5%). Nutritional status assessment revealed that 62.8% of the participants had a normal Body Mass Index (BMI), 17.6% were overweight, 15.2% were underweight, and 4.4% were obese.\
Machine learning (ML) models were evaluated for their ability to predict the risk of malnutrition in PLWHIV. The results showed that applying the Synthetic Minority Oversampling Technique (SMOTE) markedly enhanced model performance by improving minority class recall. A support vector machine (SVM) achieved the highest performance, with an accuracy of 80.1%, precision of 80.4%, recall of 80.1%, F1 score of 79.4%, and an AUC of 0.92. Key predictors of nutritional status included antiretroviral therapy duration, BMI, adherence to treatment, and World Health Organization (WHO) stage. The integration of the SVM model into electronic medical records (EMRs) could enable real-time malnutrition risk alerts during clinic visits, requiring minimal clinician training.
Conclusion
ML models offer a robust approach for predicting malnutrition risk in PLWHIV patients. The integration of these tools into routine care could enhance nutritional management, particularly in low-resource settings. Further studies are needed to confirm these findings and improve the deployment of the model in clinical settings.
Keywords: HIV, Machine learning, Nutritional status, PLWHIV, Malnutrition prediction, ART
Background
Human immunodeficiency virus (HIV) remains a significant global public health issue, with an estimated 39.9 million people living with HIV (PLWHIV) worldwide in 2023 [1]. Sub-Saharan Africa bears a disproportionate burden, accounting for nearly 67% of all cases worldwide [1]. Malnutrition, a common comorbidity among PLWHIV, worsens immunosuppression and accelerates the progression of the disease. The bidirectional relationship between HIV and malnutrition creates a vicious cycle in which the infection impairs nutritional intake, absorption, and use. In contrast, malnutrition further weakens the immune response, increasing susceptibility to opportunistic infections and mortality [2, 3].
Previous studies have reported high rates of malnutrition among PLWHIV [4]. Ethiopia’s malnutrition prevalence among PLWHIV (23–55%) aligns with regional estimates for Sub-Saharan Africa, where 20–50% of PLWHIV were found to be malnourished [2], although urban-rural disparities are particularly pronounced in Ethiopia [5]. Contributing factors include food insecurity, socioeconomic disparities, and limited access to healthcare services. The World Health Organization (WHO) emphasizes the integration of nutritional support into HIV care to improve health outcomes, particularly in resource-limited settings [6, 7].
The nutritional status of PLWHIV is often assessed using body mass index (BMI), mid-upper arm circumference (MUAC), and biochemical markers such as serum albumin. Studies have shown that underweight individuals (BMI < 18.5 kg/m²) are more likely to experience poor ART adherence, while better nutritional status is associated with improved adherence [8]. Evidence from research conducted in sub-Saharan Africa revealed that 27–39% of ART patients were underweight at the initiation of therapy, which was associated with a twofold increase in mortality risk compared with those with a normal BMI [9, 10]. Additionally, overweight and obesity (BMI ≥ 25 kg/m²) have emerged as concerns, particularly in urban settings, owing to lifestyle changes and prolonged ART use, which may lead to metabolic complications and noncommunicable diseases [11, 12].
In Ethiopia, HIV prevalence remains a public health concern despite considerable progress in reducing new infections and deaths [13]. According to the Ethiopian Public Health Institute (EPHI, 2022), approximately 613,000 people were living with HIV in 2020, with an adult prevalence rate of 0.9% based on national surveillance data [6]. Malnutrition among PLWHIV is widespread and is attributed to food insecurity, poverty, and inadequate access to health care. Studies conducted in Ethiopia have shown that 23–55% of PLWHIV attending antiretroviral therapy (ART) clinics experience malnutrition, underscoring the urgent need for targeted interventions [8, 11]. Specifically, a study in southern Ethiopia revealed that 31% of ART patients were underweight based on BMI criteria, highlighting the critical role of nutritional interventions in HIV care [11, 12].
The University of Gondar Comprehensive and Specialized Hospital, a leading healthcare institution in northwest Ethiopia, provides comprehensive HIV care services, including antiretroviral therapy (ART) and nutritional counseling. Despite these efforts, malnutrition remains a critical challenge for patients. Previous research has shown that malnutrition is associated with poor ART adherence, reduced treatment efficacy, and increased morbidity and mortality [3, 14]. Understanding the nutritional status and factors associated with PLWHIV in this setting is essential for designing effective interventions.
Advances in technology, particularly machine learning (ML), offer promising tools for addressing malnutrition in PLWHIV. ML techniques can be used to analyze complex datasets to identify patterns and predict outcomes, thereby enabling healthcare providers to proactively address nutritional issues. By leveraging machine learning, it is possible to develop predictive models that integrate demographic, clinical, and nutritional data to identify at-risk patients and tailor interventions. This approach not only enhances patient care but also improves the efficiency of resource allocation in resource-limited settings.
This study aimed to assess the nutritional status of PLWHIV attending the University of Gondar Comprehensive and Specialized Hospital and to develop a machine learning-based prediction model for malnutrition risk. By bridging global and local evidence and incorporating innovative technology, the findings will contribute to informed policymaking and the implementation of tailored data-driven interventions to improve the health and quality of life of PLWHIV in Ethiopia and similar settings in the future.
While previous studies have used logistic regression to identify malnutrition predictors [15], none have leveraged machine learning (ML) to integrate demographic, clinical, and treatment-related data for risk stratification in Ethiopian PLWHIV. Machine learning outperforms traditional logistic regression in handling nonlinear relationships (e.g., interactions between ART duration and BMI) and high-dimensional data (e.g., >20 predictors), as demonstrated in similar HIV studies [16]. This study addresses this gap by developing an interpretable ML model tailored for resource-limited settings.
Methods and materials
Study design and study setting
This was an institution-based cross-sectional study conducted at the University of Gondar Comprehensive and Specialized Hospital. This study was conducted at public health facilities in the Gondar City Administration in 2024. The Gondar City Administration is located in the Amhara National Regional State of Ethiopia, approximately 748 km northwest of the capital, Addis Ababa. The city has a population of 457,938.
Gondar is home to the University of Gondar, one of Ethiopia’s oldest universities and a leading institution for medical and public health training in the country. The city plays a crucial role in supporting health programs aimed at reducing the HIV/AIDS burden in the region. These efforts include facilitating research, implementing treatment programs, and ensuring access to antiretroviral therapy (ART) for those in need. Public health facilities in Gondar provide essential healthcare services, including HIV care, and serve as key research sites for health-related studies in Ethiopia.
The hospital provides medical education, training, medical services to the community, and many other services to more than 8 million people in Gondar Province and the neighboring areas. One of the services provided is ART, through which both children and adults can receive free diagnosis, treatment, and monitoring. The hospital began providing free ART services in March 2005. At the University of Gondar Hospital, the ART clinic reported that 15,933 patients were enrolled in the ART program. Of these, 5481 patients were actively receiving treatment. While the University of Gondar ART clinic provides routine nutritional counseling and occasional food supplementation for severely malnourished patients, there is no standardized ongoing nutritional intervention for all patients.
Data source and study population
An Electronic Medical Records for Antiretroviral Therapy (EMR-ART)-based secondary data analysis was used to conduct the study. All individuals living with HIV on antiretroviral therapy (ART) who attended ART clinics at the University of Gondar Comprehensive and Specialized Hospital comprised the source population.
Data collection
Data were extracted from the electronic medical records (EMR) of PLWHIV attending ART clinics. This process involved retrieving demographic, clinical, hematological, and treatment-related data from the hospital’s electronic system. Data extraction ensured that only de-identified information was collected to maintain patient confidentiality and adhere to the ethical guidelines.
Study variables
In this study, the dependent variable was nutritional status (BMI), which was categorized according to the World Health Organization (WHO) standards as underweight (BMI < 18.5 kg/m²), normal (18.5–24.9 kg/m²), overweight (25–29.9 kg/m²), and obese (≥ 30 kg/m²). The independent variables were sociodemographic factors (age, sex, marital status, occupation, religion, and place of residence). Religion was included in the study because of its potential influence on dietary practices. Clinical-related factors (duration of ART, WHO stage, duration of HIV infection, tuberculosis (TB) coinfection, and functional status IV) and TB coinfection were prioritized because of their high prevalence and documented impact on nutritional status [17, 18]. Other comorbidities were not included because they were not consistently recorded in the EMR-ART. Hematologic and immunological factors (baseline Cluster of Differentiation 4 (CD4) count, current CD4 count, viral load, and status) and treatment-related factors (adherence to treatment, regimen line, Tuberculosis Preventive Therapy (TPT) initiation, TPT discontinuation, and Cotrimoxazole Preventive Therapy (CPT) use). HIV treatment adherence is commonly categorized into three levels: good adherence (≥ 95% of doses taken, resulting in viral suppression and minimal resistance risk), fair adherence (85–94% of doses taken, associated with low-level viremia and potential resistance), and poor adherence (< 85% of doses taken, leading to a high risk of virologic failure and drug resistance).
Data management and analysis
Data entry was performed in a secure database using double-entry methods to minimize data entry errors. Automated scripts were used to identify duplicate entries and invalid records in the data. Data preparation involved cleaning the dataset by imputing missing values using statistical methods and addressing the outliers. Feature engineering includes creating new variables, such as ART duration, and generating interaction terms between key variables. Patient data were extracted from the electronic database in Microsoft Excel and converted to comma-separated values (CSVs) to facilitate dataset preprocessing in Python version 3.12.
Data preprocessing
Machine learning requires high-quality datasets for prediction. Therefore, handling missing data during dataset preprocessing is crucial. An imputation technique was applied to handle the missing values in the extracted CSV dataset. This study addressed missing values using simple imputation techniques, implementing median imputation for continuous variables and mode imputation for categorical variables only. The median was selected for numerical data to maintain robustness against skewed distributions, whereas the mode preserved the most frequent and valid responses for categorical variables. The simple imputer class of the scikit-learn module was used to impute missing values in the dataset [19].
Data preprocessing also includes data encoding, which is a crucial step. Categorical variables were encoded using one-hot encoding (a method to convert categorical data into numerical values), whereas missing values were imputed using the median/mode. One-hot label encoding is used to encode categorical variables. Values with more than two categories are considered categorical if they are discrete and/or discontinuous. In this study, categorical variables were encoded using one-hot encoding. Categorical values were replaced by numbers between 0 and 1 using one-hot encoding [20].
Dimensionality reduction techniques, such as principal component analysis (PCA), were applied to reduce the feature space and eliminate multicollinearity. The dataset was divided into training (80%) and testing (20%) subsets. An 80:20 split was chosen to balance the computational efficiency and statistical power, with K-fold cross-validation applied to the training set to reduce variance. SMOTE was applied to address class imbalance by generating synthetic samples for minority BMI categories (k = 10 nearest neighbors). Synthetic samples were validated for clinical plausibility post-generation. The method is summarized in an overview flow chart that delineates the key steps, such as data preprocessing, model training, and evaluation (see Fig. 1).
Fig. 1.
Overview flow chart of methodologies
Machine learning models
The models included support vector machine (SVM), random forest, logistic regression, decision tree, and gradient boosting. We evaluated performance using accuracy, precision, recall, F1 score, and AUC.
Model evaluation
In this study, the performance of the predictive models was evaluated by testing a dataset using a train-test split and cross-validation. The performance of the trained models was evaluated using the test set based on the criteria of accuracy score, ROC curve, precision (P), recall (R), and F-measure. The confusion matrix (performance table), which is an N × N matrix, where N is the number of predicted classes, and it displays the number of correct and incorrect predictions made by the classification model, was used in this study.
A confusion matrix is a table used to evaluate the performance of a classification model by comparing the predicted labels with the actual (true) labels. It is especially useful for assessing the model accuracy, precision, recall, and other metrics in binary or multiclass problems. The confusion matrix summarizes how often the model’s predictions match real-world outcomes. For example, it shows how many malnourished patients were correctly identified versus how many were overlooked or misclassified in the model.
Now, you can calculate the accuracy ([(a + d)/(a + b + c + d)] × 100), recall ([a/(a + c)] × 100), etc.
True positives (TP) are actual positives that are correctly classified as BMI classes. True negatives (TN) are actual negatives that are correctly classified into BMI classes. False negatives (FN) are actual positives that are incorrectly classified as negatives. False positives (FP) were actual negatives that were incorrectly classified as BMI (FP). Equations were used to calculate recall (sensitivity), precision (specificity), and accuracy, as given below:
Accuracy: Accuracy is the percentage of true events among the total number of cases. In this study, it was used to determine the model efficacy and measure the confusion matrix [21]. It measures the overall correctness of the model in classifying BMI.
![]() |
Sensitivity/Recall: Sensitivity is a test that counts the number of correctly classified BMI events among all positive events. This provides the number of predicted positives relative to the total number of positive classes. This is referred to as recall, which can be computed using Formula [22].
![]() |
Precision (Positive Predictive Value): Measures how many patients are predicted to be at high risk of malnutrition [23]. This study utilized the following formula, which was derived from a confusion matrix, to verify the model output:
![]() |
F1 score: Also known as the F test, it represents the inverse relationship between recall and accuracy of the model. A better model is predicted using a higher F1 score [24].
![]() |
AUROC curve: The probability curve that illustrates the connection between sensitivity and specificity is known as the Receiver Operator Characteristic curve. The most popular metric for binary classification results is the ROC curve. The degree to which the probabilities are separated from the negative classes by the positive classes is indicated by the field under the ROC. Better model prediction is indicated when the area under the curve (AUC) value is 0.92, and poor model efficiency is indicated when the value is close to 0. We used this metric to determine the model efficiency in this study [25].
Results
Sociodemographic characteristics
A total of 4152 (weighted) respondents were included in this study. The majority of the study participants were aged 48–57 years, 1364 (32.9%), with a mean age of 46.7 years (SD: 11.8), indicating a predominantly middle-aged population. Females represented 2471 (59.5%) of the sample, highlighting a gender imbalance. Most respondents were married, 2700 (65.0%), and had attained secondary education, 2145 (51.7%), reflecting a moderate educational background. A significant proportion resided in urban areas, 3177 (76.5%), suggesting greater urban representation. In terms of religious affiliation, the vast majority were Orthodox Christians, 3910 (94.2%), indicating a highly homogeneous religious composition (see Table 1).
Table 1.
Sociodemographic characteristics of the study participants in Gondar City administration, 2024
| Features | Category | Frequency | Percent (%) |
|---|---|---|---|
| Age | 18–27 | 254 | 6.1 |
| 28–37 | 531 | 12.8 | |
| 38–47 | 1363 | 32.8 | |
| 48–57 | 1364 | 32.9 | |
| 58 and above | 640 | 15.4 | |
| Sex | Female | 2471 | 59.5 |
| Male | 1681 | 40.5 | |
| Marital Status | Married | 2700 | 65.0 |
| Divorced | 698 | 16.8 | |
| Never Married | 317 | 7.6 | |
| Widowed | 437 | 10.5 | |
| Education Level | Higher Education | 413 | 9.9 |
| Secondary Education | 2145 | 51.7 | |
| Primary Education | 789 | 19.0 | |
| No Education | 805 | 19.4 | |
| Residence | Urban | 3177 | 76.5 |
| Rural | 975 | 23.5 | |
| Religion | Orthodox | 3910 | 94.2 |
| Muslim | 206 | 5.0 | |
| Protestant | 26 | 0.6 | |
| Catholic | 9 | 0.2 | |
| Other | 1 | 0.06 | |
| Occupation | Skilled | 41 | 1.0 |
| Unskilled | 3341 | 80.5 | |
| Others | 769 | 18.5 |
Clinical characteristics
Most participants (62.8%) had a normal BMI. Overweight individuals accounted for 17.6% of the sample, while only 15.2% were underweight, suggesting that malnutrition was relatively uncommon. Regarding tuberculosis (TB) status, 98.0% of the participants did not have TB, with only 2.0% reporting a positive TB status, reflecting the low prevalence of the disease. In terms of functional status, most participants (93.0%) were categorized as working, indicating a high level of physical functionality. A smaller proportion was ambulatory (4.0%), while only 3.0% were bedridden, highlighting that the majority of the population maintained an active and independent lifestyle (see Table 2).
Table 2.
Clinical-related factors characteristics of the study participants in Gondar City administration, 2024
| Features | Category | Frequency | Percent (%) |
|---|---|---|---|
| BMI | Normal | 2607 | 62.8 |
| Overweight | 733 | 17.6 | |
| Underweight | 630 | 15.2 | |
| Obesity | 182 | 4.4 | |
| TB Status | Yes | 82 | 2.0 |
| No | 4070 | 98.0 | |
| Functional Status | Working | 3861 | 93.0 |
| Ambulatory | 163 | 4.0 | |
| Bedridden | 128 | 3.0 |
BMI categories followed the WHO guidelines: underweight (< 18.5 kg/m²), normal (18.5–24.9 kg/m²), overweight (25–29.9 kg/m²), and obesity (≥ 30 kg/m²).
Hematological and immunological characteristics
The clinical profile of the participants showed that the majority had a low baseline CD4 count (71.7%), indicating significant immune suppression at the start of care. Only 16.9% of the patients had a high CD4 count, whereas 11.4% fell within the normal range. Regarding recent CD4 levels, 49.6% still had a low count, although 34.1% achieved normal levels, suggesting some improvement in immune function. A smaller group (16.3%) had a high recent CD4 count, indicating successful immune recovery.
In terms of viral load status, the vast majority (94.6%) had suppressed viral loads, reflecting effective treatment and disease management. A minimal proportion (2.0%) had unsuppressed viral loads, while 3.4% achieved undetectable levels, demonstrating optimal treatment outcomes in some participants (see Table 3).
Table 3.
Hematological and immunological factors characteristics of the study participants in Gondar City administration, 2024
| Features | Category | Frequency | Percent (%) |
|---|---|---|---|
| Baseline CD4 | Low | 2977 | 71.7 |
| High | 701 | 16.9 | |
| Normal | 474 | 11.4 | |
| Recent CD4 | Low | 2058 | 49.6 |
| Normal | 1418 | 34.1 | |
| High | 676 | 16.3 | |
| Viral load status | Suppressed | 3929 | 94.6 |
| Unsuppressed | 84 | 2.0 | |
| Undetectable | 139 | 3.4 |
Normal CD4 count: ≥500 cells/mm³; suppressed viral load: <200 copies/mL.
Treatment-related characteristics
The treatment-related characteristics of the participants indicated that the majority (87.0%) had good adherence to their prescribed regimens, reflecting a strong commitment to treatment. A smaller percentage of patients reported poor (8.6%) or fair (4.4%) adherence, which may pose challenges to treatment outcomes.
Most participants (91.8%) were on a first-line treatment regimen, while 7.7% had progressed to a second-line regimen, and only 0.5% were on third-line treatment, suggesting that the majority responded well to initial therapies. Additionally, 71.2% of the patients had started Tuberculosis Preventive Therapy (TPT), indicating a proactive approach to coinfection prevention, whereas 28.8% had not.
Regarding Cotrimoxazole Preventive Therapy (CPT) use, 79.8% of participants were on CPT, reflecting adherence to standard HIV care protocols, while 20.2% had not received the therapy. This distribution underscores the strong engagement in preventive and therapeutic interventions among the study population (see Table 4).
Table 4.
Treatment-related factors characteristics of the study participants in Gondar City administration, 2024
| Features | Category | Frequency | Percent (%) |
|---|---|---|---|
| Adherence | Good | 3611 | 87.0 |
| Poor | 355 | 8.6 | |
| Fair | 186 | 4.4 | |
| Regimen Line | First line | 3811 | 91.8 |
| Second line | 319 | 7.7 | |
| Third line | 22 | 0.5 | |
| TPT started | Yes | 2958 | 71.2 |
| No | 1194 | 28.8 | |
| CPT use | Yes | 3313 | 79.8 |
| No | 839 | 20.2 |
Model performance comparison
Machine learning (ML) models were evaluated for their ability to predict the risk of malnutrition in PLWHIV patients. Unbalanced models perform poorly because they are biased toward the majority class (e.g., normal BMI), leading to high accuracy but low sensitivity in minority classes (e.g., underweight). SVM’s recall of the SVM improved from 49% (unbalanced) to 80.1% (balanced), enabling better identification of at-risk patients. The key performance metrics included accuracy, precision, recall, and F1 score.
The results showed that balanced datasets significantly improved the model performance.
The support vector machine (SVM) achieved the highest performance, with an accuracy of 80.1%, a precision of 80.4%, a recall of 80.1%, and an F1 score of 79.5%.
Random forest performed robustly, achieving an accuracy of 75.0%, a precision of 76.0%, a recall of 75.0%, and an F1 score of 73.5% after balancing (see Table 5).
Table 5.
Performance of various machine learning models, both with and without the application of SMOTE for balancing the dataset
| Machine learning models | Dataset | Accuracy | Precision | Recall (Sensitivity) | F1 | AUC |
|---|---|---|---|---|---|---|
| Logistic Regression | Unbalanced (%) | 49.0 | 24.1 | 49.0 | 32.3 | 0.56 |
| Balanced (%) | 44.9 | 46.6 | 44.9 | 40.3 | 0.71 | |
| Decision Tree | Unbalanced (%) | 41.0 | 42.2 | 41.0 | 40.0 | 0.53 |
| Balanced (%) | 63.8 | 65.8 | 63.8 | 63.4 | 0.79 | |
| SVM | Unbalanced (%) | 49.0 | 24.1 | 49.0 | 32.3 | 0.46 |
| Balanced (%) | 80.1 | 80.4 | 80.1 | 79.5 | 0.92 | |
| KNN | Unbalanced (%) | 55.0 | 49.0 | 55.0 | 49.3 | 0.62 |
| Balanced (%) | 74.0 | 76.8 | 74.0 | 72.1 | 0.89 | |
| Random Forest | Unbalanced (%) | 46.0 | 40.5 | 46.0 | 42.2 | 0.62 |
| Balanced (%) | 75.0 | 76.0 | 75.0 | 73.5 | 0.91 | |
| Gradient Boosting | Unbalanced (%) | 48.0 | 38.6 | 48.0 | 40.4 | 0.56 |
| Balanced (%) | 69.0 | 70.5 | 69.0 | 68.0 | 0.88 | |
| Naive Bayes | Unbalanced (%) | 46.0 | 33.8 | 46.0 | 37.5 | 0.54 |
| Balanced (%) | 48.5 | 48.0 | 48.5 | 44.7 | 0.76 | |
| XGBoost | Unbalanced (%) | 44.0 | 40.9 | 44.0 | 40.0 | 0.56 |
| Balanced (%) | 70.4 | 72.1 | 70.4 | 69.6 | 0.88 |
SVM Support Vector Machine, KNN K-Nearest Neighbors, XGBoost eXtreme Gradient Boosting
SMOTE: Synthetic Minority Oversampling Technique (used to balance dataset classes).
Balancing dataset
As seen in the descriptive statistics, the prevalence of BMI among PLWHIV was 20%, indicating that the dataset was imbalanced, as most observations (80%) were concentrated in the majority class. The total distribution of BMI classes among PLWHIV users was changed to 49 utilizations within each class to a symmetric distribution for both categories for building reliable predictive models. A balanced dataset was used to train the chosen ML models, with an 80:20 train-test split and 10-fold cross-validation. The dataset balancing is depicted in the visualization of the nutritional status distribution among PLWHIV (see Fig. 2).
Fig. 2.
Class distribution of Nutritional status of PLWHIV University of Gondar Comprehensive and Specialized Hospital, 2024
Feature ranking
Interpreting the results of machine learning algorithms can be significantly more challenging than classical statistical analysis methods, which rely on well-defined concepts such as odds ratios, p-values, and confidence intervals. Machine learning models often operate in “black boxes” with highly complex, nonlinear structures (26). It is difficult to interpret how predictions were made, but techniques such as SHapley Additive exPlanations (SHAP) provide a unified framework, proposed by Lundberg and Lee, to interpret the outputs of a wide range of ML models by calculating SHAP values to gain insights into the contributions of individual features to the model’s predictions [27]. The explainability of ML models is crucial, allowing us to leverage the power of ML while also understanding the reasoning behind the models’ decisions. Features with a long bar located at the top are highly related to the nutritional status of PLWHIV. The relative importance of the predictors of nutritional status was analyzed using the random forest algorithm. The feature importance ranking highlighted variables such as ART duration, BMI, and adherence to treatment as the most influential predictors (see Fig. 3). This insight provides actionable guidance for healthcare practitioners. Unexpectedly, ‘residence’ (urban/rural) had low importance despite literature links to food insecurity, possibly due to an urban-dominated cohort (76.5% urban population). However, several variables that would typically be considered clinically significant, such as viral load status and adherence, appeared to be of unexpectedly low importance. This discrepancy could stem from various issues, including data quality problems, such as missing data or measurement errors in self-reported variables (e.g., adherence). Feature importance in malnutrition prediction. ART duration and BMI were the primary predictors. Clinicians should prioritize these factors during patient evaluation.
Fig. 3.
Feature Importance from Random Forest Model for Nutritional Status Prediction Among PLWHIV
Figure 3 shows the relative importance of the input features used in a Random Forest model to predict nutritional status among people living with HIV (PLWHIV). “Duration on ART in Months,” “Age,” and “Recent CD4 Count” were identified as the top three most influential predictors. Other significant factors included baseline CD4 count, educational level, and marital status. Feature importance was derived using the mean decrease in impurity, which indicates the contribution of each variable to the predictive performance of the model. ART: Antiretroviral Therapy; CD4 = Cluster of Differentiation 4; TPT = Tuberculosis Preventive Therapy; CPT = Cotrimoxazole Preventive Therapy.
The Receiver Operating Characteristic (ROC) curve was used to evaluate the performance of the support vector machine (SVM) classifier applied to a balanced dataset. Each curve represents the true positive rate against the false positive rate for a specific class, along with its area under the curve (AUC) score. The model demonstrated consistently high performance across all nutritional status categories. The “Overweight” class achieved the highest AUC at 0.99, indicating excellent classification ability. This was followed closely by the “Obesity” class with an AUC of 0.98, also reflecting near-perfect discrimination. The “Underweight” class showed strong performance with an AUC of 0.96, while the “Normal” weight category had the lowest AUC at 0.91, still representing reliable and effective classification. Overall, the model performed exceptionally well in distinguishing between different nutritional statuses. (see Fig. 4).
Fig. 4.
ROC Curve for SVM Model Across BMI Classes
This figure (Fig. 4) displays the Receiver Operating Characteristic (ROC) curves for the Support Vector Machine (SVM) classifier trained on balanced data, evaluating the model’s performance across four BMI classes: Normal, Obesity, Overweight, and Underweight. The Area Under the Curve (AUC) values indicate high discriminative ability, with the model performing best in the Overweight (AUC = 0.99), followed by the Obesity class (AUC = 0.98), Underweight (AUC = 0.96), and Normal (AUC = 0.91). The dashed diagonal line represents the random classifier (AUC = 0.5).
The bar chart illustrates the performance of various machine learning models after balancing the dataset. The evaluated models included logistic regression, decision tree, support vector machine (SVM), k-nearest neighbors (KNN), random forest, gradient boosting, Naïve Bayes, and XGBoost. Among these, the SVM demonstrates the best overall performance, achieving nearly identical scores across accuracy, precision, recall, and F1 score, all of which are approximately 0.79–0.80. Logistic regression and naive Bayes, on the other hand, show the poorest performance, with metrics falling below 0.5, likely due to their inability to handle the complexity of the dataset effectively. The decision tree shows moderate performance, with metrics ranging from 0.63 to 0.66. Models such as KNN and random forest perform similarly, with scores between 0.72 and 0.76, while gradient boosting and XGBoost also yield comparable and consistent results, scoring approximately 0.68–0.72. Ensemble methods, including random forest, gradient boosting, and XGBoost, display robust performance, further emphasizing their effectiveness in handling balanced datasets. Overall, SVM stands out as the top-performing model, making it a promising choice for this dataset.
A comprehensive evaluation of model performance metrics reveals that SVM consistently outperformed other algorithms, achieving high precision and recall scores when applied to the balanced dataset (see Fig. 5).
Fig. 5.
Comparisons of model performance metrics
The hyperparameter tuning results compare various machine learning models (logistic regression, decision tree, SVM, KNN, random forest, gradient boosting, and XGBoost) trained on two datasets, the original unbalanced and balanced datasets. Each model’s best hyperparameters were shown during the tuning process, and the resulting performance scores (likely accuracy or another relevant metric) are displayed in the “Best Score” column.
A key observation from the results is that the use of the balancing technique significantly improves the performance across most models. Literature reveals that the performance of logistic regression increases from 0.660 to 0.766, the decision tree improves from 0.460 to 0.820, and the SVM increases from 0.682 to 0.967. This trend is also seen across other models, particularly KNN, random forest, and XGBoost, which all benefit from the balanced dataset. The best performance is achieved by SVM, with a score of 0.97, and random forest, with a score of 0.95, emphasizing the importance of fine-tuning for enhanced performance (see Fig. 6), suggesting that these models handle the balanced data the best.
Fig. 6.
Hyperparameter tuning plot
The hyperparameters that lead to the highest scores vary by model. In logistic regression, a high regularization strength (C = 100) works best, while the decision tree receives help from controlling the depth of the tree and the smallest samples for splitting. SVM performs best with a Radial Basis Function (RBF) kernel and tuning the C parameter, while random forest and XGBoost yield better results with more estimators and specific learning rates.
In conclusion, SVM and random forest appear to be the most effective models in this study. To further improve the results, focusing on these top-performing models and considering further fine-tuning their hyperparameters is recommended. Additionally, applying data balancing techniques is crucial, especially when working with imbalanced datasets.
While the SVM model achieved a high AUC, which demonstrates excellent discrimination in predicting malnutrition risk among PLWHIV, such strong performance also raises concerns about potential overfitting. Overfitting occurs when a model learns patterns specific to the training data, including noise rather than generalizable trends, resulting in inflated performance metrics that may not translate to new or external datasets. To mitigate this risk, we employed strategies such as cross-validation, careful hyperparameter tuning, and the use of a separate test set for final evaluation. Additionally, we applied data balancing techniques to address class imbalance, which can sometimes inadvertently contribute to overfitting by generating synthetic samples that closely resemble the minority class.
The SHAP summary analysis revealed that ART duration, BMI-related metrics, adherence to treatment, age, and CD4 count were the most influential predictors of BMI status (see Table 6).
Table 6.
Mean absolute SHAP values representing the relative importance of each feature in predicting BMI categories among people living with HIV
| Feature | Mean|SHAP value| |
|---|---|
| ART_Duration | 0.143 |
| BMI | 0.127 |
| Adherence | 0.119 |
| Age | 0.102 |
| CD4_Count | 0.089 |
Discussion
The findings of this study highlight the substantial potential of machine learning (ML) in predicting malnutrition risk among people living with HIV (PLWHIV). By integrating demographic, clinical, and treatment-related data, our models achieved higher predictive accuracy than conventional approaches, supporting the adoption of ML-based tools in HIV care, especially in resource-constrained settings where malnutrition remains a persistent challenge [28].
Previous research in HIV and nutrition has primarily relied on traditional statistical methods, such as logistic regression, to identify associations between demographic or clinical factors and malnutrition. For example, earlier studies emphasized age, TB coinfection, and ART adherence as key predictors [29]. While valuable, these methods often struggle with high-dimensional data and nonlinear relationships, limiting their predictive power.
Our SVM model (AUC = 0.92) outperforms logistic regression-based predictions in prior studies (AUC = 0.60–0.70) [5, 30], and handles imbalanced datasets and complex interactions more effectively. Similarly, the random forest model achieved an AUC of 0.75, further supporting the robustness of ML approaches in this context. These results exceed those of earlier studies using traditional methods, where AUC values typically remained below 75% [14]. Decision tree models, while commonly used in previous research, showed lower precision and AUC values, reinforcing the advantages of more sophisticated algorithms [5, 31].
Beyond predictive performance, our study used Shapley additive explanations (SHAP) to enhance model interpretability. SHAP analysis identified ART duration, BMI, and treatment adherence as the most influential predictors, providing actionable insights for healthcare providers. This contrasts with earlier studies that often lacked interpretability and struggled to offer clinically actionable guidance [27]. The identification of ART duration and adherence as key predictors underscores the importance of integrated nutritional counseling and adherence support programs. Regular BMI monitoring and targeted food supplementation for underweight patients could further improve outcomes.
It is noteworthy that viral load and adherence showed lower-than-expected feature importance in our models. This may be attributable to incomplete or inconsistently recorded data for these variables in the EMR, as well as possible misclassification or reporting bias. Some key variables, such as adherence and viral load, may have been misclassified due to reliance on self-reported or inconsistently updated EMR records. This may explain their unexpectedly low feature importance in the model. Future studies should prioritize data quality assurance for these predictors to better capture their impact on nutritional outcomes.
While our results highlight the effectiveness of ML models, considerations for real-world applications are important. Machine learning models can be computationally intensive, posing challenges for healthcare systems in resource-limited settings [32]. However, advances such as cloud-based deployment, mobile health (mHealth) applications, and user-friendly interfaces can help overcome these barriers [33].
To facilitate adoption, the SVM model could be embedded within EMRs to flag high-risk patients during routine visits for further nutritional assessment or intervention. This would require minimal clinician training and could be implemented as a decision-support module, providing automated alerts (e.g., ‘High Risk: ART duration < 6 months + BMI < 18.5’) alongside recommended interventions such as nutritional counseling or food supplementation. A pilot implementation plan is proposed for future studies, emphasizing scalability in low-resource settings.
The generalizability of the model to populations with different demographic or clinical profiles is a key consideration. Validation using diverse cohorts, including those with varying socioeconomic backgrounds, comorbidities, and age distributions, would help ensure robustness. The dataset may contain urban-centric bias, which can limit the model’s applicability to rural populations. Future studies should incorporate more representative sampling to improve external validity.
Strengths and Limitations
This study leverages machine learning to address a critical public health issue in resource-limited settings. The data were collected retrospectively; some data were incomplete or missing. For handling missing values, this study used a simple imputation technique. Ensured balanced datasets, improving model performance and reliability. Additionally, explainability techniques such as SHAP provided actionable insights, enhancing the clinical relevance of the findings.
The study is not without limitations. The dataset is derived from a single institution, which could affect the generalizability of the results to other settings. Reliance on BMI without mid-upper arm circumference (MUAC) or dietary data may underestimate malnutrition complexity. Future studies should enrich EMRs with these variables. The study was based on data from a single urban-centered hospital, with 76.5% of participants residing in urban areas. This overrepresentation limits generalizability to rural settings where nutritional challenges and healthcare access may differ substantially. Future work should validate this model in rural clinics and across different regions to assess generalizability and ensure equity in malnutrition prediction. Data from a single tertiary hospital may not capture regional diversity in malnutrition determinants. Future multi-center studies, including rural clinics, to validate the model’s generalizability. Prospective data collection to address urban-rural disparities in nutritional determinants. Self-reported adherence and inconsistent viral load recording in EMRs may underestimate their true predictive value. Prospective studies with standardized biochemical and dietary data (e.g., MUAC, albumin) are recommended.
Conclusion
This study shows the effectiveness of ML models, particularly SVM and random forest, in predicting malnutrition risk among PLWHIV. The incorporation of data balancing techniques and interpretability tools enhances their utility and relevance in clinical settings. ML models offer a robust approach to predicting malnutrition risk in PLWHIV. Based on our findings, we urge healthcare providers to consider integrating machine learning-based tools into routine HIV care for early identification of malnutrition. To improve model performance and interpretability, future studies should invest in better EMR documentation practices and include richer clinical features such as dietary intake, MUAC, and serum albumin. Policymakers should invest in digital health infrastructure and capacity building to facilitate such integration. Researchers are encouraged to build upon this model with additional data and explore its application across different regions and populations. Plans include piloting this model in real-world clinical settings in Ethiopia. Anticipated challenges include the need for clinical validation, integration with existing health information systems, data privacy concerns, and ensuring usability by healthcare workers with limited technical expertise.
Acknowledgements
We would like to thank the University of Gondar Comprehensive and Specialized Hospital for providing valuable data for this study and the data clerks who work in the ART clinic for their willingness to support the extraction of data from the database. We also acknowledge the contributions of the external reviewers for their insightful feedback on the manuscript.
Abbreviations
- AIDS
Acquired immunodeficiency syndrome
- ART
Antiretroviral therapy
- AUC
Area under the curve
- BMI
Body max index
- CD4
Cluster of differentiation 4
- CPT
Cotrimoxazole preventive therapy
- CSV
Comma-separated values
- EMR-ART
Electronic medical record for antiretroviral therapy
- EPHI
Ethiopian public health institute
- F1 Score
Harmonic mean of precision and recall
- FN
False negative
- FP
False positive
- HIV
Human immunode ficiency virus
- IRB
Institutional review board
- KNN
K-nearest neighbors
- mHealth
Mobile health
- ML
Machine learning
- MUAC
Mid-Upper arm circumference
- PCA
Principal component analysis
- PLWHIV
People living with HIV
- Precision
Positive predictive value
- RBF
Radial basis function
- ROC
Receiver operating characteristic
- SHAP
SHapley additive exPlanations
- SMOTE
Synthetic minority oversampling technique
- SVM
Support vector machine
- TB
Tuberculosis
- TN
True negative
- TP
True positive
- TPT
Tuberculosis preventive therapy
- WHO
World health organization
Authors’ contributions
A. E.G handled making a significant contribution to the conceptualization, study selection, data curation, formal analysis, investigation, method, and original draft preparation. A.K.M was the project administrator; provided resources, software, supervision, validation, and visualization; reviewed and analyzed the results; and drafted the manuscript. N.D.B reviewed and edited the manuscript for clarity and accuracy. T.C.M contributed significantly to data collection and statistical analysis. A.D.W, M.A.A, and T.Z.Y contributed significantly to data collection, statistical analysis, and result interpretation. All authors read and approved the final manuscript.
Funding
The authors declare that no funding was received for this research.
Data availability
The datasets analyzed during the current study are available from the corresponding author upon reasonable request.
Declarations
Ethics approval and consent to participate
The study participants gave their informed consent after the study protocol was evaluated and approved by Debre Markos University’s ethical review board. The Amhara Regional Public Health Institute provided a letter of permission (Reference No: APHI/D/M/306/007). Due to the retrospective nature of the study, informed consent was not used. However, the confidentiality of the data was supported by keeping the extracted information private and by ensuring that it could not be used for any purpose other than study-related purposes. Only the study was conducted using the data retrieved. As a result, the data-gathering tool did not hold participants’ names or any other personal information about them. ML deployment in HIV care requires balancing utility with privacy. While this study used de-identified data, future implementations must ensure informed consent and transparency in how predictions are generated. We propose collaborative frameworks with local ethics boards to govern ML use in sensitive populations. Future ML implementations should integrate informed consent for predictive model use, ensure transparency via explainable AI (e.g., SHAP), and adhere to local data privacy laws (e.g., anonymization, opt-out options). Collaborative frameworks with ethics boards are critical for scaling. We recommend developing ethical frameworks in collaboration with local IRBs to guide the responsible deployment of ML in HIV care, ensuring fairness, privacy, and patient autonomy. The study adhered to the Helsinki Declaration.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.UNAIDS, Global. HIV & AIDS statistics — Fact sheet. 2024. 2024.
- 2. Heikens GT, Manary M. Part 2: Wasting disease in African children: the challenges ahead. Malawi Med J. 2009;21:101–5. [DOI] [PMC free article] [PubMed]
- 3.Weiser SD, Tuller DM, Frongillo EA, Senkungu J, Mukiibi N, Bangsberg DR. Food insecurity as a barrier to sustained antiretroviral therapy adherence in Uganda. PLoS ONE. 2010;5(4):e10340. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gizaw A, Gebremichael A, Kebede D. Malnutrition and associated factors among adult people living with HIV receiving antiretroviral therapy at the Organization for Social Service Health Development in Jimma Town, Oromia Region, South West. Gen Med (Los Angeles). 2018;6:4–11.
- 5.Daka DW, Ergiba MS. Prevalence of malnutrition and associated factors among adult patients on antiretroviral therapy follow-up care in Jimma medical center, Southwest Ethiopia. PLoS ONE. 2020;15(3):e0229883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.World Health Organization. Nutritional care and support for people living with HIV/AIDS: a training course. Geneva: WHO; 2009.
- 7.World Health Organization, United Nations Children’s Fund. Guideline: updates on HIV and infant feeding: the duration of breastfeeding, and support from health services to improve feeding practices among mothers living with HIV. Geneva: WHO; 2016. [PubMed]
- 8.Birhane M, Loha E, Alemayehu FR. Nutritional status and associated factors among adult HIV/AIDS patients receiving ART in Dilla university referral hospital, dilla, Southern Ethiopia. J Med Physiol Biophys. 2021;70:8–15. [Google Scholar]
- 9.Johannessen A, Naman E, Ngowi BJ, Sandvik L, Matee MI, Aglen HE, et al. Predictors of mortality in HIV-infected patients starting antiretroviral therapy in a rural hospital in Tanzania. BMC Infect Dis. 2008;8:1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Argemi X, Dara S, You S, Mattei JF, Courpotin C, Simon B, et al. Impact of malnutrition and social determinants on survival of HIV-infected adults starting antiretroviral therapy in resource-limited settings. Aids. 2012;26(9):1161–6. [DOI] [PubMed] [Google Scholar]
- 11. Adal M, Howe RH, Kassa D, Aseffa A, Petros B. Malnutrition and lipid abnormalities in antiretroviral-naïve HIV-infected adults in Addis Ababa: a cross-sectional study. PLoS ONE. 2018;13:e0195942. [DOI] [PMC free article] [PubMed]
- 12.Rezazadeh L, Ostadrahimi A, Tutunchi H, Naemi Kermanshahi M, Pourmoradian S. Nutrition interventions to address nutritional problems in HIV-positive patients: translating knowledge into practice. J Health Popul Nutr. 2023;42(1):94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. UNAIDS. Ethiopia country report 2023: HIV and AIDS estimates. Geneva: Joint United Nations Programme on HIV/AIDS; 2023.
- 14.Alebel A, Kibret GD, Petrucka P, Tesema C, Moges NA, Wagnew F, et al. Undernutrition among Ethiopian adults living with HIV: a meta-analysis. BMC Nutr. 2020;6:1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shifera N, Yosef T, Matiyas R, Kassie A, Assefa A, Molla A. Undernutrition and associated risk factors among adult HIV/AIDS patients attending antiretroviral therapy at public hospitals of bench Sheko zone, Southwest Ethiopia. J Int Association Providers AIDS Care (JIAPAC). 2022;21:23259582221079154. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Liu Z, Meng Z, Wei D, Qin Y, Lv Y, Xie L, et al. Predictive model and risk analysis for coronary heart disease in people living with HIV using machine learning. BMC Med Inf Decis Mak. 2024;24(1):110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Van Lettow M, Fawzi WW, Semba P, Semba RD. Triple trouble: the role of malnutrition in tuberculosis and human immunodeficiency virus co-infection. Nutr Rev. 2003;61(3):81–90. [DOI] [PubMed] [Google Scholar]
- 18.Gupta RK, Lucas SB, Fielding KL, Lawn SD. Prevalence of tuberculosis in post-mortem studies of HIV-infected adults and children in resource-limited settings: a systematic review and meta-analysis. Aids. 2015;29(15):1987–2002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Pedregosa F. Scikit-learn: machine learning in python. Fabian J Mach Learn Res. 2011;12:2825. [Google Scholar]
- 20. Bishop CM, Nasrabadi NM. Pattern recognition and machine learning. New York: Springer; 2006.
- 21.Šimundić A-M. Measures of diagnostic accuracy: basic definitions. Ejifcc. 2009;19(4):203. [PMC free article] [PubMed] [Google Scholar]
- 22.Santini A, Man A, Voidăzan S. Accuracy of diagnostic tests. J Crit Care Med. 2021;7(3):241–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Flach PA, Kull M. Precision-recall-gain curves: PR analysis done right. In: Advances in Neural Information Processing Systems. Vol. 28. MIT Press; 2015. p. 838–46.
- 24.Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 2020;21:1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kumar R, Indrayan A. Receiver operating characteristic (ROC) curve for medical researchers. Indian Pediatr. 2011;48:277–87. [DOI] [PubMed] [Google Scholar]
- 26.Wang K, Tian J, Zheng C, Yang H, Ren J, Liu Y, et al. Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Comput Biol Med. 2021;137:104813. [DOI] [PubMed] [Google Scholar]
- 27. Lundberg SM, Lee SI. A unified approach to interpreting model predictions. arXiv preprint arXiv:1705.07874. 2017.
- 28. UNAIDS. AIDS statistics– 2022 fact sheet. Geneva: Joint United Nations Programme on HIV/AIDS; 2023.
- 29.Frigati LJ, Ameyan W, Cotton MF, Gregson CL, Hoare J, Jao J. Spectrum, progression, and predictors of morbidity in perinatally HIV-infected adolescents on antiretroviral therapy. AIDS Res Hum Retroviruses. 2021;37:443–51.
- 30.Alebel A, Sibbritt D, Petrucka P, Demant D. Undernutrition increased the risk of loss to follow-up among adults living with HIV on ART in Northwest ethiopia: a retrospective cohort study. Sci Rep. 2022;12(1):22556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97.
- 32. Kuhn M, Johnson K. Applied predictive modeling. New York: Springer; 2013.
- 33. World Health Organization. Nutritional care and support for people living with HIV/AIDS: a training course. Geneva: World Health Organization; 2009.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The datasets analyzed during the current study are available from the corresponding author upon reasonable request.










