Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Mar 1.
Published in final edited form as: J Acquir Immune Defic Syndr. 2025 Mar 1;98(3):209–216. doi: 10.1097/QAI.0000000000003561

Using machine learning techniques to predict viral suppression among people with HIV

Xueying Yang 1,2,*, Ruilie Cai 1,3, Yunqing Ma 1,3, Hao H Zhang 4, XiaoWen Sun 1,3, Bankole Olatosi 1,5, Sharon Weissman 1,6, Xiaoming Li 1,2, Jiajia Zhang 1,3
PMCID: PMC11798697  NIHMSID: NIHMS2034769  PMID: 39561000

Abstract

Background

This study aims to develop and examine the performance of machine learning (ML) algorithms in predicting viral suppression among statewide people with HIV (PWH) in South Carolina (SC).

Methods

Extracted through the electronic reporting system in SC, the study population was adult PWH who were diagnosed between 2005–2021. Viral suppression was defined as viral load <200 copies/ml. The predictors, including socio-demographics, a historical information of viral load indicators (e.g., viral rebound), comorbidities, healthcare utilization, and annual county-level factors (e.g., social vulnerability) were measured in each 4-month windows. Using historic information in different lag time windows (1-, 3- or 5-lagged time windows with each 4-month as a unit), both traditional and ML approaches (e.g., Long Short-Term Memory network [LSTM]) were applied to predict viral suppression. Comparisons of prediction performance between different models were assessed by area under curve (AUC), recall, precision, F1 score, and Youden index.

Results

Machine learning approaches slightly outperformed the generalized linear mixed model. In all the three lagged analysis of a total of 15,580 PWH, the LSTM (lag 1: AUC=0.858; lag 3: AUC=0.877; lag 5: AUC=0.881) algorithm outperformed all the other methods in terms of AUC performance for predicting viral suppression. The top-ranking predictors that were common in different models included historical information of viral suppression, viral rebound, and viral blips in the Lag-1 time window. Inclusion of county level variables did not improve the model prediction accuracy.

Conclusion

Supervised machine learning algorithms may offer better performance for risk prediction of viral suppression than traditional statistical methods.

Keywords: HIV/AIDS, Viral suppression, Machine learning, HIV care cascade

Introduction

Viral suppression is the final stage of the HIV treatment cascade, which serves as the framework for UNAIDS’ 90–90-90 goals.1 Sustained (or durable) viral suppression permits the restoration of immune function, reduces onward secondary transmission, and indicates long-term treatment success and mortality reduction.2 In the United States, approximately 66% of all people with HIV (PWH) were virally suppressed in 2019, based on the Centers for Disease Control and Prevention national surveillance data,3 and in South Carolina (SC), 61.5% PWH were virally suppressed4. These rates are far below the 90–90-90 target set by UNAIDS.

Routinely monitoring viral load (VL) becomes more and more important over a life course, as prolonged life expectancy data of PWH and the longitudinal VL information collected over time can potentially provide opportunities to improve early prediction of subsequent virologic failure (VF) or mortality. The longitudinal history of VL includes the observed VL laboratory measurements at each clinical visit. Over the past few years, a small but increasing number of longitudinal studies have explored the dynamics of VL patterns using sustained viral suppression, viral rebound, viral blips, or low-level viremia (LLV) 57. The aggregate VL features (e.g., median, nadir VL, peak VL)8 and the temporal VL features (e.g., time to initial viral suppression 9, viral rebound, or viral blips) can be defined over a long-term observation window. The different VL measures, such as viral rebound, viral blips, or persistent LLV, are interrelated, affect each other, and also predict, to some extent, viral suppression.10

Attainment of viral suppression depends on multiple factors from individual-level (e.g., socio-demographics, clinical characteristics, HIV care-seeking behaviors) to county-level social and environmental factors (e.g., economic environment).11 Leveraging the knowledge of a combination of all or some of these factors through rapid risk calculation to predict viral outcomes in individual patients would enhance clinical decision making and prevent adverse outcomes of treatment failure and the costs associated with switching to the second line ART.12 A more comprehensive prediction model for viral outcomes based on the dynamic patterns of VL, individual demographics, HIV care-seeking behavior, and social and environmental factors can inform us on “when” and “how” to help those with poor viral control achieve and sustain viral suppression. However, most of the existing studies have focused on limited indicators of viral suppression instead of providing a comprehensive picture of the dynamic process of viral suppression, partially due to either the unavailability of such longitudinal data in medical records or the lack of advanced analytic tools for modeling such complex data. Furthermore, most extant literature counted the presences of virologic outcomes within a limited timeframe and explored their correlates using traditional analytic approaches such as the generalized estimate equation13 and cox regression, which might not be sufficiently flexible for learning the complex relationships such as nonlinearity and collinearity from data.14 There is limited literature in using machine learning methods for viral suppression, which address high dimensionality and collinearity issues but are either heavily dependent on the virological resistance genotype1517 or conducted in the resource-limited settings (e.g., Uganda).18,19 Therefore, the relative role of various factors related to HIV viral load responses in the local context of US need further examination.

This study aims to develop and evaluate modern machine learning algorithms to predict viral suppression among statewide PWH, accounting for various factors at the individual level (e.g., patient demographics, historical viral load measure, and health care service utilization), the structural level (e.g., geographic region, availability of treatment facility, and specialty), and the socioenvironmental level. We hypothesize that machine learning methods could significantly improve the prediction accuracy because of their flexibility and learning power.

Methods

Study setting and design

This is a population-based cohort study, which is built by integrating several electronic health record (EHR) databases (e.g., e-HARS, all payer claim database [UB payer]) and publicly available data sources in SC. The integrated EHR database was linked by the SC Office of Revenue and Fiscal Affairs (RFA). The SC RFA is a state agency that links individual-level longitudinal health utilization data across multiple state agencies. Specifically, the electronic records of HIV diagnosis, risk factors and laboratory tests were collected by the SC Department of Health and Environmental Control (DHEC). The SC DHEC’s HIV/AIDS electronic reporting system (e-HARS) is a statewide confidential name-based reporting of HIV/AIDS that began in 1986 20,21, to which all statewide CD4 and viral load tests are reported since January 1st 2004 as mandated 22,23. The all payer claim database captures individual-level longitudinal health utilization data from various state agencies, which includes all clinical condition diagnoses (e.g., cardiovascular disease, diabetes classified by International Classification of Diseases [ICD] codes) from emergency departments, hospital inpatient facilities, ambulatory-care facilities, and outpatient surgery facilities in the state. Details of the research protocol, including the process of data extraction and management, are described elsewhere.24 The structural and environmental factors were retrieved from publicly available resources including the American Community Survey and Area Health Resources File. The research protocol received approval from the institutional review board in the University of South Carolina and relevant SC state agencies.

Study population

The study population was PWH with a confirmed HIV diagnosis in SC. Specific inclusion criteria are: 1) individuals who were diagnosed with HIV between 2005 and 2020, and with the follow-up record until June 2021; 2) aged ≥18 years old; 3) had at least one suppressed viral load (VL<200 copies/ml) record after HIV diagnosis; 4) had at least two separate consecutive 4-month windows of follow up information and with at least one VL test in each time window. Eventually, a total of 15,580 PWH met the inclusion criteria and were included for the data analysis. Each patient’s follow up period started from the date of initial VL measurement until the date of last VL measurement.

Outcome and Time Alignment

In accordance with the literature,25 we defined the occurrence of viral suppression (VL<200 copies/ml) (i.e., outcome) and all of the time-dependent predictors in each 4-month window after the index date (i.e., the first date of VL measurement). All the VS measures were updated at the beginning of each 4-month interval from the baseline. If multiple VL testing results were recorded within one 4-month window, the average original VL test values in this time window were calculated to define the viral suppression outcome. For VL below the detection limit (200 copies/ml), we used the value of 200 for averaging since we were not able to get the specific value under 200 copies/ml in our dataset. Our time metric was 4-month from the index date; thus, all time metrics were 4-month multiples. To improve the predictive power, we parallelly examined different lag time of historical information (i.e., one, three, and five lagged 4-month window(s) of historical information) as predictors for the subsequent 4-month time window outcome, that is, we used the predictors collected in previous 4, 12, and 20 months to predict viral suppression in the Lag 1, Lag 3, and Lag 5 analysis, respectively. In each different lag time analysis, the inclusion criteria were further restricted to PWH who had records of at least four consecutive 4-month time windows for the Lag 3 analysis (n=11,240) and at least six consecutive 4-month time windows for the Lag 5 analysis (n=7,931). Both outcome and predictors were defined separately in each 4-month time window. The ML models we used do not allow missing value in the prediction model building. Therefore, if there was missing data on either outcome or covariates in a time window, that time window was not considered for model fitting.

Predictor Candidates

We compiled 47 predictor candidates, including both individual-level and county-level variables. Individual-level predictors captured socio-demographics (e.g., age, gender), baseline HIV markers (e.g., baseline CD4 count), comorbidities (e.g., diabetes), and HIV care cascade factors (linkage to care, retention in care); county-level predictors included HIV-specific county-level information (e.g., new HIV diagnoses rate) and health care access information (e.g., primary care physicians per 100,000 population). Predictor candidates were detailed in Supplementary Tables S1 & S2.

Machine-Learning Approaches and Prediction Performance Evaluation

Predictive models on VS (i.e., outcome) were built based on predictor candidates to solve this binary classification problem. We applied seven statistical and machine learning approaches: generalized linear mixed models (GLMM), Classification and Regression Trees (Tree), Random Forest (RF), Support Vector Machines (SVM), Naïve Bayes (NB), eXtreme Gradient Boosting (XGBoost), and Long short-term memory (LSTM) (described briefly in Supplement methods). For each algorithm, we built two sets of models, one using individual-level factors only and the other using both individual- and county-level predictors. In each prediction model, we randomly split the unique RFA IDs into training IDs (80%) and testing IDs (20%), resulting in the training set and the testing set, respectively. For algorithms without hyperparameters (e.g, GLMM), we fitted the model on the training set and applied the final model to the testing set to evaluate prediction performance. For other algorithms that involve hyperparameters (e.g, Tree, RF, SVM, NB, XGBoost, and LSTM), we further split the training set into the sub-training set (80%) and the sub-validation set (20%) with the same split strategy as before. Then we fit the algorithms on the sub-training set, refined the algorithms by selecting the best hyperparameters on the sub-validation set, and applied the final algorithms to the testing set to evaluate prediction performance. We parallelly implemented this procedure 30 times (using random splits) so that the variation of each performance metric could be compared with statistical tests. For each prediction model, we conducted separate analysis for different lag intervals of VS (i.e., Lag 1, Lag 3, and Lag 5) as defined in the time scale alignment (Figure S1). Given that adding county-level predictors did not improve the model performance, we only presented the results from models based on individual-level predictors in this study.

To assess discrimination performance, we compared different algorithms in terms of a variety of metrics, including the classification accuracy, area under curve (AUC), recall, precision, F1 score, and Youden index on the testing set. Definition and calculation of these metrics were detailed in the Supplement Table S3. To statistically compare different ML models for each performance metric, we conducted paired T-tests for each pair of models using performance metrics from 30 replicated runs, and applied the Benjamini-Hochberg false discovery rate (FDR) correction to account for multiple testing.26

Feature Selection Importance

For each algorithm, we calculate the importance of predictors based on their contribution to the model and then select top-ranked features. For tree-based and XGBoost models, accuracy-based importance is used; for random forest-based models, Gini-based importance is used. Since most predictors are categorical variables, these two different ways of importance calculation are roughly aligned. For each predictor, the accuracy-based importance is calculated as follows, 1) the prediction accuracy on the out-of-bag sample is measured; 2) the values of the variable in the out-of-bag-sample are randomly shuffled, while keeping all other variables fixed; and 3) the decrease in prediction accuracy on the shuffled data is measured. Gini-based importance measures the average gain of purity by splits of a given variable. If the variable is useful for classification, it tends to split mixed labeled nodes into more pure nodes such as single-class nodes. Splitting by a permuted variable tends to neither increase nor decrease node purity. Permuting an important variable tend to cause relatively large decrease in mean Gini-gain. After repeating the importance calculation 30 times, we calculated the mean importance for each predictor and then plotted the results. We plotted top 20 important predictors in the model.

Results

Patients’ characteristics

Among 15,580 eligible PWH, the mean age at HIV diagnosis were 37.03 years (SD=11.69). The majority of the participants were male (11,124, 71.40%), black (11,044, 70.89%), and living in urban (12,906, 82.84%). Nearly half of the HIV patients’ risk exposure category were men who have sex with men (7,013, 45.01%). The mean value of the baseline CD4 count was 315 cells/mm3 (Table 1).

Table 1.

Socio-demographics and baseline HIV markers among PWH

Characteristic Total N (%)

Number of Patients 15,580
Number of Visits 269,958
Age (years, mean, SD) 37.03 (11.69)
Sex
 Male 11,124 (71.40)
 Female 4,456 (28.60)
Race
 Black 11,044 (70.89)
 White 3,627 (23.28)
 Hispanic 563 (3.61)
 Others/unknown 346 (2.22)
HIV risk exposure
 MSM 7,013 (45.01)
 Heterosexual 3,989 (25.61)
 Others 3,205 (20.57)
 Injecting drug user 1,373 (8.81)
Region
 Urban 12,906 (82.84)
 Rural 2,674 (17.16)
Time to initial VS (months, mean, SD) 81.55 (81)
Baseline CD4 count (cells/mm3, median, IQR) 315 (396)
Baseline VL (copies/ml, median, IQR) 18730.5 (95556)

SD: standard deviation; IQR: interquartile range

Prediction performance of ML algorithms

Table 2 summarizes two prediction performance measures (i.e., accuracy, AUC) of each model for the Lag1-, Lag 3- and Lag 5-time window, respectively. In all the three lagged analysis, the LSTM (Lag 1: AUC=0.858, SD=0.003; Lag 3: AUC=0.877, SD=0.003; Lag 5: AUC=0.881, SD=0.004) algorithm outperformed all the other methods in terms of AUC for predicting viral suppression. The prediction performances for all the metrics (e.g., Recall, F1 score) with and without county-level predictors were listed in Supplementary Tables S4&S5. Table 3 compared the between-model prediction performance differences (in terms of classification accuracy and AUC) across all different ML algorithms and showed that the differences were mostly significant. Specifically, in the Lag 1 analysis, Random Forest showed the highest classification accuracy (0.828), while LSTM showed the highest AUC (0.858). In the Lag 3 analysis, both RF (0.834) and LSTM (0.833) revealed the higher classification accuracy while LSTM still showed the highest AUC (0.877). Slightly different in the Lag 5 analysis, SVM and Tree analysis have similar highest classification accuracy (0.841) while LSTM still has the best AUC performance (0.881). (Figure 1)

Table 2.

Accuracy and AUC for Different Machine Learning Models without County-level Factors for Different Lag Intervals

Lag 1 Lag 3 Lag 5

Accuracy (SD) AUC (SD) Accuracy (SD) AUC (SD) Accuracy (SD) AUC (SD)

GLMM 0.811 (0.001) 0.842 (0.001) 0.832 (0.001) 0.876 (0.002) 0.84 (0.002) 0.88 (0.003)
LSTM 0.826 (0.004) 0.858 (0.003) 0.833 (0.004) 0.877 (0.003) 0.839 (0.005) 0.881 (0.004)
Tree 0.826 (0.002) 0.812 (0.002) 0.831 (0.003) 0.812 (0.004) 0.841 (0.004) 0.818 (0.004)
Random Forest 0.828 (0.002) 0.815 (0.003) 0.834 (0.003) 0.814 (0.004) 0.84 (0.004) 0.817 (0.004)
XGboost 0.771 (0.003) 0.753 (0.003) 0.795 (0.003) 0.772 (0.004) 0.808 (0.005) 0.783 (0.004)
SVM Linear 0.812 (0.002) 0.803 (0.002) 0.831 (0.003) 0.812 (0.004) 0.841 (0.004) 0.818 (0.004)
SVM Radial 0.813 (0.002) 0.801 (0.003) 0.822 (0.003) 0.798 (0.004) 0.821 (0.004) 0.788 (0.005)
Naive Bayes 0.642 (0.098) 0.682 (0.073) 0.764 (0.008) 0.762 (0.005) 0.748 (0.009) 0.751 (0.007)

Note: SD: standard deviation

Lag 1 indicates the 4-month time window lagged from the outcome variable.

Lag 3 indicates the 12-month time windows lagged from the outcome variable.

Lag 5 indicates the 20-month time windows lagged from the outcome variable.

Table 3.

Paired T-test for Accuracy and AUC with FDR Correction for Different Lag Intervals*

GLMM LSTM Tree Random Forest XGboost SVM Linear SVM Radial Naive Bayes

Lag1
GLMM 1 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001
LSTM <0.001 1 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001
Tree <0.001 0.239 1 0.003 <0.001 <0.001 <0.001 <0.001
Random Forest <0.001 0.05 <0.001 1 <0.001 <0.001 <0.001 <0.001
XGboost <0.001 <0.001 <0.001 <0.001 1 <0.001 <0.001 <0.001
SVM Linear 0.054 <0.001 <0.001 <0.001 <0.001 1 <0.001 <0.001
SVM Radial 0.015 <0.001 <0.001 <0.001 <0.001 0.06 1 <0.001
Naive Bayes <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 1

Lag3
GLMM 1 0.043 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001
LSTM 0.258 1 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001
Tree 0.258 0.004 1 <0.001 <0.001 <0.001 <0.001 <0.001
Random Forest 0.007 0.074 <0.001 1 <0.001 <0.001 <0.001 <0.001
XGboost <0.001 <0.001 <0.001 <0.001 1 0.952 <0.001 <0.001
SVM Linear 0.258 0.004 1 <0.001 <0.001 1 <0.001 <0.001
SVM Radial <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 1 <0.001
Naive Bayes <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 1

Lag5
GLMM 1 0.466 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001
LSTM 0.373 1 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001
Tree 0.267 0.012 1 <0.001 <0.001 <0.001 <0.001 <0.001
Random Forest 0.383 0.017 0.553 1 <0.001 <0.001 <0.001 <0.001
XGboost <0.001 <0.001 <0.001 <0.001 1 0.331 0.438 <0.001
SVM Linear 0.267 0.012 1 0.553 <0.001 1 0.173 <0.001
SVM Radial <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 1 <0.001
Naive Bayes <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 <0.001 1
*

lower triangle includes p-values for accuracy; upper triangle includes p-values for AUC

Figure 1.

Figure 1

ROC curves for each ML algorithms in different lag intervals. The first figure is for Lag 1 analysis, the second figure is for Lag 3 analysis, and the third figure is for Lag 5 analysis.

Predictor importance

We used the random forest (RF) to perform feature selection, which reduced the original set of 46 input predictors for viral suppression to a small set of important predictors. In particular, RF permutes the inputs into trees of different groups of predictors and calculates the change in predictive power of the model (measured by AUC) for each permutation. This process prioritizes groups of predictors that together improve prediction power and deprioritizes those contributing little or no improvement to AUC. RF was able to rank the relative feature importance of the total input set for each model. Figure 2 shows the top 20 important predictors identified by the RF model in the Lag 3 analysis, which include predictors related to historical viral information such as Lag 1 viral suppression, Lag 1 viral rebound, Lag 1 viral rebound size, Lag 2 viral suppression, and Lag 3 viral suppression. The feature selection results from the models incorporating the county-level predictors were presented in the Supplementary Figure S2, which did not show significant differences from the results from the models without county-level predictors.

Figure 2.

Figure 2

Top 20 important predictors for viral suppression selected by random forest (RF)

Abbreviations: VS: viral load

a Rather than p values or coefficients, the RF reports the importance of predictor variables included in a model using Gini split methods. Importance is a measure of each variable’s cumulative contribution toward reducing square error, or heterogeneity within the subset, after the data set is sequentially split based on that variable. Thus, it is a reflection of a variable’s impact on prediction. Absolute importance is then scaled to give relative importance, with a maximum importance of 100. This figure is generated from Random Forest using data without county-level data. The predictor importance was calculated using mean decrease Gini. And then we rescaled the Gini importance into percentage (Gini_i / sum(Gini_i)). We used the mean value of percentages from 30 times calculation, then sorted by descending average importance order. It is a common way to calculate importance. (1,2)

Discussion

Using the statewide surveillance data, this is one of the largest cohort studies to use EHR data and machine learning techniques to develop viral suppression models, which demonstrated strong performance. The LSTM and RF models achieved high classification accuracy and AUC for predicting viral suppression of the subsequent 4 months after the initial viral suppression and sometimes outperformed traditional (GLMM) and other ML algorithms (e.g., Tree). ML based algorithms are capable of accurately predicting early virological suppression using readily available baseline demographic and clinical variables and could be used to derive a risk score for use in resource limited settings. Compared with the previous viral suppression models in the US,27 this study has predictive advantages in that it incorporates the routine monitoring of viral load variables (e.g., VL dynamics, viral rebound), historical medical conditions (e.g., comorbidities), and aggregated county-level factors into the model building. Therefore, these models can guide individualized clinical decisions such as choices of the first line ART and clinical (virological and immunological) monitoring. In addition, machine learning algorithms utilizing EHR data can accurately predict potential future events, such as the risk for virological failure, and this information would enable providers to intervene to improve outcomes for patients in real time.

To illustrate the clinical practice, we have listed predictor distribution of individuals with different probability of viral suppression, i.e., either very high or very low predicted probability of viral suppression in the Supplementary Table S6&S7. We observed that there were differences in lots of variables, such as the high viral load value in Lag 1 for the viral unsuppressed individual, the higher proportion of time with viral suppression in the viral suppressed individual. This information could provide insight into the clinical decision-making process. For example, we found that the most recent VL value within the past 3 months had the highest prediction power for future VS prediction. In the decision-making tool, we could include such questions in the tool development. In this case, the physician will identify the individuals with high risk of virological failure based on the historical VL values and other confounders in this tool.

Extant literature on machine learning models for viral suppression prediction are limited. Among these limited studies, most of them were conducted in sub-Saharan African countries. For example, one study conducted in South Africa used ML to predict VS with a predictive metric of 0.76,28 which used two ML methods (random forest and Adaboost) and identified important factors such as “prior late visits”, “the number of prior VL tests”, time since their last visit, number of visits on their current regimen, age, and treatment duration, range of previous VL value. However, this South Africa study used yearly viral suppression as the outcome, while our study used 4-month time windows for prediction, which better characterized longitudinal patterns of viral suppression at a higher resolution. It is hard to do head-to-head comparison of the model performance with other studies given the varied study population and data sources. Despite of this, our study yielded a relatively higher classification accuracy than some other studies in South Africa (e.g., 0.83 vs 0.76). Another study conducted in Uganda18 used the logistic regression-based machine learning techniques (e.g., multitask temporal logistic regression, patient specific survival prediction). In the US settings, one study conducted in Boston aimed to predict one-year virologic failure using EHR data between 2005 and 2006, which might be out dated as the advancement of HIV treatment and routine monitoring of viral load.27 In addition, this prediction rule was based on data collected in two HIV clinics, which had a smaller sample size and lacked validation with larger and more geographically diverse cohorts. Our study provided complementary evidence by incorporating more thorough VL dynamic measures with a larger sample size.

Results from feature selection indicate that prior viral load related measures played a major role in predicting viral suppression. By contrast, traditional demographic predictors, comorbidities, healthcare utilization, and aggregated county-level factors were less important than the viral load related indicators. Due to scant literature on the analysis of comparable data with similar approaches, a direct comparison with other studies on feature selection is challenging. Nevertheless, previous VL values were indeed included in previous models.27,28 Based on our study, these more powerful predictors can be used to further stratify the population by risk and segment more granularly for targeted interventions and differentiated care. Our results suggest some important implications for the application of this methodology to clinical practice, i.e., utilizing only routinely collected demographic, visit and laboratory data, the ML-based predictive models have the good performance to predict viral suppression.

Our study had certain limitations that should be acknowledged. First, our target population were limited to SC PWH, and the results may be hard to be generalized to other settings. Moreover, we did not assess some other clinical attributes (e.g., white and red blood cell parameters, liver enzyme abnormality) and risk factors (e.g., ART adherence, quality of life), which may affect VL dynamics. To this end, our approach can be utilized and further refined by applying it to different data sources which contain a richer set of clinical, social, and behavioral predictors. Second, as anonymized routinely collected facility-level data were used to fit the models, it was not possible to trace missing data or correct erroneously linked visits and laboratory data. Third, one of the recognized limitations in the nature of black-box machine learning methods such as classification trees is that while these predictors contribute to differences in risk, we cannot yet fully explain how or why they contribute within the context of an individual patient. For example, LSTM model has the limitation of the extensive computing time and the results from this model are hard to interpret. Additional analysis and modeling activities are needed to provide interpretable descriptions of how the algorithm performs population segmentation. Fourth, we did not harness unstructured clinical notes for risk prediction. In the future, ML algorithms utilizing both structured fields and natural language processing (NLP) of unstructured clinical notes to predict risk for HIV viral suppression might be more accurate than an algorithm using structured EHR data alone.29 For example, Semerdjian et al found that a model using NLP of clinical notes had higher prediction performance than a model based on demographics.30 Finally, our model only applies to datasets which documented historical viral load records, which may limit the generalizability of the models.

Despite these limitations, the analysis has several strengths. First, accurately identifying those at risk for poor viral suppression will allow for health care service to better triage patients, improving efficiency and resource utilization. By prioritizing those most at-risk, clinics can realize better health outcomes without additional investment in data collection and staff. In addition, the results of the algorithm can be also aggregated and used to risk score population subgroups at the facility level to identify programs for specific interventions. Machine learning models are promising tools for improving HIV care continuum outcomes. Future research should consider combining EHR data with additional data sources (e.g., social media, geospatial data) to improve the model accuracy in the prediction model building.

Conclusions

Predictive models and machine learning can identify and target HIV patients at risk for poor viral suppression. Our approach could enable anticipation of future outcomes before any visible signs and/or poor outcomes occur (e.g., an unsuppressed VL) and, most importantly, while the patient is still engaged in care. This creates the opportunity to take a proactive approach to patient management---specific targeted interventions can be designed on identified subsets of the treatment cohorts, allowing for cost-effective differentiated models on care and treatment to be applied across the cascade. This approach could also be extended to other key HIV outcomes, allowing for the use of a cost-effective and precision programming approach.

Supplementary Material

Supplemental Digital Content

Acknowledgements

The research reported in this publication was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number R01AI164947. Dr. Xueying Yang’s effort is also supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number R21AI170159. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Funders had no role in the design of the study, collection, analysis and interpretation of the data. Data not publicly available.

Conflicts of Interest and Sources of Funding:

The research reported in this publication was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number R01AI164947 (BO and JZ). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Funders had no role in the design of the study, collection, analysis and interpretation of the data. The authors declare no conflict of interest.

Footnotes

Conflict of interest: The authors declare no conflict of interest.

Conference presentations: Yang, X., Cai, R., Ma, Y., Zhang, H., Olatosi, B., Weissman, S., Zhang, J., Li, X., Zhang, J. Using machine learning techniques to predict viral suppression among people living with HIV. Poster presentation at the 2022 Annual Meeting of the American Public Health Association, Nov 6–9, 2022, Boston, USA.

References

  • 1.Harris NS, Johnson AS, Huang Y-LA, et al. Vital signs: status of human immunodeficiency virus testing, viral suppression, and HIV preexposure prophylaxis—United States, 2013–2018. Morbidity and Mortality Weekly Report. 2019;68(48):1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Lee JS, Cole SR, Richardson DB, et al. Incomplete viral suppression and mortality in HIV patients after antiretroviral therapy initiation. AIDS. Sep 10 2017;31(14):1989–1997. doi: 10.1097/QAD.0000000000001573 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.CDC U. Prevalence-based HIV care continuum for persons aged ≥13 years living with HIV infection (diagnosed or undiagnosed), 2019. Accessed January 17, 2023. https://www.cdc.gov/hiv/library/reports/hiv-surveillance.html
  • 4.CDC U. HIV Prevention Progress Report, 2019. https://wwwcdcgov/hiv/pdf/policies/progressreports/cdc-hiv-preventionprogressreportpdf. 2019;
  • 5.Adolescents. DoHaHSPoAGfAa. Guidelines for the use of antiretroviral agents in HIV-1-infected adults and adolescents. Website http://aidsinfonihgov/ContentFiles/AdultandAdolescentGLpdf Accessed July 29, 2020. 2019;
  • 6.Geretti AM, Smith C, Haberl A, et al. Determinants of virological failure after successful viral load suppression in first-line highly active antiretroviral therapy. Antivir Ther. 2008;13(7):927–36. [PubMed] [Google Scholar]
  • 7.Sungkanuparph S, Groger RK, Overton ET, Fraser VJ, Powderly WG. Persistent low-level viraemia and virological failure in HIV-1-infected patients treated with highly active antiretroviral therapy. HIV Med. Oct 2006;7(7):437–41. doi: 10.1111/j.1468-1293.2006.00403.x [DOI] [PubMed] [Google Scholar]
  • 8.Achenbach CJ, Buchanan AL, Cole SR, et al. HIV viremia and incidence of non-Hodgkin lymphoma in patients successfully treated with antiretroviral therapy. Clinical Infectious Diseases. 2014;58(11):1599–1606. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Hatzakis GE, Mathur M, Gilbert L, et al. Neural network-longitudinal assessment of the Electronic Anti-Retroviral THerapy (EARTH) cohort to follow response to HIV-treatment. AMIA Annu Symp Proc. 2005:301–5. [PMC free article] [PubMed] [Google Scholar]
  • 10.Hermans LE, Moorhouse M, Carmona S, et al. Effect of HIV-1 low-level viraemia during antiretroviral therapy on treatment outcomes in WHO-guided South African treatment programmes: a multicentre cohort study. Lancet Infect Dis. Feb 2018;18(2):188–197. doi: 10.1016/S1473-3099(17)30681-3 [DOI] [PubMed] [Google Scholar]
  • 11.Njuguna I, Neary J, Mburu C, et al. Clinic-level and individual-level factors that influence HIV viral suppression in adolescents and young adults: a national survey in Kenya. Aids. 2020;34(7):1065–1074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Revell AD, Alvarez-Uria G, Wang D, et al. Potential impact of a free online HIV treatment response prediction system for reducing virological failures and drug costs after antiretroviral therapy failure in a resource-limited setting. Biomed Res Int. 2013;2013:579741. doi: 10.1155/2013/579741 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Westergaard RP, Hess T, Astemborski J, Mehta SH, Kirk GD. Longitudinal changes in engagement in care and viral suppression for HIV-infected injection drug users. AIDS. Oct 23 2013;27(16):2559–66. doi: 10.1097/QAD.0b013e328363bff2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Chan JY-L, Leow SMH, Bea KT, et al. Mitigating the multicollinearity problem and its machine learning approach: a review. Mathematics. 2022;10(8):1283. [Google Scholar]
  • 15.Wang D, Larder B, Revell A, et al. A comparison of three computational modelling methods for the prediction of virological response to combination HIV therapy. Artif Intell Med. Sep 2009;47(1):63–74. doi: 10.1016/j.artmed.2009.05.002 [DOI] [PubMed] [Google Scholar]
  • 16.Larder B, Wang D, Revell A, et al. The development of artificial neural networks to predict virological response to combination HIV therapy. Antiviral therapy. 2007;12(1):15–24. [PubMed] [Google Scholar]
  • 17.Zazzi M, Incardona F, Rosen-Zvi M, et al. Predicting response to antiretroviral treatment by machine learning: the EuResist project. Intervirology. 2012;55(2):123–127. [DOI] [PubMed] [Google Scholar]
  • 18.Bisaso KR, Karungi SA, Kiragga A, Mukonzo JK, Castelnuovo B. A comparative study of logistic regression based machine learning techniques for prediction of early virological suppression in antiretroviral initiating HIV patients. BMC Med Inform Decis Mak. Sep 4 2018;18(1):77. doi: 10.1186/s12911-018-0659-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Dessie ZG, Zewotir T, Mwambi H, North D. Modeling Viral Suppression, Viral Rebound and State-Specific Duration of HIV Patients with CD4 Count Adjustment: Parametric Multistate Frailty Model Approach. Infect Dis Ther. Jun 2020;9(2):367–388. doi: 10.1007/s40121-020-00296-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Olatosi BA, Probst JC, Stoskopf CH, Martin AB, Duffus WA. Patterns of engagement in care by HIV-infected adults: South Carolina, 2004–2006. Aids. 2009;23(6):725–730. [DOI] [PubMed] [Google Scholar]
  • 21.Centers for Disease Control and Prevention. Missed opportunities for earlier diagnosis of HIV infection--South Carolina, 1997–2005. MMWR Morbidity and mortality weekly report. 2006;55(47):1269–1272. [PubMed] [Google Scholar]
  • 22.Centers for Disease C, Prevention. Missed opportunities for earlier diagnosis of HIV infection--South Carolina, 1997–2005. MMWR Morb Mortal Wkly Rep. Dec 1 2006;55(47):1269–72. [PubMed] [Google Scholar]
  • 23.Marengoni A, Rizzuto D, Wang HX, Winblad B, Fratiglioni L. Patterns of chronic multimorbidity in the elderly population. J Am Geriatr Soc. Feb 2009;57(2):225–30. doi: 10.1111/j.1532-5415.2008.02109.x [DOI] [PubMed] [Google Scholar]
  • 24.Olatosi B, Zhang J, Weissman S, Hu J, Haider MR, Li X. Using big data analytics to improve HIV medical care utilisation in South Carolina: A study protocol. BMJ Open. 2019–07-01 00:00:00 2019;9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Arnold EM, Swendeman D, Harris D, et al. The Stepped Care Intervention to Suppress Viral Load in Youth Living With HIV: Protocol for a Randomized Controlled Trial. JMIR Res Protoc. Feb 27 2019;8(2):e10791. doi: 10.2196/10791 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal statistical society: series B (Methodological). 1995;57(1):289–300. [Google Scholar]
  • 27.Robbins GK, Johnson KL, Chang Y, et al. Predicting virologic failure in an HIV clinic. Clin Infect Dis. Mar 1 2010;50(5):779–86. doi: 10.1086/650537 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Maskew M, Sharpey-Schafer K, De Voux L, et al. Applying machine learning and predictive modeling to retention and viral suppression in South African HIV treatment cohorts. Sci Rep. Jul 26 2022;12(1):12715. doi: 10.1038/s41598-022-16062-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Feller DJ, Zucker J, Yin MT, Gordon P, Elhadad N. Using Clinical Notes and Natural Language Processing for Automated HIV Risk Assessment. J Acquir Immune Defic Syndr. Feb 1 2018;77(2):160–166. doi: 10.1097/QAI.0000000000001580 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Semerdjian J, Lykopoulos K, Maas A, et al. Supervised machine learning to predict HIV outcomes using electronic health record and insurance claims data. AIDS. 2018; [Google Scholar]

eReferences

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Digital Content

RESOURCES