Abstract
Age-related diseases and syndromes result in poor quality of life and adverse outcomes, representing a challenge to healthcare systems worldwide. Several pharmacological interventions have been proposed to target the aging process to slow its adverse effects. The so-called geroprotectors have been proposed as novel molecules that could maintain the organism's homeostasis, targeting specific aspects linked to the hallmarks of aging and delaying the adverse outcomes associated with age. On the other hand, machine learning (ML) is revolutionising drug design by making the process faster, cheaper, and more efficient.
Supplementary Information
The online version contains supplementary material available at 10.1186/s13321-025-01058-5.
Keywords: Aging, Drug development, Age-related diseases, Geroprotectors, Machine learning, Natural products, Cheminformatics
Scientific contribution
This paper aims to identify new potential compounds with potential geroprotective properties using a structure-based analysis with machine learning tools. First, we obtain their chemical descriptors (1-3D) from the Geroprotectors database, which contains already published molecules with demonstrated geroprotective activity. Then, we built a classifier with three different machine learning models: Decision Tree Classifier (DT), Support Vector Machine (SVM) and K-Nearest Neighbours (KNN). Using our classifiers, we search for new potential geroprotectors in the Collection of Open Natural Products (COCONUT) database, which contains 695,133 molecules with diverse structures. The AUC values (DT: 0.62, SVM: 0.73, KNN: 0.64) indicate that all models performed with modest accuracy. Once we applied our classifiers, we predicted that 1,488 molecules have the potential to be experimentally tested as geroprotectors. We only consider compounds positively predicted by the three classifiers. Finally, we create a freely available new online database available at: (https://gcoixc-laboratorio0de0bioinform0tic-inger.shinyapps.io/Geroprotectors_ShinyApp/).
Graphical Abstract
Supplementary Information
The online version contains supplementary material available at 10.1186/s13321-025-01058-5.
Introduction
Life expectancy has increased significantly worldwide in recent decades due to improved living conditions and the effectiveness of medical care [1]. Additionally, the United Nations estimates that by 2050, approximately 20% of the world’s population will be over 60 years old [2]. Interestingly, aging is considered a risk factor for the development of multiple impairments and chronic diseases, such as cancer, diabetes, cardiovascular and neurodegenerative diseases [3]. Recently, aging has been considered a complex process resulting from the accumulation of multiple forms of damage in different tissues due to failures in cellular maintenance pathways [4]. Advances in molecular and cellular biology have led to the discovery of the Hallmarks of aging [5, 6], which are biological mechanisms that represent the major drivers of age-related decline.
Several authors have suggested targeting the hallmarks of aging to positively intervene in the aging process. In this sense, Geroprotectors, a pharmacological intervention, are compounds that can maintain the organism's homeostasis by targeting specific aspects linked to the Hallmarks of Ageing, thereby delaying the adverse outcomes associated with age [7]. For example, pharmacological targeting of the insulin/IGF-1 pathway (deregulated nutrient sensing) with rapamycin mimics caloric restriction, reducing inflammation and improving insulin sensitivity, increasing the lifespan of different model organisms [8, 9]. Moreover, senolytics, a broad group of compounds that target senescent cells by killing them or diminishing their inflammatory response, have demonstrated promising effects. For instance, the use of Dasatinib and Quercetin enhances cardiovascular function in old mice (24-month-old) [10].
Another example is metformin, which extends the lifespan of the roundworm Caenorhabditis elegans by up to 50% by modulating the cellular energy sensor adenosine monophosphate-activated protein kinase (AMPK) and its upstream activating kinase, liver kinase B1 (Lkb1, also known as par-4 in the worm) [11]. Even so, a large clinical trial called the TAME trial (Targeting Aging with Metformin) is currently underway to investigate whether Metformin can delay the onset of age-related diseases in humans [12]. Therefore, identifying and validating drugs that target the Hallmarks of Aging is a viable alternative to modulating this process [13]. To date, around 200 known geroprotective compounds have been reported, which have been experimentally validated on different aging models and could be freely consulted in the Geroprotectors database (http://geroprotectors.org/) [13, 14].
On the other hand, machine learning methods have become increasingly important in the field of drug design and development. Due to its ability to identify diverse and complex patterns in large chemical datasets, it reduces the experimental search and pinpoints potential compounds with desired properties, saving time and money [15, 16]. Moreover, machine learning is based on identifying patterns and correlations in large datasets, enabling systems to make informed decisions [17]. Classification algorithms allow the assignment of categories to datasets to predict their class [18]. For instance, using machine learning, Diogo Barardo and coworkers identified at least 20 chemical compounds in the DrugAge database that have a high probability of increasing the lifespan of C. elegans [19, 20]. Another interesting example is that of AgeXtend, an artificial intelligence (AI)-powered platform to revolutionise anti-ageing research by efficiently identifying and understanding molecules that can combat the aging process. Nevertheless, AgeXtend still requires significant computational resources and specialised expertise, limiting its use [21].
Due to the lack of valuable and easy web tools, in the present study, we aim to develop three machine learning classification models based on DT, SVM, and KNN to identify new potential geroprotectors. To train our models, we collected a dataset from the Geroprotectors database (http://geroprotectors.org/), which contains information on compounds with proven experimental geroprotective activity [14, 20]. Then, we screened the COCONUT database (https://coconut.naturalproducts.net/), which contains > 600,000 compounds, to identify new potential geroprotective compounds using our classification. Our results suggest 1,488 interesting molecule candidates with both geroprotective and leadlikeness potential. Finally, we developed a freely available web tool to disclose our findings, which could be experimentally validated for confirmation as geroprotectors.
Results
We considered 206 reported geroprotectors and 199 compounds with no reported geroprotective activity. The data were concatenated and binned; 80% of the dataset was considered for training the models and 20% for testing them.
The DataWarrior software used thirty-nine chemical descriptors, including basic molecular, physicochemical, and toxicoinformatic properties. Once we performed a PCA analysis, we used the following parameters: Total Molecular Weight, cLogP, H-acceptors, H-donors, total surface area, relative PSA, and rotatable bonds to avoid overfitting.
Our results (Table 1) indicate that all the models used have an accuracy of more than 60%; for instance, SVM has an accuracy of 67.9%. Moreover, DT has the highest specificity (0.61). On the other hand, KNN has a good recall (0.77), correctly identifying relevant instances. Altogether, our results suggest that our classifiers met the criteria (accuracy and specificity) to search for potential geroprotector candidates using the COCONUT database. Additionally, we show the results of a fivefold cross-validation for each model; however, the nature of the classification problem favours training with the full dataset, as small partitions in the cross-validation do not adequately preserve the distribution of minority classes.
Table 1.
Results from three ML classifier models. Accuracy (performance), Specificity (true negative cases), Recall (sensitivity), F1 (precision) and MCC (Balanced correlation) proportions from 0 to 1
| Accuracy | Specificity | Recall | F1 | MCC | |
|---|---|---|---|---|---|
| DT | 0.61 | 0.60 | 0.62 | 0.58 | 0.23 |
| Cross-validation (fivefold-DT) | 0.56 (± 0.06) | 0.58 (± 0.04) | 0.54 (± 0.14) | 0.54 (± 0.11) | 0.13 (± 0.13) |
| SVM | 0.67 | 0.54 | 0.85 | 0.69 | 0.41 |
| Cross-validation (fivefold-SVM) | 0.61 (± 0.06) | 0.80 (± 0.04) | 0.61 (± 0.06) | 0.54 (± 0.07) | 0.23 (± 0.10) |
| KNN | 0.65 | 0.56 | 0.77 | 0.65 | 0.33 |
| Cross-validation (fivefold-KNN) | 0.56 (± 0.02) | 0.63 (± 0.08) | 0.50 (± 0.07) | 0.54 (± 0.03) | 0.13 (± 0.04) |
The results for the three classifiers are presented in Fig. 1, where the AUC-ROC values were 0.62 for DT (Fig. 1a), 0.73 for SVM (Fig. 1b), and 0.64 for KNN (Fig. 1c). Once we integrated these results, we obtained 51,564 compounds classified as probable geroprotectors by all three models (Fig. 1d).
Fig. 1.
Receiver Operating Characteristic Curve (ROC) curve for the ML models. a DT-ROC curve; b SVM-ROC curve; c KNN-ROC curve; d Venn diagram of compounds predicted by each model; the intersection of all three contains 51,564 compounds, as shown, while pairwise overlaps contain DT-SVM 78,964; DT-KNN 73,136 and KNN-SVM 81,255 compounds respectively. Interestingly, DT identifies most potential candidates (189,799) compared to SVM (130,490) and KNN (124,736). The total number of compounds screened was 695,133
We applied the leadlikeness and toxicoinformatic criteria; as a result, only 1,488 did not present any toxicoinformatic information and have leadlike structure data available at: https://gcoixc-laboratorio0de0bioinform0tic-inger.shinyapps.io/Geroprotectors_ShinyApp/.
We also compared our 1,488 compounds with those in the Geroprotector database (http://geroprotectors.org/), which displays the distributions of key descriptors for known versus predicted geroprotectors (Fig. 2). The profiles are broadly similar, suggesting that the candidate compounds mostly fall within ranges seen for known geroprotectors.
Fig. 2.
Comparison of molecular descriptors between the Geroprotectors database and the new dataset of potential geroprotector candidates obtained by ML. a-d show overlapping distributions of Molecular Weight, cLogP, H-acceptors and H-donors. The orange distribution corresponds to the 1,488 geroprotectors candidates predicted by ML, while the blue distribution corresponds to the known geroprotectors (http://geroprotectors.org/). e shows a PCA plot; each vector represents a descriptor, and coloured dots indicate individual compounds
Moreover, we analyse the chemical space to compare our dataset with FDA-approved compounds. Figure 3 shows that our compounds fall within the chemical space of FDA-approved drugs, which is desirable from a pharmaceutical perspective for future development Fig. 4.
Fig. 3.
Comparision of chemical space between FDA-approved compounds, reported geroprotectors and potential geroprotector candidates. a PCA plot representing the chemical space comparison between FDA approved compounds (dots in pink), reported geroprotectors (dots in purple) and candidate geroprotectors (dots in green). b Represents t-SNE analysis for the chemical space comparision between FDA approved compounds (dots in pink), reported geroprotectors (dots in purple) and candidate geroprotectors (dots in green)
Fig. 4.
Complete workflow performed on the dataset. This image illustrates the step-by-step process that we follow to identify potential candidates for geroprotectors
Discussion
An essential goal for ageing research is to identify and develop potential geroprotectors (compounds that protect against the deleterious effects of aging by targeting two or more hallmarks of ageing) to increase lifespan and delay age-related diseases. Moreover, the use of geroprotector drugs and exercise may improve the population's healthy longevity [22]. However, the development of geroprotectors has been slow and limited. Two hundred compounds have been reported as potential geroprotectors [7], and none are currently on the market. This limitation stems from the discovery and development of new drugs, which are costly, time-consuming, and highly complex processes that present numerous challenges. Currently, ML methods, along with QSAR algorithms, have overcome and improved the decision-making and hit-to-lead process. In this sense, the enhanced in silico prediction of potential geroprotectors becomes now possible, offering a more straightforward path.
There are several examples of such applications. For instance, Liu B.H.M. L. et al., performed an artificial intelligence (AI)-driven target analysis and identified two unreported therapeutic targets for endometriosis treatment [23]; on the other hand, Smer-Barreto et al., discovered three new senolytics using ML and trained data from published articles [16]. Moreover, Sakshi Arora et al. Presented AgeXtend, an artificial intelligence (AI)-based multimodal geroprotector prediction platform, which is Python-based and leverages bioactivity data of known geroprotectors [21].
In this context, prediction algorithms used in the present study include Decision Trees (DT), a method commonly employed in data mining for classification, which presents a tree-like structure that facilitates decision-making through a series of attribute evaluations [24]; Support Vector Machine (SVM) for finding the optimal hyperplane that separates different classes of data in a high-dimensional space [24, 25] and K-Nearest Neighbors (KNN) that classifies data based on the majority class of its K-nearest neighbours in the feature space [26]. Implementing three distinct ML models (DT, SVM, and KNN) using molecular descriptors of already reported geroprotectors represents a significant advance in identifying novel compounds. For instance, the AUC values for each model (DT: 0.62, SVM: 0.73, KNN: 0.64) indicate that all models performed with modest accuracy, with SVM particularly promising in discriminating geroprotector compound candidates. Moreover, after performing the cross-validation, we observed that the results were not significantly improved compared to the single split initially used. The cross-validation resulted in slightly lower metrics due to the inherent variability of the data and the limited dataset size.
Interestingly, the DT algorithm identifies more candidates (189,799), likely due to the higher recall value of such a model. Therefore, to minimise false positives and increase confidence, given the moderate accuracy of individual models, we only consider compounds positively predicted by the three classifiers as potential geroprotector candidates. Overall, our results suggest that a combined approach of ML may be more beneficial for the correct prediction of novel potential geroprotector candidates.
Integrating our results from the three models led to the identification of 51,564 potential geroprotectors. Our chemical explanation is derived from the fact that we found the molecular descriptors identified as most important by our PCA analysis are directly related to fundamental physicochemical properties that dictate how a drug interacts with the body, known as the “drug-likeness” rules. Such numbers were further refined through leadlikeness and toxicoinformatic criteria (Molecular weight (MW) < 300 Da, Hydrogen bond acceptors ≤ 3, ClogP ≤ 3), resulting in 1,488 compounds without toxicoinformatic indicators, including genotoxicity, carcinogenicity, reproductive toxicity, and irritant properties. Noteworthy is the chemical space analysis comparing our identified compounds with known FDA-approved drugs and geroprotectors. Such alignment suggests that the identified compounds share structural similarities and possess drug-like properties similar to those of already approved drugs, which could facilitate their development as therapeutic agents.
Our results identified 1,488 potential geroprotectors from the COCONUT database. When we explore our dataset, we observe that three compounds have already been reported as geroprotectors (1-MNA, Trigonelline and Hypotaurine) [27–29]. Previous work by Alexander Aliper et al. Used GeroScope to analyse gene expression data and predict geroprotectors, discovering natural compounds (Myricetin and Epigallocatechin gallate) with potent synergistic effects against the negative impacts of ageing [15]. Accordingly, our findings suggest that natural products could serve as a promising source for geroprotector-like molecules, inspiring the development of new synthetic lead compounds.
Thus, natural products continue to be a crucial resource for drug development, providing a unique array of chemical diversity and potential therapeutic agents. Therefore, we created a publicly accessible database (https://gcoixc-laboratorio0de0bioinform0tic-inger.shinyapps.io/Geroprotectors_ShinyApp/) that allows anyone to access the 1,488 resulting potential geroprotector candidates through a web application, thereby accelerating the construction of the library and experimental validation of new geroprotective compounds, considering that most of the compounds are accomplice of the leadlikeness rule. Finally, it is essential to note that our primary limitation was the limited number of geroprotector compounds available in the database (http://geroprotectors.org/), which were selected to train our models, leading to non-optimal results in the algorithms. Further experimental validation is required to validate our results.
Conclusion
In conclusion, our computational approach successfully identified many potential geroprotector compounds that warrant further investigation. Combining machine learning modelling, chemical space analysis, and toxicoinformatic screening provides a robust framework for future efforts to discover geroprotectors from natural products. This work represents a significant step forward in the computational identification of compounds that can help address age-related diseases. Nevertheless, it is essential to note our results reflect an exploratory analysis aimed at testing a novel algorithm to identify potential geroprotector candidates.
Methods
Collection of geroprotector compounds
We obtained a dataset of 206 compounds with geroprotective activity (positive set) from the Geroprotectors database (http://geroprotectors.org/). Additionally, using the ChEMBL database, a second dataset was constructed with 199 randomly selected compounds without reported geroprotective activity and random chemical diversity. Interestingly, 31.4% of such compounds are FDA-approved drugs (negative set). Compounds without canonical SMILES, ions, and duplicate compounds were excluded. Both data sets were concatenated and encoded using binary (0/1) for the classes. The positive and negative tables can be accessed on GitHub [https://github.com/BioAgeLab/Geroprotectors-Project-INGER/], along with the supplementary materials (Table 1S).
Molecular descriptors
Molecular descriptors (1-4D) were calculated for each dataset using DataWarrior v.5.5.0 software (https://openmolecules.org/datawarrior). Following Sander's methodology [27], we calculated physicochemical, pharmacokinetic, medicinal chemical, and toxicoinformatic properties. We performed a principal component analysis (PCA) to reduce dimensionality and choose molecular descriptors with the highest variance and relevance accordingly (Fig. 2e). The molecular descriptors contributing most to the principal components were selected, considering the following: (1) molecular weight, (2) calculated octanol–water partition coefficient (cLogP), (3) hydrogen bond acceptors, (4) hydrogen bond donors, (5) total surface area, (6) relative polar surface area (relative PSA), and (7) rotatable bonds.
Supervised machine learning (SVM, KNN and DT)
To generate the models, we utilised the Scikit-learn package (https://scikit-learn.org/stable/index.html) in Python 3.0, which implements various statistical and machine learning algorithms. We developed an ensemble classifier approach based on a diverse set of three randomly selected machine learning models standing out for their simplicity and easy interpretability in their results: DT (depth of 19) provides clear decision rules for identifying specific molecular features; KNN (K = 27) enables classification based on structural similarity; and SVC (linear kernel and C/gamma = 1.0) offers robust decision boundaries for separation. We randomly divided the data into 80% for model training and 20% for classifier validation as a test set (one-time split). To ensure the reproducibility of the results, a random seed (random_state = 42) was set during data partitioning for both the initial split and all subsequent procedures. We used scaled data for training models. Due to the small size of the dataset (405 compounds), we used the default parameters of Sci-Kit Learn. The Receiver Operating Characteristic Area evaluated the predictive accuracy of our developed models. Under the Curve (ROC AUC), a metric indicating the overall performance of binary classification models with values between 0 and 1, where 1 indicates perfect prediction.
Additionally, for each model, accuracy was calculated to assess its overall effectiveness, and specificity was calculated to determine whether true negatives were correctly classified. Recall measures model sensitivity by evaluating the true positives detected. The F-1 score combines accuracy and completeness into a single value, providing a balanced assessment of a model's ability to classify instances correctly.
Cross-validation
To provide a comprehensive evaluation of model performance, k-fold cross-validation (k = 5) was applied exclusively to the training set, utilising randomisation to reduce ordering biases. Seven performance metrics were evaluated: accuracy, precision, recall (also known as sensitivity), F1-score, specificity, and Matthews Correlation Coefficient (MCC). For each metric, the mean and standard deviation were calculated across the five folds, providing a robust estimate of model performance and its variability. The independent test set was retained for the final evaluation, avoiding model selection bias and ensuring unbiased performance estimates.
Once we had the classifier for each algorithm, we used it to predict geroprotective compounds in the COCONUT database (https://coconut.naturalproducts.net), which contains over 600,000 natural product compounds from various sources [30, 31]. We only consider compounds that are positively predicted by all three classifiers as potential geroprotector candidates to minimise false positives and increase confidence, given the moderate accuracy of individual models.
Post-prediction filtering and chemical space
Compounds positively predicted in the COCONUT database were integrated and filtered based on lead-likeness criteria (MW < 300, HBA ≤ 3, cLogP ≤ 3) [32, 33]. We filtered 51,564 compounds and reached 2,089 compounds. Subsequently, compounds with undesirable toxicoinformatics properties were excluded, resulting in 1,488 compounds. Chemical descriptors of such compounds were compared using the Geroprotectors database and our findings. Additionally, we compared the chemical space for FDA-approved compounds to visualise their distribution in chemical space using the methodology proposed by [34].
Web database tool
The dataset of 1,488 potential geroprotector candidates obtained from the consensus of the three algorithms and post-prediction filters is organised using a web tool based on the ShinyApp package from the R language, which is freely available. All code and data used for model training and screening are available at [https://github.com/BioAgeLab/Geroprotectors-Project-INGER/].
Supplementary Information
Acknowledgements
This study was part of a registered project at the Instituto Nacional de Geriatría DI-PI-008-2021 (JCGV). We thank the Consejo Nacional de Humanidades, Ciencias y Tecnologías (CONAHCYT), which has become Secretaría de Ciencia, Humanidades, Tecnología e Innovación (SECIHTI), for funding and the support of this research through the project 319706 ¿Se puede revertir el envejecimiento con fármacos?
Author contributions
J.C.G.-V.: Conceptualization, Investigation, Writing-Original Draft and Editing. J.A.S.-C.: Supervision and Project administration, Investigation, Data Curation, Web application development, Writing-Original Draft and Editing and Visualization. N.A.R.-S: Validation; Formal analysis, Writing-Review and Editing and Visualization. All authors have read and agreed to the published version of the manuscript.
Funding
This study was supported by the Consejo Nacional de Humanidades, Ciencias y Tecnologías, México; Project 319706. The publication of this paper was conducted by the Instituto Nacional de Geriatría, México, and financially supported by the Consejo Nacional de Humanidades, Ciencias y Tecnologías, México; Project 319706 “¿Se puede revertir el envejecimiento con fármacos?”.
Data availability
The database is freely available at: https://gcoixc-laboratorio0de0bioinform0tic-inger.shinyapps.io/Geroprotectors_ShinyApp/
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Fuentealba M, Dönertaş HM, Williams R et al (2019) Using the drug-protein interactome to identify anti-ageing compounds for humans. PLoS Comput Biol 15:e1006639 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.World Population Prospects - Population Division - United Nations. https://population.un.org/wpp/Graphs/Probabilistic/PopPerc/60plus/900. Accessed 26 Jun 2024
- 3.Zhavoronkov A (2020) Geroprotective and senoremediative strategies to reduce the comorbidity, infection rates, severity, and lethality in gerophilic and gerolavic infections. Aging 12:6492–6510 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Niccoli T, Partridge L (2012) Ageing as a risk factor for disease. Curr Biol 22:R741–R752 [DOI] [PubMed] [Google Scholar]
- 5.López-Otín C, Blasco MA, Partridge L et al (2013) The hallmarks of aging. Cell 153:1194–1217 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.López-Otín C, Blasco MA, Partridge L et al (2023) Hallmarks of aging: an expanding universe. Cell 186:243–278 [DOI] [PubMed] [Google Scholar]
- 7.Moskalev A, Chernyagina E, Kudryavtseva A, Shaposhnikov M (2017) Geroprotectors: a unified concept and screening approaches. Aging Dis 8:354–363 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rallis C, Codlin S, Bähler J (2013) TORC1 signaling inhibition by rapamycin and caffeine affect lifespan, global gene expression, and cell proliferation of fission yeast. Aging Cell 12:563–573 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Johnson SC (2018) Nutrient sensing, signaling and ageing: the role of IGF-1 and mTOR in ageing and age-related disease. Subcell Biochem 90:49–97 [DOI] [PubMed] [Google Scholar]
- 10.Kirkland JL, Tchkonia T (2020) Senolytic drugs: from discovery to translation. J Intern Med 288:518 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Soukas AA, Hao H, Wu L (2019) Metformin as anti-aging therapy: is it for everyone? Trends Endocrinol Metab 30:745–755 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Kulkarni AS, Gubbi S, Barzilai N (2020) Benefits of metformin in attenuating the hallmarks of aging. Cell Metab 32:15–30 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Statzer C, Jongsma E, Liu SX et al (2021) Youthful and age-related matreotypes predict drugs promoting longevity. Aging Cell 20:e13441 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Moskalev A, Chernyagina E, de Magalhães JP et al (2015) Geroprotectors.org: a new, structured and curated database of current therapeutic interventions in aging and age-related disease. Aging 7:616–628 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Aliper A, Belikov AV, Garazha A et al (2016) In search for geroprotectors: in silico screening and in vitro validation of signalome-level mimetics of young healthy state. Aging (Albany NY) 8:2127–2152 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Smer-Barreto V, Quintanilla A, Elliott RJR et al (2023) Discovery of senolytics using machine learning. Nat Commun 14:3445 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Batta VMB (2024) Machine learning. International Journal of Advanced Research in science, Communication and Technology 583–591
- 18.Majumder A (2023) Classification models in machine learning techniques. Pennsylvania, IGI Global [Google Scholar]
- 19.Barardo D, Thornton D, Thoppil H et al (2017) The DrugAge database of aging-related drugs. Aging Cell 16:594–597 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Barardo DG, Newby D, Thornton D et al (2017) Machine learning for predicting lifespan-extending chemical compounds. Aging 9:1721–1737 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Arora S, Mittal A, Duari S et al (2025) Discovering geroprotectors through the explainable artificial intelligence-based platform AgeXtend. Nat Aging 5:144–161 [DOI] [PubMed] [Google Scholar]
- 22.Elliehausen CJ, Anderson RM, Diffee GM et al (2023) Geroprotector drugs and exercise: friends or foes on healthy longevity? BMC Biol 21:287 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Liu BHM, Lin Y, Long X et al (2025) Utilizing AI for the Identification and Validation of Novel Therapeutic Targets and Repurposed Drugs for Endometriosis. Adv Sci (Weinh) 12:e2406565 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Dai Q-Y, Zhang C-P, Wu H (2016) Research of decision tree classification algorithm in data mining. Int J Database Theory Appl 9:1–8 [Google Scholar]
- 25.Wang Q (2022) Support vector machine algorithm in machine learning. IEEE, New Jersey [Google Scholar]
- 26.Srisuradetchai P, Suksrikran K (2024) Random kernel k-nearest neighbors regression. Front Big Data 7:1402384 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zeng WY, Tan L, Han C et al (2021) Trigonelline extends the lifespan of C. Elegans and delays the progression of age-related diseases by activating AMPK, DAF-16, and HSF-1. Oxidat Med Cell Longevity. 10.1155/2021/7656834 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Wan QL, Fu X, Meng X et al (2020) Hypotaurine promotes longevity and stress tolerance via the stress response factors DAF-16/FOXO and SKN-1/NRF2 in Caenorhabditis elegans. Food Funct. 10.1039/c9fo02000d [DOI] [PubMed] [Google Scholar]
- 29.Schmeisser K, Mansfeld J, Kuhlow D et al (2013) Role of sirtuins in lifespan regulation is linked to methylation of nicotinamide. Nature Chem Biol. 10.1038/nchembio.1352 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Sander T, Freyss J, von Korff M, Rufener C (2015) DataWarrior: an open-source program for chemistry aware data visualization and analysis. J Chem Inf Model 55:460–473 [DOI] [PubMed] [Google Scholar]
- 31.Sorokina M, Merseburger P, Rajan K et al (2021) COCONUT online: collection of open natural products database. J Cheminform 13:2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kerns EH, Zhang X, Di L (2021) ADMETinvitroprofiling – utility and applications in lead discovery. Burger’s Med Chem Drug Disc 15:1–31 [Google Scholar]
- 33.Mendes Lampert L, Ruszczyk Machado B, Rocha Joaquim A, Fumagalli F (2024) Rings in “lead-like drugs.” Lett Drug Des Discov 21:3851–3857 [Google Scholar]
- 34.Saldívar-González FI, Medina-Franco JL (2022) Approaches for enhancing the analysis of chemical space for drug discovery. Expert Opin Drug Discov 17:789–798 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The database is freely available at: https://gcoixc-laboratorio0de0bioinform0tic-inger.shinyapps.io/Geroprotectors_ShinyApp/





