TABLE 3.
Examples of relevant resources, cheminformatics software, and machine/deep learning tools utilized in the analyze phase of the DMTA cycle in agrochemical discovery. Abbreviations: Support vector regression (SVR), Liquid Chromatography – Mass Spectrometry (LC-MS), Graph Neural Network (GNN), Retention Time (RT), Deep Graph Learning (DGL), Natural Products (NP).
| Name | Description | References/Examples |
|---|---|---|
| Structural classifications tools | ||
| LeadScope | SAR analysis and visualization tool, with a focus on toxicological data | Roberts et al. (2000) |
| DataWarrior | General purpose SAR tool | Sander et al. (2015) |
| Pipeline Pilot | Data pipeline tool; capabilities for various ad hoc analyses | Dassault Systèmes SE, (2023) |
| KNIME | Data pipeline tool; capabilities for various ad hoc analyses | Berthold et al. (2008) |
| OpenEye Toolkit | Molecular toolkit; Low-level API tools custom structure analyses | OpenEye Scientific (2023a) |
| RDKit | Molecular toolkit; Low-level API tools custom structure analyses | RDKit, (2023) |
| Structure-Activity-Relationship Visualizations | ||
| DataWarrior | General purpose SAR tool | Sander et al. (2015) |
| StarDrop™ | Includes multi-parameter optimization and SAR tools | Optibrium, (2023) |
| TIBCO Spotfire® | Lead Discovery collection adds extensive cheminformatics capabilities, including predictive analytics | TIBCO, (2023) |
| Cheminformatics and AI-enabled Metabolomics | ||
| Peakonly | DL-based model for LC-MS peak detection and integration | Melnikov et al. (2020) |
| ChromAlignNet | DL-based tool for peak-alignment of GC-MS data | Li and Wang (2019) |
| CFM-ID | Hybrid (AI-, rule-based) tool for LC-MS spectra prediction, peak annotation, and metabolite identification | Wang et al. (2021a), Djoumbou-Feunang et al. (2019b) |
| 3D-MolMS | Tandem MS Spectra prediction | Hong et al. (2023) |
| MassFormer | Tandem MS Spectra prediction | Young et al. (2023) |
| SIRIUS | Computational platform for tandem MS data-based analysis of metabolites; provides molecule search, and class prediction capabilities | Dührkop et al. (2019) |
| MESSAR | Automated tool for metabolite substructure recommendation from tandem mass spectra | Liu et al. (2020) |
| ClassyFire | Structural classification of small and large molecules | Djoumbou-Feunang (2016) |
| NP-Classifier | DNN-based structural classification of natural products | Kim et al. (2020) |
| BioTransformer | Hybrid, comprehensive tool for metabolite prediction and identification in humans, gut microbiota, and environmental microbiota | Djoumbou-Feunang et al. (2019a) |
| ADMET Predictor | Machine learning-based prediction of human metabolites | SimulationsPlus, (2023) |
| QSAR Toolbox | AI-based prediction of chemical products from abiotic transformations and metabolism (microbial, rat liver S9, skin) | QSAR Toolbox, (2023) |
| OASIS Times | AI-based prediction of chemical products from abiotic transformations as well as in vitro (gut, lung, rat liver S9) and in vivo (rat) metabolites | OASIS, (2021) |
| GLORYx | Machine learning-based prediction of human metabolites | de Bruyn Kops et al. (2021) |
| MetaTrans | Deep-learning-based, rule-free tool for prediction of small molecule metabolites in humans | Litsa et al. (2020) |
| Retip | ML-based retention time prediction | Bonini et al. (2020) |
| GNN-RT | GNN-based liquid chromatography retention time prediction | Yang Q. et al. (2021) |
| DeepCCS | Deep Learning tool for the prediction of collision cross-section values | Plante (2019) |
| Spectral Databases | Spectral databases commonly used for metabolite identification | (NIST, (2023); Guijas et al. (2018); Wang et al. (2021b); Mehta, (2020); Wishart et al. (2018); Wang et al. (2016) |
| Programming libraries and cheminformatics tools for predictive modeling | ||
| Scikit-learn | General Python-based programming library | Pedregosa et al. (2011) |
| PyTorch | General Python-based programming library for deep learning, including explainable DL | PyTorch, (2023) |
| Tensorflow | General Python-based programming library for deep learning, including explainable DL | Abadi et al. (2015) |
| DeepChem | Python-based programming library for deep chemistry | Ramsundar et al. (2019) |
| Chemprop | Python programming package implementing Message Parsing Neural Networks (MPNN) for the prediction of molecular properties as well as chemical reactions; provides uncertainty quantification capabilities | Yang K. et al. (2019) |
| DGL-Lifesci | Python programming library for graph neural network-based learning for chemistry and biology | Li Y. et al. (2021) |
| MolPMoFit | Transfer learning approach (and model) for molecular property (QSAR/QSPR) prediction | Li and Fourches (2020) |
| Chemformer | A Python library for molecular optimization, property prediction, reaction and retrosynthetic prediction | Irwin, Dimitriadis et al. (2022) |
| DESlib | A Python library for dynamic classifier and ensemble selection | Cruz et al. (2020) |
| SHAP | A Python programing library for Shapley Additive exPlanations | Lundberg and Lee, (2017); Rodríguez-Pérez and Bajorath, (2021) |
| Alibi Explain | Implements several algorithms for inspecting and explaining machine learning models | Klaise et al. (2021) |
| GNN-Explainer | A Python library for the explanation of GNN-based predictions | Ying et al. (2019) |
| CIME | A library for web-based exploratory analysis of chemical model explanations | Humer et al. (2022) |