Table 3.
Comparative overview of tools for predicting site of metabolism (SOM), including core features, enzyme coverage, and methodological descriptions.
| Method | Core components | Coverage | Description | Refs. |
|---|---|---|---|---|
| Metasite | Molecular interaction fields and reactivity estimator | Variety of CYPs | Provides a tool for predicting SOM and understanding compound metabolism by integrating protein structure-derived MIFs and molecular orbital calculations. | [89] |
| CypScore | Surface electrostatics and semi-empirical method | Individual CYP reactions | A set of six MLR models has been compiled to encompass the primary reaction types catalyzed by CYPs. | [90] |
| Metaprint2D | Atom mapping and statistical model | Phases I + II | By mining extensive biotransformation databases, determined the likelihood of metabolic transformations for atoms with specific atomic environments. | [91] |
| RD-Metabolizer | 2D fingerprint similarity calculation with model reaction SMARTS patterns | Phases I + II | RS-Predictor focuses on CYP3A4, CYP2C9, and CYP2D6, combining topological atomic fingerprinting of SMARTS patterns, and quantum chemical descriptors with support vector machines to rank oxidative metabolic sites. | [92] |
| SMARTCyp | DFT-derived reaction energies | CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C19, CYP2E1, and CYP3A4 |
Utilizes a combined framework to assess the likelihood of chemical reactions, incorporating precomputed activation energies and topological accessibility descriptor. | [93] |
| RS-WebPredictor | MIRank (SVM) | CYP1A2, CYP2A6,CYP2B6, CYP2C8, YP2C9, CYP2C19,CYP2D6, CYP2E1, and CYP3A4 | A range of pre-trained SVM models, incorporating topological descriptors and SMARTCyp reactivities, are utilized to predict SOMs. | [94] |
| IDSite | Glide docking and physical-based score | CYP2D6, CYP1A2, CYP2C9, and CYP 3A4 | Samples the conformational space and evaluates the potential of atoms to react with the catalytic iron center. Also show 3D structures of the protein ligand complex. | [95] |
| CypReact | LMB algorithm | CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4 | Accurately predicts the likelihood of a reaction occurring between the query molecule and one specified enzyme of nine different CYP kinds. | [96] |
| XenoSite | Neural networks | CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4 | As an improvement to prior methods RS-Predictor of predicting CYP-mediated SOM by using new descriptors and ML based on neural networks. XenoSite uses neural networks to predict SOMs for nine CYPs based on a database of 680 substrates and metabolites. It integrates various descriptors and outperforms RS-Predictor in training speed and accuracy. | [97] |
| SOMP | Bayesian | CYP1A, CYP12C9, CYP2C19, CYP1D6, CYP3A4, and UGT | By employing LMNA descriptors, the method captures the structure of more than 1000 metabolized xenobiotics. The PASS algorithm to analyze structure-SOM relationships for enzymes such as CYP1A2 and UGT. The Bayesian classifier uses both positive and negative data from literature and databases. | [98] |
| FAME3 | RF | Phases I + II | FAME uses 79,238 reaction data and random forest algorithms to predict phase I and II metabolic reactions. It originally used seven atomic descriptors but has expanded in FAME3 to 15 descriptors and introduced FAMEscore for accuracy evaluation. FAME3 employs an ensemble of extra trees classifiers to predict metabolic sites in various small molecules, including drugs and natural products. | [99] |
| Xu et al. [100] | Decision tree model | AOX | Multiple SOMs within a single compound can be identified and metabolic regional selectivity explained. | [100] |
CYP: cytochrome P450s family enzymes; MIFs: molecular interaction fields; MLR: multiple linear regression; RD: reaction database; 2D: two-dimensional; SMARTS: simplified molecular input line entry system (SMILES) arbitrary target specification; RS-Predictor: RegioSelectivity-Predictor; DFT: density functional theory; RS: regioselectivity; SVM: support vector machine; LMB: learning based model; ML: machine learning; SOMP: site of metabolism predictor; UGT: uridine diphosphate glucuronyltransferase; LMNA: labeled multilevel neighborhoods of atoms; PASS: prediction of activity spectra for substances; FAME3: FAst Metabolizer 3; RF: random forest; AOX: aldehyde oxidase.