Skip to main content
. 2025 Aug 18;15(8):101437. doi: 10.1016/j.jpha.2025.101437

Table 3.

Comparative overview of tools for predicting site of metabolism (SOM), including core features, enzyme coverage, and methodological descriptions.

Method Core components Coverage Description Refs.
Metasite Molecular interaction fields and reactivity estimator Variety of CYPs Provides a tool for predicting SOM and understanding compound metabolism by integrating protein structure-derived MIFs and molecular orbital calculations. [89]
CypScore Surface electrostatics and semi-empirical method Individual CYP reactions A set of six MLR models has been compiled to encompass the primary reaction types catalyzed by CYPs. [90]
Metaprint2D Atom mapping and statistical model Phases I + II By mining extensive biotransformation databases, determined the likelihood of metabolic transformations for atoms with specific atomic environments. [91]
RD-Metabolizer 2D fingerprint similarity calculation with model reaction SMARTS patterns Phases I + II RS-Predictor focuses on CYP3A4, CYP2C9, and CYP2D6, combining topological atomic fingerprinting of SMARTS patterns, and quantum chemical descriptors with support vector machines to rank oxidative metabolic sites. [92]
SMARTCyp DFT-derived reaction energies CYP1A2, CYP2A6,
CYP2B6, CYP2C8,
CYP2C19, CYP2E1, and CYP3A4
Utilizes a combined framework to assess the likelihood of chemical reactions, incorporating precomputed activation energies and topological accessibility descriptor. [93]
RS-WebPredictor MIRank (SVM) CYP1A2, CYP2A6,CYP2B6, CYP2C8, YP2C9, CYP2C19,CYP2D6, CYP2E1, and CYP3A4 A range of pre-trained SVM models, incorporating topological descriptors and SMARTCyp reactivities, are utilized to predict SOMs. [94]
IDSite Glide docking and physical-based score CYP2D6, CYP1A2, CYP2C9, and CYP 3A4 Samples the conformational space and evaluates the potential of atoms to react with the catalytic iron center. Also show 3D structures of the protein ligand complex. [95]
CypReact LMB algorithm CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4 Accurately predicts the likelihood of a reaction occurring between the query molecule and one specified enzyme of nine different CYP kinds. [96]
XenoSite Neural networks CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4 As an improvement to prior methods RS-Predictor of predicting CYP-mediated SOM by using new descriptors and ML based on neural networks. XenoSite uses neural networks to predict SOMs for nine CYPs based on a database of 680 substrates and metabolites. It integrates various descriptors and outperforms RS-Predictor in training speed and accuracy. [97]
SOMP Bayesian CYP1A, CYP12C9, CYP2C19, CYP1D6, CYP3A4, and UGT By employing LMNA descriptors, the method captures the structure of more than 1000 metabolized xenobiotics. The PASS algorithm to analyze structure-SOM relationships for enzymes such as CYP1A2 and UGT. The Bayesian classifier uses both positive and negative data from literature and databases. [98]
FAME3 RF Phases I + II FAME uses 79,238 reaction data and random forest algorithms to predict phase I and II metabolic reactions. It originally used seven atomic descriptors but has expanded in FAME3 to 15 descriptors and introduced FAMEscore for accuracy evaluation. FAME3 employs an ensemble of extra trees classifiers to predict metabolic sites in various small molecules, including drugs and natural products. [99]
Xu et al. [100] Decision tree model AOX Multiple SOMs within a single compound can be identified and metabolic regional selectivity explained. [100]

CYP: cytochrome P450s family enzymes; MIFs: molecular interaction fields; MLR: multiple linear regression; RD: reaction database; 2D: two-dimensional; SMARTS: simplified molecular input line entry system (SMILES) arbitrary target specification; RS-Predictor: RegioSelectivity-Predictor; DFT: density functional theory; RS: regioselectivity; SVM: support vector machine; LMB: learning based model; ML: machine learning; SOMP: site of metabolism predictor; UGT: uridine diphosphate glucuronyltransferase; LMNA: labeled multilevel neighborhoods of atoms; PASS: prediction of activity spectra for substances; FAME3: FAst Metabolizer 3; RF: random forest; AOX: aldehyde oxidase.