Skip to main content
ACS AuthorChoice logoLink to ACS AuthorChoice
. 2025 Oct 20;22(11):7022–7035. doi: 10.1021/acs.molpharmaceut.5c01065

Machine Learning Modeling for ABC Transporter Efflux and Inhibition: Data Curation, Model Development, and New Compound Interaction Predictions

Nada J Daood †,, Sean R Carey †,, Elena Chung †,, Tong Wang †,, Anna Kreutz §, Mounika Girireddy , Suman Chakravarti , Nicole C Kleinstreuer , Jacqueline B Tiley #, Lauren M Aleksunes , Hao Zhu †,‡,*
PMCID: PMC12587445  PMID: 41115055

Abstract

In recent years, multiple computational studies have used machine learning models to predict substrate binding and inhibition of ATP-binding cassette (ABC) transporters. However, many of these studies relied on relatively small training sets with limited applicability. In this study, we manually curated over 24,000 bioactivity records (i.e., inhibition, binding affinity, permeability) for the ABC transporters P-gp, BCRP, MRP1, and MRP2 from more than 900 literature sources in ChEMBL, with additional data from PubChem and Metrabase. This effort yielded eight data sets, comprising around 8800 unique chemicals with one or more substrate binding or inhibition activities for these four efflux transporters. Quantitative structure–activity relationship (QSAR) models were developed for each of the eight data sets using combinations of four machine learning algorithms and three sets of chemical descriptors. The resulting models demonstrated excellent performance by 5-fold cross-validation, achieving an average correct classification rate (CCR) of 0.764 for the substrate binding models and 0.839 for the inhibition models. Models were validated with additional compounds from DrugBank that were known substrates or inhibitors. We further analyzed how model predictions for efflux transporter activity could estimate exposure of the brain to xenobiotics. Notably, compounds predicted as P-gp and BCRP substrates were twice or more likely to have low brain exposure compared to compounds with high brain exposure. This study provides a large and curated drug transporter binding and inhibition database for computational modeling. Applicable models based on this large database for predicting transporter substrate binding and inhibition can be used to evaluate more complex drug bioactivities, such as exposure of protected tissues to chemicals.

Keywords: ABC transporters, MDR1, P-gp, BCRP, machine learning, QSAR, brain exposure


graphic file with name mp5c01065_0008.jpg


graphic file with name mp5c01065_0006.jpg

Introduction

ATP-binding cassette (ABC) transporters are a family of membrane transporters that play a significant role in the active efflux of endogenous substances and xenobiotic compounds, like drugs and pollutants, from the cell. For pharmaceuticals, this efflux mechanism can affect their pharmacokinetic (PK) profile and propensity to cause drug–drug interactions (DDIs). Essential ABC drug transporters include multidrug resistance 1 P-glycoprotein (P-gp; also known as MDR1), breast cancer resistance protein (BCRP), and the multidrug resistance-associated protein (MRP) subfamily, such as MRP1 and MRP2. These transporters are localized to tissues that play essential roles in the absorption, distribution, metabolism, and excretion (ADME) of drugs (Table ). They also contain ligand-binding sites that can accommodate diverse molecular structures, rendering numerous drugs susceptible to efflux. This is exemplified by the multidrug resistance to anticancer drugs conferred by ABC transporters to tumor cells. Conversely, some drugs can inhibit transporter functions, resulting in DDIs that cause the accumulation of normally effluxed chemicals, potentially leading to toxicity. For example, coadministration of P-gp inhibitors with cardiovascular drugs having a narrow therapeutic index can significantly increase the risk of toxicities. On the other hand, inhibiting P-gp can potentially improve the efficacy of anticancer drugs by increasing their concentration in target tissues. , Thus, understanding and revealing the interplay between drugs and ABC transporters is crucial for drug development, as these interactions can significantly impact the efficacy and safety of medications.

1. Enrichment of Major ABC Transporters across Human Tissues.

  Brain Intestine Liver Placenta Kidney
P-gp
BCRP  
MRP1      
MRP2  

In recent years, public databases have been established to aggregate a wider array of experimental results for diverse sets of chemicals. For example, in 2025, ChEMBL contained bioactivity records for 2.4 million compounds tested across 1.6 million assays, with manual curation of experimental results from the literature. PubChem is the largest publicly available database for chemical information, providing bioassay data and results for over 100 million chemicals. Numerous chemical-transporter interaction data (i.e., inhibition, binding affinity, permeability) can be found in these two large databases. In addition, DrugBank offers extensive information for over 16,000 drugs, including ADME profiles and transporter interactions. Moreover, transporter specific databases such as Metrabase, TP-Search, the UCSF-FDA TransPortal, the Transporter Classification Database (TCDB), and VARIDT also provide structural, functional, genomic, and/or expression information for membrane transporters. However, databases specifically recording chemical-transporter interactions, such as Metrabase and TP-Search, have not been updated in years. TransPortal, which includes chemical-transporter interactions, tissue expression data, and clinical DDIs, was recently updated in 2023 as TransPortal-TICBase. However, the number of annotated records for transporter interactions in these sources remains limited, with most data obtained from studies published before 2010. Additionally, it focuses on collecting substrate and inhibitor data, excluding potential nonsubstrates and noninhibitors, which limits its applicability in developing machine learning (ML) models. This data gap highlights the urgent need for a large and comprehensive transporter database, including newly available public transporter data that can be used for ML modeling.

ML models for drug properties/activities/toxicities, such as quantitative structure–activity relationship (QSAR) models, have emerged as valuable tools in drug discovery, offering a time- and cost-effective alternative to animal testing. As testing for interactions with ABC transporters is crucial, previous studies have focused on developing computational models to identify critical pharmacophores that are responsible for interactions, as summarized by a number of reviews. Among the four ABC transporters, P-gp is the most extensively studied, owing to its early discovery, ubiquitous expression across many tissues, broad substrate diversity, and significant influence on drug absorption and therapeutic outcomes. , However, the computational studies for P-gp have relied on training data collected over a decade ago, utilized in-house experimental data that could not be shared publicly, , or used small training sets (e.g., <50 compounds). , Most data collection efforts for transporters were also conducted over ten years ago, highlighting the need for contemporary training data for ML modeling. Although the ML models developed in many of these studies had acceptable performance, their applications for evaluating new compounds and other PK properties were limited. Our group previously published a paper that integrated transporter binding potential into predictions of blood-brain barrier (BBB) permeability. However, the inclusion of transporter binding as descriptors in this study only achieved insignificant improvements in the BBB models.

In the current study, we sought to (1) curate contemporary data sets for the substrate binding and chemical inhibition of P-gp, BCRP, MRP1, and MRP2 using public data, (2) develop and analyze ML models for their ability to predict transporter interactions, and (3) integrate these models into predictions of complex drug bioactivities such as penetration into restricted tissues (e.g., brain). We used ChEMBL as the major data source for chemical interactions with human P-gp, BCRP, MRP1, and MRP2. The retrieved data were rigorously and manually curated, and experimental details from hundreds of assays were harmonized. Strict criteria and thresholds were applied to categorize chemicals as substrates, nonsubstrates, inhibitors, or noninhibitors. Additional data from PubChem and Metrabase were integrated to train QSAR models using our in-house automated QSAR pipeline. These models were validated through predicting interactions with ABC transporters for chemicals in DrugBank and employed for predicting P-gp and BCRP substrate binding for compounds with varying levels of brain exposure. This study provides a large, comprehensive, and user-friendly transporter database for building ML models. Furthermore, the applicability of transporter model predictions offers new insights into more complex biological endpoints and can assist along the drug development pipeline.

Methods

Data Collection and Curation

This study focused on compiling and curating bioactivity data from ChEMBL (https://www.ebi.ac.uk/chembl/, accessed April 2025) related to ABC transporter interactions. Using Python v3.11.9 and the open-source chembl-webresource-client library developed by the ChEMBL group (https://github.com/chembl/chembl_webresource_client, April 2025), over 24,000 experimental records were retrieved for human P-gp, BCRP, MRP1, and MRP2. ChEMBL annotated each record with its literature source and assay description. All records were then manually reviewed by examining each cited literature source, noting details such as cell lines, substrates, substrate concentrations, and positive controls where available. The annotation of experimental details and suggestions for substrate and inhibition thresholds followed approaches described by Montanari and Ecker, Sedykh et al., and TransPortal. Each bioactivity record was classified as 1 (substrate/inhibitor), 0 (nonsubstrate/noninhibitor), 0.5 (inconclusive), or N.A. (not applicable), based on conservative thresholds or thresholds informed by previous curation studies. , For example, a bioactivity record that reported a chemical with an IC50 of 5 μM would be labeled as 1 (inhibitor) since chemicals with an IC50 ≤ 10 μM were considered inhibitors. Further classification details and activity endpoints are discussed in the Supporting Information. A majority vote was then taken to determine a chemical’s final class if a chemical had multiple records. In the case of an equal number of conflicting values (e.g., one inhibitor record and one noninhibitor record), the chemical was assigned the value 0.5. Only chemicals with binary labels of 0 or 1 for substrate and/or inhibition activity were retained for model training.

External Data Sets

External data sets were utilized to validate the model performance and evaluate the applicability of predictions. The DrugBank database provides a record of interactions between drugs and various biological targets, including enzymes, carriers, and transporters. For this study, interactions between chemicals and ABC transporters were retrieved from the target pages in DrugBank (e.g., P-gp - https://go.drugbank.com/bio_entities/BE0001032, accessed March 2025). Similar to the training sets, eight validation sets were created from DrugBank, ranging from 8 compounds (MRP1 substrates) to 197 compounds (P-gp substrates) after removing overlaps with the associated training sets (Table S1, Supplementary Excel File).

To show the utility of our substrate model predictions, the unbound brain-to-plasma concentration ratio (K p,uu,brain), which quantifies the extent of brain exposure for testing compounds, was chosen as the primary PK endpoint. The data set of K p,uu,brain was obtained from Fridén et al., which includes experimentally measured K p,uu,brain values from rat studies. After removing duplicates, a total of 85 compounds, primarily drugs, remained as another external prediction set for the generated models in this study.

Chemical Curation

Chemical structures were standardized using the CASE Ultra v1.9.0.4 DataKurator tool (MultiCASE Inc., Mayfield Heights, OH). Initially, as the SMILES provided by ChEMBL and DrugBank were not canonical SMILES, we standardized them by generating the canonical SMILES using the PubChem Identifier Exchange (https://pubchem.ncbi.nlm.nih.gov/idexchange/, April 2025). Inorganic compounds were removed, and only the largest organic component was retained for mixtures. Duplicates, which were primarily comprised of stereoisomers, were removed from each of the eight training sets, with only the chemical with the highest activity being retained. In this study, only 2D chemical structures were considered to reduce the complexity of modeling and relevant computation time. Chemicals in external sets were excluded if they overlapped with the training set chemicals. If chemicals between the training set and DrugBank had conflicting values (e.g., noninhibitor in the training set and inhibitor in the external set), then the chemicals were removed from both the training set and the external set.

Chemical Descriptors

Three sets of chemical descriptors were calculated to quantify the chemical structures and serve as variables for model training. (1) Extended-connectivity fingerprints (ECFP) are circular topological fingerprints that describe each atom’s neighborhood within 1024-bit vectors. The fingerprints were generated with a bond radius of 3 through the Morgan algorithm. (2) MACCS keys are a set of 166 binary fingerprints that represent a variety of substructures. (3) RDKit descriptors consist of 210 molecular descriptors that represent a chemical’s physicochemical properties. All the descriptors were generated with the open-source cheminformatics toolkit RDKit v2023.09.6 in Python.

QSAR Model Development

Four ML algorithms were used to develop the QSAR models: deep neural network (DNN), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB). These algorithms were selected because they represent a balance of classical and modern ML approaches in cheminformatics, as demonstrated in our previous work. Other algorithms, such as logistic regression (LR) and k-nearest neighbors (KNN), were used in the initial modeling process. However, preliminary modeling on the eight training sets showed that their predictive performance was considerably lower compared to the selected algorithms (DNN, RF, SVM, and XGB) (data not shown). Thus, we focused on the four algorithms that offered the strongest performance.

The DNN was built as a multilayer perceptron, a feed-forward neural network trained through backpropagation using a nonlinear activation function. The network architecture in this study consisted of an input layer with chemical descriptors representing the training set compounds, three hidden layers, and an output layer providing the predicted probabilities for transporter interactions. RF is an ensemble method that constructs multiple random decision trees, aggregating their results through a majority vote. SVM identifies the optimal hyperplane that best separates chemicals from the binary classes. XGB, a gradient boosting method, iteratively combines several weak decision trees to create a more accurate predictive model. The DNN, RF, and SVM algorithms were implemented in Python using the open-source scikit-learn v1.4.2 library, while XGB was implemented with the xgboost v2.0.3 library. Hyperparameters for all four algorithms were fine-tuned using grid search in scikit-learn, where each model was optimized by fitting different hyperparameter combinations to the training set to select the best-performing models. The various hyperparameters used in this study for optimization are outlined in detail in our previous studies. − ,,

Eight data sets were used for training, with data for each of the four ABC transporters divided into either substrate binding or inhibition categories. For each of the eight data sets, 12 individual models were developed using different combinations of chemical descriptors and ML algorithms. The modeling process was facilitated by an in-house automatic QSAR modeling pipeline, which is available on GitHub (https://github.com/zhu-research-group/auto_qsar). A consensus model was also generated for each data set by averaging the predictions from all the individual models. Model performance was evaluated through 5-fold cross-validation, in which the data set was randomly split into five subsets. In each iteration, four subsets were combined for training, while the remaining subset served as a test set for assessing model predictivity. This process was repeated five times, ensuring that each compound in the data set was used for testing one time.

Evaluation Metrics for Model Performance

The performance of the models was evaluated using five metrics: sensitivity, specificity, correct classification rate (CCR), positive predictive value (PPV), and the area under the curve (AUC). Sensitivity represents the proportion of correctly identified actives (true positives) out of the total number of active compounds in the data set (eq ). Specificity, on the other hand, represents the proportion of correctly classified inactives (true negatives) out of the total number of inactive compounds (eq ). CCR is calculated as the average of sensitivity and specificity, representing overall model performance (eq ). PPV indicates the proportion of true positive predictions among all compounds predicted to be active by the model (eq ). Lastly, the AUC was calculated by plotting the true positive rate (sensitivity) against the false positive rate (1 – specificity) across different classification thresholds.

Sensitivity=TruepositivesTruepositives+Falsenegatives 1
Specificity=TruenegativesTruenegatives+Falsepositives 2
CCR=Sensitivity+Specificity2 3
PPV=TruepositivesTruepositives+Falsepositives 4

Applicability Domain (AD)

The applicability domain (AD) defines the scope within which the model’s predictions are considered reliable. In this study, the generated QSAR models produced predictions on a probability scale from 0 to 1, where values of 0.5 or higher indicate substrates or inhibitors, while values below 0.5 indicate nonsubstrates or noninhibitors. An AD was implemented based on the predicted probability values of a compound’s activity across the models’ predictions to improve the predictive accuracy of the models, following an approach that has been successfully applied in previous studies. ,,, Compounds with a predicted probability of 0.6 or higher were classified as substrates or inhibitors, while those with a probability of 0.4 or lower were classified as nonsubstrates or noninhibitors. Predictions falling between 0.4 and 0.6 were defined as “out-of-domain” and excluded from further evaluation of model performance, as these thresholds demonstrated proven success in our previous studies. ,

Variable Analysis

Key molecular substructures accounting for transporter binding/inhibition were identified using the Shapley Additive Explanations (SHAP) approach. SHAP, originating from cooperative game theory, provides a framework for interpreting model predictions by quantifying the contribution of each variable (i.e., chemical descriptor) to individual predictions. SHAP was implemented using the open-source SHAP library (v0.48.0) in Python to calculate SHAP values for the MACCS descriptors across the eight DrugBank validation sets. The MACCS descriptors were subsequently ranked according to their mean absolute SHAP values, and the substructures associated with the top-ranked descriptors were analyzed to identify molecular features that were essential for predicting transporter binding and inhibition.

Scoring System for Brain Exposure

To evaluate exposure of the brain to chemicals, we developed a scoring system incorporating P-gp and BCRP substrate model predictions and key physicochemical properties. Compounds were assigned a score of 1 (indicating favorability for high brain exposure) for each of the following properties: a topological polar surface area (TPSA) ≤ 90 Å2, hydrogen bond donors (HBD) ≤ 5, and hydrogen bond acceptors (HBA) ≤ 10. , Compounds exceeding these thresholds received a score of 0 for the respective property. Additionally, compounds were predicted for their likelihood of being substrates for P-gp or BCRP using their respective QSAR models. A predicted probability ≥ 0.5 was classified as a substrate (score = 1), while probabilities <0.5 were considered nonsubstrates (score = 0). The final brain exposure score was calculated as the sum of the TPSA, HBD, and HBA scores, minus the P-gp/BCRP substrate score, yielding a total score ranging from −1 to 3 (eq ).

Brainexposurescore=STPSA+SHBD+SHBASPgporBCRPefflux 5

Results and Discussion

Study Workflow

Figure provides an overview of the study’s workflow. First, transporter substrate and inhibition data for P-gp, BCRP, MRP1, and MRP2 were initially retrieved from ChEMBL. The data underwent manual curation, followed by classifying compounds as substrates, nonsubstrates, inhibitors, or noninhibitors. The curated database was combined with compounds from PubChem and Metrabase, forming the eight data sets (Table S2, Supplementary Excel file) used to train QSAR models by using the combination of four machine learning algorithms and three sets of chemical descriptors. The resulting models were subsequently employed to predict chemical-transporter interactions for compounds from DrugBank and the brain exposure data set. Finally, compounds with predicted P-gp- and BCRP-mediated efflux potential were analyzed to assess brain exposure to drugs.

1.

1

Overview of the study workflow: (1) data collection and curation, (2) QSAR modeling, (3) predictions for external chemicals, and (4) exploring brain exposure using drug efflux predictions. Created with Biorender.com.

ChEMBL Collection and Curation

The following UniProt IDs were used to retrieve data from ChEMBL: P08183 (P-gp), Q9UNQ0 (BCRP), P33527 (MRP1), and Q92887 (MRP2). This search yielded 4394 assays from 951 sources (primarily from the literature), encompassing 24,267 in vitro bioactivity records related to these transporters. P-gp and BCRP are commonly tested for potential chemical interactions due to their broad substrate specificity and large ligand binding site, providing ample space for DDIs between various P-gp/BCRP substrates and inhibitors, which explains the higher number of records for P-gp and BCRP compared to MRP1 and MRP2. P-gp, as the most extensively studied transporter, accounted for the largest data set, with over 15,000 records covering more than 6300 compounds, followed by BCRP with over 5000 entries, MRP1 with 2604, and MRP2 with 1191 (Figure S1, Supporting Information). The disproportionately higher number of records for P-gp and BCRP reflects their broad substrate specificity, large and flexible ligand-binding pockets, and critical roles at key pharmacokinetic barriers (Table ). In contrast, MRP1 and MRP2 exhibit more selective binding sites and exist in fewer organs, accounting for fewer testing data for their potential substrates and inhibitors. Furthermore, regulatory guidelines, such as those from the FDA, require preclinical assessment of P-gp and BCRP substrate and inhibition potential for DDIs, whereas no such requirements currently exist for MRP1 and MRP2.

Of the more than 24,000 bioactivity records retrieved from ChEMBL, inhibition-related data comprised the majority (over 60%) and were associated with endpoints such as IC50, % inhibition, EC50, and K i (Figure ). IC50 refers to the concentration of a compound required to inhibit transporter activity by 50%, while % inhibition represents the extent of transporter inhibition relative to a control or baseline. EC50 is the concentration at which a compound elicits 50% of its maximal effect on transporter activity, and K i describes the dissociation constant, indicating the binding affinity between the compound and the transporter. The “Activity” field in ChEMBL (Figure ) often corresponds to % inhibition values reported in the original studies, and the endpoints “ratio” and “fold-change (FC)” were used variably to describe a compound’s inhibitory effect relative to a control. For substrate-related data, key endpoints included K m, the concentration of substrate where the transport rate reaches half its maximum, and apparent permeability (P app), the rate at which a substance crosses a cell membrane. Additionally, in some cases, the “ratio” endpoint in a record referred to the efflux ratio (ER), defined as the ratio of the rate at which a compound is transported out of the cell, rather than an inhibitory ratio. This variability highlights the necessity of reviewing the original literature to accurately determine whether a given record reflects inhibitory activity or substrate transport.

2.

2

Distribution of available transporter binding data across ChEMBL for the four ABC transporters. FCfold change; I maxmaximum inhibition; P appapparent permeability.

Experimental results for P-gp, BCRP, MRP1, and MRP2 were gathered from 733, 194, 181, and 83 studies in ChEMBL, respectively. We manually evaluated each study’s experimental protocol, interpreting results in relation to the controls used and other studies. Key assay information, such as cell lines, substrate concentrations, and positive controls, were annotated when available. In some studies, experiments were conducted to assess the potency of compounds for certain interactions with the ABC transporters without classifying whether a compound was a substrate or an inhibitor. In such cases, a classification guideline (outlined in the Supporting Information) was applied to determine the classifications for compounds. Additionally, some records in ChEMBL included comments on compounds’ classifications, which was also considered in this curation process. However, generalized comments like “active” were not considered, as they lacked specified activity type (e.g., “active” may reflect the cytotoxicity of a compound).

For the transporter substrate binding data, records were excluded if the original literature source failed to report essential experimental details, such as the type of cell line or the compound concentration. This issue was frequently encountered in studies evaluating compound efflux through P-gp and/or BCRP as part of broader in vitro ADME testing. For example, one study reported elcubragistat as a nonsubstrate of P-gp; however, no experimental protocol was provided to support this result. Thus, this record was excluded from our analysis of elcubragistat’s substrate binding activity. Records were also found when the ChEMBL classification as “substrate” was not consistent with the classification guideline implemented in our study. For example, progesterone (CHEMBL103) was labeled as a “substrate [+]” for P-gp in ChEMBL. However, the permeability ratio for progesterone in the associated literature source was 0.9 in the P-gp overexpressing LLC-PK1 cell line. The authors of the study also concluded that progesterone did not act as a substrate of P-gp. As a result, progesterone was classified as 0 (nonsubstrate) for P-gp.

Some specific assays, such as ATPase assays, were excluded from further evaluations as they were unsuitable for accurately classifying compounds as substrates or inhibitors. ATPase assays, often performed in insect cell membranes with human transporters overexpressed, provide insights into transporter binding affinity but can yield false positives/negatives. For example, P-gp ligands such as cyclosporine A do not consistently alter ATPase activity. , As a result, ATPase results were excluded from final classifications. Similarly, an assay reported by Morgan et al. assessing the inhibition of human MRP2 using [3H]-estradiol-17beta-d-glucuronide uptake in membrane vesicles was considered the largest contributor to the MRP2 ChEMBL data set, accounting for 637 out of 1192 entries. However, known MRP2 inhibitors, such as MK571 and probenecid, exhibited IC50 values above 133 μM in this assay, which would categorize them as noninhibitors. For this reason, the data from this study were not consistent with other data sources and were thus excluded from modeling.

Additionally, our curation of bioactivity records reduced the potential misclassification of tested compounds. Specifically, several ChEMBL records reporting compounds with substrate or inhibitory activity for P-gp and BCRP were found, upon review of the primary literature sources, to originate from assays targeting other ABC transporters. For example, in the original BCRP CHEMBL data set, the compound 3-[(2-phenylpyrido­[2,3-d]­pyrimidin-4-yl)­amino]­benzonitrile (CHEMBL4069654) was reported to exhibit less than 25% inhibition of BCRP, which should be classified as a noninhibitor of BCRP. However, the associated assay was found to evaluate P-gp inhibition rather than BCRP inhibition, indicating that the compound was a noninhibitor of P-gp and not BCRP. As a result, this record was excluded. On the other hand, multiple records for the same compound, derived from assays targeting BCRP, confirmed the compound as a BCRP inhibitor. Thus, this compound was ultimately labeled as a BCRP inhibitor in the BCRP training set.

Discrepancies were also observed between the inhibition classifications reported in ChEMBL and PubChem. An example is the confirmatory assay for BCRP inhibition (PubChem AID: 489003; ChEMBL assay ID: CHEMBL2114819), where ChEMBL listed the compound 1-[4-[2-(3-fluorophenyl)-5-methylpyrazolo­[1,5-a]­pyrimidin-7-yl]­piperazin-1-yl]­ethenone (CHEMBL2130718) as “inactive” for BCRP inhibition, whereas PubChem classified the same compound as “active”. To address this inconsistency, we adopted the PubChem classification in our data set, assigning the compound as a BCRP inhibitor (labeled as 1) based on the positive bioactivity outcome reported in PubChem, rather than the “inactive” designation assigned in ChEMBL. All these above examples of data curation highlight the need for a detailed review of assay protocols and chemical classifications in public data sources, since the data (e.g., chemical classification) found in public databases can be misleading and can cause errors in the modeling process.

Supplementing ChEMBL Data from Metrabase and PubChem for Modeling

Due to the nature of these transporter-related testing studies, there is often a bias toward reporting positive results, leading to a higher proportion of substrates or inhibitors than nonsubstrates or noninhibitors in the ChEMBL data sets (Table ). To address this imbalance, substrate data for the four ABC transporters was collected from Metrabase, a curated transporter interaction database, to supplement the ChEMBL data for the four transporter data sets (Table ). However, the number of BCRP nonsubstrates provided by Metrabase was insufficient to balance the training set. Thus, BCRP nonsubstrates were added from data collected by Shaikh et al. to achieve a balanced training set for generating BCRP substrate models.

2. Details of the Curated Substrate, Non-Substrate, Inhibitor, and Non-Inhibitor Data Retrieved from ChEMBL, PubChem, and Metrabase .

Substrate data
  P-gp substrates P-gp nonsubstrates BCRP substrates BCRP nonsubstrates MRP1 substrates MRP1 nonsubstrates MRP2 substrates MRP2 nonsubstrates
ChEMBL 303 155 66 15 55 8 41 6
Metrabase - 166 211 160 89 90 134 121
Shaikh et al - - - 116 - - - -
Total 624 568 242 286
Total (after curation) 607 (297 S/310 NS) 507 (253 S/254 NS) 182 (94 S/88 NS) 253 (134 S/119 NS)
Inhibitor data
  P-gp inhibitors P-gp noninhibitors BCRP inhibitors BCRP noninhibitors MRP1 inhibitors MRP1 noninhibitors MRP2 inhibitors MRP2 noninhibitors
ChEMBL 2747 822 1186 333 360 718 67 583
PubChem - 2000 - 900 - - - -
Total 5569 2419 1078 650
Total (after curation and balancing) 5282 (2493 I/2789 NI) 2395 (1176 I/1219 NI) 684 (342 I/ 342 NI) 130 (65 I/65 NI)
a

Ssubstrate; NSnonsubstrate; Iinhibitor; NI noninhibitor.

On the other hand, the MRP1 and MRP2 inhibition data sets have more noninhibitors than inhibitors, likely due to their selective binding sites and the use of their assays in studies focused on validating compound selectivity for other transporters. To balance these data sets for modeling, noninhibitors were randomly removed for the MRP1 and MRP2 inhibition data sets. For the P-gp and BCRP inhibition data sets, which had an inhibitor-to-noninhibitor ratio of roughly 3:1, noninhibitors were supplemented from high-throughput screening assays in PubChem (AIDs 1325 and 1326). From approximately 200,000 inactive compounds in each assay, 10,000 noninhibitors of P-gp and BCRP were randomly chosen and further selected if they had high structural similarity (based on Tanimoto coefficient ≥ 0.8 and MACCS keys) to inactive compounds in the training set, as described in Jiang et al. Thus, the final P-gp and BCRP inhibition data sets achieved a near 1:1 ratio (Table ). As a result of the data curation process, we generated the largest database for chemical binding/inhibition of ABC transporters thus far. Overall, compared to the existing databases reported in previous modeling studies, − , the training sets in this project were substantially larger and covered a more diverse chemical space for model development and predictions.

QSAR Model Development and Performance

The final eight curated data sets, which included P-gp, BCRP, MRP1, and MRP2 substrate and inhibition interactions (Table ), were used to train QSAR models, producing a total of 104 models, with 12 individual models and one consensus model per data set. Using CCR as a primary indicator of model performance, 5-fold cross-validation yielded CCR values ranging from 0.685 to 0.834 for the substrate models and 0.631 to 0.935 for the inhibition models (Table ). A total of 99 out of 104 models achieved a CCR above 0.7, reflecting excellent model performance. Additionally, the AUC scores for the individual models were satisfactory, where all the models achieved AUC scores above 0.7. Overall, the models trained on the eight data sets in this study performed well across multiple statistical metrics. Our models also demonstrated comparable or better performance than those generated from past representative modeling studies. ,,− ,

3. Five-Fold Cross-Validation Performance of the ABC Transporter Models .

    Correct Classification Rate (CCR)
    P-gp BCRP MRP1 MRP2
Algorithm Descriptor SUB INH SUB INH SUB INH SUB INH
DNN ECFP6 0.711 0.893 0.720 0.917 0.760 0.829 0.740 0.631
MACCS 0.710 0.872 0.734 0.913 0.729 0.804 0.781 0.692
RDKit 0.723 0.892 0.748 0.918 0.752 0.808 0.726 0.700
RF ECFP6 0.685 0.839 0.714 0.898 0.820 0.827 0.783 0.731
MACCS 0.732 0.849 0.749 0.919 0.785 0.806 0.796 0.646
RDKit 0.734 0.877 0.757 0.916 0.823 0.864 0.814 0.800
SVM ECFP6 0.743 0.916 0.734 0.923 0.817 0.852 0.753 0.708
MACCS 0.724 0.891 0.738 0.918 0.824 0.825 0.807 0.715
RDKit 0.745 0.913 0.767 0.924 0.825 0.829 0.825 0.715
XGB ECFP6 0.713 0.904 0.708 0.915 0.819 0.873 0.801 0.677
MACCS 0.710 0.884 0.744 0.921 0.785 0.845 0.785 0.700
RDKit 0.740 0.912 0.769 0.926 0.834 0.855 0.786 0.785
Consensus   0.752 0.924 0.779 0.935 0.829 0.883 0.833 0.731
a

CCR values in bold represent the best-performing model for the respective data set; SUBsubstrate models; INHinhibition models.

For the substrate models, all 52 models showed strong overall performance, with CCR values ranging from 0.685 to 0.834. As for the inhibition models, the BCRP inhibition models exhibited the best overall performance, where all the models achieved CCR values close to or above 0.9. The P-gp and MRP1 inhibition models also yielded satisfactory CCR values, ranging between 0.804 and 0.924. By adding an extra feature selection step into the modeling process (such as recursive feature elimination and forward selection), the modeling result showed no significant improvement (data not shown). However, four MRP2 inhibition models resulted in CCR values lower than 0.7, which is due to the low sensitivity of these models, such as the MRP2 DNN-MACCS model having a sensitivity as low as 0.385 (Table S3, Supplementary Excel File). This underperformance can be explained by the limited size of the training set (130 compounds) and reflects the need for more experimental MRP2 inhibition data for developing an enhanced model in the future.

In general, the performance of the QSAR models varied depending on the algorithm and chemical descriptors employed, as no single algorithm or descriptor consistently excelled across all individual models. However, the consensus models were the best-performing models for six of the eight data sets and demonstrated comparable performance for the remaining two data sets, making them suitable for external predictions. Importantly, the advantage of the consensus models is that they integrate predictions from multiple individual models, thereby reducing the impact of isolated false predictions. The AD was also implemented to evaluate whether model performance could be improved. Table S4 (Supplementary Excel File) showed that incorporating an AD yielded results comparable to models without an AD (Table S3). Thus, we proceeded with the assessment of external predictions without applying an AD, given the comparable performance observed.

The performance of the consensus models for external validation was evaluated using sensitivity as the primary metric, given that the DrugBank validation sets consisted of substrates and inhibitors. Among the substrate models, the BCRP consensus model demonstrated the best performance, achieving a sensitivity of 0.723 (Figure A). The P-gp and MRP1 substrate models exhibited moderate sensitivity values for DrugBank compounds, ranging between 0.55 to 0.65. On the other hand, among the inhibition models, the MRP1 consensus models achieved the highest sensitivity (0.956). In contrast, the P-gp and MRP2 consensus model exhibited the lowest performance (Figure A). Despite having the largest inhibition data set, the underperformance of the P-gp inhibition models may be due to the broad substrate specificity for P-gp, as it binds to diverse chemical structures. Furthermore, we identified inconsistencies within the DrugBank classifications: several compounds were labeled as P-gp inhibitors despite the cited literature references classifying them as noninhibitors, such as benzocaine and amodiaquine. , The inconsistency of this data source also proved the importance of data curation for model development. In contrast, the poor performance of the MRP2 inhibition model is likely due to the limited size of the training set (i.e., only 130 compounds in the training set). Additionally, during data set curation, conflicting labels were identified for some compounds shared between the training and DrugBank validation set (e.g., indomethacin labeled as a P-gp inhibitor in DrugBank but as a noninhibitor in the training set). This suggests that certain compounds in the MRP2 inhibition set may also be mislabeled, either due to testing protocol sensitivity or differences in how transporter inhibition was classified (i.e., different thresholds being used in different sources). The chemical space and diversity of the training set also appear to influence model performance, as illustrated by the PCA plots comparing the MRP1 and MRP2 inhibition data sets in Figure B. Specifically, MRP1 inhibitors from DrugBank shared higher similarity to MRP1 inhibitors than MRP1 noninhibitors in the training set, which likely contributed to the high predictive sensitivity of the MRP1 inhibition consensus model. We also evaluated the effect of applying the AD in the external predictions. However, no significant improvement in predictive accuracy was observed when compounds outside the AD were excluded. This condition also indicates potential inconsistencies exist between sources in DrugBank and the training data.

3.

3

External validation of ABC transporter models using eight DrugBank data sets consisting of drugs classified as substrates/inhibitors of the four ABC transporters. (A) Model performance evaluated using sensitivity as a metric. (B) PCA plots illustrating the chemical space for the MRP1 and MRP2 inhibition training and DrugBank sets using the MACCS descriptors.

Key Structural Features Underlying Model Predictions

Using the SHAP approach, MACCS descriptors were ranked to show their contributions to model predictions. For P-gp, substrate predictions were primarily driven by descriptors associated with polar functional groups, tertiary amines, and flexible chains and ring structures (Figure S2). On the other hand, inhibitor predictions were linked to hydrophobic substructures and multiple hydrogen bond donors and acceptors that promote strong binding within the P-gp ligand binding pocket. For BCRP, substrates were characterized by fluoride substituents, hydrogen bond donors, and planar aromatic heterocycles, while inhibitors were more strongly associated with nitrogen heterocycles, multiple aromatic rings, and extended aromatic scaffolds (Figure S3). In the case of MRP1, substrate predictions were driven by descriptors conferring increased hydrophilicity, whereas inhibitors were characterized by hydroxyl and carbonyl groups as well as tetrahedral carbons linked to three or more carbons (Figure S4). Finally, for MRP2, substrates were linked to descriptors involving multiple oxygen atoms, oxygen-containing heterocycles, and hydrogen bond donors and acceptors, while inhibitors were associated with five-membered rings, multiple heterocycles, and hydrophobic motifs (Figure S5). Overall, these descriptors highlight a number of structural features correlated with substrate recognition and inhibition, which have been partially proven by previous studies. ,− Further mechanistic analysis and experimental validation are still needed to uncover novel molecular determinants of transporter-ligand interactions.

The Use of Transporter Substrate Model Predictions in Brain Exposure Evaluation

The effects of drugs on the brain are determined not only by their inherent ability to cross the BBB, but also their affinity to interact with membrane transporters. P-gp and BCRP, both primary efflux transporters, are highly expressed in the brain and can significantly influence the efficacy of central nervous system (CNS) drugs. To investigate the applicability of predicted P-gp- and BCRP-mediated drug efflux to evaluate brain exposure, we applied the P-gp and BCRP substrate consensus models to predict substrate binding for compounds from a data set with measured K p,uu,brain values (Table S5, Supplementary Excel File). The results revealed that compounds with low brain exposure (K p,uu,brain < 0.1) were twice or more likely to be predicted as P-gp and/or BCRP substrates than compounds with high brain exposure (Figure A).

4.

4

Model predictions for P-gp- and BCRP-mediated efflux of 85 compounds with experimentally measured K p,uu,brain. (A) Comparison of P-gp/BCRP substrate and nonsubstrate predictions across the 85 compounds with high (K p,uu,brain ≥ 0.1) and low (K p,uu,brain < 0.1) brain exposure. Scatterplots of predicted (B) P-gp and (C) BCRP substrate probabilities against K p,uu,brain and applying different thresholds to assess brain exposure (K p,uu,brain = 0.1 vs K p,uu,brain = 1). Compounds in the shaded areas have a K p,uu,brain ≥ 1, where the red area contains predicted substrates, and the green area contains predicted nonsubstrates.

The definition of high and low brain exposure varies across studies. ,, In this study, we first assessed the relationships between the P-gp and BCRP substrate predictions and the measured K p,uu,brain values for the 85 compounds. When applying a threshold of K p,uu,brain above 1 to classify high brain exposure drugs, eight out of nine such compounds were predicted as nonsubstrates of P-gp (Figure B, green shaded area). Similarly, seven of these nine compounds were also predicted as nonsubstrates of BCRP (Figure C, green shaded area). The remaining three incorrectly predicted compounds (donepezil for P-gp and bupropion and tetramethylpyrazine for BCRP) were false positives, misclassified by the respective substrate models. Nonetheless, the models correctly identified the nonsubstrate status of the majority of the high exposure drugs with K p,uu,brain ≥ 1. Notably, although the overall data set of 85 compounds comprises a mixture of central nervous system (CNS) and non-CNS drugs, all nine compounds with a K p,uu,brain ≥ 1 were CNS drugs as classified in DrugBank, suggesting that such high K p,uu,brain values are exclusive to drugs that penetrate the brain effectively. A review of the literature further confirmed that none of these nine drugs were reported as P-gp or BCRP substrates. Overall, these results support the prevailing hypothesis that compounds with high brain exposure are unlikely to be effluxed by P-gp or BCRP. ,

When applying P-gp and BCRP substrate model predictions to evaluate drug brain exposure, it is also important to consider the physicochemical properties of target drugs that influence brain exposure. Key factors influencing K p,uu,brain include TPSA, HBD, and HBA. ,,, To evaluate brain exposure using transporter binding and physicochemical properties, we incorporated P-gp/BCRP substrate predictions into a brain exposure scoring framework (see Methods, eq ). The analysis revealed that more than 60% of the high brain exposure compounds (K p,uu,brain ≥ 0.1) were predicted to have high exposure scores, suggesting that low binding potentials of P-gp or BCRP contributed to their favorable brain penetration (Figure ). In contrast, among the compounds with low brain exposure (K p,uu,brain < 0.1), 20 out of 35 compounds exhibited low exposure scores (−1, 0, or 1), of which 16 were predicted to be substrates of P-gp and/or BCRP. Furthermore, seven of the ten low exposure compounds with moderate scores (2) were also predicted as P-gp/BCRP substrates. Despite having favorable molecular properties, the active efflux of these compounds can limit their degree of brain exposure. For example, loperamide, a synthetic μ-opioid receptor agonist used to treat diarrhea, shows minimal CNS penetration due to its efflux by P-gp (K p,uu,brain = 0.007; predicted as P-gp substrate), which limits its potential analgesic effects. Similarly, methotrexate, a chemotherapeutic agent with poor brain penetration, is actively effluxed by P-gp, BCRP, and other ABC transporters (K p,uu,brain = 0.006; predicted as P-gp and BCRP substrate). This efflux activity contributes to its poor CNS distribution and provides a rationale for the intrathecal administration of methotrexate in the treatment of brain tumors to bypass the BBB. , This finding provides a new strategy for applying transporter ML models to evaluate more complex drug related properties, such as brain exposure for new drug candidates.

5.

5

Distribution of exposure scores across the 85 compounds with high (K p,uu,brain ≥ 0.1) and low (K p,uu,brain < 0.1) brain exposure.

Conclusion

This study compiled a large database of around 8800 diverse compounds with annotated and categorized transporter interaction data, including substrate binding and inhibition, for four ABC transporters. The eight training sets generated from this collection were used to build various QSAR models, most of which demonstrated good predictive performance. External validation of the developed models with new compounds highlighted the critical role of training set size and diversity, as well as the importance of thorough annotation, interpretation, and critical analysis of public data in ensuring data set accuracy. Furthermore, applying P-gp and BCRP substrate model predictions showed promising results when assessing drug brain exposure through a simple and straightforward scoring system for evaluating complex drug properties/activities in vivo. The inhibition models can also be applied in drug discovery to identify compounds with inhibitory activity that may interfere with the disposition of coadministered drugs, thereby enabling the assessment of DDI risks and guiding the optimization of chemical structures. Additionally, while this study focused on four key ABC transporters, the modeling framework presented in this study can be broadly extended to other transporters, like solute carrier (SLC) transporters, which are also critical in drug uptake, distribution, and elimination. Overall, this study reveals the potential of integrating ML models of in vitro drug properties/activities to an applicable scoring system for evaluating complex drug properties/activities in vivo.

Supplementary Material

mp5c01065_si_001.pdf (1.1MB, pdf)
mp5c01065_si_002.xlsx (458.7KB, xlsx)

Acknowledgments

This work was partially supported by the National Institute of Child Health and Human Development (Grant UC2HD113039), the National Institute of Environmental Health Sciences (Grants R01ES031080, R35ES031709, R01ES029275, and P30ES005022), and the National Center for Advancing Translational Sciences (UM1TR004789).

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.molpharmaceut.5c01065.

  • Guidelines for classifying bioactivity records and compounds; bar plots illustrating the number of bioactivity records and unique compounds retrieved for P-gp, BCRP, MRP1, and MRP2; top 10 MACCS descriptors ranked by SHAP values for the P-gp random forest (A) substrate and (B) inhibition models; top 10 MACCS descriptors ranked by SHAP values for the BCRP random forest (A) substrate and (B) inhibition models; top 10 MACCS descriptors ranked by SHAP values for the MRP1 random forest (A) substrate and (B) inhibition models; top 10 MACCS descriptors ranked by SHAP values for the MRP2 random forest (A) substrate and (B) inhibition models (PDF)

  • Details and known ABC transporter interactions for DrugBank compounds; details and ABC transporter interactions for ∼8.8k compounds across eight training sets; 5-fold cross-validation evaluation metrics for the 104 generated QSAR models; 5-fold cross-validation evaluation metrics for the 104 generated QSAR models within the applicability domain; details of the brain exposure data set along with P-gp and BCRP substrate predictions (XLSX)

N.J.D.: Data CurationOriginal Annotation & Revision, Methodology, Investigation, Validation, Formal Analysis, Visualization, WritingOriginal Draft. S.R.C.: Data CurationRevision. E.C.: Data CurationRevision. T.W.: Data CurationRevision. A.K.: WritingReview & Editing. M.G.: WritingReview & Editing. S.C.: WritingReview & Editing. N.C.K.: WritingReview & Editing. J.B.T.: WritingReview & Editing. L.M.A.: Funding Acquisition, WritingReview & Editing. H.Z.: Conceptualization, Supervision, Funding Acquisition, WritingReview & Editing.

The authors declare no competing financial interest.

References

  1. Nigam S. K.. What Do Drug Transporters Really Do? Nat. Rev. Drug Discovery. 2015;14(1):29–44. doi: 10.1038/nrd4461. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Shugarts S., Benet L. Z.. The Role of Transporters in the Pharmacokinetics of Orally Administered Drugs. Pharm. Res. 2009;26(9):2039–2054. doi: 10.1007/s11095-009-9924-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Fletcher J. I., Williams R. T., Henderson M. J., Norris M. D., Haber M.. ABC Transporters as Mediators of Drug Resistance and Contributors to Cancer Cell Biology. Drug Resist. Updates. 2016;26:1–9. doi: 10.1016/j.drup.2016.03.001. [DOI] [PubMed] [Google Scholar]
  4. Taskar K. S., Pilla Reddy V., Burt H., Posada M. M., Varma M., Zheng M., Ullah M., Emami Riedmaier A., Umehara K., Snoeys J., Nakakariya M., Chu X., Beneton M., Chen Y., Huth F., Narayanan R., Mukherjee D., Dixit V., Sugiyama Y., Neuhoff S.. Physiologically-Based Pharmacokinetic Models for Evaluating Membrane Transporter Mediated Drug–Drug Interactions: Current Capabilities, Case Studies, Future Opportunities, and Recommendations. Clin. Pharmacol. Ther. 2020;107(5):1082–1115. doi: 10.1002/cpt.1693. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. DeGorter M. K., Xia C. Q., Yang J. J., Kim R. B.. Drug Transporters in Drug Efficacy and Toxicity. Annu. Rev. Pharmacol. Toxicol. 2012;52:249–273. doi: 10.1146/annurev-pharmtox-010611-134529. [DOI] [PubMed] [Google Scholar]
  6. Robey R. W., Pluchino K. M., Hall M. D., Fojo A. T., Bates S. E., Gottesman M. M.. Revisiting the Role of ABC Transporters in Multidrug-Resistant Cancer. Nat. Rev. Cancer. 2018;18(7):452–464. doi: 10.1038/s41568-018-0005-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Leslie E. M., Deeley R. G., Cole S. P. C.. Multidrug Resistance Proteins: Role of P-Glycoprotein, MRP1, MRP2, and BCRP (ABCG2) in Tissue Defense. Toxicol. Appl. Pharmacol. 2005;204(3):216–237. doi: 10.1016/j.taap.2004.10.012. [DOI] [PubMed] [Google Scholar]
  8. Galetin A., Brouwer K. L. R., Tweedie D., Yoshida K., Sjöstedt N., Aleksunes L., Chu X., Evers R., Hafey M. J., Lai Y., Matsson P., Riselli A., Shen H., Sparreboom A., Varma M. V. S., Yang J., Yang X., Yee S. W., Zamek-Gliszczynski M. J., Zhang L., Giacomini K. M.. Membrane Transporters in Drug Development and as Determinants of Precision Medicine. Nat. Rev. Drug Discovery. 2024;23:255–280. doi: 10.1038/s41573-023-00877-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Wessler J. D., Grip L. T., Mendell J., Giugliano R. P.. The P-Glycoprotein Transport System and Cardiovascular Drugs. J. Am. Coll. Cardiol. 2013;61(25):2495–2502. doi: 10.1016/j.jacc.2013.02.058. [DOI] [PubMed] [Google Scholar]
  10. Lai J.-I., Tseng Y.-J., Chen M.-H., Huang C.-Y. F., Chang P. M.-H.. Clinical Perspective of FDA Approved Drugs With P-Glycoprotein Inhibition Activities for Potential Cancer Therapeutics. Front. Oncol. 2020;10:561936. doi: 10.3389/fonc.2020.561936. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Coley, H. M. Overcoming Multidrug Resistance in Cancer: Clinical Studies of P-Glycoprotein Inhibitors. In Multi-Drug Resistance in Cancer; Zhou, J. , Ed.; Humana Press: Totowa, NJ, 2010; pp 341–358 10.1007/978-1-60761-416-6_15. [DOI] [PubMed] [Google Scholar]
  12. Kiss M., Mbasu R., Nicolaï J., Barnouin K., Kotian A., Mooij M. G., Kist N., Wijnen R. M. H., Ungell A.-L., Cutler P., Russel F. G. M., Wildt S. N. de.. Ontogeny of Small Intestinal Drug Transporters and Metabolizing Enzymes Based on Targeted Quantitative Proteomics. Drug Metab. Dispos. 2021;49(12):1038–1046. doi: 10.1124/dmd.121.000559. [DOI] [PubMed] [Google Scholar]
  13. Khatri R., Fallon J. K., Rementer R. J. B., Kulick N. T., Lee C. R., Smith P. C.. Targeted Quantitative Proteomic Analysis of Drug Metabolizing Enzymes and Transporters by Nano LC-MS/MS in the Sandwich Cultured Human Hepatocyte Model. J. Pharmacol. Toxicol. Methods. 2019;98:106590. doi: 10.1016/j.vascn.2019.106590. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Oswald S., Müller J., Neugebauer U., Schröter R., Herrmann E., Pavenstädt H., Ciarimboli G.. Protein Abundance of Clinically Relevant Drug Transporters in The Human Kidneys. Int. J. Mol. Sci. 2019;20(21):5303. doi: 10.3390/ijms20215303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Zdrazil B., Felix E., Hunter F., Manners E. J., Blackshaw J., Corbett S., de Veij M., Ioannidis H., Lopez D. M., Mosquera J. F., Magarinos M. P., Bosc N., Arcila R., Kizilören T., Gaulton A., Bento A. P., Adasme M. F., Monecke P., Landrum G. A., Leach A. R.. The ChEMBL Database in 2023: A Drug Discovery Platform Spanning Multiple Bioactivity Data Types and Time Periods. Nucleic Acids Res. 2024;52(D1):D1180–D1192. doi: 10.1093/nar/gkad1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Kim S., Chen J., Cheng T., Gindulyte A., He J., He S., Li Q., Shoemaker B. A., Thiessen P. A., Yu B., Zaslavsky L., Zhang J., Bolton E. E.. PubChem 2023 Update. Nucleic Acids Res. 2023;51(D1):D1373–D1380. doi: 10.1093/nar/gkac956. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Knox C., Wilson M., Klinger C. M., Franklin M., Oler E., Wilson A., Pon A., Cox J., Chin N. E. Lucy., Strawbridge S. A., Garcia-Patino M., Kruger R., Sivakumaran A., Sanford S., Doshi R., Khetarpal N., Fatokun O., Doucet D., Zubkowski A., Rayat D. Y., Jackson H., Harford K., Anjum A., Zakir M., Wang F., Tian S., Lee B., Liigand J., Peters H., Wang R. Q. Rachel., Nguyen T., So D., Sharp M., da Silva R., Gabriel C., Scantlebury J., Jasinski M., Ackerman D., Jewison T., Sajed T., Gautam V., Wishart D. S.. DrugBank 6.0: The DrugBank Knowledgebase for 2024. Nucleic Acids Res. 2024;52(D1):D1265–D1275. doi: 10.1093/nar/gkad976. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Mak L., Marcus D., Howlett A., Yarova G., Duchateau G., Klaffke W., Bender A., Glen R. C.. Metrabase: A Cheminformatics and Bioinformatics Database for Small Molecule Transporter Data Analysis and (Q)­SAR Modeling. J. Cheminf. 2015;7(1):31. doi: 10.1186/s13321-015-0083-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Ozawa N., Shimizu T., Morita R., Yokono Y., Ochiai T., Munesada K., Ohashi A., Aida Y., Hama Y., Taki K., Maeda K., Kusuhara H., Sugiyama Y.. Transporter Database, TP-Search: A Web-Accessible Comprehensive Database for Research in Pharmacokinetics of Drugs. Pharm. Res. 2004;21(11):2133–2134. doi: 10.1023/B:PHAM.0000048207.11160.d0. [DOI] [PubMed] [Google Scholar]
  20. Morrissey K. M., Wen C. C., Johns S. J., Zhang L., Huang S.-M., Giacomini K. M.. The UCSF-FDA TransPortal: A Public Drug Transporter Database. Clin. Pharmacol. Ther. 2012;92(5):545–546. doi: 10.1038/clpt.2012.44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Saier M. H. Jr, Reddy V. S., Moreno-Hagelsieb G., Hendargo K. J., Zhang Y., Iddamsetty V., Lam K. J. K., Tian N., Russum S., Wang J., Medrano-Soto A.. The Transporter Classification Database (TCDB): 2021 Update. Nucleic Acids Res. 2021;49(D1):D461–D467. doi: 10.1093/nar/gkaa1004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Yin J., Chen Z., You N., Li F., Zhang H., Xue J., Ma H., Zhao Q., Yu L., Zeng S., Zhu F.. VARIDT 3.0: The Phenotypic and Regulatory Variability of Drug Transporter. Nucleic Acids Res. 2024;52(D1):D1490–D1502. doi: 10.1093/nar/gkad818. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Michel M. E., Wen C. C., Yee S. W., Giacomini K. M., Hamdoun A., Nicklisch S. C. T.. TICBase: Integrated Resource for Data on Drug and Environmental Chemical Interactions with Mammalian Drug Transporters. Clin. Pharmacol. Ther. 2023;114(6):1293–1303. doi: 10.1002/cpt.3036. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Zhao L., Ciallella H. L., Aleksunes L. M., Zhu H.. Advancing Computer-Aided Drug Discovery (CADD) by Big Data and Data-Driven Machine Learning Modeling. Drug Discovery Today. 2020;25(9):1624–1638. doi: 10.1016/j.drudis.2020.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Zhu H.. Big Data and Artificial Intelligence Modeling for Drug Discovery. Annu. Rev. Pharmacol. Toxicol. 2020;60:573–589. doi: 10.1146/annurev-pharmtox-010919-023324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Tropsha A., Isayev O., Varnek A., Schneider G., Cherkasov A.. Integrating QSAR Modelling and Deep Learning in Drug Discovery: The Emergence of Deep QSAR. Nat. Rev. Drug Discovery. 2024;23(2):141–155. doi: 10.1038/s41573-023-00832-0. [DOI] [PubMed] [Google Scholar]
  27. Ciallella H. L., Zhu H.. Advancing Computational Toxicology in the Big Data Era by Artificial Intelligence: Data-Driven and Mechanism-Driven Modeling for Chemical Toxicity. Chem. Res. Toxicol. 2019;32(4):536–547. doi: 10.1021/acs.chemrestox.8b00393. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Jia, X. ; Wang, T. ; Zhu, H. . Advancing Computational Toxicology by Interpretable Machine Learning. Environ. Sci. Technol. 2023. 57 17690 10.1021/acs.est.3c00653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Dudas B., Miteva M. A.. Computational and Artificial Intelligence-Based Approaches for Drug Metabolism and Transport Prediction. Trends Pharmacol. Sci. 2024;45(1):39–55. doi: 10.1016/j.tips.2023.11.001. [DOI] [PubMed] [Google Scholar]
  30. Schlessinger A., Welch M. A., van Vlijmen H., Korzekwa K., Swaan P. W., Matsson P.. Molecular Modeling of Drug–Transporter InteractionsAn International Transporter Consortium Perspective. Clin. Pharmacol. Ther. 2018;104(5):818–835. doi: 10.1002/cpt.1174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Ecker G. F., Stockner T., Chiba P.. Computational Models for Prediction of Interactions with ABC-Transporters. Drug Discovery Today. 2008;13(7):311–317. doi: 10.1016/j.drudis.2007.12.012. [DOI] [PubMed] [Google Scholar]
  32. Pinto M., Digles D., Ecker G. F.. Computational Models for Predicting the Interaction with ABC Transporters. Drug Discovery Today: Technol. 2014;12:e69–e77. doi: 10.1016/j.ddtec.2014.03.007. [DOI] [PubMed] [Google Scholar]
  33. AbdulHameed M. D. M., Dey S., Xu Z., Clancy B., Desai V., Wallqvist A.. MONSTROUS: A Web-Based Chemical-Transporter Interaction Profiler. Front. Pharmacol. 2025;16:1498945. doi: 10.3389/fphar.2025.1498945. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Matheny C. J., Lamb M. W., Brouwer K. L. R., Pollack G. M.. Pharmacokinetic and Pharmacodynamic Implications of P-Glycoprotein Modulation. Pharmacotherapy: J. Hum. Pharmacol. Drug Ther. 2001;21(7):778–796. doi: 10.1592/phco.21.9.778.34558. [DOI] [PubMed] [Google Scholar]
  35. Wang R. B., Kuo C. L., Lien L. L., Lien E. J.. Structure–Activity Relationship: Analyses of p-Glycoprotein Substrates and Inhibitors. J. Clin. Pharm. Ther. 2003;28(3):203–228. doi: 10.1046/j.1365-2710.2003.00487.x. [DOI] [PubMed] [Google Scholar]
  36. Cerruela García G., García-Pedrajas N.. Boosted Feature Selectors: A Case Study on Prediction P-Gp Inhibitors and Substrates. J. Comput.-Aided Mol. Des. 2018;32(11):1273–1294. doi: 10.1007/s10822-018-0171-5. [DOI] [PubMed] [Google Scholar]
  37. Wang P.-H., Tu Y.-S., Tseng Y. J.. PgpRules: A Decision Tree Based Prediction Server for P-Glycoprotein Substrates and Inhibitors. Bioinformatics. 2019;35(20):4193–4195. doi: 10.1093/bioinformatics/btz213. [DOI] [PubMed] [Google Scholar]
  38. Kadioglu O., Efferth T.. A Machine Learning-Based Prediction Platform for P-Glycoprotein Modulators and Its Validation by Molecular Docking. Cells. 2019;8(10):1286. doi: 10.3390/cells8101286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Hinge V. K., Roy D., Kovalenko A.. Prediction of P-Glycoprotein Inhibitors with Machine Learning Classification Models and 3D-RISM-KH Theory Based Solvation Energy Descriptors. J. Comput.-Aided Mol. Des. 2019;33(11):965–971. doi: 10.1007/s10822-019-00253-5. [DOI] [PubMed] [Google Scholar]
  40. Ohashi R., Watanabe R., Esaki T., Taniguchi T., Torimoto-Katori N., Watanabe T., Ogasawara Y., Takahashi T., Tsukimoto M., Mizuguchi K.. Development of Simplified in Vitro P-Glycoprotein Substrate Assay and in Silico Prediction Models To Evaluate Transport Potential of P-Glycoprotein. Mol. Pharmaceutics. 2019;16(5):1851–1863. doi: 10.1021/acs.molpharmaceut.8b01143. [DOI] [PubMed] [Google Scholar]
  41. Adachi A., Yamashita T., Kanaya S., Kosugi Y.. Ensemble Machine Learning Approaches Based on Molecular Descriptors and Graph Convolutional Networks for Predicting the Efflux Activities of MDR1 and BCRP Transporters. AAPS J. 2023;25(5):88. doi: 10.1208/s12248-023-00853-y. [DOI] [PubMed] [Google Scholar]
  42. Xia M., Fang Y., Cao W., Liang F., Pan S., Xu X.. Quantitative Structure–Activity Relationships for the Flavonoid-Mediated Inhibition of P-Glycoprotein in KB/MDR1 Cells. Molecules. 2019;24(9):1661. doi: 10.3390/molecules24091661. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Lahyaoui M., Diane A., El-Idrissi H., Saffaj T., Rodi Y. K., Ihssane B.. QSAR Modeling and Molecular Docking Studies of 2-Oxo-1, 2-Dihydroquinoline-4- Carboxylic Acid Derivatives as p-Glycoprotein Inhibitors for Combating Cancer Multidrug Resistance. Heliyon. 2023;9(1):e13020. doi: 10.1016/j.heliyon.2023.e13020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Broccatelli F., Carosati E., Neri A., Frosini M., Goracci L., Oprea T. I., Cruciani G.. A Novel Approach for Predicting P-Glycoprotein (ABCB1) Inhibition Using Molecular Interaction Fields. J. Med. Chem. 2011;54(6):1740–1751. doi: 10.1021/jm101421d. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Chen L., Li Y., Zhao Q., Peng H., Hou T.. ADME Evaluation in Drug Discovery. 10. Predictions of P-Glycoprotein Inhibitors Using Recursive Partitioning and Naive Bayesian Classification Techniques. Mol. Pharmaceutics. 2011;8(3):889–900. doi: 10.1021/mp100465q. [DOI] [PubMed] [Google Scholar]
  46. Broccatelli F.. QSAR Models for P-Glycoprotein Transport Based on a Highly Consistent Data Set. J. Chem. Inf. Model. 2012;52(9):2462–2470. doi: 10.1021/ci3002809. [DOI] [PubMed] [Google Scholar]
  47. Poongavanam V., Haider N., Ecker G. F.. Fingerprint-Based in Silico Models for the Prediction of P-Glycoprotein Substrates and Inhibitors. Bioorg. Med. Chem. 2012;20(18):5388–5395. doi: 10.1016/j.bmc.2012.03.045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Sedykh A., Fourches D., Duan J., Hucke O., Garneau M., Zhu H., Bonneau P., Tropsha A.. Human Intestinal Transporter Database: QSAR Modeling and Virtual Profiling of Drug Uptake, Efflux and Interactions. Pharm. Res. 2013;30(4):996–1007. doi: 10.1007/s11095-012-0935-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Montanari F., Ecker G. F.. BCRP Inhibition: From Data Collection to Ligand-Based Modeling. Mol. Inf. 2014;33(5):322–331. doi: 10.1002/minf.201400012. [DOI] [PubMed] [Google Scholar]
  50. Wang W., Kim M. T., Sedykh A., Zhu H.. Developing Enhanced Blood–Brain Barrier Permeability Models: Integrating External Bio-Assay Data in QSAR Modeling. Pharm. Res. 2015;32(9):3055–3065. doi: 10.1007/s11095-015-1687-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Ciallella, H. L. ; Chung, E. ; Russo, D. P. ; Zhu, H. . Automatic Quantitative Structure–Activity Relationship Modeling to Fill Data Gaps in High-Throughput Screening. In High-Throughput Screening Assays in Toxicology; Zhu, H. ; Xia, M. , Eds.; Methods in Molecular Biology; Humana: New York, NY, 2022; Vol. 2474, pp 169–187 10.1007/978-1-0716-2213-1_16. [DOI] [PubMed] [Google Scholar]
  52. Fridén M., Winiwarter S., Jerndal G., Bengtsson O., Wan H., Bredberg U., Hammarlund-Udenaes M., Antonsson M.. Structure–Brain Exposure Relationships in Rat and Human Using a Novel Data Set of Unbound Drug Concentrations in Brain Interstitial and Cerebrospinal Fluids. J. Med. Chem. 2009;52(20):6233–6243. doi: 10.1021/jm901036q. [DOI] [PubMed] [Google Scholar]
  53. Rogers D., Hahn M.. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 2010;50(5):742–754. doi: 10.1021/ci100050t. [DOI] [PubMed] [Google Scholar]
  54. Morgan H. L.. The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. J. Chem. Doc. 1965;5(2):107–113. doi: 10.1021/c160017a018. [DOI] [Google Scholar]
  55. Ciallella H. L., Russo D. P., Aleksunes L. M., Grimm F. A., Zhu H.. Predictive Modeling of Estrogen Receptor Agonism, Antagonism, and Binding Activities Using Machine- and Deep-Learning Approaches. Lab. Invest. 2021;101(4):490–502. doi: 10.1038/s41374-020-00477-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Chung E., Russo D. P., Ciallella H. L., Wang Y.-T., Wu M., Aleksunes L. M., Zhu H.. Data-Driven Quantitative Structure–Activity Relationship Modeling for Human Carcinogenicity by Chronic Oral Exposure. Environ. Sci. Technol. 2023;57(16):6573–6588. doi: 10.1021/acs.est.3c00648. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Daood N. J., Russo D. P., Chung E., Qin X., Zhu H.. Predicting Chemical Immunotoxicity through Data-Driven QSAR Modeling of Aryl Hydrocarbon Receptor Agonism and Related Toxicity Mechanisms. Environ. Health. 2024;2(7):474–485. doi: 10.1021/envhealth.4c00026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Hornik K., Stinchcombe M., White H.. Multilayer Feedforward Networks Are Universal Approximators. Neural Netw. 1989;2(5):359–366. doi: 10.1016/0893-6080(89)90020-8. [DOI] [Google Scholar]
  59. Breiman L.. Random Forests. Mach. Learn. 2001;45:5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
  60. Cortes C., Vapnik V.. Support-Vector Networks. Mach. Learn. 1995;20:273–297. doi: 10.1007/BF00994018. [DOI] [Google Scholar]
  61. Chen, T. ; Guestrin, C. . XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; KDD ’16; Association for Computing Machinery: New York, NY, USA, 2016; pp 785–794 10.1145/2939672.2939785. [DOI] [Google Scholar]
  62. Korotcov A., Tkachenko V., Russo D. P., Ekins S.. Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets. Mol. Pharmaceutics. 2017;14(12):4462–4475. doi: 10.1021/acs.molpharmaceut.7b00578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Russo D. P., Zorn K. M., Clark A. M., Zhu H., Ekins S.. Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction. Mol. Pharmaceutics. 2018;15(10):4361–4370. doi: 10.1021/acs.molpharmaceut.8b00546. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Zhang L., Fourches D., Sedykh A., Zhu H., Golbraikh A., Ekins S., Clark J., Connelly M. C., Sigal M., Hodges D., Guiguemde A., Guy R. K., Tropsha A.. Discovery of Novel Antimalarial Compounds Enabled by QSAR-Based Virtual Screening. J. Chem. Inf. Model. 2013;53(2):475–492. doi: 10.1021/ci300421n. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Kim M. T., Huang R., Sedykh A., Wang W., Xia M., Zhu H.. Mechanism Profiling of Hepatotoxicity Caused by Oxidative Stress Using Antioxidant Response Element Reporter Gene Assay Models and Big Data. Environ. Health Perspect. 2016;124(5):634–641. doi: 10.1289/ehp.1509763. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Kim M. T., Sedykh A., Chakravarti S. K., Saiakhov R. D., Zhu H.. Critical Evaluation of Human Oral Bioavailability for Pharmaceutical Drugs by Using Various Cheminformatics Approaches. Pharm. Res. 2014;31(4):1002–1014. doi: 10.1007/s11095-013-1222-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Lundberg, S. ; Lee, S.-I. . A Unified Approach to Interpreting Model Predictions arXiv November 25, 2017. 10.48550/arXiv.1705.07874. [DOI]
  68. Rankovic Z.. CNS Drug Design: Balancing Physicochemical Properties for Optimal Brain Exposure. J. Med. Chem. 2015;58(6):2584–2608. doi: 10.1021/jm501535r. [DOI] [PubMed] [Google Scholar]
  69. Gupta M., Lee H. J., Barden C. J., Weaver D. F.. The Blood–Brain Barrier (BBB) Score. J. Med. Chem. 2019;62(21):9824–9836. doi: 10.1021/acs.jmedchem.9b01220. [DOI] [PubMed] [Google Scholar]
  70. The UniProt Consortium; Bateman A., Martin M.-J., Orchard S.. et al. UniProt: The Universal Protein Knowledgebase in 2025. Nucleic Acids Res. 2024;53(D1):D609–D617. doi: 10.1093/nar/gkae1010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. M12 Drug Interaction Studies; Food and Drug Administration (FDA), 2024. https://www.fda.gov/media/161199/download. [Google Scholar]
  72. Cisar J. S., Weber O. D., Clapper J. R., Blankman J. L., Henry C. L., Simon G. M., Alexander J. P., Jones T. K., Ezekowitz R. A. B., O’Neill G. P., Grice C. A.. Identification of ABX-1431, a Selective Inhibitor of Monoacylglycerol Lipase and Clinical Candidate for Treatment of Neurological Disorders. J. Med. Chem. 2018;61(20):9062–9084. doi: 10.1021/acs.jmedchem.8b00951. [DOI] [PubMed] [Google Scholar]
  73. Yamazaki M., Neway W. E., Ohe T., Chen I.-W., Rowe J. F., Hochman J. H., Chiba M., Lin J. H.. In Vitro Substrate Identification Studies for P-Glycoprotein-Mediated Transport: Species Difference and Predictability of in Vivo Results. J. Pharmacol. Exp. Ther. 2001;296(3):723–735. doi: 10.1016/S0022-3565(24)38809-3. [DOI] [PubMed] [Google Scholar]
  74. Bahadduri, P. M. ; Polli, J. E. ; Swaan, P. W. ; Ekins, S. . Targeting Drug Transporters – Combining In Silico and In Vitro Approaches to Predict In Vivo. In Membrane Transporters in Drug Discovery and Development: Methods and Protocols; Yan, Q. , Ed.; Methods in Molecular Biology; Humana Press: Totowa, NJ, 2010; pp 65–103 10.1007/978-1-60761-700-6_4. [DOI] [PubMed] [Google Scholar]
  75. Polli J. W., Wring S. A., Humphreys J. E., Huang L., Morgan J. B., Webster L. O., Serabjit-Singh C. S.. Rational Use of in Vitro P-Glycoprotein Assays in Drug Discovery. J. Pharmacol. Exp. Ther. 2001;299(2):620–628. doi: 10.1016/S0022-3565(24)29270-3. [DOI] [PubMed] [Google Scholar]
  76. Morgan R. E., van Staden C. J., Chen Y., Kalyanaraman N., Kalanzi J., Dunn R. T. II, Afshari C. A., Hamadeh H. K.. A Multifactorial Approach to Hepatobiliary Transporter Assessment Enables Improved Therapeutic Compound Development. Toxicol. Sci. 2013;136(1):216–241. doi: 10.1093/toxsci/kft176. [DOI] [PubMed] [Google Scholar]
  77. Han Y. H., Kato Y., Haramura M., Ohta M., Matsuoka H., Sugiyama Y.. Physicochemical Parameters Responsible for the Affinity of Methotrexate Analogs for Rat Canalicular Multispecific Organic Anion Transporter (cMOAT/MRP2) Pharm. Res. 2001;18(5):579–586. doi: 10.1023/A:1011064806507. [DOI] [PubMed] [Google Scholar]
  78. Bakos É., Evers R., Sinkó E., Váradi A., Borst P., Sarkadi B.. Interactions of the Human Multidrug Resistance Proteins MRP1 and MRP2 with Organic Anions. Mol. Pharmacol. 2000;57(4):760–768. doi: 10.1016/S0026-895X(24)26477-4. [DOI] [PubMed] [Google Scholar]
  79. Horikawa M., Kato Y., Tyson C. A., Sugiyama Y.. The Potential for an Interaction between MRP2 (ABCC2) and Various Therapeutic Agents: Probenecid as a Candidate Inhibitor of the Biliary Excretion of Irinotecan Metabolites. Drug Metab Pharmacokinet. 2002;17(1):23–33. doi: 10.2133/dmpk.17.23. [DOI] [PubMed] [Google Scholar]
  80. Leier I., Jedlitschky G., Buchholz U., Cole S. P., Deeley R. G., Keppler D.. The MRP Gene Encodes an ATP-Dependent Export Pump for Leukotriene C4 and Structurally Related Conjugates. J. Biol. Chem. 1994;269(45):27807–27810. doi: 10.1016/S0021-9258(18)46856-1. [DOI] [PubMed] [Google Scholar]
  81. Krapf M. K., Gallus J., Vahdati S., Wiese M.. New Inhibitors of Breast Cancer Resistance Protein (ABCG2) Containing a 2,4-Disubstituted Pyridopyrimidine Scaffold. J. Med. Chem. 2018;61(8):3389–3408. doi: 10.1021/acs.jmedchem.7b01012. [DOI] [PubMed] [Google Scholar]
  82. Shaikh N., Sharma M., Garg P.. Selective Fusion of Heterogeneous Classifiers for Predicting Substrates of Membrane Transporters. J. Chem. Inf. Model. 2017;57(3):594–607. doi: 10.1021/acs.jcim.6b00508. [DOI] [PubMed] [Google Scholar]
  83. Jiang D., Lei T., Wang Z., Shen C., Cao D., Hou T.. ADMET Evaluation in Drug Discovery. 20. Prediction of Breast Cancer Resistance Protein Inhibition through Machine Learning. J. Cheminf. 2020;12(1):16. doi: 10.1186/s13321-020-00421-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Pedersen J. M., Matsson P., Bergström C. A. S., Norinder U., Hoogstraate J., Artursson P.. Prediction and Identification of Drug Interactions with the Human ATP-Binding Cassette Transporter Multidrug-Resistance Associated Protein 2 (MRP2; ABCC2) J. Med. Chem. 2008;51(11):3275–3287. doi: 10.1021/jm7015683. [DOI] [PubMed] [Google Scholar]
  85. Golbraikh A., Muratov E., Fourches D., Tropsha A.. Data Set Modelability by QSAR. J. Chem. Inf. Model. 2014;54(1):1–4. doi: 10.1021/ci400572x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Hazai E., Hazai I., Ragueneau-Majlessi I., Chung S. P., Bikadi Z., Mao Q.. Predicting Substrates of the Human Breast Cancer Resistance Protein Using a Support Vector Machine Method. BMC Bioinf. 2013;14(1):130. doi: 10.1186/1471-2105-14-130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Fung K. L., Gottesman M. M.. A Synonymous Polymorphism in a Common MDR1 (ABCB1) Haplotype Shapes Protein Function. Biochim. Biophys. Acta, Proteins Proteomics. 2009;1794(5):860–871. doi: 10.1016/j.bbapap.2009.02.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Nagy H., Goda K., Fenyvesi F., Bacsó Z., Szilasi M., Kappelmayer J., Lustyik G., Cianfriglia M., Szabó G.. Distinct Groups of Multidrug Resistance Modulating Agents Are Distinguished by Competition of P-Glycoprotein-Specific Antibodies. Biochem. Biophys. Res. Commun. 2004;315(4):942–949. doi: 10.1016/j.bbrc.2004.01.156. [DOI] [PubMed] [Google Scholar]
  89. Desai P. V., Raub T. J., Blanco M.-J.. How Hydrogen Bonds Impact P-Glycoprotein Transport and Permeability. Bioorg. Med. Chem. Lett. 2012;22(21):6540–6548. doi: 10.1016/j.bmcl.2012.08.059. [DOI] [PubMed] [Google Scholar]
  90. Boumendjel A., Baubichon-Cortay H., Trompier D., Perrotton T., Di Pietro A.. Anticancer Multidrug Resistance Mediated by MRP1: Recent Advances in the Discovery of Reversal Agents. Med. Res. Rev. 2005;25(4):453–472. doi: 10.1002/med.20032. [DOI] [PubMed] [Google Scholar]
  91. Xing L., Hu Y., Lai Y.. Advancement of Structure-Activity Relationship of Multidrug Resistance-Associated Protein 2 Interactions. AAPS J. 2009;11(3):406. doi: 10.1208/s12248-009-9117-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Wager T. T., Hou X., Verhoest P. R., Villalobos A.. Moving beyond Rules: The Development of a Central Nervous System Multiparameter Optimization (CNS MPO) Approach To Enable Alignment of Druglike Properties. ACS Chem. Neurosci. 2010;1(6):435–449. doi: 10.1021/cn100008c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Lawrenz M., Svensson M., Kato M., Dingley K. H., Chief Elk J., Nie Z., Zou Y., Kaplan Z., Lagiakos H. R., Igawa H., Therrien E.. A Computational Physics-Based Approach to Predict Unbound Brain-to-Plasma Partition Coefficient, Kp,Uu. J. Chem. Inf. Model. 2023;63(12):3786–3798. doi: 10.1021/acs.jcim.3c00150. [DOI] [PubMed] [Google Scholar]
  94. Pardridge W. M.. CNS Drug Design Based on Principles of Blood-Brain Barrier Transport. J. Neurochem. 1998;70(5):1781–1792. doi: 10.1046/j.1471-4159.1998.70051781.x. [DOI] [PubMed] [Google Scholar]
  95. Schaefer C. P., Tome M. E., Davis T. P.. The Opioid Epidemic: A Central Role for the Blood Brain Barrier in Opioid Analgesia and Abuse. Fluids Barriers CNS. 2017;14(1):32. doi: 10.1186/s12987-017-0080-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Perkins R. S., Davis A., Campagne O., Owens T. S., Stewart C. F.. CNS Penetration of Methotrexate and Its Metabolite 7-Hydroxymethotrexate in Mice Bearing Orthotopic Group 3 Medulloblastoma Tumors and Model-Based Simulations for Children. Drug Metab. Pharmacokinet. 2023;48:100471. doi: 10.1016/j.dmpk.2022.100471. [DOI] [PubMed] [Google Scholar]
  97. Kang X., Chen F., Yang S.-B., Wang Y.-L., Qian Z.-H., Li Y., Lin H., Li P., Peng Y.-C., Wang X.-M., Li W.-B.. Intrathecal Methotrexate in Combination with Systemic Chemotherapy in Glioblastoma Patients with Leptomeningeal Dissemination: A Retrospective Analysis. World J. Clin. Cases. 2022;10(17):5595–5605. doi: 10.12998/wjcc.v10.i17.5595. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

mp5c01065_si_001.pdf (1.1MB, pdf)
mp5c01065_si_002.xlsx (458.7KB, xlsx)

Articles from Molecular Pharmaceutics are provided here courtesy of American Chemical Society

RESOURCES