Sequential Contrastive and Deep Learning Models to Identify Selective Butyrylcholinesterase Inhibitors

Mustafa Kemal Ozalp; Patricia A Vignaux; Ana C Puhl; Thomas R Lane; Fabio Urbina; Sean Ekins

doi:10.1021/acs.jcim.4c00397

. Author manuscript; available in PMC: 2025 Apr 22.

Published in final edited form as: J Chem Inf Model. 2024 Mar 26;64(8):3161–3172. doi: 10.1021/acs.jcim.4c00397

Sequential Contrastive and Deep Learning Models to Identify Selective Butyrylcholinesterase Inhibitors

Mustafa Kemal Ozalp ^1,^#, Patricia A Vignaux ^1,^#, Ana C Puhl ¹, Thomas R Lane ¹, Fabio Urbina ¹, Sean Ekins ^1,^*

PMCID: PMC11331448 NIHMSID: NIHMS2014714 PMID: 38532612

Abstract

Butyrylcholinesterase (BChE) is a target of interest in late-stage Alzheimer’s Disease (AD) where selective BChE inhibitors (BIs) may offer symptomatic treatment without the harsh side effects of acetylcholinesterase (AChE) inhibitors. In this study, we explore multiple machine learning strategies to identify BIs in silico, optimizing for precision over all other metrics. We compare state of the art supervised contrastive learning (CL) with deep learning (DL) and Random Forest (RF) machine learning, across single and sequential modeling configurations, to identify the best models for BChE selectivity. We used these models to virtually screen a vendor library of 5 million compounds for BIs and tested 20 of these compounds in vitro. Seven of the 20 compounds displayed selectivity for BChE over AChE, reflecting a hit rate of 35% for our model predictions suggesting a highly efficient strategy for modeling selective inhibition.

Graphical Abstract

graphic file with name nihms-2014714-f0007.jpg

INTRODUCTION

Acetylcholinesterase (AChE) is the major enzyme that catalyzes the hydrolysis of acetylcholine (ACh) to terminate neuronal transmission and signaling between synapses of cholinergic neurons¹. The closely related butyrylcholinesterase (BChE), can also hydrolyze ACh, albeit less efficiently, along with a wide variety of other choline and non-choline esters^{2, 3
4}. As a pharmaceutical target, BChE has been largely neglected in favor of AChE; however, accumulating scientific evidence suggests that BChE may have several neuronal and non-neuronal roles in the central nervous system due to the wide distribution of neuronal BChE in certain parts of the brain^{5, 6}. BChE inhibition also results in an increase in ACh levels in the brain, indicating a regulatory function in the neuronal hydrolysis of ACh⁷. Consequently, BChE inhibitors have more recently been suggested as a possible treatment during the later stages of Alzheimer’s disease (AD), when the AChE-to-BChE ratio in the brain changes and BChE may have a greater effect on levels of neuronal ACh^{8, 9}. BChE inhibitors also have the potential for fewer side effects than AChE inhibitors due to the important roles for AChE and ACh at the neuromuscular junction^{10, 11}. The AChE inhibitors donepezil, galantamine as well as the dual AChE and BChE inhibitor rivastigmine are all approved drugs despite known gastrointestinal (GI) side effects¹².

Identification of selective BChE inhibitors (BIs) has therefore become an attractive goal, but it is challenging due to the structural similarities between AChE and BChE. The two enzymes display nearly 65% amino acid sequence homology and a similar tertiary structure, with a peripheral anionic site flanking a deep gorge structure leading to the catalytic active site with a shared serine amino acid residue^{4, 5, 13–15}. Recent studies have used molecular docking to find selective BChE inhibitors in silico ^16–19, taking advantage of six aromatic residues in the gorge of AChE that are important for AChE-ligand binding but missing in the BChE gorge ^{20, 21}. These structure-based virtual screens, while effective, require the 3D crystal structures of both AChE and BChE to define structure activity relationships of lead compounds.

A different strategy for inhibitor screening in silico involves the use of machine learning (ML) models to classify compounds as active or inactive based on molecular descriptors of the compounds themselves, without any knowledge of the target enzyme structure. Our group and others have used this strategy to find inhibitors for both AChE ^{22, 23} and BChE ²⁴, but only recently have ML models been used to find selective inhibitors of BChE ²⁵. Xu et al. used the classical ML algorithms Random Forest (RF), XGBoost (XGB), Naïve Bayes (NB), and Support Vector Machine (SVM), along with a feed-forward Neural Networks (NNET) algorithm to build individual models for AChE and BChE inhibition, and selective inhibition models for each ²⁵. The optimal ML models were used to virtually screen an in-house NCATS library of 360,000 compounds, and the RF model for selective BChE inhibition displayed a hit rate of 44% on follow-up in vitro assays.

We have now employed a state-of-the art ML algorithm to find selective BChE inhibitors and compare its efficacy to more classical ML methods. Contrastive learning (CL) is a machine learning algorithm that was originally developed for unsupervised machine learning tasks, to classify objects as “positive” or “negative” based on similarities and differences that exist within the data, instead of known labels place on the objects beforehand ^{26, 27}. Supervised CL uses the same learning schema, but allows for labeling of the data in order to generate a more accurate representation of data points and provide better classification accuracy ²⁸. Variations of supervised CL have been applied to drug-disease²⁹, drug-target^30–32 and multi-type drug-drug³³ interaction problems in the drug discovery field as well as many other areas of cheminformatics ^34–36.^37–42. In the current study, we created a supervised CL model to predict selective BChE inhibition. We compared this model’s performance to that of a deep learning (DL) model comprised of a Long Short-term memory^{43, 44} (LSTM) module and multilayer perceptron⁴⁵ (MLP), and a RF model. We also investigated two different model configurations, comparing the use of a single model versus sequential models to predict selective BChE inhibition. We then screened a virtual library of over 5 million compounds for BIs and performed follow up testing of 20 compounds in vitro. Seven of these compounds inhibited BChE selectively, reflecting at hit rate of 35% across the three different models. The sequential CL and DL models displayed a hit rate of 37.5% and 43%, respectively, while the RF model’s hit rate was 20%. While this sample size is likely too small to make generalizations about the superiority of any given ML algorithm, the study as a whole describes an efficient strategy for the identification of selective BChE Inhibitors.

EXPERIMENTAL PROCEDURES

Data Sets

AChE IC₅₀ curation methods were described previously²². Human BChE (CHEMBL1914) IC₅₀ data were downloaded from ChEMBL²², and subjected to the same protocol to remove non-human and duplicate data, standardize compound SMILES, and binarize activity on a threshold of 1 μM²². The two datasets were then compared using InChIKeys for shared molecules, yielding a common dataset of 2027 compounds for which there is IC₅₀ data for both AChE and BChE. Using an IC₅₀ threshold of ≤ 1 μM for active inhibitors, these compounds were then classified as selective AChE inhibitors (AIs), selective BChE inhibitors (BIs), dual inhibitors (DIs), or non-inhibitors (NIs). This common dataset was divided into training and validation sets (80:20 ratio) using the stratified split function from the scikit-learn library in Python.

Computational Resources and Environments

We trained the DL and CL models on a Lambda workstation (Lambda Inc., San Jose, CA.) with an AMD Ryzen Threadripper 3960X 24-Core 48 Processor CPU, 4x32GB DDr4 3200 MHz RAM, 3xNVIDIA GeForce RTX 3090 with CUDA 12.2 GPUs installed with Ubuntu 20.04.06 LTS operating system (OS), Anaconda 23.7.4, Python 3.10.12. The additional packages we installed to the conda environment are PyTorch 2.0.0, pytorch-cuda 12.1, pandas 2.0.3, hydra-core 1.3.2, scikit-learn 1.3.0, scipy 1.11.3, numpy 1.24.3, ipykernel 6.25.2, ipython 8.16.1, joblib 1.3.2, pyarrow 12.0.1, matplotlib 3.7.2, seaborn 0.13.0 and their dependencies. We performed model optimization and experimentation in Python using DVC 3.22.1. The Random Forest (RF) models were trained in our proprietary software Assay Central^™ (AC).

Overview of Machine Learning Model

We used a DL model consisting of a LSTM module to create learned embeddings of molecules and a MLP module to categorize compounds (Figure 1). LSTM⁴⁴ is a recurrent neural network (RNN) architecture capable of learning relationships between the elements of a sequence⁴³. LSTM models are often preferred in drug discovery studies because molecules are transformed into SMILES strings or molecular fingerprints which both carry sequential information. MLPs⁴⁵ are universal function approximators used for both classification and regression tasks. The LSTM network has one hidden layer with 256 hidden features. The network takes 1024-bit extended connectivity fingerprints (ECFP) generated with radius 3, (ECFP6). The LSTM output passes through a rectified linear unit (ReLU) activation function followed by a dropout layer with probability, $p = 0.5 .$ The learned embeddings are sent to the MLP module which is a 3-layer feed-forward neural network with ReLU activation functions. The network outputs logits of size 2 which are the logits for each category (active or inactive).

The original applications of contrastive learning were on self-supervised learning of image categories^{26, 27}. These studies used combinations of simple neural network architectures, such as a convolutional neural network (CNN) and MLP. The key element of the CL architecture was the contrastive learning loss (CLL), as well as the preprocessing and feeding of the data to the embedding creation module. The contrastive loss, in essence, is a negative log likelihood (NLL) loss. It uses the SoftMax function to convert the similarity values of image embeddings to probabilities. To calculate the similarity of embeddings the cosine similarity (normalized inner product) is used. For the numerator, the similarity of the positive class examples is calculated. In the denominator, the similarity of the positive class examples to the negative class examples is calculated. Therefore, in each epoch, the model tries to maximize the similarity between the positive examples and minimize the similarity between the positive and negative examples.

l (i, j) = - l o g \frac{exp (s_{i, j} / τ)}{\sum_{k = 1}^{2 N} 1_{(k \neq i)} exp (s_{i, j}) / τ}

(1)

ℒ = \frac{1}{2 N} \sum_{k = 1}^{N} [l (2 k - 1, 2 k) + l (2 k, 2 k - 1)]

(2)

Here $l$ is the negative log likelihood loss for a positive image pair of examples, $s_{(i, j)}$ , where $s_{(i, j)}$ is the cosine similarity of positive image pair and $i = j = 1, \dots, 2 N$ . $1_{(k \neq i)} \in \{0,1\}$ is an indicator function evaluating to 1 if and only if $k \neq i$ and $τ$ denotes a temperature parameter. The final loss, $ℒ$ , is computed across all positive pairs, both $(i, j)$ and $(j, i)$ , in a mini batch. Because the original applications of contrastive learning dealt with images in a self-supervised fashion, we had to modify the contrastive loss to use it for molecules in a supervised learning schema. To do that, we used the contrastive loss, similar to Li, Qiao and Wang³⁰, as follows:

ℒ_{C L} = - \sum_{i = 1}^{N P E} l o g \frac{\sum_{j \in P_{i}} exp ((s_{i, j}) / τ)}{\sum_{k = 1}^{N} 1_{(k \neq i)} exp ((s_{i, j}) / τ)}

(3)

Here, NPE is the number of positive examples in a batch. The modified contrastive loss generalizes the problem to the entire positive class. Our contrastive learning model utilizes an LSTM module to create learned embeddings and MLP for classification. The CLL is used specifically to train the embeddings module. We used NLL to train the MLP module.

Random Forest⁴⁶ is an ensemble learning method using predictions derived from an ensemble of decision trees. RF models are still widely used in the drug discovery field due to their high classification accuracy and interpretability. RF is often used as a benchmark model when developing ML models, such as one-shot learning⁴⁷. We trained seven classical ML, including RF, on AChE and BChE in AC and compared model scores (see Supporting Information for all model scores). The RF model yielded the best results. Therefore, we selected the RF model as our baseline model.

Training Procedures for Machine Learning Models

Random Forest Model.

The RF models were trained and optimized using nested, 5-fold cross-validation in AC. The models were validated on the validation set and label corresponding to the model type (AChE, BChE, sBChE, or esBChE). The details for the model training and optimization can be found in the supplements of Lane et. al.⁴⁸

Deep Learning Model.

The deep learning model was trained using Adams optimization algorithm. The learning rate, $l r$ , started at 0.001 and reduced by the factor of 10 as the loss plateaued. To control the learning rate decrease, we used ReduceLROnPlateau scheduler of PyTorch. The training process stopped when $l r < 1 e - 6$ to prevent overfitting. We used negative log-likelihood loss (NLLL) to calculate the loss. Hyperparameters were chosen using a grid search. The following parameters were optimized for the average precision (AP) and precision scores prior to final model training: LSTM network output size: 256, 128, 64 (also corresponds to the input size of the MLP model); patience of the scheduler: 10, 25, 50, 100; training batch size: 32, 64, 128. The parameters of optimized models are provided in Supporting Information.

Contrastive Learning.

The CL model was trained using Adams optimization algorithm. The learning rate, $l r$ , started at 0.001 and reduced by the factor of 10 as the loss plateaued. To control the learning rate decrease, we used ReduceLROnPlateau scheduler of PyTorch. The training process stopped when $l r < 1 e - 6$ to prevent overfitting. The training of the CL model took place in two parts. First, we trained the LSTM network using CLL and the network weights were frozen. Then, we loaded the LSTM network weights and trained the MLP module using the NLLL. Hyperparameters were chosen using a grid search. The following parameters were optimized for the average precision (AP) and precision scores prior to final model training: LSTM network output size: 256, 128, 64 (also corresponds to the input size of the MLP model); patience of the scheduler: 10,25,50,100; training batch size: 32,64,128; temperature: 0.1,0.25,0.5. The parameters of optimized models are provided in Supporting Information.

t-SNE Visualizations.

We created t-SNE plots using the Scikit-learn library of Python (as described by us previously⁴⁹). The learning rate was set to 50. The rest of the t-SNE parameters were default settings.

Selective BChE Models

In total, we trained four selective BChE models with modifications on the dataset: two single models and two sequential models. The models were optimized for either Precision Score or Average Precision (AP) Score, and the models with the highest precision of all the combinations were selected for virtual screening. The scores for all models, with all possible optimizations, are provided in the Supporting Information section.

Virtual Libraries and ADME/Tox predictions

Our internal virtual screening library was comprised of 5,374,741 compounds (purchasable in at least 50mg quantities) compiled from 11 companies (Maybridge, Timtec, Otava, ChemBridge, Asinex, Life Chemicals, IBS, Enamine, ChemDiv, Chemspace, and Vitas).

Once we predicted the compound library using the selective BChE models, we used our proprietary models to predict the absorption, distribution, metabolism, excretion / Toxicity (ADME/Tox) properties of the predicted actives in our toxicology software, MegaTox. The ADME/Tox properties we predicted are the solubility (in water), blood-brain barrier (BBB) penetration, cytotoxicity, microsomal stability in human and mouse and human ether-a-go-go-related gene (hERG) toxicity. We used a Weighted-norm⁵⁰ with the hyperparameter p=0.5 to find the pareto-optimal combination of the selective BChE, BChE, solubility, and ADME models. We ranked predicted actives for each model based on their combined predicted solubility. Since each model predicted a different number of BIs, we normalized ranks by the total number of BI actives for each model. We selected the compounds that were predicted as soluble and have > 0.75 normalized rank score for selective BChE and BChE. We made our best effort to search the internet for any of these compounds with known BChE or AChE activity, and then remove them.

Compounds

CPI007001 (2-(3-Piperidin-1-yl-propoxy)-benzoic acid, Cat. # 034296) was purchased from Matrix Scientific (Elgin, SC,) CPI007003 (3-(1H-indol-3-yl)-N-[2-(2-oxoimidazolidin-1-yl)ethyl]propenamide, Cat. # 27284973) was purchased from ChemBridge (San Diego, CA). CPI007069 (1-benzyl-2-methyl-1h-indole-3-carboxylic acid, Cat. # EN300-704148), CPI007070 (N-[2-(1H-indol-3-yl)ethyl]-4-(methylamino)butanamide;hydrochloride, Cat. # Z1457946962), CPI007071 (2S)-2-[(3-ethoxyphenyl)methylamino]-3-(1H-indol-3-yl)propanoate, Cat. # Z2831686563), CPI007072 (2R)-2-[(3-hydroxy-5-propan-2-ylphenyl)methylamino]-3-(1H-indol-3-yl)propanoic acid, Cat. # Z3838201083), and CPI007073 (CPI007073, Cat. # 2-[2-[(dimethylamino)methyl]morpholin-4-yl]-N-[2-(1H-indol-3-yl)ethyl]acetamide) were purchased from Enamine (Kyiv, Ukraine). CPI007066 (N-[2-(dimethylamino)ethyl]-7-methyl-1H,2H,3H-cyclopenta[b]quinolin-9-amine hydrochloride, MolPort-000-662-340), CPI007067 (N-[(4-benzyl-5-oxomorpholin-2-yl)methyl]-2-(1H-indol-3-yl)acetamide, MolPort-046-060-406), CPI007068 (9-[4-[(3-fluorophenyl)methoxy]-3-methoxyphenyl]-2,3,4,5,6,7,9,10-octahydroacridine-1,8-dione, MolPort-002-002-208), CPI007074 (2-{2-imino-3-[2-(piperidin-1-yl)ethyl]-2,3-dihydro-1H-benzimidazol-1-yl}-1-phenylethanol, Molport-046-842-087), CPI007075 (1-(1H-indol-3-yl)-2-(piperidin-1-yl)ethanol, MolPort-002-623-472), CPI007076 (2-(2-chlorophenyl)-5-(hydroxymethyl)-N-[2-(1H-indol-3-yl)ethyl]-2H-1 ,2 ,3-triazole-4-carboxamide, MolPort-007-690-987), CPI007077 (9-(2-methoxyphenyl)-3,4,6,7,9, 10-hexahydroacridine-1,8(2H,5H)-dione, MolPort-001-995-443), CPI007078 (2-[2-bromo-4-(propan-2-yl)phenoxy]-N-(pyridin-3-ylmethyl)acetamide, MolPort-002-091-593), CPI007079 (N-[2-(1H-imidazol-4-yl)ethyl]-4-(1 H-indol-3-yl)butanamide, MolPort-028-855-076) and CPI007080 (3-(1H-indol-3-yl)-N-{2-[(1H-indol-3-ylacetyl)amino]ethyl}propenamide, MolPort-005-911-293) were purchased from MolPort (Riga, Latvia).

In vitro assays

BChE and AChE inhibition studies were conducted using the Butyrylcholinesterase Inhibitor Screening Kit (Colorimetric) (ab289837) and Acetylcholinesterase Assay Kit (Colorimetric) (ab138871) from Abcam. Compounds were pre-incubated with enzyme at 25°C for 30 minutes, then kinetic assays were performed on a SpectraMax ID5 (Molecular Devices, San Jose, CA), using an absorbance 412 nM and 60-minute incubation time. Percent inhibition was calculated using the change in absorbance over time in the compound well (Slope of [C]) compared to a solvent control (slope of [SC]) of 1% DMSO (final concentration): $% R e l a t i v e I n h i b i t i o n = \frac{S l o p e o f [S C] - S l o p e o f [C]}{S l o p e o f [S C]} * 100 %$

Reported percent inhibition for the 10 μM assays is the average of two technical replicates ± standard deviation. Dose-response curves were performed in biological duplicate.

RESULTS

Datasets for BChE Selectivity Modeling

Using a dataset for AChE inhibitors we curated previously ²², and BChE IC₅₀ values downloaded from ChEMBL, we found an overlapping subset of 2027 compounds for which there was both AChE and BChE inhibition data. Defining an inhibitor as a compound with an IC₅₀ value ≤ 1 μM, and a selective inhibitor as compound that meets this threshold for either AChE or BChE, but not both, we subdivided this dataset into selective AChE inhibitors (AIs), selective BChE inhibitors (BIs), dual inhibitors (DIs), or non-inhibitors (NIs) (Figure 2A). From these we built four different classification datasets (Figure 2B). Standard classification datasets were built for AChE and BChE, while a selective BChE (sBChE) dataset was designed to classify only BIs as active, assigning DIs a null activity alongside the AIs inhibitors and NIs. An exclusive, selective BChE (esBChE) dataset was also built that removed AIs and NIs from the dataset entirely, directly juxtaposing BIs with dual inhibitors. Each of these four datasets were further divided (80/20 split) to yield a training set and a stratified leave-out set. The sBChE leave-out set was used as the validation set to test the performance of all models.

Figure 2. — Schematic to define A) how many molecules belonged to each classification and B) how each classification was used to create each dataset for model training.

Single versus Sequential Models for BChE Selectivity.

We investigated two different model configurations for BChE selectivity: a direct approach consisting of a single model trained on selective BChE activity, and a sequential model approach which feeds the output of one activity model directly into a second activity model. We created CL, DL and RF models for each model configuration, and tested each model against our validation set. We prioritized precision in evaluating the performance of each model, to minimize the rate of false positives in our predictions when we later used them to choose compounds for in vitro testing. We therefore optimized each model for Precision Score or Average Precision, then selected the better model of the two. The scores for all models are provided (Table S1–Table S9) along with the optimized parameters (Tables S10–S14). The scores of the best models are summarized in Table 1 and Table 2.

Table 1.

Performance of Single Models for BChE Selectivity on the Validation Set.

Training set	Model	AUC	F1	Precision	Recall	AP	Accuracy	Specificity	Cohen’s Kappa
sBChE	CL	0.76	0.60	0.67	0.55	0.40	0.94	0.98	0.57
	DL	0.59	0.30	0.86	0.18	0.22	0.93	1	0.28
	RF	0.69	0.48	0.62	0.39	0.29	0.93	0.98	0.45
esBChE	CL	0.70	0.41	0.35	0.49	0.21	0.88	0.92	0.34
	DL	0.64	0.37	0.48	0.30	0.20	0.92	0.97	0.33
	RF	0.77	0.42	0.31	0.67	0.23	0.85	0.87	0.35

Open in a new tab

The scores shown are from the models with highest precision after optimization. The most precise model for each training set is indicated in bold. AUC, Area under Receiver Operating Characteristic curve; F1, harmonic mean of precision and recall, Precision, ratio of true positives to predicted positives; Recall, ratio of true positives to all positives; AP (Average precision), area under the precision-recall curve; Accuracy, ratio of true predictions to all predictions; Specificity, ratio of true negatives to all negatives; Cohen’s Kappa, a correlation coefficient which allows for the possibility that some correlation agreements happen by chance. In all cases the best value is 1 and worst is 0.

Table 2.

Performance of Sequential Models for BChE Selectivity on the Validation Set.

Sequence	Model	AUC	F1	Precision	Recall	AP	Accuracy	Specificity	Cohen’s Kappa
BChE > AChE	CL	0.76	0.62	0.72	0.55	0.43	0.95	0.98	0.59
	DL	0.69	0.49	0.65	0.39	0.31	0.93	0.98	0.46
	RF	0.72	0.57	0.75	0.45	0.39	0.94	0.99	0.54
BChE > esBChE	CL	0.64	0.42	0.90	0.27	0.30	0.94	1	0.40
	DL	0.61	0.35	1	0.21	0.28	0.94	1	0.33
	RF	0.71	0.55	0.78	0.42	0.38	0.94	0.99	0.52

Open in a new tab

The scores shown are from the models with highest precision after optimization of all possible combinations. Please see the Supplementary Results section for all the model combinations tested. The most precise model for each sequence is indicated in bold. AUC, Area under Receiver Operating Characteristic curve; F1, harmonic mean of precision and recall, Precision, ratio of true positives to predicted positives; Recall, ratio of true positives to all positives; AP (Average precision), area under the precision-recall curve; Accuracy, ratio of true predictions to all predictions; Specificity, ratio of true negatives to all negatives; Cohen’s Kappa, a correlation coefficient which allows for the possibility that some correlation agreements happen by chance. In all cases the best value is 1 and worst is 0.

The single BChE selectivity models were trained using either the sBChE or esBChE datasets (Figure 2B) and validated their respective leave-outs sets (Table S3, Table S4) The performance for each model on the full validation set are shown in Table 1. The sBChE DL model outperformed the other models, with a precision score of 0.86 on the validation set. The esBChE DL model also showed the highest precision amongst the esBChE models (0.48), but overall, the esBChE model precision scores were nearly half those of the corresponding sBChE models on the validation set (Table 1, Table S5), and much lower than those seen for the esBChE leave out set (Table S4). Considering the validation set contains AIs and DIs, which are absent from the esBChE training set, this difference is unsurprising.

The sequential BChE selectivity modeling approach split the modeling problem into two tasks, using a different model for each. The first task was to identify all BChE inhibitors, for which we used a model built on the BChE dataset (Figure 2B). This model makes no distinction between BIs and DIs, which allows for a wider pool of possible inhibitors to feed into the second model. The second task was to distinguish selective BChE inhibitors from non-selective inhibitors. We first used models built on the AChE dataset for this task, feeding compounds that scored active for BChE inhibition directly into our AChE inhibition model and selecting for those compounds that scored inactive for AChE, thus predicting BChE selectivity. This sequential model performed better than the single models by some metrics (Table 2), but our RF model was the only one which saw an increase in the Precision score. We next built a sequential model consisting of a BChE model followed by an esBChE model. This sequence of models passed predicted BChE inhibitors through a second model specifically designed to predict against dual inhibitors. These BChE > esBChE models scored higher for precision, for all three algorithms, than any other model design (Table 1, Table 2). Because of this, we decided to use the BChE > esBChE sequential models for our BChE selectivity screening.

The best selective BChE selectivity model was the sequential BChE > esBChE DL model, with a Precision Score of 1 (Table 2). While the sequential models generally performed better than the single models (Table 1), the BChE > esBChE model configuration outperformed the others for all three algorithms, with the CL and RF showing Precision Scores of 0.90 and 0.78, respectively. The DL model performed better than the other algorithms except in the case of the BChE > AChE sequential model, in with the RF model had the greatest precision. This was the only case in which the RF model outperformed the other algorithms. It is worth noting that while the models in Tables 1 and 2 were optimized for precision, recall and average precision are low and this may lead to us missing some positives.

t-SNE visualizations of AChE and BChE training sets show considerable separation between the inhibitors and non-inhibitors based on clustering of data points and relative distance between compounds of different categories (Figure 3a, b). Conversely, the actives and inactives in the sBChE training set used to train the sBChE single model do not show a significant separation, implied by the lack of clustering of the same-category compounds and close positioning of active and inactive compounds (Figure 3c) However, removal of the NIs and AIs from the sBChE training set emphasized the distinction between the BIs and DIs based on the increased relative distance between actives and inactives, especially in the center of the t-SNE plot (Figure 3d) In general, t-SNE visualizations may provide qualitative support for why the sequential BChE selectivity models worked better than the single models. We believe the esBChE model performed better than the AChE model as the second model of the sequential models because it was trained to learn the unique features distinguishing BIs from DIs, whereas the AChE model was learned to distinguish AIs from everything else. Additionally, the exclusion of AIs and NIs from the esBChE training set seems to reduce overall dataset complexity, which may help the esBChE model to perform better in a sequential setup.

Figure 3. — t-SNE visualization of the active and inactives in the four different training sets. A) The chemical space of the actives (red) versus the inactives (green) in the AChE dataset B) BChE dataset C) sBChE dataset D) esBChE dataset.

Virtual Screening for Selective BChE Inhibitors

We screened a virtual library of more than 5 million commercially available vendor compounds for selective BChE inhibitors using our sequential selectivity models for all three model architectures. Interestingly, out of more than 5 million compounds, each model predicted nearly the same number of compounds as active: 356 compounds for DL, 318 compounds for CL, and 317 compounds for RF (Figure 4A). Despite the similar number of actives predicted by each model, there was little overlap between the identity of the compounds predicted by the three models, with the DL and CL models sharing the most actives in common. Generally, the predictions for the RF model seemed to occupy a different chemical space than the other two models, based on a t-SNE plot of ECFP6 fingerprints, with minimal overlap (Figure 4B).

Figure 4. — Results of Virtual Screening for Selective BChE Inhibitors. A) Overlap of compounds predicted by the CL (light blue), DL (pink) and RF (green) BChE > esBChE models from the 5 million compound virtual library. B) t-SNE plot comparing the chemical space occupied by compounds predicted by each model, based on ECFP6 fingerprints.

Rather than test all 931 compounds in vitro for selective BChE inhibition, we pruned this list of predicted inhibitors in a stepwise fashion. The first step aimed to maximize our chances of finding hits by utilizing probability scores generated by the CL, DL and RF models for every prediction. Since each model predicted a different number of active compounds, and the probability scores are dependent on the number of actives predicted, these scores are not directly comparable between individual models. However, we normalized these scores by first ranking the probability scores for each compound, then dividing by the total number of active compounds in the prediction set to produce a rank probability (RP). Only those compounds with an RP > 0.75 for both the BChE and esBChE models were retained and subsequently fed into six different in-house ADME/Tox models in our MegaTox software (aqueous solubility, blood-brain barrier penetration, cytotoxicity, microsomal stability – human, microsomal stability – mouse, and hERG toxicity). We assigned a combined ADME/Tox score for each compound based on their performance in these models (Table 3, Figure S1–S6), but ultimately prioritized solubility when choosing compounds for in vitro testing. Of the remaining 30 compounds from the original predictions, seven compounds were either unavailable for immediate purchase or prohibitively expensive to acquire, one compound was a controlled substance, one compound was classified by the Globally Harmonized System of Classification and Labelling of Chemicals as an acute oral and environmental hazard, and one was a published BChE inhibitor. This finally left 20 compounds between the three ML models for in vitro testing (Table 3, Figure 5). Interestingly, the compounds initially selected by each model occupy adjacent chemical spaces to the training set, with the RF predictions more isolated than DL or CL thus suggesting some degree of overlap (Figure S7).

Table 3.

Compounds Selected for BChE and AChE in vitro Testing by Model.

Algorithm	Compound	BChE	esBChE RP	Solubility Score	ADME/Tox Score
Algorithm	Compound	RP	esBChE RP	Solubility Score	ADME/Tox Score
Contrastive Learning	CPI6990	0.86	0.90	0.93	8.98
	CPI6991	0.84	0.80	0.98	9.04
	CPI7066	0.75	0.98	0.65	7.92
	CPI7067	0.83	0.89	0.80	8.47
	CPI7073	0.86	0.89	0.75	8.44
	CPI7076	0.97	0.96	0.76	8.78
	CPI7077	0.97	0.76	0.92	9.32
	CPI7078	0.85	0.86	0.69	8.31

Deep Learning	CPI6989	0.92	0.84	0.93	9.77
Deep Learning	CPI7001	0.88	0.99	0.81	9.23
	CPI7069	0.80	0.82	0.67	8.30
	CPI7071	0.75	0.80	0.76	7.22
	CPI7072	0.89	0.92	0.77	8.46
	CPI7074	0.85	0.92	0.92	8.51
	CPI7075	0.85	0.78	0.79	7.73

Random Forest	CPI7003	0.90	0.87	0.88	9.17
	CPI7068	0.95	1.00	0.69	8.88
	CPI7070	0.76	0.86	0.71	7.27
	CPI7079	0.89	0.76	0.90	8.49
	CPI7080	0.88	0.95	0.87	9.05

Open in a new tab

Figure 5. — Structures of the 20 compounds tested for in vitro BChE and AChE inhibition testing. CL predictions are in light blue, DL predictions are in pink, and RF predictions are in green.

In Vitro Assays.

We tested each of the 20 compounds for inhibition of AChE and BChE at 10 μM using commercially available colorimetric kits. Seven of these 20 compounds displayed at least 30% relative inhibition of BChE, while displaying little or no inhibition of AChE at the same concentration (Figure 6A). Three of these compounds were identified using the CL model, three were from the DL model, and one was from the RF model. The represents a hit rate of 37.5% for the CL model, 43% for the DL model, and 20% for the RF model. Follow up dose-response curves showed only minor differences in the potency between the compounds predicted by the CL (Figure 6B), DL (Figure 6C) or RF (Figure 6D) model, which is not unexpected for classification modeling. Though the sample size for each model is likely too small to compare the efficiency of the models directly, the identification of 7 BIs from a possible 20 reflects an overall hit rate of 35% from our modeling strategy overall. Our goal here was not to identify as many BChE inhibitors as possible; but to develop a method for identifying selective BChE inhibitors which we have validated in vitro.

Figure 6. — Selectivity and Activity of BChE Inhibitors by Predictive Model. A) The relative inhibition for AChE and BChE for each compound predicted by CL (light blue), DL (pink) and RF (green). Error bars represent SD of two technical replicates. B) IC₅₀ values for 3 of the 8 compounds predicted by the CL model. C) IC₅₀ values for 3 of the 7 compounds predicted by the DL model. D) IC₅₀ values for 1 of the 5 compounds predicted by the RF model and the control inhibitor rivastigmine. Curves represent mean values of two biological replicates ± SD. Curves were fit using 3 parameter non-linear regression, except for Rivastigmine, for which 4 parameters was a statistically better fit (P < 0.05).

DISCUSSION

BChE inhibitors have been suggested as a treatment for neurodegenerative diseases, such as AD, particularly when the ratio of AChE to BChE changes in the later stage of the disease^{8, 9}. Inhibiting BChE can lead to an increase in ACh levels in the brain, potentially providing a therapeutic approach to counteract the side effects of AChE inhibitors used in Alzheimer’s treatment^10–12. However, the discovery of novel BIs has been challenging due to many structural similarities^{4, 5, 13–15} and only minor differences between AChE and BChE^{20, 21, 51}. Previous research showed that machine learning models can effectively predict AChE^{22, 23} and BChE^{24, 25} activity. Recently, Xu et. al.²⁵ used various machine learning algorithms, together with multiple feature selection methods, to virtually screen and find novel selective AChE and BChE inhibitors. The molecules identified in previous virtual screens included several methylenedioxyphenol containing molecules, isoflavones, N-phenylacridin-9-amines and 1,1’-([1,1’-biphenyl]-4,4’-diyl)bis(3-aminopropan-1-one), 1,3-diphenyl-1H-pyrazol-4-yl)methanamine, N-((1-(piperidin-1-yl)cyclohexyl)methyl)benzamide and 2-(1-phenylethoxy)ethan-1-amines ^{24, 25}. These classes would appear to be different to what we have identified (mainly indoles) in our external testing. In this paper, we compared the efficacy of Supervised Contrastive Learning with Deep Learning and Random Forest ML models to predict selective BChE inhibition. We also compared the use of single versus sequential models to predict compounds active against BChE and inactive against AChE. Our models were optimized and primarily evaluated for the precision and average precision scores, as they were intended to be used for hit identification. We then used optimal models from each algorithm to virtually screen a library of 5M compounds and ultimately selected 20 compounds for in vitro testing.

Our DL models outperformed the other models for precision in every case but one (Table 1, Table 2). However, the CL models had consistently higher recall and average precision scores. Considering the model architectures and optimization processes for CL and DL models, the difference in scores is likely due to the loss functions used to train each model. The contrastive learning loss (CLL) for the CL models seemed to generalize the learning task better than the negative log-likelihood loss (NLLL) for the DL model, which led to a more conservative but a more precise model. This more generalized learning can be seen in the compounds predicted active in our virtual screen, where the compounds predicted by the CL model did not cluster like those predicted by either the DL or RF models (Figure 4B). The RF model only outperformed the others in one instance (Table 2).

The model configuration that worked the best for all algorithms was a sequential model, consisting of a general BChE inhibition model, followed by an exclusive, selective BChE inhibition model the assigned BIs as active and DIs as negative. We used this sequential model configuration to perform our virtual screen, and tested eight compounds predicted by our CL model, seven compounds predicted by our DL model, and five compounds predicted by our RF model for BChE selectivity in vitro. Ultimately, we identified 7 selective BChE inhibitors of which 3 were relatively small indole analogs (CPI6990, CPI6991, CPI7075). These may provide a starting point for further optimization for applications in diseases like AD. Limitations of this study include focusing on ECFP6 fingerprints whereas we could have explored other 2D descriptors or additional molecule properties. In our hands and others ECFP descriptors appear generally useful across a range of targets as described herein. Other limitations could be related to the virtual screen of only 5M molecules when there are potentially billions available or possible. However, the set of most interest to us were those molecules likely to have properties suitable to cross the BBB and this 5M set has been filtered for this purpose. The approach we have taken using multiple machine learning algorithms including CL, to explore a relatively modest sample of commercially available chemicals could likely could also be applied to other targets were the identification of selective inhibitors is desired. Similarly, we may also apply this methodology to the search for AChE and/or BChE reactivators that are not inhibitors of these enzymes. While we have focused on potential applications to developing molecules with selective therapeutic utility, the ‘selectivity requirement’ could also be explored in cases were a target had a toxicity associated with it that would need to be avoided in favor of a more desirable activity for a second target. Finally, this study shows the potential value of carefully curated literature datasets (as described for AChE and BChE, Table S15, S16) for machine learning applications for enabling further drug discovery and toxicology related projects.

Supplementary Material

supporting information

NIHMS2014714-supplement-supporting_information.docx^{(1.4MB, docx)}

Table S15

NIHMS2014714-supplement-Table_S15.xlsx^{(90.7KB, xlsx)}

Table S16

NIHMS2014714-supplement-Table_S16.xlsx^{(32.6KB, xlsx)}

Table S17

NIHMS2014714-supplement-Table_S17.xlsx^{(12.8KB, xlsx)}

Funding Sources

S.E. kindly acknowledges NIH funding R44GM122196-02A1 from NIH NIGMS, 2R44ES031038-02A1 from the National Institute of Environmental Health Sciences (NIEHS), 1R43ES033855-01 from NIEHS. We kindly acknowledge a matching grant award under the FY 21-22 One North Carolina SBIR/STTR Matching Funds Program Solicitation.

ABBREVIATIONS

ADME/Tox: absorption, distribution, metabolism, excretion / Toxicity
ACh: Acetylcholine
AChE: Acetylcholinesterase
AD: Alzheimer’s disease
AC: Assay Central
AP: average precision
BChE: Butyrylcholinesterase
CL: contrastive learning
BBB: Blood-Brain Barrier
CLL: contrastive learning loss
CNN: convolutional neural network
DL: deep learning
ECFP: extended connectivity fingerprints
GI: Gastrointestinal
hERG: human ether-a-go-go-related gene
kNN: k-nearest neighbor
LSTM: Long Short-term memory
ML: Machine learning
MCC: Matthew’s Correlation Coefficient
MLP: multilayer perceptron
NB: naïve Bayesian
NLL: negative log likelihood
NLLL: negative log-likelihood loss
RF: Random Forest
RNN: recurrent neural network
ReLU: rectified linear unit
RMSD: Root mean-square deviation
AIs: selective AChE inhibitors
BIs: selective BChE inhibitors
SVM: support vector machine

Footnotes

Supporting Information.

Supplementary methods, ADME/Tox Model scores, Supplementary results for AChE and BChE models for single models and sequential models, Deep and contrastive learning parameters, t-SNE plot of training and validations sets (.docx). The model training, validation and test set SMILES (Tables S15–17) are provided.

Competing interests:

SE is CEO of Collaborations Pharmaceuticals, Inc. MKO, PAV, ACP, TRL and FU are employees at Collaborations Pharmaceuticals, Inc.

Statement on dual use:

The AChE machine learning models described in this study have potential dual-use capabilities, and we therefore propose to implement restrictions to control who has access to these models. We believe such precautions are necessary and these will evolve over time as we integrate software features to control this dual use.

Data and Software Availability statement

The model training, validation and test sets are provided (Table S15–Table S17).

REFERENCES

1.Soreq H; Seidman S, Acetylcholinesterase--new roles for an old actor. Nat Rev Neurosci 2001, 2, 294–302. [DOI] [PubMed] [Google Scholar]
2.Alles GA; Hawes RC, Cholinesterases in the blood of man. Journal of Biological Chemistry 1940, 133, 375–390. [Google Scholar]
3.Mendel B; Rudney H, On the type of cholinesterase present in brain tissue. Science 1943, 98, 201–202. [DOI] [PubMed] [Google Scholar]
4.Silver A, The biology of cholinesterases. Frontiers of biology 1974, 36, 426–447. [Google Scholar]
5.Darvesh S; Hopkins DA; Geula C, Neurobiology of butyrylcholinesterase. Nat Rev Neurosci 2003, 4, 131–8. [DOI] [PubMed] [Google Scholar]
6.Primo-Parmo SL; Bartels CF; Wiersema B; van der Spek AF; Innis JW; La Du BN, Characterization of 12 silent alleles of the human butyrylcholinesterase (BCHE) gene. Am J Hum Genet 1996, 58, 52–64. [PMC free article] [PubMed] [Google Scholar]
7.Stepankova S; Komers K, Cholinesterases and cholinesterase inhibitors. Current Enzyme Inhibition 2008, 4, 160–171. [Google Scholar]
8.Arendt T; Bigl V; Walther F; Sonntag M, Decreased ratio of CSF acetylcholinesterase to butyrylcholinesterase activity in Alzheimer’s disease. Lancet 1984, 1, 173. [DOI] [PubMed] [Google Scholar]
9.Perry EK; Perry RH; Blessed G; Tomlinson BE, Changes in brain cholinesterases in senile dementia of Alzheimer type. Neuropathol Appl Neurobiol 1978, 4, 273–7. [DOI] [PubMed] [Google Scholar]
10.Maelicke A; Hoeffle-Maas A; Ludwig J; Maus A; Samochocki M; Jordis U; Koepke AK, Memogain is a galantamine pro-drug having dramatically reduced adverse effects and enhanced efficacy. Journal of molecular neuroscience 2010, 40, 135–137. [DOI] [PubMed] [Google Scholar]
11.Watkins PB; Zimmerman HJ; Knapp MJ; Gracon SI; Lewis KW, Hepatotoxic effects of tacrine administration in patients with Alzheimer’s disease. JAMA 1994, 271, 992–8. [PubMed] [Google Scholar]
12.Tayeb HO; Yang HD; Price BH; Tarazi FI, Pharmacotherapies for Alzheimer’s disease: beyond cholinesterase inhibitors. Pharmacol Ther 2012, 134, 8–25. [DOI] [PubMed] [Google Scholar]
13.Li Q; Yang H; Chen Y; Sun H, Recent progress in the identification of selective butyrylcholinesterase inhibitors for Alzheimer’s disease. Eur J Med Chem 2017, 132, 294–309. [DOI] [PubMed] [Google Scholar]
14.Millard CB; Broomfield CA, A computer model of glycosylated human butyrylcholinesterase. Biochemical and biophysical research communications 1992, 189, 1280–1286. [DOI] [PubMed] [Google Scholar]
15.Nicolet Y; Lockridge O; Masson P; Fontecilla-Camps JC; Nachon F, Crystal structure of human butyrylcholinesterase and of its complexes with substrate and products. J Biol Chem 2003, 278, 41141–7. [DOI] [PubMed] [Google Scholar]
16.Dighe SN; Deora GS; De la Mora E; Nachon F; Chan S; Parat M-O; Brazzolotto X; Ross BP, Discovery and Structure–Activity Relationships of a Highly Selective Butyrylcholinesterase Inhibitor by Structure-Based Virtual Screening. Journal of Medicinal Chemistry 2016, 59, 7683–7689. [DOI] [PubMed] [Google Scholar]
17.Williams A; Zhou S; Zhan CG, Discovery of potent and selective butyrylcholinesterase inhibitors through the use of pharmacophore-based screening. Bioorg Med Chem Lett 2019, 29, 126754. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Jiang C-S; Ge Y-X; Cheng Z-Q; Wang Y-Y; Tao H-R; Zhu K; Zhang H Discovery of New Selective Butyrylcholinesterase (BChE) Inhibitors with Anti-Aβ Aggregation Activity: Structure-Based Virtual Screening, Hit Optimization and Biological Evaluation. Molecules, 2019, 24, 2568. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Miles JA; Kapure JS; Deora GS; Courageux C; Igert A; Dias J; McGeary RP; Brazzolotto X; Ross BP, Rapid discovery of a selective butyrylcholinesterase inhibitor using structure-based virtual screening. Bioorganic & Medicinal Chemistry Letters 2020, 30, 127609. [DOI] [PubMed] [Google Scholar]
20.Harel M; Sussman JL; Krejci E; Bon S; Chanal P; Massoulie J; Silman I, Conversion of acetylcholinesterase to butyrylcholinesterase: modeling and mutagenesis. Proc Natl Acad Sci U S A 1992, 89, 10827–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Radic Z; Pickering NA; Vellom DC; Camp S; Taylor P, Three distinct domains in the cholinesterase molecule confer selectivity for acetyl- and butyrylcholinesterase inhibitors. Biochemistry 1993, 32, 12074–84. [DOI] [PubMed] [Google Scholar]
22.Vignaux PA; Lane TR; Urbina F; Gerlach J; Puhl AC; Snyder SH; Ekins S, Validation of Acetylcholinesterase Inhibition Machine Learning Models for Multiple Species. Chem Res Toxicol 2023, 36, 188–201. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Lv W; Xue Y, Prediction of acetylcholinesterase inhibitors and characterization of correlative molecular descriptors by machine learning methods. Eur J Med Chem 2010, 45, 1167–72. [DOI] [PubMed] [Google Scholar]
24.Fang J; Yang R; Gao L; Zhou D; Yang S; Liu A.-l.; Du G.-h., Predictions of BuChE Inhibitors Using Support Vector Machine and Naive Bayesian Classification Techniques in Drug Discovery. Journal of Chemical Information and Modeling 2013, 53, 3009–3020. [DOI] [PubMed] [Google Scholar]
25.Xu T; Li S; Li AJ; Zhao J; Sakamuru S; Huang W; Xia M; Huang R, Identification of Potent and Selective Acetylcholinesterase/Butyrylcholinesterase Inhibitors by Virtual Screening. J Chem Inf Model 2023, 63, 2321–2330. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Chen T; Kornblith S; Norouzi M; Hinton G A simple framework for contrastive learning of visual representations. In International conference on machine learning, 2020; PMLR: 2020; pp 1597–1607. [Google Scholar]
27.He K; Fan H; Wu Y; Xie S; Girshick R Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020; 2020; pp 9729–9738. [Google Scholar]
28.Khosla P; Teterwak P; Wang C; Sarna A; Tian Y; Isola P; Maschinot A; Liu C; Krishnan D, Supervised contrastive learning. Advances in neural information processing systems 2020, 33, 18661–18673. [Google Scholar]
29.Gao Z; Ma H; Zhang X; Wang Y; Wu Z, Similarity measures-based graph co-contrastive learning for drug-disease association prediction. Bioinformatics 2023, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Li Y; Qiao G; Gao X; Wang G, Supervised graph co-contrastive learning for drug-target interaction prediction. Bioinformatics 2022, 38, 2847–2854. [DOI] [PubMed] [Google Scholar]
31.Yao K; Wang X; Li W; Zhu H; Jiang Y; Li Y; Tian T; Yang Z; Liu Q; Liu Q, Semi-supervised heterogeneous graph contrastive learning for drug-target interaction prediction. Comput Biol Med 2023, 163, 107199. [DOI] [PubMed] [Google Scholar]
32.Singh R; Sledzieski S; Bryson B; Cowen L; Berger B, Contrastive learning in protein language space predicts interactions between drugs and protein targets. Proc Natl Acad Sci U S A 2023, 120, e2220778120. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Lin S; Chen W; Chen G; Zhou S; Wei DQ; Xiong Y, MDDI-SCL: predicting multi-type drug-drug interactions via supervised contrastive learning. J Cheminform 2022, 14, 81. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Wang Y; Magar R; Liang C; Barati Farimani A, Improving Molecular Contrastive Learning via Faulty Negative Mitigation and Decomposed Fragment Contrast. J Chem Inf Model 2022, 62, 2713–2725. [DOI] [PubMed] [Google Scholar]
35.Shrivastava AD; Kell DB, FragNet, a Contrastive Learning-Based Transformer Model for Clustering, Interpreting, Visualizing, and Navigating Chemical Space. Molecules 2021, 26. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Sanchez-Fernandez A; Rumetshofer E; Hochreiter S; Klambauer G, CLOOME: contrastive learning unlocks bioimaging databases for queries with chemical structures. Nat Commun 2023, 14, 7339. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Tao W; Liu Y; Lin X; Song B; Zeng X, Prediction of multi-relational drug-gene interaction via Dynamic hyperGraph Contrastive Learning. Brief Bioinform 2023, 24. [DOI] [PubMed] [Google Scholar]
38.Du BX; Long Y; Li X; Wu M; Shi JY, CMMS-GCL: cross-modality metabolic stability prediction with graph contrastive learning. Bioinformatics 2023, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Zhang Z; Xie A; Guan J; Zhou S, Molecular property prediction by semantic-invariant contrastive learning. Bioinformatics 2023, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Wu Y; Ni X; Wang Z; Feng W, Enhancing drug property prediction with dual-channel transfer learning based on molecular fragment. BMC Bioinformatics 2023, 24, 293. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Wang J; Guan J; Zhou S, Molecular property prediction by contrastive learning with attention-guided positive sample selection. Bioinformatics 2023, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Zheng Z; Tan Y; Wang H; Yu S; Liu T; Liang C, CasANGCL: pre-training and fine-tuning model based on cascaded attention network and graph contrastive learning for molecular property prediction. Brief Bioinform 2023, 24. [DOI] [PubMed] [Google Scholar]
43.Gers FA; Schmidhuber J; Cummins F, Learning to forget: continual prediction with LSTM. Neural Comput 2000, 12, 2451–71. [DOI] [PubMed] [Google Scholar]
44.Hochreiter S; Schmidhuber J, Long short-term memory. Neural computation 1997, 9, 1735–1780. [DOI] [PubMed] [Google Scholar]
45.Hornik K; Stinchcombe M; White H, Multilayer feedforward networks are universal approximators. Neural networks 1989, 2, 359–366. [Google Scholar]
46.Breiman L, Random Forests. Machine Learning 2001, 45, 5–32. [Google Scholar]
47.Altae-Tran H; Ramsundar B; Pappu AS; Pande V, Low Data Drug Discovery with One-Shot Learning. ACS Cent Sci 2017, 3, 283–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Lane TR; Urbina F; Rank L; Gerlach J; Riabova O; Lepioshkin A; Kazakova E; Vocat A; Tkachenko V; Cole S; Makarov V; Ekins S, Machine Learning Models for Mycobacterium tuberculosis In Vitro Activity: Prediction and Target Visualization. Molecular Pharmaceutics 2022, 19, 674–689. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Lane TR; Harris J; Urbina F; Ekins S, Comparing LD(50)/LC(50) Machine Learning Models for Multiple Species. J Chem Health Saf 2023, 30, 83–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
50.Chugh T Scalarizing Functions in Bayesian Multiobjective Optimization. arXiv:1904.05760, 2019; https://ui.adsabs.harvard.edu/abs/2019arXiv190405760C (accessed April 01, 2019).
51.Saxena A; Redman AM; Jiang X; Lockridge O; Doctor BP, Differences in active site gorge dimensions of cholinesterases revealed by binding of inhibitors to human butyrylcholinesterase. Biochemistry 1997, 36, 14642–51. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supporting information

NIHMS2014714-supplement-supporting_information.docx^{(1.4MB, docx)}

Table S15

NIHMS2014714-supplement-Table_S15.xlsx^{(90.7KB, xlsx)}

Table S16

NIHMS2014714-supplement-Table_S16.xlsx^{(32.6KB, xlsx)}

Table S17

NIHMS2014714-supplement-Table_S17.xlsx^{(12.8KB, xlsx)}

[R1] 1.Soreq H; Seidman S, Acetylcholinesterase--new roles for an old actor. Nat Rev Neurosci 2001, 2, 294–302. [DOI] [PubMed] [Google Scholar]

[R2] 2.Alles GA; Hawes RC, Cholinesterases in the blood of man. Journal of Biological Chemistry 1940, 133, 375–390. [Google Scholar]

[R3] 3.Mendel B; Rudney H, On the type of cholinesterase present in brain tissue. Science 1943, 98, 201–202. [DOI] [PubMed] [Google Scholar]

[R4] 4.Silver A, The biology of cholinesterases. Frontiers of biology 1974, 36, 426–447. [Google Scholar]

[R5] 5.Darvesh S; Hopkins DA; Geula C, Neurobiology of butyrylcholinesterase. Nat Rev Neurosci 2003, 4, 131–8. [DOI] [PubMed] [Google Scholar]

[R6] 6.Primo-Parmo SL; Bartels CF; Wiersema B; van der Spek AF; Innis JW; La Du BN, Characterization of 12 silent alleles of the human butyrylcholinesterase (BCHE) gene. Am J Hum Genet 1996, 58, 52–64. [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Stepankova S; Komers K, Cholinesterases and cholinesterase inhibitors. Current Enzyme Inhibition 2008, 4, 160–171. [Google Scholar]

[R8] 8.Arendt T; Bigl V; Walther F; Sonntag M, Decreased ratio of CSF acetylcholinesterase to butyrylcholinesterase activity in Alzheimer’s disease. Lancet 1984, 1, 173. [DOI] [PubMed] [Google Scholar]

[R9] 9.Perry EK; Perry RH; Blessed G; Tomlinson BE, Changes in brain cholinesterases in senile dementia of Alzheimer type. Neuropathol Appl Neurobiol 1978, 4, 273–7. [DOI] [PubMed] [Google Scholar]

[R10] 10.Maelicke A; Hoeffle-Maas A; Ludwig J; Maus A; Samochocki M; Jordis U; Koepke AK, Memogain is a galantamine pro-drug having dramatically reduced adverse effects and enhanced efficacy. Journal of molecular neuroscience 2010, 40, 135–137. [DOI] [PubMed] [Google Scholar]

[R11] 11.Watkins PB; Zimmerman HJ; Knapp MJ; Gracon SI; Lewis KW, Hepatotoxic effects of tacrine administration in patients with Alzheimer’s disease. JAMA 1994, 271, 992–8. [PubMed] [Google Scholar]

[R12] 12.Tayeb HO; Yang HD; Price BH; Tarazi FI, Pharmacotherapies for Alzheimer’s disease: beyond cholinesterase inhibitors. Pharmacol Ther 2012, 134, 8–25. [DOI] [PubMed] [Google Scholar]

[R13] 13.Li Q; Yang H; Chen Y; Sun H, Recent progress in the identification of selective butyrylcholinesterase inhibitors for Alzheimer’s disease. Eur J Med Chem 2017, 132, 294–309. [DOI] [PubMed] [Google Scholar]

[R14] 14.Millard CB; Broomfield CA, A computer model of glycosylated human butyrylcholinesterase. Biochemical and biophysical research communications 1992, 189, 1280–1286. [DOI] [PubMed] [Google Scholar]

[R15] 15.Nicolet Y; Lockridge O; Masson P; Fontecilla-Camps JC; Nachon F, Crystal structure of human butyrylcholinesterase and of its complexes with substrate and products. J Biol Chem 2003, 278, 41141–7. [DOI] [PubMed] [Google Scholar]

[R16] 16.Dighe SN; Deora GS; De la Mora E; Nachon F; Chan S; Parat M-O; Brazzolotto X; Ross BP, Discovery and Structure–Activity Relationships of a Highly Selective Butyrylcholinesterase Inhibitor by Structure-Based Virtual Screening. Journal of Medicinal Chemistry 2016, 59, 7683–7689. [DOI] [PubMed] [Google Scholar]

[R17] 17.Williams A; Zhou S; Zhan CG, Discovery of potent and selective butyrylcholinesterase inhibitors through the use of pharmacophore-based screening. Bioorg Med Chem Lett 2019, 29, 126754. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Jiang C-S; Ge Y-X; Cheng Z-Q; Wang Y-Y; Tao H-R; Zhu K; Zhang H Discovery of New Selective Butyrylcholinesterase (BChE) Inhibitors with Anti-Aβ Aggregation Activity: Structure-Based Virtual Screening, Hit Optimization and Biological Evaluation. Molecules, 2019, 24, 2568. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Miles JA; Kapure JS; Deora GS; Courageux C; Igert A; Dias J; McGeary RP; Brazzolotto X; Ross BP, Rapid discovery of a selective butyrylcholinesterase inhibitor using structure-based virtual screening. Bioorganic & Medicinal Chemistry Letters 2020, 30, 127609. [DOI] [PubMed] [Google Scholar]

[R20] 20.Harel M; Sussman JL; Krejci E; Bon S; Chanal P; Massoulie J; Silman I, Conversion of acetylcholinesterase to butyrylcholinesterase: modeling and mutagenesis. Proc Natl Acad Sci U S A 1992, 89, 10827–31. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Radic Z; Pickering NA; Vellom DC; Camp S; Taylor P, Three distinct domains in the cholinesterase molecule confer selectivity for acetyl- and butyrylcholinesterase inhibitors. Biochemistry 1993, 32, 12074–84. [DOI] [PubMed] [Google Scholar]

[R22] 22.Vignaux PA; Lane TR; Urbina F; Gerlach J; Puhl AC; Snyder SH; Ekins S, Validation of Acetylcholinesterase Inhibition Machine Learning Models for Multiple Species. Chem Res Toxicol 2023, 36, 188–201. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] 23.Lv W; Xue Y, Prediction of acetylcholinesterase inhibitors and characterization of correlative molecular descriptors by machine learning methods. Eur J Med Chem 2010, 45, 1167–72. [DOI] [PubMed] [Google Scholar]

[R24] 24.Fang J; Yang R; Gao L; Zhou D; Yang S; Liu A.-l.; Du G.-h., Predictions of BuChE Inhibitors Using Support Vector Machine and Naive Bayesian Classification Techniques in Drug Discovery. Journal of Chemical Information and Modeling 2013, 53, 3009–3020. [DOI] [PubMed] [Google Scholar]

[R25] 25.Xu T; Li S; Li AJ; Zhao J; Sakamuru S; Huang W; Xia M; Huang R, Identification of Potent and Selective Acetylcholinesterase/Butyrylcholinesterase Inhibitors by Virtual Screening. J Chem Inf Model 2023, 63, 2321–2330. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Chen T; Kornblith S; Norouzi M; Hinton G A simple framework for contrastive learning of visual representations. In International conference on machine learning, 2020; PMLR: 2020; pp 1597–1607. [Google Scholar]

[R27] 27.He K; Fan H; Wu Y; Xie S; Girshick R Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020; 2020; pp 9729–9738. [Google Scholar]

[R28] 28.Khosla P; Teterwak P; Wang C; Sarna A; Tian Y; Isola P; Maschinot A; Liu C; Krishnan D, Supervised contrastive learning. Advances in neural information processing systems 2020, 33, 18661–18673. [Google Scholar]

[R29] 29.Gao Z; Ma H; Zhang X; Wang Y; Wu Z, Similarity measures-based graph co-contrastive learning for drug-disease association prediction. Bioinformatics 2023, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Li Y; Qiao G; Gao X; Wang G, Supervised graph co-contrastive learning for drug-target interaction prediction. Bioinformatics 2022, 38, 2847–2854. [DOI] [PubMed] [Google Scholar]

[R31] 31.Yao K; Wang X; Li W; Zhu H; Jiang Y; Li Y; Tian T; Yang Z; Liu Q; Liu Q, Semi-supervised heterogeneous graph contrastive learning for drug-target interaction prediction. Comput Biol Med 2023, 163, 107199. [DOI] [PubMed] [Google Scholar]

[R32] 32.Singh R; Sledzieski S; Bryson B; Cowen L; Berger B, Contrastive learning in protein language space predicts interactions between drugs and protein targets. Proc Natl Acad Sci U S A 2023, 120, e2220778120. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] 33.Lin S; Chen W; Chen G; Zhou S; Wei DQ; Xiong Y, MDDI-SCL: predicting multi-type drug-drug interactions via supervised contrastive learning. J Cheminform 2022, 14, 81. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.Wang Y; Magar R; Liang C; Barati Farimani A, Improving Molecular Contrastive Learning via Faulty Negative Mitigation and Decomposed Fragment Contrast. J Chem Inf Model 2022, 62, 2713–2725. [DOI] [PubMed] [Google Scholar]

[R35] 35.Shrivastava AD; Kell DB, FragNet, a Contrastive Learning-Based Transformer Model for Clustering, Interpreting, Visualizing, and Navigating Chemical Space. Molecules 2021, 26. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] 36.Sanchez-Fernandez A; Rumetshofer E; Hochreiter S; Klambauer G, CLOOME: contrastive learning unlocks bioimaging databases for queries with chemical structures. Nat Commun 2023, 14, 7339. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] 37.Tao W; Liu Y; Lin X; Song B; Zeng X, Prediction of multi-relational drug-gene interaction via Dynamic hyperGraph Contrastive Learning. Brief Bioinform 2023, 24. [DOI] [PubMed] [Google Scholar]

[R38] 38.Du BX; Long Y; Li X; Wu M; Shi JY, CMMS-GCL: cross-modality metabolic stability prediction with graph contrastive learning. Bioinformatics 2023, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] 39.Zhang Z; Xie A; Guan J; Zhou S, Molecular property prediction by semantic-invariant contrastive learning. Bioinformatics 2023, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] 40.Wu Y; Ni X; Wang Z; Feng W, Enhancing drug property prediction with dual-channel transfer learning based on molecular fragment. BMC Bioinformatics 2023, 24, 293. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] 41.Wang J; Guan J; Zhou S, Molecular property prediction by contrastive learning with attention-guided positive sample selection. Bioinformatics 2023, 39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R42] 42.Zheng Z; Tan Y; Wang H; Yu S; Liu T; Liang C, CasANGCL: pre-training and fine-tuning model based on cascaded attention network and graph contrastive learning for molecular property prediction. Brief Bioinform 2023, 24. [DOI] [PubMed] [Google Scholar]

[R43] 43.Gers FA; Schmidhuber J; Cummins F, Learning to forget: continual prediction with LSTM. Neural Comput 2000, 12, 2451–71. [DOI] [PubMed] [Google Scholar]

[R44] 44.Hochreiter S; Schmidhuber J, Long short-term memory. Neural computation 1997, 9, 1735–1780. [DOI] [PubMed] [Google Scholar]

[R45] 45.Hornik K; Stinchcombe M; White H, Multilayer feedforward networks are universal approximators. Neural networks 1989, 2, 359–366. [Google Scholar]

[R46] 46.Breiman L, Random Forests. Machine Learning 2001, 45, 5–32. [Google Scholar]

[R47] 47.Altae-Tran H; Ramsundar B; Pappu AS; Pande V, Low Data Drug Discovery with One-Shot Learning. ACS Cent Sci 2017, 3, 283–293. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] 48.Lane TR; Urbina F; Rank L; Gerlach J; Riabova O; Lepioshkin A; Kazakova E; Vocat A; Tkachenko V; Cole S; Makarov V; Ekins S, Machine Learning Models for Mycobacterium tuberculosis In Vitro Activity: Prediction and Target Visualization. Molecular Pharmaceutics 2022, 19, 674–689. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] 49.Lane TR; Harris J; Urbina F; Ekins S, Comparing LD(50)/LC(50) Machine Learning Models for Multiple Species. J Chem Health Saf 2023, 30, 83–97. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R50] 50.Chugh T Scalarizing Functions in Bayesian Multiobjective Optimization. arXiv:1904.05760, 2019; https://ui.adsabs.harvard.edu/abs/2019arXiv190405760C (accessed April 01, 2019).

[R51] 51.Saxena A; Redman AM; Jiang X; Lockridge O; Doctor BP, Differences in active site gorge dimensions of cholinesterases revealed by binding of inhibitors to human butyrylcholinesterase. Biochemistry 1997, 36, 14642–51. [DOI] [PubMed] [Google Scholar]

PERMALINK

Sequential Contrastive and Deep Learning Models to Identify Selective Butyrylcholinesterase Inhibitors

Mustafa Kemal Ozalp

Patricia A Vignaux

Ana C Puhl

Thomas R Lane

Fabio Urbina

Sean Ekins

Abstract

Graphical Abstract

INTRODUCTION

EXPERIMENTAL PROCEDURES

Data Sets

Computational Resources and Environments

Overview of Machine Learning Model

Figure 1.

Training Procedures for Machine Learning Models

Random Forest Model.

Deep Learning Model.

Contrastive Learning.

t-SNE Visualizations.

Selective BChE Models

Virtual Libraries and ADME/Tox predictions

Compounds

In vitro assays

RESULTS

Datasets for BChE Selectivity Modeling

Figure 2.

Single versus Sequential Models for BChE Selectivity.

Table 1.

Table 2.

Figure 3.

Virtual Screening for Selective BChE Inhibitors

Figure 4.

Table 3.

Figure 5.

In Vitro Assays.

Figure 6.

DISCUSSION

Supplementary Material

Funding Sources

ABBREVIATIONS

Footnotes

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases