Abstract
Background
Blood brain barrier and Alzheimer’s disease are interrelated. This interrelation is detected by physicochemical methods, pharmacological and electrophysiological analyses. Nature of the phenomenon is extremely complex. The description of this interrelation in mathematical terms is a very important task.
Objective
The systematization of facts, which are described in the literature and related to interaction between processes, which influence Alzheimer's disease and blood brain barrier is the subject of this work. In addition, establishing of correlations between molecular features and endpoints, which are related to the treatment of Alzheimer's disease and blood brain barrier using the CORAL software are subjects of this work.
Methods
The information on logically structured analysis is available in the literature and building up quantitative structure – activity relationships (QSARs) by the Monte Carlo method has been used to solve the task of systematization of facts related to the “treatment of Alzheimer's disease vs. blood brain barrier”.
Results
Comparison of agreements and disagreements of the available published papers together with the statistical quality of built up QSARs are results of this work.
Conclusion
The facts from published papers and technical details of QSAR built up in this study give possibility to formulate the following rules: (i) there are molecular alerts, which are promoters to increase blood brain barrier and therapeutic activity of anti-Alzheimer disease agents; (ii) there are molecular alerts, which contradict each other.
Keywords: Alzheimer's disease, blood brain barrier, QSAR, monte carlo method, molecular alerts, CORAL software
1. INTRODUCTION
Alzheimer’s disease is a disorder of the central nervous system accompanied by memory deterioration, and progressive impairment of daily life activities. Aging of an organism is a biochemical process. Therefore, the injection of chemicals can influence this process. The blood-brain barrier is a major factor hindering the development of neurotherapeutics. Experimental methods of Blood Brain Barrier permeation determination as well as experimental definition of many other biomedical endpoints are cumbersome and expensive. Under such circumstances, computational approaches for the prediction of biomedical endpoints, in general, and computational methods for prediction of Blood Brain Barrier permeation, in particular are attractive alternatives of the direct experiment. Currently, there is no cure for Alzheimer's disease [1].
Being the most common form of dementia, Alzheimer’s disease is currently affecting over 5.5 million people in the United States and more than 35 million worldwide [2, 3]. The hallmark of the disease is progressive cognitive decline that results in loss of language skills, difficulty in learning, loss of memory, and alterations in personality and mood [4-6].
There are some circumstances, which indicate the possible interrelation between processes related to Alzheimer’s disease and Blood Brain Barrier [7-9]. It has been noticed that breakdown of the Blood Brain Barrier is a particularly important development in Alzheimer’s disease progression [10-12].
According to the listed circumstances, the attractive paradigm to search agents versus Alzheimer’s disease can be represented by scheme illustrated in Fig. (1). It is important to note that there are logical implications and interrelation between all the mentioned components of the paradigm.
2. ONTOLOGY
The information about the interaction between the elements of phenomena represented in Fig. (1) is very complex and unclear owing to dynamical and combinatorial aspects. The methods to represent this information in a format, which is convenient for understanding, should be regarded as methods of critical importance. One of the possible ways to construct a method of the above mentioned quality is the analysis of molecular alerts (features) able to influence the blood brain barrier and likely able to suggest the perspective list of molecular features valuable from the point of view of drug discovery oriented to define a group of agents versus Alzheimer’s disease.
2.1. Task Definition: Interrelation Between Blood Brain Barrier and Alzheimer’s Disease
Much of the underlying biology leading to Alzheimer’s disease is unknown. Popular etiologic hypotheses have largely ignored the blood brain barrier as an important factor contributing to the pathologic hallmarks of this most common form of dementia. However, evidence identifying blood brain barrier dysfunction in Alzheimer’s disease continues to escalate [13].
Normal ageing and Alzheimer's disease have many common features. In many ways, both conditions only differ by quantitative criteria. A variety of genetic, medical and environmental factors modulate the ageing-related processes leading to Alzheimer’s disease. Thus, Alzheimer's disease is a metabolic disease [14]. The pathophysiological influence of microelements, including aluminum and iron, is highly controversial; at any rate, they may adversely affect of Alzheimer's disease progress [14].
The application of gene transfer (i.e. macromolecular sequences of amino acids) can also be used to augment existing or provide new functions to cells in the hope that this will be of therapeutic benefit [15].
The Blood Brain Barrier is a dynamic and complex interface between the blood and the central nervous system regulating brain homeostasis. Major functions of the Blood Brain Barrier include the transport of nutrients and protection of the brain from toxic compounds. The nutrition of the brain involves small molecules like sugars, amino acids, vitamins, and trace elements. Large biomolecules, lipoproteins, peptide and protein hormones cross the Blood Brain Barrier by receptor-mediated transport [16]. Dysfunction in the transport of nutrients at the Blood Brain Barrier is described in several neurological disorders and diseases. The Blood Brain Barrier penetration of neuroprotective nutrients, especially the potential protective effect of polyphenols and alkaloids, on brain endothelium is well-known [16, 17].
Thus, the search for molecular features (fragments, 3D-isomerism, intramolecular and intermolecular quantum mechanical conditions) with apparent influence to blood brain barrier and destructed fragments of neurons can be a perspective for drug discovery.
2.2. Molecular Features which Influence to Blood Brain Barrier
Mechanistic interpretation for QSAR related to blood brain barrier usually based on physicochemical conditions such as octanol/water partition coefficient, isolated atomic energy [18], H-bond donor surface area, H-acceptor surface area [19], Rotatable bonds count, Hydrogen bond acceptor count [20]. There is influence of the presence of heavy atoms on the blood brain barrier and central nervous system [17]. The binding energy predictions were highly correlated with r2=0.88, F=692.4, standard error of estimate =0.775, for selected blood brain barrier active/inactive compounds (n=93) [17].
Inhibition of efflux pumps present at the blood brain barrier by nutraceuticals and plant compounds can be carried out with a number of organic compounds such as Apigenin, Berbamine, Catechin, Chrysin, Rutin, etc. [16]. The rings are common attributes of these biologically active compounds [16]. Thus the six-membered rings are of molecular feature with influence on the blood brain barrier and central nervous system [16]. Presence of nitrogen in rings and size of linear molecular fragment connecting a couple of rings is also a molecular alert related to blood brain barrier [21].
2.3. Molecular Features which Influence the Alzheimer's Disease
Mechanistic interpretation for QSAR related to Alzheimer’s disease is usually based on physicochemical and biochemical conditions, such as molecular weight, total polar surface area hydrophilicity, absorption rate constants, etc., without molecular alerts [22]. However, modifiers of pharmacokinetics effects include molecular images such as 2-propan-water, acetone-water and the number of carbon atoms [22]. Chlorine and oxygen connected to six-membered rings, triple covalent bonds, as well as 3D-conformations can also be examined as structural alerts related to endpoints interrelated to Alzheimer’s disease [23]. Finally, groups of five-membered and six-membered rings involve oxygen and nitrogen respectively, aspotential agents for treating Alzheimer’s disease [24].
3. QSAR MODELS
3.1. Data
The binding affinity data (IC50 nM converted into negative decimal logarithm pIC50= -log10IC50) of 233 gamma-secretase inhibitors (potential agents for treatment Alzheimer’s disease) are studied in the literature [25, 26]. The database for Blood brain barrier permeation (logBB) values for 291 substances is available from the literature [27].
3.2. Optimal Descriptor
A model for biological activity is building up as one-variable correlation
(1) |
The C0 and C1 are regression coefficients (intercept and slope) calculated with the Least squares method. “T” is threshold to define rare features extracted from SMILES. For instance, if T=3, all features which have prevalence less than 3 in the training set are considered as rare. The rare features are not used to build up a model (their correlation weights are zero). N is the number of epochs of the Monte Carlo optimization for correlation weights of molecular features involved in the modelling process. The T* and N* are values of the T and N which give the best statistical characteristics for model calculated with Eq. 1 for the calibration set.
The optimal descriptor of correlation weights (DCW) of different molecular features extracted from simplified molecular input-line entry system (SMILES) [28] and from molecular graph:
(2) |
where
(3) |
(4) |
Twelve symbols for registration of molecular features extracted from SMILES are reserved in the program for possible modifications in the future.
Example of the molecular features extracted from SMILES and represented by twelve symbols is shown in Table 1. The C3 – C7 are situations in a molecular system related to the presence (absence) of three-membered, four-membered, five-membered, six-membered and seven-membered rings. Table 2 represents general scheme of the representation of different situations related to rings by twelve symbols.
Table 1.
ID | Comment | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Representation of Sk | N | . | . | . | . | . | . | . | . | . | . | . |
C | . | . | . | . | . | . | . | . | . | . | . | ||
(* | . | . | . | . | . | . | . | . | . | . | . | ||
S | . | . | . | . | . | . | . | . | . | . | . | ||
C | . | . | . | . | . | . | . | . | . | . | . | ||
C | . | . | . | . | . | . | . | . | . | . | . | ||
F | . | . | . | . | . | . | . | . | . | . | . | ||
( | . | . | . | . | . | . | . | . | . | . | . | ||
= | . | . | . | . | . | . | . | . | . | . | . | ||
N | . | . | . | . | . | . | . | . | . | . | . | ||
2 | Representation of SSk | N | . | . | . | C | . | . | . | . | . | . | . |
C | . | . | . | ( | . | . | . | . | . | . | . | ||
S | . | . | . | ( | . | . | . | . | . | . | . | ||
S | . | . | . | C | . | . | . | . | . | . | . | ||
C | . | . | . | C | . | . | . | . | . | . | . | ||
F | . | . | . | C | . | . | . | . | . | . | . | ||
F | . | . | . | ( | . | . | . | . | . | . | . | ||
= | . | . | . | ( | . | . | . | . | . | . | . | ||
N | . | . | . | = | . | . | . | . | . | . | . | ||
= | # | @ | N | O | S | P | F | Cl | Br | I | |||
3 | Definition of HARD attribute | $ | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 |
*)Brackets are the representation of molecular branching and used only “without”.
Table 2.
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Ring status | C | x | . | . | . | a | h | . | y | . | . | . |
The CW(x) is the correlation weights for a molecular feature x. The correlation weights are calculated with the Monte Carlo method optimization. The CORAL software is available for the calculations [29]. The optimal correlation weights give maximal correlation coefficient value between experimental and predicted activity for the training set. The predictive potential of the model should be checked up with external validation set [29]. The detailed description of the CORAL software is available on the Internet (http://www.insilico.eu/coral).
3.4. Predictive Models Built up with the CORAL Software
Three different splits into the training and validation set were studied for the binding affinity data on gamma-secretase inhibitors (pIC50), and were also studied for Blood brain barrier permeation (logBB). It is to be noted that the training set for the CORAL models is structured into training, invisible and calibration sets [30, 31].
Computational experiments have shown that efficacy of the “training” can be improved by means of special set which permanently checks the absence of overtraining. This set can be named as “passive training set” or “invisible training set”.
In other words, there are two ways to use a “total” training set to build up correlation “descriptor - endpoint”:
Traditional scheme: all compounds of the total training set are taken into the Monte Carlo optimization process. Result will be the maximal correlation coefficient between optimal descriptor and endpoint for all total training set.
Balance of correlations: The first half of the total training set is involved in the Monte Carlo optimization process. However, second half is not involved in the process. In this case, the result will be maximal correlation coefficient between the optimal descriptor and endpoint for the first half of compounds, whereas second half of compounds will give hint whether the correlation is objective or this correlation is preferable solely for the first active half of compounds.
Thus, the balance of correlation is building up a QSAR model with the following participants:
The training set is “builder of the model”;
The invisible training set is the “inspector of the model”; the inspector must detect and stop the process of the overtraining;
The calibration set is an expert; the expert must declare, “Model is ready”;
The validation set is the appraiser of real predictive potential of the model.
The advantage to this approach is the possibility of building up a model solely from 2D data on the molecular structure represented by SMILES with the interpretation of influence of different molecular features extracted from SMILES. However, there are some disadvantages of the approach. In particular, the Monte Carlo optimization is not a fast calculation especially for large datasets. In addition, some of the SMILES fragments do not have transparent physical meaning (e.g. symbols “[“, “@”, dots, etc.).
The x is the size of rings i.e. x=3, 4, 5, 6, 7; If there are aromatic rings then a=’A’, otherwise a=’.’; If there are heteroatoms in rings then h=’H’, otherwise h=‘.’; The y is the number of rings i.e. y=0, 1, 2, …
The models, which were built up with the balance of correlations, are as follows:
Binding Affinity of Gamma-secretase Inhibitors (Potential Agents for Treatment Alzheimer’s Disease)
Split 1
pIC50 = 1.2942501 (± 0.0382248) + 0.1606057 (± 0.0009709) * DCW(1,15) (5)
n=62, r2=0.8258, RMSE=0.623, F=284 (training set)
n=71, r2=0.6856, RMSE=0.727 (invisible training set)
n=51, r2=0.6810, RMSE=0.751 (calibration set)
n=49, r2=0.7752, RMSE=0.733 (validation set)
Split 2
pIC50 = 3.2737064 (± 0.0326601) + 0.1974723 (± 0.0013567) * DCW(1,15) (6)
n=66, r2=0.7711, RMSE=0.694, F=216 (training set)
n=67, r2=0.7702, RMSE=0.703 (invisible training set)
n=50, r2=0.7258, RMSE=0.718 (calibration set)
n=50, r2=0.7676, RMSE=0.645 (validation set)
Split 3
pIC50 = 2.1408654 (± 0.0416128) + 0.1757965 (± 0.0012683) * DCW(1,15) (7)
n=61, r2=0.7725, RMSE=0.665, F=200 (training set)
n=63, r2=0.7724, RMSE=0.756 (invisible training set)
n=55, r2=0.7610, RMSE=1.11 (calibration set)
n=54, r2=0.7753, RMSE=0.882 (validation set)
Blood Brain Barrier Permeation (logBB)
Split 1
Log(BB) = -0.8609358 (± 0.0066439) + 0.0537248 (± 0.0003448) * DCW(1,15) (8)
n=101, r2=0.7438, RMSE=0.286, F=287 (training set)
n=104, r2=0.7540, RMSE=0.331 (invisible training set)
n=43, r2=0.9141, RMSE=0.198 (calibration set)
n=43, r2=0.8592, RMSE=0.240 (validation set)
Split 2
Log(BB) = -0.9164493 (± 0.0072757) + 0.0385240 (± 0.0002497) * DCW(1,10) (9)
n=103, r2=0.6830, RMSE=0.350, F=218 (training set)
n=107, r2=0.6828, RMSE=0.330 (invisible training set)
n=41, r2=0.8350, RMSE=0.229 (calibration set)
n=40, r2=0.8310, RMSE=0.319 (validation set)
Split 3
Log(BB) = -0.5038388 (± 0.0053701) + 0.0231569 (± 0.0001622) * DCW(1,10) (10)
n=104, r2=0.6388, RMSE=0.359, F=180 (training set)
n=105, r2=0.6477, RMSE=0.389 (invisible training set)
n=41, r2=0.8344, RMSE=0.275 (calibration set)
n=41, r2=0.7273, RMSE=0.274 (validation set)
3.5. Molecular Features which Influence the pIC50 and logBB Extracted from Coral-models
Table 3 contains correlation weights of different molecular features obtained in three runs of the Monte Carlo method
Table 3.
No. | Feature, F | CW(F) Run 1 | CW(F) Run 2 | CW(F) Run 3 | Training Set | Invisible Training Set | Calibration Set | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
pIC50, split 1 | |||||||||||||
1 | 1........... | 0.24936 | 0.81527 | 1.00426 | 62 | 71 | 51 | ||||||
2 | O...(....... | 1.81598 | 2.00425 | 2.94093 | 62 | 71 | 51 | ||||||
3 | O...=....... | 0.62907 | 0.75437 | 1.18718 | 62 | 71 | 51 | ||||||
4 | C3......0... | 1.74618 | 3.12552 | 2.49983 | 60 | 71 | 51 | ||||||
5 | C4......0... | 3.12573 | 4.43867 | 1.99580 | 60 | 71 | 51 | ||||||
6 | C...(....... | 0.68970 | 0.62551 | 0.25068 | 59 | 62 | 43 | ||||||
7 | C...1....... | 1.37112 | 1.12572 | 1.43837 | 59 | 61 | 43 | ||||||
8 | c...(....... | 1.25445 | 1.43762 | 1.37518 | 57 | 63 | 46 | ||||||
9 | c...1....... | 0.37510 | 0.68855 | 0.24748 | 55 | 65 | 47 | ||||||
10 | N...(....... | 0.43569 | 0.12723 | 0.12950 | 50 | 54 | 38 | ||||||
11 | 1...(....... | 0.62013 | 0.37156 | 0.50154 | 41 | 46 | 28 | ||||||
12 | N...C....... | 0.74564 | 0.93512 | 0.62564 | 41 | 43 | 29 | ||||||
13 | S........... | 1.87883 | 1.44122 | 2.56649 | 40 | 43 | 33 | ||||||
14 | [...C....... | 2.87716 | 1.68980 | 1.75250 | 38 | 34 | 25 | ||||||
15 | F........... | 0.68765 | 0.74914 | 0.37233 | 37 | 38 | 28 | ||||||
16 | C5......0... | 4.87431 | 4.87313 | 3.87512 | 36 | 40 | 31 | ||||||
1 | (........... | -0.50046 | -0.62885 | -0.05899 | 62 | 71 | 51 | ||||||
2 | =...(....... | -0.37242 | -0.24593 | -0.56678 | 62 | 69 | 51 | ||||||
3 | =........... | -2.24798 | -1.12583 | -2.05997 | 62 | 71 | 51 | ||||||
4 | C........... | -0.56673 | -0.56218 | -0.50032 | 62 | 71 | 51 | ||||||
5 | c........... | -0.06497 | -0.18722 | -0.31242 | 62 | 71 | 51 | ||||||
6 | c...c....... | -0.56687 | -0.49790 | -0.81516 | 62 | 71 | 51 | ||||||
7 | N........... | -0.68973 | -1.12750 | -0.68769 | 54 | 62 | 41 | ||||||
8 | (...(....... | -0.74772 | -1.12089 | -1.81062 | 39 | 44 | 34 | ||||||
9 | [...H....... | -1.56676 | -1.25190 | -0.31208 | 38 | 34 | 25 | ||||||
10 | Cl..(....... | -0.24951 | -0.56565 | -0.62721 | 35 | 27 | 26 | ||||||
11 | C...=....... | -2.37058 | -2.74693 | -3.44246 | 26 | 30 | 14 | ||||||
12 | H...@@...... | -1.06063 | -0.37158 | -1.43490 | 21 | 21 | 13 | ||||||
13 | [...@....... | -2.31200 | -2.81686 | -1.50485 | 19 | 11 | 9 | ||||||
14 | =...1....... | -1.31479 | -1.74616 | -1.00456 | 9 | 15 | 10 | ||||||
15 | [...N....... | -0.43407 | -2.19238 | -1.93745 | 9 | 12 | 6 | ||||||
16 | C6...AH.4... | -3.74966 | -2.99712 | -2.99987 | 8 | 6 | 5 | ||||||
No. | Feature, F | CW(F) Run 1 | CW(F) Run 2 | CW(F) Run 3 | Training Set | Invisible Training Set | Calibration Set | ||||||
pIC50, split 2 | |||||||||||||
1 | 1........... | 0.37791 | 0.75136 | 0.50087 | 66 | 67 | 50 | ||||||
2 | O........... | 1.93510 | 2.31473 | 1.06252 | 66 | 67 | 50 | ||||||
3 | C...(....... | 0.06720 | 0.43483 | 0.62375 | 61 | 61 | 45 | ||||||
4 | C...1....... | 1.56744 | 1.62065 | 1.75150 | 61 | 58 | 43 | ||||||
5 | c...(....... | 1.56080 | 1.18804 | 1.75301 | 60 | 60 | 46 | ||||||
6 | N...(....... | 0.50498 | 0.62805 | 0.43364 | 54 | 49 | 36 | ||||||
7 | C...C....... | 0.37205 | 0.62336 | 0.37277 | 51 | 58 | 42 | ||||||
8 | 2........... | 0.43523 | 0.56398 | 0.12820 | 45 | 51 | 32 | ||||||
9 | C5......0... | 6.00472 | 5.99545 | 6.25140 | 43 | 36 | 34 | ||||||
10 | N...C....... | 1.37968 | 1.68793 | 1.87623 | 39 | 41 | 27 | ||||||
11 | [...C....... | 0.12456 | 0.49768 | 1.80975 | 37 | 38 | 23 | ||||||
12 | [...H....... | 1.62581 | 0.69162 | 1.00379 | 37 | 38 | 23 | ||||||
13 | c...2....... | 0.99918 | 0.56194 | 1.12751 | 35 | 36 | 24 | ||||||
14 | F........... | 0.49877 | 0.44089 | 1.30846 | 34 | 37 | 28 | ||||||
15 | F...(....... | 0.62920 | 0.69236 | 0.37820 | 33 | 35 | 27 | ||||||
16 | S........... | 3.12476 | 2.87269 | 3.37296 | 33 | 43 | 31 | ||||||
1 | (........... | -0.55900 | -0.55795 | -1.06147 | 66 | 67 | 50 | ||||||
2 | =........... | -0.31255 | -1.99752 | -1.87560 | 66 | 67 | 50 | ||||||
3 | C........... | -0.24649 | -0.62194 | -0.37101 | 66 | 67 | 50 | ||||||
4 | C3......0... | -4.12984 | -4.74520 | -3.49783 | 66 | 66 | 49 | ||||||
5 | c........... | -0.43299 | -0.12822 | -0.37044 | 66 | 67 | 50 | ||||||
6 | c...c....... | -0.56748 | -0.50425 | -0.99606 | 66 | 67 | 50 | ||||||
7 | N........... | -1.25121 | -1.12588 | -1.00118 | 57 | 60 | 39 | ||||||
8 | H........... | -1.37596 | -0.25167 | -1.18672 | 37 | 38 | 23 | ||||||
9 | c...C....... | -0.37586 | -0.37213 | -0.49682 | 37 | 38 | 25 | ||||||
10 | [...(....... | -1.37164 | -1.62892 | -0.87345 | 35 | 35 | 22 | ||||||
11 | (...(....... | -1.00341 | -1.05836 | -0.99682 | 32 | 44 | 32 | ||||||
12 | C...=....... | -1.87617 | -0.49606 | -0.87942 | 26 | 28 | 23 | ||||||
13 | C...@@...... | -1.93993 | -0.24818 | -0.05935 | 23 | 21 | 15 | ||||||
14 | [...1....... | -0.25399 | -0.87790 | -0.06226 | 22 | 25 | 17 | ||||||
15 | $10011100100 | -1.24578 | -1.56069 | -1.81534 | 13 | 10 | 7 | ||||||
16 | C7...A..1... | -0.24768 | -1.18849 | -0.62114 | 11 | 21 | 10 | ||||||
pIC50, split 3 | |||||||||||||
1 | 1........... | 0.80782 | 0.12046 | 1.00472 | 61 | 63 | 55 | ||||||
2 | =...(....... | 0.80859 | 0.43861 | 1.24558 | 61 | 63 | 53 | ||||||
3 | O...(....... | 2.81151 | 2.62129 | 2.31681 | 61 | 63 | 55 | ||||||
No. | Feature, F | CW(F) Run 1 | CW(F) Run 2 | CW(F) Run 3 | Training Set | Invisible Training Set | Calibration Set | ||||||
pIC50, split 3 | |||||||||||||
4 | O...=....... | 0.93514 | 1.43913 | 0.68355 | 61 | 63 | 55 | ||||||
5 | c........... | 0.06327 | 0.00247 | 0.12191 | 61 | 63 | 55 | ||||||
6 | C...1....... | 0.81487 | 1.06109 | 1.12812 | 58 | 60 | 46 | ||||||
7 | c...(....... | 0.68932 | 0.75280 | 0.93721 | 57 | 59 | 47 | ||||||
8 | c...1....... | 0.06090 | 0.30887 | 0.37182 | 53 | 58 | 51 | ||||||
9 | N...(....... | 0.37516 | 0.87510 | 1.68538 | 50 | 48 | 43 | ||||||
10 | 2........... | 0.94011 | 1.18721 | 1.05999 | 48 | 43 | 36 | ||||||
11 | [...C....... | 1.18350 | 0.74643 | 1.12063 | 39 | 35 | 26 | ||||||
12 | N...C....... | 1.25462 | 1.31352 | 1.62911 | 37 | 42 | 33 | ||||||
13 | S........... | 1.24827 | 2.00081 | 0.93984 | 36 | 39 | 36 | ||||||
14 | C5......0... | 3.43701 | 5.50479 | 6.44186 | 35 | 37 | 34 | ||||||
15 | F...(....... | 1.12015 | 0.55846 | 1.12187 | 35 | 33 | 27 | ||||||
16 | S...(....... | 1.99622 | 1.55892 | 1.68773 | 33 | 37 | 34 | ||||||
1 | (........... | -0.37339 | -0.06365 | -0.62191 | 61 | 63 | 55 | ||||||
2 | =........... | -1.62730 | -2.31258 | -2.12289 | 61 | 63 | 55 | ||||||
3 | C........... | -0.37397 | -0.69070 | -0.56000 | 61 | 63 | 55 | ||||||
4 | c...c....... | -0.62350 | -0.50475 | -1.12573 | 61 | 63 | 55 | ||||||
5 | N........... | -1.12086 | -1.31212 | -2.06263 | 54 | 53 | 48 | ||||||
6 | C...C....... | -0.24581 | -0.06071 | -0.37304 | 48 | 47 | 46 | ||||||
7 | [...H....... | -0.44110 | -0.30958 | -1.24568 | 39 | 35 | 26 | ||||||
8 | @@.......... | -0.87191 | -0.19206 | -0.62280 | 30 | 20 | 14 | ||||||
9 | C...=....... | -1.75324 | -1.80866 | -1.87391 | 28 | 25 | 22 | ||||||
10 | [...1....... | -0.12014 | -0.56717 | -0.31449 | 24 | 22 | 18 | ||||||
11 | [...@....... | -2.05908 | -1.00253 | -0.12596 | 16 | 10 | 13 | ||||||
12 | C7...A..1... | -1.06160 | -1.43859 | -0.62483 | 15 | 16 | 14 | ||||||
13 | $10011100100 | -0.62031 | -0.25133 | -2.50400 | 9 | 12 | 7 | ||||||
14 | [...2....... | -1.31346 | -1.43451 | -0.75249 | 9 | 8 | 7 | ||||||
15 | C6...AH.4... | -0.94103 | -2.06365 | -1.55992 | 8 | 7 | 7 | ||||||
16 | S...C....... | -1.12210 | -1.62776 | -1.19212 | 8 | 1 | 1 | ||||||
LogBB, split 1 | |||||||||||||
1 | C........... | 0.69000 | 0.44177 | 0.44099 | 101 | 102 | 42 | ||||||
2 | C4......0... | 1.44233 | 1.93619 | 0.87009 | 100 | 104 | 43 | ||||||
3 | C3......0... | 9.24711 | 8.24931 | 6.37970 | 99 | 102 | 43 | ||||||
4 | C...C....... | 0.18958 | 0.24932 | 0.31713 | 90 | 88 | 41 | ||||||
5 | C...(....... | 1.06715 | 0.74746 | 1.24582 | 87 | 91 | 35 | ||||||
6 | C...1....... | 0.50407 | 0.68784 | 0.99825 | 80 | 76 | 26 | ||||||
No. | Feature, F | CW(F) Run 1 | CW(F) Run 2 | CW(F) Run 3 | Training Set | Invisible Training Set | Calibration Set | ||||||
LogBB, split 1 | |||||||||||||
7 | C...=....... | 1.06451 | 1.00467 | 0.93251 | 80 | 80 | 24 | ||||||
8 | C5......0... | 5.18616 | 4.87599 | 3.06013 | 66 | 70 | 32 | ||||||
9 | N...C....... | 1.06163 | 1.25062 | 1.05830 | 61 | 59 | 20 | ||||||
10 | N...(....... | 1.87440 | 1.80861 | 1.50002 | 50 | 50 | 16 | ||||||
11 | O...=....... | 3.74850 | 3.12144 | 3.50390 | 45 | 49 | 17 | ||||||
12 | O...C....... | 1.87390 | 1.62698 | 1.50087 | 42 | 35 | 10 | ||||||
13 | =...2....... | 1.31662 | 2.37411 | 1.93516 | 41 | 37 | 12 | ||||||
14 | C...3....... | 1.56423 | 0.24589 | 0.69105 | 36 | 43 | 10 | ||||||
15 | $10011000000 | 3.49649 | 2.87644 | 4.06526 | 32 | 23 | 7 | ||||||
16 | C5....H.1... | 0.93855 | 0.49718 | 0.24570 | 29 | 28 | 11 | ||||||
1 | =........... | -1.94237 | -2.00425 | -1.12592 | 89 | 86 | 30 | ||||||
2 | (........... | -1.94146 | -1.44173 | -1.93644 | 88 | 91 | 35 | ||||||
3 | N........... | -1.69081 | -1.80786 | -1.68985 | 74 | 69 | 22 | ||||||
4 | O........... | -3.93612 | -3.12302 | -3.87867 | 66 | 69 | 27 | ||||||
5 | =...(....... | -0.87699 | -0.37408 | -0.56436 | 62 | 58 | 23 | ||||||
6 | C...2....... | -1.87318 | -2.18807 | -1.87392 | 60 | 59 | 16 | ||||||
7 | O...(....... | -1.74592 | -1.74659 | -1.50041 | 43 | 49 | 19 | ||||||
8 | 2...(....... | -2.06074 | -0.87009 | -1.37681 | 36 | 35 | 10 | ||||||
9 | N...=....... | -1.62360 | -1.50313 | -2.12455 | 30 | 35 | 11 | ||||||
10 | =...3....... | -0.68502 | -0.93424 | -1.18858 | 26 | 29 | 8 | ||||||
11 | N...2....... | -2.81032 | -1.19035 | -2.12915 | 24 | 19 | 6 | ||||||
12 | [........... | -0.81319 | -1.12785 | -1.31036 | 10 | 8 | 3 | ||||||
13 | =...4....... | -1.31236 | -0.81692 | -0.68350 | 9 | 19 | 5 | ||||||
14 | N...H....... | -1.12172 | -0.87838 | -1.62984 | 7 | 6 | 3 | ||||||
15 | [...C....... | -2.87947 | -2.75490 | -2.06543 | 7 | 5 | 3 | ||||||
16 | Br.......... | -0.49505 | -0.49665 | -1.94093 | 6 | 2 | 2 | ||||||
LogBB, split 2 | |||||||||||||
1 | C3......0... | 10.87070 | 9.99674 | 11.00131 | 103 | 103 | 41 | ||||||
2 | C........... | 0.12071 | 0.12533 | 0.37494 | 101 | 106 | 41 | ||||||
3 | C...C....... | 0.93320 | 0.87535 | 0.50239 | 89 | 96 | 37 | ||||||
4 | C...(....... | 1.19210 | 1.25322 | 0.62690 | 83 | 93 | 36 | ||||||
5 | C...=....... | 0.37918 | 1.37448 | 0.44201 | 80 | 86 | 24 | ||||||
6 | 1........... | 1.49679 | 0.12476 | 1.06684 | 74 | 88 | 26 | ||||||
7 | C...1....... | 1.18368 | 1.49925 | 1.56002 | 74 | 88 | 26 | ||||||
8 | C5......0... | 4.25337 | 4.68880 | 4.12680 | 68 | 66 | 36 | ||||||
9 | N...C....... | 1.74585 | 1.50379 | 1.31711 | 61 | 68 | 19 | ||||||
No. | Feature, F | CW(F) Run 1 | CW(F) Run 2 | CW(F) Run 3 | Training Set | Invisible Training Set | Calibration Set | ||||||
LogBB, split 2 | |||||||||||||
10 | 2........... | 1.37069 | 0.93599 | 0.87107 | 56 | 71 | 15 | ||||||
11 | =...1....... | 1.00094 | 1.00452 | 1.18761 | 48 | 61 | 19 | ||||||
12 | N...(....... | 2.00333 | 2.24552 | 1.81625 | 47 | 50 | 18 | ||||||
13 | O...=....... | 3.62311 | 3.49517 | 3.37670 | 42 | 52 | 19 | ||||||
14 | =...2....... | 1.18866 | 0.62164 | 0.18527 | 40 | 45 | 11 | ||||||
15 | C...3....... | 1.31047 | 0.87676 | 1.31327 | 38 | 47 | 9 | ||||||
16 | O...C....... | 2.12437 | 2.25455 | 1.81321 | 36 | 40 | 12 | ||||||
1 | C4......0... | -0.49732 | -0.50309 | -0.50008 | 103 | 106 | 41 | ||||||
2 | C7......0... | -3.50133 | -3.24763 | -2.87060 | 90 | 88 | 34 | ||||||
3 | =........... | -1.12942 | -2.24834 | -1.31595 | 85 | 96 | 28 | ||||||
4 | (........... | -1.62048 | -1.81308 | -1.18864 | 84 | 94 | 36 | ||||||
5 | N........... | -2.49537 | -2.68449 | -2.25354 | 70 | 78 | 23 | ||||||
6 | O........... | -3.62711 | -4.25421 | -3.56712 | 64 | 75 | 28 | ||||||
7 | =...(....... | -2.18783 | -0.74894 | -1.93659 | 56 | 71 | 23 | ||||||
8 | C...2....... | -2.49624 | -2.30834 | -1.81363 | 56 | 70 | 15 | ||||||
9 | O...(....... | -2.12646 | -1.87628 | -2.00250 | 41 | 54 | 18 | ||||||
10 | N...=....... | -1.12893 | -0.99861 | -0.99789 | 37 | 34 | 5 | ||||||
11 | 2...(....... | -1.75032 | -0.75119 | -0.49969 | 35 | 43 | 8 | ||||||
12 | =...3....... | -1.00262 | -0.62848 | -0.25161 | 27 | 32 | 7 | ||||||
13 | N...2....... | -2.00309 | -0.24532 | -0.87223 | 22 | 29 | 5 | ||||||
14 | S........... | -0.80988 | -2.37982 | -1.12535 | 16 | 17 | 3 | ||||||
15 | =...4....... | -2.24740 | -0.99940 | -1.12895 | 12 | 17 | 3 | ||||||
16 | [...C....... | -3.81523 | -3.37811 | -2.81052 | 8 | 6 | 2 | ||||||
LogBB, split 3 | |||||||||||||
1 | C........... | 0.00132 | 0.19167 | 0.25322 | 103 | 104 | 40 | ||||||
2 | C3......0... | 9.74655 | 10.49769 | 9.50124 | 102 | 104 | 40 | ||||||
3 | C...C....... | 1.00004 | 1.25242 | 1.00283 | 92 | 97 | 33 | ||||||
4 | C...(....... | 0.50101 | 1.37191 | 1.00401 | 90 | 89 | 35 | ||||||
5 | C...1....... | 1.31706 | 1.00284 | 1.49553 | 77 | 83 | 27 | ||||||
6 | C...=....... | 0.55849 | 1.49910 | 0.06478 | 76 | 87 | 25 | ||||||
7 | C5......0... | 6.00291 | 4.74699 | 5.25173 | 71 | 66 | 32 | ||||||
8 | N...C....... | 1.24857 | 0.87346 | 1.62565 | 61 | 63 | 21 | ||||||
9 | N...(....... | 0.75396 | 1.37567 | 1.68298 | 58 | 43 | 17 | ||||||
10 | O...=....... | 1.50315 | 2.50127 | 1.49589 | 49 | 45 | 18 | ||||||
11 | =...2....... | 3.25269 | 1.68316 | 2.25396 | 38 | 45 | 12 | ||||||
12 | 3........... | 1.62996 | 0.75194 | 0.74545 | 36 | 45 | 10 | ||||||
No. | Feature, F | CW(F) Run 1 | CW(F) Run 2 | CW(F) Run 3 | Training Set | Invisible Training Set | Calibration Set | ||||||
LogBB, split 3 | |||||||||||||
13 | C...3....... | 0.00153 | 0.25042 | 0.74564 | 36 | 45 | 10 | ||||||
14 | 1...(....... | 1.87609 | 2.19099 | 3.00002 | 34 | 27 | 14 | ||||||
15 | C6......0... | 2.50113 | 4.75479 | 4.37791 | 33 | 27 | 16 | ||||||
16 | O...C....... | 0.62800 | 0.49543 | 0.74706 | 31 | 37 | 14 | ||||||
1 | (........... | -0.50197 | -1.74939 | -1.25398 | 92 | 89 | 35 | ||||||
2 | C7......0... | -3.00432 | -2.24765 | -2.50016 | 92 | 86 | 35 | ||||||
3 | =........... | -1.74993 | -2.19227 | -0.93260 | 84 | 93 | 30 | ||||||
4 | N........... | -1.50133 | -2.62558 | -3.30976 | 69 | 75 | 24 | ||||||
5 | =...(....... | -0.37305 | -0.00161 | -0.68413 | 66 | 63 | 18 | ||||||
6 | O........... | -3.49561 | -3.49608 | -3.05869 | 66 | 69 | 28 | ||||||
7 | C...2....... | -1.25479 | -1.19206 | -1.56163 | 55 | 66 | 17 | ||||||
8 | O...(....... | -1.62599 | -1.87859 | -2.25388 | 46 | 48 | 15 | ||||||
9 | 2...(....... | -1.25291 | -1.75055 | -2.00114 | 37 | 36 | 11 | ||||||
10 | N...=....... | -3.00207 | -2.49735 | -2.24940 | 31 | 34 | 10 | ||||||
11 | C5....H.1... | -0.31727 | -0.62372 | -0.06462 | 29 | 29 | 6 | ||||||
12 | =...3....... | -1.25013 | -2.25139 | -0.31558 | 27 | 29 | 6 | ||||||
13 | (...(....... | -1.49562 | -1.56378 | -2.00367 | 23 | 16 | 7 | ||||||
14 | N...2....... | -2.18457 | -2.87062 | -2.50423 | 23 | 24 | 8 | ||||||
15 | [........... | -0.44089 | -0.31661 | -0.31291 | 8 | 11 | 3 | ||||||
16 | [...H....... | -0.75060 | -0.74589 | -0.12414 | 8 | 6 | 2 |
optimization procedure. These features are extracted according to the principles: (i) these have significant prevalence in training, invisible training and calibration sets; and (ii) these features have stable positive or stable negative correlation weights in all runs.
3.5. Molecular Features, which have Similar Effects for pIC50 and logBB
Table 4 contains lists of molecular features which are promoters of increase for both pIC50 and logBB together with features which are promoters of decrease for both pIC50 and logBB. In the first approximation, oxygen and nitrogen connected in rings and oxygen connected with carbon or nitrogen are promoters of increase for both pIC50 and logBB. Branching and the presence of double bonds as well as nitrogen itself are promoters of decrease for both pIC50 and logBB.
Table 4.
1 | 1 | 1 | 2 | 2 | 2 | TRN1* | iTRN1 | CLB1 | TRN2 | iTRN2 | CLB2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
pIC50-split1-logBB-split1 | ||||||||||||
O...=....... | + | + | + | + | + | + | 62 | 71 | 51 | 45 | 49 | 17 |
C3......0... | + | + | + | + | + | + | 60 | 71 | 51 | 99 | 102 | 43 |
C4......0... | + | + | + | + | + | + | 60 | 71 | 51 | 100 | 104 | 43 |
C...(....... | + | + | + | + | + | + | 59 | 62 | 43 | 87 | 91 | 35 |
C...1....... | + | + | + | + | + | + | 59 | 61 | 43 | 80 | 76 | 26 |
N...(....... | + | + | + | + | + | + | 50 | 54 | 38 | 50 | 50 | 16 |
1...(....... | + | + | + | + | + | + | 41 | 46 | 28 | 25 | 29 | 14 |
N...C....... | + | + | + | + | + | + | 41 | 43 | 29 | 61 | 59 | 20 |
C5......0... | + | + | + | + | + | + | 36 | 40 | 31 | 66 | 70 | 32 |
N...1....... | + | + | + | + | + | + | 36 | 33 | 22 | 23 | 23 | 6 |
O...C....... | + | + | + | + | + | + | 22 | 23 | 20 | 42 | 35 | 10 |
(........... | - | - | - | - | - | - | 62 | 71 | 51 | 88 | 91 | 35 |
=...(....... | - | - | - | - | - | - | 62 | 69 | 51 | 62 | 58 | 23 |
=........... | - | - | - | - | - | - | 62 | 71 | 51 | 89 | 86 | 30 |
N........... | - | - | - | - | - | - | 54 | 62 | 41 | 74 | 69 | 22 |
pIC50-split1-logBB-split2 | ||||||||||||
1........... | + | + | + | + | + | + | 62 | 71 | 51 | 74 | 88 | 26 |
O...=....... | + | + | + | + | + | + | 62 | 71 | 51 | 42 | 52 | 19 |
C3......0... | + | + | + | + | + | + | 60 | 71 | 51 | 103 | 103 | 41 |
C...(....... | + | + | + | + | + | + | 59 | 62 | 43 | 83 | 93 | 36 |
C...1....... | + | + | + | + | + | + | 59 | 61 | 43 | 74 | 88 | 26 |
N...(....... | + | + | + | + | + | + | 50 | 54 | 38 | 47 | 50 | 18 |
1...(....... | + | + | + | + | + | + | 41 | 46 | 28 | 33 | 28 | 14 |
N...C....... | + | + | + | + | + | + | 41 | 43 | 29 | 61 | 68 | 19 |
F........... | + | + | + | + | + | + | 37 | 38 | 28 | 21 | 11 | 5 |
C5......0... | + | + | + | + | + | + | 36 | 40 | 31 | 68 | 66 | 36 |
N...1....... | + | + | + | + | + | + | 36 | 33 | 22 | 26 | 27 | 5 |
O...C....... | + | + | + | + | + | + | 22 | 23 | 20 | 36 | 40 | 12 |
(........... | - | - | - | - | - | - | 62 | 71 | 51 | 84 | 94 | 36 |
=...(....... | - | - | - | - | - | - | 62 | 69 | 51 | 56 | 71 | 23 |
=........... | - | - | - | - | - | - | 62 | 71 | 51 | 85 | 96 | 28 |
N........... | - | - | - | - | - | - | 54 | 62 | 41 | 70 | 78 | 23 |
pIC50-split1-logBB-split3 | ||||||||||||
O...=....... | + | + | + | + | + | + | 62 | 71 | 51 | 49 | 45 | 18 |
C3......0... | + | + | + | + | + | + | 60 | 71 | 51 | 102 | 104 | 40 |
1 | 1 | 1 | 2 | 2 | 2 | TRN1* | iTRN1 | CLB1 | TRN2 | iTRN2 | CLB2 | |
pIC50-split1-logBB-split3 | ||||||||||||
C...(....... | + | + | + | + | + | + | 59 | 62 | 43 | 90 | 89 | 35 |
C...1....... | + | + | + | + | + | + | 59 | 61 | 43 | 77 | 83 | 27 |
N...(....... | + | + | + | + | + | + | 50 | 54 | 38 | 58 | 43 | 17 |
1...(....... | + | + | + | + | + | + | 41 | 46 | 28 | 34 | 27 | 14 |
N...C....... | + | + | + | + | + | + | 41 | 43 | 29 | 61 | 63 | 21 |
C5......0... | + | + | + | + | + | + | 36 | 40 | 31 | 71 | 66 | 32 |
O...C....... | + | + | + | + | + | + | 22 | 23 | 20 | 31 | 37 | 14 |
(........... | - | - | - | - | - | - | 62 | 71 | 51 | 92 | 89 | 35 |
=...(....... | - | - | - | - | - | - | 62 | 69 | 51 | 66 | 63 | 18 |
=........... | - | - | - | - | - | - | 62 | 71 | 51 | 84 | 93 | 30 |
N........... | - | - | - | - | - | - | 54 | 62 | 41 | 69 | 75 | 24 |
(...(....... | - | - | - | - | - | - | 39 | 44 | 34 | 23 | 16 | 7 |
pIC50-split2-logBB-split1 | ||||||||||||
C...(....... | + | + | + | + | + | + | 61 | 61 | 45 | 87 | 91 | 35 |
C...1....... | + | + | + | + | + | + | 61 | 58 | 43 | 80 | 76 | 26 |
N...(....... | + | + | + | + | + | + | 54 | 49 | 36 | 50 | 50 | 16 |
C...C....... | + | + | + | + | + | + | 51 | 58 | 42 | 90 | 88 | 41 |
C5......0... | + | + | + | + | + | + | 43 | 36 | 34 | 66 | 70 | 32 |
N...C....... | + | + | + | + | + | + | 39 | 41 | 27 | 61 | 59 | 20 |
O...C....... | + | + | + | + | + | + | 22 | 18 | 19 | 42 | 35 | 10 |
(........... | - | - | - | - | - | - | 66 | 67 | 50 | 88 | 91 | 35 |
=........... | - | - | - | - | - | - | 66 | 67 | 50 | 89 | 86 | 30 |
N........... | - | - | - | - | - | - | 57 | 60 | 39 | 74 | 69 | 22 |
pIC50-split2-logBB-split2 | ||||||||||||
1........... | + | + | + | + | + | + | 66 | 67 | 50 | 74 | 88 | 26 |
C...(....... | + | + | + | + | + | + | 61 | 61 | 45 | 83 | 93 | 36 |
C...1....... | + | + | + | + | + | + | 61 | 58 | 43 | 74 | 88 | 26 |
N...(....... | + | + | + | + | + | + | 54 | 49 | 36 | 47 | 50 | 18 |
C...C....... | + | + | + | + | + | + | 51 | 58 | 42 | 89 | 96 | 37 |
2........... | + | + | + | + | + | + | 45 | 51 | 32 | 56 | 71 | 15 |
C5......0... | + | + | + | + | + | + | 43 | 36 | 34 | 68 | 66 | 36 |
N...C....... | + | + | + | + | + | + | 39 | 41 | 27 | 61 | 68 | 19 |
F........... | + | + | + | + | + | + | 34 | 37 | 28 | 21 | 11 | 5 |
O...C....... | + | + | + | + | + | + | 22 | 18 | 19 | 36 | 40 | 12 |
(........... | - | - | - | - | - | - | 66 | 67 | 50 | 84 | 94 | 36 |
=........... | - | - | - | - | - | - | 66 | 67 | 50 | 85 | 96 | 28 |
N........... | - | - | - | - | - | - | 57 | 60 | 39 | 70 | 78 | 23 |
1 | 1 | 1 | 2 | 2 | 2 | TRN1* | iTRN1 | CLB1 | TRN2 | iTRN2 | CLB2 | |
pIC50-split2-logBB-split3 | ||||||||||||
C...(....... | + | + | + | + | + | + | 61 | 61 | 45 | 90 | 89 | 35 |
C...1....... | + | + | + | + | + | + | 61 | 58 | 43 | 77 | 83 | 27 |
N...(....... | + | + | + | + | + | + | 54 | 49 | 36 | 58 | 43 | 17 |
C...C....... | + | + | + | + | + | + | 51 | 58 | 42 | 92 | 97 | 33 |
C5......0... | + | + | + | + | + | + | 43 | 36 | 34 | 71 | 66 | 32 |
N...C....... | + | + | + | + | + | + | 39 | 41 | 27 | 61 | 63 | 21 |
O...C....... | + | + | + | + | + | + | 22 | 18 | 19 | 31 | 37 | 14 |
(........... | - | - | - | - | - | - | 66 | 67 | 50 | 92 | 89 | 35 |
=........... | - | - | - | - | - | - | 66 | 67 | 50 | 84 | 93 | 30 |
N........... | - | - | - | - | - | - | 57 | 60 | 39 | 69 | 75 | 24 |
(...(....... | - | - | - | - | - | - | 32 | 44 | 32 | 23 | 16 | 7 |
pIC50-split3-logBB-split1 | ||||||||||||
O...=....... | + | + | + | + | + | + | 61 | 63 | 55 | 45 | 49 | 17 |
C...1....... | + | + | + | + | + | + | 58 | 60 | 46 | 80 | 76 | 26 |
N...(....... | + | + | + | + | + | + | 50 | 48 | 43 | 50 | 50 | 16 |
N...C....... | + | + | + | + | + | + | 37 | 42 | 33 | 61 | 59 | 20 |
C5......0... | + | + | + | + | + | + | 35 | 37 | 34 | 66 | 70 | 32 |
N...1....... | + | + | + | + | + | + | 32 | 31 | 29 | 23 | 23 | 6 |
(........... | - | - | - | - | - | - | 61 | 63 | 55 | 88 | 91 | 35 |
=........... | - | - | - | - | - | - | 61 | 63 | 55 | 89 | 86 | 30 |
N........... | - | - | - | - | - | - | 54 | 53 | 48 | 74 | 69 | 22 |
pIC50-split3-logBB-split2 | ||||||||||||
1........... | + | + | + | + | + | + | 61 | 63 | 55 | 74 | 88 | 26 |
O...=....... | + | + | + | + | + | + | 61 | 63 | 55 | 42 | 52 | 19 |
C...1....... | + | + | + | + | + | + | 58 | 60 | 46 | 74 | 88 | 26 |
N...(....... | + | + | + | + | + | + | 50 | 48 | 43 | 47 | 50 | 18 |
2........... | + | + | + | + | + | + | 48 | 43 | 36 | 56 | 71 | 15 |
N...C....... | + | + | + | + | + | + | 37 | 42 | 33 | 61 | 68 | 19 |
C5......0... | + | + | + | + | + | + | 35 | 37 | 34 | 68 | 66 | 36 |
N...1....... | + | + | + | + | + | + | 32 | 31 | 29 | 26 | 27 | 5 |
(........... | - | - | - | - | - | - | 61 | 63 | 55 | 84 | 94 | 36 |
=........... | - | - | - | - | - | - | 61 | 63 | 55 | 85 | 96 | 28 |
N........... | - | - | - | - | - | - | 54 | 53 | 48 | 70 | 78 | 23 |
pIC50-split3-logBB-split3 | ||||||||||||
O...=....... | + | + | + | + | + | + | 61 | 63 | 55 | 49 | 45 | 18 |
C...1....... | + | + | + | + | + | + | 58 | 60 | 46 | 77 | 83 | 27 |
N...(....... | + | + | + | + | + | + | 50 | 48 | 43 | 58 | 43 | 17 |
1 | 1 | 1 | 2 | 2 | 2 | TRN1* | iTRN1 | CLB1 | TRN2 | iTRN2 | CLB2 | |
pIC50-split3-logBB-split3 | ||||||||||||
N...C....... | + | + | + | + | + | + | 37 | 42 | 33 | 61 | 63 | 21 |
C5......0... | + | + | + | + | + | + | 35 | 37 | 34 | 71 | 66 | 32 |
(........... | - | - | - | - | - | - | 61 | 63 | 55 | 92 | 89 | 35 |
=........... | - | - | - | - | - | - | 61 | 63 | 55 | 84 | 93 | 30 |
N........... | - | - | - | - | - | - | 54 | 53 | 48 | 69 | 75 | 24 |
*)TRN1, iTRN1 and CLB1 are the numbers of a feature in the training, invisible training and calibration sets for endpoint 1; TRN2, iTRN2 and CLB2 mean the same for endpoint 2.
3.6. Molecular Features, which have Opposite Effects for pIC50 and logBB
Table 5 contains lists of molecular features, which have opposite effect on both pIC50 and for logBB. In the first approximation, presence of two rings and presence of carbon with double covalent bond have opposite effects on pIC50 and logBB.
Table 5.
1 | 1 | 1 | 2 | 2 | 2 | TRN1* | iTRN1 | CLB1 | TRN2 | iTRN2 | CLB2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
pIC50-split1-logBB-split1 | ||||||||||||
O...(....... | + | + | + | - | - | - | 62 | 71 | 51 | 43 | 49 | 19 |
2...(....... | + | + | + | - | - | - | 31 | 33 | 18 | 36 | 35 | 10 |
C........... | - | - | - | + | + | + | 62 | 71 | 51 | 101 | 102 | 42 |
C...=....... | - | - | - | + | + | + | 26 | 30 | 14 | 80 | 80 | 24 |
pIC50-split1-logBB-split2 | ||||||||||||
O...(....... | + | + | + | - | - | - | 62 | 71 | 51 | 41 | 54 | 18 |
C4......0... | + | + | + | - | - | - | 60 | 71 | 51 | 103 | 106 | 41 |
2...(....... | + | + | + | - | - | - | 31 | 33 | 18 | 35 | 43 | 8 |
C7......0... | + | + | + | - | - | - | 28 | 21 | 22 | 90 | 88 | 34 |
C........... | - | - | - | + | + | + | 62 | 71 | 51 | 101 | 106 | 41 |
C...=....... | - | - | - | + | + | + | 26 | 30 | 14 | 80 | 86 | 24 |
pIC50-split1-logBB-split3 | ||||||||||||
O...(....... | + | + | + | - | - | - | 62 | 71 | 51 | 46 | 48 | 15 |
2...(....... | + | + | + | - | - | - | 31 | 33 | 18 | 37 | 36 | 11 |
C7......0... | + | + | + | - | - | - | 28 | 21 | 22 | 92 | 86 | 35 |
C........... | - | - | - | + | + | + | 62 | 71 | 51 | 103 | 104 | 40 |
C...=....... | - | - | - | + | + | + | 26 | 30 | 14 | 76 | 87 | 25 |
pIC50-split2-logBB-split1 | ||||||||||||
O........... | + | + | + | - | - | - | 66 | 67 | 50 | 66 | 69 | 27 |
2...(....... | + | + | + | - | - | - | 31 | 30 | 25 | 36 | 35 | 10 |
C........... | - | - | - | + | + | + | 66 | 67 | 50 | 101 | 102 | 42 |
C3......0... | - | - | - | + | + | + | 66 | 66 | 49 | 99 | 102 | 43 |
C...=....... | - | - | - | + | + | + | 26 | 28 | 23 | 80 | 80 | 24 |
pIC50-split2-logBB-split2 | ||||||||||||
O........... | + | + | + | - | - | - | 66 | 67 | 50 | 64 | 75 | 28 |
2...(....... | + | + | + | - | - | - | 31 | 30 | 25 | 35 | 43 | 8 |
1 | 1 | 1 | 2 | 2 | 2 | TRN1* | iTRN1 | CLB1 | TRN2 | iTRN2 | CLB2 | |
pIC50-split2-logBB-split2 | ||||||||||||
C7......0... | + | + | + | - | - | - | 27 | 23 | 21 | 90 | 88 | 34 |
C........... | - | - | - | + | + | + | 66 | 67 | 50 | 101 | 106 | 41 |
C3......0... | - | - | - | + | + | + | 66 | 66 | 49 | 103 | 103 | 41 |
C...=....... | - | - | - | + | + | + | 26 | 28 | 23 | 80 | 86 | 24 |
pIC50-split2-logBB-split3 | ||||||||||||
O........... | + | + | + | - | - | - | 66 | 67 | 50 | 66 | 69 | 28 |
2...(....... | + | + | + | - | - | - | 31 | 30 | 25 | 37 | 36 | 11 |
C7......0... | + | + | + | - | - | - | 27 | 23 | 21 | 92 | 86 | 35 |
C........... | - | - | - | + | + | + | 66 | 67 | 50 | 103 | 104 | 40 |
C3......0... | - | - | - | + | + | + | 66 | 66 | 49 | 102 | 104 | 40 |
C...=....... | - | - | - | + | + | + | 26 | 28 | 23 | 76 | 87 | 25 |
pIC50-split3-logBB-split1 | ||||||||||||
=...(....... | + | + | + | - | - | - | 61 | 63 | 53 | 62 | 58 | 23 |
O...(....... | + | + | + | - | - | - | 61 | 63 | 55 | 43 | 49 | 19 |
C........... | - | - | - | + | + | + | 61 | 63 | 55 | 101 | 102 | 42 |
C...C....... | - | - | - | + | + | + | 48 | 47 | 46 | 90 | 88 | 41 |
C...=....... | - | - | - | + | + | + | 28 | 25 | 22 | 80 | 80 | 24 |
pIC50-split3-logBB-split2 | ||||||||||||
=...(....... | + | + | + | - | - | - | 61 | 63 | 53 | 56 | 71 | 23 |
O...(....... | + | + | + | - | - | - | 61 | 63 | 55 | 41 | 54 | 18 |
C7......0... | + | + | + | - | - | - | 22 | 23 | 24 | 90 | 88 | 34 |
C........... | - | - | - | + | + | + | 61 | 63 | 55 | 101 | 106 | 41 |
C...C....... | - | - | - | + | + | + | 48 | 47 | 46 | 89 | 96 | 37 |
C...=....... | - | - | - | + | + | + | 28 | 25 | 22 | 80 | 86 | 24 |
pIC50-split3-logBB-split3 | ||||||||||||
=...(....... | + | + | + | - | - | - | 61 | 63 | 53 | 66 | 63 | 18 |
O...(....... | + | + | + | - | - | - | 61 | 63 | 55 | 46 | 48 | 15 |
C7......0... | + | + | + | - | - | - | 22 | 23 | 24 | 92 | 86 | 35 |
C........... | - | - | - | + | + | + | 61 | 63 | 55 | 103 | 104 | 40 |
C...C....... | - | - | - | + | + | + | 48 | 47 | 46 | 92 | 97 | 33 |
C...=....... | - | - | - | + | + | + | 28 | 25 | 22 | 76 | 87 | 25 |
*)TRN1, iTRN1 and CLB1 are the numbers of feature in the training, invisible training and calibration sets for endpoint 1; TRN2, iTRN2, and CLB2 mean the same for endpoint 2.
It is to be noted that the number of features which have the same effect for pIC50 and logBB is larger than the number of features which have opposite effects for pIC50 and logBB. Consequently, the consideration of interrelations between these endpoints (maybe not only those) can be a perspective in the aspect of drug discovery.
Supplementary materials section contains SMILES and numerical data on examined endpoints.
CONCLUSION
There are arguments to consider the interrelation between gamma-secretase inhibitors activity (pIC50) and blood brain barrier permeation (logBB). The interrelation is described in the literature and confirmed in this work (Table 4). The interrelation can be detected and described in terms of molecular features extracted from SMILES and molecular graph which are involved in building up QSAR models for the pIC50 and logBB. The examination of equivalent and opposite effect of the presence of molecular features for other endpoint can be useful for other pairs of endpoints. From practical point of view, these can be (a) water solubility and octanol water partition coefficient; (b) water solubility and toxicity; (c) carcinogenicity and mutagenicity, etc.
Acknowledgements
AAT and APT thank the project LIFE-COMBASE contract (LIFE15 ENV/ES/000416) for financial support.
LIST OF ABBREVIATIONS
- QSAR
Quantitative structure – activity relationships
- CWs
Correlation weights
- BBB
Blood brain barrier
- AD
Alzheimer's disease
- SMILES
Simplified molecular input-line entry system
SUPPLEMENTARY MATERIAL
Consent for Publication
Not applicable.
Conflict of Interest
The authors declare no conflict of interest, financial or otherwise.
REFERENCES
- 1.Trifunović J., Borčić V., Vukmirović S., Mikov M. Assessment of the pharmacokinetic profile of novel s-triazine derivatives and their potential use in treatment of Alzheimer’s disease. Life Sci. 2017;168:1–6. doi: 10.1016/j.lfs.2016.11.001. [http://dx.doi.org/10.1016/j.lfs.2016.11.001]. [PMID: 27818183]. [DOI] [PubMed] [Google Scholar]
- 2.Querfurth H.W., LaFerla F.M. Alzheimer’s disease. N. Engl. J. Med. 2010;362(4):329–344. doi: 10.1056/NEJMra0909142. [http://dx.doi.org/10.1056/NEJMra 0909142]. [PMID: 20107219]. [DOI] [PubMed] [Google Scholar]
- 3.Alzheimer’s disease facts and figures. Alzheimers Dement. 2015;11(3):332–384. doi: 10.1016/j.jalz.2015.02.003. [http://dx.doi.org/10.1016/j.jalz.2015.02.003]. [PMID: 25984581]. [DOI] [PubMed] [Google Scholar]
- 4.Braak H., Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol. 1991;82(4):239–259. doi: 10.1007/BF00308809. [http:// dx.doi.org/10.1007/BF00308809]. [PMID: 1759558]. [DOI] [PubMed] [Google Scholar]
- 5.Felician O., Sandson T.A. The neurobiology and pharmacotherapy of Alzheimer’s disease. J. Neuropsychiatry Clin. Neurosci. 1999;11(1):19–31. doi: 10.1176/jnp.11.1.19. [http://dx.doi.org/10.1176/jnp.11.1.19]. [PMID: 9990552]. [DOI] [PubMed] [Google Scholar]
- 6.Sanders S., Morano C. Alzheimer’s disease and related dementias. J. Gerontol. Soc. Work. 2008;50(Suppl. 1):191–214. doi: 10.1080/01634370802137900. [http://dx.doi. org/10.1080/01634370802137900]. [PMID: 18924393]. [DOI] [PubMed] [Google Scholar]
- 7.Abbott N.J., Patabendige A.A.K., Dolman D.E.M., Yusof S.R., Begley D.J. Structure and function of the blood-brain barrier. Neurobiol. Dis. 2010;37(1):13–25. doi: 10.1016/j.nbd.2009.07.030. [http://dx.doi.org/10.1016/j.nbd. 2009.07.030]. [PMID: 19664713]. [DOI] [PubMed] [Google Scholar]
- 8.Pardridge W.M. The blood-brain barrier: bottleneck in brain drug development. NeuroRx. 2005;2(1):3–14. doi: 10.1602/neurorx.2.1.3. [http://dx.doi.org/10. 1602/neurorx.2.1.3]. [PMID: 15717053]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Pan W., Banks W.A., Fasold M.B., Bluth J., Kastin A.J. Transport of brain-derived neurotrophic factor across the blood-brain barrier. Neuropharmacology. 1998;37(12):1553–1561. doi: 10.1016/s0028-3908(98)00141-5. [http://dx. doi.org/10.1016/S0028-3908(98)00141-5]. [PMID: 9886678]. [DOI] [PubMed] [Google Scholar]
- 10.Clifford P.M., Zarrabi S., Siu G., Kinsler K.J., Kosciuk M.C., Venkataraman V., D’Andrea M.R., Dinsmore S., Nagele R.G. Abeta peptides can enter the brain through a defective blood-brain barrier and bind selectively to neurons. Brain Res. 2007;1142(1):223–236. doi: 10.1016/j.brainres.2007.01.070. [http://dx.doi.org/10.1016/j.brainres.2007.01.070]. [PMID: 17306234]. [DOI] [PubMed] [Google Scholar]
- 11.Zlokovic B.V. New therapeutic targets in the neurovascular pathway in Alzheimer’s disease. Neurotherapeutics. 2008;5(3):409–414. doi: 10.1016/j.nurt.2008.05.011. [http://dx.doi.org/10.1016/j.nurt.2008.05.011]. [PMID: 18625452]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Nagele R.G., Clifford P.M., Siu G., Levin E.C., Acharya N.K., Han M., Kosciuk M.C., Venkataraman V., Zavareh S., Zarrabi S., Kinsler K., Thaker N.G., Nagele E.P., Dash J., Wang H.Y., Levitas A. Brain-reactive autoantibodies prevalent in human sera increase intraneuronal amyloid-β(1-42) deposition. J. Alzheimers Dis. 2011;25(4):605–622. doi: 10.3233/JAD-2011-110098. [PMID: 21483091]. [DOI] [PubMed] [Google Scholar]
- 13.Bowman G.L., Quinn J.F. Alzheimer’s disease and the blood-brain barrier: Past, present and future. Aging Health. 2008;4(1):47–55. doi: 10.2217/1745509X.4.1.47. [http://dx.doi.org/10.2217/1745509X.4.1.47]. [PMID: 19924258]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Heininger K. A unifying hypothesis of Alzheimer’s disease. III. Risk factors. Hum. Psychopharmacol. 2000;15(1):1–70. doi: 10.1002/(SICI)1099-1077(200001)15:1<1::AID-HUP153>3.0.CO;2-1. [http://dx.doi.org/10.1002/(SICI)1099-1077(200001)15:1<1:AID-HUP153>3.0.CO;2-1]. [PMID: 12404343]. [DOI] [PubMed] [Google Scholar]
- 15.Hermens W.T.J.M.C., Verhaagen J. Viral vectors, tools for gene transfer in the nervous system. Prog. Neurobiol. 1998;55(4):399–432. doi: 10.1016/s0301-0082(98)00007-0. [http://dx.doi.org/10.1016/S0301-0082(98)00007-0]. [PMID: 9654386]. [DOI] [PubMed] [Google Scholar]
- 16.Campos-Bedolla P., Walter F.R., Veszelka S., Deli M.A. Role of the blood-brain barrier in the nutrition of the central nervous system. Arch. Med. Res. 2014;45(8):610–638. doi: 10.1016/j.arcmed.2014.11.018. [http://dx.doi.org/ 10.1016/j.arcmed.2014.11.018]. [PMID: 25481827]. [DOI] [PubMed] [Google Scholar]
- 17.Shityakov S., Förster C. In silico predictive model to determine vector-mediated transport properties for the blood-brain barrier choline transporter. Adv. Appl. Bioinform. Chem. 2014;7:23–36. doi: 10.2147/AABC.S63749. [http://dx.doi.org/10.2147/AABC.S63749]. [PMID: 25214795]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Bujak R., Struck-Lewicka W., Kaliszan M., Kaliszan R., Markuszewski M.J. Blood-brain barrier permeability mechanisms in view of quantitative structure-activity relationships (QSAR). J. Pharm. Biomed. Anal. 2015;108:29–37. doi: 10.1016/j.jpba.2015.01.046. [http://dx.doi.org/ 10.1016/j.jpba.2015.01.046]. [PMID: 25703237]. [DOI] [PubMed] [Google Scholar]
- 19.Zhang D., Xiao J., Zhou N., Zheng M., Luo X., Jiang H., Chen K. A genetic algorithm based support vector machine model for blood-brain barrier penetration prediction. BioMed Res. Int. 2015;2015:292683. doi: 10.1155/2015/292683. [http://dx.doi.org/10.1155/2015/292683]. [PMID: 26504797]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Suenderhauf C., Hammann F., Huwyler J. Computational prediction of blood-brain barrier permeability using decision tree induction. Molecules. 2012;17(9):10429–10445. doi: 10.3390/molecules170910429. [http://dx.doi.org/ 10.3390/molecules170910429]. [PMID: 22941223]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Geldenhuys W.J., Manda V.K., Mittapalli R.K., Van der Schyf C.J., Crooks P.A., Dwoskin L.P., Allen D.D., Lockman P.R. Predictive screening model for potential vector-mediated transport of cationic substrates at the blood-brain barrier choline transporter. Bioorg. Med. Chem. Lett. 2010;20(3):870–877. doi: 10.1016/j.bmcl.2009.12.079. [http://dx.doi.org/ 10.1016/j.bmcl.2009.12.079]. [PMID: 20053562]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Trifunović J., Borčić V., Vukmirović S., Mikov M. Assessment of the pharmacokinetic profile of novel s-triazine derivatives and their potential use in treatment of Alzheimer’s disease. Life Sci. 2017;168:1–6. doi: 10.1016/j.lfs.2016.11.001. [http://dx.doi.org/10.1016/j.lfs.2016.11.001]. [PMID: 27818183]. [DOI] [PubMed] [Google Scholar]
- 23.Nikolic K., Mavridis L., Djikic T., Vucicevic J., Agbaba D., Yelekci K., Mitchell J.B.O. Drug design for CNS diseases: Polypharmacological profiling of compounds using cheminformatic, 3D-QSAR and virtual screening methodologies. Front. Neurosci. 2016;10(JUN):265. doi: 10.3389/fnins.2016.00265. [PMID: 27375423]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Tian Y.L., Lv M., Li J.J., Xu T., Zhai H.L., Zhang X.Y. Study on the active mechanism of β-secretase inhibitors by molecular simulations. Eur. J. Pharm. Sci. 2015;76:138–148. doi: 10.1016/j.ejps.2015.05.007. [http:// dx.doi.org/10.1016/j.ejps.2015.05.007]. [PMID: 25965961]. [DOI] [PubMed] [Google Scholar]
- 25.Ajmani S., Janardhan S., Viswanadhan V.N. Toward a general predictive QSAR model for gamma-secretase inhibitors. Mol. Divers. 2013;17(3):421–434. doi: 10.1007/s11030-013-9441-2. [http://dx.doi.org/10.1007/s11030-013-9441-2]. [PMID: 23612850]. [DOI] [PubMed] [Google Scholar]
- 26.Toropova M.A., Toropov A.A., Raška I., Jr, Rašková M. Searching therapeutic agents for treatment of Alzheimer disease using the Monte Carlo method. Comput. Biol. Med. 2015;64:148–154. doi: 10.1016/j.compbiomed.2015.06.019. [http://dx.doi.org/10.1016/j.compbiomed.2015.06.019]. [PMID: 26164035]. [DOI] [PubMed] [Google Scholar]
- 27.Hou T., Xu X. ADME evaluation in drug discovery. 1. Applications of genetic algorithms to the prediction of blood-brain partitioning of a large set of drugs. J. Mol. Model. 2002;8(12):337–349. doi: 10.1007/s00894-002-0101-1. [http://dx.doi.org/10.1007/s00894-002-0101-1]. [PMID: 12541001]. [DOI] [PubMed] [Google Scholar]
- 28.Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988;28:31–36. [http://dx.doi.org/10. 1021/ci00057a005]. [Google Scholar]
- 29.Toropova A.P., Toropov A.A., Martyanov S.E., Benfenati E., Gini G., Leszczynska D., Leszczynski J. CORAL: Monte Carlo method as a tool for the prediction of the bioconcentration factor of industrial pollutants. Mol. Inform. 2013;32(2):145–154. doi: 10.1002/minf.201200069. [http://dx.doi.org/10.1002/minf.201200069]. [PMID: 27481276]. [DOI] [PubMed] [Google Scholar]
- 30.Begum S., Achary P.G.R. Simplified molecular input line entry system-based: QSAR modelling for MAP kinase-interacting protein kinase (MNK1). SAR QSAR Environ. Res. 2015;26(5):343–361. doi: 10.1080/1062936X.2015.1039577. [http://dx.doi.org/10.1080/1062936X.2015.1039577]. [PMID: 25967103]. [DOI] [PubMed] [Google Scholar]
- 31.Achary P.G.R. QSPR modelling of dielectric constants of π-conjugated organic compounds by means of the CORAL software. SAR QSAR Environ. Res. 2014;25(6):507–526. doi: 10.1080/1062936X.2014.899267. [http://dx.doi.org/ 10.1080/1062936X.2014.899267]. [PMID: 24716837]. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.