Skip to main content
Data in Brief logoLink to Data in Brief
. 2018 Dec 19;22:471–483. doi: 10.1016/j.dib.2018.12.047

FABP4 inhibitors 3D-QSAR model and isosteric replacement of BMS309403 datasets

Giuseppe Floresta a,b,c,, Agostino Cilibrizzi c,d, Vincenzo Abbate d, Ambra Spampinato a, Chiara Zagni a, Antonio Rescifina a,⁎⁎
PMCID: PMC6312796  PMID: 30619925

Abstract

The data have been obtained from FABP4 inhibitor molecules previously published. The 120 compounds were used to build a 3D-QSAR model. The development of the QSAR model has been undertaken with the use of Forge software using the PM3 optimized structure and the experimental IC50 of each compound. The QSAR model was also employed to predict the activity of 3000 new isosteric derivatives of BMS309403. The isosteric replacement was also validated by the synthesis and the biological screening of three new compounds reported in the related research article “3D-QSAR assisted identification of FABP4 inhibitors: An effective scaffold hopping analysis/QSAR evaluation” (Floresta et al., 2019).


Specifications table

Subject area Computational Chemistry
More specific subject area Three-Dimensional Quantitative Structure-Activity Relationship (3D-QSAR) modeling
Type of data Tables, figures
How data was acquired Statistical modeling and online databases
Data format Raw and analyzed
Experimental factors The whole dataset consists of 120 FABP4 ligands and 3000 isosteric derivatives of BMS309403
Experimental features The 3D-QSAR model has been developed using Forge as software. Chemical structure descriptors and pIC50were used as variables. Spark was used for the isosteric replacement
Data source location Department of Drug Sciences, University of Catania, Italy
Data accessibility Data is with this article
Related research article G. Floresta, A. Cilibrizzi, V. Abbate, A. Spampinato, C. Zagni, A. Rescifina, 3D-QSAR assisted identification of FABP4 inhibitors: An effective scaffold hopping analysis/QSAR evaluation, Bioorganic Chemistry, 84 (2019) 276–284 [1].

Value of the data

  • FABP4 recently demonstrated an interesting molecular target for the treatment of type 2 diabetes, other metabolic diseases and some type of cancers.

  • QSAR modeling data was generated to provide a method useful in finding or repurposing novel FABP4 ligands.

  • The model has also been used to predict the activity of 3000 isosteric derivatives of BMS309403.

  • The data can be used by others to build their own model.

  • The data can be used for the synthesis of some potent suggested compounds.

1. Data

FABP4 recently demonstrated an interesting molecular target for the treatment of type 2 diabetes, other metabolic diseases and some type of cancers [2], [3], [4], [5], [6], [7], [8], [9], [10]. Recently, a variety of effective FABP4 inhibitors have been developed [11], but unfortunately, none of them is currently in the clinical research phases (Table 1). CAMD (computer aided molecular design) shows a promising and effective tool for the identification of FABP4 inhibitors [12], [13], [14], [15]. In line with our recent interest in the development of QSAR models and related applications [16], [17], [18], [19], [20], [21], [22], [23], [24], in order to identify novel hit compounds, herein we report the dataset and the parameter used to build a 3D-QSAR model for FABP4. This dataset is reported in Tables 2 and 3, were the molecules used in the training set (96) and in the test set (24) are reported, respectively. Information for the building of the 3D-QSAR model is reported in Fig. 1, Fig. 2, Fig. 3, Fig. 4, Fig. 5, Fig. 6, Fig. 7, Fig. 8, Fig. 9. Moreover, the 3D-QSAR model was also used to predict the biological activity of 3000 new isosteric derivatives of BMS309403 derived from a scaffold-hopping analysis, the analyzed areas of the selected compounds and the Spark׳s parameters used for the isosteric replacement are reported in Figs. 8 and 9. The results of the isosteric replacement of different portion of BMS309403 are reported in Tables S4–S9.

Table 1.

PDB codes and molecules used as reference compounds for ligand-based alignment.

graphic file with name fx1.gif

Table 2.

SMILES, experimental and predicted pIC50 values of the molecules in the training set.

pIC50
No SMILES Exp. Pred.
1 FC(F)(F)[C@H]1CCc2c(C1)c(c(c(n2)C3CCCC3)C=4[N-]N=NN4)-c5ccnc(c5)C 8.0 8.0
2 CC1(CCCC1)c2c(c(c3c(n2)CCCCC3)-c4ccnc(c4)C)C=5[N-]N=NN5 8.0 8.0
3 Clc1c(F)cc2c(c(c(c(N(CC)CC)n2)C=3[N-]N=NN3)-c4ccccc4)c1 7.9 7.9
4 Clc1c(F)cc2c(c(c(c(n2)C(CC)CC)C=3[N-]N=NN3)-c4ccccc4)c1 7.8 7.8
5 OCC1(CCCC1)c2c(c(c3c(n2)CCCCC3)-c4ccnc(c4)C)C=5[N-]N=NN5 7.7 7.7
6 CCCCC[C@H]1CCc2c(C1)c(c(c(n2)C3(CCCC3)COC)C=4[N-]N=NN4)-c5ccccc5 7.7 7.7
7 FC(F)(F)c1ccc2c(c(c(c(N3CCCCC3)n2)C=4[N-]N=NN4)-c5ccccc5)c1 7.5 7.5
8 Clc1ccc2c(c(c(c(n2)C3CC3)C([O-])=O)-c4ccccc4)c1 7.4 7.4
9 Clc1ccc2c(c(c(c(N(CC)C)n2)C=3[N-]N=NN3)-c4ccccc4)c1 7.3 7.4
10 Clc1cc(Cl)cc(NC(=O)NC2(CCCC2)C([O-])=O)c1-c3ccccc3 7.3 7.3
11 Clc1c(F)cc(c(NC(=O)NC2(CCCC2)C([O-])=O)c1)-c3ccccc3 7.0 7.0
12 O=C(N)c1ccccc1Cn2c3c(cccc3c4CCCCCc42)C([O-])=O 7.0 7.0
13 n1c2c(CCCCC2)c(c(c1C3CCCCC3)C=4[N-]N=NN4)-c5ccncc5 7.0 6.9
14 Clc1ccc(c(NC(=O)NC2(CCCC2)C([O-])=O)c1)-c3ccc(F)cc3 6.9 6.9
15 FC(F)(F)c1ccccc1Cn2c3c(cccc3c4CCCCc42)C([O-])=O 6.4 6.5
16 Fc1ccc(-c2c(c(n(n2)-c3ccccc3-c4cccc(OCC([O-])=O)c4)CC)-c5ccccc5)cc1 6.5 6.5
17 [O-]C(=O)c1cccc2c3CCCCCc3n(c12)Cc4ccccc4 6.2 6.3
18 Fc1ccccc1Cn2c3c(cccc3c4CCCCc42)C([O-])=O 6.4 6.3
19 Fc1cccc(Cn2c3c(cccc3c4CCCCc42)C([O-])=O)c1 6.4 6.3
20 FC(F)(F)c1ccccc1Cn2c3c(cccc3c4CCCCCc42)C([O-])=O 6.2 6.3
21 [O-]C(=O)CCCn1c2ccccc2c3ccccc31 6.2 6.3
22 FC(F)(F)c1ccc(c(NC(=O)NC2(CCCC2)C([O-])=O)c1)-c3ccccc3 6.3 6.2
23 [O-]C(=O)c1cccc2c3CCCCc3n(c12)Cc4cccc(OC)c4 6.3 6.2
24 Fc1cccc(Cn2c3c(cccc3c4CCCCCc42)C([O-])=O)c1 6.1 6.2
25 FC(F)(F)c1cc(O)nc(SCc2ccc(OC)cc2)n1 6.2 6.2
26 [O-]C(=O)c1ccc2c(n(c3CCCCc23)Cc4ccccc4)c1 6.1 6.1
27 [O-]C(=O)c1cccc2c3CCCc3n(c12)Cc4ccccc4 6.1 6.1
28 [O-]C(=O)c1cccc2c3CCCCc3n(c12)Cc4ccccc4OC 6.2 6.1
29 [O-]C(=O)c1cccc2c3CCCCc3n(c12)Cc4ccc(C)cc4 6.0 6.1
30 Fc1ccccc1Cn2c3c(cccc3c4CCCCCc42)C([O-])=O 6.2 6.1
31 Fc1ccc(Cn2c3c(cccc3c4CCCCCc42)C([O-])=O)cc1 6.1 6.1
32 [O-]C(=O)CCCCn1c2ccccc2c3ccccc31 6.1 6.1
33 FC(F)(F)c1cccc(Cn2c3c(cccc3c4CCCCCc42)C([O-])=O)c1 6.0 6.0
34 FC(F)(F)c1cc(O)nc(SCC(=O)N2CCCCC2)n1 6.0 6.0
35 O=S(=O)(n1ccc2ccc(cc21)C)c3ccsc3C([O-])=O 5.9 5.9
36 Brc1ccc2c(ccn2S(=O)(=O)c3ccsc3C([O-])=O)c1 5.9 5.9
37 FC(F)(F)c1cccc(Cn2c3c(cccc3c4CCCCc42)C([O-])=O)c1 5.8 5.7
38 FC(F)(F)c1ccc(Cn2c3c(cccc3c4CCCCc42)C([O-])=O)cc1 5.6 5.7
39 FC(F)(F)c1ccc(Cn2c3c(cccc3c4CCCCCc42)C([O-])=O)cc1 5.7 5.7
40 O=S(=O)(n1cc(c2ccccc21)C)c3ccsc3C([O-])=O 5.8 5.7
41 [O-]C(=O)c1cccc2c3CCCCc3n(c12)Cc4ccc(OC)cc4 5.6 5.6
42 [O-]C(=O)[C@H](Oc1cccc(-c2ccccc2-n3c(c(c(n3)-c4ccccc4)-c5ccccc5)CC)c1)C 5.6 5.6
43 O=S(=O)(n1ccc2cccc(OC)c21)c3ccsc3C([O-])=O 5.6 5.6
44 O/N=C/1CCCc2c1c3cccc(c3n2Cc4ccccc4)C([O-])=O 5.5 5.5
45 Clc1cccc(-n2c(-c3ccccc3)cc(n2)-c4ccccc4OCCCC([O-])=O)c1 5.6 5.5
46 [O-]C(=O)[C@H](Oc1cccc(-c2ccccc2-n3c(c(c(n3)-c4ccccc4)-c5ccccc5)CC)c1)CC 5.5 5.5
47 Fc1ccc2c(ccn2S(=O)(=O)c3ccsc3C([O-])=O)c1 5.5 5.5
48 [O-]C(=O)c1cccc2c(c(n(c12)Cc3ccccc3)C)C 5.4 5.4
49 Clc1ccc(-n2c(-c3ccccc3)cc(n2)-c4ccccc4OCCCC([O-])=O)cc1 5.4 5.4
50 Clc1ccccc1-n2c(-c3ccccc3)cc(n2)-c4ccccc4OCCCC([O-])=O 5.4 5.4
51 [O-]C(=O)c1c(C(C)C)cc(C(C)C)cc1C(C)C 5.4 5.4
52 O=S(=O)(n1c2ccccc2c3ccccc31)c4ccccc4C([O-])=O 5.4 5.4
53 Fc1ccc2ccn(S(=O)(=O)c3ccsc3C([O-])=O)c2c1 5.4 5.4
54 FC(F)(F)c1cc(O)nc(NCc2ccc(OC)cc2)n1 5.4 5.4
55 [O-]C(=O)CCCOc1ccccc1-c2cc(n(n2)-c3ccccc3)-c4ccc(cc4)C 5.3 5.3
56 Brc1ccc(-n2c(-c3ccccc3)cc(n2)-c4ccccc4OCCCC([O-])=O)cc1 5.3 5.3
57 Fc1ccc(-c2c(nn(c2CC)-c3ccccc3-c4cccc(OCC([O-])=O)c4)-c5ccccc5)cc1 5.3 5.3
58 [O-]C(=O)CCCCOc1ccccc1-c2cc(n(n2)-c3ccccc3)-c4ccccc4 5.2 5.2
59 O=S(=O)(n1ccc2cc(ccc21)C)c3ccsc3C([O-])=O 5.2 5.2
60 O=S(=O)(n1ccc2ccc(OC)cc21)c3ccccc3C([O-])=O 5.2 5.2
61 Brc1ccc(-c2cc(nn2-c3ccccc3)-c4ccccc4OCCCC([O-])=O)cc1 5.0 5.0
62 Fc1ccc(-n2c(-c3ccccc3)cc(n2)-c4ccccc4OCCCC([O-])=O)cc1 5.0 5.0
63 [O-]C(=O)CCCOc1ccccc1-c2cc(n(n2)-c3ccc(C(C)C)cc3)-c4ccccc4 5.0 5.0
64 [O-]C(=O)CCn1c2ccccc2c3ccccc31 5.0 5.0
65 O=S(=O)(n1ccc2c(cccc21)C)c3ccsc3C([O-])=O 5.1 5.0
66 O=S(=O)(n1ccc2cc(OC)ccc21)c3ccsc3C([O-])=O 5.1 5.0
67 O=S(=O)(n1cc(c2ccccc21)C)c3ccccc3C([O-])=O 5.1 5.0
68 O=S(=O)(n1ccc2c(cccc21)C)c3ccccc3C([O-])=O 4.9 4.9
69 Brc1ccc2c(ccn2S(=O)(=O)c3ccccc3C([O-])=O)c1 4.9 4.9
70 [O-]C(=O)CCCOc1ccccc1-c2cc(n(n2)-c3ccc(OC)cc3)-c4ccccc4 4.9 4.8
71 [O-]C(=O)CCCOc1ccccc1-c2cc(n(n2)C3CCCCCC3)-c4ccccc4 4.8 4.8
72 Brc1ccc2c(n(S(=O)(=O)c3c(C(C)C)cc(C(C)C)cc3C(C)C)cn2)c1 4.8 4.8
73 Clc1ccc2c(nc(n2S(=O)(=O)c3c(C(C)C)cc(C(C)C)cc3C(C)C)C)c1 4.8 4.8
74 O=S(=O)(n1cncc1)c2c(C(C)C)cc(C(C)C)cc2C(C)C 4.7 4.8
75 Clc1ccccc1CNc2nc(O)cc(n2)C(F)(F)F 4.6 4.7
76 FC(F)(F)c1cc(O)nc(n1)CCc2ccc(OC)cc2 4.6 4.7
77 O=C1CCCc2c1c3cccc(c3n2Cc4ccccc4)C([O-])=O 4.6 4.6
78 [O-]C(=O)CCCOc1ccccc1-c2cc(n(n2)C3CCCCC3)-c4ccccc4 4.6 4.6
79 O=S(=O)(n1ccc2cc(ccc21)C)c3ccccc3C([O-])=O 4.5 4.6
80 FC(F)(F)c1cc(O)nc(n1)N(Cc2ccccc2)C 4.6 4.6
81 Clc1ccc(-c2cc(nn2-c3ccccc3)-c4ccccc4OCCCCCCC([O-])=O)cc1 4.5 4.5
82 FC(F)(F)c1cc(O)nc(NCC(=O)N2CCCCC2)n1 4.4 4.4
83 Clc1cccc(CNc2nc(O)cc(n2)C(F)(F)F)c1 4.5 4.4
84 FC(F)(F)c1cc(O)nc(NCc2ccc(C)cc2)n1 4.5 4.4
85 Clc1ccc(-c2cc(nn2-c3ccccc3)-c4ccccc4OCCCCC([O-])=O)cc1 4.1 4.2
86 Brc1ccc(-c2cc(nn2-c3ccccc3)-c4ccccc4OCCCCC([O-])=O)cc1 4.1 4.1
87 O=S(=O)(n1ccc2c(OC)cccc21)c3ccccc3C([O-])=O 4.1 4.1
88 O=S(=O)(N)c1c(C(C)C)cc(C(C)C)cc1C(C)C 4.0 4.0
89 [O-]C(=O)Cn1c2ccccc2c3ccccc31 4.0 4.0
90 FC(F)(F)c1cc(O)nc(n1)NCc2ccc(-c3ccccc3)cc2 4.0 4.0
91 FC(F)(F)c1cc(O)nc(NCc2ccncc2)n1 4.0 4.0
92 FC(F)(F)c1cc(O)nc(n1)CCc2ccccc2 4.0 4.0
93 FC(F)(F)c1cc(O)nc(NCCc2ccccc2)n1 4.0 3.9
94 [O-]C(=O)CCCCOc1ccccc1-c2cc(n(n2)-c3ccccc3)-c4ccc(cc4)C 3.6 3.6
95 Clc1ccc(CNc2nc(O)cc(n2)C(F)(F)F)cc1 5.5 3.5
96 Clc1ccc(-c2cc(nn2-c3ccccc3)-c4ccccc4OCC([O-])=O)cc1 2.0 2.0

Table 3.

SMILES, experimental, and predicted pIC50 values of the molecules in the test set.

pIC50
No SMILES Exp. Pred.
1 FC(F)(F)c1ccc2c(c(c(c(N(CC)CC)n2)C=3[N-]N=NN3)-c4ccccc4)c1 7.6 7.8
2 Clc1c(F)cc2c(c(c(c(N3CCCCC3)n2)C=4[N-]N=NN4)-c5ccccc5)c1 7.9 7.3
3 Clc1ccc(c(NC(=O)NC2(CCCC2)C([O-])=O)c1)-c3ccccc3 6.8 6.5
4 O=C(N)c1cccc(Cn2c3c(cccc3c4CCCCCc42)C([O-])=O)c1 7.2 6.2
5 [O-]C(=O)c1ccc2c(c3CCCCc3n2Cc4ccccc4)c1 4.6 6.1
6 Fc1ccc(Cn2c3c(cccc3c4CCCCc42)C([O-])=O)cc1 6.1 6.1
7 [O-]C(=O)c1cccc2c3CCCCc3n(c12)Cc4ccccc4 6.2 5.9
8 Fc1cccc(c1Cn2c3c(cccc3c4CCCCc42)C([O-])=O)C(F)(F)F 5.7 5.9
9 O=S(=O)(n1c2ccccc2c3ccccc31)c4ccsc4C([O-])=O 6.0 5.9
10 [O-]C(=O)c1cccc2c3CCCCCc3n(CCC)c12 6.4 5.7
11 [O-]S(=O)(=O)c1c(C(C)C)cc(C(C)C)cc1C(C)C 5.1 5.7
12 O=S(=O)(n1ccc2ccc(OC)cc21)c3ccsc3C([O-])=O 5.6 5.7
13 [O-]C(=O)c1cccc2c3CCCCc3n(CCC)c12 6.1 5.6
14 Fc1cccc2ccn(S(=O)(=O)c3ccsc3C([O-])=O)c12 5.4 5.4
15 [O-]C(=O)CCCOc1ccccc1-c2cc(n(n2)-c3ccccc3)-c4ccccc4 5.5 5.3
16 Clc1ccc(-c2cc(nn2-c3ccccc3)-c4ccccc4OCCCC([O-])=O)cc1 5.2 5.2
17 Fc1cccc2c1ccn2S(=O)(=O)c3ccccc3C([O-])=O 5.0 5.2
18 Clc1ccc(CN(c2nc(O)cc(n2)C(F)(F)F)C)cc1 5.4 5.1
19 FC(F)(F)c1cc(O)nc(Nc2ccccc2)n1 4.0 4.8
20 Brc1ccc2c(n(S(=O)(=O)c3c(C(C)C)cc(C(C)C)cc3C(C)C)c(n2)C)c1 4.1 4.7
21 O=S(=O)(n1c(nc2ccccc21)C)c3c(C(C)C)cc(C(C)C)cc3C(C)C 4.0 4.6
22 [O-]C(=O)CCCOc1ccccc1-c2cc(n(n2)C3CCCC3)-c4ccccc4 4.8 4.5
23 O=S(=O)(n1ccc2c(OC)cccc21)c3ccsc3C([O-])=O 4.9 4.3
24 FC(F)(F)c1cc(O)nc(n1)NCc2ccccc2 4.5 4.2

Fig. 1.

Fig. 1

Comparison of alignment methods.

Fig. 2.

Fig. 2

Schematic representation of the process adopted to obtain the template compounds for the ligand-based alignment.

Fig. 3.

Fig. 3

A) Protein and inhibitors aligned. B) Aligned inhibitors imported to Forge for ligand-based alignment.

Fig. 4.

Fig. 4

Forge׳s parameters used for conformation hunt.

Fig. 5.

Fig. 5

Forge׳s parameters used for alignment.

Fig. 6.

Fig. 6

Forge׳s parameters used to build the QSAR model.

Fig. 7.

Fig. 7

Model statistics for FABP4 model.

Fig. 8.

Fig. 8

The studied position for the bioisosteric replacement of BMS309403 are highlighted in bold.

Fig. 9.

Fig. 9

Spark׳s parameters used for bio-isosteric replacement.

2. Experimental design, materials and methods

2.1. Compounds alignments

With the aim to generate a plausible and consistent set of alignment molecules, before running the regression analysis, we evaluated two different types of alignment (Fig. 1).

First, we evaluated a structure-based alignment, based on the docking of the different ligands on the active site of the protein. All 120 structures, optimized at the PM3 level of theory [25], [26], [27], have been converted into pdbqt format using Babel, [28] and subsequently docked in the active site of FABP4. Molecular docking was performed using the three-dimensional crystal structures of substrate-free fatty acid binding protein 4 in complex with BMS309403 (PDB ID: 2NNQ) obtained from the Protein Data Bank (PDB, http://www.rcsb.org/pdb). AutoDock Vina (version 1.1.2) [29], was used for all docking experiments. The default values of the docking parameters in AutoDock Vina were all maintained, except for “exhaustiveness” that was set to 15. A grid box of 18 Å × 18 Å × 18 Å was used, encompassing the inhibitor binding cavity of FABP4 and centered on the ligand. The binding modes were clustered through the root mean square deviation among the Cartesian coordinates of the ligand atoms. The docking results were ranked based on the binding free energy. After the calculations with AutoDock Vina, all the generated structures were manually checked, in order to ensure a correct positioning within the binding site. Then the generated structures were imported into Forge [30] to build the Structure-based 3D-QSAR model. A classic ligand-based alignment is the second type of alignment that was evaluated. This was carried out with the same software used for the building of the model. All the optimized structures, together with their respective IC50 values, were imported into Forge (10.4.2, Cresset, Litlington, Cambridgeshire, UK, http://www.cresset-group.com/forge) [30], [31], [32], [33], [34] for setting-up the field-based 3D-QSAR model. Eight different molecules were chosen as a template for the calculations of field points and as a template for the alignment. These eight molecules were selected since they are present in crystallized forms with FABP4 (PDB IDs: 2NNQ, 3FR2, 3FR4, 3FR5, 4NNS, 4NNT, 1TOU and 1TOW, Table 1) [35], [36], [37], [38]. The structures, small protein, and inhibitors, were first downloaded from the Protein Data Bank (PDB); the amino acid sequence was then superposed and aligned with YASARA (version 17.8.15) to get also the ligands in the binding site aligned and superposed, thus the eight molecules were imported on Forge (Figs. 2 and 3).

The XED (eXtended Electron Distribution) force field was used to generate the field point .The compounds in the training set were aligned to the reference compound by maximum common substructure using a customized set-up for the conformation hunt:

  • Max number of conformations: 500.

  • RMS cut-off for duplicate conformers: 0.5 Å.

  • Gradient cut-off for conformer minimization: 0.1 kcal/mol.

  • Energy window: 2.5 kcal/mol.

The RMS cut-off for duplicate conformers parameter controls the similarity threshold below which two conformers are assumed identical. Conformations that gave a minimized energy outside the energy window were discarded.

All the alignments were manually checked to ensure the best possible model. All the field points of the training set were used to derive a gauge invariant set of sampling points, which reduced the number of descriptors that needed to be taken into account, with a distance of 1 Å between the sample points. Sample values were calculated, ensuring that all areas around the molecule (and possibly contributing to the activity) are properly described.

2.2. Statistical analysis

For the validation of the QSAR model, the leave-one-out method was used. 20 was the maximum number of components to extract from the PLS regression. 50 was the number of Y scrambles to use The threshold of the sample point minimum distance was set to 1 Å. The Leave-one-out method was used during the validation of the QSAR model. The regression method used in Forge was PLS (SIMPLS algorithm) [39], [40], [41], [42], [43]. All the parameters for the QSAR model are resumed in Fig. 4, Fig. 5, Fig. 6.

The predictive ability of the generated QSAR model was confirmed by several statistical tests. The cross-validation regression coefficient (q2) was calculated based on the PRESS (Prediction error sum of squares) and SSY (Sum of squares of deviation of the experimental values from their mean):

q2=1PRESSSSY=1i=1n(YexpYpred)2i=1n(YexpYmean)2
  • Yexp=experimental activity of training set compound

  • Ypred=predicted activity of training set compound

  • Ymean=mean values of the activity of training set compound

The derived ligand-based approach results to be more reliable (r2 = 0.92, q2 = 0.64) than the structure-based alignment (r2 = 0.90, q2 = 0.38). The ligand-based 3D-QSAR align model was further validated with a set of external compounds (i.e. test set). Out of 120 molecules, we randomly choose 96 molecules (covering the whole range of activities of the compounds) as a training set to build the model, while the remaining 24 compounds served as a test set to evaluate the model.

The statistical reliability of this model was also validated by the determination of the r2test, using the following equation:

r2test=1i=1n(YpredtestYtest)2i=1n(YtestYmean)2
  • Ypredtest=predicted activity of test set compound by QSAR equation

  • Ytest=experimental activity of test set compound

  • Ymean=mean values of the activity of training set compound

The 11-components model (Fig. 7) shows both good predictive and descriptive capability as it is shown by the good r2 (0.99) and q2 (0.69) [44] values for the training and the cross-validated training sets. The plot of experimental vs. predicted activity for the compounds, in both the training set and the cross-validated training set (q2=0.69), shows a reasonable distribution of the values. The plot of experimental vs. predicted activity for the compounds in the test set is still reasonably good with only few outliers and a good cross-validated r2 of 0.73.

2.3. Isosteric replacement

The isosteric replacement was performed using Spark as a software (10.4.0, Cresset, Litlington, Cambridgeshire, UK, http://www.cresset-group.com/forge) [30], [31], [32], [33], [34]. As reported in Fig. 8, Different portions of the BMS309403 were replaced. Then, the newly designed molecules were aligned with the 3D-QSAR model and evaluated. The replacement was performed through the same 178,558 fragments for each part, which derive from ChEMBL and Zinc databases (Fig. 9) [45], [46]. Five hundred compounds were generated for each substitution producing 3000 hits (reported in Tables S4–S9). There of the suggested molecules were synthesized and tested as reported in the related research article [1].

Acknowledgements

Free academic licenses from Cresset and ChemAxon for their suites of programs are gratefully acknowledged.

Footnotes

Transparency document

Transparency data associated with this article can be found in the online version at https://doi.org/10.1016/j.dib.2018.12.047.

Appendix A

Supplementary data associated with this article can be found in the online version at https://doi.org/10.1016/j.dib.2018.12.047.

Contributor Information

Giuseppe Floresta, Email: giuseppe.floresta@unict.it.

Antonio Rescifina, Email: arescifina@unict.it.

Transparency document. Supplementary material

Supplementary material.

mmc1.pdf (898.9KB, pdf)

.

Appendix A. Supplementary material

Supplementary Tables S4–S9.

mmc2.docx (186.6KB, docx)

.

References

  • 1.Floresta G., Cilibrizzi A., Abbate V., Spampinato A., Zagni C., Rescifina A. 3D-QSAR assisted identification of FABP4 inhibitors: an effective scaffold hopping analysis/QSAR evaluation. Bioorg. Chem. 2019;84:276–284. doi: 10.1016/j.bioorg.2018.11.045. [DOI] [PubMed] [Google Scholar]
  • 2.Adachi Y., Hiramatsu S., Tokuda N., Sharifi K., Ebrahimi M., Islam A., Kagawa Y., Koshy Vaidyan L., Sawada T., Hamano K., Owada Y. Fatty acid-binding protein 4 (FABP4) and FABP5 modulate cytokine production in the mouse thymic epithelial cells. Histochem Cell Biol. 2012;138:397–406. doi: 10.1007/s00418-012-0963-y. [DOI] [PubMed] [Google Scholar]
  • 3.Blecha I.M., Siqueira F., Ferreira A.B., Feijo G.L., Torres R.A., Junior, Medeiros S.R., Sousa I.I., Santiago G.G., Ferraz A.L. Identification and evaluation of polymorphisms in FABP3 and FABP4 in beef cattle. Genet Mol. Res. 2015;14:16353–16363. doi: 10.4238/2015.December.9.3. [DOI] [PubMed] [Google Scholar]
  • 4.Bosquet A., Guaita-Esteruelas S., Saavedra P., Rodriguez-Calvo R., Heras M., Girona J., Masana L. Exogenous FABP4 induces endoplasmic reticulum stress in HepG2 liver cells. Atherosclerosis. 2016;249:191–199. doi: 10.1016/j.atherosclerosis.2016.04.012. [DOI] [PubMed] [Google Scholar]
  • 5.Guaita-Esteruelas S., Bosquet A., Saavedra P., Guma J., Girona J., Lam E.W., Amillano K., Borras J., Masana L. Exogenous FABP4 increases breast cancer cell proliferation and activates the expression of fatty acid transport proteins. Mol. Carcinog. 2017;56:208–217. doi: 10.1002/mc.22485. [DOI] [PubMed] [Google Scholar]
  • 6.Harjes U., Bridges E., Gharpure K.M., Roxanis I., Sheldon H., Miranda F., Mangala L.S., Pradeep S., Lopez-Berestein G., Ahmed A., Fielding B., Sood A.K., Harris A.L. Antiangiogenic and tumour inhibitory effects of downregulating tumour endothelial FABP4. Oncogene. 2016 doi: 10.1038/onc.2016.256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Luo Y., Yang Z., Li D., Liu Z., Yang L., Zou Q., Yuan Y. LDHB and FABP4 are associated with progression and poor prognosis of pancreatic ductal adenocarcinomas. Appl. Immunohistochem. Mol. Morphol. 2015 doi: 10.1097/PAI.0000000000000306. [DOI] [PubMed] [Google Scholar]
  • 8.Meng D.M., Wang L., Xu J.R., Yan S.L., Zhou L., Mi Q.S. Fabp4-Cre-mediated deletion of the miRNA-processing enzyme Dicer causes mouse embryonic lethality. Acta Diabetol. 2013;50:823–824. doi: 10.1007/s00592-011-0335-4. [DOI] [PubMed] [Google Scholar]
  • 9.Syamsunarno M.R., Iso T., Hanaoka H., Yamaguchi A., Obokata M., Koitabashi N., Goto K., Hishiki T., Nagahata Y., Matsui H., Sano M., Kobayashi M., Kikuchi O., Sasaki T., Maeda K., Murakami M., Kitamura T., Suematsu M., Tsushima Y., Endo K., Hotamisligil G.S., Kurabayashi M. A critical role of fatty acid binding protein 4 and 5 (FABP4/5) in the systemic response to fasting. PLoS One. 2013;8:e79386. doi: 10.1371/journal.pone.0079386. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Tang Z., Shen Q., Xie H., Zhou X., Li J., Feng J., Liu H., Wang W., Zhang S., Ni S. Elevated expression of FABP3 and FABP4 cooperatively correlates with poor prognosis in non-small cell lung cancer (NSCLC) Oncotarget. 2016 doi: 10.18632/oncotarget.10086. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Floresta G., Pistarà V., Amata E., Dichiara M., Marrazzo A., Prezzavento O., Rescifina A. Adipocyte fatty acid binding protein 4 (FABP4) inhibitors. A comprehensive systematic review. Eur. J Med Chem. 2017;138:854–873. doi: 10.1016/j.ejmech.2017.07.022. [DOI] [PubMed] [Google Scholar]
  • 12.Tagami U., Takahashi K., Igarashi S., Ejima C., Yoshida T., Takeshita S., Miyanaga W., Sugiki M., Tokumasu M., Hatanaka T., Kashiwagi T., Ishikawa K., Miyano H., Mizukoshi T. Interaction analysis of FABP4 inhibitors by x-ray crystallography and fragment molecular orbital analysis. ACS Med. Chem. Lett. 2016;7:435–439. doi: 10.1021/acsmedchemlett.6b00040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Cai H., Wang T., Yang Z., Xu Z., Wang G., Wang H.Y., Zhu W., Chen K. Combined virtual screening and substructure search for discovery of novel FABP4 inhibitors. J. Chem. Inf. Model. 2017;57:2329–2335. doi: 10.1021/acs.jcim.7b00364. [DOI] [PubMed] [Google Scholar]
  • 14.Cai H., Liu Q., Gao D., Wang T., Chen T., Yan G., Chen K., Xu Y., Wang H., Li Y., Zhu W. Novel fatty acid binding protein 4 (FABP4) inhibitors: virtual screening, synthesis and crystal structure determination. Eur. J. Med. Chem. 2015;90:241–250. doi: 10.1016/j.ejmech.2014.11.020. [DOI] [PubMed] [Google Scholar]
  • 15.Cai H., Yan G., Zhang X., Gorbenko O., Wang H., Zhu W. Discovery of highly selective inhibitors of human fatty acid binding protein 4 (FABP4) by virtual screening. Bioorg. Med Chem. Lett. 2010;20:3675–3679. doi: 10.1016/j.bmcl.2010.04.095. [DOI] [PubMed] [Google Scholar]
  • 16.Floresta G., Apirakkan O., Rescifina A., Abbate V. Discovery of high-affinity cannabinoid receptors ligands through a 3D-QSAR ushered by scaffold-hopping analysis. Molecules. 2018;23 doi: 10.3390/molecules23092183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Floresta G., Pittalà V., Sorrenti V., Romeo G., Salerno L., Rescifina A. Development of new HO-1 inhibitors by a thorough scaffold-hopping analysis. Bioorg. Chem. 2018;81:334–339. doi: 10.1016/j.bioorg.2018.08.023. [DOI] [PubMed] [Google Scholar]
  • 18.Floresta G., Amata E., Dichiara M., Marrazzo A., Salerno L., Romeo G., Prezzavento O., Pittalà V., Rescifina A. Identification of potentially potent heme oxygenase 1 inhibitors through 3D-QSAR coupled to scaffold-hopping analysis. Chemmedchem. 2018;13:1336–1342. doi: 10.1002/cmdc.201800176. [DOI] [PubMed] [Google Scholar]
  • 19.Salerno L., Amata E., Romeo G., Marrazzo A., Prezzavento O., Floresta G., Sorrenti V., Barbagallo I., Rescifina A., Pittalà V. Potholing of the hydrophobic heme oxygenase-1 western region for the search of potent and selective imidazole-based inhibitors. Eur. J. Med. Chem. 2018;148:54–62. doi: 10.1016/j.ejmech.2018.02.007. [DOI] [PubMed] [Google Scholar]
  • 20.Floresta G., Rescifina A., Marrazzo A., Dichiara M., Pistarà V., Pittalà V., Prezzavento O., Amata E. Hyphenated 3D-QSAR statistical model-scaffold hopping analysis for the identification of potentially potent and selective sigma-2 receptor ligands. Eur. J. Med Chem. 2017;139:884–891. doi: 10.1016/j.ejmech.2017.08.053. [DOI] [PubMed] [Google Scholar]
  • 21.Rescifina A., Floresta G., Marrazzo A., Parenti C., Prezzavento O., Nastasi G., Dichiara M., Amata E. Sigma-2 receptor ligands QSAR model dataset. Data Brief. 2017;13:514–535. doi: 10.1016/j.dib.2017.06.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Rescifina A., Floresta G., Marrazzo A., Parenti C., Prezzavento O., Nastasi G., Dichiara M., Amata E. Development of a sigma-2 receptor affinity filter through a Monte Carlo based QSAR analysis. Eur. J Pharm. Sci. 2017;106:94–101. doi: 10.1016/j.ejps.2017.05.061. [DOI] [PubMed] [Google Scholar]
  • 23.Greish K.F., Salerno L., Al Zahrani R., Amata E., Modica M.N., Romeo G., Marrazzo A., Prezzavento O., Sorrenti V., Rescifina A., Floresta G., Intagliata S., Pittalà V. Novel structural insight into inhibitors of heme oxygenase-1 (HO-1) by new imidazole-based compounds: biochemical and in vitro anticancer activity evaluation. Molecules. 2018;23 doi: 10.3390/molecules23051209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Floresta G., Pistarà V., Amata E., Dichiara M., Damigella A., Marrazzo A., Prezzavento O., Punzo F., Rescifina A. Molecular modeling studies of pseudouridine isoxazolidinyl nucleoside analogues as potential inhibitors of the pseudouridine 5′-monophosphate glycosidase. Chem. Biol. Drug Des. 2018;91:519–525. doi: 10.1111/cbdd.13113. [DOI] [PubMed] [Google Scholar]
  • 25.La Manna P., Talotta C., Floresta G., De Rosa M., Soriente A., Rescifina A., Gaeta C., Neri P. Mild friedel-crafts reactions inside a hexameric resorcinarene capsule: C-Cl bond activation through hydrogen bonding to bridging water molecules. Angew. Chem. Int. Ed. Engl. 2018;57:5423–5428. doi: 10.1002/anie.201801642. [DOI] [PubMed] [Google Scholar]
  • 26.La Manna P., De Rosa M., Talotta C., Gaeta C., Soriente A., Floresta G., Rescifina A., Neri P. The hexameric resorcinarene capsule as an artificial enzyme: ruling the regio and stereochemistry of a 1,3-dipolar cycloaddition between nitrones and unsaturated aldehydes. Org. Chem. Front. 2018;5:827–837. [Google Scholar]
  • 27.Floresta G., Rescifina A. Metyrapone-β-cyclodextrin supramolecular interactions inferred by complementary spectroscopic/spectrometric and computational studies. J Mol. Struct. 2019;1176:815–824. [Google Scholar]
  • 28.O׳Boyle N.M., Banck M., James C.A., Morley C., Vandermeersch T., Hutchison G.R. Open Babel: an open chemical toolbox. J. Cheminform. 2011;3:33. doi: 10.1186/1758-2946-3-33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Trott O., Olson A.J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 2010;31:455–461. doi: 10.1002/jcc.21334. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Cheeseright T., Mackey M., Rose S., Vinter A. Molecular field extrema as descriptors of biological activity: definition and validation. J. Chem. Inf. Model. 2006;46:665–676. doi: 10.1021/ci050357s. [DOI] [PubMed] [Google Scholar]
  • 31.Cheeseright T.J., Mackey M.D., Scoffin R.A. High content pharmacophores from molecular fields: a biologically relevant method for comparing and understanding ligands. Curr. Comput. Aided Drug Des. 2011;7:190–205. doi: 10.2174/157340911796504314. [DOI] [PubMed] [Google Scholar]
  • 32.Cheeseright T.J., Holm M., Lehmann F., Luik S., Gottert M., Melville J.L., Laufer S. Novel lead structures for p38 MAP kinase via FieldScreen virtual screening. J. Med. Chem. 2009;52:4200–4209. doi: 10.1021/jm801399r. [DOI] [PubMed] [Google Scholar]
  • 33.Cheeseright T.J., Mackey M.D., Melville J.L., Vinter J.G. FieldScreen: virtual screening using molecular fields. Application to the DUD data set. J. Chem. Inf. Model. 2008;48:2108–2117. doi: 10.1021/ci800110p. [DOI] [PubMed] [Google Scholar]
  • 34.Cheeseright T., Mackey M., Rose S., Vinter A. Molecular field technology applied to virtual screening and finding the bioactive conformation. Expert Opin. Drug Discov. 2007;2:131–144. doi: 10.1517/17460441.2.1.131. [DOI] [PubMed] [Google Scholar]
  • 35.Sulsky R., Magnin D.R., Huang Y., Simpkins L., Taunk P., Patel M., Zhu Y., Stouch T.R., Bassolino-Klimas D., Parker R., Harrity T., Stoffel R., Taylor D.S., Lavoie T.B., Kish K., Jacobson B.L., Sheriff S., Adam L.P., Ewing W.R., Robl J.A. Potent and selective biphenyl azole inhibitors of adipocyte fatty acid binding protein (aFABP) Bioorg. Med. Chem. Lett. 2007;17:3511–3515. doi: 10.1016/j.bmcl.2006.12.044. [DOI] [PubMed] [Google Scholar]
  • 36.Barf T., Lehmann F., Hammer K., Haile S., Axen E., Medina C., Uppenberg J., Svensson S., Rondahl L., Lundback T. N-Benzyl-indolo carboxylic acids: design and synthesis of potent and selective adipocyte fatty-acid binding protein (A-FABP) inhibitors. Bioorg. Med. Chem. Lett. 2009;19:1745–1748. doi: 10.1016/j.bmcl.2009.01.084. [DOI] [PubMed] [Google Scholar]
  • 37.Lehmann F., Haile S., Axen E., Medina C., Uppenberg J., Svensson S., Lundback T., Rondahl L., Barf T. Discovery of inhibitors of human adipocyte fatty acid-binding protein, a potential type 2 diabetes target. Bioorg. Med. Chem. Lett. 2004;14:4445–4448. doi: 10.1016/j.bmcl.2004.06.057. [DOI] [PubMed] [Google Scholar]
  • 38.Ringom R., Axen E., Uppenberg J., Lundback T., Rondahl L., Barf T. Substituted benzylamino-6-(trifluoromethyl)pyrimidin-4(1H)-ones: a novel class of selective human A-FABP inhibitors. Bioorg. Med. Chem. Lett. 2004;14:4449–4452. doi: 10.1016/j.bmcl.2004.06.058. [DOI] [PubMed] [Google Scholar]
  • 39.Luco J.M., Ferretti F.H. QSAR based on multiple linear regression and PLS methods for the anti-HIV activity of a large group of HEPT derivatives. J Chem. Inf. Comput. Sci. 1997;37:392–401. doi: 10.1021/ci960487o. [DOI] [PubMed] [Google Scholar]
  • 40.Surribas A., Amigo J.M., Coello J., Montesinos J.L., Valero F., Maspoch S. Parallel factor analysis combined with PLS regression applied to the on-line monitoring of Pichia pastoris cultures. Anal. Bioanal. Chem. 2006;385:1281–1288. doi: 10.1007/s00216-006-0355-z. [DOI] [PubMed] [Google Scholar]
  • 41.Tuppurainen K., Korhonen S.P., Ruuskanen J. Performance of multicomponent self-organizing regression (MCSOR) in QSAR, QSPR, and multivariate calibration: comparison with partial least-squares (PLS) and validation with large external data sets. SAR QSAR Environ. Res. 2006;17:549–561. doi: 10.1080/10629360601033390. [DOI] [PubMed] [Google Scholar]
  • 42.Palermo G., Piraino P., Zucht H.D. Performance of PLS regression coefficients in selecting variables for each response of a multivariate PLS for omics-type data. Adv. Appl. Bioinform. Chem. 2009;2:57–70. doi: 10.2147/aabc.s3619. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Asadollahi T., Dadfarnia S., Shabani A.M., Ghasemi J.B., Sarkhosh M. QSAR models for CXCR2 receptor antagonists based on the genetic algorithm for data preprocessing prior to application of the PLS linear regression method and design of the new compounds using in silico virtual screening. Molecules. 2011;16:1928–1955. doi: 10.3390/molecules16031928. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Golbraikh A., Tropsha A. Beware of q2! J. Mol. Graph Model. 2002;20:269–276. doi: 10.1016/s1093-3263(01)00123-1. [DOI] [PubMed] [Google Scholar]
  • 45.Irwin J.J., Shoichet B.K. ZINC--a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model. 2005;45:177–182. doi: 10.1021/ci049714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Bento A.P., Gaulton A., Hersey A., Bellis L.J., Chambers J., Davies M., Kruger F.A., Light Y., Mak L., McGlinchey S., Nowotka M., Papadatos G., Santos R., Overington J.P. The ChEMBL bioactivity database: an update. Nucleic Acids Res. 2014;42:D1083–1090. doi: 10.1093/nar/gkt1031. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material.

mmc1.pdf (898.9KB, pdf)

Supplementary Tables S4–S9.

mmc2.docx (186.6KB, docx)

Articles from Data in Brief are provided here courtesy of Elsevier

RESOURCES