Skip to main content
ACS Omega logoLink to ACS Omega
. 2020 Sep 10;5(37):24111–24117. doi: 10.1021/acsomega.0c03659

Quantitative Comparison of Three-Dimensional Activity Landscapes of Compound Data Sets Based upon Topological Features

Javed Iqbal 1, Martin Vogt 1, Jürgen Bajorath 1,*
PMCID: PMC7513547  PMID: 32984733

Abstract

graphic file with name ao0c03659_0006.jpg

Visualization of structure–activity relationships (SARs) in compound data sets substantially contributes to their systematic analysis. For SAR visualization, different types of activity landscape (AL) representations have been introduced. Three-dimensional (3D) AL models in which an activity hypersurface is constructed in chemical space are particularly intuitive because these 3D ALs are reminiscent of “true” (geographical) landscapes. Accordingly, the topologies of 3D AL representations can be immediately associated with different SAR characteristics of compound data sets. However, the comparison of 3D ALs has thus far been confined to visual inspection and qualitative analysis. We have focused on image analysis as a possible approach to facilitate a quantitative comparison of 3D ALs, which would further increase their utility for SAR exploration. Herein, we introduce a new computational methodology for quantifying topological relationships between 3D ALs. Images of color-coded 3D ALs were converted into top-down views of these ALs. From transformed images, different categories of shape features were systematically extracted, and multilevel shape correspondence was determined as a measure of AL similarity. This made it possible to differentiate between 3D ALs in quantitative terms.

1. Introduction

Activity landscapes (ALs) are graphical representations that are designed to integrate structural and potency relationships between compounds sharing the same specific activity.1,2 AL modeling enables the visualization and graphical analysis of structure–activity relationships (SARs) for different data sets.2

Over the years, a variety of AL representations of different designs and complexities have been introduced.18 These include two-dimensional (2D) representations such as plots of pairwise compound similarity versus potency relationships1,4 or annotated similarity-based compound networks.2,9 In addition, three-dimensional (3D) AL views such as pairwise compound activity–property–similarity distributions4 or maps reminding us of geographical landscapes3,8,10 have been studied.

These maps, in the following referred to as 3D ALs, are particularly intuitive because they can be interpreted in the same way as geographical landscapes. In 3D ALs, topological features such as mountains, plains, or valleys are associated with different SAR characteristics. For example, plains and gently sloped valleys in a 3D AL result from a series of chemical modifications (“walks” in chemical space) that are accompanied by small to moderate changes in potency, a phenotype referred to as SAR continuity.2 By contrast, mountainous regions and peaks in a 3D AL are a consequence of small compound modifications (small steps in chemical space) that cause large potency alterations, which represent SAR discontinuity.2 Most prominent peaks in discontinuous regions are termed activity cliffs10,11 and formed by pairs or groups of structural analogues with largest potency differences in compound data sets.11

For SAR visualization, 3D AL models are obtained by adding compound potency values as third dimension to a 2D projection of a chemical feature (descriptor) space, for which computational approaches and parameters have been established in previous studies.3,8 From experimental potency values, a coherent surface is interpolated and color-coded by potency, representing an activity surface.3,8 The major features of the geographical landscape like topologies of different activity surfaces, as discussed above, can vary significantly, and different topological features reflect different SAR characteristics.

SAR visualization using 3D ALs represents a qualitative approach. Although 3D AL analysis can be complemented with the application of numerical SAR analysis functions to quantify SAR continuity and/or discontinuity for compound data sets,12,13 subjective graphical assessment has principal limitations when comparing different ALs. This is the case because most compound data sets exhibit SAR heterogeneity, resulting from different combinations of locally continuous and discontinuous SARs that originate from different compound subsets.12 As a consequence, in corresponding 3D ALs, smooth and rugged regions are interspersed. When comparing 3D AL models, it is generally difficult to judge differences in SAR information content on the basis of visual inspection. Therefore, the ability to compare 3D ALs in quantitative terms would be highly desirable to complement graphical analysis.

We have reasoned that a quantitative comparison of 3D ALs might become feasible by focusing on quantitative analysis of images of 3D ALs. The first evidence in support of this conjecture was provided by successful classification of 3D AL image variants based upon features learned using convolutional neural networks.14 For 3D ALs, reference representations with specifically altered topologies were generated, either significantly increasing or decreasing the proportion of valleys to peaks. From images of original and modified ALs, distinguishing features were learned and used for class label predictions using machine learning, leading to overall successful classification of these AL variants.14

In light of these initial findings, we have investigated 3D AL image analysis to quantitatively compare 3D ALs and the SAR information they capture. Herein, we represent a new computational approach to quantitatively account for feature relationships between 3D ALs, indicating varying SAR information content and quantify similarity relationships between 3D ALs.

2. Results and Discussion

2.1. Study Concept

Images of 3D ALs provide a data source for algorithmic extraction of features that account for topological characteristics representing SARs contained in compound data sets. Our analysis was based on the premise that similarity relationships between different 3D ALs might be quantifiable if such image features could be canonicalized and formally compared. Therefore, we have recorded 3D AL images and converted them into representations, in which landscape topologies were defined by varying color pixel intensities and from which features suitable for 3D AL comparison could be systematically extracted. On the basis of the resulting feature sets, 3D AL similarity analysis was performed, and different comparisons were carried out. In the following, our computational approach is presented and applied to establish proof of concept.

2.2. Activity Landscape Image Processing

The potency surface of 3D ALs is determined by three variables capturing structural and potency relationships between compounds: distance, elevation, and color gradients. While distance accounts for structural relationships (walks in chemical space), both elevation and color gradients account for potency relationships. This inherent redundancy in representing potency relationships makes it possible to replace elevation with color profiles that capture AL topology by varying color pixel intensities. This idea is central to 3D AL image processing and comparison. Accordingly, original 3D AL images are converted into color-coded heatmaps that represent top-down views of the landscapes preserving distance and potency relationships, as illustrated in Figure 1. Furthermore, in Figure 2, an original 3D AL and the corresponding heatmap are compared in greater detail by mapping corresponding positions of exemplary active compounds participating in the formation of different activity cliffs within the same region.

Figure 1.

Figure 1

Activity landscape views and topological features. (a) At the top, an exemplary original 3D AL (left) and the corresponding heatmap (right) are shown. The heatmap conveys a top-down view of the AL. At the bottom, topological features extracted from the heatmap are depicted. For peaks and valleys, contours are drawn covering eight threshold levels (±0.25, ±0.5, ±0.75, and ±0.9). (b) The generation of a feature vector is illustrated that records the cumulative area of shape features for any contour threshold levels. For clarity, only individual contours with an area greater than 3 are shown and labeled with respective thresholds.

Figure 2.

Figure 2

Compound mapping on activity landscape representations. Corresponding positions of highly and weakly potent orexin receptor type 2 antagonists belonging to different analogue series are mapped on the original 3D AL of the data set (top left) and the heatmap representation (top right), respectively. The pair of highly potent analogues and the pair of weakly potent analogues contribute to the formation of different activity cliffs within the same region of the AL. At the bottom, the structures of these compounds are shown with their logarithmic potency values. Chemical modifications distinguishing highly potent analogues (top) and weakly potent analogues (bottom) are highlighted (red).

2.3. Activity Landscape Features

The resulting heatmaps embed unique topological features as color profiles and color intensity-based textures. By well-defined contouring (see Section 3), shape features are defined, as shown in Figure 1a. Contours are derived based on the potency value distributions and thus capture different shape features on a relative scale, with corresponding threshold values distinguishing different contour levels. Characteristic features are then extracted and represented as AL-specific feature vectors for quantitative comparison. The generation of a feature vector accounting for the different shape features of an activity class is illustrated in Figure 1b. The feature extraction approach presented herein focuses on the detection of borders in heatmaps that encompass regions of different topologies and enclose valleys or peaks. Accordingly, different topological features yield different shapes. Feature extraction is facilitated using the marching squares algorithm (MSA)15 (see Section 3). Two main characteristics of shape features include the area that is covered and the color intensity range, for which thresholds are defined. Shapes representing peaks or valleys in 3D ALs are compared across different threshold levels and the AL similarity is quantified in different ways using a suitable similarity metric.16

2.4. Exemplary Activity Landscapes

For our proof-of-concept studies, 3D ALs were generated for eight compound data sets (different activity classes) taken from release 23 of ChEMBL,17 as reported in Table 1. The compounds were active against diverse targets and covered varying potency ranges. A general cutoff value of 10 μM potency was applied to exclude borderline active compounds from 3D AL modeling. From 3D AL images, heatmaps were derived for further analysis. Figure 3 shows the heatmaps of these activity classes. The comparison reveals topological similarities and differences. All heatmaps combine mountainous and smooth regions containing peaks and valleys, respectively. These topological features mirror SAR heterogeneity. Within these variable landscapes, differences are observed. For example, the heatmap of ChEMBL data set 204 displays most prominent peaks and cliffs (characterized by sharp borders between peaks and valleys). Based upon visual inspection, the heatmap of data set 4792 appears to resemble this topological phenotype most closely among the others. Other topological similarities or differences are more difficult to judge on the basis of visual inspection. For example, it would hardly be possible to confidently predict whether the AL of set 3155 might be more similar to that of set 219 or 4550. Despite apparent data set-dependent topological differences captured by these heatmaps, it would generally be very difficult to judge (dis)similarity relationships between them, illustrating limitations of our perception. Accordingly, any consistently applicable quantitative measure of AL similarity would provide a substantial advance for AL-assisted SAR exploration.

Table 1. Compound Data Setsa.

      potency [pKi]
ChEMBL target ID target name compounds min max
204 coagulation factor II 915 5 12.2
344 melanin-concentrating hormone receptor 1 1175 5 9.8
3155 5-hydroxytryptamine receptor 7 1094 5 10
4792 orexin receptor type 2 1443 5 10.1
255 adenosine receptor A2b 1237 5 9.8
219 dopamine D4 receptor 1082 5 10.5
4550 arachidonate 5-lipoxygenase-activating protein 1318 5.6 9.4
225 serotonin receptor 2C 1079 5 9.7
a

The composition of eight compound activity classes used for 3D AL analysis is summarized.

Figure 3.

Figure 3

Heatmaps: for the eight activity classes, heatmaps derived from 3D AL images are shown. ChEMBL target IDs are reported according to Table 1.

2.5. Activity Landscape Comparison

On the basis of extracted image features, as discussed above, different types of comparisons were carried out, attempting to discern topological characteristics and relate topology to SAR information content.

First, we applied the weighted Jaccard coefficient (Jw) to compare feature vectors recording fractional heatmap areas contoured at different threshold levels (corresponding to different topological features). The formalism is presented in section 3.4. Importantly, the comparison of feature vectors did not depend on establishing correspondences between individual shapes. To avoid “averaging” over distinct topological features accounting for different SARs, comparisons were separately carried out for valleys (negative threshold levels) and peaks (positive thresholds), corresponding to SAR continuity and discontinuity, respectively. Feature vectors of activity classes were compared in a pairwise manner, and the ALs were then ranked separately for valleys and peaks in the order of descending similarity to set 204, which served as a reference AL. The results are reported in Table 2. As can be seen, feature vector comparison yielded meaningful rankings of landscapes with a significant spread of pairwise similarity values (Jw ranges from 0 to 1). As expected, the relative ranking on the basis of valleys and peaks changed. Similarity values were generally higher for comparison of valleys than of peaks. This was expected, given the generally larger area covered by valleys in heatmaps. To the (limited) extent we were able to subjectively judge such similarity relationships, the rankings were reconcilable and intuitive. For example, in peak-based ranking, the AL of set 4792 (rank 2) was most similar to reference set 204, followed by set 3155 (rank 3), consistent with visual inspection. In valley-based ranking, the AL of set 4550 (rank 8) was most dissimilar to reference 204, which was also intuitive. On the other hand, on the basis of visual inspection, it was more difficult to understand why set 219 (rank 2) was more similar to 204 than set 225. Clearly, the similarity calculations numerically distinguished between pairwise relationships that were essentially impossible for us to judge, which we had aimed for. Successful ranking of different ALs according to topological features on the basis of calculated similarity values was considered an encouraging finding.

Table 2. Similarity-Based Ranking of Activity Landscapesa.

rank valley-based similarity JwV peak-based similarity JwP
1 204 1 204 1
2 225 0.89 4792 0.58
3 219 0.87 3155 0.46
4 3155 0.85 219 0.31
5 255 0.78 255 0.28
6 344 0.69 225 0.15
7 4792 0.63 4550 0.14
8 4550 0.45 344 0.13
a

3D ALs of activity classes were ranked in the order of decreasing similarity using the AL of data set 204 as a reference. IDs are used according to Table 1. Separate rankings are reported on the basis of features accounting for valleys and peaks, respectively.

To assess the SAR information contained in shapes identified for specific thresholds, structure–activity similarity (SAS) maps1,4 were calculated for exemplary contours obtained at decreasing threshold levels, as shown in Figure 4 for a representative data set. SAS maps are scatter plots of compound pairs, in which the x-axis reports pairwise similarity and the y-axis pairwise potency differences. Here, the extended connectivity fingerprint with bond diameter 4 (ECFP4)18 was used as a molecular representation to calculate Tanimoto similarity.19 As can be seen, the contour of the highest threshold 0.75 (corresponding to a pKi value of 8.5) comprises only very few, but very similar, compounds, with pairs formed by these compounds displaying potency differences of up to around two orders of magnitude. For the lower threshold of 0.5 (corresponding to a pKi of 8.0), more diverse compounds are contained in the contoured region. The majority of compound pairs still share a Tanimoto similarity of 0.4 or higher, which is significant for ECFP4. Thus, given the wide spread of potency differences, this region of the 3D AL is rich in SAR information. For the threshold of 0.25 (corresponding to a pKi of 7.5), the contour comprises many compounds with pairwise similarities of, on average, less than 0.4, which, by definition, do not yield meaningful SARs. Taken together, these findings demonstrate that by comparing only the highest threshold peak contours and lowest threshold valley contours (see Section 3), AL similarity analysis focuses on the most relevant regions with respect to SAR information content. This is the case because comparison of these features leads to the identification of similar compounds with largest potency differences in data sets that determine SARs.

Figure 4.

Figure 4

Feature-based structure–activity relationship information. In the heatmap of activity class 204, an exemplary peak region is contoured at different threshold levels of 0.75 (yellow), 0.50 (blue), and 0.25 (green) and a close-up view is shown. For compounds associated with the shapes detected at each contour level, structure–activity similarity (SAS) maps are shown capturing associated SAR information.

2.6. Conclusions

Activity landscape representations are used for SAR visualization and aid in the exploration of SARs contained in compound data sets. Among these representations, 3D AL models are particularly intuitive. So far, comparisons of 3D ALs have been confined to the qualitative level. However, quantitative comparisons of 3D ALs would greatly help in assessing differences in SAR information content beyond of what can be appreciated on the basis of visual inspection. In this work, we have presented a computational methodology to facilitate quantitative comparisons of 3D ALs on the basis of topological features extracted from AL image data. As we have shown, numerical analysis discerns similarity relationships between 3D ALs in a meaningful way and enables ranking of ALs according to relative differences in the SAR information they capture. In addition, we have demonstrated that compound subsets associated with different contoured areas representing defined topological features convey varying SAR information, as one would expect. Taken together, our findings suggest that the approach introduced complements to SAR visualization and further increases the potential of 3D ALs for large-scale SAR analysis.

3. Materials and Methods

3.1. Three-Dimensional Activity Landscapes

Three-dimensional AL models of compound data sets were constructed as described.14 For 3D AL modeling, preferred approaches for dimensionality reduction of original chemical reference spaces, molecular representations, and similarity/distance calculations have been identified in earlier studies.3,8 Accordingly, chemical reference space was generated on the basis of ECFP418 as a molecular representation and calculation of pairwise compound Tanimoto distances.19 The 2D projection of chemical reference space was then computed using multidimensional scaling (MDS)20 applying a stress function based upon pairwise Tanimoto distances. MDS was previously found to be a preferred dimensionality reduction approach for retaining compound distances in original chemical reference (fingerprint) spaces. For the data sets used herein, the 2D projections preserved original ECFP4 Tanimoto distance relationships between compounds, with correlation coefficients between Tanimoto distances in fingerprint space and Euclidean distances in 2D projections of ∼0.7. The potency surface was interpolated via Gaussian process regression.21 This approach interpolates intermediate values by a Gaussian process based upon prior covariances of experimental potency values. The “Sum of Matern and White” kernel21 was used assuming a mean of zero to derive relationships between experimental data points (potency values), and Gaussian noise factors were applied to permit minor variations of z-values for points on the xy plane and optimize the global fit of the surface to experimental data points. Noise factors were regularized by optimizing the kernel’s α parameter between 10–1 and 10–7 over 10 iterations. The potency gradient was applied to a limited pKi range from 5.0 (green) over 7.0 (yellow) to 9.0 (red). Potency values larger than 9.0 were assigned to red.

3.2. Image Preprocessing

For each 3D AL, a heatmap was initially computed using the red, green, and blue (RGB) color model of openCV version 3.0 with eight bits per channel.22 Because the original 3D AL models were created by interpolating potency values using a color gradient from red over yellow to green, without using the blue channel, the red and green (RG) channel pixel values were extracted by subtracting green channel intensity values from red channel intensity values and combined into a single intensity value ranging from −255 to 255. Hence, the least potent (brightest green), moderately potent (yellow), and highest potent (brightest red) compounds/pixels corresponded to values of −255, 0, and +255, respectively. RG pixel values were then normalized to the range of −1 to +1. The RG color model preserved more than 95% of the RGB colors, except for white regions (i.e., interpolated surface area without experimental potency backup), which was accounted for by yellow using the RG model. However, these regions only accounted for less than 5% of the surface.

3.3. Feature Extraction: Contours and Shapes

Feature extraction was performed on the basis of heatmaps of original size 543 × 543 pixels that were rescaled to 300 × 300 pixels. This representation corresponded to a top-down view of the original 3D AL color-coded as described above. Extraction of features proceeded in two steps. First, contour lines (i.e., lines of equal intensity in the image) were used to identify regions encompassing valleys and peaks in 3D ALs. The canonical heatmap representation was thus segmented into different regions using contour lines. The scikit-image implementation of MSA15,23 was applied to extract the contours. MSA represents the 2D version of the marching cubes algorithm,15 which creates a contour line segment by mapping an image onto a square grid.

For contour extraction, each heatmap was initially binarized using threshold values of 0.25, 0.5, 0.75, and 0.9, respectively, to delineate shapes representing peaks, while inversely binarized negative thresholds of −0.25, −0.5, −0.75, and −0.9 were used to identify smooth regions (valleys). The following MSA parameter settings were used: high connectivity, high positive orientation, and iso-line of level 7. The resulting contour lines represented nonintersecting closed curves. Shapes were subsequently characterized as groups of contour levels of increasing threshold magnitude, with higher-threshold contours being enclosed by lower-threshold contour lines. Contour areas were calculated on the basis of Green’s theorem using computed image moments.24 Threshold contours identified individual peaks of the AL for highest positive thresholds and valleys for lowest negative thresholds of individual shapes. Peak and valley contours were only considered if they were contained in at least one contour of a lower-magnitude threshold. Each peak and valley was then defined by its area and its threshold level.

3.4. Activity Landscape Similarity Analysis

To quantify the similarity of two 3D ALs images based upon their heatmaps, peaks and valleys were generated and compared. For each AL and threshold, the total areas of the peak and valley contours corresponding to the given threshold were determined, resulting in a feature vector comprising eight values

3.4.

This vector represents the total area of the peaks/valleys at each threshold level. For each pair of landscapes A, B (with corresponding feature vectors a, b), a similarity coefficient was established by calculating the weighted Jaccard index or Ružička similarity16

3.4.

The resulting coefficients for peaks and valleys were termed JwP (A,B) and JwV (A,B), respectively.

Acknowledgments

J.I. is supported by a Ph.D. fellowship from the German Academic Exchange Service (DAAD) in collaboration with the Higher Education Commission (HEC) of Pakistan.

Author Contributions

The study was carried out and the manuscript written with contributions of all authors. All authors have approved the final version of the manuscript.

The authors declare no competing financial interest.

References

  1. Shanmugasundaram V.; Maggiora G. M. In Characterizing Property and Activity Landscapes Using an Information-Theoretic Approach, Proceedings of 222nd American Chemical Society National Meeting, Division of Chemical Information, Chicago, IL, Aug 26–30, 2001; American Chemical Society: Washington, D.C., 2001; Abstract no. 77.
  2. Wassermann A. M.; Wawer M.; Bajorath J. Activity Landscape Representations for Structure-Activity Relationship Analysis. J. Med. Chem. 2010, 53, 8209–8223. 10.1021/jm100933w. [DOI] [PubMed] [Google Scholar]
  3. Peltason L.; Iyer P.; Bajorath J. Rationalizing Three-dimensional Activity Landscapes and the Influence of Molecular Representations on Landscape Topology and the Formation of Activity Cliffs. J. Chem. Inf. Model. 2010, 50, 1021–1033. 10.1021/ci100091e. [DOI] [PubMed] [Google Scholar]
  4. Yongye A. B.; Byler K.; Santos R.; Martínez-Mayorga K.; Maggiora G. M.; Medina-Franco J. L. Consensus Models of Activity Landscapes with Multiple Chemical, Conformer, and Property Representations. J. Chem. Inf. Model. 2011, 51, 1259–1270. 10.1021/ci200081k. [DOI] [PubMed] [Google Scholar]
  5. Iyer P.; Dimova D.; Vogt M.; Bajorath J. Navigating High-Dimensional Activity Landscapes: Design and Application of the Ligand-Target Differentiation Map. J. Chem. Inf. Model. 2012, 52, 1962–1969. 10.1021/ci3002765. [DOI] [PubMed] [Google Scholar]
  6. Medina-Franco J. P.; Navarrete-Vázquez G.; Méndez-Lucio O. Activity and Property Landscape Modeling is at the Interface of Chemoinformatics and Medicinal Chemistry. Future Med. Chem. 2015, 7, 1197–1211. 10.4155/fmc.15.51. [DOI] [PubMed] [Google Scholar]
  7. Vogt M. Progress with Modeling Activity Landscapes in Drug Design. Expert Opin. Drug Discovery 2018, 13, 605–615. 10.1080/17460441.2018.1465926. [DOI] [PubMed] [Google Scholar]
  8. Miyao T.; Funatsu K.; Bajorath J. Three-dimensional Activity Landscape Models of Different Design and Their Application to Compound Mapping and Potency Prediction. J. Chem. Inf. Model. 2019, 59, 993–1004. 10.1021/acs.jcim.8b00661. [DOI] [PubMed] [Google Scholar]
  9. Wawer M.; Peltason L.; Weskamp N.; Teckentrup A.; Bajorath J. Structure–Activity Relationship Anatomy by Network-like Similarity Graphs and Local Structure–Activity Relationship Indices. J. Med. Chem. 2008, 51, 6075–6084. 10.1021/jm800867g. [DOI] [PubMed] [Google Scholar]
  10. Maggiora G. M. On Outliers and Activity Cliffs – Why QSAR often Disappoints. J. Chem. Inf. Model. 2006, 46, 1535. 10.1021/ci060117s. [DOI] [PubMed] [Google Scholar]
  11. Stumpfe D.; Hu Y.; Dimova D.; Bajorath J. Recent Progress in Understanding Activity Cliffs and their Utility in Medicinal Chemistry. J. Med. Chem. 2014, 57, 18–28. 10.1021/jm401120g. [DOI] [PubMed] [Google Scholar]
  12. Peltason L.; Bajorath J. SAR Index: Quantifying the Nature of Structure-Activity Relationships. J. Med. Chem. 2007, 50, 5571–5578. 10.1021/jm0705713. [DOI] [PubMed] [Google Scholar]
  13. Guha R.; Van Drie J. H. Structure-Activity Landscape Index: Identifying and Quantifying Activity Cliffs. J. Chem. Inf. Model. 2008, 48, 646–658. 10.1021/ci7004093. [DOI] [PubMed] [Google Scholar]
  14. Iqbal J.; Vogt M.; Bajorath J. Activity Landscape Image Analysis Using Convolutional Neural Networks. J. Cheminf. 2020, 12, e34 10.1186/s13321-020-00436-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Lorensen W. E.; Cline H. E. In Marching Cubes: A High Resolution 3D Surface Construction Algorithm, Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques; Association for Computing Machinery: New York, NY, 1987; pp 163–169.
  16. Ružička M. Anwendung mathematisch-statistischer Methoden in der Geobotanik (Synthetische Bearbeitung von Aufnahmen). Biológia 1958, 13, 647–661. [Google Scholar]
  17. Gaulton A.; Bellis L. J.; Bento A. P.; Chambers J.; Davies M.; Hersey A.; Light Y.; McGlinchey S.; Michalovich D.; Al-Lazikani B.; Overington J. P. ChEMBL: A Large-scale Bioactivity Database for Drug Discovery. Nucleic Acids Res. 2012, 40, D1100–D1107. 10.1093/nar/gkr777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Rogers D.; Hahn M. Extended-connectivity Fingerprints. J. Chem. Inf. Model. 2010, 50, 742–754. 10.1021/ci100050t. [DOI] [PubMed] [Google Scholar]
  19. Rogers D. J.; Tanimoto T. T. A Computer Program for Classifying Plants. Science 1960, 132, 1115–1118. 10.1126/science.132.3434.1115. [DOI] [PubMed] [Google Scholar]
  20. Borg I.; Groenen P. J. F.. Modern Multidimensional Scaling: Theory and Applications; Springer: New York, 2005. [Google Scholar]
  21. Rasmussen C. E.Gaussian Processes in Machine Learning. In Summer School on Machine Learning; Springer: Berlin, Heidelberg, 2003; pp 63–71. [Google Scholar]
  22. Bradski G.; Kaehler A.. Learning OpenCV: Computer Vision with the OpenCV Library; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2008.
  23. Van der Walt S.; Schönberger J.; Nunez-Iglesias J.; Boulogne F.; Warner J.; Yager N.; Gouillart E.; Yu T. the Scikit-Image Contributors. Scikit-Image: Image Processing in Python. PeerJ 2014, 2, e453 10.7717/peerj.453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Edwards C. H.Calculus and Analytic Geometry; Prentice-Hall College Division: Upper Saddle River, 1990. [Google Scholar]

Articles from ACS Omega are provided here courtesy of American Chemical Society

RESOURCES