Skip to main content
. 2017 Feb 2;9:6. doi: 10.1186/s13321-017-0192-4

Fig. 3.

Fig. 3

Representation of the similarity matrix corresponding to the lysine dipeptide dataset using the agglomerative clustering algorithm (top) and the sketchmap algorithm (bottom, projection parameters shown following the scheme σA_Ba_b). A few representative structures (see Eq. (7)) of interesting clusters are shown (right) and their corresponding position on the sketchmaps and dendrogram representation is highlighted. The five sketchmaps are colored according to the conformational energy and the backbone dihedral angles ϕ, ψ, ω1 and ω2. The dendrogram shows the clustering hierarchy of the structures of the dataset. Each structure is vertically aligned with its properties shown using color bars below the dendrogram. The dendrogram is cut at a linkage distance of 0.1 since structural properties are very similar below this threshold, and the clusters that are merged at this level are shown as thick gray bars separated by light-gray lines. Clusters composed of only one structure are drawn as a black line reaching the bottom of the dendrogram. The main structural motifs of this set of structures are governed by the peptide bond dihedral angles ω1 and ω2. The two main clusters a, b are showing a global correlation with the angle ω2 while the angle ω1 splits them into two well correlated sub-clusters (d)–(g) respectively. The cluster c is highlighted as an example containing ‘outlier’ structures of low conformational energy