Skip to main content
. Author manuscript; available in PMC: 2022 Jun 1.
Published in final edited form as: J Biomed Inform. 2021 Apr 20;118:103788. doi: 10.1016/j.jbi.2021.103788

Fig. 6. Comparison of T-distributed Stochastic Neighbor Embedding visualizations of clusters obtained from the same mixed-type data set with a single-distance and a multiple-distance solution.

Fig. 6.

Clinical data from 21 mixed features on 247 patients with chronic lymphocytic leukemia was clustered with two methods. First, it was transformed to a binary matrix and clustered with the Hamming distance (left). Second, untransformed, mixed data were clustered with DAISY dissimilarity algorithm (right). The Hamming solution recover amorphous groupings without clear separation. The DAISY solution recovered 4 distinct clusters and a small grouping of outliers.