Skip to main content
. 2021 Jul 23;9(7):e19905. doi: 10.2196/19905

Table 2.

Clustering performance on interval evaluation indexes based on various patient representations.

Representation schemes Parameters for training Cluster evaluation indexes

Corpus used Corpus with shuffling Window size Hopkins statistic Silhouette index Davies-Bouldin index
Embedding-based
representation
Full Yes 5 0.922 0.783 1.067

Stroke Yes 5 0.913 0.862a 0.551b

Full No 5 0.903 0.685 1.711

Stroke No 5 0.925 0.672 1.382

Full No 255 0.922 0.783 1.065

Stroke No 224 0.931c 0.790 0.772
Multi-hot representationd N/Ae N/A N/A 0.813 0.233 3.236
Mixture representationf N/A N/A N/A 0.918 0.141 4.157

aHighest value of the Silhouette index.

bLowest value of the Davies-Bouldin index.

cHighest value of the Hopkins statistic.

dMulti-hot representation: representation method of the combinations of one-hot codes.

eN/A: not applicable.

fMixture representation: representation method of the combination of multi-hot codes for discrete features and real numbers for continuous values of age and laboratory tests.