Table 2.
Clustering performance on interval evaluation indexes based on various patient representations.
| Representation schemes | Parameters for training | Cluster evaluation indexes | |||||
|
|
Corpus used | Corpus with shuffling | Window size | Hopkins statistic | Silhouette index | Davies-Bouldin index | |
| Embedding-based representation |
Full | Yes | 5 | 0.922 | 0.783 | 1.067 | |
|
|
Stroke | Yes | 5 | 0.913 | 0.862a | 0.551b | |
|
|
Full | No | 5 | 0.903 | 0.685 | 1.711 | |
|
|
Stroke | No | 5 | 0.925 | 0.672 | 1.382 | |
|
|
Full | No | 255 | 0.922 | 0.783 | 1.065 | |
|
|
Stroke | No | 224 | 0.931c | 0.790 | 0.772 | |
| Multi-hot representationd | N/Ae | N/A | N/A | 0.813 | 0.233 | 3.236 | |
| Mixture representationf | N/A | N/A | N/A | 0.918 | 0.141 | 4.157 | |
aHighest value of the Silhouette index.
bLowest value of the Davies-Bouldin index.
cHighest value of the Hopkins statistic.
dMulti-hot representation: representation method of the combinations of one-hot codes.
eN/A: not applicable.
fMixture representation: representation method of the combination of multi-hot codes for discrete features and real numbers for continuous values of age and laboratory tests.