Table 2.
Data type, dissimilarity, and scaling of continuous features | Categorical features encoded as binary? | Value, n (%) | ||
Continuous | ||||
Euclidean assumed | ||||
Not detailed | N/Aa | 1 (4) | ||
Mixed | ||||
Euclidean assumed | ||||
Scaled but method unspecified | Yes No |
1 (4) 1 (4) |
||
Scaled to lie in the interval of 0 to 1 | Yes | 1 (4) | ||
z-scores | Yes No |
1 (4) 1 (4) |
||
Not detailed | Yes No |
3 (13) 6 (26) |
||
Euclidean stated | ||||
z-scores | Yes No |
2 (9) 1 (4) |
||
Gowerb | ||||
Gower standardisation | No | 3 (13) | ||
Scaled but method unspecified | No | 1 (4) | ||
treeClust | ||||
Not detailed | No | 1 (4) |
aN/A: not applicable (irrelevant for continuous features).
bComputing the Gower coefficient normalizes the distance between feature samples by dividing by the feature range. Therefore, it is not necessary to normalize continuous features prior to computing the Gower coefficient.