Fig. 5.

Hierarchical clustering on principal components of neuropsychological (T) scores for subject group identification using Ward’s D2 criterion (Murtagh & Legendre, 2014). PCA is used on the subject cognitive matrix to remove highly correlated continuous variables. Next, we apply hierarchical clustering using Ward’s D2 method on the distance matrix to select the clusters based on the height of the hierarchical tree. The distance matrix is computed using the dissimilarity measure such as the distance correlation (Székely, Rizzo, & Bakirov, 2007) of the PCs. The initial number of clusters is assessed according to the compactness metrics (Halkidi, Batistakis, & Vazirgiannis, 2002a, 2002b), and the cluster stability is evaluated using the Jaccard similarity index (Jaccard, 1912) via a nonparametric bootstrap technique with a number of repetitions (see detailed protocol in Supplementary Methods Section 3.1). We select significant clusters based on the approximately unbiased probability -values (Efron, Halloran, & Holmes, 1996), as shown in Fig. 6a. We provide the final clustering solution by applying the -means algorithm to the hierarchical clustering output.