To cross-validate the clinical symptom and atypical connectivity differences between subgroups in Figs. 3–4 and Extended Data Fig. 3, we first subsampled 95% of the data in 1,000 replicates. Second, we calculated canonical variates (connectivity score and clinical score for each brain–behavior dimension) in each replicate. Third, in each replicate, we hierarchically clustered on connectivity scores using cosine similarity distance and average linkage and identified four subgroups. Fourth, we used the Hungarian method to match clusters between replicates (numerical assignment of subgroups can change without changing subject composition in cluster). Fifth, we calculated the distribution of clinical symptom z-scores for each subgroup across replicates. Sixth, in each replicate, we calculated atypical connectivity per subgroup versus N = 907 neurotypical controls (two-sided Welch’s t-test). Seventh, we calculated the mean and standard deviation (σ) of atypical connectivity (t) on RSFC over 1,000 subsampled replicates. (a-d) Note similarity to Fig. 3b–e: Subgroups differ with respect to clinical symptoms, similar to subgroup differences identified when subgroups were calculated as modal cluster assignment across 1,000 training sets (mode analysis) shown in Fig. 3b–e. Plots include 284 subjects x 1,000 training sets to indicate distribution of clinical behaviors across all 1,000 training set cluster assignments. Box bounds: [25th,75th percentile]; center: median; whiskers: 99.3% data in + /−2.7 σ; outliers: circles). (e-h) Heatmaps show patterns of mean atypical connectivity across replicates in each subgroup across brain regions (rows) and functional networks (columns), and were thresholded for significant atypical connectivity (two-sided Welch’s t-test, mean FDR <0.05). (i-l) Heatmaps show patterns of the standard deviation of atypical connectivity across replicates in each subgroup across brain regions (rows) and functional networks (columns).