a, b For sufficiently large sample sizes, statistical power to detect a non-zero between-set association strength converges to 1. Shaded areas show 95% confidence intervals across 25 covariance matrices representing distributions with the indicated rtrue but different (true) weight vectors. c, d In-sample (solid) estimates of the between-set correlations approach their assumed true (population) value (dashed). e, f Weight errors (quantified as the “1—absolute cosine similarity” between the true weights of the generative model and estimated weights from CCA/PLS on a collection of samples, separately for X and Y and taking the greater of the two), g, h score errors (measured as “1—absolute Pearson correlation” between estimated and true scores, which, in turn, are obtained by applying estimated and true weights to common test data) i, j as well as loading errors (measured as “1—absolute Pearson correlation” between estimated and true loadings) become close to 0 for sufficiently large sample sizes. Original data features are generally different from principal component scores, but as the relation between these two data representations cannot be constrained, we calculate all loadings here with respect to principal component scores. Moreover, to compare loadings across repeated datasets we calculate loadings for a common test set, as for CCA/PLS scores. Left and right columns show results for CCA and PLS, respectively. For all metrics, convergence depends on the true (population) between-set correlation rtrue and is slower if rtrue is low. Note that the color code indicates true (population) between-set correlation and corresponds to the dashed horizontal lines in c-d. Curves show mean and 95% confidence intervals of CCA/PLS estimates across 100 draws of collections of observations with a given sample size from 25 different generative models with the indicated rtrue but varying true (population) weight vectors (see Methods). X and Y feature space dimensionality was 8.