Abstract
We present a method for multimodal brain data registration that aligns shapes of nodal network configurations in an invertible manner. We use ideas from shape analysis to represent an individual subject data configuration as an element on a hypersphere, where geodesics have closed form solutions. The method not only performs inter-subject data registration, but also allows for the construction of a population data template to which all subject data configurations can be registered. Results show compression of data measures and significant reduction in variance after registration. We also observe increased predictive power of regions of interest (ROI) node identification, significant increases in pairwise network connectivity measures, as well as significant increases in canonical correlations with age after registration.
1. Introduction
Structural brain association networks are derived from correlations of cortical parcellated regions of interest (ROI) measures such as gray matter thickness, volume, surface area, etc., following voxel based or cortical based morphometry. Such networks, also termed as morphological networks yield complementary information to direct structural connectivity measures from diffusion weighted imaging (DWI) or connectivity measures from functional magnetic resonance imaging (fMRI) [8, 9]. As opposed to explicit edge-based connections obtained from DWI, morphological networks are extracted by inferring covariance relationships between different nodal brain measures. Although the biological explanations of such inference-based networks are not yet fully developed, several studies have suggested that the structural covariance (i) encodes salient information about the large-scale brain organization, (ii) may be related to changes in brain development and may be perturbed by neurological and psychiatric diseases, (iii) may be modulated by biobehavioral factors, and (iv) may indicate maturational coupling (inter-subject correlations of longitudinal rates of change of cortical thickness) between brain regions [1]. Several methods have been proposed to analyze such inferred brain network associations. However there is still a need for improvement in the methodology for validation of statistical network models or even to conduct formal inference on network structural models. Such analyses are conducted in Euclidean spaces, where standard Pearson, or partial correlation coefficients are computed on the brain measures. Although the brains are preregistered to an atlas to account for spatial differences between subjects before computing such measures, we hypothesize that there may still remain other confounding factors (including biological demographic variables) that influence the underlying covariance structure. Generally network modeling methods perform multiple regression to factor out these nuisance variables. Such methods typically ignore the nonlinear geometry of the space of these measures. We thus suggest another level of data transformation or alignment to match the measures across ROIs and subjects. Our goal here is to perform statistical analysis of multivariate nodal network connectivity configurations in an invariant manner.
As an initial step, we propose representing multimodal brain data for each subject as a geometric configuration in an abstract multidimensional space and using statistical shape analysis methods for registering nodal data from brain measures across subjects [3, 10]. The data configuration can be identified and visualized in its native multidimensional variable space as well as transformed back to the population space using an invertible transform. The data configuration assumes a matrix form where rows denote observations and the columns denote features. For example, the rows can be ROIs and the columns can be vector valued measures for each ROI. In the variable space, these measures become axes of a coordinate system, and the ROIs become distinct nodes lying in this coordinate system, where a complete set of distinct ROIs in this coordinate system constitutes an individual subject. The subject space becomes a hypersphere, where shapes of data matrices, or nodal network configurations can be registered in a group-wise manner across a population. In this work, such registration is performed by Procrustes analysis, which includes scaling, translation, and exact rotations [3, 10]. A schematic overview of this representation is shown in Fig. 1. The novelty of this work lies in representing brain nodal data measures as configurations in a geometric space, and registering shapes of full data configurations formed from multiple ROIs and multidimensional measures for each ROI across subjects.
Fig. 1.

Schematic for representation of subject data.
2. Multimodal Data Registration
Our data consists of averages of morphological features sampled from brain cortical surfaces for a population. However, we emphasize that the method is general, and can also be applied to functional measures from functional MRI or arterial spin labeling (ASL) measures. The morphological measures from surfaces are derived from atlas-based registrations of individual cortices from subjects using available methods such as Freesurfer [4], MSM [11], or BrainSuite [12]. Following registration, the cortex is parcellated into distinct regions of interest, and measures such as myelin (T1w/T2w fraction), cortical thickness, gray matter volume, or sulcal depths, are averaged over the ROIs. Thus for a single subject, one can have a vector of measures at each ROI.
For a set of k ROIs and m measures per ROI, this multimodal dataset for a single subject is represented by a matrix configuration as X = {(x1, …, xi, …, xk)}, where each xi = [w1, w2, …, wj, …, wm]T, is a multidimensional vector. Thus is a matrix configuration. Since each of the values wj represents different measurements such as thickness, volume, myelin (T1w/T2w fraction), their units are different, and thus should be standardized. Establishing the scale parameters in multimodal analyses generally remains a challenge, although several workarounds have been used previously. One could scale each of the measures as [α1w1, …, αjwj, …, αmwm]T, where αjs are empirically chosen to bring the values of each feature to comparable units, or one could standardize the features using z-scores. In this paper, we empirically choose αjs based on prior observations of the data. Figure 1 shows the schematic of the data representation as well as the configuration of the data in the variable (ROI or feature) space.
2.1. Pairwise Subject Data Registration
Given such multimodal configurations for two subjects X1 and X2, our goal is to register them so that all the features are brought into a common space of configurations. Although each of the measures at each ROI have been scaled independently, there still remains a question of the global scale of the matrix, owing to inter-subject differences. Additionally, the two configurations may be shifted in space by a global translation. Thus before registering them, we perform global centering and scaling as follows. For a subject data matrix X1, we compute a centered and a scaled matrix given by,
| (1) |
| (2) |
where || · ||F denotes the Frobenius norm. Without loss of generality, we will denote the scaled and centered subject matrix as X1 throughout this paper. We note that the scaling and centering operation maps the subject data matrix to a hypersphere . Finally, to register the two transformed subject configurations X1 and X2, we solve the problem of exact rotation by minimizing the sum squared error as
| (3) |
where Γ ∈ SO(n) is an element of the special orthogonal group. The solution to this problem can be estimated, even in noisy data [7]. We follow this method by first computing the singular value decomposition (SVD) of the product of X1 and as , where U, V ∈ SO(m). Then the exact rotation matrix is given by .
2.2. Group-Wise Data Registration
While we can perform pairwise matching of subject data as shown above, the problem of groupwise registration can be solved in a variety of ways. We can assign a fixed subject as a template and register all data to it, in which case there is an introduction of bias due to the initial choice of the template. Alternatively, we can perform pairwise registrations for all N subjects and average the resulting data. This can be done very fast, but the notion of averaging transforms for pairwise registrations is not straightforward and can potentially introduce errors. In our work, we estimate an average data template from the population by relying on the spherical geometry of the subject space. The geodesic between two data matrices X1 and X2 on the sphere is given by
| (4) |
where t ∈ [0, 1], 〈·, ·〉F is the Frobenius inner product and f = X2 – 〈X1, X2〉F X1. Then from Eq. 4, the geodesic distance is given by . Next, we compute the average template using an iterative procedure that minimizes the sum squared geodesic distances to itself. Thus given a set of N subject data matrices X1, X2, …, XN the average template is given as the local minimizer . This iterative procedure is seeded with a Euclidean mean, projected on the sphere as the initial condition. Figure 2 shows an example of a group-wise registration of N subjects with k = 33 left hemisphere ROI measures of myelin, cortical thickness, and gray matter volume (m = 3).
Fig. 2.

Scatter plots of data (top) and variance (bottom) before and after registration. * denotes significant decreases (p < 0.05).
2.3. Order of Data Representation
Thus far we represented the subject data matrix with ROIs as observations (rows), and measures such as thickness, myelin, volume as features (columns), as shown in the top scatter plot of Fig. 1. Another way to think about this is that from a given subject’s brain, we sample (observe) k ROIs at once, and each ROI is described in terms of m features. In this case the method registers ROIs across subjects. Thus we expect a reduction in variance in the measures across all ROIs as well as an increased inter-subject correlation between ROIs for a given measure at a time. While this configuration seems easy to understand and will potentially find uses for subsequent group-wise statistical analyses, we could also flip the order of ROIs and measures. We note that this idea is similar to the R− and Q− analyses as introduced in Cattell et al. [2]. That is, each subject data matrix is now represented by a configuration , where rows (m) represent measures and columns (k) denote ROIs. Now, given a subject’s brain, we sample (observe) m measures (thickness, volume, myelin etc.), and each measure is described in terms of k ROIs. While this data configuration doesn’t seem an obvious choice to represent subjects, the method in this case will register measures across subjects. Thus we would expect an increase in pairwise ROI correlations for measures as well as potentially increased correlations of measures (thickness etc.) with external variables. This configuration may be potentially useful in harmonizing measures across subjects and in improving the sensitivity of group-wise network-based statistical analyses.
3. Results and Applications
3.1. Data
Our data consists of 93 subjects (44M/49F, ages from 20–64 years, mean age 36 ± 12 years) collected using the Human Connectome Project (HCP) MRI acquisition protocol. The structural scans consisted of a T1-weighted (T1w) multi-echo MPRAGE (voxel size (VS) = 0.8mm isotropic; TR = 2.5s; TE = 1.81:1.79:7.18ms; TI = 1000ms; flip angle (34) = 8.0deg; acquisition time (TA) = 8:22min) and a T2-weighted (T2w) acquisition (VS = 0.8mm isotropic; TR = 3200ms; TE = 564ms; TA = 6:35min). T1w and T2w imaging data were preprocessed using the HCP minimal processing pipeline [5]. After the automated Freesurfer (version 5.3) reconstruction implemented in the HCP pipeline, volume and thickness measures were extracted [4]. In addition, T1w/T2w ratio maps that indicate myelin contrast were also used as a measure for analysis [6]. We used 33 ROIs from the left hemisphere and all three measures were used for registration.
3.2. Data Registration Across ROI Nodes of Cortical Measures
Figure 2 shows results of data registration for a k = 33, m = 3 configuration. We display the results in two dimensions although all three measures were used for registration. It is observed that after registration, the individual data samples within each ROI are highly compressed compared to the original data. This compression observed in Fig. 2 is quantitatively compared by computing the variances of each cortical measure across ROIs in the original data before and after registration (bottom, Fig. 2). The variances of thickness and volume across all ROIs decreased significantly.
To assess whether the reduction in variance of ROIs translates to improvement in classifying ROI labels, we performed 10-fold cross validation of k-nearestneighbor classification of ROI labels before and after alignment. The 10-fold mean prediction accuracy was higher after registration with 84% compared to 77% with p-value of 0.0036 (Fig. 3). The prediction accuracy was significantly higher after registration in the caudal middle frontal, fusiform, lateral orbitofrontal, lingual, pars opercularis, pars triangularis, rostral-middlefrontal, superior-frontal, and superior-temporal areas, and significantly lower in caudal anterior cingulate (FDR q = 0.05).
Fig. 3.

Prediction accuracies of ROI labels.
3.3. Registration Across Cortical Measures with ROIs as Features and Canonical Multivariate Correlations with Demographic Measures
Figure 4 shows pairwise ROI correlations between cortical measures. The correlations significantly improved (FDR corrected q = 0.001) across almost all areas after registration, demonstrating that the transposed representation is effective for aligning measures (with ROIs as features) across subjects.
Fig. 4.

ROI pairwise correlations before and after alignment. A white dot denotes significance (FDR corrected q = 0.001).
Finally under this transposed configuration, we computed canonical correlations between all the 3 measures together with the population age following registration (Fig. 5). The correlations significantly increased (FDR corrected q = 0.0001) following registration, demonstrating that registration of cortical measures across subjects may accentuate the relationship between the measures and a demographic variable such as age.
Fig. 5.

Canonical correlations of cortical measures with age before and after registration.
4. Discussion
We introduced an idea for shape matching of brain data configurations with applications to nodal network connectivity. The resulting data alignment has different applications depending upon the construction of the subject wise data matrix. We observed that formatting the data as a tall matrix (ROIs × measures) causes a reduction in overall variance in the ROI measures as well a decrease in the pairwise correlations between measures across ROIs (near diagonal covariance matrix). We conjecture that this property may be beneficial in partially reducing the influence of external confounding factors (not directly imaged in MRI). We propose to experimentally test this hypothesis in future work. On the other hand, formatting the data as a wide matrix (measures × ROIs) increases pairwise ROI correlations between measures as well as canonical correlations among demographic variables such as age. This characteristic is potentially useful in noisy observations where increased sensitivity in estimating the nodal correlations for network connectivity is desired. This may possibly lead to increased detection power under stringent multiple testing thresholds. Finally, in this paper, we only focused on global scaling, translation, and exact rotations, although in the future, other transformations of higher degree of freedom can be considered. We also plan to investigate the potential caveats of alignment decreasing or increasing useful information or/and noise, respectively.
Acknowledgments.
This research was supported by the NIH/NIAAA award K25AA024192.
References
- 1.Alexander-Bloch A, Giedd JN, Bullmore E: Imaging structural co-variancebetween human brain regions. Nat. Rev. Neurosci 14(5), 322 (2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Cattell RB: The data box In: Nesselroade JR, Cattell RB (eds.) Handbookof Multivariate Experimental Psychology. Perspectives on Individual Differences, pp. 69–130. Springer, Boston: (1988). 10.1007/978-1-4613-0893-53 [DOI] [Google Scholar]
- 3.Dryden IL, Mardia K: Statistical Shape Analysis Wiley Series in Probabilityand Statistics: Probability and Statistics. Wiley, Hoboken: (1998) [Google Scholar]
- 4.Fischl B, van der Kouwe A, Destrieux C, et al. : Automatically parcellating thehuman cerebral cortex. Cereb. Cortex 14(1), 11–22 (2004) [DOI] [PubMed] [Google Scholar]
- 5.Glasser MF, Sotiropoulos SN, Wilson JA, Coalson TS, Fischl B, et al. : Theminimal preprocessing pipelines for the human connectome project. Neuroimage 80, 105–124 (2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Glasser MF, Van Essen DC: Mapping human cortical areas in vivo basedon myelin content as revealed by T1-and T2-weighted MRI. J. Neurosci 31(32), 11597–11616 (2011) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Goryn D, Hein S: On the estimation of rigid body rotation from noisy data. IEEE Trans. Pattern Anal. Mach. Intell 17(12), 1219–1220 (1995) [Google Scholar]
- 8.He Y, Chen ZJ, Evans AC: Small-world anatomical networks in the humanbrain revealed by cortical thickness from MRI. Cereb. Cortex 17(10), 2407–2419 (2007) [DOI] [PubMed] [Google Scholar]
- 9.He Y, Evans A: Graph theoretical modeling of brain connectivity. Curr. Opin.Neurol 23(4), 341–350 (2010) [DOI] [PubMed] [Google Scholar]
- 10.Kendall DG: Shape manifolds, procrustean metrics, and complex projectivespaces. Bull. Lond. Math. Soc 16(2), 81–121 (1984) [Google Scholar]
- 11.Robinson EC, et al. : MSM: a new flexible framework for multimodal surfacematching. Neuroimage 100, 414–426 (2014) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Shattuck DW, Leahy RM: BrainSuite: an automated cortical surface identification tool. Med. Image Anal 6(2), 129–142 (2002) [DOI] [PubMed] [Google Scholar]
