Abstract
Functional connectivity, reflecting synchronized brain activity across distinct regions, is crucial for understanding cognitive processes. Despite the recent interest in exploring the relationship between functional connectivity and structural brain features, understanding the precise link remains challenging. We propose a novel analysis method that integrates structural factors—such as anatomical morphology summaries, voxel intensity, diffusion-weighted information, and geographic distance to explain variation in functional connectivity. Our method employs generalized additive model (GAM), leveraging region-pair or vertex-pair information, while accommodating individual subject differences in both template and subject spaces. Furthermore, we assess repeatability via the so called discriminability of subjects under our approach, quantifying the probability of similarities between measurements for the same subject versus different subjects. Utilizing data from the Human Connectome Project, we analyze brain connectivity in twin pairs and non-twin pairs to evaluate the repeatability of model-based connectivity patterns estimated via GAMs. Our findings suggest that direct structure/function regression models enhances our understanding of functional connectivity variation, providing insights into underlying mechanisms and discriminability of brain connections.
Keywords: fMRI, Functional Connectivity, Generalized Additive Models, Discriminability, Structural Features, Explainability
1. Introduction
Functional connectivity of the brain, which refers to the synchronized activity between spatially distinct brain regions [5], is often measured using functional magnetic resonance imaging (fMRI) and has the goal of understanding network dynamics underlying cognitive processes and neurological disorders. In recent years, there has been a significant interest in studying the relationship between functional connectivity and structural brain features, in particular, how the brain dynamics is regulated by its structural organization and connectivity [2, 9, 7].
Structural brain features, such as cortical morphology and connectivity density, play pivotal roles in shaping functional connectivity patterns. Recent studies suggest that the degree of functional activation within a cortical area is intricately tied to the physical characteristics of the region, including cortical volume, thickness, surface area, and curvature [13]. Furthermore, the extent of locally exchanged cortical activity appears to be influenced by the density of local connections [3]. Moreover, diffusion imaging and anatomical MRI techniques have been instrumental in providing personalized predictions across a wide spectrum of neurological and psychiatric conditions [10]. Recent research has also explored novel approaches to understanding connectivity variation at the subject level. For instance, [12] characterized functional connectivity at subject level using an edge distribution approach, where they employed the estimated density of connectivity between nodes of interest as a functional covariate, enabling non-geometrically localized connectivity investigations. Similarly, [11] developed regression models on functional connectivity matrices, incorporating structural and regional factors such as geographic distance, homotopic distance, network labels, and region indicators as covariates to explain variations in connections. While these methods explore the relationship between brain structure and function, the precise relationship between spatiotemporal patterns and local structural properties remains unexplored. Additionally, these studies are typically conducted under a template or parcellation space, which involves a complex registration process that doesn’t adequately account for inter-subject variability. Further research is needed to delve into the specific mechanisms linking local structural features to functional connectivity patterns and to develop methods that better accommodate individual differences in brain structure.
In this work, we introduce a novel and easily implemented analysis approach aimed at explaining the variation in functional connectivity by integrating important structural factors such as anatomical morphology summaries, voxel intensity, diffusion-weighted information, and geographic distance. Our methodology utilizes region-pair or vertex-pair information within additive models. We employ subject-level Fisher-transformed connectivity matrices as the outcome of interest and incorporate structural brain factors as covariates to explain the variations in connections. We show that our approach can be performed in template space, as well as subject (vertex) space, thereby accounting for inter-subject differences that get removed via template registration. Additionally, we investigate the discriminability (a measure of repeatability) of data under our proposed approach, which quantifies the probability that two measurements for the same subject will be more similar than measurements for two different subjects. We utilize this measure to analyze the repeatability of brain connectivity measurements between twin pairs and non-twin pairs data available from the Human Connectome Project (HCP). By comparing discriminability in twins, evidence is presented that this analysis method captures underlying connectivity patterns that are repeatable while not removing registration-based inter-subject variation.
2. Methodology
We adopt a statistical approach as shown in Figure 1 using generalized additive models (GAMs) [8, 16] to capture the relationship between functional connectivity and various structural brain factors. GAMs are particularly advantageous for our analysis due to their non-parametric nature, allowing us to model complex, non-linear relationships between the dependent variable, which is the functional connectivity profile, and a set of independent structural covariates.
Fig. 1.

Integration of multi-modal structural brain features with functional connectivity data for discriminability analysis. The multi-modal structural brain features include fractional anisotropy (FA), mean diffusivity (MD), cortical thickness (CT), gray matter volume (V), surface area (SA), curvature (Curv), combined with region-of-interest (RoI) coordinates (x, y, z) and voxel intensity measures. The outcome of this analysis aims to distinguish between different familial relationships (Monozygotic twins - MZ, Dizygotic twins - DZ, and Siblings - SIB).
2.1. Connectivity Additive Model
Each subject’s functional connectivity profile is described by a matrix of Fisher’s Z-transformed empirical Pearson correlations, , where each element denotes the correlation between a pair of parcellated brain regions or voxels and . To account for the non-localized nature of brain region pairs, we vectorize the upper triangular portion of . This vector serves as the dependent variable in our additive models.
The independent variables in our regression models include a variety of structural factors. Anatomical morphology summaries are included to capture the shape and size differences between brain regions, voxel intensity metrics to account for the signal characteristics within each region, and diffusion-weighted metrics to assess the integrity of the fiber tracts connecting the regions. Geographic distance, or the spatial proximity between each pair of regions, is also included as a covariate. Our model is formalized as follows: for each subject , the connectivity between regions and is modeled as:
Here, represents the intercept, are smooth functions of the respective structural covariates, and is an error term. The smooth functions, , are determined from the data, allowing for the flexibility to capture the unique shape of the relationship between each covariate and the connectivity measure. This is in contrast to traditional linear models that impose a fixed form (linear or polynomial) on the relationship. We fit the smooth terms using penalized regression B-splines and an additive second-derivative smoothing penalty for regularization.
The diffusion-weighted imaging components , include mean diffusivity (MD) and fractional anisotropy (FA). MD offers insights into the average rate of water molecule diffusion within brain tissue, indicative of tissue density and cellular integrity. FA measures the directional coherence of water diffusion, reflecting the degree of anisotropy within the white matter tracts, which is crucial for understanding the integrity of neural pathways. For any two regions, we model this term using a cosine similarity metric defined as follows: where and are vectors consisting of diffusion-weighted imaging descriptors for regions and , respectively.
For , we incorporate smooth terms encompassing four critical structural features—cortical thickness, gray matter volume, surface area, and curvature. Cortical thickness reflects the depth of the cerebral cortex and is associated with cognitive functions. Gray matter volume represents the size of the cortical and subcortical regions, directly linked to brain maturity and development. Surface area contributes to the overall cortical surface, indicating regional developmental patterns, while curvature reflects the folding patterns of the cortex, which may influence neural processing efficiency. We model this similarly using a cosine similarity metric.
The geographical distance term , quantifies the Euclidean distance between brain regions. This measure is fundamental, as geographical correlations could potentially arise from biological reasons such as function specialization, and the spatial distance between regions and is computed using the Euclidean distance between their centroid coordinates:
Lastly, Voxel intensity , is modelled through the subject’s corresponding T1-weighted image voxel values. This term helps to account for the variation in signal characteristics that might influence functional connectivity measurements. For each region, a histogram of voxel intensities is plotted. The similarity between histograms for regions and is computed using the 2-Wasserstein distance: . Here, and are the cumulative distribution functions (CDFs) corresponding to the histograms of voxel intensities for regions and , respectively.
2.2. Data Discriminability
To assess the discriminability between monozygotic (MZ), dizygotic (DZ), non-twin sibling (SIB) and not-related (NR) pairs with respect to functional brain connectivity, we adopt a statistical framework that quantifies the degree to which connectivity patterns are more similar within twin pairs as compared to between non-twin pairs. This approach extends the concept of discriminability introduced by [1, 15] to the domain of data repeatability. The discriminability statistic, , is defined as the probability that a randomly chosen within-pair distance is less than a between-pair distance, under a given distance metric . Higher values of indicate greater repeatability. The within-pair and between-pair distances are calculated using the chosen metric between the measurement profiles of the subjects in the pairs. Mathematically, the unbiased estimator of discriminability for each group can be expressed as:
where is the number of pairs in the MZ, DZ, SIB or NR group, is the number of twins (or siblings) per pair ( represent the vector of measurement values for a pair and is the indicator function which evaluates to 1 when the condition within is true, and 0 otherwise. We chose 60 pairs from each of the MZ, DZ, SIB, and NR groups within the HCP dataset for conducting the discriminability analysis because they included all the imaging modalities, and we employed Euclidean distance as the metric for measuring distances.
3. Experiments
3.1. Human Connectome Project Data
The dataset for our study comprises resting-state and task fMRI data obtained from the Human Connectome Project (HCP) [14], specifically from the HCP 900 subject release where imaging was conducted using a Siemens Skyra 3T scanner at Washington University in St. Louis. In this study, we exclusively utilized data from the left-to-right phase encoding sessions to ensure consistency and minimize potential directional biases in the analysis. For parcellation-based analysis, we employed the Destrieux atlas [4] that consists of 148 brain ROI’s. The preprocessing protocol was adapted from the HCP “fMRIVolume” pipeline [6], which incorporates a series of steps such as gradient unwarping, motion correction, distortion correction with FSL’s topup tool, registration to structural T1-weighted scan, non-linear registration into MNI152 space, grand-mean intensity normalization, and spatial smoothing using a Gaussian kernel with a full-width half-maximum of 4 mm. Crucially, while coregistration to structural T1-weighted scans was carried out, we intentionally omitted non-linear registration to standard space (MNI atlas) for the subject space analysis to preserve the unique anatomical features of each subject. Subsequently, we extracted time series data for each parcel defined by the Destrieux atlas for 148 regions. For subject space (vertex level) analysis, we perform a spatial stratified sampling defined by the atlas in subject space, where we sample 5000 vertices in total. The Fisher’s Z-transformation was applied to the correlation coefficients, resulting in correlations for atlas space analysis, and resulting in approximately 12 million pairs, to normalize the distribution of connectivity measures. This transformation attempts to ensure that the resulting scores approximate a normal distribution. The normality of Z-scores is useful for both statistical inference and general performance of the GAM, which used a squared error loss function (in our case) and thus results in a more accurate error model for our data. It should be noted, however, that the GAM appoach could be used for binary data if threshold using a Bernoulli log-likelihood and GAM link function.
3.2. Partial Dependence Analysis
To understand the influence of each structural feature independently, we perform partial dependence analysis. This technique allows us to visualize the effect of a single structural covariate on the predicted outcome of functional connectivity while averaging out the effects of other covariates. For a given covariate , the partial dependence function is defined as: where represents all features except and and is the predictive model.
3.3. Implementation Details
For our analysis, we employed the pyGAM framework. Given that the data follows a normal distribution, which is part of the exponential family, we selected the identity link function for this model. To capture the nonlinear patterns in the data, we utilized a third-order spline term, incorporating 20 splines to fit the model effectively. Further, to prevent overfitting and ensure smoothness, a second derivative smoothing penalty was applied to the spline terms in the model. The optimal smoothing parameter was determined through a comprehensive grid search, with chosen to balance the trade-off between fit and smoothness.
4. Results and Discussion
We used explained deviance as our measure of fit, with our model explaining 15.7% of the variability in the response variable. Figure 2 represents the partial dependence plots of different structural features against functional connectivity values for resting state in template and vertex space. The diffusion-weighted metrics show a consistent pattern across both spaces, with functional connectivity values remaining relatively stable, regardless of the variation in cosine similarity. This suggests that raw diffusion properties, captured by metrics like MD and FA, may not vary significantly with functional connectivity, or may be influenced by other unmeasured factors. Of note, these are not measures of diffusion based connectivity between the pairs; instead only measuring the distance in the intra-regional diffusion properties.
Fig. 2.

Partial dependence plots of different structural features against functional connectivity values for resting state in Top: template and Bottom: vertex (subject) space. Each figure contains four panels that correspond to diffusion-weighted metrics, anatomical morphology summaries, region of interest (RoI) distance, and voxel intensity.
The non-diffusion structural summaries reveal a more pronounced effect on functional connectivity. During the resting state, there is a noticeable trend where higher morphological alignment (higher cosine similarity) correlates with increased functional connectivity. Interestingly, this trend appears to be consistent across all tasks (provided in the supplementary material), indicating a robust morphological influence on connectivity. The ROI distance panels exhibit an expected inverse relationship; as the Euclidean distance between ROIs increases, functional connectivity decreases. This inverse relationship is consistent across both spaces, where it aligns with the intuitive notion that closer brain regions are more likely to exhibit stronger connectivity. Voxel intensity, represented by the Wasserstein distance of T1-weighted image voxel values, shows a distinct negative correlation with functional connectivity in all cognitive states. The strength of this correlation seems to be most pronounced in the resting state, with a steady decline in functional connectivity as voxel intensity dissimilarity increases. The consistency of these patterns across different cognitive states suggests that these structural features have a stable relationship with functional connectivity, independent of the cognitive demands placed on the individual.
Table 1 presents the discriminability () analysis outcomes for MZ, DZ, SIB, and NR individuals across various cognitive states in both template and vertex spaces using partial dependence functions from morphology, diffusion, and GAM predictions. We used the spline coefficients of each feature from partial dependence analysis to compute the distance () in discriminability analysis, thereby combining PD analysis to derive the scores shown in Table 1. This approach ensures that the influence of each structural feature on functional connectivity is reflected in the discriminability scores, providing an integrated evaluation of our framework. In template space, the values for MZ, DZ, and SIB are slightly higher than NR, suggesting genetic influences, but the differences are minimal, around the 0.50 mark, indicating limited discriminability. Vertex-level analysis, however, consistently shows higher discriminability and clearer distinctions between twins and non-twin pairs across all cognitive states, especially with diffusion-weighted information. We provide bootstrapped standard errors (SE) for δ to account for uncertainties due to structural measurement errors and environmental factors. For instance, at the vertex level (during rest), the SEs for the Morphology function for MZ/DZ/SIB/NR is 0.029/0.036/0.034/0.047, and for the GAM predictions is 0.033/0.036/0.044/0.049, respectively. Despite these uncertainties, the ranking (MZ>DZ>SIB>NR) remains consistent for diffusion-weighted factors, highlighting their utility as biomarkers for understanding familial relationships. Vertex level analysis shows to be a more consistent method for assessing familial relationship between twins and non-twin pairs, highlighting its advantage over template space analysis.
Table 1.
Discriminability analysis comparing MZ, DZ, SIB, and NR individuals across various cognitive states in template and vertex spaces. Vertex space results are visualized through a gradient of grey color from darker to lighter, moving from MZ to NR, where a darker color denotes higher degree of repeatability.
| Level | Cognitive state | f2 (Morphology) | f1 (Diffusion-weighted) | GAM predictions | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| MZ | DZ | SIB | NR | MZ | DZ | SIB | NR | MZ | DZ | SIB | NR | ||
| TEMPLATE | Rest | 0.58 | 0.58 | 0.51 | 0.47 | 0.71 | 0.59 | 0.53 | 0.50 | 0.58 | 0.55 | 0.49 | 0.49 |
| WM | 0.52 | 0.56 | 0.50 | 0.51 | 0.64 | 0.62 | 0.53 | 0.45 | 0.55 | 0.50 | 0.48 | 0.50 | |
| Relational | 0.51 | 0.59 | 0.45 | 0.46 | 0.61 | 0.53 | 0.45 | 0.38 | 0.51 | 0.51 | 0.47 | 0.42 | |
| Language | 0.50 | 0.52 | 0.50 | 0.48 | 0.60 | 0.63 | 0.47 | 0.44 | 0.60 | 0.50 | 0.46 | 0.47 | |
| Emotion | 0.48 | 0.57 | 0.48 | 0.46 | 0.61 | 0.56 | 0.49 | 0.44 | 0.55 | 0.52 | 0.51 | 0.47 | |
| Gambling | 0.50 | 0.45 | 0.49 | 0.47 | 0.68 | 0.53 | 0.55 | 0.44 | 0.55 | 0.48 | 0.49 | 0.45 | |
| Motor | 0.53 | 0.49 | 0.51 | 0.51 | 0.66 | 0.55 | 0.49 | 0.45 | 0.56 | 0.53 | 0.51 | 0.49 | |
| Social | 0.53 | 0.60 | 0.49 | 0.50 | 0.64 | 0.51 | 0.50 | 0.46 | 0.56 | 0.48 | 0.54 | 0.44 | |
| VERTEX | Rest | 0.50 | 0.51 | 0.32 | 0.29 | 0.64 | 0.59 | 0.36 | 0.30 | 0.53 | 0.54 | 0.33 | 0.29 |
| WM | 0.49 | 0.56 | 0.34 | 0.37 | 0.66 | 0.61 | 0.34 | 0.32 | 0.54 | 0.50 | 0.34 | 0.34 | |
| Relational | 0.49 | 0.52 | 0.35 | 0.28 | 0.63 | 0.52 | 0.35 | 0.30 | 0.50 | 0.51 | 0.35 | 0.30 | |
| Language | 0.52 | 0.50 | 0.33 | 0.32 | 0.62 | 0.57 | 0.38 | 0.33 | 0.59 | 0.50 | 0.33 | 0.35 | |
| Emotion | 0.44 | 0.49 | 0.34 | 0.32 | 0.60 | 0.52 | 0.35 | 0.30 | 0.54 | 0.52 | 0.36 | 0.32 | |
| Gambling | 0.51 | 0.52 | 0.35 | 0.33 | 0.64 | 0.51 | 0.38 | 0.33 | 0.55 | 0.47 | 0.34 | 0.31 | |
| Motor | 0.49 | 0.50 | 0.34 | 0.36 | 0.65 | 0.51 | 0.37 | 0.35 | 0.54 | 0.53 | 0.36 | 0.36 | |
| Social | 0.49 | 0.58 | 0.37 | 0.32 | 0.64 | 0.52 | 0.38 | 0.35 | 0.55 | 0.48 | 0.38 | 0.29 | |
5. Conclusion
We have developed a novel analytical framework that integrates an array of structural determinants, encompassing anatomical morphology summaries, voxel intensity metrics, diffusion-weighted measures, and geospatial distances, offering a multi-modal analysis of the impact of these structural elements on the dynamics of functional brain networks. Our approach utilizes subject-specific connectivity matrices and is adaptable to both template and individual-specific spatial domains, introducing a unique way for accommodating inter-individual variability—a crucial aspect frequently neglected in traditional analyses. The employment of our framework in analyzing HCP data, particularly for the evaluation of connectivity heritability in twin cohorts, highlights our framework’s ability to understand the inheritable components of brain connectivity patterns. The results of discriminability assessments further confirms the robustness and reproducibility of our technique, highlighting its efficacy in discriminating biological variances within brain connectivity configurations.
Supplementary Material
Acknowledgements
This work was supported by the National Institutes of Health grant R01 EB029977 (PI Caffo) from the National Institute of Biomedical Imaging and Bioengineering and the National Institutes of Health grant R01 HD108790 (PI Venkataraman) from the National Institute of Child Health and Human Development.
Footnotes
Disclosure of Interests
The authors declare that they have no competing interests in the paper.
References
- 1.Bridgeford EW, Wang S, Wang Z, Xu T, Craddock C, Dey J, Kiar G, Gray-Roncal W, Colantuoni C, Douville C, et al. : Eliminating accidental deviations to minimize generalization error and maximize replicability: Applications in connectomics and genomics. PLoS computational biology 17(9), e1009279 (2021) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Bullmore E, Sporns O: Complex brain networks: graph theoretical analysis of structural and functional systems. Nature reviews neuroscience 10(3), 186–198 (2009) [DOI] [PubMed] [Google Scholar]
- 3.Bullmore E, Sporns O: The economy of brain network organization. Nature reviews neuroscience 13(5), 336–349 (2012) [DOI] [PubMed] [Google Scholar]
- 4.Destrieux C, Fischl B, Dale A, Halgren E: Automatic parcellation of human cortical gyri and sulci using standard anatomical nomenclature. Neuroimage 53(1), 1–15 (2010) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Friston KJ: Functional and effective connectivity: a review. Brain connectivity 1(1), 13–36 (2011) [DOI] [PubMed] [Google Scholar]
- 6.Glasser MF, Sotiropoulos SN, Wilson JA, Coalson TS, Fischl B, Andersson JL, Xu J, Jbabdi S, Webster M, Polimeni JR, et al. : The minimal preprocessing pipelines for the human connectome project. Neuroimage 80, 105–124 (2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Guye M, Bettus G, Bartolomei F, Cozzone PJ: Graph theoretical analysis of structural and functional connectivity mri in normal and pathological brain networks. Magnetic Resonance Materials in Physics, Biology and Medicine 23, 409–421 (2010) [Google Scholar]
- 8.Hastie TJ: Generalized additive models. In: Statistical models in S, pp. 249–307. Routledge; (2017) [Google Scholar]
- 9.Honey CJ, Sporns O, Cammoun L, Gigandet X, Thiran JP, Meuli R, Hagmann P: Predicting human resting-state functional connectivity from structural connectivity. Proceedings of the National Academy of Sciences 106(6), 2035–2040 (2009) [Google Scholar]
- 10.Sabuncu MR, Konukoglu E, Initiative ADN: Clinical prediction from structural brain mri scans: a large-scale empirical study. Neuroinformatics 13, 31–46 (2015) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Smith B, Zhao Y, Lindquist M, Caffo B: Regression models for partially localized fmri connectivity analyses. Frontiers in Neuroimaging 2, 1178359 (2023). 10.3389/fnimg.2023.1178359 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Tang B, Zhao Y, Venkataraman A, Tsapkini K, Lindquist MA, Pekar J, Caffo B: Differences in functional connectivity distribution after transcranial direct-current stimulation: A connectivity density point of view. Human Brain Mapping 44(1), 170–185 (2023) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Tillisch K, Mayer EA, Gupta A, Gill Z, Brazeilles R, Le Nevé B, van Hylckama Vlieg JE, Guyonnet D, Derrien M, Labus JS: Brain structure and response to emotional stimuli as related to gut microbial profiles in healthy women. Psychosomatic medicine 79(8), 905–913 (2017) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Van Essen DC, Smith SM, Barch DM, Behrens TE, Yacoub E, Ugurbil K, Consortium WMH, et al. : The wu-minn human connectome project: an overview. Neuroimage 80, 62–79 (2013) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wang Z, Bridgeford E, Wang S, Vogelstein JT, Caffo B: Statistical analysis of data repeatability measures. arXiv preprint arXiv:2005.11911 (2020) [Google Scholar]
- 16.Wood SN: Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology 73(1), 3–36 (2011) [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
