Abstract
Multimodal neuroimages, such as diffusion tensor imaging (DTI) and resting-state functional MRI (fMRI), offer complementary perspectives on brain activities by capturing structural or functional interactions among brain regions. While existing studies suggest that fusing these multimodal data helps detect abnormal brain activity caused by neurocognitive decline, they are generally implemented in Euclidean space and can’t effectively capture the intrinsic hierarchical organization of structural/functional brain networks. This paper presents a hyperbolic kernel graph fusion (HKGF) framework for neurocognitive decline analysis with multimodal neuroimages. It consists of a multimodal graph construction module, a graph representation learning module that encodes brain graphs in hyperbolic space through a family of hyperbolic kernel graph neural networks (HKGNNs), a cross-modality coupling module that enables effective multimodal data fusion, and a hyperbolic neural network for downstream predictions. Notably, HKGNNs represent graphs in hyperbolic space to capture both local and global dependencies among brain regions while preserving the hierarchical structure of brain networks. Extensive experiments involving over 4,000 subjects with DTI and/or fMRI data demonstrate the superiority of HKGF over state-of-the-art methods in two neurocognitive decline prediction tasks. The proposed HKGF is a general framework for multimodal data analysis, facilitating objective quantification of brain structural or functional connectivity changes associated with neurocognitive decline.
Index Terms: Hyperbolic Kernel Graph Neural Network, Brain Connectivity, Multimodal Neuroimage, Neurocognitive Decline
1. Introduction
Multimodal neuroimaging data, such as diffusion tensor imaging (DTI), resting-state functional MRI (fMRI), and arterial spin labeling (ASL), provide complementary views of the brain [1]. While DTI characterizes the structural connectivity (SC) between brain regions-of-interest (ROIs) [2], fMRI and ASL capture the functional connectivity (FC) among ROIs by measuring correlated blood-oxygen-level-dependent (BOLD) signal fluctuations and regional cerebral blood flow, respectively [3–7]. Recent studies have shown that integrating these multimodal neuroimaging data improves the detection of abnormal brain connectivity patterns associated with neurocognitive decline [8], where subtle changes in structural, functional, or perfusion brain networks contribute to cognitive decline [9].
Extensive evidence has shown that the human brain is hierarchically organized in both structure and function, enabling efficient and flexible information processing at multiple levels [10–16]. For instance, Yeo et al. [10] demonstrated that the cerebral cortex can be divided into large-scale networks (e.g., default mode or frontoparietal) composed of anatomically distinct but functionally coherent regions. In addition, Margulies et al. [11] revealed a major gradient of cortical organization, extending from unimodal to cross-modal regions, reflecting a functional hierarchy. On the structural side, previous studies [13] have shown that the anatomical connectivity of the brain exhibits a modular and hierarchical topology, with smaller subnetworks embedded in larger integrated systems. Unfortunately, most current multimodal data fusion methods are formulated in Euclidean space, which makes them inadequate for capturing the inherently non-Euclidean hierarchical structure of structural or functional brain networks. As a result, their capacity to model complex interactions across different modalities is significantly limited. Hyperbolic space provides a promising solution to address this limitation because its negative curvature enables the volume to expand exponentially, which is suitable for modeling the hierarchical structure of brain networks [17].
Several previous studies have explored hyperbolic learning in medical data analysis. Zhang et al. [18] developed a hyperbolic space sparse coding method for predicting Alzheimer’s disease progression by mapping ventricular morphometry features into hyperbolic space. Yu et al. [19] introduced a hyperbolic prototype network to leverage class hierarchy for skin lesion recognition, achieving improved classification performance. Despite these advances, existing hyperbolic methods typically focus on single-modality data representation learning. Recently, Zhang et al. [20] proposed a multimodal fusion method utilizing the hyperbolic graph convolutional neural network (HGCN) [17] to integrate fMRI and DTI data, improving the identification of abnormal structural and functional disruptions associated with Alzheimer’s disease. However, due to the complex Riemannian operations in hyperbolic space, such as Möbius addition and multiplication, these methods often have high computational complexity. Furthermore, existing methods often fail to explicitly capture local-to-global dependencies among structurally and functionally connected brain ROIs, which is crucial for the analysis of neurocognitive decline.
To this end, we propose a novel Hyperbolic Kernel Graph Fusion (HKGF) framework tailored for neurocognitive decline analysis using multimodal neuroimaging data. As illustrated in Fig. 1, the proposed HKGF is composed of four key components: (1) a multimodal graph construction module, (2) a graph representation learning module that encodes structural and functional brain graphs in hyperbolic space through a general and flexible family of novel hyperbolic kernel graph neural networks (HKGNNs), (3) a cross-modality coupling module that integrates multimodal brain graphs by explicitly capturing ROI-level dependencies within and across imaging modalities, and (4) a new hyperbolic neural network (HNN) that facilitates downstream predictions. In particular, HKGNNs extend conventional GNNs by incorporating curvature-aware kernel functions in hyperbolic space, enabling the modeling of both local and global dependencies among brain regions while preserving the hierarchical structure of brain networks. Compared to conventional hyperbolic GNNs, our model avoids complex Riemannian operations by adopting an efficient kernel-based formulation, resulting in low computational cost. To the best of our knowledge, this is one of the first attempts to design and integrate hyperbolic kernels with graph neural networks for analyzing image-based neurocognitive decline. We also introduce a transfer learning strategy to reduce potential data scarcity by pretraining models on over 3,800 auxiliary fMRI scans. Extensive experiments on two target cohorts with 231 subjects show that HKGF outperforms state-of-the-art methods in multiple tasks. The source code and pretrained models can be accessed online.
Fig. 1.

Illustration of the proposed hyperbolic kernel graph fusion (HKGF) framework for neurocognitive decline analysis with multimodal data. Using DTI and fMRI input data as an example, this framework comprises four major components: (1) multimodal graph construction, (2) graph representation learning through a family of hyperbolic kernel graph neural networks (HKGNNs), (3) cross-modality coupling for feature fusion by capturing local to global connectivity interactions among brain regions, and (4) a prediction module using a new hyperbolic neural network (HNN).
The major contributions of this work are listed below.
We propose a general Hyperbolic Kernel Graph Fusion (HKGF) framework for automated neurocognitive decline analysis by modeling the intrinsic hierarchical structure of brain networks and integrating multimodal neuroimaging data (e.g., DTI, fMRI, and ASL) into a unified end-to-end learning framework.
We design a family of hyperbolic kernel graph neural networks to represent structural/functional brain connectivity networks. Compared to conventional GNNs, our HKGNNs improve representation capacity by leveraging curvature-aware kernel functions to model complex hierarchical structures in brain networks. Compared to existing hyperbolic GNNs, HKGNNs significantly enhance computational efficiency by avoiding costly Riemannian operations.
We introduce a cross-modality coupling module that explicitly captures interactions between heterogeneous neuroimaging modalities to facilitate effective multimodal fusion in hyperbolic space.
We conduct extensive experiments on multiple cohorts with multimodal data, demonstrating the superiority of HKGF over state-of-the-art Euclidean and non-Euclidean methods in multiple tasks.
The remainder of this paper is structured as follows. Section 2 reviews the most relevant work. Section 3 describes the proposed method. Section 4 outlines the experimental setup and presents the results. Section 5 analyzes the impact of several key components. Section 6 concludes this paper.
2. Related Work
2.1. Image-based Neurocognitive Decline Analysis
Neurocognitive impairment involves a range of cognitive deficits and is linked to abnormalities in brain structural and functional connectivity, as identified through imaging techniques like DTI and fMRI [21]. In particular, DTI quantifies white matter integrity and structural connectivity by modeling water diffusion along axonal tracts [2], while resting-state fMRI captures functional connectivity among ROIs by measuring correlated BOLD signal fluctuations. Oishi et al. [21] applied DTI analysis to neurocognitive impairment and identified white matter degeneration in specific regions, particularly in limbic and association fibers, as potential biomarkers for early-stage diagnosis of Alzheimer’s disease. Chen et al. [22] used fMRI to examine neurocognitive impairment and observed FC alterations that reflect abnormalities in intrinsic brain networks. Liu et al. [23] employed dynamic FCs derived from resting-state fMRI and proposed a GNN framework to improve the classification of mild cognitive impairment and Alzheimer’s disease. Wang et al. [24] used resting-state fMRI data for HIV-related neurocognitive impairment detection by incorporating brain community structure information for brain network analysis.
Multimodal data fusion has emerged as a powerful strategy to leverage the complementary information from diverse modalities such as DTI and fMRI, and ASL. Early efforts in multimodal brain imaging fusion primarily focus on feature-level concatenation of DTI and fMRI features [8]. Beyond direct feature integration, Broser et al. [25] proposed an fMRI-guided probabilistic tractography framework to map cortico-cortical and cortico-subcortical language networks in children, integrating DTI-based connectivity with fMRI activation for individualized white matter analysis. Iyer et al. [26] proposed a DWI-guided Bayesian network structure learning approach that uses structural priors from DWI to infer directed functional networks from fMRI data. While this early work offered valuable insights into multimodal integration, it did not explicitly model the topological organization of brain connectivity. To address this limitation, recent studies explored graph-based representations to capture the topological structure of brain networks. Zhang et al. [27] developed a multimodal GNN framework that integrates brain networks from sMRI and PET via shared adjacency matrices and late fusion strategies, demonstrating its effectiveness in Alzheimer’s disease classification. Bagheri et al. [28] proposed a Bayesian graph-based framework that integrates DTI-derived structural priors into causal discovery from fMRI data, improving the estimation of effective brain connectivity. These graph-based works are formulated in Euclidean geometry, which limits their ability to capture the hierarchical structure inherent in brain networks. To address this, Zhang et al. [20] proposed a hyperbolic GCN framework that integrates DTI and fMRI data to enhance classification performance. However, the reliance on complex Riemannian operations in hyperbolic space results in high computational complexity.
2.2. Brain Network Representation Learning
Existing studies often represent the brain as a functional or structural connectivity network derived from fMRI or DTI data. These networks are typically modeled as graphs, where nodes denote regions-of-interest (ROIs) and edges encode functional or structural connections reflecting synchronized activity or anatomical linkages. To represent brain networks/graphs, researchers often extract node-level and graph-level features. Node-level features are computed for each ROI to assess the importance of individual brain regions, while graph-level features describe global network properties and inter-regional interactions, providing an overall view of the brain network.
Learning-based methods have been developed for brain disorder analysis by feeding these predefined brain network features to machine learning algorithms for classification or regression. Recent efforts have focused on developing data-driven approaches for brain network analysis. For example, Kawahara et al. [29] proposed BrainNetCNN, a CNN architecture tailored for structural brain network analysis using DTI-derived connectomes. It employs edge-to-edge, edge-to-node, and node-to-graph convolutions to capture topological features for neurodevelopmental outcome prediction. Li et al. [30] proposed BrainGNN to analyze resting-state fMRI data and identify disease-related neurological biomarkers. It takes brain networks as input and uses two node-level graph convolutional layers to learn representations that capture both topological and functional connectivity patterns. Recently, Chami et al. [17] proposed HGCN, which extends GCNs into hyperbolic space to better capture hierarchical structures in brain graphs. It embeds SC/FC networks using Möbius-based convolutions and generates graph-level embeddings via hyperbolic-to-Euclidean projection and pooling for classification. However, these methods often struggle to explicitly capture the local-to-global dependencies between structurally and functionally connected brain ROIs, which are essential for understanding neurocognitive decline. In this work, we will develop a family of hyperbolic kernel GNN models aimed at capturing both local and global dependencies among brain regions while maintaining the hierarchical structure of brain networks.
3. Methodology
Previous evidence indicates that the human brain has a hierarchical organization across both structural and functional connectivity networks [10–13]. To further explore this hierarchical structure in the context of neurocognitive decline, we visualize regional fMRI and DTI features from 137 subjects in an HIV-associated neurocognitive disorder (HAND) cohort [24] using t-SNE [31], as shown in Fig. 2. Each of the 116 brain regions-of-interest (ROIs), defined by the Automated Anatomical Labeling (AAL) atlas [32], is assigned to one of seven functional subnetworks identified by Yeo et al. [10] or to the cerebellum (CB) group, based on maximum voxel overlap with the MNI152 template. ROIs that could not be reliably assigned to any Yeo functional network or the cerebellum (14 in total) based on voxel overlap were excluded from the t-SNE visualization.
Fig. 2.

T-SNE [31] visualizations of regional (a) fMRI and (b) DTI features for subjects from HAND cohort [24]. For fMRI data, each brain ROI defined by the AAL atlas is represented by a 116-dimensional vector, where each element corresponds to the functional connectivity (measured using Pearson correlation coefficients) with all other ROIs. For DTI data, each ROI is represented by a 348-dimensional feature vector, capturing its structural connectivity with other ROIs in terms of white matter fiber number (FN), fractional anisotropy (FA), and fiber length (FL). Each point denotes an ROI for a specific subject, colored by its assigned group among the seven Yeo networks [10] or the cerebellum (CB). The seven Yeo networks include control (CON), default mode (DMN), dorsal attention (DAN), limbic (LIM), salience/ventral attention (SN), somatomotor (SMN), and visual (VIS) networks.
These subnetworks include the control (CON), default mode (DMN), dorsal attention (DAN), limbic (LIM), salience/ventral attention (SN), somatomotor (SMN), and visual (VIS) networks. Each point in the figure represents a specific ROI from an individual subject. As shown in Fig. 2 (a), brain regions within the same subnetwork display similar functional connectivity (FC) fingerprints, even when they are spatially separated across different cortical areas. A similar pattern is observed in Fig. 2 (b) for DTI features. These consistent trends across modalities suggest that spatially distributed regions within each subnetwork work in coordination. This supports the view that the human brain exhibits a hierarchical organization in both structure and function, enabling efficient and flexible information processing across multiple levels. Motivated by these findings, we propose a novel Hyperbolic Kernel Graph Fusion (HKGF) framework to explicitly capture the hierarchical organization of brain networks for neurocognitive decline analysis, with details introduced below.
3.1. Preliminaries
Notations.
We use and to denote the -dimensional Euclidean space, -dimensional Poincaré model with curvature , respectively. We omit the or when or for simplicity. The matrices, vectors, and scalars are denoted by bold capital letters (e.g., ), bold lower-case letters (e.g., ), and thin letters (e.g., ), respectively.
Hyperbolic Space.
Hyperbolic space is a Riemannian manifold characterized by constant negative curvature [33]. Multiple isometric models have been proposed to represent hyperbolic space [34]. Following [35–37], we adopt the Poincaré model in this work. The Poincaré ball () is an -dimensional ball defined as:
| (1) |
with negative curvature and the Riemannian metric . As an analytic framework, gyrovector space theory facilitates operations in hyperbolic geometry [38], with the Möbius gyrovector model well-aligned with the structure of the Poincaré ball [39]. For example, for any two points , the Möbius addition is given by:
| (2) |
where denotes the Euclidean inner product. This non-Euclidean operation serves as the basis for defining hyperbolic geometric distances, which are computed as:
| (3) |
In addition to Möbius operations within the manifold, computations in hyperbolic space often require projections to the tangent space (known as Euclidean space). This can be achieved by the following logarithmic map [36]: for is
| (4) |
3.2. Proposed Framework
As shown in Fig. 1, the HKGF consists of a multimodal graph construction module, a graph representation learning module that encodes both structural and functional brain graphs in hyperbolic space using a general and flexible family of novel hyperbolic kernel graph neural networks (HKGNNs), a cross-modality coupling module that explicitly models region-of-interest (ROI) dependencies within and across imaging modalities, and a hyperbolic neural network (HNN) for downstream predictive tasks. In particular, the proposed HKGNNs extend traditional convolutional graph neural networks by integrating curvature-aware kernel functions within hyperbolic space, allowing for effective modeling of both local and global relationships among brain regions while maintaining the hierarchical nature of brain networks. Unlike conventional hyperbolic GNNs, our approach avoids computationally intensive Riemannian operations by adopting an efficient kernel-based design, significantly reducing computational cost and improving scalability.
3.2.1. Multimodal Graph Construction
Three modalities are used in this work, including DTI, resting-state fMRI, and ASL. Based on these modalities, we construct SC or FC networks for each subject.
FMRI-based FC Graph Construction.
From resting-state fMRI, we will extract mean time series from ( in this work) ROIs per subject, defined by the AAL atlas. With each ROI treated as a specific node, a fully-connected functional connectivity (FC) network is constructed by computing the Pearson correlation coefficients between fMRI time series of all ROI pairs, resulting in a symmetric matrix ( in this case). The original node feature for node is given by the -th row of , and is referred to as the node feature matrix. Since fully-connected brain networks may contain noisy or redundant connections, following [40], we empirically retain only the top 50% of the strongest edges in each FC graph to construct a sparse binary adjacency matrix , where denotes the weight of an existing edge between two nodes, and otherwise.
DTI-based SC Graph Construction.
For each subject, three complementary structural connectivity (SC) metrics are calculated from DTI data, including fiber number (FN), fractional anisotropy (FA), and fiber length (FL). These metrics are then used to construct a subject-specific SC graph. Each node in the graph corresponds to a brain ROI and is represented by the concatenated FN, FA, and FL features, covering both microstructural and macrostructural properties. Denote the DTI-based node feature matrix as ( in this case), where the original feature for the -th node is represented by the -th row of . Each edge in the SC graph is defined as the sum of the three metrics between the corresponding ROI pair, capturing the overall strength of structural connections.
3.2.2. Graph Representation Learning with HKGNNs
We aim to design a family of hyperbolic kernel graph neural networks (HKGNNs) to capture hierarchical organization from brain structural and/or functional connectivity networks. In the following, we first introduce new hyperbolic kernels and then develop a new hyperbolic kernel graph convolutional network (HKGCN) and a hyperbolic kernel graph attention network (HKGAT). Throughout this paper, we refer to HKGF with the HKGCN backbone as HKGF1, and that with the HKGAT backbone as HKGF2.
Hyperbolic Kernels.
To effectively capture both global and local geometric structures in hyperbolic space, we introduce a hyperbolic arc-cos (HAC) kernel and a hyperbolic radial basis function (HRBF) kernel, inspired by their Euclidean counterparts (i.e., the arc-cos kernel [41] and the RBF kernel [42]). The -th arc-cosine kernel is formulated as:
| (5) |
where and denotes the Heaviside step function. With and a Rectified Linear Unit (ReLU) as the mapping function, the arc-cosine kernel [41] can be written as:
| (6) |
where
In this way, . If ReLU is used as a mapping function in a standard neural network, this arc-cos kernel can be viewed as the inner product of the mapped features of two samples and in a specific layer of the neural network.
Inspired by this kernel, our HAC kernel is defined as:
| (7) |
where is the probability distribution function of is a non-linear transformation of , with denoting the logarithmic map from the Poincaré ball to its tangent space (see Eq. (4)). This formulation generalizes the arc-cosine kernel mapping to hyperbolic space, while allowing the use of arbitrary nonlinear functions . When is distance-independent and operates on a shared tangent space representation, the HAC kernel aggregates information beyond local neighborhoods, thus helps capture global similarity patterns in hyperbolic space.
The HAC kernel can be approximated using random features methods [43, 44], defined as:
| (8) |
where denotes the collection of random vectors independently drawn from the distribution , and is the output feature dimension. The HAC kernel value is then approximated as an inner product of two data points in the mapped feature space:
Interpreting as a generic mapping function within a neural network layer, this approximation establishes a natural connection between the proposed hyperbolic kernel and deep learning frameworks. Accordingly, it can be seamlessly integrated into GNN models as a specialized mapping function, providing a principled way to incorporate hyperbolic geometry within graph-based learning frameworks.
To capture local similarity among samples in hyperbolic space, we also introduce the HRBF kernel as follows:
| (9) |
where denotes the imaginary unit, is a probability distribution over and is the mapping function defined in Eq. (4). Specifically, this HRBF kernel measures the similarity between a pair of mapped points and based on their Euclidean distance in the tangent space [42]:
| (10) |
According to the “Curve Length Equivalence” Theorem [37], this distance in the tangent space actually approximates the hyperbolic geometric distance with a scaling factor. When this distance is small, the kernel value is high, indicating strong similarity between two samples. In contrast, as this distance increases, the kernel value tends to be 0, indicating low similarity. With Eq. (9), the HRBF kernel helps capture local structural information among samples in hyperbolic geometry.
Utilizing random Fourier features [42], the HRBF kernel can be approximated using the inner product of the mapped features and , formulated as:
| (11) |
By leveraging random feature approximation, the HRBF kernel can be flexibly applied in deep neural networks for downstream tasks. Similar to HAC, it can also be integrated into GNN models as a specialized mapping function.
Both equations in Eq. (8) and Eq. (11) can be rewritten as
| (12) |
where is the nonlinear activation. In this way, Eq. (12) can be viewed as a hyperbolic extension of an artificial neural network . The formal proof of the positive definiteness of the HAC and HRBF kernels is provided in Appendix of Supplementary Materials.
Proposed HKGCN.
For a brain SC/FC graph, we denote its adjacency matrix as and its node feature matrix as , where is the original feature of node . Given a projection function , we integrate our HAC and HRBF kernels into the GCN [45] framework for graph feature learning, yielding a single-layer HKGCN. Specifically, we first map the samples in Euclidean space to a Poincaré ball by:
| (13) |
where if , and otherwise, ensuring that all data points lie within the Poincaré ball. To capture both local and global information in the brain network, we define the single-layer HKGCN as:
| (14) |
where is a trainable weight matrix, is a trainable bias vector, is the normalized adjacency matrix, and is the dimension of updated node features. The hyperparameter balances the contributions of the two kernels.
The first term in Eq. (14) serves as a topology-aware approximation of the HAC kernel for modeling global relationships, where the nonlinear activation (ReLU in this work) is applied to node feature aggregation based on the normalized adjacency matrix. The second term serves as a topology-aware approximation of the HRBF kernel for capturing local information, where the cosine function is applied for node feature aggregation. Instead of pointwise kernel activations in Eq. (8) and Eq. (11), HKGCN performs node feature aggregation in tangent space, followed by nonlinear kernel activations. This facilitates the seamless integration of hyperbolic kernels into GNN models for graph representation learning. We can construct a multi-layer HKGNN by stacking multiple HKGCN layers.
Proposed HKGAT.
Similar to HKGCN, we integrate the two hyperbolic kernels into the graph attention network (GAT) [46] to define a single-layer HKGAT. In particular, HKGAT leverages an attention mechanism to learn node-specific aggregation weights, allowing for flexible modeling of neighborhood contributions. To this end, we first project the input feature matrix into a Poincaré ball using the project function defined in Eq. (13). Let and denote the projected feature of node and node , respectively. We first map and from the Poincaré ball to the tangent space using the logarithmic map defined in Eq. (4). We then compute the attention score as:
| (15) |
where is a learnable weight matrix that transforms each feature vector into a shared embedding space, and is a learnable attention vector. The operator concatenates the transformed features of nodes and before applying the attention mechanism. Besides, the LeakyReLU nonlinear function is defined as:
where is a small positive constant that allows a small portion of negative input values to be preserved. To obtain normalized attention coefficients, we apply the softmax function across all neighbors of node :
| (16) |
Then, the new representation of node is obtained by aggregating the features of its neighbors, formulated as:
| (17) |
and the final output of the HKGAT layer is defined as:
| (18) |
where is a scalar balancing the contribution of cosine activation. With Eq. (18), HKGAT can capture both local and global information across graph nodes. Here, we use the Exponential Linear Unit (ELU) activation as a nonlinear function , defined as:
where is a positive constant. Here, in Eq. (18) denotes the output of a single-head single-layer HKGAT.
To capture complex relationships among nodes, we can use the multi-head attention technique (with heads) in each HKGAT layer, formulated as:
| (19) |
By stacking single-head and/or multi-head HKGAT layers, we can obtain a multi-layer HKGAT.
The HKGNN-encoded SC and FC graph embeddings in Fig. 1 can be represented as and , where denotes the mapping function of the proposed multi-layer HKGCN or HKGAT. By integrating the two hyperbolic kernels, our multi-layer HKGCN and HKGAT perform kernel-based feature transformations to effectively aggregate both local and global information from neighboring nodes at each layer. Their multi-layer architecture progressively captures local-to-global interactions in brain SC/FC networks, while the hyperbolic kernels inherently model their underlying hierarchical structure.
3.2.3. Cross-Modality Coupling for Feature Fusion
To capture global interactions between SC and FC graphs, we construct a novel SC-FC coupling graph using an inner-product operation between the normalized embeddings of SC and FC graphs:
| (20) |
where is the adjacency matrix representing global SC-FC interactions across ROIs. To define node features on this coupling graph, we concatenate the learned SC and FC representations to form the node feature matrix . Another HKGCN or HKGAT model is then applied to this coupling graph for generating fused features: . This data-driven construction of the coupling graph enables explicit modeling of cross-modal interactions, in contrast to traditional methods (e.g., Pearson correlation [26]) that rely on predefined assumptions. Moreover, the integration of HKGNNs allows for effective representation of the brain’s hierarchical organization, capturing complex local-to-global dependencies.
3.2.4. Prediction with Hyperbolic Neural Network
Given the node feature matrix obtained from the SC-FC coupling module, we apply average pooling to obtain a compact vector representation . To align with the hyperbolic nature of representations extracted by HKGCN/HKGAT, we design a new Hyperbolic Neural Network (HNN) for classification. Unlike conventional predictors that apply fully connected layers in Euclidean space, our HNN operates in the tangent space to preserve the geometric consistency and hierarchical properties [37] learned by HKGCN. Each layer applies this transformation:
| (21) |
where is the activation function (i.e., ReLU) and ensures that the input lies within the Poincaré ball. We stack two such layers to construct HNN. The final node feature is passed through a softmax layer for classification with cross-entropy loss. Compared to existing hyperbolic neural networks [17, 36], our HNN reduces computational complexity by avoiding costly operations in hyperbolic space.
3.3. Implementation Details
In HKGF1, the HKGCN backbone consists of two hyperbolic kernel layers, both equipped with a hidden dimension of 64 and the rescaling factor . We use the first-stage HKGCN for SC/FC graph feature learning, and the second-stage HKGCN for SC-FC coupling graph representation learning. Similarly, the HKGF2 employs an HKGAT backbone, also comprising two hyperbolic kernel layers with a hidden dimension of 64 and a rescaling factor of . The first layer of HKGAT uses attention heads, and the second layer uses a single head. We also use two-stage HKGAT modules in HKGF2. Both HKGCN and HKGAT utilize the curvature parameter of 0.001. The HNN contains two layers (each with a hidden dimension of ). The Adam optimizer is used with a learning rate of 0.0001 and a weight decay of 1 × 10−4. The batch size is 128, and the number of training epochs is 50. To support reproducible research, we have made our source code publicly available via GitHub (see HKGCN and HKGAT).
4. Experiment
4.1. Materials and Data Preprocessing
4.1.1. Materials
Both the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset [47] and an HIV-associated neurocognitive disorder (HAND) dataset [24] are used as the target domains in this work. On ADNI, we aim to identify patients with significant memory complaints (SMC) from CN subjects, including paired resting-state fMRI and DTI data at baseline time from 46 SMC subjects and 48 age- and sex-matched CN subjects. On HAND, we aim to identify asymptomatic neurocognitive impairment with HIV (ANI) from control normal (CN) subjects. Both resting-state fMRI and DTI data in HAND are collected from 68 ANIs and age-matched 69 CNs at a local hospital. Given the limited sample sizes in the two target domains, we employ a transfer learning strategy by pretraining deep learning models on large-scale auxiliary source domain data. These source domain data (with 3,806 resting-state fMRI scans) are from three public datasets, including ABIDE [48], REST-meta-MDD Consortium [49], and ADHD-200 [50]. All source data are used for pretraining via a self-supervised contrastive learning strategy [51], without employing any category labels. Demographic information of the studied subjects from the source and target domains is reported in Table 1. Detailed subject IDs are reported in Supplementary Materials.
TABLE 1.
Demographic information of subjects from two target cohorts (target domains) and three auxiliary datasets (source domains).
| Dataset | Modality | Domain | Category | Subject # | Sex (M/F) | Age |
|---|---|---|---|---|---|---|
|
| ||||||
| ADNI | fMRI,DTI | Target | SMC | 46 | 16/30 | 75.635.43 |
| CN | 48 | 18/30 | 75.024.04 | |||
|
| ||||||
| HAND | fMRI,DTI | Target | ANI | 68 | 68/0 | 33.076.18 |
| CN | 69 | 69/0 | 33.335.37 | |||
|
| ||||||
| ABIDE | fMRI | Source | ASD | 351 | 308/43 | 16.908.00 |
| CN | 370 | 295/75 | 16.606.80 | |||
|
| ||||||
| MDD | fMRI | Source | MDD | 1, 163 | 415/748 | 36.9014.90 |
| CN | 1, 104 | 425/579 | 37.0016.00 | |||
|
| ||||||
| ADHD-200 | fMRI | Source | ADHD | 285 | 233/52 | 12.203.10 |
| CN | 379 | 196/183 | 13.203.50 | |||
Age is presented as the mean±standard deviation. ANI: asymptomatic neurocognitive impairment with HIV; CN: cognitively normal; SMC: significant memory concern; ASD: autism spectrum disorder; MDD: major depressive disorder; ADHD: attention-deficit/hyperactivity disorder; M: Male; F: Female.
4.1.2. Data Preprocessing
All resting-state fMRI data were preprocessed using a popular pipeline, including magnetization stabilization, slice time correction, head motion correction, regression of confounding covariates such as white matter signal, ventricular signal, and head motion parameters, normalization to the Montreal Neurological Institute (MNI) space, spatial smoothing, and bandpass filtering. We extracted mean time series (i.e., blood-oxygen-level dependent signals) from 116 ROIs per subject, which were defined by the Automated Anatomical Labeling (AAL) atlas. Based on regional BOLD signals, we construct an FC network for each subject. All DTI data were preprocessed using an established pipeline, including brain tissue extraction, bias field correction, eddy current, and head motion correction, diffusion gradient direction adjusting, registering anatomical images of each subject to the diffusion space, and brain ROI partition based on AAL. For each subject, three SC metrics were calculated: fiber number (FN), fractional anisotropy (FA), and fiber length (FL). These metrics are concatenated to represent each ROI. The weight of each edge between a pair of ROIs is computed as the sum of the three metrics. More details can be found in Supplementary Materials.
4.2. Experimental Setup
4.2.1. Competing Method
We compare our HKGF1 (with HKGCN as backbone) and HKGF2 (with HKGAT as backbone) against three classical machine learning approaches, including support vector machine (SVM) [52], XGBoost [54], and random forest (RF) [53] with concatenated SC and FC graph features as input. We will further compare our methods with eight state-of-the-art deep learning methods that automatically extract brain SC and FC features: graph convolutional network (GCN) [45], graph isomorphism network (GIN) [57], graph attention network (GAT) [46], Transformer [55], graph sample and aggregate (GraphSAGE) [56], BrainNetCNN [29], BrainGNN [30], and hyperbolic graph convolutional neural network (HGCN) [17]. Details of the competing methods are present in Supplementary Materials.
The three machine learning models (SVM, RF, and XGBoost) use the same input features (dimension: 2,336) per subject, including node-level (e.g., degree centrality, clustering coefficient, betweenness, eigenvector centrality) and graph-level features from each subject’s SC and FC graphs. The deep learning methods use the same multimodal graph inputs as ours and a two-layer MLP for prediction, with 32 neurons in the first layer and 2 neurons in the second. We employ the default settings for these competing methods and align their training hyperparameters with ours to ensure a fair comparison.
The competing deep learning methods incorporate both early fusion (EF) and late fusion (LF) strategies for multimodal feature integration, as shown in Fig. S1 of Supplementary Materials. Taking the GCN as a representative example, early fusion (denoted as GCN-EF) begins by concatenating the node-level FC and SC features to form a fused node representation. Simultaneously, the adjacency matrices of the FC and SC graphs are summed to construct a single fused adjacency matrix. This composite graph (consisting of the fused node feature matrix and the fused adjacency matrix) is then fed into a GCN for feature extraction, followed by a two-layer MLP for final prediction. In contrast, the late fusion strategy (denoted as GCN-LF) processes the SC and FC graphs separately using two independent GCN models to learn modality-specific features. The resulting features are then concatenated and passed through the same two-layer MLP to generate the final output. More details are present in Fig. S1 and Section 2 of Supplementary Materials.
4.2.2. Prediction Task and Evaluation Metric
Two prediction tasks will be performed on two target cohorts (i.e., ADNI and HAND), including (1) Task 1: SMC vs. CN classification on ADNI with baseline fMRI and ASL data from 29 SMC subjects and 15 CN subjects, and (2) Task 2: ANI vs. CN classification on HAND with fMRI and DTI data from 68 ANI subjects and 69 CN subjects. To address the small data issue, we employ a transfer learning strategy by first pretraining a deep learning model on large-scale source data and then fine-tuning it on the target data to perform downstream target tasks. Using self-supervised contrastive learning [51], we pretrain the deep models (i.e., GCN, GIN, GAT, Transformer, GraphSAGE, BrainNetCNN, BrainGNN, HGCN, HKGCN, and HKGAT) on the auxiliary source data containing 3,806 fMRI scans from ABIDE [48], REST-meta-MDD Consortium [49], and ADHD-200 [50]. We then fine-tune these pretrained models on target data for prediction. There is no data overlap between the auxiliary source datasets and the two target datasets. All the pretrained models can be accessed online1.
All methods are evaluated using a standard 5-fold cross-validation (CV) strategy on the two target datasets. The target HAND dataset was collected from a single site and includes male subjects only, while the target ADNI dataset involves multiple acquisition sites and includes both male and female subjects. Within each dataset, we ensured that the age distributions were statistically comparable across all cross-validation folds (two-sample t-test ). For the ADNI dataset, sex distributions were also checked for balance across folds (), minimizing potential confounding effects during evaluation. To reduce variability due to random data splits, the CV process is repeated five times with different random seeds. The average results and standard deviation across the five runs are reported. Performance evaluation is conducted using seven metrics: area under the ROC curve (AUC), accuracy (ACC), F1 score (F1), balanced accuracy (BAC), sensitivity (SEN), specificity (SPE), and precision (PRE).
4.3. Results and Analysis
4.3.1. Task 1: SMC vs. CN Prediction on ADNI
We present the results of all methods for SMC vs. CN classification on the ADNI dataset using fMRI and DTI data in Table 2. We conduct pairwise t-test on the results achieved by HKGF1 and each competing method to assess their difference significance, with significant results () marked as ‘*’ in Table 2, while those for the comparison between HKGF2 and each competing method are marked as ‘†’.
TABLE 2.
Results (%) of different methods in SMC vs. CN classification on ADNI with fMRI and DTI data.
| Method | AUC | ACC | F1 | BAC | SEN | SPE | PRE | p-value* | p-value† |
|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||
| SVM [52] | 55.08 ± 5.90 | 57.63 ± 2.60 | 59.92 ± 2.05 | 58.51 ± 1.25 | 59.19 ± 3.97 | 57.83 ± 5.54 | 64.42 ± 2.09 | < 0.001* | < 0.001† |
| RF [53] | 67.75 ± 3.54 | 60.42 ± 2.99 | 67.29 ± 2.42 | 60.80 ± 2.58 | 76.52 ± 3.68 | 45.08 ± 7.38 | 63.28 ± 2.39 | < 0.001* | < 0.001† |
| XGBoost [54] | 70.54 ± 4.87 | 63.53 ± 3.87 | 67.20 ± 4.43 | 64.26 ± 4.74 | 70.42 ± 3.50 | 58.10 ± 7.11 | 67.49 ± 4.75 | < 0.001* | < 0.001† |
|
| |||||||||
| GCN-EF [45] | 74.72 ± 4.31 | 73.65 ± 4.98 | 73.65 ± 6.22 | 75.41 ± 3.37 | 71.89 ± 10.27 | 78.93 ± 10.15 | 83.19 ± 5.95 | < 0.001* | < 0.001† |
| GCN-LF [45] | 70.02 ± 2.08 | 72.22 ± 1.92 | 74.92 ± 2.49 | 71.48 ± 1.78 | 78.33 ± 4.59 | 64.63 ± 5.20 | 74.73 ± 2.46 | < 0.001* | < 0.001† |
| GAT-EF [46] | 73.87 ± 1.47 | 75.84 ± 2.34 | 76.06 ± 3.16 | 75.62 ± 1.78 | 74.44 ± 7.55 | 76.81 ± 7.15 | 80.86 ± 2.61 | < 0.001* | < 0.001† |
| GAT-LF [46] | 77.67 ± 2.80 | 75.68 ± 2.98 | 77.48 ± 3.24 | 75.43 ± 2.60 | 78.53 ± 6.14 | 72.32 ± 5.15 | 79.10 ± 1.55 | 0.330 | < 0.001† |
| Transformer-EF [55] | 66.95 ± 2.26 | 68.64 ± 3.96 | 67.83 ± 2.53 | 69.36 ± 4.09 | 65.80 ± 6.51 | 72.92 ± 13.80 | 79.61 ± 9.27 | 0.001* | < 0.001† |
| Transformer-LF [55] | 70.89 ± 7.81 | 71.66 ± 8.10 | 75.26 ± 6.85 | 71.02 ± 7.85 | 81.76 ± 10.43 | 60.29 ± 17.73 | 74.99 ± 8.62 | < 0.001* | 0.048† |
| GraphSAGE-EF [56] | 73.82 ± 3.16 | 69.05 ± 1.62 | 69.56 ± 2.48 | 70.93 ± 3.10 | 67.67 ± 5.18 | 74.19 ± 8.02 | 77.38 ± 4.51 | < 0.001* | < 0.001† |
| GraphSAGE-LF [56] | 75.70 ± 3.77 | 72.55 ± 2.49 | 72.62 ± 4.00 | 74.34 ± 2.51 | 69.37 ± 6.21 | 79.32 ± 3.82 | 81.70 ± 3.04 | 0.012* | < 0.001† |
| GIN-EF [57] | 74.17 ± 4.10 | 70.96 ± 3.14 | 71.37 ± 3.43 | 72.86 ± 2.46 | 68.38 ± 4.31 | 77.33 ± 3.44 | 79.27 ± 3.22 | < 0.001* | < 0.001† |
| GIN-LF [57] | 70.32 ± 2.34 | 69.42 ± 1.41 | 67.10 ± 3.04 | 69.88 ± 1.66 | 63.39 ± 6.81 | 76.37 ± 8.36 | 79.55 ± 3.86 | < 0.001* | < 0.001† |
| BrainNetCNN-EF [29] | 58.72 ± 8.60 | 61.85 ± 3.43 | 57.85 ± 5.58 | 62.92 ± 4.22 | 57.15 ± 9.20 | 68.70 ± 15.94 | 66.55 ± 11.12 | < 0.001* | < 0.001† |
| BrainNetCNN-LF [29] | 71.95 ± 6.53 | 74.07 ± 4.55 | 75.25 ± 6.36 | 75.74 ± 5.01 | 75.66 ± 12.75 | 75.82 ± 8.53 | 80.80 ± 2.79 | < 0.001* | < 0.001† |
| BrainGNN-EF [30] | 62.61 ± 6.15 | 67.44 ± 6.59 | 66.24 ± 11.18 | 68.30 ± 3.22 | 65.22 ± 14.36 | 71.38 ± 8.04 | 77.19 ± 3.96 | 0.004* | 0.006† |
| BrainGNN-LF [30] | 72.21 ± 6.89 | 71.02 ± 5.44 | 73.75 ± 4.23 | 72.40 ± 4.86 | 76.59 ± 4.12 | 68.22 ± 8.23 | 75.22 ± 4.58 | 0.023* | 0.003† |
| HGCN-EF [17] | 71.10 ± 4.04 | 69.28 ± 2.52 | 68.58 ± 4.29 | 69.79 ± 2.39 | 65.00 ± 3.12 | 74.58 ± 3.28 | 77.70 ± 2.51 | < 0.001* | < 0.001† |
| HGCN-LF [17] | 69.81 ± 5.59 | 69.58 ± 3.90 | 70.37 ± 3.58 | 69.91 ± 4.43 | 67.33 ± 5.22 | 72.49 ± 11.04 | 78.24 ± 7.50 | < 0.001* | < 0.001† |
|
| |||||||||
| HKGF1 (Ours) | 76.22 ± 2.31 | 76.30 ± 1.99 | 77.59 ± 2.54 | 78.10 ± 2.13 | 78.34 ± 4.77 | 77.87 ± 3.15 | 81.32 ± 1.99 | – | – |
| HKGF2 (Ours) | 80.42 ± 2.25 | 81.26 ± 2.62 | 82.65 ± 2.64 | 80.58 ± 2.11 | 84.81 ± 4.08 | 76.35 ± 3.97 | 81.75 ± 3.11 | – | – |
‘*’ denotes statistically significant differences between HKGF1 and a competing method (p < 0.05 via t-test), while ‘†’ denotes statistically significant differences between HKGF2 and a competing method.
Several important insights can be derived from Table 2. First, both versions of our method (HKGF1 and HKGF2) generally outperform competing approaches, with HKGF2 achieving the highest overall performance (AUC: 80.42%, ACC: 81.26%). The slight superiority of HKGF2 over HKGF1 may be attributed to the attention mechanism employed in the HKGAT backbone, which enables it to capture informative dependencies that the HKGCN backbone in HKGF1 lacks. Second, HKGF1 and HKGF2 yield statistically significant improvements over traditional methods such as SVM and XGBoost, and various GCN-based models across most evaluation metrics. Third, compared to transformer-based models (e.g., Transformer-EF) and GraphSAGE variants, our methods achieve higher sensitivity and specificity, indicating more balanced classification performance. Additionally, t-test results confirm that HKGF1 and HKGF2 significantly outperform most competing methods, further validating the effectiveness of our HKGF framework in SMC detection.
4.3.2. Task 2: ANI vs. CN Prediction on HAND
Table 3 reports the results of all methods in ANI vs. CN classification on HAND. We draw several key observations from this table. First, deep learning approaches generally outperform traditional machine learning methods (i.e., SVM, RF, and XGBoost), highlighting the advantages of data-driven graph feature extraction. Second, our HKGF1 and HKGF2 achieve the best performance (e.g., AUC of 69.74% and ACC of 71.85% by HKGF2), indicating their effectiveness in modeling cross-modality local-to-global interactions and hierarchical dependencies among ROIs. Third, our methods significantly outperform state-of-the-art models such as BrainNetCNN and BrainGNN. This improvement may be attributed to the SC-FC coupling module in HKGF, as opposed to simple SC and FC feature concatenation used in the two competing methods. Finally, the overall performance across all methods in this task is generally lower than that reported for SMC identification in Table 2. This difference may stem from the nature of the cohorts: while SMC subjects experience subjective cognitive complaints, ANI patients often do not exhibit noticeable cognitive symptoms, making ANI identification inherently more challenging.
TABLE 3.
Results (%) of different methods in ANI vs. CN classification on HAND with fMRI and DTI data.
| Method | AUC | ACC | F1 | BAC | SEN | SPE | PRE | p-value | p-value |
|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||
| SVM [52] | 51.96 ± 3.25 | 50.08 ± 3.01 | 48.70 ± 3.99 | 51.57 ± 3.55 | 47.50 ± 4.61 | 55.63 ± 4.05 | 52.58 ± 4.40 | < 0.001* | < 0.001† |
| RF [53] | 54.77 ± 3.05 | 49.25 ± 3.05 | 48.49 ± 3.49 | 50.63 ± 3.66 | 49.75 ± 5.80 | 51.49 ± 9.83 | 50.88 ± 3.82 | < 0.001* | < 0.001† |
| XGBoost [54] | 57.49 ± 6.34 | 54.61 ± 5.21 | 54.53 ± 5.76 | 54.97 ± 5.01 | 55.70 ± 7.13 | 54.23 ± 5.06 | 55.62 ± 5.54 | < 0.001* | < 0.001† |
|
| |||||||||
| GCN-EF [45] | 65.24 ± 5.59 | 64.83 ± 3.36 | 61.97 ± 6.88 | 65.54 ± 4.08 | 64.74 ± 10.14 | 66.34 ± 10.34 | 69.09 ± 5.05 | 0.039* | 0.028† |
| GCN-LF [45] | 61.06 ± 2.71 | 63.16 ± 1.46 | 62.88 ± 2.66 | 63.35 ± 1.45 | 67.44 ± 6.54 | 59.27 ± 7.35 | 65.52 ± 4.04 | < 0.001* | 0.011† |
| GAT-EF [46] | 65.10 ± 5.17 | 64.79 ± 3.07 | 61.91 ± 7.13 | 64.38 ± 3.19 | 65.60 ± 9.61 | 63.15 ± 6.56 | 67.63 ± 3.08 | 0.019* | 0.005† |
| GAT-LF [46] | 62.61 ± 3.78 | 64.23 ± 3.49 | 66.48 ± 5.90 | 60.77 ± 5.75 | 64.21 ± 2.42 | 67.94 ± 11.26 | 67.58 ± 4.33 | 0.046* | < 0.001† |
| Transformer-EF [55] | 54.92 ± 4.36 | 59.13 ± 2.69 | 57.97 ± 5.95 | 59.46 ± 1.52 | 64.56 ± 9.28 | 54.36 ± 10.01 | 57.90 ± 8.06 | < 0.001* | < 0.001† |
| Transformer-LF [55] | 55.49 ± 3.88 | 60.56 ± 3.81 | 53.86 ± 2.31 | 59.54 ± 2.00 | 64.81 ± 9.42 | 64.26 ± 12.69 | 63.29 ± 5.59 | < 0.001* | < 0.001† |
| GraphSAGE-EF [56] | 63.11 ± 3.63 | 64.23 ± 3.98 | 64.24 ± 6.10 | 64.21 ± 4.23 | 71.11 ± 10.18 | 57.32 ± 4.89 | 64.04 ± 3.10 | 0.025* | 0.032† |
| GraphSAGE-LF [56] | 62.31 ± 2.30 | 64.23 ± 1.90 | 62.70 ± 2.58 | 63.60 ± 1.46 | 64.01 ± 5.54 | 63.18 ± 5.55 | 64.21 ± 1.85 | 0.048* | 0.024† |
| GIN-EF [57] | 62.90 ± 5.34 | 65.12 ± 4.34 | 62.51 ± 8.48 | 64.54 ± 4.62 | 66.07 ± 10.43 | 63.00 ± 9.74 | 68.73 ± 9.00 | 0.022* | 0.033† |
| GIN-LF [57] | 61.30 ± 2.42 | 63.63 ± 2.80 | 61.05 ± 3.60 | 63.74 ± 1.94 | 64.91 ± 7.98 | 62.56 ± 11.56 | 69.01 ± 8.68 | 0.038* | < 0.001† |
| BrainNetCNN-EF [29] | 54.73 ± 4.46 | 58.25 ± 1.83 | 57.97 ± 4.93 | 58.20 ± 1.96 | 63.12 ± 13.76 | 53.29 ± 16.81 | 59.33 ± 4.92 | < 0.001* | < 0.001† |
| BrainNetCNN-LF [29] | 54.98 ± 5.88 | 62.05 ± 4.76 | 57.79 ± 7.47 | 62.01 ± 4.94 | 61.08 ± 13.80 | 62.95 ± 13.16 | 63.76 ± 11.95 | < 0.001* | < 0.001† |
| BrainGNN-EF [30] | 56.11 ± 7.60 | 62.08 ± 5.03 | 60.83 ± 3.86 | 62.73 ± 4.21 | 65.15 ± 8.33 | 60.31 ± 12.85 | 65.39 ± 5.38 | 0.013* | 0.048† |
| BrainGNN-LF [30] | 58.83 ± 5.08 | 62.58 ± 3.64 | 54.31 ± 8.33 | 61.39 ± 4.10 | 55.16 ± 13.76 | 67.62 ± 12.85 | 60.16 ± 6.79 | 0.009* | < 0.001† |
| HGCN-EF [17] | 64.08 ± 5.42 | 63.66 ± 3.95 | 65.18 ± 7.17 | 63.81 ± 3.81 | 72.35 ± 10.23 | 55.27 ± 4.29 | 64.64 ± 2.18 | < 0.001* | < 0.001† |
| HGCN-LF [17] | 60.68 ± 1.58 | 65.12 ± 1.15 | 61.02 ± 4.13 | 65.04 ± 1.26 | 59.52 ± 8.43 | 70.57 ± 7.10 | 69.74 ± 5.21 | < 0.001* | < 0.001† |
|
| |||||||||
| HKGF1 (Ours) | 68.83 ± 2.55 | 71.53 ± 2.59 | 69.72 ± 5.15 | 71.54 ± 2.53 | 71.30 ± 8.73 | 71.78 ± 4.18 | 73.62 ± 1.72 | – | – |
| HKGF2 (Ours) | 69.74 ± 1.77 | 71.86 ± 2.39 | 68.73 ± 2.54 | 72.36 ± 1.40 | 70.45 ± 7.57 | 74.27 ± 7.64 | 76.45 ± 4.99 | – | – |
‘*’ denotes statistically significant differences between HKGF1 and a competing method (p < 0.05 via t-test), while ‘†’ denotes statistically significant differences between HKGF2 and a competing method.
4.4. Visualization of Learned Graph Representation
For both prediction tasks, we visualize the graph features learned by the first-stage and second-stage HKGCN backbones of HKGF1 using t-SNE [31] in Fig. 3 (a), where the three columns correspond to FC graphs derived from fMRI, SC graphs from DTI, and the SC-FC coupling graphs produced by our cross-modality coupling module. Figure 3 (b) presents a comparison of the feature distributions learned by the first-stage and second-stage HKGAT backbones of the proposed HKGF2. These visualizations show that the feature representations of FC and SC graphs exhibit considerable overlap, while the fused SC-FC coupling graph features can more clearly distinguish between categories. It is important to recognize that the two classification tasks are inherently challenging due to the subtlety and complexity of asymptomatic neurocognitive disorders in HIV patients and the subjective nature of SMCs. This complexity leads to limited feature separability between categories.
Fig. 3.

The t-SNE [31] visualization of features output by (a) HKGCN backbone and (b) HKGAT backbone in single-modality graph representation learning (1st stage) and cross-modality coupling (2nd stage) modules of the proposed HKGF1 and HKGF2, respectively. Shown for (top) Task 1: SMC vs. CN classification on the ADNI dataset and (bottom) ANI vs. CN classification on the HAND dataset using fMRI and DTI data.
4.5. Identified Discriminative Brain Regions
We further study the discriminative brain regions identified by our methods in the two tasks based on SC-FC coupling graph embeddings through the BrainNet Viewer [58]. The visualization results are shown in Fig. 4. The SC-FC coupling features are derived from our SC-FC coupling modules in HKGF1 and HKGF2. Specifically, for HKGF1 with the HKGCN backbone (without any attention techniques), we compute Pearson’s correlation coefficients between paired ROIs of these embeddings and use the t-test (significance level: 0.05) to identify significant connectivity differences between the positive (e.g., SMC) and negative (i.e., CN) groups across five folds. We select the top 15 discriminative connectivities, involving 14 ROIs in SMC vs. CN classification and 16 ROIs in ANI vs. CN classification. For HKGF2 with the HKGAT backbone, we rely on the attention matrix learned by the second-stage HKGAT to obtain the discriminative ROIs.
Fig. 4.

Discriminative brain regions identified by HKGF1 (with HKGCN as backbone) and HKGF2 (with HKGAT as backbone) in the tasks of (a) SMC vs. CN classification and (b) ANI vs. CN classification with fMRI and DTI data. Different colors represent distinct regions.
Figure 4 (a) indicates that, despite some variability, both methods consistently identify six overlapping ROIs in the SMC vs. CN classification, including SFGdor.L, IFGtriang.L, IFGtriang.R, SFGmed.R, PoCG.L, PCL.L. These regions are primarily located in the frontal and parietal lobes, which are known to be involved in executive function, attention, and early memory processing. Their involvement aligns with previous studies reporting frontal-parietal dysfunction in individuals with subjective memory concerns [59–61], suggesting these ROIs may serve as early neural markers of cognitive decline. Figure 4 (b) shows that, in the ANI vs. CN classification, both methods consistently identify four important ROIs (CAU.L, CAU.R, PUT.R, and CRBLCrus1.L). These findings align with previous studies [62–64], further supporting the reliability of our methods in distinguishing ANI from CNs. Notably, connectivity between the left cerebellar Crus I (CRBLCrus1.L) and the left inferior temporal gyrus (ITG.L) suggests involvement of cerebellar–cortical circuits in semantic processing. Given the cerebellum’s role in motor-cognitive integration and ITG’s function in language and semantics, this connection may reflect disruptions in integrative processing in HAND [63]. We also found strong coupling between the left hippocampus (HIP.L) and ITG.L, a circuit commonly linked to memory function and frequently implicated in HAND pathology. This supports prior evidence that memory-related circuits are among the most vulnerable in HAND [65]. Additionally, the observed interaction between the left superior orbital frontal gyrus (ORBsup.L) and Vermis 10 may relate to emotional and cognitive regulation, as both regions are involved in affective and executive processing. Finally, the connection between the left caudate nucleus (CAU.L) and Vermis 10 points to dysfunction in frontal–subcortical loops. The caudate is involved in attention control, while the cerebellar vermis supports behavioral coordination. Disruption in this pathway may underlie deficits in attention and motor function commonly seen in HAND, emphasizing the involvement of cortico-subcortical networks in the disorder [63].
To enhance biological interpretability, we further examined how the identified discriminative brain regions relate to known neuropathological patterns in HAND and AD. The consistent involvement of the frontal and parietal regions (including SFGdor, IFGtriang, SFGmed, PoCG, and PCL) supports prior findings that executive and attentional control networks are among the earliest affected systems in both disorders [66, 67]. The detected coupling between the hippocampus and inferior temporal gyrus corresponds to memory-related circuits frequently reported as vulnerable in HAND and AD [68, 69]. Moreover, the identified cerebellar–cortical and frontal–subcortical interactions (e.g., CRBLCrus1–ITG and CAU–Vermis 10) align with recent evidence that cerebellar and striatal abnormalities contribute to attention and motor-cognitive deficits in HAND and AD [70]. These converging patterns highlight the involvement of distributed cortico-subcortical networks and underscore the biological plausibility of our model’s findings.
5. Discussion
5.1. Ablation Study
To evaluate the influence of key components, we compare our two HKGF implementations (i.e., HKGF1 and HKGF2) with their variants: (1) HKGF-G that uses GCN instead of HKGCN/HKGAT as backbone, (2) HKGF-A with GAT as backbone, (3) HKGF-K with HGCN as backbone, (4) HKGF1w/oC, which removes SC-FC coupling and concatenates HKGCN-extracted fMRI and DTI features, (5) HKGF2w/oC that removes SC-FC coupling and concatenates HKGAT-extracted fMRI and DTI features, and (6) HKGF1w/oH and HKGF2w/oH that use MLP (with 2 fully-connected layers) instead of HNN as a predictor.
As reported in Table 4, HKGF1 and HKGF2 achieve the overall best performance compared to their variants. First, the three methods (HKGF-G, HKGFC-A, and HKGF-K) show moderate performance degradation compared to HKGF1 and HKGF2 across both tasks. This suggests our method’s ability to model hierarchical dependencies among ROIs, an aspect not captured by HKGF-G and HKGF-A, as well as their superior capacity to capture cross-modality local-to-global interactions compared to HKGF-K. Second, the performance drop of HKGF1w/oC HKGF2w/oC highlights the importance of explicit SC-FC coupling in feature fusion. Additionally, the slight performance decline observed in HKGF1w/oH and HKGF2w/oH implies that our HNN used in HKGF1 and HKGF2 may better align with the hyperbolic geometry compared to the standard MLP predictor.
TABLE 4.
Performance (%) of the proposed methods (i.e., HKGF1 and HKGF2) and their ablated variants in two prediction tasks.
| Method | SMC vs. CN Classification on ADNI | ANI vs. CN Classification on HAND | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| ||||||||||||||
| AUC | ACC | F1 | BAC | SEN | SEE | PEE | AUC | ACC | F1 | BAC | SEN | SEE | PEE | |
|
| ||||||||||||||
| HKGF-G | 74.272.23 | 74.693.93 | 75.992.87 | 75.503.05 | 77.304.08 | 73.706.46 | 79.853.43 | 63.233.98 | 67.723.98 | 65.108.62 | 67.843.12 | 64.3113.86 | 71.388.19 | 72.883.02 |
| HKGF-A | 79.652.24 | 77.152.35 | 78.131.79 | 77.141.83 | 77.743.48 | 76.53.5.23 | 80.913.10 | 65.732.74 | 66.441.90 | 62.584.93 | 66.640.83 | 63.4810.24 | 69.809.71 | 71.817.21 |
| HKGF-K | 72.155.68 | 74.254.74 | 74.986.02 | 74.414.16 | 73.009.46 | 75.824.30 | 81.532.24 | 65.392.59 | 64.662.69 | 63.903.59 | 64.692.52 | 66.113.39 | 63.271.95 | 65.981.54 |
|
| ||||||||||||||
| HKGF1w/oC | 68.1512.23 | 67.625.35 | 68.756.30 | 68.065.40 | 65.0010.87 | 71.1112.86 | 74.728.69 | 60.644.90 | 64.382.60 | 64.225.78 | 64.472.62 | 70.339.09 | 58.615.71 | 64.372.73 |
| HKGF1w/oH | 73.242.87 | 75.503.69 | 76.222.50 | 75.253.31 | 77.203.00 | 73.317.16 | 78.97 3.21 | 63.544.62 | 64.803.23 | 61.807.07 | 64.873.15 | 63.0411.80 | 66.708.77 | 68.685.22 |
| HKGF2w/oC | 78.002.01 | 77.712.14 | 79.862.09 | 78.172.50 | 82.284.05 | 74.074.55 | 80.763.38 | 67.411.48 | 67.022.58 | 66.976.14 | 66.612.12 | 73.0111.68 | 60.2010.59 | 66.685.65 |
| HKGF2w/oH | 79.720.90 | 80.350.14 | 80.061.06 | 79.700.67 | 77.623.71 | 81.78 4.26 | 85.56 3.03 | 69.391.70 | 67.921.24 | 65.43.5.85 | 68.171.06 | 67.5112.70 | 68.8310.76 | 71.678.95 |
|
| ||||||||||||||
| HKGF1 (Ours) | 76.222.31 | 76.301.99 | 77.592.54 | 78.102.13 | 78.344.77 | 77.873.15 | 81.321.99 | 68.832.55 | 71.532.59 | 69.72 5.15 | 71.542.53 | 71.30 8.73 | 71.784.18 | 73.621.72 |
| HKGF2 (Ours) | 80.42 2.25 | 81.26 2.62 | 82.65 2.64 | 80.58 2.11 | 84.81 4.08 | 76.353.97 | 81.753.11 | 69.74 1.77 | 71.86 2.39 | 68.732.54 | 72.36 1.40 | 70.457.57 | 74.27 7.64 | 76.45 4.99 |
5.2. Hyperparameter Analysis
We further investigate the influence of two key hyperparameters in HKGCN and HKGAT: (1) curvature value which determines the geometry of hyperbolic space, and (2) which scales the cosine function. To assess their impact, we conduct experiments on HAND for ANI vs. CN classification. In Fig. 5, we report the AUC and ACC performance of HKGF1 and HKGF2 under different settings of and . These results reveal that the curvature parameter c has a relatively minor effect on the two models’ performance in terms of ACC and AUC. This suggests that our methods are robust to variations in the underlying geometric space. In contrast, the scaling factor has a more noticeable impact, and the performance remains stable when in both cases, indicating a reasonable range for tuning without degrading predictive performance.
Fig. 5.

Influence of the two hyperparameters ( and ) on the performance of (a) HKGF1 and (b) HKGF2 for ANI vs. CN classification on HAND.
5.3. Influence of Transfer Learning
To evaluate the effectiveness of our transfer learning strategy for model pretraining, we compare the proposed models (i.e., HKGF1 and HKGF2) with their respective variants (called HKGF1w/oP and HKGF2w/oP, respectively). Both HKGF1w/oP and HKGF2w/oP are trained from scratch on the target data without any pretraining. The results of these methods in ANI vs. CN classification on HAND and SMC vs. CN classification on ADNI are reported in Table 5. As presented in Table 5, both HKGF1 and HKGF2 outperform their corresponding variants across most evaluation metrics. These variants, which are trained from scratch without pretraining, show notably lower performance. For instance, HKGF1 improves the AUC by more than 5% compared to HKGF1w/oP on HAND, highlighting the effectiveness of our transfer learning approach. These results suggest that leveraging large-scale auxiliary data (as we do in this work) significantly enhances model generalization, especially when the amount of target data is limited.
TABLE 5.
Results(%) of the proposed methods (i.e., HKGF1 and HKGF2) and their variants without model pretraining in two prediction tasks.
| Method | SMC vs. CN Classification on ADNI | ANI vs. CN Classification on HAND | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| ||||||||||||||
| AUC | ACC | F1 | BAC | SEN | SPE | PRE | AUC | ACC | F1 | BAC | SEN | SPE | PRE | |
|
| ||||||||||||||
| HKGF1w/oP | 73.752.68 | 76.292.14 | 76.695.15 | 75.621.87 | 77.7712.56 | 73.4811.59 | 79.084.87 | 63.623.48 | 68.031.91 | 68.424.13 | 68.181.85 | 72.818.47 | 63.546.49 | 68.511.54 |
| HKGF1 (Ours) | 76.222.31 | 76.301.99 | 77.592.54 | 78.102.13 | 78.344.77 | 77.873.15 | 81.321.99 | 68.832.55 | 71.532.59 | 69.725.15 | 71.542.53 | 71.308.73 | 71.784.18 | 73.621.72 |
|
| ||||||||||||||
| HKGF2w/oP | 76.221.29 | 77.941.71 | 78.530.67 | 77.451.11 | 78.301.39 | 76.602.36 | 82.593.30 | 66.422.52 | 65.702.79 | 66.692.76 | 66.122.76 | 72.463.54 | 59.795.06 | 65.572.15 |
| HKGF2 (Ours) | 80.422.25 | 81.262.62 | 82.652.64 | 80.582.11 | 84.814.08 | 76.353.97 | 81.753.11 | 69.741.77 | 71.862.39 | 68.732.54 | 72.361.40 | 70.457.57 | 74.277.64 | 76.454.99 |
5.4. Model Generalizability Evaluation
The proposed HKGF is a general framework that can be employed to fuse multiple modalities. To further validate its generalization ability on other modalities, we conduct a new experiment to perform SMC vs. CN classification using paired resting-state fMRI and arterial spin labeling (ASL) MRI data of 29 SMC and 15 CN subjects from ADNI. The pre-processing of the fMRI data is introduced in Section 4.1.2. We preprocess the ASL data using the ASLtbx toolbox [71], with details given in Supplementary Materials. From each preprocessed ASL MRI, we obtain a mean cerebral blood flow (CBF) map comprising 116 ROIs per subject, as defined by the AAL atlas. For each ROI, we extract a 107-dimensional radiomics feature vector from the CBF image using PyRadiomics [72]. Based on these regional features, we construct an ASL-based FC graph by treating each ROI as a node and computing the Pearson correlation between feature vectors of ROI pairs to define edge weights.
For a fair comparison, all the competing deep learning methods and our HKGF share the same fMRI and ASL graph inputs, while three conventional machine learning methods (i.e., SVM, RF, and XGBoost) employ each subject’s concatenated fMRI-based FC network feature and ASL-based radiomics features as input. The results achieved by different methods in SMC vs. CN classification on ADNI with fMRI and ASL data are reported in Table 6. This table suggests that our HKGF1 and HKGF2 outperform other methods across most metrics. HKGF2 achieves the highest AUC (76.85%) and ACC (79.22%), indicating robust performance in distinguishing SMC subjects and CNs. These results demonstrate that our HKGF framework generalizes effectively across diverse multimodal scenarios.
TABLE 6.
Generalization evaluation results (%) of different methods in SMC vs. CN classification on AdNi with fMRI and ASL data.
| Method | AUC | ACC | F1 | BAC | SEN |
|---|---|---|---|---|---|
|
| |||||
| SVM | 64.5910.60 | 66.509.17 | 69.1013.67 | 62.899.22 | 70.1014.98 |
| RF | 74.964.14 | 75.614.90 | 79.684.32 | 75.056.15 | 77.77 5.24 |
| XGBoost | 64.615.19 | 64.673.08 | 65.486.48 | 66.692.77 | 60.707.38 |
|
| |||||
| GCN-EF | 70.189.06 | 71.337.41 | 74.715.55 | 72.514.36 | 75.694.93 |
| GCN-LF | 62.114.17 | 65.564.02 | 70.334.45 | 63.805.22 | 68.937.55 |
| GAT-EF | 74.686.14 | 77.502.68 | 78.973.38 | 81.44 2.87 | 70.224.83 |
| GAT-LF | 75.144.76 | 78.834.04 | 80.722.97 | 80.402.98 | 73.466.41 |
| Transformer-EF | 73.107.98 | 76.174.04 | 78.143.95 | 78.204.16 | 72.394.34 |
| Transformer-LF | 72.787.41 | 74.398.84 | 75.7910.45 | 76.819.27 | 69.6314.35 |
| GraphSAGE-EF | 73.214.80 | 76.063.48 | 80.803.34 | 74.215.25 | 78.436.35 |
| GraphSAGE-LF | 66.828.01 | 70.787.42 | 72.4112.10 | 70.345.91 | 73.6917.75 |
| GIN-EF | 67.165.02 | 75.446.34 | 76.749.35 | 76.863.37 | 72.4011.39 |
| GIN-LF | 68.438.79 | 74.287.98 | 72.3512.35 | 78.275.48 | 66.6715.41 |
| BrainNetCNN-EF | 61.8212.12 | 66.615.71 | 68.178.40 | 66.703.33 | 65.0810.91 |
| BrainNetCNN-LF | 73.356.85 | 69.563.48 | 70.964.52 | 72.073.61 | 62.474.81 |
| BrainGNN-EF | 64.2211.03 | 64.3910.30 | 63.4416.00 | 67.828.38 | 62.3016.02 |
| BrainGNN-LF | 72.055.09 | 74.116.67 | 77.527.22 | 73.726.96 | 73.777.76 |
| HGCN-EF | 65.299.27 | 63.614.20 | 63.638.03 | 67.273.14 | 55.878.29 |
| HGCN-LF | 67.694.38 | 66.394.12 | 68.826.56 | 68.932.90 | 60.539.72 |
|
| |||||
| HKGF1 (Ours) | 75.962.37 | 76.783.44 | 78.704.82 | 80.501.07 | 74.6610.01 |
| HKGF2 (Ours) | 76.85 3.78 | 79.22 4.27 | 81.12 5.32 | 79.694.36 | 75.048.77 |
5.5. Computation Complexity Analysis
We further compare the computational complexity of the proposed HKGCN and HKGAT with the three most relevant methods (i.e., GCN, GAT, and HGCN). We evaluate the number of their trainable parameters and the total number of floating-point operations (FLOPs) in one forward pass when encoding each functional connectivity graph from resting-state fMRI and structural connectivity graph from DTI for ANI vs. CN classification on the HAND cohort. Using an NVIDIA RTX 4090 GPU, we report the results in Table 7, which reveal several key observations. These results demonstrate that our HKGCN is more efficient than the state-of-the-art HGAN, achieving significantly lower computational cost while retaining the same number of trainable parameters. Recalling the classification results in Tables 2–3, we can observe that our HKGCN consistently outperforms HGAN while requiring fewer resources, demonstrating its effectiveness and scalability. In addition, our HKGAT model maintains a similar computational cost in terms of FLOPs and the same number of trainable parameters as the standard GAT. Likewise, our HKGCN matches the conventional GCN in both FLOPs and parameter count. This demonstrates that our novel approach of introducing a hyperbolic kernel into GAT and GCN does not increase computational complexity, yet significantly boosts their performance.
TABLE 7.
Comparison of computational costs for five methods in encoding a single FC graph from resting-state fMRI and a single SC graph from DTI for ANI vs. CN classification on the HAND dataset.
| Method | fMRI | DTI | ||
|---|---|---|---|---|
|
| ||||
| Param (K) | FLOPs (MMac) | Param (K) | FLOPs (MMac) | |
|
| ||||
| GCN [45] | 11.65 | 6.16 | 26.50 | 9.61 |
| HGCN [17] | 11.65 | 8.95 | 26.50 | 15.84 |
| HKGCN (Ours) | 11.65 | 6.22 | 26.50 | 9.67 |
|
| ||||
| GAT [46] | 46.72 | 198.22 | 106.11 | 273.14 |
| HKGAT (Ours) | 46.72 | 198.89 | 106.11 | 273.82 |
Param: parameter; FLOP: floating-point operation; MMac: million multiply-accumulate operations.
5.6. Influence of Adjacency Matrix Sparsity
In the main experiments, we empirically retained the top 50% of the strongest edges in each functional connectivity adjacency matrix, as this configuration yielded stable performance in preliminary evaluations. To further assess robustness, we conduct sensitivity analyses with sparsity thresholds/ratios of 10%, 20%, 30%, 40%, 50%, 60%, 70%, and 80%. In this experiment, we report the results of HKGF1 and HKGF2 under different sparsity ratios in distinguishing SMC from CN subjects on the ADNI dataset. As shown in Fig. 6, for HKGF1, both ACC and AUC remain relatively stable across different sparsity levels, with the best performance observed around 50%. In contrast, HKGF2 shows a significant improvement when the sparsity ratio increases from 10% to 30%, after which the results stabilize, maintaining high and stable accuracy and AUC values. These findings suggest that HKGF2 is more robust to network sparsity changes and benefits from moderate edge retention, while HKGF1 is less sensitive to sparsity variations. More discussions can be found in Supplementary Materials.
Fig. 6.

Results of the proposed HKGF1 (a) and HKGF2 (b) with different functional connectivity adjacency matrix sparsity ratios in the task of SMC vs. CN classification on the ADNI dataset with fMRI and DTI data.
5.7. Limitations and Future Work
Several limitations should be taken into consideration. First, although we visualize the learned features and highlight discriminative structural and functional connectivity patterns in the experiments, a potential limitation of the proposed HKGF framework is the reduced interpretability of its representations, due to the complexity of hyperbolic geometry. Future work could incorporate attention-based explanation mechanisms [55] or visualization tools in hyperbolic space to further enhance model interpretability. Second, HKGF requires complete multimodal data (e.g., paired DTI and fMRI), and thus, cannot handle cases with missing modalities. Future work will include graph data imputation techniques [73] to impute the missing data. Third, for multisite datasets such as ADNI, variability due to scanner and site effects may impact model generalization. Future work will explore harmonization strategies [74, 75] on functional and structural connectivity features to reduce site variability and enhance generalization.
6. Conclusion
This paper introduces HKGF, a novel framework for neurocognitive decline analysis using multimodal neuroimaging data. The HKGF embeds brain functional/structural connectivity graphs into hyperbolic space via a family of novel hyperbolic kernel graph neural networks, thereby capturing local and global dependencies while preserving the hierarchical structure of brain networks. A cross-modality coupling module further enhances the fusion of DTI and fMRI. Experimental results demonstrate that HKGF consistently outperforms state-of-the-art methods in predicting neurocognitive decline, underscoring its effectiveness and potential in objectively quantifying brain connectivity changes associated with neurocognitive decline.
Supplementary Material
Acknowledgment
This research was supported in part by NIH grants (Nos. R01AG073297, RF1AG082938, R01EB035160, and R01NS134849). The authors would like to thank Ms. Biying Xiu for her assistance with ASL data processing in this project, and acknowledge Drs. Wei Wang and Hong-Jun Li, Yuanyuan Wang, Yu Qi, Shuai Han, Xire Aili, and Yuxun Gao for their help with imaging data collection in the HIV-Associated Neurocognitive Disorder (HAND) study. A part of the data is from ADNI. The ADNI and HAND investigators provide data but are not involved in data processing, analysis, and writing. A list of ADNI investigators is accessible online.
Biographies

Meimei Yang received her PhD degree in Software Engineering at Southeast University, Nanjing, China, in 2024. She is currently a Postdoc fellow in the Department of Radiology and BRIC at the University of North Carolina at Chapel Hill (2024-present). Her current research primarily focuses on machine learning and neuroimaging analysis.

Yongheng Sun is currently a Research Assistant at the University of North Carolina at Chapel Hill (2025-present). His research focuses on pattern recognition and medical image analysis.

Qianqian Wang is currently pursuing her Ph.D. degree in Biomedical Engineering at the University of North Carolina at Chapel Hill (2025-present). Her research focuses on biomedical data analysis and machine learning.

Andrea Bozoki is an experienced academic clinician. Her research focuses on clinical trials and multimodal imaging biomarker development and validation. She has demonstrated expertise in obtaining and completing sponsored clinical trials, and in creating a revenue-positive multidisciplinary program for the care of cognitive disorders.

Maureen Kohi received her medical degree from New York Medical College and completed her Diagnostic Radiology residency at the University of California, San Francisco. She has extensive clinical experience in detecting imaging biomarkers and applying imaging techniques in clinical decision-making.

Mingxia Liu (Senior Member, IEEE) received the Ph.D. degree from Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2015. Her current research interests include machine learning, pattern recognition, and biomedical data analysis.
Footnotes
Contributor Information
Meimei Yang, Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
Yongheng Sun, Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
Qianqian Wang, Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
Andrea Bozoki, Department of Neurology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA..
Maureen Kohi, Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
Mingxia Liu, Department of Radiology and Biomedical Research Imaging Center (BRIC), University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.
References
- [1].Mulkern RV, Davis PE, Haker SJ, Estepar RSJ, Panych LP, Maier SE, and Rivkin MJ, “Complementary aspects of diffusion imaging and fMRI; I: structure and function,” Magnetic Resonance Imaging, vol. 24, no. 4, pp. 463–474, 2006. [DOI] [PubMed] [Google Scholar]
- [2].Basser PJ, Mattiello J, and LeBihan D, “MR diffusion tensor spectroscopy and imaging,” Biophysical Journal, vol. 66, no. 1, pp. 259–267, 1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Biswal B, Zerrin Yetkin F, Haughton VM, and Hyde JS, “Functional connectivity in the motor cortex of resting human brain using echo-planar MRI,” Magnetic Resonance in Medicine, vol. 34, no. 4, pp. 537–541, 1995. [DOI] [PubMed] [Google Scholar]
- [4].Fox MD and Raichle ME, “Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging,” Nature Reviews Neuroscience, vol. 8, no. 9, pp. 700–711, 2007. [DOI] [PubMed] [Google Scholar]
- [5].Detre JA and Alsop DC, “Perfusion magnetic resonance imaging with continuous arterial spin labeling: Methods and clinical applications in the central nervous system,” European Journal of Radiology, vol. 30, no. 2, pp. 115–124, 1999. [DOI] [PubMed] [Google Scholar]
- [6].Alsop DC, Detre JA, Golay X, Günther M, Hendrikse J, Hernandez-Garcia L, Lu H, MacIntosh BJ, Parkes LM, Smits M et al. , “Recommended implementation of arterial spin-labeled perfusion MRI for clinical applications: A consensus of the ISMRM perfusion study group and the European consortium for ASL in dementia,” Magnetic Resonance in Medicine, vol. 73, no. 1, pp. 102–116, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Zong Y, Zuo Q, Ng MK-P, Lei B, and Wang S, “A new brain network construction paradigm for brain disorder via diffusion-based graph contrastive learning,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, pp. 10 389 – 10 403, 2024. [DOI] [PubMed] [Google Scholar]
- [8].Zhu D, Zhang T, Jiang X, Hu X, Chen H, Yang N, Lv J, Han J, Guo L, and Liu T, “Fusing DTI and fMRI data: A survey of methods and applications,” NeuroImage, vol. 102, pp. 184–191, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Zhou J, Greicius MD, Gennatas ED, Growdon ME, Jang JY, Rabinovici GD, Kramer JH, Weiner M, Miller BL, and Seeley WW, “Divergent network connectivity changes in behavioural variant frontotemporal dementia and Alzheimer’s disease,” Brain, vol. 133, no. 5, pp. 1352–1367, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Yeo BTT, Krienen FM, Sepulcre J, Sabuncu MR, Lashkari D, Hollinshead M, Roffman JL, Smoller JW, Zöllei L, Polimeni JR, Fischl B, Liu H, and Buckner RL, “The organization of the human cerebral cortex estimated by intrinsic functional connectivity,” Journal of Neurophysiology, vol. 106, no. 3, pp. 1125–1165, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Margulies DS, Ghosh S, Goulas A, Falkiewicz M, Huntenburg JM, Langs G, Bezgin G, Eickhoff SB, Castellanos FX, Petrides M, Jefferies E, and Smallwood J, “Situating the default-mode network along a principal gradient of macroscale cortical organization,” Proceedings of the National Academy of Sciences, vol. 113, no. 44, pp. 12 574–12 579, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Meunier D, Lambiotte R, Fornito A, Ersche K, and Bullmore E, “Hierarchical modularity in human brain functional networks,” Frontiers in Neuroinformatics, vol. 3, p. 37, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Betzel RF, Byrge L, He Y, Goni J, Zuo XN, and Sporns O, “Changes in structural and functional connectivity among resting-state networks across the human lifespan,” NeuroImage, vol. 102, pp. 345–357, 2014. [DOI] [PubMed] [Google Scholar]
- [14].Zhou C, Zemanova L, Zamorá G, Hilgetag CC, and Kurths J, “Hierarchical organization unveiled by functional connectivity in complex brain networks,” Physical Review Letters, vol. 97, no. 23, p. 238103, 2006. [DOI] [PubMed] [Google Scholar]
- [15].Chen ZJ, He Y, Rosa-Neto P, Germann J, and Evans AC, “Revealing modular architecture of human brain structural networks by using cortical thickness from MRI,” Cerebral Cortex, vol. 18, no. 10, pp. 2374–2381, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Hilgetag CC and Goulas A, “‘Hierarchy’ in the organization of brain networks,” Philosophical Transactions of the Royal Society B, vol. 375, no. 1796, p. 20190319, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Chami I, Ying R, Ré C, and Leskovec J, “Hyperbolic graph convolutional neural networks,” in NeurIPS, vol. 32, 2019. [PMC free article] [PubMed] [Google Scholar]
- [18].Zhang J, Shi J, Stonnington C, Li Q, Gutman BA, Chen K, Reiman EM, Caselli R, Thompson PM, Ye J et al. , “Hyperbolic space sparse coding with its application on prediction of alzheimer’s disease in mild cognitive impairment,” in MICCAI. Springer, 2016, pp. 326–334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Yu Z, Nguyen T, Gal Y, Ju L, Chandra SS, Zhang L, Bonnington P, Mar V, Wang Z, and Ge Z, “Skin lesion recognition with class-hierarchy regularized hyperbolic embeddings,” in MICCAI. Springer, 2022, pp. 594–603. [Google Scholar]
- [20].Zhang L, Na S, Liu T, Zhu D, and Huang J, “Multimodal deep fusion in hyperbolic space for mild cognitive impairment study,” in MICCAI. Springer, 2023, pp. 674–684. [Google Scholar]
- [21].Oishi K, Mielke MM, Albert M, Lyketsos CG, and Mori S, “DTI analyses and clinical applications in Alzheimer’s disease,” Journal of Alzheimer’s Disease, vol. 26, no. s3, pp. 287–296, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Chen P.-p., Wei X.-y., Tao L, Xin X, Xiao S.-t., and He N, “Cerebral abnormalities in HIV-infected individuals with neurocognitive impairment revealed by fMRI,” Scientific Reports, vol. 13, no. 1, p. 10331, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Ereira S, Waters S, Razi A, and Marshall CR, “Early detection of dementia with default-mode network effective connectivity,” Nature Mental Health, vol. 2, no. 7, pp. 787–800, 2024. [Google Scholar]
- [24].Wang Q, Wang W, Fang Y, Yap P-T, Zhu H, Li H-J, Qiao L, and Liu M, “Leveraging brain modularity prior for interpretable representation learning of fMRI,” IEEE Transactions on Biomedical Engineering, vol. 71, no. 8, pp. 2391–2401, 2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Broser PJ, Groeschel S, Hauser T-K, Lidzba K, and Wilke M, “Functional MRI-guided probabilistic tractography of cortico-cortical and cortico-subcortical language networks in children,” NeuroImage, vol. 63, no. 3, pp. 1561–1570, 2012. [DOI] [PubMed] [Google Scholar]
- [26].Iyer SP, Shafran I, Grayson D, Gates K, Nigg JT, and Fair DA, “Inferring functional connectivity in MRI using Bayesian network structure learning with a modified PC algorithm,” NeuroImage, vol. 75, pp. 165–175, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Zhang Y, He X, Chan YH, Teng Q, and Rajapakse JC, “Multimodal graph neural network for early diagnosis of Alzheimer’s disease from sMRI and PET scans,” Computers in Biology and Medicine, vol. 164, p. 107328, 2023. [DOI] [PubMed] [Google Scholar]
- [28].Bagheri A, Dehshiri M, Bagheri Y, Akhondi-Asl A, and Nadjar Araabi B, “Brain effective connectome based on fMRI and DTI data: Bayesian causal learning and assessment,” Plos One, vol. 18, no. 8, p. e0289406, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Kawahara J, Brown CJ, Miller SP, Booth BG, Chau V, Grunau RE, Zwicker JG, and Hamarneh G, “BrainNetCNN: Convolutional neural networks for brain networks; towards predicting neurodevelopment,” NeuroImage, vol. 146, pp. 1038–1049, 2017. [DOI] [PubMed] [Google Scholar]
- [30].Li X, Zhou Y, Dvornek NC, Zhang M, Gao S, Zhuang J, Scheinost D, Staib LH, Ventola P, and Duncan JS, “BrainGNN: Interpretable brain graph neural network for fMRI analysis,” in CVPR, 2021, pp. 9260–9270. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Van der Maaten L and Hinton G, “Visualizing data using t-SNE,” Journal of Machine Learning Research, vol. 9, no. 11, 2008. [Google Scholar]
- [32].Rolls ET, Huang C-C, Lin C-P, Feng J, and Joliot M, “Automated anatomical labelling atlas 3,” NeuroImage, vol. 206, p. 116189, 2020. [DOI] [PubMed] [Google Scholar]
- [33].Ratcliffe JG, Foundations of Hyperbolic Manifolds. Springer, 1994. [Google Scholar]
- [34].Cannon JW, Floyd WJ, Kenyon R, Parry WR et al. , “Hyperbolic geometry,” Flavors of Geometry, vol. 31, pp. 59–115, 1997. [Google Scholar]
- [35].Nickel M and Kiela D, “Poincaré embeddings for learning hierarchical representations,” in NeurIPs, 2017, pp. 6338–6347. [Google Scholar]
- [36].Ganea O, Bécigneul G, and Hofmann T, “Hyperbolic neural networks,” in NeurIPS, vol. 31, 2018, pp. 5345–5355. [Google Scholar]
- [37].Fang P, Harandi M, Lan Z, and Petersson L, “Poincaré kernels for hyperbolic representations,” International Journal of Computer Vision, pp. 1–23, 2023. [Google Scholar]
- [38].Ungar AA, Analytic hyperbolic geometry and Albert Einstein’s special theory of relativity. World Scientific, 2008. [Google Scholar]
- [39].——, “From pythagoras to einstein: the hyperbolic pythagorean theorem,” Foundations of Physics, vol. 28, no. 8, pp. 1283–1321, 1998. [Google Scholar]
- [40].Kim B-H, Ye JC, and Kim J-J, “Learning dynamic graph representation of brain connectome with spatio-temporal attention,” in Advances in Neural Information Processing Systems, vol. 34, 2021, pp. 4314–4327. [Google Scholar]
- [41].Cho Y and Saul L, “Kernel methods for deep learning,” in Advances in Neural Information Processing Systems, vol. 22, 2009. [Google Scholar]
- [42].Rahimi A and Recht B, “Random features for large-scale kernel machines,” in Advances in Neural Information Processing Systems, vol. 20, 2007. [Google Scholar]
- [43].Rudi A and Rosasco L, “Generalization properties of learning with random features,” in NeurIPs, vol. 30, 2017. [Google Scholar]
- [44].Bach F, Learning theory from first principles. MIT press, 2024. [Google Scholar]
- [45].Kipf TN and Welling M, “Semi-supervised classification with graph convolutional networks,” in ICLR, 2017. [Google Scholar]
- [46].Veličković P, Cucurull G, Casanova A, Romero A, Lio P, and Bengio Y, “Graph attention networks,” in International Conference on Learning Representations (ICLR), 2018. [Google Scholar]
- [47].Jack CR Jr, Bernstein MA, Fox NC, Thompson P, Alexander G, Harvey D, Borowski B, Britson PJ, Whitwell JL, Ward C et al. , “The Alzheimer’s disease neuroimaging initiative (ADNI): MRI methods,” Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine, vol. 27, no. 4, pp. 685–691, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [48].Di Martino A, Yan C-G, Li Q, Denio E, Castellanos FX, Alaerts K, Anderson JS, Assaf M, Bookheimer SY, Dapretto M et al. , “The autism brain imaging data exchange: Towards a large-scale evaluation of the intrinsic brain architecture in autism,” Molecular Psychiatry, vol. 19, no. 6, pp. 659–667, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Yan C-G, Chen X, Li L, Castellanos FX, Bai T-J, Bo Q-J, Cao J, Chen G-M, Chen N-X, Chen W et al. , “Reduced default mode network functional connectivity in patients with recurrent major depressive disorder,” Proceedings of the National Academy of Sciences, vol. 116, no. 18, pp. 9078–9083, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [50].A.−. Consortium, “The ADHD-200 consortium: A model to advance the translational potential of neuroimaging in clinical neuroscience,” Frontiers in Systems Neuroscience, vol. 6, p. 62, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [51].Fang Y, Zhang J, Wang L, Wang Q, and Liu M, “ACTION: Augmentation and computation toolbox for brain network analysis with functional MRI,” NeuroImage, vol. 305, p. 120967, 2025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [52].Pisner DA and Schnyer DM, “Support Vector Machine,” in Machine Learning. Elsevier, 2020, pp. 101–121. [Google Scholar]
- [53].Breiman L, “Random forests,” Machine Learning, vol. 45, pp. 5–32, 2001. [Google Scholar]
- [54].Chen T and Guestrin C, “XGBoost: A scalable tree boosting system,” in KDD, 2016, pp. 785–794. [Google Scholar]
- [55].Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L. u., and Polosukhin I, “Attention is all you need,” in NeurIPs, vol. 30, 2017. [Google Scholar]
- [56].Hamilton W, Ying Z, and Leskovec J, “Inductive representation learning on large graphs,” NeurIPS, vol. 30, 2017. [Google Scholar]
- [57].Xu K, Hu W, Leskovec J, and Jegelka S, “How powerful are graph neural networks?” in ICLR, 2019. [Google Scholar]
- [58].Xia M, Wang J, and He Y, “BrainNet Viewer: A network visualization tool for human brain connectomics,” PloS One, vol. 8, no. 7, p. e68910, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [59].Rodda J, Dannhauser T, Cutinha D, Shergill S, and Walker Z, “Subjective cognitive impairment: Functional MRI during a divided attention task,” European Psychiatry, vol. 26, no. 7, pp. 457–462, 2011. [DOI] [PubMed] [Google Scholar]
- [60].Woolgar A, Parr A, Cusack R, Thompson R, Nimmo-Smith I, Torralva T, Roca M, Antoun N, Manes F, and Duncan J, “Fluid intelligence loss linked to restricted regions of damage within frontal and parietal cortex,” Proceedings of the National Academy of Sciences, vol. 107, no. 33, pp. 14 899–14 902, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [61].Jamali M, Grannan B, Haroush K, Moses ZB, Eskandar EN, Herrington T, Patel S, and Williams ZM, “Dorsolateral prefrontal neurons mediate subjective decisions and their variation in humans,” Nature Neuroscience, vol. 22, no. 6, pp. 1010–1020, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [62].Janssen MA, Hinne M, Janssen RJ, van Gerven MA, Steens SC, Góraj B, Koopmans PP, and Kessels RP, “Resting-state subcortical functional connectivity in HIV-infected patients on long-term cART,” Brain Imaging and Behavior, vol. 11, pp. 1555–1560, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [63].Wang H, Li R, Zhou Y, Wang Y, Cui J, Nguchu BA, Qiu B, Wang X, and Li H, “Altered cerebro-cerebellum resting-state functional connectivity in HIV-infected male patients,” Journal of Neurovirology, vol. 24, pp. 587–596, 2018. [DOI] [PubMed] [Google Scholar]
- [64].Wright PW, Pyakurel A, Vaida FF, Price RW, Lee E, Peterson J, Fuchs D, Zetterberg H, Robertson KR, Walter R et al. , “Putamen volume and its clinical and neurological correlates in primary HIV infection,” Aids, vol. 30, no. 11, pp. 1789–1794, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [65].López-Villegas D, Lenkinski RE, and Frank I, “Biochemical changes in the frontal lobe of HIV-infected individuals detected by magnetic resonance spectroscopy,” Proceedings of the National Academy of Sciences, vol. 94, no. 18, pp. 9854–9859, 1997. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [66].Heaton R, Clifford D, Franklin D Jr, Woods S, Ake C, Vaida F, Ellis R, Letendre S, Marcotte T, Atkinson J et al. , “HIV-associated neurocognitive disorders persist in the era of potent antiretroviral therapy: CHARTER Study,” Neurology, vol. 75, no. 23, pp. 2087–2096, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [67].Ellis R, Langford D, and Masliah E, “HIV and antiretroviral therapy in the brain: Neuronal injury and repair,” Nature Reviews Neuroscience, vol. 8, no. 1, pp. 33–44, 2007. [DOI] [PubMed] [Google Scholar]
- [68].Milanini B and Valcour V, “Differentiating HIV-associated neurocognitive disorders from Alzheimer’s disease: An emerging issue in geriatric NeuroHIV,” Current HIV/AIDS Reports, vol. 14, no. 4, pp. 123–132, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [69].Kim G-W, Park K, Kim Y-H, and Jeong G-W, “Increased hippocampal-inferior temporal gyrus white matter connectivity following donepezil treatment in patients with early Alzheimer’s disease: A diffusion tensor probabilistic tractography study,” Journal of Clinical Medicine, vol. 12, no. 3, p. 967, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [70].Jacobs HI, Hopkins DA, Mayrhofer HC, Bruner E, Van Leeuwen FW, Raaijmakers W, and Schmahmann JD, “The cerebellum in Alzheimer’s disease: Evaluating its role in cognitive decline,” Brain, vol. 141, no. 1, pp. 37–47, 2018. [DOI] [PubMed] [Google Scholar]
- [71].Wang Z, Aguirre GK, Rao H, Wang J, Fernández-Seara MA, Childress AR, and Detre JA, “Empirical optimization of ASL data analysis using an ASL data processing toolbox: ASLtbx,” Magnetic Resonance Imaging, vol. 26, no. 2, pp. 261–269, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [72].Van Griethuysen JJ, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, Beets-Tan RG, Fillion-Robin J-C, Pieper S, and Aerts HJ, “Computational radiomics system to decode the radiographic phenotype,” Cancer Research, vol. 77, no. 21, pp. e104–e107, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [73].Han X, Jiang Z, Liu N, and Hu X, “G-mixup: Graph data augmentation for graph classification,” in International Conference on Machine Learning. PMLR, 2022, pp. 8230–8248. [Google Scholar]
- [74].Yu M, Linn KA, Cook PA, Phillips ML, McInnis M, Fava M, Trivedi MH, Weissman MM, Shinohara RT, and Sheline YI, “Statistical harmonization corrects site effects in functional connectivity measurements from multi-site fMRI data,” Human Brain Mapping, vol. 39, no. 11, pp. 4213–4227, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [75].Pomponio R, Erus G, Habes M, Doshi J, Srinivasan D, Mamourian E, Bashyam V, Nasrallah IM, Satterthwaite TD, Fan Y et al. , “Harmonization of large MRI datasets for the analysis of brain imaging patterns throughout the lifespan,” NeuroImage, vol. 208, p. 116450, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
