Abstract
The difficulty of acquiring resting-state fMRI of early developing children under the same condition leads to a dedicated protocol, i.e., scanning younger infants during sleep and older children during being awake, respectively. However, the obviously different brain activities of sleep and awake states arouse a new challenge of awake-to-sleep connectome prediction/translation, which remains unexplored despite its importance in the longitudinally-consistent delineation of brain functional development. Due to the data scarcity and huge differences between natural images and geometric data (e.g., brain connectome), existing methods tailored for image translation generally fail in predicting functional connectome from awake to sleep. To fill this critical gap, we unprecedentedly propose a novel reference-relation guided autoencoder with deep CCA restriction (R2AE-dCCA) for awake-to-sleep connectome prediction. Specifically, 1) A reference-autoencoder (RAE) is proposed to realize a guided generation from the source domain to the target domain. The limited paired data are thus greatly augmented by including the combinations of all the age-restricted neighboring subjects as the references, while the target-specific pattern is fully learned; 2) A relation network is then designed and embedded into RAE, which utilizes the similarity in the source domain to determine the belief-strength of the reference during prediction; 3) To ensure that the learned relation in the source domain can effectively guide the generation in the target domain, a deep CCA restriction is further employed to maintain the neighboring relation during translation; 4) New validation metrics dedicated for connectome prediction are also proposed. Experimental results showed that our proposed R2AE-dCCA produces better prediction accuracy and well maintains the modular structure of brain functional connectome in comparison with state-of-the-art methods.
Keywords: Functional Connectome Prediction, rs-fMRI, Autoencoder
1. Introduction
During the first few years of life, the human brain undergoes exceptionally dynamic development that could largely shape later behavioral and cognitive performance [1–3]. Delineating the functional developmental trajectories through this stage with resting-state fMRI (rs-fMRI) is of great importance in understanding the normal brain and diagnosing neurodevelopmental disorders [4, 5]. However, there are unique challenges associated with acquiring rs-fMRI for early developing children under a unified condition (sleep or awake), i.e., 1) it is impossible to request younger infants to be awake while remaining still during scanning; 2) it is difficult to persuade older children (>24 months) to sleep during the daytime and also the brain activity in the deep sleep during the night is significantly different from normal resting state. This dilemma generally leads to a dedicated protocol, i.e., scanning younger infants during sleep, while scanning older children during being awake and keeping them still by watching movies [4, 6]. However, there are big differences lying in the hemodynamic responses of sleep and awake that will be reflected in fMRI and the corresponding brain functional connectome. Thus, to realize meaningful and consistent cross-age studies under different scan conditions, predicting the functional connectome obtained during sleep from that during being awake is critical, which, however, remains unexplored to the best of our knowledge and challenging. Due to the difficulties in image acquisition and recruitment, training the model for awake-to-sleep brain connectome prediction usually confronts the problem of data scarcity. Moreover, although deep generative adversarial networks based domain translation have been successfully developed for image translation [7–9], these methods generally fail in predicting functional connectome because of the huge difference between natural images and geometric data (e.g., brain connectome). In addition, the multi-view brain graph synthesis method [10, 11] may not perform well in awake-to-sleep prediction, as it was designed for connectomes obtained by cortical morphological measures, which usually have smaller distribution differences between the source and target domains.
To address these issues, we unprecedentedly propose a Reference-Relation guided AutoEncoder with deep Canonical Correlation Analysis restriction (R2AE-dCCA) for awake-to-sleep connectome prediction. First, a reference-autoencoder (RAE) is proposed to realize a guided generation from the source domain to the target domain. During the training stage, reference-couples from the target domain will be used to guide the prediction, which are constructed by the combinations of all the age-restricted neighboring subjects of the ones to be predicted. Merging with the individualized patterns learned from the source domain in the latent space, reference-couples not only provide the sleep-specific patterns but also greatly augment the limited data by random coupling. Then, a relation network is designed and embedded into RAE, which learns the similarity of the reference-couples to the subject to be predicted in the source domain and determine the belief-strength of reference in the process of target connectome generation. To guarantee that that similarity in the source domain will be maintained in the target domain, a deep CCA (Canonical Correlation Analysis) restriction is further employed in the coupled latent space to keep the neighboring relation from the source domain to the target domain. Finally, in the testing stage, all the samples in the target domain will be used as references and the corresponding reference-strengths are co-determined by the relation network and age distance. Our R2AE-dCCA was implemented on a developing children rs-fMRI dataset and compared with state-of-the-art methods. The superiority of our proposed R2AE-dCCA was validated not only on the overall accuracy but also on our proposed three new validation metrics dedicated for connectome prediction.
2. Method
2.1. Model description
The framework of our proposed model, reference-relation guided autoencoder with deep CCA restriction (R2AE-dCCA), is depicted in Fig. 1 and detailed below.
Our goal is to learn the mapping from the awake domain X to the sleep domain Y, f:X→Y, given paired samples {(xi, yi, Agei)|xi ∈ X, yi ∈ Y}, where i =1, ⋯, N. N is the number of subjects, Agei is the scan age of subject i, xi and yi are brain functional connectome represented by functional connectivity matrix and usually vectorized as the corresponding off-diagonal upper triangular elements for computation.
There are four main steps in training stage: (1) Reference-couples and relation-couples construction based on age-restricted neighborhood; (2) Reference guided target connectome generation; (3) Deep-CCA based source domain to target domain correlation; (4) Relation guided fusion.
Reference-couples and relation-couples construction.
For each connectome xi in the awake domain, except for the paired yi, other connectomes in the sleep domain Y could also provide rich information of how the brain connectivity patterns look like during sleep. Since the infant brain undergoes exceptionally dynamic development in both structure and function during early childhood [1–3, 28, 29], we only leverage the subjects within a neighboring age range to guide the learning of prediction. For a subject i, the age-restricted neighborhood of i is defined as , θ is the user-defined threshold and set as 30 days in our experiments. Then, the relation-couples and reference-couples are constructed as and , respectively.
Encoding.
The inputs xi, xj, and xk employ a multilayer perceptron neural network, denoted as EX, as the encoder for the source domain to learn the individualized information, while yj and yk employ another multi-layer perceptron neural network, EY, as the encoder for the target domain to learn the domain-specific pattern. The outputs of the encoders are latent variables, which denoted as , , , , and , respectively.
Reference guided target connectome generation.
In the latent space, and encoded from the target domain are leveraged as the reference information to guide the generation of the target connectome. Here a multi-layer perceptron neural network, denoted as GXY, is employed for the generation with the inputs, which are the concatenations of the individualized information from the source domain and the sleep-specific information from the target domain. Thus, and are the predicted sleep connectome corresponding to xi based on the reference of yi and yk, respectively. Together with the encoding process, a reference autoencoder (RAE) is designed.
Relation guided fusion.
Since more than one are obtained from the multiple reference couples and the RAE, a relation network is further designed to guide the fusion. Specifically, according to the relation-couples, a multi-layer perceptron neural network, denoted as RX, is embedded into RAE and employed to learn the reference-strength that yj and yk should contribute to the prediction of yi. That is, based on the latent variables , , and ,
(1) |
Deep-CCA based source domain to target domain correlation.
From Equation (1), is estimated based on the assumption that the similarity relationship in the source domain maintains in the target domain. Therefore, the correlation between the learned embedding of the source domain and the target domain should be maximized during the training, thus ensuring the effectiveness of the fusion by preserving the neighboring relationship cross domains. Suppose , l is the dimension of the latent space, s is the batch size of training, and are the column-wise concatenation of and , respectively. Let and be the centered matrix, 1 is an all-1s matrix, and , , , δ1, δ2 > 0 are constants to ensure and to be positive definite. As for the classical CCA, the total correlation of the top m components of Zx and Zy is the sum of the top m singular values of the matrix . In our case, m is set as l, then the correlation of Zx and Zy is the matrix trace norm of T, i.e.,
(2) |
Taking as one term of the loss function, the embeddings of the source domain and target domain are required to be maximally correlated.
Adversarial loss.
To enforce the stability of the training, a distribution regularization is imposed to the latent space and realized by a shared discriminator D [12]. Let p(z) be the prior distribution imposing on the latent variable z, q(z|x) be the encoding distribution. Training the autoencoder with distribution regularization requires the aggregated posterior distribution matching the predefined prior p(z), where pd(x) is the distribution of the input data. Here, this regularization is realized by an adversarial procedure, which leads to a problem, where
(3) |
(4) |
(5) |
Target connectome prediction loss.
L2 norm and Pearson’s correlation are adopted as our generation loss for the target connectome prediction:
(6) |
(7) |
Full objective.
The objective functions to optimize EX, EY, GXY, and D are written as:
(8) |
(9) |
where , λ1, λ2, and λ3 are trade off parameters. The model alternatively updates EX, EY, GXY, and D with and .
Testing stage.
For each xtest, with the age of refence connectome being considered into the relation guided fusion, all the connectome yTr in the training set are used as reference to avoid the lack of variability. With RAE obtaining corresponding sleep connectome and relation network providing the reference-strength of each reference, the final prediction of ytest is estimated as
(10) |
where , is the Softmax function, (xTr, yTr) is the paired connectomes obtained during sleep and being awake of the same subject.
2.2. Validation of functional connectome prediction
Although Pearson’s correlation coefficient (r) and mean absolute error (MAE) are usually taken as the evaluation metrics, they are general measures without any characteristics of specific applications. Taking the practical requirement of functional connectome prediction into consideration, we propose three new metrics dedicated for the validation of functional connectome prediction, i.e., correlation of top percentile connections (Corrpercl), normalized variation of information (VIn), and normalized mutual information (MIn) of the induced modular structure.
Correlation of top percentile connections (Corrpercl).
For a functional connectome, the connections with top percentile of strength are usually the focus of functional graph or network construction [13, 14]. Thus, Corrpercl is the Pearson’s correlation coefficient merely counted within the connections with top percentile of strength, i.e.,
(11) |
where ypercl and are consisting of connections with top percentile of strength in the expected connectome y. The percentile was set as 95% in our experiments.
Normalized variation of information (VIn) and mutual information (MIn).
The modular structure based on graph theory is one of the most important analyses for functional brain networks [15, 16]. Here we introduce the capability of maintaining the modular structure of the expected connectome y as one metric to validate the predicted connectome. Let A = {a1, a2, ⋯, ac} and b = {b1, b2, ⋯, bc} be the modular partition induced by y and , respectively. The VIn and MIn [17] between A and B are defined as follow:
(12) |
(13) |
where is the number of connections in yi, and | · | in Equations (12) and (13) represents the numbers of connections in the module. In our experiments, A and B are obtained by finding the modular structure with the maximal between-class and within-class ratio within 100 repetition of k-means clustering [18]. The number of clusters is set as 10.
3. Experiments
3.1. Data description
We verified the effectiveness of the proposed R2AE-dCCA model on a high-resolution resting-state fMRI (rs-fMRI) data including 20 paired sleep and awake scans in the UNC/UMN Baby Connectome Project [6]. All paired rs-fMRI data were acquired during natural sleeping and video watching on a 3T Siemens Prisma MRI scanner using a 32-channel head coil. T1-weighted and T2-weighted MR images were obtained with the resolution = 0.8×0.8×0.8 mm3. The rs-fMRIs scans were acquired with TR/TE = 800/37 ms, FA = 80°, FOV = 220 mm, resolution = 2×2×2 mm3, and total volumes = 420 (5 min 47 sec). All structural and functional MR images were preprocessed by a state-of-the-art infant-tailored in-house pipeline [23–27]. Each rs-fMRI was finally parcellated based on the automated anatomical labeling template [19], thus obtaining 116 anatomical regions. The average time series within each ROI was correlated with those from all others. The functional connectivity matrix was derived by calculating the Pearson’s correlation coefficient between time series of each pair of ROIs. Fishers r-to-z transformation was conducted to improve the normality of the functional connectivity.
3.2. Validation of R2AE-dCCA
With the metrics of MAE, r, Corrpercl, VIn, and MIn, we compared the proposed R2AE-dCCA model by leave-one-out cross-validation with the following five methods: (1) Connectome prediction with linear model (GLM) [20]; (2) Multi-kernel manifold learning (MKML) [21]; (3) CCA-based MKML (CCA-TSW)[10]; (4) Pixel2Pixel GAN [22]; (5) R2AE-dCCA without relation network (R2AE-dCCA no R-Net);
In R2AE-dCCA, the encoder EX and EY constitute of 3 densely connected layers of dimension (50, 50, 100) with (LeakyReLU, Sigmoid, Linear) as the corresponding activation function. GXY constitutes of 4 densely connected layers of dimension (30, 30, 30, 30) with LeakyReLU as the activation function. The discriminator D constitutes of 4 densely connected layers of dimension (50, 50, 25, 1) with LeakyReLU as the activation function of the first 3 layers and Sigmoid as the activation function of the last layer. R2AE-dCCA was implemented with Pytorch and optimized with Adamax by a fixed learning rate as 0.001. The batch size was set as 400. λ1 = 0.1, λ2 = 0.8, and λ3 = 0.1. Methods (4) and (5) share the similar architecture with R2AE-dCCA for the fairness of the comparison. The means and standard deviations of the leave-one-out cross-validation are reported in Table 1. Our method achieves lowest MAE, VIn, highest r, Corrpercl, and MIn among all comparison methods, indicating the superior performance of our method. Fig. 2 shows the scatter plots of the expected and predicted connection strength on a representative subject. It can be seen that our method achieved better prediction especially for the connections with strength greater than 1, i.e., the top percentile connections. Fig. 3 shows how the predicted connectome maintains the obtained modular structures induced from the expected functional connectome of a representative subject. In Fig. 3, the order of the brain regions are the same in the subfigures, while the values are the corresponding predicted connection strengthes based on different methods. The results obtained by our method shows higher similarity with the groud truth. In summary, our R2AE-dCCA model outperformed the other five state-of-the-art methods not only on overall prediction accuracy but also on maintaining the modular structure.
Table 1.
Methods╲Metrics | Conventional | Connectome Specific | |||
---|---|---|---|---|---|
MAE | r | Corrpercl | VIn | MIn | |
GLM [20] | .272 ± .040 | .304 ± .086 | .302 ± .090 | .553 ± .055 | .213 ± .072 |
MKML [21] | .257 ± .046 | .534 ± .081 | .542 ± .079 | .460 ± .044 | .338 ± .066 |
CCA-TSW [10] | .247 ± .048 | .549 ± .083 | .546 ± .056 | .451 ± .057 | .349 ± .085 |
Pixel2Pixel GAN [22] | .316 ± .088 | .426 ± .076 | .212 ± .093 | .448 ± .051 | .330 ± .070 |
R2AE-dCCA no R-Net | .239 ± .050 | .592 ± .089 | .568 ± .095 | .442 ± .056 | .365 ± .066 |
R2AE-dCCA (proposed) | .227 ± .050 | .614 ± .084 | .583 ± .010 | .427 ± 0.51 | .379 ± .062 |
4. Conclusion
In this paper, to fill the gap of awake-to-sleep connectome prediction for longitudinal study of early brain functional development, we proposed a reference-relation guided autoencoder with deep CCA restriction (R2AE-dCCA). With the framework of reference guided generation and relation guided fusion, R2AE-dCCA reaches the superior prediction accuracy by effectively augmenting the severely limited data, utilizing the domain specific pattern from the target domain, and maintaining neighboring relationship from the source domain. As a generalized model for connectome translation, our model and the proposed connectome-dedicated validation metrics have high potential in other connectome prediction related fields.
Acknowledgments.
This work was partially supported by NIH grants (MH116225, MH117943, MH109773, and MH123202). This work also utilizes approaches developed by an NIH grant (1U01MH110274) and the efforts of the UNC/UMN Baby Connectome Project Consortium.
References
- 1.Lyall AE, Shi F, Geng XJ, Woolson S, Li G, Wang L, Hamer RM, Shen DG, and Gilmore JH: Dynamic Development of Regional Cortical Thickness and Surface Area in Early Childhood. Cerebral Cortex, 25(8), pp. 2204–2212 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Gilmore JH, Shi F, Woolson SL, Knickmeyer RC, Short SJ, Lin WL, Zhu HT, Hamer RM, Styner M, and Shen DG: Longitudinal Development of Cortical and Subcortical Gray Matter from Birth to 2 Years’. Cerebral Cortex, 22(11), pp. 2478–2485 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Li G, Wang L, Shi F, Lyall AE, Ahn M, Peng Z, Zhu H, Lin W, Gilmore JH, and Shen D: Cortical thickness and surface area in neonates at high risk for schizophrenia. Brain Structure and Function, 221(1), pp. 447–461 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zhang H, Shen D, and Lin W: Resting-state functional MRI studies on infant brains: A decade of gap-filling efforts. Neuroimage, 185, pp. 664–684 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Cao M, Huang H, and He Y: Developmental Connectomics from Infancy through Early Childhood. Trends in Neuroscience, 40(8), pp. 494–506 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Howell BR, Styner MA, Gao W, Yap PT, Wang L, Baluyot K, Yacoub E, Chen G, Potts T, Salzwedel A, Li G, Gilmore JH, Piven J, Smith JK, Shen D, Ugurbil K, Zhu H, Lin W, and Elison JT: The UNC/UMN Baby Connectome Project (BCP): An overview of the study design and protocol development. Neuroimage, 185, pp. 891–905 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Alotaibi A: Deep Generative Adversarial Networks for Image-to-Image Translation: A Review. Symmetry, 12(10), pp. 1705 (2020). [Google Scholar]
- 8.Armanious K, Jiang C, Fischer M, Küstner T, Hepp T, Nikolaou K, Gatidis S, and Yang B: MedGAN: Medical image translation using GANs. Computerized medical imaging and graphics, 79, pp. 101684 (2020). [DOI] [PubMed] [Google Scholar]
- 9.Choi Y, Choi M, Kim M, Ha J-W, Kim S, and Choo J: Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789–8797 (2018). [Google Scholar]
- 10.Zhu M, and Rekik I: Multi-view brain network prediction from a source view using sample selection via CCA-based multi-kernel connectomic manifold learning. In: International Workshop on PRedictive Intelligence In MEdicine, Springer, Cham, pp. 94–102 (2018). [Google Scholar]
- 11.Bessadok A, Mahjoub MA, and Rekik I: Brain graph synthesis by dual adversarial domain alignment and target graph prediction from a source graph. Medical Image Analysis, 68, pp. 101902 (2021). [DOI] [PubMed] [Google Scholar]
- 12.Makhzani A, Shlens J, Jaitly N, Goodfellow I, and Frey B: Adversarial autoencoders. In International Conference on Learning (2015). [Google Scholar]
- 13.Van den Heuvel MP, de Lange SC, Zalesky A, Seguin C, Yeo BT, and Schmidt R: Proportional thresholding in resting-state fMRI functional connectivity networks and consequences for patient-control connectome studies: Issues and recommendations. Neuroimage, 152, pp. 437–449 (2017). [DOI] [PubMed] [Google Scholar]
- 14.Garrison KA, Scheinost D, Finn ES, Shen X, and Constable RT: The (in) stability of functional brain network measures across thresholds. Neuroimage, 118, pp. 651–661 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wen X, Wang R, Yin W, Lin W, Zhang H, and Shen D: Development of dynamic functional architecture during early infancy. Cerebral Cortex, 30(11), pp. 5626–5638 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Meunier D, Achard S, Morcom A, and Bullmore E: Age-related changes in modular organization of human brain functional networks. Neuroimage, 44(3), pp. 715–723 (2009). [DOI] [PubMed] [Google Scholar]
- 17.Meilă M: Comparing clusterings-an information based distance. Journal of multivariate analysis, 98(5), pp. 873–895 (2007). [Google Scholar]
- 18.Venkataraman A, Van Dijk KR, Buckner RL, and Golland P: Exploring functional connectivity in fMRI via clustering. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 441–444 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer B, and Joliot M: Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage, 15(1), pp. 273–289 (2002). [DOI] [PubMed] [Google Scholar]
- 20.Tavor I, Jones OP, Mars RB, Smith S, Behrens T, and Jbabdi S: Task-free MRI predicts individual differences in brain activity during task performance. Science, 352(6282), pp. 216–220 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Wang B, Zhu J, Pierson E, Ramazzotti D, and Batzoglou S: Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning. Nature methods, 14(4), pp. 414–416 (2017). [DOI] [PubMed] [Google Scholar]
- 22.Isola P, Zhu J-Y, Zhou T, and Efros AA: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134 (2017). [Google Scholar]
- 23.Li G, Wang L, Yap P-T., et al. : Computational neuroanatomy of baby brains: A review. Neuroimage, 185, 906–925 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wang L, Li G, Shi F, et al. : Volume-based analysis of 6-month-old infant brain MRI for autism biomarker identification and early diagnosis. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 411–419 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Li G, Wang L, Shi F, Gilmore JH, Lin W, Shen D: Construction of 4D high-definition cortical surface atlases of infants: Methods and applications. Medical Image Analysis, 25(1), 22–36 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Li G, Wang L, Shi F, Lin W, Shen D, Simultaneous and consistent labeling of longitudinal dynamic developing cortical surfaces in infants. Medical Image Analysis, 18(8), 1274–1289 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Yin W, Li T, Hung SC, Zhang H, Wang L, Shen D, Lin W: The emergence of a functionally flexible brain during early infancy. Proceedings of the National Academy of Sciences, 117(38), 23904–23913 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Hu D, Zhang H, Wu Z, Wang, et al. : Disentangled-Multimodal Adversarial Autoencoder: Application to Infant Age Prediction With Incomplete Multimodal Neuroimages. IEEE Transactions on Medical Imaging, 39(12), pp.4137–4149 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hu D, Wang F, Zhang H, et al. : Disentangled Intensive Triplet Autoencoder for Infant Functional Connectome Fingerprinting. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 72–82 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]