Skip to main content
Cognitive Neurodynamics logoLink to Cognitive Neurodynamics
. 2021 Mar 12;15(5):915–919. doi: 10.1007/s11571-021-09670-5

The braingraph.org database with more than 1000 robust human connectomes in five resolutions

Bálint Varga 1, Vince Grolmusz 1,2,
PMCID: PMC8448809  PMID: 34603551

Abstract

The human brain is the most complex object of study we encounter today. Mapping the neuronal-level connections between the more than 80 billion neurons in the brain is a hopeless task for science. By the recent advancement of magnetic resonance imaging (MRI), we are able to map the macroscopic connections between about 1000 brain areas. The MRI data acquisition and the subsequent algorithmic workflow contain several complex steps, where errors can occur. In the present contribution we describe and publish 1064 human connectomes, computed from the public release of the Human Connectome Project. Each connectome is available in 5 resolutions, with 83, 129, 234, 463 and 1015 anatomically labeled nodes. For error correction we follow an averaging and extreme value deleting strategy for each edge and for each connectome. The resulting 5320 braingraphs can be downloaded from the https://braingraph.org site. This dataset makes possible the access to this graphs for scientists unfamiliar with neuroimaging- and connectome-related tools: mathematicians, physicists and engineers can use their expertize and ideas in the analysis of the connections of the human brain. Brain scientists and computational neuroscientists also have a robust and large, multi-resolution set for connectomical studies.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11571-021-09670-5.

Keywords: Connectome, Braingraph

Introduction

Connectomes or braingraphs are compact and focused derivatives of the diffusion magnetic resonance images (MRIs) of the brain: their vertices are labeled by the anatomical areas, and two such vertices are connected by a weighted graph-edge, if a tractography workflow Besson et al. (2014) finds neural tracks between the areas, corresponded to the vertices. By focusing on the connections between cerebral areas instead of analyzing the whole MR image, we can make use of the rich and refined resources of graph theory, born with the famous article of Leonhard Euler on the problem of the Königsberg Bridges Euler (1741) in 1741.

Our research group earlier has prepared several undirected and directed braingraph sets (Kerepesi et al. 2016, 2017; Szalkai et al. 2015a, 2017a, 2019a) from the 500 Subjects Data Release McNab et al. (2013) of the Human Connectome Project (HCP). The resulting graphs were made available at the site https://braingraph.org, and were applied in several structural studies of the human brain (Szalkai et al. 2015b; Kerepesi et al. 2018a; Szalkai et al. 2019a; Kerepesi et al. March 2018b; Szalkai et al. Feb 2019b, 2018; Szalkai et al. 2017b; Szalkai et al. 2016; Fellner et al. 2019, 2020a, 2020b).

In the present contribution we describe a new braingraph set, computed from the 1200 Subjects Data Release of the Human Connectome Project McNab et al. (2013). The set contains 1064 connectomes, each in five resolutions, and each edge is weighted by three different weight functions. Our dataset may serve as a robust resource for the computational neuroscience community in the coming years.

Methods

The data source of the workflow is the 1200 Subjects Data Release of the Human Connectome Project (HCP) McNab et al. (2013), documented at the site https://www.humanconnectome.org/study/hcp-young-adult/document/1200-subjects-data-release. For the present study the “re-preprocessed” 3T diffusion data was applied, as was detailed at the HCP site.

The Connectome Mapper Tool Kit (CMTK) workflow Daducci et al. (2012) was utilized in the graph computation on the HCP data. For each subject, we have applied the segmentation and the parcellation steps only once, but the probabilistic tractography part of the workflow 10 times. The parcellation scheme was the Lausanne2008 atlas, the labels applied are listed in https://github.com/LTS5/cmp_nipype/blob/master/cmtklib/data/parcellation/lausanne2008/ParcellationLausanne2008.xls.

The graph construction was performed in the following steps:

  1. For each subject the MRtrix 0.3 tractography algorithm Tournier et al. (2012) was run, with probabilistic seeding and probabilistic tractography. The number of streamlines was set to 1 million. For defining the graph edges, let us consider two distinct, anatomically labeled areas of the cortical- or sub-cortical gray areas of the brain, denoted by A and B. If the tractography algorithm found at least one streamline between the area A and B, then vertex a, representing area A was connected to vertex b, representing area B, by a graph edge. The three weights of {a,b} give the number of streamlines or fibers found between areas A and B, the average length of the streamlines, and the mean fractional anisotropy of the streamlines.

  2. Step 1 was repeated 10 times for each subject. We accepted {a,b} to be an edge of the connectome of the subject, if it was present in all ten graphs computed in the repetitions. Next, for each edge we computed the maximum and the minimum number of the fibers, defining that edge, and deleted those two extremal values. Consequently, there remained 8 fiber numbers for each edge. We computed the mean value of those fiber numbers, the mean value of the lengths of the streamlines and the fractional anisotropies for the three weights of the edge.

In other words, the probabilistic tractography was performed 10 times, the graphs were constructed after each run, (i.e., 10 graphs were constructed for each subject), next the extremal fiber number values were deleted, the remaining 8 values were averaged, and the edges, which were present in all 10 graphs were allowed to be included in the resulting graph.

Steps 1 and 2 were performed only in the highest (i.e., the finest) resolution with 1015 vertices. For lower resolutions, the graphs were computed from the 1015-vertex graph by contracting vertices, summing the fiber numbers of the multiple edges between the two contracted vertices and contracting the multiple edges.

On the choice of 10 as the repetition number of the probabilistic tractography we refer to the detailed analysis in the “Discussion and results” section below.

From the dataset of the HCP website we were able to finish the graph computations for 1064 subjects.

The computation was done on our 24-member Intel i7 cluster (each with 6 physical and 12 virtual CPU cores and 16 GB of RAM) within 3 weeks running time.

Data records

The data source of this work was published at the Human Connectome Project’s website at http://www.humanconnectome.org/McNab et al. (2013) as the 1200 Subjects Public Release. The parcellation data, containing the anatomically labeled ROIs, is listed in the CMTK nypipe GitHub repository https://github.com/LTS5/cmp_nipype/blob/master/cmtklib/data/parcellation/lausanne2008/ParcellationLausanne2008.xls.

The braingraphs, computed by us, can be accessed at the https://braingraph.org/cms/download-pit-group-connectomes/ site, by selecting one of the download options, denoted by “X nodes set, 1064 brains, 1 000 000 streamlines, 10x repeated”, where X=86,129,234,463,1015.

The graphs are given in GraphML format, described in https://cmtk.org Daducci et al. (2012). Each file begins with an attribute definition section, then the nodes are described with their coordinates and anatomical labels, corresponding to the parcellation at https://github.com/LTS5/cmp_nipype/blob/master/cmtklib/data/parcellation/lausanne2008/ParcellationLausanne2008.xls.

Next the (un-directed) edges are listed. The edges carry three weights:

  • The number of fibers;

  • The mean value of the fiber lengths in the edge;

  • And the mean fractional anisotropy of the fibers

Note that the edge weights are averages from the eight of the ten tractography-runs, therefore, even the fiber number is—typically —a non-integer.

Discussion and results

Here we describe the workflow, which implied the choice of the 10 repetitions of step 1 in the graph construction above. We note that the present section describes only the process, resulting the specific choice of the repetition number 10, and not the actual graph construction (which was already duly described in the “Methods” section).

The implementations of the deterministic tractography algorithms also contain a probabilistic seeding step; i.e., two runs of these tractography computations almost always yield different results. When we use probabilistic tractography Girard et al. Sep (2014); Buchanan et al. Feb (2014), it is evident that distinct runs yield different results.

For generating reproducible results in the graph construction with a probabilistic tractography phase, it is a natural idea to repeat the probabilistic tractography algorithm for the very same input several times, and to average the results of the tractography in a careful way.

Let us fix two vertices, and let the random variable X denote the number of fibers discovered between then, then, clearly, for any X: E(X-E(X))=E(X)-E(X)=0, that is, the expectation of the difference of X from its expected value E(X) is 0. This fact implies that the repetitions and the averaging will increase the reliability of the tractography results.

For the determination of the number of repetitions k, with the trade-off with practical computability and robustness, we have followed the strategy, described as follows. In short, we determined the number of necessary repetitions by comparing deviations for 10 average values, each for k repetitions, for k=1,2,,50.

More exactly, we have chosen 9 subjects: for each non-zero leading digits of the ID numbers, one was chosen randomly (the choices were: 136631, 200008, 300618, 401422, 500222, 601127,700634, 800941, 901038). For a given subject, and a given positive integer value k, we have generated the following ten braingraphs:

Gk1,Gk2,Gk10,

where Gki was calculated by k repetitions of the tractography phase, and averaging the numbers of fibers for each edge on the k runs.

For i=1,2,,10, we have generated independent k instances, and averaged these k fiber numbers for each edge. Next, we have thrown out those edges, which were not present in all the ten copies of the averaged graphs. Now, for each remaining edge {u,v} of the graph G, we computed the average fiber number values over k repetitions: one average value wi(k)(u,v) for each i in Gki, for i=1,2,,10. For readability, we omit (uv) from wi(k)(u,v) in what follows.

For these ten wi(k) values we computed the relative standard deviation (also called coefficient of variation) of the ten wi(k) values:

cv(w(k))=σ(w(k))μ(w(k)), 1

where

μ(w(k))=110i=110wi(k),σ(w(k))=19i=110(wi(k)-μ(w(k)))2 2

Figure 1 displays the change of the relative standard deviation of the fiber number of a given edge (the edge, connecting vertex 19 and vertex 21 in the 463-vertex resolution in the case of subject No. 901038) for k=1,2,,50.

Fig. 1.

Fig. 1

The change of the relative standard deviations (on the y axis) of the edge, connecting vertex 19 and vertex 21 in the 463-vertex resolution in the case of subject No. 901038, for k=1,2,,50, (on the x axis)

Figure 2 shows the change of the relative standard deviations, averaged for all edges as a function of k, in the case of a given braingraph, in 234-vertex resolution. Supporting Figures 1, 2, 3 and 4 show the same in graphs of different resolutions.

Fig. 2.

Fig. 2

The change of the relative standard deviations (on the y axis), averaged for all edges as a function of k=1,2,,50 (on the x axis), in the case of the connectome of subject No. 300618, in 234-vertex resolution. The medians of the relative standard deviations are visualized by red horizontal lines, while the boxes show the middle-half of the datapoints: under the box there are the lower quarter-, above the box the upper quarter of the data points. The solid lines show the whole spread of the data points

Based on the visual examination of Figure 2 (and the related figures for other resolutions and subjects, cf. Supporting Figs. 1, 2, 3 and 4), we have chosen the k=10 value for repetitions as a good trade-off between deviation and practical computability: for repetitions k>10 the decrease of the red horizontal lines, showing the median relative standard deviations, is very small on Fig. 2 and Supporting Figs. 1 and 2, and still small on Supporting Figs. 3 and 4.

Supplementary Information

Below is the link to the electronic supplementary material.

Acknowledgements

Data were provided in part by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University. VG and BV were partially supported by the VEKOP-2.3.2-16-2017-00014 program, supported by the European Union and the State of Hungary, co-financed by the European Regional Development Fund, and the NKFI-127909 grant of the National Research, Development and Innovation Office of Hungary. VG and BV was supported in part by the EFOP-3.6.3-VEKOP-16-2017-00002 grant, supported by the European Union, co-financed by the European Social Fund.

Author Contributions

BV constructed the image processing system, computed the braingraphs, and prepared the figure, VG has secured funding, initiated the study, analyzed data and wrote the paper.

Funding

Open access funding provided by Eötvös Loránd University.

Declarations

Conflicts of interest

The authors declare no conflicts of interest.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Bálint Varga, Email: balorkany@pitgroup.org.

Vince Grolmusz, Email: grolmusz@pitgroup.org.

References

  1. Besson P, Dinkelacker V, Valabregue R, Thivard L, Leclerc X, Baulac M, Sammler D, Colliot O, Lehéricy S, Samson S, Dupont S. Structural connectivity differences in left and right temporal lobe epilepsy. Neuroimage. 2014;100C:135–144. doi: 10.1016/j.neuroimage.2014.04.071. [DOI] [PubMed] [Google Scholar]
  2. Buchanan CR, Pernet CR, Gorgolewski KJ, Storkey AJ, Bastin ME. Test-retest reliability of structural brain networks from diffusion MRI. Neuroimage. 2014;86:231–243. doi: 10.1016/j.neuroimage.2013.09.054. [DOI] [PubMed] [Google Scholar]
  3. Daducci A, Gerhard S, Griffa A, Lemkaddem A, Cammoun L, Gigandet X, Meuli R, Hagmann P, Thiran JP. The connectome mapper: an open-source processing pipeline to map connectomes with MRI. PLoS One. 2012;7(12):e48121. doi: 10.1371/journal.pone.0048121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Euler L. Solutio problematis ad geometriam situs pertinentis. Commentarii Academiae Scientarum Imperialis Petropolitanae 8 (1): 128–140, 1741. http://eulerarchive.maa.org//docs/originals/E053.pdf
  5. Fellner M, Varga B, Grolmusz V. The frequent subgraphs of the connectome of the human brain. Cognit Neurodynam. 2019;13(5):453–460. doi: 10.1007/s11571-019-09535-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Fellner M, Varga B, Grolmusz V. The frequent complete subgraphs in the human connectome. PloS One. 2020;15(8):e0236883. doi: 10.1371/journal.pone.0236883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Fellner M, Varga B, Grolmusz V. The frequent network neighborhood mapping of the human hippocampus shows much more frequent neighbor sets in males than in females. PLOS One. 2020;15(1):e0227910. doi: 10.1371/journal.pone.0227910. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Girard G, Whittingstall K, Deriche R, Descoteaux M. Towards quantitative connectivity analysis: reducing tractography biases. Neuroimage. 2014;98:266–278. doi: 10.1016/j.neuroimage.2014.04.074. [DOI] [PubMed] [Google Scholar]
  9. Kerepesi C, Szalkai B, Varga B, Grolmusz V. How to direct the edges of the connectomes: dynamics of the consensus connectomes and the development of the connections in the human brain. PLOS One. 2016;11(6):e0158680. doi: 10.1371/journal.pone.0158680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Kerepesi C, Szalkai B, Varga B, Grolmusz V. The braingraph. org database of high resolution structural connectomes and the brain graph tools. Cognit Neurodynam. 2017;11(5):483–486. doi: 10.1007/s11571-017-9445-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Kerepesi C, Szalkai B, Varga B, Grolmusz V. Comparative connectomics: mapping the inter-individual variability of connections within the regions of the human brain. Neurosci Lett. 2018;662(1):17–21. doi: 10.1016/j.neulet.2017.10.003. [DOI] [PubMed] [Google Scholar]
  12. Kerepesi C, Varga B, Szalkai B, Grolmusz V. The dorsal striatum and the dynamics of the consensus connectomes in the frontal lobe of the human brain. Neurosci Lett. 2018;673:51–55. doi: 10.1016/j.neulet.2018.02.052. [DOI] [PubMed] [Google Scholar]
  13. McNab JA, Edlow BL, Witzel T, Huang SY, Bhat H, Heberlein K, Feiweier T, Liu K, Keil B, Cohen-Adad J, Tisdall D, Folkerth RD, Kinney HC, Wald LL. The Human Connectome Project and beyond: initial applications of 300 mT/m gradients. Neuroimage. 2013;80:234–245. doi: 10.1016/j.neuroimage.2013.05.074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Szalkai B, Kerepesi C, Varga B, Grolmusz V. The budapest reference connectome server v2. 0. Neurosci Lett. 2015;595:60–62. doi: 10.1016/j.neulet.2015.03.071. [DOI] [PubMed] [Google Scholar]
  15. Szalkai B, Varga B, Grolmusz V. Graph theoretical analysis reveals: Women’s brains are better connected than men’s. PLoS One. 2015;10(7):e0130045. doi: 10.1371/journal.pone.0130045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Szalkai B, Kerepesi C, Varga B, Grolmusz V. Parameterizable consensus connectomes from the human connectome project: the budapest reference connectome server v3.0. Cognit Neurodynam. 2017;11(1):113–116. doi: 10.1007/s11571-016-9407-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Szalkai B, Varga B, Grolmusz V. The robustness and the doubly-preferential attachment simulation of the consensus connectome dynamics of the human brain. Sci Rep. 2017 doi: 10.1038/s41598-017-16326-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Szalkai B, Varga B, Grolmusz V. Comparing advanced graph-theoretical parameters of the connectomes of the lobes of the human brain. Cognit Neurodynam. 2018;12(6):549–559. doi: 10.1007/s11571-018-9508-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Szalkai B, Kerepesi C, Varga B, Grolmusz V. High-resolution directed human connectomes and the consensus connectome dynamics. PLoS ONE. 2019;14(4):e0215473. doi: 10.1371/journal.pone.0215473. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Szalkai B, Varga B, Grolmusz V. Mapping correlations of psychological and connectomical properties of the dataset of the human connectome project with the maximum spanning tree method. Brain Imag Behav. 2019;13(5):1185–1192. doi: 10.1007/s11682-018-9937-6. [DOI] [PubMed] [Google Scholar]
  21. Szalkai B, Varga B, and Grolmusz V (2021) The graph of our mind. Brain Sci 11(3):342. 10.3390/brainsci11030342 [DOI] [PMC free article] [PubMed]
  22. Tournier J, Calamante F, Connelly A, et al. Mrtrix: diffusion tractography in crossing fiber regions. Int J Imag Syst Technol. 2012;22(1):53–66. doi: 10.1002/ima.22005. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials


Articles from Cognitive Neurodynamics are provided here courtesy of Springer Science+Business Media B.V.

RESOURCES