Abstract
For more than a decade now, we can discover and study thousands of cerebral connections with the application of diffusion magnetic resonance imaging (dMRI) techniques and the accompanying algorithmic workflow. While numerous connectomical results were published enlightening the relation between the braingraph and certain biological, medical, and psychological properties, it is still a great challenge to identify a small number of brain connections closely related to those conditions. In the present contribution, by applying the 1200 Subjects Release of the Human Connectome Project (HCP) and Support Vector Machines, we identify just 102 connections out of the total number of 1950 connections in the 83-vertex graphs of 1064 subjects, which—by a simple linear test—precisely, without any error determine the sex of the subject. Next, we re-scaled the weights of the edges—corresponding to the discovered fibers—to be between 0 and 1, and, very surprisingly, we were able to identify two graph edges out of these 102, such that, if their weights are both 1, then the connectome always belongs to a female subject, independently of the other edges. Similarly, we have identified 3 edges from these 102, whose weights, if two of them are 1 and one is 0, imply that the graph belongs to a male subject—again, independently of the other edges. We call the former 2 edges superfeminine and the first two of the 3 edges supermasculine edges of the human connectome. Even more interestingly, the edge, connecting the right Pars Triangularis and the right Superior Parietal areas, is one of the 2 superfeminine edges, and it is also the third edge, accompanying the two supermasculine connections if its weight is 0; therefore, it is also a “switching” edge. Identifying such edge-sets of distinction is the unprecedented result of this work.
Supplementary Information
The online version contains supplementary material available at 10.1007/s11571-021-09687-w.
Keywords: Connectome, Braingraph, SVM, Linear separation, Sex differences, Superfeminine edges, Supermasculine edges
Introduction
One of the most important challenges in brain science is establishing the cellular and anatomical causes of neurophysiological or psychological differences between human subjects. In the last decade, by the spectacular developments in magnetic resonance imaging (MRI) of the brain, together with the data-processing pipeline for the data collected, our knowledge of the cerebral connections has been increased enormously (e.g., Sporns et al. 2005; Van Essen et al. 2012; Szalkai et al. 2019a).
Diffusion MRI (dMRI) is capable of discovering the spatial anisotropy of the movement of water molecules in the brain: since in the axonal fibers of the white matter the water molecules have a diffusion movement along the axons, the axonal fibers can be tracked and traced, without any contrast material, with refined tractography algorithms (Tournier et al. 2012). With the reliable identification of the cortical- and sub-cortical gray matter areas (Fischl 2012), we can construct the connectome, or the braingraph as follows: the nodes (or vertices) of this graph are the anatomically identified gray matter areas, and two nodes are connected by an (undirected) edge if the tractography algorithm finds axonal fibers between the brain areas, corresponding to these two nodes.
Numerous results were published in the last decade, analyzing the human braingraph (Hagmann et al. 2008; Szalkai et al. 2015a; Kerepesi and Grolmusz 2017; Hagmann et al. 2012; Szalkai et al. 2019b; Craddock et al. 2013; Kerepesi et al. 2018b; Szalkai et al. 2017b; Ortiz et al. 2014). Several works describe the connections of the healthy human brain (Ball et al. 2014; Kerepesi et al. 2016; Bargmann 2012; Kerepesi et al. 2017; Batalle et al. 2013; Szalkai et al. 2017a; Kerepesi et al. 2018b; Graham 2014), while others establish relations between psychiatric diseases or conditions and the connectome (Agosta et al. 2014; Alexander-Bloch et al. 2014; Baker et al. 2014; Szalkai et al. 2019a; Besson et al. May 2014; Bonilha et al. 2014).
Sex differences
It is known for several years that the female and the male connectomes have different properties as graphs. The work of Ingalhalikar et al. (2014) has proven—on a publicly un-available dataset—that the ratio of inter-hemispheric connections vs. the intra-hemispheric connections differs in males and females.
Our group has shown on a publicly available dataset (Kerepesi et al. 2017) that several deep graph-theoretical properties, which are usually applied in the characterization of the quality of large computer interconnection networks (Leighton 1992), are better in the braingraphs of women than in men (Szalkai et al. 2015b, 2021). We have proven that women’s braingraphs are better expanders, have greater minimal bisection width, more spanning trees, larger minimum vertex cover than that of men. In the work of Szalkai et al. (2018) we have proven that the advantage in the graph-quality parameters of women is due to the sex differences, and not to the size differences: we have compared the graphs of 36 large-brain women and 36 small-brain men, such that the brain volumes of all men were smaller than the brain volume of the smallest-brain woman in the group. We have found that men did not have better parameters than women in this test, and, additionally, many of the advantages of the women remained valid.
The adjective “better” and the noun “advantage” refer to the quality parameters of the large computer interconnection networks (Leighton 1992); their beneficial effects on the human brain functions are not proven yet (Szalkai et al. 2015b, 2021).
Parameters, defined a priori versus a posteriori
In the studies of Ingalhalikar et al. (2014), Szalkai et al. (2015b), Szalkai et al. (2021), Szalkai et al. (2018), Fellner et al. (2020b), the authors compared parameters, which were identified a priori, i.e., the examination of these parameters were decided before the braingraphs were analyzed. In the present work, we intend to identify a posteriori parameters, i.e., edge-structures in the course of the analysis of the braingraphs, in which the male and female connectomes differ. Additionally, we intend to discover the smallest possible edge-sets of the braingraphs, which already determine the sex of the subject.
First, we constructed and trained a deep artificial neural network (ANN, see, e.g., Szalkai and Grolmusz 2017, 2018 for definitions and examples) for classifying the sex of the subject, using only his/her braingraph. While these efforts were moderately successful, we have found that not the deep networks, but, on the contrary, the one-level networks gave the best results for predicting the sex of the subject. In a certain sense, one-level neural networks are similar in their capabilities to linear tests or Support Vector Machines (SVMs). In the “Methods” section, we give a short introduction to SVMs.
It is important to note that we have not used artificial intelligence tools (ANNs and SVMs) for making predictors. We have used these tools for data analysis: we have found the “minimal SVM” which distinguished the sexes, then apply this SVM as a mathematical model to identify distinguished edges of the male and female connectomes.
Few edges, which imply biological properties
It is a great challenge to identify one edge or a small set of edges in the human braingraph, which imply some important biological properties of the subject. In other words, the task is to find the most important brain connections, which relate to some biological conditions (biological property, or diseased status, or mental ability or disability). Up to now, more complex graph-theoretical properties—instead of just identifying a few graph edges—were published in this direction: for example, for the sex of the subjects, complex graph-theoretical differences were found in Szalkai et al. (2015b), Szalkai et al. (2021), Szalkai et al. (2018), Fellner et al. (2020a), Fellner et al. (2019). For intelligence-related psychological tests, some frequent neighbor sets of the hippocampus were identified, which are correlated with good and bad test results in Fellner et al. (2020b). Interrelations between graph-theoretical properties of the connectome and physiological properties were described in Szalkai et al. (2019a).
Finding one, two, or three edges whose strengths (measured in fiber numbers, cf. the “Methods” section) imply important biological properties is one of the results of the present work. These edges are the most important ones in relation to the property studied. This problem is analogous to finding the most important vertices in a graph, which was solved by Google Inc., by their famous PageRank algorithm (Page et al. 1999; Brin and Page 1998): The PageRank scoring has made the Google web search engine in front of their competitors.
Here, we identify superfeminine and supermasculine edges of the braingraph based on the largest cohort available today. These edges are described as follows.
Few edges, which simply determine the sex of the subject
Applying Support Vector Machines and integer programming algorithms, we were able to identify a small set of connectome edges, which precisely determine the female and male brains, without any error. Additionally, and perhaps more surprisingly, we have identified 2 and 3 particular edges with the following property: if the scaled weight of both edges is 1, then the connectome belongs to a female subject. If the scaled weight of the first two of the three edges is 1, and the scaled weight of the third is 0, then the connectome belongs to a male subject. The edge weights correspond to the fiber numbers, and the scaling means that the fiber number is multiplied by an edge-specific number such that the resulting value is between 0 and 1 (the exact definition of the edge weights is given in the “Methods” section). We call these edges superfeminine and supermasculine edges, respectively.
More exactly, we are considering graphs on 83 vertices. From these 83 vertices, one can form
vertex-pairs, i.e., this is the maximum number of edges on 83 vertices. Note that each of the 1064 braingraphs contains exactly 83 vertices, and all of these vertices correspond to the very same 83 gray-matter areas of the brain (sometimes called ROIs, Regions of Interest).
In our dataset of 1064 subjects, the union of all edges of the 1064 braingraphs contains 1950 edges. That means that out of the possible 3403 edges, only 1950 are present in the union of all the 1064 braingraphs. This is not a surprising observation since, for example, few areas from the left hemisphere are connected directly to the areas of the right hemisphere (see Supporting Fig. 1 in the on-line supporting material). As our first result, we have succeeded in finding a hyperplane in the 1950-dimensional Euclidean space, which perfectly separates the 1064 points, corresponding to the male and the female subjects (see Fig. 1). In general, it is not a great surprise: if we take an -vertex simplex in the n-dimensional space, then—for any +1 and −1 labeling of those vertices—there exists a hyperplane which perfectly, without any error separates the −1 and the +1 labeled points as in Fig. 1. Finding a separating hyperspace in much lower dimensions is difficult and often impossible.
It is a great challenge to find the smallest possible set of edges, which still implies the sex of the subject. This small set of connections may carry the most important features, which differentiate the braingraphs of the sexes. If there existed a single graph edge e with weight w(e), such that for all braingraphs of men and for all braingraphs for women , then this single edge e would separate the sexes in a very simple way. If no such single edge exists, but there existed two edges, e and f, and three constants a, b, c, such that for all men and for all women, then these two edges, e and f, would separate the sexes by a linear test. Unfortunately, no one knows one or two edges, separating the braingraphs of the sexes by simple linear tests.
We were able to identify 102 edges, which already determine the sex of the subjects (Fig. 1). Moreover, these edges determine the sex in a very simple, linear way, described below (the method of the identification of these 102 edges is detailed in the “Methods” section).
For describing this phenomenon, let us correspond each graph to a length-102 vector, with coordinates equal to the edge-weights on the chosen 102 edges. This way, we have 1064 vectors, each with 102 coordinates. In other words, we have a 102-dimensional Euclidean space with 1064 points (vectors) in it. In this space, we have determined a hyperplane, which separates the male and female graphs in the following way: all the 102-dimensional vectors made from the female graphs are on one side of the hyperplane, while all the 102-dimensional vectors, made from the male braingraphs are on the other side of the hyperplane. Consequently, (1) 102 edges out of the 1950 edges already determine the sex of the subject, and (2) in a very simple, exact, and linear way, by a separating hyperplane. Figure 1 gives a simple example for the data separation on the plane (in 2 dimensions) with a line (i.e., a line is a hyperplane on the plane).
Figure 2 depicts the 102 edges, which already determine the sex of the subject. The list of these 102 edges is given in Supporting Table 1. An Excel file performing the actual separation-computation with all data is available at http://uratim.com/agysvm/agy-svm.zip. An interactive chart visualizing the separation can be viewed at http://pitgroup.org/static/interactive_chart/abra.html
Superfeminine and supermasculine edges
Our second main result is the identification of very few connections, out of the 102 edges, in a way that if each of these edges has specific (either high or low) weights, then the sex of the subject is uniquely determined.
Let us recall that the weight of an edge is the number of the axonal fibers found running between its two endpoints in the tractography algorithm, scaled for individual edges to be between 0 and 1 (the details are given in the “Methods” section).
We have found that if the weights of both edges below are 1, then, independently from the weights of the remaining 100 edges out of the 102 sex-determining connections, the sex of the subject is female:
F1: (rh.superiorfrontal, Left-Putamen)
F2: (rh.parstriangularis, rh.superiorparietal)
We call the set of edges F1. F2 “superfeminine” edges.
Similarly, we have found three edges, such that, if the weights of the first two are high and the weight of the third one is low, then, independently of the other edge-weights of the remaining 99 edges out of the 102 connections, the sex of the subject is male:
M1: (1h.rostralmiddlefrontal, Left-Thalamus-Proper)
M2: (Right-Hippocampus, lh.supramarginal)
F2: (rh.parstriangularis, rh.superiorparietal)
The superfeminine and supermasculine edges are depicted in Fig. 3.
We call edges M1 and M2 “supermasculine” edges. Note that edge F2 is present in both sets: if the weight of F1 and F2 are high, then it implies that the graph belongs to a female subject, and if the weight of F2 is low, and the weights of M1 and M2 are high, then the graph belongs to a male subject. We call the edge F2 a “switching” edge. We refer to the exact definition of the “switching” edge in the “Methods” section.
Methods
Graph construction
Our data source is the 1200 Subjects Release of the Human Connectome Project (HCP) (McNab et al. 2013), available at the https://www.humanconnectome.org site. The subjects were healthy adults between 22 and 35 years of age. The data acquisition methodology of the Human Connectome Project is detailed in the “WU-Minn HCP 1200 Subjects Data Release: Reference Manual” at the site https://www.humanconnectome.org/storage/app/media/documentation/s1200/HCP_S1200_Release_Reference_Manual.pdf.
We have applied the 3T MR diffusion imaging data and processed it with the Connectome Mapper Tool Kit (CMTK) (Daducci et al. 2012).
Our goal was the construction of graphs, or connectomes, which describe the connections between the distinct, anatomically identified cortical and sub-cortical, gray-matter areas of the brain of the subjects. The nodes (or vertices) of our graphs corresponded to the anatomically identified gray matter areas, and we connected two nodes by an edge if the workflow described below found axonal fibers running between the areas that corresponded to the nodes. We emphasize that the study of the connectome instead of the whole MR image deals with exclusively the connections between the gray matter areas and does not take into account the exact orbit of the axonal fibers running in the white matter of the brain. This way, we can work with graphs instead of very redundant spatial imagery gained from the processing of the diffusion MR images. We note that the (mathematical) graph theory, which was established in 1741 by a work of Euler (1741), has very rich structures and several of the most complex and deepest proofs and tools in mathematics (e.g., Szemeredi 1975; Chudnovsky et al. 2006; Erdos et al. 1946). Therefore, the transition from images to graphs facilitates the application of the well-developed techniques of the (mathematical) graph theory to one of the most complex organs on Earth, the human brain.
The axonal fibers are discovered from the diffusion MR images by tractography algorithms. Probabilistic tractography was applied, with 1 million streamlines, by using MRtrix 0.2 tractography software. For each subject, the tractography program was run 10 times. In each run, the number of fibers was determined for each edge. If in any of the ten runs an edge was non-existent, that is, it was not defined by any fiber in the tractography, then that edge was discarded. Next, from these 10 runs, for each edge, the maximum and minimum numbers of fibers were deleted, and the average of the remaining 8 fiber numbers was assigned to the edge; this number is used as the weight of the edge. This way, the false positive and false negative edges were dealt with, and large errors, leading to the maximum or minimum fiber numbers of an edge, were discarded: they did not influence the average value (Varga and Grolmusz 2021).
For each subject, 5 graphs, each with resolutions of 83, 129, 234, 463, and 1015 nodes were computed, by applying the CMTK’s implementation of the FreeSurfer suite of programs for parcellation (Fischl 2012; Desikan et al. 2006; Tournier et al. 2012).
The HCP public release contains the data of 1206 subjects. From these, 1113 contained structural scans. Our workflow (Varga and Grolmusz 2021) was successfully completed for the data of 1064 subjects. From the subjects, there were 575 females and 489 males. The resulting graphs, with 5 resolutions for each subject, can be downloaded from the site http://braingraph.org/download-pit-group-connectomes/. For the detailed description of the graph-constructing workflow and the resulting graph dataset, we refer to the publication (Varga and Grolmusz 2021).
In the present work, we apply only the coarsest 83-node resolution, i.e., we consider 1064 graphs of 1064 subjects, each on 83 vertices. We have found 1950 edges by taking the union of the edges of the 1064 braingraphs on 83 vertices.
In braingraph u the edge v is denoted by , for , . The weight of the edge , denoted by , is the average number of axonal fibers found running between its endpoints in the 8 tractography computations.
An edge-specific weight-scaling method
We would like to scale individually the weights of the edges such that all the resulting edge-weights are between 0 and 1, as follows:
1 |
if the denominator is not zero; otherwise, let be zero; . This way, for each braingraph, and for each edge, the smallest weight is transformed to 0, and the largest (if differs from the smallest) to 1. From now on, we use this scaled weights , instead of the original ones. Let .
In other words, for any , describes a braingraph, with the new, scaled edges as its coordinates.
In what follows, we do not use the superscript if the meaning of x is clear from the context.
An SVM-based technique with heuristic improvements
The support vector machines (SVMs) are frequently used tools in artificial intelligence to classify the elements of large data sets (Cortes and Vapnik 1995).
Suppose that we have k data points in the n-dimensional Euclidean space , and a function . We intend to find an n-dimensional hyperplane, such that
one side of the hyperplane contains all ’s with , and the other side of the hyperplane contains all ’s with , and
the hyperplane separates the data points with the largest margin, that is, the distance of the closest data point to the hyperplane is maximized.
If then the requirement (1) can always be met if the points are in a general position in the n-dimensional Euclidean space (one can see this simply by solving a linear system of equations with a non-zero determinant for finding the normal vector of the hyperplane). If , then (1) (i.e., the perfect separation with a hyperspace) is not always satisfiable. We refer to Cover’s theorem for probability estimations for the satisfiability of (1) when (Cover 1965).
In the present work, first, we solved (1) and (2) for the dimensional space, with , by using the Python Scikit-Learn suite (Hao and Ho 2019). Next, we intend to reduce the coordinates (i.e., the number of edges), which are present in the separation. In other words, we needed to find as few coordinates as possible, such that the male and female connectomes can be separated by a hyperspace, using only the chosen coordinates.
This goal can be formalized as follows:
Let denote the number of the non-zero coordinates of vector w. Then we need to find
2 |
satisfying
3 |
corresponding to a female braingraph, and
4 |
corresponding to a male braingraph.
By the best of our knowledge, no optimization method is known for solving this problem exactly in polynomial time. Here we have applied the combination of two simple heuristic solution methods, by which we were able to reduce from 1950 to 102. In other words, we can identify 102 coordinates of or, equivalently, 102 edges of the graph, such that the sex of the corresponding subject can be expressed by the sign of the linear expression . The value of b and the 102 non-zero coordinates of w are given in the Supporting material, in Supporting Table 2.
The first heuristic algorithm is a Weight-Based Dimension-Reduction Algorithm (WBDRA): Here, we start with a w, which separates linearly, and next delete the smallest weight coordinates of w. A rate parameter r defines that the r fraction of the smallest coordinates needs to be deleted. If the new w does not separate, then we backtrack and decrease r. The code of the algorithm is given in the Supporting Material, as Program Code 1.
The second procedure is a Single Dimension Deleting Algorithm (SDDA): Here, we start with a separating w, and take a random order of the non-zero coordinates of w, and attempt to delete one dimension if the separation property remains valid. If not, then we try to delete the next dimension. The code of SDDA is given as Program Code 2 in the Supporting Material.
With the application of the two heuristic algorithms (WBDRA, SDDA), we have succeeded in reducing the to 102.
We need to add that we cannot prove the optimality of the 102-dimensional solution: we think that even better results can be reached. However, by using Cover’s theorem (Cover 1965), the probability that randomly 0–1 labeled, randomly chosen points are separable by a hyperplane in 102 dimensions is much less than .
Since our data points are not randomly distributed, we intended to investigate the specialty of the existence of the 102-dimensional SVM for our 1064 data points.
We have focused on our main tool, the WBDRA algorithm: using only this procedure, we were able to identify a 115-dimensional sex-separating weight vector—by using SSDA—this dimension was reduced to 102. Since the WBRDA algorithm is much faster than the SSDA, we used WBRDA in the tests below.
We have performed the following tests 50 times for our specific data points:
We assigned randomly 575 1-labels, and 489 0-labels to the 1064 data points y, corresponding to weighted-edge brain graphs;
next, we have applied the WBRDA algorithm.
The smallest dimension we find was 223, the largest 293, the average 256.6. Therefore, the 115-dimensional separation of the sexes, found by using the WBRDA algorithm exclusively is a surprising result, even for the specific y points, representing our 1064 braingraphs.
Finding superfeminine and supermasculine edges
Our goal is to identify edges, which have the greatest impact on decisions (3) and (4). These edges may have very important roles in the sex-specific development and functioning of the human brain. Simply stated, the most important edges would have the coordinates with the largest absolute values in vector w in (3) and (4). In what follows, we formally define 0-generator and 1-generator coordinates for a given function ; in our application f maps weighted edge-sequences to the sex of the subject.
Let [N] denote the set .
For and let denote:
Let denote the set of our 1064 braingraphs, each represented by an ; originally, , i.e., each braingraph was represented by a 1950 weighted edges. In the previous section, we have seen that we can reduce .
For an let .
Definition 1
We say that is a 1-generator for f with a seed , if . Similarly, we say that is a 0-generator for f with a seed if is satisfied.
In other words, the seed values in the coordinates in the 0-generator or 1-generator I already determine the value of our f.
Our goal is finding the smallest 0- and 1-generators for f, where f gives the sex of the subject: for males, and for females:
For this f, finding the minimal 0- and 1-generators is essentially a version of a knapsack problem, solvable by integer programming methods. For the reduction, we need some definitions and simple statements:
Definition 2
For any fixed , let be defined
Let be defined
It is easy to see that maximizes and minimizes .
We show the reduction for 1-generators; for 0-generators a similar reduction works.
Lemma 1
If is a 1-generator for f with seed then it is also a 1-generator with seed
Proof
Let , then .
The next Corollary is obvious:
Corollary 2
If I the smallest 1-generator with any seed, then it is also the smallest 1-generator with seed .
Lemma 3
Let denote the coordinates of the 0–1characteristic vector of set I: if and only if . Then I is a 1-generator for f with a seed if and only if , implies .
Proof
5 |
is non-negative.
Note that for any , the (5) is also non-negative.
From Lemma 3, the optimization problem, which gives the minimum 1-generator, can be written: Minimize , with the condition that for all
Definition 3
The edges in 1-generators, where the corresponding seed coordinates are ones, are called superfeminine edges. The edges in 0-generators, where the corresponding seed coordinates are ones, are called supermasculine edges. An edge e is called a switching edge, if any of the following two properties holds for it:
e is a superfeminine edge and it is also in a 0-generator with the corresponding seed coordinate 0; or
e is a supermasculine edge, which is also in a 1-generator with the corresponding seed value 0.
The distinction of by the 1 seed-coordinates are made since the weights correspond to fiber numbers, and the “strong” graph edges, defined by many fibers, are called superfeminine or supermasculine edges; we do not intend to call “weak” edges, i.e., edges with the fewest fiber tracts superfeminine or supermasculine, even if they are the part of a 0- or 1-generator (e.g., we do not call F2 a supermasculine edge). The superfeminine and supermasculine edges we found are depicted on Fig. 3.
Software used
The braingraphs were computed by using the CMTK suite (Daducci et al. 2012), with the details given in the beginning of the section. The figures were created by using Python Matplotlib mplot3D and Networkx packages. The 1950-dimensional SVM was computed using the Python Scikit-Learn suite of programs (Hao and Ho 2019). The heuristic improvements, resulting in the 102-dimensional separation, were found by the programs given in the Supporting Material in the Program codes section. For IP optimization, we used the Python Pulp package.
Discussion and results
Most cerebral sex dimorphism studies to date were done on very small (up to 40–80 subjects) cohorts and applied mostly volumetric investigations (Frederikse et al. 1999; Koscik et al. 2009; Maleki et al. 2012; Butler et al. 2006). Our previous works (Szalkai et al. 2015b, 2021, 2018; Fellner et al. 2019, 2020a, 2020c) first demonstrated sex dimorphisms in a priori defined graph parameters; in most cases the better connectivity-related parameters were found in the female connectomes.
Here we first demonstrate relatively small edge-sets, which determine the sex of the subjects on a very large, 1064-member cohort.
The 102 edges, which already define the sex of the subjects – without any error—are listed in the Supporting material as Supporting Table 1. Obviously, numerous edges connect subcortical nuclei with other parts of the brain. 13 of these 102 edges are inter-hemispheric.
The most frequently appearing nodes in these 102 edges, without considering lateralization, are the inferiorparietal (10 times), posteriorcingulate (9 times), precuneus (9 times), superiorparietal (8 times).
It is known that the inferior parietal lobule, which is a part of the heteromodal association cortex (HASC), shows sexual volumetric dimorphisms (Frederikse et al. 1999; Koscik et al. 2009).
The sex differences in the development of migraine and the role of precuneus were reported in Maleki et al. (2012) and in mental rotation (Butler et al. 2006).
Counting with lateralization, the most frequent nodes are the rh.precuneus (7 times), rh.inferiorparietal (6 times), rh.posteriorcingulate (6 times) and the right-pallidum (6 times), all in the right hemisphere.
To the best of our knowledge, we are the first showing that not only these nodes of the braingraph, but rather their important connections, listed in Supporting Table 1, carry substantial sex dimorphisms.
Additionally, we are the first to show the existence of superfeminine and supermasculine edges.
The superfeminine edges we have found are
F1: (rh.superiorfrontal, Left-Putamen)
F2: (rh.parstriangularis, rh.superiorparietal).
The two supermasculine edges with the F2 “switching” edge are:
M1: (1h.rostralmiddlefrontal, Left-Thalamus-Proper)
M2: (Right-Hippocampus, lh.supramarginal)
F2: (rh.parstriangularis, rh.superiorparietal)
The weights in fiber numbers of these edges are between 0 and 13.5 for F1; 0 and 385.375 for F2; 0 and 2010.5 for M1 and 0 and 27 for M2. Note that the weights are computed for each edge as the average of 8 tractography runs; therefore, they are not always integers.
The most interesting edge is F2, which, with high weight, is a superfeminine edge, and with low weight, and with M1 and M2 with high weights, it implies the male sex of the subject. We note that we use the terms “high” and “low” here, instead of 1 and 0 here. This is because if we set the weight of F1 and F2 both to 1, then the test will decide that the subject is a female (see Corollary 2); but it may happen that no actual female braingraph has the weight of F1 and F2 equal to exactly 1.
The area of Pars Triangularis was related to hormonal (oxytocin and arginine vasopressin) effects in men, and the same hormones to the parietal cortex—instead of Pars Triangularis—in women (Rubin et al. 2017). It is striking that just this edge, connecting the Pars Triangularis and the Superior Parietal area in the right hemisphere, has this distinguished “switching” property. Other publications also report sex differences in Pars Triangularis and the parietal cortex in context with hormonal regulation (Striepens et al. 2014; Hecht et al. 2017; Skvortsova et al. 2020), speech-language production (Foundas et al. 1998; Frederikse et al. 1999; Yao et al. 2020), in mental rotation performance (Koscik et al. 2009).
There exist numerous other sets of edges with the superfeminine and supermasculine property; we demonstrated these since they were the smallest set we have found. We note that knowing only the weights of F1, F2 or M1, M2 and F1 will not imply the sex in general; except when their weights are extremal.
Conclusions
Instead of “a priori” hypotheses, we have followed an “a posteriori” way of search for edges in the human connectome, which determine the sex of the subjects. We have identified 102 edges that determine the sex in a very simple, linear way in a 1064-member cohort. Instead of considering all the possible 1950 edges, only these 102 edges imply the sex of the subject without any error.
First in the literature, we have found two and three edges, out of the 102 ones, whose weights being properly set, imply the sex of the subject, independently of the other edges in the graph. The right Pars Triangularis area is present as an endpoint in these edges. This area is related to hormonal (oxytocin and arginine vasopressin) effects in men and the same hormones to the parietal cortex – instead of Pars Triangularis—in women (Rubin et al. 2017). The parietal cortex is also present as an endpoint in these edges.
The novel edge-specific scaling of the weights of the edges, given by the formula (1), contributed to the definition of the superfeminine and the supermasculine edges.
Supplementary Information
Below is the link to the electronic supplementary material.
Acknowledgements
Data were provided in part by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University. VG and BV were partially supported by the VEKOP-2.3.2-16-2017-00014 program, supported by the European Union and the State of Hungary, co-financed by the European Regional Development Fund, VG by NKFI-127909 Grants of the National Research, Development and Innovation Office of Hungary. LK and ES were supported in part by the EFOP-3.6.3-VEKOP-16-2017-00002 grant, supported by the European Union, co-financed by the European Social Fund. The authors are indebted to Balázs Szalkai for consultations on this work. Funding was provided by Nemzeti Kutatási Fejlesztési és Innovációs Hivatal.
Author contributions
LK and ES suggested using SVM in finding characteristic edges, performed integer quadratic programming optimizations, and invented the methods of finding very few (i.e., 102) characteristic edges and the superfeminine and supermasculine edges. BV computed the braingraphs from the HCP public data and constructed the interactive chart at http://pitgroup.org/static/interactive_chart/abra.html. VG initiated the study, secured funding, analyzed the results, and wrote the paper.
Funding
Open access funding provided by Eötvös Loránd University.
Data availability
The data source of this work was published at the Human Connectome Project’s website at http://www.humanconnectome.org (McNab et al. 2013) as the 1200-subjects public release. The parcellation data, containing the anatomically labeled ROIs, is listed in the CMTK nypipe GitHub repository https://github.com/LTS5/cmp_nipype/blob/master/cmtklib/data/parcellation/lausanne2008/ParcellationLausanne2008.xls. The large Excel table, which computes the linear separation of the 1064 male and female braingraphs, using only 102 edges, can be accessed at http://uratim.com/agysvm/agy-svm.zip; note that in the Excel file the non-scaled weights are present, to facilitate easy verification. The interactive chart showing the linear separation between the 1064 braingraphs of the sexes is available at http://pitgroup.org/static/interactive_chart/abra.html.
Declarations
Conflict of interest
The authors have no competing interests to disclose.
Footnotes
László Keresztes and Evelin Szögi joint first authors.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
László Keresztes, Email: keresztes@pitgroup.org.
Evelin Szögi, Email: szogi@pitgroup.org.
Bálint Varga, Email: balorkany@pitgroup.org.
Vince Grolmusz, Email: grolmusz@pitgroup.org.
References
- Agosta F, Galantucci S, Valsasina P, Canu E, Meani A, Marcone A, Magnani G, Falini A, Comi G, Filippi M (2014) Disrupted brain connectome in semantic variant of primary progressive aphasia. Neurobiol Aging 35(11):2646–2655 [DOI] [PubMed]
- Alexander-Bloch AF, Reiss PT, Rapoport J, McAdams H, Giedd JN, Bullmore ET, Gogtay N (2014) Abnormal cortical growth in schizophrenia targets normative modules of synchronized development. Biol Psychiatry 76(6):438–446 [DOI] [PMC free article] [PubMed]
- Baker JT, Holmes AJ, Masters GA, Thomas Yeo BT, Krienen F, Buckner RL, Öngür D. Disruption of cortical association networks in schizophrenia and psychotic bipolar disorder. JAMA Psychiatry. 2014;71(2):109–118. doi: 10.1001/jamapsychiatry.2013.3469. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ball G, Aljabar P, Zebari S, Tusor N, Arichi T, Merchant N, Robinson EC, Ogundipe E, Rueckert D, Edwards AD, Counsell SJ. Rich-club organization of the newborn human brain. Proc Natl Acad Sci USA. 2014;111(20):7456–7461. doi: 10.1073/pnas.1324118111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bargmann CI. Beyond the connectome: how neuromodulators shape neural circuits. BioEssays. 2012;34(6):458–465. doi: 10.1002/bies.201100185. [DOI] [PubMed] [Google Scholar]
- Batalle D, Muñoz-Moreno E, Figueras F, Bargallo N, Eixarch E, Gratacos E. Normalization of similarity-based individual brain networks from gray matter MRI and its association with neurodevelopment in infants with intrauterine growth restriction. Neuroimage. 2013;83:901–911. doi: 10.1016/j.neuroimage.2013.07.045. [DOI] [PubMed] [Google Scholar]
- Besson P, Dinkelacker V, Valabregue R, Thivard L, Leclerc X, Baulac M, Sammler D, Colliot O, Lehéricy S, Samson S, Dupont S. Structural connectivity differences in left and right temporal lobe epilepsy. Neuroimage. 2014;100C:135–144. doi: 10.1016/j.neuroimage.2014.04.071. [DOI] [PubMed] [Google Scholar]
- Bonilha L, Nesland T, Rorden C, Fillmore P, Ratnayake RP, Fridriksson J. Mapping remote subcortical ramifications of injury after ischemic strokes. Behav Neurol. 2014;2014:215380. doi: 10.1155/2014/215380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brin S, Page L. The anatomy of a large-scale hypertextual web search engine. Comput Netw ISDN Syst. 1998;30:107–117. [Google Scholar]
- Butler T, Imperato-McGinley J, Pan H, Voyer D, Cordero J, Zhu Y-S, Stern E, Silbersweig D. Sex differences in mental rotation: top-down versus bottom-up processing. NeuroImage. 2006;32:445–456. doi: 10.1016/j.neuroimage.2006.03.030. [DOI] [PubMed] [Google Scholar]
- Chudnovsky M, Robertson N, Seymour P, Thomas R. The strong perfect graph theorem. Ann Math. 2006;164(1):51–229. [Google Scholar]
- Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–297. [Google Scholar]
- Cover TM (1965) Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Trans Electron Comput EC–14(3):326–334
- Craddock RC, Milham MP, LaConte SM. Predicting intrinsic brain activity. Neuroimage. 2013;82:127–136. doi: 10.1016/j.neuroimage.2013.05.072. [DOI] [PubMed] [Google Scholar]
- Daducci A, Gerhard S, Griffa A, Lemkaddem A, Cammoun L, Gigandet X, Meuli R, Hagmann P, Thiran J-P. The connectome mapper: an open-source processing pipeline to map connectomes with MRI. PLoS ONE. 2012;7(12):e48121. doi: 10.1371/journal.pone.0048121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Desikan RS, Segonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, Buckner RL, Dale AM, Maguire RP, Hyman BT, Albert MS, Killiany RJ. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage. 2006;31(3):968–980. doi: 10.1016/j.neuroimage.2006.01.021. [DOI] [PubMed] [Google Scholar]
- Erdos P, Stone AH, et al. On the structure of linear graphs. Bull Am Math Soc. 1946;52(1087–1091):1. [Google Scholar]
- Euler L (1741) Solutio problematis ad geometriam situs pertinentis. Commentarii Academiae Scientarum Imperialis Petropolitanae 8(1):128–140
- Fellner M, Varga B, Grolmusz V. The frequent subgraphs of the connectome of the human brain. Cognit Neurodyn. 2019;13(5):453–460. doi: 10.1007/s11571-019-09535-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fellner M, Varga B, Grolmusz V. The frequent network neighborhood mapping of the human hippocampus shows much more frequent neighbor sets in males than in females. PLoS ONE. 2020;15(1):e0227910. doi: 10.1371/journal.pone.0227910. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fellner M, Varga B, Grolmusz V (2020b) Good neighbors, bad neighbors: the frequent network neighborhood mapping of the hippocampus enlightens several structural factors of the human intelligence on a 414-subject cohort. Sci Rep 10(11967) [DOI] [PMC free article] [PubMed]
- Máté F, Bálint V, Vince G. The frequent complete subgraphs in the human connectome. PLoS ONE. 2020;15(8):e0236883. doi: 10.1371/journal.pone.0236883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fischl B. Freesurfer. Neuroimage. 2012;62(2):774–781. doi: 10.1016/j.neuroimage.2012.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Foundas AL, Eure KF, Luevano LF, Weinberger DR (1998) MRI asymmetries of Broca's area: the Pars Triangularis and Pars Opercularis 64:282–296 [DOI] [PubMed]
- Frederikse ME, Lu A, Aylward E, Barta P, Pearlson G. Sex differences in the inferior parietal lobule. Cereb Cortex. 1999;9(8):896–901. doi: 10.1093/cercor/9.8.896. [DOI] [PubMed] [Google Scholar]
- Graham DJ. Routing in the brain. Front Comput Neurosci. 2014;8:44. doi: 10.3389/fncom.2014.00044. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hagmann P, Cammoun L, Gigandet X, Meuli R, Honey CJ, Wedeen VJ, Sporns O. Mapping the structural core of human cerebral cortex. PLoS Biol. 2008;6(7):e159. doi: 10.1371/journal.pbio.0060159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hagmann P, Grant PE, Fair DA. MR connectomics: a conceptual framework for studying the developing brain. Front Syst Neurosci. 2012;6:43. doi: 10.3389/fnsys.2012.00043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hao J, Ho TK (2019) Machine learning made easy: a review of scikit-learn package in python programming language. J Educ Behav Stat 44:348–361
- Hecht EE, Robins DL, Gautam P, King TZ. Intranasal oxytocin reduces social perception in women: neural activation and individual variation. Neuroimage. 2017;147:314–329. doi: 10.1016/j.neuroimage.2016.12.046. [DOI] [PubMed] [Google Scholar]
- Ingalhalikar M, Smith A, Parker D, Satterthwaite TD, Elliott MA, Ruparel K, Hakonarson H, Gur RE, Gur RC, Verma R. Sex differences in the structural connectome of the human brain. Proc Natl Acad Sci USA. 2014;111(2):823–828. doi: 10.1073/pnas.1316909110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kerepesi C, Grolmusz V. The Giant Virus Finder discovers an abundance of giant viruses in the Antarctic dry valleys. Arch Virol. 2017;162(6):1671–1676. doi: 10.1007/s00705-017-3286-4. [DOI] [PubMed] [Google Scholar]
- Kerepesi C, Szalkai B, Varga B, Grolmusz V. How to direct the edges of the connectomes: dynamics of the consensus connectomes and the development of the connections in the human brain. PLoS ONE. 2016;11(6):e0158680. doi: 10.1371/journal.pone.0158680. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kerepesi C, Szalkai B, Varga B, Grolmusz V. The braingraph .org database of high resolution structural connectomes and the brain graph tools. Cognit Neurodyn. 2017;11(5):483–486. doi: 10.1007/s11571-017-9445-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kerepesi C, Varga B, Szalkai B, Grolmusz V. The dorsal striatum and the dynamics of the consensus connectomes in the frontal lobe of the human brain. Neurosci Lett. 2018;673:51–55. doi: 10.1016/j.neulet.2018.02.052. [DOI] [PubMed] [Google Scholar]
- Kerepesi C, Szalkai B, Varga B, Grolmusz V. Comparative connectomics: mapping the inter-individual variability of connections within the regions of the human brain. Neurosci Lett. 2018;662(1):17–21. doi: 10.1016/j.neulet.2017.10.003. [DOI] [PubMed] [Google Scholar]
- Koscik T, O'Leary D, Moser DJ, Andreasen NC, Nopoulos P. Sex differences in parietal lobe morphology: relationship to mental rotation performance. Brain Cognit. 2009;69(3):451–459. doi: 10.1016/j.bandc.2008.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leighton FT (1992) Introduction to parallel algorithms and architectures: arrays, trees, hypercubes. Elsevier
- Maleki N, Linnman C, Brawn J, Burstein R, Becerra L, Borsook D. Her versus his migraine: multiple sex differences in brain function and structure. Brain? J Neurol. 2012;135:2546–2559. doi: 10.1093/brain/aws175. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McNab JA, Edlow BL, Witzel T, Huang SY, Bhat H, Heberlein K, Feiweier T, Liu K, Keil B, Cohen-Adad J, Tisdall MD, Folkerth RD, Kinney HC, Wald LL. The Human Connectome Project and beyond: initial applications of 300 mT/m gradients. Neuroimage. 2013;80:234–245. doi: 10.1016/j.neuroimage.2013.05.074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ortiz A, Gorriz JM, Ramirez J, Salas-Gonzalez D. Improving MR brain image segmentation using self-organising maps and entropy-gradient clustering. Inf Sci. 2014;262:117–136. [Google Scholar]
- Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web. Technical Report 1999-66, Stanford InfoLab, November. Previous number = SIDL-WP-1999-0120
- Rubin LH, Li Y, Keedy SK, Reilly JL, Bishop JR, Carter CS, Pournajafi-Nazarloo H, Drogos LL, Tamminga CA, Pearlson GD, Keshavan MS, Clementz BA, Hill SK, Liao W, Ji G-J, Lui S, Sweeney JA. Sex differences in associations of arginine vasopressin and oxytocin with resting-state functional brain connectivity. J Neurosci Res. 2017;95:576–586. doi: 10.1002/jnr.23820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Skvortsova A, Veldhuijzen DS, de Rover M, Pacheco-Lopez G, Bakermans-Kranenburg M, van IJzendoorn M, Chavannes NH, van Middendorp H, Evers AWM (2020) Effects of oxytocin administration and conditioned oxytocin on brain activity: an fmri study. PLoS ONE 15:e0229692 [DOI] [PMC free article] [PubMed]
- Sporns O, Tononi G, Kotter R. The human connectome: a structural description of the human brain. PLoS Comput Biol. 2005;1(4):e42. doi: 10.1371/journal.pcbi.0010042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Striepens N, Matusch A, Kendrick KM, Mihov Y, Elmenhorst D, Becker B, Lang M, Coenen HH, Maier W, Hurlemann R, Bauer A (2014) Oxytocin enhances attractiveness of unfamiliar female faces independent of the dopamine reward system, vol 39, pp 74–87 [DOI] [PubMed]
- Szalkai B, Grolmusz V (2017) Near perfect protein multi-label classification with deep neural networks. Methods (San Diego, Calif.) 132:50–56 [DOI] [PubMed]
- Szalkai B, Grolmusz V (2018) SECLAF: a webserver and deep neural network design tool for hierarchical biological sequence classification. Bioinformatics 34(14):2487–2489 [DOI] [PubMed]
- Szalkai B, Kerepesi C, Varga B, Grolmusz V. The Budapest reference connectome server v2. 0. Neurosci Lett. 2015;595:60–62. doi: 10.1016/j.neulet.2015.03.071. [DOI] [PubMed] [Google Scholar]
- Szalkai B, Kerepesi C, Varga B, Grolmusz V. Parameterizable consensus connectomes from the Human Connectome Project: the Budapest reference connectome server v3.0. Cognit Neurodyn. 2017;11(1):113–116. doi: 10.1007/s11571-016-9407-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szalkai B, Kerepesi C, Varga B, Grolmusz V (2019a) High-resolution directed human connectomes and the consensus connectome dynamics. PLoS ONE 14(4):e0215473 [DOI] [PMC free article] [PubMed]
- Szalkai B, Varga B, Grolmusz V. Graph theoretical analysis reveals: Women’s brains are better connected than men’s. PLoS ONE. 2015;10(7):e0130045. doi: 10.1371/journal.pone.0130045. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Szalkai B, Varga B, Grolmusz V (2017b) The robustness and the doubly-preferential attachment simulation of the consensus connectome dynamics of the human brain. Sci Rep 7(16118) [DOI] [PMC free article] [PubMed]
- Szalkai B, Varga B, Grolmusz V. Brain size bias-compensated graph-theoretical parameters are also better in women’s connectomes. Brain Imaging and Behav. 2018;12(3):663–673. doi: 10.1007/s11682-017-9720-0. [DOI] [PubMed] [Google Scholar]
- Szalkai B, Varga B, Grolmusz V. Mapping correlations of psychological and connectomical properties of the dataset of the human connectome project with the maximum spanning tree method. Brain Imaging Behav. 2019;13(5):1185–1192. doi: 10.1007/s11682-018-9937-6. [DOI] [PubMed] [Google Scholar]
- Szalkai B, Varga B, Grolmusz V (2021) The graph of our mind. Brain Sci 11(3) [DOI] [PMC free article] [PubMed]
- Szemeredi E (1975) Regular partitions of graphs. In: Colloq. Internat. CNRS, Univ. Orsay, Orsay, 1976, volume 260. CNRS
- Tournier J, Calamante F, Connelly A, et al. Mrtrix: diffusion tractography in crossing fiber regions. Int J Imaging Syst Technol. 2012;22(1):53–66. [Google Scholar]
- Van Essen DC, Ugurbil K, Auerbach E, Barch D, Behrens TEJ, Bucholz R, Chang A, Chen L, Corbetta M, Curtiss SW, Della Penna S, Feinberg D, Glasser MF, Harel N, Heath AC, Larson-Prior L, Marcus D, Michalareas G, Moeller S, Oostenveld R, Petersen SE, Prior F, Schlaggar BL, Smith SM, Snyder AZ, Xu J, Yacoub E, U-Minn W, H. C. P Consortium (2012) The human connectome project: a data acquisition perspective. Neuroimage 62(4):2222–2231 [DOI] [PMC free article] [PubMed]
- Varga B, Grolmusz V (2021) The braingraph.org database with more than 1000 robust human structural connectomes in five resolutions. Cogn Neurodyn. 10.1007/s11571-021-09670-5 [DOI] [PMC free article] [PubMed]
- Yao S, Liebenthal E, Juvekar P, Bunevicius A, Vera M, Rigolo L, Golby AJ, Tie Y. Sex effect on pre-surgical language mapping in patients with a brain tumor. Front Neurosci. 2020;14:4. doi: 10.3389/fnins.2020.00004. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data source of this work was published at the Human Connectome Project’s website at http://www.humanconnectome.org (McNab et al. 2013) as the 1200-subjects public release. The parcellation data, containing the anatomically labeled ROIs, is listed in the CMTK nypipe GitHub repository https://github.com/LTS5/cmp_nipype/blob/master/cmtklib/data/parcellation/lausanne2008/ParcellationLausanne2008.xls. The large Excel table, which computes the linear separation of the 1064 male and female braingraphs, using only 102 edges, can be accessed at http://uratim.com/agysvm/agy-svm.zip; note that in the Excel file the non-scaled weights are present, to facilitate easy verification. The interactive chart showing the linear separation between the 1064 braingraphs of the sexes is available at http://pitgroup.org/static/interactive_chart/abra.html.