Abstract
The mechanisms by which modularity emerges in complex networks are not well understood but recent reports have suggested that modularity may arise from evolutionary selection. We show that finding the modularity of a network is analogous to finding the ground-state energy of a spin system. Moreover, we demonstrate that, due to fluctuations, stochastic network models give rise to modular networks. Specifically, we show both numerically and analytically that random graphs and scale-free networks have modularity. We argue that this fact must be taken into consideration to define statistically significant modularity in complex networks.
Statistical, mathematical, and model-based analysis of complex networks have recently uncovered interesting unifying patterns in networks from seemingly unrelated disciplines [1-5]. In spite of these advances, many properties of complex networks remain elusive, a prominent one being modularity [6,7]. For example, it is a matter of common experience that social networks have communities of highly interconnected nodes that are poorly connected to nodes in other communities. Such modular structures have been reported not only in social networks [6-8], but also in biochemical networks [9], food webs [10], and the Internet [11]. It is widely believed that the modular structure of complex networks plays a critical role in their functionality [9]. There is therefore a clear need to develop algorithms to identify modules accurately [6,7,11-13].
More fundamentally, the mechanisms by which modularity emerges in complex networks are not well understood. In biological networks—both biochemical and ecological—researchers have suggested that modularity increases robustness, flexibility, and stability [9,10]. Similarly, in engineered networks, it has been suggested that modularity is effective to achieve adaptability in rapidly changing environments [14]. It may therefore seem that evolutionary pressures make networks modular, implying that any successful model of complex networks should take into account external factors that enhance modularity. Recently, however, Solé and Fernàndez have pointed out that models without any external pressure are able to give rise to modular networks [15].
In this paper, we show that Erdös-Rényi (ER) random graphs, in which any pair of nodes is connected with probability p [16], have a high modularity. We show numerically and analytically that this high modularity is due to fluctuations in the establishment of links, which are magnified by the large number of ways in which a network can be partitioned into modules. Furthermore, we show that one obtains similar results when considering scale-free networks [2]. We conclude by discussing how these results should be taken into consideration to define statistically significant modularity in complex networks.
Following the first quantitative definition of modularity [7,12], several groups have proposed heuristic algorithms to detect modules in complex networks. For a given partition of the nodes of a network into modules, the modularity ℳ of this partition is defined as [7]
| (1) |
where r is the number of modules, L is the number of links in the network, ls is the number of links between nodes in module s, and ds is the sum of the degrees of the nodes in module s. This definition of modularity implies that ℳ≤1 and that ℳ=0 for a random partition of the nodes [7]. We define the modularity M of a network as the largest modularity of all possible partitions of the network M =max{ℳ}.
The problem of finding the modularity of a network with S nodes is therefore analogous to the standard statistical mechanics problem of finding the ground-state energy of the Hamiltonian ℋ=−Lℳ. Specifically, one can map the network into a spin system by defining the variables si ∈{1,2,…,S} as the module to which node i belongs and the couplings Jij as being 1 if nodes i and j are connected in the network and 0 otherwise. Then, from Eq. (1), one can demonstrate that
| (2) |
This Hamiltonian corresponds to an S-state Potts model with both ferromagnetic and anti-ferromagnetic terms, and two-, three-, and four-spin interactions. Therefore, it seems difficult to apply methods used in problems that are similar but formally simpler, like the graph coloring problem [17]. Rather, we propose here a heuristic estimation of the modularity for a number of interesting graph models, namely low-dimensional regular lattices, ER random graphs [16] and scale-free networks [2].
Low-dimensional regular lattices
Consider a one-dimensional lattice with S nodes, each one connected to its two neighbors [20]. This case is particularly simple because the modules comprise only contiguous nodes and, therefore, the number of between-module links equals the number r of modules. Assuming that all modules have approximately the same size n=S/r, the modularity of a partition with r modules is
| (3) |
where we have used the fact that the number L of links is L≈S. Under these assumptions, the problem of finding the modularity of a regular one-dimensional lattice is reduced to finding the optimal number r* of modules, that is, the number of modules that yields the maximum modularity. One can show that , and the modularity is
| (4) |
Note that the only assumption in the calculation is that all modules have approximately the same number of nodes. Numerical results confirm that this is a sensible assumption.
One can generalize this result to one-dimensional lattices in which each node is connected to z nodes on the left and z on the right. In this case, the leading contributions to the modularity are
| (5) |
Similarly, one can calculate the modularity of d-dimensional cubic lattices in which each node is connected to 2z nodes in each one of the d directions, to obtain that [18]
| (6) |
Random graphs
In ER random graphs [16], each pair of nodes is connected with probability p. As for d-dimensional lattices, we assume that the partition of the network with highest modularity consists of r modules with approximately the same number of nodes n=S/r, the same number of within-module links ki, and the same number of links ko to other modules. In the S⪢1 limit, we can assume that the total number of links is S2p/2 and, therefore, ki and ko are related by
| (7) |
Hence, for S⪢1, the modularity of such a partition is simply
| (8) |
Under these assumptions, the problem of finding the modularity of a random graph is reduced to finding a partition of the graph with the following properties: (i) The partition consists of r equal modules, each one with ki within-module links; (ii) the partition typically exists in a random graph; and (iii) the partition yields the maximum modularity relative to the other partitions that typically exist.
In a random graph with S nodes and linking probability p, the average number 𝒩 of different partitions with r identical modules, each with ki links, is 𝒩(S,p;r,ki). A certain partition typically exists if 𝒩(S,p;r,ki)≥1. Among all the partitions that typically exist, we are interested in the one whose modularity is maximum. In other words, given a certain number r of modules, we want a partition with as many within-module links as possible. Therefore, if one finds a very common partition 𝒩(S,p;r,ki)≥1, it must be possible to find another partition with the same r and that has larger modularity. This new partition will be rarer than the former one . By iterating this argument, one concludes that the partition we are interested in must satisfy
| (9) |
where is the maximum number of within-module links that one can typically find in a partition with r identical modules.
To calculate 𝒩(S,p;r,ki), we use the following process. First, we calculate the number 𝒩1 of ways in which a module of size n=S/r, with ki within-module links and ko(r,ki) external links, can be separated from the rest of the graph:
| (10) |
where
| (11) |
| (12) |
The next step is to separate the second module from the remaining set of S−n nodes. It is important to note that the second module only needs to establish ko(1−n/(S−n)) external links, because the remaining kon/(S−n) are already established with the first module. Therefore,
| (13) |
Repeating this separation process, one can see that the general term is of the form
| (14) |
Finally, 𝒩(S,p;r,ki) is the product of all the individual module separations
| (15) |
so that Eq. (9) can be solved numerically to obtain using Eqs. (11), (12), (14), and (15).
Once we find for a given value of r, we use Eq. (8) to obtain the modularity. Finally, we select the optimal number of modules r=r*(S,p) and the modularity MER(S,p) of the ER random graph is
| (16) |
In Fig. 1(a), we compare the modularity of ER graphs obtained through optimization of Eq. (1) using simulated annealing [19], with the predictions of Eq. (16). We find good agreement in the relevant region of sparse but connected graphs, that is, 2/S,<p⪡1.
FIG. 1.

Modularity in Erdös-Rényi random graphs. (a) Comparison of numerical results of the modularity as a function of the linking probability, and the predictions of Eqs. (16) and (19). The numerical results are obtained by maximizing the modularity, Eq. (1), using simulated annealing [19]. (b) Modularity as a function of pS for large networks, as predicted by Eq. (16). Both in (a) and (b), numerical problems in the solution of Eq. (9) prevent us from obtaining values of the modularity for larger values of p.
Equation (16) enables us to obtain the modularity of large random graphs, something that would not be possible using simulated annealing because of the computational cost. In Fig. 1(b) we show that for S→∞ the modularity only depends on pS
| (17) |
To obtain a closed expression for MER for any value of S, we note that at the percolation point pS=2 the random graph contains essentially no loops, that is, the graph is a tree [16]. In this case, one can find partitions in which the number of between-module links equals the number of modules r as in the simple one-dimensional case, and the modularity is
| (18) |
We propose the simplest ansatz that verifies Eqs. (17) and (18) simultaneously
| (19) |
In Fig. 1(a), we show that Eq. (19) is in good agreement with values obtained using simulated annealing.
Our analytic treatment allows us to explain the origin of the modularity in random graphs. The typical partition of an ER graph into modules of size n is very unlikely to have a number of within-module links ki larger than the average pn(n−1)/2, expected for a random partition of the nodes. However, the number of possible partitions S!/(n!r) is so large that, typically, there exists a partition whose ki is much larger than the average. For example, for a network with S=200 and p=0.02 one typically finds a partition with r=7 modules and ki≈ instead of the value ki≈8 expected for a random partition.
Remarkably, the modularity of a random graph can be as large as that of a graph with modular structure imposed at the onset [6]. In such a graph, nodes are divided into modules and each pair of nodes is connected with probability pi if they belong to the same module, and with probability po>pi otherwise. Using the same example as before, the modularity of an ER graph with S=200 and p=0.02 is the same as the modularity of a graph with m=7 modules, pi≈0.09, and po≈0.004.
Scale-free networks
So far, we have considered d-dimensional regular lattices and ER random graphs, in which all nodes have essentially the same degree. However, many complex networks display scale-free degree distributions [4], meaning that some nodes have degrees that are orders of magnitude larger than the average. Since the results presented for ER graphs rely on the fact that there are many partitions of the network and implicitly on the fact that nodes are exchangeable, it is worth asking whether “random” scale-free networks also display modularity.
To answer this question, we use the scale-free model proposed in [2]. In the model, the network grows by the addition of new nodes. Each time a new node is added, it establishes m preferential connections to nodes already in the network. In Fig. 2, we show the modularity of scale-free networks as a function of the network size S for different values of m. As before, we find the modularity by optimizing Eq. (1) using simulated annealing. As for ER graphs, the modularity approaches a finite value for large S and decreases with the connectivity m.
FIG. 2.

Modularity in scale-free networks. Numerical results of the modularity as a function of the network size S for different values of m. These results are obtained by maximizing the modularity, Eq. (1), with simulated annealing. The lines are the predictions of Eq. (21), with a=0.165±0.009 in all the cases.
We are unable to derive a general expression for the modularity of scale-free networks. However, for m=1 the scale-free network is a tree. Thus,
| (20) |
For larger values of m, we find numerically that, at a fixed network size, the modularity is a linear function of 1/m. The simplest possible ansatz for the modularity that verifies this condition and Eq. (20) simultaneously is
| (21) |
As we show in Fig. 2, this approximation works well for a=0.165±0.009.
Conclusions
We have shown that modularity in networks can arise due to a number of mechanisms. We have demonstrated that networks embedded in low-dimensional spaces have high modularity. We have also shown analytically and numerically that, surprisingly, random graphs and scale-free networks have high modularity due to fluctuations in the establishment of links.
Recently, several works have reported the existence of modules in complex networks and suggested that some evolutionary mechanism must enhance modularity. This statement is based, in the best cases, on the fact that the modularity is large enough, and relies implicitly on the assumption that random graphs have low modularity.
Our results enable one to define statistically significant modularity in networks. We argue that, just as it is already done for the clustering coefficient and other quantities, the modularity of complex networks must always be compared to the null case of a random graph. The analytical expressions we have derived provide a convenient way to carry out such a comparison.
Acknowledgments
We thank Alex Arenas, André A. Moreira, Carla A. Ng, and Daniel B. Stouffer for numerous suggestions and discussions. R.G. and M.S. thank the Fulbright Program and the Spanish Ministry of Education, Culture & Sports. L.A.N.A. gratefully acknowledges the support of a Searle Leadership Fund Award and of a NIH/NIGMS K-25 award.
References
- 1.Watts DJ, Strogatz SH. Nature (London) 1998;393:440. doi: 10.1038/30918. [DOI] [PubMed] [Google Scholar]
- 2.Barabási A-L, Albert R. Science. 1999;286:509. doi: 10.1126/science.286.5439.509. [DOI] [PubMed] [Google Scholar]
- 3.Amaral LAN, Scala A, Barthelémy M, Stanley HE. Proc. Natl. Acad. Sci. U.S.A. 2000;97:11149. doi: 10.1073/pnas.200327197. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Albert R, Barabási A-L. Rev. Mod. Phys. 2002;74:47. [Google Scholar]
- 5.Dorogovtsev SN, Mendes JFF. Adv. Phys. 2002;51:1079. [Google Scholar]
- 6.Girvan M, Newman MEJ. Proc. Natl. Acad. Sci. U.S.A. 2002;99:7821. doi: 10.1073/pnas.122653799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Newman MEJ, Girvan M. Phys. Rev. E. 2004;69:026113. doi: 10.1103/PhysRevE.69.026113. [DOI] [PubMed] [Google Scholar]
- 8.Guimerà R, Danon L, Diaz-Guilera A, Giralt F, Arenas A. Phys. Rev. E. 2003;68:065103(R). doi: 10.1103/PhysRevE.68.065103. [DOI] [PubMed] [Google Scholar]; Arenas A, Danon L, Díaz-Guilera A, Gleiser PM, Guimerà R. Eur. Phys. J. B. 2004;38:373. [Google Scholar]
- 9.Hartwell LH, Hopfield JJ, Leibler S, Murray AW. Nature (London) 1999;402:C47. doi: 10.1038/35011540. [DOI] [PubMed] [Google Scholar]; Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási A-L. Science. 2002;297:1551. doi: 10.1126/science.1073374. [DOI] [PubMed] [Google Scholar]; Holme P, Huss M. Bioinformatics. 2003;19:532. doi: 10.1093/bioinformatics/btg033. [DOI] [PubMed] [Google Scholar]
- 10.Pimm SL. Theor Popul. Biol. 1979;16:144. doi: 10.1016/0040-5809(79)90010-8. [DOI] [PubMed] [Google Scholar]; Krause AE, Frank KA, Mason DM, Ulanowicz RE, Taylor WW. Nature (London) 2003;426:282. doi: 10.1038/nature02115. [DOI] [PubMed] [Google Scholar]
- 11.Eriksen KA, Simonsen I, Maslov S, Sneppen K. Phys. Rev. Lett. 2003;90:148701. doi: 10.1103/PhysRevLett.90.148701. [DOI] [PubMed] [Google Scholar]
- 12.Newman MEJ. Phys. Rev. E. 2004;69:066133. [Google Scholar]
- 13.Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D. Proc. Natl. Acad. Sci. U.S.A. 2004;101:2658. doi: 10.1073/pnas.0400054101. [DOI] [PMC free article] [PubMed] [Google Scholar]; Reichardt J, Bornholdt S. e-print cond-mat/0402349. [Google Scholar]; Fortunato S, Latora V, Marchiori M. e-print cond-mat/0402522. [Google Scholar]; Capocci A, Servedio VDP, Caldarelli G, Colaiori F. e-print cond-mat/0402499. [Google Scholar]
- 14.Alon U. Science. 2003;301:1866. doi: 10.1126/science.1089072. [DOI] [PubMed] [Google Scholar]
- 15.Solé RV, Fernández P. e-print q-bio.GN/0312032. [Google Scholar]
- 16.Bollobas B. Random Graphs. 2nd ed. Cambridge University Press; New York: 2001. [Google Scholar]
- 17.Mulet R, Pagnani A, Weigt M, Zecchina R. Phys. Rev. Lett. 2002;89:268701. doi: 10.1103/PhysRevLett.89.268701. [DOI] [PubMed] [Google Scholar]; van Mourik J, Saad D. Phys. Rev. E. 2002;66:056120. doi: 10.1103/PhysRevE.66.056120. [DOI] [PubMed] [Google Scholar]
- 18.Guimerà R, Sales-Pardo M, Amaral LAN. unpublished. [Google Scholar]
- 19.Kirkpatrick S, Gelatt CD, Vecchi MP. Science. 1983;220:671. doi: 10.1126/science.220.4598.671. [DOI] [PubMed] [Google Scholar]
- 20.In this case, there are no fluctuations involved in the creation of the network. Rather, modularity arises because neighbors of a node in a low-dimensional lattice are also neighbors of each other, and neither the node nor its neighbors are linked to nodes that are far away in the lattice. As an example, consider the cities and towns in Europe and all the roads between them. This “road network” is two-dimensional and has modules that correspond roughly to the countries. In each of this “sub-modules” people have different customs, food preferences, languages or dialects, etc., that is, communities exist. It is also worth noting that the hypothesis used in the calculations are essentially the same that we use for random graphs and scale-free networks, and that the modularity of one-dimensional regular lattices turns out to be useful in certain limits of ER random graphs and scale-free networks.
