A maximum entropy framework for nonexponential distributions

Jack Peterson; Purushottam D Dixit; Ken A Dill

doi:10.1073/pnas.1320578110

. 2013 Dec 2;110(51):20380–20385. doi: 10.1073/pnas.1320578110

A maximum entropy framework for nonexponential distributions

Jack Peterson ^a,^b, Purushottam D Dixit ^c, Ken A Dill ^b,¹

PMCID: PMC3870711 PMID: 24297895

Significance

Many statistical distributions, particularly among social and biological systems, have “heavy tails,” which are situations where rare events are not as improbable as would have been guessed from more traditional statistics. Heavy-tailed distributions are the basis for the phrase “the rich get richer.” Here, we propose a basic principle underlying systems with heavy-tailed distributions. We show that it is the same principle (maximum entropy) used in statistical physics and statistics to estimate probabilistic models from relatively few constraints. The heavy-tail principle can be expressed in terms of shared costs and economies of scale. The probability distribution we derive is a mathematical digamma function, and we show that it accurately fits 13 real-world data sets.

Keywords: heavy tail, fat tail, statistical mechanics, thermostatistics, social physics

Abstract

Probability distributions having power-law tails are observed in a broad range of social, economic, and biological systems. We describe here a potentially useful common framework. We derive distribution functions Inline graphic for situations in which a “joiner particle” k pays some form of price to enter a community of size , where costs are subject to economies of scale. Maximizing the Boltzmann–Gibbs–Shannon entropy subject to this energy-like constraint predicts a distribution having a power-law tail; it reduces to the Boltzmann distribution in the absence of economies of scale. We show that the predicted function gives excellent fits to 13 different distribution functions, ranging from friendship links in social networks, to protein–protein interactions, to the severity of terrorist attacks. This approach may give useful insights into when to expect power-law distributions in the natural and social sciences.

Probability distributions are often observed to have power-law tails, particularly in social, economic, and biological systems. Examples include distributions of fluctuations in financial markets (1), the populations of cities (2), the distribution of Web site links (3), and others (4, 5). Such distributions have generated much popular interest (6, 7) because of their association with rare but consequential events, such as stock market bubbles and crashes.

If sufficient data are available, finding the mathematical shape of a distribution function can be as simple as curve-fitting, with a follow-up determination of the significance of the mathematical form used to fit it. However, it is often interesting to know if the shape of a given distribution function can be explained by an underlying generative principle. Principles underlying power-law distributions have been sought in various types of models. For example, the power-law distributions of node connectivities in social networks have been derived from dynamical network evolution models (8–17). A large and popular class of such models is based on the preferential attachment rule (18–27), wherein it is assumed that new nodes attach preferentially to the largest of the existing nodes. Explanations for power laws are also given by Ising models in critical phenomena (28–34), network models with thresholded “fitness” values (35), and random-energy models of hydrophobic contacts in protein interaction networks (36).

However, such approaches are often based on particular mechanisms or processes; they often predict particular power-law exponents, for example. Our interest here is in finding a broader vantage point, as well as a common language, for describing a range of distributions, from power law to exponential. For deriving exponential distributions, a well-known general principle is the method of maximum entropy (Max Ent) in statistical physics (37, 38). In such problems, you want to choose the best possible distribution from all candidate distributions that are consistent with certain set of constrained moments, such as the average energy. For this type of problem, which is highly underdetermined, a principle is needed for selecting a “best” mathematical function from among alternative model distribution functions. To find the mathematical form of the distribution function Inline graphic over states , the Max Ent principle asserts that you should maximize the Boltzmann–Gibbs–Shannon (BGS) entropy functional subject to constraints, such as the known value of the average energy . This procedure gives the exponential (Boltzmann) distribution, , where β is the Lagrange multiplier that enforces the constraint. This variational principle has been the subject of various historical justifications. It is now commonly understood as the approach that chooses the least-biased model that is consistent with the known constraint(s) (39).

Is there an equally compelling principle that would select fat-tailed distributions, given limited information? There is a large literature that explores this. Inferring nonexponential distributions can be done by maximizing a different mathematical form of entropy, rather than the BGS form. Examples of these nontraditional entropies include those of Tsallis (40), Renyi (41), and others (42, 43). For example, the Tsallis entropy is defined as Inline graphic , where K is a constant and q is a parameter for the problem at hand. Such methods otherwise follow the same strategy as above: maximizing the chosen form of entropy subject to an extensive energy constraint gives nonexponential distributions. The Tsallis entropy has been applied widely (44–53).

However, we adopt an alternative way to infer nonexponential distributions. To contrast our approach, we first switch from probabilities to their logarithms. Logarithms of probabilities can be parsed into energy-like and entropy-like components, as is standard in statistical physics. Said differently, a nonexponential distribution that is derived from a Max Ent principle requires that there be nonextensivity in either an energy-like or entropy-like term; that is, it is nonadditive over independent subsystems, not scaling linearly with system size. Tsallis and others have chosen to assign the nonextensivity to an entropy term, and retain extensivity in an energy term. Here, instead, we keep the canonical BGS form of entropy, and invoke a nonextensive energy-like term. In our view, only the latter approach is consistent with the principles elucidated by Shore and Johnson (37) (reviewed in ref. 39). Shore and Johnson (37) showed that the BGS form of entropy is uniquely the mathematical function that ensures satisfaction of the addition and multiplication rules of probability. Shore and Johnson (37) assert that any form of entropy other than BGS will impart a bias that is unwarranted by the data it aims to fit. We regard the Shore and Johnson (37) argument as a compelling first-principles basis for defining a proper variational principle for modeling distribution functions. Here, we describe a variational approach based on the BGS entropy function, and we seek an explanation for power-law distributions in the form of an energy-like function instead.

Theory

Assembly of Simple Colloidal Particles.

We frame our discussion in terms of a joiner particle that enters a cluster or community of particles, as shown in Fig. 1. However, this is a natural way to describe the classical problem of the colloidal clustering of physical particles; it is readily shown (reviewed below) to give an exponential distribution of cluster sizes. However, this general description also pertains more broadly, such as when people populate cities, links are added to Web sites, or when papers accumulate citations. We want to compute the distribution, Inline graphic , of populations of communities having size .

Inline graphic — The joining cost for a particle to join a size community is . This diagram can describe particles forming colloidal clusters, or social processes such as people joining cities, citations added to papers, or link creation in a social network.

To begin, we express a cumulative cost of joining. For particles in colloids, this cost is expressed as a chemical potential, i.e., a free energy per particle. If Inline graphic represents the cost of adding particle j to a cluster of size , the cumulative cost of assembling a whole cluster of k particles is the sum

graphic file with name pnas.1320578110eq1.jpg

Max Ent asserts that we should choose the probability distribution that has the maximum entropy among all candidate distributions that are consistent with the mean value Inline graphic of the total cost of assembly (54),

where λ is a Lagrange multiplier that enforces the constraint.

In situations where the cost of joining does not depend on the size of the community a particle joins, then Inline graphic , where is a constant. The cumulative cost of assembling the cluster is then

Substituting into Eq. 2 and absorbing the Lagrange multiplier λ into Inline graphic yields the grand canonical exponential distribution, well known for problems such as this:

graphic file with name pnas.1320578110eq4.jpg

In short, when the joining cost of a particle entry is independent of the size of the community it enters, the community size distribution is exponential.

Communal Assemblies and Economies of Scale.

Now, we develop a general model of communal assembly based on economies of scale. Consider a situation where the joining cost for a particle depends on the size of the community it joins. In particular, consider situations in which the costs are lower for joining a larger community. Said differently, the cost-minus-benefit function Inline graphic is now allowed to be subject to economies of scale, which, as we note below, can also be interpreted instead as a form of discount in which the community pays down some of the joining costs for the joiner particle.

To see the idea of economy-of-scale cost function, imagine building a network of telephones. In this case, a community of size 1 is a single unconnected phone. A community of size 2 is two connected phones, etc. Consider the first phone: The cost of creating the first phone is high because it requires initial investment in the phone assembly plant. And the benefit is low, because there is no value in having a single phone. Now, for the second phone, the cost-minus-benefit is lower. The cost of producing the second phone is lower than the first because the production plant already exists, and the benefit is higher because two connected phones are more useful than one unconnected phone. For the third phone, the cost-minus-benefit is even lower than for the second because the production cost is even lower (economy of scale) and because the benefits increase with the number of phones in the network.

To illustrate, suppose the cost-minus-benefit for the first phone is 150, for the second phone is 80, and for the third phone is 50. To express these cost relationships, we define an intrinsic cost for the first phone (joiner particle), 150 in this example. We define the difference in cost-minus-benefit between the first and second phones as the discount provided by the first phone when the second phone joins the community of two phones. In this example, the first phone provides a discount of 70 when the second phone joins. Similarly, the total discount provided by the two-phone community is 100 when the third phone joins the community.

In this language, the existing community is paying down some fraction of the joining costs for the next particle. Mathematically, this communal cost-minus-benefit function can be expressed as

The quantity Inline graphic on the left side of Eq. 5 is the total cost-minus-benefit when a particle joins a k-mer community. The joining cost has two components, expressed on the right side: each joining event has an intrinsic cost that must be paid, and each joining event involves some discount that is provided by the community. Because there are k members of the existing community, the quantity Inline graphic is the discount given to a joiner by each existing community particle, where is a problem-specific parameter that characterizes how much of the joining cost burden is shouldered by each member of the community. In the phone example, we assumed . The value of represents fully equal cost-sharing between joiner and community member: each communal particle gives the joining particle a discount equal to what the joiner itself pays. The opposite extreme limit is represented by Inline graphic ; in this case, the community gives no discount at all to the joining particle.

The idea of communal sharing of cost-minus-benefit is applicable to various domains; it can express that one person is more likely to join a well-populated group on a social networking site because the many existing links to it make it is easier to find (i.e., lower cost) and because its bigger hub offers the newcomer more relationships to other people (i.e., greater benefit). Or, it can express that people prefer larger cities to smaller ones because of the greater benefits that accrue to the joiner in terms of jobs, services, and entertainment. (In our terminology, a larger community pays down more of the cost-minus-benefit for the next immigrant to join.) We use the terms “economy of scale” (EOS) or “communal” to refer to any system that can be described by a cost function, such as Eq. 5, in which the community can be regarded as sharing in the joining costs, although other functional forms might also be of value for expressing EOS.

Rearranging Eq. 5 gives Inline graphic . The total cost-minus-benefit, , of assembling a community of size k is

graphic file with name pnas.1320578110eq6.jpg

where Inline graphic is the digamma function ( is Euler’s constant), and the constant term will be absorbed into the normalization.

From this cost-minus-benefit expression (Eq. 6), for a given Inline graphic , we can now uniquely determine the probability distribution by maximizing the entropy. Substituting Eq. 6 into Eq. 2 yields

graphic file with name pnas.1320578110eq7.jpg

Eq. 7 describes a broad class of distributions. These distributions have a power-law tail for large k, with exponent Inline graphic , and a cross-over at from exponential to power law. To see this, expand asymptotically and drop terms of order ; this yields , so Eq. 7 obeys a power law for large k, and becomes a simple exponential in the limit of (zero cost-sharing). One quantitative measure of a distribution’s position along the continuum from exponential to power law is the value of its scaling exponent, Inline graphic . A small exponent indicates that the system has extensive social sharing, thus power-law behavior. As the exponent becomes large, the distribution approaches an exponential function. Eq. 7 has a power-law scaling only when the cost of joining a community has a linear dependence on the community size. The linear dependence arises because the joiner particle interacts identically with all other particles in the community.

What is the role of detailed balance in our modeling? Fig. 1 shows no reverse arrows from k to Inline graphic . The principle of Max Ent can be regarded as a general way to infer distribution functions from limited information, irrespective of whether there is an underlying a kinetic model. So, it poses no problem that some of our distributions, such as scientific citations, are not taken from reversible processes.

Results

Eq. 7 and Fig. 2 show the central results of this paper. Consider three types of plots. On the one hand, exponential functions can be seen in data by plotting Inline graphic vs. k. Or, power-law functions are seen by plotting vs. . Here, we find that plotting vs. a digamma function provides a universal fit to several disparate experimental data sets over their full distributions (Fig. 3). Fig. 2 shows fits of Eqs. 7–13 datasets, using and as fitting parameters that are determined by a maximum-likelihood procedure (see SI Text for dataset and goodness-of-fit test details). The Inline graphic and characterize the intrinsic cost of joining any cluster, and the communal contribution to sharing that cost, respectively.

Fig. 3. — Eq. 7 fitted to the 13 datasets in Table 1, plotted against the total cost to assemble a size k community, . Values of and are shown in Table 1. The y axis has been rescaled by dividing by the maximum , so that all curves begin at . All data sets are fit by the line. See Fig. 2 for fits to individual datasets.

Rare events are less rare under fat-tailed distributions than under exponential distributions. For dynamical systems, the risk of such events can be quantified by the coefficient of variation (CV), defined as the ratio of the SD Inline graphic to the mean . For equilibrium/steady-state systems, the CV quantifies the spread of a probability distribution, and is determined by the power-law exponent, . Systems with small scaling exponents () experience an unbounded, power-law growth of their CV as the system size N becomes large, Inline graphic . This growth is particularly rapid in systems with , because the average community size diverges at . For these systems, is observed. Several of our datasets fall into this high-risk category, such as the number of deaths due to terrorist attacks (Table 1).

Table 1.

Fitting parameters and statistics

Data set				N		P
GitHub	9(1)	0.21(2)	3.642	120,866	2(2)	0.78
Wikipedia	1.5(1)	1.3(1)	25.418	21,607	1.9(1)	0.79
Pretty Good Privacy	1(1)	2.6(2)	4.558	10,680	2.6(3)	0.16
Word adjacency	3.6(4)	0.6(1)	5.243	11,018	2.1(3)	0.09
Terrorist attacks	2.1(2)	1(1)	4.346	9,101	2.2(3)	0.38
Facebook wall	1.6(1)	2.3(3)	2.128	10,082	3.6(5)	0.99
Proteins (fly)	0.9(2)	5(2)	2.527	878	5(2)	0.89
Proteins (yeast)	0.9(1)	4(1)	3.404	2,170	3(1)	0.48
Proteins (human)	0.8(1)	4(1)	3.391	3,165	4(1)	0.52
Digg	0.68(3)	4.2(3)	5.202	16,844	2.8(2)	0.05
Petster	0.21(3)	15(3)	13.492	1,858	3(1)	0.08
Word use	2.3(1)	0.8(1)	11.137	18,855	1.9(2)	0.56
Software	0.8(1)	2.1(3)	62.82	2,208	1.7(3)	0.69

Open in a new tab

Discussion

We have expressed a range of probability distributions in terms of a generalized energy-like cost function. In particular, we have considered types of costs that can be subject to economies of scale, which we have also called “community discounts.” We maximize the BGS entropy, subject to such cost-minus-benefit functions. This procedure predicts probability distributions that are exponential functions of a digamma function. Such a distribution function has a power-law tail, but reduces to a Boltzmann distribution in the absence of EOS. This function gives good fits to distributions ranging from scientific citations and patents, to protein-protein interactions, to friendship networks, and to Web links and terrorist networks—over their full distributions, not just in their tails.

Framed in this way, each new joiner particle must pay an intrinsic buy-in cost to join a community, but that cost may be reduced by a communal discount (an economy of scale). Here, we discuss a few points. First, both exponential and power-law distributions are ubiquitous. How can we rationalize this? One perspective is given by switching viewpoint from probabilities to their logarithms, which are commonly expressed in a language of dimensionless cost functions, such as energy Inline graphic . There are many forms of energy (e.g., gravitational, magnetic, electrostatic, springs, and interatomic interactions). The ubiquity of the exponential distribution can be seen in terms of the diversity and interchangeability of energies.

A broad swath of physics problems can be expressed in terms of the different types of energy and their ability to combine, add, or exchange with each other in various ways. Here, we indicate that nonexponential distributions, too, can be expressed in a language of costs, particularly those that are shared and are subject to economies of scale. Second, where do we expect exponentials vs. power laws? What sets Eq. 5 apart from typical energy functions in physical systems is that EOS costs are both independent of distance and long-ranged (the joiner particle interacts with all particles in given community). Consequently, when the system size becomes large, due to the absence of a correlation length-scale, the energy of the system does not increase linearly with system size, giving rise to a nonextensive energy function. This view is consistent with the appearance of power laws in critical phenomena, where interactions are effectively long-ranged.

Third, interestingly, the concept of cost-minus-benefit in Eq. 5 can be further generalized, also leading to either Gaussian or stretched-exponential distributions. A Gaussian distribution results when the cost-minus-benefit function grows linearly with cluster size, Inline graphic ; this would arise if the joiner particle were to pay a tax to each member of a community, and this leads to a total cost of (Eq. 1). These would be “hostile” communities, leading to mostly very small communities and few large ones, because a Gaussian function drops off even faster with k than an exponential does. An example would be a Coulombic particle of charge q joining a community of k other such charged particles, as in the Born model of ion hydration (55). A stretched-exponential distribution can arise if the joiner particle instead pays a tax to only a subset of the community. For example, in a charged sphere with strong shielding, if only the particles at the sphere’s surface interact with the joiner particle, then Inline graphic and , leading to a stretched-exponential distribution. In these situations, EOS can affect the community-size distribution not only through cost-sharing but also through the topology of interactions.

Finally, we reiterate a matter of principle. On the one hand, nonexponential distributions could be derived by using a nonextensive entropy-like quantity, such as those of Tsallis, combined with an extensive energy-like quantity. Here, instead, our derivation is based on using the BGS entropy combined with a nonextensive energy-like quantity. We favor the latter because it is consistent with the foundational premises of Shore and Johnson (37). In short, in the absence of energies or costs, the BGS entropy alone predicts a uniform distribution; any other alternative would introduce bias and structure into Inline graphic that is not warranted by the data. Models based on nonextensive entropies intrinsically prefer larger clusters, but without any basis to justify them. The present treatment invokes the same nature of randomness as when physical particles populate energy levels. The present work provides a cost-like language for expressing various different types of probability distribution functions.

Supplementary Material

Supporting Information

supp_110_51_20380__index.html^{(7KB, html)}

Acknowledgments

We thank A. de Graff, H. Ge, D. Farrell, K. Ghosh, S. Maslov, and C. Shalizi for helpful discussions, and K. Sneppen, M. S. Shell, and H. Qian for comments on our manuscript. Support for this work was provided by a US Department of Defense National Defense Science and Engineering Graduate Fellowship (to J.P.), the National Science Foundation and Laufer Center (J.P. and K.A.D.), and Department of Energy Grant PM-031 from the Office of Biological Research (to P.D.D.).

Footnotes

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1320578110/-/DCSupplemental.

References

1.Mantegna R, Stanley H. Scaling behaviour in the dynamics of an economic index. Nature. 1995;376:46–49. [Google Scholar]
2.Zipf GK. Human Behavior and the Principle of Least Effort. Cambridge, MA: Addison-Wesley; 1949. [Google Scholar]
3.Broder A, et al. Graph structure in the web. Comput Netw. 2000;33(1-6):309–320. [Google Scholar]
4.Newman M. Power laws, Pareto distributions and Zipf’s law. Contemp Phys. 2005;46(5):323–351. [Google Scholar]
5.Clauset A, Shalizi C, Newman M. Power-law distributions in empirical data. SIAM Rev. 2009;51(4):661–703. [Google Scholar]
6.Taleb NN. The Black Swan: The Impact of the Highly Improbable. New York: Random House; 2007. [Google Scholar]
7.Bremmer I, Keats P. The Fat Tail: The Power of Political Knowledge for Strategic Investing. London: Oxford Univ Press; 2009. [Google Scholar]
8.Vázquez A, Flammini A, Maritan A, Vespignani A. Modeling of protein interaction networks. Complexus. 2003;1(1):38–44. [Google Scholar]
9.Berg J, Lässig M, Wagner A. Structure and evolution of protein interaction networks: A statistical model for link dynamics and gene duplications. BMC Evol Biol. 2004;4(1):51–63. doi: 10.1186/1471-2148-4-51. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Maslov S, Krishna S, Pang TY, Sneppen K. Toolbox model of evolution of prokaryotic metabolic networks and their regulation. Proc Natl Acad Sci USA. 2009;106(24):9743–9748. doi: 10.1073/pnas.0903206106. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Pang TY, Maslov S. A toolbox model of evolution of metabolic pathways on networks of arbitrary topology. PLOS Comput Biol. 2011;7(5):e1001137. doi: 10.1371/journal.pcbi.1001137. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Leskovec J, Chakrabarti D, Kleinberg J, Faloutsos C, Ghahramani Z. Kronecker graphs: An approach to modeling networks. J Mach Learn Res. 2010;11:985–1042. [Google Scholar]
13.Karagiannis T, Le Boudec J-Y, Vojnovic M. Power law and exponential decay of intercontact times between mobile devices. IEEE Trans Mobile Comput. 2010;9(10):1377–1390. [Google Scholar]
14.Shou C, et al. Measuring the evolutionary rewiring of biological networks. PLOS Comput Biol. 2011;7(1):e1001050. doi: 10.1371/journal.pcbi.1001050. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Fortuna MA, Bonachela JA, Levin SA. Evolution of a modular software network. Proc Natl Acad Sci USA. 2011;108(50):19985–19989. doi: 10.1073/pnas.1115960108. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Peterson GJ, Pressé S, Peterson KS, Dill KA. Simulated evolution of protein-protein interaction networks with realistic topology. PLoS ONE. 2012;7(6):e39052. doi: 10.1371/journal.pone.0039052. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Pang TY, Maslov S. Universal distribution of component frequencies in biological and technological systems. Proc Nat Acad Sci. 2013;110(15):6235–6239. doi: 10.1073/pnas.1217795110. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Simon H. On a class of skew distribution functions. Biometrika. 1955;42:425–440. [Google Scholar]
19.de Solla Price D. A general theory of bibliometric and other cumulative advantage processes. J Am Soc Inf Sci. 1976;27(5):292–306. [Google Scholar]
20.Barabási AL, Albert R. Emergence of scaling in random networks. Science. 1999;286(5439):509–512. doi: 10.1126/science.286.5439.509. [DOI] [PubMed] [Google Scholar]
21.Vázquez A. Growing network with local rules: Preferential attachment, clustering hierarchy, and degree correlations. Phys Rev E Stat Nonlin Soft Matter Phys. 2003;67(5 Pt 2):056104. doi: 10.1103/PhysRevE.67.056104. [DOI] [PubMed] [Google Scholar]
22.Yook S-H, Jeong H, Barabási A-L. Modeling the Internet’s large-scale topology. Proc Natl Acad Sci USA. 2002;99(21):13382–13386. doi: 10.1073/pnas.172501399. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Capocci A, et al. Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia. Phys Rev E Stat Nonlin Soft Matter Phys. 2006;74(3 Pt 2):036116. doi: 10.1103/PhysRevE.74.036116. [DOI] [PubMed] [Google Scholar]
24.Newman ME. Clustering and preferential attachment in growing networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2001;64(2 Pt 2):025102. doi: 10.1103/PhysRevE.64.025102. [DOI] [PubMed] [Google Scholar]
25.Jeong H, Néda Z, Barabási A-L. Measuring preferential attachment in evolving networks. Europhys Lett. 2003;61(4):567. [Google Scholar]
26.Poncela J, Gómez-Gardeñes J, Floría LM, Sánchez A, Moreno Y. Complex cooperative networks from evolutionary preferential attachment. PLoS ONE. 2008;3(6):e2449. doi: 10.1371/journal.pone.0002449. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Peterson GJ, Pressé S, Dill KA. Nonuniversal power law scaling in the probability distribution of scientific citations. Proc Natl Acad Sci USA. 2010;107(37):16023–16027. doi: 10.1073/pnas.1010757107. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Fisher M. The renormalization group in the theory of critical behavior. Rev Mod Phys. 1974;46(4):597–616. [Google Scholar]
29.Yeomans J. Statistical Mechanics of Phase Transitions. New York: Oxford Univ Press; 1992. [Google Scholar]
30.Stanley H. Scaling, universality, and renormalization: Three pillars of modern critical phenomena. Rev Mod Phys. 1999;71(2):S358–S366. [Google Scholar]
31.Gefen Y, Mandelbrot BB, Aharony A. Critical phenomena on fractal lattices. Phys Rev Lett. 1980;45(11):855–858. [Google Scholar]
32.Fisher DS. Scaling and critical slowing down in random-field Ising systems. Phys Rev Lett. 1986;56(5):416–419. doi: 10.1103/PhysRevLett.56.416. [DOI] [PubMed] [Google Scholar]
33.Suzuki M, Kubo R. Dynamics of the Ising model near the critical point. J Phys Soc Jpn. 1968;24:51–60. [Google Scholar]
34.Glauber RJ. Time-dependent statistics of the Ising model. J Math Phys. 1963;4(2):294–304. [Google Scholar]
35.Caldarelli G, Capocci A, De Los Rios P, Muñoz MA. Scale-free networks from varying vertex intrinsic fitness. Phys Rev Lett. 2002;89(25):258702–258705. doi: 10.1103/PhysRevLett.89.258702. [DOI] [PubMed] [Google Scholar]
36.Deeds EJ, Ashenberg O, Shakhnovich EI. A simple physical model for scaling in protein-protein interaction networks. Proc Natl Acad Sci USA. 2006;103(2):311–316. doi: 10.1073/pnas.0509715102. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Shore J, Johnson R. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Trans Inf Theory. 1980;26(1):26–37. [Google Scholar]
38.Jaynes ET. Information theory and statistical mechanics. Phys Rev. 1957;106(4):620–630. [Google Scholar]
39.Pressé S, Ghosh K, Lee J, Dill KA. The principles of Maximum Entropy and Maximum Caliber in statistical physics. Rev Mod Phys. 2013;85(3):1115–1141. [Google Scholar]
40.Tsallis C. Possible generalization of Boltzmann-Gibbs statistics. J Stat Phys. 1988;52(1-2):479–487. [Google Scholar]
41.Rènyi A. On measures of entropy and information. In: Neyman J, editor. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, CA: Univ of California Press; 1961. pp. 547–561. [Google Scholar]
42.Aczél J, Daróczy Z. On Measures of Information and Their Characterizations. Vol 40. New York: Academic; 1975. [Google Scholar]
43.Amari S-i. Differential-Geometrical Methods in Statistic. Berlin: Springer; 1985. [Google Scholar]
44.Lutz E. Anomalous diffusion and Tsallis statistics in an optical lattice. Phys Rev A. 2003;67(5):051402. [Google Scholar]
45.Douglas P, Bergamini S, Renzoni F. Tunable Tsallis distributions in dissipative optical lattices. Phys Rev Lett. 2006;96(11):110601. doi: 10.1103/PhysRevLett.96.110601. [DOI] [PubMed] [Google Scholar]
46.Burlaga L, Vinas A. Triangle for the entropic index q of non-extensive statistical mechanics observed by Voyager 1 in the distant heliosphere. Phys A Stat Mech Appl. 2005;356(2):375–384. [Google Scholar]
47.Pickup RM, Cywinski R, Pappas C, Farago B, Fouquet P. Generalized spin-glass relaxation. Phys Rev Lett. 2009;102(9):097202. doi: 10.1103/PhysRevLett.102.097202. [DOI] [PubMed] [Google Scholar]
48.DeVoe RG. Power-law distributions for a trapped ion interacting with a classical buffer gas. Phys Rev Lett. 2009;102(6):063001. doi: 10.1103/PhysRevLett.102.063001. [DOI] [PubMed] [Google Scholar]
49.Plastino A, Plastino A. Non-extensive statistical mechanics and generalized Fokker-Planck equation. Phys A Stat Mech Appl. 1995;222(1):347–354. [Google Scholar]
50.Tsallis C, Bukman DJ. Anomalous diffusion in the presence of external forces: Exact time-dependent solutions and their thermostatistical basis. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 1996;54(3):R2197–R2200. doi: 10.1103/physreve.54.r2197. [DOI] [PubMed] [Google Scholar]
51.Caruso F, Tsallis C. Nonadditive entropy reconciles the area law in quantum systems with classical thermodynamics. Phys Rev E Stat Nonlin Soft Matter Phys. 2008;78(2 Pt 1):021102. doi: 10.1103/PhysRevE.78.021102. [DOI] [PubMed] [Google Scholar]
52.Abe S. Axioms and uniqueness theorem for Tsallis entropy. Phys Lett A. 2000;271(1):74–79. [Google Scholar]
53.Gell-Mann M, Tsallis C. Nonextensive Entropy: Interdisciplinary Applications: Interdisciplinary Applications. New York: Oxford Univ Press; 2004. [Google Scholar]
54.Dill K, Bromberg S. Molecular Driving Forces: Statistical Thermodynamics in Biology, Chemistry, Physics, and Nanoscience. 2nd Ed. New York: Garland Science; 2010. [Google Scholar]
55. Born M (1920) [Volumes and heats of hydration of ions]. Z Phys 1:45–48. German.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

supp_110_51_20380__index.html^{(7KB, html)}

1320578110_pnas.201320578SI.pdf^{(454.7KB, pdf)}

[r1] 1.Mantegna R, Stanley H. Scaling behaviour in the dynamics of an economic index. Nature. 1995;376:46–49. [Google Scholar]

[r2] 2.Zipf GK. Human Behavior and the Principle of Least Effort. Cambridge, MA: Addison-Wesley; 1949. [Google Scholar]

[r3] 3.Broder A, et al. Graph structure in the web. Comput Netw. 2000;33(1-6):309–320. [Google Scholar]

[r4] 4.Newman M. Power laws, Pareto distributions and Zipf’s law. Contemp Phys. 2005;46(5):323–351. [Google Scholar]

[r5] 5.Clauset A, Shalizi C, Newman M. Power-law distributions in empirical data. SIAM Rev. 2009;51(4):661–703. [Google Scholar]

[r6] 6.Taleb NN. The Black Swan: The Impact of the Highly Improbable. New York: Random House; 2007. [Google Scholar]

[r7] 7.Bremmer I, Keats P. The Fat Tail: The Power of Political Knowledge for Strategic Investing. London: Oxford Univ Press; 2009. [Google Scholar]

[r8] 8.Vázquez A, Flammini A, Maritan A, Vespignani A. Modeling of protein interaction networks. Complexus. 2003;1(1):38–44. [Google Scholar]

[r9] 9.Berg J, Lässig M, Wagner A. Structure and evolution of protein interaction networks: A statistical model for link dynamics and gene duplications. BMC Evol Biol. 2004;4(1):51–63. doi: 10.1186/1471-2148-4-51. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r10] 10.Maslov S, Krishna S, Pang TY, Sneppen K. Toolbox model of evolution of prokaryotic metabolic networks and their regulation. Proc Natl Acad Sci USA. 2009;106(24):9743–9748. doi: 10.1073/pnas.0903206106. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11] 11.Pang TY, Maslov S. A toolbox model of evolution of metabolic pathways on networks of arbitrary topology. PLOS Comput Biol. 2011;7(5):e1001137. doi: 10.1371/journal.pcbi.1001137. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r12] 12.Leskovec J, Chakrabarti D, Kleinberg J, Faloutsos C, Ghahramani Z. Kronecker graphs: An approach to modeling networks. J Mach Learn Res. 2010;11:985–1042. [Google Scholar]

[r13] 13.Karagiannis T, Le Boudec J-Y, Vojnovic M. Power law and exponential decay of intercontact times between mobile devices. IEEE Trans Mobile Comput. 2010;9(10):1377–1390. [Google Scholar]

[r14] 14.Shou C, et al. Measuring the evolutionary rewiring of biological networks. PLOS Comput Biol. 2011;7(1):e1001050. doi: 10.1371/journal.pcbi.1001050. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r15] 15.Fortuna MA, Bonachela JA, Levin SA. Evolution of a modular software network. Proc Natl Acad Sci USA. 2011;108(50):19985–19989. doi: 10.1073/pnas.1115960108. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r16] 16.Peterson GJ, Pressé S, Peterson KS, Dill KA. Simulated evolution of protein-protein interaction networks with realistic topology. PLoS ONE. 2012;7(6):e39052. doi: 10.1371/journal.pone.0039052. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r17] 17.Pang TY, Maslov S. Universal distribution of component frequencies in biological and technological systems. Proc Nat Acad Sci. 2013;110(15):6235–6239. doi: 10.1073/pnas.1217795110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r18] 18.Simon H. On a class of skew distribution functions. Biometrika. 1955;42:425–440. [Google Scholar]

[r19] 19.de Solla Price D. A general theory of bibliometric and other cumulative advantage processes. J Am Soc Inf Sci. 1976;27(5):292–306. [Google Scholar]

[r20] 20.Barabási AL, Albert R. Emergence of scaling in random networks. Science. 1999;286(5439):509–512. doi: 10.1126/science.286.5439.509. [DOI] [PubMed] [Google Scholar]

[r21] 21.Vázquez A. Growing network with local rules: Preferential attachment, clustering hierarchy, and degree correlations. Phys Rev E Stat Nonlin Soft Matter Phys. 2003;67(5 Pt 2):056104. doi: 10.1103/PhysRevE.67.056104. [DOI] [PubMed] [Google Scholar]

[r22] 22.Yook S-H, Jeong H, Barabási A-L. Modeling the Internet’s large-scale topology. Proc Natl Acad Sci USA. 2002;99(21):13382–13386. doi: 10.1073/pnas.172501399. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r23] 23.Capocci A, et al. Preferential attachment in the growth of social networks: The internet encyclopedia Wikipedia. Phys Rev E Stat Nonlin Soft Matter Phys. 2006;74(3 Pt 2):036116. doi: 10.1103/PhysRevE.74.036116. [DOI] [PubMed] [Google Scholar]

[r24] 24.Newman ME. Clustering and preferential attachment in growing networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2001;64(2 Pt 2):025102. doi: 10.1103/PhysRevE.64.025102. [DOI] [PubMed] [Google Scholar]

[r25] 25.Jeong H, Néda Z, Barabási A-L. Measuring preferential attachment in evolving networks. Europhys Lett. 2003;61(4):567. [Google Scholar]

[r26] 26.Poncela J, Gómez-Gardeñes J, Floría LM, Sánchez A, Moreno Y. Complex cooperative networks from evolutionary preferential attachment. PLoS ONE. 2008;3(6):e2449. doi: 10.1371/journal.pone.0002449. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r27] 27.Peterson GJ, Pressé S, Dill KA. Nonuniversal power law scaling in the probability distribution of scientific citations. Proc Natl Acad Sci USA. 2010;107(37):16023–16027. doi: 10.1073/pnas.1010757107. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r28] 28.Fisher M. The renormalization group in the theory of critical behavior. Rev Mod Phys. 1974;46(4):597–616. [Google Scholar]

[r29] 29.Yeomans J. Statistical Mechanics of Phase Transitions. New York: Oxford Univ Press; 1992. [Google Scholar]

[r30] 30.Stanley H. Scaling, universality, and renormalization: Three pillars of modern critical phenomena. Rev Mod Phys. 1999;71(2):S358–S366. [Google Scholar]

[r31] 31.Gefen Y, Mandelbrot BB, Aharony A. Critical phenomena on fractal lattices. Phys Rev Lett. 1980;45(11):855–858. [Google Scholar]

[r32] 32.Fisher DS. Scaling and critical slowing down in random-field Ising systems. Phys Rev Lett. 1986;56(5):416–419. doi: 10.1103/PhysRevLett.56.416. [DOI] [PubMed] [Google Scholar]

[r33] 33.Suzuki M, Kubo R. Dynamics of the Ising model near the critical point. J Phys Soc Jpn. 1968;24:51–60. [Google Scholar]

[r34] 34.Glauber RJ. Time-dependent statistics of the Ising model. J Math Phys. 1963;4(2):294–304. [Google Scholar]

[r35] 35.Caldarelli G, Capocci A, De Los Rios P, Muñoz MA. Scale-free networks from varying vertex intrinsic fitness. Phys Rev Lett. 2002;89(25):258702–258705. doi: 10.1103/PhysRevLett.89.258702. [DOI] [PubMed] [Google Scholar]

[r36] 36.Deeds EJ, Ashenberg O, Shakhnovich EI. A simple physical model for scaling in protein-protein interaction networks. Proc Natl Acad Sci USA. 2006;103(2):311–316. doi: 10.1073/pnas.0509715102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r37] 37.Shore J, Johnson R. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Trans Inf Theory. 1980;26(1):26–37. [Google Scholar]

[r38] 38.Jaynes ET. Information theory and statistical mechanics. Phys Rev. 1957;106(4):620–630. [Google Scholar]

[r39] 39.Pressé S, Ghosh K, Lee J, Dill KA. The principles of Maximum Entropy and Maximum Caliber in statistical physics. Rev Mod Phys. 2013;85(3):1115–1141. [Google Scholar]

[r40] 40.Tsallis C. Possible generalization of Boltzmann-Gibbs statistics. J Stat Phys. 1988;52(1-2):479–487. [Google Scholar]

[r41] 41.Rènyi A. On measures of entropy and information. In: Neyman J, editor. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, CA: Univ of California Press; 1961. pp. 547–561. [Google Scholar]

[r42] 42.Aczél J, Daróczy Z. On Measures of Information and Their Characterizations. Vol 40. New York: Academic; 1975. [Google Scholar]

[r43] 43.Amari S-i. Differential-Geometrical Methods in Statistic. Berlin: Springer; 1985. [Google Scholar]

[r44] 44.Lutz E. Anomalous diffusion and Tsallis statistics in an optical lattice. Phys Rev A. 2003;67(5):051402. [Google Scholar]

[r45] 45.Douglas P, Bergamini S, Renzoni F. Tunable Tsallis distributions in dissipative optical lattices. Phys Rev Lett. 2006;96(11):110601. doi: 10.1103/PhysRevLett.96.110601. [DOI] [PubMed] [Google Scholar]

[r46] 46.Burlaga L, Vinas A. Triangle for the entropic index q of non-extensive statistical mechanics observed by Voyager 1 in the distant heliosphere. Phys A Stat Mech Appl. 2005;356(2):375–384. [Google Scholar]

[r47] 47.Pickup RM, Cywinski R, Pappas C, Farago B, Fouquet P. Generalized spin-glass relaxation. Phys Rev Lett. 2009;102(9):097202. doi: 10.1103/PhysRevLett.102.097202. [DOI] [PubMed] [Google Scholar]

[r48] 48.DeVoe RG. Power-law distributions for a trapped ion interacting with a classical buffer gas. Phys Rev Lett. 2009;102(6):063001. doi: 10.1103/PhysRevLett.102.063001. [DOI] [PubMed] [Google Scholar]

[r49] 49.Plastino A, Plastino A. Non-extensive statistical mechanics and generalized Fokker-Planck equation. Phys A Stat Mech Appl. 1995;222(1):347–354. [Google Scholar]

[r50] 50.Tsallis C, Bukman DJ. Anomalous diffusion in the presence of external forces: Exact time-dependent solutions and their thermostatistical basis. Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 1996;54(3):R2197–R2200. doi: 10.1103/physreve.54.r2197. [DOI] [PubMed] [Google Scholar]

[r51] 51.Caruso F, Tsallis C. Nonadditive entropy reconciles the area law in quantum systems with classical thermodynamics. Phys Rev E Stat Nonlin Soft Matter Phys. 2008;78(2 Pt 1):021102. doi: 10.1103/PhysRevE.78.021102. [DOI] [PubMed] [Google Scholar]

[r52] 52.Abe S. Axioms and uniqueness theorem for Tsallis entropy. Phys Lett A. 2000;271(1):74–79. [Google Scholar]

[r53] 53.Gell-Mann M, Tsallis C. Nonextensive Entropy: Interdisciplinary Applications: Interdisciplinary Applications. New York: Oxford Univ Press; 2004. [Google Scholar]

[r54] 54.Dill K, Bromberg S. Molecular Driving Forces: Statistical Thermodynamics in Biology, Chemistry, Physics, and Nanoscience. 2nd Ed. New York: Garland Science; 2010. [Google Scholar]

[r55] 55. Born M (1920) [Volumes and heats of hydration of ions]. Z Phys 1:45–48. German.

PERMALINK

A maximum entropy framework for nonexponential distributions

Jack Peterson

Purushottam D Dixit

Ken A Dill

Significance

Abstract