Skip to main content
. 2020 Jan 29;11:574. doi: 10.1038/s41467-020-14418-6

Fig. 3. Theoretical relationship between the shortest compression length and network prediction limit.

Fig. 3

Networks are generated from a static random matrix Q with elements generated from the degree distribution of empirical networks. a Comparison between the ranking distribution from TBPA and that from random matrix Q based on Metabolic network29. The left half represents the ranking distribution of coarse-grained probabilities q~ij, each of which is the average value of every N values of qij ranked in a descending order. The right half illustrates the TBPA ranking distribution of existing links for this artificially generated network. The probability pj is an average value from 100 simulations. b Values of coarse-grained entropy H~Q from the probability distribution q~ij vs. TBPA performance entropy HTBPA for all of the artificial networks generated from the empirical ones. Each dot corresponds to the value of (HTBPA,H~Q) of an artificial network, showing a good match between the two values. c Theoretical linear relationship between L* and HTBPA* calculated based on the empirical networks' degree distributions. Points of the same shape and color correspond to an empirical network (bottom-left) and its different shuffled versions, i.e., having its links shuffled different number of times up to all of original links. Each point represents the value pair of (L*,HTBPA*) of an artificial network with the same degree distribution as a corresponding (shuffled) empirical network. L* is calculated from Eq. (1) and HTBPA* is calculated from Eq. (6). The blue solid line is the average of 12 studied networks' analytical slopes obtained from Eq. (7) and the shaded region denotes the standard deviation. The gray dash line is the analytical result of Eq. (9) for the limit of 〈k〉 → . d Competition between two terms of the slope in Eq. (8). e The empirical and theoretical value of slope in the linear relationship between L* and HTBPA*. The purple plane denotes the plane of empirical value at 1.63, and each colored point on the plane represents a real network. The lower curved surface below represents the theoretical values of the slope given by Eq. (7), and each colored point on the lower surface represents an artificial network constructed with the same degree distribution as original empirical networks, and the subsequent points with the same color are the artificial networks with randomly added links to increase 〈k〉. The red curve (lnN=k) on the lower surface is an estimated boundary (see Supplementary Note 9) that our theories are not valid far from the left side of it.