Skip to main content
PLOS One logoLink to PLOS One
. 2012 Oct 19;7(10):e47278. doi: 10.1371/journal.pone.0047278

A Network Analysis of Countries’ Export Flows: Firm Grounds for the Building Blocks of the Economy

Guido Caldarelli 1,2,3, Matthieu Cristelli 2,4,*, Andrea Gabrielli 2,3, Luciano Pietronero 2,4,3, Antonio Scala 2,3, Andrea Tacchella 4,2
Editor: Alessandro Flammini5
PMCID: PMC3477170  PMID: 23094044

Abstract

In this paper we analyze the bipartite network of countries and products from UN data on country production. We define the country-country and product-product projected networks and introduce a novel method of filtering information based on elements’ similarity. As a result we find that country clustering reveals unexpected socio-geographic links among the most competing countries. On the same footings the products clustering can be efficiently used for a bottom-up classification of produced goods. Furthermore we mathematically reformulate the “reflections method” introduced by Hidalgo and Hausmann as a fixpoint problem; such formulation highlights some conceptual weaknesses of the approach. To overcome such an issue, we introduce an alternative methodology (based on biased Markov chains) that allows to rank countries in a conceptually consistent way. Our analysis uncovers a strong non-linear interaction between the diversification of a country and the ubiquity of its products, thus suggesting the possible need of moving towards more efficient and direct non-linear fixpoint algorithms to rank countries and products in the global market.

Introduction

Complex Networks

Networks emerged in the recent years as the main mathematical tool for the description of complex systems. In particular, the mathematical framework of graph theory made possible to extract relevant information from different biological and social systems [1][3]. In this paper we use some concepts of network theory to address the problem of economic complexity [4][7].

Our activity is in the track of a long-standing interaction between economics and physical sciences [8][12] and it explains, extends and complements a recent analysis done on the network of trades between nations [13], [14]. Hidalgo and Hausmann (HH) address the problem of competitiveness and robustness of different countries in the global economy by studying the differences in the Gross Domestic Product and assuming that the development of a country is related to different “capabilities”. While countries cannot directly trade capabilities, it is the specific combination of those capabilities that results in different products traded. More capabilities are supposed to bring higher returns and the accumulation of new capabilities provides an exponentially growing advantage. Therefore the origin of the differences in the wealth of countries can be inferred by the record of trading activities analyzed as the expressions of the capabilities of the countries.

Revealed Competitive Advantage and the country-product Matrix

We consider here the Standard Trade Classification data for the years in the interval Inline graphic. In the following we shall analyze the year Inline graphic, but similar results apply for the other snapshots. For the year Inline graphic the data provides information on Inline graphic different countries and Inline graphic different products.

To make a fair comparison between the trades, it is useful to employ Balassa’s Revealed Comparative Advantage (RCA) [15] i.e. the ratio between the export share of product Inline graphic in country Inline graphic and the share of product Inline graphic in the world market

graphic file with name pone.0047278.e009.jpg (1)

where Inline graphic represents the dollar exports of country Inline graphic in product Inline graphic.

We consider country Inline graphic to be a competitive exporter of product Inline graphic if its RCA is larger than some threshold value, which we take as 1 as in standard economics literature; previous studies have verified that small variations around such threshold do not qualitatively change the results.

The network structure of the country-product competition is given by the semipositive matrix Inline graphic defined as

graphic file with name pone.0047278.e016.jpg (2)

where Inline graphic is the threshold (Inline graphic = 1).

To such matrix Inline graphic we can associate a graph whose nodes are divided into two sets Inline graphic of Inline graphic nodes (the countries) and Inline graphic of Inline graphic nodes (the products) where a link between a node Inline graphic and a node Inline graphic exists if and only if Inline graphic, i.e. a bipartite graph. The matrix Inline graphic is strictly related to the adjacency matrix of the country-product bipartite network.

The fundamental structure of the matrix Inline graphic is revealed by ordering the rows of the matrix by the number of exported products and the columns by the number of exporting countries: doing so, Inline graphic assumes a substantially triangular structure. Such structure reflects the fact that some countries export a large fraction of all products (highly diversified countries), and some products appear to be exported by most countries (ubiquitous products). Moreover, the countries that export few products tend to export only ubiquitous products, while highly diversified countries are the only ones to export the products that only few other countries export.

This triangular structure is therefore revealing us that there is a systematic relationship between the diversification of countries and the ubiquity of the products they make. Poorly diversified countries have a revealed comparative advantage (RCA) almost exclusively in ubiquitous products, whereas the most diversified countries appear to be the only ones with RCAs in the less ubiquitous products which in general are of higher value on the market. It is therefore plausible that such structure reflects a ranking among the nations.

The fact that the matrix is triangular rather than block-diagonal suggests that some successful countries are more diversified than expected. Countries add more new products to the export mix while keeping, at the same time, their traditional productions. The structure of Inline graphic therefore contradicts most of classical macro-economical models that always predict a specialization of countries in particular sectors of production (i.e. countries should aggregate in communities producing similar goods) that would result in a more or less block-diagonal matrix Inline graphic.

In the following, we are going to analyze the economical consequences of the structure of the bipartite country-product graph described by Inline graphic. In particular, we analyze the community structure induced by Inline graphic on the countries and products projected networks. As a second step, we reformulate as a linear fixpoint algorithm the HH’s reflection method to determine the countries and products respective rankings induced by Inline graphic. In this way we are able to clarify the critical aspects of this method and its mathematical weakness. Finally, to assign proper weights to the countries, we formulate a mathematically well defined biased Markov chain process on the country-product network; to account for the bipartite structure of the network, we introduce a two parameter bias in this method. To select the optimal bias, we compare the results of our algorithm with a standard economic indicator, the gross domestic product Inline graphic. The optimal values of the parameters suggests a highly non-linear interaction between the number of different products produced by each country (diversification) and the number of different countries producing each product (ubiquity) in determining the competitiveness of countries and products. This fact suggests that, to better capture the essential features of economical competition of countries, we need a more direct and efficient non-linear approach.

Results

The Network of Countries

In order to obtain an immediate understanding of the economic relations between countries induced by their products a possible approach is to define a projection graph obtained from the original set of bipartite relations represented by the matrix Inline graphic [16]. The idea is to connect the various countries with a link whose strength is given by the number of products they mutually produce. In such a way the information stored in the matrix Inline graphic is projected into the network of countries as shown in Fig. 1.

Figure 1. The network of countries and products and the two possible projections.

Figure 1

The country network can be characterized by the Inline graphic country-country matrix Inline graphic. The non-diagonal elements Inline graphic correspond to the number of products that countries Inline graphic and Inline graphic have in common (i.e. are produced by both countries). They are a measure of their mutual competition, allowing a quantitative comparison between economic and financial systems [17]; the diagonal elements Inline graphic corresponds to the number of products produced by country Inline graphic and are a measure of the diversification of country Inline graphic.

To quantify the competition among two countries, we can define the similarity matrix among countries as

graphic file with name pone.0047278.e046.jpg (3)

Note that Inline graphic and that small (large) values indicate small (large) correlations between the products of the two countries Inline graphic and Inline graphic. Similar approaches to define a correlation between vertices or a distance [18] have often been employed in the field of complex networks, for example to detect protein correlations [19] or to characterize the interdependencies among clinical traits of the orofacial system [20], [21].

The first problem for large correlation networks is how to visualize the relevant structure. The simplest approach to visualize the most similar vertices is realized by building a Minimal Spanning Tree (MST) [22], [23]. In this method, starting from an empty graph, edges Inline graphic are added in order of decreasing similarity until all the nodes are connected; to obtain a tree, edges that would introduce a loop are discarded. A further problem is to split the graph in smaller sub-graphs (communities) that share important common feature, i.e. have strong correlations. Similarity, like analogous correlation indicators, can be used to detect the inner structure of a network; while different methods for community detection vary in their detailed implementation [24], [25], they give reasonably similar qualitative results when the indicators contain the same information.

The MST method can be thus generalized in order to detect the presence of communities by adding the extra condition that no edge between two nodes that have been already connected to some other node is allowed. In this way we obtain a set of disconnected sub-trees (i.e. a forest) embedded in the MST. This Minimal Spanning Forest (MSF) method naturally splits the network of countries into separate subsets. This method allows for the visualization of correlations in a large network and at the same time performs a sort of community detection if not precise, certainly very fast.

By visual inspection in Fig. 2 we can spot a large subtree composed by developed countries and some other subtrees in which clear geographical correlations are present. Notice that each subtree contains countries with very similar products, i.e. countries that are competing on the same markets. In particular, developing countries seem to be mostly direct competitors of their geographical neighbors. This features despite its high frequency in most geographical areas, comes unexpected since it is not the most rationale choice [26], [27]: as an example, both banks [28] and countries [29] trade preferentially with similar partners, thereby affecting the whole robustness of the system [30], [31]. This behavior can be reproduced by simple statistical models based on agents’ fitnesses [32], [33].

Figure 2. The Minimal Spanning Forest for the Countries.

Figure 2

The various subgraphs have a distinct geographical similarity. We show in green northern European countries and in red the “Baltic” republics. In general neighboring (also in a social and cultural sense) countries compete for the production of similar goods.

The Network of Products

Similarly to countries, we can project the bipartite graph into a product network by connecting two products if they are produced by the same one or more countries and giving a weight to this link proportional to the number of countries producing both products. Such network can be represented by the Inline graphic product-product matrix Inline graphic. The non-diagonal elements Inline graphic correspond to the number of countries producing both Inline graphic and Inline graphic have in common, while the diagonal elements Inline graphic corresponds to the number of countries producing Inline graphic.

In analogy with Eq. (3), the similarity matrix among products is defined as

graphic file with name pone.0047278.e058.jpg (4)

It indicates how much products are correlated on a market: a value Inline graphic indicates that whenever product Inline graphic is present on the market of a country, also product Inline graphic would be present. This could be for example the case of two products Inline graphic, Inline graphic that are both necessary for the same and only industrial process.

As in the case of countries, the MSF algorithm can be applied to visualize correlations and detect communities. In the case of the product network this analysis brings to an apparently contradictory results: let’s see why. Products are officially characterized by a hierarchical topology assigned by UN. Within this classification similar issue as “metalliferous ores and metal scraps” (groups 27.xx) are in a totally different section with respect to “non ferrous metals” (groups 68.xx). By applying our new algorithm, based on the economical competition network Inline graphic, one would naively expect that products belonging to the same UN hierarchy should belong to the same community and vice-versa; therefore, if we would assign different colors to different UN hierarchies, one would expect all the nodes belonging to a single community to be of the same color. In Fig. 3 we show that this is not the case. Such a paradox can be understood by analyzing in closer detail the detected communities with the MSF method. As an example, we show in Fig. 4 a large community where most of the vertices belong to the area of “vehicle part and constituents”. In this cluster we can spot the noticeable presence of a vertex belonging to “food” hierarchy. This apparent contradiction is solved up by noticing that such vertex refers to colza seeds, a typical plant recently used mostly for bio-fuels and not for alimentation: our MSF method has correctly positioned this “food” product in the “vehicle” cluster. Therefore, methods based on community detection could be considered as a possible rational substitute for current top-down “human-made” taxonomies [32].

Figure 3. The Minimal Spanning Forest (MSF) for the Products.

Figure 3

We put a different color according to the first digit used in COMTRADE classification. This analysis should reveal correlation between different but similar products.

Figure 4. The largest tree in the Products MSF.

Figure 4

When passing from classification colors to the real products name, we see they are all strongly related. It is interesting the presence of colza seeds in the lower left corner of the figure.

Ranking Countries and Products by Reflection Method

Hidalgo and Haussman (HH) have introduced in [13], [14] the fundamental idea that the complex set of capabilities of countries (in general hardly comparable between different countries) can be inferred from the structure of matrix Inline graphic (that we can observe). In this spirit, ubiquitous products require few capabilities and can be produced by most countries, while diversified countries possess many capabilities allowing to produce most products. Therefore, the most diversified countries are expected to be amongst the top ones in the global competition; on the same footing ubiquitous products are likely to correspond to low-quality products.

In order to refine such intuitions in a quantitative ranking among countries and products, the authors of [13], [14] have introduced two quantities: the Inline graphic level diversification Inline graphic (called Inline graphic in [13], [14]) of the country Inline graphic and the Inline graphic level ubiquity Inline graphic (called Inline graphic in [13], [14]) of the product Inline graphic. At the zeroInline graphic order the diversification of a country is simply defined as the number of its products or

graphic file with name pone.0047278.e075.jpg (5)

where Inline graphic is the degree of the node Inline graphic in the bipartite country-product network); analogously the zeroInline graphic order ubiquity of a product is defined as the number of different countries producing it

graphic file with name pone.0047278.e079.jpg (6)

where Inline graphic is the degree of the node Inline graphic in the bipartite country-product network. The diversification Inline graphic is intended to represent the zeroInline graphic order measure of the “quality” of the country Inline graphic with the idea that the more products a country exports the strongest its position on the marker. The ubiquity Inline graphic is intended to represent the zeroInline graphic order measure of the “dis-value of the product Inline graphic in the global competition with the idea that the more countries produce a product, the least is its value on the market.

In the original approach these two initial quantities are refined in an iterative way via the so-called “reflections method”, consisting in defining the diversification of a country at the Inline graphic iteration as the average ubiquity of its product at the Inline graphic iteration and the ubiquity of a country at the Inline graphic iteration as the average diversification of its producing countries at the Inline graphic iteration:

graphic file with name pone.0047278.e092.jpg (7)

In vectorial form, this can be cast in the following form

graphic file with name pone.0047278.e093.jpg (8)

where Inline graphic is the Inline graphicdimensional vector of components Inline graphic, Inline graphic is the Inline graphicdimensional vector of components Inline graphic, and where we have called Inline graphic and Inline graphic (the upper suffix Inline graphic stands for “transpose”), with Inline graphic and Inline graphic respectively the Inline graphic and Inline graphic square diagonal matrices defined by Inline graphic and Inline graphic.

Such an approach suffers from some problems. The first one is related to the fact that the process is defined in a bipartite networks and therefore even and odd iterations have different meanings. In fact, let us consider the diversification Inline graphic of the Inline graphic country: as prescribed by the algorithm, Inline graphic is the average ubiquity of the products of the Inline graphic country at the Inline graphic-th iteration. Therefore countries with most ubiquitous (less valuable) products would get an highest Inline graphic order diversification. On the other hand, the approximately triangular structure of Inline graphic tells us that these countries are the same ones with a small degree and therefore with a low value of the Inline graphic order diversification Inline graphic. As shown to by [13], [14], this is the case also to higher orders; therefore the diversifications at even and odd iterations are substantially an anti-correlated. Conversely, successive even iterations are positively correlated so that Inline graphic looks a refinement of Inline graphic, Inline graphic a refinement of Inline graphic and so on. Same considerations apply to the iterations for the ubiquity of products.

The major problem in the HH algorithm is that it is a case of a consensus dynamics [34], i.e. the state of a node at iteration Inline graphic is just the average of the state of its neighbors at iteration Inline graphic. It is well known that such iterations have the uniform state (all the nodes equal) as the natural fixpoint. It is therefore puzzling how such “equalizing” procedure could lead to any form of ranking. To solve such a puzzle, let’s write the HH algorithm as a simple iterative linear system and analyze its behavior.

Focusing only on even iterations and on diversifications, we can write HH procedure as:

graphic file with name pone.0047278.e124.jpg (9)

where Inline graphic is a Inline graphic squared matrix.

The matrix Inline graphic in Eq.9 is a Markovian stochastic matrix when it acts from the right on positive vectors, in the sense that every element Inline graphic and

graphic file with name pone.0047278.e129.jpg

In particular for the given Inline graphic adjacency matrix it is also ergodic. Therefore, its spectrum of eigenvalues is bounded in absolute value by its unique upper eigenvalue Inline graphic. Since Inline graphic acts on Inline graphic from the left, the right eigenvector Inline graphic corresponding to the largest eigenvalue Inline graphic is simply a uniform vector with identical components, i.e. in the Inline graphic limit Inline graphic converges to the fixpoint Inline graphic where all countries have the same asymptotic diversification.

It is therefore not a case that HH prescribe to stop their algorithm at a finite number of iterations and that they introduce as a recipe to consider as the ranking of a country the rescaled version of the Inline graphic level diversifications [14]

graphic file with name pone.0047278.e140.jpg (10)

where Inline graphic is the arithmetic mean of all Inline graphic and Inline graphic the standard deviation of the same set. With these prescription, HH algorithm seems to converge to an approximately constant value after Inline graphic steps.

This observed behavior can be easily be explained by noticing that, in contrast with the erroneous statement in [14], finding the fitness by the reflection method can be reformulated as a fix-point problem (our Eq. 9) and solved using the spectral properties of a linear system. In fact,since the ergodic Markovian nature of Inline graphic we can order eigenvalues/eigenvectors such that Inline graphic. Therefore, expanding Inline graphic in terms of the right eigenvectors Inline graphic of Inline graphic the initial condition

graphic file with name pone.0047278.e150.jpg

we can write the Inline graphic-th iterate as

graphic file with name pone.0047278.e152.jpg (11)

Therefore, at sufficiently large Inline graphic the ordering of the countries is completely determined by the components of Inline graphic; notice that such an asymptotic ordering is independent from the initial condition Inline graphic and therefore should be considered as the appropriate fixpoint renormalized fitness Inline graphic for all countries.

What happens to the HH scheme? At sufficiently large Inline graphic, Inline graphic and Inline graphic; therefore Inline graphic becomes proportional to Inline graphic (Eq. 10). The number of iterations Inline graphic needed to converge is given by the ratio between Inline graphic and Inline graphic (Inline graphic; therefore the Inline graphic iterations prescribed by HH are not a general prescription but depend on the spectrum of the network analyzed.

Notice also that when the numerical reflection method is used, the renormalized fitness represents a deviation Inline graphic from a constant and can be detected only if it is bigger than the numerical error; therefore only “not too big” Inline graphic can be employed. On the other hand, the spectral characterization we propose does not suffer from such a pitfall even when. Similar considerations can be developed for the even iterations of the reflection method for the products.

Biased Markov Chain Approach and Non-linear Interactions

Having assessed the problems of HH’s method, we investigate the possibility of defining alternative linear algorithms able to implement similar economical intuitions about the ranking of the countries while keeping a more robust mathematical foundation. In formulating such a new scheme we will keep the approximation of linearity for the iterations even though we shall find in the results hints of the non-linear nature of the problem.

Our approach is inspired to the well-known PageRank algorithm [35]. PageRank (named after the WWW, where vertices are the pages) is one of the most famous of Bonacich centrality measures [36]. In the original PageRank method the ranking of a vertex is proportional to the time spent on it by an unbiased random walker (in different contexts [11] analogous measures assess the stability of a firm in a business firm network).

We define the weights of vertices to be proportional to the time that an appropriately biased random walker on the network spends on them in the large time limit [37]. As shown below, such weights, being the generalization of Inline graphic and Inline graphic, give a measure respectively of competitiveness of countries and “dis-quality” (or lack of competitiveness) of products. As the nodes of our bipartite network are entities that are logically and conceptually separated (countries and products), we assign to the random walker a different bias when jumping from countries to products respect to jumping from products to countries.

Let us call Inline graphic weight of country Inline graphic at the Inline graphic iteration and Inline graphic fitness of product Inline graphic at the Inline graphic iteration. We define the following Markov process on the country-product bipartite network

graphic file with name pone.0047278.e177.jpg (12)

where the Markov transition matrix Inline graphic is given by

graphic file with name pone.0047278.e179.jpg (13)

Here Inline graphic gives the probability to jump from product Inline graphic to country Inline graphic in a single step, and Inline graphic the probability to jump from country Inline graphic to product Inline graphic also in a single step. Note that Eqs.(13) define a Inline graphicdimensional connected Markov chain of period two. Therefore, random walkers initially starting from countries, will be found on products at odd steps and on countries at even ones; the reverse happens for random walkers starting from products. By considering separately the random walkers starting from countries and from products, we can reduce this Markov chain to two ergodic Markov chains of respective dimension Inline graphic and Inline graphic. In particular, if the walker starts from a country, using a vectorial formalism, we can write for the weights of countries

graphic file with name pone.0047278.e189.jpg (14)

where the Inline graphic ergodic stochastic matrix Inline graphic is defined by

graphic file with name pone.0047278.e192.jpg (15)

At the same time for products we can write

graphic file with name pone.0047278.e193.jpg (16)

where the Inline graphic ergodic stochastic matrix Inline graphic is given by

graphic file with name pone.0047278.e196.jpg (17)

Given the structure of Inline graphic and Inline graphic, it is simple to show that the two matrices share the same spectrum which is upper bounded in modulus by the unique eigenvalue Inline graphic. For both matrices, the eigenvectors corresponding to Inline graphic are the stationary and asymptotic weights Inline graphic and Inline graphic of the Markov chains. In order to find analytically such asymptotic values, we apply the detailed balance condition:

graphic file with name pone.0047278.e203.jpg (18)

which gives

graphic file with name pone.0047278.e204.jpg (19)

where Inline graphic and Inline graphic are normalization constants. Note that for Inline graphic Eq. (13) gives the completely unbiased random walk for which Inline graphic where Inline graphic is given in Eq. (9). Therefore, in this case Eqs. (19) become

graphic file with name pone.0047278.e210.jpg (20)

as for the case of unbiased random walks on a simple connected network the asymptotic weight of a node is proportional to its connectivity. Thus, in the case of Inline graphic we recover the zeroInline graphic order iteration of the HH’s reflection method. Note that, in the same spirit of HH, Inline graphic gives a rough measure of the competitiveness of country Inline graphic while Inline graphic gives an approximate measure of the dis-quality in the market of product Inline graphic. By continuity, we associate the same meaning of competitiveness/disquality to the stationary states Inline graphic/Inline graphic at different values of Inline graphic and Inline graphic.

To understand the behavior of our ranking respect to the bias, we have analyzed the mean correlation (square of the Pearson coefficient) for the year 1998 (other years give analogous results) between the logarithm of the GDP of each country and its weight (Eqs. (19) for different values of Inline graphic and Inline graphic (see Fig. 5). We are aware that GDP is not an absolute measure of wealth [38] as it does not account directly for relevant quantities like the wealth due to natural resources [39]. Nevertheless, we expect GDP to monotonically increase with the wealth. What network analysis shows is that the number of products is correlated with both quantities. We envisage such kind of analysis in order to define suitable policies for underdeveloped countries [40].

Figure 5. The plot of the mean Correlation (square of Pearson coefficient, Inline graphic) between logarithm of GDP and fixpoint weights of countries in the biased (Markovian) random walk method as a function of parameters Inline graphic and Inline graphic.

Figure 5

The contour plot for a level of Inline graphic is indicated as a green loop in the orange region (year Inline graphic 1998).

It is interesting to note that the region of large correlations (region inside the contour plot in the Fig. 5) is found in the positive quadrant for about Inline graphic and Inline graphic; in particular the maximal value is approximately at Inline graphic and Inline graphic. These results can be connected with the approximately “triangular” shape of the matrix Inline graphic. In fact, let us rewrite Eqs. (19) (apart from the normalization constant) as:

graphic file with name pone.0047278.e233.jpg

where Inline graphic is the arithmetic average of Inline graphic of the products exported by country Inline graphic and Inline graphic is the arithmetic average of Inline graphic for countries exporting product Inline graphic. Since Inline graphic is substantially positive and slightly smaller of Inline graphic and Inline graphic is definitely positive with optimal values around Inline graphic, the competitive countries will be characterized by a good balance between a high value of Inline graphic and a small typical value of Inline graphic of its products. Nevertheless, since the optimal values of Inline graphic are distributed up to the region of values much larger than 1 (i.e. Inline graphic is significantly smaller than Inline graphic), we see that the major role for the asymptotic weight of a country is played by the presence in its portfolio of un-ubiquitous products which alone give the dominant contribution to Inline graphic. A similar reasoning leads to the conclusion that the dis-value of a product is basically determined by the presence in the set of its producers of poorly diversified countries that are basically exporting only products characterized by a low level of complexity.

Our new approach based on biased Markov chain theory permits thus to implement the interesting ideas developed by HH in [14] on a more solid mathematical basis using the framework of linear iterated transformations and avoiding the indicated flaws of HH’s “reflection method”. Interestingly, our results reveal a strongly non-linear entanglement between the two basic information one can extract from the matrix Inline graphic: diversification of countries and ubiquity of products. In particular, this non-linear relation makes explicit an almost extremal influence of ubiquity of products on the competitiveness of a country in the global market: having “good” or complex products in the portfolio is more important than to have many products of poor value. Furthermore, the information that a product has among its producers some poorly diversified countries is nearly sufficient to say that it is a non-complex (dis-valuable) product in the market. This strongly non-linear entanglement between diversifications of countries and ubiquities of products is an indication of the necessity to go beyond the linear approach in order to introduce more sound and direct description of the competition of countries and products possibly based on a suitable ab initio non-linear approach characterized by a smaller number of ad hoc assumptions [41].

Discussion

In this paper we applied methods of graph theory to the analysis of the economic productions of countries. The information is available in the form of an Inline graphic rectangular matrix Inline graphic giving the different production of the possible Inline graphic goods for each of the Inline graphic countries. The matrix Inline graphic corresponds to a bipartite graph, the country-product network, that can be projected into the country-country network Inline graphic and the product-product network Inline graphic. By using complex-networks analysis, we can attain an effective filtering of the information contained in Inline graphic and Inline graphic. We introduce a new filtering algorithm that identifies communities of countries with similar production. As an unexpected result, this analysis shows that neighboring countries tend to compete over the same markets instead of diversifying. We also show that a classification of goods based on such filtering provides an alternative product taxonomy determined by the countries’ activity. We then study the ranking of the countries induced by the country-product bipartite network. We first show that HH’s ranking is the fix-point of a linear process; in this way we can avoid some logical and numerical pitfalls and clarify some of its weak theoretical points. Finally, in analogy with the Google PageRank algorithm, we define a biased, two parameters Markov chain algorithm to assign ranking weights to countries and products by taking into account the structure of the adjacency matrix of the country-product bipartite network. By correlating the fix-point ranking (i.e. competitiveness of countries and products) with the GDP of each country, we find that the optimal bias parameters of the algorithm indicate a strongly non-linear interaction between the diversification of the countries and the ubiquity of the products. The fact that we still find some discrepancies between fitnesses and GDP is related to the fact that they measure related but different things. In particular while GDP is a measure of the richness of a country, the fitness measures the possibility of a certain country to sustain its growth or to recover from crises.

Materials and Methods

Graphs

A graph is a couple Inline graphic where Inline graphic is the set of vertices, and Inline graphic is the set of edges. A graph Inline graphic can be represented via its adjacency matrix Inline graphic.

graphic file with name pone.0047278.e284.jpg (21)

The degree Inline graphic of the node Inline graphic is the number Inline graphic of its neighbors.

An unbiased random walk on a graph Inline graphic is characterized by a probability Inline graphic of jumping from a vertex Inline graphic to one of its Inline graphic neighbors and is described by the jump matrix

graphic file with name pone.0047278.e292.jpg (22)

where Inline graphic is the diagonal matrix Inline graphic corresponding to the nodes degrees.

Bipartite Graphs

A bipartite graph is a triple Inline graphic where Inline graphic and Inline graphic are two disjoint sets of vertices, and Inline graphic is the set of edges, i.e. edges exist only between vertices of the two different sets Inline graphic and Inline graphic.

The bipartite graph Inline graphic can be described by the matrix Inline graphic defined as

graphic file with name pone.0047278.e303.jpg (23)

In terms of Inline graphic, it is possible to define the adjacency matrix Inline graphic of Inline graphic as

graphic file with name pone.0047278.e307.jpg (24)

It is also useful to define the co-occurrence matrices Inline graphic Inline graphicand Inline graphic that respectively count the number of common neighbors between two vertices of Inline graphic or of Inline graphic. Inline graphic is the weighted adjacency matrix of the co-occurrence graph Inline graphic with vertices on Inline graphic and where each non-zero element of Inline graphic corresponds to an edge among vertices Inline graphic and Inline graphic with weight Inline graphic. The same is valid for the co-occurrence matrix Inline graphic and the co-occurrence graph Inline graphic.

Many projection schemes for a bipartite graph Inline graphic start from constructing the graphs Inline graphic or Inline graphic and eliminating the edges whose weights are less than a given threshold or whose statistical significance is low.

Matrix from RCA

To make a fair comparison between the exports, it is useful to employ Balassa’s Revealed Comparative Advantage (RCA) [15] i.e. the ratio between the export share of product Inline graphic in country Inline graphic and the share of product Inline graphic in the world market

graphic file with name pone.0047278.e328.jpg (25)

where Inline graphic represents the dollar exports of country Inline graphic in product Inline graphic.

The network structure is given by the country-product adjacency matrix Inline graphic defined as

graphic file with name pone.0047278.e333.jpg (26)

where Inline graphic is the threshold. A positive entry, Inline graphic tells us that country Inline graphic is a competitive exporter of the product Inline graphic.

Minimal Spanning Forest

The spanning forest algorithm (SFA) is a computationally less-demanding variant of the Spanning Tree Algorithm (STA) where single operations can take up to Inline graphic respect to the STA case where all operations are Inline graphic. Here cluster is a synonymous for connected component.

To analyze the performance of the SFA, we use as a benchmark a weighted network with well defined communities. We consider the graph Inline graphic composed joining Inline graphic communities each consisting in a clique of Inline graphic nodes; the total number of nodes is Inline graphic. A function Inline graphic associates to each node Inline graphic its community Inline graphic; links between nodes Inline graphic and Inline graphic have weight Inline graphic. Thus, links inside a community have weight one, while links among separate communities have smaller weights. We also consider the extremely noisy case Inline graphic where weights between nodes Inline graphic and Inline graphic are random variables uniformily distributed in the interval Inline graphic.

Furthermore we shall also consider for a weighted graph Inline graphic the associate threshold graph Inline graphic where Inline graphic is the subset of edges in Inline graphic having weight higher than the threshold Inline graphic. The threshold graph Inline graphic corresponds to the separated Inline graphic communities for Inline graphic.

Finally, to compare the minimum spanning forest Inline graphic with a threshold graph Inline graphic, we consider the overlap Inline graphic to be the fraction of links in Inline graphic that belong to the same cluster of Inline graphic.

In the non-random case, the SFA individuates correctly the communities and Inline graphic equals the number of clusters Inline graphic of Inline graphic. Notice that the ratio Inline graphic between the number of clusters Inline graphic of Inline graphic versus the threshold Inline graphic intersects the overlap Inline graphic when Inline graphic is the correct number of communities. The left panel of Fig. 6 shows such behavior for Inline graphic.

Figure 6. Graphs of the overlap Inline graphic between the spanning forest and threshold graph and the ratio Inline graphic versus the threshold Inline graphic.

Figure 6

Here Inline graphic is the number of clusters in the threshold graph and Inline graphic is the number of clusters in the spanning forest. (left panel:) Curves for Inline graphic. In this deterministic case, Inline graphic equals the number of communities and both curves intersect when the Inline graphic. (right panel:) Curves for Inline graphic; curves are averaged over Inline graphic configurations of the noise.

In the noisy case, we find that Inline graphic overestimates Inline graphic; on the other hand, Inline graphic intersect Inline graphic at Inline graphic for values Inline graphic less than one and Inline graphic gives a better estimate of Inline graphic. Such an effect is shown in Table 1 that shows for several values of Inline graphic, Inline graphic the proximity of Inline graphic to the expected number of communities Inline graphic. The right panel of fig. 6 shows the intersection of curves for Inline graphic in the noisy case.

Table 1. Example of estimates of the number of communities for the noisy case; notice that Inline graphic is close to the expected value Inline graphic.

Inline graphic Inline graphic Inline graphic Inline graphic
10 5 14 11.1
9 5 13 10.1
7 5 9 7.2
5 10 12 6.2
5 7 8 6.5
5 5 7 5.3

Intersection point between Inline graphic and Inline graphic are calculated averaging curves over Inline graphic random samples.

Acknowledgments

We thank EU FET Open project FOC nr.255987 and CNR-PNR National Project “Crisis-Lab” for support.

Funding Statement

Supporting Grant: EU Future and Emerging Technologies (FET) Open project FOC nr.255987. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Caldarelli G (2007) Scale-Free Networks: Complex Webs in Nature and Technology. Oxford University Press.
  • 2. Battiston S, Delli Gatti D, Gallegati M, Greenwald B, Stiglitz JE (2007) Credit chains and bankruptcy propagation in production networks. Journal of Economic Dynamics and Control 31: 2061–2084. [Google Scholar]
  • 3. Gabrielli A, Caldarelli G (2007) Invasion percolation and critical transient in the Barabási model of human dynamics. Physical Review Letters 98: 208701. [DOI] [PubMed] [Google Scholar]
  • 4. Galluccio S, Caldarelli G, Marsili M, Zhang YC (1997) Scaling in currency exchange. Physica A 245: 423. [Google Scholar]
  • 5.Jackson MO (2008) Social and Economic Networks. Princeton University Press.
  • 6. Borgatti SP, Mehra A, Brass DJ, Labianca G (2009) Network Analysis in the Social Sciences. Science 323: 892–895. [DOI] [PubMed] [Google Scholar]
  • 7. Haldane AG, May RM (2011) Systemic risk in banking ecosystems. Nature 469: 351–355. [DOI] [PubMed] [Google Scholar]
  • 8. Stanley HE, Amaral LAN, Buldyrev SV, Gopikrishnan P, Plerou V, et al. (2002) Self-organized complexity in economics and finance. Proceedings of the National Academy of Sciences of the United States of America 99: 2561–2565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. MA S, Boguñá M (2003) Topology of the world trade web. Phys Rev E 68: 15101. [DOI] [PubMed] [Google Scholar]
  • 10. Schweitzer F, Fagiolo G, Sornette D, Vega-Redondo F, Vespignani A, et al. (2009) Economic Networks: The New Challenges. Science 325: 422–425. [DOI] [PubMed] [Google Scholar]
  • 11. Fu D, Pammolli F, Buldyrev SV, Riccaboni M, Matia K, et al. (2005) The growth of business firms: Theoretical framework and empirical evidence. Proceedings of the National Academy of Sciences of the United States of America 102: 18801–18806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Majumder SR, Diermeier D, Rietz TA, Amaral LA (2009) Price dynamics in political prediction markets. Proceedings of the National Academy of Sciences 106: 679–684. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Hidalgo CA, Klinger B, Barabási AL, Hausmann R (2007) The Product Space Conditions the Development of Nations. Science 317: 482–487. [DOI] [PubMed] [Google Scholar]
  • 14. Hidalgo CA, Hausmann R (2009) The building blocks of economic complexity. Proceedings of the National Academy of Sciences 106: 10570–10575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Balassa B (1965) Trade liberalization and ‘revealed’ comparative advantage. Manchester School 33: 99–123. [Google Scholar]
  • 16.Bellman R (1997) Introduction to matrix analysis (2nd ed.). Philadelphia, PA, USA: Society for Industrial and Applied Mathematics.
  • 17. Johnson N, Lux T (2011) Financial systems: Ecology and economics. Nature 469: 302–303. [DOI] [PubMed] [Google Scholar]
  • 18. Bonanno G, Caldarelli G, Lillo F, Mantegna RN (2003) Topology of correlation-based minimal spanning trees in real and model markets. Phys Rev E 68: 46130. [DOI] [PubMed] [Google Scholar]
  • 19.Brun C, Chevenet F, Martin D, Wojcik J, Guénoche A, et al.. (2003) Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network. Genome biology 5. [DOI] [PMC free article] [PubMed]
  • 20. Auconi P, Caldarelli G, Scala A, Ierardo G, Polimeni A (2011) A network approach to orthodontic diagnosis. Orthodontics & Craniofacial Research 14: 189–197. [DOI] [PubMed] [Google Scholar]
  • 21.Scala A, Auconi P, Scazzocchio M, Caldarelli G, McNamara J, et al.. (2012) Using networks to understand medical data: the case of class iii malocclusions. PLoS ONE. [DOI] [PMC free article] [PubMed]
  • 22. Mantegna RN (1999) Hierarchical structure in financial markets. European Physical Journal B 11: 193–197. [Google Scholar]
  • 23.Mantegna RN, Stanley HE (2000) An Introduction to Econophysics: Correlations and Complexity in Finance. Cambridge Univ. Press, Cambridge UK.
  • 24. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99: 7821–7826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Fortunato S (2010) Community detection in graphs. Physics Reports 486: 75–174. [Google Scholar]
  • 26. Farmer JD, Lo AW (1999) Frontiers of finance: Evolution and efficient markets. Proceedings of the National Academy of Sciences 96: 9991–9992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Chi Ho Yeung YCZ (2009) Minority Games, Springer. 5588–5604.
  • 28. De Masi G, Iori G, Caldarelli G (2006) Fitness model for the Italian interbank money market. Phys Rev E 74: 66112. [DOI] [PubMed] [Google Scholar]
  • 29. Garlaschelli D, Loffredo MI (2004) Fitness-Dependent Topological Properties of the World Trade Web. Phys Rev Lett 93: 188701. [DOI] [PubMed] [Google Scholar]
  • 30.Podobnik B, Horvatic D, Petersen AM, Urošević B, Stanley HE (2010) Bankruptcy risk model and empirical tests. Proceedings of the National Academy of Sciences. [DOI] [PMC free article] [PubMed]
  • 31. Buldyrev SV, Parshani R, Paul G, Stanley HE, Havlin S (2010) Catastrophic cascade of failures in interdependent networks. Nature 464: 1025–1028. [DOI] [PubMed] [Google Scholar]
  • 32. Capocci A, Caldarelli G (2008) Taxonomy and clustering in collaborative systems: the case of the on-line encyclopedia Wikipedia, EPL. 81: 28006. [Google Scholar]
  • 33. Garlaschelli D, Capocci A, Caldarelli G (2007) Self–organized network evolution coupled to extremal dynamics. Nature Physics 3: 813–817. [Google Scholar]
  • 34.Shamma JS (2008) Cooperative Control of Distributed Multi-Agent Systems. Wiley-Interscience. ISBN 978-0-470-06031-5.
  • 35.Page L, Brin S, Motwami R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Standford InfoLab University website, Accessed 2012 Sep 19. URL http://dbpubs.stanford.edu:8090/pub/1999-66.
  • 36. Bonacich P (1987) Power and Centrality: A Family of Measures. American Journal of Sociology 92: 1170–1182. [Google Scholar]
  • 37.Zlatić V, Gabrielli A, Caldarelli G (2010) Topologically biased random walk and community finding in networks. Physical Review E 82: 066109+. [DOI] [PubMed]
  • 38.Arrow KJ, Dasgupta P, Goulder LH, Mumford KJ, Oleson K (2010) Sustainability and the Measurement of Wealth. National Bureau of Economic Research Working Paper Series : 16599+.
  • 39.Dasgupta P (2009) The Place of Nature in Economic Development. Ideas Website, Accessed 2012 Sep 19. Technical report. URL http://ideas.repec.org/p/ess/wpaper/id2233.html.
  • 40.Dasgupta P (2010) Poverty traps: Exploring the complexity of causation. International Food Policy Research Institute (IFPRI) 2010 Vision briefs BB07 Special Edition.
  • 41.Tacchella A, Cristelli M, Caldarelli G, Gabrielli A, Pietronero L (2012) Economic complexity: a new metric for countries’ competitiveness and products’ complexity. submitted to Journal of Economic Dynamics and Control.

Articles from PLoS ONE are provided here courtesy of PLOS

RESOURCES