Skip to main content
Other Publishers logoLink to Other Publishers
. 2018 Mar 9;97(3):032303. doi: 10.1103/PhysRevE.97.032303

Tuning the overlap and the cross-layer correlations in two-layer networks: Application to a susceptible-infectious-recovered model with awareness dissemination

David Juher 1,*, Joan Saldaña 2,
PMCID: PMC7217526  PMID: 29776021

Abstract

We study the properties of the potential overlap between two networks A,B sharing the same set of N nodes (a two-layer network) whose respective degree distributions pA(k),pB(k) are given. Defining the overlap coefficient α as the Jaccard index, we prove that α is very close to 0 when A and B are random and independently generated. We derive an upper bound αM for the maximum overlap coefficient permitted in terms of pA(k), pB(k), and N. Then we present an algorithm based on cross rewiring of links to obtain a two-layer network with any prescribed α inside the range (0,αM). A refined version of the algorithm allows us to minimize the cross-layer correlations that unavoidably appear for values of α beyond a critical overlap αc<αM. Finally, we present a very simple example of a susceptible-infectious-recovered epidemic model with information dissemination and use the algorithms to determine the impact of the overlap on the final outbreak size predicted by the model.

I. INTRODUCTION

Some contagious processes interact with each other during their propagation, which can occur either through the same route of transmission or through routes that share the same set of nodes but use different types of connections. In the second case, the description of the spread uses the concept of multilayer or multiplex network, namely, a set of nodes (individuals, computers, etc.) connected by qualitatively different types of links corresponding to possible relationships among them (acquaintanceship, friendship, physical contact, social networks, etc.), each layer defined by a type of connection. Competitive viruses spreading simultaneously through different routes of transmission over the same host population, or the spread of a pathogen and awareness during an epidemic episode are examples of processes that are better described by means of multilayer networks [1].

In the last years it has been a development of the mathematical formulation of multiplex networks and, also, of more general interconnected networks for which the set of nodes does not need to be the same at each layer [2–4]. Moreover, recent results show the importance of the interrelation between different layers in determining the fate of competitive epidemic processes [1,5]. In other cases, however, the importance of such an interrelation is not so evident from the analytical results of the epidemic threshold [6,7], or even seems to be not relevant at all [8].

Only a few papers dealing with competing epidemics over multilayer networks focus on the impact of layer overlap on the epidemic dynamics [5,9,10]. In [5], the authors consider a sequential propagation of two epidemics using distinct routes of transmission over a network consisting of two partly overlapped layers. Using bond percolation, it is determined the success of a second epidemic through that part of its route of transmission whose nodes have not been infected by the first epidemic. In [10], the authors develop an analytical approach to deal with simultaneous spread of two interacting viral agents on two-layered networks. In that work, moreover, the respective effects of overlap and correlation of the degrees of nodes in each layer on the epidemic dynamics are considered.

Here the overlap α between two (labeled) networks A and B of N nodes is defined as the fraction of links of the union network that are common links of A and B or, equivalently, the probability that a randomly chosen link of the network AB is simultaneously a link of both A and B. In fact α is the Jaccard index, a statistic used for comparing the similarity of two sample sets, as defined in [11]. Just to illustrate that this simple statistical parameter can play a critical role in the qualitative response of a two-layer network model, in Sec. VIII we present a mean-field model for the spread of an infectious agent on one layer (contact layer). The model implicitly assumes an information dissemination on a second layer (notification layer) about the infection status of the nodes which causes an increase in awareness and the adoption of preventive behaviors. As an interesting feature, the overlap coefficient α between the networks embedding the respective routes of transmission is a parameter of the model. This allows us to derive a simple relationship between α and the epidemic threshold. Provided that one wants to perform simulations to validate this (or any) model, a systematic procedure to generate couples of networks of given size and degree distributions with a prescribed value of α would be a useful tool. We stress that this is the main focus of the paper, and that the model in Sec. VIII is just a simple example to illustrate the convenience of having such tools.

Our approach is based on the study of the potential overlap between two networks whose (finite, empirical) degree distributions are previously fixed. More precisely, in Secs. III and IV we estimate the minimum and maximum values (call them αm and αM) for the overlap coefficient between two networks of size N and degree distributions pA(k) and pB(k). In particular, we show that αm0. The study of the maximum αM is based on the computation of the potential overlap Opot(DA,DB) between two fixed degree sequences DA,DB following pA(k), pB(k). In Sec. V we present the CR algorithm, that takes as input any two degree sequences DA,DB and a desired overlap α(0,Opot(DA,DB)) and generates a couple of networks with degree sequences DA,DB and overlap coefficient close to α. When DA,DB are randomly sampled from pA(k),pB(k), the potential overlap Opot(DA,DB) is called critical overlap. In Sec. VI we show that the CR algorithm starting with two random sequences succeeds in constructing pairs of networks having any overlap below the critical one and exhibiting some desirable statistical properties, specifically lack of in- and cross-layer degree-degree correlations. Of course the critical overlap belongs to the interval (αm,αM), and is higher than expected from intuition. In Sec. VII we show that for values between the critical overlap and αM there is an unavoidable direct relationship between overlap and cross-layer degree-degree correlation, and propose a refined version of the cross-rewiring (CR) algorithm that tries to reach values beyond the critical overlap while maintaining the cross-layer degree-degree correlation as small as possible.

With this collection of algorithms, we are given a tool to test the analytical predictions relating overlap and epidemic thresholds. In the few previous works dealing with interacting epidemics on overlay networks [5,9,10], the two-layer network over which multiple pathogens spread was characterized by the probability ρ(kA,kB,kc) that a randomly selected node belongs to kA links unique to layer A, kB links unique to layer B, and kc common links [note that, using this language, what we are assuming here as a natural requirement is that the marginal distributions pA(kA) and pB(kB), that can be recovered as pA(kA)=kB,kcρ(kAkc,kBkc,kc) and pB(kB)=kA,kcρ(kAkc,kBkc,kc), are given]. Those papers put the main focus on the influence of the overlap and the degree correlations on the epidemic dynamics predicted by the model, rather than on the algorithms used to construct the two-layer network. So, the simulations to test the validity of the predictions were performed over particularly simple cases (rich enough, nevertheless, to extract valid conclusions). As an example, to test the model response to an arbitrary overlap α, the authors perform the simulations in the simplest setting pA(k)=pB(k)p(k), that obviously admits any overlap coefficient from 0 to 1, take

ρ(kA,kB,kc)=p(kA+kc)δkA,kBkA+kckcαkc(1α)kA

(attach independently kA+kc=kB+kc links to a node, with a probability α for each link to belong to both networks), and execute a “configuration model”-like algorithm to connect pairs of stubs sampled from ρ with the obvious restrictions. As another example, the respective effects of overlap and degree correlations are isolated by considering again simple (and extremal) cases: random overlap and no degree correlation; random overlap and full degree correlation; and full overlap.

In a more general setting (pApB), it is not straightforward to extend the configuration model algorithm to get prescribed intermediate overlaps and/or degree correlations. In contrast, the algorithm we present here can be used to generate any permitted value of both parameters in the range forced by the marginal degree distributions.

II. TERMINOLOGY AND STANDING NOTATION

Throughout this paper, the nodes of any network will be labeled with the natural numbers {1,2,...,N}. The cardinality of a finite set X will be denoted by |X|. Let V={1,2,...,N} for some NN. Let E and E be two subsets of {{i,j}:ijandi,jV}. Let G and G be the undirected networks having V as the set of nodes and E and E as the respective sets of links. The union network GG is the undirected network whose sets of nodes and links are V and EE respectively. By definition, we will say that G and G are different from each other if and only if EE. In particular, if we have a network H and we simply permute the labels of the nodes of H, then we obtain a network that is in general different from (but isomorphic to) H. Observe that the union operation is not a topological invariant: the union of two networks does not depend only on their shapes but also on the way their nodes are labeled. The overlap between G and G is defined as the fraction

O(G,G):=|EE||EE|=|EE||E|+|E||EE|,

which can be thought of as the probability that a randomly chosen link of GG is simultaneously a link of both G and G.

A degree set of cardinality N is a multiset (i.e., multiple instances of each element are allowed) of N integers that is realizable as the set of degrees of a network. That is, there exist a labeling {k1,k2,...,kN} of the elements of the set and a network G of N nodes such that ki is the degree of the node i. Equivalently, ki is even and the integers ki satisfy the well-known Havel-Hakimi condition [12,13] (a technical recursive condition that is irrelevant to our purposes). As usual, the ordered list D=(k1,k2,...,kN) will be called the degree sequence of G. Note that rearranging the elements of D by means of a permutation σ corresponds to relabeling the nodes of G to get another network G isomorphic to G with degree sequence σ(D) (and the same degree set as G).

A probability distribution p(k) with bounded support will be called empirical (of N nodes) if it is realizable as the degree distribution of a network of N nodes. That is, there exists a network G of N nodes such that, if {k1,k2,...,kN} is the degree set of G, then Nk:=|{i:ki=k}|=p(k)N. Observe that giving an empirical distribution p(k) of N nodes is completely equivalent to specify a degree set K of cardinality N. For an ordered sequence D we will write Dp(k) to indicate that D is a particular arrangement of the elements of K. For a pair of ordered sequences (D,D) and two empirical distributions p(k),p(k), we will write (D,D)p(k)×p(k) to indicate that Dp(k) and Dp(k).

We use the term empirical for a degree distribution to distinguish it from a (theoretical, not necessarily with bounded support) probability distribution p(k). In this case, for any NN, one can use several standard algorithms (see Sec. III) to construct a network GN of N nodes whose empirical degree distribution pN(k) is close to p(k), in the sense that, for big enough values of N, pN(k) converges in probability to p(k) ([14], Theorem 2.1).

Assume that we are given two empirical degree distributions p(k),p(k) of N nodes, with corresponding degree sets K and K. Let n and n be the total number of pairwise different networks having respectively K and K as degree sets, each one numbered with an integer in the range [1,n] (respectively, [1,n]). Then we can clearly consider a function of two variables O(x,y) on the grid of all pairs (x,y) of integers in [1,n]×[1,n], that gives the value of the overlap of the networks numbered as x and y. Observe that the function O(x,y) has a global minimum and maximum. These extremal values will be denoted by O_(N,p,p) and O¯(N,p,p), or simply by O_ and O¯ when no confusion seems possible.

III. EXPECTED OVERLAP BETWEEN TWO RANDOM INDEPENDENT LAYERS

Assume that we are given two empirical degree distributions p(k),p(k) of N nodes. In this section we prove that the expected overlap between two random networks of N nodes and degree distributions p(k) and p(k) (generated, for instance, via the standard configuration model algorithm [15–17]) is very close to zero when N is big enough, thus showing that O_0. Giving estimations for O¯ will be the matter of Sec. IV.

Let us recall the configuration model algorithm to generate a random network with a given degree sequence (k1,k2,...,kN). Take a vector X of length 2L:=ki containing k1 times the integer 1 in the first k1 entries, k2 times the integer 2 in the following k2 entries, etc. Each entry v of X represents a single stub (or semilink) attached at the node labeled as v. Then, take a random permutation of the entries of X to get a new array Y. Finally, read the contents of Y in order, interpreting each pair of consecutive entries v,w as a link between the nodes v and w. For an example, take N=6 and consider the degree distribution p(k) defined by p(1)=p(3)=1/6, p(2)=4/6, and p(k)=0 for k1,2,3. The corresponding degree set is {1,2,2,2,2,3}. Take, for instance, (1,2,2,2,2,3) as degree sequence. Then, X=(1,2,2,3,3,4,4,5,5,6,6,6). Now we permute X at random, obtaining Y=(3,4,5,1,6,3,6,2,4,5,2,6). The links of the obtained network are {3,4}, {5,1}, {6,3}, {6,2}, {4,5}, {2,6}. Observe that the link {6,2} appears twice. In general, the configuration model algorithm gives multigraphs rather than graphs. It is well known, however, that the fraction of self-loops and multilinks over the total number of links goes to 0 as N when the variance of the degree distribution is bounded [18]. See [14] for alternative implementations of the configuration model to get simple graphs.

It seems natural to expect that the overlap between two networks of respective degree distributions p(k),p(k) and size N generated via the configuration model algorithm is very small. When the respective mean degrees are small with respect to the total size N this turns out to be true. To prove this fact, we need to estimate the probability that two given nodes are connected in a random network generated via the configuration model algorithm. So, let G be a network of N nodes, L links, and degree distribution p(k). Assume that G has been obtained by means of the configuration model algorithm starting with a degree sequence (k1,k2,...,kN). Take at random any pair {i,j} of nodes with kikj. Next we estimate the probability pij that the network G contains the link {i,j}. This probability is given by the quotient a/b, where b is the total number of rearrangements Y of the vector X (here we are using the notation introduced in the definition of the configuration model) and a is the number of such rearrangements having at least two consecutive entries i,j (or j,i) in places Yn,Yn+1 for n=1,3,5,...,2L1. We have that

b=(2L)!k1!k2!kN!. (1)

Let us compute a. For l=1,2,...,L, let Yl be the set of rearrangements Y containing the entries i,j (or j,i) in places Y2l1,Y2l. Then, a=|Y1Y2...YL|. By the inclusion-exclusion principle, a=a1a2++(1)ki1aki, where al is the sum of the cardinalities of all intersections of l sets in Y1,Y2,...,YL. A simple combinatorial argument yields that, for lki,

al=Ll2l(2L2l)!k1!ki1!(kil)!ki+1!kj1!(kjl)!kj!kN!,

while al=0 for ki<lL. Using the previous expression and the inclusion-exclusion principle we get that

a=l=1ki(1)l1Ll2l(2L2l)!(kil)!(kjl)!k1!k2!ki1!ki+1!kj1!(kj+1)!kN!.

Taking it all into account, we get that the probability that G contains the link {i,j} is

pij=L!ki!kj!(2L)!l=1ki(1)l12l(2L2l)!l!(Ll)!(kil)!(kjl)!. (2)

This exact expression is too complex to be used to estimate the expected overlap between two random networks. Instead, if in the previous proof we replace a simply by a1, then it easily follows that

pijkikj2L1, (3)

that is in fact a standard approximation used in the literature for the probability pij [18,19]. The approximation (3) is good enough only when ki and kj are small with respect to L, in particular when we consider networks with bounded mean degree and large size N, which is the case for most modeling applications. However, in general (3) can significantly differ from the exact formula (2).

Now let p(k),p(k) be two empirical degree distributions with respective means k and k. Let G,G be two networks of N nodes and degree distributions p(k) and p(k) generated via the configuration model algorithm starting with degree sequences (k1,k2,...,kN) and (k1,k2,...,kN). Assume that N is big enough with respect to k and k in such a way that the approximation (3) holds. Let L,L be the number of links of G and G respectively. Using (3) we can compute the probability p that two different nodes chosen at random are neighbors in G:

p12L1ki,kjkip(ki)kjp(kj)=k22L1kN, (4)

where in the last expression k denotes the expected degree of a node and we have used that kN=2L. Now the expected overlap between G and G can be computed as the probability that two different nodes are connected in both G and G over the probability that they are connected in GG which, by virtue of (4), is

kk/N211kN1kN.

In consequence,

O(G,G)kkN(k+k)kk, (5)

telling us that, given N and any two degree distributions p(k),p(k), the minimum overlap O_(N,p,p) is very close to 0, at least when N is big with respect to the expected values k and k. Of course, for small networks this is not true in general.

IV. AN UPPER BOUND FOR THE MAXIMUM OVERLAP

We start this section by giving a computable upper bound for O¯(N,p,p) in terms of the size N and the empirical distributions p(k),p(k). To do it, first we introduce the notion of potential overlap between two fixed degree sequences.

Let G,G be two networks of N nodes and empirical degree distributions p(k),p(k), with means k and k and corresponding degree sequences D=(k1,k2,...,kN), and D=(k1,k2,...,kN), with ki=kN=:2L and ki=kN=:2L. If E and E are the sets of links of G and G, then by definition

O(G,G)=|EE||EE|=|EE|L+L|EE|=x(k+k)N2x=:F(x), (6)

where x stands for |EE|. Now observe that F(x) is increasing in x. In consequence, an upper bound for the overlap is obtained when replacing x by the maximum possible number of links of the intersection network. It is clear that the intersection network cannot have more than min{ki,ki} links attached at node i. In consequence, the total number of links of the intersection network is at most

12i=1Nmin{ki,ki}.

So, we define the potential overlap Opot(D,D) associated to a pair (D,D) of degree sequences as

12imin{ki,ki}12i(ki+ki)12imin{ki,ki}, (7)

that, since ki+ki=max{ki,ki}+min{ki,ki}, can be rewritten as

Opot(D,D):=i=1Nmin{ki,ki}/i=1Nmax{ki,ki}. (8)

Now observe that

O¯max(D,D)p×p{Opot(D,D)}

and recall that the set of possible degree sequences associated to p(k) coincides essentially with the set of all permutations of the numbers k1,k2,...,kN. Thus, if (D,D)p(k)×p(k) and σ,ρ are two permutations of order N, then (σ(D),ρ(D))p(k)×p(k). Moreover, Opot(σ(D),σ(D))=Opot(D,D). In consequence, without loss of generality we can assume that D is increasingly ordered (that is, kikj if i<j). In this case, it is easy to check that if there is a pair of entries kikj of D with i<j, then if we swap both entries the obtained sequence D satisfies Opot(D,D)Opot(D,D). So, the maximum in the previous inequality is attained precisely when both D and D are increasingly ordered. So, we have proved that

O¯i=1Nmin{ki,ki}i=1Nmax{ki,ki},wheneverk1k2kNandk1k2kN. (9)

Inequality (9) allows us to design an efficient algorithm to compute an upper bound for the maximum overlap. The algorithm takes as input the empirical distributions p(k) and p(k), sorts increasingly the elements of the respective degree sets, and finally returns the right-hand side of the inequality in (9). Table I (second row in each box) shows the output of this algorithm for several pairs of empirical distributions, obtained by approximating the corresponding pairs of (theoretical) distributions. Here “SF” stands for a scale-free network with p(k)=Ckγ with γ=3, minimum degree m, cutoff kc=mN1/2, and the normalization constant C=(γ1)mγ1N/(N1), for which k2m [20]. “Exponential” corresponds to p(k)=(1/m)e1k/m with minimum degree m, for which k=2m. “Poisson” corresponds to p(k)=λeλ/k! with λ=k, and “Regular” stands for a random network for which all nodes have the same degree. In all cases, N=10000.

TABLE I.

Critical overlap as defined in Sec. VI (first row) and the upper bound (9) for the maximum overlap permitted (second row) between pairs of empirical distributions. In all cases N=10000. For the left column distributions, k=20 while, for the upper ones, k=26.

  Regular Poisson SF Exponential
  0.7693 0.7508 0.6301 0.6654
Regular 0.7693 0.7508 0.6301 0.6654
  0.7552 0.7259 0.5969 0.6392
Poisson 0.7552 0.7709 0.7221 0.7739
  0.5451 0.5365 0.4903 0.5117
SF 0.5451 0.6000 0.7688 0.7023
  0.6330 0.6174 0.5415 0.5683
Exponential 0.6330 0.7077 0.7715 0.7706

V. AN ALGORITHM TO SWEEP THE RANGE OF POTENTIAL OVERLAPS BETWEEN TWO DEGREE SEQUENCES

In this section we design an algorithm that takes any pair of degree sequences D,D and a value α between 0 and Opot(D,D) and constructs a pair of networks G,G with degree sequences D,D whose overlap is as close as possible to α (values of α very close to Opot(D,D) are not attainable since Opot(D,D) is just an upper bound).

Assume that we have generated two random networks G(0),G(0) of N nodes using the configuration model. In view of (5), O(G(0),G(0))0. Thus, it seems natural to propose an algorithm that works as follows. At each time step t0, modify the networks G(t),G(t) a little bit without modifying the degree sequences by performing a local operation (an operation involving few nodes and/or links) to obtain new networks G(t+1),G(t+1) in such a way that O(G(t+1),G(t+1)) is slightly larger than O(G(t),G(t)). Repeat until the overlap is close to α.

The kind of local operation that we will use in the scheme above is a cross rewiring [21], according to the following definition. Let G(t),G(t) be two networks of N nodes. A good pair in G(t) with respect to G(t) is a pair of links {a,b}, {c,d} in G(t) satisfying the following conditions:

(1) {a,b} and {c,d} are not links in G(t).

(2) {a,c} and {b,d} are not links in G(t).

(3) {a,c} is a link in G(t).

Analogously we define a good pair in G(t) with respect to G(t) by interchanging the roles of G(t) and G(t) in the previous definition. Given a good pair {a,b}, {c,d} in G(t) with respect to G(t), the associated cross-rewiring operation consists of replacing the links {a,b} and {c,d} in G(t) by {a,c} and {b,d} to get a new network G(t+1). Observe that G(t) and G(t+1) are in general different as nonlabeled networks. However, the degrees of the involved nodes a,b,c,d are not modified after performing the cross rewiring. In consequence, G(t) and G(t+1) have the same degree sequences. On the other hand, set G(t+1)=G(t) and let E(t), E(t+1), E(t), E(t+1) be respectively the sets of links of G(t), G(t+1), G(t), G(t+1). Then, |E(t+1)|=|E(t)| and, by the definition of the cross-rewiring operation over a good pair, |E(t+1)|=|E(t)|. Moreover, by the definition of a good pair, either |E(t+1)E(t+1)|=|E(t)E(t)|+1 if {b,d} is a link in G(t) or |E(t+1)E(t+1)|=|E(t)E(t)|+2 otherwise. Then, if we denote O(G(t),G(t)) and O(G(t+1),G(t+1)) by O(t) and O(t+1) respectively, a trivial computation yields that

O(t+1)=O(t)+xO(t)2+2xO(t)+xLxxO(t), (10)

where x{1,2} and L=|E(t)|+|E(t)|. As a consequence of (10), the overlap after performing a cross-rewiring operation in a good pair of links slightly (but strictly) increases.

From now on, let 0αOpot(D,D) be the desired overlap coefficient. In view of what has been said, let us consider the following CR algorithm (standing for “cross rewiring”):

CR algorithm [input: D,D,α]

(1) Use the configuration model to get two random networks G(0),G(0) of size N and degree sequences D,D. The overlap between G(0) and G(0) is close to 0.

At each time step t0:

(2) Choose at random (if it exists) a good pair of links in G(t) with respect to G(t). Perform a cross-rewiring operation in G(t) using such a pair, obtaining a new network G(t+1). Set G(t+1):=G(t). Then, by (10), O(G(t+1),G(t+1))>O(G(t),G(t)). If O(G(t+1),G(t+1))α, set G:=G(t+1), G:=G(t+1) and stop.

(3) Repeat the previous step interchanging the roles of G(t) and G(t). Proceed to the next time step.

It is clear that after a finite number t0 of steps the algorithm will stop, either because no good pairs are found or because the overlap between G(t0) and G(t0) has reached the value α. In any case, the output of the algorithm is the pair of networks G(t0),G(t0). A natural question is whether in general the algorithm may halt forced by the condition that no good pairs are found, before having reached a value of the overlap close to α, especially when α is close to Opot(D,D) [we stress the fact that Opot(D,D) is just an upper bound, far from being realizable in general]. So, it makes sense to remove the stop condition given by the overlap and let the algorithm run until no more good pairs are found. In Table II we show the maximum overlap obtained in this way for several pairs of distributions, together with the upper bound Opot(D,D). In all cases, the input degree sequences D,D are random arrangements of the degree sets associated to the respective distributions. The obtained overlap is relatively close to the upper bound, suggesting that indeed the CR algorithm is able to sweep the entire range of permitted overlaps between 0 and Opot(D,D).

TABLE II.

Maximum overlap (first row) generated by the CR algorithm starting with two random arrangements D,D of the corresponding degree sets vs the upper bound Opot(D,D) (second row). In all cases N=10000, k=10.

  Regular Poisson SF Exponential
  1 0.738 11 0.562 01 0.637 31
Regular 1 0.776 12 0.611 29 0.678 17
    0.638 81 0.492 42 0.563 93
Poisson   0.696 05 0.565 36 0.625 55
      0.446 73 0.477 69
SF     0.514 43 0.538 22
        0.534 26
Exponential       0.589 36

VI. BELOW THE CRITICAL OVERLAP: TWO DESIRABLE STATISTICAL FEATURES OF THE NETWORKS GENERATED BY THE CR ALGORITHM

Let us introduce another relevant quantity that we will call critical overlap. It is defined as the potential overlap between two random sequences (Drand,Drand)p(k)×p(k) where p(k),p(k) are empirical distributions of N nodes:

Ocr(N,p,p):=Opot(Drand,Drand),

that for N big enough and pairs of distributions with bounded variance can be essentially considered as independent from the particular sampled sequences. Against an initial intuition, the critical overlap is not close to 0 but lies relatively close to O¯ (see Table I). By running the CR algorithm with sequences Drand, Drand one can get any overlap α between 0 and (values close to) Ocr(N,p,p). As we will see, proceeding in this way the obtained two-layer network exhibits some desirable statistical features (lack of in- and cross-layer correlations). For higher values of α, it is unavoidable to introduce correlations and deviate from what happens in a “configuration model” context (Sec. VII).

A. Lack of in-layer degree-degree correlations

The lack of degree-degree correlations inside each layer is often a crucial requirement in the derivation of the equations governing mean-field multilayer models. In particular, this will be a basic assumption in the derivation of system (16) and (17) for the susceptible-infectious-recovered (SIR) model proposed in Sec. VIII. It is reasonable to expect that each network in a pair created via the CR algorithm with random initial sequences is uncorrelated, since:

(1) The networks G(0),G(0) are randomly generated via the configuration model algorithm, which is known to produce uncorrelated networks.

(2) A cross rewiring performed over a good pair of links {a,b}, {c,d} increases (decreases) the global degree-degree correlation if the new links connect the two nodes with the smallest degrees and the two nodes with the largest degrees (respectively, if one of the new links connects the node with the largest degree to the node with lowest degree). But the rewiring criterion in the CR algorithm is intended to increase the overlap coefficient and has nothing to do with the degrees of the four involved nodes. So, some reconnections will increase the global degree-degree correlation and some will decrease it, thus expecting essentially an overall balance.

To support this claim, we show in Table III the standard Pearson coefficient r for each layer, computed from the two random variables defined by the degrees of the nodes at both ends of randomly chosen links [22]. Values of r close to 1 (respectively 1) account for dissortative (respectively assortative) networks, while values close to 0 correspond to uncorrelated networks. As in Table II, the CR algorithm was executed taking as input two random arrangements of the corresponding degree sets.

TABLE III.

Pearson coefficient to measure the degree-degree correlations in each layer for several pairs of networks obtained from the CR algorithm with prescribed overlap α=0.15,0.3,0.45. In all cases, N=10000 and k=10.

  α=0.15 α=0.3 α=0.45   α=0.15 α=0.3 α=0.45
Poisson 0.022 88 0.024 74 0.054 29 Poisson 0.014 04 0.037 58 0.073 38
SF 0.006 73 0.047 74 0.129 42 Poisson 0.013 82 0.043 92 0.056 24
SF 0.004 19 0.032 07 0.074 98 SF 0.008 30 0.030 33 0.078 82
Exponential 0.027 11 0.077 71 0.131 75 SF 0.015 86 0.047 19 0.079 09
Poisson 0.018 88 0.035 88 0.054 01 Exponential 0.022 10 0.070 99 0.128 41
Exponential 0.032 90 0.070 54 0.097 41 Exponential 0.052 03 0.078 87 0.117 22

B. Lack of cross-layer degree-degree correlations

The cross-layer degree-degree correlation τ is defined as the correlation of the respective degrees ki and ki of the same node i in the two layers. In Sec. VII we will show precisely how to measure it. We note that the epidemic model proposed in Sec. VIII will be simple enough to be independent of this sort of correlation, but this may not be the case for more sophisticated models, so that the question of obtaining a given overlap controlling τ makes sense. Observe that the cross-layer degree-degree correlation between two networks G,G depends only on the respective degree sequences, not on the particular links joining the nodes in G and G. On the other hand, τ0 for two independent random sequences Drand,Drand. Since during the execution of the CR algorithm the respective degree sequences are not modified, the lack of degree-degree correlations follows when using the CR algorithm starting with two independent random arrangements of the degree sets of p(k),p(k).

VII. ABOVE THE CRITICAL OVERLAP: ACCOUNTING FOR CROSS-LAYER DEGREE-DEGREE CORRELATIONS

In view of the previous sections, there is a natural algorithm that allows us to get any prescribed overlap 0αO¯(N,p,p): arrange the degree sets of p(k) and p(k) to get degree sequences D,D increasingly ordered. According to (9), O¯(N,p,p)Opot(D,D). Then, run the CR algorithm taking D,D and α as input. This algorithm generates a pair of networks with maximum cross-layer degree-degree correlation. Indeed, nodes 1 and N have respectively the smallest and the largest degree in both layers, and the intermediate nodes have the same degree rank.

As we will see, there is an unavoidable relationship between high values of the overlap and the cross-layer degree-degree correlation, but the question arises whether it is possible to get a value of the overlap close to the maximum while controlling the cross-layer correlation to some extent.

Given two degree sequences D=(k1,k2,...,kN) and D=(k1,k2,...,kN), it is natural to measure the cross-layer degree-degree correlations by using the Kendall's τ-b coefficient [23]:

τ(D,D):=NcNd(N0N1)(N0N2).

Here Nc is the number of concordant pairs, Nd is the number of discordant pairs, N0=N(N1)/2, N1=iti(ti1)/2, and N2=jtj(tj1)/2, where ti is the number of tied values in the ith group of ties for D (analogously for tj and D). A pair of indices ij is said to be concordant if (kikj)(kikj)>0, discordant if (kikj)(kikj)<0, or tied if (kikj)(kikj)=0.

It is well known that if the agreement (respectively, disagreement) between the two rankings is perfect, then τ(D,D)=1 [respectively τ(D,D)=1], while if D and D are independent (lack of cross-layer degree-degree correlation) then τ(D,D) is expected to be close to 0. Note also that if σ is any permutation, τ(σ(D),σ(D))=τ(D,D). So, in what follows we will assume without loss of generality that D is increasingly ordered:

k1k2kN.

The cross-layer degree-degree correlation between two networks G,G depends only on the respective degree sequences D,D, not on the particular links joining the nodes in G and G. Considering a permutation σ of the elements of D corresponds to relabeling the nodes of G to get a network G isomorphic (so, equally distributed) to G, and it makes sense to study how the potential overlap Opot(D,σ(D)) and the correlation coefficient τ(D,σ(D)) vary in terms of σ with respect to Opot(D,D) and τ(D,D). Since any permutation decomposes in a sequence of transpositions (or swaps) of two elements, let us consider a pair of indices i<j such that ki>kj (a discordant pair). When we swap both entries in D to get a sequence D such that ki=kj, kj=ki and kl=kl for li,j, then τ(D,D)>τ(D,D). On the other hand, it is trivial to check that nmin{kn,kn}nmin{kn,kn} equals

(a) 0 if kj<ki<ki<kj,

(b) kjki>0 if kj<ki<kj<ki,

(c) kiki>0 if kj<ki<ki<kj,

(d) kikj>0 if ki<kj<ki<kj,

(e) kjkj>0 if ki<kj<kj<ki,

(f) 0 if ki<kj<kj<ki.

Since Opot((rn)1N,(rn)1N) is increasing as a function of nmin{rn,rn} [see (7)], it follows that the potential overlap does not decrease when performing a swap that increases the τ-b coefficient. Analogously, one can check that the τ-b coefficient does not decrease after a swap that increases the potential overlap. This remark plainly shows that, as expected, there is a direct relationship between overlap and cross-layer degree-degree correlation.

Keeping in mind that we want to find a sequence of swaps in order to increase the potential overlap while controlling in some sense the cross-layer degree-degree correlation, a crucial remark is that, together with the swaps of types (b)–(e) above, that increase both the potential overlap and the τ-b coefficient, there are two cases for which the swap kikj does not modify the potential overlap while it decreases the τ-b coefficient:

(A) ki<kj<k1<kj,

(B) ki<kj<ki<kj.

Before describing what we call the LS-CR algorithm (standing for label swap–cross rewiring), we give an example of how it works. Let p(k),p(k) be two empirical distributions approximating respectively a Poisson distribution with k=10 and a scale-free distribution with k=12. Set N=10000. Let D,D two random arrangements of the degree sets. The cross-layer degree-degree correlation is expected to be close to 0. Indeed, in a particular simulation we get τ(D,D)=0.01470, while Opot(D,D)=0.57478=Ocr(N,p,p). So, since the cross-rewiring operations do not modify the cross-layer correlation, if we want a prescribed overlap α smaller than 0.57478, the CR algorithm suffices to construct a two-layer network with overlap close to α and a small τ-b coefficient. But suppose that the desired overlap is significantly larger. To see how big it can be, rearrange the elements in D,D to get two sequences σ(D),ρ(D) increasingly ordered and compute Opot(σ(D),ρ(D))=0.749283, that according to (9) is an absolute upper bound for the largest permitted overlap. The corresponding τ-b coefficient is of course very close to 1: τ(σ(D),ρ(D))=0.949190. Suppose now that the desired overlap is very close to O¯, for instance α=0.73. We proceed as follows. Rearrange D using the permutation σ [so σ(D) is increasingly ordered while σ(D) is not]. Both the potential overlap and the τ-b coefficient between σ(D) and σ(D) do not change. Now we perform a series of swaps in σ(D) of any of types (b)–(e), that increase both the potential overlap and the τ-b coefficient, until we reach the potential overlap α. Then, we perform as many swaps of type (A)-(B) as possible in order to diminish the τ-b coefficient without modifying the potential overlap. After running this algorithm in our particular simulation, we get a sequence D such that τ(σ(D),D)=0.648549. Of course the correlation is high, but significantly smaller than 1. Finally, now we can use the CR algorithm with input σ(D),D,α to effectively construct the two-layer network. If we repeat the previous scheme with a prescribed overlap α=0.65, still close to the maximum, we get a sequence D such that τ(σ(D),D)=0.209785. It is instructive to visualize the evolution of both Opot and τ during the complete sequence of swaps (see Fig. 1).

FIG. 1.

FIG. 1.

Evolution of the potential overlap (crosses) and the cross-layer degree-degree correlation (diamonds) when performing a sequence of swaps.

So, let 0αO¯(N,p,p) be the desired overlap. The following LS-CR algorithm (standing for label swap–cross rewiring) is intended to construct two networks of N nodes distributed according to p(k),p(k) with an overlap close to α and a cross-layer degree-degree correlation as small as possible.

LS-CR algorithm [input: N,p(k),p(k),α]

(1) Take degree sequences Drand,Drand by rearranging at random the degree sets of p(k),p(k). Then,

Opot(Drand,Drand)=Ocr(N,p,p).

(2) If αOcr(N,p,p), execute the CR algorithm with input Drand,Drand,α and stop. Otherwise,

(3) Let σ be the permutation that rearranges Drand increasingly. Set D0=σ(Drand), D0=σ(Drand). Then,

Opot(D0,D0)=Opot(Drand,Drand),τ(D0,D0)=τ(Drand,Drand)0.

(4) At each time step t0:

Choose at random (if it exists) a pair of indices i<j such that the four corresponding entries in D0 and Dt satisfy any of the conditions (b)–(e). Swap the entries i and j in Dt to get a new sequence Dt+1. Then,

Opot(D0,Dt+1)>Opot(D0,Dt).

If Opot(D0,Dt+1)α, set t0:=t+1 and go to step 5. Otherwise, proceed to the next time step.

(5) At each time step tt0:

Choose at random (if it exists) a pair of indices i<j such that the four corresponding entries in D0 and Dt satisfy either (A) or (B). Swap the entries i and j in Dt to get a new sequence Dt+1. Then,

Opot(D0,Dt+1)=Opot(D0,Dt),τ(D0,Dt+1)<τ(D0,Dt).

If no pairs are found satisfying (A) or (B), set t1:=t and go to step 6. Otherwise, proceed to the next time step.

(6) Execute the CR algorithm with input D0,Dt1,α.

In Table IV we show some statistical features of the two-layer network obtained from the LS-CR algorithm for several pairs of distributions and different values of the prescribed overlap, all beyond the critical one. In each case we show the obtained overlap α, the Kendall's τ-b coefficient τ for the cross-layer correlation, and the Pearson coefficients ρ1, ρ2 for the degree-degree correlation inside each layer. The evolution of the statistics with the overlap depends of course on the particular distributions considered, but some clear general conclusions can be extracted. In all cases, the obtained overlaps are close to the prescribed one. The τ-b coefficient approaches 1 (even relatively) only for values of the overlap beyond about 80% of the theoretical maximum. The degree-degree correlations inside each layer remain in most cases close to 0.

TABLE IV.

Three examples of a series of executions of the LS-CR algorithm with prescribed overlaps 0.6, 0.65, 0.7, 0.75, 0.8, and 0.85. In all cases, N=10000, k=12 for the first distribution, and k=14 for the second one. For any pair of distributions we report both the critical and the theoretical maximum overlap. For each two-layer network, we show the overlap α, the Kendall's τ-b coefficient τ for the cross-layer degree-degree correlation, and the Pearson coefficients ρ1, ρ2 for the in-layer degree-degree correlations.

    0.6 0.65 0.70 0.75 0.80 0.85
  α 0.5844 0.5952 0.6389 0.6985 0.7614 0.8050
ER 12–Exp 14 τ 0.0178 0.0005 0.0935 0.3537 0.6152 0.7899
Ocr=0.6330 ρ1 0.0753 0.0692 0.0522 0.0241 0.0087 0.0095
O¯=0.8350 ρ2 0.1865 0.2055 0.3517 0.3543 0.3311 0.2727
  α 0.5387 0.5788 0.6386 0.7014 0.7688 0.8459
Exp 12–Exp 14 τ 0.0006 0.0323 0.2116 0.3656 0.5132 0.6743
Ocr=0.5797 ρ1 0.2010 0.1833 0.1451 0.0851 0.0366 0.0166
O¯=0.8545 ρ2 0.1715 0.2377 0.2104 0.1798 0.3271 0.1102
  α 0.5102 0.5702 0.6236 0.6879 0.7611 0.8385
SF 12–SF 14 τ 0.1543 0.2900 0.4033 0.4989 0.5722 0.6246
Ocr=0.5027 ρ1 0.0982 0.0957 0.0761 0.0551 0.0269 0.0021
O¯=0.8522 ρ2 0.1215 0.1106 0.0980 0.0696 0.0362 0.0760

As a final remark, it is clear that the LS-CR algorithm admits a lot of variants depending on the type and order of swaps that one performs (in the “LS” part of the algorithm). For instance, one may be interested in inverting the roles and generate a two-layer network with a prescribed cross-layer degree-degree correlation, while getting an overlap as big as possible.

VIII. A SIMPLE EXAMPLE: A MEAN-FIELD SIR EPIDEMIC MODEL ON A TWO-LAYER NETWORK

This section aims at illustrating that, specially for mean-field models of processes that take place over a two-layer network, the qualitative response of a model may depend critically on the interlayer overlap. To do it, we present a simple example of an epidemic model with information dissemination and determine the impact of the overlap on the final outbreak size predicted by the model.

Epidemic models describe the spread of infectious diseases on populations whose individuals are classified into distinct classes according to their infection state as, for instance, susceptible (S), infectious (I), and recovered (R) individuals. A closer look at the physical transmission of an infection reveals that a suitable description of populations must take into account the network layer A of physical contacts among individuals, with nodes representing individuals and links corresponding to physical contacts along which disease can propagate. On the other hand, if one assumes that the probability of getting infected through an infectious contact S-I depends on the awareness state of the susceptible individual, then a second network layer B over which information about the infection status of individuals circulates can be considered. In the context of management and control of sexually transmitted diseases (STDs), an example of this second network layer is given by the partner notification program. This service helps to reach sexual contacts of patients of STDs and inform them that they may be at risk, and hence the need of seeking medical care [24,25]. So, in our approach, if a pair of individuals, one susceptible and the other infectious, are connected to each other in both network layers, we assume that the transmission rate βc (here c stands for common) will be smaller than the normal transmission rate β because the susceptible partner adopts preventive measures to diminish the risk of contagion.

According to this scenario, next we derive a mean-field SIR epidemic model which implicitly assumes spreading of information on the infection status of nodes in one layer, while explicitly modeling the transmission of an infectious agent in a second layer. Following the standard approach for STDs where the heterogeneity in the number of contacts (sexual partners) is a basic ingredient [26], individuals are classified according to their infection state and their number of physical contacts. So, the model will take into account the network layer A of physical contacts in terms of its degree distribution pA(k)=Nk/N where Nk is the number of individuals having degree k. Analogously, the information or notification network (network layer B) is described by its degree distribution pB(k). For the sake of brevity, a pair of nodes connected to each other in both networks is said to share a common link, although the natures of the connections are dissimilar. Moreover, the model does not assume that links in layer B are a subset of those in layer A, as could be the case in partner notification.

Within each layer, it is assumed that there is no degree-degree correlation, i.e., neighbors in each layer are randomly sampled from the population according to the so-called proportionate mixing of individuals. This means that, in each layer, the probability P(k|k) that a node of degree k is connected to a node of degree k is independent of the degree k and it is given by the fraction of links pointing to nodes of degree k, i.e., P(k|k)=kp(k)/k. Now, let Ik(t) be the number of infectious nodes of degree k at time t in layer A. Although links are unordered pairs of connected nodes by definition, let us consider that every link {u,v} gives rise to two oriented links uv and vu. Then, the probability that a randomly chosen oriented link of A leads to an infectious node is given by the fraction of oriented links in A pointing to infectious nodes [26], that is,

ΘI(t)=1kANkkIk(t)=1kAkkik(t),

where kA is the average degree in A, and ik(t):=Ik(t)/N is the fraction of nodes that are both infectious and of degree k in A at time t.

Finally, let LA, LB, and LAB denote the number of links of A, B, and common links, respectively. Let pB|A be the probability that a randomly chosen link of A, an A link, connects two nodes that are also connected in B, that is, pB|A=LABLA. Similarly, pA|B=LABLB is the probability that a randomly chosen B link is a common link to both networks.

We stress that a key assumption in the model derivation is the uniformity of the overlap between the links of each layer. More precisely: the overlap α is a global feature of the pair of networks {A,B} that depends on the respective whole sets of links, and the equations of the model, that will account for what happens around a typical node i, will be derived using α as a parameter. Implicitly, this corresponds to the mean-field approximation that the local overlap around the node i (fraction of links confluent to i in the union network that are common links of A and B) does not deviate significantly from α. This assumption is clearly unrealistic in general. For instance, a particular run of the LS-CR algorithm with prescribed overlap α=0.5 over two exponential networks of 5000 nodes and mean degrees 45 and 30 leads to a mean local overlap equal to 0.5439 and a standard deviation of 0.1790. So, it is relevant to test the goodness of this approximation by comparing the predictions of the model with simulation outputs.

Taking it all into account, the epidemic spreading is described in terms of Ik(t), and also of Sk(t) and Rk(t), the number of susceptible and recovered nodes of degree k in layer A at time t respectively, which satisfy Sk(t)+Ik(t)+Rk(t)=Nk. In particular, the differential equations for Sk and Ik are

dSkdt=k(1pB|A)βSkΘIkpB|AβcSkΘI, (11)
dIkdt=k(1pB|A)βSkΘI+kpB|AβcSkΘIμIk. (12)

The first term on the right-hand side of (12) is the rate of creation of new infectious nodes of degree k in A due to transmissions of the infection through links that only belong to layer A, whereas the second one is the rate of creation of new infectious nodes from transmissions across common links. The last term accounts for the recoveries of infectious nodes, which occur at a recovery rate μ. Here kApB|A is the expected number of common oriented links. Therefore, since this number is the same regardless of the network we use to compute it, the following consistency relationship must follow:

kApB|A=kBpA|B. (13)

Now let us express pB|A and pA|B in terms of the overlap α:=O(A,B), which is defined as α=LABLAB where LAB is the set of links of the union network AB. Using that kN=2L, pB|A can be expressed in terms of α as follows:

pB|A=LABLA=LABLABLABLA=αLA+LBLABLA=α1+kBkApB|A. (14)

From this simple relationship it immediately follows that

pB|A=1+kBkAα1+α. (15)

Similarly,

pA|B=1+kAkBα1+α.

As expected, pB|A and pA|B fulfill relationship (13).

Introducing (15) into system (11) and (12), the overlap appears as a new parameter of the model which now, in terms of the fractions sk=Sk/N and ik=Ik/N of susceptible and infectious nodes of degree k, reads

dskdt=kβ0(α)skΘI, (16)
dikdt=kβ0(α)skΘIμik, (17)

where

β0(α):=11+αβ1kBkAα+βc1+kBkAα,

and sk+ik+rk=pA(k).

These equations correspond to the standard SIR model for heterogeneous and closed populations with proportionate mixing [26,27], but with an averaged transmission rate β0(α) that takes into account the degree of overlap between the two layers. A similar mean-field approach for modeling epidemic spreading in single heterogeneous networks was adopted in [28] using, as a state variable, the fraction ρk of nodes of degree k that are infectious. The connection between both approaches is given by the relationship between the state variables. For instance, ik=Ik/N=Ik/Nk·Nk/N=:ρkp(k).

Simple facts about system (16) and (17) are as follows:

(1) Since the factor α/(1+α) in (15) is increasing in α, and αmin{kA,kB}/max{kA,kB} [see (8)], it follows that

pB|Amin{kA,kB}kA.

So, when kAkB we get pB|A1 while for kA>kB we get pB|AkB/kA<1.

(2) If βc=β or α=0, the system reduces to the classic SIR model, as expected, because information dissemination plays no role in the infection spread. If α=1, we actually have one network and again the system reduces to the SIR model but now with β replaced by βc.

To determine the impact of the network overlap on the initial epidemic growth, we linearize the system (16) and (17) about the disease-free equilibrium (sk*,ik*)=(pA(k),0)k and obtain that the elements of the Jacobian matrix J* evaluated at this equilibrium are

Jkk*=β0(α)kAkkpA(k)μδkk,

where δkk is the Kronecker delta. Since the only non-zero eigenvalue of the matrix (kkpA(k)) is equal to kA2=kk2pA(k) [with an associated eigenvector whose components vk are proportional to kpA(k)], it follows that the largest eigenvalue of J* is

Λ1(α)=kA2kAβ0(α)μ,

which corresponds to the initial growth rate of the epidemic. Clearly, Λ1 decreases with α because βc<β, and Λ1(α)=0 at β0(α)/μ=kA/kA2, which corresponds to the epidemic threshold according to this mean-field approximation. Notice that, under proportionate mixing, the expected degree of a node reached by following a randomly chosen link in network A is kA2/kA.

We have checked the accuracy of the model (16) and (17) by collating the predicted epidemic final size, i.e., the number of individuals ever infected, with the histogram of final outbreak sizes of an ensemble of 1500 stochastic epidemic realizations on a network of 5000 nodes and using α as a tuning parameter. Each network layer is generated according to the configuration model, and the desired value of α is attained using the CR algorithm to guarantee the in-layer degree-degree correlation is as close to 0 as possible. To clearly separate the dichotomy “minor outbreak vs major outbreak” (initial extinctions are highly feasible because only one node is randomly infected at t=0), we chose the values of the parameters to be far enough from the epidemic threshold. This guarantees the existence of a marked distribution of final major-outbreak sizes, in addition to the one of minor-outbreak sizes around 1.

For an acceptable prediction of the model, the final epidemic size obtained from the mean-field approximation should be relatively close to the mean value around which major outbreaks are distributed. We insist that, in addition to the well-known limitations of the mean-field approach when modeling epidemic processes on one-layer networks [29], here the accuracy of predictions also depends on the fulfillment of the implicit hypotheses assumed in the derivation of expression (15) for pB|A. Namely, (i) there is no in-layer degree-degree correlations, and (ii) the occurrence of a common link is the same for any pair of nodes in the network. So, the value of pB|A does not depend on the degree of a node in layer A and, hence, the overlap between layers is uniformly distributed (i.e., there are no parts of the network more overlapped than others). Assumption (i) is guaranteed by the algorithms. However, assumption (ii) is not feasible when the architectures of both layers are very different from each other. Then, our simulations have been performed on networks with two different architectures reflecting two extreme cases. First, we have considered two-layer networks where each layer is in turn a regular random network. This guarantees that both hypotheses are satisfied and, moreover, a good accuracy of the mean-field approach for this type of network if the degree of each layer is high enough. Second, we have considered networks with both layers having exponential degree distributions which have a high variance. In both cases, the mean degrees are 45 (layer A) and 30 (layer B), both high enough to minimize the impact of stochastic fluctuations around infected nodes.

To derive an analytical expression of the final epidemic size note that, for all k, Sk()=NkRk() since limtIk(t)=0 [Sk() and Rk() are the limits of Sk(t) and Rk(t) as t]. From this fact, the initial condition is (sk(0),ik(0))=(pA(k),0), and integrating from 0 to the equation resulting from the sum of Eqs. (16) and (17), we have

0Ik(t)dt=1μRk().

Now, integrating (16) from 0 to , and using the previous expression, it follows that Rk()=Nk(1ekξ), with ξ:=β0(α)μkANkkRk(). Therefore, the final epidemic size is given by

kRk()=kNk(1ekξ) (18)

with ξ being the positive solution (if it exists) of the equation

ξ=β0(α)kAμk(1ekξ). (19)

It is interesting to observe that, since ξ=0 is always a solution, this equation will have a unique positive solution ξ*(α) if the derivative with respect to ξ of its right-hand side is larger than 1, which is equivalent to having Λ1(α)>0, i.e., a positive initial epidemic growth (the uniqueness follows from the convexity of the function defined by this right-hand side). This derivation of the final epidemic size is presented for the sake of completeness because, indeed, it follows from the one given in Ref. [26], Appendix E, in a more general setting.

Figures 2 and 3 show the histograms of final outbreak sizes obtained from the stochastic simulations on regular random networks (RNNs) and on networks with exponential degree distributions, respectively, for three values of α. These figures also show the mean final epidemic size given by Eq. (18) after numerically solving Eq. (19) using the generated degree sequence in layer A. When the CR algorithm is applied and the distribution of major outbreaks is clearly distinguished from the one of minor outbreaks (otherwise to talk about the final size of an epidemic makes no sense), the predicted final size on RNNs is almost the same as the mean final size of the major outbreaks (4853 vs 4850, 4816 vs 4818, and 4776 vs 4775 for α=0.3, 0.4, 0.5, and rounded values). For the exponential networks, the predicted final epidemic size differs less than 5% from the mean final size of major outbreaks for most of the networks generated with the configuration model with different degree distributions (not shown here). In Fig. 3, this disagreement is indeed less than 2%. Therefore, in these cases the proposed mean-field model qualitatively captures the impact of the overlap on the expected final size of an epidemic.

FIG. 2.

FIG. 2.

Histograms of 1500 final outbreak sizes on a two-layer network of 5000 nodes. Each layer is generated as a regular random network of degree kA=45 and kB=30, respectively. The size distribution of small outbreaks ranges from 1 to 9 in the three panels but only the frequency of a final size equal to 1 (the initial infected node recovers before infecting any neighbor) can be distinguished. Vertical dotted line from bottom to top shows the predicted final epidemic size according to (18) and (19). Insets: magnified histograms of major-outbreak sizes. Parameters: β=0.1, βc=0.05, μ=1, and α=0.3 (a), 0.4 (b), and 0.5 (c).

FIG. 3.

FIG. 3.

Histograms of 1500 final outbreak sizes on a two-layer network of 5000 nodes and exponential degree distributions on each layer with expected degrees kA=45 and kB=30, respectively (see Sec. IV for details). The size distribution of small outbreaks ranges from 1 to 5 in top panel, and from 1 to 7 in middle and bottom panels. Note that only the frequency of a final size equal to 1 (the initial infected node recovers before infecting any neighbor) can be perceived. Vertical dotted line from bottom to top shows the predicted final epidemic size according to (18) and (19). Insets: magnified histograms of major-outbreak sizes. Parameters: β=0.1, βc=0.05, μ=1, and α=0.3 (a), 0.4 (b), and 0.5 (c).

IX. CONCLUSIONS

The aim of this paper is to provide a toolbox of algorithms to generate two networks G,G sharing the same set of N nodes (a two-layer network) whose respective degree distributions p(k),p(k) are given, with a prescribed overlap coefficient α defined by the Jaccard index.

First of all, we study the possible range (αm,αM)[0,1] of permitted overlap coefficients in terms of p(k), p(k) and N. We start by proving that αm0 for any p(k),p(k) and N big enough. Given two fixed degree sequences D=(ki)i=1N and D=(ki)i=1N, we derive an upper bound of α for any pair of networks sequenced as D and D, by assuming the condition (not realizable in general) that the intersection network has exactly min{ki,ki} links attached at node i. We call this upper bound potential overlap between D and D and we denote it by Opot(D,D). Then we prove that an estimate (more properly, an upper bound) for αM is precisely Opot(S,S), where S,S are degree sequences sampled from p(k),p(k) whose respective elements are increasingly arranged.

To construct the desired algorithm we proceed in three steps. First, we define a partial procedure, that we call CR algorithm, that takes any pair of degree sequences D,D and a value α between 0 and Opot(D,D) and constructs a pair of networks G,G sequenced as D,D whose overlap is as close as possible to α. Second, we introduce what we call the critical overlap αm<αc<αM, defined as Opot(Drand,Drand) where Drand,Drand are random sequences sampled from p(k),p(k). Against an initial intuition, αc is closer to αM than expected. We show that the CR algorithm with Drand,Drand as input suffices to construct a pair of networks having any overlap below αc and exhibiting some desirable statistical properties, specifically lack of in- and cross-layer degree-degree correlations. Finally, when the desired overlap is beyond αc, we propose what we call the LS-CR algorithm, that minimizes the cross-layer degree-degree correlations that unavoidably appear for high values of α.

To illustrate the impact of the network overlap, we present a simple example of an SIR epidemic model over a two-layer network (physical contacts–information dissemination) and determine the impact of α on the initial epidemic growth and on the final epidemic size predicted by the model. The comparison of the epidemic final size with the average final size of major outbreaks obtained from stochastic simulations shows an excellent agreement on regular random networks of high degrees, and a qualitatively good agreement on exponential networks. Provided that one wants to perform simulations to validate this (or any) model where a multilayer network is involved, a systematic procedure to generate couples of networks of given size and degree distributions with a prescribed value of the overlap α as those presented here seems to be a useful tool.

ACKNOWLEDGMENTS

The authors have been partially supported by Research Grants No. MTM2014-52402-C3-3-P (J.S., D.J.), No. MTM2017-86795-C3-1-P (D.J.) of Ministerio de Economía y Competitividad (MINECO) of the Spanish government, No. 2017SGR1617 (D.J.), No. 2014SGR1083 (J.S.) of the Generalitat de Catalunya, and No. MPCUdG2016/047 of the Universitat de Girona.

REFERENCES

  • [1].F. D. Sahneh and C. Scoglio, Competitive epidemic spreading over arbitrary multilayer networks, Phys. Rev. E 89, 062817 (2014). 10.1103/PhysRevE.89.062817 [DOI] [PubMed] [Google Scholar]
  • [2].M. De Domenico, A. Solé-Ribalta, E. Cozzo, M. Kivelä, Y. Moreno, M. A. Porter, S. Gómez, and A. Arenas, Mathematical Formulation of Multilayer Networks, Phys. Rev. X 3, 041022 (2013). 10.1103/PhysRevX.3.041022 [DOI] [Google Scholar]
  • [3].G. D'Agostino and A. Scala, Networks of Networks: The Last Frontier of Complexity (Springer, New York, 2014). [Google Scholar]
  • [4].A. Saumell-Mendiola, M. A. Serrano, and M. Boguná, Epidemic spreading on interconnected networks, Phys. Rev. E 86, 026106 (2012). 10.1103/PhysRevE.86.026106 [DOI] [PubMed] [Google Scholar]
  • [5].S. Funk and V. A. A. Jansen, Interacting epidemics on overlay networks, Phys. Rev. E 81, 036118 (2010). 10.1103/PhysRevE.81.036118 [DOI] [PubMed] [Google Scholar]
  • [6].C. Granell, S. Gómez, and A. Arenas, Dynamical Interplay Between Awareness and Epidemic Spreading in Multiplex Networks, Phys. Rev. Lett. 111, 128701 (2013). 10.1103/PhysRevLett.111.128701 [DOI] [PubMed] [Google Scholar]
  • [7].C. Granell, S. Gómez, and A. Arenas, Competing spreading processes on multiplex networks: awareness and epidemics, Phys. Rev. E 90, 012808 (2014). 10.1103/PhysRevE.90.012808 [DOI] [PubMed] [Google Scholar]
  • [8].X. Wei, N. C. Valler, B. A. Prakash, I. Neamtiu, M. Faloutsos, and C. Faloutsos, Competing memes propagation on networks: A network science perspective, IEEE J. Sel. Areas Commun. 31, 1049 (2013). 10.1109/JSAC.2013.130607 [DOI] [Google Scholar]
  • [9].S. Funk, E. Gilad, C. Watkins, and V. A. A. Jansen, The spread of awareness and its impact on epidemic outbreaks, Proc. Natl. Acad. Sci. USA 106, 6872 (2009). 10.1073/pnas.0810762106 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].V. Marceau, P. A. Noël, L. Hébert-Dufresne, A. Allard, and L. J. Dubé, Modeling the dynamical interaction between epidemics on overlay networks, Phys. Rev. E 84, 026105 (2011). 10.1103/PhysRevE.84.026105 [DOI] [PubMed] [Google Scholar]
  • [11].P. Jaccard, Étude comparative de la distribution florale dans une portion des alpes et des Jura, Bull. Soc. Vaudoise Sci. Nat. 37, 547 (1901). [Google Scholar]
  • [12].V. Havel, A remark on the existence of finite graphs, Časopis Pěst. Mat. (in Czech) 80, 477 (1955). [Google Scholar]
  • [13].S. L. Hakimi, On realizability of a set of integers as degrees of the vertices of a linear graph. I, J. Soc. Ind. Appl. Math. 10, 496 (1962). 10.1137/0110037 [DOI] [Google Scholar]
  • [14].T. Britton, M. Deijfen, and A. Martin-Löf, Generating simple random graphs with prescribed degree distribution, J. Stat. Phys. 124, 1377 (2006). 10.1007/s10955-006-9168-x [DOI] [Google Scholar]
  • [15].E. A. Bender and E. R. Canfield, The asymptotic number of labeled graphs with given degree sequences, J. Comb. Theory, Ser. A 24, 296 (1978). 10.1016/0097-3165(78)90059-6 [DOI] [Google Scholar]
  • [16].B. Bollobás, A probabilistic proof of an asymptotic formula for the number of labeled regular graphs, Eur. J. Combin. 1, 311 (1980). 10.1016/S0195-6698(80)80030-8 [DOI] [Google Scholar]
  • [17].M. Molloy and B. Reed, A critical point for random graphs with a given degree sequence, Random Struct. Algorithms 6, 161 (1995). 10.1002/rsa.3240060204 [DOI] [Google Scholar]
  • [18].M. E. J. Newman, S. H. Strogatz, and D. J. Watts, Random graphs with arbitrary degree distributions and their applications, Phys. Rev. E 64, 026118 (2001). 10.1103/PhysRevE.64.026118 [DOI] [PubMed] [Google Scholar]
  • [19].S. N. Dorogovtsev, A. V. Goltsev, and J. F. F. Mendes, Critical phenomena in complex networks, Rev. Mod. Phys. 80, 1275 (2008). 10.1103/RevModPhys.80.1275 [DOI] [Google Scholar]
  • [20].M. Catanzaro, M. Boguñá, and R. Pastor-Satorras, Generation of uncorrelated random scale-free networks, Phys. Rev. E 71, 027103 (2005). 10.1103/PhysRevE.71.027103 [DOI] [PubMed] [Google Scholar]
  • [21].R. Xulvi-Brunet and I. M. Sokolov, Changing correlations in networks: Assortativity and dissortativity, Acta Phys. Pol. B 36, 1431 (2005). [Google Scholar]
  • [22].M. E. J. Newman, Assortative Mixing in Networks, Phys. Rev. Lett. 89, 208701 (2002). 10.1103/PhysRevLett.89.208701 [DOI] [PubMed] [Google Scholar]
  • [23].M. G. Kendall, A new measure of rank correlation, Biometrika 30, 81 (1938). 10.1093/biomet/30.1-2.81 [DOI] [Google Scholar]
  • [24].G. Bell and J. Potterat, Partner notification for sexually transmitted infections in the modern world: A practitioner perspective on challenges and opportunities, Sex. Transm. Infect. 87, ii34 (2011). 10.1136/sextrans-2011-050229 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].D. Juher, J. Saldaña, R. Kohn, K. Bernstein, and C. Scoglio, Network-centric interventions to contain the syphilis epidemic in san francisco, Sci. Rep. 7, 6464 (2017). 10.1038/s41598-017-06619-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].R. M. Anderson and R. M. May, Infectious Diseases of Humans: Dynamics and Control (Wiley Online Library, 1992), Vol. 28. [Google Scholar]
  • [27].R. M. May and A. L. Lloyd, Infection dynamics on scale-free networks, Phys. Rev. E 64, 066112 (2001). 10.1103/PhysRevE.64.066112 [DOI] [PubMed] [Google Scholar]
  • [28].R. Pastor-Satorras and A. Vespignani, Epidemic Spreading in Scale-Free Networks, Phys. Rev. Lett. 86, 3200 (2001). 10.1103/PhysRevLett.86.3200 [DOI] [PubMed] [Google Scholar]
  • [29].J. C. Miller, A. C. Slim, and E. M. Volz, Edge-based compartmental modeling for infectious disease spread, J. R. Soc. Interface 9, 890 (2012). 10.1098/rsif.2011.0403 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES