Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2020 Jul 16;16(7):e1008035. doi: 10.1371/journal.pcbi.1008035

Data-driven contact structures: From homogeneous mixing to multilayer networks

Alberto Aleta 1,*, Guilherme Ferraz de Arruda 1, Yamir Moreno 1,2,3
Editor: Benjamin Althouse4
PMCID: PMC7386617  PMID: 32673307

Abstract

The modeling of the spreading of communicable diseases has experienced significant advances in the last two decades or so. This has been possible due to the proliferation of data and the development of new methods to gather, mine and analyze it. A key role has also been played by the latest advances in new disciplines like network science. Nonetheless, current models still lack a faithful representation of all possible heterogeneities and features that can be extracted from data. Here, we bridge a current gap in the mathematical modeling of infectious diseases and develop a framework that allows to account simultaneously for both the connectivity of individuals and the age-structure of the population. We compare different scenarios, namely, i) the homogeneous mixing setting, ii) one in which only the social mixing is taken into account, iii) a setting that considers the connectivity of individuals alone, and finally, iv) a multilayer representation in which both the social mixing and the number of contacts are included in the model. We analytically show that the thresholds obtained for these four scenarios are different. In addition, we conduct extensive numerical simulations and conclude that heterogeneities in the contact network are important for a proper determination of the epidemic threshold, whereas the age-structure plays a bigger role beyond the onset of the outbreak. Altogether, when it comes to evaluate interventions such as vaccination, both sources of individual heterogeneity are important and should be concurrently considered. Our results also provide an indication of the errors incurred in situations in which one cannot access all needed information in terms of connectivity and age of the population.

Author summary

Disease modeling has experienced a substantial advance in the last decades. However, state-of-art models still lack a full representation of all possible levels of heterogeneity. Here, we compare several frameworks that either use the connectivity, the demography, or both features. Specifically, we analyze four scenarios: (i) two homogeneous mixings, considering either social or demographic data and (ii) two network models, one accounting only for the connectivity distribution and another that includes both connectivity and demography. Our analyses highlight the differences between each approach and the role of demographic and connectivity distributions; while the contact pattern is crucial for the determination of the epidemic threshold, the age-structure is fundamental after the outbreak. Notably, regarding vaccination, both types of heterogeneity play a significant role, suggesting that none of them should be neglected for this purpose. Finally, our results provide estimates of possible errors when data about sources of heterogeneity is not available.

Introduction

One of the most fundamental concepts in epidemic dynamics is the heterogeneity in the ability of hosts to transmit the disease. This heterogeneity can be described as a function of three components: an individual’s infectiousness, the rate at which she contacts susceptible individuals, and the duration of the infection [1]. Of these three components, the second one is probably the hardest to correctly estimate since it depends on several factors not related to the pathogen itself, such as the demographic structure of the population or its contact patterns. Hence, the heterogeneity in the mixing patterns between individuals is a key element for the correct assessment of the impact of epidemic outbreaks [2, 3].

The heterogeneity of the population can be characterized by different degrees of resolution [4]. The most basic approach, known as homogeneous mixing, considers that a contact between any two individuals in a population occurs randomly with equal probability [5]. In the decade of 1980, due to the interest in studying the spreading of sexually transmitted diseases, this assumption had to be modified [6]. The population was then divided into groups according to some characteristics, such as gender or sexual activity levels, and the interaction between those groups was encoded in a contact matrix [7]. Even though a homogeneous mixing component was still present inside each group, because all individuals within a group were indistinguishable, this approach demonstrated that a core group of 20% of the individuals in the host population could lead to 80% of the transmissions, which called for a complete redefinition of disease control programs [8].

The disproportionate role that highly active individuals had in the spreading dynamics was mathematically encoded in the fact that the transmission did not depend on the average number of new partners but on the mean-square divided by the mean [9], being one of the earliest signs of the crucial role that heterogeneities play in the spreading of diseases. This approach was also applied to other types of diseases in which groups of hosts could be easily identified, including vector-borne diseases [10] or age-dependent diseases [11]. The importance of heterogeneous mixing patterns was thus acknowledged, and several empirical studies measured them for sexually transmitted diseases [12, 13]. Yet, data on the mixing patterns of the population determinant for the spread of airborne infectious diseases, and in particular, their relationship with the age of individuals was not collected at large scale until 2008, i.e., 20 years later [14].

A further step to include more heterogeneities in the system is to consider the complete contact network of the population, which contains explicitly who can contact who [15, 16]. This approach is of particular importance for airborne infectious diseases since it is not possible to define a priori groups of highly infectious individuals, such as the core group for sexually transmitted diseases. Moreover, this approach gives a simple explanation to super-spreading events, which attracted a lot of attention after the 2003 SARS pandemic [17, 18] and are currently being scrutinized again in the context of the COVID-19 pandemic [19]. It was observed that it was common to find hosts who transmitted the disease to many more individuals than the average. Within a network perspective, this is just a consequence of the higher number of contacts, or degree, that some individuals have in the network [17, 18, 20]. This individual heterogeneity also signaled that outbreaks could be really large if key individuals become infected and, at the same time, gave a new target for efficient control strategies such as vaccinating highly connected individuals [21, 22]. However, despite the many advantages of this approach, determining the complete contact network of a large population is almost infeasible, especially for infections transmitted by respiratory droplets or close contacts. Hence, it is common to use idealized networks built using some empirical data of the population, such as the degree distribution [23].

Lastly, there are high-resolution approaches that rely on lots of statistical data to build agent-based models in which the behavior of every single individual is taken into account [2429]. Note, however, that in agent-based models, individuals are usually assigned to certain mixing groups (i.e., their household, school, or workplace), and that inside those groups homogeneous mixing is used, due to the lack of data for all these settings at a country scale [30]. An important step to create more realistic models in this direction is to collect high-resolution data on individual contacts using wearable sensors [21], that can be used to build time-varying networks in which not only the information about who contacts who is contained but also the duration and frequency of contacts [31]. Several settings have been monitored, such as schools and workplaces [32, 33], or even conferences and museums [34, 35]. Although the data is still too rare to be used in large scale simulations, it has already been shown that the heterogeneity induced by the time-varying networks inside each mixing group produces a different outcome than the one obtained assuming homogeneous mixing within each group [30].

Our goal in this paper is to analyze the role of one particular type of heterogeneity in disease dynamics, namely, the age structure of the population. Originally, age was introduced into the models to study childhood diseases [5]. The classical approach consists of dividing the population into different groups, one for each age bracket under consideration, and establishing an age-dependent transmission rate. This transmission rate can be arranged in a matrix in which each element encodes the transmission probability between groups i and j (this matrix is also known as the Who Acquired Infection from Whom matrix [36, 37]). It is also possible to separate the effect of the transmission itself in a common parameter and encode the number of contacts between each group in the matrix [38]. Note that this procedure falls into the second category described previously. That is, it takes into account the heterogeneity induced by having different classes of individuals but hides the individual variability under a homogeneous mixing approach within each group, as in models of sexually transmitted diseases with groups with different activity levels. Nevertheless, this approach is widely used today and has yielded outstanding results for many diseases such as chickenpox [39], herpes zoster [40], measles [4143], pertussis [44] and tuberculosis [45]. In fact, even though the theoretical basis of this method is relatively old, data on the contact patterns of the general population as a function of their age have been available only recently.

The first large-scale study on the contact patterns between and within groups in the context of infections spread by respiratory droplets or close contact took place in 2008 and was focused in Europe [14]. Since then, a number of studies covering different countries have appeared, although data on Africa and Asia are still scarce [46]. Various methods have been developed to infer the contact patterns in the absence of direct data [4749], and to project them into the future [50]. And yet, most studies that use this data disregard the whole distribution of contacts and use only the average number of contacts between groups, completely neglecting the individual heterogeneity (with few exceptions [51]). As a consequence, in these studies, super-spreading events cannot occur naturally, unless the model is modified, contrary to network models in which the large connectivity of some individuals can result in the appearance of such events. Similarly, the virtual absence of an epidemic threshold for certain types of contact networks cannot be observed with these simplified contact patterns [52]. To bridge this gap, in this paper, we focus on analyzing the role that disease-independent heterogeneity in host contact rates plays in the spreading of epidemics in large populations under several scenarios, both numerically and analytically. Furthermore, in contrast to previous approaches to this problem [5356], we use a data-driven approach to highlight not only the role of those heterogeneities but also to explore the validity of the conclusions that one can derive when only limited information about the population is available.

Results

Modeling the contact patterns of the population

There are multiple ways of modeling the contact patterns of the population, depending on the availability of data and the characteristics of the disease. In this work, we consider that diseases have the same outcome on all individuals regardless of their condition and that individuals do not change their behavior as a consequence of the disease. This way, we can focus on the effect of adding different characteristics to the population contact patterns.

To be more specific, we use the information from the survey that was carried out in Italy for the POLYMOD project [14]. In this project, over 7,000 participants from eight European countries were asked to record the characteristics of their contacts with different individuals during one day, including age, sex, location, etc. Since that pioneering work, the number of countries where this type of study has been conducted has been increasing steadily, but data on Africa and Asia are still scarce. Besides, the resolution and amount of information vary from study to study [46]. As such, we build four different models of interaction, assuming that only partial information about the population is available, see Fig 1.

Fig 1. Modeling the contact patterns of the population.

Fig 1

Panel A: Schematic view of the different models considered. If the only information available is the average number of contacts per individual, homogeneous mixing can be assumed (H). If there is information about the average number of contacts between individuals with age a and a′, then a classical group-interaction model can be implemented (M). On the other hand, if the full contact distribution of the population is known, regardless of their age, it is possible to build the contact network of the population (C). Lastly, when both the contact distribution and the interaction patterns between different age groups are known, the individual heterogeneity and the global mixing patterns can be combined to create a multilayer network in which each layer represents a different age group (C+M). Panel B: Demographic structure of Italy in 2005 [57]. Panel C: Age-contact patterns in Italy obtained in the POLYMOD study [14]. Panel D: Contact distribution in Italy obtained in the POLYMOD study [14]. The x axis represents the number of daily contacts and the y axis the fraction of individuals that have reported such amount of contacts. The distribution is fitted to a right-censored negative binomial distribution since the maximum number of contacts that could be reported was 45.

The simplest formulation is the homogeneous mixing approach (model H), suitable when very limited information about the population is available. In this model, all individuals are able to contact each other with equal probability. The number of such interactions, 〈k〉, can be extracted from contact surveys simply by calculating the average number of contacts per individual. Note, however, that this formulation is very simplistic since all individuals are completely equivalent. A slightly better approximation is to divide the population into age-groups, given the demographic structure of the population, Fig 1B, and establish a different number of contacts between and within them (model M), which is the common approach currently used in the epidemic literature to model age-mixing patterns. In this case, the necessary information includes knowing the age of both individuals participating in each contact, although this information can be easily summarized in an age-contact matrix, M, where each entry Mαβ represents the average number of contacts from an individual in age group α to individuals in age group β. Note that in both models only the average number of contacts is used, in one case the average over the whole population and in the other over each age-group.

Another possibility is to use the whole contact distribution, Fig 1D, to build the contact network of the population. This formulation is commonly found in the network science literature since it highlights the role that the disproportionate number of contacts of some individuals have in the dynamics of the disease. A simple way of creating these networks is to represent each individual i as a node and extract its degree (number of contacts) from the distribution. Then, the expected number of edges between nodes i and j is Aij=kikj/lkl (model C). To obtain this expression, we can consider that each node i has ki stubs associated. Next, if these stubs are matched together randomly, the probability that each stub from node i ends up at one of the kj stubs of node j is kj over the total number of stubs, ∑l kl. This method is known as the configuration model.

Lastly, we can combine both ingredients, the mixing patterns, and the contact distribution of the population in a network representation. To do so, we propose to arrange nodes in a multilayer network, in which each layer represents an age-group. As such, the first step to create this network is to extract the age associated to each node from the demographic structure of the population, Fig 1B, and assign them to their corresponding layer (since we are working with 15 age-groups, our system is composed by that same amount of layers). Then, the degree of each node should be extracted from the desired distribution. To incorporate the mixing patterns into the configuration model, we propose the following scheme:

  1. Given a node i located in layer α (where the layer represents the age-group associated with i), the probability that any of its stubs ends up at a node in any layer β (including the same layer) is pαβ. This probability can be extracted from the mixing matrix as pαβ = Mαβ/∑β Mαβ.

  2. The stub from node i will match the stub of node j, situated in layer β, with probability kj/∑lβ kl, where the denominator indicates the addition over the degree of all nodes present in layer β.

Hence, the expected number of edges between nodes i and j will be given by

Aij=kipα(i),β(j)kjlβ(j)kl. (1)

Yet, note that incorporating the mixing patterns introduces a restriction in the degree distribution. Indeed, one of the important properties of the mixing patterns matrix is that it has to verify reciprocity, i.e.,

MαβNα=MβαNβ. (2)

That is, the number of contacts going from group α to group β has to be the same as the ones from β to α (if the populations of each group were equal, this would lead to a symmetric matrix). It is easy to see that Eq (1) only fulfills this property if

lαkl=βMαβNα. (3)

And, thus,

kα=βMαβ, (4)

where 〈kα represents the average degree in layer α. Hence, even though the shape of the distribution can be chosen freely, the mixing matrix fixes the average degree of each layer. Eqs (1) and (4) completely define our last model, the CM model, see Fig 1A.

Susceptible-infected-susceptible dynamics

To determine the consequences of each of the previous assumptions, we first consider a general susceptible-infected-susceptible (SIS) Markovian model [58, 59]. In this model, the recovery rate of each infected individual is modeled by a Poisson process with rate δ. In turn, each successful contact emanating from an infected individual (i.e., a contact that transmits the disease) is modeled as a Poisson process with rate λ. We denote by Yi the Bernoulli random variables that are equal to one if individual i is infected or zero otherwise. Complementary, Yi + Xi = 1 and ∑i Xi + Yi = N. The only ingredient left to be defined is how the contact process between individuals actually takes place. In general, in its exact formulation, we can do so by introducing the matrix A, which denotes whether two individuals can contact each other or not [58, 59]:

dYidt=-δYi+λXijAijYj. (5)

With this formulation we can already study the spreading of an epidemic on any network, models C and CM. Indeed, assuming that the states are independent, i.e., 〈Yi Yj〉 = 〈Yi〉〈Yj〉≡yi yj, we get

dyidt=-δyi+λxijAijyj. (6)

Considering that the nodes with the same degree are statistically equivalent, we can obtain the epidemic threshold using the heterogeneous mean field approximation [60],

τλδ=kk2. (7)

This well-known result from network science clearly shows the importance of the heterogeneity of the contacts, since it depends on the second moment of the distribution. In the case of Italy, using this expression we obtain a theoretical threshold of τCM = 0.033 and τC = 0.035 for the CM and C models, respectively.

For the M model, since individuals are indistinguishable, Eq (6) is rewritten as

dyαdt=-δyα+λxαβMαβyβ, (8)

where Mαβ is the matrix depicted in Fig 1C, and yα the fraction of infected individuals in layer α. In this case, using the next generation approach [61, 62], the epidemic threshold is

τMλδ=1ρ(M). (9)

Regarding Italy, the spectral radius of M is ρ(M) = 22.51, resulting in an epidemic threshold of τM = 0.044.

Lastly, the equation governing the H model is

dydt=-δy+λkxy, (10)

where the epidemic threshold is

τHλδ=1k. (11)

According to Fig 1D, the epidemic threshold in our system is thus τH = 0.052.

Thus, in this case, the following relation holds:

τH>τA>τC>τCM0.052>0.044>0.035>0.033 (12)

Some observations are in order. First, even though the average number of contacts is the same in all models, the epidemic threshold is completely different. Besides, increasingly adding heterogeneity to the model lowers the epidemic threshold. This is especially relevant when going from classical mixing models to network models. Indeed, when we introduce the whole contact distribution, we are indirectly adding the possibility of having super-spreading events, which, as noted before, is missing in the classical approaches. On the other hand, as expected, the difference between both network models is relatively small (τCMτC=1.06) since the main driver of the epidemic threshold is the contact distribution. Nonetheless, as we shall see next, for other scenarios, the multilayer framework will yield quite different results from model C.

To asses the quality of our theoretical analysis, our first step is to obtain the epidemic threshold for each configuration numerically. To do so, we create an artificial population of 106 individuals and assign them an age according to the demographic structure of the Italian population [57]. Then, we simulate a stochastic SIS Markov model, with δ = 1 and multiple values of λ for each of the four contact models under consideration (see Materials and methods). In Fig 2A, we show the attack rate (total number of cases over the whole population) as a function of λ. The overall behavior of the four scenarios is qualitatively similar, although large differences are observed in the value of the epidemic threshold (see inset), as predicted.

Fig 2. Dynamics of a SIS model using different contact models.

Fig 2

A) The fraction of infected individuals as a function of the infection rate. In the inset, the area near the epidemic threshold for each configuration is shown enlarged. B) Susceptibility as a function of the infection rate for the four configurations with populations of size 104, 105 and 106. The larger the size of the population the closer the peak of susceptibility is to the theoretical epidemic threshold (dashed line). C) Relative difference in the number of infected individuals between the results obtained using the M (purple circles), C (red squares) or CM (blue triangles) models and the homogeneous mixing setting. Positive values indicate that the number of infected individuals is larger than in the homogeneous mixing scenario, while negative values represent a lower number of infected individuals.

To properly characterize the value of the epidemic threshold and compare it with the theoretical expectations, we use the quasistationary state (QS) method [59, 63]. This technique allows computing the susceptibility of the system, which presents a peak at the epidemic threshold (see Materials and methods). The caveat is that it is highly dependent on the system size since the epidemic threshold is only properly defined for infinite systems. Nevertheless, in Fig 2B we compute the susceptibility, χ for the four configurations with system sizes ranging from 104 to 106 individuals and we can see that for the latter the peak of the susceptibility is already quite close to the predicted value of the epidemic threshold, validating our theoretical approach.

Next, we focus on studying the impact that the disease has on each age group under the different configurations, Fig 2C. We set the value of λ in each case so that the attack rate is equal to 0.4, since the four scenarios converge to that value for similar values of λ (see Fig 2A). Using the homogeneous mixing approximation, we obtain a distribution of infected individuals across ages proportional to the demographic structure of the population (Fig 1B), as one would expect given that all individuals are virtually indistinguishable for the dynamics. The same result is obtained for the C model, in which the age of the nodes is completely independent of the network structure. At variance with these results, if we incorporate the heterogeneous mixing patterns of the population either in the age-mixing (M) model or in the multilayer network (CM) setting, the incidence in each age group would be quite different, see Fig 2C. Note that we have again set λ so that the overall incidence is 0.40 in all cases −this assures that the total number of infected individuals is the same, only its distribution across age classes is different. Results show that in both scenarios the prevalence is much higher for teenagers and smaller for the older cohorts than in the homogeneous mixing model.

Susceptible-infected-removed dynamics

Although the SIS model facilitates the theoretical and numerical analysis of the system, especially near the epidemic threshold, it is too simplistic to model real diseases such as ILI. Thus, to highlight the impact of these observations on a more realistic scenario, we slightly modify the model by incorporating the removed compartment so that the dynamics are governed by a susceptible-infected-removed (SIR) model, which is better suited for studying ILI [64].

It has been recently shown that using a constant and group-independent basic reproduction number, R0, might not describe well key features of the disease dynamics in realistic scenarios [28]. For this reason, we first explore the dependency of this parameter with the age of the individual in the two networked scenarios. To do so, we simply count the total number of newly infected individuals that a single seeded infectious subject would produce in a fully susceptible population over 108 simulations, with the value of λ set so that the average value of R0 is 1.3 inline with typical values for influenza [65]. Fig 3A shows the value of R0 as a function of the age of the seed node in the network in which all nodes have the same degree distribution. Clearly, the same R0 value is obtained regardless of the age of the nodes, as it should be given that both their degree and their connections are independent of their age. Conversely, in the multilayer network where the mixing patterns of the population are incorporated, Fig 3B, the situation changes completely. The value of R0 is above the average for teenagers and adults but below the average for the elderly, highlighting the importance of the underlying structure in the value of R0.

Fig 3. Basic reproduction number in single layer and multilayer networks.

Fig 3

A) Measured value of R0 in the C model, where both the degree distribution and the connections are completely independent of the age assigned to the nodes. B) The measured value of R0 in the CM model, where the connection patterns follow the age mixing patterns of the population. In both cases the average R0 of the total population has been set to 1.3.

Lastly, we study the effect of vaccinating a fraction of the nodes before the epidemic begins. This sort of contention measures are among those that can benefit the most from knowledge about the structure of the population, as they allow devising more efficient vaccination strategies. First, we set the baseline scenario to values compatible with the 2018-2019 ILI epidemic in Italy. According to the World Health Organization, the total attack rate was 13.3%. Besides, an important fraction of the population was vaccinated preemptively. In Italy, vaccination is recommended for several groups of people, such as those with chronic medical conditions, firefighters, health care workers, or the elderly [66]. Of these groups, the only one that we can distinguish in our model is the elderly, but it is also the one with the largest vaccination rates. Unfortunately, the uptake of the vaccine has been decreasing for the past few years, and now is close to 50% [67]. Even more, the effectiveness of the vaccine is estimated to be around 60% yielding an effective vaccination rate of 30% in the elderly [68]. Hence, to obtain the baseline values in our model, we set 30% of the elderly in the recovered state initially and set the value of λ so that the attack rate is 13.3%, Fig 4A.

Fig 4. Effects of different vaccination strategies.

Fig 4

A) Attack rate under the standard vaccination adjusted to the values of the 2018-2019 ILI epidemic. B) Reduction of the attack rate when the vaccination is increased by 1%N but applied randomly. C) Reduction of the attack rate when the fraction of vaccinated population is increased by 1%N but targeted to highly infective individuals: the group of 15-19 years old in the M model and the nodes with larger degree in the two networked scenarios.

Our first observation is that in the C scheme, we trivially obtain a reduction in the attack rate among the elderly due to their vaccination, but otherwise, the incidence is the same in all age groups. On the other hand, both in the M and CM models, the attack rate depends highly on the age of the individual. To gauge the effect of increasing vaccination rates, we vaccinate 1% of the total population (assuming that the effectiveness is 60% for all age-groups). Note that since the elderly group represents 19% of the population, the initial vaccination rate was roughly 10% of the total population. If these new vaccines are administered randomly, we can see that the effect is just a homogeneous reduction of 5-6% in all age groups, independently of the model, Fig 4B.

Conversely, if that same amount of new vaccinations is targeted, the situation changes completely. In the M model, we vaccinate individuals belonging to the group with 15-19 years old since it is the one with the largest number of contacts and the highest attack rates. We can see that the overall reduction is much larger than in the previous case, and especially so in this particular group, see Fig 4C. In the C and CM models, instead, we apply the vaccines to individuals with the largest degrees. We can see that the reduction is larger in the C setting than in the CM one. This result might seem counter-intuitive since the same measure is applied to both systems. However, note that while in the C model the largest degrees are homogeneously distributed across the population, in the CM model they are concentrated in specific age groups, or layers. Furthermore, since nodes in the same layer tend to be connected together, the previous observation implies that the effect of removing hubs will be lower. To verify this, we have rewired the connections of the CM model while preserving the age, degree and vaccination status of each node. As we can see, in such case we recover the same value as in the C model. In other terms, the correlations induced by the age mixing patterns lower the effectivity of this vaccination strategy. Note also that in both the random and the targeted vaccination schemes, the number of new vaccines introduced in the system is exactly the same, only who is vaccinated changes.

Discussion

Models can range from simple homogeneous mixing models to high-resolution approaches. The latter, even though it might provide better insights, is also much more data demanding. As a compromise between the two, network models can capture the heterogeneity of the population while keeping the amount of data necessary low. Nevertheless, most network approaches focus only on determining the role that the difference in the number of contacts of the population has on the impact of disease dynamics but ignore other types of heterogeneities such as the age mixing patterns.

We have shown that to determine the epidemic threshold of the population properly, the heterogeneity in the number of contacts cannot be neglected, making the simple homogeneous approach and the homogeneous approach with age mixing patterns ill-suited for it. In fact, a description that ignores the age mixing patterns of the population can capture much better the value of the epidemic threshold. Furthermore, we observe two different regimes in the attack rate as a function of the spreading rate. For low values of the spreading rate, individual heterogeneity plays a more important role, yielding larger attack rates than the homogeneous counterparts. However, after a certain value, the phenomenology reverses, i.e., larger attack rates are obtained for the homogeneous approaches rather than for the networked versions. The reason is that, in homogeneous models, an infected agent can contact everyone in the population, and thus it can keep infecting individuals even if the attack rate is high. When the network is taken into consideration, it is possible that nodes run out of susceptible individuals within their vicinity, virtually preventing them from spreading the disease any further.

On the other hand, if we study the distribution of infected individuals across age cohorts, we can see that the C scheme is no longer valid, yielding the same results as the simple homogeneous mixing approach. If the age mixing patterns are added into the model, either in the M or CM schemes, a larger fraction of young individuals will be infected, while the incidence in elder cohorts is reduced. Hence, even though the C approach can predict fairly well the value of the epidemic threshold, it cannot be used to study the spreading of diseases in which taking into account the age of the individuals is important beyond the epidemic threshold. Conversely, the multilayer network of the CM model can describe both the epidemic threshold and the distribution of the disease across age groups correctly. In other words, it combines both the importance that individual heterogeneity has with the inherent assortativity present in human interactions.

Individual heterogeneity also introduces important variations in the measured value of R0. This observation is quite important since it shows that for the proper evaluation of R0 during emerging diseases, the sampling of the population has to be done carefully. Biases in the sampled individuals, such as having too many young individuals, could lead to estimations of R0 much larger than its actual value. Even more, this is not limited to the age of the individuals since we have also seen the importance of individual heterogeneity in the dynamics. Of utmost relevance, if in the sample, there are individuals with an average number of contacts higher than the normal population, the estimations of R0 would also be higher.

Lastly, we have also observed the crucial role that heterogeneity plays if we want to devise efficient vaccination strategies. The role of networks in this regard is known to be important not only because there are tools that allow identifying the most important individuals, but because it provides a clear way to study herd immunity. Yet, if we do not take into account the contact distribution of the population the effectivity of vaccination campaigns will be lower. Conversely, if we rely simply on the contact distribution of the population and disregard their mixing patterns, we would overestimate the effect of vaccination.

As the current COVID-19 pandemic has shown, accounting for both the age and the contact heterogeneity of individuals is crucial to control the epidemic. It is yet unknown the exact role that age plays in this disease, although preliminary results show that children are less susceptible and that the case fatality rate for older individuals is much higher. Similarly, large super-spreading events are possible such as the ones detected in South Korea, Boston or Spain [19, 69]. The latter country is also among the ones most affected by the current epidemic, but empirical information about the age mixing patterns of the population is not available [46, 69]. Thus, to the inherent problems of forecasting the evolution of an emerging disease [70, 71] we have to add our ignorance about these factors which, as we have shown in this article, can substantially modify the predictions. This highlights once again the importance of obtaining precise information about the behavior of the population, enhancing our preparedness for this type of event.

To sum up, we have shown the importance that individual heterogeneities have on the spreading of infectious diseases. Yet, although in general the more details in the model the better, it is also important to take into account the inherent limitations about data that currently exist. Therefore, it is crucial to correctly gauge what can and cannot be done, given the information available to us. In particular, we have shown that to predict the epidemic threshold, it is indispensable to know the degree distribution of the population. Nonetheless, this is not strictly needed to evaluate the impact of a disease away from the threshold. Yet, adding this information, even though it does not dramatically change the predicted outcomes of the epidemic under normal conditions, could be pivotal to devise efficient vaccination strategies. Furthermore, we have seen that the underlying information of the system also has an impact on quantities that are commonly measured and used in real settings, such as R0, implying that care must be taken when extrapolating the results from one study to the other.

Materials and methods

Model

In all cases, we consider populations of 106 individuals. In the H model, since individuals are indistinguishable, the impact of the disease over the age groups is computed by randomly extracting values from the demographic distribution of Italy in 2005 [57]. In the M model, the size of each age-group is computed using the same procedure. Besides, the age-mixing matrix was corrected so that reciprocity is fulfilled, and the average connectivity is exactly 19.40 [50]. In the C model, we randomly extract the degree of each node from a right-censored negative binomial distribution adjusted to the survey data from POLYMOD [14]. Then, links are sampled performing a Bernoulli trial over each pair of nodes respecting that Aij=kikj/lkl. A similar procedure is followed to create the multilayer with age mixing patterns, but in this case, each layer has its own values for the negative binomial distribution, according to the data (see Fig 1D), and the probability of establishing a respects Aij=pα(i),β(j)kikj/lα(j)kl where pα(i),β(j) is the probability that a link from a node with the same age as node i ends up at a node with the same age as node j, and α(j) is the layer to which j belongs. We remark that the network is simplified, removing multiple edges.

Epidemic threshold

Close to the critical point, the fluctuations of the system are often high, driving the system to the absorbing state [59, 63]. To avoid this problem, the quasistationary state (QS) method stores M active configurations previously visited by the dynamics. At each step, with probability pr, the current configuration (as long as is active) replaces one of the M stored ones. Then, if the system tries to visit an absorbing state, the whole configuration is substituted by one of the stored ones. The system evolves for a relaxation time, tr, and then the distribution of the number of infected individuals, pn, is obtained during a sampling time ta. Lastly, the threshold is estimated by locating the peak of the modified susceptibility χ = N(〈ρ2〉 − 〈ρ2)/〈ρ〉, where 〈ρk〉 is the k-th moment of the the distribution of the number of infected individuals, pn (note that 〈ρk〉 = ∑n nk pn). In our analysis, the number of stored configurations and the probability of replacing one of them is fixed to M = 100 and pr = 0.01, while the relaxation and sampling times vary in a range depending on the size of the system, tr = 104 − 106 and ta = 105 − 107.

Data Availability

All data used in this paper comes from the POLYMOD study, which was published with the title "Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases" and it is available at doi.org/10.1371/journal.pmed.0050074.

Funding Statement

YM acknowledges partial support from the Government of Aragon, Spain, through grant E36-17R (FENOL), and by MINECO and FEDER funds (FIS2017-87519-P). AA, GFdA, and YM acknowledge support from Intesa Sanpaolo Innovation Center. The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. VanderWaal KL, Ezenwa VO. Heterogeneity in pathogen transmission: mechanisms and methodology. Functional Ecology. 2016;30(10):1606–1622. 10.1111/1365-2435.12645 [DOI] [Google Scholar]
  • 2. Merler S, Ajelli M. The role of population heterogeneity and human mobility in the spread of pandemic influenza. Proceedings of the Royal Society B: Biological Sciences. 2009;277(1681):557–565. 10.1098/rspb.2009.1605 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Machens A, Gesualdo F, Rizzo C, Tozzi AE, Barrat A, Cattuto C. An infectious disease model on empirical networks of human contact: bridging the gap between dynamic network data and contact matrices. BMC Infectious Diseases. 2013;13(1). 10.1186/1471-2334-13-185 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Dorjee S, Poljak Z, Revie CW, Bridgland J, McNab B, Leger E, et al. A Review of Simulation Modelling Approaches Used for the Spread of Zoonotic Influenza Viruses in Animal and Human Populations. Zoonoses and Public Health. 2012;60(6):383–411. 10.1111/zph.12010 [DOI] [PubMed] [Google Scholar]
  • 5. Keeling M, Rohani P. Modeling infectious diseases in humans and animals. Princeton: Princeton University Press; 2008. [Google Scholar]
  • 6. Winkelstein W. Sexual Practices and Risk of Infection by the Human Immunodeficiency Virus. JAMA. 1987;257(3):321. [PubMed] [Google Scholar]
  • 7. Hethcote HW, Yorke JA. Gonorrhea Transmission Dynamics and Control. Springer; Berlin Heidelberg; 1984. Available from: 10.1007/978-3-662-07544-9. [DOI] [Google Scholar]
  • 8. Woolhouse MEJ, Dye C, Etard JF, Smith T, Charlwood JD, Garnett GP, et al. Heterogeneities in the transmission of infectious agents: Implications for the design of control programs. Proceedings of the National Academy of Sciences. 1997;94(1):338–342. 10.1073/pnas.94.1.338 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. May RM, Anderson RM. Transmission dynamics of HIV infection. Nature. 1987;326(6109):137–142. 10.1038/326137a0 [DOI] [PubMed] [Google Scholar]
  • 10. Dietz K. Models for Vector-Borne Parasitic Diseases In: Vito Volterra Symposium on Mathematical Models in Biology. Springer; Berlin Heidelberg; 1980. p. 264–277. Available from: 10.1007/978-3-642-93161-1_15. [DOI] [Google Scholar]
  • 11. Anderson RM, May RM. Age-related changes in the rate of disease transmission: implications for the design of vaccination programmes. Journal of Hygiene. 1985;94(3):365–436. 10.1017/s002217240006160x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Johnson AM, Wadsworth J, Wellings K, Bradshaw S, Field J. Sexual lifestyles and HIV risk. Nature. 1992;360(6403):410–412. 10.1038/360410a0 [DOI] [PubMed] [Google Scholar]
  • 13. ACSF. AIDS and sexual behaviour in France. Nature. 1992;360(6403):407–409. 10.1038/360407a0 [DOI] [PubMed] [Google Scholar]
  • 14. Mossong J, Hens N, Jit M, Beutels P, Auranen K, Mikolajczyk R, et al. Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases. PLOS Medicine. 2008;5(3):1–1. 10.1371/journal.pmed.0050074 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang D. Complex networks: Structure and dynamics. Physics Reports. 2006;424(4-5):175–308. 10.1016/j.physrep.2005.10.009 [DOI] [Google Scholar]
  • 16. Newman MEJ. Networks. Oxford, United Kingdom New York, NY, United States of America: Oxford University Press; 2018. [Google Scholar]
  • 17. Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the effect of individual variation on disease emergence. Nature. 2005;438(7066):355–359. 10.1038/nature04153 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Meyers LA, Pourbohloul B, Newman MEJ, Skowronski DM, Brunham RC. Network theory and SARS: predicting outbreak diversity. Journal of Theoretical Biology. 2005;232(1):71–81. 10.1016/j.jtbi.2004.07.026 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Althouse BM, Wenger EA, Miller JC, Scarpino SV, Allard A, Hébert-Dufresne L, et al. Stochasticity and heterogeneity in the transmissiondynamics of SARS-CoV-2;. https://covid.idmod.org/data/Stochasticity_heterogeneity_transmission_dynamics_SARS-CoV-2.pdf.
  • 20. Craft ME. Infectious disease transmission and contact networks in wildlife and livestock. Philosophical Transactions of the Royal Society B: Biological Sciences. 2015;370(1669):20140107 10.1098/rstb.2014.0107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Salathé M, Kazandjieva M, Lee JW, Levis P, Feldman MW, Jones JH. A high-resolution human contact network for infectious disease transmission. Proc Natl Acad Sci USA. 2010;107(51):22020–22025. 10.1073/pnas.1009094108 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Wang Z, Bauch CT, Bhattacharyya S, d’Onofrio A, Manfredi P, Perc M, et al. Statistical physics of vaccination. Physics Reports. 2016;664:1–113. [Google Scholar]
  • 23. Keeling MJ, Eames KTD. Networks and epidemic models. Journal of The Royal Society Interface. 2005;2(4):295–307. 10.1098/rsif.2005.0051 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Ferguson NM, Cummings DAT, Cauchemez S, Fraser C, Riley S, Meeyai A, et al. Strategies for containing an emerging influenza pandemic in Southeast Asia. Nature. 2005;437(7056):209–214. 10.1038/nature04017 [DOI] [PubMed] [Google Scholar]
  • 25. Ferguson NM, Cummings DAT, Fraser C, Cajka JC, Cooley PC, Burke DS. Strategies for mitigating an influenza pandemic. Nature. 2006;442(7101):448–452. 10.1038/nature04795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Germann TC, Kadau K, Longini IM, Macken CA. Mitigation strategies for pandemic influenza in the United States. Proceedings of the National Academy of Sciences. 2006;103(15):5935–5940. 10.1073/pnas.0601266103 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Zhang Q, Sun K, Chinazzi M, y Piontti AP, Dean NE, Rojas DP, et al. Spread of Zika virus in the Americas. Proceedings of the National Academy of Sciences. 2017;114(22):E4334–E4343. 10.1073/pnas.1620161114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Liu QH, Ajelli M, Aleta A, Merler S, Moreno Y, Vespignani A. Measurability of the epidemic reproduction number in data-driven contact networks. Proceedings of the National Academy of Sciences. 2018;115(50):12680–12685. 10.1073/pnas.1811115115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Litvinova M, Liu QH, Kulikov ES, Ajelli M. Reactive school closure weakens the network of social interactions and reduces the spread of influenza. Proceedings of the National Academy of Sciences. 2019;116(27):13174–13181. 10.1073/pnas.1821298116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Bioglio L, Génois M, Vestergaard CL, Poletto C, Barrat A, Colizza V. Recalibrating disease parameters for increasing realism in modeling epidemics in closed settings. BMC Infectious Diseases. 2016;16(1). 10.1186/s12879-016-2003-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Starnini M, Lepri B, Baronchelli A, Barrat A, Cattuto C, Pastor-Satorras R. Robust Modeling of Human Contact Networks Across Different Scales and Proximity-Sensing Techniques In: Lecture Notes in Computer Science. Springer International Publishing; 2017. p. 536–551. Available from: 10.1007/978-3-319-67217-5_32. [DOI] [Google Scholar]
  • 32. Mastrandrea R, Fournet J, Barrat A. Contact Patterns in a High School: A Comparison between Data Collected Using Wearable Sensors, Contact Diaries and Friendship Surveys. PLOS ONE. 2015;10(9):e0136497 10.1371/journal.pone.0136497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Génois M, Vestergaard CL, Fournet J, Panisson A, Bonmarin I, Barrat A. Data on face-to-face contacts in an office building suggest a low-cost vaccination strategy based on community linkers. Network Science. 2015;3(3):326–347. 10.1017/nws.2015.10 [DOI] [Google Scholar]
  • 34. Isella L, Stehlé J, Barrat A, Cattuto C, Pinton JF, den Broeck WV. What’s in a crowd? Analysis of face-to-face behavioral networks. Journal of Theoretical Biology. 2011;271(1):166–180. 10.1016/j.jtbi.2010.11.033 [DOI] [PubMed] [Google Scholar]
  • 35. Barrat A, Cattuto C, Colizza V, Gesualdo F, Isella L, Pandolfi E, et al. Empirical temporal networks of face-to-face human interactions. The European Physical Journal Special Topics. 2013;222(6):1295–1309. 10.1140/epjst/e2013-01927-7 [DOI] [Google Scholar]
  • 36. Valle SYD, Hyman JM, Hethcote HW, Eubank SG. Mixing patterns between age groups in social networks. Social Networks. 2007;29(4):539–554. 10.1016/j.socnet.2007.04.005 [DOI] [Google Scholar]
  • 37. Kouokam E, Zucker JD, Fondjo F, Choisy M. Disease Control in Age Structure Population. ISRN Epidemiology. 2013;2013:1–8. 10.5402/2013/703230 [DOI] [Google Scholar]
  • 38. Anderson RM, May RM. Infectious Diseases of Humans: Dynamics and Control. Oxford: Oxford University Press; 1991. [Google Scholar]
  • 39. Ogunjimi B, Hens N, Goeyvaerts N, Aerts M, Damme PV, Beutels P. Using empirical social contact data to model person to person infectious disease transmission: An illustration for varicella. Mathematical Biosciences. 2009;218(2):80–87. 10.1016/j.mbs.2008.12.009 [DOI] [PubMed] [Google Scholar]
  • 40. Marziano V, Poletti P, Guzzetta G, Ajelli M, Manfredi P, Merler S. The impact of demographic changes on the epidemiology of herpes zoster: Spain as a case study. Proceedings of the Royal Society B: Biological Sciences. 2015;282(1804):20142509 10.1098/rspb.2014.2509 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Magpantay FMG, King AA, Rohani P. Age-structure and transient dynamics in epidemiological systems. Journal of The Royal Society Interface. 2019;16(156):20190151 10.1098/rsif.2019.0151 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Zhou L, Wang Y, Xiao Y, Li MY. Global dynamics of a discrete age-structured SIR epidemic model with applications to measles vaccination strategies. Mathematical Biosciences. 2019;308:27–37. 10.1016/j.mbs.2018.12.003 [DOI] [PubMed] [Google Scholar]
  • 43. Marziano V, Poletti P, Trentini F, Melegaro A, Ajelli M, Merler S. Parental vaccination to reduce measles immunity gaps in Italy. eLife. 2019;8 10.7554/eLife.44942 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Rohani P, Zhong X, King AA. Contact Network Structure Explains the Changing Epidemiology of Pertussis. Science. 2010;330(6006):982–985. 10.1126/science.1194134 [DOI] [PubMed] [Google Scholar]
  • 45. Arregui S, Iglesias MJ, Samper S, Marinova D, Martin C, Sanz J, et al. Data-driven model for the assessment of Mycobacterium tuberculosis transmission in evolving demographic structures. Proceedings of the National Academy of Sciences. 2018;115(14):E3238–E3245. 10.1073/pnas.1720606115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46. Hoang T, Coletti P, Melegaro A, Wallinga J, Grijalva CG, Edmunds JW, et al. A Systematic Review of Social Contact Surveys to Inform Transmission Models of Close-contact Infections. Epidemiology. 2019;30(5):723–736. 10.1097/EDE.0000000000001047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47. Fumanelli L, Ajelli M, Manfredi P, Vespignani A, Merler S. Inferring the Structure of Social Contacts from Demographic Data in the Analysis of Infectious Diseases Spread. PLoS Computational Biology. 2012;8(9):e1002673 10.1371/journal.pcbi.1002673 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Prem K, Cook AR, Jit M. Projecting social contact matrices in 152 countries using contact surveys and demographic data. PLOS Computational Biology. 2017;13(9):e1005697 10.1371/journal.pcbi.1005697 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Mistry D, Litvinova M, Piontti APy, Chinazzi M, Fumanelli L, Gomes MFC, et al. Inferring high-resolution human mixing patterns for disease modeling. arXiv. 2020;. [DOI] [PMC free article] [PubMed]
  • 50. Arregui S, Aleta A, Sanz J, Moreno Y. Projecting social contact matrices to different demographic structures. PLOS Computational Biology. 2018;14(12):e1006638 10.1371/journal.pcbi.1006638 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Nguyen VK, Mikolajczyk R, Hernandez-Vargas EA. High-resolution epidemic simulation using within-host infection and contact data. BMC Public Health. 2018;18(1). 10.1186/s12889-018-5709-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Pastor-Satorras R, Vespignani A. Epidemic Spreading in Scale-Free Networks. Physical Review Letters. 2001;86(14):3200–3203. 10.1103/PhysRevLett.86.3200 [DOI] [PubMed] [Google Scholar]
  • 53. Chen S, Small M, Tao Y, Fu X. Transmission Dynamics of an SIS Model with Age Structure on Heterogeneous Networks. Bulletin of Mathematical Biology. 2018;80(8):2049–2087. 10.1007/s11538-018-0445-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Alexander ME, Kobes R. Effects of vaccination and population structure on influenza epidemic spread in the presence of two circulating strains. BMC Public Health. 2011;11(S1). 10.1186/1471-2458-11-S1-S8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Miller JC, Volz EM. Incorporating Disease and Population Structure into Models of SIR Disease in Contact Networks. PLoS ONE. 2013;8(8):e69162 10.1371/journal.pone.0069162 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56. Liccardo A, Fierro A. Multiple Lattice Model for Influenza Spreading. PLOS ONE. 2015;10(10):e0141065 10.1371/journal.pone.0141065 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.United Nations, Department of Economic and Social Affairs, Population Division (2019). World Population Prospects 2019, custom data acquired via website.;. https://population.un.org/wpp/DataQuery/.
  • 58. Van Mieghem P, Omic J, Kooij R. Virus Spread in Networks. IEEE/ACM Trans Netw. 2009;17(1):1–14. 10.1109/TNET.2008.925623 [DOI] [Google Scholar]
  • 59. de Arruda GF, Rodrigues FA, Moreno Y. Fundamentals of spreading processes in single and multilayer complex networks. Physics Reports. 2018;756:1—59. 10.1016/j.physrep.2018.06.007 [DOI] [Google Scholar]
  • 60. Barrat A, Barthlemy M, Vespignani A. Dynamical Processes on Complex Networks. 1st ed New York, NY, USA: Cambridge University Press; 2008. [Google Scholar]
  • 61. Diekmann O, Heesterbeek JAP, Roberts MG. The construction of next-generation matrices for compartmental epidemic models. Journal of The Royal Society Interface. 2009;7(47):873–885. 10.1098/rsif.2009.0386 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Luca GD, Kerckhove KV, Coletti P, Poletto C, Bossuyt N, Hens N, et al. The impact of regular school closure on seasonal influenza epidemics: a data-driven spatial transmission model for Belgium. BMC Infectious Diseases. 2018;18(1):29 10.1186/s12879-017-2934-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Ferreira SC, Castellano C, Pastor-Satorras R. Epidemic thresholds of the susceptible-infected-susceptible model on networks: A comparison of numerical and theoretical results. Physical Review E. 2012;86(4). 10.1103/PhysRevE.86.041125 [DOI] [PubMed] [Google Scholar]
  • 64. Apolloni A, Poletto C, Colizza V. Age-specific contacts and travel patterns in the spatial spread of 2009 H1N1 influenza pandemic. BMC Infectious Diseases. 2013;13(1). 10.1186/1471-2334-13-176 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Biggerstaff M, Cauchemez S, Reed C, Gambhir M, Finelli L. Estimates of the reproduction number for seasonal, pandemic, and zoonotic influenza: a systematic review of the literature. BMC Infectious Diseases. 2014;14(1). 10.1186/1471-2334-14-480 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Seasonal influenza vaccination and antiviral use in EU/EEA Member States. European Centre for Disease Prevention and Control; 2018.
  • 67. Manzoli L, Gabutti G, Siliquini R, Flacco ME, Villari P, Ricciardi W. Association between vaccination coverage decline and influenza incidence rise among Italian elderly. European Journal of Public Health. 2018;28(4):740–742. 10.1093/eurpub/cky053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68. de Waure C, Boccalini S, Bonanni P, Amicizia D, Poscia A, Bechini A, et al. Adjuvanted influenza vaccine for the Italian elderly in the 2018/19 season: an updated health technology assessment. European Journal of Public Health. 2019;29(5):900–905. 10.1093/eurpub/ckz041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69. Aleta A, Moreno Y. Evaluation of the potential incidence of COVID-19 and effectiveness of containment measures in Spain: a data-driven approach. BMC Med. 2020;18(1):1–12. 10.1186/s12916-020-01619-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70. Perc M, Gorišek Miksić N, Slavinec M, Stožer A. Forecasting COVID-19. Front Phys. 2020;8 10.3389/fphy.2020.00127 [DOI] [Google Scholar]
  • 71.Castro M, Ares S, Cuesta JA, Manrubia S. Predictability: Can the turning point and end of an expanding epidemic be precisely forecast? arXiv. 2020;. [DOI] [PMC free article] [PubMed]
PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008035.r001

Decision Letter 0

Rob J De Boer, Benjamin Althouse

6 May 2020

Dear Dr. Aleta,

Thank you very much for submitting your manuscript "Data-driven contact structures: from homogeneous mixing to multilayer networks" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations.

Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. 

When you are ready to resubmit, please upload the following:

[1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out

[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).

Important additional instructions are given below your reviewer comments.

Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.

Sincerely,

Benjamin Althouse

Associate Editor

PLOS Computational Biology

Rob De Boer

Deputy Editor

PLOS Computational Biology

***********************

A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately:

[LINK]

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: In "Data-driven contact structures: from homogeneous mixing to multilayer networks" authors bridge for the first time an important gap that has existed until now between data analysis and contact structures derived from this data, namely the simultaneous determination of both, the connectivity of individuals and the age-structure of the population. This advance has important implications for the mathematical modeling of infectious diseases, as beautifully shown in the paper. In particular, authors compare four different scenarios that gradually leads to taking into account a multilayer representation in which both the social mixing and the number of contacts

are included in the model. It is shown analytically that the thresholds obtained for these

four scenarios are different, and that indeed only the most comprehensive framework, as presented in the paper, allows for the correct determination of the epidemic threshold. This is further supported by systematic simulations, confirming further that heterogeneities in the contact network are vastly important and must not be overlooked if we wish for a proper

determination of the epidemic threshold. It is also shown that the age-structure, which is likewise determined by the new approach, plays a bigger role beyond the onset of the outbreak.

An accurate determination of epidemic thresholds in contact networks is of huge importance, both for mitigating and prediction epidemic spreading, as well as for devising effective vaccination strategies. This research points out clearly for the first time that, when it comes to the evaluation of interventions such as vaccination, both sources of individual heterogeneity are important and should be considered jointly. This was an important open problem in the realm of an intensely investigated subject with obvious practical ramifications. By introducing a clever new approach based on empirical data and network science, this study thus fills and incredibly important gap that bridges the divide, and it reveals just how wrong one could be by neglecting or not having access to all the needed information in terms of connectivity and age of the population.

The paper is well-written, comprehensive, and clear. I find it is among the finest papers that I have had the pleasure of reading in the recent past. The motivation behind the approach and the insights it affords towards improving spreading of communicable diseases is genius, and as such it will surely not fail to impress the diverse readership of PLOS Computational Biology. For these reasons, I warmly recommend publication.

It is quite a challenge to suggest improvements for such an excellent contribution. Perhaps a reference to the current COVID-19 pandemic and how the approach could improve forecasting, as studied in "Forecasting COVID-19", Front. Phys. 8, 127 (2020), would be worthwhile. Apart from this, I can only reiterate my overall very positive impressions and congratulate the authors to a fine contribution.

Reviewer #2: This manuscript proposes a model which considers the aspects (i) the number of contacts of individuals and (ii) age-dependent contact pattern. Four different versions of the model are introduced depended on whether considering the heterogeneity of the two aspects. The performance of the models is examined with SIS and SIR epidemic models based on a data of human contact in Italy. I think this proposed model is interesting. However, this work looks only on its half way which demands a more systematical evaluation under a wide parameter space. The main results presented in the work are based on a specific data and the provided conclusions may not show its generality. Thus, I could not recommend its publication on Plos Computational Biology.

Following are some points for the notice of the authors:

1. The most results shown in this manuscript are based on the parameters induced from the data of Italy. A systematic investigation of the model is recommended from which general properties pertain to the model need to be abstracted and the results of empirical networks could serve as a potent examples.

2. Many concepts and legends in the figures are not well defined. For example, the "Frequency" in Fig. 1D; and is the "number of contacts" should be "age"? What is the 'X' in Fig. 2B? What is the "Relative Difference (%)" in Fig. 2C.

3. Some results in the figures seem not sufficient in its reliability. For example, generally, human contacts are symmetric and reciprocal. However, the gray plot in Fig. 1C looks asymmetric.

4. In addition, in Fig. 3B, what is the definition of R0 according to the age and more importantly how is this age-dependent R0 obtained. Are the results in Fig. 3B theoretical or numerical. If numerical, what is the number of realizations in the simulation and what are the sizes of error bars?

5. The results in Fig. 2C seem to be obtained from the contact pattern in Fig. 1C. However, in the manuscript it is not sufficiently clarified. Since this result presented together with A and B, confusion is easily occurred.

6. Why the study of vaccination is only put on SIR model? Since the SIS model is also an object being addressed in this work, the study of the vaccination should also been applied to the SIS model for the completeness of the work.

7. Some arguments are difficult to understand, e.g. "To gauge the effect ..." in line 300.

8. As mentioned, the results of this manuscript are largely based on a specific data. Therefore, the observations from the data may not suggest a general conclusion. Thus, I don't think the arguments under general terms without mentioning specific conditions, for example "This result might seem ..." in line 312, could explain the specific observations on its above, since in other conditions the observations could be essentially different.

9. "Attack rate" looks not a common term used in the complex network epidemiology.

There are also a few other points need to be addressed, while these here are enough for me come to the conclusion that this manuscript does not meet the standard of Plos Computational Biology.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Computational Biology data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: None

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Figure Files:

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Data Requirements:

Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.

Reproducibility:

To enhance the reproducibility of your results, PLOS recommends that you deposit laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see http://journals.plos.org/ploscompbiol/s/submission-guidelines#loc-materials-and-methods

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008035.r003

Decision Letter 1

Rob J De Boer, Benjamin Althouse

9 Jun 2020

Dear Dr. Aleta,

We are pleased to inform you that your manuscript 'Data-driven contact structures: from homogeneous mixing to multilayer networks' has been provisionally accepted for publication in PLOS Computational Biology.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.

Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.

IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.

Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.

Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. 

Best regards,

Benjamin Althouse

Associate Editor

PLOS Computational Biology

Rob De Boer

Deputy Editor

PLOS Computational Biology

***********************************************************

PLoS Comput Biol. doi: 10.1371/journal.pcbi.1008035.r004

Acceptance letter

Rob J De Boer, Benjamin Althouse

9 Jul 2020

PCOMPBIOL-D-20-00448R1

Data-driven contact structures: from homogeneous mixing to multilayer networks

Dear Dr Aleta,

I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Sarah Hammond

PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: ResponseReviewers.pdf

    Data Availability Statement

    All data used in this paper comes from the POLYMOD study, which was published with the title "Social Contacts and Mixing Patterns Relevant to the Spread of Infectious Diseases" and it is available at doi.org/10.1371/journal.pmed.0050074.


    Articles from PLoS Computational Biology are provided here courtesy of PLOS

    RESOURCES