Skip to main content
Springer logoLink to Springer
. 2019 Apr 10;78(7):2369–2398. doi: 10.1007/s00285-019-01347-2

SIR epidemics and vaccination on random graphs with clustering

Carolina Fransson 1,, Pieter Trapman 1
PMCID: PMC6534529  PMID: 30972440

Abstract

In this paper we consider Susceptible Infectious Recovered (SIR) epidemics on random graphs with clustering. To incorporate group structure of the underlying social network, we use a generalized version of the configuration model in which each node is a member of a specified number of triangles. SIR epidemics on this type of graph have earlier been investigated under the assumption of homogeneous infectivity and also under the assumption of Poisson transmission and recovery rates. We extend known results from literature by relaxing the assumption of homogeneous infectivity both in individual infectivity and between different kinds of neighbours. An important special case of the epidemic model analysed in this paper is epidemics in continuous time with arbitrary infectious period distribution. We use branching process approximations of the spread of the disease to provide expressions for the basic reproduction number R0, the probability of a major outbreak and the expected final size. In addition, the impact of random vaccination with a perfect vaccine on the final outcome of the epidemic is investigated. We find that, for this particular model, R0 equals the perfect vaccine-associated reproduction number. Generalizations to groups larger than three are discussed briefly.

Keywords: SIR epidemics, Configuration model, Clustering, Branching processes, Vaccination

Introduction

One of the most important factors that determine the fate of an outbreak of an infectious disease is the contact pattern of individuals in the population. The frequency and duration of the contacts between individuals typically depend on the nature of their relationship. For this reason, recent interest has focused on the impact of the underlying social network on the spread of the disease. The social network is typically represented by a random graph (Newman et al. 2002), in which the nodes or vertices represent individuals and the edges represent social contacts between the individuals. Two nodes that share an edge are called “neighbours”.

A popular choice when generating random graphs with a specified degree distribution is the configuration model (CM). It was introduced by Bollobás (1980) for the special case where the degree distribution is degenerate (i.e. every node of the graph has the same degree) and extended to more general degree distributions by Molloy and Reed (1995, 1998). There is a vast literature on epidemics on configuration model graphs [see e.g. Andersson (1999), Britton et al. (2007), Janson et al. (2014), Barbour and Reinert (2013), Bhamidi et al. (2014)].

An important feature of the configuration model is that, under mild regularity conditions on the degrees, this type of graph is asymptotically unclustered. That is to say, it contains virtually no groups and short circuits. Real world networks do, however, typically exhibit clustering (Newman 2003), and there are a number of graph models that do allow for group structure (Bollobás et al. 2011; Karoński et al. 1999; Newman 2002). Epidemics on graphs with group structure were studied by Trapman (2007); Ball et al. (2009, 2010, 2014); Coupechoux and Lelarge (2015); Britton et al. (2008).

In this paper, we use a generalized version of the configuration model to incorporate clustering of the social network in the analysis of the spread of an infectious disease. The configuration model with clustering (CMC) was independently introduced by Miller (2009) and Newman (2009). It is an extension of the CM in the sense that, for each node u, in addition to the degree of u one also specifies the number of pairs of neighbours of u that are in turn neighbours of each others. In other words, one specifies the number of triangles (with non-overlapping edges) of which u is a member [see Sect. 2.1 for a precise definition of the graph model]. This allows for graphs with non-negligible clustering and a specified degree distribution. That is to say, the CMC deviates from the classical Erdős–Rényi graph model (Erdős and Rényi 1959) in two fundamental ways: it allows for for a non-Poissonian degree distributions and is asymptotically clustered. Epidemics on this type of graph have previously been studied by Miller (2009) and Volz (2011). Miller (2009) investigated the impact of clustering on the epidemic threshold, formulated as a bond percolation problem. This means that the infectivity of infected individuals is assumed to be homogeneous; an infected individual transmits the disease to each of its neighbours independently with some fixed probability T. Volz (2011) investigated the time evolution and final size of epidemics on CMC graphs under the assumption of exponentially distributed infectious periods during which individuals contact neighbours at a constant rate.

The main contribution of our research is that we extend the results of Miller (2009) and Volz (2011) by allowing for heterogeneous infectivity, i.e. by allowing for some infected individuals to be more contagious than others or for individuals to exhibit different contact behaviors for different types of neighbours. Such heterogeneity may, for instance, reflect variability in the infectious period or contact preferences on the part of individuals. We provide expressions for the probability of a major outbreak and the final size of a major outbreak. A key tool in our analysis is the approximation of the epidemic seen from a “generation of infection” or “rank” perspective by a multitype Galton Watson branching process. This approximation, which is interesting in its own right, gives rise to the rank-based reproduction number R0 [see e.g. Pellis et al. (2008, 2012)]. We note that especially allowing for heterogeneity in the infectivity of individuals requires a more intricate branching process approximation than a model with homogeneous infectivity [as analysed by Miller (2009)]. To see this, consider an individual v which is infected through a triangle Δ. The “local epidemics” in Δ and in other triangles v is part of all depend on the infectivity of v and are therefore in general not independent.

The second contribution of this paper concerns vaccination. We investigate the impact of uniform vaccination (i.e. vaccinated individuals are selected uniformly at random) with a perfect vaccine (i.e. a vaccine that provides full and permanent immunity to the disease). We find that it is necessary to vaccinate a fraction 1-1/R0 of the population in order to prevent a major outbreak of the disease, as in the case of homogeneous mixing. We illustrate our findings with numerical examples.

This paper is structured as follows. In Sect. 2 we provide the preliminaries for the model. In Sect. 2.1 we give a more detailed description of how graphs are generated in the CMC and investigate the asymptotic clustering of such graphs and in Sect. 2.2 the epidemic model is specified. Sections 2.3 and 2.4 contains an overview of the concept of reproduction numbers and the necessary branching process background. In Sect. 3, we derive expressions for the probability of a major outbreak and the expected final size under the assumption of an unvaccinated and fully susceptible population, and in Sect. 4 the analysis is repeated under the assumption of uniform vaccination with a perfect vaccine. We illustrate our findings with numerical examples presented in Sect. 5 and discuss possible extensions in Sect. 6.

Preliminaries

The configuration model with clustering

A CMC graph is constructed as follows. Let {p(ks,kΔ)}ks,kΔN0 be a prescribed joint degree distribution, where ks denotes the number of single edges attached to a node, and kΔ denotes the number of pairs of triangle edges. Throughout, (S,Δ) is assumed to be a generic random vector distributed according to p. Let {(Si,Δi)}i=1N be a sequence of independent copies of (S,Δ). Analogously to the CM, a graph GN=GN(p) of size N is constructed by first assigning the single degree Si and the triangle degree Δi to the node vi, i=1,2,,N. One may think of this step in terms of half-edges; to each node vi, we attach Si single half-edges and Δi pairs of triangle half-edges. The single half-edges are then matched in pairs and the triangle half-edge pairs in threes by choosing a matching uniformly at random among all possible such matchings. The process of joining half-edges is illustrated in Fig. 1. As described in Miller (2009), the matching may be carried out as follows. Two lists of nodes, one single degree list and one triangle degree list are created. A node with joint degree (ks,kΔ) appears ks times in the single list and kΔ times in the triangle list. The lists are then shuffled uniformly, and the nodes on positions 2m+1 and 2m+2 in the single degree list and positions 3m+1,3m+2 and 3m+3 in the triangle degree list are matched, mN0.

Fig. 1.

Fig. 1

Schematic illustration of the construction of a CMC graph. Triangle half-edges (marked with a triangle) and single half-edges (marked with a perpendicular line) are assigned to the nodes of the graph (left). The half-edges are then matched uniformly at random (right). Note that two of the half-edges attached to v3 are paired with each other and so form a self-loop

We define the total single degree as

DS(N):=i=1NSi

and the total triangle degree as

DΔ(N):=i=1NΔi.

If the total single degree (that is, the length of the single degree list) is not even or if the total triangle edge degree (the length of the triangle degree list) is not a multiple of three we erase a single half-edge and/or one or two triangle half-edge pairs chosen uniformly at random. Similarly, we erase self-loops and merge multiple edges, so that the resulting graph is simple. Under assumption A1 (stated below) on p it holds that the number of single self-loops and single double edges converge in distribution to independent Poisson random variables with finite means [cf. Van der Hofstad (2016, prop. 7.13)].

For this reason, self-loops and multiple edges are negligible in the limit as N. In the remainder of this paper, we ignore the small differences in the topology of the graph that arise from erasing multiple edges or self-loops. In addition, we ignore the small differences in effective degree distribution that arise from erasing half-edges so that the number of single and triangle half-edges are multiples of two and three, respectively.

We make the following assumptions on p.

  1. E(Δ2)<andE(S2)<.

  2. P(max(Δ,S)2)>0 and E(ΔS)>0.

Note that the assumption A1 implies E(ΔS)<. Assumption A2 ensures that the mean matrices of the approximating branching processes (presented below) are positively regular (we say that an r×r matrix M is positively regular if it has finite non-negative entries and for some nN all entries of Mn are strictly positive).

Clustering coefficient of GN

For any undirected graph we can measure the amount of clustering in the network using the so-called clustering coefficient, which is defined as follows. Let G=(V,E) be an undirected graph with node set V and edge set E. Define

WG={(u,v,w)V3:(u,v),(v,w)E}

the set of all ordered wedges (i.e. directed paths consisting of precisely two edges) of G and

WΔG={(u,v,w)V3:(u,v),(v,w),(w,u)E}WG

the set of all ordered triangles of G. The clustering coefficientC(G) of G is a measure of the degree of clustering of G and is defined as the fraction of the ordered wedges of G that are also triangles:

C(G)=|WΔG||WG|.

      Here |·| denotes the cardinality of a set.

As stated in the following proposition, CMC graphs have asymptotically non-zero clustering as N. An analogous result for fixed degree sequences was presented in Newman (2009). Let P denote convergence in probability.

Proposition 1

Let {GN}N be a sequence of CMC graphs with independent degrees drawn from p. If p satisfies assumption A1 then

C(GN)PE(2Δ)E((2Δ+S)2)-E(2Δ+S). 1

The proof is presented in the Appendix.

Downshifted size-biased degrees

The graph GN may be constructed by joining the half-edges in a random order. In particular, GN may be constructed as the epidemic progresses; starting with the initial infected case we sequentially match the half-edges along which the disease is transmitted. Since half-edges are chosen uniformly at random in the matching procedure, the probability to choose a specific node is proportional to the number of free half-edges attached to the node in question. That is, if we pair a single half-edge, the probability of choosing a specific node with ks unpaired single half-edges is proportional to ks. For this reason, the degree distribution of a node explored by joining a single half-edge in the early phase of the epidemic can be approximated by the single size biased [cf. the concept of excess degree in Meyers (2007)] degree distribution p(s)

p(s)(ks,kΔ)=ksp(ks,kΔ)E(S). 2

Similarly, the degree distribution of the nodes explored by joining three triangle half-edge pairs in the early phase of the epidemic can be approximated by the triangle size biased degree distribution p(Δ)

p(Δ)(ks,kΔ)=kΔp(ks,kΔ)E(Δ). 3

In the epidemic process, we need to account for the fact that an infected individual has at least one non-susceptible neighbour (namely the direct source of its infection). For this reason, we introduce the downshifted size biased degree distributions p(s) and p(Δ), given by

p(s)(ks,kΔ)=p(s)(ks+1,kΔ)p(Δ)(ks,kΔ)=p(Δ)(ks,kΔ+1). 4

Throughout, we will make frequent reference to the following random vectors

(S(s),Δ(s))p(s)(S(Δ),Δ(Δ))p(Δ) 5

and the expected values

E(S(s))=E(S2)E(S)-1E(Δ(s))=E(SΔ)E(S)E(S(Δ))=E(SΔ)E(Δ).E(Δ(Δ))=E(Δ2)E(Δ)-1 6

The epidemic model

We use an SIR model to investigate the dynamics of the spread of the disease. At any given time point, the population is divided into three groups, depending on health status. The groups are susceptible (S), infectious (I) and recovered (R) [see e.g. Britton (2010), Diekmann et al. (2013)]. Individuals of the population make contact with other individuals at (possibly random) points in time. If, at some time point, an infectious individual contacts a susceptible individual then the susceptible individual instantaneously becomes infectious. An infectious individual will cease to be contagious after a period of time, which we call the infectious period of the individual in question, and is then transferred to the recovered group. Recovered individuals are those that are immune to the disease. Individuals belonging to this group play no further role in the spread of the disease. Because of this last observation, we can treat individuals that die because of the disease as “recovered”. In summary, we allow only the transitions SI and IR. Note that the population is assumed to be closed; we ignore births, deaths and migration.

More specifically, we consider an SIR epidemic in a generation framework on the clustered static graph GN and assume possible heterogeneity in infectivity, both between different individuals (individual heterogeneity) and between different kinds of edges (edge heterogeneity). Individual heterogeneity means that some infected individuals are more contagious than others. Such heterogeneity may, for instance, arise from variability in the infectious period. Edge heterogeneity reflects that individuals may exhibit different contact behaviors for different types of neighbours ; an individual may for instance prefer to spend more time with its triangle neighbours at the expense of spending less time with its single neighbours.

To construct a model that captures such heterogeneities, let T=(Ts,TΔ) be a random variable with support in [0,1]2, and let {Ti}i=1N={(Ts(i),TΔ(i))}i be a sequence of independent copies of T. We allow for any dependence structure between Ts and TΔ. Each node vi of GN is equipped with a two-dimensional transmission weightTi. If vi gets infected, then each susceptible single neighbour (neighbour by virtue of a single edge) of vi gets infected by vi independently in the next generation with probability Ts(i), and each susceptible triangle neighbour (neighbour by virtue of a triangle edge) of vi gets infected by vi independently in the next generation with probability TΔ(i) (conditioned on {Ti}i). Node vi thereafter becomes recovered, playing no further role in the epidemic. An infected node transmits the disease independently of the transmissions from other infected nodes. An infected node does not, however, transmit the disease to its neighbours independently, unless the distribution of T is degenerate. Conditioned on the transmission weights {Ti}i and the structure of GN, the number of single and triangle neighbours that an infected node vi makes (infectious) contact with while infectious has a binomial distribution with parameters (Si,Ts(i)) and (Δi,TΔ(i)), respectively.

The spread of this epidemic can be fully captured by a directed graph [see e.g. Pellis et al. (2012), Kenah and Miller (2011)]. To construct such directed graph from an undirected CMC graph GN, we replace each undirected edge of GN by two parallel directed edges, pointing in the opposite direction. The weight of an edge (vi,vj), which represents the (potential) transmission time from vi to vj, is taken to be 1 if vi would make infectious contact with vj if infected, and otherwise. The individuals ultimately infected are then the individuals that can be reached from an initial case by following a path consisting of directed edges with finite edge weights.

Reproduction numbers

A key quantity in the study of epidemics is the basic reproduction number, often denoted by R0. It is usually defined as the expected number of infected cases caused by a “typical” infected individual in an otherwise susceptible population (Diekmann et al. 1990). For most stochastic epidemic models [including SIR epidemics in homogeneous mixing populations (Britton 2010), populations with households (Ball et al. 2016) and epidemics on networks (Britton et al. 2007)] it has the threshold property that a major outbreak is possible if and only if R0>1.

For models where a suitable generation based branching process approximation is available, R0 is usually defined as the Perron root (the dominant eigenvalue, which exists and is real-valued by assumptions A1 and A2, see for instance Varga (2009, Chapter 2) of the mean matrix of the approximating Galton Watson branching process. This is the definition used in this article. By standard branching process theory, the interpretation of R0 as the expected number of cases caused by the typical individual in the early phase of the epidemic and its threshold properties are retained by this definition. The threshold property of R0 is made precise in Theorem 1 below.

In Sect. 4, we investigate the spread of an epidemic in a population with vaccination. To this end, in addition to the basic reproduction number R0, we consider the perfect vaccine-associated reproduction numberRV (Goldstein et al. 2009). A vaccine is perfect if it provides full and permanent immunity. That is, an individual vaccinated with a perfect vaccine cannot contract the disease. The perfect vaccine-associated reproduction number RV is defined as (Ball et al. 2016)

RV=11-fv(c), 7

where the critical vaccination coveragefv(c) is the fraction of the population that has to be vaccinated with a perfect vaccine in order to reduce R0 to unity, if the vaccinated individuals are chosen uniformly at random. That is to say, fv(c)=1-1/RV is the fraction necessary to vaccinate in order to be guaranteed to prevent a major outbreak (Britton 2010). Note that if R01 then fv(c)=0.

For many models, including epidemics on graphs generated by the CM (Britton et al. 2007) and the standard stochastic SIR epidemic model [i.e. individuals mix homogeneously, see for instance Britton (2010)], RV=R0. That is, vaccinating a fraction 1-1/R0 of the population with a perfect vaccine is sufficient to surely prevent a major outbreak. On the other hand, for the households and households-workplaces model with uniform vaccination, RVR0 (Ball et al. 2016) with strict inequality possible. In Sect. 4.1 we show that for the model analysed in this report, RV=R0.

Epidemics in continuous time: the rank-based approach

As mentioned above, heterogeneity in infectivity might arise from heterogeneity in the infectious period; an important special case of the above described model is epidemics in continuous time with random infectious periods where contacts between individuals take place according to point processes on R0. Ignoring the real time-dynamics of an epidemic does not impact results that concern the final outcome of the epidemic. This result was first presented by Ludwig (1975), see also Pellis et al. (2008) or Kiss et al. (2017, section 6.2.3) for a more recent discussion. This leads us to the often more tractable rank-based approach.

In order to define the rank of a vertex, denote the initial case by v. The rank of a node v in GN is the distance from v to v, if every edge along which the disease would be transmitted is assigned the edge weight 1, and every other edge is assigned the edge weight . That is, the rank of v is the smallest number of directed edges that have to be traversed in order to follow a path of (potential) transmission from v to v. We may then analyse the spread of the disease by letting generation n of the epidemic process consist of the individuals of rank n. If, for instance, v1 is the first node in a triangle consisting of the nodes v1,v2,v3 to be infected, and v1 infects v2 and thereafter attempts to infect v3, then v3 is attributed to v1 regardless of whether v1 or v2 infected v3. This is illustrated in Fig. 2.

Fig. 2.

Fig. 2

The difference between rank-based generations and true generations. Left: The length of the path v1v3 (i.e. the transmission time from v1 to v3) is 4.89 and exceeds the length 2.22+1.64 of the path v1v2v3. Therefore, the true path of transmission is uwv1v2v3. In the rank-based approach, however, v3 is attributed to v1. Right: The resulting rank generation tree

Consider a continuous time epidemic formulated as follows. Suppose that each infected individual remains infectious for a (random) period of time. The infectious periods are distributed as the random variable τ, τF, and independent (but identically distributed) for different nodes. Suppose further that a node makes contact with each neighbour independently at a Poisson rate while infected, and that susceptible individuals are fully susceptible, so that each infectious-susceptible contact results in transmission. If the contact rate is given by βs for single edge neighbours and βΔ for triangle edge neighbours then the transmission weights Ts and TΔ are distributed as 1-e-βsτ and 1-e-βΔτ, respectively. We then have E(Ts)=1-L(βs), E(TΔ)=1-L(βΔ) and E(TΔ(1-TΔ))=L(βΔ)-L(2βΔ) where L(z)=R+e-zxdF(x) is the Laplace transform of the infectious period.

Branching process approximations

To analyse the spread of the disease in the early stages of the epidemic, we employ a multi-type branching process approximation. The graph GN may be constructed by joining the half-edges in any (possibly random) order, provided that the uniform matching is not violated. In particular, the graph GN may be constructed (or explored) as the epidemic progresses; starting with the initial infected case u we sequentially match the half-edges along which the disease is transmitted. In the early phase of the epidemic, short cycles (except for the triangles formed by triangle edges) are unlikely to appear. For these reasons, the early spread of the disease is well approximated by a suitably chosen branching process.

Similarly, a branching process approximation can be used to approximate the expected final size of the epidemic (Ball et al. 2009, 2010, 2014). In the graph representation of an epidemic, an individual contracts the disease if and only if there is a path of directed edges with finite edge weights from the initial case to the node representing the individual in question.

Define the susceptibility set S(v)=SN(v) of a node v as the collection of nodes of GN that can be reached from v by tracing a path of finite length backwards. That is, the individuals that contract the disease are precisely the individuals with susceptibility sets that contain an initial case. Hence, if the initial case is chosen uniformly at random then the probability that a node v contracts the disease is proportional to the size of its susceptibility set S(v) and this probability can be approximated by exploring S(v). Figure 3 shows a schematic illustration of a susceptibility set.

Fig. 3.

Fig. 3

Graph representation of an epidemic in a small (N=9) population. The gray dashed and black solid edges have infinite and finite edge weights (transmission times), respectively. The nodes in the susceptibility set of v5, S(v5)={v1.v2,v3,v5,v7}, are enclosed by the blue dotted line. The nodes that v5 would infect if infected, directly or through other nodes, are enclosed by the orange dashed line (colour figure online)

By reversing the direction of the edges of the graph representation of an epidemic, but keeping the weights, the expected final fraction of the population infected in a major outbreak and the probability of a major outbreak are interchanged (Miller 2008), provided that the initial case is chosen uniformly at random. The process so obtained is called the backward epidemic process of the node v. If the underlying epidemic model is such that the backward epidemic process can be well approximated by a branching process, then we can use this branching process to compute the asymptotic distribution of the proportion of the population that ultimately escapes infection. This is made precise in the following theorem, due to Ball et al. (2014, Theorem 3.5), who proved the theorem for the related model of random intersection graphs. The statement of Theorem 1 carries over to the forward and backward branching processes considered in this paper. We omit the proof, which is analogous to the proof presented in Ball et al. (2014), see also Ball et al. (2009).

Theorem 1

Let q and qb be the extinction probabilities of the forward and backward approximating branching processes respectively, and let SN be the proportion of the population that ultimately escapes an epidemic in a population of size N. Then

SNS 8

as N where the convergence in (8) is in distribution and P(S=1)=1-P(S=qb)=q.

In other words, in the limit of large population sizes, the epidemic “takes off” with probability 1-q, and if this happens a fraction 1-qb of the population is ultimately infected (with probability converging to 1 as N). Note that since R0 is defined as the Perron root of the mean matrix of the forward branching process, q<1 if and only if R0>1.

An epidemic in a fully susceptible population

We now have the tools to analyse the spread of an infectious disease on a graph generated by the CMC. In the present section, the population is assumed to be fully susceptible to the disease, apart from the initially infectious individual.

Forward process

Before analysing the forward process, we need to set some terminology. For a given triangle uvw, where u is the first individual to be infected in the triangle uvw, we refer to v and w as twins. We approximate the spread of the disease during the early phase by a multi-type branching process consisting of the following three types (except for the initial case):

  • Type 1:

    A node infected along a triangle edge whose twin (in the same triangle) is infected at the same time step or earlier

  • Type 2:

    A node infected along a triangle edge that is not of type 1

  • Type 3:

    A node infected along a single edge

Figure 4 shows three examples of possible paths of transmission within a triangle giving rise to type 1 and 2 individuals in the approximating branching process.

Fig. 4.

Fig. 4

Three examples of possible paths of transmission in a triangle v1,v2,v3, where v1 is the first node to be infected. Left: v1 infects both v2 and v3. Both v2 and v3 are represented by type 1 individuals in the approximating branching process. Center: v1 infects v2 and v2 infects v3. Then v3 and v2 are represented by type 1 and type 2 individuals, respectively. Right: v1 infects v2. Then v2 is represented by a type 2 individual

Denote by

Mf=(mi,j)i,j=13=m1,1m1,2m1,3m2,1m2,2m2,3m3,1m3,2m3,3

the mean matrix of the above described branching process. Suppose that v1 is the first individual to be infected in the triangle v1, v2, v3. The probability that v1 transmits the disease both to v2 and v3 is E(T2). Similarly, the probability that v1 transmits the disease to either v2 or v3, but not to both, is 2E(T(1-T)).

Thus, by linearity of expectation and because the distribution of the susceptible neighbours of infected nodes in the early phase of the epidemic is given by the downshifted degree distributions in (4), we obtain

Mf=2E(TΔ2)E(Δ(Δ))2E(TΔ(1-TΔ))E(Δ(Δ))E(Ts)E(S(Δ))2E(TΔ2)E(Δ(Δ))+E(TΔ)2E(TΔ(1-TΔ))E(Δ(Δ))E(Ts)E(S(Δ))2E(TΔ2)E(Δ(S))2E(TΔ(1-TΔ))E(Δ(S))E(Ts)E(S(S)). 9

(Recall that the random variables Δ(Δ), Δ(s), S(Δ) and S(s) defined in (5) have the downshifted size biased distributions). Note that all entries of Mf are finite and that S and Δ both have finite second moments by assumption A1.

If Mf is positively regular (see the last paragraph before Sect. 2.1.1) then R0 is given by the Perron root of Mf. With little effort, one can use the expected values provided in (6) to show that necessary and sufficient conditions for Mf to be positively regular are that assumptions A1-A2 hold and that 0<E(Ts)<1 and 0<E(TΔ)<1. If some of these conditions are not satisfied, we may analyse the spread of the disease by reducing the number of types of the approximating forward branching process. It is worth pointing out that R0 only depends on the marginal distributions of Ts and TΔ (via their moments), not on the dependence structure between them.

Probability of a major outbreak

For two s-dimensional vectors a¯=(a1,,as)T and b¯=(b1,,bs)T, we define

a¯b¯:=a1b1··asbs.

Let f:[0,1]3R3 be the probability generating function of the offspring distribution of the three types in the approximating branching process. That is, for z¯=(z1,z2,z3)T[0,1]3 the ith component of f(z¯) is given by

f(z¯)i=Ez¯ξ¯i 10

where ξ¯i=(ξi,1,ξi,2,ξi,3) is distributed as the offspring of a type i individual, i=1,2,3.

Similarly, let f:[0,1]3R be the probability generating function of the offspring distribution of the initial case. If ξ¯=(ξ,1,ξ,2,ξ,3)T is distributed as the offspring of the initial case, then f is given by

f(z¯)=Ez¯ξ¯.

For i=1,2,3, let (S(i),Δ(i)) be the joint degree of a type i case with offspring (ξi,1,ξi,2,ξi,3) and transmission weight T=(Ts,TΔ). That is,

(S(1),Δ(1))=d(S(2),Δ(2))=d(S(Δ),Δ(Δ))

and

(S(3),Δ(3))=d(S(s),Δ(s)).

Here =d denotes equality in distribution. By conditional independence we have

E(z1ξi,1z2ξi,2z3ξi,3)=EE(z3ξi,3|T,S(i),Δ(i))E(z1ξi,1z2ξi,2|T,S(i),Δ(i)).

Conditioned on the transmission weight T and the single degree S(1), ξ1,3 has a binomial distribution with parameters S(1) and Ts. Thus

E(z3ξ1,3|T,S(1),Δ(1))=k0+k1=S(1)S(1)k1(Tsz3)k1(1-Ts)k0=(Tsz3+1-Ts)S(1).

Similarly

E(z1ξ1,1z2ξ1,2|T,S(1),Δ(1))=k0+k1+k2=Δ(1)-1Δ(1)-1k0,k1,k2(1-TΔ)2k0(2(1-TΔ)TΔz2)k1(TΔz1)2k2=((1-TΔ)2+2TΔ(1-TΔ)z2+TΔ2z12)Δ(1)-1.

Thus

E(z1ξ1,1z2ξ1,2z3ξ1,3)=E((Tsz3+1-Ts)S(Δ)((1-TΔ)2+2TΔ(1-TΔ)z2+TΔ2z12)Δ(Δ)) 11

where (Δ(Δ),S(Δ)) is independent of T.

Since the conditional offspring distribution of a type 2 individual is identical to the offspring distribution of a type 1 individual except that a type 2 individual may give birth to one additional type 1 individual with probability TΔ, we have

E(z1ξ2,1z2ξ2,2z3ξ2,3)=E((Tsz3+1-Ts)S(Δ)((1-TΔ)2+2TΔ(1-TΔ)z2+TΔ2z12)Δ(Δ)(TΔz1+1-TΔ)). 12

Similarly,

E(z1ξ3,1z2ξ3,2z3ξ3,3)=E((Tsz3+1-Ts)S(s)((1-TΔ)2+2TΔ(1-TΔ)z2+TΔ2z12)Δ(s)). 13

Substituting (11)–(13) into (10) gives an expression for f.

By standard branching process theory, if R0>1 the extinction probability of a process descending from a type i individual, i=1,2,3, is given by qi, where q¯=(q1,q2,q3)T is the unique solution of q¯=f(q¯) in [0,1)3. We also have

q¯=limnfn(0¯), 14

where fn is the composition of f with itself n times.

Since the approximating branching process dies out if and only if each of the processes started by the children of the initial case die out, the probability of extinction is given by f(q¯). After some calculations, analogous to the calculations that led to (11)–(13), we find that the probability of extinction is given by

f(q¯)=E(Tsq3+1-Ts)S((1-TΔ)2+2TΔ(1-TΔ)q2+TΔ2q12)Δ

where (S,Δ) is independent of T. We conclude that, by Theorem 1, the probability of a major outbreak is given by 1-f(q¯), where q¯ is the limit in (14) and also the fixed point of f in [0,1)3.

Backward process

Let w be a given node of GN, chosen uniformly at random. We use a backward branching process to approximate the probability that w contracts the disease, which by an exchangeability argument equals the expected final size of a major outbreak. The offspring of an individual v in the backward process are the individuals that would potentially have infected v, if they were infected themselves.

The members of the susceptibility set are divided into the following two groups (which give rise to a two-type approximating backward branching process).

  • Type 1:

    The vertex is included in the susceptibility set by virtue of potential transmission along a single edge

  • Type 2:

    The vertex is included in the susceptibility set by virtue of potential transmission along a triangle edge

We assign kinship as follows. The children of type 1 of an individual v1 are the individuals included in the susceptibility set due to potential transmission along a single edge. The children of type 2 of v1 are the individuals included in the susceptibility set due to potential transmission of the disease to v1, within a triangle of which v1 is a member. We note that, given a triangle v1,v2,v3 where v1 is the primary case, both v2 and v3 will be members of the susceptibility set of v1 by virtue of transmissions within the triangle if at least one of the following events happens:

(E1)

v2 and v3 both “infect” v1

(E2)

v2 infects v1 and v3 “infects” v2

(E3)

v3 infects v1 and v2 “infects” v3

Here “infects” is conditional on the “infector” being infected during the epidemic.

The events E1-E3 are illustrated in Fig. 5.

Fig. 5.

Fig. 5

The individuals v2 and v3 are both in the susceptibility set S(v1) of v1 by virtue of transmission within the triangle v1,v2,v3 if and only if at least one of the events E1 (left), E2 (center) or E3 (right) happens

Standard calculations give that the probability of the union of the events E1-E3 is given by p2=3E(TΔ)2-2E(TΔ)E(TΔ2). Similarly, the probability that neither v1 nor v2 will be members of the susceptibility set of v by transmissions within the triangle is given by p0=(1-E(TΔ))2. For later use, denote 1-p0-p2 by p1.

Expected final size of a major outbreak

Let b be the probability generating function of the offspring distribution of the two types of the approximating backward branching process. Furthermore, let b be the probability generating function of the offspring distribution of the ancestor w. Analogously to the forward branching process, the probability that a branching population whose ancestor is of type i,i=1,2, will go extinct is given by qib, where q¯b=(q1b,q2b)T is the unique solution of q¯b=b(q¯b) in [0,1)2 (recall R0>1). The probability of extinction is given by b(q¯b).

Proceeding in the same manner as in Sect. 3.1.1 yields

b(z1,z2)1=E(E(Ts)z1+1-E(Ts))S(s)(p0+p1z2+p2z22)Δ(s)

where p0,p1 and p2 are as in Sect. 3.2. Similarly

b(z1,z2)2=E(E(Ts)z1+1-E(Ts))S(Δ)(p0+p1z2+p2z22)Δ(Δ),

and the probability of ultimate extinction of the backward process is given by

b(q¯b)=E(E(Ts)q1b+1-E(Ts))S(p0+p1q2b+p2(q2b)2)Δ.

We conclude that the expected final size of a major outbreak is given by 1-b(q¯b).

Vaccination

Random vaccination with a perfect vaccine

Assume that a fraction fv<1 of the population is vaccinated, and that the vaccinated individuals are chosen uniformly at random (without replacement) from the population. The vaccine is perfect, in the sense that a vaccinated individual gains full and lasting immunity to the disease. If the population size N is large, we may use a slightly different model, where each individual is vaccinated with probability fv, independently of the vaccination status of other individuals. By the law of large numbers, for our purposes the models are equivalent in the limit as the population size N.

As before, we may approximate the early phase of the epidemic by a multi-type branching process. The individuals of the approximating branching process are now of the following three types.

  • Type 1:

    Infected along a triangle edge and has a twin that is known not to be susceptible

  • Type 2:

    Infected along a triangle edge and has a twin that might be susceptible

  • Type 3:

    Infected along a single edge

To clarify the types, assume that in the early phase of the epidemic v1 is the primary case in the triangle v1,v2,v3. If v1 attempts to transmit the disease both to v2 and v3 and succeeds (that is, none of v2 and v3 are vaccinated) then both v2 and v3 are represented by type 1 individuals in the approximating branching process. This happens with probability

E(TΔ2)(1-fv)2. 15

If v1 attempts to transmit the disease both to v2 and v3, but only succeeds to transmit the disease to v3 (that is, v2 is vaccinated and v3 is not vaccinated), then in the approximating branching process the individual representing v1 gives birth to one type 1 individual (representing v3) within the triangle v1,v2,v3. This happens with probability

E(TΔ2)fv(1-fv). 16

If v1 attempts to transmit the disease only to v2 and succeeds (that is, v2 is not vaccinated) then in the approximating branching process, the individual representing v1 gives birth to one type 2 individual (representing v2) within the triangle v1,v2,v3. This happens with probability

E(TΔ(1-TΔ))(1-fv). 17

The above described events are illustrated in Fig. 6.

Fig. 6.

Fig. 6

Three examples of transmission dynamics within a triangle v1,v2,v3. An attempted transmission of the disease is represented by an arrow, an attempted transmission to a vaccinated individual is represented by an arrow and a blue bar. Left: v1 attempts to transmit the disease both to v2 and v3, and succeeds. Both v2 and v3 are represented by type 1 individuals in the approximating branching process. Center: v1 attempts to transmit the disease both to v2 and v3, the transmission to v2 is blocked since v2 is vaccinated. Then v3 is represented by a type 1 individual. Right: v1 succeeds to transmit the disease to v2, but does not attempt to infect v3. Then v2 is represented by a type 2 individual (colour figure online)

Denote the mean matrix of the approximating branching process by Mf(v)=(mi,j(v))i,j=13. Using the expressions in (15) and (16) gives the expected number of type 1 individuals produced by a type 1 individual

m1,1(v)=2(1-fv)2E(TΔ2)+2(1-fv)fvE(TΔ2)EΔ(Δ)=(1-fv)2E(TΔ2)E(Δ(Δ))=(1-fv)m1,1

where m1,1 is an element of the mean matrix Mf of the forward branching process presented in (9).

Proceeding in the same fashion, we obtain the elements of the mean matrix Mf(v)=(mi,j(v))i,j=13 of the branching process with random vaccination. It turns out that

Mf(v)=(1-fv)Mf.

It is readily verified that the Perron root of Mf(v) is

rf(v)=(1-fv)rf, 18

where rf is the Perron root of Mf. Setting rf(v) to 1 in (18) and solving for fv yields the critical vaccination coverage fv(c)=1-1/rf.

We conclude that, for this particular graph model, equality holds between the basic reproduction number R0 and the perfect vaccine-associated reproduction number RV as defined in (7). Note that R0 is based on a rank-based perspective of infection and not on “who-infected-whom.

Probability of a major outbreak

Let h be the probability generating function of the offspring distribution of the three types in our model including vaccination. As in Sect. 3.1.1, we use the probability generating function to approximate the probability of extinction of the epidemic. To this end, let (ζi,1,ζi,2,ζi,3) be distributed as the offspring of a type i individual with transmission weight T, i=1,2,3, and let (S(i),Δ(i)) be distributed as the joint degree of this individual. That is,

(S(1),Δ(1))=d(S(2),Δ(2))=d(S(Δ),Δ(Δ))

and

(S(3),Δ(3))=d(S(s),Δ(s)).

Note that (S(i),Δ(i)) and T are independent.

By conditional independence

Ez1ζ1,1z2ζ1,2z3ζ1,3=EEz3ζ1,3|S(1),Δ(1),TEz1ζ1,1z2ζ1,2|S(1),Δ(1),T

for z¯=(z1,z2,z3)T[0,1]3.

Conditioned on the transmission weight T and the joint degree (S(1),Δ(1)), the number of attempted transmissions from a type 1 individual along single edges has a binomial distribution with parameters S(1) and Ts, and each attempted transmission succeeds with probability (1-fv). Thus,

Ez3ζ1,3|S(1),Δ(1),T=k0+k1=S(1)S(1)k1z3k1(Ts(1-fv))k1((1-Ts)+Tsfv)k0=(Ts(1-fv)z3+1-Ts+Tsfv)S(1). 19

Similarly, for a type 1 individual w with triangle degree Δ(1), by conditioning on the number of attempted transmissions (in ki of the Δ(1)-1 triangles that are not yet affected by the disease, w attempts to transmit the disease to i individuals, i=0,1,2) and the vaccination status of the individuals contacted by w we obtain

E(z1ζ1,1z2ζ1,2|S(1),Δ(1),T)=k0+k1+k2=Δ(1)-1Δ(1)-1k0,k1,k2(1-TΔ)2k0(2TΔ(1-TΔ))k1TΔ2k2k~0+k~1+k~2=k2k2k~0,k~1,k~2((1-fv)z1)2k~2(2fv(1-fv)z1)k~1fv2k~0k0+k1=k1k1k0,k1(1-fv)k1z2k1fvk0=k0+k1+k2=Δ(1)-1Δ(1)-1k0,k1,k2(1-TΔ)2k0(2TΔ(1-TΔ))k1TΔ2k2(((1-fv)z1)2+2fv(1-fv)z1+fv2)k2((1-fv)z2+fv)k1=[(1-TΔ)2+2TΔ(1-TΔ)[(1-fv)z2+fv]+TΔ2[((1-fv)z1)2+2fv(1-fv)z1+fv2]]Δ(1)-1. 20

Combining (19) and (20) yields

Ez1ζ1,1z2ζ1,2z3ζ1,3=E[(Ts(1-fv)z3+1-Ts+Tsfv)S(Δ)((1-TΔ)2+2TΔ(1-TΔ)((1-fv)z2+fv)+TΔ2(((1-fv)z1)2+2fv(1-fv)z1+fv2))Δ(Δ)]. 21

By noting that the offspring distribution of a type 2 individual is identical to the offspring distribution of a type 1 individual, except that a type 2 may give birth to one additional type 1 individual with probability TΔ(1-fv) we obtain

Ez1ζ2,1z2ζ2,2z3ζ2,3=E[(Ts(1-fv)z3+1-Ts+Tsfv)S(Δ)((1-TΔ)2+2TΔ(1-TΔ)((1-fv)z2+fv)+TΔ2(((1-fv)z1)2+2fv(1-fv)z1+fv2))Δ(Δ)(z1TΔ(1-fv)+1-TΔ(1-fv))]. 22

Similarly,

Ez1ζ3,1z2ζ3,2z3ζ3,3=E[(Ts(1-fv)z3+1-Ts+Tsfv)S(s)((1-TΔ)2+2TΔ(1-TΔ)((1-fv)z2+fv)+TΔ2((((1-fv)z1)2+2fv(1-fv)z1+fv2))Δ(s)]. 23

Combining these results yields the probability generating function h of the offspring distribution of a type 1, 2, 3 individual respectively. That is, h(z¯)1 is given by (21), h(z¯)2 is given by (22) and h(z¯)3 is given by (23).

The probability generating function h of the initial case is given by

h(z¯)=E(z1ζ,1z2ζ,2z3ζ,3)=E[(Ts(1-fv)z3+1-Ts+Tsfv)S((1-TΔ)2+2TΔ(1-TΔ)((1-fv)z2+fv)+TΔ2(((1-fv)z1)2+2fv(1-fv)z1+fv2))Δ]. 24

for z¯=(z1,z2,z3)T[0,1]3, where (S,Δ) is distributed as the joint degree of the initial case and independent of T. The probability of extinction of the approximating branching process is given by h(q¯(v)), where q¯(v) is given by the point in [0,1]3 closest to the origin that satisfies q¯(v)=h(q¯(v)). Thus, by Theorem 1 the probability of a major outbreak is 1-h(q¯(v)).

The backward process

We now turn our attention to the backward process and final size of an epidemic in a population where a fraction fv is vaccinated with a perfect vaccine. To this end, we introduce the following three types, where individuals are classified by their vaccination status and the type of the edge along which they would transmit the disease if infected.

  • Type 1:

    Transmits along triangle edge, no information on vaccination status is available

  • Type 2:

    Transmits along triangle edge and is known not to be vaccinated since it is successfully infected by its twin

  • Type 3:

    Transmits along single edge, no information on vaccination status is available

To clarify the types a bit more, let v1,v2,v3 be a given triangle. At least one of v2 and v3 belongs to the susceptibility set of v1 by virtue of potential transmissions within the triangle if some the following events, illustrated in Fig. 7, happens. Note that all cases infected by virtue of transmission within the triangle v1,v2,v3 are attributed to v1.

(E1)
v2 attempts to infect v1 and v3 attempts to infect v2, both succeed, and v3 does not attempt to infect v1. Or the same thing might happen, with v2 and v3 interchanged. This results in one type 1 and one type 2 individual in the approximating branching process. If v1 is represented by a type 1 or 3 individual this happens with probability
2(1-fv)2E(TΔ)E(TΔ(1-TΔ)),
if v1 is represented by an individual of type 2 this happens with probability
2(1-fv)E(TΔ)E(TΔ(1-TΔ)).
(E2)
Only one of v2 and v3 attempts to infect v1, and succeeds. The other node does not attempt to infect any node within the triangle. This results in one type 1 offspring. If v1 is represented by an individual of type 1 or 3 this happens with probability
2(1-fv)E(TΔ)E(TΔ(1-TΔ)),
if v1 is represented by an individual of type 2 this happens with probability
2E(TΔ)E(TΔ(1-TΔ)).
(E3)
v2 and v3 both attempt to infect v1 and succeeds. This results in two type 1 individuals born in the approximating branching process. If v1 is represented by an individual of type 1 or 3 this happens with probability
(1-fv)E(TΔ2),
if v1 is represented by an individual of type 2 this happens with probability
E(TΔ2).
(E4)

v2 attempts to infect v1 and succeeds. The other node, v3, attempts to infect v2, but fails due to v2 being vaccinated. The individual v3 does not attempt to infect v1. In this scenario, v2 belongs to the susceptibility set of v1. However, we do not include v2 is the approximating branching process. This does not have any impact on the result of our analysis, since we are only interested in the probability of extinction of the backward process and v2 does not produce any offspring in this process.

Fig. 7.

Fig. 7

At least one of v2 and v3 will belong to the susceptibility set of v1 by virtue of potential transmissions within the triangle if some of the following types of scenarios (left to right in the picture) occur: E1, E2, E3, E4. An attempted transmission of the disease is represented by an arrow, an attempted transmission to a vaccinated individual is represented by an arrow and a blue bar (colour figure online)

Expected final size

Let b(v) and b(v) be the probability generating function of the offspring distribution of the three types of the approximating backward branching process and of the ancestor, respectively. Furthermore, let ζ¯i=(ζi,1b,ζi,2b,ζi,3b) be distributed as the offspring of a type i,i=1,2,3, individual and denote by Es the conditional expectation given that the parent of ζi,1b,ζi,2b,ζi,3b is susceptible. Let further ζ¯=(ζ,1b,ζ,2b,ζ,3b) be distributed as the offspring of the ancestor. Denote the extinction probability of a process descending from a type i individual by qib, i=1,2,3 and let q¯b=(q1b,q2b,q3b)T.

To find an expression for b(v), we first note that for z¯=(z1,z2,z3)T

Ez¯ζ¯3=fv+(1-fv)EsEsz3ζ3,3b|S(3),Δ(3)Esz1ζ3,1bz2ζ3,2b|S(3),Δ(3) 25

where, as before, (S(i),Δ(i)) is distributed as the joint degree of a type i individual, i=1,2,3.

Now

Esz3ζ3,3|S(3),Δ(3)=k0+k1=S(3)-1S(3)-1k0,k1z3k1E(Ts)k1E(1-Ts)k0=(E(Ts)z3+1-E(Ts))S(3)-1. 26

By conditioning on the number of triangles k2 in which an event of type E3 occurs, the number of triangles k1a in which an event of type E1 occurs, the number of triangles k1b in which an event of type E4 occurs and the number of triangles k1c in which an event of type E2 occurs we obtain

Es(z1ζ3,1z2ζ3,2|S(3),Δ(3))=k0+k1a+k1b+k1c+k2=Δ(3)Δ(3)k0,k1a,k1b,k1c,k2E(1-TΔ)2k0(2E(TΔ)E(TΔ(1-TΔ))(1-fv))k1a(2E(TΔ)E(TΔ(1-TΔ))fv)k1b(2E(TΔ)E((1-TΔ)2))k1cE(TΔ)2k2z2k1az1k1a+k1c+2k2=((E(1-TΔ))2+2E(TΔ)E(TΔ(1-TΔ))(1-fv)z2z1+2E(TΔ)E(TΔ(1-TΔ))fv+2E(TΔ)E((1-TΔ)2)z1+E(TΔ)2z12)Δ(3). 27

Inserting the right hand sides of (26) and (27) in (25) gives

E(z1ζ3,1z2ζ3,2z3ζ3,3)=fv+(1-fv)E[(E(Ts)z3+1-E(Ts))S(s)((E(1-TΔ))2+2E(TΔ)E(TΔ(1-TΔ))(1-fv)z1z2+2E(TΔ)E(TΔ(1-TΔ))fv+2E(TΔ)E((1-TΔ)2)z1+E(TΔ)2z12)Δ(s)]. 28

Similarly

E(z1ζ2,1z2ζ2,2z3ζ2,3)=E[(E(Ts)z3+1-E(Ts))S(Δ)((E(1-TΔ))2+2E(TΔ)E(TΔ(1-TΔ))(1-fv)z1z2+2E(TΔ)E(TΔ(1-TΔ))fv+2E(TΔ)E((1-TΔ)2)z1+E(TΔs)2z12)Δ(Δ)]. 29

and

E(z1ζ1,1z2ζ1,2z3ζ1,3)=fv+(1-fv)E(z1ζ2,1z2ζ2,2z3ζ2,3). 30

Combining these results yields the probability generating function of the offspring distribution of the three types; b(v)(z¯)3 is given by (28) and b(v)(z¯)2 is given by (29). By replacing (S(s),Δ(s)) in the right hand side of (28) by (S(Δ),Δ(Δ)) we obtain b(v)(z¯)1.

Also by replacing (S(s),Δ(s)) in the right hand side of (28), but now by (S,Δ) we obtain the probability generating function b(v)(z¯) of the offspring of the initial case. The expected final size of the epidemic, conditioned on that a major outbreak occurs, is given by

1-b(v)(q¯b).

Numerical example

Consider for now the special case where Ts=TΔ. With some abuse of notation we denote Ts=TΔ by T, i.e. T=Ts=TΔ is one-dimensional here. Under very general assumptions, increasing the heterogeneity in infectiousness leads to a decrease in the probability of a major outbreak, the expected final size and R0 (Kuulasmaa 1982; Meester and Trapman 2011; Miller 2008), see also Ball (1985); Kenah and Robins (2007); Miller (2007). In particular, for a fixed (marginal) transmission probability E(T), the probability of a major outbreak and the expected final size are maximized if T=E(T) with probability 1 and minimized if P(T=1)=E(T)=1-P(T=0). Similarly, for given E(T), R0 is maximized if T=E(T) with probability 1 and minimized if P(T=1)=E(T)=1-P(T=0).

We illustrate this with the following two examples. In this first example we assume that T=Ts=TΔ. Consider the three degree distributions

  1. p(2,1)=1

  2. p(4,0)=0.95=1-p(2,1)

  3. p(0,2)=0.95=1-p(2,1).

That is, in all three degree distributions the total degree is 4 with probability 1. In addition, distribution 1 corresponds to a network where every node is member of exactly one triangle. Distribution 2 corresponds to a network where a node is not a member of any triangle with probability 0.95, while with probability 0.05 a node is member of one triangle. Finally, distribution 3 corresponds to a network where a node is a member of two triangles with probability 0.95, while with probability 0.05 a node is member of one triangle.

Furthermore, let T have distribution Beta(α,α) for some α>0. That is, T has density, Cαxα-1(1-x)α-1, on the interval (0, 1), where Cα is a normalizing constant. Then E(T)=1/2 and we can tune the heterogeneity of the infectivity of infected individuals by varying α. In particular

E(T2)=121-12+α-1.

Note that α=1 corresponds to TU(0,1), with α corresponding to T becoming a point mass at 1 / 2. Here U(0, 1) denotes the uniform distribution on (0, 1). Figure 8 shows the probability that a major outbreak does not occur, the expected final size, R0 and the critical vaccination coverage fv(c) as functions of α or E(T2).

Fig. 8.

Fig. 8

The impact of heterogeneity in infectivity for the three degree distributions. With some abuse of notation we write T=Ts=TΔ. a The probability that a major outbreak does not occur as a function of α. b The probability that a major outbreak does not occur as a function of E(T2). c The expected final size of a major outbreak as a function of α. d The expected final size of a major outbreak as a function of E(T2). e The basic reproduction number R0 as a function of E(T2). f The critical vaccination coverage fv(c) as a function of E(T2)

As can be seen in Fig. 8, ignoring actual heterogeneity of infectivity in this case leads to an overestimation of the probability of a major outbreak (Fig. 8a, b). This effect is particularly evident in the presence of high clustering; the steeper slope of the curve corresponding to distribution 3 (Fig. 8b) and the relatively low probability of a major outbreak when α is small can be explained by the fact that the approximating forward branching process is close to being critical when α is small. Figure 8c, d shows that heterogeneity of infectivity has virtually no impact on the expected final size of a major outbreak and R0 in the near absence of clustering, which is in line with known results for unclustered networks [see, for instance, section 4 in Miller (2008)]. In the presence of clustering, on the other hand, ignoring heterogeneity of infectivity leads to an underestimation of the expected final size and a substantial overestimation of the critical vaccination coverage fv(c). Note that R0 and fv(c) depend on the distribution of T only through the first and second moment of T.

Next, we relax the assumption Ts=TΔ and investigate the impact of the correlation ρ between Ts and TΔ on the spread of the disease. To this end, we consider a model where as before Ts and TΔ both have distribution Beta(α,α) and where the correlation ρ=ρ(t) may be tuned by varying t[-1,1]. Here ρ(t) is increasing in t with ρ(-1)=-1 and ρ(1)=1. The degree distribution of the underlying graph is given by distribution 1 above.

To construct such a model, let the joint distribution of Ts and TΔ be as follows. Let N1,N2,N3 be three independent standard normal random variables, and assume that the joint distribution of Ts and TΔ is given by

T=d(Fα-1(Φ(Ns)),Fα-1(Φ(NΔ))),

where Fα and Φ are the CDF’s of the distributions Beta(α,α) and N(0, 1), and Ns and NΔ are the standard normal random variables

Ns=tN1+1-tN2,NΔ=sign(t)tN1+1-tN3.

With little effort one can show that ρ(t) is indeed increasing in t, that (by the symmetry of the distribution Beta(α,α)) ρ(-1)=-1 and ρ(1)=1, and that Ts and TΔ are independent for t=0. As can be seen in Fig. 9 the probability 1-f(q¯) that a major outbreak occurs increases as the correlation ρ(t) decreases. This effect is substantial when heterogeneity in individual infectivity is high (i.e. α is small) but wanes as the heterogeneity decreases (i.e. α increases). This can be explained by the fact that for a fixed value of α the probability that an infectious individual will not transmit the disease to any of its susceptible neighbours decreases as t increases, and that this probability is more sensitive to changes in t if the heterogeneity in individual infectivity is higher [cf. Kuulasmaa (1982)]. It should be noted that R0 and the critical vaccination coverage fv(c) do not depend on the correlation ρ(t), which can be seen from the expression for the mean matrix in (9) and (18).

Fig. 9.

Fig. 9

The probability that a major outbreak does not occur as a function of α and t

Discussion

In this paper, we have incorporated clustering in the spread of an infectious disease by allowing for groups of size three with non-overlapping edges. It is, in principle, straightforward to extend the methods used in this paper to larger group sizes. The CMC may, for instance, be generalized to larger group sizes as follows. Let K={k1,,kr}N2 be the set of possible group sizes. In the matching procedure, each node is equipped with an r-dimensional degree in N0r. The ith component (the ki-degree) of a degree specifies the number of groups of size ki to which the node in question belongs. Analogously to the construction of a CMC graph, groups are then formed by creating one list for each group size; a node with ki-degree di appears precisely di times in the list corresponding to groups of size ki. The lists are then shuffled and half-edges of nodes in positions k+1,,k+ki in the ki-list are joined. The structure of a graph so obtained would be characterized by fully connected cliques, and similar to that of a random intersection graph (Ball et al. 2014). One possible approach to investigate epidemics on such graphs would be to approximate the spread of the disease by a multitype Galton Watson process where groups (or cliques) are represented by the particles of the branching process. The types of the approximating branching process would then be vectors in N2 of the form (mn), where m represents the size of the clique and n represents the number of members of the clique that the primary case of the clique attempts to infect. Another possible approach would be to use an infinite type branching process in the spirit of Ball et al. (2014). We believe that the result would be analogous to the results obtained in Ball et al. (2014).

Acknowledgements

This research is supported by the Swedish Research Council (Vetenskapsrådet) Grant 2016-04566. We wish to thank the members of the journal club on infectious diseases at Stockholm University and Daniel Ahlberg for suggestions that lead to substantial improvements of the paper.

Appendix: Proof of proposition 1

Let d¯={(Si,Δi)}iN be a given (i.e. non-random) degree sequence that satisfies the following regularity assumptions.

  1. i=1N1(Si=k1,Δi=k2)Np(k1,k2) for any k1,k2Z0.

  2. i=1NΔi2NE(Δ2) and i=1NSi2NE(S2)

where (S,Δ) has distribution p, which is assumed to satisfy A1–A2 in Sect. 2.1. Let further G¯={GN}NN be a sequence of graphs generated by the CMC, where the degree sequence of GN is given by d¯N={(Si,Δi)}i=1N and denote DS(N)=i=1NSi.

Under the assumptions A1–A2 the expected number of self-loops and the expected number of multiple edges are both of order O(1) [cf. Van der Hofstad (2016, prop. 7.11)]. Denote by AN the number of wedges of GN that are “deleted” when merging multiple edges and erasing self-loops, that is

AN=i=1NSi+2Δi22-|WGN|=i=1N(Si+2Δi)(Si+2Δi-1)-|WGN|,

then E(AN)=O(1).

From the definition of AN, we deduce that the total number of ordered triangles of GN is bounded from below by |WΔGN|i=1N2Δi-AN and the total number of ordered wedges is bounded from above by

|WGN|i=1NSi+2Δi22=i=1N(Si+2Δi)(Si+2Δi-1).

Therefore, by the definition of C(GN) and the assumptions above

C(GN)i=1N2ΔiN-ANi=1N(Si+2Δi)(Si+2Δi-1)NPE(2Δ)E((2Δ+S)2)-E(2Δ+S) 31

as N.

This lower bound is tight in the limit as the number of nodes N. Indeed, denote by WsGN the set of ordered triangles of GN that consists solely of single edges, i.e.

WsGN={(u,v,w)VN3:(u,v),(u,w)and(v,w)are single edges},

where VN is the node set of GN. Now, whenever DS(N)6

E|WsGN|iSi2jSjDS(N)-2lSlDS(N)-3(Sj-1)(Sl-1)DS(N)-5 32

where the sums run over the integers 1,,N.

Dividing by N in (32) and letting N approach infinity gives E(|WSGN|)/N0 as N. Thus |WSGN|/N0 in probability. Repeating this procedure for triangles formed by a combination of triangle and single edges gives

C(GN)PE(2Δ)E((2Δ+S)2)-E(2Δ+S). 33

The assertion now follows by bounded convergence and the law of large numbers.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Carolina Fransson, Phone: +468-16 45 82, Email: carolina.fransson@math.su.se.

Pieter Trapman, Email: ptrapman@math.su.se.

References

  1. Andersson H. Epidemic models and social networks. Math Sci. 1999;24(2):128–147. [Google Scholar]
  2. Ball F. Deterministic and stochastic epidemics with several kinds of susceptibles. Adv Appl Probab. 1985;17(1):1–22. doi: 10.2307/1427049. [DOI] [Google Scholar]
  3. Ball F, Sirl D, Trapman P. Threshold behaviour and final outcome of an epidemic on a random network with household structure. Adv Appl Probab. 2009;41(3):765–796. doi: 10.1239/aap/1253281063. [DOI] [Google Scholar]
  4. Ball F, Sirl D, Trapman P. Analysis of a stochastic SIR epidemic on a random network incorporating household structure. Math Biosci. 2010;224(2):53–73. doi: 10.1016/j.mbs.2009.12.003. [DOI] [PubMed] [Google Scholar]
  5. Ball F, Sirl D, Trapman P. Epidemics on random intersection graphs. Ann Appl Probab. 2014;24(3):1081–1128. doi: 10.1214/13-AAP942. [DOI] [Google Scholar]
  6. Ball F, Pellis L, Trapman P. Reproduction numbers for epidemic models with households and other social structures II: comparisons and implications for vaccination. Math Biosci. 2016;274:108–139. doi: 10.1016/j.mbs.2016.01.006. [DOI] [PubMed] [Google Scholar]
  7. Barbour A, Reinert G. Approximating the epidemic curve. Electron J Probab. 2013;18(54):1–30. [Google Scholar]
  8. Bhamidi S, Van der Hofstad R, Komjáthy J. The front of the epidemic spread and first passage percolation. J Appl Probab. 2014;51(A):101–121. doi: 10.1239/jap/1417528470. [DOI] [Google Scholar]
  9. Bollobás B. A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. Eur J Comb. 1980;1(4):311–316. doi: 10.1016/S0195-6698(80)80030-8. [DOI] [Google Scholar]
  10. Bollobás B, Janson S, Riordan O. Sparse random graphs with clustering. Random Struct Algorithms. 2011;38(3):269–323. doi: 10.1002/rsa.20322. [DOI] [Google Scholar]
  11. Britton T. Stochastic epidemic models: a survey. Math Biosci. 2010;225(1):24–35. doi: 10.1016/j.mbs.2010.01.006. [DOI] [PubMed] [Google Scholar]
  12. Britton T, Janson S, Martin-Löf A. Graphs with specified degree distributions, simple epidemics, and local vaccination strategies. Adv Appl Probab. 2007;39(4):922–948. doi: 10.1239/aap/1198177233. [DOI] [Google Scholar]
  13. Britton T, Deijfen M, Lagerås AN, Lindholm M. Epidemics on random graphs with tunable clustering. J Appl Probab. 2008;45(3):743–756. doi: 10.1239/jap/1222441827. [DOI] [Google Scholar]
  14. Coupechoux E, Lelarge M. Contagions in random networks with overlapping communities. Adv Appl Probab. 2015;47(4):973–988. doi: 10.1239/aap/1449859796. [DOI] [Google Scholar]
  15. Diekmann O, Heesterbeek J, Metz JAJ. On the definition and the computation of the basic reproduction ratio R0 in models for infectious diseases in heterogeneous populations. J Math Biol. 1990;28(4):365–382. doi: 10.1007/BF00178324. [DOI] [PubMed] [Google Scholar]
  16. Diekmann O, Heesterbeek H, Britton T (2013) Mathematical tools for understanding infectious disease dynamics. Princeton University Press, Princeton. http://www.jstor.org/stable/j.cttq9530. Accessed 22 Jan 2018
  17. Erdős P, Rényi A. On random graphs I. Publ Math Debrecen. 1959;6:290–297. [Google Scholar]
  18. Goldstein E, Paur K, Fraser C, Kenah E, Wallinga J, Lipsitch M. Reproductive numbers, epidemic spread and control in a community of households. Math Biosci. 2009;221(1):11–25. doi: 10.1016/j.mbs.2009.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Janson S, Luczak M, Windridge P. Law of large numbers for the SIR epidemic on a random graph with given degrees. Random Struct Algorithms. 2014;45(4):726–763. doi: 10.1002/rsa.20575. [DOI] [Google Scholar]
  20. Karoński M, Scheinerman E, Singer-cohen K. On random intersection graphs: the subgraph problem. Comb Probab Comput. 1999;8(1–2):131–159. doi: 10.1017/S0963548398003459. [DOI] [Google Scholar]
  21. Kenah E, Miller J (2011) Epidemic percolation networks, epidemic outcomes, and interventions. Interdiscip Perspect Infect Dis. 10.1155/2011/543520 [DOI] [PMC free article] [PubMed]
  22. Kenah E, Robins JM. Second look at the spread of epidemics on networks. Phys Rev E. 2007;76(036):113. doi: 10.1103/PhysRevE.76.036113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kiss I, Miller J, Simon P (2017) Mathematics of epidemics on networks: from exact to approximate models. Springer, Berlin
  24. Kuulasmaa K. The spatial general epidemic and locally dependent random graphs. J Appl Probab. 1982;19(4):745–758. doi: 10.2307/3213827. [DOI] [Google Scholar]
  25. Ludwig D. Final size distribution for epidemics. Math Biosci. 1975;23(1):33–46. doi: 10.1016/0025-5564(75)90119-4. [DOI] [Google Scholar]
  26. Meester R, Trapman P. Bounding the size and probability of epidemics on networks. Adv Appl Probab. 2011;43(2):335–347. doi: 10.1239/aap/1308662482. [DOI] [Google Scholar]
  27. Meyers L. Contact network epidemiology: bond percolation applied to infectious disease prediction and control. Bull Am Math Soc. 2007;44(1):63–86. doi: 10.1090/S0273-0979-06-01148-7. [DOI] [Google Scholar]
  28. Miller J. Epidemic size and probability in populations with heterogeneous infectivity and susceptibility. Phys Rev E. 2007;76(010):101(R). doi: 10.1103/PhysRevE.76.010101. [DOI] [PubMed] [Google Scholar]
  29. Miller J. Bounding the size and probability of epidemics on networks. J Appl Probab. 2008;45(2):498–512. doi: 10.1239/jap/1214950363. [DOI] [Google Scholar]
  30. Miller J. Percolation and epidemics in random clustered networks. Phys Rev E. 2009;80(020):901. doi: 10.1103/PhysRevE.80.020901. [DOI] [PubMed] [Google Scholar]
  31. Molloy M, Reed B. A critical point for random graphs with a given degree sequence. Random Struct Algorithms. 1995;6(2–3):161–180. doi: 10.1002/rsa.3240060204. [DOI] [Google Scholar]
  32. Molloy M, Reed B. The size of the giant component of a random graph with a given degree sequence. Comb Probab Comput. 1998;7(3):295–305. doi: 10.1017/S0963548398003526. [DOI] [Google Scholar]
  33. Newman M. Spread of epidemic disease on networks. Phys Rev E. 2002;66(016):128. doi: 10.1103/PhysRevE.66.016128. [DOI] [PubMed] [Google Scholar]
  34. Newman M. Properties of highly clustered networks. Phys Rev E. 2003;68(026):121. doi: 10.1103/PhysRevE.68.026121. [DOI] [PubMed] [Google Scholar]
  35. Newman M. Random graphs with clustering. Phys Rev Lett. 2009;103(058):701. doi: 10.1103/PhysRevLett.103.058701. [DOI] [PubMed] [Google Scholar]
  36. Newman M, Watts D, Strogatz S. Random graph models of social networks. Proc Natl Acad Sci USA. 2002;99(3):2566–2572. doi: 10.1073/pnas.012582999. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Pellis L, Ferguson NM, Fraser C. The relationship between real-time and discrete-generation models of epidemic spread. Math Biosci. 2008;216(1):63–70. doi: 10.1016/j.mbs.2008.08.009. [DOI] [PubMed] [Google Scholar]
  38. Pellis L, Ball F, Trapman P. Reproduction numbers for epidemic models with households and other social structures. I. Definition and calculation of R0. Math Biosci. 2012;235(1):85–97. doi: 10.1016/j.mbs.2011.10.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Trapman P. On analytical approaches to epidemics on networks. Theor Popul Biol. 2007;71(2):160–173. doi: 10.1016/j.tpb.2006.11.002. [DOI] [PubMed] [Google Scholar]
  40. Van der Hofstad R. Random graphs and complex networks. Cambridge: Cambridge University Press; 2016. [Google Scholar]
  41. Varga RS (2009) Matrix iterative analysis. Springer, Berlin. 10.1007/978-3-642-05156-2
  42. Volz E, Miller J, Galvani A, Meyers L. Effects of heterogeneous and clustered contact patterns on infectious disease dynamics. PLoS Comput Biol. 2011;7:e1002042. doi: 10.1371/journal.pcbi.1002042. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Mathematical Biology are provided here courtesy of Springer

RESOURCES