Skip to main content
Springer logoLink to Springer
. 2013 Dec 18;77(4):614–645. doi: 10.1007/s11538-013-9923-5

Exact Equations for SIR Epidemics on Tree Graphs

K J Sharkey 1,, I Z Kiss 2, R R Wilkinson 1, P L Simon 3
PMCID: PMC4541714  PMID: 24347252

Abstract

We consider Markovian susceptible-infectious-removed (SIR) dynamics on time-invariant weighted contact networks where the infection and removal processes are Poisson and where network links may be directed or undirected. We prove that a particular pair-based moment closure representation generates the expected infectious time series for networks with no cycles in the underlying graph. Moreover, this “deterministic” representation of the expected behaviour of a complex heterogeneous and finite Markovian system is straightforward to evaluate numerically.

Keywords: Kolmogorov equation, Dimensional reduction

Introduction

Background

The majority of epidemic models fall either into the category of stochastic models (Bailey 1975; Bartlett 1956) or into the category of deterministic differential equation-based models (Anderson and May 1991; Kermack and McKendrick 1927). These two strands developed largely independently for much of the twentieth century. Thus, an interesting question arises as to the precise mathematical connection between stochastic and deterministic models. Frequently, deterministic descriptions apply to large populations where the stochastic effects can be treated as negligible. For small populations we shall assume that it is the average or expected behaviour of the epidemic that we are hoping to replicate with “deterministic” descriptions. This average behaviour is a system characteristic that is fully specified by the system and its initial conditions.

The first epidemic models were based on the assumption that populations are evenly mixed, with each individual equally likely to interact with any other individual at any time (Heathcote 2000). A classic example of this type of model is the Susceptible-Infectious-Removed (SIR) compartmental model whereby individuals are classified according to being in one of these three states. It has been shown that for this type of mean-field model, the average of many stochastic simulations (the expected outcome of the stochastic model) converges to the solution of the “equivalent” mean-field deterministic model in the limit of an infinite population size and subject to strict conditions regarding the initialisation of the epidemic (Kurtz 1970, 1971; Simon and Kiss 2011).

More recently, a higher degree of realism has been introduced by considering stochastic models on contact networks where individuals are only able to contact a limited subset of the population. This enables significant heterogeneity to be incorporated, treating individuals as distinct entities with fixed connectivity to pre-allocated neighbours. While stochastic models are readily extended to incorporate such systems, deterministic descriptions have been more problematic. Several methodologies have been developed including pair-approximation models (Keeling 1999; Keeling and Eames 2005; Rand 1999), degree-based models (Pastor-Satorras and Vespignani 2001), and models based on the probability generating function (PGF) formalism which are applicable to configuration networks (Volz 2008) as well as the related edge-based compartmental modelling (Miller et al. 2012; Miller and Volz 2012). It has been observed (House and Keeling 2011) that these models are, at some level, equivalent and are all derived from similar principles of independence. Although comparison with simulation of stochastic models can sometimes be good, the basic link remains obscure.

Typically there are two idealised scenarios in which exact correspondence between stochastic models and solvable deterministic descriptions has been shown. Firstly, correspondence has been shown to sometimes occur in the limit of infinite populations for particular idealised graphs (Ball and Neal 2008; Decreusefond et al. 2012) which cannot be exactly realised in practice. It can also occur with some very simplified systems whose symmetry properties can be exploited to achieve reductions in the stochastic description (Keeling and Ross 2008; Simon et al. 2011).

Here we consider a recently introduced class of model, related to the pair-approximation models, which give an exact correspondence between a deterministic description and the stochastic model for SIR epidemics on finite, time-invariant networks. Pair-approximation models were introduced into network-based epidemic and ecological theory in the 1990s to describe large populations of interacting individuals (Matsuda et al. 1992; Sato et al. 1994; Harada and Iwasa 1994; Rand 1999; Keeling 1999). They are an example of a hierarchy of equations which are truncated at the second order by an approximation (truncation at the first order corresponds to mean-field). This type of hierarchy was first considered in statistical physics and is sometimes known as the Bogoliubov–Born–Green–Kirkwood–Yvon (BBGKY) hierarchy (Kirkwood 1946, 1947; Born and Green 1946). Recently, related models have been considered at the level of individuals, variously called subsystem equations, moment dynamics equations, pair-based equations (Sharkey 2008, 2011; Baker and Simpson 2010; Markham et al. 2013). This method generates a solvable class of models which can encompass a significant amount of heterogeneity and enables a fundamental link with finite stochastic models (Sharkey 2008, 2011).

We consider a pair-based representation of Markovian SIR dynamics. We show that by considering subsystems at the level of pairs, a closure can be found that determines the expected infectious time series exactly for arbitrary network structures where the underlying graph is a tree and, in some special circumstances, for particular networks with cycles. We note that the recent, related message passing formulation of epidemics on contact networks developed by Karrer and Newman (2010) also enables an exact description of epidemic dynamics on finite tree graphs.

Statement of the Main Result

We consider an SIR compartmental model composed of P individuals whose states are described at any given point in time by vectors I and S with respective components Ii and Si, i∈{1,2,…,P}, such that Ii=1 if individual i is infectious (Ii=0 otherwise) and Si=1 if individual i is susceptible (Si=0 otherwise). Transmission and recovery occur by Poisson processes with rate parameters Inline graphic and μi=γiIi, respectively, where T is a “transmission” matrix with (time-independent) elements Tij denoting the rate parameter for an infectious node j infecting a susceptible node i (Tii=0 for all i) and where γi denotes the rate parameter for an infectious individual i to recover, enabling individual-specific removal rates.

As shown by Sharkey (2011), for any transmission matrix T and any nodes i,j, the following differential equations are provably exact (consistent with the stochastic model):

graphic file with name M2.gif 1

where 〈Si〉 and 〈Ii〉 denote the time-dependent probabilities (or equivalently the expected values of the indicator functions) for individual i to be susceptible and infectious, respectively, and expressions of the form 〈AiBj〉 denote the time-dependent probability that individual i is in state A and individual j is in state B with a similar interpretation of terms of the form 〈AiBjCk〉. Here and throughout, we adopt the dot notation to denote time derivatives. It follows that the expected population-level susceptible and infectious time series are given by Inline graphic and Inline graphic, respectively.

Note that it is a short step (see Sharkey 2008) from (1) to the familiar population-level pair equations (Keeling 1999; Keeling and Eames 2005; Rand 1999), also proved independently by Taylor et al. (2012) for the susceptible-infectious-susceptible variant.

This system can be completed by formulating differential equations for the triples, quadruples, and so forth until we reach the full system size. This yields a self-contained system of differential equations that exactly determines the probabilities of each quantity given initial conditions. However, cascading these equations up to the full system size will usually result in a system that is impractical to solve due to its sheer size. This is why this system is typically closed at some level by introducing a functional relation approximating higher-order probabilities in terms of lower-order ones. One of the most frequently used closure relations can be written as

graphic file with name M5.gif 2

for the current context. Applying this closure relation to our system at the level of pairs we arrive at the following system:

graphic file with name M6.gif 3

where we use X for susceptible and Y for infectious to emphasise that these are approximating differential equations based on the closure. When 〈Xi〉 in the denominator is zero, we assume that the approximation takes the value zero.

In general, we consider networks (graphs) with directed and undirected edges. In what follows, we use the terminology “tree graph” to include graphs with directed edges where the underlying (equivalent undirected) graph is a tree. Our main aim is to show that when matrix T represents a tree and the system is initiated in a pure system state (that is, one of the 3P possible configurations has probability 1 at time t=0), then the system can be closed at the level of pairs such that the closure holds exactly. Specifically, solving the closed system above, we obtain the same values for all marginal and pairwise-joint probabilities present in the unclosed system: 〈Xi〉=〈Si〉, 〈Yi〉=〈Ii〉, with similar equalities holding for the pairs.

In fact, we will prove the following theorem.

Theorem 1.1

Let us assume the following:

  • The graph (transmission network) is a tree (the underlying graph has no cycles).

  • The initial condition is a pure state, i.e. the system is initially in one of its 3Ppossible configurations with probability 1.

Then the following relations hold:

graphic file with name M7.gif

for alli∈{1,2,…,P} and for alljwith links towardsiand allkwith links towardsj : ik;

graphic file with name M8.gif

for alli∈{1,2,…,P} and for alljandkwith links towardsi: jk.

Remark

This theorem also holds for mixed (probabilistic) initial system states provided that the initial probabilities of the states of individuals in the system are uncorrelated. However, in general, mixed initial states cannot be represented exactly.

The theorem will be formulated in a more general context stating that even higher-order closure relations are also exact.

Figure 1 shows the numerical solution of (3) for a small network of 9 nodes where it is clear that it is accurate to within the precision visible on the graph. Matlab code for solving the system of Eqs. (3) is provided in Sharkey (2011). This code also works on networks which are not trees but is no longer exact in these cases. Cycles in the underlying graph of order three utilise the alternative closure 〈AiBjCk〉=〈AiBj〉〈BjCk〉〈AiCk〉/〈Ai〉〈Bj〉〈Ck〉 which is believed to gain increased accuracy in most circumstances, but these do not occur in the tree graphs considered in the present work.

Fig. 1.

Fig. 1

(a) An undirected tree indicating two nodes which we infect to initiate epidemics, with all other nodes initially susceptible. (b) The mean (dots) of 100,000 stochastic simulations on the network with transmission rate τ=0.1 across each link and removal rate γ=0.05 for each node, with error bars denoting the 5th and 95th percentiles plotted together with the solution of (3) (solid line) using the Matlab code published with Sharkey (2011)

The structure of the paper is as follows. Section 2 introduces some notation which is needed to prove the result. This also contains an important theorem (Theorem 2.1) which specifies equations describing the probabilities of the states of arbitrary subsystems (the proof of this result is given in the Appendix). The relevant state space for our domain of a tree graph is then developed. Section 3 proves the main result, initially focusing on some special cases to help motivate and facilitate understanding of the main ideas and steps of the general proof in Sect. 3.5. The main ingredient for the general proof is Lemma 3.1 which is proved via Theorem 2.1. Theorem 3.1 then follows easily by induction from Lemma 3.1. The theorem as stated above is a simple corollary of Theorem 3.1. In Sect. 4 we discuss an application of the pair-based model to some special cases of graphs with cycles where it is also exact.

Formulating the Full System

In this section we introduce a new notation which will assist in formulating the set of differential equations for the full system. In (1) we formulated the differential equations up to the level of pairs and we said that this could be continued up to the full system level. This will be done formally here. In order to make the method clearer, using our existing notation let us first evaluate the full set of equations for the undirected line graph with three nodes which we refer to as the open triple, depicted in Fig. 2. Here we shall assume that the transmission rate parameter is τ across both links and that the removal rate is γ for all three nodes. Firstly we write all of the single node equations for this network. From (1):

graphic file with name M9.gif 4

and

graphic file with name M10.gif 5

We also need to specify the following equations for pairs:

graphic file with name M11.gif 6

Finally, at the triple level we have from the master equation (since the system has only three nodes):

graphic file with name M12.gif 7

Fig. 2.

Fig. 2

Open triple graph

In order to formulate the full system for an arbitrary graph, we introduce notation for the subsystem states.

Notation for System and Subsystem States

In general, our stochastic system (which we denote by Γ) comprises of P individuals, each of which may be in any of the S, I or R states at any given time. In total, this corresponds to 3P possible states. Denoting these system states by Γα, α∈{1,2,…,3P}, the probabilities for each state are given by the master equation (or Kolmogorov equations):

graphic file with name M13.gif 8

where σ denotes a constant matrix of Poisson rate parameters. The master equation completely describes our stochastic system using a set of 3P ordinary differential equations. Our overall objective is to show that (3) is implied by the master equation when T represents a tree graph.

It is useful for us to define a general subsystem ψW comprising of r nodes in Γ indexed by vector W of length r: W=(W1,W2,…,Wi,…,Wr), Wi∈{1,2,…,P}, where we can assume W1<W2<⋯<Wr. We assume that the network connections of the nodes of ψW are a subset of the connections of Γ.

Let Inline graphic denote the state of subsystem ψW where A=(A1,A2,…,Ar) and Ai∈{S,I,R} ∀i∈{1,2,…,r} is a sequence of S,I and R symbols of length r such that the state of node Wi is Ai. In terms of the notation of the previous section for subsystems of single nodes and pairs of nodes, we have Inline graphic, Inline graphic, Inline graphic, Inline graphic etc. We shall use these two notations interchangeably. We shall also sometimes treat indexing vectors such as W as sets such that nW means that the node n is in the subsystem ψW.

In general, although we can specify the states of each node with this type of notation, an important ambiguity remains because information about the network structure is not included. To remove this ambiguity, the notation should normally be used in the context of a sketch of the relevant network structure or where the network structure is clear from the context of its use (as in (1)).

Let us now show how the differential equations of the different subsystem states can be formulated in general.

Differential Equations for Subsystems

Here we obtain differential equations describing the rate of change of the state of any subsystem. First we make some definitions.

Definition 2.1

A neighbour of node i is a node with a network link directed towards i.

Definition 2.2

Ni denotes the set of neighbours of node i. That is: Tij≠0 ∀jNi.

Definition 2.3

For the subsystem Inline graphic, if node Wk is infectious then:

graphic file with name M20.gif

Otherwise, Inline graphic

Remark

This operator changes the state of node Wk in subsystem ψW to S if it is infectious. If node Wk is susceptible or removed then it leaves the state unchanged.

Definition 2.4

For the subsystem state Inline graphic of r nodes, a subsystem of r+1 nodes can be generated as follows: Take k∈{1,2,…,r} and take a neighbour n of Wk outside of the subsystem with a network link towards Wk, i.e. let Inline graphic, nW. If Ak=S, then the generated subsystem state of (r+1) nodes is given by the generating rule:

graphic file with name M24.gif

i.e. the subsystem is extended by an infected at node n which is connected towards Wk. If Ak=I, then the generated subsystem state is given by:

graphic file with name M25.gif

i.e. the subsystem is extended by an infected at node n which is connected towards Wk and the state of node Wk is changed from I to S.

To complete the definition, if Ak=R then the operator Inline graphic leaves the subsystem unchanged. We also assume that for any state Ak where there is no link from node n to node Wk in the transmission matrix T, then the subsystem is also left unchanged.

Remark

The generated order r+1 subsystem is obtained by replacing a susceptible or infectious node Wk in the original subsystem by an SI arc such that the S node of the arc is put in the place of the node Wk and where the I node of the arc is external to the subsystem.

Definition 2.5

For the subsystem Inline graphic, if node Wk is removed then:

graphic file with name M28.gif

Otherwise, Inline graphic.

Definition 2.6

For any subsystem ψW of r nodes in state Inline graphic we define Inline graphic where k∈{1,2,…,r} and a∈{S,I,R} to have value 1 if Ak=a and to have value zero otherwise:

graphic file with name M32.gif

Theorem 2.1

The rate of change of the probability of a subsystem stateInline graphicis:

graphic file with name 11538_2013_9923_Equ9_HTML.gif 9

The proof of this theorem is a rather long diversion and can be found in the Appendix.

As an example of applying the theorem, we can use it to obtain the set of subsystem equations (1) by considering each equation in turn:

  • If the subsystem is a single susceptible individual Inline graphic, then r=1 so k can only take the value k=1 where W1=i and A1=S, reducing (9) to:
    graphic file with name M35.gif
    The first term on the second line of (9) is zero because Tii=0, and the other terms are zero because Inline graphic and Inline graphic.
  • For an infectious individual Inline graphic we obtain:
    graphic file with name M39.gif
    where the first term on the second line of (9) is zero because Tii=0 and the last term is zero because Inline graphic.
  • If the subsystem is the pair Inline graphic then the sum over k is over k=1 and k=2 and W1=i, W2=j, A1=S, A2=I so:
    graphic file with name M42.gif
    where the first line corresponds to k=1 and the second to k=2.
  • If the subsystem is the pair Inline graphic then the sum is over k=1 and k=2 where W1=i, W2=j, A1=S and A2=S so:
    graphic file with name M44.gif
    where both terms come from the first line of (9).

We have therefore obtained (1) in a slightly different notation (recall that Tii=0 ∀i∈{1,2,…,P}).

The State Space for a Tree Graph

Here we build up a state space which is sufficient to describe a tree graph. We first make some definitions.

Definition 2.7

An r-motif is a subsystem of Γ comprising of r nodes and of network links such that it forms a weakly connected network.

Definition 2.8

An r-state is the state of an r-motif.

The state space that we need to consider is built up inductively from the states of single nodes by considering the infection process. Starting with the infected states of the single nodes Inline graphic, i∈{1,2,…,P}, (9) shows that they depend on the 2-states Inline graphic, jNi as described by the generating rule (Definition 2.4).

The differential equations for Inline graphic in turn contain the 3-states Inline graphic, kNj and Inline graphic, kNi. The differential equations for the 3-states contain 4-states and typically, the differential equations for r-states contain (r+1)-states for r∈{1,2,…,(P−1)}. This state generation process can continue until we reach P-states which can only depend on other P-states.

Note that this process always forms subsystems which are motifs and that the motif states can never include removed nodes.

Definition 2.9

An out-neighbour of node i is a node with a network link from i towards it.

Proposition 2.1

For a tree graph, if the out-neighbours of theInodes are all Sin anr-motif, then this is true for all (r+1)-motifs generated from thisr-motif.

Proof

This follows easily from the definition of the generating rule (Definition 2.4). □

Definition 2.10

Consider a tree graph and take the 1-motifs with I nodes: Inline graphic. The “basic state space” M is formed by these 1-states together with the set of motif states that can be iteratively generated from them using the generating rule (Definition 2.4).

Remark

Due to the method of its construction, the state space M gives a self-contained system of differential equations, i.e. the time derivatives of the probabilities of each motif state can be expressed in terms of the probabilities of other motif states in the state space. An example in the case of the open triple is given by the motifs in (4), (6) and (7).

Definition 2.11

Consider a tree graph and the 1-states: Inline graphic and the 2-states with SS, i.e. Inline graphic. The “extended state space” Inline graphic comprises of these motifs states together with the set of motif states that can be generated from them by repeated iteration of the generating rule.

Remark

The extended state space is required to form the relevant closure relations. Due to the method of its construction, it is also self-contained.

Lemma 2.1

LetInline graphic. Then the out-neighbour of anInode is anSnode inInline graphic.

Proof

Follows from Proposition 2.1. □

Lemma 2.2

For a tree graph, the equation for the time derivative of the probability of anr-stateInline graphicis given by:

graphic file with name 11538_2013_9923_Equ10_HTML.gif 10

Proof

For these states we have Inline graphic. Additionally, when Inline graphic, the first term on the second line of (9) never arises because Inline graphic implies that an I is connected to an I node in r-state Inline graphic which contradicts Lemma 2.1. Therefore (9) reduces to (10). □

Let us now formulate the exact closure relations and prove our main result.

Closure Relation and Proof of the Main Result

The exactness of (3) is straightforward to see provided that outbreaks of epidemics are always initiated with a single infected individual. We prove this first before considering the general case.

Proof for Single Initial Infected

When infection is initiated on a tree graph at a single individual, infection must always proceed in linear chains. Consequently there is no possibility of the state IkSjIi illustrated in Fig. 3 arising because an infection initiated at either k or i must pass through j to get to the other node. Furthermore,

graphic file with name M61.gif

but since 〈IiSjIk〉=0 and consequently 〈RiSjIk〉=0, we have:

graphic file with name M62.gif

reducing (1) to the following closed system:

graphic file with name M63.gif

Similar arguments show that this can be written in the form of (3).

Fig. 3.

Fig. 3

Shown is a state which cannot arise on a tree graph where there is only one initially infectious node

More generally, this argument also applies to any tree graph where there is at most one network path by which any susceptible individual in the network can become infectious from the initial configuration of infected individuals.

Before discussing the general proof for any tree graph with multiple initially infected individuals, we consider two very simple example networks which will serve to motivate and illustrate the method of proof.

Proof for an Open Triple

Here we consider the case for the open triple depicted in Fig. 2. The equations for the probabilities of the basic state space M are given in (4), (6) and (7). To form the relevant closure relations, we require the equations for the extended state space Inline graphic formed by the equations for M together with (5),

graphic file with name M65.gif 11

Our objective is to close the system at the level of pairs using the closure relation (2), eliminating the need for differential equations describing triples (7), and show that the system remains exact. We note that the exactness of (3) can be proved in this case along the lines of the previous argument by considering each possible initial condition separately; however, the approach discussed here will be more useful for understanding the general case.

We need to consider closures for the triples 〈I1S2I3〉, 〈I1S2S3〉 and 〈S1S2I3〉. Let us consider the closure:

graphic file with name M66.gif

This is exact if α(t)=0 where

graphic file with name M67.gif

and 〈S2〉≠0. Taking the derivative of α with respect to time gives

graphic file with name M68.gif

Substituting the relevant derivatives in from (5)–(7) and cancelling terms reduces this to

graphic file with name M69.gif

so:

graphic file with name M70.gif

Now it is easily verified that provided the system is initiated in a specific system state then α(0)=0. Consequently α(t)=0 for all t≥0 and the closure is exact.

By symmetry, it will suffice to consider one of the remaining two triples in (6). We wish to show that α(t)=0 where

graphic file with name M71.gif

Here it is necessary to also use (11) for pairs of type SS in the extended state space. This closure is not established immediately, but there is a two-step process to establishing that α(t)=0 which the reader can verify by analogy with the example of the star graph in the next section.

Proof for a Star Graph

We now consider the case of the undirected star graph with P=4 shown in Fig. 4, where again we assume that the strength is the same across each network link and is denoted by τ and the removal rate for each node is γ. Writing down the equations of the extended state space, there are two types of closure which need to be proved: one for the SSI triples and one for the ISI triples (see (1)). The graph has three triples ((1,4,3),(2,4,3),(1,4,2)), but it is sufficient to prove exactness for one of them. Hence we want to prove the following two relations:

graphic file with name M72.gif 12

For brevity, we adopt the alternative notation:

graphic file with name M73.gif

We introduce

graphic file with name M74.gif

By differentiating this, substituting in from the process equations and grouping terms, we obtain

graphic file with name M75.gif 13

where:

graphic file with name M76.gif

Differentiating α2 we get

graphic file with name M77.gif 14

where:

graphic file with name M78.gif

The derivatives of α3 and α4 can be obtained similarly.

Fig. 4.

Fig. 4

Star graph with P=4 nodes

Differentiating α5 we get

graphic file with name M79.gif 15

where:

graphic file with name M80.gif

The derivative for α6 can also be obtained. Finally, differentiating α7 and α8 we obtain:

graphic file with name M81.gif 16

To conclude the proof of the exactness of the closure, we first assume that the initial state is not mixed; that is, one of the 34=81 possible configurations has probability 1 at t=0. Then it is easy to see that αj(0)=0 for all j∈{1,2,…,8} (see Lemma 3.2 in Sect. 3.5 for a proof in a more general context). Hence the differential equations for α7 and α8 show that α7(t)=0 and α8(t)=0 for all t≥0. The differential equation for α5 then implies that α5=0 ∀t≥0 (and similarly for α6). This implies that α2=0 (and similarly α3=α4=0). The differential equation for α1 shows that α1=0 which is what we wanted to show. The other triple closure in (12) can be proved similarly.

Remark

In fact, we have proved several closure relations αj=0 ∀j∈{1,2,…,8}.

The closure relations each consist of two pairs which are visualised in Fig. 5. For reference, we refer to these as the left pair and the right pair referring to their position in this figure. Looking at these closure relations, we can form two observations:

  1. For a given node i, the number of times it appears as Si is the same in the left and the right pair, and similarly with the number of times it appears as Ii. For example, with α5, node 1 (the left node) has one I and one S for both pairs and node 4 (central node) has two S’s in both pairs. For α6, the number of I’s at nodes 1, 2, 3 and 4 is (1,1,1,0) in both pairs and the number of S’s is given by (1,0,0,2) in both pairs.

  2. Any SI pairing on the left appears exactly the same number of times on the right. For example in α7, I1S4 appears once on the left and once on the right and I3S4 appears twice on the left and twice on the right. Observing that only SS pairs and IS pairs appear in the closure relations, a consequence is that SS pairs also have this property.

These observations will be of key importance for developing the general proof in the following two sections.

Fig. 5.

Fig. 5

Each box illustrates the relevant node states for the four parts of the closure relation in the equation above it. The node states on the left and the right correspond to the two terms in the closure relation. The node numbers correspond to the same positions as in Fig. 4

General Closure Relations

In general, to show that the closure relationship (2) is exact for the tree graph, we need to show that α=0 where

graphic file with name M82.gif

B=S and A,C∈{S,I}. Our proof of this is via induction using a sequence of closures analogous to the proof in the case of the star graph in Sect. 3.3.

We shall consider many closure relations. In general we specify that they are composed of two pairs of motif states Inline graphic and Inline graphic and that the closure is exact if α=0 where

graphic file with name M85.gif

We formalise the observations we made about the closure relations for the star graph at the end of Sect. 3.3 by defining what we term “compatible pairs”.

Definition 3.1

For all Inline graphic s.t. Wj=i and Aj=a.

For all Inline graphic.

Remark

In general, the notation Inline graphic denotes that the state of subsystem Inline graphic is implied by the state of subsystem Inline graphic because it is contained within it.

Definition 3.2

Two pairs of motif states Inline graphic and Inline graphic are called compatible pairs if the following conditions are met:

  • CP(i) Inline graphic

  • CP(ii) Inline graphic

  • CP(iii) Inline graphic

  • CP(iv) Inline graphic

  • CP(v) Same as CP(iii) and CP(iv) but with SS pairs

where a∈{S,I,R}.

Definition 3.3

Let Inline graphic be an r-state and Inline graphic be a q-state. Then the order of the pair Inline graphic is defined as r+q.

Proposition 3.1

IfInline graphicandInline graphicare compatible pairs, then their order is equal.

Proof

Follows from CP(i) and CP(ii). □

Proposition 3.2

For a tree graph, applying the transformationhito each of the four motif states in compatible pairs that contain nodeigenerates compatible pairs.

Proof

The transformation satisfies CP(i) and CP(ii) because it replaces Inline graphic with Inline graphic which does not alter the form of the conditions. The transformation satisfies CP(iii) and CP(iv) because all IS pairs where i is the infected individual are removed by this transformation. New IS pairs cannot be created by the transformation since this would require II pairs which are prohibited for tree graphs by Lemma 2.1. CP(v) is satisfied because the transformation leaves existing SS pairs unchanged and created SS pairs result from existing IS pairs so are balanced on each side. □

Proof of the Main Result

Lemma 3.1

LetInline graphicandInline graphicbe compatible pairs or orderRandInline graphic. Let

graphic file with name M107.gif

Then

graphic file with name M108.gif 17

where eachαpcan be expressed as

graphic file with name M109.gif

withInline graphicandInline graphicbeing compatible pairs of orderR+1, andc0, cpbeing constants andmbeing an integer denoting the number of terms in the summation.

Remark

This is a general statement of the forms of (13)–(16) in the star graph example.

Proof

Take the derivative of α0:

graphic file with name M112.gif 18

We consider the terms associated with removal, transmission terms of order R and transmission terms of order R+1 separately. Firstly, from (10), this derivative contains the following terms associated with the removal process:

graphic file with name 11538_2013_9923_Equad_HTML.gif

where the sums over k1,k2,k3,k4 are over all nodes in the motifs ψW,ψX,ψY,ψZ respectively and where

graphic file with name 11538_2013_9923_Equae_HTML.gif

is easily seen to follow from CP(i) and CP(ii).

The right-hand side of (18) also contains the following transmission terms with motifs of order R:

graphic file with name M113.gif

where

graphic file with name M114.gif

and where the sums over k1,k2,k3,k4,l1,l2,l3,l4 are over all nodes in each of the relevant motifs. This follows from CP(iii) and CP(iv). Hence the removal terms and transmission terms of order R contribute c0a0 to the derivative of a0 where c0=−vw.

For transmission terms with motifs of order R+1, consider the term Inline graphic in (18). This gives rise to the following terms in the derivative of α0:

graphic file with name M116.gif

To prove the lemma, it is sufficient to show that each term in this sum can be paired uniquely with a term in Inline graphic or in Inline graphic, such that the difference of these terms forms αp. By symmetry this is a one-to-one pairing establishing that each term of order R+1 on the right-hand side of (18) is accounted for exactly once in the sum of αp.

Let us take an element from the sum by choosing a node Ww, w∈{1,2,…,r}, and an outside neighbour node Inline graphic. This neighbour can either be in ψX or outside.

Case 1: Aw=S.

Consider first the case where Aw is a susceptible node, resulting in the following term in the sum:

graphic file with name M120.gif

where we can identify Inline graphic.

We have Aw=S, Inline graphic and nW. Let us now identify a term in Inline graphic to form compatible pairs. According to CP(i) and CP(ii) we can assume without loss of generality that WwY, i.e. ∃y:Yy=Ww and Cy=S (see Fig. 6). There are two subcases:

Fig. 6.

Fig. 6

Each circle refers to one of the motif states Inline graphic, Inline graphic, Inline graphic, Inline graphic specified to the top left. The position of the relevant node states with respect to the motif states are then illustrated. (a) Subcase 1.1 (nY). (b) Subcase 1.2 (nY)

Subcase 1.1: nY.

When nY, either nZ or nZ and these are shown by solid and dashed lines respectively in Fig. 6(a). By CP(i), nZnX since nW so the solid lines match on the left and right pairs as do the dashed lines. The corresponding term must therefore be Inline graphic irrespective of whether nZ or not. Hence:

graphic file with name M129.gif

where Inline graphic and Inline graphic are easily seen to satisfy the definition of compatible pairs since the extra node is n which is I in both pairs.

Subcase 1.2: nY.

If nY, then the edge nYy is an SS or IS edge in C. By CP(iii), CP(iv) and CP(v), it is also the same edge in B because nW. Hence ∃x:Xx=Wwand Bx=S. By CP(ii), WwZ is also true where ∃z:Zz=Ww andDz=S. This is illustrated in Fig. 6(b). We therefore have the corresponding term Inline graphic and:

graphic file with name M133.gif

where again, the relevant pairs are seen to satisfy compatibility.

Case 2: Aw=I.

So far we have proved the existence of αp when Aw=S, Inline graphic, nW. Now we have to show αp can be defined when Aw=I, Inline graphic, nW. In this case, the motif generating rule firstly changes Aw=I to Aw=S and then applies the same generating rule as if Aw=S initially. From Proposition 3.2, applying the transformation to compatible pairs of order R produces compatible pairs of order R in the case of the tree graph. After this transformation, the argument runs identically to case 1. This completes the proof of Lemma 3.1. □

Lemma 3.2

Assume that the initial condition is not mixed, i.e. ∃A∈{I,S}Psuch thatInline graphic. IfInline graphicandInline graphicare compatible pairs, then for any graph, Inline graphicatt=0.

Proof

Assume that Inline graphic. Then by CP(i) and CP(ii), for all Inline graphic, we must have Inline graphic and/or Inline graphic. This is also true for all Inline graphic and, by the symmetry between the compatible pairs, it follows that Inline graphic. Similarly, it follows that Inline graphic implies that Inline graphic. □

Theorem 3.1

Let us assume the following:

  • The graph is a tree.

  • The initial condition is not mixed.

  • Inline graphicandInline graphicare compatible pairs.

ThenInline graphicfor all timet≥0.

Proof

We prove the theorem by induction according to the order of the closure. This is analogous to the proof for the star graph in Sect. 3.3.

Step 1.

If the closure is of order 2P, then it is exact. More precisely, if Inline graphic, Inline graphic, Inline graphic and Inline graphic are P-states, then (17) does not contain the summation terms and becomes:

graphic file with name M155.gif

Since we start from an initial condition that is not mixed, we have (by Lemma 3.2) α0(0)=0⇒α0(t)=0 ∀t≥0.

Step 2.

Assume that the theorem is proved for compatible pairs of order R+1. We prove that it is true for compatible pairs of order R. Applying Lemma 3.1, we have:

graphic file with name M156.gif

According to the induction condition, αp=0 ∀p because these are compatible pairs of order R+1. Therefore Inline graphic. From Lemma 3.2, α0(0)=0 so α0(t)=0∀t≥0. Since we have proved the result for compatible pairs of order 2P then we have completed the proof of the theorem. □

The lowest-order compatible pairs are of order four. The closure relation corresponding to these pairs is formulated in the following important corollary.

Corollary 3.1

Under the assumptions on the graph and the initial conditions in Theorem 3.1, we have the special cases:

graphic file with name M158.gif

for alli∈{1,2,…,P} and for alljNi, kNj: ik;

graphic file with name M159.gif

for alli∈{1,2,…,P} and for allj,kNi: jk.

This corollary is Theorem 1.1 expressed in a different notation.

Remark

From CP(i) and CP(ii), it is clear that Lemma 3.2 can be extended to the mixed initial condition where the probabilities of the initial states of each individual in the system are statistically independent, leading to Inline graphic at t=0. However, for general mixed initial conditions where correlations between individuals can occur, Lemma 3.2 does not hold and the pair-based model is not exact.

Application to Some Graphs which Are not Trees

To complete this work, we make a final observation which shows that the pair-based model can sometimes provide an exact representation of infectious dynamics on graphs which are not strictly trees. We first make two definitions which can be understood with reference to the examples in Fig. 7.

Fig. 7.

Fig. 7

The graphs on the left are the initial transmission networks where the initially infected nodes are indicated by the symbol I. The graphs on the right are the reduced representation graphs where the cuts for independent segments which occur for cases (b) and (d) are indicated with dashed lines. The tree structure of the graphs on the right shows that applying the pair-based model to these graphs generates an exact representation of the infection dynamics on the original system

Definition 4.1

A reduced representation is a graph which is constructed from the initial transmission network and the given initial conditions by removing transmission routes which cannot carry infection dynamics.

Definition 4.2

An independent segment is a region of a graph that is only connected to other regions via nodes in the segment which are initially infectious.

Theorem 4.1

Given SIR dynamics on a transmission network with infection and removal governed by Poisson processes and given an unmixed initial state of the system, if every independent segment of the reduced representation is a tree, then applying (3) to this representation exactly generates the expected infection dynamics on the original transmission network.

Proof

By definition, the infection dynamics of the system remain unchanged after the removal of edges which cannot support infection dynamics. Additionally, the infection dynamics of any independent segment are independent of the dynamics on the rest of the graph because there is no process that allows influence across the initially infectious nodes. If the resulting representation graph is a set of trees, then since (3) is an exact representation of the dynamics on each independent segment, solving (3) on the reduced representation graph is equivalent to the infection dynamics on the original transmission network. □

Figure 7 shows some graphs and the associated representation graphs where the dashed lines indicate the boundaries that separate independent segments. For each of these examples, the solution of (3) on the representation graph exactly reproduces the expected infection dynamics of the original system.

This suggests that the accuracy of the pair-based model could be increased by first generating the representation graph for the particular network and initial conditions prior to numerically solving the pair-based model.

Discussion

We considered the pair-based variant of the subsystem approach to constructing epidemic models on networks (Sharkey 2008, 2011). We proved that for SIR dynamics on fixed tree graphs with exponentially distributed transmission and removal processes, the pair-based model provides an exact determination of the infection probability time course for each individual in the network. We also showed that the dynamics of some networks with cycles can also be represented exactly by the pair-based model under specific initial conditions.

This represents the first provably exact deterministic model of epidemic dynamics on finite heterogeneous systems which has been numerically evaluated. Here we use the qualifying term “heterogeneous” to exclude systems with significant symmetry which may be employed to obtain exact representations in very specialised circumstances (Keeling and Ross 2008; Simon et al. 2011). In principle, the message-passing approach of Karrer and Newman (2010) will also yield an exact description of finite heterogeneous systems in a way that is numerically feasible, but to our knowledge this has not yet been implemented in this context. Interestingly, the message-passing method also applies more generally beyond the usual assumptions of Markovian dynamics to arbitrary distributions for transmission and removal processes, although there may be implementation issues for more general distributions.

We note that effective degree models can generate very good agreement with stochastic simulation (Ball and Neal 2008; Lindquist et al. 2011) as do the PGF or edge-based compartmental modelling methods (Miller et al. 2012; Miller and Volz 2012; Volz 2008), although exact correspondence has not been proven here. For some idealised networks, including fully connected networks and some configuration networks (Volz 2008), convergence to the expected value can be shown in the infinite population limit (Ball and Neal 2008; Decreusefond et al. 2012; Karrer and Newman 2010). However, these models have a large measure of homogeneity, and convergence only occurs for infinite populations.

It is intuitively understood that clustering is at the root of problems with models based around closures at the level of pairs (Keeling and Eames 2005). Previous analysis (Sharkey 2011) attributed the failure to anomalous terms which emerge in subsystem equations when differentiating closure approximations based around the statistical independence of individuals. Here, repeating similar analysis for a closure at the order of pairs in the context of tree graphs, these anomalies do not arise and we are able to prove that the closure is exact via induction.

In principle, models based around subsystems at the order of three nodes or higher could be constructed. The next higher-order model would require obtaining a closure which is able to preserve correlations between triples, and similarly for higher orders. This leads to an interesting theoretical question for future analysis: does the hierarchy of exact order-by-order models suggested in Sharkey (2011) exist, and if so, what form should the closure approximations take at each level? We conjecture that exact closures of a similar nature to those considered here are possible for networks with more structure, given that the order at which the closure is performed is guided by the network structure; future work will focus on this question.

Acknowledgements

This research was facilitated in part by the Research Centre for Mathematics and Modelling at The University of Liverpool. We thank two anonymous reviewers for helpful comments which improved the manuscript.

Appendix

The proof of Theorem 2.1 is analogous to the proof of the single and pair equations by Sharkey (2011). In what follows, summations over Greek indices α,β are assumed to be over all 3P possible system states. First we make some definitions.

Definition A.1

For a system Γ in state α and a single node i of Γ in state a we define:

graphic file with name M161.gif

denoting whether or not the specified single node state matches the system state. Note that this is just Definition 2.6 applied to the full system.

Definition A.2

graphic file with name M162.gif

Proposition A.1

For allα,i:

graphic file with name M163.gif

where the summation is over all possible states available to nodei.

Proof

Statement that for a given system state Γα, or subsystem state Inline graphic, each node must be in a unique state. □

Proposition A.2

For allβ,i,a:

graphic file with name M165.gif

Proof

Statement that there is only one system state which is identical to Γβ except that node i is in state Inline graphic. □

Proposition A.3

For any subsystemInline graphicandk∈{1,2,…,r}:

graphic file with name M168.gif

for allα,a.

Proof

Proposition is true when Inline graphic. When Inline graphic we have:

graphic file with name M171.gif

 □

Proposition A.4

For any subsystemInline graphicandk∈{1,2,…,r}:

graphic file with name M173.gif

for allβ,a.

Proof

Proposition is true when:

graphic file with name M174.gif

From Proposition A.2 there must be a single state Γα for which Inline graphic, otherwise it is zero. When

graphic file with name M176.gif

we must also have (for the state when Inline graphic):

graphic file with name M178.gif

because only site Inline graphic can change state during this transition, establishing the proposition. □

We can now use these propositions to prove Theorem 2.1:

Proof

We have that:

graphic file with name M180.gif

Taking the derivative of this with respect to time and substituting in the system master equation (8) gives

graphic file with name M181.gif 19

From Proposition A.1:

graphic file with name M182.gif

Multiplying the right of (19) by this gives:

graphic file with name M183.gif

This can be simplified using the fact that σαβ=0 whenever the state of the subsystem ψW differs by more than a single individual Inline graphic, k∈{1⋯r}, between states Γα and Γβ which means that aj=bj=Aj for jk:

graphic file with name M185.gif

where the last equality follows from Inline graphic.

For SIR dynamics, we can do the summations over ak and bk:

graphic file with name M187.gif 20

Now we introduce the relevant terms in the transition matrix at the level of the system:

graphic file with name 11538_2013_9923_Equbd_HTML.gif

where these equations are designed so that they are satisfied for any combination of α,β,k. Substituting these into (20) gives:

graphic file with name 11538_2013_9923_Eqube_HTML.gif

We can rearrange the summation order:

graphic file with name 11538_2013_9923_Equbf_HTML.gif

and apply Proposition A.3:

graphic file with name 11538_2013_9923_Equbg_HTML.gif

Applying Proposition A.4 gives

graphic file with name 11538_2013_9923_Equbh_HTML.gif

Breaking up the sums over n on the first and third lines depending on whether the node n is internal or external to the motif ψW gives:

graphic file with name 11538_2013_9923_Equbi_HTML.gif

Lines 1 and 4 can be immediately recognised as the generating rule (Definition 2.4). For nW and Inline graphic:

graphic file with name M189.gif

and if Inline graphic then Inline graphic.

Line 2 requires that nW. Let l∈{1,2,…,r} and Wl=n. Then:

graphic file with name M192.gif

where the last equality follows from Proposition A.3. Using the definition of Inline graphic, this becomes:

graphic file with name M194.gif

and similarly for line 5.

We obtain:

graphic file with name 11538_2013_9923_Equbm_HTML.gif

where the Inline graphic operator on the 5th line is superfluous but allows us to write the equation in the form of (9). □

References

  1. Anderson R. M., May R. M. Infectious diseases of humans. London: Oxford University Press; 1991. [Google Scholar]
  2. Bailey N. T. J. The mathematical theory of infectious diseases. London: Griffin; 1975. [Google Scholar]
  3. Baker R. E., Simpson M. J. Correcting mean-field approximations for birth–death-movement processes. Phys. Rev. E. 2010;82 doi: 10.1103/PhysRevE.82.041905. [DOI] [PubMed] [Google Scholar]
  4. Ball F., Neal P. Network epidemic models with two levels of mixing. Math. Biosci. 2008;212:69–87. doi: 10.1016/j.mbs.2008.01.001. [DOI] [PubMed] [Google Scholar]
  5. Bartlett M. S. Proc. third Berkley symp. math. statist. prob. 1956. Deterministic and stochastic models for recurrent epidemics; pp. 81–108. [Google Scholar]
  6. Born M., Green H. S. A general kinetic theory of liquids. I. The molecular distribution functions. Proc. R. Soc. Edinb. A. 1946;188:10–18. doi: 10.1098/rspa.1946.0093. [DOI] [PubMed] [Google Scholar]
  7. Decreusefond L., Dhersin J., Moyal P., Tran V. C. Large graph limit for an SIR process in random network with heterogeneous connectivity. Ann. Appl. Probab. 2012;22:541–575. doi: 10.1214/11-AAP773. [DOI] [Google Scholar]
  8. Harada Y., Iwasa Y. Lattice population dynamics for plants with dispersing seeds and vegetative propagation. Res. Popul. Ecol. 1994;36:237–249. doi: 10.1007/BF02514940. [DOI] [Google Scholar]
  9. Heathcote H. W. The mathematics of infectious diseases. SIAM Rev. 2000;42:599–653. doi: 10.1137/S0036144500371907. [DOI] [Google Scholar]
  10. House T., Keeling M. J. Insights from unifying modern approximations to infections on networks. J. R. Soc. Interface. 2011;8:67–73. doi: 10.1098/rsif.2010.0179. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Karrer B., Newman M. E. J. A message passing approach for general epidemic models. Phys. Rev. E. 2010;82 doi: 10.1103/PhysRevE.82.016101. [DOI] [PubMed] [Google Scholar]
  12. Keeling M. J. The effects of local spatial structure on epidemiological invasions. Proc. Biol. Sci. 1999;266:859–867. doi: 10.1098/rspb.1999.0716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Keeling M. J., Eames K. T. D. Networks and epidemic models. J. R. Soc. Interface. 2005;2:295–307. doi: 10.1098/rsif.2005.0051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Keeling M. J., Ross J. On methods for studying stochastic disease dynamics. J. R. Soc. Interface. 2008;5:171–181. doi: 10.1098/rsif.2007.1106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Kermack W. O., McKendrick A. G. Contributions to the mathematical theory of epidemics. Proc. R. Soc. Edinb. A. 1927;115:700–721. doi: 10.1098/rspa.1927.0118. [DOI] [Google Scholar]
  16. Kirkwood J. G. The statistical mechanical theory of transport processes I. General theory. J. Chem. Phys. 1946;14:180–201. doi: 10.1063/1.1724117. [DOI] [Google Scholar]
  17. Kirkwood J. G. The statistical mechanical theory of transport processes II. Transport in gases. J. Chem. Phys. 1947;15:72–76. doi: 10.1063/1.1746292. [DOI] [Google Scholar]
  18. Kurtz T. G. Solutions of ordinary differential equations as limits of pure jump Markov processes. J. Appl. Probab. 1970;7:49–58. doi: 10.2307/3212147. [DOI] [Google Scholar]
  19. Kurtz T. G. Limit theorems for sequences of jump Markov processes approximating ordinary differential processes. J. Appl. Probab. 1971;8:344–356. doi: 10.2307/3211904. [DOI] [Google Scholar]
  20. Lindquist J., Ma J., van den Driessche P., Willeboordse F. H. Effective degree network disease models. J. Math. Biol. 2011;62:143–164. doi: 10.1007/s00285-010-0331-2. [DOI] [PubMed] [Google Scholar]
  21. Markham D. C., Simpson M. J., Baker R. E. Simplified method for including spatial correlations in mean-field approximations. Phys. Rev. E. 2013;87 doi: 10.1103/PhysRevE.87.062702. [DOI] [PubMed] [Google Scholar]
  22. Matsuda H., Ogita N., Sasaki A., Sato K. Statistical mechanics of populations: the lattice Lotka–Volterra model. Prog. Theor. Phys. 1992;88:1035–1049. doi: 10.1143/ptp/88.6.1035. [DOI] [Google Scholar]
  23. Miller J. C., Slim A. C., Volz E. M. Edge-based compartmental modelling for infectious disease spread. J. R. Soc. Interface. 2012;9:890–906. doi: 10.1098/rsif.2011.0403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Miller J. C., Volz E. M. Model hierarchies in edge-based compartmental modeling for infectious disease spread. J. Math. Biol. 2012 doi: 10.1007/s00285-012-0572-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Pastor-Satorras R., Vespignani A. Epidemic spreading in scale-free networks. Phys. Rev. Lett. 2001;86:3200–3203. doi: 10.1103/PhysRevLett.86.3200. [DOI] [PubMed] [Google Scholar]
  26. Rand D. A. Correlation equations and pair approximations for spatial ecologies. In: McGlade J., editor. Advanced ecological theory: principles and applications. Oxford: Blackwell; 1999. pp. 100–142. [Google Scholar]
  27. Sato K., Matsuda H., Sasaki A. Pathogen invasion and host extinction in lattice structured populations. J. Math. Biol. 1994;32:251–268. doi: 10.1007/BF00163881. [DOI] [PubMed] [Google Scholar]
  28. Sharkey K. J. Deterministic epidemiological models at the individual level. J. Math. Biol. 2008;57:311–331. doi: 10.1007/s00285-008-0161-7. [DOI] [PubMed] [Google Scholar]
  29. Sharkey K. J. Deterministic epidemic models on contact networks: correlations and unbiological terms. Theor. Popul. Biol. 2011;79:115–129. doi: 10.1016/j.tpb.2011.01.004. [DOI] [PubMed] [Google Scholar]
  30. Simon P. L., Taylor M., Kiss I. Z. Exact epidemic models on graphs using graph-automorphism driven lumping. J. Math. Biol. 2011;62:479–508. doi: 10.1007/s00285-010-0344-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Simon P. L., Kiss I. Z. From exact stochastic to mean-field ODE models: a new approach to prove convergence results. IMA J. Appl. Math. 2011 [Google Scholar]
  32. Taylor M., Simon P. L., Green D. M., House T., Kiss I. Z. From Markovian to pairwise epidemic models and the performance of moment closure approximations. J. Math. Biol. 2012;64:1021–1042. doi: 10.1007/s00285-011-0443-3. [DOI] [PubMed] [Google Scholar]
  33. Volz E. SIR dynamics in random networks with heterogeneous connectivity. J. Math. Biol. 2008;56:293–310. doi: 10.1007/s00285-007-0116-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Bulletin of Mathematical Biology are provided here courtesy of Springer

RESOURCES