Abstract
We consider Markovian susceptible-infectious-removed (SIR) dynamics on time-invariant weighted contact networks where the infection and removal processes are Poisson and where network links may be directed or undirected. We prove that a particular pair-based moment closure representation generates the expected infectious time series for networks with no cycles in the underlying graph. Moreover, this “deterministic” representation of the expected behaviour of a complex heterogeneous and finite Markovian system is straightforward to evaluate numerically.
Keywords: Kolmogorov equation, Dimensional reduction
Introduction
Background
The majority of epidemic models fall either into the category of stochastic models (Bailey 1975; Bartlett 1956) or into the category of deterministic differential equation-based models (Anderson and May 1991; Kermack and McKendrick 1927). These two strands developed largely independently for much of the twentieth century. Thus, an interesting question arises as to the precise mathematical connection between stochastic and deterministic models. Frequently, deterministic descriptions apply to large populations where the stochastic effects can be treated as negligible. For small populations we shall assume that it is the average or expected behaviour of the epidemic that we are hoping to replicate with “deterministic” descriptions. This average behaviour is a system characteristic that is fully specified by the system and its initial conditions.
The first epidemic models were based on the assumption that populations are evenly mixed, with each individual equally likely to interact with any other individual at any time (Heathcote 2000). A classic example of this type of model is the Susceptible-Infectious-Removed (SIR) compartmental model whereby individuals are classified according to being in one of these three states. It has been shown that for this type of mean-field model, the average of many stochastic simulations (the expected outcome of the stochastic model) converges to the solution of the “equivalent” mean-field deterministic model in the limit of an infinite population size and subject to strict conditions regarding the initialisation of the epidemic (Kurtz 1970, 1971; Simon and Kiss 2011).
More recently, a higher degree of realism has been introduced by considering stochastic models on contact networks where individuals are only able to contact a limited subset of the population. This enables significant heterogeneity to be incorporated, treating individuals as distinct entities with fixed connectivity to pre-allocated neighbours. While stochastic models are readily extended to incorporate such systems, deterministic descriptions have been more problematic. Several methodologies have been developed including pair-approximation models (Keeling 1999; Keeling and Eames 2005; Rand 1999), degree-based models (Pastor-Satorras and Vespignani 2001), and models based on the probability generating function (PGF) formalism which are applicable to configuration networks (Volz 2008) as well as the related edge-based compartmental modelling (Miller et al. 2012; Miller and Volz 2012). It has been observed (House and Keeling 2011) that these models are, at some level, equivalent and are all derived from similar principles of independence. Although comparison with simulation of stochastic models can sometimes be good, the basic link remains obscure.
Typically there are two idealised scenarios in which exact correspondence between stochastic models and solvable deterministic descriptions has been shown. Firstly, correspondence has been shown to sometimes occur in the limit of infinite populations for particular idealised graphs (Ball and Neal 2008; Decreusefond et al. 2012) which cannot be exactly realised in practice. It can also occur with some very simplified systems whose symmetry properties can be exploited to achieve reductions in the stochastic description (Keeling and Ross 2008; Simon et al. 2011).
Here we consider a recently introduced class of model, related to the pair-approximation models, which give an exact correspondence between a deterministic description and the stochastic model for SIR epidemics on finite, time-invariant networks. Pair-approximation models were introduced into network-based epidemic and ecological theory in the 1990s to describe large populations of interacting individuals (Matsuda et al. 1992; Sato et al. 1994; Harada and Iwasa 1994; Rand 1999; Keeling 1999). They are an example of a hierarchy of equations which are truncated at the second order by an approximation (truncation at the first order corresponds to mean-field). This type of hierarchy was first considered in statistical physics and is sometimes known as the Bogoliubov–Born–Green–Kirkwood–Yvon (BBGKY) hierarchy (Kirkwood 1946, 1947; Born and Green 1946). Recently, related models have been considered at the level of individuals, variously called subsystem equations, moment dynamics equations, pair-based equations (Sharkey 2008, 2011; Baker and Simpson 2010; Markham et al. 2013). This method generates a solvable class of models which can encompass a significant amount of heterogeneity and enables a fundamental link with finite stochastic models (Sharkey 2008, 2011).
We consider a pair-based representation of Markovian SIR dynamics. We show that by considering subsystems at the level of pairs, a closure can be found that determines the expected infectious time series exactly for arbitrary network structures where the underlying graph is a tree and, in some special circumstances, for particular networks with cycles. We note that the recent, related message passing formulation of epidemics on contact networks developed by Karrer and Newman (2010) also enables an exact description of epidemic dynamics on finite tree graphs.
Statement of the Main Result
We consider an SIR compartmental model composed of P individuals whose states are described at any given point in time by vectors I and S with respective components Ii and Si, i∈{1,2,…,P}, such that Ii=1 if individual i is infectious (Ii=0 otherwise) and Si=1 if individual i is susceptible (Si=0 otherwise). Transmission and recovery occur by Poisson processes with rate parameters and μi=γiIi, respectively, where T is a “transmission” matrix with (time-independent) elements Tij denoting the rate parameter for an infectious node j infecting a susceptible node i (Tii=0 for all i) and where γi denotes the rate parameter for an infectious individual i to recover, enabling individual-specific removal rates.
As shown by Sharkey (2011), for any transmission matrix T and any nodes i,j, the following differential equations are provably exact (consistent with the stochastic model):
1 |
where 〈Si〉 and 〈Ii〉 denote the time-dependent probabilities (or equivalently the expected values of the indicator functions) for individual i to be susceptible and infectious, respectively, and expressions of the form 〈AiBj〉 denote the time-dependent probability that individual i is in state A and individual j is in state B with a similar interpretation of terms of the form 〈AiBjCk〉. Here and throughout, we adopt the dot notation to denote time derivatives. It follows that the expected population-level susceptible and infectious time series are given by and , respectively.
Note that it is a short step (see Sharkey 2008) from (1) to the familiar population-level pair equations (Keeling 1999; Keeling and Eames 2005; Rand 1999), also proved independently by Taylor et al. (2012) for the susceptible-infectious-susceptible variant.
This system can be completed by formulating differential equations for the triples, quadruples, and so forth until we reach the full system size. This yields a self-contained system of differential equations that exactly determines the probabilities of each quantity given initial conditions. However, cascading these equations up to the full system size will usually result in a system that is impractical to solve due to its sheer size. This is why this system is typically closed at some level by introducing a functional relation approximating higher-order probabilities in terms of lower-order ones. One of the most frequently used closure relations can be written as
2 |
for the current context. Applying this closure relation to our system at the level of pairs we arrive at the following system:
3 |
where we use X for susceptible and Y for infectious to emphasise that these are approximating differential equations based on the closure. When 〈Xi〉 in the denominator is zero, we assume that the approximation takes the value zero.
In general, we consider networks (graphs) with directed and undirected edges. In what follows, we use the terminology “tree graph” to include graphs with directed edges where the underlying (equivalent undirected) graph is a tree. Our main aim is to show that when matrix T represents a tree and the system is initiated in a pure system state (that is, one of the 3P possible configurations has probability 1 at time t=0), then the system can be closed at the level of pairs such that the closure holds exactly. Specifically, solving the closed system above, we obtain the same values for all marginal and pairwise-joint probabilities present in the unclosed system: 〈Xi〉=〈Si〉, 〈Yi〉=〈Ii〉, with similar equalities holding for the pairs.
In fact, we will prove the following theorem.
Theorem 1.1
Let us assume the following:
The graph (transmission network) is a tree (the underlying graph has no cycles).
The initial condition is a pure state, i.e. the system is initially in one of its 3Ppossible configurations with probability 1.
Then the following relations hold:
for alli∈{1,2,…,P} and for alljwith links towardsiand allkwith links towardsj : i≠k;
for alli∈{1,2,…,P} and for alljandkwith links towardsi: j≠k.
Remark
This theorem also holds for mixed (probabilistic) initial system states provided that the initial probabilities of the states of individuals in the system are uncorrelated. However, in general, mixed initial states cannot be represented exactly.
The theorem will be formulated in a more general context stating that even higher-order closure relations are also exact.
Figure 1 shows the numerical solution of (3) for a small network of 9 nodes where it is clear that it is accurate to within the precision visible on the graph. Matlab code for solving the system of Eqs. (3) is provided in Sharkey (2011). This code also works on networks which are not trees but is no longer exact in these cases. Cycles in the underlying graph of order three utilise the alternative closure 〈AiBjCk〉=〈AiBj〉〈BjCk〉〈AiCk〉/〈Ai〉〈Bj〉〈Ck〉 which is believed to gain increased accuracy in most circumstances, but these do not occur in the tree graphs considered in the present work.
The structure of the paper is as follows. Section 2 introduces some notation which is needed to prove the result. This also contains an important theorem (Theorem 2.1) which specifies equations describing the probabilities of the states of arbitrary subsystems (the proof of this result is given in the Appendix). The relevant state space for our domain of a tree graph is then developed. Section 3 proves the main result, initially focusing on some special cases to help motivate and facilitate understanding of the main ideas and steps of the general proof in Sect. 3.5. The main ingredient for the general proof is Lemma 3.1 which is proved via Theorem 2.1. Theorem 3.1 then follows easily by induction from Lemma 3.1. The theorem as stated above is a simple corollary of Theorem 3.1. In Sect. 4 we discuss an application of the pair-based model to some special cases of graphs with cycles where it is also exact.
Formulating the Full System
In this section we introduce a new notation which will assist in formulating the set of differential equations for the full system. In (1) we formulated the differential equations up to the level of pairs and we said that this could be continued up to the full system level. This will be done formally here. In order to make the method clearer, using our existing notation let us first evaluate the full set of equations for the undirected line graph with three nodes which we refer to as the open triple, depicted in Fig. 2. Here we shall assume that the transmission rate parameter is τ across both links and that the removal rate is γ for all three nodes. Firstly we write all of the single node equations for this network. From (1):
4 |
and
5 |
We also need to specify the following equations for pairs:
6 |
Finally, at the triple level we have from the master equation (since the system has only three nodes):
7 |
In order to formulate the full system for an arbitrary graph, we introduce notation for the subsystem states.
Notation for System and Subsystem States
In general, our stochastic system (which we denote by Γ) comprises of P individuals, each of which may be in any of the S, I or R states at any given time. In total, this corresponds to 3P possible states. Denoting these system states by Γα, α∈{1,2,…,3P}, the probabilities for each state are given by the master equation (or Kolmogorov equations):
8 |
where σ denotes a constant matrix of Poisson rate parameters. The master equation completely describes our stochastic system using a set of 3P ordinary differential equations. Our overall objective is to show that (3) is implied by the master equation when T represents a tree graph.
It is useful for us to define a general subsystem ψW comprising of r nodes in Γ indexed by vector W of length r: W=(W1,W2,…,Wi,…,Wr), Wi∈{1,2,…,P}, where we can assume W1<W2<⋯<Wr. We assume that the network connections of the nodes of ψW are a subset of the connections of Γ.
Let denote the state of subsystem ψW where A=(A1,A2,…,Ar) and Ai∈{S,I,R} ∀i∈{1,2,…,r} is a sequence of S,I and R symbols of length r such that the state of node Wi is Ai. In terms of the notation of the previous section for subsystems of single nodes and pairs of nodes, we have , , , etc. We shall use these two notations interchangeably. We shall also sometimes treat indexing vectors such as W as sets such that n∈W means that the node n is in the subsystem ψW.
In general, although we can specify the states of each node with this type of notation, an important ambiguity remains because information about the network structure is not included. To remove this ambiguity, the notation should normally be used in the context of a sketch of the relevant network structure or where the network structure is clear from the context of its use (as in (1)).
Let us now show how the differential equations of the different subsystem states can be formulated in general.
Differential Equations for Subsystems
Here we obtain differential equations describing the rate of change of the state of any subsystem. First we make some definitions.
Definition 2.1
A neighbour of node i is a node with a network link directed towards i.
Definition 2.2
Ni denotes the set of neighbours of node i. That is: Tij≠0 ∀j∈Ni.
Definition 2.3
For the subsystem , if node Wk is infectious then:
Otherwise,
Remark
This operator changes the state of node Wk in subsystem ψW to S if it is infectious. If node Wk is susceptible or removed then it leaves the state unchanged.
Definition 2.4
For the subsystem state of r nodes, a subsystem of r+1 nodes can be generated as follows: Take k∈{1,2,…,r} and take a neighbour n of Wk outside of the subsystem with a network link towards Wk, i.e. let , n∉W. If Ak=S, then the generated subsystem state of (r+1) nodes is given by the generating rule:
i.e. the subsystem is extended by an infected at node n which is connected towards Wk. If Ak=I, then the generated subsystem state is given by:
i.e. the subsystem is extended by an infected at node n which is connected towards Wk and the state of node Wk is changed from I to S.
To complete the definition, if Ak=R then the operator leaves the subsystem unchanged. We also assume that for any state Ak where there is no link from node n to node Wk in the transmission matrix T, then the subsystem is also left unchanged.
Remark
The generated order r+1 subsystem is obtained by replacing a susceptible or infectious node Wk in the original subsystem by an SI arc such that the S node of the arc is put in the place of the node Wk and where the I node of the arc is external to the subsystem.
Definition 2.5
For the subsystem , if node Wk is removed then:
Otherwise, .
Definition 2.6
For any subsystem ψW of r nodes in state we define where k∈{1,2,…,r} and a∈{S,I,R} to have value 1 if Ak=a and to have value zero otherwise:
Theorem 2.1
The rate of change of the probability of a subsystem stateis:
9 |
The proof of this theorem is a rather long diversion and can be found in the Appendix.
As an example of applying the theorem, we can use it to obtain the set of subsystem equations (1) by considering each equation in turn:
- For an infectious individual we obtain:
where the first term on the second line of (9) is zero because Tii=0 and the last term is zero because . - If the subsystem is the pair then the sum over k is over k=1 and k=2 and W1=i, W2=j, A1=S, A2=I so:
where the first line corresponds to k=1 and the second to k=2. - If the subsystem is the pair then the sum is over k=1 and k=2 where W1=i, W2=j, A1=S and A2=S so:
where both terms come from the first line of (9).
We have therefore obtained (1) in a slightly different notation (recall that Tii=0 ∀i∈{1,2,…,P}).
The State Space for a Tree Graph
Here we build up a state space which is sufficient to describe a tree graph. We first make some definitions.
Definition 2.7
An r-motif is a subsystem of Γ comprising of r nodes and of network links such that it forms a weakly connected network.
Definition 2.8
An r-state is the state of an r-motif.
The state space that we need to consider is built up inductively from the states of single nodes by considering the infection process. Starting with the infected states of the single nodes , i∈{1,2,…,P}, (9) shows that they depend on the 2-states , j∈Ni as described by the generating rule (Definition 2.4).
The differential equations for in turn contain the 3-states , k∈Nj and , k∈Ni. The differential equations for the 3-states contain 4-states and typically, the differential equations for r-states contain (r+1)-states for r∈{1,2,…,(P−1)}. This state generation process can continue until we reach P-states which can only depend on other P-states.
Note that this process always forms subsystems which are motifs and that the motif states can never include removed nodes.
Definition 2.9
An out-neighbour of node i is a node with a network link from i towards it.
Proposition 2.1
For a tree graph, if the out-neighbours of theInodes are all Sin anr-motif, then this is true for all (r+1)-motifs generated from thisr-motif.
Proof
This follows easily from the definition of the generating rule (Definition 2.4). □
Definition 2.10
Consider a tree graph and take the 1-motifs with I nodes: . The “basic state space” M is formed by these 1-states together with the set of motif states that can be iteratively generated from them using the generating rule (Definition 2.4).
Remark
Due to the method of its construction, the state space M gives a self-contained system of differential equations, i.e. the time derivatives of the probabilities of each motif state can be expressed in terms of the probabilities of other motif states in the state space. An example in the case of the open triple is given by the motifs in (4), (6) and (7).
Definition 2.11
Consider a tree graph and the 1-states: and the 2-states with SS, i.e. . The “extended state space” comprises of these motifs states together with the set of motif states that can be generated from them by repeated iteration of the generating rule.
Remark
The extended state space is required to form the relevant closure relations. Due to the method of its construction, it is also self-contained.
Lemma 2.1
Let. Then the out-neighbour of anInode is anSnode in.
Proof
Follows from Proposition 2.1. □
Lemma 2.2
For a tree graph, the equation for the time derivative of the probability of anr-stateis given by:
10 |
Proof
For these states we have . Additionally, when , the first term on the second line of (9) never arises because implies that an I is connected to an I node in r-state which contradicts Lemma 2.1. Therefore (9) reduces to (10). □
Let us now formulate the exact closure relations and prove our main result.
Closure Relation and Proof of the Main Result
The exactness of (3) is straightforward to see provided that outbreaks of epidemics are always initiated with a single infected individual. We prove this first before considering the general case.
Proof for Single Initial Infected
When infection is initiated on a tree graph at a single individual, infection must always proceed in linear chains. Consequently there is no possibility of the state IkSjIi illustrated in Fig. 3 arising because an infection initiated at either k or i must pass through j to get to the other node. Furthermore,
but since 〈IiSjIk〉=0 and consequently 〈RiSjIk〉=0, we have:
reducing (1) to the following closed system:
Similar arguments show that this can be written in the form of (3).
More generally, this argument also applies to any tree graph where there is at most one network path by which any susceptible individual in the network can become infectious from the initial configuration of infected individuals.
Before discussing the general proof for any tree graph with multiple initially infected individuals, we consider two very simple example networks which will serve to motivate and illustrate the method of proof.
Proof for an Open Triple
Here we consider the case for the open triple depicted in Fig. 2. The equations for the probabilities of the basic state space M are given in (4), (6) and (7). To form the relevant closure relations, we require the equations for the extended state space formed by the equations for M together with (5),
11 |
Our objective is to close the system at the level of pairs using the closure relation (2), eliminating the need for differential equations describing triples (7), and show that the system remains exact. We note that the exactness of (3) can be proved in this case along the lines of the previous argument by considering each possible initial condition separately; however, the approach discussed here will be more useful for understanding the general case.
We need to consider closures for the triples 〈I1S2I3〉, 〈I1S2S3〉 and 〈S1S2I3〉. Let us consider the closure:
This is exact if α(t)=0 where
and 〈S2〉≠0. Taking the derivative of α with respect to time gives
Substituting the relevant derivatives in from (5)–(7) and cancelling terms reduces this to
so:
Now it is easily verified that provided the system is initiated in a specific system state then α(0)=0. Consequently α(t)=0 for all t≥0 and the closure is exact.
By symmetry, it will suffice to consider one of the remaining two triples in (6). We wish to show that α(t)=0 where
Here it is necessary to also use (11) for pairs of type SS in the extended state space. This closure is not established immediately, but there is a two-step process to establishing that α(t)=0 which the reader can verify by analogy with the example of the star graph in the next section.
Proof for a Star Graph
We now consider the case of the undirected star graph with P=4 shown in Fig. 4, where again we assume that the strength is the same across each network link and is denoted by τ and the removal rate for each node is γ. Writing down the equations of the extended state space, there are two types of closure which need to be proved: one for the S−S−I triples and one for the I−S−I triples (see (1)). The graph has three triples ((1,4,3),(2,4,3),(1,4,2)), but it is sufficient to prove exactness for one of them. Hence we want to prove the following two relations:
12 |
For brevity, we adopt the alternative notation:
We introduce
By differentiating this, substituting in from the process equations and grouping terms, we obtain
13 |
where:
Differentiating α2 we get
14 |
where:
The derivatives of α3 and α4 can be obtained similarly.
Differentiating α5 we get
15 |
where:
The derivative for α6 can also be obtained. Finally, differentiating α7 and α8 we obtain:
16 |
To conclude the proof of the exactness of the closure, we first assume that the initial state is not mixed; that is, one of the 34=81 possible configurations has probability 1 at t=0. Then it is easy to see that αj(0)=0 for all j∈{1,2,…,8} (see Lemma 3.2 in Sect. 3.5 for a proof in a more general context). Hence the differential equations for α7 and α8 show that α7(t)=0 and α8(t)=0 for all t≥0. The differential equation for α5 then implies that α5=0 ∀t≥0 (and similarly for α6). This implies that α2=0 (and similarly α3=α4=0). The differential equation for α1 shows that α1=0 which is what we wanted to show. The other triple closure in (12) can be proved similarly.
Remark
In fact, we have proved several closure relations αj=0 ∀j∈{1,2,…,8}.
The closure relations each consist of two pairs which are visualised in Fig. 5. For reference, we refer to these as the left pair and the right pair referring to their position in this figure. Looking at these closure relations, we can form two observations:
For a given node i, the number of times it appears as Si is the same in the left and the right pair, and similarly with the number of times it appears as Ii. For example, with α5, node 1 (the left node) has one I and one S for both pairs and node 4 (central node) has two S’s in both pairs. For α6, the number of I’s at nodes 1, 2, 3 and 4 is (1,1,1,0) in both pairs and the number of S’s is given by (1,0,0,2) in both pairs.
Any SI pairing on the left appears exactly the same number of times on the right. For example in α7, I1S4 appears once on the left and once on the right and I3S4 appears twice on the left and twice on the right. Observing that only SS pairs and IS pairs appear in the closure relations, a consequence is that SS pairs also have this property.
These observations will be of key importance for developing the general proof in the following two sections.
General Closure Relations
In general, to show that the closure relationship (2) is exact for the tree graph, we need to show that α=0 where
B=S and A,C∈{S,I}. Our proof of this is via induction using a sequence of closures analogous to the proof in the case of the star graph in Sect. 3.3.
We shall consider many closure relations. In general we specify that they are composed of two pairs of motif states and and that the closure is exact if α=0 where
We formalise the observations we made about the closure relations for the star graph at the end of Sect. 3.3 by defining what we term “compatible pairs”.
Definition 3.1
For all s.t. Wj=i and Aj=a.
For all .
Remark
In general, the notation denotes that the state of subsystem is implied by the state of subsystem because it is contained within it.
Definition 3.2
Two pairs of motif states and are called compatible pairs if the following conditions are met:
CP(i)
CP(ii)
CP(iii)
CP(iv)
CP(v) Same as CP(iii) and CP(iv) but with SS pairs
where a∈{S,I,R}.
Definition 3.3
Let be an r-state and be a q-state. Then the order of the pair is defined as r+q.
Proposition 3.1
Ifandare compatible pairs, then their order is equal.
Proof
Follows from CP(i) and CP(ii). □
Proposition 3.2
For a tree graph, applying the transformationhito each of the four motif states in compatible pairs that contain nodeigenerates compatible pairs.
Proof
The transformation satisfies CP(i) and CP(ii) because it replaces with which does not alter the form of the conditions. The transformation satisfies CP(iii) and CP(iv) because all IS pairs where i is the infected individual are removed by this transformation. New IS pairs cannot be created by the transformation since this would require II pairs which are prohibited for tree graphs by Lemma 2.1. CP(v) is satisfied because the transformation leaves existing SS pairs unchanged and created SS pairs result from existing IS pairs so are balanced on each side. □
Proof of the Main Result
Lemma 3.1
Letandbe compatible pairs or orderRand. Let
Then
17 |
where eachαpcan be expressed as
withandbeing compatible pairs of orderR+1, andc0, cpbeing constants andmbeing an integer denoting the number of terms in the summation.
Remark
This is a general statement of the forms of (13)–(16) in the star graph example.
Proof
Take the derivative of α0:
18 |
We consider the terms associated with removal, transmission terms of order R and transmission terms of order R+1 separately. Firstly, from (10), this derivative contains the following terms associated with the removal process:
where the sums over k1,k2,k3,k4 are over all nodes in the motifs ψW,ψX,ψY,ψZ respectively and where
is easily seen to follow from CP(i) and CP(ii).
The right-hand side of (18) also contains the following transmission terms with motifs of order R:
where
and where the sums over k1,k2,k3,k4,l1,l2,l3,l4 are over all nodes in each of the relevant motifs. This follows from CP(iii) and CP(iv). Hence the removal terms and transmission terms of order R contribute c0a0 to the derivative of a0 where c0=−v−w.
For transmission terms with motifs of order R+1, consider the term in (18). This gives rise to the following terms in the derivative of α0:
To prove the lemma, it is sufficient to show that each term in this sum can be paired uniquely with a term in or in , such that the difference of these terms forms αp. By symmetry this is a one-to-one pairing establishing that each term of order R+1 on the right-hand side of (18) is accounted for exactly once in the sum of αp.
Let us take an element from the sum by choosing a node Ww, w∈{1,2,…,r}, and an outside neighbour node . This neighbour can either be in ψX or outside.
Case 1: Aw=S.
Consider first the case where Aw is a susceptible node, resulting in the following term in the sum:
where we can identify .
We have Aw=S, and n∉W. Let us now identify a term in to form compatible pairs. According to CP(i) and CP(ii) we can assume without loss of generality that Ww∈Y, i.e. ∃y:Yy=Ww and Cy=S (see Fig. 6). There are two subcases:
Subcase 1.1: n∉Y.
When n∉Y, either n∈Z or n∉Z and these are shown by solid and dashed lines respectively in Fig. 6(a). By CP(i), n∈Z⇔n∈X since n∉W so the solid lines match on the left and right pairs as do the dashed lines. The corresponding term must therefore be irrespective of whether n∈Z or not. Hence:
where and are easily seen to satisfy the definition of compatible pairs since the extra node is n which is I in both pairs.
Subcase 1.2: n∈Y.
If n∈Y, then the edge n→Yy is an SS or IS edge in C. By CP(iii), CP(iv) and CP(v), it is also the same edge in B because n∉W. Hence ∃x:Xx=Wwand Bx=S. By CP(ii), Ww∈Z is also true where ∃z:Zz=Ww andDz=S. This is illustrated in Fig. 6(b). We therefore have the corresponding term and:
where again, the relevant pairs are seen to satisfy compatibility.
Case 2: Aw=I.
So far we have proved the existence of αp when Aw=S, , n∉W. Now we have to show αp can be defined when Aw=I, , n∉W. In this case, the motif generating rule firstly changes Aw=I to Aw=S and then applies the same generating rule as if Aw=S initially. From Proposition 3.2, applying the transformation to compatible pairs of order R produces compatible pairs of order R in the case of the tree graph. After this transformation, the argument runs identically to case 1. This completes the proof of Lemma 3.1. □
Lemma 3.2
Assume that the initial condition is not mixed, i.e. ∃A∈{I,S}Psuch that. Ifandare compatible pairs, then for any graph, att=0.
Proof
Assume that . Then by CP(i) and CP(ii), for all , we must have and/or . This is also true for all and, by the symmetry between the compatible pairs, it follows that . Similarly, it follows that implies that . □
Theorem 3.1
Let us assume the following:
The graph is a tree.
The initial condition is not mixed.
andare compatible pairs.
Thenfor all timet≥0.
Proof
We prove the theorem by induction according to the order of the closure. This is analogous to the proof for the star graph in Sect. 3.3.
Step 1.
If the closure is of order 2P, then it is exact. More precisely, if , , and are P-states, then (17) does not contain the summation terms and becomes:
Since we start from an initial condition that is not mixed, we have (by Lemma 3.2) α0(0)=0⇒α0(t)=0 ∀t≥0.
Step 2.
Assume that the theorem is proved for compatible pairs of order R+1. We prove that it is true for compatible pairs of order R. Applying Lemma 3.1, we have:
According to the induction condition, αp=0 ∀p because these are compatible pairs of order R+1. Therefore . From Lemma 3.2, α0(0)=0 so α0(t)=0∀t≥0. Since we have proved the result for compatible pairs of order 2P then we have completed the proof of the theorem. □
The lowest-order compatible pairs are of order four. The closure relation corresponding to these pairs is formulated in the following important corollary.
Corollary 3.1
Under the assumptions on the graph and the initial conditions in Theorem 3.1, we have the special cases:
for alli∈{1,2,…,P} and for allj∈Ni, k∈Nj: i≠k;
for alli∈{1,2,…,P} and for allj,k∈Ni: j≠k.
This corollary is Theorem 1.1 expressed in a different notation.
Remark
From CP(i) and CP(ii), it is clear that Lemma 3.2 can be extended to the mixed initial condition where the probabilities of the initial states of each individual in the system are statistically independent, leading to at t=0. However, for general mixed initial conditions where correlations between individuals can occur, Lemma 3.2 does not hold and the pair-based model is not exact.
Application to Some Graphs which Are not Trees
To complete this work, we make a final observation which shows that the pair-based model can sometimes provide an exact representation of infectious dynamics on graphs which are not strictly trees. We first make two definitions which can be understood with reference to the examples in Fig. 7.
Definition 4.1
A reduced representation is a graph which is constructed from the initial transmission network and the given initial conditions by removing transmission routes which cannot carry infection dynamics.
Definition 4.2
An independent segment is a region of a graph that is only connected to other regions via nodes in the segment which are initially infectious.
Theorem 4.1
Given SIR dynamics on a transmission network with infection and removal governed by Poisson processes and given an unmixed initial state of the system, if every independent segment of the reduced representation is a tree, then applying (3) to this representation exactly generates the expected infection dynamics on the original transmission network.
Proof
By definition, the infection dynamics of the system remain unchanged after the removal of edges which cannot support infection dynamics. Additionally, the infection dynamics of any independent segment are independent of the dynamics on the rest of the graph because there is no process that allows influence across the initially infectious nodes. If the resulting representation graph is a set of trees, then since (3) is an exact representation of the dynamics on each independent segment, solving (3) on the reduced representation graph is equivalent to the infection dynamics on the original transmission network. □
Figure 7 shows some graphs and the associated representation graphs where the dashed lines indicate the boundaries that separate independent segments. For each of these examples, the solution of (3) on the representation graph exactly reproduces the expected infection dynamics of the original system.
This suggests that the accuracy of the pair-based model could be increased by first generating the representation graph for the particular network and initial conditions prior to numerically solving the pair-based model.
Discussion
We considered the pair-based variant of the subsystem approach to constructing epidemic models on networks (Sharkey 2008, 2011). We proved that for SIR dynamics on fixed tree graphs with exponentially distributed transmission and removal processes, the pair-based model provides an exact determination of the infection probability time course for each individual in the network. We also showed that the dynamics of some networks with cycles can also be represented exactly by the pair-based model under specific initial conditions.
This represents the first provably exact deterministic model of epidemic dynamics on finite heterogeneous systems which has been numerically evaluated. Here we use the qualifying term “heterogeneous” to exclude systems with significant symmetry which may be employed to obtain exact representations in very specialised circumstances (Keeling and Ross 2008; Simon et al. 2011). In principle, the message-passing approach of Karrer and Newman (2010) will also yield an exact description of finite heterogeneous systems in a way that is numerically feasible, but to our knowledge this has not yet been implemented in this context. Interestingly, the message-passing method also applies more generally beyond the usual assumptions of Markovian dynamics to arbitrary distributions for transmission and removal processes, although there may be implementation issues for more general distributions.
We note that effective degree models can generate very good agreement with stochastic simulation (Ball and Neal 2008; Lindquist et al. 2011) as do the PGF or edge-based compartmental modelling methods (Miller et al. 2012; Miller and Volz 2012; Volz 2008), although exact correspondence has not been proven here. For some idealised networks, including fully connected networks and some configuration networks (Volz 2008), convergence to the expected value can be shown in the infinite population limit (Ball and Neal 2008; Decreusefond et al. 2012; Karrer and Newman 2010). However, these models have a large measure of homogeneity, and convergence only occurs for infinite populations.
It is intuitively understood that clustering is at the root of problems with models based around closures at the level of pairs (Keeling and Eames 2005). Previous analysis (Sharkey 2011) attributed the failure to anomalous terms which emerge in subsystem equations when differentiating closure approximations based around the statistical independence of individuals. Here, repeating similar analysis for a closure at the order of pairs in the context of tree graphs, these anomalies do not arise and we are able to prove that the closure is exact via induction.
In principle, models based around subsystems at the order of three nodes or higher could be constructed. The next higher-order model would require obtaining a closure which is able to preserve correlations between triples, and similarly for higher orders. This leads to an interesting theoretical question for future analysis: does the hierarchy of exact order-by-order models suggested in Sharkey (2011) exist, and if so, what form should the closure approximations take at each level? We conjecture that exact closures of a similar nature to those considered here are possible for networks with more structure, given that the order at which the closure is performed is guided by the network structure; future work will focus on this question.
Acknowledgements
This research was facilitated in part by the Research Centre for Mathematics and Modelling at The University of Liverpool. We thank two anonymous reviewers for helpful comments which improved the manuscript.
Appendix
The proof of Theorem 2.1 is analogous to the proof of the single and pair equations by Sharkey (2011). In what follows, summations over Greek indices α,β are assumed to be over all 3P possible system states. First we make some definitions.
Definition A.1
For a system Γ in state α and a single node i of Γ in state a we define:
denoting whether or not the specified single node state matches the system state. Note that this is just Definition 2.6 applied to the full system.
Definition A.2
Proposition A.1
For allα,i:
where the summation is over all possible states available to nodei.
Proof
Statement that for a given system state Γα, or subsystem state , each node must be in a unique state. □
Proposition A.2
For allβ,i,a:
Proof
Statement that there is only one system state which is identical to Γβ except that node i is in state . □
Proposition A.3
For any subsystemand ∀k∈{1,2,…,r}:
for allα,a.
Proof
Proposition is true when . When we have:
□
Proposition A.4
For any subsystemand ∀k∈{1,2,…,r}:
for allβ,a.
Proof
Proposition is true when:
From Proposition A.2 there must be a single state Γα for which , otherwise it is zero. When
we must also have (for the state when ):
because only site can change state during this transition, establishing the proposition. □
We can now use these propositions to prove Theorem 2.1:
Proof
We have that:
Taking the derivative of this with respect to time and substituting in the system master equation (8) gives
19 |
From Proposition A.1:
Multiplying the right of (19) by this gives:
This can be simplified using the fact that σαβ=0 whenever the state of the subsystem ψW differs by more than a single individual , k∈{1⋯r}, between states Γα and Γβ which means that aj=bj=Aj for j≠k:
where the last equality follows from .
For SIR dynamics, we can do the summations over ak and bk:
20 |
Now we introduce the relevant terms in the transition matrix at the level of the system:
where these equations are designed so that they are satisfied for any combination of α,β,k. Substituting these into (20) gives:
We can rearrange the summation order:
and apply Proposition A.3:
Applying Proposition A.4 gives
Breaking up the sums over n on the first and third lines depending on whether the node n is internal or external to the motif ψW gives:
Lines 1 and 4 can be immediately recognised as the generating rule (Definition 2.4). For n∉W and :
and if then .
Line 2 requires that n∈W. Let l∈{1,2,…,r} and Wl=n. Then:
where the last equality follows from Proposition A.3. Using the definition of , this becomes:
and similarly for line 5.
We obtain:
where the operator on the 5th line is superfluous but allows us to write the equation in the form of (9). □
References
- Anderson R. M., May R. M. Infectious diseases of humans. London: Oxford University Press; 1991. [Google Scholar]
- Bailey N. T. J. The mathematical theory of infectious diseases. London: Griffin; 1975. [Google Scholar]
- Baker R. E., Simpson M. J. Correcting mean-field approximations for birth–death-movement processes. Phys. Rev. E. 2010;82 doi: 10.1103/PhysRevE.82.041905. [DOI] [PubMed] [Google Scholar]
- Ball F., Neal P. Network epidemic models with two levels of mixing. Math. Biosci. 2008;212:69–87. doi: 10.1016/j.mbs.2008.01.001. [DOI] [PubMed] [Google Scholar]
- Bartlett M. S. Proc. third Berkley symp. math. statist. prob. 1956. Deterministic and stochastic models for recurrent epidemics; pp. 81–108. [Google Scholar]
- Born M., Green H. S. A general kinetic theory of liquids. I. The molecular distribution functions. Proc. R. Soc. Edinb. A. 1946;188:10–18. doi: 10.1098/rspa.1946.0093. [DOI] [PubMed] [Google Scholar]
- Decreusefond L., Dhersin J., Moyal P., Tran V. C. Large graph limit for an SIR process in random network with heterogeneous connectivity. Ann. Appl. Probab. 2012;22:541–575. doi: 10.1214/11-AAP773. [DOI] [Google Scholar]
- Harada Y., Iwasa Y. Lattice population dynamics for plants with dispersing seeds and vegetative propagation. Res. Popul. Ecol. 1994;36:237–249. doi: 10.1007/BF02514940. [DOI] [Google Scholar]
- Heathcote H. W. The mathematics of infectious diseases. SIAM Rev. 2000;42:599–653. doi: 10.1137/S0036144500371907. [DOI] [Google Scholar]
- House T., Keeling M. J. Insights from unifying modern approximations to infections on networks. J. R. Soc. Interface. 2011;8:67–73. doi: 10.1098/rsif.2010.0179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karrer B., Newman M. E. J. A message passing approach for general epidemic models. Phys. Rev. E. 2010;82 doi: 10.1103/PhysRevE.82.016101. [DOI] [PubMed] [Google Scholar]
- Keeling M. J. The effects of local spatial structure on epidemiological invasions. Proc. Biol. Sci. 1999;266:859–867. doi: 10.1098/rspb.1999.0716. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keeling M. J., Eames K. T. D. Networks and epidemic models. J. R. Soc. Interface. 2005;2:295–307. doi: 10.1098/rsif.2005.0051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keeling M. J., Ross J. On methods for studying stochastic disease dynamics. J. R. Soc. Interface. 2008;5:171–181. doi: 10.1098/rsif.2007.1106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kermack W. O., McKendrick A. G. Contributions to the mathematical theory of epidemics. Proc. R. Soc. Edinb. A. 1927;115:700–721. doi: 10.1098/rspa.1927.0118. [DOI] [Google Scholar]
- Kirkwood J. G. The statistical mechanical theory of transport processes I. General theory. J. Chem. Phys. 1946;14:180–201. doi: 10.1063/1.1724117. [DOI] [Google Scholar]
- Kirkwood J. G. The statistical mechanical theory of transport processes II. Transport in gases. J. Chem. Phys. 1947;15:72–76. doi: 10.1063/1.1746292. [DOI] [Google Scholar]
- Kurtz T. G. Solutions of ordinary differential equations as limits of pure jump Markov processes. J. Appl. Probab. 1970;7:49–58. doi: 10.2307/3212147. [DOI] [Google Scholar]
- Kurtz T. G. Limit theorems for sequences of jump Markov processes approximating ordinary differential processes. J. Appl. Probab. 1971;8:344–356. doi: 10.2307/3211904. [DOI] [Google Scholar]
- Lindquist J., Ma J., van den Driessche P., Willeboordse F. H. Effective degree network disease models. J. Math. Biol. 2011;62:143–164. doi: 10.1007/s00285-010-0331-2. [DOI] [PubMed] [Google Scholar]
- Markham D. C., Simpson M. J., Baker R. E. Simplified method for including spatial correlations in mean-field approximations. Phys. Rev. E. 2013;87 doi: 10.1103/PhysRevE.87.062702. [DOI] [PubMed] [Google Scholar]
- Matsuda H., Ogita N., Sasaki A., Sato K. Statistical mechanics of populations: the lattice Lotka–Volterra model. Prog. Theor. Phys. 1992;88:1035–1049. doi: 10.1143/ptp/88.6.1035. [DOI] [Google Scholar]
- Miller J. C., Slim A. C., Volz E. M. Edge-based compartmental modelling for infectious disease spread. J. R. Soc. Interface. 2012;9:890–906. doi: 10.1098/rsif.2011.0403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller J. C., Volz E. M. Model hierarchies in edge-based compartmental modeling for infectious disease spread. J. Math. Biol. 2012 doi: 10.1007/s00285-012-0572-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pastor-Satorras R., Vespignani A. Epidemic spreading in scale-free networks. Phys. Rev. Lett. 2001;86:3200–3203. doi: 10.1103/PhysRevLett.86.3200. [DOI] [PubMed] [Google Scholar]
- Rand D. A. Correlation equations and pair approximations for spatial ecologies. In: McGlade J., editor. Advanced ecological theory: principles and applications. Oxford: Blackwell; 1999. pp. 100–142. [Google Scholar]
- Sato K., Matsuda H., Sasaki A. Pathogen invasion and host extinction in lattice structured populations. J. Math. Biol. 1994;32:251–268. doi: 10.1007/BF00163881. [DOI] [PubMed] [Google Scholar]
- Sharkey K. J. Deterministic epidemiological models at the individual level. J. Math. Biol. 2008;57:311–331. doi: 10.1007/s00285-008-0161-7. [DOI] [PubMed] [Google Scholar]
- Sharkey K. J. Deterministic epidemic models on contact networks: correlations and unbiological terms. Theor. Popul. Biol. 2011;79:115–129. doi: 10.1016/j.tpb.2011.01.004. [DOI] [PubMed] [Google Scholar]
- Simon P. L., Taylor M., Kiss I. Z. Exact epidemic models on graphs using graph-automorphism driven lumping. J. Math. Biol. 2011;62:479–508. doi: 10.1007/s00285-010-0344-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simon P. L., Kiss I. Z. From exact stochastic to mean-field ODE models: a new approach to prove convergence results. IMA J. Appl. Math. 2011 [Google Scholar]
- Taylor M., Simon P. L., Green D. M., House T., Kiss I. Z. From Markovian to pairwise epidemic models and the performance of moment closure approximations. J. Math. Biol. 2012;64:1021–1042. doi: 10.1007/s00285-011-0443-3. [DOI] [PubMed] [Google Scholar]
- Volz E. SIR dynamics in random networks with heterogeneous connectivity. J. Math. Biol. 2008;56:293–310. doi: 10.1007/s00285-007-0116-4. [DOI] [PMC free article] [PubMed] [Google Scholar]