Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2010 Jul 16;389(21):4755–4768. doi: 10.1016/j.physa.2010.07.014

Discovering network behind infectious disease outbreak

Yoshiharu Maeno 1,
PMCID: PMC7125928  PMID: 32288081

Abstract

Stochasticity and spatial heterogeneity are of great interest recently in studying the spread of an infectious disease. The presented method solves an inverse problem to discover the effectively decisive topology of a heterogeneous network and reveal the transmission parameters which govern the stochastic spreads over the network from a dataset on an infectious disease outbreak in the early growth phase. Populations in a combination of epidemiological compartment models and a meta-population network model are described by stochastic differential equations. Probability density functions are derived from the equations and used for the maximal likelihood estimation of the topology and parameters. The method is tested with computationally synthesized datasets and the WHO dataset on the SARS outbreak.

Keywords: Epidemiological compartment model, Meta-population network model, Maximal likelihood estimation, Severe Acute Respiratory Syndrome, Stochastic differential equation

1. Introduction

When the epidemiologists at a public health agency detect a signal of an infectious disease outbreak, they rely heavily on mathematical models of disease transmission in estimating the rate of transmission, predicting the direction and speed of the spread, and figuring out an effective measure to contain the outbreak. Many of the models formulate stochasticity and spatial heterogeneity, which are of great interest recently. The spatial heterogeneity ranges from the uneven probabilities of contacts between the individuals in communities [1], [2], dependence of the strength of the demographical interactions between cities on the distance [3], to nation-wide or world-wide inhomogeneous geographical structures [4], [5].

A Monte-Carlo stochastic simulation is widely used to understand the influence of the spatial heterogeneity on stochastic spreading. In such a simulation, accuracy and reproducibility of the input demographical knowledge such as the amount of traffic between cities have great impacts on the reliability of the output pattern of the movement of pathogens and their hosts. But, in studying world-wide epidemics, just a collection of regular airline routes and aircraft capacities does not always present the transportation network which results in the real chain of transmission. Some routes are influential decisively, but the others are not. Examples are found in the spread of Severe Acute Respiratory Syndrome (SARS) from Asia to the world in 2003. There were not any cases in Japan in spite of the heavy traffic there from Asian countries. Many patients appeared in Canada earlier than the United States to which airlines connect Asian countries much more densely. Here arises an interesting question. Inversely, is it possible to learn the effectively decisive transportation network by observing how the disease spreads, reinforce the demographical knowledge on the network, and import the acquired knowledge into the mathematical model? This is an inverse problem similar to the network tomography [6], [7].

In this study, a statistical method is presented to discover the effectively decisive topology of a heterogeneous network and reveal the parameters which govern stochastic transmission from a dataset on the early growth phase of the outbreak. The dataset consists of either the number of infectious persons or the number of new cases per an observation interval. The method is founded on a mathematical model for a stochastic reaction–diffusion process [8]. The population in the model is described by a set of Langevin equations. The equations are stochastic differential equations which include rapidly fluctuating and highly irregular functions of time. Probability density functions and likelihood functions are derived from the equations analytically, and used for the maximal likelihood estimation of the topology and parameters. The method is tested with a number of computationally synthesized datasets and the World Health Organization (WHO) dataset on the SARS outbreak in March through April in 2003.

2. Problem

2.1. Stochastic model

The model in this study is a special case of a stochastic reaction–diffusion process. The model is a combination of standard epidemiological SIR or SIS compartment models and a meta-population network model [9]. The meta-population network model [10] sub-divides the entire population into distinct sub-populations in N geographical regions. Movement of persons occurs between the sub-populations while the epidemiological state transitions (infection and recovery) occur in a sub-population. A sub-population is randomly well-mixed. Heterogeneity is present between sub-populations.

The geographical regions are represented by nodes ni(i=0,1,,N1). The movement is parameterized by a matrix γ whose i-th row and j-th column element γij is the probability at which a person moves from ni to nj per a unit time. A person remains at the same node at the probability of 1j=0N1γij. Generally, γij=γji does not hold. By definition, γii=0. It is often confirmed empirically that a simple law relates a network topology to the movement [11]. The topology is specified by a neighbor matrix l. The transportation between two regions is represented by a pair of unidirectional links. If a pair of links is present between ni and nj,lij=lji=1. If absent, lij=lji=0. By definition, lii=0. In the experiments in Section 4, an empirically confirmed law γij=Γij(l) is postulated, and the topology and the probability of movement are treated interchangeably.

The SIR compartment model [12] is a behavioral extreme where immunity is life-long. The state of a person changes from a susceptible state (S), through an infectious state (I), to a recovered state (R). In contrast, the immunity does not occur in the SIS compartment model. The state of a recovered patient goes back to S. The parameter α represents the probability at which an infectious person contacts a person and infect the person per a unit time. If the contacted person is susceptible, the number of the infectious persons increases by 1. The effective rate of infection by a single infectious person is the product of α and the proportion of the susceptible persons within the population. The parameter β represents the probability at which an infectious person recovers per a unit time. These parameters are constants over subpopulations and time. The basic reproductive ratio r is defined by r=α/β [13].

Movement, infection and recovery are Markovian stochastic processes governed by γij,α, and β.

2.2. Time evolution of spread

In a stochastic process, even if the initial condition is known, there are many possible trajectories which the process might go along. A set of these possible trajectories is a statistical ensemble. The change in the population is described by a set of Langevin equations [14]. A Langevin equation is a stochastic differential equation [15]. The microscopic continuous time evolution of a system is obtained by adding a fluctuation (a stochastic term) to the known macroscopic time evolution of the system.

The quantity Si(t) is the number of susceptible persons at a node ni at time t. Ii(t) is the number of infectious persons. Ri(t) is the number of recovered persons. The change in Ii(t)(i=0,1,,N1) is given by Eq. (1) [16]. It is a set of N stochastic differential equations.

dIi(t)dt=αSi(t)Ii(t)Si(t)+Ii(t)+Ri(t)βIi(t)+j=0N1γjiIj(t)j=0N1γijIi(t)+αSi(t)Ii(t)Si(t)+Ii(t)+Ri(t)ξi[α](t)βIi(t)ξi[β](t)+j=0N1γjiIj(t)ξji[γ](t)j=0N1γijIi(t)ξij[γ](t). (1)

Stochastic terms ξ(t)=(ξi[α](t),ξi[β](t),ξij[γ](t)) are rapidly fluctuating and highly irregular functions of time. The number of terms is M=N2+N (N terms for infection, N terms for recovery, and N(N1) terms for movement). The functional forms of individual elements ξa(t)(a=0,1,,M1) are not known. Their statistical property is the Gaussian white noise which satisfies Eq. (2) through (4).

ξa(t)ensemble=0 (2)
ξa(t)ξb(u)ensemble=δabδ(ut) (3)
ξa(t)ξb(u)ξc(v)ensemble=0. (4)

In these equations, δ(t) is a Dirac’s delta function, and δab is a Kronecker’s delta symbol. The ensemble average of a variable x is xensemble. Eq. (3) means that there is no correlation at different times and between different terms. Eq. (4) means that the third and higher order moments vanish.

In most cases, the outbreak is contained before the spread reaches equilibrium. In the early growth phase of the outbreak, IiSi and RiSi hold true. The first term of the rightside of Eq. (1) is independent of Si and Ri because Si/(Si+Ii+Ri)1. The resulting equation is Eq. (5). Eq. (5) can also be applied to the SIS model.

dIi(t)dt=αIi(t)βIi(t)+j=0N1γjiIj(t)j=0N1γijIi(t)+αIi(t)ξi[α](t)βIi(t)ξi[β](t)+j=0N1γjiIj(t)ξji[γ](t)j=0N1γijIi(t)ξij[γ](t). (5)

The cumulative number of new cases until time t is represented by Ji(t)(i=0,1,,N1). The rate of increase in Ji(t) equals to the first term of Eq. (5). That is, αIi(t). The time evolution of Ji(t) is given by Eq. (6). The rightside dose not depend on Ji(t) itself.

dJi(t)dt=αIi(t)+αIi(t)ξi[α](t). (6)

The total number of the infectious persons at time t is given by I(t)=i=0N1Ii(t). Its time evolution is given by Eq. (7). It does not depend on the values of γij.

dI(t)dt=αI(t)βI(t)+i=0N1αIi(t)ξi[α](t)i=0N1βIi(t)ξi[β](t). (7)

The total cumulative number of new cases until time t is given by J(t)=i=0N1Ji(t). Its time evolution is given by Eq. (8).

dJ(t)dt=αI(t)+i=0N1αIi(t)ξi[α](t). (8)

2.3. Definition of problem

The problem is to discover the network topology l (or γ) and reveal the transmission parameter r (or α and β) from a given dataset Ii(td)(i=0,1,,N1,d=0,1,,D1) or ΔJi(td)(i=0,1,,N1,d=0,1,,D1). The dataset Ii(td) is the time sequence of the number of infectious persons. The dataset ΔJi(td)=Ji(td+1)Ji(td) is the time sequence of the number of new cases between observations. An observation is made at every node ni(i=0,1,,N1) at times td(d=0,1,,D1). The time interval between observations is Δt=td+1td. For example, a bundle of the daily reports on cases from hospitals is a dataset ΔJi(td) where Δt=1 day. Other information is not known. That is, nothing is known about Si(td),Ri(td), nor the initial condition which could identify the index case (the first patient from whom the infectious disease has spread).

3. Method

3.1. Likelihood function

Various techniques of statistical inference can be applied once the likelihood function is obtained analytically. The likelihood function is the conditional probability of the obtained dataset as a function of the unknown parameters of a parameterized statistical model. The conditional probability becomes noticeably large if the value of the parameters is close to the true value. For example, maximal a posteriori estimation is used to find the parameters which maximize the posterior distribution. In this study, the problem is solved by maximal likelihood estimation. The Langevin equation (5) through (8) are solved by obtaining the moments of probability variables at time t so that the probability density functions and logarithmic likelihood functions can be derived, rather than by calculating the trajectories of time-dependent variables for a given functional form of ξa(t) [4]. Four logarithmic likelihood functions L[I1](θ),L[I2](θ),L[J1](θ), and L[J2](θ) are derived for given datasets Ii(td),I(td),ΔJi(td), and ΔJ(td) respectively under the unknown parameters θ={γ,α,β}.

Appendix A presents the procedure to solve the Langevin equations through a Fokker–Planck equation [17]. Appendix B summarizes the formula for the time evolution of m[I](t|θ),m[J](t|θ) (row vectors whose i-th element is the mean of Ii,Ji) and v[II](t|θ),v[IJ](t|θ),v[JJ](t|θ) (matrices whose i-th row and j-th column element is the covariance between Ii and Ij,Ii and Jj,Ji and Jj). Appendix C summarizes the formula for the time evolution of m[I](t|θ),m[J](t|θ) (the mean of I,J) and v[II](t|θ),v[IJ](t|θ),v[JJ](t|θ) (the variance of I, covariance between I and J, variance of J).

3.1.1. Case 1: for given Ii(td)

The logarithmic likelihood function L[I1](θ) is determined by Ii(td). If the third and higher order moments are ignored, the probability density function p(I,td+1|θ) is a multi-variate Gaussian distribution with the mean m[I](td+1|θ) and covariance v[II](td+1|θ) in Eq. (9). This is the probability of I=(I0,,In1) at t=td+1 given θ. XT is a transpose of a matrix X.

p[I1](I,td+1|θ)=exp(12(Im[I](td+1|θ))v[II](td+1|θ)1(Im[I](td+1|θ))T)(2π)Ndetv[II](td+1|θ). (9)

L[I1](θ) is the logarithm of a product of the probability of the individual observation I(td+1) at t=td+1. It is given by Eq. (10).

L[I1](θ)=d=0D2logp[I1](I(td+1),td+1|θ). (10)

If Δt is small, the formula for the moments become simpler. The exact formula for m[I](t|θ) and v[II](t|θ) are expanded in terms of Δt, and the second and higher order terms are ignored. If the data Ii(td) at t=td is reliable completely, mi[I](td|θ)=Ii(td) and vij[II](td|θ)=0. The moments after Δt are given by Eqs. (11), (12). The moments at t=td+1 depend only on the data at t=td.

mi[I](td+1|θ)Ii(td)+{(αβk=0N1γik)Ii(td)+j=0N1γjiIj(td)}Δt (11)
vij[II](td+1|θ)[{(α+β+k=0N1γik)Ii(td)+k=0N1γkiIk(td)}δijγijIi(td)γjiIj(td)]Δt. (12)

Similarly, the logarithmic likelihood function L[I2](θ) is determined by I(td). I(td) can be calculated from the given dataset Ii(td). The probability density function p[I2](I,td+1|θ)=p[I2](I,td+1|α,β) is a Gaussian distribution with the mean m[I](td+1|θ) and variance v[II](td+1|θ). It does not depend on γ. L[I2](θ)=L[I2](α,β) is the logarithm of a product of the probability of individual observation I(td+1) at t=td+1. It is given by Eq. (13).

L[I2](α,β)=d=0D2logp[I2](I(td+1),td+1|α,β). (13)

Again, if Δt is small, the formula for the moments become simpler. They are given by Eqs. (14), (15).

m[I](td+1|θ)I(td)+(αβ)I(td)Δt (14)
v[II](td+1|θ)(α+β)I(td)Δt. (15)

3.1.2. Case 2: for given ΔJi(td)

The logarithmic likelihood function L[J1](θ) is determined by ΔJi(td). The probability density function p[J1](J,td+1|θ) is a multi-variate Gaussian distribution in the same functional form as Eq. (9) with the mean m[J](td+1|θ) and covariance v[JJ](td+1|θ). The functional form of L[J1] is also the same as Eq. (10). Ji(td+1) can be calculated from the dataset by Ji(td+1)=Ji(t0)+d=0dΔJi(td). The first observation ΔJi(t0) may include all the known cases at that time. Then, Ji(t0) can be deleted from the above formula. There is, however, a big difference from L[I1](θ). The probability at t=td+1 is not determined by the data Ji(td) because the moments of Ji at t=td+1 depends on Ii at t=td whose value is not known. Such an approximation as Eq. (14) or (15) is not correct. Thus, the exact formula for v[JJ](t|θ) as a function of t must be evaluated to calculate the value of L[J1](θ).

Similarly, the logarithmic likelihood function L[J2](θ) is determined by J(td). J(td) can be calculated from the given dataset ΔJi(td). The probability density function p[J2](J,td+1|θ)=p[J2](J,td+1|α,β) is a Gaussian distribution with the mean m[J](td+1|θ) and variance v[JJ](td+1|θ). The functional form of L[J2](θ)=L[J2](α,β) is the same as Eq. (13). The approximation in Eqs. (14), (15) is not correct either. Thus, the exact formula for v[JJ](t|θ) as a function of t must be evaluated to calculate the value of L[J2](α,β).

3.2. Estimation procedure

Theoretically, every formula in the following can be applied in estimating γij directly, as well as estimating l and obtaining γ by the law γij=Γij(l). But, the estimation of N(N1)/2 binary parameters lij(i<j) tends to be more robust than that of N(N1) continuous parameters γij(ij). The binary parameters are suitable for reliable combinatorial optimization by means of well-established numerical algorithms and computational implementations. Thus, the estimation of l is detailed here and demonstrated in Section 4.

3.2.1. Case 1: for given Ii(td)

The procedure for the estimation from Ii(td) is presented. The problem is solved by dividing it to two sub-problems and solving them sequentially, rather than by searching the maximal likelihood estimators αˆ,βˆ, and lˆ simultaneously. The first sub-problem is to obtain αˆ and βˆ by solving Eq. (16).

αˆ,βˆ=argmaxα,βL[I2](α,β). (16)

The estimators are given by Eqs. (17), (18) where ΔI(td)=I(td+1)I(td).

αˆ=12Δt{1Dd=0D1ΔI(td)I(td)1D(d=0D1ΔI(td))2d=0D1I(td)+d=0D1ΔI(td)d=0D1I(td)} (17)
βˆ=12Δt{1Dd=0D1ΔI(td)I(td)1D(d=0D1ΔI(td))2d=0D1I(td)d=0D1ΔI(td)d=0D1I(td)}. (18)

The second sub-problem is to obtain the maximal likelihood estimator lˆ using the obtained values of αˆ and βˆ. They are obtained by solving Eq. (19).

lˆ=argmaxlL[I1](αˆ,βˆ,Γij(l)). (19)

Eq. (19) cannot be solved analytically. There are 1014 possible topologies for N=10, and 1057 for N=20. Simulated annealing [18] is a powerful meta-heuristic algorithm to solve such a combinatorial global optimization problem. A candidate of parameters l is generated randomly near the present value of l. The parameters are updated from γij=Γij(l) to Γij(l) according to the probability p(s) in Eq. (20) in the s-th step (s=0,1,) of iterations.

p(s)=min(exp(L[I1](αˆ,βˆ,Γij(l))L[I1](αˆ,βˆ,Γij(l))kT(s)),1). (20)

T(s) is the annealing temperature in the s-th step. A typical cooling schedule is T(s)=1/log(s+1). Since O(T)=1, the scaling constant k is selected as an appropriate value whose order is the same as that of L[I1].

3.2.2. Case 2: for given ΔJi(td)

The procedure for the estimation from ΔJi(td) is presented. Again, the problem is divided to two sub-problems. The first sub-problem is to solve Eq. (21). The quantity I(0) is the initial value of the number of infectious persons, which appears in the formula for the mean and variance of J in Appendix C. It is not the same as the known J(t0), but an unknown parameter.

αˆ,βˆ,Iˆ(0)=argmaxα,β,I(0)L[J2](α,β,I(0)). (21)

Simulated annealing uses the probability p(s) in Eq. (22) for the update of a candidate. An alternative means to solve Eq. (21) is such a function maximization algorithm as a BFGS quasi-Newton method [18].

p(s)=min(exp(L[J2](α,β,I(0))L[J2](α,β,I(0))kT(s)),1). (22)

A great difficulty in maximizing L[J1](θ) is encountered in solving the second sub-problem. The very complex formula for v[JJ](t|θ) to obtain the value of L[J1](θ) is not tractable even numerically unless N is very small. An approximation is introduced to convert this problem to the computationally tractable second sub-problem in 3.2.1. The valued of Ii(td) is approximately obtained from the value of ΔJi(td) by Eq. (23), which use the already obtained value of αˆ. Eq. (19) is solved with the converted values of Ii(td) instead of maximizing L[J1](θ) directly.

Ii(td)ΔJi(td)αˆΔt. (23)

Eq. (23) is a discrete time approximation of Eq. (6) for small Δt. This relationship holds true for the mean values of Ii(td) and ΔJi(td). But the variance of Ii(td) is overestimated by neglecting the stochastic term αIiξi[α]. Because of the approximation, the estimation from ΔJi(td) would be more erroneous than that from Ii(td). The estimation errors are demonstrated in Section 4.

4. Experiment

4.1. Computationally synthesized dataset

A number of test datasets are synthesized by numerical integration [15] of a Langevin equation (1) for random network topologies and transmission parameters. The network is a Erdös–Rényi model in a combination of N and the average nodal degree ki. The nodal degree of a node ni is given by ki=j=0N1lij. The probability at which lij=1 is ki/(N1).

It is postulated that the total number of persons who moves from ni to nj per a unit time is proportional to kikj if a link is present. This law is known valid generally for the world-wide airline transportation network [11]. It is also postulated that the initial population Pi(0)=Si(0)+Ii(0)+Ri(0) of a node ni is proportional to the total number of persons who outgoes from the node per a unit time. Consequently, γij is determined as a function of l by Eq. (24). The fraction of persons who outgoes per a unit time is a constant γ over the network. This is an additional unknown parameter in solving Eq. (19). The law in Eq. (24) is used in discovering the network topology by Eq. (19) as well as synthesizing the datasets computationally.

γij=Γij(l)=lijkikjj=0N1lijkikjγ. (24)

Pi(0) is given by Eq. (25). The total population is set to P=106N in the experiment.

Pi(0)=j=0N1lijkikji=0N1j=0N1lijkikjP. (25)

The estimation error of the basic reproductive ratio is defined by Eq. (26). It is a relative absolute deviation from the true value.

Er=|rˆr|r=|αˆ/βˆα/β|α/β. (26)

The estimation error of the topology is defined by Eq. (27). It is the fraction of links whose presence or absence is estimated wrongly.

El=i<j|lˆijlij|N(N1)/2. (27)

Fig. 1 illustrates an example of the network topology estimated from a computationally synthesized dataset. The graph [A] shows the dataset Ii(td) with Δt=1 and D=100 when r=2. The drawing [B] shows the topology with N=10 and ki=3 to synthesize the dataset. The index cases appear at n0. The network includes a core sub-structure consisting of n0,n2,n4, n5, and n9. It is nearly a clique where every node is connected to every other node. Links are present except for the one between n5 and n9. The drawing [C] shows the topology estimated from the dataset. The error is El=0.18. The core is discovered correctly. The ability of the method is surprising in distinguishing the only pair of nodes where the link is absent. The links from the core to n1 and n7 are discovered. Although the method identifies that n3,n6, and n8 do not belong to the core, but form the stubs (dead ends) from the core, it fails to estimate how they are connected to each other and the core. The number of cases is the smallest at these nodes. The movements of infectious persons to and from them are so infrequent that the analysis on them is not so reliable.

Fig. 1.

Fig. 1

Example of the network topology estimated from a computationally synthesized dataset. [A]: dataset Ii(td) with Δt=1 and D=100 when the basic reproductive ratio is r=2(α=0.067,β=0.033,γ=0.1). Individual curves represent the nodes. [B]: random network topology with N=10 and ki=3 to synthesize the dataset in [A]. The index cases appear at n0. At t=t99,I2>I4>I0>I5>I9>I7>I1>I6>I8>I3. [C]: network topology estimated from the dataset in [A].

The estimation error El of the method in this study is compared with those of a naive estimation and a mere random guess. The naive estimation relies on the correlation between nodes. When an infectious person moves from ni to nj,Ii decreases and Ij increases by one simultaneously. Thus, intuitively, the negative correlation of the change in Ii and Ij is the signal of the presence of a link. The correlation ρij between ni and nj is calculated by Eq. (28) where ΔIi(td)=Ii(td+1)Ii(td).

ρij=d=0D2(ΔIi(td)1Nk=0N1ΔIk(td))(ΔIj(td)1Nk=0N1ΔIk(td)). (28)

Note that kΔIk=0N1(td)/N is not the average over the time sequence for nk, but the average over the nodes at t=td. This formula is supposed to exclude the positive correlation because of the common growing trends in Ii (dependent on (αβ)IiΔt in Eq. (11)). The naive estimation predicts lij=1 if ρij<0. The random guess is the worst bound of estimation. The number of links whose presence or absence is predicted wrongly obeys a binomial distribution. The mean and standard deviation of El are 0.5 and 0.0745 for N=10, and 0.5 and 0.0256 for N=20 theoretically.

Fig. 2 shows El for various values of the normalized average degree ki/(N1)(ki=2,3,4 for the number of nodes N=10) when r=2, and Ii(td) with Δt=1 and D=100 is given as a dataset. For small ki, the naive estimation does not work at all. As ki increases, El of the method increases and that of the naive estimation decreases. For large ki, the naive estimation becomes less erroneous than the random guess. But, it never surpasses the method in this study. The initially heterogeneous node-to-node distribution of infectious persons relaxes more quickly in the networks having more links. For example, the standard deviation of Ii(t99) is about 400 for ki=2, and about 300 for ki=4, while the mean for both ki is about 550 (I0(0)exp((αβ)DΔt)/N). The growing trends also become more homogeneous. Under such homogeneity, the negative correlation implies the movements between nodes directly. The naive estimation may be a substitute if assuming homogeneous distribution during observation are well-grounded.

Fig. 2.

Fig. 2

Estimation error El for various values of the normalized average degree ki/(N1)(ki=2,3,4 for the number of nodes N=10) when r=2(α=0.067,β=0.033,γ=0.1), and Ii(td) with Δt=1 and D=100 is given as a dataset. The initial condition is I0(0)=200 and Ii(0)=0 for all i0. The individual plots show the mean and standard deviation over trials for 100 different random networks. [a]: maximal likelihood estimation (the method presented in this study). [b]: naive negative correlation estimation. [c]: mere random guess (theoretically 0.5 ± 0.0745).

Fig. 3 shows the estimation errors El and Er for various values of the normalized average degree ki/(N1)(ki=2,3,4 for N=10 and ki=3,6,9 for N=20),N, and r when Ii(td) with Δt=1 and D=100 is given as a dataset. The findings are as follows.

Fig. 3.

Fig. 3

Estimation errors El and Er for various values of the normalized average degree ki/(N1)(ki=2,3,4 for N=10 and ki=3,6,9 for N=20), the number of nodes N, and the basic reproductive ratio r when Ii(td) with Δt=1 and D=100 is given as a dataset. The initial condition is I0(0)=200 and Ii(0)=0 for all i0. The individual plots show the mean and standard deviation over trials for 100 different random networks. [A]: El for r=2(α=0.067,β=0.033,γ=0.1). [B]: El for r=4 (α=0.08,β=0.02, γ=0.1). [C]: El for r=6 (α=0.086,β=0.014, γ=0.1). [D]: Er for r=2 (the same as [A]). [E]: Er for r=4 (the same as [B]). [F]: Er for r=6 (the same as [C]).

  • As ki increases, El increases from around 0.2 to 0.4. Although El (the average ± standard deviation) remains less than 0.5 within the range of the experimental conditions here, the estimation comes close to a mere random guess (El0.5) for the dense network limit ( a complete graph). Discerning the presence or absence of links becomes more difficult as the spread goes on over more links in parallel and reaches more nodes along more possible routes. On the other hand, Er does not change largely as ki changes.

  • As the network becomes larger, El increases and Er decreases. As the model becomes more complex (the number of links O(N2) becomes larger compared to the amount of data O(ND)), there may appear more optimal or sub-optimal topologies. Choosing a unique right answer becomes more difficult from such similar candidates. On the other hand, the central limit theorem guarantees that the fluctuation decreases as the network becomes larger because α and β are estimated from the sum of N probability variables (I(td)=i=0N1Ii(td)).

  • As r increases, El increases (but the difference between r=4 and r=6 is very small) and Er increases from around 0.1 to 0.35. The observations cannot track down the rapid reproduction of patients when rΔt is large.

The dependence of the errors on γ in Eq. (24) is investigated. The errors increase from El=0.2 and Er=0.092 for γ=0.1 in [a] of Fig. 3[A] to El=0.26 and Er=0.10 for γ=0.2, and El=0.32 and Er=0.099 for γ=0.4. The accuracy of estimation is limited when many persons move between nodes in both directions because of large γ. The dependence of the errors on Δt is investigated. The errors increase from El=0.2 and Er=0.092 in [a] of Fig. 3[A] to El=0.31 and Er=0.18 if the observations are made four times less frequently (Δt=4,D=25). But they are improved only slightly to El=0.18 and Er=0.074 if the observations are made 4 times more frequently (Δt=0.25, D=400). A small time interval between observations is relevant to accurate estimation. The errors are investigated for various initial population distributions Pi(0). If the population is a thousandth (P=103N), El=0.23 and Er=0.091 for N=10,ki=2, and r=2 when I0(0)=20. If Pi(0)j=0N1lij(kikj)4 rather than j=0N1lijkikj in Eq. (25), El=0.25 and Er=0.09 when the population is P=106N. In this case, the population ranges in vastly diverse scales. The ratio of the population of the most populated node to that of the least populated node is Pmax(0)/Pmin(0)2000 while Pmax(0)/Pmin(0)7 in case of Eq. (25). Er is not affected by the distribution. El increases when much less populated nodes are present. But El still remains small.

Fig. 4 shows the estimation errors El and Er for various values of ki/(N1),N, and r when ΔJi(td) with Δt=1 and D=100 is given as a dataset. The experimental conditions are the same as those for Fig. 3. The findings are as follows.

Fig. 4.

Fig. 4

Estimation errors El and Er for various values of ki/(N1),N, and r when ΔJi(td) with Δt=1 and D=100 is given as a dataset. The experimental conditions are the same as those for Fig. 3.

  • The dependency of El and Er on ki,N, and r in Fig. 4 is similar to those in Fig. 3.

  • The absolute value of errors tends to increase. For example, El=0.31 in Fig. 4 is much larger than El=0.2 in Fig. 3 under the same experimental conditions ki=2,N=10, and r=2. In contrast, the increase in Er is relatively small. Er is nearly the same when r=6. The deterioration in estimating the topology, therefore, seem to result from the influence of the approximation in Eq. (23).

As a summary of the experiments, the estimation is particularly reliable (El0.2 and Er0.1) when Ii(td) for a slow reproduction over a sparse network is used as an input. Such performance cannot be achieved by the naive estimation. The estimated topology from ΔJi(td) is more erroneous than that from Ii(td) by as much as 50%.

4.2. SARS dataset

SARS is a respiratory disease in humans caused by the SARS corona-virus. The epidemic of SARS appears to have started in the Guangdong Province of south China in November 2002. SARS spread from the Guangdong Province to Hong Kong in early 2003, and eventually nearly 40 countries around the world by July. WHO archives the cumulative number of reported probable cases of SARS.1 . The dataset in the archive had been updated nearly every day since March 17. It is a time sequence dataset Ji(td) with Δt=1 day. In this study, the target geographical regions are those where five or more cases had been reported in a month since March 17. They include Canada (CAN), France (FRA), United Kingdom (GBR), Germany (GER), Hong Kong (HKG), Malaysia (MAS), Taiwan (ROC), Singapore (SIN), Thailand (THA), United States (USA), and Vietnam (VIE). Mainland China is not included because no data is available in some periods and no data outside of Guangdong Province is reported in other periods.

Fig. 5 shows the date when the first patient appeared and the propagating wavefront of the spread. It is almost certain that neither FRA nor MAS are the origin of the outbreak. But, nobody can tell the chain of transmission among CAN, HKG, SIN, GBR, ROC, and USA in just two days from March 17 to 19 reliably. The wavefront is not as informative as anticipated. Such a naive gleaning verifies the obvious series of events at best.

Fig. 5.

Fig. 5

Date when the first cases appeared and the propagating wavefront of the spread in the WHO dataset on the cases of SARS.

The estimated transmission parameters are αˆ=0.18 and βˆ=0.13. The basic reproductive ratio is rˆ=1.4. According to the field-based medical case studies, the basic reproductive ratio (except for super-spreading events [19]) was r=2.7 in February and went down to r=1 in late March in Hong Kong [20], and r=7 in February and r=1 in early March in Singapore [13]. The decrease of r is due to the quarantine, hospitalization and public awareness starting to take effects after WHO issued a world-wide alert on March 12. But the spread of SARS was still going on world-wide. The value slightly greater than 1 seems reasonable at that stage.

It is postulated that the law in Eq. (24) holds true in analyzing the SARS dataset. The topology which achieves the largest value of the logarithmic likehood function among many trials is chosen. This is efficient in rejecting the local maximum to which simulated annealing may converge. The trials use different random number sequences to generate nearby candidates l in Eq. (20). Fig. 6 shows the estimated topologies lˆ. The topology [A] is the most likely (the largest value of the likelihood L=9985). The best 30 trials out of 300 trials converge to [A]. It includes 11 links (ki/(N1)=0.2). The topology [B] is the second most likely (L=9998). The next 5 trials converge to [B]. It include 13 links (ki/(N1)=0.24). The topology [C] is the third most likely (L=10012). The next 29 trials converge to [C]. It include 14 links (ki/(N1)=0.25). About 20% of the trials converge to either [A] or [C].

Fig. 6.

Fig. 6

Estimated topologies from the WHO dataset on the cases of SARS in Canada (CAN), France (FRA), United Kingdom (GBR), Germany (GER), Hong Kong (HKG), Malaysia (MAS), Taiwan (ROC), Singapore (SIN), Thailand (THI), United States (USA), and Vietnam (VIE) from March 17 through April 17. [A]: the most likely topology. [B]: the second most likely topology. [C]: the third most likely topology.

The sub-structures common in all of [A], [B], and [C] are a star from HKG to CAN, ROC, and SIN, another star from USA to GBR, MAS, and VIE, and a link between the centers of these stars (HKG and USA). A triangle between USA, VIE, and THI appears in [A] and [C]. The likelihood seems sensitive to the topological whereabouts of GER and FRA given these common sub-structures as a core of the network. This may happen to make [B] a tall but narrow peak in the landscape of the likelihood, which simulated annealing sometimes fail to discover. In addition to these sub-structures, a few remarkable points are seen in these topologies. The nodes SIN and ROC are stubs where ki=1. The role of SIN is not so relevant in spreading the disease despite the fact that the number of cases there was more than 100 in the middle of April. The nodes CAN and USA have links to distant geographical regions, and USA is a hub (ki is the largest). They are relevant intermediate spreader nodes. The links around GER are not stable among the three topologies. The number of cases in some European countries is too small to draw a reliable conclusion.

The estimated topologies are not meant to reproduce the trajectories of individual patients’ movement, but rather demonstrate some demographical interactions within the macroscopic world-wide transportation behind the SARS outbreak. Nevertheless, the sub-structures mentioned above seem to be consistent with the following publicly known series of events on some individual patients’ microscopic movements.

  • Two of the index patients in Toronto in Canada, three of the index patients in Singapore, and another three of the index patients in the United States stayed in a hotel in Hong Kong where a Chinese nephrologist, who had treated many patients in Guangzhou and became infected, was staying in late February.2 This event implies the links from HKG to CAN, SIN, and USA forms a chain of transmission in the early growth phase of the SARS outbreak.

  • A garment manufacturer from the United States became infected during their stay in Hong Kong on the way to Hanoi in Vietnam, showed symptoms there, and was evacuated to a hospital in Hong Kong [21]. An Italian physician, who treated him at a hospital in Hanoi, showed symptoms in Bangkok in Thailand where he would attend a conference in early March. These events imply that the interactions among HKG, USA, VIE, and THI are present potentially, which could result in another chain of transmission.

The WHO dataset is not of perfectly reliable quality. Particularly, the data on mainland China is of poor quality, and cannot be used in this study. Even the individual number of cases which was reported from the other local governments may not be accurate. The number of cases is highly fluctuating and seems noisy. Data in a city-level resolution, rather than nation-level, would be necessary for accurate estimation when large countries like USA play an important role as a spreader. It is surprising that, in spite of these limitations, the method reproduces some characteristics of the network over which SARS spread from Hong Kong to Southeast Asia and North America.

5. Conclusion

The method presented in this study solves an inverse problem to discover the effectively decisive network and reveal the transmission parameters from the observation Ii(td) or ΔJi(td) on the spread of an infectious disease. The findings with test datasets are that the estimation is particularly successful when the topology is sparse and reproduction is slow, and that the estimation from ΔJi(td) is more erroneous than that from Ii(td). The network topology discovered from a seemingly noisy dataset on the SARS outbreak reproduces some characteristic patterns of the spread from Hong Kong to Southeast Asia and North America. So far, there has been a great deal of effort in obtaining a complete picture of how an infectious disease did and will spread from the found pieces of an epidemiological jigsaw puzzle. The method presents new pieces from a viewpoint of macroscopic transportation. These pieces can be put together with the pieces found in the conventional field-based medical case studies on the individual patients’ microscopic movements.

The method can be extended to apply to a more practical situation. The experimental condition in this study is an extreme where nothing but Ii(td) or ΔJi(td) is known and no informative prior knowledge is available. If some demographical statistics on the traffic between cities or the findings on the past contacts between individual patients are available, the consequent posterior distribution enables more comprehensive Bayesian inference. Another extension is to employ more complicated but realistic epidemiological compartment models. The latent period (infected but not infectious) and hospitalization are relevant for some diseases. The dependence of recovery on time (β constant) is realistic for other diseases. Strictly speaking, the time interval from infection to recovery and from a movement to another obeys appropriate probability density functions. Analytical treatment of the stochastic process with these effects tends to be considerably difficult. The estimation may count on such a numerical method as a Markov-chain Monte-Carlo sampling.

In addition to the extensions to the method, it is sometimes critical to gather such a larger dataset as a collection of multiple independent time sequences starting from different index cases. Less erroneous estimation may be possible even for the regions whose population is small or where the number of cases is small. Such a dataset is not available for SARS, but possibly for influenza which spreads around the world in seasonal epidemics. Understanding the landscape of the likelihood functions is essential in identifying the requisites for a dataset, a network topology, and a mathematical model of disease transmission to make the inverse problem well-posed and stabilize the solution. This remains the challenge for the future.

1

World Health Organization, cumulative number of reported probable cases of SARS, http://www.who.int/csr/sars/country/en/index.html (2003).

2

SARS Expert Committee (Hong Kong), SARS in Hong Kong: from experience to action, http://www.sars-expertcom.gov.hk/english/reports/reports/reports_fullrpt.html (2003).

Appendix A. Probability density function

A generic form of a Langevin equation for multiple time-dependent variables xi(t) is given by Eq. (29). The fluctuations ξa(t) are stochastic terms.

dxi(t)dt=μi(x0(t),,xN1(t))+a=0M1σia(x0(t),,xN1(t))ξa(t). (29)

Eq. (29) can be solved by deriving the probability density function p(x,t) for probability variables x=(x0,,xN1) at time t. The time evolution of p(x,t) is given by the Fokker–Planck equation in Eq. (30).

p(x,t)t=i=0N1xiAi(x)p(x,t)+12i,j=0N12xixjBij(x)p(x,t). (30)

The coefficients Ai and Bij are given by Eqs. (31), (32).

Ai(x)=μi(x) (31)
Bij(x)=a=0M1σia(x)σja(x). (32)

The mean (the first order moment) of xi at t is given by mi(t)=xit=xip(x,t)dx. The time evolution of mi(t) is given by Eq. (33). It is derived by multiplying Eq. (30) by x and partial integration under the condition where p and p/x decay more rapidly than Ai and Bij near the boundary of the domain of x.

dmi(t)dt=Ai(x)t. (33)

The covariance (the second order moment) between xi and xj at t is given by vij(t)=xixjtmi(t)mj(t). The time evolution of vij(t) is given by Eq. (34). Derivation is similar to that for Eq. (33).

dvij(t)dt=Bij(x)t+xiAj(x)t+Ai(x)xjt. (34)

Higher order moments can be obtained recursively as a solution of the differential equations which include the calculated lower order moments.

Appendix B. Moments of Ii and Ji

Eq. (35) through (39) are the differential equations for the time evolution of the first and second order moments of Ii and Ji. The symbols m[I](t|θ),m[J](t|θ) are the row vectors whose i-th element is the mean of Ii,Ji, and v[II](t|θ), v[IJ](t|θ),v[JJ](t|θ) are the N×N matrices whose i-th row and j-th column element is the covariance between Ii and Ij,Ii and Jj, Ji and Jj. The unknown network topology and transmission parameters are represented by a symbol θ={γ,α,β}.

dm[I](t|θ)dt=m[I](t|θ)aT (35)
dm[J](t|θ)dt=αm[I](t|θ) (36)
dv[II](t|θ)dt=av[II](t|θ)+v[II](t|θ)aT+Bt (37)
dv[IJ](t|θ)dt=av[IJ](t|θ)+α(v[II](t|θ)+c(t)) (38)
dv[JJ](t|θ)dt=α(v[IJ](t|θ)+v[IJ](t|θ)T+c(t)). (39)

Definitions of the N×N matrices a,B, and c which appear in Eq. (35) through (39) are given by Eq. (40) through (42).

aij=(αβk=0n1γik)δij+γji (40)
Bij={(α+β+k=0N1γik)Ii+k=0N1γkiIk}δijγijIiγjiIj (41)
cij(t)=δijmi[I](t|θ). (42)

Eq. (43) through (47) are the solutions. E is a unit matrix.

m[I](t|θ)=I(0)exp(aTt) (43)
m[J](t|θ)=I(0){α(aT)1exp(aTt)α(aT)1+E} (44)
v[II](t|θ)=0texp(a(tt))Btexp(aT(tt))dt (45)
v[IJ](t|θ)=0tαexp(a(tt))(v[II](t|θ)+c(t))dt (46)
v[JJ](t|θ)=0tα(v[IJ](t|θ)+v[IJ](t|θ)T+c(t))dt=0tα20texp(a(tt))(v[II](t|θ)+c(t))+(v[II](t|θ)+c(t))exp(aT(tt))dt+αc(t)dt. (47)

Appendix C. Moments of I and J

Eq. (48) through (52) are the differential equations for the time evolution of the first and second order moments of I and J. The symbols m[I](t|θ),m[J](t|θ) are the mean of I,J, and v[II](t|θ), v[IJ](t|θ),v[JJ](t|θ) are the variance of I, covariance between I and J, variance of J.

dm[I](t|θ)dt=(αβ)m[I](t|θ) (48)
dm[J](t|θ)dt=αm[I](t|θ) (49)
dv[II](t|θ)dt=2(αβ)v[II](t|θ)+(α+β)m[I](t|θ) (50)
dv[IJ](t|θ)dt=(αβ)v[IJ](t|θ)+α(v[II](t|θ)+m[I](t|θ)) (51)
dv[JJ](t|θ)dt=α(2v[IJ](t|θ)+m[I](t|θ)). (52)

Eq. (53) through (57) are the solutions.

m[I](t|θ)=I(0)exp(αβ)t (53)
m[J](t|θ)=I(0)(ααβexp(αβ)tβαβ) (54)
v[II](t|θ)=I(0)α+βαβ(exp2(αβ)texp(αβ)t) (55)
v[IJ](t|θ)=I(0){α(α+β)(αβ)2exp2(αβ)t(α(α+β)(αβ)2+2αβαβt)exp(αβ)t} (56)
v[JJ](t|θ)=I(0)[α2(α+β)(αβ)3exp2(αβ)t{α(α+β)(αβ)2+4α2β(αβ)2t}exp(αβ)tαβ(α+β)(αβ)3]. (57)

References

  • 1.Walker D.W., Allingham D., Lee H.W.J., Small M. Parameter inference in small world network disease models with approximate Bayesian computational methods. Physica A. 2010;389:540–548. doi: 10.1016/j.physa.2009.09.053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Small M., Tse C.K., Walker D.M. Super-spreaders and the rate of transmission of the SARS virus. Physica D. 2006;215:146–158. doi: 10.1016/j.physd.2006.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Keeling M.J., Brooks S.P., Gilligan C.A. Using conservation of pattern to estimate spatial parameters from a single snapshot. Proceedings of the National Academy of Sciences USA. 2004;101:9155–9160. doi: 10.1073/pnas.0400335101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Dangerfield C.E., Ross J.V., Keeling M.J. Integrating stochasticity and network structure into an epidemic model. Journal of the Royal Society Interface. 2009 doi: 10.1098/rsif.2008.0410. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Riley S. Large-scale spatial-transmission models of infectious disease. Science. 2007;316:1298–1301. doi: 10.1126/science.1134695. [DOI] [PubMed] [Google Scholar]
  • 6.Maeno Y. Node discovery problem for a social network. Connections. 2009;29:62–76. [Google Scholar]
  • 7.Rabbat M.G., Figueiredo M.A.T., Nowak R.D. Network inference from co-occurrences. IEEE Transactions on Information Theory. 2008;54:4053–4068. [Google Scholar]
  • 8.Baronchelli A., Catanzaro M., Pastor-Satorras R. Bosonic reaction–diffusion processes on scale-free networks. Physical Review E. 2008;78:01611. doi: 10.1103/PhysRevE.78.016111. [DOI] [PubMed] [Google Scholar]
  • 9.Simões M., da Gama M.M.T., Nunes A. Stochastic fluctuations in epidemics on networks. Journal of the Royal Society Interface. 2008;5:555–566. doi: 10.1098/rsif.2007.1206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Colizza V., Vespignani A. Invasion threshold in heterogeneous meta-population networks. Physical Review Letters. 2007;99:148701. doi: 10.1103/PhysRevLett.99.148701. [DOI] [PubMed] [Google Scholar]
  • 11.Barrat A., Barthélemy M., Pastor-Satorras R., Vespignani A. The architecture of complex weighted networks. Proceedings of the National Academy of Sciences USA. 2004;101:3747–3752. doi: 10.1073/pnas.0400087101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Keeling M.J., Ross J.V. On methods for studying stochastic disease dynamics. Journal of the Royal Society Interface. 2008;5:171–181. doi: 10.1098/rsif.2007.1106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Lipsitch M. Transmission dynamics and control of severe acute respiratory syndrome. Science. 2003;300:1966–1970. doi: 10.1126/science.1086616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hufnagel L., Brockmann D., Geisel T. Forecast and control of epidemics in a globalized world. Proceedings of the National Academy of Sciences USA. 2004;101:15124–15129. doi: 10.1073/pnas.0308344101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kloeden P.E., Platen E. Springer; 1992. Numerical Solution of Stochastic Differential Equations. [Google Scholar]
  • 16.Colizza V., Barret A., Barthélemy M., Vespignani A. The role of the airline transportation network in the prediction and predictability of global epidemics. Proceedings of the National Academy of Sciences USA. 2006;103:2015–2020. doi: 10.1073/pnas.0510525103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.van Kampen N.G. Elsevier; 2007. Stochastic Processes in Physics and Chemistry. [Google Scholar]
  • 18.Press W.H., Teukolsky S.A., Vetterling W.T., Flannery B.P. Cambridge University Press; 2007. Numerical Recipes: The Art of Scientific Computing. [Google Scholar]
  • 19.Fujie R., Odagaki T. Effects of superspreaders in spread of epidemic. Physica A. 2007;374:843–852. doi: 10.1016/j.physa.2006.08.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Riley S. Transmission dynamics of the etiological agent of SARS in Hong Kong: impact of public health interventions. Science. 2003;300:1961–1966. doi: 10.1126/science.1086478. [DOI] [PubMed] [Google Scholar]
  • 21.Greenfeld K.T. HarperCollins Publishers; 2006. China Syndrome. [Google Scholar]

Articles from Physica a are provided here courtesy of Elsevier

RESOURCES