Infectious disease transmission as a forensic problem: who infected whom?

Peter Teunis; Janneke C M Heijne; Faizel Sukhrie; Jan van Eijkeren; Marion Koopmans; Mirjam Kretzschmar

doi:10.1098/rsif.2012.0955

. 2013 Apr 6;10(81):20120955. doi: 10.1098/rsif.2012.0955

Infectious disease transmission as a forensic problem: who infected whom?

Peter Teunis ^1,^2,^✉, Janneke C M Heijne ³, Faizel Sukhrie ^4,⁵, Jan van Eijkeren ⁶, Marion Koopmans ^4,⁵, Mirjam Kretzschmar ^1,⁷

PMCID: PMC3627102 PMID: 23389896

Abstract

Observations on infectious diseases often consist of a sample of cases, distinguished by symptoms, and other characteristics, such as onset dates, spatial locations, genetic sequence of the pathogen and/or physiological and clinical data. Cases are often clustered, in space and time, suggesting that they are connected. By defining kernel functions for pairwise analysis of cases, a matrix of transmission probabilities can be estimated. We set up a Bayesian framework to integrate various sources of information to estimate the transmission network. The method is illustrated by analysing data from a multi-year study (2002–2007) of nosocomial outbreaks of norovirus in a large university hospital in the Netherlands. The study included 264 cases, the norovirus genotype was known in approximately 60 per cent of the patients. Combining all the available data allowed likely identification of individual transmission links between most of the cases (72%). This illustrates that the proposed method can be used to accurately reconstruct transmission networks, enhancing our understanding of outbreak dynamics and possibly leading to new insights into how to prevent outbreaks.

Keywords: infectious disease, transmission, networks, biostatistics

1. Introduction

Studies of infectious disease transmission are often based on simulations, for instance of the possible outcomes when an infectious subject is released into a susceptible population. Careful consideration of the probabilities of any possible outcome then may be used for statistical inference on important parameters, for transmissibility and mixing behaviour [1–4]. When there is knowledge about contacts between individuals, simulations may incorporate potential paths of the infection in a given network [5–8]. A complication is that transmission of infection depends not only on who is connected to whom, but also on when (in which order) these contacts occurred [9–11]. For any but the smallest populations with simple structure, a forward projecting approach is problematic, because of the many different outcomes that are possible.

However, in a given population of cases as may be observed in an outbreak of infectious disease, reversing time makes the problem a lot simpler: any infected subject can have been caused by only one infectious ancestor [12–15]. Exploiting this notion, our goal is to track possible links between cases of infectious disease, identifying who may have been infected by whom. In order to do so, we want to use all relevant information between each pair of cases. Information about a possible causal link between a pair of cases may be found in the times of their occurrence, their spatial locations, in their serological status or possible differences in genetic sequences of the infecting pathogen. Information from each of these sources can be interpreted as a measure of the disparity between pairs of cases, similar properties indicating proximity. The greater the dissimilarity, the greater the distance between cases, in terms of time, location, serum antibody titre or genetic sequence of the pathogen.

When an outbreak is studied, various categories of data are collected, but usually some information is missing. Records from any individual patient may miss one or more items, and few records may be complete. The approach presented here combines various incomplete data sources to estimate who infected whom as accurately as possible.

To outline the approach used here, we will start by defining a transmission matrix of the probabilities that transmission occurred between any pair of cases in the affected population. Then we will show how to formulate a likelihood function, allowing estimation of the elements of this matrix. The likelihood contains a pairwise kernel function dependent on the available data mentioned above (time, location, … ) and we will show how these various data sources may be incorporated. Finally, use of the estimated transmission matrix to estimate reproduction numbers, possible transmission networks, and a new identifiability metric to evaluate identification of ancestors will be given.

A practical application is demonstrated by analysis of a dataset on nosocomial transmission of several genotypes of norovirus, illustrating how to deal with missing data in a novel manner.

2. Analysis of the transmission network

Because the probabilities of transmission between any two persons change in time during an outbreak, depending on the infection status of the population, any given contact matrix (identifying contact frequencies between subjects) cannot be directly translated into a transmission network. Observed numbers of cases during an outbreak do inform us about transmission that must have occurred, given a certain (unobserved) susceptible status of the population. Conditional on that unobserved status, inference can be made on the transmission path of the infection, as described below.

2.1. Transmission probability matrix

Given a number n of observed infected cases, a matrix V of transmission probabilities can be defined

2.1

with element v_i,j the probability that subject i acquired its infection from subject j and v_i the vector of transmission probabilities linking case i to any other case.

The following properties of the transmission probability matrix V can be identified immediately;

— any diagonal element v_i,j = 1, if and only if subject i is a founder. All other
2.2
because subjects cannot infect themselves; and
— if all cases have been found (the complete outbreak has been observed), rows must add to 1
2.3
any subject must have acquired their infection from exactly one other subject in the observed infected population. In case of an incompletely observed outbreak, e.g. asymptomatic cases contributing to transmission, this latter condition may have to be relaxed.

By characterizing links through their parent nodes, rows can be estimated independently: the probability that subject i obtained its infection from subject j is independent of the probability that subject k obtained its infection from subject j.

Note that the elements v_i,j are probabilities of transmission that do not directly identify links between subjects. A transmission link for subject i may be thought of as a random sample from a categorical distribution with probability vector v_i.

2.2. Kernel likelihood

Suppose observations of a distance measure X_i,j can be made on pairs of subjects (i,j). These distances may be serial intervals for symptom onset, or spatial distance, or any other quantifiable measure that may indicate a causal link between two subjects.

Suppose κ_i,j(x_i,j|i ← j,θ) is the conditional probability (density) of x_i,j given that subject i was infected by j (designated as i ← j). If κ_i,j() is a parametric function, it may depend on parameters θ, these will be omitted unless they are needed. The contribution of the pair (i,j) to the likelihood then is

2.4

Note that if subject i is known not to have had contact with subject j the probability of transmission between them is v_i,j = 0. Omitting constant (normalizing) terms, the dot product

2.5

can be interpreted as a likelihood function, with probabilities of transmission v_i from any possible ancestors to case i, given the observed data X_i.

Over all observed subjects, the likelihood becomes

2.6

2.3. Estimating V

Given a set of observed distances X between pairs of cases, the transmission probability matrix V can be estimated, using the likelihood function in equation (2.6).

As any row i of V represents the vector of probabilities that case i received their infection from any other case, these row vectors may be estimated independently of one another.

Finding the maximum likelihood link to case i is trivial: this is a vector with v_i,k = 1 at the position (k) of the largest element of κ_i and 0 elsewhere (see the electronic supplementary material, appendix D). If all elements in κ_i are close to 0 and only κ_i,k ≈ 1 this is sensible; however, if all elements in κ_i would be approximately equal with only one (at position k) slightly larger than the rest, the maximum likelihood vector would also have v_i,k = 1 and 0 elsewhere (see the electronic supplementary material, appendix D), but we would be less certain of who was the ancestor of case i, than in the former case.

To obtain information about the uncertainty in the transmission probabilities v_i,j they can be estimated in a Markov chain Monte Carlo (MCMC) procedure. Using a Metropolis–Hastings algorithm [16] we start with an initial matrix V with arbitrary elements (for instance all elements in any row not known to be zero set to the same value). The likelihood function (equation (2.6)) is then used to calculate the likelihood of that initial position in the Markov chain. In case prior distributions are specified, a posterior probability (density) may be calculated.

In order to allow use of a normal candidate distribution, the v_i,j are first logit transformed,

2.7

and candidates z′_i,j = z_i,j + ε are calculated (ε ∼ N(0,σ_c)), and the new v′_i,j are obtained by reverse logit transform and scaling so that rows add to 1

2.8

The likelihood for the candidate matrix is then calculated (or its posterior), and the candidate is accepted, if ratio of the likelihoods (or posteriors) for V^′ and V is greater than a uniform (0,1) random variate [16].

Note that here the kernel function κ_i,j(X_i,j|i ← j,θ) has been assumed known, that is: V is estimated, given the parameter set θ. Ultimately, we need to infer both V and θ (see the electronic supplementary material, appendix E).

3. Linking kernel function

Information to identify pairs of cases as links in a transmission chain may be related to the infection, such as the time between symptom onsets or the (genetic sequence of the) pathogen involved. Other information may relate to contact patterns that exist independent of the outbreak, such as spatial location of the subjects or contact frequencies between subjects (though these latter may be modified by disease).

Observations may take the form of a set of numbers representing distance in some trait space, where distance is associated with a probability of transmission. The matrix of κ_i,j(X_i,j|θ) then may have elements consisting of parametric probability densities with some common set of parameters θ.

Often, such distance measurements are not available, but there may be knowledge of the probabilities of links between subjects that can be quantified. For instance, two subjects may or may not share the same peer group, implying frequent or less frequent contacts, or two subjects may be infected with genetically distinct isolates of the same pathogen sufficiently different to exclude mutation. Such data can be incorporated by building a matrix of fixed elements κ_i,j. In the following, we specify different types of distance measures.

3.1. Serial interval

In the previous studies, the likelihood of a transmission link between two cases has been calculated by way of the serial interval distribution [12]. The serial interval is defined as the time between successive clinical cases [17].

When the hazard function for onset of illness in a secondary case, given the interval τ between onsets of illness symptoms is h(τ|θ) with model parameter vector θ, and we would know the onsets T_i, any two cases (i,j) separated by a serial interval T_i−T_j would be linked [18] by a probability

3.1

When recorded times are interval censored, for instance in days, the kernel function must be slightly modified (electronic supplementary material, appendix A).

In practice there are two additional constraints that may save computational effort;

— negative serial intervals are excluded. When infected subjects become infectious long before they start having symptoms, the incubation period in one subject can be shorter than the interval between onset of infectiousness and onset of symptoms in its ancestor [17], resulting in a negative serial interval (figure 1). For most diseases this is not likely and for convenience it is assumed that any element with T_j > T_i must have a kernel likelihood zero. If needed, this assumption may be easily dropped; and
— when the serial interval between two cases exceeds the maximum serial interval (say, some high percentile of its distribution), a link is highly unlikely and the corresponding kernel likelihood may also be set to zero.

Figure 1. — The serial interval between successive cases in a transmission chain: the generation interval T_g(*i,j*); the serial interval for onset of infectiousness ΔT_inf(*i,j*) and the serial interval for symptom onset ΔT_symp(*i,j*). When the incubation period (from infection to symptom onset) is long, negative serial intervals for symptom onset may be found.

As a consequence, when the observed symptom onsets {T₁, T₂, … ,T_n} are an ordered list of increasing times, the resulting likelihood matrix is only non-zero in a region below and close to the diagonal depending on the maximum length of the serial interval (see the electronic supplementary material, appendix figure A2).

3.2. Proximity of cases

The distances between cases can be expressed as geometrical distances, calculated from known geographical locations, but alternatively, some more or less precise proximity measure may be given. When the locations of cases (e.g. home or work) are given such continuous distance measures may be evaluated in a contact kernel, similar to the generation interval kernel above [19,20].

However, proximity of any pair of cases may often be more appropriately characterized by a measure of social distance, reflecting the probability of contact between two cases. For instance, membership of a peer group increases the probability of contact and not being part of that group decreases the probability of contact. The probabilities of any pair of cases sharing contacts then define a (social) proximity kernel

3.2

where the two parameters a₀ and a₁ may be probabilities or rates of contacts, the ratio of their magnitudes representing the ratio of contact probabilities between subjects.

3.3. Genetic similarity

When pathogen samples have been taken from patients, genetic sequence information may be available. Such sequences are commonly collected for constructing the genetic lineage of the pathogen. However, the genetic similarity between any pair of isolates from the same cluster of infectious disease cases may also be used as a measure of the distance between the cases the samples were obtained from. Genetic similarity between pathogens from a pair of cases i and j can be expressed as a genetic kernel coefficient G_i,j weighting links between observed cases [21]. Some methods for characterizing genetic distance discriminate between G_i,j and G_j,i, others not.

Often simpler, categorical information is available. Suppose a limited number of different genotypes of a pathogen may be found, that are sufficiently different so that one type cannot transform into another in a few generations, within an outbreak. Then a cluster of cases with mixed genotypes must consist of a mixture of transmission networks, one for each genotype. Any case may shed pathogens of any one of k = (1, … ,K) different genotypes (assuming mixed infections do not occur). Pathogen type can be represented by a vector c_i of length K. When case i sheds pathogen type k then all elements of c_i are zero, except the element at position k which is 1. The genetic kernel function

3.3

now leads to a matrix with elements 0 or 1, for cases with either the same or different genotypes.

When the genotype of the pathogen has not been observed in all cases, the abundance of observed genotypes may be used to assign a probability that a case has any one of the set of possible types. The distribution of the abundance of each genotype may be assumed multi-nomial with probability vector

3.4

with element p_k as the probability that any unknown specimen from the outbreak has genotype k and the sum of all p_k adding up to 1. Assuming equal detectability of genotypes, these abundances (the probability that any detected virus is of a particular genotype) may be obtained from the observed frequencies in the same outbreak.

Genotypes may be assigned to the cases with missing genotype by drawing a random sample of the abundance distribution. For any single case, this is a sample from a categorical distribution (multi-nomial with n = 1). For any specific case i, the different possible genotypes produce different optimum (maximum likelihood) estimates of its links to other cases. Assigning a different genotype to a case where that information is missing is equivalent to assigning that case to a different transmission sub-network.

3.4. Combining different linking kernel functions

Analogously, other types of information on distances between cases may be formulated in terms of kernel functions. The information from onset times, locations and/or genetic similarity may be combined to produce a likelihood contribution for each possible pair of observed cases. In the simplest case of independent contributions, the elements Inline graphic of the likelihood matrix then are found by element-wise multiplication

3.5

assuming absence of correlation between types of distances (time, location and genetic sequence). The superindexes denote kernels based on time (T), spatial or social (S) and genetic (G) data, respectively.

4. Analysis of the estimated transmission probability matrix

The transmission probability matrix provides information about the transmission network of an outbreak on several levels: distribution of reproduction numbers and other network metrics, distribution of possible transmission trees, and identifiability of specific links between cases. The set of V matrices obtained by MCMC sampling may be analysed for studying these properties of the transmission network.

4.1. Reproduction number

Row i of the transmission matrix shows a vector of the probabilities of case i to have acquired their infection from any other case. Conversely, column j of this same matrix is a vector of the probabilities of case j to have transmitted their infection to any other case. Their sum

4.1

can be interpreted as the expectation of the number of cases produced by infectious case j, or its reproduction number. Thus, we can use the transmission matrix to calculate reproduction numbers for any individual case in the outbreak. In network terminology, R_j is an estimate of the outdegree of node j [22].

The posterior sample of transmission probability matrices (obtained from the Markov chain) can be considered a sample from the joint posterior distribution of transmission probabilities for all cases. If an element from the iteration with rank number m = 1, … ,M is v_i,j^(m), then the sample is

4.2

Quantiles from such a sample of R_j may be used to calculate marginal posterior intervals of the reproduction numbers. Using the transmission probability matrix, other network metrics can also be calculated.

4.2. Transmission tree

As mentioned above (§2.1), links between case i and any other case can be assumed realizations of a random sample from a categorial distribution with probability vector v_i,j. Such sampling reduces each row of V to a binary vector with exactly one element 1 and the rest zero. This reduced matrix may be interpreted as an adjacency matrix, equivalent to a directed graph representing a transmission tree of the outbreak.

Note that in order for V to be translated into a proper transmission network, cycles are not admissible. If any element v_i,j ≠ 0, then v_j,i = 0: subjects cannot mutually infect each other. Any larger cycles (i.e. loops involving more than two subjects) also cannot occur. By sorting all cases in order of their onsets of symptoms and preventing negative serial intervals (and treating degenerate cases as in the electronic supplementary material, appendix B), cycles are prevented from occurring.

4.3. Identifiability of links

Any row i of the transmission probability matrix V represents a vector of probabilities that a subject i got their infection from any of the other cases in the outbreak. If that vector has a single non-zero element (that then must be 1, if this is a closed outbreak), then the person who infected subject i can be identified uniquely. In reality, this will rarely happen: several elements of the vector at row i can be non-zero, pointing to several other subjects as possible infectors of subject i.

To study the ambiguity regarding who infected subject i, a crude measure might be the number of non-zero elements in row i of the transmission matrix. However, it is not likely that all of these possible candidates were assigned the same probability of having infected subject i.

The probability that i was infected by another infectious subject j and not by anyone else is v_i,j. Interpreting infection of i by j as a Bernoulli trial with variance

4.3

may be used as a measure of how accurately j may be identified as the person who infected i. The sum

4.4

then is a measure of how accurately any of the subjects in the observed population may be identified as having infected subject i.

The quantity 1−q(i) may be used as a score of the degree of identification of the infection parent of subject i. When there is only one non-zero element in row i of the transmission matrix, the parent node is perfectly known, and 1−q(i) = 1 indicating perfect identification. When row i has n_i non-zero terms and they all have the same value 1/n_i then

4.5

is at a minimum. This would represent complete ignorance about who infected subject i, among those for whom a link should be possible (i.e. those with non-zero v_i,j).

5. An application: nosocomial transmission of norovirus

Norovirus is an enteric pathogen causing a considerable proportion of outbreaks of enteric illness worldwide [23], in particular in hospitals and nursing homes [24]. Norovirus can be easily transmitted by infectious cases [13,14]. Various different genotypes of the infectious pathogen may be isolated from patients in norovirus outbreaks. Such diversity may result from exposure to a common source containing a mixture of virus genotypes [25]. However, in outbreaks with person-to-person transmission [26], such a cluster must be considered a mixture of as many outbreaks as there are identified genotypes (§3.3).

This analysis uses previously published data from norovirus outbreaks that occurred during a 5 year period (19 December 2002–31 December 2007) in a university hospital in the Netherlands [27]. In short, from 264 cases in total, 160 provided samples that were analysed (39.4% missing). Data included only cases with symptoms of acute gastroenteritis (vomiting, diarrhea), as date of symptom onset. The hospital department where any of these cases occurred was also known. Assuming that cases from different departments had a decreased probability of transmission owing to the spatial separation, we set a fixed weight factor for transmission between cases in the same department (0.9) and between cases from different departments (0.1), similar to previously reported analyses [13,14].

It is a rare occasion that all cases in an outbreak are sampled for shedded pathogens. For that reason, when multiple genotypes have been identified in part of the identified cases but a fraction (with similar symptoms) lacks samples and remains untyped, these untyped cases may be assigned to any of the networks describing transmission for each genotype (§3.3).

As the elements of the transmission probability matrix are updated in the MCMC procedure, randomly rotating all available genotypes over all cases with missing genotype results in large steps in posterior probability space. Therefore, a nested procedure was used where genotypes of random untyped cases are updated one at a time, followed by updating the transmission probability matrix for several thousands of iterations. By this procedure, each case with missing genotype is inserted into the transmission network, where it causes the strongest increase in posterior probability.

Norovirus may be transmitted through direct contact between an infectious and a susceptible person, but often the virus remains infectious for extended periods when deposited on inanimate surfaces. Anyone touching such contaminated surfaces may then transfer some infectious virus to the skin of their hands (for instance) and subsequently inoculate themselves by touching exposed mucosal areas. Such an environmental transmission route may link cases over extended intervals. The serial interval may, therefore, be heterogeneous: in addition to the short-term (relatively) high probability of transmission by direct contact, environmental transmission causes a tail effect with a low frequency of long serial intervals (figure 2).

Figure 2. — Serial interval distribution as a binary mixture showing two modes of transmission: a short interval corresponding to direct contact [13,28], and a more disperse, longer interval corresponding to environmental transmission.

In this analysis, a distribution for the serial interval was determined as a mixture of the distribution obtained from day-care centre data [13,28] and a component with similar shape (gamma distribution) but longer intervals, representing possible environmental transmission. Note that there is still only one transmission probability matrix, but its elements now represent a mixture of two modes of transmission. Initial parameters for this distribution (table 1) were determined using the unweighted kernel function described in the electronic supplementary material, appendix C. As soon as convergence was established, the parameters of this serial interval kernel were also estimated in the MCMC procedure (see the electronic supplementary material, appendix E). Figure 2 shows the shape of the temporal kernel function used in this analysis.

Table 1.

Parameters and quantiles of the binary mixture used for the temporal kernel distribution (figure 2). Shown are the weight factor for each component (proportion), the shape and scale parameters of the gamma distributed components, and their arithmetic means and a 95% range.

component	proportion	shape	scale	mean	Q_0.025	Q_0.975
1	0.306	6.53	0.31	2.00	0.77	3.80
2	0.694	0.88	18.07	8.73	2.23	19.56

Open in a new tab

When the cases with ‘missing’ genotype were included into the transmission analysis, using their symptom onset and hospital department data, a likely genotype could be inferred. Figure 3 shows the distribution of norovirus genotypes in observed in patient samples, and the estimated (posterior) frequencies of these genotypes.

Figure 3. — Frequencies of the different norovirus genotypes found in a multi-year hospital outbreak study (obs) compared with frequencies assigned by transmission analysis, sampled from a Markov chain of their posterior distribution: median and 95% intervals.

Figure 4 shows examples of transmission trees, based on the transmission matrix with the highest posterior probability (‘plausible estimate’), for the most frequent genotypes GII.3 and GII.4.

Now we ask whether the analysis led to conclusive results on who was infected by whom? Using the score metric defined above parent nodes could be identified with varying degrees of certainty. Some nodes remained isolated ( Inline graphic ) meaning that no parent nodes were found, because there were no eligible cases within the time window defined by the temporal kernel distribution (figure 2). These may have been founders infected by a source outside the observed cluster, but they could also have been infected by an undetected (asymptomatic) case. For the remaining cases, it was frequently possible to identify a parent node from whom they received their infection, summarized in table 2. For about 18 per cent of all cases Inline graphic a unique parent node could be identified (1−q = 1), and for another 12 per cent of cases, a parent node could be found with only minor uncertainty . For 24 per cent of cases parent nodes could not be identified . Figure 5a shows the distribution of scores for all nodes. Compared with nodes representing subjects with known virus genotype, nodes with unknown genotype show greater variability in their identifiability scores (figure 5b). Figure 5c shows how nodes with a greater possible number of parents tend to have lower scores, as may be expected. Typical results for the four categories of outcomes (complete certainty, minor uncertainty, substantial uncertainty and complete ignorance) are shown in the electronic supplementary material, appendix figure A4.

Table 2.

Identifiability of node parents (who infected whom) by range of the score: numbers of typed or untyped subjects with perfectly identified parents (1−q = 1), decreasing degrees of identification and isolated nodes (1−q = 0).

	typed		untyped
score (1−q)	N	%	N	%
1	27	17	20	19
0.9 ≤1−q < 1.0	17	11	14	13
0.7 ≤ 1−q < 0.9	53	33	26	25
0.0 < 1−q < 0.7	40	25	23	22
0	22	14	21	20
0 ≤ 1−q ≤ 1	159		104

Open in a new tab

Figure 5. — (a) Distribution of the most likely (plausible estimate) identifiability score 1−q (equation 4.4) among all subjects in the multi-year study, including all genotypes, locations and clusters. (b) Distribution of the mean identifiability score 1−q (mean, quartiles and 95% range, averaged over all posterior MC samples) among all subjects in the multi-year study for subjects with typed and untyped norovirus infections. (c) Mean identifiability score 1−q versus mean numbers of potential parent nodes. Dark symbols for typed nodes and white symbols for nodes representing subjects with unknown virus genotype. (Online version in colour.)

Of course the identification of parent nodes still rests on the assumption that infectious cases in these outbreaks were all observed. While it is likely that cases identified as isolated have been infected outside the hospital, it remains uncertain whether any cases in the observed outbreaks may have been infected by unobserved contacts from outside the hospital, or indeed by asymptomatic subjects who remained invisible to the outbreak detection protocol [15]. The missing data approach, used here to deal with genotype data, can be extended to study whether addition of a missed infectious subject may substantially improve the posterior probability of the whole transmission network.

6. Conclusion

The present paper has shown that by combining observed data from different modalities in a pairwise analysis of all cases in an outbreak, the question of ‘who has infected whom’ may be answered, for a majority of the patients involved in an outbreak. Such detailed information on the transmission of an infectious disease is valuable, to identify factors that enhance or impair transmission, or to identify superspreaders.

The transmission probability matrix that is obtained also defines a transmission network (including uncertainty in the links connecting nodes), providing information about the contact network, the underlying infrastructure on which the disease is transmitted. The methods used here may support the analyses of transmission networks, based on observed infectious disease data.

Acknowledgements

We thank much to our colleagues Simon Heisterkamp, Nico Nagelkerke, Hans van Houwelingen and Jacco Wallinga, whose critical advice in many discussions has greatly helped to improve the paper.

References

1.Breban R, Vardavas R, Blower S. 2007. Theory versus data: how to calculate R₀? PLoS ONE 2, e282. 10.1371/journal.pone.0000282 (doi:10.1371/journal.pone.0000282) [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Britton T, O'Neill P. 2002. Bayesian inference for stochastic epidemics in populations with random social structure. Scand. J. Stat. 29, 375–390 10.1111/1467-9469.00296 (doi:10.1111/1467-9469.00296) [DOI] [Google Scholar]
3.Ball F, Lyne O. 2001. Stochastic multitype SIR epidemics among a population partitioned into households. Adv. Appl. Probab. 33, 99–123 10.1239/aap/999187899 (doi:10.1239/aap/999187899) [DOI] [Google Scholar]
4.Bailey NTJ. 1955. Some problems in the statistical analysis of epidemic data. J. R. Stat. Soc. B 17, 35–38 [Google Scholar]
5.Britton T, Deijfen M, Lageras AN, Lindholm M. 2008. Epidemics on random graphs with tunable clustering. J. Appl. Probab. 45, 743–756 10.1239/jap/1222441827 (doi:10.1239/jap/1222441827) [DOI] [Google Scholar]
6.Meyers LA, Newman MEJ, Pourbohloul B. 2006. Predicting epidemics on directed contact networks. J. Theor. Biol. 240, 400–418 10.1016/j.jtbi.2005.10.004 (doi:10.1016/j.jtbi.2005.10.004) [DOI] [PubMed] [Google Scholar]
7.Meyers LA, Pourbohloul B, Newman MEJ, Skowronski DM, Brunham RC. 2005. Network theory and SARS: predicting outbreak diversity. J. Theor. Biol. 232, 71–81 10.1016/j.jtbi.2004.07.026 (doi:10.1016/j.jtbi.2004.07.026) [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Newman MEJ. 2002. Spread of epidemic disease on networks. Phys. Rev. E 66, 016128. 10.1103/PhysRevE.66.016128 (doi:10.1103/PhysRevE.66.016128) [DOI] [PubMed] [Google Scholar]
9.Holme P. 2011. Temporal networks. See http://arxiv.org/abs/1108.1780v1 [Google Scholar]
10.Rocha LEC, Liljeros F, Holme P. 2011. Simulated epidemics in an empirical spatiotemporal network of 50,185 sexual contacts. PLoS Comput. Biol. 7, e1001109. 10.1371/journal.pcbi.1001109 (doi:10.1371/journal.pcbi.1001109) [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Rocha LEC, Liljeros F, Holme P. 2010. Information dynamics shape the sexual networks of internet-mediated prostitution. Proc. Natl Acad. Sci. USA 107, 5706–5711 10.1073/pnas.0914080107 (doi:10.1073/pnas.0914080107) [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Wallinga J, Teunis P. 2004. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. Am. J. Epidemiol. 160, 509–516 10.1093/aje/kwh255 (doi:10.1093/aje/kwh255) [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Heijne JCM, Teunis PFM, Morroy G, Wijkmans C, Oostveen S, Duizer E, Kretzschmar M, Wallinga J. 2009. Enhanced hygiene measures and norovirus transmission during an outbreak. Emerg. Infect. Dis. 15, 24–30 10.3201/1501.080299 (doi:10.3201/1501.080299) [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Heijne JCM, Rondy M, Verhoef L, Wallinga J, Kretzschmar M, Low N, Koopmans M, Teunis PFM. 2012. Quantifying transmission of norovirus during an outbreak. Epidemiology 23, 277–284 10.1097/EDE.0b013e3182456ee6 (doi:10.1097/EDE.0b013e3182456ee6) [DOI] [PubMed] [Google Scholar]
15.Sukhrie FHA, Teunis PFM, Beersma MFC, Copra C, Vennema H, Bogerman J, Koopmans M. 2012. Nosocomial transmission of norovirus is mainly caused by symptomatic cases. Clin. Infect. Dis. 54, 931–937 10.1093/cid/cir971 (doi:10.1093/cid/cir971). [DOI] [PubMed] [Google Scholar]
16.Gilks WR, Richardson S, Spiegelhalter DJ. (eds). 1996. Markov chain Monte Carlo in practice. London, UK: Chapman and Hall [Google Scholar]
17.Fine PEM. 2003. The interval between successive cases of an infectious disease. Am. J. Epidemiol. 158, 1039–1047 10.1093/aje/kwg251 (doi:10.1093/aje/kwg251) [DOI] [PubMed] [Google Scholar]
18.Kenah E, Lipsitch M, Robins JM. 2008. Generation interval contraction and epidemic data analysis. Math. Biosci. 213, 71–79 10.1016/j.mbs.2008.02.007 (doi:10.1016/j.mbs.2008.02.007) [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Parham PE, Ferguson NM. 2006. Space and contact networks: capturing the locality of disease transmission. J. R. Soc. Interface 3, 483–493 10.1098/rsif.2005.0105 (doi:10.1098/rsif.2005.0105) [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Boender GJ, Meester R, Gies E, de Jong MCM. 2007. The local threshold for geographical spread of infectious diseases between farms. Prev. Vet. Med. 82, 90–101 10.1016/j.prevetmed.2007.05.016 (doi:10.1016/j.prevetmed.2007.05.016) [DOI] [PubMed] [Google Scholar]
21.Ypma RJF, Bataille AMA, Stegeman A, Koch G, Wallinga J, van Ballegooijen WM. 2012. Unravelling transmission trees of infectious diseases by combining genetic and epidemiological data. Proc. R. Soc. B 279, 444–450 10.1098/rspb.2011.0913 (doi:10.1098/rspb.2011.0913) [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Newman MEJ. 2010. Networks: an introduction. Oxford, UK: Oxford University Press [Google Scholar]
23.Koopmans M. 2008. Progress in understanding norovirus epidemiology. Curr. Opin. Infect. Dis. 21, 544–552 10.1097/QCO.0b013e3283108965 (doi:10.1097/QCO.0b013e3283108965) [DOI] [PubMed] [Google Scholar]
24.Lopman BA, Hall AJ, Curns AT, Parashar UD. 2011. Increasing rates of gastroenteritis hospital discharges in US adults and the contribution of norovirus, 1996–2007 Clin. Infect. Dis. 52, 466–474 10.1093/cid/ciq163 (doi:10.1093/cid/ciq163) [DOI] [PubMed] [Google Scholar]
25.Le Guyader FS, et al. 2006. Detection of multiple noroviruses associated with an international gastroenteritis outbreak linked to oyster consumption. J. Clin. Microbiol. 44, 3878–3882 10.1128/JCM.01327-06 (doi:10.1128/JCM.01327-06) [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Sukhrie FHA, Siebenga JJ, Beersma MFC, Koopmans M. 2010. Chronic shedders as reservoir for nosocomial transmission of norovirus. J. Clin. Microbiol. 48, 4303–4305 10.1128/JCM.01308-10 (doi:10.1128/JCM.01308-10) [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Beersma MFC, Schutten M, Vennema H, Hartwig NG, Mes THM, Osterhaus ADME, van Doornum GJJ, Koopmans M. 2009. Norovirus in a Dutch tertiary care hospital (2002–2007): frequent nosocomial transmission and dominance of GIIb strains in young children. J. Hosp. Infect. 71, 199–205 10.1016/j.jhin.2008.11.018 (doi:10.1016/j.jhin.2008.11.018) [DOI] [PubMed] [Google Scholar]
28.Götz H, Ekdahl K, Lindbäck J, de Jong B, Hedlund KO, Giesecke J. 2001. Clinical spectrum and transmission characteristics of infection with Norwalk-like virus: findings from a large community outbreak in Sweden. Clin. Infect. Dis. 33, 622–628 10.1086/322608 (doi:10.1086/322608) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C1] 1.Breban R, Vardavas R, Blower S. 2007. Theory versus data: how to calculate R₀? PLoS ONE 2, e282. 10.1371/journal.pone.0000282 (doi:10.1371/journal.pone.0000282) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C2] 2.Britton T, O'Neill P. 2002. Bayesian inference for stochastic epidemics in populations with random social structure. Scand. J. Stat. 29, 375–390 10.1111/1467-9469.00296 (doi:10.1111/1467-9469.00296) [DOI] [Google Scholar]

[RSIF20120955C3] 3.Ball F, Lyne O. 2001. Stochastic multitype SIR epidemics among a population partitioned into households. Adv. Appl. Probab. 33, 99–123 10.1239/aap/999187899 (doi:10.1239/aap/999187899) [DOI] [Google Scholar]

[RSIF20120955C4] 4.Bailey NTJ. 1955. Some problems in the statistical analysis of epidemic data. J. R. Stat. Soc. B 17, 35–38 [Google Scholar]

[RSIF20120955C5] 5.Britton T, Deijfen M, Lageras AN, Lindholm M. 2008. Epidemics on random graphs with tunable clustering. J. Appl. Probab. 45, 743–756 10.1239/jap/1222441827 (doi:10.1239/jap/1222441827) [DOI] [Google Scholar]

[RSIF20120955C6] 6.Meyers LA, Newman MEJ, Pourbohloul B. 2006. Predicting epidemics on directed contact networks. J. Theor. Biol. 240, 400–418 10.1016/j.jtbi.2005.10.004 (doi:10.1016/j.jtbi.2005.10.004) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C7] 7.Meyers LA, Pourbohloul B, Newman MEJ, Skowronski DM, Brunham RC. 2005. Network theory and SARS: predicting outbreak diversity. J. Theor. Biol. 232, 71–81 10.1016/j.jtbi.2004.07.026 (doi:10.1016/j.jtbi.2004.07.026) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C8] 8.Newman MEJ. 2002. Spread of epidemic disease on networks. Phys. Rev. E 66, 016128. 10.1103/PhysRevE.66.016128 (doi:10.1103/PhysRevE.66.016128) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C9] 9.Holme P. 2011. Temporal networks. See http://arxiv.org/abs/1108.1780v1 [Google Scholar]

[RSIF20120955C10] 10.Rocha LEC, Liljeros F, Holme P. 2011. Simulated epidemics in an empirical spatiotemporal network of 50,185 sexual contacts. PLoS Comput. Biol. 7, e1001109. 10.1371/journal.pcbi.1001109 (doi:10.1371/journal.pcbi.1001109) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C11] 11.Rocha LEC, Liljeros F, Holme P. 2010. Information dynamics shape the sexual networks of internet-mediated prostitution. Proc. Natl Acad. Sci. USA 107, 5706–5711 10.1073/pnas.0914080107 (doi:10.1073/pnas.0914080107) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C12] 12.Wallinga J, Teunis P. 2004. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. Am. J. Epidemiol. 160, 509–516 10.1093/aje/kwh255 (doi:10.1093/aje/kwh255) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C13] 13.Heijne JCM, Teunis PFM, Morroy G, Wijkmans C, Oostveen S, Duizer E, Kretzschmar M, Wallinga J. 2009. Enhanced hygiene measures and norovirus transmission during an outbreak. Emerg. Infect. Dis. 15, 24–30 10.3201/1501.080299 (doi:10.3201/1501.080299) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C14] 14.Heijne JCM, Rondy M, Verhoef L, Wallinga J, Kretzschmar M, Low N, Koopmans M, Teunis PFM. 2012. Quantifying transmission of norovirus during an outbreak. Epidemiology 23, 277–284 10.1097/EDE.0b013e3182456ee6 (doi:10.1097/EDE.0b013e3182456ee6) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C15] 15.Sukhrie FHA, Teunis PFM, Beersma MFC, Copra C, Vennema H, Bogerman J, Koopmans M. 2012. Nosocomial transmission of norovirus is mainly caused by symptomatic cases. Clin. Infect. Dis. 54, 931–937 10.1093/cid/cir971 (doi:10.1093/cid/cir971). [DOI] [PubMed] [Google Scholar]

[RSIF20120955C16] 16.Gilks WR, Richardson S, Spiegelhalter DJ. (eds). 1996. Markov chain Monte Carlo in practice. London, UK: Chapman and Hall [Google Scholar]

[RSIF20120955C17] 17.Fine PEM. 2003. The interval between successive cases of an infectious disease. Am. J. Epidemiol. 158, 1039–1047 10.1093/aje/kwg251 (doi:10.1093/aje/kwg251) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C18] 18.Kenah E, Lipsitch M, Robins JM. 2008. Generation interval contraction and epidemic data analysis. Math. Biosci. 213, 71–79 10.1016/j.mbs.2008.02.007 (doi:10.1016/j.mbs.2008.02.007) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C19] 19.Parham PE, Ferguson NM. 2006. Space and contact networks: capturing the locality of disease transmission. J. R. Soc. Interface 3, 483–493 10.1098/rsif.2005.0105 (doi:10.1098/rsif.2005.0105) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C20] 20.Boender GJ, Meester R, Gies E, de Jong MCM. 2007. The local threshold for geographical spread of infectious diseases between farms. Prev. Vet. Med. 82, 90–101 10.1016/j.prevetmed.2007.05.016 (doi:10.1016/j.prevetmed.2007.05.016) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C21] 21.Ypma RJF, Bataille AMA, Stegeman A, Koch G, Wallinga J, van Ballegooijen WM. 2012. Unravelling transmission trees of infectious diseases by combining genetic and epidemiological data. Proc. R. Soc. B 279, 444–450 10.1098/rspb.2011.0913 (doi:10.1098/rspb.2011.0913) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C22] 22.Newman MEJ. 2010. Networks: an introduction. Oxford, UK: Oxford University Press [Google Scholar]

[RSIF20120955C23] 23.Koopmans M. 2008. Progress in understanding norovirus epidemiology. Curr. Opin. Infect. Dis. 21, 544–552 10.1097/QCO.0b013e3283108965 (doi:10.1097/QCO.0b013e3283108965) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C24] 24.Lopman BA, Hall AJ, Curns AT, Parashar UD. 2011. Increasing rates of gastroenteritis hospital discharges in US adults and the contribution of norovirus, 1996–2007 Clin. Infect. Dis. 52, 466–474 10.1093/cid/ciq163 (doi:10.1093/cid/ciq163) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C25] 25.Le Guyader FS, et al. 2006. Detection of multiple noroviruses associated with an international gastroenteritis outbreak linked to oyster consumption. J. Clin. Microbiol. 44, 3878–3882 10.1128/JCM.01327-06 (doi:10.1128/JCM.01327-06) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C26] 26.Sukhrie FHA, Siebenga JJ, Beersma MFC, Koopmans M. 2010. Chronic shedders as reservoir for nosocomial transmission of norovirus. J. Clin. Microbiol. 48, 4303–4305 10.1128/JCM.01308-10 (doi:10.1128/JCM.01308-10) [DOI] [PMC free article] [PubMed] [Google Scholar]

[RSIF20120955C27] 27.Beersma MFC, Schutten M, Vennema H, Hartwig NG, Mes THM, Osterhaus ADME, van Doornum GJJ, Koopmans M. 2009. Norovirus in a Dutch tertiary care hospital (2002–2007): frequent nosocomial transmission and dominance of GIIb strains in young children. J. Hosp. Infect. 71, 199–205 10.1016/j.jhin.2008.11.018 (doi:10.1016/j.jhin.2008.11.018) [DOI] [PubMed] [Google Scholar]

[RSIF20120955C28] 28.Götz H, Ekdahl K, Lindbäck J, de Jong B, Hedlund KO, Giesecke J. 2001. Clinical spectrum and transmission characteristics of infection with Norwalk-like virus: findings from a large community outbreak in Sweden. Clin. Infect. Dis. 33, 622–628 10.1086/322608 (doi:10.1086/322608) [DOI] [PubMed] [Google Scholar]

PERMALINK

Infectious disease transmission as a forensic problem: who infected whom?

Peter Teunis

Janneke C M Heijne

Faizel Sukhrie

Jan van Eijkeren

Marion Koopmans

Mirjam Kretzschmar

Abstract

1. Introduction