Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Dec 1.
Published in final edited form as: Biometrics. 2015 Jul 6;71(4):1009–1021. doi: 10.1111/biom.12352

Likelihood-Based Inference for Discretely Observed Birth-Death-Shift Processes, with Applications to Evolution of Mobile Genetic Elements

Jason Xu 1, Peter Guttorp 1, Midori Kato-Maeda 2, Vladimir N Minin 1,3,*
PMCID: PMC4834989  NIHMSID: NIHMS776259  PMID: 26148963

Summary

Continuous-time birth-death-shift (BDS) processes are frequently used in stochastic modeling, with many applications in ecology and epidemiology. In particular, such processes can model evolutionary dynamics of transposable elements — important genetic markers in molecular epidemiology. Estimation of the effects of individual covariates on the birth, death, and shift rates of the process can be accomplished by analyzing patient data, but inferring these rates in a discretely and unevenly observed setting presents computational challenges. We propose a multi-type branching process approximation to BDS processes and develop a corresponding expectation maximization algorithm, where we use spectral techniques to reduce calculation of expected sufficient statistics to low dimensional integration. These techniques yield an efficient and robust optimization routine for inferring the rates of the BDS process, and apply broadly to multi-type branching processes whose rates can depend on many covariates. After rigorously testing our methodology in simulation studies, we apply our method to study intrapatient time evolution of IS6110 transposable element, a genetic marker frequently used during estimation of epidemiological clusters of Mycobacterium tuberculosis infections.

Keywords: Branching processes, Expectation Maximization, Molecular Epidemiology, Missing Data

1. Introduction

Continuous-time branching processes are widely used in stochastic modeling of population dynamics, with applications in biology, genetics, epidemiology, quantum optics, and nuclear fission (Renshaw, 2011). One of the most widely used classes of branching processes are birth-death (BD) processes, a simple yet flexible model for single-species population dynamics. The popularity of BD processes is in part attributable to their well-understood mathematical properties. To accurately model behavior in many applications, however, it is often necessary to consider systems with more than one species — bivariate or other multi-type processes are commonly used to model phenomena such as competition, predation, or infection (Renshaw, 2011). Multi-type branching processes form one class of models that can accommodate populations with multiple types, but these models pose considerable computational challenges for statistical inference. Our work introduces new methods to overcome these challenges, enabling likelihood-based inference in partially observed, multi-type branching processes.

Many statistically relevant quantities are available in closed form for birth-death processes and several of its variants, including transition probabilities, stationary distributions, and moments (Bailey, 1964; Keiding, 1975; Crawford and Suchard, 2012). The ability to compute finite-time transition probabilities enables likelihood-based inference for discretely observed or partially observed BD processes, since the observed likelihood is a function of these transition probabilities. Evaluating this likelihood is necessary in maximum likelihood estimation as well as in many Bayesian inferential procedures. Recent work by Doss et al. (2013) and Crawford et al. (2014) introduces techniques to additionally compute conditional moments of BD sufficient statistics for linear and general birth-death-immigration processes, enabling calculation of the expected complete-data likelihood necessary in an expectation-maximization (EM) algorithm (Dempster et al., 1977).

Unfortunately, methods to evaluate finite-time transition probabilities and conditional moments are not known in the multi-type setting, and generalizing the techniques available in the single-species case is nontrivial. Without these quantities, likelihood-based estimation is limited to simulation-based inference via Monte Carlo EM or MCMC (Golinelli et al., 2006) and asymptotic approximations, such as moment-based estimating equations (Catlin et al., 2001). However, these approaches have shortcomings. MCMC approaches require augmenting the state space by high-dimensional latent variables and become computationally prohibitive when the state space is large. Moment-based methods are statistically less efficient than likelihood-based approaches and thus often inappropriate for smaller datasets, requiring a large number of observations to produce meaningful standard errors and confidence intervals.

In this paper, we extend the analysis of Doss et al. (2013), deriving previously unavailable numerical solutions to transition probabilities and conditional moments for discretely observed, multi-type branching processes. This enables us to evaluate the observed likelihood, as well as to reduce the challenging computation of expected complete-data log-likelihood necessary in an EM algorithm to efficient evaluation of expected sufficient statistics by low-dimensional integration. Our EM algorithm can be applied in settings where the data are assumed to be generated from independent, continuous-time multi-type branching processes, observed at discrete and possibly irregularly spaced time points, whose rates can be a function of many process-specific covariates.

Though our methodology applies broadly, we focus attention to estimating the rates of a birth-death-shift (BDS) process which allows a simultaneous birth and death, or shift event, to occur. The BDS process adds the possibility of shift events to the standard BD framework, and is useful for modeling systems that allow for elements to switch locations or types. For example, in epidemiological applications, interaction between infected and susceptible populations can be captured as a shift event, involving a simultaneous increase and decrease in the respective populations. Spatial BDS processes have also been studied to improve Metropolis-Hastings algorithms for perfect sampling (Huber, 2012) relevant to a range of spatial statistical applications; see Illian et al. (2008) for an overview. Our motivation stems from the BDS process proposed by Rosenberg et al. (2003) to model evolution of transposons — mobile genetic elements that can replicate, die, or shift locations along the genome. While previous methods have inferred rates of this process from data under restrictive model assumptions, our method is the first to enable inference without compromising the model. In particular, previous analyses either did not allow multiple events to occur per observation interval or could not model shift events, while our work successfully addresses both issues. We derive an EM algorithm for discretely observed multi-type branching processes, and assess its performance in several simulation studies. Finally, we apply our algorithm to estimate rates of the IS6110 transposon in the Mycobacterium tuberculosis genome as a function of relevant covariates.

2. Methodology

Our motivation stems from a birth-death-shift process proposed by Rosenberg et al. (2003) to model evolutionary dynamics of transposable elements or transposons — genomic mobile sequence elements. Each transposon can (1) duplicate, with the new copy moving to a new genomic location; (2) shift to a different genomic position; or (3) be removed and lost from the genome, independently of all other transposons. These events occur at instantaneous rates proportional to the total transposon copy number at that time. Thus, transposons evolve according to a linear birth-death-shift (BDS) process in continuous time.

The process of transposon evolution within a host is observable by serially genotyping the organism of interest, e.g., Mycobacterium tuberculosis as in Rosenberg et al. (2003). The number and chromosomal position of the IS6110 element in the M. tuberculosis genome can be visualized using restriction fragment length polymorphism (RFLP). This technique entails restriction endonuclease digestion of the M. tuberculosis DNA which is run in an agarose gel, southern blotting and probing with a peroxidase labeled IS6110 probe. Birth, death, and shift events are thus detectable via changes in the number and size of the bands where the IS6110 elements are located.

Estimating the rates based on observed changes at genotyping times in this experimental setup corresponds to inference in a discretely observed linear BDS process. That is, we assume each element behaves independently, and that overall rates of each event are proportional to total copy number k. Together with the time-homogeneity assumption, waiting times until occurrence of an event are distributed exponentially with rate kη, where η = λ+μ+ν. When an event occurs, the probability that it is birth, death, or shift is given by λ/η, ν/η, and μ/η respectively. The BDS process is therefore a continuous-time Markov chain (CTMC).

The states in our process x~{0,1}SΩ~ can be represented as binary vectors, where S is the number of possible locations transposons may occupy along the genome, 0's denote unoccupied sites, and 1's correspond to sites occupied by a transposon. Now, denote the 2S × 2S infinitesimal generator of this CTMC as Q={qx~1,x~2}, where qx~1,x~2 denotes the instantaneous rate of jumping to x~2 beginning from x~1, with x~1, x~2Ω~. To write the entries of Q, first define C+(x~) as the set of all configurations with one additional site occupied relative to x~. Thus, C+(x~) contains states corresponding to one birth event beginning with x~. Similarly C(x~) contains states where one additional site is occupied and one originally occupied site is no longer occupied, and C(x~) contains states where one originally occupied site in x~ is no longer occupied. Then C+(x~1)=Sk, C(x~1)=k, and C(x~1)=C+(x~1)×C(x~1), and finally the entries of the generator Q are given by

qx~1,x~2=λC+(x~1)1{x~2C+(x~1)}+νC(x~1)1{x~2C(x~1)}+μC(x~1)1{x~2C(x~1)}. (1)

2.1 BDS process with covariates

We are interested in inference when the data consist of m independent processes {X~p(t)}, p = 1, …, m, each discretely observed at times 0 = tp,0 < tp,1 < … < tp,n(p). We assume each {X~p(t)} process evolves according to a linear BDS model with per-particle instantaneous birth rate λp ⩾ 0, shift rate νp ⩾ 0, and death rate μp ⩾ 0. The data, observations from each process, are points in the previously defined state space, with X~p(t)Ω~ for any fixed p and t. For example, in transposon evolution, each patient p is genotyped at n(p)+1 observation times, and at each given time, the 1's present in the data vector correspond to locations in the gel currently occupied by transposons. The observed data corresponding to a given process {X~p(t)} can thus be collected in a S × {n(p) + 1} matrix with columns corresponding to observation times, and the full observed dataset can be collected into a S×p=1m{n(p)+1} matrix: Y={X~1(t1,0),,X~1(t1,n(1)),,X~m(tm,0),,X~m(tm,n(m))}.

The rates of each process are determined by a vector of c covariates zp=(zp,1,zp,2,,zp,c)Rc through a log-linear model log(λp) = βλ · zp, log(νp) = βν · zp, log(μp) = βμ · zp, where β ≔ (βλ, βν, βμ) are the regression coefficients and · represents a vector product. For instance, in an epidemiological study, these covariates may contain patient-specific disease process and demographic information. The observed data log-likelihood is obtained by summing over transitions in each process and summing over all processes:

~o(Y;β)=p=1mj=0n(p)1logp~X~p(tp,j),X~p(tp,j+1)(tp,j+1tp,j;θp), (2)

where θp = (λp, νp, μp) and p~x~1,x~2(t;θ)=Prθ{X~(t)=x~2X~(0)=x~1} denotes a transition probability of the BDS process. We are interested in computing the maximum likelihood estimates (MLEs) of parameters β of the BDS process. Notice that if the transition probabilities were available for given λ, ν, μ, and t values, one could maximize the likelihood in (2) using standard off-the-shelf optimization procedures. However, due to the large state space of all possible configurations of occupied sites, analysis of these transition probabilities is intractable. To approximate the BDS model likelihood above, we introduce a two-type branching process with computationally tractable transition probabilities that are numerically close to the transition probabilities of the BDS model over any time interval. The following sections detail the correspondence between the BDS model and the two-type branching process.

2.2 State space reduction

The size of the original state space Ω~=2S quickly becomes unmanageable as S grows so that analysis using the rate matrix defined in (1) becomes unwieldy for all but small values of S. Previous work by Doss et al. (2013) addresses this issue by collapsing the state space to one dimension, distilling the data to copy number counts at each observation time. In this simplified setting, they develop tools for inference in a discretely observed birth-death-immigration framework. However, this approximate model ignores particle shifts which do not affect the total copy number, rendering the shift rate unidentifiable. Further, collapsing the state space in this way violates the Markov assumption in the BDS model. In particular, waiting times between birth and death events are exponentially distributed under the model in Doss et al. (2013), but under the BDS model with shift events, the waiting time between a birth and death no longer follows an exponential distribution.

Instead of ignoring shifts, we propose a reduction of the state space into a two-dimensional representation ΩN×N. Elements of this reduced space are pairs X(t) = (xold, xnew) ∈ Ω tracking the number of originally occupied and newly occupied sites at the end of each observation interval. As an example, assume six particles are present initially at time t0, and a shift and a birth occur before the first observation t1, and a death occurs before a second observation at t2. When considering the first interval [t0, t1), we have {X(t0) = (6, 0),X(t1) = (5, 2)}, but over [t1, t2), we now have {X(t1) = (7, 0),X(t2) = (6, 0)}, since all seven particles at t1 comprise the initial population in the second interval. This seemingly inconsistent definition of the state at X(t1) is not a problem: we will see that all necessary computations occur across disjoint intervals, so that our reduced representation of the original process needs only to be defined consistently for any given pair of consecutive observations.

Formally, this state space transformation is a mapping ψ:Ω~×Ω~Ω×Ω on consecutive pairs of observations in Ω~ to the reduced state space that can be computed ψ:{X~(t1),X~(t2)}{(a,0),(b,c)}={X(t1),X(t2)}, where a=j=1SX~j(t1) is the total number of initially occupied sites in X~(t1), b=j=1S1{X~j(t2)=X~j(t1)}X~j(t1) is the number of initially occupied sites that remain occupied, and c=j=1S1{X~j(t2)X~j(t1)=1} is the number of newly occupied sites in X~(t2) not present in X~(t1). Note that while ψ significantly reduces the size of the state space, the mapping discards information about specific particle locations, which is uninformative to inferring birth, death, and shift rates due to symmetry induced by particle independence. The number of changes in locations between observations — the data relevant to our estimation task — is preserved in the image of ψ.

2.3 A two-type branching process model

Working now in the space Ω, we can treat xold and xnew as particle types in a two-type branching process. Let aj(k, l) be the rate of producing k type 1 particles and l type 2 particles, beginning with one type j particle, j = 1, 2. Then the nonzero rates defining the two-type branching process corresponding to the birth-death-shift model are given by a1(1, 1) = λ, a1(0, 1) = ν, a1(0, 0) = μ, a1(1, 0) = −(λ + ν + μ), a2(0, 2) = λ, a2(0, 1) = −(λ + μ), a2(0, 0) = μ. This characterization enables us to apply a generating function approach to obtain transition probabilities of the process. Defining Xj(t) as the number of type j particles at time t, we consider the generating function

ϕjk(t,s1,s2)=E{s1X1(t)s2X2(t)X1(0)=j,X2(0)=k}=l=0m=0p(j,k),(l,m)(t)s1ls2m. (3)

Using the Kolmogorov backward equations, we derive equations and a closed form solution for ϕjk (see Appendix A). With ϕjk available, we see from (3) that the transition probabilities p(j,k),(l,m)(t) can then be obtained by differentiating and setting s1, s2 = 0, but without an analytical expression for these derivatives, repeated numerical differentiation is inefficient and numerically unstable. Instead, we map our domain [0, 1] × [0, 1] to the boundary of the complex unit circle by setting s1 = eiw1, s2 = eiw2 so that the generating function becomes a Fourier series ϕjk(t,e2πiw1,e2πiw2)=l,m=0p(j,k),(l,m)(t)e2πilw1e2πimw2. Applying a Riemann sum approximation to the integral corresponding to coefficients given by the Fourier inversion formula, we can compute the transition probabilities using

p(j,k),(l,m)(t)=0101ϕjk(t,e2πiw1,e2πiw2)e2πilw1e2πimw2dw1dw21N2u=0N1v=0N1ϕjk(t,e2πiuN,e2πivN)e2πiluNe2πimvN. (4)

Choice of a larger N leads to a finer and thus more accurate Riemann sum approximation of the integral, and also allows us to compute transition probabilities to and from a larger total particle population of either type. The Fast Fourier transform (FFT) enables efficient computation of these coefficients (Henrici, 1979), and in our application and simulation studies, we find that a grid size as small as N = 16 yields accurate results. With transition probabilities available, we may closely approximate ~o(Y;β) by the branching process likelihood

o(Y;β)=p=1mj=0n(p)1logpXp(tp,j),Xp(tp,j+1)(tp,j+1tp,j;θp), (5)

so that maximizing (5) also maximizes the observed likelihood in (2) by proxy.

3. EM algorithm for the BDS process

With transition probabilities of the process available, it is already possible to produce MLEs of the covariate effects associated with birth, death, and shift rates by numerical maximization of the observed likelihood. However, an EM algorithm approach often outperforms off-the-shelf optimization procedures in missing data problems, offering a significantly faster and more robust solution. Let ℓc(X, β) denote the complete data log-likelihood, X the complete data, and Y the available observations. The EM algorithm begins with an initial parameter estimate β0, and then at each jth iteration, updates the estimate by setting βj = argmaxβ Eβj−1 {ℓc(X, β) | Y}. Each iteration involves a computation of the expectation term called the E-step, followed by a maximization of the expectation called the M-step.

3.1 E-step

The fully observed BDS process is a continuous-time Markov chain, so its complete-data log-likelihood can be written as

c(X;β)=p=1m{bplogλp+fplogνp+dplogμp(λp+μp+νp)k=0kτp(k)+k=0logτp(k)}, (6)

where τp(k) is the total time process Xp(t) spends with total copy number X1p(t)+X2p(t)=k, bp is the total number of births, fp the number of shifts, and dp the number of deaths for each patient p = 1, …, m — these quantities are the complete data sufficient statistics (Guttorp, 1995). Notice the final term in (6) is constant with respect to the parameters. We see that in order to obtain the expected complete-data log-likelihood, we need to calculate only expected births — Eβ {bp | Y}, shifts — Eβ {fp | Y}, deaths — Eβ {dp | Y}, and particle time — Eβ {Rp | Y}, where the last quantity is defined

Eβ{RpY}Eβ{tp,0tp,n(p)X1(s)+X2(s)dsY}=Eβ{k=0kτp(k)Y}.

By independence of the p processes and linearity of expectations, each expectation breaks into sums of expectations over the observation intervals. Further, by homogeneity it suffices that for all non-negative integers j, k, l, m, we can calculate the quantities

ejk,lm+(t)=E{bp,tXp(0)=(j,k),Xp(t)=(l,m)},
ejk,lm(t)=E{fp,tXp(0)=(j,k),Xp(t)=(l,m)},
ejk,lm(t)=E{dp,tXp(0)=(j,k),Xp(t)=(l,m)},
ejk,lm(t)=E{Rp,tXp(0)=(j,k),Xp(t)=(l,m)}.

Dependence of these quantities on rates λp, νp, μp is suppressed for simplicity. As noticed by Minin and Suchard (2008) and Doss et al. (2013), it is easier to work via restricted moments

mjk,lm+(t)=E{bp,t1{Xp(t)=lm}Xp(0)=(j,k)}=n=0nqjk,lm+(n,t),
mjk,lm(t)=E{fp,t1{Xp(t)=lm}Xp(0)=(j,k)}=n=0nqjk,lm(n,t),
mjk,lm(t)=E{dp,t1{Xp(t)=lm}Xp(0)=(j,k)}=n=0nqjk,lm(n,t),
mjk,lm(t)=E{Rp,t1{Xp(t)=lm}Xp(0)=(j,k)}=x=0xdqjk,lm(x,t),

where

qjk,lm(x,t)=Pr{Rp,tx,Xp(t)=(l,m)Xp(0)=(j,k)},
qjk,lm+(n,t)=Pr{bp,t=n,Xp(t)=(l,m)Xp(0)=(j,k)},

and q, q are defined analogously. The conditional expectations can then be recovered after dividing by transition probabilities, i.e. ejk,lm+(t)=mjk,lm+(t)pjk,lm(t).

These restricted moments can be computed with a similar approach used to obtain transition probabilities. We begin by defining the pseudo-generating functions: for expected births, let gjk,lm+(r,t)=n=0qjk,lm+(n,t)rn. Ignoring notational dependence on individual patients for simplicity, we define the joint generating function

Hjk+(r,s1,s2,t)=E{rbts1X1(t)S2X2(t)X(0)=(j,k)}=lmnPr{bt=n,X(t)=(k,l)X(0)=(j,k)}rns1ls2m=lmgjk,lm+(r,t)s1ls2m.

Pseudo-generating functions for shifts and deaths are defined analogously, and the pseudo-generating function for particle time is defined as

Hjk(r,s1,s2,t)=lmx=0erxdqjk,lm(x,t)s1ls2mlmVjk,lm(r,t)s1ls2m,

where Vjk,lm(r;t)=0erxdqjk,lm(x;t) is the Laplace-Stieltjes transform of qjk,lm(x;t). In each case we can define series whose coefficients are our quantities of interest by partial differentiation:

Gjk+(s1,s2,t)=ddrHjk+(r,s1,s2,t)r=1=lm{nnqjk,lm+(n,t)}s1ls2m=lmmjk,lm+(t)s1ls2m. (7)

Gjk and Gjk are defined analogously, and the expression for particle time is instead differentiated at r = 0:

Gjk(s1,s2,t)=ddrHjk(r,s1,s2,t)r=0=lm{x=0xdqjk,lm(x,t)}s1ls2m=lmmjk,lm(t)s1ls2m. (8)

We see that given expressions for Hjk+, Hjk, Hjk, and Hjk, the coefficients corresponding to moments mjk,lm+, mjk,lm, mjk,lm, mjk,lm can then be numerically computed using FFT analogously to (4). For notational simplicity, we use Gjk when referring collectively to Gjk+, Gjk, Gjk, and Gjk, and similarly define Hjk.

Having reduced our task to computing Hjk, we define H1H10(r, s1, s2, t) and H2H01(r, s1, s2, t). Particle independence then yields Hjk=H1jH2k. In all cases, H2 is analytically available, and we derive an ordinary differential equation for H1, summarized in the theorem below. We present the result for a branching process with rates corresponding to the birth-death-shift model, but such systems of equations are available for an arbitrary time-homogeneous multi-type branching process.

Theorem 1: Let {Xt} be a two-type branching defined by the rates in Section 2.3. Denote particle time and the number of births, shifts, and deaths over the interval [0, t) by Rt, bt, fp, and dt respectively. Define the generating functions corresponding to births as

H1+(r,s1,s2,t)=E{rbts1X1(t)s2X2(t)X(0)=(1,0)}andH2+(r,s1,s2,t)=E{rbts1X1(t)s2X2(t)X(0)=(0,1)}.

Then H2+=yb+

{λr2λrybλμ+(1s2yb+λr2λrybλμ)e(2ybλrλμ)t}1,

where yb=(λ+μ+λ2+2λμ+μ24λμr)(2λr), and H1+ satisfies

ddtH1+(t,s1,s2,r)=λrH1+H2++νH2++μ(λ+μ+ν)H1+, (9)

subject to initial condition H1(r, s1, s2, 0) = s1.

Analogous equations for shifts, deaths, and particle time along with detailed derivations are included in Appendix B.

This theorem shows that for each of the sufficient statistics, necessary computations for Hjk reduce to solving a single ordinary differential equation, and because we can evaluate Hjk, we can also easily differentiate Hjk numerically, yielding solutions Gjk.

To summarize, with Hjk+, Hjk, Hjk, Hjk now available, we may obtain the restricted moments by computing the coefficients in the power series Gjk+, Gjk, Gjk, Gjk. These coefficients are recovered using a Riemann approximation to the Fourier inversion formula analogous to formula (4). We are thus able to compute all necessary quantities appearing in the expected complete-data log-likelihood Eβ~{c(X,β)Y}. Recall that sufficient statistics for each patient bp, fp, dp, and Rp break up over intervals: i.e. the total number of births bp is equal to the sum of the number of births over each disjoint interval [tp,j−1, tp,j), with j = 1, …, n(p). Further, by the Markov property, the conditional expectation of the number births over an interval [t1, t2) given Y depends only on the states of the process at the endpoints of the interval:

E(bp,t2t1Y)=E{bp,t2t1Xp(t1),Xp(t2)}=eXp(t1),Xp(t2)+(t2t1)=mXp(t1),Xp(t2)+(t2t1)pXp(t1),Xp(t2)(t2t1), (10)

and the same is true for the other sufficient statistics. Therefore, for each process p,

Eβ~(bpY)=i=1n(p)eXp(tp,i1),Xp(tp,1)+(tp,i1tp,i;λ~p,ν~p,μ~p), (11)

with log(λ~p)=β~λzp, log(ν~p)=β~νzp, log(μ~p)=β~μzp, and we obtain Eβ~(fpY), Eβ~(dpY) analogously. Finally, combining (11), (10), (6) and denoting θ~p=(λ~p,ν~p,μ~p), the expected complete-data log likelihood up to a constant is equal to

Eβ~{c(X,β)Y} (12)
p=1m[j=1n(p){mXp(tp,j1),Xp(tp,j)+(tp,jtp,j1;θ~p)pXp(tp,j1),Xp(tp,j)(tp,jtp,j1;θ~p)logλp+mXp(tp,j1),Xp(tp,j)(tp,jtp,j1;θ~p)pXp(tp,j1),Xp(tp,j)(tp,jtp,j1;θ~p)logνp+mXp(tp,j1),Xp(tp,j)(tp,jtp,j1;θ~p)pXp(tp,j1),Xp(tp,j)(tp,jtp,j1;θ~p)logμpmXp(tp,j1),Xp(tp,j)(tp,jtp,j1;θ~p)pXp(tp,j1),Xp(tp,j)(tp,jtp,j1;θ~p)(λp+μp+νp)}]. (13)

3.2 M-step

To complete an M-step, we use an efficient Newton-Raphson (N-R) algorithm to maximize the expectation g(β)=Eβ~{c(X,β)Y}. Each N-R step recursively updates parameters using the following equation:

βnew=βcur{Hg(βcur)}1g(βcur), (14)

where ▽g denotes the gradient vector and Hg denotes the Hessian matrix of g(β). Fortunately, compact analytical forms for these quantities are available. First, we collect complete data sufficient statistics across processes into the following vectors:

UT=(Eβ~(b1,t1,n(1)Y),,Eβ~(bm,tm,n(m)Y)),
VT=(Eβ~(f1,t1,n(1)Y),,Eβ~(fm,tm,n(m)Y)),
DT=(Eβ~(d1,t1,n(1)Y),,Eβ~(dm,tm,n(m)Y)),
PT=(Eβ~(R1,t1,n(1)Y),,Eβ~(Rm,tm,n(m)Y)).

If we aggregate covariate vectors for each process in a c × p matrix Z = (z1, …, zm) and process-specific rates into vectors λ = (λ1, …, λm), ν = (ν1, …, νm),μ = (μ1, …, μm), then the gradient and Hessian can be expressed as

g(β)=(ZT{diag(P)λ+U},ZT{diag(P)ν+V},ZT{diag(P)μ+D}), (15)
Hg(β)=(ZTdiag(Pλ)Z000ZTdiag(Pν)Z000ZTdiag(Pμ)Z). (16)

In our experience the M-step generally converges in fewer than ten N-R steps. Availability of closed form solutions (15) and (16) yields very fast execution of each N-R step — the computational cost of the M-step is negligible compared to the E-step.

3.3 Accelerating E-step calculations for intervals with no change

In our birth-death-shift application, we may avoid the relatively costly E-step calculations for some intervals by approximating the probability of observing no changes with the probability that no event occurs in the underlying complete process. This leads to computational efficiency gains in settings such as our application where many intervals feature no observed changes.

It is very unlikely that events occur in a time interval [t1, t2) yet no change is observed so that X(t1) = X(t2). For instance, if 12 elements are present initially and a death followed by a birth occur, then we almost always observe X(t1) = (12, 0),X(t2) = (11, 1) unless the element added by the birth occupies the exact location that was previously occupied by the element that dies. This scenario would leave the observed state unchanged, X(t1) = X(t2) = (12, 0), but has exceedingly low probability: the already small but non-negligible probability that more than one event occurs is then multiplied by 1/(S – 11), the probability of the birth occurring in a specific location (recall S is very large). Therefore, it is numerically accurate to treat intervals with no observed changes as if no changes in the latent continuous-time process occur. In this case, the transition probability is easily calculated, given by the tail of an exponential distribution p(12,0),(12,0) (t2t1) = e−12(λ+μ+ν)(t2t1). In addition to efficient closed-form transition probability calculation, the expected sufficient statistics necessary for the E-step are known in this setting. If no events occur, we know that e(k,0),(k,0)+(t)=e(k,0),(k,0)(t)=e(k,0),(k,0)(t)=0, and that the expected particle time is e(k,0),(k,0)(t)=kt. This is not only faster computationally but also more numerically stable, avoiding the division of numerically calculated restricted moments by numerically calculated transition probabilities. We verify this efficient implementation in our simulation studies as illustrated in Figure 4.

Figure 4.

Figure 4

The left plot shows converged log-likelihood values using EM, accelerated EM, and Nelder-Mead optimization. The right plot shows parameter estimates produced by the EM, accelerated EM, and Nelder-Mead algorithms, with true parameters values shown as crosses. This figure appears in color in the electronic version of this article.

3.4 Implementation

Our algorithms are implemented in R package bdsem, available at https://github.com/jasonxu90/bdsem. The EM algorithm implementation accommodates panel data settings with unevenly spaced discrete observations and includes functions for MLE inference using other methods, as well as code for simulating from the BDS process. The software is accompanied by a vignette that steps through simplified versions of all simulation studies included in this paper.

4. Results

We begin with a simulation study to check that transition probabilities for the two-type branching process described in Section 2.3 coincide with those of the BDS model. We compare our generating function computations to Monte Carlo estimates of transition probabilities obtained by simulating from the BDSmodel, and also include a comparison to the FM method presented in (Rosenberg et al., 2003).

4.1 Comparison with frequent monitoring

The FM model allows at most one event to occur per interval. Thus, over an observation interval [ti, ti+1) beginning with k particles, the probabilities of a birth, death, and shift have closed forms (λ/η)ekη(ti+1ti), (μ/η)ekη(ti+1ti), and (ν/η)ekη(ti+1ti) respectively, where η = λ + ν + μ. The probability of no event occurring is given by ekη(ti+1ti), and all other transition probabilities are zero under the FM assumption.

We compute Monte Carlo approximations of transition probabilities from 2000 realizations of a BDS process without covariates, with rates λ = 0.0188, μ = 0.0147, ν = 0.00268 equal to estimates of transposable element birth, death, and shift rates obtained by Rosenberg et al. (2003) using FM. We begin each simulation with an initial population size of 10, and record the state of the process after simulating for dt units of time, varying dt from 0.5 to 10.

In Figure 2, we see that as the length of an observation interval increases, FM approximations become inaccurate, while those obtained using our method remain within the narrow Monte Carlo confidence intervals. However, notice that the probability that no event occurs remains accurate even under the FM approximation, supporting the efficient implementation of our EM algorithm described in Section 3.3. Figure C-1 in the Appendix C demonstrates that our method also reliably calculates other transition probabilities that are set to 0 by the FM method, and these computations remain accurate when we vary the rates of the process.

Figure 2.

Figure 2

Transition probability approximations. BDS transition probabilities are approximated with two methods — the FM method, shown with crosses, and the generating function method, depicted with triangles. We depict Monte Carlo estimates of the BDS transition probabilities with circles; vertical segments indicate their corresponding Monte Carlo confidence intervals. This figure appears in color in the electronic version of this article.

Further, these discrepancies in numerical transition probabilities between methods indeed translate to differences in estimated rates. To see this, we generate a partially observed dataset and infer rates using both methods. We simulate from the BDS process with λ = 0.07, μ = 0.12, ν = 0.02 to resemble the dynamics of the real dataset we will analyze in the next section, and record 200 discretely observed states of the process evenly spaced dt time units apart. Each simulated interval begins with an initial population size drawn uniformly between 1 and 15, and this data generating process is repeated to produce three datasets corresponding to inter-observation intervals of lengths dt = (0.2, 0.4, 0.6). We infer the MLE rates for each of the three discretely observed datasets using the generating function method and under the frequent modeling assumption, and repeat the entire procedure for 200 trials.

In the top row of Figure 3, we see that our generating function approach successfully recovers the MLE estimates, and coverage of 95% confidence intervals remains close to 0.95 as we increase the length of time intervals between observations. The FM method performs somewhat reasonably for shorter observation intervals, but the bias in these approximate MLEs becomes stark as dt increases, with 95% confidence interval coverage probability dropping as low as 0.24.

Figure 3.

Figure 3

MLE parameter estimates on simulated data. The top row displays estimates of global birth, death, and shift rates in the simple BDS for three datasets, each with observation interval lengths dt = (0.2, 0.4, 0.6). True parameter values used to initialize simulations marked by horizontal dashed line, and results using the FM method are included in gray. Monte Carlo coverage probabilities for 95% confidence intervals are displayed above box plots. The bottom row displays estimated coefficients using EM in the BDS process with covariates, shifted by true values.

We can also check the accuracy of restricted moment computations via simulation by verifying the equality E(Nt+X0=i,j)=k,lE(Nt+,1xt=klx0=i,j) for expected births, and analogous expressions for other expected sufficient statistics. The left hand side is empirically approximated by a Monte Carlo average of the number of births over many realizations of the process, while the restricted moments on the right hand side are computed via our generating function approach (see Appendix C, Figure C-2).

4.2 Estimation of parameters in BDS model with covariates

With accurate transition probabilities and restricted moments available, we are ready to infer coefficients in the BDS model with covariate-dependent rates using the EM algorithm. We begin by generating a simulated dataset resembling the real data consisting of observations corresponding to 100 “patients,” each with three covariates zp,1, zp,2, zp,3 ~ Unif{(0, 2) × (6, 10) × (4, 6)}. An illustration of the format of these data, which reflects the format of the real dataset we later analyze, is provided in Table 1. We then simulate patient-specific BDS processes, beginning with rates λp, νp, μp log-linearly related to a true vector of coefficients β. We collect between 2 and 7 observations per patient, each spaced dt = 0.4 apart. Each simulated observation interval begins with an initial population uniformly drawn between 2 and 14. Finally, we choose true values of the effect sizes so that averaging over patients, the overall birth, shift, and death rates of the process are similar to previous studies (Rosenberg et al., 2003; Doss et al., 2013).

Table 1.

Visualization of data format with covariates zi, i = 1, 2, 3.

Patient Time # Bands Shift z 1 z 2 z 3
1 0 9 no 1.3 6.3 4.2
0.4 9 no 1.3 6.3 4.2
0.8 10 no 1.3 6.3 4.2
1.2 10 no 1.3 6.3 4.2
1.6 10 yes 1.3 6.3 4.2
2.0 10 no 1.3 6.3 4.2

2 0 14 no 0.7 9.1 5.5
0.4 14 no 0.7 9.1 5.5
0.8 13 no 0.7 9.1 5.5

3

The algorithm is initialized with β0 ~ N {β, diag(0.5β)}, and the entire procedure of generating the dataset and inferring rates via EM is repeated 150 times. In the bottom row of Figure 3, we see that the MLEs are again unbiased estimates of the true values, with corresponding confidence interval coverage staying close to 95%. Our EM algorithm not only successfully recovers the true parameters, but also outperforms generic optimization. We choose to compare against the Nelder-Mead (NM) algorithm, as it proved to be the most robust among the methods available via the optim function in R; a similar choice of NM for comparison to EM implementations is motivated in (Lange and Minin, 2013). In this experiment, we generate one fixed dataset as described in the procedure above from the BDS model with covariates and initialize each method with identical initial parameter values and convergence criteria, using a relative tolerance of ϵ = 1 × 10−6. We repeat this procedure over 100 sets of initial parameters.

Figure 4 displays the log-likelihood values achieved by each algorithm at convergence, as well as values in which Nelder-Mead terminated at an iteration limit set at 2000 steps. We see that in every case, the EM algorithm is significantly faster and finds a better optimum than NM. Further, the wide range of converged log-likelihood values suggests that NM is sensitive to initial conditions — an undesirable feature in this fixed data setting. We also verify that EM and accelerated EM arrive at the same parameter estimates and log-likelihood values at convergence up to specified relative tolerance. The increase in efficiency is not seen here: in these simulated examples, generating function computations are always performed and cached at each iteration, rather than bypassed for candidate intervals described in Section 3.3. In our application in the following section, we find that accelerating EM runs approximately six times as fast as its nonaccelerated counterpart.

Our EM approach is not only more stable in terms of the maximized log-likelihood, but also in terms of parameter estimates. The right panel of Figure 4 shows that estimates for each coefficient differ by no more than 0.01 across disparate initial conditions under both EM implementations, while a range of estimates are produced by the Nelder-Mead algorithm.

Note that for some coefficients, estimates produced by NM appear to lie closer to the “true” parameters used to generate the data. We believe this to be an artifact of centering initial parameter values for both algorithms around the true parameters. Indeed, MLEs corresponding to the likelihood surface of a given synthetic dataset generally do not coincide exactly with the inputs used to simulate the data. The fact that EM consistently finds a better optimum in terms of log-likelihood suggests that this is the case.

4.3 Mycobacterium tuberculosis transposable element evolution

We apply our EM algorithm to infer covariate-dependent birth, death, and shift rates of the M. tuberculosis transposon IS6110, a frequently used marker to track M. tuberculosis in the community (McEvoy et al., 2007). The marker serves as a DNA fingerprint, and in community-based studies patients that share the same or similar M. tuberculosis genotypes are considered as part of the same transmission chain (Van Embden et al., 1993; Kato-Maeda et al., 2011). However, such inference relies on a fairly precise understanding of within-host evolutionary dynamics: for instance, if a DNA marker changes very rapidly, isolates from the same source will be strongly differentiated, and the severity of outbreaks would be underestimated without accounting for the high change rate. Understanding the rates of change of IS6110 -based genotypes is thus critical toward the interpretation and design of such studies (Tanaka and Rosenberg, 2001), which in turn provide important information toward designing policy decisions such as control and intervention programs.

We analyze data from an ongoing study of the transmission and pathogenesis of M. tuberculosis patients in a community study in San Francisco (Cattamanchi et al., 2006). The database includes all culture positive tuberculosis cases reported to the San Francisco Department of Public Health. We included patients with more than one M. tuberculosis isolate from specimens sampled more than 10 days apart, genotyped with IS6110 restriction fragment length polymorphism (RFLP) analysis. Our dataset contains 252 observation intervals corresponding to 196 unique patients observed at 452 time points. Average time between sampling times is 0.35 years, with the longest interval being 2.35 years. Of the 252 intervals, 29 feature endpoints with distinct genotypes. Additional summary statistics are included in Appendix D.

This dataset was analyzed by Rosenberg et al. (2003) under the FM assumption, but these authors necessarily discarded all intervals with more than one change in RFLP bands, as these intervals with “complex changes” are not possible under their restricted model. A later investigation by Doss et al. (2013) relaxes this assumption, allowing for multiple births or deaths to occur, but ignores RFLP band locations entirely, working instead only with total copy numbers evolving under a linear birth-death process. Under this birth-death model, the shift rate becomes unidentifiable, and the study instead infers covariate effects of birth and death rates. Our new method allows for a more principled and complete analysis, utilizing the full dataset without compromising any original modeling assumptions.

We begin by applying our EM algorithm to the simple BDS model with a single birth, death, and shift rate of IS6110 for all patients. We estimate the MLE rates λ^=0.0156, ν^=0.00426, μ^=0.0187, with associated 95% confidence intervals (0.00929, 0.0251), (0.00145, 0.0125), and (0.0177, 0.0301) respectively; results were not sensitive to parameter initializations. These estimates are interpretable as the change rate of IS6110 per copy, per year, and our results are consistent with previous estimates in the literature: for all rates, confidence intervals overlap those obtained in the frequent monitoring approach in (Rosenberg et al., 2003) as well as those obtained in the BD model (Doss et al., 2013). Similarly to Doss et al. (2013) which estimates μ = 0.0207, our estimate of death rate μ^ allowing for multiple events between observations is higher than μ = 0.0147 obtained under the FM assumption. This is to be expected, as there are multiple intervals in which IS6110 count drops by more than 1 in the dataset. Although confidence intervals overlap, our estimate ν^ is noticeably higher than the previous result ν = 0.00268, with the upper end of our confidence interval almost twice as large as the upper end of the 95% FM confidence interval [0, 0.00654). Again, our analysis allows consideration of intervals that can be explained by at least two genotype changes that were either omitted in earlier studies or interpreted as a single birth event.

In addition to estimating the BDS rates globally, Doss et al. (2013) investigated rates as functions of several covariates in a panel data setting, and their findings in the birth-death framework suggest that M. tuberculosis lineage (Gagneux et al., 2006) may have a statistically significant effect on the rates of the process. We reexamine the effect of lineage on the rates in the full BDS model, considering 109 patients infected with Euro-American (EU) lineage strains, 54 patients with East-Asian (EA) strains, and 25 patients with Indo-Oceanic (IO) strains. We combine EU and IO lineages, because Doss et al. (2013) found that the number of IO samples was not sufficient to recover rates for this lineage. Following Doss et al. (2013), we also include HIV infection status of each patient (HIV) and drug resistance status of the M. Tuberculosis strain (DR). These attributes are coded as binary covariates: EIp = 1 if patient p is infected with the EU or IO strain and 0 otherwise, so that intercept terms β0λ, β0μ, β0ν correspond to the EA strain. The variable HIVp = 1 if patient p is infected with HIV and 0 otherwise, and DRp = 1 if patient p is infected with a drug-resistant strain, and 0 otherwise. Covariates are log-linearly related to birth, death, and shift rates: logλp=β0λ+β1λEIp+β2λHIVp+β3λDRp, logμp=β0μ+β1μEIp+β2μHIVp+β3μDRp, logνp=β0ν+β1νEIp+β2νHIVp+β3νDRp.

We estimate coefficients in the full log-linear model described above, as well as in several simpler models, using the EM algorithm. The simpler models differ from the full model by either excluding the HIV and DR covariates, or excluding all covariates for specified global or “simple” rates. For instance, the model labeled “Lineage only, simple ν” in Table 2 has five parameters β=(β0λ,β1λ,β0μ,β1μ,βν), and rates defined as logλp=β0λ+β1λEIp, log νp = log ν = βν, logμp=β0μ+β1μEIp. Estimates in all cases are not sensitive to starting values. A summary and model comparison via the Bayesian Information Criterion (BIC) (Schwarz, 1978) is included in Table 2, which selects the model including only the lineage covariate for modeling death rate μ. Coefficient estimates are displayed graphically for the full model as well as the best model selected by BIC in Figure 5. While we choose not to report coefficient estimates from each model for brevity, in all models, the confidence interval for βEIμ does not contain zero, indicating that strain lineage has a statistically significant effect on the death rate. The estimate βEIμ^=2.028 under the best model indicates that in Euro-American and Indo-Oceanic lineages loss of IS6110 element occurs exp(2.028) = 7.599 times faster than in their East-Asian counterpart. Our analysis affirms the result suggested by Doss et al. (2013) in the simpler BD framework: M. tuberculosis lineage needs to be taken into consideration when studying disease transmission using IS6110 genotypes.

Table 2.

Model comparison via BIC2L^+klnn. We also fit the log-linear model of Doss et al. (2013), which includes separate indicator variables for Euro-American and Indo-Oceanic lineages. Models described as “lineage only” do not include HIV, DR covariates, and rates described as “simple” are global to all patients, not influenced by covariates in the model.

Model # Par Log-likelihood BIC
Full, EU, IO lineages 15 −119.845 330.01
Full 12 −120.498 313.25
Full, simple ν 9 −122.455 299.10
Lineage covariate only 6 −123.649 293.42
Lin. only, simple ν 5 −123.717 277.54
Lin. only, simple λ, ν 4 −124.472 273.02
Simple λ, ν, μ 3 −127.914 273.90

Figure 5.

Figure 5

Coefficient estimates and 95% confidence intervals in full model and best model according to BIC. Notice intervals corresponding to βEIμ do not contain 0.

5. Discussion

In this paper, we have developed an EM algorithm for inference in a discretely observed, multi-type branching process framework. We focus our attention on fitting BDS processes to panel data, driven by the problem of estimating evolutionary dynamics of IS6110 — a genetic marker that plays an important role in DNA fingerprinting of M. tuberculosis. Our method allows for log-rates to be linear combinations of many patient-specific covariates, and is flexible enough to capture the full range of dynamics between observation times by approximating the BDS process with a two-type branching process. To our knowledge, there is no other method of comparable accuracy for fitting BDS processes in this setting.

The generating functions and numerical techniques we derive to calculate previously unknown transition probabilities and restricted moments are helpful tools toward probabilistic characterization of such processes more generally. We demonstrate how our generating function approach leads to maximum likelihood estimation and evaluation of expected complete-data log-likelihood within an EM algorithm, but note that these calculations also arise in a variety of other statistical prediction and estimation techniques. For example, tractability of the likelihood via our methods allows for their use in Bayesian inference.

Several problems associated with our numerical methods remain open. First, although we empirically show that our branching process approximation to the discretely observed BDS likelihood is highly accurate, rigorous characterization of this approximation is lacking. Filling this theoretical gap is an interesting avenue for future research. Second, our method has potential numerical limitations in settings with very large populations. Computing transition probabilities to population sizes up to N of any particle type typically requires Np differential equations to be solved, where p is the number of particle types. Although efficient numerical solvers evaluate each ODE in fractions of a second, requiring millions of evaluations becomes prohibitive within an iterative algorithm. However, because the support of transition probabilities is often concentrated unless observation intervals are very long, future work may harness this sparsity to accelerate computations.

Our covariate-specific rate analysis reaffirms previous indication in the simplified BD framework that strain lineage has a significant effect on the death rate (Doss et al., 2013), although the large confidence intervals suggest that this lineage effect is somewhat marginal. Indeed, more data would be required to be certain in the result, but our principled analysis is assuring in that any spurious findings can now be attributed to limited, noisy data rather than to model misspecification. The possibility of differences in rates of genetic marker evolution across lineages is important in epidemiological studies. For example, similar IS6110 genotypes across multiple individuals infected with EA lineage of M. tuberculosis do not provide strong evidence of these individuals belonging to the same transmission chain, because of the slow change rate of IS6110 in the EA lineage. Failing to account for this may lead to inferring false relationships among genotypically similar clusters of patients.

The BDS model we consider is general enough so that our methods can be applied to studying evolution of any transposable element. Such studies are not limited to infectious disease surveillance, as studying evolution of transposable elements in eukaryotes is also of great interest (Biémont, 2010). Beyond the BDS framework, the tools we develop for fitting branching processes are transferable to many settings. For example, our methodology is applicable to compartmental models, a class of well-known multi-type branching processes that finds applications in modeling cancerous growth, bacterial evolution, and cellular differentiation in systems such as hematopoiesis (Golinelli et al., 2006).

Supplementary Material

Supp Info

Figure 1.

Figure 1

Illustration of the three types of transposition—birth, death, shift—along a genome, represented by circles. Transposons, depicted by filled rectangles along the circles/genomes, correspond to observable gel bands, denoted by horizontal lines in the rectangles next to each circle diagram. Numbers within each circle represent each configuration X(t) in the notation introduced in Section 2.2. More specifically, we call the gel band on the left our initial configuration and set the number of particles of type 1 to the number of bands, 5, and the number of particles of type 2 to 0. On the right set of diagrams, a birth event keeps the number of type 1 particles intact and increments the number of type 2 particles by one, a death event changes the number of type 1 particles from five to four and keeps the number of type 2 particles at zero, and finally a shift event decreases the number of type 1 particles by one and increases the number of type 2 particles by one.

Acknowledgements

VNM was supported by the NIH grants R01-AI107034 and U54-GM111274. JX was supported by an NDSEG fellowship.

Footnotes

6. Supplementary Materials Web Appendices and Figures referenced in Sections 2.3, 3, and 4.1 are available with this paper at the Biometrics website on Wiley Online Library.

References

  1. Bailey NTJ. The Elements of Stochastic Processes; with Applications to the Natural Sciences. Wiley; New York: 1964. [Google Scholar]
  2. Biémont C. A brief history of the status of transposable elements: From junk DNA to major players in evolution. Genetics. 2010;186(4):1085–1093. doi: 10.1534/genetics.110.124180. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Catlin SN, Abkowitz JL, Guttorp P. Statistical inference in a two-compartment model for hematopoiesis. Biometrics. 2001;57(2):546–553. doi: 10.1111/j.0006-341x.2001.00546.x. [DOI] [PubMed] [Google Scholar]
  4. Cattamanchi A, Hopewell PC, Gonzalez LC, Osmond DH, Masae Kawamura L, Daley CL, Jasmer RM. A 13-year molecular epidemiological analysis of tuberculosis in San Francisco. The International Journal of Tuberculosis and Lung Disease. 2006;10(3):297–304. [PubMed] [Google Scholar]
  5. Crawford F, Suchard MA. Transition probabilities for general birth-death processes with applications in ecology, genetics, and evolution. Journal of Mathematical Biology. 2012;65(3):553–580. doi: 10.1007/s00285-011-0471-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Crawford FW, Minin VN, Suchard MA. Estimation for general birth-death processes. Journal of the American Statistical Association. 2014;109(506):730–747. doi: 10.1080/01621459.2013.866565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 1977;39(1):1–38. [Google Scholar]
  8. Doss CR, Suchard Ma, Holmes I, Kato-Maeda MM, Minin VN. Fitting birth–death processes to panel data with applications to bacterial DNA fingerprinting. The Annals of Applied Statistics. 2013;7(4):2315–2335. doi: 10.1214/13-AOAS673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Gagneux S, DeRiemer K, Van T, Kato-Maeda M, de Jong BC, Narayanan S, Nicol M, Niemann S, Kremer K, Gutierrez MC, et al. Variable host–pathogen compatibility in Mycobacterium tuberculosis. Proceedings of the National Academy of Sciences of the United States of America. 2006;103(8):2869–2873. doi: 10.1073/pnas.0511240103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Golinelli D, Guttorp P, Abkowitz JA. Bayesian inference in a hidden stochastic two-compartment model for feline hematopoiesis. Mathematical Medicine and Biology. 2006;23(3):153–172. doi: 10.1093/imammb/dql008. [DOI] [PubMed] [Google Scholar]
  11. Guttorp P. Stochastic modeling of scientific data. CRC Press; 1995. [Google Scholar]
  12. Henrici P. Fast Fourier methods in computational complex analysis. Siam Review. 1979;21(4):481–527. [Google Scholar]
  13. Huber M. Spatial birth–death swap chains. Bernoulli. 2012;18(3):1031–1041. [Google Scholar]
  14. Illian J, Penttinen A, Stoyan H, Stoyan D. Statistical analysis and modelling of spatial point patterns. ume 70. John Wiley & Sons; 2008. [Google Scholar]
  15. Kato-Maeda M, Metcalfe JZ, Flores L. Genotyping of Mycobacterium tuberculosis: application in epidemiologic studies. Future Microbiology. 2011;6(2):203–216. doi: 10.2217/fmb.10.165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Keiding N. Maximum likelihood estimation in the birth-and-death process. The Annals of Statistics. 1975;3(2):363–372. [Google Scholar]
  17. Lange JM, Minin VN. Fitting and interpreting continuous-time latent Markov models for panel data. Statistics in Medicine. 2013;32(26):4581–4595. doi: 10.1002/sim.5861. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. McEvoy CRE, van Pittius NCG, Victor TC, van Helden PD, Warren RM. The role of IS6110 in the evolution of Mycobacterium tuberculosis. Tuberculosis. 2007;87(5):393–404. doi: 10.1016/j.tube.2007.05.010. [DOI] [PubMed] [Google Scholar]
  19. Minin VN, Suchard MA. Counting labeled transitions in continuous-time Markov models of evolution. Journal of Mathematical Biology. 2008;56(3):391–412. doi: 10.1007/s00285-007-0120-8. [DOI] [PubMed] [Google Scholar]
  20. Renshaw E. Stochastic Population Processes: Analysis, Approximations, Simulations. Oxford University Press; Oxford, UK: 2011. [Google Scholar]
  21. Rosenberg NA, Tsolaki AG, Tanaka MM. Estimating change rates of genetic markers using serial samples: applications to the transposon IS6110 in Mycobacterium tuberculosis. Theoretical Population Biology. 2003;63(4):347–363. doi: 10.1016/s0040-5809(03)00010-8. [DOI] [PubMed] [Google Scholar]
  22. Schwarz G. Estimating the dimension of a model. The Annals of Statistics. 1978;6(2):461–464. [Google Scholar]
  23. Tanaka MM, Rosenberg NA. Optimal estimation of transposition rates of insertion sequences for molecular epidemiology. Statistics in Medicine. 2001;20(16):2409–2420. doi: 10.1002/sim.910. [DOI] [PubMed] [Google Scholar]
  24. Van Embden JD, Cave MD, Crawford JT, Dale JW, Eisenach KD, Gicquel B, Hermans P, Martin C, McAdam R, Shinnick TM. Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology. Journal of Clinical Microbiology. 1993;31(2):406–409. doi: 10.1128/jcm.31.2.406-409.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp Info

RESOURCES