Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2023 Feb 20;51(6):1131–1150. doi: 10.1080/02664763.2023.2179567

Epidemic change-point detection in general integer-valued time series

Mamadou Lamine Diop 1, William Kengne 1,CONTACT
PMCID: PMC11018010  PMID: 38628444

ABSTRACT

In this paper, we consider the structural change in a class of discrete valued time series, where the true conditional distribution of the observations is assumed to be unknown. The conditional mean of the process depends on a parameter θ which may change over time. We provide sufficient conditions for the consistency and the asymptotic normality of the Poisson quasi-maximum likelihood estimator (QMLE) of the model. We consider an epidemic change-point detection and propose a test statistic based on the QMLE of the parameter. Under the null hypothesis of a constant parameter (no change), the test statistic converges to a distribution obtained from increments of a Browninan bridge. The test statistic diverges to infinity under the epidemic alternative, which establishes that the proposed procedure is consistent in power. The effectiveness of the proposed procedure is illustrated by simulated and real data examples.

KEYWORDS: Discrete valued time series, change-point detection, epidemic alternative, semi-parametric statistic, Poisson QMLE

MATHEMATICS SUBJECT CLASSIFICATIONS: 62M10, 62F03, 62F12

1. Introduction

Change-point detection is a vast and active field of research, since its applications can be found in several areas, such as, epidemiology, finance, ecology, biology, etc. This paper focuses on the epidemic change-point problem (see, for instance, [18,26]) in a large class of integer-valued time series. The epidemic change-point problem involves testing the null hypothesis of no change versus the alternative that two changes occur during the data generating, with the structure of the first and the third segment is the same, and different from the second segment.

Assume that {Yt,tZ} is a time series of counts and denote by Ft1=σ{Yt1,;Xt1,} the σ-field generated by the whole past at time t−1. Let Θ be a fixed compact subset of Rd ( dN) and TZ. For any θΘ, define the class of integer-valued time series given by

Class ODT(fθ): The process Y={Yt,tZ} belongs to ODT(fθ) if it satisfies:

E(Yt|Ft1)=fθ(Yt1,Yt2,)tT, (1)

where fθ() is a measurable non-negative function, assumed to be known up to the parameter θ. Ahmad and Francq [1] carried out the inference question in the semiparametric setting in the class ODZ(fθ), whereas [8,9] focused on the model selection and multiple change-points problems in this class. Note that, numerous classical integer-valued time series models belong to the class ODZ(fθ): for instance, the Poisson INGARCH models (see for instance [12]), the negative binomial INGARCH models (proposed by [27]), the binomial INGARCH (see [25]), the Poisson exponential autoregressive models (see [13]), the INAR models (see [23,24]).

Firstly, we consider the class ODZ(fθ) for some θΘ and carry out the inference on the parameter θ. The consistency and the asymptotic normality of the Poisson quasi-maximum likelihood estimator (QMLE) are addressed. These results are the same as the ones obtained by [1]. But the conditions set here seem to be more straightforward than those needed by these authors (see Section 2 below).

Secondly, we focus on the test with an epidemic alternative, for detecting changes in the parameter of the class ODZ(fθ). The principle is that, the parameter has changed at time t1, and then restored its original value after a time t2; that is, an ‘epidemic’ occurred between t1 and t2. This question has been addressed in several works; see, among others papers, [2,3,5,6,14–16,18–21,26]. Most of these procedures are developed for epidemic change-point detection in the mean of random variables. Also, the case time series of count has not received great attention in the literature, while these models are very useful in many fields (see Section 5 for an example of application to the number of hospital admissions). For the general class ODZ(fθ), we propose a test procedure based on the Poisson QMLE, for detecting epidemic change in the parameter θ. Under the null hypothesis (no change) the test statistic converges to a distribution obtained from a difference between two Brownian bridges; this test statistic diverges to infinity under the epidemic alternative.

The paper is organized as follows. In Section 2, we set some assumptions, define the Poisson QMLE and establish its asymptotic properties. Section 3 is devoted to the construction of the test statistic and the asymptotic studies under the null and the epidemic alternative. Some simulation results are displayed in Section 4. Section 5 focuses on a real data example and Section 6 provides the proofs of the main results.

2. Assumptions and Poisson QMLE

Throughout the sequel, the following notations will be used:

  • x:=i=1p|xi|2, for any xRp;

  • x:=i=1pj=1q|xi,j|2, for any matrix x=(xi,j)Mp,q(R); where Mp,q(R) denotes the set of matrices of dimension p×q with coefficients in R;

  • fΘ:=supθΘ(f(θ)) for any function f:ΘMp,q(R);

  • Yr:=E(Yr)1/r, where Y is a random vector with finite rorder moments;

  • T,={,+1,,} for any (,)N2 such as ;

  • N0=N{0}.

In the sequel, we will denote by 0 the null vector of any vector space. Consider the following classical contraction condition on the function fθ.

Assumption 1

Ai(Θ) (i = 0, 1, 2): For any yN0N, the function θfθ(y) is i times continuously differentiable on Θ with ifθ(0)/θiΘ<; and there exists a sequence of non-negative real numbers (αk(i))k1 satisfying k=1αk(0)<1 (or k=1αk(i)< for i = 1, 2); such that for any y,yN0N,

supθΘifθ(y)θiifθ(y)θik=1αk(i)|ykyk|.

In the whole paper, it is assumed that any {Yt,tZ} belonging to ODT(fθ) is a stationary and ergodic process satisfying:

C>0,ϵ>0,suchthattZ,EYt1+ϵ<C. (2)

Let (Y1,,Yk)OD{1,,k}(fθ) be a trajectory with k1 and θΘ. Then, for any subset T{1,,k}, the conditional Poisson (quasi)log-likelihood computed on T is given (up to a constant) by

L(T,θ):=tTt(θ)witht(θ)=Ytlogλt(θ)λt(θ),

where λt(θ)=fθ(Yt1,Yt2,). This conditional (quasi)log-likelihood is approximated (see also [1]) by

L^(T,θ):=tT^t(θ)with^t(θ)=Ytlogλ^t(θ)λ^t(θ), (3)

where λ^t(θ)=fθ(Yt1,,Y1,0,). According to (3), the Poisson QMLE of θ computed on T is defined by

θ^(T):=argmaxθΘ(L^(T,θ)). (4)

When (Y1,,Yn) is a trajectory of a process {Yt,tZ} belonging to ODZ(fθ), we impose the following assumptions to study the asymptotic behavior of the Poisson QMLE.

  • (A0):

    for all (θ,θ)Θ2, (fθ(Yt1,Yt2,)=a.s.fθ(Yt1,Yt2,)forsometN)θ=θ; moreover, c_>0 such that infθΘfθ(y)c_, for all yN0N;

  • (A1):

    θ is an interior point of ΘRd;

  • (A2):

    for all cRd, cλt(θ)θ=a.s.0 c=0.

The above assumptions allow to establish the consistency and the asymptotic normality of the QMLE. Such results have already been obtained by [1]. However, the conditions needed in the following proposition seem to be more straightforward. See also Remark 2 in [10], which points out that, many of the assumptions in [1] (for instance, (4), (11), (12), (14), (15)) can be obtained from the Lipschitz-type conditions above.

Proposition 2.1

Assume that (Y1,,Yn) is a trajectory of a process {Yt,tZ} belonging to ODZ(fθ). Let (jn)n1 and (kn)n1 be two integer valued sequences such that jn<kn, kn and knjn as n.

  • (i)
    If (A0), A0(Θ) and (2) (with ϵ1) hold with
    αk(0)=O(kγ)forsomeγ>3/2, (5)
    then
    θ^(Tjn,kn)a.s.nθ.
  • (ii)
    If (A0)–(A2), Ai(Θ) (i = 0, 1, 2) and (2) (with ϵ3) hold with
    αk(0)+αk(1)=O(kγ)forsomeγ>3/2, (6)
    then
    knjn(θ^(Tjn,kn)θ)DnN(0,Σ1)withΣ:=JθIθ1Jθ,
    where Jθ=E[1λ0(θ)λ0(θ)θλ0(θ)θ]andIθ=E[(Y0λ0(θ)1)2λ0(θ)θλ0(θ)θ].

This proposition will be proved by relying on some results which have already been established in [11] without using the assumption of ‘conditional Poisson distribution’.

For any ,N with , define the following matrices:

J^(T,)=1+1tT,1λ^t(θ^(T,))λ^t(θ^(T,))θλ^t(θ^(T,))θ,I^(T,)=1+1tT,(Ytλ^t(θ^(T,))1)2λ^t(θ^(T,))θλ^t(θ^(T,))θ.

According to (A2), one can easily show that the matrices I and J are symmetric and positive definite. Further, the part (i.) of Proposition 2.1 implies the almost sure convergence of J^(T1,n) and I^(T1,n) to Jθ and Iθ, respectively. Therefore, Σ^n=J^(T1,n)I^(T1,n)1J^(T1,n) is a consistent estimator of the covariance matrix Σ.

3. Change-point test and asymptotic results

Assume that (Y1,,Yn) is an observed trajectory of the process {Yt,tZ} and we would like to test the null hypothesis of constant parameter

  • H0:

    (Y1,,Yn) is a trajectory of the process stationary {Yt,tZ}ODZ(fθ1) with θ1Θ,

    against the epidemic alternative

  • H1:

    there exists (θ1,θ2,t1,t2)Θ2×{2,3,,n1}2 (with θ1θ2 and t1<t2) such that (Y1,,Yt1) and (Yt2+1,,Yn) are trajectories of a process {Yt(1),tZ}ODZ(fθ1), and (Yt1+1,,Yt2) is a trajectory of a process {Yt(2),tZ}ODZ(fθ2).

We derive a retrospective test procedure in a semi-parametric setting, with a statistic based on the Poisson QMLE. Suppose that (un)n1 and (vn)n1 are two integer valued sequences such that: un=o(n), vn=o(n) and un,vnn+. For all n1, define the matrix

Σ^(un)=13[J^(T1,un)I^(T1,un)1J^(T1,un)+J^(Tun+1,nun)I^(Tun+1,nun)1J^(Tun+1,nun)+J^(Tnun+1,n)I^(Tnun+1,n)1J^(Tnun+1,n)] (7)

and the subset

Tn={(k1,k2)([vn,nvn]N)2withk2k1vn}. (8)

For all (k1,k2)Tn, we introduce

Cn,k1,k2=(k2k1)n3/2[(n(k2k1))θ^(Tk1+1,k2)k1θ^(T1,k1)(nk2)θ^(Tk2+1,n)]. (9)

Consider thus the test statistic given by

Q^n=max(k1,k2)TnQ^n,k1,k2withQ^n,k1,k2=Cn,k1,k2Σ^(un)Cn,k1,k2. (10)

Let us note that, the sequences {un} and {vn} play a very important role in the construction of the proposed test statistic Q^n. In the theoretical study, they allow to apply the convergence results of Proposition 2.1 when establishing the consistency of the proposed procedure. For example, {vn} assures that the lengths of Tk1+1,k2, Tk2+1,n and T1,k1 are sufficiently large (see (8)), whereas {un} assures that the length of T1,un (see (7)) is large enough as n. In practical applications, they are used to assure that the length of Tk1+1,k2, Tk2+1,n, T1,k1 and T1,un are not too small, which allows to obtain the convergence of the numerical algorithm used to compute the estimators on theses segments. See also [11,17] for further comments on such sequences. The matrix Σ^(un) is also essentially useful to prove the asymptotic properties of Q^n, because: (i) under H0, each of the three matrices in the formula of Σ^(un) converges almost surely to the covariance matrix Σ and (ii) under the epidemic alternative, the first and third matrices converge to the covariance matrix of the stationary model of the first regime (or to the third regime) which is positive definite. The consistency of second matrix is not ensured under the alternative; but it is positive semi-definite.

Note that, a weight function can be used to increase the power of the test procedure based on the statistic Q^n. See, for instance, [7,10,11] for some examples. The statistic Q^n can be seen as an extension to any parameter of the test statistic proposed by [19] (statistic UI(n,ρ)), [16] (statistic T12), [5] (statistic Tn(α,β)) or [2] (statistic TnB2) in the context of mean change analysis. Indeed, in the particular case of the change-point detection in the mean with θ^(T) the empirical mean computed on the segment T, the statistic Q^n is equivalent to those proposed by these authors.

The statistic test evaluates the distance between θ^(T1,k1)θ^(Tk1+1,k2) and θ^(Tk2+1,n)θ^(Tk1+1,k2), for all (k1,k2)Tn. These distances are not too large in the absence of change-point (i.e. under H0). Thus, the procedure rejects the null hypothesis if there exist two instants k1 and k2 such that the distances exceed a suitably chosen constant.

The following theorem establishes the asymptotic behavior of the test statistic under the null hypothesis.

Theorem 3.1

Under H0 with θ1Θ, assume that (A0)–(A2), Ai(Θ) (i = 0, 1, 2) and (2) (with ϵ>3) and (6) hold. Then,

Q^nDnsup0τ1<τ21Wd(τ1)Wd(τ2)2, (11)

where Wd is a d-dimensional Brownian bridge.

For a significance level α(0,1), the critical region of the test is then (Q^n>cd,α), where cd,α is the (1α)-quantile of the distribution of sup0τ1<τ21Wd(τ1)Wd(τ2)2. This assures that the test procedure has correct size asymptotically. Table 1 below shows the values of cd,α for α=0.01,0.05,0.10 and d=1,,5, which are obtained by computing the empirical quantiles through Monte-Carlo simulations based on 5000 replications. The distribution was evaluated on a grid of size 1000.

Table 1.

Some empirical (1α)-quantiles of the distribution of sup0τ1<τ21Wd(τ1)Wd(τ2)2.

  d
α 1 2 3 4 5
0.01 3.907 7.320 12.384 16.004 19.039
0.05 2.973 5.690 8.948 11.708 14.471
0.10 2.503 4.988 7.650 9.954 12.410

Under the epidemic alternative, we set the following additional condition.

Assumption 2

There exists (τ1,τ2)(0,1)2 such that (t1,t2)=([nτ1],[nτ2]) (with [] is the integer part).

Combining all the regularity assumptions given above, we obtain the following result.

Theorem 3.2

Under H1 with θ1 and θ2 belonging to Θ, assume that B, (A0)–(A2), Ai(Θ) (i = 0, 1, 2), (2) (with ϵ3) and (6) hold. Then,

Q^nPn+. (12)

This theorem establishes the consistency in power of the proposed procedure. Under H1, an estimator of the vector of breakpoints t_=(t1,t2) is given by

t_^n=argmax(k1,k2)TnCn,k1,k2Σ^(un)Cn,k1,k2.

The property of this estimator, for example, the study of the asymptotic behavior of t_^nt_ is a topic of another research subject.

4. Simulation study

We present some simulation results in order to assess the empirical size and power of the proposed test procedure. To do so, we consider the following processes:

  • Poisson-INGARCH processes:
    Yt|Ft1Poisson(λt)withλt=ω+αYt1+βλt1,foralltZ; (13)
  • NB-INGARCH processes:
    Yt|Ft1NB(r,pt)withr(1pt)pt=λt=ω+αYt1+βλt1,foralltZ, (14)
    where NB(r,p) denotes the negative binomial distribution with parameters r (assumed to be known) and p, and the parameter vector associated to the models is denoted by θ=(ω,α,β) which becomes θ=(ω,α) when β=0 (i.e. for an INARCH (1) representation). The NB-INGARCH processes are generated with r = 1 and r = 5. Here, we use the probability mass function of NB(r,p) given by
    P(Y=y)=(y+r1r1)pr(1p)y,forally=0,1,2,

Firstly, we generate two trajectories (Y1,,Y500) from (14): a trajectory under H0 with θ=(0.5,0.2,0.35) and a trajectory under H1 with breaks at t1=150 when θ changes to (1,0.2,0.35) and t2=350 when θ reverts back to (0.5,0.2,0.35). We have implemented the procedure in the R software (developed by the CRAN project). Figure 1 shows the realizations of the statistic Q^n,k1,k2 computed with un=vn=[(log(n))5/2]. As can be seen from this figure, in the scenario without change, the statistic Q^n,k1,k2 is less than the limit of the critical region that is represented by the horizontal triangle (see Figure 1(c)). Under the alternative (of epidemic change), Q^n,k1,k2 is greater than the critical value of the test and it reaches its maximum around the point where the changes occur (see the dotted lines in Figure 1(d)).

Figure 1.

Figure 1.

Typical realization of 500 observations of two NB-INGARCH(1,1) processes with r = 5 and the corresponding statistics Q^n,k1,k2 for the epidemic change-point detection. (a) is a trajectory without change, where the true parameter θ=(0.5,0.2,0.35) is constant. (b) is a trajectory generated under the epidemic alternative, where the parameter θ changes to (1,0.2,0.35) at t1=150 and reverts back to (0.5,0.2,0.35) at t2=350. The horizontal triangles in (c) and (d) represent the limit of the critical region of the test, whereas the dotted lines show the point where the maximum of Q^n,k1,k2 is reached.

Now, for each of the two models (13) and (14), we are going to generate independent replications with sample size n=500,1000 in the following situations: a scenario where the parameter θ=θ0 is constant (no change) and a scenario where the parameter θ changes from θ0 to θ1 at time t1=0.3n and reverts back to θ0 at time t2=0.7n. Table 2 contains the empirical sizes and powers computed (under H0 and H1, respectively) as the proportion of the number of rejections of the null hypothesis based on 500 replications. These results are obtained with a significance level α=5%. The scenario ‘ θ0=(22.75,0.18);θ1=(14.5,0.05)’ considered here is related and close to the fitted representation obtained from the real data example (see below). As expected, the performance is better for the Poisson-INGARCH processes than in the NB-INGARCH processes, but the test procedure works well in both cases (see Table 2). It produces reasonable empirical levels which are close to the nominal one when n = 1000. Also, the empirical powers increase with the sample size and are close to 1 when n = 1000, even in situations where the difference between θ0 and θ1 is relatively small (see, for example, the last scenario of the Poisson-INGARCH in Table 2). This is consistent with the results of Theorem 3.2.

Table 2.

Empirical sizes and powers at the nominal level 0.05 for the epidemic change-point detection in the Poisson-INGARCH (13) and NB-INGARCH (14) processes.

          n = 500 n = 1000
Poisson-INGARCH   Empirical levels:        
      θ0=(22.75,0.18)   0.036 0.044
      θ0=(0.15,0.3,0.2)   0.064 0.058
      θ0=(1,0.4,0.2)   0.040 0.048
    Empirical powers:        
      θ0=(22.75,0.18); θ1=(14.5,0.05); 0.996 1.000
      θ0=(0.15,0.3,0.2); θ1=(0.15,0.3,0.6); 0.990 0.998
      θ0=(1,0.4,0.2); θ1=(1,0.15,0.2); 0.826 0.964
NB-INGARCH r = 1 Empirical levels:        
      θ0=(22.75,0.18)   0.032 0.042
      θ0=(0.5,0.2,0.35)   0.076 0.060
    Empirical powers:        
      θ0=(22.75,0.18); θ1=(14.5,0.05); 0.880 0.992
      θ0=(0.5,0.2,0.35); θ1=(1,0.2,0.35); 0.896 0.948
  r = 5 Empirical levels:        
      θ0=(22.75,0.18)   0.030 0.046
      θ0=(0.5,0.2,0.35)   0.074 0.056
    Empirical powers:        
      θ0=(22.75,0.18); θ1=(14.5,0.05); 0.984 1.000
      θ0=(0.5,0.2,0.35); θ1=(1,0.2,0.35); 0.966 0.992

5. Real data example

We investigate the number of daily hospital admissions for respiratory diseases in children under 6 years old in the Vitória metropolitan area, Brazil. The data are obtained from the Hospital Infantil Nossa Senhora da Gloria. The time series is plotted in Figure 2(a). There are 413 available observations that represent the admission from 13 June 2008 through 30 July 2009. This time series is a part of a large dataset (available at https://rss.onlinelibrary.wiley.com/pb-assets/hub-assets/rss/Datasets/RSSC%2067.2/C1239deSouza-1531120585220.zip) which has been studied by [22]. In their works, they used a hybrid generalized additive with Poisson marginal distribution to analyze the effects of some atmospheric pollutants on the number of hospital admissions due to cause-specific respiratory diseases.

Figure 2.

Figure 2.

Plot of Q^n,k1,k2 for the epidemic change-point detection applied to the number of treatments for respiratory diseases in the Vitória metropolitan area, Brazil, between 13 June 2008 and 30 July 2009 with an INARCH(1) representation. The vertical lines in (a) are the estimated breakpoints. The horizontal triangle in (b) represents the limit of the critical region of the test, whereas the dotted lines show the point where the maximum of Q^n,k1,k2 is reached.

The time series plot appears to show an epidemic change in the sequence. To test this, we apply our detection procedure with an INARCH (1) representation given by E(Yt|Ft1)=λt=ω+αYt1. In each segment, to compute the QPMLE, the initial values λ1 and λ1/θ are set to be the empirical mean of the data and the null vector, respectively. For un=[(log(n))5/2] and vn=[(log(n))2], Figure 2(b) shows the values of the statistic Q^n,k1,k2 corresponding to all the possible combinations (k1,k2)Tn. The critical value of the nominal level α=5% is cd,α=5.69 and the resulting test statistic is Q^n=14.72; which implies the rejection of the null hypothesis (i.e. changes-points are detected). The peak in the graph is reached at the point (k1,k2)=(198,285) which is the vector of the locations of the break-points estimated. In the simulation with the scenario close to this real data example, we have found that the estimator t_^n of the change-point is on average very close to the true change-point t_. More precisely, among the 500 trajectories simulated for n = 500, we considered those for which a change-point was detected (498 for Poisson INGARCH and 492 for NB-INGARCH process), and we found that on average, t_^nt_<2 in both of these cases. The locations of the changes detected here correspond to the dates 27 December 2008 and 24 March 2009. The second regime detected coincides with a large part of the austral summer which is from December to March; which partly explains the slight decrease of the number of hospital admissions observed in this period. The estimated model on each regime yields:

λ^t={21.757(2.226)+0.188(0.076)Yt1fort<198,13.210(2.146)+0.154(0.138)Yt1for198t<285,23.445(2.402)+0.187(0.082)Yt1fort285, (15)

where in parentheses are the standard errors of the estimators. In (15), one can see that, the parameter of the first regime is very close to that of the third regime. This is in accordance with the alternative H1 and lends a substantial support to the existence of an epidemic change-point in this series.

6. Proofs of the main results

Let (ψn)nN and (rn)nN be sequences of random variables or vectors. Throughout this section, we use the notation ψn=oP(rn) to mean: for all ϵ>0, P(ψnϵrn)n0. Write ψn=OP(rn) to mean: for all ε>0, there exists C>0 such that P(ψnCrn)ε for n large enough. In the sequel, C denotes a positive constant whose the value may differ from one inequality to another.

6.1. Proof of Proposition 2.1

Without loss of generality, we show the results for jn=1 and kn=n (i.e. the consistency and the asymptotic normality of θ^(T1,n)).

To simplify the expressions in this paragraph, we set: θ^n:=θ^(T1,n) and Ln(θ):=Ln(T1,n,θ) for any θΘ.

  1. To prove the first part of the proposition (i.e. the consistency), it suffices to show that the condition (4) of [1] is satisfied. This condition is established by [10] in their Remark 2.1 by using A0(Θ) and (5).

  2. Applying the mean value theorem to the function θθiLn(θ) for all i{1,,d}, there exists θ¯n,i between θ^n and θ such that
    θiLn(θ^n)=θiLn(θ)+2θθiLn(θ¯n,i)(θ^nθ),
    which is equivalent to
    nJθ^n(θ^nθ)=1n(θLn(θ)θL^n(θ^n))+1n(θL^n(θ^n)θLn(θ^n)) (16)
    with
    Jθ^n=(1n2θθiLn(θ¯n,i))1id. (17)

Moreover, by proceeding as in Lemma 7.1 of [11], we can show that

E[1nθLˆn(θ)θLn(θ)Θ]n0.

In addition, θL^n(θ^n)=0 for n large enough since θ^n is a local maximum of the function θL^n(θ) (from the assumption (A1) and the consistency of θ^n). Thus, (16) gives

nJθ^n(θ^nθ)=1nθLn(θ)+oP(1). (18)

The following lemma will be useful in the sequel.

Lemma 6.1

Assume that all the assumptions of Proposition 2.1 hold. Then,

  • (a)

    Jθ=E[1λ0(θ)λ0(θ)θλ0(θ)θ]<andIθ=E[(Y0λ0(θ)1)2λ0(θ)θλ0(θ)θ]<.

  • (b)

    (θt(θ),Ft)tZ is a stationary ergodic, square integrable martingale difference sequence with covariance matrix Iθ;

  • (c)

    Jθ^na.s.nJθ and that the matrix Jθ is invertible.

Proof.

  1. It suffices to show that
    (a1):E[1λ0(θ)λ0(θ)θλ0(θ)θΘ]<and(a2):E[(Y0λ0(θ)1)2λ0(θ)θλ0(θ)θΘ]<.
    (a1): From (A0), we have
    E[1λ0(θ)λ0(θ)θλ0(θ)θΘ]E[1λ0(θ)Θλ0(θ)θΘ2]1c_E[λ0(θ)θΘ2]. (19)
    Moreover, from Assumption A1(Θ), for all tZ, we have
    λt(θ)θΘθfθ(Yt1,)θfθ(0,)Θ+θfθ(0,)ΘC+1α(1)Yt. (20)
    Then, applying the Hölder's inequality, we obtain
    E[λt(θ)θΘ2](λt(θ)θΘ4)2(C+1α(1)Yt4)2(C+Y041α(1))2<(from(2)withϵ3).
    We conclude the proof of (a1) from (19).
    (a2): Again, using (A0), the Hölder's inequality and (20), we get
    E[(Y0λ0(θ)1)2λ0(θ)θλ0(θ)θΘ]1c_(Y04+1)2(λ0(θ)θΘ4)2C(Y04+1)2(C+Y041α(1))2<,
    which achieves the proof of (a).
  2. See the proof of Theorem 2.2 of [1] or Lemma 7.2 of [10].

  3. This part is established in the proof of Theorem 3.2 of [11] by using Ai(Θ) (i = 0, 1, 2), (2) (with ϵ3) and (6).

Let us use the Lemma 6.1 to complete the proof of the part (ii.) of Proposition 2.1. From Lemma 6.1(c), for n large enough such that Jθ¯n (defined in (17)) is an invertible matrix. Then, the relation (18) is equivalent to

n(θ^nθ)=Jθ^n1[1nθiLn(θ)]+oP(1).

Furthermore, applying the central limit theorem to the stationary ergodic martingale difference sequence (θt(θ),Ft)tZ (see Lemma 6.1(b)), we have

1nθLn(θ)=1nt=1nθt(θ)DnNd(0,Iθ).

Therefore, for n large enough, it holds that

n(θ^nθ)=Jθ1[1nθiLn(θ)]+oP(1)DnNd(0,Jθ1IθJθ1).

6.2. Proof of Theorem 3.1

The following lemma is obtained from the Lemma A.1 and A.4 of [9]; the proof is then omitted.

Lemma 6.2

Assume that the assumptions of Theorem 3.1 hold. Then,

(i.)1nL^(T1,n,θ)L(T1,n,θ)Θa.s.n0and(ii.)1nL^(T1,n,θ)θL(T1,n,θ)θΘPn0.

Define the statistic

Qn=max(k1,k2)TnQn,k1,k2withQn,k1,k2=Cn,k1,k2ΣCn,k1,k2,

where Σ is defined in Proposition 2.1 and computed at θ1, under H0. Consider the following lemma.

Lemma 6.3

Assume that the assumptions of Theorem 3.1 hold. Then,

max(k1,k2)Tn|Q^n,k1,k2Qn,k1,k2|=oP(1).

Proof.

Let (k1,k2)Tn. According to the asymptotic normality of the QMLE from Proposition 2.1 and the consistency of Σ^(un), when n, we have

{Σ^(un)Σ=o(1),k2k1(θ^(Tk1+1,k2)θ1)=OP(1),k1(θ^(T1,k1)θ1)=OP(1),nk2(θ^(Tk2+1,n)θ1)=OP(1). (21)

Then, in addition to the consistency of the QMLE from Proposition 2.1, we obtain

|Q^n,k1,k2Qn,k1,k2|=(k2k1)2n3|[(n(k2k1))θ^(Tk1+1,k2)(k1θ^(T1,k1)+(nk2)θ^(Tk2+1,n))]×(Σ^(un)Σ)[(n(k2k1))θ^(Tk1+1,k2)(k1θ^(T1,k1)+(nk2)θ^(Tk2+1,n))]|CΣ^(un)Σ[(k2k1)(n(k2k1))2n3k2k1(θ^(Tk1+1,k2)θ1)2+k1(k2k1)2n3k1(θ^(T1,k1)θ1)2+(nk2)(k2k1)2n3nk2(θ^(Tk2+1,n)θ1)2]o(1)OP(1)=oP(1).

This allows to conclude the proof of the lemma.

Let k,k[1,n], θ~Θ and i{1,2,,d}. The mean value theorem to the function to θθiL(Tk,k,θ) implies that there exists θn,i between θ~ and θ1 such that

θiL(Tk,k,θ~)=θiL(Tk,k,θ1)+2θθiL(Tk,k,θn,i)(θ~θ1).

This is equivalent to

(kk+1)Jn(Tk,k,θ~)(θ~θ1)=θL(Tk,k,θ1)θL(Tk,k,θ~) (22)

with

Jn(Tk,k,θ~)=1(kk+1)2θθiL(Tk,k,θn,i)1id.

We first use Lemma 6.1 to show that

QnDnsup0τ1<τ21Wd(τ1)Wd(τ2)2. (23)

Remark that

Cn,k1,k2=k2k1n3/2[(nk2)(θ^(Tk1+1,k2)θ^(Tk2+1,n))k1(θ^(T1,k1)θ^(Tk1+1,k2))]

and

Qn,k1,k2=Iθ11/2Jθ1Cn,k1,k22.

Let (k1,k2)Tn. Applying (22) with θ~=θ^(Tk1+1,k2) and Tk,k=Tk1+1,k2, we have

Jn(Tk1+1,k2,θ^(Tk1+1,k2))(θ^(Tk1+1,k2)θ1)=1k2k1(θL(Tk1+1,k2,θ1)θL(Tk1+1,k2,θ^(Tk1+1,k2))). (24)

With θ~=θ^(Tk2+1,n) and Tk,k=Tk2+1,n, (22) gives

Jn(Tk2+1,n,θ^(Tk2+1,n))(θ^(Tk2+1,n)θ1)=1nk2(θL(Tk2+1,n,θ1)θL(Tk2+1,n,θ^(Tk2+1,n))). (25)

Moreover, as n+, Proposition 2.1 and Lemma 6.1(c) (applied to θ1) imply

Jn(Tk1+1,k2,θ^(Tk1+1,k2))Jθ1=o(1)andJn(Tk2+1,n,θ^(Tk2+1,n))Jθ1=o(1).

Then, according to (21), for n large enough, it holds from (24) that

k2k1Jθ1(θ^(Tk1+1,k2)θ1)=1k2k1(θL(Tk1+1,k2,θ1)θL(Tk1+1,k2,θ^(Tk1+1,k2)))k2k1((Jn(Tk1+1,k2,θ^(Tk1+1,k2))Jθ1)(θ^(Tk1+1,k2)θ0))=1k2k1(θL(Tk1+1,k2,θ1)θL(Tk1+1,k2,θ^(Tk1+1,k2)))+oP(1)=1k2k1(θL(Tk1+1,k2,θ1)θL^(Tk1+1,k2,θ^(Tk1+1,k2)))+oP(1)+1k2k1(θL^(Tk1+1,k2,θ^(Tk1+1,k2))θL(Tk1+1,k2,θ^(Tk1+1,k2)))=1k2k1(θL(Tk1+1,k2,θ1)θL^(Tk1+1,k2,θ^(Tk1+1,k2)))+oP(1),

where the last equality is obtained from Lemma 6.2 (ii.). It is equivalent to

Jθ1(θ^(Tk1+1,k2)θ1)=1k2k1(θL(Tk1+1,k2,θ1)θL^(Tk1+1,k2,θ^(Tk1+1,k2)))+oP(1k2k1). (26)

For n large enough, θ^(Tk1+1,k2) is an interior point of Θ and we have θL^(Tk1+1,k2,θ^(Tk1+1,k2))=0. Thus, from (26), we obtain

Jθ1(θ^(Tk1+1,k2)θ1)=1k2k1θL(Tk1+1,k2,θ1)+oP(1k2k1). (27)

Similarly, using (25), we also obtain

Jθ1(θ^(Tk2+1,n)θ1)=1nk2θL(Tk2+1,n,θ1)+oP(1nk2). (28)

The subtraction of (27) and (28) gives

Jθ1(θ^(Tk1+1,k2)θ^(Tk2+1,n))=1k2k1θL(Tk1+1,k2,θ1)1nk2θL(Tk2+1,n,θ1)+oP(1k2k1+1nk2);

i.e.

(k2k1)(nk2)n3/2Jθ1(θ^(Tk1+1,k2)θ^(Tk2+1,n))=1n3/2[(nk2)θL(Tk1+1,k2,θ1)(k2k1)θL(Tk2+1,n,θ1)]+oP(1). (29)

By going along similar lines, we have

k1(k2k1)n3/2Jθ1(θ^(T1,k1)θ^(Tk1+1,k2))=1n3/2[(k2k1)θL(T1,k1,θ1)k1θL(Tk1+1,k2,θ1)]+oP(1). (30)

Combining (29) and (30), we get

Jθ1Cn,k1,k2=1n3/2[(n(k2k1))θL(Tk1+1,k2,θ1)(k2k1)(θL(Tk2+1,n,θ1)+θL(T1,k1,θ1))]+oP(1).=1n[θL(Tk1+1,k2,θ1)(k2k1)nL(T1,n)]+oP(1)=1n[θL(T1,k2,θ1)θL(T1,k1,θ1)(k2k1)nL(T1,n)]+oP(1)=1n[(θL(T1,k2,θ1)k2nL(T1,n))(θL(T1,k1,θ1)k1nL(T1,n))]+oP(1);

i.e.

Iθ11/2Jθ1Cn,k1,k2=Iθ11/2n[(θL(T1,k2,θ1)k2nL(T1,n))(θL(T1,k1,θ1)k1nL(T1,n))]+oP(1). (31)

Recall that, for any τ]0,1],

θL(T1,[],θ1)=t=1[]θt(θ1),where[]istheintegerpartof.

From Lemma 6.1(b) (applied to θ1), applying the central limit theorem for the martingale difference sequence (θt(θ1),Ft)tZ (see [4]), we have

1n(θL(T1,[nτ1],θ1)[nτ1]nθL(T1,n,θ1))=1n(t=1[nτ1]θt(θ1)[nτ1]nt=1nθt(θ1))DnBIθ1(τ1)τ1BIθ1(1),

where BIθ1 is a Gaussian process with covariance matrix min(s,t)Iθ1. Hence,

Iθ11/2n(θL(T1,[nτ1],θ1)[nτ1]nθL(T1,n,θ1))DnBd(τ1)τ1Bd(1)=Wd(τ1)

in D([0,1]), where Bd is a d-dimensional standard motion, and Wd is a d-dimensional Brownian bridge.

Similarly, we get

Iθ11/2n(θL(T1,[nτ2],θ1)[nτ2]nθL(T1,n,θ1))DnBd(τ2)τ2Bd(1)=Wd(τ2).

Thus, as n, it comes from (31) that

Qn,[nτ1],[nτ2]=Iθ11/2Jθ1Cn,[nτ1],[nτ2]2DnWd(τ1)Wd(τ2)2inD([0,1]).

Hence, for n large enough, we have

Qn=maxvnk1<k2nvnk1<k2vnQn,k1,k2=supvnnτ1<τ21vnnQn,[nτ1],[nτ2]Dnsup0τ1<τ21×Wd(τ1)Wd(τ2)2

in  D([0,1]). We conclude the proof of the theorem from Lemma 6.3. □

6.2.1. Proof of Theorem 3.2

Recall that, under the alternative, (Y1,,Yn) is a trajectory of a process Y={Yt,tZ} satisfying

Yt={Yt(1)fort{,1,0,1,t1]{t2+1,t2+2,},Yt(2)fort{t1+1,t1+2,,t2}, (32)

where (t1,t2)=([τ1n],[τ2n]) (with 0<τ1<τ2<1) and {Yt(j),tZ} (j = 1, 2) is a stationary and ergodic solution of the model (1) depending on θj with θ1θ2.

We have Q^n=max(k1,k2)TnQ^n,k1,k2Q^n,t1,t2. Then, it suffices to show that Q^n,t1,t2Pn+ to establish the theorem. For any nN,

Q^n,t1,t2=Cn,t1,t2Σ^(un)Cn,t1,t2

with

Cn,t1,t2=t2t1n3/2[(n(t2t1))θ^(Tt1+1,t2)(t1θ^(T1,t1)+(nt2)θ^(Tt2+1,n))]

and

Σ^(un)=13[J^(T1,un)I^(T1,un)1J^(T1,un)+J^(Tun+1,nun)I^(Tun+1,nun)1J^(Tun+1,nun)+J^(Tnun+1,n)I^(Tnun+1,n)1J^(Tnun+1,n)].

Moreover, for n large enough, θ^(T1,t1)=a.s.θ^(Tt2+1,n) (from the consistency of the Poisson QMLE). Consequently, Cn,t1,t2 becomes

Cn,t1,t2=(t2t1)(n(t2t1))n3/2(θ^(T1,t1)θ^(Tt1+1,t2)). (33)

Furthermore, by definition, the three matrices in the formula of Σ^n(un) are positive semi-definite, and the first and the last one converge a.s. to same matrix which is positive definite.

Then, for n large enough, we can write

Q^n,t1,t2(t2t1)2(n(t2t1))2n3(θ^(T1,t1)θ^(Tt1+1,t2))×[J^(T1,un)I^(T1,un)1J^(T1,un)+J^(Tnun+1,n)I^(Tnun+1,n)1J^(Tnun+1,n)]×(θ^(T1,t1)θ^(Tt1+1,t2))Rn(τ1,τ2)(θ^(T1,t1)θ^(Tt1+1,t2))×[J^(T1,un)I^(T1,un)1J^(T1,un)+J^(Tnun+1,n)I^(Tnun+1,n)1J^(Tnun+1,n)]×(θ^(T1,t1)θ^(Tt1+1,t2))

with

Rn(τ1,τ2)=1n3[(n(τ2τ1)1)(n(1(τ2τ1))1)]2.

From the asymptotic properties of the Poisson QMLE, we have

  • θ^(T1,t)θ^(Tt1+1,t2)a.s.nθ1θ20,θ^(T1,un)a.s.nθ1,θ^(Tnun+1,n)a.s.nθ2;

  • J^(T1,un)I^(T1,un)1J^(T1,un)a.s.nΣ(1),J^(Tnun+1,n)I^(Tnun+1,n)1J^(Tnun+1,n)a.s.nΣ(1),

where

Σ(1)=J1I11J1withJ1=E[1λ0(θ1)λ0(θ1)θλ0(θ1)θ]andI1=E[(Y0λ0(θ1)1)2λ0(θ1)θλ0(θ1)θ].

Therefore, since Σ(1) is positive definite and Rn(τ1,τ2)n+, we deduce that Q^n,t1,t2a.s.n+. This completes the proof of the theorem.  □

Funding Statement

The first author was supported by the CY Advanced Studies (CY Cergy Paris Université, France), and the MME-DII Center of Excellence [grant number ANR-11-LABEX-0023-01]. The second author developed within [grant number ANR BREAKRISK: ANR-17-CE26-0001-01] and the CY Initiative of Excellence [grant 'Investissements d'Avenir' ANR-16-IDEX-0008, Project 'EcoDep' PSI-AAP2020-0000000013].

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Ahmad A. and Francq C., Poisson QMLE of count time series models, J. Time Ser. Anal. 37 (2016), pp. 291–314. [Google Scholar]
  • 2.Aston J.A. and Kirch C., Detecting and estimating changes in dependent functional data, J. Multivar. Anal. 109 (2012), pp. 204–220. [Google Scholar]
  • 3.Aston J.A. and Kirch C., Evaluating stationarity via change-point alternatives with applications to fmri data, Ann. Appl. Stat. 6 (2012), pp. 1906–1948. [Google Scholar]
  • 4.Billingsley P., Convergence of Probability Measures, Wiley, London, 1968. [Google Scholar]
  • 5.Bucchia B., Testing for epidemic changes in the mean of a multiparameter stochastic process, J. Stat. Plan. Inference. 150 (2014), pp. 124–141. [Google Scholar]
  • 6.Csörgö M., Csörgö M., and Horváth L., Limit Theorems in Change-Point Analysis, Wiley, New York, 1997. [Google Scholar]
  • 7.Diop M.L. and Kengne W., Testing parameter change in general integer-valued time series, J. Time Ser. Anal. 38 (2017), pp. 880–894. [Google Scholar]
  • 8.Diop M.L. and Kengne W., Consistent model selection procedure for general integer-valued time series, Stats. 55 (2021), pp. 1207–1230. [Google Scholar]
  • 9.Diop M.L. and Kengne W., Piecewise autoregression for general integer-valued time series, J. Stat. Plan. Infer. 211 (2021), pp. 271–286. [Google Scholar]
  • 10.Diop M.L. and Kengne W., Poisson QMLE for change-point detection in general integer-valued time series models, Metrika. 85 (2022), pp. 373–403. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Doukhan P. and Kengne W., Inference and testing for structural change in general poisson autoregressive models, Electron. J. Stat. 9 (2015), pp. 1267–1314. [Google Scholar]
  • 12.Ferland R., Latour A., and Oraichi D., Integer-valued garch process, J. Time Ser. Anal. 27 (2006), pp. 923–942. [Google Scholar]
  • 13.Fokianos K., Rahbek A., and Tjøstheim D., Poisson autoregression, J. Am. Stat. Assoc. 104 (2009), pp. 1430–1439. [Google Scholar]
  • 14.Graiche F., Merabet D., and Hamadouche D., Testing change in the variance with epidemic alternatives, Commun. Stat. Theor. M. 45 (2016), pp. 3822–3837. [Google Scholar]
  • 15.Guan Z., Semiparametric tests for change-points with epidemic alternatives, J. Stat. Plan. Infer. 137 (2007), pp. 1748–1764. [Google Scholar]
  • 16.Jarušková D. and Piterbarg V.I., Log-likelihood ratio test for detecting transient change, Stat. Probab. Lett. 81 (2011), pp. 552–559. [Google Scholar]
  • 17.Kengne W.C., Testing for parameter constancy in general causal time-series models, J. Time Ser. Anal. 33 (2012), pp. 503–518. [Google Scholar]
  • 18.Levin B. and Kline J., The cusum test of homogeneity with an application in spontaneous abortion epidemiology, Stat. Med. 4 (1985), pp. 469–488. [DOI] [PubMed] [Google Scholar]
  • 19.Račkauskas A. and Suquet C., Hölder norm test statistics for epidemic change, J. Stat. Plan. Infer. 126 (2004), pp. 495–520. [Google Scholar]
  • 20.Račkauskas A. and Suquet C., Testing epidemic changes of infinite dimensional parameters, Stat. Inference Stoch. Process. 9 (2006), pp. 111–134. [Google Scholar]
  • 21.Ramanayake A. and Gupta A.K., Tests for an epidemic change in a sequence of exponentially distributed random variables, Biom. J.: J. Math. Methods Biosci. 45 (2003), pp. 946–958. [Google Scholar]
  • 22.Souza J.B., Reisen V.A., Franco G.C., Ispány M., Bondon P., and Santos J.M., Generalized additive models with principal component analysis: An application to time series of respiratory disease and air pollution data, J. R. Stat. Soc. Ser. C. Appl. Stat. 67 (2018), pp. 453–480. [Google Scholar]
  • 23.Weiß C.H., Thinning operations for modeling time series of counts-a survey, AStA Adv. Stat. Anal. 92 (2008), pp. 319–341. [Google Scholar]
  • 24.Weiß C.H., Feld M.H.-J., Mamode Khan N., and Sunecher Y., INARMA model, Count Time Ser. Stats. 2 (2019), pp. 284–320. [Google Scholar]
  • 25.Weiß C.H. and Pollett P.K., Binomial autoregressive processes with density-dependent thinning, J. Time Ser. Anal. 35 (2014), pp. 115–132. [Google Scholar]
  • 26.Yao Q., Tests for change-points with epidemic alternatives, Biometrika. 80 (1993), pp. 179–191. [Google Scholar]
  • 27.Zhu F., A negative binomial integer-valued GARCH model, J. Time Ser. Anal. 32 (2011), pp. 54–67. [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES