Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Oct 7.
Published in final edited form as: Scand Stat Theory Appl. 2020 Aug 25;48(4):1277–1313. doi: 10.1111/sjos.12489

Estimation of change-point for a class of count time series models

Yunwei Cui 1, Rongning Wu 2, Qi Zheng 3
PMCID: PMC9540642  NIHMSID: NIHMS1654954  PMID: 36213620

Abstract

We apply a three-step sequential procedure to estimate the change-point of count time series. Under certain regularity conditions, the estimator of change-point converges in distribution to the location of the maxima of a two-sided random walk. We derive a closed-form approximating distribution for the maxima of the two-sided random walk based on the invariance principle for the strong mixing processes, so that the statistical inference for the true change-point can be carried out. It is for the first time that such properties are provided for integer-valued time series models. Moreover, we show that the proposed procedure is applicable for the integer-valued autoregressive conditional heteroskedastic (INARCH) models with Poisson or negative binomial conditional distribution. In simulation studies, the proposed procedure is shown to perform well in locating the change-point of INARCH models. And, the procedure is further illustrated with empirical data of weekly robbery counts in two neighborhoods of Baltimore City.

Keywords: change-point estimation, integer-valued time series, negative binomial INARCH model, Poisson INARCH model

1 |. INTRODUCTION

The detection of structural changes is an important topic in time series modeling because in real world the data-generating mechanism often changes at certain time points. The techniques for change-point detection emerge in two schemes: either on-line scheme where the data are collected sequentially and analyzed on a real-time basis, or off-line scheme where data are collected in advance and then analyzed on a retrospective basis. In the literature on integer-valued time series models, the two change-point detection schemes have been explored under varied specific models by many researchers, including Weiß and Testik (2009, 2011, 2012) and Hudecová et al. (2015) for the on-line scheme, Fokianos and Fried (2010), Franke et al. (2012), Kang and Lee (2014), Doukhan and Kengne (2015), and Diop and Kengne (2017) for the off-line scheme, among others.

In addition to detecting a structural change, people also want to pinpoint the change-point, the location on the timeline corresponding to the occurrence of the change. Compared to the amount of integer-valued time series research devoted to testing whether or not a structure change exists, the problem of estimating the change-point, when such a change exists, has been far less studied. The common practice is to employ some rule-of-thumb methods to locate the change-point; for example, the change-point is estimated as the time point when a change-point existence test statistic achieves its maximum value (Kang and Lee, 2014). Nevertheless, in order to gauge the quality of an estimation method and perform meaningful statistical inference, one also needs to learn the distributional properties of the change-point estimator. Numerous studies have been carried out to this end for continuous-valued time series models; for example, Yao (1987), Bai (1994), Ling (2016), and Yau and Zhao (2016), to mention a few. Also, see Jandhyala et al. (2013) for a review on inference for single and multiple change-points in time series. However, for integer-valued time series models, to the best of our knowledge, no study has examined the distributional properties of the change-point estimator in the existing literature.

To fill the gap in the literature, in this paper we study estimation of the change-point in terms of model parameters, when it is detected to exist for the one-parameter exponential family AR model (Davis and Liu, 2016; Cui and Zheng, 2017) in an off-line scheme. Our study is motivated by procedures that have been used for continuous-valued time series models. For example, Bai (1994) estimated the unknown shift point in the mean by the method of least squares for linear time series models, and Ling (2016) estimated the change-point of model parameters for linear and nonlinear time series models. Since, due to its nature of discreteness, count time series has unique features different than continuous-valued time series, these methods, while being useful, do not extend to the discrete setting in a straightforward manner. That said, we adopt the procedures of Ling (2016) in our study and establish asymptotic properties of the procedure under strong mixing conditions. As in Yao (1987) and Ling (2016), an approximate distribution for the change-point estimator in non-rescaled time is also provided with a closed form, which makes the statistical inference for the true change-point feasible. It is for the first time that such properties are examined for integer-valued time series models. For further illustration, we consider the widely discussed INARCH models with Poisson or negative binomial conditional distribution, which is a special case of our setting. While the change-point existence tests of INARCH models have been previously studied, no theoretical results for the change-point estimation and relevant statistical inference are available in the literature. We apply the proposed technique to the INARCH models and show that the asymptotic results hold under mild conditions.

The rest of the paper is organized as follows. In Section 2, we introduce the one-parameter exponential family AR model and propose a three-step sequential procedure to estimate the change-point for model parameters. In Section 3, we give the asymptotic distribution of the resulting change-point estimator. In Section 4, we develop the approximating distribution of the change-point estimator, based on the maxima of a two-sided random walk, and derive a closed-form expression. In Section 5, we establish the applicability of the proposed method and its properties for the INARCH models with Poisson or negative binomial conditional distribution. In Section 6, we perform simulation studies to evaluate the finite-sample performance of the procedure. In Section 7, we apply the techniques to analyzing the number of robbery cases in two neighborhoods of Baltimore City. We show that a change-point exists, which corresponds to the Freddie Gray incident, an event having ignited widespread discussions of policing practices. In Section 7, we also perform change-point estimation for the polio incidence data and the campylobacteriosis infection data. A discussion of future research is given in Section 8. The technical proofs are given in the Appendix.

2 |. MODEL FORMULATION AND ESTIMATION OF CHANGE-POINT

This paper concerns change-point estimation for the one-parameter exponential family AR model (e.g., Cui and Zheng, 2017). The one-parameter exponential family consists of distributions whose probability density (mass) function is given by p(y|η) = eηyA(η)h(y), y0={0,1,2,}, where A(·) and h(·) are known functions and η is called the natural parameter. Let λ = E(Y), then by standard results λ = B(η) and Var(Y) = B′(η), where B(·) = A′(·), the first derivative of A(·). So, the natural parameter can be obtained via η = B−1(λ). The family includes the Poisson distribution, binomial distribution (with known total number of trials), and negative binomial distribution (with known number of successes). See Davis and Liu (2016) and Cui and Zheng (2017) for details.

For t, let {Yt} be a time series of counts, and Ft=σ{(Yt,λt),(Yt1,λt1),} be the σ-algebra generated by the information up to time t, where λt=E(YtFt1) is the conditional mean process. The one-parameter exponential family AR model is specified as follows:

{YtFt1~p(yηt),ηt=B1(λt),λt=fθ(Yt1,,λt1,), (1)

where fθ(·) is a measurable non-negative function, dependent on a parameter vector θ = (θ1, …, θd) ∈ Θ, with Θd being a compact set. To estimate the true parameter of (1), denoted by θ0, Davis and Liu (2016) and Cui and Zheng (2017) implemented the conditional maximum likelihood estimation (CMLE) method, and Ahmad and Francq (2016) applied the Poisson quasi-maximum likelihood estimation (PQMLE) method. These techniques rely on the property that, under some regularity conditions, there exists a unique stationary solution of (1), denoted as {(Yt, λt)}, and a set of functions of past values of Yt, denoted by fθ(), are properly defined for all θ ∈ Θ. At any time t, the true parameter value θ0 renders the unique stationary conditional mean process, λt=fθ0(Yt1,Yt2,), almost surely. Hereafter, the notation λt(θ) is used to represent the state process defined by fθ(Yt1,Yt2,). For more information, please refer to the Lipschitz-type conditions in Davis and Liu (2016) and Cui and Zheng (2017).

Suppose that Tn = {Y1, …, Yn} is the observed count series. Let k0 = ⌊0⌋, for some τ0 ∈ (0, 1), be the true change-point in (1), the underlying autoregressive mechanism that generates Tn. That is, Tn comprises two segments, T1n={Y1,,Yk0} and T2n={Yk0+1,,Yn}, which are realizations of strictly stationary and ergodic processes from (1) with true parameters θ10 and θ20, respectively, and θ10θ20. The goal is to estimate k0 as well as θi0, i = 1, 2. For change-point estimation of the one-parameter exponential family AR model, the following stability property is essentially required for the underlying model (1) of each segment, when the true parameter is given by θ10 or θ20:

(S) There exists a unique solution of (1), which is a strictly stationary and ergodic process.

To estimate k0, the basic idea is to divide Tn into two parts, {Y1, …, Yk} and {Yk+1, …, Yn}, at a certain position k, and then calculate the conditional maximum log-likelihood respectively. The overall conditional log-likelihood for Tn at k is computed as the sum of the two conditional maximum log-likelihood function values. Then, the estimator of k0 will be the dividing point, denoted as k^n, that induces the maximum overall conditional log-likelihood among all choices of k.

In the literature, the computation of conditional log-likelihood is based on the CMLE or the PQMLE methods, which both depend on the state process λt(θ). Thus, it is crucial to compute the state process λt(θ) for any given θ ∈ Θ. In the ideal situation, when values of λt(θ), t = 1, …, n, are available for all θ ∈ Θ, the conditional log-likelihood function of the data, {Y1, …, Yk}, at any θ1 ∈ Θ, up to a constant term, is given by

L1n(k,θ1)=t=1kt(θ1)=t=1k{ηt(θ1)YtA(ηt(θ1))}, (2)

with ηt(θ1) = B−1(λt(θ1)). Similarly, the conditional log-likelihood function of the data, {Yk+1, …, Yn}, at any θ2 ∈ Θ, up to a constant term, is given by

L2n(k,θ2)=t=k+1nt(θ2)=t=k+1n{ηt(θ2)YtA(ηt(θ2))}. (3)

It follows that the overall conditional log-likelihood for Tn at k is computed as Ln(k, θ1, θ2) = L1n(k, θ1) + L2n(k, θ2). Borrowing the idea of Ling (2016), we propose a three-step sequential procedure to estimate k0, as well as θ10 and θ20:

(1) For each k = n*, …, nn*, calculate

θ^1n(k)=argmax θ1ΘL1n(k,θ1)

and

θ^2n(k)=argmax θ2ΘL2n(k,θ2)

(2) The change-point estimator is given by

k^n=argmaxk{n*,,nn*}Ln(k,θ^1n(k),θ^2n(k));

(3) The parameter estimators, θ^10 of θ10 and θ^20 of θ20, are given by θ^1n(k^n) and θ^2n(k^n), respectively.

Note that in the above three-step procedure, a small integer n* is introduced such that k is selected from the set {n*, …, nn*}, in order to allow for enough observations in the two parts separated at each k, and make parameter estimation feasible.

3 |. PROPERTIES OF THE CHANGE-POINT ESTIMATOR

In this paper, the norm ∥·∥ is defined as follows, unless otherwise specified: a=aa for a (column) vector a and B=i,jbi,j2 for a matrix B, where bi,j denotes the element on the position (i, j). Now, we define the ball with center θ ∈ Θ of radius ϕ > 0 as V(θ, ϕ) = {a ∈ Θ| ∥aθ∥ < ϕ}, Dt(θ) = ∂ℓt(θ)/∂θ, Pt(θ) = −2t(θ)/θθ, Σθ = E(Pt(θ)), and Ωθ = E (Dt(θ) × Dt(θ)). To explore the asymptotic properties of k^n and θ^i0, i = 1, 2, the following assumptions are needed:

(C1) E(supθ∈Θ |t(θ)|) < ∞, and E(t(θ)) is maximized at θ0.

(C2) It holds true that 1nsupθΘ|t=n1[t(θ)E(t(θ))]|a.s.0, as n → ∞.

(C3) θ0 is an interior point of Θ, t(θ) is twice continuously differentiable in terms of θ and the matrix Σθ0 is invertible, and in addition, E supθV(θ0,ϕ)Pt(θ)< for some ϕ > 0.

(C4) Dt(θ0) is a martingale difference sequence with covariance matrix given by Ωθ0, and Ωθ0 is invertible.

By Ling (2016), it is straightforward to show that the conditions (S) and (C1) − (C4) guarantee the convergence of the parameter estimators and the change-point estimator, and induce the related asymptotic distributions.

Theorem 1 Assume that model (1) satisfies the condition (S), assumptions (C1) and (C2), for θ = θ10 and θ = θ20. Then,

(a) τ^n=τ0+op(1), where τ^n=k^n/n, the change-point estimator in rescaled time;

(b) θ^1n=θ10+op(1) and θ^2n=θ20+op(1);

(c) k^n=k0+Op(1).

Before discussing the asymptotic distribution of the change-point estimator in non-rescaled time, k^n, we first define a two-sided random walk by

W(k,θ10,θ20)={t=1k(2t(θ10)2t(θ20)), if k>0,0, if k=0,t=k1(1t(θ20)1t(θ10)), if k<0. (4)

When k < 0, the count series is the stationary and ergodic solution to (1) with the true parameter θ10. To avoid ambiguity, we write Yt as Y1t in this case. That is, Y1t is generated from a stationary ergodic regime M(θ10) determined by θ10, which is denoted as Y1tM(θ10). And, we write t(θ) as 1t(θ). Then, the corresponding score vector and Hessian matrix of the log-likelihood function are represented by D1t(θ) and −P1t(θ), respectively. Likewise, when k > 0, we have the notations Y2tM(θ20), 2t(θ), D2t(θ), and −P2t(θ). The next theorem establishes the asymptotic distributions of the estimators.

Theorem 2 Assume that model (1) satisfies the condition (S) and assumptions (C1)–(C4), for θ = θ10 and θ = θ20. Then, as n → ∞,

(a) n(θ^1nθ10)DN(0,Σθ101Ωθ10Σθ101), and n(θ^2nθ20)DN(0,Σθ201Ωθ20Σθ201);

(b) k^nk0DargmaxkW(k,θ10,θ20), where W(k, θ10, θ20) is defined by (4).

4 |. APPROXIMATING DISTRIBUTION FOR THE CHANGE-POINT ESTIMATOR

This section develops a result for approximating the limiting distribution of k^nk0. Let d = θ10θ20. We adopt the idea of Ling (2016) and assume that θ20 is fixed and ∥d∥ → 0. In light of (4), let Wd(k) = W (k, θ10, θ20) = W (k, θ20 + d, θ20) and m=(dΣθ20d)2(dΩθ20d). By the assumption (C3) and the ergodic theorem, it can be shown that, for any ϵ, there exists a ball of θ20, denoted as V (θ20, ϕ), such that

limnP(1nsupθV(θ20,ϕ)t=1n[P2t(θ)Σθ20]>ϵ)=0.

As a result, if θn = θ20 + op(1), then 1nt=1nP2t(θn)=Σθ20+op(1). In addition, given that Σθ0 and Ωθ20 are invertible, it holds true m → ∞ as ∥d∥ → 0. Therefore, we have, for any z > 0,

i=1mz[2t(θ10)2t(θ20)]=t=1mzdD2t(θ20)mz2d[Σθ20+op(1)]d (5)

as ∥d∥ → 0. Furthermore, for dealing with the backward sum, the following assumption is needed:

(A1) For any z > 0, as ∥d∥ → 0, it holds true that

t=mz1[1t(θ20)1t(θ10)]=t=mz1dD2t(θ20)mz2d[Σθ20+op(1)]d,

where op(1) denotes convergence to zero in probability as ∥d∥ → 0.

The invariance principle for a martingale difference sequence of near-epoch dependence was applied by Ling (2016) to establish an approximating distribution for change-point estimator. Toward this end for the integer-valued time series, we resort to Theorem 4 of Kuelbs and Philipp (1980) on invariance principle for the strong mixing processes, which has also been used by Davis et al. (1995) for parameter change-point estimation of time series.

Let {υt} be a sequence of random variables (vectors), and Fj and Fk be the σ–fields generated by {υt, tj} and {υt, tk}. Then {υt} is said to be strongly mixing or α–mixing, if as k → ∞

α(k)=supjsupAFj,BFj+k|P(AB)P(A)P(B)|0.

If the process is strictly stationary, then the strong mixing coefficients are simply given by

α(k)=supAF0,BFk|P(AB)P(A)P(B)|.

The process is said to be geometrically strongly mixing or strong mixing with geometric rate, if α(k) converges to zero at an exponential rate, α(k) = O(ψk) for some ψ ∈ (0, 1). Define {νt} as the reversed process of {υt}, i.e., νt = νt for t. By the definition of α(k), if {υt} is strictly stationary, then {νt} possesses the same strong mixing property as {υt}, due to the strict stationarity property.

To derive functional limit results for the partial sum t=1dD2t(θ20) when > 0 and the backward sum t=1d D2t(θ20) when < 0, one needs a polynomial decay of the mixing coefficients. Due to the fact that the definition of strongly mixing for strictly stationary time series is of a symmetric nature, the reversed process inherits the strong mixing property of the unreversed process, and only the mixing property of the unreversed process needs to be specified. In particular, we impose the following assumption:

(A2) The process D2t(θ20) is a strictly stationary process with finite (2 + ɩ)th moment; namely, E ∥D2t(θ20)∥2+ɩ < ∞ for some ɩ ∈ (0, 1], and it is a strong mixing process with α(k) satisfying α(k) = o(k−(1+ε)(1+2∕ɩ)) for some ε > 0.

Note that, a sufficient condition guaranteeing the polynomial decay rate is that the strong mixing coefficients α(k) decay at a geometric rate. We establish the following result for the two-sided random walk (4), by combining the assumptions (A1)–(A2) and the invariance principle for strong mixing processes.

Theorem 3 Suppose that the assumptions (C3), (C4), (A1), and (A2) hold true for θ10 and θ20. Then, given any M > 0, we have

argmaxz[M,M]γdWd(mz)Dargmaxz[M,M]B(z)12|z|,

asd∥ → 0, where γd=(dΣθ20d)(dΩθ20d)1 and B(z) is the two-sided standard Brownian motion in with B(0)=0.

Define V=argmaxzB(z)12|z|. The distribution of V has been studied by Yao (1987) and other researchers; it is shown that V has a symmetric distribution and for a > 0 the distribution function is given by

P(Va)=1+a2πea/8+32eaΦ(32a)12(a+5)Φ(12a), (6)

where Φ(·) represents the standard normal distribution function. Let the quantiles υα be defined by P(V > υα) = α. Through (6), we obtain υ.05 = 7.6873 and υ.025 = 11.0333. The distribution of V can be viewed as a special case of a more general class of distributions for the maximum of asymmetric two-sided Brownian motion with triangular drift (Stryhn, 1996).

Let Fd(x) denote the distribution function of argmaxk Wd(k). Following the arguments of Ling (2016), we can show that, given any ϵ > 0, there exists an M > 0 such that

|Fd(x)P(margmaxz[M,M]γdWd(mz)x)|ϵ,

from which it follows that, if M is large enough, the distribution of m−1 argmaxk Wd(k) can be well approximated by the distribution of γd argmaxz∈[−M,M] Wd(⌊mz⌋), i.e.,

|P(m1 argmax kWd(k)x)P(argmaxz[M,M]γdWd(mz)x)|ϵ.

Finally, invoking Theorems 2 and 3, we conclude that an adequate approximating distribution to the limiting distribution of k^n, when d is small, is given by

P((dΣθ20d)2(dΩθ20d)1(k^nk0)x)P(Vx).

5 |. INARCH MODELS

In this section, we focus our discussion on the fundamental properties of the integer-valued generalized autoregressive conditional heteroskedastic (INGARCH) models. We show that INARCH models, which are special cases of the INGARCH models, satisfy all the conditions of Theorems 1–3, and therefore, the proposed change-point technique and particularly its asymptotic properties are applicable.

5.1 |. Asymptotic properties of change-point estimator for INARCH models

The INGARCH model of Ferland et al. (2006) is capable of modeling count time series with overdispersion. The model, originally specified to have Poisson conditional distribution, can be extended to incorporate the one-parameter exponential family via the formulation (1)

{YtFt1~p(yηt),t,ηt=B1(λt),λt=δ+i=1pαiλti+j=1qβjYtj, (7)

where δ > 0, αi ≥ 0 for i = 1, …, p, and βj ≥ 0 for j = 1, …, q.

The specified marginal distribution in (7) can be generated by Poisson processes. A Poisson process with rate μ, denoted as N(·; μ), is a stochastic process, which is often used as a model to keep track of the number of occurrences of a certain event along the timeline. For the existence of Poisson processes on some measurable space, please refer to Ash and Doléans-Dade (2000). A Poisson process N(·; μ) satisfies three properties:

1. N(0; μ) = 0 a.s.

2. For given positive values γ and φ with γ < φ, the increment on (γ, φ], which is equal to N(φ; μ) − N(γ; μ), has a Poisson distribution with mean equal to μ(φγ), where μ is called the mean rate of the process.

3. The increments on pairwise disjoint intervals are independent.

Thereafter, to ease the notation, N˜() is adopted to denote a Poisson process with unit rate.

In the sequel, we assume that the observed count series of (7) with Poisson marginal distribution is constructed in the spirit of a mixed Poisson process

{Yt=N˜t(λt),λt=δ+i=1pαiλti+j=1qβjYtj, (8)

where {N˜t(), t} is a sequence of i.i.d copies of N˜(). It is easy to see that the above specification in (8) defines a Poisson INGARCH model. Compared with (7), the definition of (8) is more constructive, in the sense that it provides a concrete method to realize the conditional Poisson distribution as described by the first line of (7).

For negative binomial INGARCH models, please refer to Zhu (2011). In the same vein, we adopt the method of Christou and Fokianos (2014) to construct negative binomial INGARCH processes. That is, the observed count series {Yt} evolves based on a mixed Poisson process with

{Yt=N˜t(Ztλt),λt=δ+i=1pαiλti+j=1qβjYtj, (9)

where {N˜t(), t} is a sequence of i.i.d. copies of N˜() and {Zt} consists of independent gamma variables with mean 1 and variance 1∕r. From the well-known results of Gamma Poisson mixture, we can see that

P(Yt=yλt)=Γ(r+y)Γ(y+1)Γ(r)(rr+λt)r(λtr+λt)y,y0,

which shows that the count series {Yt} has a conditional negative binomial distribution at each t and (9) defines a negative binomial INGARCH process.

Before discussing the statistical inference for change-point estimation, we present the stochastic stability, especially the strong mixing property, for the Poisson INGARCH and negative binomial INGARCH models.

Theorem 4 Assume model (8) or model (9) satisfies the conditions that δ > 0, i=1pαi+j=1qβj<1, αi ≥ 0 for i = 1, …, p, and βj ≥ 0 for j = 1, …, q. Then, each of them has a strictly stationary solution, denoted by {(Yt, λt)}, which is unique and ergodic and has a finite first moment E(Yt) < ∞; additionally, {Yt} is strong mixing with geometric rate. Furthermore, the strictly stationary solution to the Poisson INGARCH model has finite moments of any order.

For statistical inference of the INGARCH models, the exact form of t(θ1) in (2) and (3) is determined by the specific conditional distributions. It is well-known that for the Poisson distribution, η = ln λ and A(η) = λ, and for the negative binomial case η=lnλr+λ and A(η)=rlnrr+λ. Thus, for the log-likelihood function of the Poisson distribution in Fokianos et al. (2009) and the log Poisson quasi-likelihood function in Christou and Fokianos (2014) or Ahmad and Francq (2016), t (θ) = Yt ln λt(θ) − λt(θ); for the log-likelihood function of the negative binomial distribution in Cui and Zheng (2017), t(θ)=Ytlnλt(θ)r+λt(θ)+rlnrr+λt(θ).

The INARCH models, as special cases of the above INGARCH models, have p = 0, and the serial dependence of the count series is introduced through the past values of Yt, i.e., λt=δ+j=1qβjYtj. In this setting, the parameter vector θ = (θ1, …, θq+1) = (δ, β1, …, βq), and the true parameter vector θ0 = (θ01, …, θ0,q+1) = (δ0, β01, …, β0q). We examine three scenarios of the INARCH models for change-point estimation:

Case 1 CMLE for the Poisson INARCH models,

Case 2 PQMLE for the negative binomial INARCH models,

Case 3 CMLE for the negative binomial INARCH models.

To estimate the true parameter and achieve the asymptotic properties for the change-point estimator developed in previous sections, we require that INARCH models satisfy

(R1) θ0 is an interior point of Θ, a compact set of [δ, ∞) × [0, ∞)q, with δ being a positive constant and j=1qβ0j<1.

In light of Theorem 4, the stability condition (S) for the Poisson or the negative binomial INARCH models is satisfied under j=1qβ0j<1. The condition (C1) has been established in the literature, for which please see Davis and Liu (2016) and Cui and Wang (2019). The condition (C3) holds true under certain finite moment condition and (R1). In particular, for the negative binomial case, when E(Yt2)<, Ahmad and Francq (2016) and Cui and Wang (2019) established (C3) for Case 2 and Case 3, respectively. On the other hand, since the Poisson INGARCH models have finite moments of any order, it is easy to show that there exists some V (θ0, ϕ), such that E supθV(θ0,ϕ)Pt(θ)< is true for Case 1. The condition (C4) has been proved for all three cases respectively by Ahmad and Francq (2016) and Cui and Wu (2020).

The remaining condition, (C2), can be established under strong mixing properties. We present the following strong law of large numbers for both forward and backward sums of a strong mixing process.

Lemma 1 For a strong mixing process {vt}t=, assume that E(υt) = 0, supt E(|υt|2+ɩ) < ∞ and the strong mixing coefficients satisfy k=0α(k)12+l, for some ɩ > 0. Then, there exists some δ > 0 such that

1nt=1nvt=o(nδ) a.s.     and    1nt=n1vt=o(nδ) a.s. 

By virtue of Theorem 4, the condition (R1) entails that the conditional mean process {λt(θ)} is geometrically strongly mixing, since it depends on finitely many past values of {Yt} (Davis and Mikosch, 2009). In turn, using Lemma 1, we can show that (C2) is satisfied for INARCH models.

Lemma 2 Suppose that the INARCH model with Poisson or negative binomial conditional distribution satisfies (R1), and the strictly stationary solution has a finite third moment, E(Yt3)<. Then, (C2) holds true for Case1, Case2, and Case3.

In summary, we have verified that all the conditions imposed in Section 3 are satisfied for the three scenarios considered in this section. Hence, the following result is in order.

Theorem 5 Suppose that the INARCH model with Poisson or negative binomial conditional distribution satisfies the assumption (R1) and, in addition, for the negative binomial model, the third moment is finite, i.e., E(Yt3)<. Then, Theorems 1 and 2 hold true for

(a) the CMLE-based sequential procedure to estimate the change-point of the Poisson or the negative binomial INARCH model, and

(b) the PQMLE-based sequential procedure to estimate the change-point of the negative binomial INARCH model.

5.2 |. Approximating distribution for the change-point estimator of INARCH models

Next, we move on to the approximating distribution of k^n. The models with Poisson conditional distributions are considered first. Some additional assumptions regarding the underlying stochastic models of the count series are needed to ensure the assumptions (A1)–(A2). Two general results for INGARCH models are provided, and then it is shown that Theorem 3 holds true for INARCH models in particular.

With regard to (4), we assume that the count series {Y1t} and {Y2t} with Poisson conditional distribution are generated based on the same sequence of i.i.d. unit rate Poisson processes, denoted by {N˜t(), t}, with true parameter vectors θ10 = (δ10, α11, …, α1p, β11, …, β1q) and θ20 = (δ20, α21, …, α2p, β21, …, β2q), according to

{Y1t=N˜t(0,λ1t),λ1t=δ10+i=1pα1iλ1,ti+j=1qβ1jY1,tj, (10)

and

{Y2t=N˜t(0,λ2t),λ2t=δ20+i=1pα2iλ2,ti+j=1qβ2jY2,tj. (11)

As in (7), the true parameters are required to satisfy δk0 > 0, αki ≥ 0, and βkj ≥ 0, for k ∈ {1, 2}, i = 1, …, p, and j = 1, …, q. It turns out that, as ∥d∥ goes to zero, the moment E||Y1tY2t||s also goes to zero for any s > 0.

Lemma 3 Suppose that i=1pαki+j=1qβkj<1,k{1,2}, k ∈ {1, 2}, for (10) and (11). Then, the strictly stationary count series of the models satisfy, asd∥ → 0,

E|Y2tY1t|s0, a.s.,
E|λ1t(θ10)λ2t(θ20)|S0, a.s.,

for any integer s > 0.

Continuing with the above setting of (10) and (11) for the count processes, we can show that, as a consequence of Lemma 3, the score and Hessian matrix of 1t(θ10) converge to those of 2t(θ20), in certain senses given below.

Lemma 4 Suppose that i=1pαki+j=1qβkj<1,k{1,2}, k ∈ {1, 2}, for (10) and (11). Then, the strictly stationary count series of the models satisfy, asd∥ → 0,

ED2t(θ20)D1t(θ10)20,

and

EP2t(θ20)P1t(θ10)0.

The assumption that the two segments of the times series, before and after the change-point, share the same sequence of i.i.d. unit rate Poisson processes is inspired by Ling (2016). Without such an assumption on the generating mechanism, the approximating distribution for k^n would not be made possible. By our setting, it turns out that Theorem 3 holds true for the Poisson INARCH models.

Theorem 6 Let the two-sided random walk (4) be built upon two Poisson INARCH models. Assume that, θ10 and θ20 satisfy (R1) and the strictly stationary count series {Y1t} and {Y2t} are generated according to (10) and (11). Then, given any M > 0, asd∥ → 0,

argmaxz[M,M]γdWd(mz)Dargmaxz[M,M]B(z)12|z|,

where γd=(dΣθ20d)(dΩθ20d)1 and B(z) is the two-sided standard Brownian motion in with B(0)=0.

The results of Theorem 5 and Theorem 6 of Poisson INARCH models are readily to be extended to the Poisson INGARCH models, once we successfully address the initial value issue and the strong mixing property issue.

Remark 1. The initial value issue arises from the fact that, for INGARCH models (with p > 0 and q > 0, and αi > 0 and βj > 0 for certain i and j), t(θ) in (2) and (3) depends on infinitely many past values of count series, for which only finitely many observations are available in practice. As discussed in Ling and McAleer (2010) for non-linear time series in general, some initial values, denoted as H^0={(Y^t,λ^t(θ1))}t=0, need to be selected, so that an approximation of λt(θ1) can be computed iteratively from the conditional mean function given in (1). Let λ^t(θ1) denote such an approximation for {Y1, …, Yk}. Then,

L^1n(k,θ1)=t=1k^t(θ1)=t=1k{η^t(θ1)YtA(η^t(θ1))}

works as a proxy for L1n(θ1), where η^t(θ1)=η(λ^t(θ1))=B1(λ^t(θ1)). Likewise, based on certain initial values, we approximate the conditional log-likelihood function of the data {Yk+1, …, Yn} at any θ2 ∈ Θ, by L^2n(k,θ2)=t=k+1n^t(θ2). Consequently, the overall log-likelihood function of the count series Tn, separated at k with parameters θ1, θ2 ∈ Θ, is given by

L^n(k,θ1,θ2)=L^1n(k,θ1)+L^2n(k,θ2).

The proposed change-point estimation procedure will be carried out for L^n(k,θ1,θ2) instead of Ln(k, θ1, θ2). To derive the asymptotic results as Theorem 1 and Theorem 2, initial value conditions should be imposed to regulate the differences between the approximation process and the true process in terms of |λ^t(θ)λt(θ)|, D^t(θ)Dt(θ), and P^t(θ)Pt(θ). In the literature, the regularity conditions are formulated in various ways under different situations. For a set of initial value conditions that work for most time series models with initial value issues, as well as INGARCH models, please refer to Assumption 2 of Ling and McAleer (2010). Also note that, although the regularity conditions may be formulated differently in various GARCH type models, the exponential boundedness properties are widely used in dealing with the initial value issues, such as inequality (19) in Cui and Wu (2020), Remark 2.3 in Ahmad and Francq (2016), inequality (7.58) in Francq and Zakoïan (2010), Lemma A.4 in Ling (2007a), and Lemma 5.8 in Berkes et al. (2003).

Remark 2. The geometrically strongly mixing property is needed to satisfy (C2) and (A2). For the strictly stationary count series of an INGARCH model, there exist ρ ∈ (0, 1), K > 0 and coefficients ω*(θ), ωi(θ), ωi,j(θ) for 0, determined by θ, such that

λt(θ)=ω0*(θ)+=1ω*(θ)Yt,λt(θ)θi=ω0i(θ)+=1ωi(θ)Yt,2λt(θ)θiθj=ω0i,j(θ)+=1ωi,j(θ)Yt, (12)

where supθΘ|ω*(θ)|Kρ, supθΘ|ωi(θ)|Kρ, and supθΘ|ωi,j(θ)|Kρ, for all i, j ∈ {0, 1, …, p + q}, from which it follows that t(θ) and its score and Hessian matrix are functions of the whole past history of the count series. However, a measurable function of infinitely many past values of the underlying process does not necessarily induce a strong mixing process, even though the underlying process is strong mixing. The proof for the strong mixing property for t(θ) and its score and Hessian matrix for INGARCH models remains a challenging problem. In this paper, we focus on the theoretical results for the INARCH processes. Nevertheless, in the simulation studies to be presented in the next section, the proposed method and the approximation distribution work well for INGARCH(1, 1) processes. And, we are optimistic that (C2) and (A2) should hold for INGARCH models.

Finally, we are in a position to discuss the negative binomial INARCH models. Let {Y1t} and {Y2t} be generated by (9) with the same sequence of i.i.d. Poisson processes {N˜t(), t} and the same sequence of i.i.d. Gamma variables {Zt} through

{Y1t=N˜t(0,Ztλ1t),λ1t=δ10+i=1pα1iλ1,ti+j=1qβ1jY1,tj, (13)

and

{Y2t=N˜t(0,Ztλ2t),λ2t=δ20+i=1pα2iλ2,ti+j=1qβ2jY2,tj. (14)

Given λ1t and λ2t, since

|Y2tY1t|=|N˜t(0,Ztλ1t)N˜t(0,Ztλ1t)|~N˜t(0,Zt|λ2tλ1t|),

it can be derived from the properties of Poisson process and Gamma Poisson mixture that the random variable |Y2tY1t| from (13) and (14) follows a conditional negative binomial distribution with mean equal to |λ2tλ1t|. Moreover, it can be shown that Lemmas 3 and 4 all hold true under the condition that, for some ɩ > 0,

limd0E|λ2tλ1t|4+l0,   E(Y1t4+l)<,   E(Y2t4+l)<, (15)

and hence, the proof of Theorem 6 carries over to the negative binomial INARCH case.

Theorem 7 Assume that, for the negative binomial INARCH model, θ10 and θ20 satisfy (R1) and the strictly stationary count series {Y1t} and {Y2t} are generated according to (13) and (14). If the condition (15) is satisfied, then, given any M > 0, the two-sided random walk (4) satisfies

argmaxz[M,M]γdWd(mz)Dargmaxz[M,M]B(z)12|z|,

asd∥ → 0, where γd=(dΣθ20d)(dΩθ20d)1 and B(z) is the two-sided standard Brownian motion in with B(0)=0.

In light of the proof of Lemma 3, we show that the following conditions are sufficient for (15) to be true:

(r+1)(r+2)(r+3)(r+4)r4j=1qβq<1,   E(Y1t5)<,   E(Y1t5)<;

to save space, please see the supplementary materials for the proof. The conditions ensuring the existence of higher order moments of the negative binomial INGARCH model are complicated and tedious to derive (Ahmad and Francq, 2016). We use the technique of Ferland et al. (2006) to derive that the following conditions are necessary and sufficient for the fifth moment of a negative binomial INARCH(1) to be finite:

{β<1,β2(1+1r)<1,β3(1+3r+2r2)<1,β4(1+6r+11r2+6r3)<1,β5(1+10r+35r2+50r3+24r4)<1.

The proof is provided in the supplementary materials of the paper.

After some tedious algebra, we can also derive the necessary and sufficient condition to ensure the third moment of a negative binomial INARCH(2) model to be finite

{β1+β2<1,β12+β22+β12+β22r+2r+1rβ12β21β2<1,(r+1)(r+2)r2(β13+β23+3β12β2B+3β1β23B+3β12β22)<1, (16)

where B=r+1rβ12+β1β221r+1rβ232r+1rβ1β2. The same technique for deriving (16) can be extended to even higher order moments of a negative binomial INARCH(2) model. As the order increases, the algebra becomes progressively more complicated, since the multinomial expansions of (δ+β1Yt1+β2Yt2)m, m ≥ 4, are involved, and an increasing number of terms of the expectations of cross products of Yt−1 and Yt−2, in the form of E(Yt1jYt2k), need to be processed. Again, for (16), please see the supplementary materials for the proof.

6 |. SIMULATION STUDIES

To investigate the finite sample performance of the proposed procedure for estimating the change-point, simulation studies were conducted. In each simulation run, a time series Tn was generated based on the specific underlying time series models. Various values satisfying (S) were selected for θ10 and θ20. The generated series is composed of two segments, T1n and T2n, with θ10 and θ20 as the true parameters. Two sample sizes, n = 400 and n = 1000, were investigated. For the simulation studies, τ0 = 0.4 was used, that is, the parameter vector of the data-generating model changed to θ20 from θ10 after k0 = 0.4 × n. For each case, the simulation was repeated 1000 times. We reported the empirical mean and standard deviation of θ^10, θ^20, and k^n, as well as the mean of the empirical 95% confidence intervals for k0 by averaging the results from the 1000 simulation runs. The (1 − 2α) × 100% confidence interval for k0 was computed by

[k^nΔvα1,k^n+Δvα+1],

where Δ=(d^Ω^θ20d^)(d^Σ^θ20d^)2 with d^=θ^10θ^20 and Σ^θ20 and Ω^θ20 being the estimates of Σθ20 and Ωθ20, respectively.

The results for the Poisson INARCH(2) models are summarized in Table 1 and Table 3. The sample means of k^n were close to the true values of k0, indicating that the proposed procedure was effective for locating the change-point. Moreover, the sample means of the parameter estimates were also close to their corresponding true model parameter values. In the parentheses, we gave the asymptotic standard deviations of θ^10 and θ^20 computed based on Theorem 2. They matched the related sample standard deviations well, especially for n = 1000. The approximating distribution for k^n worked satisfactorily, as all means of the confidence intervals included k0. Comparing the two tables, one can see that, as sample size changed from 400 to 1000, the estimates became more accurate and precise, which is demonstrated by the decreasing differences between the means and their corresponding true parameter values and the decreasing standard deviations of the estimates.

TABLE 1.

Estimation of the location of change-point for the Poisson INARCH(2) models, n = 400, τ0 = 0.4.

θ10 = (1, 0.3, 0.1) θ20 = (1.1, 0.4, 0.2) k0 = 160
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.030 0.281 0.095 1.255 0.378 0.182 173.4 (111.0, 235.8)
s.d. 0.209 0.104 0.080 0.415 0.096 0.083 64.1
(0.166) (0.083) (0.080) (0.176) (0.066) (0.065)
θ10 = (1, 0.4, 0.2) θ20 = (1.1, 0.5, 0.3) k0 = 160
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.091 0.386 0.182 1.319 0.485 0.279 165.1 (119.8, 210.2)
s.d. 0.298 0.101 0.094 0.439 0.079 0.076 50.8
(0.196) (0.081) (0.079) (0.237) (0.063) (0.062)
θ10 = (1, 0.1, 0.3) θ20 = (1.1, 0.2, 0.4) k0 = 160
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.037 0.099 0.280 1.262 0.187 0.369 174.8 (107.7, 241.9)
s.d. 0.213 0.081 0.107 0.397 0.080 0.092 64.8
(0.176) (0.077) (0.080) (0.187) (0.060) (0.061)
θ10 = (1, 0.2, 0.1) θ20 = (1.1, 0.3, 0.2) k0 = 160
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.026 0.185 0.098 1.228 0.284 0.178 175.7 (107.3, 244.1)
s.d. 0.201 0.099 0.085 0.332 0.092 0.086 70.8
(0.158) (0.083) (0.081) (0.163) (0.066) (0.065)
θ10 = (1, 0.5, 0.1) θ20 = (1.1, 0.6, 0.2) k0 = 160
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.067 0.482 0.094 1.314 0.578 0.187 165.3 (119.3, 211.5)
s.d. 0.287 0.103 0.075 0.472 0.088 0.075 51.6
(0.188) (0.082) (0.079) (0.226) (0.064) (0.063)
θ10 = (3, 0.2, 0.1) θ20 = (3.1, 0.3, 0.2) k0 = 160
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 3.103 0.183 0.094 3.234 0.289 0.189 153.5 (109.2, 197.8)
s.d. 0.477 0.089 0.073 0.507 0.070 0.066 40.4
(0.439) (0.080) (0.079) (0.445) (0.064) (0.064)

TABLE 3.

Estimation of the location of change-point for the Poisson INARCH(2) models, n = 1000, τ0 = 0.4.

θ10 = (1, 0.3, 0.1) θ20 = (1.1, 0.4, 0.2) k0 = 400
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.014 0.294 0.096 1.140 0.396 0.191 406.5 (334.0, 479.1)
s.d. 0.110 0.060 0.052 0.117 0.040 0.043 58.1
(0.105) (0.053) (0.051) (0.111) (0.042) (0.041)
θ10 = (1, 0.4, 0.2) θ20 = (1.1, 0.5, 0.3) k0 = 400
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.028 0.394 0.193 1.162 0.496 0.293 397.5 (353.6, 441.5)
s.d. 0.133 0.056 0.053 0.168 0.040 0.041 36.8
(0.125) (0.051) (0.050) (0.150) (0.040) (0.039)
θ10 = (1, 0.1, 0.3) θ20 = (1.1, 0.2, 0.4) k0 = 400
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.023 0.097 0.290 1.141 0.197 0.392 405.7 (334.1, 477.4)
s.d. 0.123 0.052 0.057 0.134 0.039 0.041 61.7
(0.112) (0.048) (0.051) (0.118) (0.038) (0.039)
θ10 = (1, 0.2, 0.1) θ20 = (1.1, 0.3, 0.2) k0 = 400
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.015 0.194 0.093 1.136 0.295 0.195 407.1 (328.9, 485.3)
s.d. 0.108 0.059 0.051 0.123 0.045 0.042 65.2
(0.099) (0.052) (0.051) (0.102) (0.042) (0.041)
θ10 = (1, 0.5, 0.1) θ20 = (1.1, 0.6, 0.2) k0 = 400
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.021 0.497 0.094 1.159 0.596 0.193 398.9 (355.1, 442.7)
s.d. 0.128 0.054 0.050 0.163 0.041 0.040 38.3
(0.118) (0.051) (0.049) (0.143) (0.040) (0.040)
θ10 = (3, 0.2, 0.1) θ20 = (3.1, 0.3, 0.2) k0 = 400
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 3.051 0.195 0.094 3.132 0.298 0.197 394.4 (355.6, 433.2)
s.d. 0.295 0.054 0.049 0.293 0.041 0.041 31.9
(0.278) (0.051) (0.050) (0.281) (0.041) (0.040)

Moreover, the same setting was adopted to investigate the Poisson INGARCH(1, 1) models. Table 2 and Table 4 summarize the performance of the proposed procedure for Poisson INGARCH(1, 1) models with n = 400 and n = 1000. Compared to the results in Table 1 and Table 3, the change-point estimator tended to yield more accurate and precise estimates for k0, since the sample averages are closer to the true values and the sample standard deviations are smaller.

TABLE 2.

Estimation of the location of change-point for the Poisson INGARCH(1, 1) models, n = 400, τ0 = 0.4.

θ10 = (1, 0.2, 0.3) θ20 = (1.1, 0.3, 0.4) k0 = 160
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 1.022 0.194 0.285 1.215 0.282 0.387 164.8 (126.4, 203.2)
s.d. 0.340 0.189 0.090 0.369 0.132 0.074 32.4
(0.416) (0.233) (0.081) (0.336) (0.123) (0.064)
θ10 = (1, 0.1, 0.3) θ20 = (1.1, 0.2, 0.4) k0 = 160
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 0.933 0.149 0.279 1.174 0.189 0.390 169.0 (111.6, 226.5)
s.d. 0.298 0.191 0.096 0.333 0.142 0.0781 45.7
(0.393) (0.252) (0.083) (0.317) (0.140) (0.066)
θ10 = (1, 0.1, 0.5) θ20 = (1.1, 0.2, 0.6) k0 = 160
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 1.026 0.107 0.475 1.215 0.189 0.591 165.1 (133.3, 196.9)
s.d. 0.268 0.117 0.089 0.327 0.097 0.076 32.8
(0.298) (0.146) (0.082) (0.288) (0.088) (0.064)
θ10 = (1, 0.3, 0.1) θ20 = (1.1, 0.4, 0.2) k0 = 160
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 0.996 0.298 0.097 1.263 0.349 0.194 165.0 (129.5, 200.5)
s.d. 0.491 0.306 0.067 0.502 0.206 0.067 25.5
(1.074) (0.671) (0.081) (0.546) (0.227) (0.064)
θ10 = (1, 0.3, 0.2) θ20 = (1.1, 0.4, 0.3) k0 = 160
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 1.063 0.276 0.185 1.232 0.372 0.293 163.4 (125.5, 201.2)
s.d. 0.457 0.251 0.084 0.421 0.145 0.069 23.1
(0.589) (0.324) (0.081) (0.409) (0.144) (0.064)

TABLE 4.

Estimation of the location of change-point for the Poisson INGARCH(1,1) models, n = 1000, τ0 = 0.4.

θ10 = (1, 0.2, 0.3) θ20 = (1.1, 0.3, 0.4) k0 = 400
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 1.037 0.185 0.294 1.151 0.289 0.397 401.9 (367.7, 436.1)
s.d. 0.255 0.140 0.054 0.225 0.079 0.042 26.3
(0.264) (0.149) (0.052) (0.212) (0.077) (0.041)
θ10 = (1, 0.1, 0.3) θ20 = (1.1, 0.2, 0.4) k0 = 400
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 0.985 0.112 0.294 1.136 0.191 0.399 406.6 (352.4, 460.8)
s.d. 0.199 0.126 0.055 0.206 0.090 0.044 39.9
(0.238) (0.152) (0.053) (0.199) (0.088) (0.042)
θ10 = (1, 0.1, 0.5) θ20 = (1.1, 0.2, 0.6) k0 = 400
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 1.008 0.102 0.493 1.138 0.197 0.596 402.7 (371.9, 433.5)
s.d. 0.180 0.084 0.054 0.184 0.057 0.041 22.3
(0.189) (0.093) (0.051) (0.185) (0.056) (0.041)
θ10 = (1, 0.3, 0.1) θ20 = (1.1, 0.4, 0.2) k0 = 400
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 1.065 0.279 0.099 1.118 0.372 0.196 403.2 (350.5, 455.9)
s.d. 0.429 0.262 0.049 0.383 0.156 0.041 21.1
(0.676) (0.422) (0.050) (0.345) (0.143) (0.041)
θ10 = (1, 0.3, 0.2) θ20 = (1.1, 0.4, 0.3) k0 = 400
δ^10 α^11 β^11 δ^20 α^21 β^21 k^n 95% C.I.
mean 1.061 0.271 0.196 1.155 0.388 0.297 402.3 (372.5, 432.1)
s.d. 0.354 0.193 0.052 0.276 0.094 0.041 17.5
(0.372) (0.205) (0.051) (0.258) (0.091) (0.040)

The CMLE-based procedure and the PQMLE-based procedure were also investigated for negative binomial INARCH(2) models and negative binomial INGARCH(1, 1) models with r = 4 and r = 8 (results were not included in this paper but are available upon request). In general, when the value of r increases, the estimation of k0 becomes more accurate and precise, which was indicated by smaller difference between the mean of k^n and k0 and smaller sample standard deviation of k^n. The approximating distribution for k^n again worked well for constructing confidence intervals for k0. The PQMLE-based procedure tends to overestimate k0 and involves larger variability than CMLE-based procedure. In regard to model parameter estimation, the sample means of the estimates, θ^10 and θ^20, were close to their true values. However, there existed a relatively large discrepancy between the sample standard deviations of θ^10 and θ^20 and their asymptotic counterparts. We looked further into this issue for some cases and found that, when the sample size was increased to n = 2000, the sample means of the model parameter estimates became closer to their true values and the sample standard deviations were in good agreement with the asymptotic values. It is likely that the observed big discrepancy between the sample standard deviations and the theoretical standard deviations of asymptotic result was attributed to the slow convergence for the estimators of the specific parameter values.

Besides τ0 = 0.4, we also investigated the case τ0 = 0.1 for boundary problems in which the change-points are near the beginning or the end of the data. It is well-known in the literature of change-point analysis (Robbins et al., 2011; Aue and Horváth, 2013), that the change-point detection tests work best if the structural breaks occur close to the sample center and the power naturally decays for the boundary problem. Through simulation studies, it can be seen that the proposed three-step sequential procedure also suffered from the drawback of boundary problem. The problem is complicated as a result of the interplay among n, τ0, and the specific values of the true parameters of the underlying models. The challenge partially resides in how to make accurate statistical estimation based on small sample sizes. For fixed τ0 = 0.1, we experimented three cases, n = 1000, n = 2000, and n = 4000, for INARCH(2) models. The proposed change-point estimator finally regained the same level of effectiveness as Table 3 and Table 4 when the sample size reached n = 4000. To see the progressive recovery of the proposed change-point estimator k^n, please refer to Table 5. Meanwhile, simulation studies were carried out for INGARCH(1, 1) models with τ0 = 0.1 as well, and similar results were obtained.

TABLE 5.

Estimation of the location of change-point for the Poisson INARCH(2) models with change-point near the beginning of the data, τ0 = 0.1.

n=4000 θ10 = (1, 0.3, 0.1) θ20 = (1.1, 0.4, 0.2) k0 = 400
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.018 0.294 0.095 1.104 0.399 0.199 408.7 (337.4, 480.0)
s.d. 0.107 0.057 0.051 0.047 0.017 0.016 77.1
(0.105) (0.053) (0.051) (0.045) (0.016) (0.016)
n=2000 θ10 = (1, 0.3, 0.1) θ20 = (1.1, 0.4, 0.2) k0 = 200
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.041 0.287 0.097 1.118 0.396 0.198 266.2 (178.9, 353.4)
s.d. 0.167 0.089 0.072 0.103 0.037 0.031 259.4
(0.148) (0.074) (0.071) (0.064) (0.024) (0.023)
n=1000 θ10 = (1, 0.3, 0.1) θ20 = (1.1, 0.4, 0.2) k0 = 100
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.045 0.300 0.117 1.181 0.385 0.192 253.3 (123.1, 383.5)
s.d. 0.238 0.131 0.098 0.361 0.078 0.064 268.2
(0.209) (0.105) (0.101) (0.091) (0.034) (0.033)
n=4000 θ10 = (1, 0.1, 0.2) θ20 = (1.1, 0.2, 0.3) k0 = 400
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.016 0.097 0.190 1.106 0.198 0.299 408.1 (325.3, 490.9)
s.d. 0.109 0.053 0.055 0.044 0.017 0.017 62.0
(0.103) (0.050) (0.051) (0.043) (0.016) (0.017)
n=2000 θ10 = (1, 0.1, 0.2) θ20 = (1.1, 0.2, 0.3) k0 = 200
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.025 0.100 0.186 1.116 0.199 0.295 271.3 (175.5, 367.1)
s.d. 0.162 0.077 0.084 0.117 0.035 0.033 266.7
(0.145) (0.071) (0.072) (0.061) (0.023) (0.023)
n=1000 θ10 = (1, 0.1, 0.2) θ20 = (1.1, 0.2, 0.3) k0 = 100
δ^10 β^11 β^12 δ^20 β^21 β^22 k^n 95% C.I.
mean 1.026 0.122 0.201 1.161 0.192 0.288 260.7 (141.1, 380.4)
s.d. 0.227 0.098 0.123 0.312 0.068 0.077 272.5
(0.206) (0.100) (0.102) (0.086) (0.032) (0.033)

In summary, the simulation studies supported the theoretical properties established in Sections 3, 4 and 5. The proposed procedure overall performed well for all the investigated cases for Poisson INARCH(2) and INGARCH(1, 1) models, when the change-points were relatively close to the center.

7 |. EMPIRICAL EXAMPLES

Located on the Chesapeake Bay in the Mid-Atlantic coast region, Baltimore City is the largest city in the State of Maryland and the 30th largest city in the U.S., according to the 2017 data by the U.S. Census Bureau. In recent years, Baltimore City has witnessed elevated level of violence. On April 19, 2015, Freddie Gray, a 25-year-old Black man died of a spinal cord injury while in police custody. From late April to early May 2015, riots broke out in Baltimore City, and thousands of police and Maryland National Guard were deployed. The incident drew nationwide attention and placed spot light on many issues of police practices. In 2017, Baltimore City recorded 342 murders and surpassed New York City in absolute number of homicides, though the population of Baltimore City is only about one fourteenth of that of New York City, and it is officially the deadliest year of Baltimore City in terms of per capita rate ever, far higher than any other top 30 largest cities. The sustained high levels of crime rates have caused widespread public concern. The data under investigation here are the weekly counts of robberies in Downtown and Patterson Park Neighborhood of Baltimore, from January 1, 2012 to January 5, 2019, for totally 366 weeks. The data were obtained from the published data from City of Baltimore website, https://data.baltimorecity.gov/. Downtown and Patterson Park Neighborhood are two neighborhoods where robberies frequently happen. They account for 5.87% of all robbery cases that occurred in 278 neighborhoods in Baltimore City during the same time period.

The analysis of the robbery data starts with the descriptive statistics. The mean of the counts is 5.05 and the sample variance is 8.77. The sample ACF and PACF function graphs in Figure 1 demonstrate that the data are positively serially correlated. The INGARCH-type models were selected, because it is capable of modeling overdispersed count data with positive autocorrelations. The ACF and PACF in Figure 1 exhibit the typical features of autocorrelation functions of classic AR models for continuous-valued time series, and we will fit one of its counterparts, the Poisson INARCH(2) model, to the data first. The ACF and PACF also indicate the possibility that models with ARMA-type autocorrelation structures are adequate for the data, and we will explore one of its counterparts in integer-valued time series, the Poisson INGARCH(1, 1) model.

FIGURE 1.

FIGURE 1

Top: Weekly counts of robberies in Downtown and Patterson Park Neighborhood of Baltimore City from January 1, 2012 to January 5, 2019; Bottom left: the ACF of the robbery data; Bottom right: the PACF of the robbery data.

Let {Yt}, t = 1, …, 366, denote the sequence of robbery counts. Fitting a Poisson INARCH(2) model to the time series yielded

M0:λ^t=2.8460+0.2374Yt1+0.2007Yt2.

The adequacy of the model was evaluated by the Pearson residuals. The Pearson residuals of M0 have a mean equal to 1.2327 and a variance equal to 1.5171, and the ACF and PACF of the Pearson residuals appear to be uncorrelated. Furthermore, the randomized probability integral transform (PIT) was also applied to the fitted model for goodness-of-fit testing (Brockwell, 2007). If the randomized PIT appears to be uniformly distributed, then the model has a good fit. While the ACF and PACF of the Pearson residuals indicated that M0 was adequate for the data, the PIT (Figure 2) suggested some deviation from a well-fitted model. Kolmogorov-Smirnov tests were performed and yielded a p-value equal to 0.0552, which rejected a uniform distribution for PIT at significance level of 0.10. Note that the computation of randomized PIT involves randomness introduced from a uniform random variable in its formula. In this paper, when we worked with the randomized PIT for a specific model, the randomized PIT was generated 10000 times and relevant tests (procedures) were implemented 10000 times. The reported results of randomized PIT were the averages of 10000 repetitions.

FIGURE 2.

FIGURE 2

Left: histograms for the relative frequencies of randomized PIT for INAR(2) models fitted to the robbery data; Right: QQ-plots of randomized PIT against standard uniform distribution, where the straight line is the 45° line going through the origin.

In the ensuing investigation, we applied the residual-based parameter change test and the estimate-based parameter change test of Kang and Lee (2014) for the Poisson INARCH(2) model to M0. Let ϵ^t be the residual of the fitted model at time t, and σ^n=(1nt=1nϵ^t2)1/2. For the Poisson INGARCH(2) model satisfying (R1), if there does not exist a parameter change in the underlying model, then

Tnr=max1kn1nσ^n|t=1kϵ^tknt=1nϵ^t|Dsup0s1|W1°(s)|,

as n → ∞ (Cui and Wu, 2016), where W1°(s) is a standard Brownian bridge. The residual-based test for M0 achieved a value of T366r=1.929, which is greater than 1.353, the critical value at the significance level of 0.05. For the estimate-based test, define θ^m=argmaxθΘt=1mt(θ). The estimate-based CUSUM test statistic is given by

Tne=max1mnt2n(θ^mθ^n)Ω^n(θ^mθ^n),

where Ω^n=1nt=1n2t(θ^n)θθ. If there does not exist a parameter change in the underlying model, then,

TneDsup0<s<1W3°(s)2,

as n → ∞, where W3°(s) is a 3-dimensional standard Brownian bridge. The estimate-based test yielded a large value of T366e=13.2027, whereas the critical value is 3.004 at the significant level of 0.05. Therefore, both tests suggested that the data generating mechanism for robbery count series experienced some structural change.

In the next step, we used the proposed three-step sequential procedure to locate the change-point. It turned out that the change-point was estimated to be k^n=170, which corresponds to three weeks before the Freddie Gray incident. Also, the 95% confidence interval of the change-point was calculated as [142, 198], which covers the whole span for the timeline of the Freddie Gray incident, including the arrest of Freddie Gray and the riot in Baltimore City in the end of April 2015.

Using the estimated change-point of 170, we fitted two Poisson INARCH(2) models to the data before and after the estimated change-point, and obtained

M01:λ^t=3.1923+0.0953Yt1+0.1153Yt2,
M02:λ^t=3.8087+0.2102yt1+0.1549Yt2.

The Pearson residuals were uncorrelated for both models with mean and variance values equal to (0.9997, 1.2442) and (1.2327, 1.5171). We also performed the parameter change test for both models, and both test statistics were insignificant. Specifically, T170r=0.4876 and T170e=0.8126 were obtained for M01, and T196r=1.0639 and T196e=2.6456 were obtained for M02. Next, we calculated the randomized PIT for both models and plotted the histograms and QQ-plots of PIT in Figure 2. Compared with M0, the goodness-of-fit of the fitted model was improved, since the PIT of M01 or M02 was closer to a uniform distribution than that of M0. Kolmogorov-Smirnov tests resulted in p-values of 0.5823 for the PIT of M01 and 0.1497 for the PIT of M02, which provided further evidence supporting not only the goodness-of-fit of M01 and M02 but also the existence and plausibility of change-point of the original data.

To further validate the two fitted models, we fitted the two segments of counts by two Poisson distributions with means equal to the means of the two segments, which are 4.0235 and 5.9336, respectively:

M01:Poisson(4.0235),
M02:Poisson(5.9336).

For the first segment of counts, model M01 achieved an AIC of −537.5831, whereas the Poisson model M01 achieved an AIC of −534.4741; for the second segment of counts, model M02 achieved an AIC of −1852.0388, whereas the Poisson model M02 achieved an AIC of −1813.7767. In terms of AIC, a Poisson INARCH(2) model is more suitable than a Poisson model for each of the segments of the robbery counts.

Additionally, we adopted a Poisson INGARCH(1, 1) model to the whole robbery data and obtained

M0*:λ^t=0.5383+0.7049λ^t1+0.1890Yt1.

Again, Pearson residuals supported the adequacy of the model, since the ACF and PACF of the Pearson residuals appear to be uncorrelated. The Pearson residuals had a mean equal to 1.3169 and a variance equal to 1.5961. However, the randomized PIT suggested that the fit of the model to the data was less than satisfactory, in that the relative frequency histogram and QQ-plot of M0* in Figure 3 exhibited some departure form a uniform distribution. Further investigation by the Kolmogorov-Smirnov test generated a p-value of 0.0968, which rejected a uniform distribution at significance level of 0.10. Just like the INARCH(2) case, a single INGARCH(1, 1) model is incapable of well capturing the dynamics behind the data. Subsequently, the residual-based test for M0* gave a value of T366r=1.3744, and the estimate-based test yielded an extremely large value of T366e=61.5692. Based on the INGARCH(1, 1) model, at significance level of 0.05, both change-point existence tests flagged the existence of structural change for the robbery data.

FIGURE 3.

FIGURE 3

Left: histograms for the relative frequencies of randomized PIT for INGARCH(1, 1) models fitted to the robbery data; Right: QQ-plots of randomized PIT against standard uniform distribution, where the straight line is the 45° line going through the origin.

Proceeding along the lines set out for the proposed three-step sequential procedure, we found out that, based on Poisson INGARCH(1, 1) models, the change-point was estimated to be k^n=174, which corresponds to the funeral of Freddie Gray, one week after the incident. In light of the simulation study for Poisson INGARCH(1, 1) models for which the theoretical result is yet to be established, the 95% confidence interval of the change-point was calculated as [156, 192], covering the whole timeline of Freddie Gray incident. Evidences obtained from INGARCH(1, 1) models corroborated those from INARCH(2) models that certain structural change happened for the data around the time of the Freddie Gray incident.

To further investigate the data, we generated two segments at k^n=174 and fitted two Poisson INGARCH(1, 1) models

M01*:λ^t=1.4574+0.5188λ^t1+0.1167Yt1,
M02*:λ^t=0.9836+0.6462λ^t1+0.1891Yt1.

An array of methods were applied to validate the two models above. The mean and variance of Pearson residuals were given by (0.9934, 1.2544) and (1.4400, 1.5165) respectively. The parameter change tests were insignificant, with T174r=0.6126 and T174e=1.0172 for M01*, and T187r=0.9348 and T187e=2.5364 for M02*. In particular, the relative frequency histograms and QQ plots of the randomized PIT’s were displayed in Figure 3. By contrast to that of M0*, the PIT’s of M01* and M02* were in high agreement with a uniform distribution, which was further supported by p-values of 0.5456 and 0.3437 from Kolmogorov-Smirnov tests. Evidently, two INGARCH(1, 1) models represent the data better than one.

In summary, the results show that fitting two Poisson INARCH(2) models or two Poisson INGARCH(1, 1) models other than one model is appropriate for the number of robberies in Downtown and Patterson Park Neighborhood from January 1, 2012 to January 5, 2019. There exists significant statistical evidence that the counts of total number of robbery cases in the two neighborhoods of Baltimore City have experienced some structural change around the time when the Freddie Gray incident happened. The plot of the number of robberies in Figure 4 exhibits an elevated level since April 2015, which can also be inferred from the fitted INARCH(2) or INGARCH(1, 1) models. The second model M02 represents a higher level of weekly robbery cases starting around the time of Freddie Gray incident, as the theoretical mean of M02 corresponding to the second segment of the data is equal to 3.8087∕(1 − 0.2102 − 0.1549) = 5.9989, whereas the mean of M01 corresponding to the first segment is equal to 3.1923∕(1 − 0.0953 − 0.1153) = 4.0440, for which the same conclusion can be reached by M01* and M02*. In Figure 4, two horizontal dashed lines denoting the two theoretical means were added to the time series plot of the number of robberies, which represent the average levels of the two segments very well.

FIGURE 4.

FIGURE 4

Weekly counts of robberies in Downtown and Patterson Park Neighborhood of Baltimore City, for which the right vertical dashed line marks the week when Freddie Gray incident happened and the left vertical dashed line marks the change-point found by the proposed method based on INARCH(2) models.

Kang and Lee (2014) suggested using the point where Tnr is achieved, hereafter denoted as k˜n, as a change-point estimator, and the point where Tne is achieved, hereafter denoted as kˇn, as another change-point estimator. We implemented the two change-point estimators in Kang and Lee (2014) for the robbery data as well. For the INARCH(2) model, we obtained k˜n=172 and kˇn=174; for the INGARCH(1, 1) model, we obtained k˜n=172 and kˇn=165.

The polio incidence data of monthly counts of poliomyelitis cases in the U.S. from the year 1970 to 1983 was reported by the Centers for Disease Control. The data has been studied by many authors, including Davis et al. (2000), Davis and Wu (2009), and Kang and Lee (2014). As given in Davis and Wu (2009), one of the main objectives for analyzing the polio data is to investigate whether or not the incidence of polio has been decreasing since 1970. Kang and Lee (2014) applied a Poisson INGARCH(1, 1) model to the polio data and performed change-point analysis. It turned out that only the estimate-based test was significant and the change-point was estimated to be kˇn=35. Along the same line, the proposed three-step sequential procedure was applied to the polio data, which gave k^n=35 and a 95% confidence interval [33, 37].

Another published example for change-point estimation for integer-valued time series was concerned with the monthly number of cases of campylobacteriosis infections from January 1990 to October 2000, in the north of the province of Québec Canada. Kang and Lee (2014) fitted a Poisson INGARCH(1,1) model and found a change-point at kˇn=81. We applied the proposed three-step sequential procedure and obtained k^n=83 and a 95% confidence interval [74, 92].

Remark 3. The three estimators, k^n, k˜n and kˇ, produced highly consistent results for the empirical data examined in this paper. Additional simulation studies were performed to examine k˜n and kˇn for INARCH(2) models with n = 400 and n = 1000. The parameter settings for the simulation models were inherited from Table 1 and Table 3. Generally, the estimator k˜n was easy to implement, while estimator k^n and kˇn were computationally expensive. To put this into perspective, for each model setting in Table 3 with n = 1000, it took about 18.5 minutes to finish 1000 simulation rounds of change-point estimation for k˜n, while it took 5003.3 minutes for kˇn or 7110.5 minutes for k^n to perform the same task, on a desktop computer with an Intel® Core i5-4590 CPU @3.30 GHz, by Python 3.6.4. On the other hand, k˜n seemed to be less accurate than either of the other two change-point estimators, since k˜n always had considerably bigger discrepancy between the average of the change-point estimates and the true value of k0 than that of k^n or kˇn, for all the models examined. Otherwise, the performance of the two methods of Kang and Lee (2014), in terms of average values and standard deviations of the estimates for underlying model parameters, were comparable to that of k^n. Especially as n changed from 400 to 1000, the estimates became more accurate and precise about the true parameter values, which demonstrated that k˜n and kˇn also complied with certain consistency properties and limiting distributions (yet to be discovered). A second set of simulations focused on cases where τ0 = 0.1. It can be seen that k˜n and kˇn both seriously suffered the boundary problem. For details of the simulation results of k˜n and kˇn, please refer to Table 1 through Table 6 of the supplementary materials.

8 |. DISCUSSION

In this paper, we studied the off-line change-point estimation for a class of models for count data. Under regularity conditions, we proved that the change-point estimator in the non-rescaled time converges in distribution to the location of the maxima of a two-sided random walk, for which a closed-form approximation distribution was derived for strong mixing process, when the magnitude of the change is small. The proposed method and its asymptotic properties were shown to be applicable for the INARCH process with Poisson or negative binomial conditional distributions. The finite sample performance of the proposed estimation procedure was verified in simulation studies. We analyzed the robbery data of two neighborhoods of Baltimore City and a change-point, marking the start of elevated crime rate in recent years, is shown to happen around the Freddie Gray incident.

There remain several open problems. It is of interest to study the strong mixing property of t(θ), Dt(θ), and Pt(θ) of INGARCH models, and extend the asymptotic results and approximation distribution of INARCH models to INGARCH models. As pointed out by one referee, Doukhan and Neumann (2019) recently proved existence and uniqueness of a stationary distribution and absolute regularity for nonlinear GARCH and INGARCH models of order (p, q) with the coefficient of absolute regularity satisfying β(n)=O(ρn), for some 0 < ρ < 1. The previous convergence rate can be shown to suffice for Lemma 1 of this paper and Theorem 4 of Kuelbs and Philipp (1980) to hold true. The upshot is that the framework employed in our paper is versatile and could be applied to other integer-valued time series models not explored yet, including INAR processes (Du and Li, 1991). Nevertheless, the major challenge remaining is to develop a closed-form approximating distribution. The approximating distribution in our paper was developed based on a dedicate design that two count processes were generated by the same sequence of Poisson processes. To develop the approximating distribution for another integer-valued time series model, the underlying data generation mechanism needs to be carefully architected, so that assumptions (A1) and (A2) can hold true.

Boundary problems impose substantial difficulties for change-point analysis. Weighted versions of the change-point existence tests were deployed to alleviate the severity of the problem (Robbins et al., 2011; Aue and Horváth, 2013). At this point, it is not clear how to adapt the three-step sequential procedure to make it more effective for locating change-point in the setting of boundary problems. More generally, robust change-point estimation methods for integer-valued time series are desirable and they deserve a comprehensive treatment and study. An interesting topic for further research could be the study of the asymptotic properties of k˜n and kˇn. Another line for future research is to develop some new alternative weak dependence conditions, like the near-epoch dependence technique in Ling (2007b) and associated invariance principles for the continuous-valued time series, instead of the strong mixing condition to establish Theorems 1–3.

Supplementary Material

Supplementary Materials

Acknowledgements

We thank the two anonymous referees and the Associate Editor for their invaluable comments and suggestions that have significantly improved the quality of the paper. We would also like to thank Kyle Cummings, Sam Houston, and Gregory Nikol of Towson University for their IT support. The research of Rongning Wu was supported in part by a PSC-CUNY award, jointly funded by The Professional Staff Congress and The City University of New York. The research of Qi Zheng was partially supported by NSF DMS-1952486 and NIH 1R03AG067611–01A1.

Funding information

The Professional Staff Congress and The City University of New York, PSC-CUNY; The National Science Foundation, Grant/Award Number: DMS-1952486; The National Institutes of Health, Grant/Award Number: 1R03AG067611–01A1.

Appendix

Proof of Theorem 1:

The results are due to Ling (2016). The consistency of the rescaled time, τ^n, is established by Lemma A.1. of Ling (2016). By Theorem 2.1 of Ling (2016), θ^1n and θ^2n are consistent for θ10 and θ20, respectively. In addition, k^nk0 is bounded in probability. □

Proof of Theorem 2:

By (C3), given any ϵ > 0, there exists some ϕ > 0, such that

limnP(supθV(θ0,ϕ)1nt=1n[Pt(θ)Σθ0]>ϵ)=0, (17)

for which please refer to the proof of Lemma A2 of Ling and McAleer (2010). Then for any sequence θn = θ0 + op(1), as n → ∞,

P(1nt=1n[Pt(θn)Σθ0]>ϵ)P(θnθ0ϕ)+P(supθV(θ0,ϕ)1nt=1n[Pt(θ)Σθ0]>ϵ)0.

We can verify that Assumption 2.3 of Ling (2016) is satisfied. In particular,

t=1n(t(θn)t(θ0))=(θnθ0)t=1nt(θ0)θn(θnθ0)×(12Σθ0+op(1))(θnθ0).

Following the same lines as Theorem 2.2 of Ling (2016), we establish the result. □

Proof of Theorem 3:

By definition, mγd=(dΩθ20d)1/2. Then, it follows that mγddD2t(θ20) defines a martingale difference sequence with variance equal to 1. By virtue of (C3) and the invertibility of Ωθ20, we see that m → ∞ as ∥d∥ → 0. In light of (A2), we invoke Theorem 4 in Kuelbs and Philipp (1980) and obtain that, for any positive M, on D[−M, M] as ∥d∥ → 0

1mt=1mzmγddD2t(θ20)DB(z), for z>0,
1mt=mz1mγddD2t(θ20)DB(z), for z<0,

where D[−M, M] is the space of right continuous functions with left limits on the interval [−M, M]. Based on (5) and (A1), as Theorem 3.1 of Ling (2016), a continuous mapping argument entails the result. □

Proof of Theorem 4:

The proof follows along the same lines as Theorem 3.1 of Neumann (2011) and Proposition 5 of Davis and Liu (2016). Since the result is not established explicitly for the general INGARCH(p, q) models with Poisson or negative binomial marginal distribution, the proof is given in the supplementary materials. It employs the geometric moment contraction technique of Wu and Shao (2004). Let Wt = (Yt+1−q, …, Yt−1, λt+1−p, …, λt), and w be some specific realization of the the random vector. The key step is to find a metric, denoted by ρ(·), such that, starting with any non-negative w*=(n1q*,,n1*,  λ1p*,,λ0*) and w = (n1−q, …, n−1, λ1−p, …, λ0), there exists some constant υ ∈ (0, 1) satisfying

E[ρ(W1(w),W1(w*))]vρ(ww*).

Let Bk,=σ{Yk,,Y}. The coefficient of absolute regularity of the count process {Yt} is defined by

β(n)=E[supABn,|P(AB,0)P(A)|].

It is shown that β(n) has a geometric rate of convergence, i.e., β(n) = O(υn). □

Proof of Lemma 1:

For a random variable Z, let ∥Z2 = (E(|Z|2))1∕2. Then, by a standard result about strong mixing (cf. Francq and Zakoïan, 2010, Corollary A.2), we have i=1nvt+i2=O(n1/2) uniformly in t. Invoking Proposition 1 of Wu (2007), we can show that there exists some ρ(12,1), such that for every integer k

max1t2k|St|2r=0k[j=12kr(S2rjS2r(j1)2)2]1/2=r=0k[j=12krO(2r)]1/2=O(k2k/2)=O(2kρ),

where St=i=1tvi. The desired result follows from the Borel-Cantelli lemma, as the proof of Theorem 2.1 in Ling (2007b). □

Proof of Lemma 2:

We can show that E(supθ∈Θ|t(θ)|2+ɩ) < ∞ with some ɩ > 0 for Case 1, Case 2, and Case 3, under the condition E(Yt3)< and the following result: If a non-negative random variable is bounded below, X > ω > 0, and E(Xϵ) < ∞ for some ϵ > 0, then E(|ln(X)|s) < ∞ for every s > 0. Therefore, the two strong laws of large numbers induced by Lemma 1 are readily applicable.

We will show that given any ϵ,

limP(n1nsupθΘ|t=n1[t(θ)E(t(θ))]|>ϵ)=0.

Invoking (C2), we see that Lemma 1 is applicable for each θ ∈ Θ and implies that

1n(t=n1[t(θ)E(t(θ))])=o(1),a.s. (18)

In addition, by Lemma 1 in Chow and Teicher (1978, pg. 66), given any κ > 0 and integer M, there exists an L1 such that when > L1

P(n1n|t=n1[t(θ)E(t(θ))]|>ϵ/3)<κ/(3M). (19)

Let Xt(δ)=supθΘθsupθV(θ,δ)|t(θ*)t(θ)| for δ > 0. Then, E(Xt(δ)) → 0 as δ → 0, due to E supθ∈Θ |t(θ)| < ∞. It follows that there exists δ > 0 such that E(Xt(δ)) < ϵ∕6 if δδ. Using the fact that Xt(δ) satisfies the conditions of Lemma 1 and by the same argument as (18)

limP(n1n|t=n1[Xt(δ)E(Xt(δ))]|>ϵ/6)=0.

Thus, for any κ > 0 there exists an integer L2 such that for all > L2

P(n1nt=n1Xt(δ)>ϵ/3)P(n1n|t=n1[Xt(δ)E(Xt(δ))]|>ϵ/6)<κ/3. (20)

Also, since E(t(θ)) is a continuous function defined on a compact set Θ, there exists δ > 0 such that for all δδ

P(supθΘsupθ*V(θ,δ)|E[t(θ)t(θ*)]|>ϵ/3)=0. (21)

Now partition Θ into a finite number of balls V (θ, δi), i ∈ {1, …, M}, with

δ1==δMmin{δ,δ}.

Therefore, in view of (19), (20) and (21), given any ϵ > 0 and κ > 0, there exists L > 0 such that for all > L,

P(n1nsupθΘ|t=n1[t(θ)E(t(θ))]|>ϵ)P(nmax1iMsupθV(θ,δi)|1nt=n1[t(θ)t(θi)]|>ϵ/3)+P(max1iMsupθV(θ,δi)|E[t(θ)t(θi)]|>ϵ/3)+i=1MP(n1n|t=n1[t(θi)E(t(θi))]|>ϵ/3)<κ,

which finishes the proof. □

Proof of Lemma 3:

The direct application of Theorem 4 leads to finite moments of any orders for both count series, i.e., E(Y1ts)< and E(Y2ts)< for all s > 0. Also the stationary solution to the INGARCH process assumes a form λt=π0+k=1πkYtk, where πk can be determined by k=1πkzk=Aθ1(z)Bθ(z) and π0=Aθ1(1)δ for Aθ(z)=1=1pαizi and Bθ(z)=j=1qβjzj. Due to ipαi<1, then by Theorem 3.1.1 of Brockwell and Davis (1991) and 1i=1pαij=1qβj>0, it holds true that

k=1πk=j=1qβj/(1i=1pαi)<1. (22)

From above, we have

λ1t=π1,0+=1π1,Y1,t,   λ2t=π2,0+=1π2,Y2,t.

Due to (22), the sum of the coefficients of either of the two expansions is less than 1, i.e., 0<=1πi,<1,i{1,2}, i ∈ {1, 2}.

In the following, we take advantage of the property that, given λ1t and λ2t, |Y2tY1t|=|N˜t(λ2t)N˜t(λ1t)| follows a Poisson distribution with mean μt = |λ2tλ1t|. For s = 1, we have

E|Y2tY1t|=E(E(|Y2tY1t|λ1t,λ2t))=E(μt).

Then it follows

E|Y2tY1t||π2,0π1,0|+E|=1π2,Y2,tπ1,Y1,t||π2,0π1,0|+=1π2,E|Y2,tY1,t|+=1|π2,π1,|E(Y1,t).

Using stationarity and 1=1π2,>0, we see that as ∥d∥ → 0

E|Y2tY1t||π2,0π1,0|+=1|π1,π2,|E(Y1,t)1=1π2,0,

due to the dominated convergence theorem. As a byproduct of above, we also show E(μt) → 0 as ∥d∥ → 0.

By mathematical induction, we assume that the result is true for all sn, for certain integer n > 0. Recall that, for a Poisson random variable Y with mean λ, the moment of Y satisfies

E(Yn)=j=1nS(n,j)λj (23)

where n is any positive integer, and S(n, j) is the Stirling’s number of the second kind with S(n, 1) = S(n, n) = 1 for any integer n > 0. Using (23), we have

(E|Y2tY1t|n+1)1n+1=(E(j=1n+1S(n+1,j)μtj))1n+1=(E(μtn+1+j=1nS(n+1,j)μtj))1n+1(E(μtn+1))1n+1+(E(j=1nS(n+1,j)μtj))1n+1, (24)

As the case s = 1, invoking the Minkowski’s inequality

(E(μtn+1))1n+1||π2,0π1,0+(E|=1π2,Y2,tπ1,Y1,t|n+1)1n+1|π2,0π1,0|+=1π2,(E|Y2,tY1,t|n+1)1n+1+=1|π2,π1,|(E(Y1,tn+1))1n+1. (25)

Combing (24) and (25), we have, as ∥d∥ tends to zero,

(E|Y2tY1t|n+1)1n+1(E(j=1nS(n+1,j)μtj))1n+1+|π2,0π1,0|+=1|π2,π1,|(E(Y1,tn+1))1n+11=1π2,0,

And additionally, due to (25), it follows that limd0(E(μtn+1))1n+1=0. □

Proof of Lemma 4:

For the Poisson INGARCH(1, 1) model, we have

1t(θ)=Y1tlog(λ1t(θ))λ1t(θ),   D1t(θ)=(Y1tλ1t(θ)1)λ1t(θ)θ,
P1t(θ)=(Y1tλ1t(θ)1)2λ1t(θ)θθ+Y1tλ1t2(θ)λ1t(θ)θλ1t(θ)θ.

The result follows from, for i ∈ {1, 2, 3}, as ∥d∥ → 0,

E|(Y2tλ2t(θ20)1)λ2t(θ20)θi(Y1tλ1t(θ10)1)λ1t(θ10)θi|20.

First,

E|Y2tλ2t(θ20)λ2t(θ20)θiY1tλ1t(θ10)λ1t(θ10)θi|23(A1+A2+A3),

where A1=E|Y2tλ2t(θ20)λ2t(θ20)θiY1tλ2t(θ20)λ2t(θ20)θi|2, A2=E|Y1tλ2t(θ20)λ2t(θ20)θiY1tλ1t(θ10)λ2t(θ20)θi|2 and A3=E|Y1tλ1t(θ10)λ2t(θ20)θiY1tλ1t(θ10)λ1t(θ10)θi|2.

Note that 1λ1t(θ10)λ1t(θ10)θi and 1λ2t(θ20)λ2t(θ20)θi all have finite moments of any order by Remark 2.3 of Ahmad and Francq (2016). For A1, in light of the above property, E|Y2tY1t|3< and limd0E|Y2tY1t|30 from Lemma 3, it is straightforward to derive limd∥→0 A1 → 0 via the Hölder’s inequality. For A2, since λ1t(θ) is bonded below by δ, we have

A21δ_2E|Y1t(λ1t(θ10)λ2t(θ20))1λ2t(θ20)λ2t(θ20)θi|21δ_2(E|Y1t|5)2/5(E|λ2t(θ20)λ1t(θ10)|5)2/5(E|1λ2t(θ20)λ2t(θ20)θi|10)1/5.

Invoking Lemma 3, we have limd∥→0 A2 → 0. For the last term A3, it is straightforward to show that Y1tλ1t(θ10) has finite moments of any order. Then, it follows from (12) and Lemma 3 that

(E|λ2t(θ20)θiλ1t(θ10)θi|3)1/3|ω20iω10i|+=1ω2i(E|Y2tY1t|3)1/3+=1|ω2iω1i|(E(Y1t)3)1/30, as d0.

from which we can show limd→0A3∥ → 0 by the Hölder’s inequality. Collecting all the results, we prove the first part of the lemma. Note that the above arguments apply to the negative binomial case, for which the finite moments of the fifth order is the least stringent (integer) moment needed for ∥A2∥ → 0 as ∥d∥ → 0.

Using a similar technique as above to decompose the involved terms, and also by virtue of Lemma 3, (12) and the Hölder’s inequality, we prove the convergence of the Hessian matrix. Note that E(Y1t4)< and E(Y2t4)< are invoked in the proof to show that, as ∥d∥ → 0,

E|Y1tλ1t2(θ10)λ1t(θ10)θiλ1t(θ10)θjY2tλ2t2(θ20)λ2t(θ20)θiλ2t(θ20)θj|0.

Proof of Theorem 6:

Given that E, by following the same lines of the proof of Lemma 2, we can show that the uniform strong law of large numbers for the backward sums holds for P1t(θ), namely,

1usupθΘt=u1[P1t(θ)E(P1t(θ))]=o(1), a.s., as u.

In light of (12), it is straightforward to show E(supθΘP1t(θ))<, from which we can see that E (P1t(θ)) is a continuous function in θ, since 1t(θ) satisfies (C3), i.e., 1t(θ) is twice continuously differentiable. Let θ* ∈ Θ be a sequence that lies between θ10 and θ20. Then, it follows from the dominated convergence theorem that as ∥d∥ → 0,

E(P1t(θ*))E(P1t(θ10))0.

Note that the definition of m renders ∥d2 = O(1∕m) and θ*=θ10+O(1/m). Then, we have

1mzt=mz1P1t(θ*)1mzt=mz1P1t(θ10)1mzt=mz1[P1t(θ*)E(P1t(θ*))]+1mzt=mz1[P1t(θ10)E(P1t(θ10))]+[E(P1t(θ*))E(Pt(θ10))]21mzsupθΘt=mz1[P1t(θ)E(P1t(θ))]+E(P1t(θ*))E(P1t(θ10))=o(1), a.s. m. (26)

By Taylor series expansion, for any ⌊mz⌋ > 0, it is straight to show

t=mz1[1t(θ20)1t(θ10)]=t=mz1dD1t(θ10)12t=mz1dP1t(θ*)d,

where θ* lies between θ10 and θ20. From (26), we have

t=mz1dP1t(θ*)d=dmz(1mzt=mz1P1t(θ10)+o(1))d=dt=mz1P1t(θ10)d+o(1)dmzd=dt=mz1P1t(θ10)d+o(1),

where the last equality is due to mdd = O(1). As a result, we have

t=mz1[1t(θ20)1t(θ10)]=t=mz1dD1t(θ10)12t=mz1dP1t(θ10)d+o(1).

Due to Lemma 4 and the Chebyshev’s inequality, we have, as ∥d∥ → 0,

t=mz1dD1t(θ10)=t=mz1dD2t(θ20)+op(1). (27)

Using Lemma 4 and the Markov’s inequality,

dt=mz1P1t(θ10)ddt=mz1P2t(θ20)d=op(1). (28)

Consequently, (27) and (28) lead to

t=mz1[1t(θ20)1t(θ10)]=t=mz1dD2t(θ20)12t=mz1dP2t(θ20)d+op(1).

Invoking E(Y1t3)< and the same arguments as (26), we can show that the uniform strong law of large numbers for the backward sums holds for P2t(θ)

1mzt=mz1P2t(θ20)=Σθ20+op(1).

Therefore, assumption (A1) is satisfied. In addition, we can show that (A2) also holds true, for which the moment condition of (A2) is satisfied due to E(Y2t3)< and the strong mixing properties are inherited from {Y2t}. In view of Theorem 3, we finally establish the result. □

Footnotes

Supporting Information

Additional information for this article is available online including the proof for (16), the proof for Theorem 4, the simulation results for Remark 3, and the ACF and PACF graphs for M0 and M0*.

REFERENCES

  1. Ahmad A and Francq C (2016) Poisson qmle of count time series models. Journal of Time Series Analysis, 37, 291–314. [Google Scholar]
  2. Ash RB and Doléans-Dade CA (2000) Probability & measure theory. San Diego, California: Academic Press, second edn. [Google Scholar]
  3. Aue A and Horváth L (2013) Structural breaks in time series. Journal of Time Series Analysis, 34, 1–16. [Google Scholar]
  4. Bai J (1994) Least squares estimation of a shift in linear processes. Journal of Time Series Analysis, 15, 453–472. [Google Scholar]
  5. Berkes I, Horváth L and Kokoszka P (2003) Garch processes: structure and estimation. Bernoulli, 9, 201–207. [Google Scholar]
  6. Brockwell AE (2007) Universal residuals: a multivariate transformation. Statistics and Probability Letters, 77, 1473–1478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Chow YS and Teicher H (1978) Probability theory: independence, interchangeability, martingales. New York: Springer-Verlag. [Google Scholar]
  8. Christou V and Fokianos K (2014) Quasi-likelihood inference for negative binomial time series models. Journal of Time Series Analysis, 35, 55–78. [Google Scholar]
  9. Cui Y and Wang X (2019) Conditional maximum likelihood estimation in negative binomial ingarch processes with known number of successes when the true parameter is at the boundary of the parameter space. Communications in Statistics – Theory and Methods. To appear. [Google Scholar]
  10. Cui Y and Wu R (2016) On conditional maximum likelihood estimation for ingarch(p, q) models. Statistics and Probability Letters, 118, 1–7. [Google Scholar]
  11. — (2020) Test of parameter changes in a class of observation-driven models for count time series. Communications in Statistics – Theory and Methods., 49, 1933–1959. [Google Scholar]
  12. Cui Y and Zheng Q (2017) Conditional maximum likelihood estimation for a class of observation-driven time series models for count data. Statistics and Probability Letters, 123, 193–201. [Google Scholar]
  13. Davis RA, Dunsmuir WTM and Wang Y (2000) On autocorrelation in a poisson regression model. Biometrika, 87, 491–505. [Google Scholar]
  14. Davis RA, Huang D and Yao YC (1995) Testing for a change in the parameter values and order of an autoregressive model. The Annals of Statistics, 23, 282–304. [Google Scholar]
  15. Davis RA and Liu H (2016) Theory and inference for a class of nonlinear models with application to time series of counts. Statistica Sinica, 26, 1673–1707. [Google Scholar]
  16. Davis RA and Mikosch T (2009) Probabilistic properties of stochastic volatility models. In Handbook of Financial Time Series (eds. Andersen TG, Davis RA, Kreiß JP and Mikosch T), 255–267. Berlin Heidelberg: Springer-Verlag. [Google Scholar]
  17. Davis RA and Wu R (2009) A negative binomial model for time series of counts. Biometrika, 96, 735–749. [Google Scholar]
  18. Diop ML and Kengne W (2017) Testing parameter change in general integer-valued time series. Journal of time Series Analysis, 38, 880–894. [Google Scholar]
  19. Doukhan P and Kengne W (2015) Inference and testing for structural change in general poisson autoregressive models. Electronic Journal of Statistics, 9, 1267–1314. [Google Scholar]
  20. Doukhan P and Neumann MH (2019) Absolute regularity of semi-contractive garch-type processes. Journal of Applied Probacility, 56, 91–115. [Google Scholar]
  21. Du J-G and Li Y (1991) The integer-valued autoregressive (inar(p)) model. Journal of Time Series Analysis, 12, 129–142. [Google Scholar]
  22. Ferland R, Latour A and Oraichi D (2006) Integer-valued garch process. Journal of Time Series Analysis, 27, 923–942. [Google Scholar]
  23. Fokianos K and Fried R (2010) Interventions in ingarch processes. Journal of Time Series Analysis, 31, 210–225. [Google Scholar]
  24. Fokianos K, Rahbeka A and Tjøstheim D (2009) Poisson autoregression. Journal of the American Statistical Association, 104, 1430–1439. [Google Scholar]
  25. Francq C and Zakoïan J-M (2010) GARCH models: structure statistical inference and financial applications. Chichester, UK: Wiley. [Google Scholar]
  26. Franke J, Kirch C and Kamgaing JT (2012) Changepoints in times series of counts. Journal of Time Series Analysis, 33, 757–770. [Google Scholar]
  27. Hudecová Š, Hušková M and Meintanis S (2015) Detection of changes in inar models. In Stochastic Models, Statistics and Their Applications, Springer Proceedings in Mathematics & Statistics (PROMS, volume 122) (eds. Steland A, Rafajłowicz E and Szajowski K), chap. 2, 11–18. Wrocław Poland: Springer-Verlag. [Google Scholar]
  28. Jandhyala V, Fotopoulos S, MacNeill I and Liu P (2013) Inference for single and multiple change-points in time series. Journal of Time Series Analysis, 34, 423–446. [Google Scholar]
  29. Kang J and Lee S (2014) Parameter change test for poisson autoregressive models. Scandinavian Journal of Statistics, 41, 1136–1152. [Google Scholar]
  30. Kuelbs J and Philipp W (1980) Almost sure invariance principles for partial sums of mixing b-valued random variables. The Annals of Probability, 8, 1003–1036. [Google Scholar]
  31. Ling S (2007a) Self-weighted and local quasi-maximum likelihood estimators for arma-garch/igarch models. Journal of Econometrics, 140, 849–873. [Google Scholar]
  32. — (2007b) Testing for change-points in time series models and limiting theorems for ned sequences. Annals of Statistics, 35, 1213–1237. [Google Scholar]
  33. — (2016) Estimation of change-points in linear and nonlinear time series models. Econometric Theory, 32, 402–430. [Google Scholar]
  34. Ling S and McAleer M (2010) A general asymptotic theory for time-series models. Statistica Neerlandica, 64, 97–111. [Google Scholar]
  35. Neumann MH (2011) Absolute regularity and ergodicity of poisson count processes. Bernoulli, 17, 1268–1284. [Google Scholar]
  36. Robbins M, Gallagher C, Lund R and Aue A (2011) Mean shift testing in correlated data. Journal of Time Series Analysis, 32, 498–511. [Google Scholar]
  37. Stryhn H (1996) The location of the maximum of asymmetric two-sided brownian motion with triangular drift. Statistics and Probability Letters, 29, 279–284. [Google Scholar]
  38. Weiß CH and Testik MC (2009) Cusum monitoring of first-order integer-valued autoregressive processes of poisson counts. Journal of Quality Technology, 41, 389–400. [Google Scholar]
  39. — (2011) The poisson inar(1) cusum chart under overdispersion and estimation error. IIE Transactions, 43, 805–818. [Google Scholar]
  40. — (2012) Detection of abrupt changes in count data time series: cumulative sum derivations for inarch(1) models. Journal of Quality Technology, 44, 249–264. [Google Scholar]
  41. Wu W and Shao X (2004) Limit theorems for iterated random functions. Journal of Applied Probability, 41, 425–436. [Google Scholar]
  42. Wu WB (2007) Strong invariance principles for dependent random variables. The Annals of Probability, 35, 2294–2320. [Google Scholar]
  43. Yao Y-C (1987) Approximating the distribution of the maximum likelihood estimate of the change-point in a sequence of independent random variables. The Annals of Statistics, 15, 1321–1328. [Google Scholar]
  44. Yau CY and Zhao Z (2016) Inference for multiple change points in time series via likelihood ratio scan statistics. Journal of the Royal Statistical Society: Series B, 78, 895–916. [Google Scholar]
  45. Zhu F (2011) A negative binomial integer-valued garch model. Journal of Time Series Analysis, 32, 54–67. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials

RESOURCES