A first-order binomial-mixed Poisson integer-valued autoregressive model with serially dependent innovations

Zezhun Chen; Angelos Dassios; George Tzougas

doi:10.1080/02664763.2021.1993798

. 2021 Nov 1;50(2):352–369. doi: 10.1080/02664763.2021.1993798

A first-order binomial-mixed Poisson integer-valued autoregressive model with serially dependent innovations

Zezhun Chen ^1,^CONTACT, Angelos Dassios ¹, George Tzougas ^1,^†

PMCID: PMC9870000 PMID: 36698548

Abstract

Motivated by the extended Poisson INAR(1), which allows innovations to be serially dependent, we develop a new family of binomial-mixed Poisson INAR(1) (BMP INAR(1)) processes by adding a mixed Poisson component to the innovations of the classical Poisson INAR(1) process. Due to the flexibility of the mixed Poisson component, the model includes a large class of INAR(1) processes with different transition probabilities. Moreover, it can capture some overdispersion features coming from the data while keeping the innovations serially dependent. We discuss its statistical properties, stationarity conditions and transition probabilities for different mixing densities (Exponential, Lindley). Then, we derive the maximum likelihood estimation method and its asymptotic properties for this model. Finally, we demonstrate our approach using a real data example of iceberg count data from a financial system.

Keywords: Count data time series, binomial-mixed Poisson INAR(1) models, mixed Poisson distribution, overdispersion, maximum likelihood estimation

1. Introduction

Modelling the integer-valued count time series has attracted a lot of attention over the last few years in a plethora of different scientific fields such as the social sciences, healthcare, insurance, economics and the financial industry. The standard ARMA model will inevitably introduce real-valued results, and so is not appropriate for modelling this type of data. As a result, many alternative classes of integer-valued time series models have been introduced and explored in the applied statistical literature. The Integer-valued autoregressive process of order one, abbreviated as INAR(1), was proposed by McKenzie [8] and Al-Osh and Alzaid [1] as a counterpart to the Gaussian AR(1) model for Poisson counts. This model was derived by manipulating the operation between coefficients and variables, as well as the innovation term, in such a way that the values are always integers. The relationship of coefficients and variables is defined as $α \circ X_{t} = \sum_{i = 1}^{k} V_{i}$ such that $V_{i}$ are i.i.d Bernoulli random variables with parameter α and ° denotes the binomial thinning operator. The binomial thinning is very easy to interpret, and binomial INAR(1) has the same autocorrelation structure as the standard AR(1) model and hence can be applied to fit the count data. For a general review, please see [11,12].

Later on, in order to accommodate different features exhibited by count data, for example, under-dispersion, overdispersion, probability of observing zero and different dependent structures, many research studies introduced alternative thinning operators or varied the distribution of $V_{i}$ for different needs. The case where $V_{i}$ are i.i.d geometric random variables is analysed by Ristić et al. [10], which is called NGINAR(1). Kirchner [7] introduced reproduction operators so that $V_{i}$ are i.i.d Poisson random variables to explore the relationship between Hawkes process and integer-valued time series. For further variation, random coefficients thinning is introduced so that $V_{i}$ are i.i.d Bernoulli with the parameter α being a random variable. This type of thinning operator was proposed by McKenzie [8,9] and Zheng et al. [14]; they applied this to a generalized INAR(1) model. In particular, to accommodate the overdispersion feature, one way is to change the thinning operators from binomial to other types as discussed above. Another way is to replace the innovation distribution with some other overdispersed distribution; for example, see [2]. A third approach would be to keep the structure of binomial INAR(1) but to allow the innovation terms to be serially dependent; see [13].

In this study, motivated by Weiß [13], we develop a new family of binomial-mixed Poisson INAR(1) (BMP INAR(1)) processes by adding a mixed Poisson component to the innovations term of the classical Poisson INAR(1) process. The proposed class of BMP INAR(1) processes is ideally suited for modelling heterogeneity in count time series data since, due to the mixed Poisson component which we introduce herein, it includes many members with different transition probabilities that can adequately capture different levels of overdispersion in the data while keeping the innovation as independent Poisson.

The rest paper is organized as follows. Section 2 defines the Binomial mixed Poisson INAR(1) model by adding a mixed Poisson component in the Poisson INAR(1) model. Statistical properties and the stationarity condition are derived in Section 3. Section 4 derives the distribution of the mixed Poisson component based on two different mixing density functions from the exponential family, namely the Exponential and Lindley distributions. In Section 5, maximum likelihood estimation is discussed as well as its asymptotic properties for the estimators. In Section 6, the model is fitted to financial data (iceberg count) and discuss numerical results. Finally, concluding remarks are provided in Section 7.

2. Construction of binomial mixed Poisson INAR(1)

In [13], the classical Poisson INAR(1) was extended by allowing the innovations ε to depend on the current state of the model $X_{t}$ such that $ε_{t} \sim Po (a X_{t - 1} + b),$ where a and b are some positive constants. The innovation with this definition is separable in the sense that $ε_{t} = a * X_{t - 1} + ϵ_{t}$ , where $a * X_{t - 1} = \sum_{i = 1}^{X_{t - 1}} U_{i}$ , with $U_{i} \overset{i . i . d}{\sim} Po (a)$ and $ϵ_{t} \sim Po (b)$ . To introduce further heterogeneity while maintaining serially dependent innovations structure in this model, we extend this by allowing $U_{i}$ to be a mixed Poisson random variable.

Starting from a Poisson random variable U with parameter θ, we may obtain a large class of random variables by allowing θ to be another random variable which follows some classes of density function $g (θ | φ)$ where φ can be a scalar or a vector; see Karlis [6]. The random variable U follows a Mixed Poisson distribution with g as a mixing density. The distribution function of U is defined as

P (U = u) = \int_{0}^{\infty} \frac{e^{- θ_{i}} θ_{i}^{u}}{u!} g (θ | φ) d θ .

(1)

We now construct our model.

Definition 2.1

The Binomial-Mixed Poisson integer-valued Autoregressive model (BMP INAR(1)) is defined by the following equations:

$\begin{aligned} X_{t + 1} & = p_{1} \circ X_{t} + ε_{t + 1} \\ = p_{1} \circ X_{t} + φ *_{g} X_{t} + Z_{t + 1}, \\ p_{1} \circ X_{t} & = \sum_{k = 1}^{X_{t}} V_{k}, φ *_{g} X_{t} = \sum_{i = 1}^{X_{t}} U_{i}, \\ P (U_{i} = x) & = \int_{0}^{\infty} \frac{e^{- θ_{i}} θ_{i}^{x}}{x!} g (θ_{i} | φ) d θ_{i}, \end{aligned}$ (2)

where

° is a binomial thinning operator such that $V_{i}$ are i.i.d Bernoulli random variables with parameter $p_{1} \in [0, 1];$

${Z_{t}}_{t = 1, 2, \dots}$ are i.i.d Poisson random variables with rate $λ_{1} > 0$ ;

$*_{g}$ is a reproduction operator such that $U_{i}$ are independent Mixed Poisson distributed with mixing density function $g (θ_{i} | φ)$ ;

$*_{g}$ and ° are independent of each other so that $U_{i}$ and $V_{k}$ are independent of each other.

As we will see shortly, the stationarity condition for this model is simply $p_{1} + μ_{g} < 1$ where $μ_{g}$ is the first moment of $U_{i}$ . When it comes to interpretation, this model can be seen as the evolution of a population where the binomial part indicates the survivors from the previous period, the mixed Poisson part is the total offspring and the innovation part indicates immigrants. Obviously, this model is a Markov Chain and its transition probability can be found easily once we know the mixing density $g (θ | φ)$ . The probability mass function of $Y_{t + 1} = φ *_{g} X_{t}$ is given by

P (Y_{t + 1} = y | X_{t} = n) = E [\frac{e^{- \sum_{i = 1}^{n} θ_{i}} (\sum_{i = 1}^{n} θ_{i})^{y}}{y!}],

(3)

where the expectation is taken over $θ_{1}, θ_{2}, \dots, θ_{n}$ . In order to evaluate the expectation explicitly, it would be desirable that the random variables $θ_{i}$ have an ‘additivity’ property such that density (or probability mass function) of the sum $\sum_{i = 1}^{n} θ_{i}$ is either itself with different parameters or can be written in a closed form. Many members of the exponential family have this kind of property. In general, we let $g (x | φ)$ be of an exponential family form such that

g (x | φ) = h (x) \exp {η (φ) T (x) + ξ (φ)} .

(4)

Denote the density of the sum $S_{n} = \sum_{i = 1}^{n} θ_{i}$ as $g_{n} (s | φ)$ , where $θ_{i}$ are i.i.d random variables with density $g (θ | φ)$ . The expectation above can be expressed as

P (Y_{t + 1} = y | X_{t} = n) = \int_{R^{+}} \frac{e^{- s} s^{y}}{y!} g_{n} (s | φ) d s .

(5)

The density $g_{n} (s | φ)$ is explicitly known in many cases, for example, it can be an Inverse Gaussian, Exponential, Gamma, Geometric, Bernoulli or Lindley. For the sake of parsimony, we use distributions with a single parameter. In other words, we assume that φ is scalar. Note that, if we let $g (θ | φ) = δ_{φ} (θ)$ – a Dirac delta function concentrating at φ, the model will recover to the Extended Poisson INAR(1) in [13].

3. Statistical properties of BMP INAR(1)

3.1. Moments and correlation structure

We first need to derive the moments of $U_{i}$ .

Lemma 3.1

The first moment and second central moment of $U_{i}$ with density $g (x | φ)$ are given by

$E [U_{i}] = μ_{g}, Var (U_{i}) = μ_{g} + σ_{g}^{2},$ (6)

where $μ_{g} = E_{g} [θ_{i}] = \int_{R} x g (x | φ) d x$ and $σ_{g}^{2} = V a r_{g} (θ_{i})$ .

Proof.

By the conditional expectation argument

$\begin{aligned} E [U_{i}] & = E_{g} [E [U_{i} | θ_{i}]] = E_{g} [θ_{i}] = μ_{g}, \\ E [U_{i}^{2}] & = E_{g} [E [U_{i}^{2} | θ_{i}]] = E_{g} [θ_{i}^{2} + θ_{i}], \\ Var (U_{i}) & = E [U_{i}^{2}] - (E [U_{i}])^{2} = σ_{g}^{2} + μ_{g} . \end{aligned}$

Proposition 3.1

Assume $p_{1} + μ_{g} < 1$ . The stationary moments of $X_{t}$ is given by

$\begin{aligned} E [X_{t}] & = μ_{x} = \frac{λ_{1}}{1 - p_{1} - μ_{g}}, \\ Var (X_{t}) & = σ_{x}^{2} = μ_{x} \frac{1 - p_{1}^{2} + σ_{g}^{2}}{1 - (p_{1} + μ_{g})^{2}}, \\ Cov (X_{t}, X_{t - k}) & = γ (k) = (p_{1} + μ_{g})^{k} σ_{x}^{2} . \end{aligned}$ (7)

Proof.

For the first moment, we have

$\begin{aligned} E [X_{t}] & = E [p_{1} \circ X_{t - 1}] + E [φ *_{g} X_{t - 1}] + E [Z_{t}], \\ μ_{x} & = p_{1} μ_{x} + μ_{g} μ_{x} + λ_{1}, \\ μ_{x} & = \frac{λ_{1}}{1 - p_{1} - μ_{g}} . \end{aligned}$

Since the operators ° and $*_{g}$ are independent of each other, for the second central moment, we have

$\begin{aligned} Var (X_{t}) & = Var (p_{1} \circ X_{t - 1} + φ *_{g} X_{t - 1}) + Var (Z_{t}) \\ = Var (E [\sum_{i = 1}^{X_{t - 1}} (V_{i} + U_{i}) | X_{t - 1}]) + E [Var (\sum_{i = 1}^{X_{t - 1}} (V_{i} + U_{i}) | X_{t - 1})] + λ_{1} \\ = (p_{1} + μ_{g})^{2} σ_{x}^{2} + (p_{1} (1 - p_{1}) + σ_{g}^{2} + μ_{g}) μ_{x} + λ_{1}, \\ σ_{x}^{2} & = μ_{x} \frac{1 - p_{1}^{2} + σ_{g}^{2}}{1 - (p_{1} + μ_{g})^{2}} . \end{aligned}$

Let $F_{t} = σ (X_{t}, X_{t - 1}, \dots,)$ be the σ-algebra generated by the model $X_{t}$ up to time t, the covariance of the model is given by

$Cov (X_{t}, X_{t - k}) = Cov (p_{1} \circ X_{t - 1}, X_{t - k}) + Cov (φ *_{g} X_{t - 1}, X_{t - k}) + Cov (Z_{t}, X_{t - k}) .$

Again by using conditional expectations, we have

$\begin{aligned} Cov (p_{1} \circ X_{t - 1}, X_{t - k}) & = Cov (E [p_{1} \circ X_{t - 1} | F_{t - 1}], E [X_{t - k} | F_{t - 1}]) \\ + E [Cov (p_{1} \circ X_{t - 1}, X_{t - k} | F_{t - 1})] \\ = Cov (p_{1} X_{t - 1}, X_{t - k}) + E [Cov (\sum_{i = 1}^{X_{t - 1}} V_{i}, X_{t - k} | F_{t - 1})] \\ = p_{1} γ (k - 1) + 0. \end{aligned}$

Obviously, $Cov (X_{t}, X_{t - k}) = γ (k) = (p_{1} + μ_{g}) γ (k - 1) = (p_{1} + μ_{g})^{k} γ (0)$ .

From the results above, it is clear that this model follows the same correlation structure as that of standard AR(1) model. Furthermore, unlike equal-dispersed Poisson INAR(1), BMP INAR(1) is in general an overdispersed model with Fisher index of dispersion

{FI}_{x} = \frac{σ_{x}^{2}}{μ_{x}} = 1 + \frac{μ_{g}^{2} + 2 p_{1} μ_{g} + σ_{g}^{2}}{1 - (p_{1} + μ_{g})^{2}} .

(8)

3.2. Existence of stationary solution

Proposition 3.2

Given that $P (U_{i} = 0) > 0$ and $p_{1} + μ_{g} < 1$ the following infinite sequence:

$\begin{aligned} f_{i} (θ) & = (1 - p_{1} + p_{1} f_{i - 1} (θ)) Φ_{u} (f_{i - 1} (θ)), i \geq 1, \\ f_{0} (θ) & = θ, θ \in [0, 1], \end{aligned}$ (9)

where $Φ_{u} (θ)$ is the probability generating function (p.g.f) of $U_{i},$ has a limit $lim_{i \to \infty} f_{i} (θ) = 1$

Proof.

Define the increment of the sequence

$\begin{aligned} f_{i} (θ) - f_{i - 1} (θ) & = (1 - p_{1} + p_{1} f_{i - 1} (θ)) Φ_{u} (f_{i - 1} (θ)) - f_{i - 1} (θ) \\ = (1 - p_{1} + p_{1} x) Φ_{u} (x) - x x = f_{i} (θ) \\ =: Q (x) . \end{aligned}$

By the definition of p.g.f, $x \in [0, 1]$ , the monotonicity of this function is shown by its first and second derivatives

$\begin{aligned} Q^{'} (x) & = p_{1} Φ_{u} (x) + (1 - p_{1} + p_{1} x) Φ_{u} (x) - 1, \\ Q^{″} (x) & = 2 p_{1} Φ_{u}^{'} (x) + (1 - p_{1} + p_{1} x) Φ_{u}^{″} (x) . \end{aligned}$

By the definition of p.g.f, $Φ^{'} (x) \geq 0$ and $Φ^{″} (x) \geq 0$ . So $Q^{″} (x) \geq 0$ , which implies $Q^{'} (x)$ is non-decreasing function. Then we have

$Q^{'} (x) \leq Q^{'} (1) = p_{1} + μ_{g} - 1 < 0.$

Notice that $Q (0) = (1 - p_{1}) P (U_{i} = 0) > 0, Q (1) = 0$ . Hence we can conclude that Q is a monotonic decreasing function ranging from 0 to $Q (0)$ . In order words, for any $i = 1, \dots,$ and $θ \in [0, 1]$ , the sequence $f_{i} (θ) = f_{i - 1} (θ) + Q (f_{i - 1} (θ))$ is increasing with respect to i. Finally, $lim_{i \to \infty} f_{i} (θ) = 1$ .

Proposition 3.3

Let $X_{t}$ be the BMP INAR(1) model defined in Definition 2.1. If the condition $P (U_{i}) > 0$ and $p_{1} + μ_{g} < 1$ holds, then the process $X_{t}$ has a proper stationary distribution and $X_{t}$ is an ergodic Markov Chain. The stationary distribution is $Φ_{x} (θ) = \prod_{i = 0}^{\infty} Φ_{z} (f_{i} (θ))$ .

Proof.

Denote the p.g.f of $X_{n}$ and the innovation $Z_{n}$ as $Φ_{X_{n}} (θ)$ and $Φ_{z} (θ)$ respectively, then $Φ_{X_{n}} (θ)$ can be expressed as following product:

$\begin{aligned} Φ_{X_{n}} (θ) & = E [E [θ^{X_{n}} | X_{n - 1}] | X_{0}] \\ = E [E [θ^{p_{1} \circ X_{n - 1} + φ *_{g} X_{n - 1} + Z_{t}} | X_{n - 1}] | X_{0}] \\ = E [f_{1} (θ)^{X_{n - 1}} | X_{0}] Φ_{z} (f_{0} (θ)) \\ = ⋮ ⋮ \\ = E [f_{n} (θ)^{X_{0}}] \prod_{i = 0}^{n - 1} Φ_{z} (f_{i} (θ)) . \end{aligned}$

To show the existence of the limiting distribution is equivalent to show the limit of the product as n goes to infinity is something other than 0, which means that we have to show that the series

$L P_{n} = \log Φ_{X_{n}} (θ) = \log E [f_{n} (θ)^{X_{0}}] + \sum_{i = 0}^{n - 1} \log Φ_{z} (f_{i} (θ))$

is convergent as $n \to \infty$ . The convergence of the infinite series $\sum_{i = 0}^{\infty} \log Φ_{z} (f_{i} (θ))$ can be shown by the ratio test

$\begin{aligned} lim_{i \to \infty} | \frac{\log Φ_{z} (f_{i} (θ))}{\log Φ_{z} (f_{i - 1} (θ))} | \\ = lim_{x \to 1} \frac{\log Φ_{z} ((1 - p_{1} + p_{1} x) Φ_{u} (x))}{\log Φ_{z} (x)} \\ = lim_{x \to 1} \frac{Φ_{z} (x)}{Φ_{z}^{'} (x)} \frac{Φ_{z}^{'} ((1 - p_{1} + p_{1} x) Φ_{u} (x))}{Φ_{z} ((1 - p_{1} + p_{1} x) Φ_{u} (x))} (p_{1} Φ_{u} (x) + (1 - p_{1} + p_{1} x) Φ_{u}^{'} (x)) \\ = p_{1} + μ_{g} < 1. \end{aligned}$ (10)

Hence $lim_{n \to \infty} L P_{n} > - \infty$ , from which we can infer that $lim_{n \to \infty} Φ_{X_{n}} (θ) > 0$ exists and the limiting distribution of $X_{n}$ exists. Furthermore, by the construction of $X_{n}$ , the chain is defined on a countable state space $S = {0, 1, 2, \dots}$ . The positivity of transition probability $P (X_{n} = j | X_{n - 1} = i) > 0, \forall i, j \in S$ implies that $X_{n}$ is irreducible and aperiodic. Hence the limiting distribution $Φ_{x} (θ) = lim_{n \to \infty} Φ_{X_{n}} (θ)$ is the unique stationary distribution for $X_{n}$ .

In general, $P (U_{i} = 0) = \int_{R^{+}} e^{- θ} g (θ | φ) d θ > 0$ as long as $g (θ | φ) > 0$ , so we just need to ensure the existence of the first moment to achieve the stationarity of $X_{n}$ . The infinite product $Φ_{x} (θ) = \prod_{i = 0}^{\infty} Φ_{z} (f_{i} (θ))$ is the p.g.f of the stationary distribution, which also satisfies

Φ_{x} (θ) = Φ_{x} ((1 - p_{1} + p_{1} θ) Φ_{u} (θ)) Φ_{z} (θ) .

(11)

4. Distribution function of the mixed Poisson component

In order to apply maximum likelihood estimation for the statistical inference of this model, we need to derive the distribution of $Y_{t + 1} = φ *_{g} X_{t}$ according to different density functions g. As mentioned before, we focus on the density g coming from the exponential family. For expository purposes, we will derive the distribution of $Y_{t + 1}$ based on exponential and Lindley densities.

4.1. Mixed by exponential density

If $g (θ | φ) = \frac{1}{φ} e^{- \frac{1}{φ} θ}$ , then the distribution of $U_{i}$ is given by

\begin{aligned} P (U_{i} = x) & = \int_{0}^{\infty} \frac{e^{- θ_{i}} θ_{i}^{x}}{x!} \frac{1}{φ} e^{- \frac{1}{θ_{i}} x} d θ_{i} \\ = \frac{1}{φ x!} \int_{0}^{\infty} e^{- (1 + \frac{1}{φ}) θ_{i}} θ_{i}^{x} d θ_{i} \\ = (\frac{1}{1 + φ}) {(\frac{φ}{1 + φ})}^{x}, x = 0, 1, \dots, \end{aligned}

(12)

which is a geometric distribution with parameter $\frac{φ}{1 + φ}$ . Then, the distribution function $f_{φ} (m, X_{t})$ of $φ *_{g} X_{t}$ as well as its first and second derivatives are given by

\begin{aligned} \begin{aligned} f_{φ} (m, X_{t}) & = C_{m + X_{t} - 1}^{m} {(\frac{1}{1 + φ})}^{X_{t}} {(\frac{φ}{1 + φ})}^{m}, \\ \frac{\partial f_{φ} (m, X_{t})}{\partial φ} & = (\frac{m}{φ (1 - φ)} - \frac{X_{t}}{1 + φ}) f_{φ} (m, X_{t}), \\ \frac{\partial^{2} f_{φ} (m, X_{t})}{\partial (φ)^{2}} & = ({(\frac{m}{φ (1 - φ)} - \frac{X_{t}}{1 + φ})}^{2} + \frac{X_{t}}{(1 + φ)^{2}} - \frac{m (1 + 2 φ)}{φ^{2} (1 + φ)^{2}}) f_{φ} (m, X_{t}) . \end{aligned} \end{aligned}

(13)

Note that $X_{t}$ will recover to the NGINAR(1) in [10] if we further let $p_{1} = 0$ . In general, the stationarity condition becomes $p_{1} + φ < 1$ and the probability generating function of $X_{t}$ satisfies the equation

Φ_{x} (θ) = Φ_{x} (\frac{1 - p_{1} + p_{1} θ}{1 + φ - φ θ}) Φ_{z} (θ) .

(14)

We will now relax the assumption of the innovation term being Poisson and let the marginal distribution of X be a geometric random variable with parameter $\frac{α}{1 + α}, α > 0$ . Using the relationship of the p.g.f, we can infer the required distribution of Z.

Proposition 4.1

If $p_{1} > φ, α > φ$ or $p_{1} < φ, α < φ$ and the distribution of ${Z_{t}}_{t = 1, 2, \dots}$ follows a mixed geometric distribution such that

$Z_{t} = {\begin{cases} Geom (\frac{φ}{1 + φ}), W . P . \frac{(p_{1} - φ) α}{α - φ}, \\ Geom (\frac{α}{1 + α}), W . P . 1 - \frac{(p_{1} - φ) α}{α - φ}, \end{cases}$ (15)

then the marginal distribution of X follows a $Geom (\frac{α}{1 + α})$ distribution.

Proof.

By utilizing equation (14), we assume the X has a geometric distribution such that $Φ_{x} (θ) = \frac{1}{1 + α - α θ}$ . Then, the probability generating function of Z has the following form:

$\begin{aligned} Φ_{z} (θ) & = \frac{Φ_{x} (θ)}{Φ_{x} (\frac{1 - p_{1} + p_{1} θ}{1 + φ - φ θ})} \\ = \frac{(1 + φ - φ θ) (1 + α) - α (1 - p_{1} + p_{1} θ)}{(1 + α - α θ) (1 + φ - φ θ)} \\ = \frac{(p_{1} - φ) α}{α - φ} \frac{1}{1 + φ - φ θ} + (1 - \frac{(p_{1} - φ) α}{α - φ}) \frac{1}{1 + α - α θ} . \end{aligned}$ (16)

4.2. Mixed by Lindley density

Suppose now the density $g (θ | φ) = \frac{φ^{2}}{1 + φ} (θ + 1) e^{- φ θ}$ is a Lindley density function. The distribution of $U_{i}$ is the so-called Poisson–Lindley distribution, see [6], which has the following probability mass function

\begin{aligned} P (U_{i} = x) & = \int_{0}^{\infty} \frac{e^{- θ_{i}} θ_{i}^{x}}{x!} \frac{φ^{2}}{1 + φ} (θ_{i} + 1) e^{- φ θ_{i}} d θ_{i} \\ = \frac{φ^{2}}{(1 + φ) x!} (\int_{0}^{\infty} θ_{i}^{x + 1} e^{- (φ + 1) θ_{i}} d θ_{i} + \int_{0}^{\infty} θ_{i}^{x} e^{- (φ + 1) θ_{i}} d θ_{i}) \\ = \frac{φ^{2}}{(1 + φ) x!} (\frac{Γ (x + 2)}{(1 + φ)^{x + 2}} + \frac{Γ (x + 1)}{(1 + φ)^{x + 1}}) \\ = \frac{φ^{2} (φ + 2 + x)}{(1 + φ)^{x + 3}}, x = 0, 1, \dots . \end{aligned}

(17)

Under this parameter setting, $E [U_{i}] = μ_{g} = \frac{φ + 2}{φ (φ + 1)}$ which makes the parameter φ less interpretable. So we adopt the following parameter setting for the mixing density $g (θ | φ)$

g (θ | φ) = \frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}} (θ + 1) e^{- \tilde{φ} θ} \tilde{φ} = \frac{1 - φ + Δ}{2 φ} Δ = \sqrt{(φ - 1)^{2} + 8 φ} .

(18)

Then, $μ_{g} = φ, σ_{g} = φ^{2} - \frac{2}{(\tilde{φ} (1 + \tilde{φ}))^{2}}$ . On the other hand, the additivity of $U_{i}$ is not that clear. In order to evaluate the expectation (3), we need to find out the distribution of $S_{n} = \sum_{i = 1}^{n} θ_{i}$ .

Proposition 4.2

Suppose $θ_{i}$ are i.i.d Lindley distributed. The density of the sum $S_{n} = \sum_{i = 1}^{n} θ_{i}$ is given by

$g_{n} (s | φ) = {(\frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}})}^{n} e^{- \tilde{φ} s} \sum_{k = 0}^{n} \frac{C_{n}^{k}}{Γ (n + k)} s^{n + k - 1} .$ (19)

Proof.

We can prove this by inverting the Laplace transform. The Laplace transform of $θ_{i}$ is

$\begin{aligned} E [e^{- ν θ_{i}}] & = \int_{0}^{\infty} \frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}} (θ_{i} + 1) e^{- (ν + \tilde{φ}) θ_{i}} d θ_{i} \\ = \frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}} \frac{\tilde{φ} + ν + 1}{(\tilde{φ} + ν)^{2}} . \end{aligned}$

Then the Laplace transform of $S_{n}$ is simply the product of $E [e^{- ν θ_{i}}]$ , which is

$E [e^{- ν S_{n}}] = {(\frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}})}^{n} \frac{(\tilde{φ} + ν + 1)^{n}}{(\tilde{φ} + ν)^{2 n}} .$

Using a binomial expansion, we have

$\begin{aligned} E [e^{- ν S_{n}}] & = {(\frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}})}^{n} \frac{1}{(\tilde{φ} + ν)^{2 n}} \sum_{k = 0}^{n} C_{n}^{k} (\tilde{φ} + ν)^{k} \\ = {(\frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}})}^{n} \frac{1}{(\tilde{φ} + ν)^{n}} \sum_{k = 0}^{n} C_{n}^{n - k} (\tilde{φ} + ν)^{- (n - k)} \\ = {(\frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}})}^{n} \sum_{k = 0}^{n} C_{n}^{k} (\tilde{φ} + ν)^{- (n + k)} \\ = {(\frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}})}^{n} \sum_{k = 0}^{n} \int_{0}^{\infty} \frac{C_{n}^{k}}{Γ (n + k)} s^{n + k - 1} e^{- \tilde{φ} s} e^{- ν s} d s \\ = \int_{0}^{\infty} e^{- ν s} {(\frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}})}^{n} e^{- \tilde{φ} s} \sum_{k = 0}^{n} \frac{C_{n}^{k}}{Γ (n + k)} s^{n + k - 1} d s . \end{aligned}$

Obviously, the density function of $S_{n}$ is the integrand except $e^{- ν s}$ .

Then, the distribution of $Y_{t + 1} = θ *_{g} X_{t}$ is given by the following proposition.

Proposition 4.3

The probability mass function of $Y_{t + 1} = φ *_{g} X_{t}$ as well as its derivatives are given by

$\begin{aligned} f_{φ} (y, n) & = P (Y_{t + 1} = y | X_{t} = n) = {(\frac{{\tilde{φ}}^{2}}{1 + \tilde{φ}})}^{n} \sum_{k = 0}^{n} C_{n}^{k} C_{n + k + y - 1}^{y} (1 + \tilde{φ})^{- (n + k + y)}, \\ \frac{\partial f_{φ} (y, n)}{\partial \tilde{φ}} & = n (\frac{2}{\tilde{φ}} - \frac{1}{1 + \tilde{φ}}) f_{φ} (y, n) - (y + 1) f_{φ} (y + 1, n), \\ \frac{\partial^{2} f_{φ} (y, n)}{\partial {\tilde{φ}}^{2}} & = (n^{2} {(\frac{2}{\tilde{φ}} - \frac{1}{1 + \tilde{φ}})}^{2} - n (\frac{2}{{\tilde{φ}}^{2}} - \frac{1}{(1 + \tilde{φ})^{2}})) f_{φ} (y, n) \\ - 2 n (y + 1) (\frac{2}{\tilde{φ}} - \frac{1}{1 + \tilde{φ}}) f_{φ} (y + 1, n) + (y + 1) (y + 2) f_{φ} (y + 2, n), \\ \frac{\partial f_{φ} (y, n)}{\partial φ} & = \frac{\partial f_{φ} (y, n)}{\partial \tilde{φ}} \frac{\partial \tilde{φ}}{\partial φ}, \\ \frac{\partial^{2} f_{φ} (y, n)}{\partial φ^{2}} & = \frac{\partial^{2} f_{φ} (y, n)}{\partial {\tilde{φ}}^{2}} {(\frac{\partial \tilde{φ}}{\partial φ})}^{2} + \frac{\partial f_{φ} (y, n)}{\partial \tilde{φ}} \frac{\partial^{2} \tilde{φ}}{\partial φ^{2}}, \end{aligned}$ (20)

where

$\begin{aligned} \frac{\partial \tilde{φ}}{\partial φ} & = - \frac{1}{2 φ} + \frac{φ + 3}{2 φ Δ} - \frac{1 - φ + Δ}{2 φ^{2}}, \\ \frac{\partial^{2} \tilde{φ}}{\partial φ^{2}} & = \frac{1}{φ^{2}} + \frac{1}{2 φ Δ} + \frac{1 - φ + Δ}{φ^{3}} - \frac{(φ + 3)^{2}}{2 φ Δ^{3}} - \frac{φ + 3}{φ^{2} Δ} . \end{aligned}$

Proof.

$\begin{aligned} P (Y_{t + 1} = y | X_{t} = n) & = E [\frac{e^{- \sum_{i = 1}^{n} θ_{i}} (\sum_{i = 1}^{n} θ_{i})^{y}}{y!}] \\ = \int_{0}^{\infty} \frac{e^{- s} s^{y}}{y!} {(\frac{φ^{2}}{1 + φ})}^{n} e^{- φ s} \sum_{k = 0}^{n} \frac{C_{n}^{k}}{Γ (n + k)} s^{n + k - 1} d s \\ = {(\frac{φ^{2}}{1 + φ})}^{n} \sum_{k = 0}^{n} C_{n}^{k} \frac{Γ (n + k + y)}{Γ (n + k) Γ (y + 1)} (1 + φ)^{- (n + k + y)} \\ = {(\frac{φ^{2}}{1 + φ})}^{n} \sum_{k = 0}^{n} C_{n}^{k} C_{n + k + y - 1}^{y} (1 + φ)^{- (n + k + y)} . \end{aligned}$

5. Maximum likelihood estimation and its asymptotic property

In general, the transition probability can be written down explicitly as

\begin{aligned} P (X_{t + 1} = i | X_{t} = j) & = \sum_{m = 0}^{min (i, j)} C_{j}^{m} p_{1}^{m} (1 - p_{1})^{j - m} P (Y_{t + 1} + Z_{t + 1} = i - m) \\ = \sum_{m = 0}^{min (i, j)} \sum_{x = 0}^{i - m} F_{p_{1}} (m, j) f_{φ} (x, j) F_{λ_{1}} (i - m - x), \\ F_{p_{1}} (m, j) & = C_{j}^{m} p_{1}^{m} (1 - p_{1})^{j - m}, \\ f_{φ} (x, j) & = \int_{R^{+}} \frac{e^{- s} s^{x}}{x!} g_{j} (s | φ) d s, \\ F_{λ_{1}} (i - m - x) & = \frac{e^{- λ_{1}} λ_{1}^{i - m - x}}{(i - m - x)!} . \end{aligned}

(21)

The log likelihood function is simply $ℓ (p_{1}, φ, α) = \sum_{t = 0}^{n - 1} \log P (X_{t + 1} | X_{t})$ .

Proposition 5.1

Suppose we have a random sample ${X_{1}, X_{2}, \dots, X_{n}}$ . Let $p = (p_{1}, φ, λ_{1})$ denote the parameters vector for the stationary BMP INAR(1) model. The maximum likelihood estimator $\hat{p}$ has the following asymptotic distribution:

$\sqrt{n} (\hat{p} - p) \sim N (0, I^{- 1}),$ (22)

where

$\begin{aligned} H & = {\begin{array}{ccc} ℓ_{p_{1} p_{1}} & ℓ_{p_{1} φ} & ℓ_{p_{1} λ_{1}} \\ ℓ_{φ p_{1}} & ℓ_{φ φ} & ℓ_{φ λ_{1}} \\ ℓ_{λ_{1} p_{1}} & ℓ_{p_{1} φ} & ℓ_{λ_{1} λ_{1}} \end{array}}, I = - E [H], \end{aligned}$ (23)

$\begin{aligned} \begin{aligned} \frac{\partial P (X_{t + 1} | X_{t})}{\partial p_{1}} = \sum_{m = 0}^{min (X_{t + 1}, X_{t})} \sum_{x = 0}^{X_{t + 1} - m} \frac{\partial F_{p_{1}} (m, X_{t})}{\partial p_{1}} f_{φ} (x, X_{t}) F_{λ_{1}} (X_{t + 1} - m - x), \\ \frac{\partial^{2} P (X_{t + 1} | X_{t})}{\partial (p_{1})^{2}} = \sum_{m = 0}^{min (X_{t + 1}, X_{t})} \sum_{x = 0}^{X_{t + 1} - m} \frac{\partial^{2} F_{p_{1}} (m, X_{t})}{\partial (p_{1})^{2}} f_{φ} (x, X_{t}) F_{λ_{1}} (X_{t + 1} - m - x), \\ \frac{\partial^{2} P (X_{t + 1} | X_{t})}{\partial p_{1} \partial φ} = \sum_{m = 0}^{min (X_{t + 1}, X_{t})} \sum_{x = 0}^{X_{t + 1} - m} \frac{\partial F_{p_{1}} (m, X_{t})}{\partial p_{1}} \frac{\partial f_{φ} (x, X_{t})}{\partial φ} F_{λ_{1}} (X_{t + 1} - m - x), \\ ℓ_{x y} = \sum_{t = 0}^{T - 1} \frac{\partial^{2} P (X_{t + 1} | X_{t})}{\partial x \partial y} \frac{1}{P (X_{t + 1} | X_{t})} - \frac{\partial P (X_{t + 1} | X_{t})}{\partial x} \frac{\partial P (X_{t + 1} | X_{t})}{\partial y} \frac{1}{P (X_{t + 1} | X_{t})^{2}}, \end{aligned} \end{aligned}$ (24)

where $x, y \in {p_{1}, φ, λ_{1}}$ . The first and second derivatives of each distribution function is given by

$\begin{aligned} \frac{\partial F_{p_{1}} (m, X_{t})}{\partial p_{1}} = \frac{m - p_{1} X_{t}}{p_{1} (1 - p_{1})} F_{p_{1}} (m, X_{t}), \\ \frac{\partial f_{φ} (m, X_{t})}{\partial φ} = \frac{\partial}{\partial φ} \int_{R^{+}} \frac{e^{- s} s^{x}}{x!} g_{X_{t}} (s | φ) d s, \\ \frac{\partial F_{λ_{1}} (m)}{\partial λ_{1}} = (\frac{m}{λ_{1}} - 1) F_{λ_{1}} (m) \\ \frac{\partial^{2} F_{p_{1}} (m, X_{t})}{\partial (p_{1})^{2}} = (\frac{m (m - 1 - (X_{t} - 1) p_{1})}{p_{1}^{2} (1 - p_{1})} - \frac{(X_{t} - m) (m - (X_{t} - 1) p_{1})}{p_{1} (1 - p_{1})^{2}}) F_{p_{1}} (m, X_{t}), \\ \frac{\partial^{2} f_{φ} (m, X_{t})}{\partial φ^{2}} = \frac{\partial^{2}}{\partial φ^{2}} \int_{R^{+}} \frac{e^{- s} s^{x}}{x!} g_{X_{t}} (s | φ) d s, \\ \frac{\partial^{2} F_{λ_{1}} (x)}{\partial (λ_{1})^{2}} = (1 - \frac{2 x}{λ_{1}} + \frac{x (x - 1)}{λ_{1}^{2}}) F_{λ_{1}} (x) . \end{aligned}$

Proof.

From Proposition 3.3, we know that the $X_{n}$ is stationary and ergodic and its stationary distribution is characterized by the p.g.f $Φ_{x} (θ) = \prod_{i = 0}^{\infty} Φ_{z} (f_{i} (θ))$ . Then score functions and information matrix $I$ are also stationary and ergodic. Then the proof for asymptotic normality is similar to the proof of Theorem 4 in Appendix A of [3].

The expectation of information matrix $I$ can be calculated numerically by finding out unconditional distribution $P (X_{t})$ and joint distribution $P (X_{t - 1}, X_{t})$ . However, this would be computational intensive when the sample size n is large. In practice, since the process $X_{t}$ is stationary and ergodic, $I \approx - H$ when n is large.

To verify the asymptotic normality of the maximum likelihood estimators, we conduct a Monte Carlo experiment. This experiment is based on 2000 replications. For each replication, a time series of BMP-INAR(1) with chosen mixing density, either Exponential or Lindley, of size $n = 100, 200, \dots, 500$ is generated. The parameters are set as $p_{1} = φ = 0.3, λ_{1} = 2$ for both mixing densities and they are estimated via the maximum likelihood method. The biases and standard errors of the estimated parameters are shown in Tables 1 and 2. We observe that the biases of the estimators are either reasonably small or decreasing with respect to the sample size n. And it is clear that the standard error is also decreasing with respect to n. Finally, in order to graphically inspect the distribution of estimators, normal quantile-quantile plots are provided in Figure 1.

Table 1.

The bias of Maximum likelihood estimators of BMP-INAR(1) model with respect to different sample size n.

	$Bias (\hat{p}$ )	n = 100	n = 200	n = 300	n = 400	n = 500
Exponential	$p_{1}$	0.0022	−0.0021	0.0019	−0.0003	−0.0003
	ϕ	−0.0284	−0.0104	−0.0110	−0.0072	−0.0059
	$λ_{1}$	0.1089	0.0526	0.0384	0.0366	0.0279
Lindley	$p_{1}$	−0.0008	0.0004	−0.0015	−0.0020	−0.0011
	ϕ	−0.0209	−0.0143	−0.0085	−0.0050	−0.0039
	$λ_{1}$	0.0387	0.0227	0.0141	0.0144	0.0101

Open in a new tab

Table 2.

The standard error of Maximum likelihood estimators of BMP-INAR(1) model with respect to different sample size n.

	S.E.( $\hat{p}$ )	n = 100	n = 200	n = 300	n = 400	n = 500
Exponential	$p_{1}$	0.1303	0.0965	0.0752	0.0663	0.0576
	ϕ	0.1384	0.0970	0.0783	0.0670	0.0581
	$λ_{1}$	0.3982	0.2858	0.2276	0.2012	0.1764
Lindley	$p_{1}$	0.1319	0.0991	0.0854	0.0711	0.0630
	ϕ	0.1432	0.1054	0.0880	0.0729	0.0661
	$λ_{1}$	0.2050	0.1515	0.1166	0.0999	0.0911

Open in a new tab

Figure 1. — Quantile–Quantile plots for maximum likelihood estimators of BMP-INAR(1) model. The left panel shows plots for the Exponential mixing density, while the right panel depicts the plots for the Lindley mixing density.

6. Real data example: iceberg order data

The iceberg order counts concern the Deutsche Telekom shares traded in the XETRA system of Deutsche Börse, and the concrete time series gives the number of iceberg orders (for the ask side) per 20 min for 32 consecutive trading days in the first quarter of 2004. The special feature of iceberg orders is that only a small part of the order (tip of the iceberg) is visible in the order book and the main part of the order is hidden. For detail description, please see the [4,5]. This dataset is also analysed in [13], where the Extended Poisson INAR(1) is applied to fit the data.

A table of descriptive statistics, a time series, as well as the ACF and PACF plots are shown in Table 3 and Figure 2. The variance of the iceberg count is higher than its mean, which indicates the data is overdispersed. The level of dispersion is described by the Fisher index of dispersion $FI > 1$ . Evidence of the applicability of a first-order autoregressive model is indicated by the empirical ACF and PACF graphs. They illustrate a clear decay for ACF and cut-off at lag = 1 for PACF.

Table 3.

Descriptive statistics of iceberg count.

Minimum	Maximum	Median	Mean	Variance	FI
0	9	1	1.407	2.184	1.552

Open in a new tab

Figure 2. — Time series plot of iceberg data and its empirical ACF PACF plots. The dash line lines are the 95% confident bands by assuming the series to be a white noise process.

The likelihood function is constructed as in (21) with different $f_{φ} (x, j)$ (mixed by Exponential or Lindley). It is then maximized through ‘optim’ in R with ‘method = BFGS’ (quasi-Newton method) while the standard deviations of MLEs are calculated through inverting the negative observed information matrix in Proposition 5.1 based on MLEs. To access the goodness of fit, we adopt the information criteria AIC and BIC as well as the (standardized) Pearson residuals. If the model is correctly specified, Pearson residuals for BMP-INAR(1) are expected to have a mean and variance close to 0 and 1, respectively, with no significant autocorrelation. The Pearson residuals are calculated by the following formula:

e_{t} = \frac{x_{t} - E [X_{t} | x_{t - 1}]}{\sqrt{Var (X_{t} | x_{t - 1})}},

(25)

where $x_{t}$ denotes the observed value.

The ACF plots of the Pearson residuals in Figure 3 indicate that the BMP-INAR(1) models are appropriate for fitting the iceberg data. The estimated parameters shown in Table 4 are significantly different from 0, which is indicated by their estimated standard deviation. Compared to the Dirac delta case, which is actually the Extended Poisson INAR(1) of [13], the other two cases do show some improvement with smaller AIC, BIC values and larger fitted Fisher index of dispersion ${\hat{FI}}_{x}$ which, however, is slightly smaller than the empirical $FI$ . On the other hand, it seems that there is little difference between the other two cases as they have very similar AIC and BIC values. This is due to the fact that the value of $\hat{φ}$ is identical for both densities. Finally, it should be noted that the variance of the Pearson residuals is visibly larger than 1. As it was previously mentioned, the exponential and Lindley mixing densities were considered for expository purposes. Therefore, since the proposed family of BMP INAR(1) models is quite general, another mixing distribution could potentially more efficiently capture the observed dispersion structure for this data.

Table 4.

The results for the BMP INAR(1) model mixed by different density functions.

						Pearson residuals
Mixing density	$\hat{p_{1}}$	$\hat{φ}$	$\hat{λ_{1}}$	AIC	BIC	Mean	Variance	${\hat{FI}}_{x}$
Dirac delta	0.410	0.188	0.567	2212	2226	−0.001	1.159	1.295
	(0.058)	(0.059)	(0.040)
Exponential	0.434	0.167	0.563	2208	2222	−0.002	1.154	1.315
	(0.044)	(0.044)	(0.040)
Lindley	0.434	0.167	0.563	2208	2222	−0.002	1.154	1.314
	(0.043)	(0.043)	(0.040)

Open in a new tab

Note: The results of Dirac delta case are from Table 2 of [13]. The estimated standard deviations for all models are in brackets.

Overall, the mixed Poison component in the BMP INAR(1) model efficiently captures the overdispersion in this type of financial data.

7. Concluding remarks

The BMP INAR(1) is an extension of the classical Poisson INAR(1) model obtained by adding an additional mixed Poisson component and hence it can capture the level of overdispersion coming from the data. The exponential family is a desired choice for the mixing density due to its ‘additivity’ property. The choice of the mixing density can control the dispersion level to some extent, although the BMP INAR(1) $X_{t}$ is always overdispersed in general. Furthermore, due to its simplicity, $X_{t}$ is actually a Markov chain and the maximum likelihood estimation method can be applied easily. The real data analysis shows that BMP INAR(1) can be a potential choice for modelling financial count data that exhibit standard AR(1) structure and overdispersion.

Acknowledgments

The authors would like to thank the anonymous referee for their very helpful comments and suggestions which have significantly improved this article and would like to thank Prof Christian Wei for kindly sharing the financial count data.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

1.Al-Osh M. and Alzaid A.A., First-order integer-valued autoregressive (INAR (1)) process, J. Time Ser. Anal. 8 (1987), pp. 261–275. [Google Scholar]
2.Bourguignon M., Rodrigues J., and Santos-Neto M., Extended Poisson INAR (1) processes with equidispersion, underdispersion and overdispersion, J. Appl. Stat. 46 (2019), pp. 101–118. [Google Scholar]
3.Bu R., McCabe B., and Hadri K., Maximum likelihood estimation of higher-order integer-valued autoregressive processes, J. Time Ser. Anal. 29 (2008), pp. 973–994. [Google Scholar]
4.Frey S. and Sandås P., The impact of iceberg orders in limit order books, in AFA 2009 San Francisco Meetings Paper, 2009.
5.Jung R.C. and Tremayne A., Useful models for time series of counts or simply wrong ones? Adv. Stat. Anal., 95 (2011), pp. 59–91. [Google Scholar]
6.Karlis D., EM algorithm for mixed poisson and other discrete distributions, ASTIN Bull. J. IAA 35 (2005), pp. 3–24. [Google Scholar]
7.Kirchner M., Hawkes and INAR(∞) processes, Stoch. Process. Appl. 126 (2016), pp. 2494–2525. [Google Scholar]
8.McKenzie E., Some simple models for discrete variate time series, J. Am. Water Resour. Assoc. 21 (1985), pp. 645–650. [Google Scholar]
9.McKenzie E., Autoregressive moving-average processes with negative-binomial and geometric marginal distributions, Adv. Appl. Probab. 18 (1986), pp. 679–705. [Google Scholar]
10.Ristić M.M., Bakouch H.S., and Nastić A.S., A new geometric first-order integer-valued autoregressive (NGINAR (1)) process, J. Stat. Plan. Inference 139 (2009), pp. 2218–2226. [Google Scholar]
11.Scotto M.G., Weiß C.H., and Gouveia S., Thinning-based models in the analysis of integer-valued time series: A review, Stat. Model. 15 (2015), pp. 590–618. [Google Scholar]
12.Weiß C.H., Thinning operations for modeling time series of counts a survey, Adv. Stat. Anal. 92 (2008), pp. 319–341. [Google Scholar]
13.Weiß C.H., A poisson INAR(1) model with serially dependent innovations, Metrika 78 (2015), pp. 829–851. [Google Scholar]
14.Zheng H., Basawa I.V., and Datta S., First-order random coefficient integer-valued autoregressive processes, J. Stat. Plan. Inference 137 (2007), pp. 212–229. [Google Scholar]

[CIT0001] 1.Al-Osh M. and Alzaid A.A., First-order integer-valued autoregressive (INAR (1)) process, J. Time Ser. Anal. 8 (1987), pp. 261–275. [Google Scholar]

[CIT0002] 2.Bourguignon M., Rodrigues J., and Santos-Neto M., Extended Poisson INAR (1) processes with equidispersion, underdispersion and overdispersion, J. Appl. Stat. 46 (2019), pp. 101–118. [Google Scholar]

[CIT0003] 3.Bu R., McCabe B., and Hadri K., Maximum likelihood estimation of higher-order integer-valued autoregressive processes, J. Time Ser. Anal. 29 (2008), pp. 973–994. [Google Scholar]

[CIT0004] 4.Frey S. and Sandås P., The impact of iceberg orders in limit order books, in AFA 2009 San Francisco Meetings Paper, 2009.

[CIT0005] 5.Jung R.C. and Tremayne A., Useful models for time series of counts or simply wrong ones? Adv. Stat. Anal., 95 (2011), pp. 59–91. [Google Scholar]

[CIT0006] 6.Karlis D., EM algorithm for mixed poisson and other discrete distributions, ASTIN Bull. J. IAA 35 (2005), pp. 3–24. [Google Scholar]

[CIT0007] 7.Kirchner M., Hawkes and INAR(∞) processes, Stoch. Process. Appl. 126 (2016), pp. 2494–2525. [Google Scholar]

[CIT0008] 8.McKenzie E., Some simple models for discrete variate time series, J. Am. Water Resour. Assoc. 21 (1985), pp. 645–650. [Google Scholar]

[CIT0009] 9.McKenzie E., Autoregressive moving-average processes with negative-binomial and geometric marginal distributions, Adv. Appl. Probab. 18 (1986), pp. 679–705. [Google Scholar]

[CIT0010] 10.Ristić M.M., Bakouch H.S., and Nastić A.S., A new geometric first-order integer-valued autoregressive (NGINAR (1)) process, J. Stat. Plan. Inference 139 (2009), pp. 2218–2226. [Google Scholar]

[CIT0011] 11.Scotto M.G., Weiß C.H., and Gouveia S., Thinning-based models in the analysis of integer-valued time series: A review, Stat. Model. 15 (2015), pp. 590–618. [Google Scholar]

[CIT0012] 12.Weiß C.H., Thinning operations for modeling time series of counts a survey, Adv. Stat. Anal. 92 (2008), pp. 319–341. [Google Scholar]

[CIT0013] 13.Weiß C.H., A poisson INAR(1) model with serially dependent innovations, Metrika 78 (2015), pp. 829–851. [Google Scholar]

[CIT0014] 14.Zheng H., Basawa I.V., and Datta S., First-order random coefficient integer-valued autoregressive processes, J. Stat. Plan. Inference 137 (2007), pp. 212–229. [Google Scholar]

PERMALINK

A first-order binomial-mixed Poisson integer-valued autoregressive model with serially dependent innovations

Zezhun Chen

Angelos Dassios

George Tzougas

Abstract

1. Introduction

2. Construction of binomial mixed Poisson INAR(1)

Definition 2.1

3. Statistical properties of BMP INAR(1)

3.1. Moments and correlation structure

Lemma 3.1

Proof.

Proposition 3.1

Proof.

3.2. Existence of stationary solution

Proposition 3.2

Proof.

Proposition 3.3

Proof.

4. Distribution function of the mixed Poisson component

4.1. Mixed by exponential density

Proposition 4.1

Proof.

4.2. Mixed by Lindley density

Proposition 4.2

Proof.

Proposition 4.3

Proof.

5. Maximum likelihood estimation and its asymptotic property

Proposition 5.1

Proof.

Table 1.

Table 2.

Figure 1.

6. Real data example: iceberg order data

Table 3.

Figure 2.

Figure 3.

Table 4.

7. Concluding remarks

Acknowledgments

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases