Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2023 Dec 22;51(12):2402–2419. doi: 10.1080/02664763.2023.2297155

Diagnostic checks in time series models based on a new correlation coefficient of residuals

Jian Pei a, Fukang Zhu a,CONTACT, Qi Li b
PMCID: PMC11389639  PMID: 39267714

Abstract

For checking time series models, the Ljung–Box, Li–Mak and Zhu–Wang statistics play an important role, which use the Pearson's correlation coefficient to implement (squared) residual (partial) autocorrelation tests. In this paper, we replace the Pearson's correlation coefficient with a new rank correlation coefficient and propose a new test statistic to conduct diagnostic checks for residuals in autoregressive moving average models, autoregressive conditional heteroscedasticity models and integer-valued time series models, respectively. We conduct simulations to assess the performance of the new test statistic, and compare it with existing ones, and the results show the superiority of the proposed one. We use three real examples to exhibit its usefulness.

Keywords: ARCH, ARMA, bootstrap, INARCH, Pearsons correlation coefficient, rank correlation coefficient

1. Introduction

Given time series observations, a traditional modeling approach is to fit an appropriate autoregressive moving average (ARMA) model, whose conditional variance is homogenous over time. Box and Pierce [3] proposed a statistic for diagnostic checks in ARMA models and presented that the standard errors of the usual residual autocorrelations could be much less than 1/n for ARMA models. Ljung and Box [14] made a simple modification to this statistic and improved approximation results, then the corresponding Ljung–Box statistic plays an important role in checking ARMA models.

Time series models with a changing conditional variance constitute an important class of non-linear time series models. Autoregressive conditional heteroscedasticity (ARCH) model proposed by [8] is one of the earliest of this kind of models. Weiss [16] proposed several tests for the adequacy of an ARCH specification. For the presence of ARCH in the error term, [9,15] suggested similar tests. Li and Mak [13] proposed a portmanteau statistic based on the large-sample distribution of the squared standardized residual autocorrelations, whose importance in modeling nonlinear time series with conditional heteroscedasticity is similar to that of the Ljung–Box statistic in ARMA models.

In recent years, there has been considerable interest in integer-valued time series models because they are widely used in epidemiology, finance, disease modeling and environmental science, then lots of models about time series of counts have been developed in the literature. Ferland et al. [10] proposed Poisson integer-valued generalized ARCH (INGARCH) models to study the number of cases of campylobacterosis infections. Weiß [18] discussed the integer-valued ARCH (INARCH) models and illustrated that they are closely related to the standard autoregressive models. Zhu and Wang [22] investigated maximum likelihood estimator (MLE) and weighted least squares estimator for parameters in the INARCH model. Furthermore, five portmanteau test statistics were proposed by [21] to check the adequacy of a fitted INARCH specification.

For the above (squared) residual (partial) autocorrelation test, Pearson's correlation coefficient is used in these statistics. It is well known that Pearson's correlation coefficient, Spearman's ρ, and Kendall's τ are three most popular classical measures of statistical association. These coefficients are powerful for testing linear or monotone associations, and they have well-established asymptotic theories for calculating p-values. However, they are not valid for testing associations that are not monotonic, even in the complete absence of noise. There have been many proposals to solve this problem of the classical coefficients (see [12]), such as the maximal correlation coefficient (see [4]), and numerous coefficients based on joint cumulative distribution functions and ranks (see [1,2,11,19]). However, most of these coefficients do not possess simple asymptotic theories under the hypothesis of independence.

Chatterjee [5] proposed a new rank correlation coefficient that attracted much attention, which has a particular appeal because of its advantages. The formula of it is as simple as the classical coefficients, such as Pearson's correlation coefficient, Spearman's ρ and Kendall's τ, so it becomes very easy to understand conceptually and computable very quickly both in theory and in practice. Many correlation coefficients are hundreds of times slower to compute as the sample size increases. Moreover, it is robust to outliers and invariant under monotone transformations of the data as it is a function of ranks. Its asymptotic theory under the hypothesis of independence is also very simple and roughly valid even when the sample size is only 20. Furthermore, the test of independence based on it is consistent compared with other alternatives.

Our idea is to conduct diagnostic checks in time series models based on a new correlation coefficient of residuals, i.e. we replace the Pearson's correlation coefficient with the above Chatterjee's new correlation coefficient and propose a new test statistic for diagnostic checks. The rest of this paper is organized as follows. Section 2 introduces a new test statistic based on Chatterjee's rank correlation coefficient. Section 3 illustrates the application of the new test statistic to three kinds of time series models. Section 4 records several simulation studies that illustrate the superiority of the new statistic. Section 5 uses three real examples to exhibit the application. Section 6 concludes.

2. A new test statistic based on Chatterjee's rank correlation coefficient

Pearson's correlation coefficient is frequently used for diagnostic checks in time series models, and we wonder what will happen if the Pearson's correlation coefficient is replaced by other correlation coefficients?

We consider Chatterjee's new correlation coefficient ξn, which is capable of conducting independence test for two real-valued random variables (X,Y). Let (X1,Y1),,(Xn,Yn) be a sample composed of n independent copies of (X,Y) and rearrange the data as (X[1],Y[1]),,(X[n],Y[n]) such that X[1]X[n]. If there are ties among Xi's, then choose an increasing rearrangement by breaking ties uniformly at random. When rearrange the data set such that X[1]X[n], we can not claim that Y[i]'s are ordered accordingly. Let ri be the rank of Y[i], i.e. the number of j such that Y[j]Y[i] and define li to be the number of j such that Y[j]Y[i].

Definition 2.1

The Chatterjee's correlation coefficient is defined as

ξn(X,Y)=1ni=1n1|ri+1ri|2i=1nli(nli).

If there are no ties among Yi's, it holds that

ξn(X,Y)=13i=1n1|ri+1ri|n21.

For large samples, based on Theorem 2.1 in [5], the independence test can be conducted by the asymptotic null distribution of the correlation coefficient, which is as follows.

Lemma 2.1

Suppose that X and Y are independent and Y is continuous. Then nξn(X,Y)N(0,2/5) in distribution as n.

Note that Chatterjee's correlation coefficient ξn is used to test the independence between two random variables. In the context of diagnostic checks in time series models, well-fitted models should ideally yield residuals approximating 0 and exhibiting no correlation. Thus, we extend Chatterjee's correlation coefficient to time series models for the purpose of conducting correlation tests on residuals. More precisely, let [x] denote the largest integer that does not exceed x, and we have:

(1) Correlation tests on residuals {ε^t1} and {ε^t},t=2,3,,n. We let the random variables (X,Y)=(ε^t1,ε^t) and at this time there are [n2] samples: (ε^1,ε^2),(ε^3,ε^4),,(ε^2[n2]1,ε^2[n2]). We denote the sample size [n2] as N1 and the rank correlation coefficients of {ε^t1} and {ε^t} as ξN1.

(2) Correlation tests on residuals {ε^t2} and {ε^t},t=3,4,,n. We let the random variables (X,Y)=(ε^t2,ε^t) and at this time there are 2[n4] samples: (ε^1,ε^3),(ε^2,ε^4),,(ε^4[n4]3,ε^4[n4]1), (ε^4[n4]2,ε^4[n4]). We denote the sample size 2[n4] as N2 and the rank correlation coefficients of {ε^t2} and {ε^t} as ξN2.

 ……

(m) Correlation tests on residuals {ε^tm} and {ε^t},t=m+1,m+2,,n. We let the random variables (X,Y)=(ε^tm,ε^t) and at this time there are m[n2m] samples: (ε^1,ε^m+1),,(ε^m,ε^2m), …, (ε^2m[n2m]2m+1,ε^2m[n2m]m+1),,(ε^2m[n2m]m,ε^2m[n2m]). We denote the sample size m[n2m] as Nm and the rank correlation coefficients of {ε^tm} and {ε^t} as ξNm.

Repeat the similar steps m times, with m being a positive integer. The portmanteau test statistic is given as

Q(ξn)=1mk=1mNkξNk. (1)

It is challenging to develop the asymptotic distribution of the test statistic Q(ξn). Our work focuses explicitly on the computation of Q(ξn), while the problem of obtaining its asymptotic distribution will be studied in the future. We use parametric bootstrap to calculate the p-value of the test statistic as a way to conduct diagnostic checks in time series models. The steps of the parametric bootstrap are similar to those in [6]. More specifically, given the data, we estimate the model parameters and obtain the residuals to calculate the test statistic Q(ξn). Then, we generate B time series of length n using the estimated model. For each time series, b=1,,B, we estimate the model parameters and obtain the residuals to compute the value of the test statistic denoted by Qb(ξn) and then compute the p-value by the formula

pvalue=#{b:Qb(ξn)Q(ξn)}+1B+1. (2)

By comparing this p-value with the predetermined significance level α, we implement diagnostic checks in time series model, i.e. if the pvalue<α, we consider the fit to be inadequate, otherwise, the fit is adequate.

3. Application of the new test statistic to time series models

In this section, we apply our proposed new test statistic to ARMA, ARMA-ARCH and INARCH models.

3.1. ARMA model

Let us now consider a time series {yt} generated by a stationary ARMA model

ϕ(B)yt=θ(B)εt,

where ϕ(B)=1ϕ1BϕpBp,θ(B)=1+θ1B++θqBq,Bkwt=wtk, and {εt} is a sequence of independent and identically distributed (iid) N(0,1) random variables.

Let θ=(ϕ1,,ϕp,θ1,,θq)T and θ^ be the MLE of θ. When θ is replaced by θ^, let ε^t be the corresponding residual. After a model of this form has been fitted, it is important to study the adequacy of the fit by examining the residuals ε^1,,ε^n and their autocorrelations

r^k=t=k+1nε^tε^tk/t=1nε^t2,k=1,2,

[14] revised the test statistic proposed by [3], which is given as

Q(M)=n(n+2)k=1M(nk)1r^k2,

and it is distributed as χMpq2 for large n, yielding an approximate test for lack of fit.

Our proposed test statistic Q(ξn) is applicable to the ARMA model. More precisely, with regard to the residuals ε^1,,ε^n, we calculate their rank correlation coefficients ξN1,,ξNm. Then, utilizing Equations (1) and (2), we compute the new test statistic Q(ξn) and corresponding p-value. By comparing the obtained p-value with significance level α, we conduct diagnostic checks in the ARMA model.

3.2. ARMA-ARCH model

Suppose that the time series {yt} satisfies the following conditional heteroscedastic model

ϕ(B)yt=θ(B)ηt,ηt=htεt,ht=α0+i=1rαiηti2,

where ϕ(B)=1ϕ1BϕpBp,θ(B)=1+θ1B++θqBq,Bkwt=wtk, and {εt} is a sequence of iid N(0,1) random variables, i.e. the ARMA-ARCH model.

Let θ=(ϕ1,,ϕp,θ1,,θq,α0,α1,,αr)T and θ^ be the MLE of θ. When θ is replaced by θ^, let ε^t=η^t/h^t be the corresponding residual. Li and Mak [13] proposed a portmanteau statistic based on the large-sample distribution of the squared residual autocorrelations. The transformed lag-k-squared residual autocorrelation is given by

r^k=k+1n(ε^t21)(ε^tk21)t=1n(ε^t21)2,k=1,2,

The Li–Mak statistic

Q(r,M)=ni=r+1Mr^i2

is asymptotically distributed as χMr2.

Our proposed test statistic Q(ξn) is also applicable to the ARMA-ARCH model. Similar to Section 3.1, regarding residuals ε^1,,ε^n, we compute the new test statistic Q(ξn) and corresponding p-value. By comparing the obtained p-value with significance level α, we conduct diagnostic checks in the ARMA-ARCH model.

3.3. INARCH model

In this subsection, we consider an INARCH(p) sequence {yt} generated by the following model

ytFt1:P(λt),λt=α0+i=1pαiyti,

where Ft1 denotes the σ-field generated by {yt1,yt2,}. Zhu and Wang [21] proposed five portmanteau test statistics to check the adequacy of a fitted INARCH specification. Before giving these five test statistics, we define the following notation.

Let α=(α0,α1,,αp)T and α^ be the MLE of α. Let εt=yt/λt,1tn and ε^t=yt/λ^t be the conditional residual, where λ^t=α^0+i=1pα^iyti. Define

r^(k)=1nt=k+1n(ε^t1)(ε^tk1),k=1,2,

Let r^M=(r^(1),,r^(M))T. Define matrix Σ=(σkl) with σkl=E[(εt1)2(εtk1)(εtl1)],k,l=1,,M and Σ can be easily estimated from the data. Let

σ^kl=1nkl<tn(ε^t1)2(ε^tk1)(ε^tl1),

where kl=max{k,l}. We can get Σ^n12, the estimator for Σ12, by replacing σkl with σ^kl. The residual autocorrelation is defined as r~M=Σ^n12r^M=(r~(1),,r~(M))T. Denote the kth partial autocorrelation of the εt's as ψk and let ΨM=(ψ1,,ψM)T. Define the vector of the first M conditional residual partial autocorrelations as Ψ~M=(ψ~1,,ψ~M)T, where ψ~k can be obtained, analogously to ψk, by replacing εt with ε^t.

The first two statistics proposed by [21] are

Q~(M)=nr~MTr~M,Q(M)=nΨ~MTΨ~M,

which are asymptotically distributed as χM2.

The third and fourth statistics proposed by [21] are

D(M)=n[1|R~M|1M],D(M)=nM+1log|R~M|,

where the autocorrelation matrix R~M=(r~(|ij|)),i,j=1,,M+1. The distribution of these two test statistics can be approximated by a Gamma distribution with density function βαχα1eβx/Γ(α), where the parameters are defined by

α=a22b=3M(M+1)4(2M+1),β=a2b=3M2(2M+1).

The fifth test statistic proposed by [21] is

Qs(M)=nk=1Ms[l=0sr~(k+l)]2,

the distribution of which can be approximated by a Gamma distribution with parameters α=a2/(2b),β=a/(2b), where a=(s+1)(Ms),b=(s+1)2(Ms)+2i=1s+1(s+1i)2(Msi). In the subsequent analysis, we choose s = 1.

Our proposed test statistic Q(ξn) is also applicable to the INARCH model. Similar to Section 3.1, concerning the residuals ε^1,,ε^n, we compute the new test statistic Q(ξn) and corresponding p-value. Then, by comparing the obtained p-value with significance level α, we conduct diagnostic checks in the INARCH model.

3.4. Randomized quantile residuals

In this subsection, we consider alternative randomized quantile residuals proposed by [7]. The randomized quantile residuals follow a standard normal distribution, by inverting the fitted distribution function at each response value and finding the equivalent standard normal quantile. For ARMA and ARMA-ARCH models, the residuals already follow a standard normal distribution. Hence, we investigate randomized quantile residuals for the INARCH model in the subsequent analysis. Zhu [20] previously examined quantile residuals for the negative binomial INGARCH model.

The normalized randomized quantile residual is defined by ε^t=Φ1(ut), where Φ1 is the inverse cumulative distribution function of a standard normal variable and ut is a random value from the uniform distribution in the interval [F(yt1,α^),F(yt,α^)], with F(yt,α^) being the cumulative distribution function of the fitted INARCH model. Concerning the obtained quantile residuals ε^1,,ε^n, we calculate their rank correlation coefficients ξN1,,ξNm. Then, utilizing Equations (1) and (2), we compute the new test statistic Q(ξn) and corresponding p-value. By comparing the obtained p-value with significance level α, we conduct diagnostic checks.

4. Simulation

To illustrate the superiority of the proposed test statistic, we calculate empirical sizes and powers, and compare it with Ljung–Box, Li–Mak and Zhu–Wang statistics in this section. The significance level α is 5%, sample sizes n = 250, 500 and m, M = 10, 20. For different values of n;m and M, we use B = 499 bootstrap replicates and 500 simulations.

4.1. ARMA model

Let us apply the test statistic Q(ξn) to the ARMA model and compare it with Ljung–Box statistic.

The first simulation study was conducted by generating 500 sets of observations {y1,,yn} from the ARMA(1,1) model. For convenience, we denote the simulation as Simulation A1. Table 1 lists empirical sizes of overall tests. The results show the proportion of p-values less than α for the test statistic Q(ξn) with a few combinations of ϕ1 and θ1. Moreover, the proportion of Ljung–Box statistic Q(M) values exceeding the upper 5 percentage point of the asymptotic distribution is also shown in Table 1 as a comparison. It can be seen that they all have reliable empirical sizes and the proposed new statistic already performs well with smaller sample size (n = 250). There is no significant difference in the performance of the new test statistic for different m.

Table 1.

Empirical sizes of the overall tests for Simulation A1: ARMA(1,1) model.

n M(m) Q(ξn) Q(M)
    ϕ1=0.8,θ1=0.6
250 10 0.056 0.052
  20 0.044 0.058
500 10 0.060 0.056
  20 0.040 0.052
    ϕ1=0.6,θ1=0.4
250 10 0.040 0.058
  20 0.052 0.044
500 10 0.048 0.040
  20 0.042 0.046
    ϕ1=0.4,θ1=0.7
250 10 0.058 0.046
  20 0.040 0.056
500 10 0.054 0.056
  20 0.050 0.046
    ϕ1=0.3,θ1=0.5
250 10 0.046 0.042
  20 0.048 0.046
500 10 0.054 0.040
  20 0.048 0.040

The second simulation study was conducted to compute empirical powers of overall tests in ARMA models, shown in Table 2. For convenience, we denote the simulation as Simulation A2. The results are based on data generated from an AR(2) model, with an AR(1) model being fitted to obtain Q(M) and Q(ξn). We find that the powers of the new test statistic Q(ξn) are sufficiently high although its powers are still slightly smaller than Ljung–Box in most cases. Moreover, the new test statistic has higher powers with larger sample size (n = 500) and smaller m (m = 10).

Table 2.

Empirical powers of the overall tests for Simulation A2: data were generated from an AR(2) model, with an AR(1) model being fitted.

n M(m) Q(ξn) Q(M)
    ϕ1=0.2,ϕ2=0.7
250 10 0.822 1
  20 0.724 1
500 10 0.948 1
  20 0.896 1
    ϕ1=0.1,ϕ2=0.8
250 10 0.972 1
  20 0.952 1
500 10 1 1
  20 1 1
    ϕ1=0.1,ϕ2=0.7
250 10 0.850 1
  20 0.768 1
500 10 0.976 1
  20 0.904 1

4.2. ARMA-ARCH model

In this subsection, we apply the test statistic Q(ξn) to the ARMA-ARCH model and compare it with Li–Mak statistic.

Let us conduct the third simulation study by generating 500 sets of observations {y1,,yn} from an AR(1)-ARCH(2) model. For convenience, we denote the simulation as Simulation B1. Table 3 lists empirical sizes of overall tests. It can be seen that the proposed new test statistic Q(ξn) has more reliable empirical sizes than the Li–Mak statistic Q(r,M). Moreover, the proposed new statistic already performs well with smaller sample size (n = 250) and there is no significant difference in the performance of the new test statistic for different m.

Table 3.

Empirical sizes of the overall tests for Simulation B1: AR(1)-ARCH(2) model.

n M(m) Q(ξn) Q(r,M)
  ϕ1=0.8,α0=0.6,α1=0.3,α2=0.2
250 10 0.040 0.032
  20 0.052 0.026
500 10 0.040 0.038
  20 0.054 0.034
  ϕ1=0.7,α0=0.2,α1=0.2,α2=0.2
250 10 0.054 0.034
  20 0.050 0.028
500 10 0.050 0.036
  20 0.056 0.030
  ϕ1=0.4,α0=0.3,α1=0.3,α2=0.2
250 10 0.058 0.030
  20 0.040 0.026
500 10 0.046 0.040
  20 0.052 0.034
  ϕ1=0.5,α0=0.5,α1=0.2,α2=0.2
250 10 0.050 0.034
  20 0.044 0.024
500 10 0.042 0.036
  20 0.044 0.038

The fourth simulation study was conducted to compute empirical powers of overall tests in ARMA-ARCH models, shown in Table 4. For convenience, we denote the simulation as Simulation B2. The results are based on data generated from an AR(2)-ARCH(2) model, with an AR(1)-ARCH(2) model and an AR(1)-ARCH(1) model being fitted to obtain Q(r,M) and Q(ξn). It can be seen that the Li–Mak statistic Q(r,M) does not perform well here. When the data are fitted by an AR(1)-ARCH(2) model, the powers of Q(r,M) are too small, and even though whose powers increase when the data are fitted by an AR(1)-ARCH(1) model, they are still smaller than the powers of Q(ξn). In contrast, our proposed test statistic Q(ξn) performs well and its powers are high, especially with larger sample size (n = 500) and smaller m (m = 10).

Table 4.

Empirical powers of the overall tests for Simulation B2: data were generated from an AR(2)-ARCH(2) model, with an AR(1)-ARCH(2) model and an AR(1)-ARCH(1) model being fitted. Model 1: AR(1)-ARCH(2) model; Model 2: AR(1)-ARCH(1) model.

Model n M(m) Q(ξn) Q(r,M)
  ϕ1=0.2,ϕ2=0.6,α0=0.4,α1=0.2,α2=0.2
1 250 10 0.464 0.056
    20 0.378 0.022
  500 10 0.734 0.052
    20 0.554 0.048
2 250 10 0.612 0.484
    20 0.516 0.326
  500 10 0.832 0.744
    20 0.720 0.618
  ϕ1=0.3,ϕ2=0.6,α0=0.6,α1=0.4,α2=0.2
1 250 10 0.508 0.062
    20 0.422 0.022
  500 10 0.738 0.072
    20 0.610 0.044
2 250 10 0.592 0.330
    20 0.494 0.202
  500 10 0.768 0.548
    20 0.632 0.406
  ϕ1=0.2,ϕ2=0.7,α0=0.5,α1=0.4,α2=0.2
1 250 10 0.836 0.050
    20 0.738 0.042
  500 10 0.986 0.068
    20 0.946 0.070
2 250 10 0.902 0.318
    20 0.794 0.228
  500 10 0.992 0.604
    20 0.954 0.446

4.3. INARCH model

Finally, we employ the new test statistic in the INARCH model and compare it with five Zhu–Wang statistics. We analyze the residuals ε^t in Section 3.3 and the randomized quantile residuals ε^t in Section 3.4. The respective test statistics are denoted as Q(ξn) and Q(ξn).

Let us conduct the fifth simulation study by generating 500 sets of observations {y1,,yn} from the INARCH(p) model, where p = 1, 2, 3. For convenience, we denote the simulation as Simulation C1. Table 5 lists empirical sizes of overall tests. It can be seen that the proposed test statistics Q(ξn) and Q(ξn) have more reliable empirical sizes than the five test statistics proposed by [21]. Furthermore, both new test statistics perform well even with smaller sample size (n = 250) and show no significant difference in their performance for different m.

Table 5.

Empirical sizes of the overall tests for Simulation C1: INARCH(p) model and p=1,2,3.

n M(m) Q(ξn) Q(ξn) Q~(M) Q(M) D(M) D(M) Q1(M)
  p=1,α0=0.5,α1=0.3
250 10 0.042 0.034 0.026 0.054 0.022 0.024 0.042
  20 0.054 0.046 0.014 0.142 0.068 0.080 0.036
500 10 0.050 0.038 0.020 0.038 0.024 0.026 0.050
  20 0.046 0.062 0.024 0.066 0.038 0.042 0.044
  p=1,α0=0.6,α1=0.4
250 10 0.048 0.052 0.026 0.054 0.034 0.038 0.050
  20 0.044 0.052 0.044 0.172 0.084 0.106 0.042
500 10 0.042 0.060 0.034 0.044 0.028 0.030 0.042
  20 0.044 0.042 0.028 0.082 0.048 0.054 0.036
  p=2,α0=1.0,α1=0.4,α2=0.3
250 10 0.046 0.058 0.022 0.038 0.008 0.014 0.008
  20 0.042 0.052 0.016 0.128 0.042 0.048 0.024
500 10 0.040 0.046 0.014 0.028 0.008 0.010 0.024
  20 0.044 0.042 0.014 0.070 0.024 0.026 0.022
  p=3,α0=0.5,α1=0.3,α2=0.3,α3=0.2
250 10 0.044 0.058 0.010 0.030 0.010 0.010 0.014
  20 0.048 0.058 0.020 0.138 0.052 0.056 0.024
500 10 0.044 0.044 0.018 0.032 0.014 0.014 0.036
  20 0.044 0.044 0.018 0.064 0.030 0.034 0.032

The sixth simulation study was conducted to compute empirical powers of overall tests in INARCH(p) models, shown in Table 6. For convenience, we denote the simulation as Simulation C2. The results are based on data generated from the INARCH(2) and INARCH(3) models, with an INARCH(1) model used to obtain the test statistics. We find that the test statistic Q(ξn) based on residuals ε^t outperforms Q(ξn) based on randomized quantile residuals ε^t. Although the powers of Q(ξn) are slightly smaller than Zhu–Wang statistics in most cases, they are sufficiently high. It's worth noting that both new test statistics have higher powers with larger sample size (n = 500) and smaller m (m = 10).

Table 6.

Empirical powers of the overall tests for Simulation C2: data were generated from the INARCH(2) and INARCH(3) models, with an INARCH(1) model being fitted.

n M(m) Q(ξn) Q(ξn) Q~(M) Q(M) D(M) D(M) Q1(M)
  p=2,α0=1.0,α1=0.1,α2=0.7
250 10 0.898 0.770 1 1 1 1 0.998
  20 0.820 0.680 1 1 1 1 0.962
500 10 0.998 0.954 1 1 1 1 1
  20 0.988 0.854 1 1 1 1 0.996
  p=2,α0=1.0,α1=0.1,α2=0.8
250 10 0.990 0.972 1 1 1 1 0.980
  20 0.982 0.944 1 0.998 1 1 0.930
500 10 1 1 1 1 1 1 1
  20 1 0.998 1 1 1 1 0.998
  p=3,α0=0.5,α1=0.1,α2=0.6,α3=0.2
250 10 0.828 0.472 1 1 1 1 0.998
  20 0.728 0.400 1 1 1 1 0.990
500 10 0.992 0.656 1 1 1 1 1
  20 0.964 0.524 1 1 1 1 1
  p=3,α0=0.5,α1=0.1,α2=0.7,α3=0.1
250 10 0.932 0.752 1 1 1 1 0.998
  20 0.878 0.630 1 1 1 1 0.950
500 10 1 0.928 1 1 1 1 1
  20 0.996 0.836 1 1 1 1 1

In summary, the proposed new test statistic provides a valuable option for conducting diagnostic checks in time series models. In ARMA models, the performance of new test statistic is comparable to the Ljung–Box statistic. In ARMA-ARCH models, the new test statistic outperforms the Li–Mak statistic concerning both size and power. Regarding INARCH models, we analyze the test statistics Q(ξn) based on residuals ε^t and Q(ξn) based on randomized quantile residuals ε^t, respectively. Both new test statistics outperform the Zhu–Wang statistics in terms of size. Specifically, the test statistic Q(ξn) derived from residuals ε^t has higher powers than Q(ξn) based on randomized quantile residuals ε^t.

5. Real examples

In this section, we exhibit the application of the new test statistic to financial data and the daily download count of a program. Although the financial data are only available on the trading day, it is assumed that it comes from an equally spaced sampling process in the analysis. The significance level is 5%, bootstrap replicates B = 499 and m, M = 10. Additionally, if the ratio between the value of existing test statistic (Ljung–Box, Li–Mak or Zhu–Wang) and the corresponding critical value is less than 1, or if the p-value of our proposed test statistic exceeds 0.05, it suggests the suitability of the fitted model.

Example 5.1

We consider the daily log-returns of the S&P 500 index from October 21, 2010 to August 12, 2011. The dataset has 204 observations and the data can be downloaded from the website https://cn.investing.com/indices/us-spx-500-historical-data. We denote the daily closing price of the S&P 500 index at time t as pt. Then the daily log-return at time t is defined as yt=log(pt)log(pt1). Figure 1 shows the time series plot, the autocorrelation function (ACF) plot and the partial autocorrelation function (PACF) plot of the daily log-returns of the S&P 500 index. We use an AR(2) model to fit data and the model parameters are ϕ1=0.1741,ϕ2=0.3151. Table 7 shows that the ratio between the value of the Ljung–Box statistic Q(M) and the corresponding critical value is 0.5795(<1), and the p-value of the new test statistic Q(ξn) is 0.3660(>0.05). Both the Ljung–Box statistic Q(M) and the new test statistic Q(ξn) suggest an adequate fit of the data. The top plot in Figure 2 shows the ACF plot of the residuals when the data are fitted by the AR(2) model and there is no significant autocorrelation. Then we use an AR(1) model to fit data and the model parameter is ϕ1=0.2595. Table 7 shows that the ratio between the value of the Ljung–Box statistic Q(M) and the corresponding critical value is 1.3713(>1), and the p-value of the new test statistic Q(ξn) is 0.0040(<0.05). Both the Ljung–Box statistic Q(M) and the new test statistic Q(ξn) clearly reject the fit. The bottom plot in Figure 2 shows the ACF plot of the residuals when the data are fitted by the AR(1) model. We find significant autocorrelation at lag 2. The results illustrate that the performance of new test statistic is comparable to the Ljung–Box statistic in ARMA models.

Figure 1.

Figure 1.

The daily log-returns of the S&P 500 index: (a) time series plot; (b) ACF plot; (c) PACF plot.

Table 7.

The ratio between the value of the Ljung–Box statistic Q(M) and the corresponding critical value as well as the p-value of the new test statistic Q(ξn) under the fit of two models for Example 5.1. Model 1: AR(2) model; Model 2: AR(1) model.

  Model 1 Model 2
Q(M) 0.5795 1.3713
Q(ξn) 0.3660 0.0040

Figure 2.

Figure 2.

ACF plots of the residuals for Example 5.1: (a) ACF plot when data are fitted by an AR(2) model; (b) ACF plot when data are fitted by an AR(1) model.

Example 5.2

We consider the Shanghai interbank overnight rate from August 13, 2013 to August 29, 2014. The dataset has 263 observations and the data can be downloaded from the website https://www.shibor.org/shibor/dataservices/. We take the logarithm of the data and then centralize them. Figure 3 shows the time series plot, ACF plot and PACF plot of the processed data. We use an AR(2)-ARCH(2) model to fit data and the model parameters are ϕ1=1.3549,ϕ2=0.3857,α0=0.0024,α1=0.2185,α2=0.0419. Table 8 shows that the ratio between the value of the Li–Mak statistic Q(r,M) and the corresponding critical value is 0.6753(<1), and the p-value of the new test statistic Q(ξn) is 0.4500(>0.05). Both the Li–Mak statistic Q(r,M) and the new test statistic Q(ξn) suggest an adequate fit of the data. The top plot in Figure 4 shows the ACF plot of the residuals when the data are fitted by the AR(2)-ARCH(2) model and there is no significant autocorrelation. Then we use an AR(1)-ARCH(1) model to fit data and the model parameters are ϕ1=0.9738,α0=0.0026,α1=0.3192. Table 8 shows that the ratio between the value of the Li–Mak statistic Q(r,M) and the corresponding critical value is 0.5293(<1), and the p-value of the new test statistic Q(ξn) is 0.0020(<0.05). The statistic Q(ξn) clearly rejects the fit, while the statistic Q(r,M) suggests the fit is adequate. The middle plot in Figure 4 shows the ACF plot of the residuals when the data are fitted by the AR(1)-ARCH(1) model. We find significant autocorrelation at lag 1. Then we use an AR(1)-ARCH(2) model to fit data and the model parameters are ϕ1=0.9700,α0=0.0025,α1=0.2572,α2=0.0594. Table 8 shows that the ratio between the value of the Li–Mak statistic Q(r,M) and the corresponding critical value is 0.6009(<1), and the p-value of the new test statistic Q(ξn) is 0.0380(<0.05). The statistic Q(ξn) clearly rejects the fit, while the statistic Q(r,M) suggests the fit is adequate. The bottom plot in Figure 4 shows the ACF plot of the residuals when the data are fitted by the AR(1)-ARCH(2) model. We also find significant autocorrelation at lag 1. The results confirm that the power of the Li–Mak statistic is less than the new test statistic.

Figure 3.

Figure 3.

Shanghai interbank overnight rate: (a) time series plot; (b) ACF plot; (c) PACF plot.

Table 8.

The ratio between the value of the Li–Mak statistic Q(r,M) and the corresponding critical value as well as the p-value of the new test statistic Q(ξn) under the fit of three models for Example 5.2. Model 1: AR(2)-ARCH(2) model; Model 2: AR(1)-ARCH(1) model; Model 3: AR(1)-ARCH(2) model.

  Model 1 Model 2 Model 3
Q(r,M) 0.6753 0.5293 0.6009
Q(ξn) 0.4500 0.0020 0.0380

Figure 4.

Figure 4.

ACF plots of the residuals for Example 5.2: (a) ACF plot when data are fitted by an AR(2)-ARCH(2) model; (b) ACF plot when data are fitted by an AR(1)-ARCH(1) model; (c) ACF plot when data are fitted by an AR(1)-ARCH(2) model.

Example 5.3

We consider the daily download counts of the program CWß TeXpert from June 1, 2006 to February 28, 2007. The dataset has 267 observations, which has been analyzed by [17,21], among others. Figure 5 shows the time series plot, ACF plot and PACF plot of the series. The sample mean is 2.4007 and sample variance is 7.5343, indicating that models with over-dispersion may be needed. We use an INARCH(1) model to fit data and the model parameters are α0=1.6815,α1=0.2882. Table 9 shows the ratio (<1) between the value of the statistic and the corresponding critical value for the Zhu–Wang statistics. The p-values of both new test statistics Q(ξn) and Q(ξn) are 0.5300(>0.05) and 0.1760(>0.05), respectively. According to the Zhu–Wang statistics and both new test statistics, all suggest an adequate fit of the data. Figure 6 shows the ACF plots of the residuals and randomized quantile residuals when the data are fitted by the INARCH(1) model and there is no significant autocorrelation. The results illustrate that the performance of both new test statistics is comparable to the Zhu–Wang statistics in INARCH models.

Figure 5.

Figure 5.

The daily download counts of the program: (a) time series plot; (b) ACF plot; (c) PACF plot.

Table 9.

The ratio between the value of five Zhu–Wang statistics and the corresponding critical value as well as the p-value of both new test statistics Q(ξn) and Q(ξn) under the fit of INARCH(1) model for Example 5.3.

Q(ξn) Q(ξn) Q~(M) Q(M) D(M) D(M) Q1(M)
0.5300 0.1760 0.4183 0.3627 0.0468 0.0428 0.4418

Figure 6.

Figure 6.

The daily download counts of the program: (a) ACF plot of the residuals ε^t under the fit of INARCH(1) model; (b) ACF plot of the randomized quantile residuals ε^t under the fit of INARCH(1) model.

6. Conclusion

We replace the Pearson's correlation coefficient with Chatterjee's new rank correlation coefficient and propose a new test statistic to conduct diagnostic checks in ARMA, ARMA-ARCH and INARCH models. We conduct simulations to assess the performance of the new test statistic, and compare it with existing ones. Simulation results show the superiority of the new test statistic, which provides a new approach to diagnostic checks in time series models. Three datasets of real examples exhibit the application. Future work will be considered to use other correlation coefficients for diagnostic checks in time series models.

Acknowledgments

The authors thank two referees for their useful and constructive comments on an earlier draft of this article.

Funding Statement

Zhu's work is supported by National Natural Science Foundation of China (No. 12271206), Natural Science Foundation of Jilin Province (No. 20210101143JC), and Science and Technology Research Planning Project of Jilin Provincial Department of Education (No. JJKH20231122KJ). Li's work is supported by National Natural Science Foundation of China (No. 12201069), Natural Science Foundation of Jilin Province (No. 20210101160JC), Science and Technology Research Project of Jilin Provincial Education Department (No. JJKH20220820KJ) and Natural Science Foundation Projects of CCNU (CSJJ2022006ZK).

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Bergsma W. and Dassios A., A consistent test of independence based on a sign covariance related to Kendall's tau, Bernoulli 20 (2014), pp. 1006–1028. [Google Scholar]
  • 2.Blum J.R., Kiefer J., and Rosenblatt M., Distribution free tests of independence based on the sample distribution function, Ann. Math. Stat. 32 (1961), pp. 485–498. [Google Scholar]
  • 3.Box G.E.P. and Pierce D.A., Distribution of residual autocorrelations in autoregressive integrated moving average time series models, J. Am. Stat. Assoc. 65 (1970), pp. 1509–1526. [Google Scholar]
  • 4.Breiman L. and Friedman J.H., Estimating optimal transformations for multiple regression and correlation, J. Am. Stat. Assoc. 80 (1985), pp. 580–598. [Google Scholar]
  • 5.Chatterjee S., A new coefficient of correlation, J. Am. Stat. Assoc. 116 (2021), pp. 2009–2022. [Google Scholar]
  • 6.Christou V. and Fokianos K., Estimation and testing linearity for non-linear mixed poisson autoregressions, Electron. J. Stat. 9 (2015), pp. 1357–1377. [Google Scholar]
  • 7.Dunn P.K. and Smyth G.K., Randomized quantile residuals, J. Comput. Graph. Stat. 5 (1996), pp. 236–244. [Google Scholar]
  • 8.Engle R.F., Autoregressive conditional heteroscedasticity with estimates of the variance of U.K. inflation, Econometrica 50 (1982), pp. 987–1007. [Google Scholar]
  • 9.Engle R.F., Hendry D.F., and Trumble D., Small sample properties of ARCH estimators and tests, Can. J. Econ. 18 (1985), pp. 66–93. [Google Scholar]
  • 10.Ferland R., Latour A., and Oraichi D., Integer-valued GARCH process, J. Time Ser. Anal. 27 (2006), pp. 923–942. [Google Scholar]
  • 11.Hoeffding W., A non-parametric test of independence, Ann. Math. Stat. 19 (1948), pp. 546–557. [Google Scholar]
  • 12.Josse J. and Holmes S., Measuring multivariate association and beyond, Stat. Surv. 10 (2016), pp. 132–167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Li W.K. and Mak T.K., On the squared residual autocorrelations in non-linear time series with conditional heteroskedasticity, J. Time Ser. Anal. 15 (1994), pp. 627–636. [Google Scholar]
  • 14.Ljung G.M. and Box G.E.P., On a measure of lack of fit in time series models, Biometrika 65 (1978), pp. 297–303. [Google Scholar]
  • 15.Pantula S.G., Estimation of autoregressive models with ARCH errors, Sankhya B 50 (1988), pp. 119–138. [Google Scholar]
  • 16.Weiss A.A., ARMA models with ARCH errors, J. Time Ser. Anal. 5 (1984), pp. 129–143. [Google Scholar]
  • 17.Weiß C.H., Thinning operations for modeling time series of counts—a survey, AStA Adv. Stat. Anal. 92 (2008), pp. 319–341. [Google Scholar]
  • 18.Weiß C.H., Modelling time series of counts with overdispersion, Stat. Methods Appl. 18 (2009), pp. 507–519. [Google Scholar]
  • 19.Yanagimoto T., On measures of association and a related problem, Ann. Inst. Stat. Math. 22 (1970), pp. 57–63. [Google Scholar]
  • 20.Zhu F., A negative binomial integer-valued GARCH model, J. Time Ser. Anal. 32 (2011), pp. 54–67. [Google Scholar]
  • 21.Zhu F. and Wang D., Diagnostic checking integer-valued ARCH(p) models using conditional residual autocorrelations, Comput. Stat. Data Anal. 54 (2010), pp. 496–508. [Google Scholar]
  • 22.Zhu F. and Wang D., Estimation and testing for a Poisson autoregressive model, Metrika 73 (2011), pp. 211–230. [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES