Inference for modulated stationary processes

Zhibiao Zhao; Xiaoye Li

doi:10.3150/11-BEJ399

. Author manuscript; available in PMC: 2014 Feb 1.

Published in final edited form as: Bernoulli (Andover). 2013 Feb 1;19(1):205–227. doi: 10.3150/11-BEJ399

Inference for modulated stationary processes

Zhibiao Zhao ^1,^*, Xiaoye Li ^1,^**

PMCID: PMC3607552 NIHMSID: NIHMS383095 PMID: 23539557

Abstract

We study statistical inferences for a class of modulated stationary processes with time-dependent variances. Due to non-stationarity and the large number of unknown parameters, existing methods for stationary or locally stationary time series are not applicable. Based on a self-normalization technique, we address several inference problems, including self-normalized central limit theorem, self-normalized cumulative sum test for the change-point problem, long-run variance estimation through blockwise self-normalization, and self-normalization-based wild boot-strap. Monte Carlo simulation studies show that the proposed self-normalization-based methods outperform stationarity-based alternatives. We demonstrate the proposed methodology using two real data sets: annual mean precipitation rates in Seoul during 1771–2000, and quarterly U.S. Gross National Product growth rates during 1947–2002.

Keywords: Change-point analysis, Confidence interval, Long-run variance, Modulated stationary process, Self-normalization, Strong invariance principle, Wild bootstrap

1. Introduction

In time series analysis, stationarity requires that dependence structure be sustained over time, and thus we can borrow information from one time period to study model dynamics over another period; see Fan and Yao (2003) for nonparametric treatments and Lahiri (2003) for various resampling and block bootstrap methods. In practice, however, many climatic, economic, and financial time series are non-stationary and therefore challenging to analyze. First, since dependence structure varies over time, information is more localized. Second, non-stationary processes often require extra parameters to account for time-varying structure. One way to overcome these issues is to impose certain local stationarity; see, for example, Dahlhaus (1997) and Adak (1998) for spectral representation frameworks and Dahlhaus and Polonik (2009) for a time domain approach.

In this article we study a class of modulated stationary processes [see Adak (1998)]:

X_{i} = μ + σ_{i} e_{i}, i = 1, \dots, n,

(1.1)

where e_i are stationary time series with zero mean, and σ_i > 0 are unknown constants adjusting for time-dependent variances. Then X_i oscillates around the constant mean μ, whereas its variance changes over time in an unknown manner. In the special case of σ_i ≡ 1, (1.1) reduces to stationary case. If σ_i = s(i/n) for a Lipschitz continuous function s(t) on [0, 1], then (1.1) is locally stationary. For the general non-stationary case (1.1), the number of unknown parameters is larger than the number of observations, and it is infeasible to estimate σ_i. Due to non-stationarity and the large number of unknown parameters, existing methods that are developed for (locally) stationary processes are not applicable, and our main purpose is to develop new statistical inference techniques.

First, we establish a uniform strong approximation result which can be used to derive a self-normalized central limit theorem (CLT) for the sample mean X̅ of (1.1). For stationary case σ_i ≡ 1, by Fan and Yao (2003), under mild mixing conditions,

\sqrt{n} (\bar{X} - μ) \Rightarrow N (0, τ^{2}), where τ^{2} = γ_{0} + 2 \sum_{k = 1}^{\infty} γ_{k} and γ_{k} = Cov (e_{i}, e_{i + k}) .

(1.2)

For the modulated stationary case (1.1), it is non-trivial whether $\sqrt{n} (\bar{X} - μ)$ has a CLT without imposing further assumptions on σ_i and the dependence structure of e_i. Moreover, even when the latter CLT exists, it is difficult to estimate the limiting variance due to the large number of unknown parameters; see De Jong and Davidson (2000) for related work assuming a near-epoch dependent mixing framework. Zhao (2011) studied confidence interval construction for μ in (1.1) by assuming a block-wise asymptotically-equal-cumulative-variance assumption. The latter assumption is rather restrictive and essentially requires that block averages be asymptotically independent and identically distributed (IID). In this article, we deal with the more general setting (1.1). Under a strong invariance principle assumption, we establish a self-normalized CLT with the self-normalizing constant adjusting for time-dependent non-stationarity. The obtained CLT is an extension of the classical CLT for IID data or stationary time series to modulated stationary processes. Furthermore, we extend the idea to linear combinations of means over different time periods, which allows us to address inference regarding mean levels over multiple time periods.

Second, we study wild bootstrap for modulated stationary processes. Since the seminal work of Efron (1979), a great deal of research has been done on bootstrap under various settings, ranging from bootstrap for IID data in Efron (1979), wild bootstrap for independent observations with possibly non-constant variances in Wu (1986) and Liu (1988), to various block bootstrap and resampling methods for stationary time series in Künsch (1989), Politis and Romano (1994), Bühlmann (2002), and the monograph Lahiri (2003). With the established self-normalized CLT, we propose a wild bootstrap procedure that is tailored to deal with modulated stationary processes: the dependence is removed through a scaling factor, and the non-constant variance structure of the original data is preserved in the wild bootstrap data-generating mechanism. Our simulation study shows that the wild bootstrap method outperforms the widely used stationarity-based block bootstrap.

Third, we address change-point analysis. The change-point problem has been an active area of research; see Pettitt (1980) for proportion changes in binary data, Horváth (1993) for mean and variance changes in Gaussian observations, Bai and Perron (1998) for coefficient changes in linear models, Aue et al. (2008a) for coefficient changes in polynomial regression with uncorrelated errors, Aue et al. (2008b) for mean change in time series with stationary errors, Shao and Zhang (2010) for change-points for stationary time series, and the monograph by Csörgő and Horváth (1997) for more discussion. Most of these works deal with stationary and/or independent data. Hansen (2000) studied tests for constancy of parameters in linear regression models with non-stationary regressors and conditionally homoscedastic martingale difference errors. Here we consider

H_{0} : X_{i} = μ_{i} + σ_{i} e_{i}, μ_{1} = \dots = μ_{n}, H_{a} : μ_{1} = \dots = μ_{J} \neq μ_{J + 1} = \dots = μ_{n},

(1.3)

where J is an unknown change-point. The aforementioned works mainly focused on detecting changes in mean while the error variance is constant. On the other hand, researchers have also realized the importance of the variance/covariance structure in change-point analysis. For example, Inclán and Tiao (1994) studied change in variance for independent data, and Aue et al. (2009) and Berkes, Gombay and Horváth (2009) considered change in covariance for time series data. To our knowledge, there has been almost no attempt to advance change-point analysis under the non-constant variances framework in (1.3). Andrews (1993) studied change-point problem under near-epoch dependence structure that allows for non-stationary processes, but his Assumption 1 (c) on page 830 therein essentially implies that the process has constant variance. The popular cumulative sum (CUSUM) test is developed for stationary time series and does not take into account the time-dependent variances. Using the self-normalization idea, we propose a self-normalized CUSUM test and a wild bootstrap method to obtain its critical value. Our empirical studies show that the usual CUSUM tests tend to over-reject the null hypothesis in the presence of non-constant variances. By contrast, the self-normalized CUSUM test yields size close to the nominal level.

Fourth, we estimate the long-run variance τ² in (1.2). Long-run variance plays an essential role in statistical inferences involving time series. Most works in the literature deal with stationary processes through various block bootstrap and subsampling approaches; see Carlstein (1986), Künsch (1989), Politis and Romano (1994), Götze and Künsch (1996), and the monograph Lahiri (2003). De Jong and Davidson (2000) established the consistency of kernel estimators of covariance matrices under a near epoch dependent mixing condition. Recently, Müller (2007) studied robust long-run variance estimation for locally stationary process. For model (1.1), the error process {e_i} is contaminated with unknown standard deviations {σ_i}, and we apply blockwise self-normalization to remove non-stationarity, resulting in asymptotically stationary blocks.

Fifth, the proposed methods can be extended to deal with linear regression model

X_{i} = U_{i} β + σ_{i} e_{i},

(1.4)

where U_i = (u_i,1, …, u_i,p) are deterministic covariates, and β = (β₁, …, β_p)′ is the unknown column vector of parameters. For p = 2, Hansen (1995) established the asymptotic normality of the least-squares estimate of the slope parameter under a fairly general framework of non-stationary errors. While Hansen (1995) assumed that the errors form a martingale difference array so that they are uncorrelated, the framework in (1.4) is more general in that it allows for correlations. On the other hand, Hansen (1995) allowed the conditional volatilities to follow an autoregressive model, hence introducing stochastic volatilities. Phillips, Sun and Jin (2007) considered (1.4) for stationary errors and their approach is not applicable here due to the unknown non-constant variances $σ_{i}^{2}$ . In Section 2.6 we consider self-normalized CLT for the least-squares estimator of β in (1.4). In the polynomial regression case u_i,r = (i/n)^r−1, Aue et al. (2008a) studied a likelihood-based test for constancy of β in (1.4) for uncorrelated errors with constant variance. Due to the presence of correlation and time-varying variances, it is more challenging to study the change-point problem for (1.4) and this is beyond the scope of this article.

The rest of this article is organized as follows. We present theoretical results in Section 2. Sections 3–4 contain Monte Carlo studies and applications to two real data sets.

2. Main results

For sequences {a_n} and {b_n}, write a_n = O(b_n), a_n = o(b_n), and a_n ≍ b_n, respectively, if |a_n/b_n| < c₁, a_n/b_n → 0, and c₂ < |a_n/b_n| < c₃, for some constants 0 < c₁, c₂, c₃ < ∞. For q > 0 and a random variable e, write e ∈ ℒ^q if ‖e‖_q ≔ {E(|e|^q)}^1/q < ∞.

2.1. Uniform approximations for modulated stationary processes

In (1.1), assume without loss of generality that E(e_i) = 0 and $E (e_{i}^{2}) = 1$ so that {e_i} and ${e_{i}^{2} - 1}$ are centered stationary processes. With the convention $S_{0} = S_{0}^{*} = 0$ , define

S_{i} = \sum_{j = 1}^{i} e_{j} and S_{i}^{*} = \sum_{j = 1}^{i} (e_{j}^{2} - 1), i = 1, 2, \dots .

(2.1)

Assumption 2.1. There exist standard Brownian motions {B_t} and ${B_{t}^{*}}$ such that

max_{1 \leq i \leq n} | S_{i} - τ B_{i} | = o_{a . s .} (Δ_{n}) and max_{1 \leq i \leq n} | S_{i}^{*} - τ * B_{i}^{*} | = o_{a . s .} (Δ_{n}),

(2.2)

where Δ_n is the approximation error, τ² and τ^*2 are the long-run variances of {e_i} and ${e_{i}^{2} - 1}$ , respectively. Further assume τ² > 0 to avoid the degenerate case τ² = 0.

The uniform approximations in (2.2) are generally called strong invariance principle. The two Brownian motions {B_t} and ${B_{t}^{*}}$ may be defined on different probability spaces and hence are not jointly distributed, which is not an issue because our argument does not depend on their joint distribution. To see how to use (2.2), under H₀ in (1.3), consider

F_{j} = j ({\underline{X}}_{j} - μ) and {\underline{V}}_{j}^{2} = \sum_{i = 1}^{j} {(X_{i} - {\underline{X}}_{j})}^{2}, where {\underline{X}}_{j} = j^{- 1} \sum_{i = 1}^{j} X_{i} .

(2.3)

Theorem 2.1 below presents uniform approximations for F_j and $V_{j}^{2}$ . Define

r_{n} = | σ_{n} | + \sum_{i = 2}^{n} | σ_{i} - σ_{i - 1} | and r_{n}^{*} = | σ_{n}^{2} | + \sum_{i = 2}^{n} | σ_{i}^{2} - σ_{i - 1}^{2} |,

(2.4)

\sum_{j}^{2} = \sum_{i = 1}^{j} σ_{i}^{2} and \sum_{j}^{* 2} = {(\sum_{i = 1}^{j} σ_{i}^{4})}^{1 / 2} .

(2.5)

Theorem 2.1. Let (2.2) hold. For any c ∈ (0, 1], the uniform approximations hold:

max_{cn \leq j \leq n} | F_{j} - τ \sum_{i = 1}^{j} σ_{i} (B_{i} - B_{i - 1}) | = O_{a . s .} (r_{n} Δ_{n}),

(2.6)

max_{cn \leq j \leq n} | {\underline{V}}_{j}^{2} - Σ_{j}^{2} | = O_{p} {(r_{n}^{2} Δ_{n}^{2} + Σ_{n}^{2}) / n + Σ_{n}^{* 2} + r_{n}^{*} Δ_{n}} .

(2.7)

Theorem 2.1 provides quite general results under (2.2). We now discuss sufficient conditions for (2.2). Shao (1993) obtained sufficient mixing conditions for (2.2). In this article, we briefly introduce the framework in Wu (2007). Assume that e_i has the causal representation e_i = G (…, ε_i−1, ε_i), where ε_i are IID innovations, and G is a measurable function such that e_i is well-defined. Let ${ε_{i}^{'}}_{i \in ℤ}$ be an independent copy of {ε_i}_i∈ℤ. Assume

\sum_{i = 1}^{\infty} i {‖ e_{i} - e_{i}^{'} ‖}_{8} < \infty, where e_{i}^{'} = G (\dots, ε_{- 1}, ε_{0}^{'}, ε_{1} \dots, ε_{i - 1}, ε_{i}) .

(2.8)

Proposition 2.1 below follows from Corollary 4 in Wu (2007).

Proposition 2.1. Assume that (2.8) holds. Then (2.2) holds with Δ_n = n^1/4 log(n), the optimal rate up to a logarithm factor.

For linear process $e_{i} = \sum_{j = 0}^{\infty} a_{j} ε_{i - j}$ with ε_i ∈ ℒ⁸ and E (ε_i) = 0, ${‖ e_{i} - e_{i}^{'} ‖}_{8} = {‖ ε_{0} - ε_{0}^{'} ‖}_{8} | a_{i} |$ . If $\sum_{i = 1}^{\infty} i | a_{i} |$ < ∞, then (2.2) holds with Δ_n = n^1/4 log(n). For many nonlinear time series, ${‖ e_{i} - e_{i}^{'} ‖}_{8}$ decays exponentially fast and hence (2.8) holds; see Section 3.1 of Wu (2007). From now on we assume (2.2) holds with Δ_n = n^1/4 log(n).

Remark 2.1. If e_i are IID with E(e_i) = 0 and e_i ∈ ℒ^q for some 2 < q ≤ 4, the celebrated “Hungarian embedding” asserts that $\sum_{j = 1}^{i} e_{j}$ satisfies a strong invariance principle with the optimal rate o_a.s. (n^1/q). Thus, it is necessary to have the moment assumption e_i ∈ ℒ⁸ in Proposition 2.1 in order to ensure strong invariance principles for both S_i and $S_{i}^{*}$ in (2.1) with approximation rate n^1/4 log(n). On the other hand, one can relax the moment assumption by loosening the approximation rate. For example, by Corollary 4 in Wu (2007), assume e_i ∈ ℒ^2q for some q > 2 and $\sum_{i = 1}^{\infty} i {‖ e_{i} - e_{i}^{*} ‖}_{2 q} < \infty$ , then (2.2) holds with Δ_n = n^1/min(q,4) log(n).

As shown in Examples 2.1–2.3 below, r_n and $r_{n}^{*}$ in (2.4) often have tractable bounds.

Example 2.1. If σ_i is non-decreasing in i, then σ_n ≤ r_n ≤ 2σ_n and $σ_{n}^{2} \leq r_{n}^{*} \leq 2 σ_{n}^{2}$ . If σ_i is non-increasing in i, then r_n = σ₁ and $r_{n}^{*} = σ_{1}^{2}$ . If σ_i are piecewise constants with finitely many pieces, then $r_{n}, r_{n}^{*} = O (1)$ .

Example 2.2. Let σ_i = s(i/n^γ) for γ ∈ [0, 1] and a Lipschitz continuous function s(t), t ∈ [0, ∞), sup_{t∈[0, ∞)} s(t) < ∞. Then $r_{n}, r_{n}^{*} = O (n^{1 - γ})$ . If γ = 1, we obtain locally stationary case with time window i/n ∈ [0, 1]; if γ ∈ [0, 1), we have infinite time window [0, ∞) as n/n^γ → ∞, which may be more reasonable for data with a long time horizon.

Example 2.3. If σ_i = i^βL(i) for a slowly varying function L(·) such that L(cx)/L(x) → 1 as x → ∞ for all c > 0. Then we can show r_n = O{n^βL(n)} or O(1) and $r_{n}^{*} = O {n^{2 β} L^{2} (n)}$ or O(1), depending on whether β > 0 or β < 0. For the boundary case β = 0, assume L (i+1)/L(i) = 1+O(1/i) uniformly, then $r_{n} = L (n) + O (1) \sum_{i = 2}^{n} L (i) / i = O {log (n) {max}_{1 \leq i \leq n} L (i)}$ . Similarly, $r_{n}^{*} = O {log (n) {max}_{1 \leq i \leq n} L^{2} (i)}$ .

2.2. Self-normalized central limit theorem

In this section we establish a self-normalized CLT for the sample average X̅ . To understand how non-stationarity makes this problem difficult, elementary calculation shows

Var {\sqrt{n} (\bar{X} - μ)} = \frac{γ_{0}}{n} \sum_{i = 1}^{n} σ_{i}^{2} + \frac{2}{n} \sum_{1 \leq i < j \leq n} σ_{i} σ_{j} γ_{j - i} ≔ τ_{n}^{2},

(2.9)

where γ_k = Cov(e₀, e_k). In the stationary case σ_i ≡ 1, under condition $\sum_{k = 0}^{\infty} | γ_{k} | < \infty, τ_{n}^{2} \to τ^{2}$ , the long-run variance in (1.2). For non-constant variances, it is difficult to deal with $τ_{n}^{2}$ directly, due to the large number of unknown parameters and complicated structure. See De Jong and Davidson (2000) for a kernel estimator of $τ_{n}^{2}$ under a near-epoch dependent mixing framework.

To attenuate the aforementioned issue, we apply the uniform approximations in Theorem 2.1. Assume that (2.10) below holds. Note that the increments B_i − B_i−1 of standard Brownian motions are IID standard normal random variables. By (2.6), n(X̅ −μ) is equivalent to $N (0, τ^{2} Σ_{n}^{2})$ in distribution. By (2.7), V̲_n/Σ_n → 1 in probability. By Slutsky’s theorem, we have the following Proposition 2.2.

Proposition 2.2. Let (2.2) hold with Δ_n = n^1/4 log(n). For $r_{n}, r_{n}^{*}, Σ_{n}^{2}, Σ_{n}^{* 2}$ in (2.4)–(2.5), assume

δ_{n} = r_{n} Δ_{n} / Σ_{n} + (r_{n}^{*} Δ_{n} + Σ_{n}^{* 2}) / Σ_{n}^{2} \to 0 .

(2.10)

Recall ${\underline{V}}_{n}^{2}$ in (2.3). Then as n → ∞, n(X̅ − μ)/V̲_n ⇒ N(0, τ²). Consequently, a (1 − α) asymptotic confidence interval for μ is X̅ ± z_α/2τ̂V̲_n/n, where τ̂ is a consistent estimate of τ (Section 2.5 below), and z_α/2 is (1 − α/2) standard normal quantile.

Proposition 2.2 is an extension of the classical CLT for IID data or stationary processes to modulated stationary processes. If X_i are IID, then n(X̅ − μ)/V̲_n ⇒ N(0, 1). In Proposition 2.2, τ² can be viewed as the variance inflation factor due to the dependence of {e_i}. For stationary data, the sample variance ${\underline{V}}_{n}^{2} / n$ is a consistent estimate of the population variance. Here, for non-constant variances case (1.1), by (2.7) in Theorem 2.1, ${\underline{V}}_{n}^{2} / n$ can be viewed as an estimate of the time-average “population variance” $Σ_{n}^{2} / n$ . So, we can interpret the CLT in Proposition 2.2 as a self-normalized CLT for modulated stationary processes with the self-normalizing term V̲_n adjusting for non-stationarity due to σ₁, …, σ_n and τ² accounting for dependence of {e_i}. Clearly, parameters σ₁, …, σ_n are canceled out through self-normalization. Finally, condition (2.10) is satisfied for Example 2.2 with γ > 3/4 and Example 2.3 with β > −1/4.

In classical statistics, the width of confidence intervals usually shrinks as sample size increases. By Proposition 2.2 and Theorem 2.1, the width of the constructed confidence interval for μ is proportional to V̲_n/n or, equivalently, Σ_n/n. Thus, a necessary and sufficient condition for shrinking confidence interval is $\sum_{i = 1}^{n} σ_{i}^{2} / n^{2} \to 0$ , which is satisfied if $σ_{i} = o (\sqrt{i})$ . An intuitive explanation is as follows. For IID data, sample mean converges at a rate of $O (\sqrt{n})$ . In (1.1), if σ_i grows faster than $O (\sqrt{i})$ , the contribution of a new observation is negligible relative to its noise level.

Example 2.4. If σ_i ≍ i^β with β ∈ [0, 1/2), the length of confidence interval is proportional to Σ_n/n ≍ n^β−1/2. In particular, if c₁ < σ_i < c₂ for some positive constants c₁ and c₂, then Σ_n/n achieves the optimal rate O(n^−1/2). If σ_i ≍ log(i), then $Σ_{n} / n ≍ log (n) / \sqrt{n}$ .

The same idea can be extended to linear combinations of means over multiple time periods. Suppose we have observations from k consecutive time periods 𝒯₁, …, 𝒯_k, each of the form (1.1) with different means, denoted by μ₁, …, μ_k, and each having time-dependent variances. Let ν = β₁μ₁ + ⋯ + β_kμ_k for given coefficients β₁, …, β_k. For example, if we are interested in mean change from 𝒯₁ to 𝒯₂, we can take ν = μ₂ − μ₁; if we are interested in whether the increase from 𝒯₃ to 𝒯₄ is larger than that from 𝒯₁ to 𝒯₂, we can let ν = (μ₄ − μ₃) − (μ₂ − μ₁). Proposition 2.3 below extends Proposition 2.2 to multiple means.

Proposition 2.3. Let ν = β₁μ₁ + ⋯ + β_kμ_k. For 𝒯_j, denote its sample size n_j and its sample average X̅ (j). Assume that (2.10) holds for each individual time periods 𝒯_j and for simplicity that n₁, …, n_k are of the same order. Then

\frac{\sum_{j = 1}^{k} β_{j} \bar{X} (j) - ν}{Λ_{n}} \Rightarrow N (0, τ^{2}), where Λ_{n}^{2} = \sum_{j = 1}^{k} {\frac{β_{j}^{2}}{n_{j}^{2}} \sum_{i \in 𝒯_{j}} {[X_{i} - \bar{X} (j)]}^{2}} .

2.3. Wild bootstrap for self-normalized statistic

Recall σ_ie_i in (1.1). Suppose we are interested in the self-normalized statistic

H_{n} = \frac{\sum_{i = 1}^{n} σ_{i} e_{i}}{\sqrt{\sum_{i = 1}^{n} σ_{i}^{2} e_{i}^{2}}} .

For problems with small sample sizes, it is natural to use bootstrap distribution instead of the convergence H_n ⇒ N(0, τ²) in Proposition 2.2. Wu (1986) and Liu (1988) have pioneered the work on wild bootstrap for independent data with non-identical distributions. We shall extend their wild bootstrap procedure to the modulated stationary process (1.1).

Let {α_i} be IID random variables independent of {e_i} satisfying α_i ∈ ℒ³, E(α_i) = 0, $E (α_{i}^{2}) = 1$ . Define the self-normalized statistic based on new data:

H_{n}^{*} = \frac{\sum_{i = 1}^{n} ξ_{i}}{\sqrt{\sum_{i = 1}^{n} {(ξ_{i} - \bar{ξ})}^{2}}}, where ξ_{i} = σ_{i} e_{i} α_{i} and \bar{ξ} = \frac{ξ_{1} + \dots + ξ_{n}}{n} .

Clearly, ξ_i inherits the non-stationarity structure of σ_ie_i by writing $ξ_{i} = σ_{i} e_{i}^{*} with e_{i}^{*} = e_{i} α_{i}$ . On the other hand, for the new error process ${e_{i}^{*}}, E (e_{i}^{* 2}) = E (e_{i}^{2}) = 1 and Cov (e_{i}^{*}, e_{j}^{*}) = 0$ for i ≠ j. Thus, ${e_{i}^{*}}$ is a white noise sequence with long-run variance one. By Proposition 2.2, the scaled version H_n/τ ⇒ N(0, 1) is robust against the dependence structure of {e_i}, so we expect that $H_{n}^{*}$ should be close to H_n/τ in distribution.

Theorem 2.2. Let the conditions in Proposition 2.2 hold. Further assume

{(\sum_{i = 1}^{n} σ_{i}^{3})}^{2} {(\sum_{i = 1}^{n} σ_{i}^{2})}^{- 3} \to 0 .

(2.11)

Let τ̂ be a consistent estimate of τ . Denote by ℙ* the conditional law given {e_i}. Then

sup_{x \in ℝ} | ℙ^{*} (H_{n}^{*} \leq x) - ℙ (H_{n} / \hat{τ} \leq x) | \to 0, in probability .

(2.12)

Theorem 2.2 asserts that, $H_{n}^{*}$ behaves like the scaled version H_n/τ̂, with the scaling factor τ̂ coming from the dependence of {e_i}. Here we use the sample mean X̅ in (1.1) to illustrate a wild bootstrap procedure to obtain the distribution of n(X̅ − μ)/(τV̲_n) in Proposition 2.2.

(i)
Apply method in Section 2.5 to X₁, …, X_n to obtain a consistent estimate τ̂ of τ.
(ii)
Subtract the sample mean X̅ from data to obtain ε_i = X_i − X̅, i = 1, …, n.
(iii)
Generate IID random variables α₁, …, α_n satisfying E(α_i) = 0, $E (α_{i}^{2}) = 1$ .
(iv)
Based on ε_i in (ii) and α_i in (iii), generate bootstrap data $ξ_{i}^{b} = ε_{i} α_{i}$ , and compute
$H_{n}^{b} = \frac{\sum_{i = 1}^{n} ξ_{i}^{b}}{{\hat{τ}}^{b} \sqrt{\sum_{i = 1}^{n} {(ξ_{i}^{b} - {\bar{ξ}}^{b})}^{2}}},$
where τ̂^b is a long-run variance estimate (see Section 2.5) for bootstrap data $ξ_{i}^{b}$ .
(vi)
Repeat (ii)–(iv) many times and use the empirical distribution of those realizations of $H_{n}^{b}$ as the distribution of n(X̅ − μ)/(τV̲_n).

The proposed wild bootstrap is an extension of that in Liu (1988) for independent data to modulated stationary case, and it has two appealing features. First, the scaling factor τ̂ makes the statistic independent of the dependence structure. Second, the bootstrap data-generating mechanism is adaptive to unknown time-dependent variances ${σ_{i}^{2}}$ . For the distribution of α_i in step (iii), we use ℙ(α_i = −1) = ℙ(α_i = 1) = 1/2, which has some desirable properties. For example, it preserves the magnitude and range of the data. As shown by Davidson and Flachaire (2008), for certain hypothesis testing problems in linear regression models with symmetrically distributed errors, the bootstrap distribution is exactly equal to that of the test statistic; see Theorem 1 therein.

For the purpose of comparison, we briefly introduce the widely used block bootstrap for a stationary time series {X_i} with mean μ. By (1.2), $\sqrt{n} (\bar{X} - μ) \Rightarrow N (0, τ^{2})$ . Suppose that we want to bootstrap the distribution of $\sqrt{n} (\bar{X} - μ)$ . Let k_n, ℓ_n, ℐ₁, …, ℐ_{ℓ_n} be defined as in Section 2.5 below. The non-overlapping block bootstrap works as follows

Take a simple random sample of size ℓ_n with replacement from the blocks ℐ₁, …, ℐ_{ℓ_n}, and form the bootstrap data $X_{1}^{b}, \dots, X_{n'}^{b}, n' = k_{n} ℓ_{n}$ , by pooling together X_i’s for which the index i is within those selected blocks.
Let X̅^b be the sample average of $X_{1}^{b}, \dots, X_{n'}^{b}$ . Compute $Ξ_{n} = \sqrt{n'} {{\bar{X}}^{b} - E^{*} ({\bar{X}}^{b})}, where E^{*} ({\bar{X}}^{b}) = \sum_{i = 1}^{n'} X_{i} / n'$ is the conditional expectation of X̅^b given {X_i}.
Repeat (i)–(ii) many times and use the empirical distribution of Ξ_n’s as the distribution of $\sqrt{n} (\bar{X} - μ)$ .

In step (ii), another choice is the studentized version ${Ξ̃}_{n} = \sqrt{n'} {{\bar{X}}^{b} - E^{*} ({\bar{X}}^{b})} / {\hat{τ}}^{b}$ , where τ̂^b is a consistent estimate of τ based on bootstrap data. Assuming stationarity and k_n → ∞, the blocks are asymptotically independent and share the same model dynamics as the whole data, which validates the above block bootstrap. We refer the reader to Lahiri (2003) for detailed discussions. For a non-stationary process, block bootstrap is no longer valid, because individual blocks are not representative of the whole data. By contrast, the proposed wild bootstrap is adaptive to unknown dependence and the non-constant variances structure.

2.4. Change-point analysis: self-normalized CUSUM test

To test a change-point in the mean of a process {X_i}, two popular CUSUM-type tests [see Section 3 of Robbins et al. (2011) for a review and related references] are

T_{n}^{1} = max_{cn \leq j \leq (1 - c) n} \frac{{\hat{τ}}^{- 1} | S_{X} (j) |}{\sqrt{j (1 - j / n)}} and T_{n}^{2} = max_{cn \leq j \leq (1 - c) n} {\hat{τ}}^{- 1} | S_{X} (j) |,

(2.13)

where τ̂² is a consistent estimate of the long-run variance τ² of {X_i}, and

S_{X} (j) = (1 - \frac{j}{n}) \sum_{i = 1}^{j} X_{i} - \frac{j}{n} \sum_{i = j + 1}^{n} X_{i} .

(2.14)

Here c > 0 (c = 0.1 in our simulation studies) is a small number to avoid the boundary issue. For IID data, j(1 − j/n) is proportional to the variance of S_X (j), so $T_{n}^{1}$ is a studentized version of $T_{n}^{2}$ . For IID Gaussian data, $T_{n}^{1}$ is equivalent to likelihood ratio test; see Csörgő and Horváth (1997). Assume that, under null hypothesis,

{n^{- 1 / 2} \sum_{i = 1}^{⌊ nt ⌋} [X_{i} - E (X_{i})]}_{0 \leq t \leq 1} \Rightarrow τ {B_{t}}_{0 \leq t \leq 1}, in the Skorohod space,

(2.15)

for a standard Brownian motion {B_t}_t≥0. The above convergence requires finite-dimensional convergence and tightness; see Billingsley (1968). By the continuous mapping theorem, $T_{n}^{1} \Rightarrow {max}_{c \leq t \leq 1 - c} | B_{t} - {t B}_{1} | / \sqrt{t (1 - t)} and T_{n}^{2} / \sqrt{n} \Rightarrow {max}_{c \leq t \leq 1 - c} | B_{t} - {t B}_{1} |$ .

For the modulated stationary case (1.3), (2.15) is no longer valid. Moreover, since $T_{n}^{1} and T_{n}^{2}$ do not take into account the time-dependent variances $σ_{i}^{2}$ , an abrupt change in variances may lead to a false rejection of H₀ when the mean remains constant. For example, our simulation study in Section 3.3 shows that the empirical false rejection probability for $T_{n}^{1} and T_{n}^{2}$ is about 10% for nominal level 5%. To alleviate the issue of non-constant variances, we adopt the self-normalization approach as in previous sections. Recall F_j and V̲_j in (2.3). For each fixed cn ≤ j ≤ (1 − c)n, by Theorem 2.1 and Slutsky’s theorem, F_j/V̲_j ⇒ N(0, τ²) in distribution, assuming the negligibility of the approximation errors. Therefore, the self-normalization term V̲_j can remove the time-dependent variances. In light of this, we can simultaneously self-normalize the two terms $\sum_{i = 1}^{j} X_{i} and \sum_{i = j + 1}^{n} X_{i}$ in (2.14) and propose the self-normalized test statistic

T_{n}^{SN} = max_{cn \leq j \leq (1 - c) n} {\hat{τ}}^{- 1} | T_{n} (j) |, where T_{n} (j) = \frac{S_{X} (j)}{\sqrt{{(1 - j / n)}^{2} {\underline{V}}_{j}^{2} + {(j / n)}^{2} {\bar{V}}_{j}^{2}}} .

(2.16)

Here, ${\underline{V}}_{j}^{2}$ is defined as in (2.3), ${\bar{V}}_{j}^{2} = \sum_{i = j + 1}^{n} {(X_{i} - {\bar{X}}_{j})}^{2} with {\bar{X}}_{j} = {(n - j)}^{- 1} \sum_{i = j + 1}^{n} X_{i}$ .

Theorem 2.3. Assume (2.2) holds. Let δ_n → 0 be as in (2.10). Under H₀, we have $max_{cn \leq j \leq (1 - c) n} | T_{n} (j) - τ {\tilde{T}}_{n} (j) | = O_{p} (δ_{n})$ , where ${\tilde{T}}_{n} (j) = \frac{(1 - j / n) \sum_{i = 1}^{j} σ_{i} (B_{i} - B_{i - 1}) - j / n \sum_{i = j + 1}^{n} σ_{i} (B_{i} - B_{i - 1})}{\sqrt{{(1 - j / n)}^{2} \sum_{i = 1}^{j} σ_{i}^{2} + {(j / n)}^{2} \sum_{i = j + 1}^{n} σ_{i}^{2}}}$ .

By Theorem 2.3, under H₀, $T_{n}^{SN}$ is asymptotically equivalent to max_{cn≤j≤(1−c)n}|T̃_n(j)|. Due to the self-normalization, for each j, the time-dependent variances are removed and T̃_n(j) ~ N(0, 1) has a standard normal distribution. However, T̃_n(j) and e T̃_n(j′) are correlated for j ≠ j′. Therefore, {T̃_n(j)} is a non-stationary Gaussian process with a standard normal marginal density. Due to the large number of unknown parameters σ_i, it is infeasible to obtain the null distribution directly. On the other hand, Theorem 2.3 establishes the fact that, asymptotically, the distribution of $T_{n}^{SN}$ in (2.16) depends only on σ₁, …, σ_n and is robust against the dependence structure of {e_i}, which motivates us to use the wild bootstrap method in Section 2.3 to find the critical value of $T_{n}^{SN}$ .

Compute T_n(j) and find Ĵ = argmax_{cn≤j≤(1−c)n}|T_n(j)|.
Divide the data into two blocks X₁, …, X_Ĵ and X_Ĵ+1, …, X_n. Within each block, subtract the sample mean from the observations therein to obtain centered data. Pool all centered data together and denote them by ε₁, …, ε_n.
Based on ε₁, …, ε_n, obtain an estimate τ̂ of τ . See Section 2.5 below.
Compute the test statistic $T_{n}^{SN}$ in (2.16).
Based on ε_i in (ii), use the wild bootstrap method in Section 2.3 to generate synthetic data ξ₁, …, ξ_n, and use (i)–(iv) to compute the bootstrap test statistic $T_{n}^{b}$ based on the bootstrap data ξ₁, …, ξ_n.
Repeat (v) many times and find (1 − α) quantile of those $T_{n}^{b} ’ s$ .

As argued in Section 2.3, the synthetic data-generating scheme (v) inherits the time-varying non-stationarity structure of the original data. Also, the statistic $T_{n}^{SN}$ is robust against the dependence structure, which justifies the proposed bootstrap method. If H₀ is rejected, the change-point is then estimated by Ĵ = argmax_{cn≤j≤(1−c)n} |T_n(j)|.

If there is no evidence to reject H₀, we briefly discuss how to apply the same methodology to test H̃₀ : σ₁ = ⋯ = σ_J ≠ σ_J+1 = ⋯ = σ_n, that is, whether there is a change-point in the variances $σ_{i}^{2}$ . By (1.1), we have ${(X_{i} - μ)}^{2} = σ_{i}^{2} + σ_{i}^{2} ζ_{i}, where ζ_{i} = e_{i}^{2} - 1$ has mean zero. Therefore, testing a change-point in the variances $σ_{i}^{2}$ of X_i is equivalent to testing a change-point in the mean of the new data X̃_i = (X_i − X̅)².

2.5. Long-run variance estimation

To apply the results in Sections 2.2–2.4, we need a consistent estimate of the long-run variance τ². Most existing works deal with stationary time series through various block bootstrap and subsampling approaches; see Lahiri (2003) and references therein. Assuming a near-epoch dependent mixing condition, De Jong and Davidson (2000) established the consistency of a kernel estimator of $Var (\sum_{i = 1}^{n} X_{i})$ , and their result can be used to estimate $τ_{n}^{2}$ in (2.9) for the CLT of $\sqrt{n} (\bar{X} - μ)$ . However, for the change-point problem in Section 2.4, we need an estimator of the long-run variance τ² of the unobservable process {e_i}, so the method in De Jong and Davidson (2000) is not directly applicable.

To attenuate the non-stationarity issue, we extend the idea in Section 2.2 to blockwise self-normalization. Let k_n be the block length. Denote by ℓ_n = ⌊n/k_n⌋ the largest integer not exceeding n/k_n. Ignore the boundary and divide 1, …, n into ℓ_n blocks

ℐ_{j} = {(j - 1) k_{n} + 1, \dots, {jk}_{n}}, j = 1, \dots, ℓ_{n} .

(2.17)

Recall the overall sample mean X̅ . For each block j, define the self-normalized statistic

D_{j} = \frac{k_{n} [\bar{X} (j) - \bar{X}]}{V (j)}, where \bar{X} (j) = \frac{1}{k_{n}} \sum_{i \in ℐ_{j}} X_{i}, V^{2} (j) = \sum_{i \in ℐ_{j}} {[X_{i} - \bar{X} (j)]}^{2} .

(2.18)

By Proposition 2.2, the self-normalized statistics D₁, …, D_{ℓ_n} ~ N(0, τ²) are asymptotically IID. Thus, we propose estimating τ² by

{\hat{τ}}^{2} = \frac{1}{ℓ_{n}} \sum_{j = 1}^{ℓ_{n}} D_{j}^{2} .

(2.19)

As in (2.4)–(2.5), we define the quantities on block j

r (j) = | σ_{{j k}_{n}} | + \sum_{i \in ℐ_{j}} | σ_{i} - σ_{i - 1} | and r^{*} (j) = | σ_{{j k}_{n}}^{2} | + \sum_{i \in ℐ_{j}} | σ_{i}^{2} - σ_{i - 1}^{2} |,

(2.20)

Σ^{2} (j) = \sum_{i \in ℐ_{j}} σ_{i}^{2} and Σ^{* 2} (j) = {(\sum_{i \in ℐ_{j}} σ_{i}^{4})}^{1 / 2} .

(2.21)

Theorem 2.4. Let (2.2) hold with Δ_n = n^1/4 log(n). Recall r_n, Σ_n in (2.4)–(2.5). Define

M_{n} = \frac{1}{k_{n}} + max_{1 \leq j \leq ℓ_{n}} \frac{Σ^{* 2} (j) + r^{*} (j) Δ_{n}}{Σ^{2} (j)} + max_{1 \leq j \leq ℓ_{n}} \frac{r (j) Δ_{n}}{Σ (j)} .

(2.22)

Assume that r_nΔ_n/Σ_n → 0 and

χ_{n} = ℓ_{n}^{- 1 / 2} + log (n) M_{n} + \sqrt{log (n)} \frac{Σ_{n}}{ℓ_{n}^{2}} \sum_{j = 1}^{ℓ_{n}} \frac{1}{Σ (j)} + \frac{Σ_{n}^{2}}{ℓ_{n}^{3}} \sum_{j = 1}^{ℓ_{n}} \frac{1}{Σ^{2} (j)} \to 0 .

(2.23)

Then τ̂² − τ² = O_p(χ_n). Consequently, τ̂ is a consistent estimate of τ.

Consider Example 2.2 with γ ∈ [0, 1). Then $χ_{n} ≍ \sqrt{log (n) / ℓ_{n}} + {log}^{2} (n) (n^{1 / 4} / \sqrt{k_{n}} + n^{5 / 4 - γ} / k_{n} + \sqrt{k_{n}} n^{1 / 4 - γ})$ . For γ ∈ (3/4, 1), it can be shown that the optimal rate is χ_n ≍ n^−1/8 log^5/4(n) when k_n ≍ n^3/4 log^3/2(n). In Example 2.3 with σ_i = i^β for some β ∈ [0, 1), elementary but tedious calculations show that the optimal rate is

χ_{n} ≍ {\begin{matrix} n^{- 1 / 8} {log}^{5 / 4} (n), & k_{n} ≍ n^{3 / 4} {log}^{3 / 2} (n), & β \in [0, 3 / 4], \\ n^{\frac{β - 1}{5 - 4 β}} {log (n)}^{\frac{8 (1 - β)}{5 - 4 β}}, & k_{n} ≍ n^{\frac{4.5 - 4 β}{5 - 4 β}} {log (n)}^{\frac{4}{5 - 4 β}}, & β \in (3 / 4, 1) . \end{matrix}

2.6. Some possible extensions

The self-normalization approaches in Sections 2.2–2.5 can be extended to linear regression model (1.4) with modulated stationary time series errors. The approach in Phillips, Sun and Jin (2007) is not applicable here due to non-stationarity. For simplicity, we consider the simple case that p = 2, U_i = (1, i/n), and β = (β₀, β₁)′. Hansen (1995) studied a similar setting for martingale difference errors. Denote by β̂₀ and β̂₁ the simple linear regression estimates of β₀ and β₁ given by

{\hat{β}}_{1} = \frac{n \sum_{i = 1}^{n} {iX}_{i} - \sum_{i = 1}^{n} i \sum_{i = 1}^{n} X_{i}}{\sum_{i = 1}^{n} i^{2} - {(\sum_{i = 1}^{n} i)}^{2} / n} and {\hat{β}}_{0} = \bar{X} - {\hat{β}}_{1} (n + 1) / (2 n) .

(2.24)

Then simple algebra shows that

{\hat{β}}_{0} - β_{0} = \frac{2}{n^{2} - n} \sum_{i = 1}^{n} (2 n - 3 i + 1) σ_{i} e_{i}, {\hat{β}}_{1} - β_{1} = \frac{6}{n^{2} - 1} \sum_{i = 1}^{n} (2 i - n - 1) σ_{i} e_{i} .

The latter expressions are linear combinations of {e_i}. Thus, by the same argument in Proposition 2.2 and Theorem 2.1, we have self-normalized CLTs for β̂₀ and β̂₁.

Theorem 2.5. Let s_i,0 = (2n − 3i + 1)σ_i and s_i,₁ = (2i − n − 1)σ_i. Assume that {s_i,0}_1≤i≤n and {s_i,1}_1≤i≤n satisfy condition (2.10). Then as n → ∞,

\frac{n^{2} ({\hat{β}}_{0} - β_{0})}{2 V_{n, 0}} \Rightarrow N (0, τ^{2}), where V_{n, 0}^{2} = \sum_{i = 1}^{n} {(2 n - 3 i + 1)}^{2} {(X_{i} - {\hat{β}}_{0} - {\hat{β}}_{1} i / n)}^{2},

\frac{n^{2} ({\hat{β}}_{1} - β_{1})}{6 V_{n, 1}} \Rightarrow N (0, τ^{2}), where V_{n, 1}^{2} = \sum_{i = 1}^{n} {(2 i - n - 1)}^{2} {(X_{i} - {\hat{β}}_{0} - {\hat{β}}_{1} i / n)}^{2} .

The long-run variance τ² can be estimated using the idea of blockwise self-normalization in Section 2.5. Let k_n, ℓ_n and ℐ_j be defined as in Section 2.5. Then we propose

{\hat{τ}}^{2} = \frac{1}{ℓ_{n}} \sum_{j = 1}^{ℓ_{n}} D_{j}^{2}, where D_{j} = \frac{\sum_{i \in ℐ_{j}} (X_{i} - {\hat{β}}_{0} - {\hat{β}}_{1} i / n)}{\sqrt{\sum_{i \in ℐ_{j}} {(X_{i} - {\hat{β}}_{0} - {\hat{β}}_{1} i / n)}^{2}}} .

(2.25)

Here, D₁, …, D_{ℓ_n} are asymptotically IID normal random variables with mean zero and variance τ². Consistency can be established under similar conditions as in Theorem 2.4.

For the general linear regression model (1.4), the linearly weighted average structure of linear regression estimates allows us to obtain self-normalized CLTs as in Theorem 2.5 under more complicated conditions. Also, it is possible to extend the proposed method to the nonparametric regression model with time-varying variances

X_{i} = f (i / n) + σ_{i} e_{i},

(2.26)

where f(·) is a nonparametric time trend of interest. Nonparametric estimates, for example the Nadaraya-Watson estimate, are usually based on locally weighted observations. The latter feature allows us to derive similar self-normalized CLT. However, the change-point problem for (1.4) and (2.26) will be more challenging, and Aue et al. (2008a) studied (1.4) for uncorrelated errors with constant variance. Also, it is more difficult to address the bandwidth selection issues; see Altman (1990) for related contribution when σ_i ≡ 1. It remains a direction of future research to investigate (1.4) and (2.26).

3. Simulation study

3.1. Selection of block length k_n for τ̂

Recall that D₁, …, D_{ℓ_n} in (2.25) are asymptotically IID normal random variables. To get a sensible choice of the block length parameter k_n, we propose a simulation-based method by minimizing the empirical mean squared error (MSE).

Simulate n IID standard normal random variables Z₁, …, Z_n.
Based on Z₁, …, Z_n, obtain τ̂ with block length k.
Repeat (i)–(ii) many times, compute empirical MSE(k) as the average of realizations of (τ̂ − 1)², and find the optimal k by minimizing MSE(k).

We find that the optimal block length k is about 12 for n = 120, about 15 for n = 240, about 20 for n = 360, 600, and about 25 for n = 1200.

3.2. Empirical coverage probabilities

Let sample size n = 120. Recall e_i and σ_i in (1.1). For σ_i, consider four choices:

A1 : σ_i = 0.21_i≤n/2 + 0.61_i>n/2,
A2 : σ_i = 0.2{1 + cos²(i/n^4/5)},
A3 : σ_i = 0.2 + 0.1 log(1 + |i − n/2|),
A4 : σ_i = 0.3 + ϕ(i/60),

where ϕ is the standard normal density and 1 is the indicator function. The sequences A1–A4 exhibit different patterns, with a piecewise constancy for A1, a cosine shape for A2, a sharp change around time n/2 for A3, and a gradual downtrend for A4. Let ε_i be IID N(0,1). For e_i, we consider both linear and nonlinear processes:

B1 : $e_{i} = {η_{i} - E (η_{i})} / \sqrt{Var (η_{i})}, where η_{i} = θ | η_{i - 1} | + \sqrt{1 - θ^{2}} ε_{i}, | θ | < 1$ .
B2 : $e_{i} = \sum_{j = 0}^{\infty} a_{j} ε_{i - j}, where a_{j} = \frac{{(j + 1)}^{- β}}{\sqrt{\sum_{j = 0}^{\infty} {(j + 1)}^{- 2 β}}}, β > 1 / 2$ .

For B1, by Wu (2007), (2.8) holds. By Andel, Netuka and Svara (1984), $E (η_{i}) = θ \sqrt{2 / π}$ and Var(η_i) = 1 − 2θ²/π. To examine how the strength of dependence affects the performance, we consider θ = 0, 0.4, 0.8, representing independence, intermediate, and strong dependence, respectively. For B2 with β > 2, (2.2) holds with Δ_n = n^1/4 log(n), and we consider three cases β = 2.1, 3, 4. To assess the effect of block length k_n, three choices k_n = 8, 10, 12 are used. Thus, we consider all 72 combinations of {A1, A2, A3, A4} × {B1, θ = 0, 0.4, 0.8; B2, β = 2.1, 3, 4} × {k_n = 8, 10, 12}.

Without loss of generality we examine coverage probabilities based on 10³ realized confidence intervals for μ = 0 in (1.1). We compare our self-normalization-based confidence intervals to some stationarity-based methods. For (1.1), if we pretend that the error process {ẽ_i = σ_ie_i} is stationary, then we can use (1.2) to construct an asymptotic confidence interval for μ. Under stationarity, the long-run variance τ² of {ẽ_i} can be similarly estimated through the block method in Section 2.5 by using the non-normalized version $D_{j} = \sqrt{k_{n}} [\bar{X} (j) - \bar{X}]$ in (2.25); see Lahiri (2003). Thus, we compare two self-normalization-based methods and three stationarity-based alternatives: self-normalization-based confidence intervals through the asymptotic theory in Proposition 2.2 (SN) and the wild bootstrap (WB) in Section 2.3; stationarity-based confidence intervals through the asymptotic theory (1.2) (ST), non-overlapping block bootstrap (BB), and studentized non-overlapping block bootstrap (SBB) in Section 2.3. From the results in Table 1, we see that the coverage probabilities of the proposed self-normalization-based methods (columns SN and WB) are close to the nominal level 95% for almost all cases considered. By contrast, the stationarity-based methods (columns ST, BB and SBB) suffer from substantial undercoverage, especially when dependence is strong (θ = 0.8 in Table 1 (a) and β = 2.1 in Table 1 (b)). For the two self-normalization-based methods, WB slightly outperforms SN.

Table 1.

Coverage probabilities (in percentage) for μ in (1.1) with e_i from B1 [Table 1 (a)] and B2 [Table 1 (b)]. Nominal level is 95%. SN and WB denote self-normalization-based confidence intervals using asymptotic theory in Proposition 2.2 and the wild bootstrap procedure, respectively; ST, BB, SBB denote stationarity-based confidence intervals using asymptotic theory in (1.2), non-overlapping block bootstrap, and studentized non-overlapping block bootstrap, respectively.

(a): Model B1

θ	k_n	σ_i	SN	WB	ST	BB	SBB	σ_i	SN	WB	ST	BB	SBB
0.0	8		98.0	94.7	93.1	92.2	92.8		96.6	95.2	92.3	92.5	92.5
	10	A1	98.2	95.0	92.6	92.4	92.2	A2	94.6	94.6	90.0	89.5	89.4
	12		98.1	95.6	91.7	91.4	91.1		92.1	93.7	89.7	89.5	89.6

	8		96.4	95.0	92.5	92.3	92.0		96.6	95.6	93.1	92.6	93.0
	10	A3	94.7	94.7	90.8	90.6	90.6	A4	95.1	95.1	91.4	91.3	91.3
	12		93.7	94.8	90.8	90.4	90.5		92.9	93.7	89.8	89.7	89.5

0.4	8		98.7	95.9	92.7	92.6	92.9		96.6	95.3	92.5	92.4	92.0
	10	A1	98.5	95.7	92.8	92.7	92.3	A2	95.4	95.4	91.6	91.1	91.6
	12		98.0	95.0	90.8	90.8	90.2		92.5	94.0	89.4	89.1	89.4

	8		96.6	95.2	91.7	91.7	91.6		95.4	94.1	90.8	90.9	90.6
	10	A3	95.3	95.5	91.5	91.3	91.5	A4	95.0	94.8	91.2	90.7	90.8
	12		93.1	94.6	90.2	89.9	89.9		94.1	95.1	90.3	89.8	90.1

0.8	8		97.9	94.6	87.8	86.8	87.3		96.1	94.7	87.2	87.3	87.0
	10	A1	97.6	95.5	87.3	87.0	86.7	A2	93.3	92.9	86.4	86.8	86.1
	12		97.3	94.0	85.8	85.5	85.1		92.6	93.4	86.5	86.4	86.4

	8		94.8	93.5	85.7	85.7	86.0		95.5	94.7	86.3	86.1	86.1
	10	A3	93.5	93.8	85.7	85.5	85.2	A4	95.3	95.1	88.5	88.3	88.5
	12		92.4	93.3	87.2	86.7	86.9		92.6	94.2	86.3	85.8	85.7

(b): Model B2

β	k_n	σ_i	SN	WB	ST	BB	SBB	σ_i	SN	WB	ST	BB	SBB
4.0	8		97.6	94.9	91.8	91.4	91.9		95.9	94.2	91.9	92.0	91.1
	10	A1	97.7	93.2	88.9	88.1	88.3	A2	95.7	95.7	92.1	91.8	92.1
	12		97.9	95.5	90.7	90.2	90.0		93.3	94.6	90.0	89.9	89.7

	8		94.6	93.3	89.8	89.5	89.5		95.6	94.7	91.3	91.7	91.0
	10	A3	95.1	95.2	91.6	91.4	91.5	A4	95.4	95.9	92.8	92.2	93.0
	12		93.8	95.4	90.8	90.6	90.2		93.9	94.9	88.9	88.5	88.6

3.0	8		99.1	95.7	91.1	91.0	91.2		95.8	94.6	90.4	89.8	90.1
	10	A1	98.5	96.4	91.6	90.9	91.1	A2	95.6	95.2	92.1	91.9	91.5
	12		97.9	94.6	89.6	89.3	89.0		94.1	95.0	90.5	90.2	90.4

	8		95.9	94.6	92.0	91.9	91.7		96.0	94.5	90.6	90.4	90.3
	10	A3	94.3	94.4	90.0	89.9	89.8	A4	94.3	94.4	89.2	89.3	88.9
	12		93.2	94.5	88.9	88.6	88.7		93.1	94.1	89.6	88.9	88.8

2.1	8		97.1	92.5	86.2	86.2	85.5		95.7	93.8	88.9	89.0	88.7
	10	A1	97.6	94.7	89.2	88.9	88.6	A2	93.5	93.6	88.8	88.8	88.4
	12		97.2	95.1	87.9	87.5	87.7		92.6	93.9	88.0	87.6	87.7

	8		94.0	93.7	88.5	88.4	88.3		95.0	93.1	88.8	88.7	88.6
	10	A3	93.3	93.8	88.1	87.9	87.8	A4	94.1	94.2	89.1	88.8	89.1
	12		92.9	94.7	89.1	88.4	88.4		91.5	92.6	87.7	87.5	87.5

Open in a new tab

3.3. Size and power study

In (1.3), we use the same setting for σ_i and e_i as in Section 3.2. For mean μ_i, we consider μ_i = λ1_i>40, λ ≥ 0, and compare the test statistics $T_{n}^{1}, T_{n}^{2}$ in (2.13) and $T_{n}^{SN}$ in (2.16). First, we compare their size under the null with λ = 0. The critical value of $T_{n}^{SN}$ is obtained using the wild bootstrap in Section 2.4; for $T_{n}^{1} and T_{n}^{2}$ , their critical values are based on the block bootstrap in Section 2.3. In each case, we use 10³ bootstrap samples, nominal level 5%, and block length k_n = 10, and summarize the empirical sizes (under the null λ = 0) in Table 2 based on 10³ realizations. While $T_{n}^{SN}$ has size close to 5%, $T_{n}^{1} and T_{n}^{2}$ tend to over-reject the null, and the false rejection probabilities can be three times the nominal level of 5%. Next, we compare the size-adjusted power. Instead of using the bootstrap methods to obtain critical values, we use 95% quantiles of 10⁴ realizations of the test statistics when data are simulated directly from the null model so that the empirical size is exactly 5%. Figure 1 presents the power curves for combinations {A1–A4} × {B1 with θ = 0.4; B2 with β = 3.0} with 10³ realizations each. For A1, $T_{n}^{SN}$ outperforms $T_{n}^{1} and T_{n}^{2}$ ; for A2–A4, there is a moderate loss of power for $T_{n}^{SN}$ . Overall, $T_{n}^{SN}$ has power comparable to other two tests. In practice, however, the null model is unknown, and when one turns to the bootstrap method to obtain the critical values, the usual CUSUM tests $T_{n}^{1} and T_{n}^{2}$ will likely over-reject the null as shown in Table 2. In summary, with such small sample size and complicated time-varying variances structure, $T_{n}^{SN}$ along with the wild bootstrap method delivers reasonably good power and the size is close to nominal level.

Table 2.

Size (in percentage) comparison of $T_{n}^{1} and T_{n}^{2}$ in (2.13) and $T_{n}^{SN}$ in (2.16), with sample size n = 120, nominal level 5%, and block length k_n = 10.

Model B1

Model B2

σ_i

T_{n}^{SN}

T_{n}^{1}

T_{n}^{2}

T_{n}^{SN}

T_{n}^{1}

T_{n}^{2}

0.0

4.9

9.1

8.4

2.1

7.3

12.2

13.4

0.4

4.7

9.4

9.6

3.0

4.7

8.6

9.2

0.8

6.0

15.1

14.7

4.0

5.6

9.9

7.7

0.0

5.7

8.2

6.1

2.1

5.8

9.5

8.6

0.4

6.1

8.9

6.8

3.0

5.3

9.6

6.8

0.8

7.3

12.6

9.3

4.0

4.2

7.5

4.2

0.0

5.0

5.7

4.8

2.1

5.5

7.7

6.7

0.4

5.3

6.9

5.4

3.0

5.8

6.1

4.9

0.8

7.0

9.8

10.0

4.0

5.0

6.5

4.2

0.0

5.4

8.4

6.0

2.1

6.9

8.8

7.1

0.4

5.7

7.9

5.2

3.0

4.8

6.6

6.3

0.8

7.2

11.1

9.2

4.0

5.3

6.2

5.8

Open in a new tab

Size-adjusted power curves for $T_{n}^{1}$ (dashed curve) and $T_{n}^{2}$ (dotdash curve) in (2.13) and $T_{n}^{SN}$ (solid curve) in (2.16) as functions of change size λ (horizontal axis) with sample size n = 120 and block length *k_n* = 10. For (A1,B1)–(A4,B1), the error process {*e_i*} is from B1 with θ = 0.4; for (A1,B2)–(A4,B2), the error process {*e_i*} is from B2 with β = 3.0.

Finally, we point out that the proposed self-normalization-based methods are not robust to models with time-varying correlation structures. For example, consider the model e_i = 0.3e_i−1 + ε_i for 1 ≤ i ≤ 60 and e_i = 0.8e_i−1 + ε_i for 61 ≤ i ≤ n, where ε_i are IID N(0,1). With k_n = 10, the sizes (nominal level 5%) for the three tests $T_{n}^{SN}, T_{n}^{1}, T_{n}^{2}$ are 0.154, 0.196, 0.223 for A1. Future research directions include (i) developing tests for change in the variance or covariance structure for (1.1) [See Inclán and Tiao (1994), Aue et al. (2009) and Berkes, Gombay and Horváth (2009) for related contributions]; and (ii) developing methods that are robust to changes in correlations.

4. Applications to two real data sets

4.1. Annual mean precipitation in Seoul during 1771–2000

The data set consists of annual mean precipitation rates in Seoul during 1771–2000; see Figure 2 for a plot. The mean levels seem to be different for the two time periods 1771–1880 and 1881–2000. Ha and Ha (2006) assumed the observations are IID under the null hypothesis. As shown in Figure 2, the variations change over time. Also, the autocorrelation function plot (not reported here) indicates strong dependence up to lag 18. Therefore, it is more reasonable to apply our self-normalization-based test that is tailored to deal with modulated stationary processes. With sample size n = 230, by the method in Section 3.1, the optimal block length is about 15. Based on 10⁵ bootstrap samples as described in Section 2.4, we obtain the corresponding p-values 0.016, 0.005, 0.045, 0.007, with block length k_n = 12, 14, 16, 18, respectively. For all choices of k_n, there is compelling evidence that a change-point occurred at year 1880. While our result is consistent with that of Ha and Ha (2006), our modulated stationary time series framework seems to be more reasonable. Denote by μ₁ and μ₂ the mean levels over pre-change and post-change time periods 1771–1880 and 1881–2000. For the two sub-periods with sample sizes 110 and 120, the optimal block length is about 12. With k_n = 12, applying the wild bootstrap in Section 2.3 with 10⁵ bootstrap samples, we obtain 95% confidence intervals [121.7, 161.3] for μ₁, [100.9, 114.3] for μ₂. For the difference μ₁ − μ₂, with optimal block length k_n = 15, the 95% wild bootstrap confidence interval is [19.6, 48.2]. Note that the latter confidence interval for μ₁ − μ₂ does not cover zero, which provides further evidence for μ₁ ≠ μ₂ and the existence of a change-point at year 1880.

4.2. Quarterly U.S. GNP growth rates during 1947–2002

The data set consists of quarterly U.S. Gross National Product (GNP) growth rates from the first quarter of 1947 to the third quarter of 2002; see Section 3.8 in Shumway and Stoffer (2006) for a stationary autoregressive model approach. However, the plot in Figure 3 suggests a non-stationary pattern: the variation becomes smaller after year 1985 whereas the mean level remains constant. Moreover, the stationarity test in Kwiatkowski et al. (1992) provides fairly strong evidence for non-stationarity with a p-value of 0.088. With the block length k_n = 12, 14, 16, 18, we obtain the corresponding p-values 0.853, 0.922, 0.903, 0.782, and hence there is no evidence to reject the null hypothesis of a constant mean μ. Based on k_n = 15, the 95% wild bootstrap confidence interval for μ is [0.66%, 1.00%]. To test whether there is a change-point in the variance, by the discussion in the last paragraph of Section 2.4, we consider X̃_i = (X_i − X̲_n)². With k_n = 12, 14, 16, 18, the corresponding p-values are 0.001, 0.006, 0.001, 0.010, indicating strong evidence for a change-point in the variance at year 1984. In summary, we conclude that there is no change-point in the mean level, but there is a change-point in the variance at year 1984.

Acknowledgements

We are grateful to the associate editor and three anonymous referees for their insightful comments that have significantly improved this paper. We also thank Amanda Applegate for help on improving the presentation and Kyung-Ja Ha for providing us the Seoul precipitation data. Zhao’s research was partially supported by NIDA grant P50-DA10075. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIDA or the NIH.

Appendix: Proofs

Proof of Theorem 2.1. Let $r_{j} = | σ_{j} | + \sum_{i = 2}^{j} | σ_{i} - σ_{i - 1} |$ . By the triangle inequality, we have r_j ≤ r_n. Recall S_i in (2.2). By the summation by parts formula, (2.6) follows via

F_{j} = \sum_{i = 1}^{j} σ_{i} (S_{i} - S_{i - 1}) = σ_{j} S_{j} + \sum_{i = 1}^{j - 1} (σ_{i} - σ_{i + 1}) S_{i} = σ_{j} τ B_{j} + \sum_{i = 1}^{j - 1} (σ_{i} - σ_{i + 1}) τ B_{i} + O_{a . s .} (r_{n} Δ_{n}) = τ \sum_{i = 1}^{j} σ_{i} (B_{i} - B_{i - 1}) + O_{a . s .} (r_{n} Δ_{n}) .

(5.1)

By Kolmogorov’s maximal inequality for independent random variables, for δ > 0,

P {max_{1 \leq j \leq n} | \sum_{i = 1}^{j} σ_{i} (B_{i} - B_{i - 1}) | \geq δ Σ_{n}} \leq {(δ Σ_{n})}^{- 2} E [{\sum_{i = 1}^{n} σ_{i} (B_{i} - B_{i - 1})}^{2}] = δ^{- 2} .

(5.2)

Thus, by (5.1), max_1≤j≤n |F_j| = O_p(Σ_n + r_nΔ_n). Observe that

{\underline{V}}_{j}^{2} - Σ_{j}^{2} = W_{j} - F_{j}^{2} / j, where W_{j} = \sum_{i = 1}^{j} σ_{i}^{2} (e_{i}^{2} - 1) .

(5.3)

By (2.2), the same argument in (5.1) and (5.2) shows $W_{j} = O_{p} (Σ_{n}^{* 2} + r_{n}^{*} Δ_{n})$ , uniformly. The desired result then follows via (5.3).

Proof of Theorem 2.2. Denote by Φ(x) the standard normal distribution function. By Proposition 2.2 and Slutsky’s theorem, ℙ(H_n/τ̂ ≤ x) → Φ(x) for each fixed x ∈ ℝ. Since Φ(x) is a continuous distribution, sup_x∈ℝ |ℙ(H_n/τ̂ ≤ x) − Φ(x)| = 0. It remains to prove ${sup}_{x \in ℝ} | ℙ^{*} (H_{n}^{*} \leq x) - Φ (x) | \to 0$ , in probability. Notice that, conditioning on {e_i}, {ξ_i} are independent random variables with zero mean. By the Berry-Esséen bound in Bentkus, Bloznelis and Götze (1996), there exists a finite constant c such that

sup_{x \in ℝ} | ℙ^{*} (H_{n}^{*} \leq x) - Φ (x) | \leq c \sum_{i = 1}^{n} E^{*} ({| ξ_{i} |}^{3}) {\sum_{i = 1}^{n} E^{*} ({| ξ_{i} |}^{2})}^{- 3 / 2},

(5.4)

where E^* denotes conditional expectations given {e_i}. Clearly, $E^{*} ({| ξ_{i} |}^{2}) = σ_{i}^{2} e_{i}^{2} E (α_{1}^{2}) and E ({| ξ_{i} |}^{3}) = σ_{i}^{3} | e_{i}^{3} | E (| α_{1}^{3} |)$ . Thus, under the assumption e_i ∈ ℒ³, we have $\sum_{i = 1}^{n} E^{*} ({| ξ_{i} |}^{3}) = O_{p} (\sum_{i = 1}^{n} σ_{i}^{3})$ . Meanwhile, by the proof of Theorem 2.1, $\sum_{i = 1}^{n} E^{*} (| ξ_{i} |^{2}) = \sum_{i = 1}^{n} σ_{i}^{2} e_{i}^{2} = {1 + o_{p} (1)} \sum_{i = 1}^{n} σ_{i}^{2}$ . Therefore, the desired result follows from (5.4) in view of (2.11).

Proof of Theorem 2.3. For cn ≤ j ≤ (1 − c)n, c ≤ (1 − j/n), j/n ≤ 1 − c. For S_X (j) in (2.14), by (2.6), we have max_{cn≤j≤(1−c)n} |S_X(j) − τ S̃_X (j)| = O_a.s.(r_nΔ_n), where

{\tilde{S}}_{X} (j) = (1 - \frac{j}{n}) \sum_{i = 1}^{j} σ_{i} (B_{i} - B_{i - 1}) - \frac{j}{n} \sum_{i = j + 1}^{n} σ_{i} (B_{i} - B_{i - 1}) .

By (2.7), ${max}_{cn \leq j \leq (1 - c) n} | {(1 - j / n)}^{2} {\underline{V}}_{j}^{2} + {(j / n)}^{2} {\bar{V}}_{j}^{2} - V_{j}^{2} | = O_{p} (ϖ_{n})$ , where

V_{j}^{2} = {(1 - j / n)}^{2} \sum_{i = 1}^{j} σ_{i}^{2} + {(j / n)}^{2} \sum_{i = j + 1}^{n} σ_{i}^{2} and ϖ_{n} = (r_{n}^{2} Δ_{n}^{2} + Σ_{n}^{2}) / n + Σ_{n}^{* 2} + r_{n}^{*} Δ_{n} .

For cn ≤ j ≤ (1 − c)n, $V_{j}^{2} \geq c^{2} Σ_{n}^{2}$ . Thus, condition (2.10) implies $ϖ_{n} = o (V_{j}^{2}) and {V_{j}^{2} + O_{p} (ϖ_{n})}^{1 / 2} = V_{j} + O_{p} (ϖ_{n} / V_{j})$ . Therefore, uniformly over cn ≤ j ≤ (1 − c)n,

T_{n} (j) - τ {\tilde{T}}_{n} (j) = \frac{τ {\tilde{S}}_{X} (j) + O_{a . s} (r_{n} Δ_{n})}{V_{j} + O_{p} (ϖ_{n} / V_{j})} - \frac{τ {\tilde{S}}_{X} (j)}{V_{j}} = O_{p} {\frac{r_{n} Δ_{n}}{V_{j}} + \frac{ϖ_{n} {\tilde{S}}_{X} (j)}{V_{j}^{3}}} .

By (5.2), max_j |S̃_X (j)| = O_p(Σ_n). Thus, the result follows in view of V_j ≥ cΣ_n.

Proof of Theorem 2.4. Condition M_n → 0 implies max_{1≤j≤ℓ_n} r(j)Δ_n/Σ(j) → 0. By (2.7),

ω_{j} ≔ \frac{V^{2} (j)}{Σ^{2} (j)} - 1 = O_{p} {\frac{Σ^{* 2} (j) + r^{*} (j) Δ_{n}}{Σ^{2} (j)} + \frac{1}{k_{n}}} = O_{p} (M_{n}) \to 0 .

(5.5)

Define U_j = Σ⁻¹(j) ∑_{i∈ℐ_j} σ_i(B_i − B_i−1). Clearly, U₁, …, U_{ℓ_n} are independent standard normal random variables. Thus, ${max}_{1 \leq j \leq ℓ_{n}} | U_{j} | = O_{p} {\sqrt{log (ℓ_{n})}} = O_{p} {\sqrt{log (n)}}$ . By (2.6), X̲_n − μ = O_p{(Σ_n + r_nΔ_n)/n} = O_p(Σ_n/n). Recall the definition of D_j in (2.18). By the same argument in (2.6), using $\sqrt{1 + x} = 1 + O (x)$ as x → 0, we have

D_{j} = \frac{k_{n} {\bar{X} (j) - μ}}{Σ (j)} \frac{1}{\sqrt{1 + ω_{j}}} + \frac{k_{n} (μ - {\underline{X}}_{n})}{Σ (j)} \frac{1}{\sqrt{1 + ω_{j}}} = [τ U_{j} + O_{a . s} {\frac{r (j) Δ_{n}}{Σ (j)}}] {1 + O (ω_{j})} + O_{p} {\frac{k_{n} Σ_{n}}{n Σ (j)}} = τ U_{j} + O_{p} {\sqrt{log (n)} M_{n} + \frac{Σ_{n}}{ℓ_{n} Σ (j)}} .

By the latter expression and log(n)M_n → 0, we can easily verify τ̂² − τ² = O_p(χ_n).

References

Adak S. Time-dependent spectral analysis of nonstationary time series. J. Amer. Statist. Assoc. 1998;93:1488–1501. [Google Scholar]
Altman NS. Kernel smoothing of data with correlated errors. J. Amer. Statist. Assoc. 1990;85:749–759. [Google Scholar]
Andel J, Netuka I, Svara K. On threshold autoregressive processes. Kybernetika. 1984;20:89–106. [Google Scholar]
Andrews DWK. Tests for parameter instability and structural change with unknown change point. Econometrica. 1993;61:821–856. [Google Scholar]
Aue A, HÖrmann S, Horváth L, Reimherr M. Break detection in the covariance structure of multivariate time series models. Ann. Statist. 2009;37:4046–4087. [Google Scholar]
Aue A, Horváth L, Hušsková M, Kokoszka P. Testing for changes in polynomial regression. Bernoulli. 2008a;14:637–660. [Google Scholar]
Aue A, Horváth L, Kokoszka P, Steinebach J. Monitoring shifts in mean: asymptotic normality of stopping times. Test. 2008b;17:515–530. [Google Scholar]
Bai J, Perron P. Estimating and testing linear models with multiple structural changes. Econometrica. 1998;66:47–78. [Google Scholar]
Bentkus V, Bloznelis M, Götze F. A Berry-Essen bound for Student’s statistic in the non i.i.d. case. J. Theoret. Probab. 1996;9:765–796. [Google Scholar]
Berkes I, Gombay E, Horváth L. Testing for changes in the covariance structure of linear processes. J. Statist. Plann. Inference. 2009;139:2044–2063. [Google Scholar]
Billingsley P. Convergence of Probability Measures. New York: Wiley; 1968. [Google Scholar]
Bühlmann P. Bootstraps for time series. Stat. Sci. 2002;17:52–72. [Google Scholar]
Carlstein E. The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann. Statist. 1986;14:1171–1179. [Google Scholar]
Csörgő M, Horváth L. Limit Theorems in Change-point Analysis. New York: Wiley; 1997. [Google Scholar]
Dahlhaus R. Fitting time series models to nonstationary processes. Ann. Statist. 1997;25:1–37. [Google Scholar]
Dahlhaus R, Polonik W. Empirical spectral processes for locally stationary time series. Bernoulli. 2009;15:1–39. [Google Scholar]
Davidson R, Flachaire E. The wild bootstrap, tamed at last. J. Econometrics. 2008;146:162–169. [Google Scholar]
De Jong RM, Davidson J. Consistency of kernel estimators of heteroscedastic and autocorrelated covariance matrices. Econometrica. 2000;68:407–423. [Google Scholar]
Efron B. Bootstrap methods: Another look at the jackknife. Ann. Statist. 1979;7:1–26. [Google Scholar]
Fan J, Yao Q. Nonlinear Time Series: Nonparametric and Parametric Methods. New York: Springer-Verlag; 2003. [Google Scholar]
Götze F, Künsch HR. Second order correctness of the blockwise bootstrap for stationary observations. Ann. Statist. 1996;24:1914–1933. [Google Scholar]
Ha K-J, Ha E. Climatic change and interannual fluctuations in the long-term record of monthly precipitation for Seoul. Int. J. Climatol. 2006;26:607–618. [Google Scholar]
Hansen B. Regression with non-stationary volatility. Econometrica. 1995;63:1113–1132. [Google Scholar]
Hansen B. Testing for structural change in conditional models. J. Econometrics. 2000;97:93–115. [Google Scholar]
Horváth L. The maximum likelihood method for testing changes in the parameters of normal observations. Ann. Statist. 1993;21:671–680. [Google Scholar]
Inclán C, Tiao GC. Use of cumulative sums of squares for retrospective detection of changes of variance. J. Amer. Statist. Assoc. 1994;89:913–923. [Google Scholar]
Künsch HR. The jackknife and the bootstrap for general stationary observations. Ann. Statist. 1989;17:1217–1241. [Google Scholar]
Kwiatkowski D, Phillips PCB, Schmidt P, Shin Y. Testing the null hypothesis of stationarity against the alternative of a unit root. J. Econometrics. 1992;54:159–178. [Google Scholar]
Lahiri SN. Resampling Methods for Dependent Data. New York: Springer-Verlag; 2003. [Google Scholar]
Liu RY. Bootstrap procedures under some non-I.I.D. models. Ann. Statist. 1988;16:1696–1708. [Google Scholar]
Müller UK. A theory of robust long-run variance estimation. J. Econometrics. 2007;141:1331–1352. [Google Scholar]
Pettitt A. A simple cumulative sum type statistic for the change-point problem with zero-one observations. Biometrika. 1980;67:79–84. [Google Scholar]
Phillips PCB, Sun YX, Jin SN. Long run variance estimation and robust regression testing using sharp origin kernels with no truncation. J. Statist. Plann. Inference. 2007;137:985–1023. [Google Scholar]
Politis D, Romano J. Large sample confidence regions based on subsamples under minimal assumptions. Ann. Statist. 1994;22:2031–2050. [Google Scholar]
Robbins MW, Lund RB, Gallagher CM, Lu Q. Changepoints in the North Atlantic tropical cyclone record. J. Amer. Statist. Assoc. 2011;106:89–99. [Google Scholar]
Shao QM. Almost sure invariance principles for mixing sequences of random variables. Stoch. Proc. Appl. 1993;48:319–334. [Google Scholar]
Shao X, Zhang X. Testing for change points in time series. J. Amer. Statist. Assoc. 2010;105:1228–1240. [Google Scholar]
Shumway RH, Stoffer DS. Time Series Analysis and its Applications with R Examples. 2nd edn. New York: Springer; 2006. [Google Scholar]
Wu CFJ. Jackknife, bootstrap and other resampling methods in regression analysis. Ann. Statist. 1986;14:1261–1295. [Google Scholar]
Wu WB. Strong invariance principles for dependent random variables. Ann. Probab. 2007;35:2294–2320. [Google Scholar]
Zhao Z. A self-normalized confidence interval for the mean of a class of non-stationary processes. Biometrika. 2011;98:81–90. doi: 10.1093/biomet/asq076. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] Adak S. Time-dependent spectral analysis of nonstationary time series. J. Amer. Statist. Assoc. 1998;93:1488–1501. [Google Scholar]

[R2] Altman NS. Kernel smoothing of data with correlated errors. J. Amer. Statist. Assoc. 1990;85:749–759. [Google Scholar]

[R3] Andel J, Netuka I, Svara K. On threshold autoregressive processes. Kybernetika. 1984;20:89–106. [Google Scholar]

[R4] Andrews DWK. Tests for parameter instability and structural change with unknown change point. Econometrica. 1993;61:821–856. [Google Scholar]

[R5] Aue A, HÖrmann S, Horváth L, Reimherr M. Break detection in the covariance structure of multivariate time series models. Ann. Statist. 2009;37:4046–4087. [Google Scholar]

[R6] Aue A, Horváth L, Hušsková M, Kokoszka P. Testing for changes in polynomial regression. Bernoulli. 2008a;14:637–660. [Google Scholar]

[R7] Aue A, Horváth L, Kokoszka P, Steinebach J. Monitoring shifts in mean: asymptotic normality of stopping times. Test. 2008b;17:515–530. [Google Scholar]

[R8] Bai J, Perron P. Estimating and testing linear models with multiple structural changes. Econometrica. 1998;66:47–78. [Google Scholar]

[R9] Bentkus V, Bloznelis M, Götze F. A Berry-Essen bound for Student’s statistic in the non i.i.d. case. J. Theoret. Probab. 1996;9:765–796. [Google Scholar]

[R10] Berkes I, Gombay E, Horváth L. Testing for changes in the covariance structure of linear processes. J. Statist. Plann. Inference. 2009;139:2044–2063. [Google Scholar]

[R11] Billingsley P. Convergence of Probability Measures. New York: Wiley; 1968. [Google Scholar]

[R12] Bühlmann P. Bootstraps for time series. Stat. Sci. 2002;17:52–72. [Google Scholar]

[R13] Carlstein E. The use of subseries values for estimating the variance of a general statistic from a stationary sequence. Ann. Statist. 1986;14:1171–1179. [Google Scholar]

[R14] Csörgő M, Horváth L. Limit Theorems in Change-point Analysis. New York: Wiley; 1997. [Google Scholar]

[R15] Dahlhaus R. Fitting time series models to nonstationary processes. Ann. Statist. 1997;25:1–37. [Google Scholar]

[R16] Dahlhaus R, Polonik W. Empirical spectral processes for locally stationary time series. Bernoulli. 2009;15:1–39. [Google Scholar]

[R17] Davidson R, Flachaire E. The wild bootstrap, tamed at last. J. Econometrics. 2008;146:162–169. [Google Scholar]

[R18] De Jong RM, Davidson J. Consistency of kernel estimators of heteroscedastic and autocorrelated covariance matrices. Econometrica. 2000;68:407–423. [Google Scholar]

[R19] Efron B. Bootstrap methods: Another look at the jackknife. Ann. Statist. 1979;7:1–26. [Google Scholar]

[R20] Fan J, Yao Q. Nonlinear Time Series: Nonparametric and Parametric Methods. New York: Springer-Verlag; 2003. [Google Scholar]

[R21] Götze F, Künsch HR. Second order correctness of the blockwise bootstrap for stationary observations. Ann. Statist. 1996;24:1914–1933. [Google Scholar]

[R22] Ha K-J, Ha E. Climatic change and interannual fluctuations in the long-term record of monthly precipitation for Seoul. Int. J. Climatol. 2006;26:607–618. [Google Scholar]

[R23] Hansen B. Regression with non-stationary volatility. Econometrica. 1995;63:1113–1132. [Google Scholar]

[R24] Hansen B. Testing for structural change in conditional models. J. Econometrics. 2000;97:93–115. [Google Scholar]

[R25] Horváth L. The maximum likelihood method for testing changes in the parameters of normal observations. Ann. Statist. 1993;21:671–680. [Google Scholar]

[R26] Inclán C, Tiao GC. Use of cumulative sums of squares for retrospective detection of changes of variance. J. Amer. Statist. Assoc. 1994;89:913–923. [Google Scholar]

[R27] Künsch HR. The jackknife and the bootstrap for general stationary observations. Ann. Statist. 1989;17:1217–1241. [Google Scholar]

[R28] Kwiatkowski D, Phillips PCB, Schmidt P, Shin Y. Testing the null hypothesis of stationarity against the alternative of a unit root. J. Econometrics. 1992;54:159–178. [Google Scholar]

[R29] Lahiri SN. Resampling Methods for Dependent Data. New York: Springer-Verlag; 2003. [Google Scholar]

[R30] Liu RY. Bootstrap procedures under some non-I.I.D. models. Ann. Statist. 1988;16:1696–1708. [Google Scholar]

[R31] Müller UK. A theory of robust long-run variance estimation. J. Econometrics. 2007;141:1331–1352. [Google Scholar]

[R32] Pettitt A. A simple cumulative sum type statistic for the change-point problem with zero-one observations. Biometrika. 1980;67:79–84. [Google Scholar]

[R33] Phillips PCB, Sun YX, Jin SN. Long run variance estimation and robust regression testing using sharp origin kernels with no truncation. J. Statist. Plann. Inference. 2007;137:985–1023. [Google Scholar]

[R34] Politis D, Romano J. Large sample confidence regions based on subsamples under minimal assumptions. Ann. Statist. 1994;22:2031–2050. [Google Scholar]

[R35] Robbins MW, Lund RB, Gallagher CM, Lu Q. Changepoints in the North Atlantic tropical cyclone record. J. Amer. Statist. Assoc. 2011;106:89–99. [Google Scholar]

[R36] Shao QM. Almost sure invariance principles for mixing sequences of random variables. Stoch. Proc. Appl. 1993;48:319–334. [Google Scholar]

[R37] Shao X, Zhang X. Testing for change points in time series. J. Amer. Statist. Assoc. 2010;105:1228–1240. [Google Scholar]

[R38] Shumway RH, Stoffer DS. Time Series Analysis and its Applications with R Examples. 2nd edn. New York: Springer; 2006. [Google Scholar]

[R39] Wu CFJ. Jackknife, bootstrap and other resampling methods in regression analysis. Ann. Statist. 1986;14:1261–1295. [Google Scholar]

[R40] Wu WB. Strong invariance principles for dependent random variables. Ann. Probab. 2007;35:2294–2320. [Google Scholar]

[R41] Zhao Z. A self-normalized confidence interval for the mean of a class of non-stationary processes. Biometrika. 2011;98:81–90. doi: 10.1093/biomet/asq076. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Inference for modulated stationary processes

Zhibiao Zhao

Xiaoye Li

Abstract

1. Introduction

2. Main results

2.1. Uniform approximations for modulated stationary processes

2.2. Self-normalized central limit theorem

2.3. Wild bootstrap for self-normalized statistic

2.4. Change-point analysis: self-normalized CUSUM test

2.5. Long-run variance estimation

2.6. Some possible extensions

3. Simulation study

3.1. Selection of block length k_n for τ̂

3.2. Empirical coverage probabilities

Table 1.

3.3. Size and power study

Table 2.

Figure 1.

4. Applications to two real data sets

4.1. Annual mean precipitation in Seoul during 1771–2000

Figure 2.

4.2. Quarterly U.S. GNP growth rates during 1947–2002

Figure 3.

Acknowledgements

Appendix: Proofs

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Inference for modulated stationary processes

Zhibiao Zhao

Xiaoye Li

Abstract

1. Introduction

2. Main results

2.1. Uniform approximations for modulated stationary processes

2.2. Self-normalized central limit theorem

2.3. Wild bootstrap for self-normalized statistic

2.4. Change-point analysis: self-normalized CUSUM test

2.5. Long-run variance estimation

2.6. Some possible extensions

3. Simulation study

3.1. Selection of block length kn for τ̂

3.2. Empirical coverage probabilities

Table 1.

3.3. Size and power study

Table 2.

Figure 1.

4. Applications to two real data sets

4.1. Annual mean precipitation in Seoul during 1771–2000

Figure 2.

4.2. Quarterly U.S. GNP growth rates during 1947–2002

Figure 3.

Acknowledgements

Appendix: Proofs

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.1. Selection of block length k_n for τ̂