A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series

Lin Yang

doi:10.3390/e26030226

. 2024 Mar 1;26(3):226. doi: 10.3390/e26030226

A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series

Lin Yang ¹

Editor: Boris Ryabko¹

PMCID: PMC10969679 PMID: 38539738

Abstract

We propose a two-sample testing procedure for high-dimensional time series. To obtain the asymptotic distribution of our $ℓ_{\infty}$ -type test statistic under the null hypothesis, we establish high-dimensional central limit theorems (HCLTs) for an $α$ -mixing sequence. Specifically, we derive two HCLTs for the maximum of a sum of high-dimensional $α$ -mixing random vectors under the assumptions of bounded finite moments and exponential tails, respectively. The proposed HCLT for $α$ -mixing sequence under bounded finite moments assumption is novel, and in comparison with existing results, we improve the convergence rate of the HCLT under the exponential tails assumption. To compute the critical value, we employ the blockwise bootstrap method. Importantly, our approach does not require the independence of the two samples, making it applicable for detecting change points in high-dimensional time series. Numerical results emphasize the effectiveness and advantages of our method.

Keywords: two-sample testing, high-dimensional time series, α-mixing, Gaussian approximation, blockwise bootstrap

1. Introduction

A fundamental testing problem in multivariate analysis involves assessing the equality of two mean vectors, denoted as $μ_{X}$ and $μ_{Y}$ . Since its inception by [1], the Hotelling $T^{2}$ test has proven to be a valuable tool in multivariate analyses. Subsequently, numerous studies have addressed the testing of $μ_{X} = μ_{Y}$ , within various contexts and under distinct assumptions. See refs. [2,3], along with their respective references.

Consider two sets of observations, ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ , where $X_{t} = {(X_{t, 1}, \dots, X_{t, p})}^{T}$ and $Y_{t} = {(Y_{t, 1}, \dots, Y_{t, p})}^{T}$ . These observations are drawn from two populations with means $μ_{X}$ and $μ_{Y}$ , respectively. The classical test aims to test the hypotheses:

\begin{matrix} H_{0} : μ_{X} = μ_{Y} versus H_{1} : μ_{X} \neq μ_{Y} . \end{matrix}

(1)

When ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ are two independent sequences and independent with each other, a considerable body of literature focuses on testing Hypothesis (1). The $ℓ_{2}$ -type test statistic corresponding to (1) is of the form ${(\bar{X} - \bar{Y})}^{T} S^{- 1} (\bar{X} - \bar{Y})$ , where $\bar{X} = n_{1}^{- 1} \sum_{t = 1}^{n_{1}} X_{t}$ , $\bar{Y} = n_{2}^{- 1} \sum_{t = 1}^{n_{2}} Y_{t}$ and $S^{- 1}$ is the weight matrix. A straightforward choice for $S^{- 1}$ is the identity matrix $I_{p}$ [4,5], implying equal weighting for each dimension. Several classical asymptotic theories have been developed based on this selection of $S^{- 1}$ . However, this choice disregards the variability in each dimension and the correlations between them, resulting in suboptimal performance, particularly in the presence of heterogeneity or the existence of correlations between dimensions. In recent decades, numerous researchers have investigated various choices for $S^{- 1}$ along with the corresponding asymptotic theories. See refs. [6,7]. In addition, some researchers have developed a framework centered on $ℓ_{\infty}$ -type test statistics, represented as ${max}_{j \in [p]} | {(S^{- 1 / 2} (\bar{X} - \bar{Y}))}_{j} |$ [8,9,10]. Extreme value theory plays a pivotal role in deriving the asymptotic behaviors of these test statistics.

However, when ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ are two weakly dependent sequences and are not independent of each other, the above methods may not work well. In this paper, we introduce an $ℓ_{\infty}$ -type test statistic $T_{n} : = {(n_{1} n_{2})}^{1 / 2} {(n_{1} + n_{2})}^{- 1 / 2} {| \bar{X} - \bar{Y} |}_{\infty}$ for testing $H_{0}$ under two dependent sequences. Based on $Σ$ , which represents the variance of ${(n_{1} n_{2})}^{1 / 2} {(n_{1} + n_{2})}^{- 1 / 2} (\bar{X} - \bar{Y})$ , we construct a Gaussian maxima, denoted as $T_{n}^{G}$ , to approximate $T_{n}$ under the null hypothesis. When $n_{1} = n_{2} = n$ , $T_{n}$ can be written as $| S_{n} |_{\infty}$ , the maximum of a sum of high-dimensional weakly dependent random vectors, where $S_{n} = n^{- 1 / 2} \sum_{t = 1}^{n} (X_{t} - Y_{t})$ . Let $T_{n}^{G} = {| G |}_{\infty}$ with $G = {(G_{1}, \dots, G_{p})}^{T}$ ∼ $N {0, var (S_{n})}$ and $A$ be a class of Borel subsets in $R^{p}$ . Define

ρ_{n} (A) = sup_{A \in A} | P (S_{n} \in A) - P (G \in A) | .

Paticularly, let $A^{\max}$ consists of all sets $A^{\max}$ of the form $A^{\max} = {{(a_{1}, \dots, a_{p})}^{T} \in R^{p}$ : ${max}_{j \in [p]} | a_{j} | \leq x}$ with some $x \in R$ . Then we have

\begin{matrix} ρ_{n} (A^{\max}) = sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | . \end{matrix}

Note that $ρ_{n} (A^{\max})$ is the Kolmogorov distance between $T_{n}$ and $T_{n}^{G}$ .

When dimension p diverges exponentially with respect to the sample size n, several studies have focused on deriving $ρ_{n} (A^{\max}) = o (1)$ under a weakly dependent assumption. Based on the coupling method for $β$ -mixing sequence, ref. [11] obtained $ρ_{n} (A^{\max}) = o (1)$ under the $β$ -mixing condition, contributing to the understanding of such phenomena. Ref. [12] extended the scope of the investigation to the physical dependence framework introduced by [13]. Considering three distinct types of dependence—namely $α$ -mixing, m-dependence, and physical dependence measures—ref. [14] made significant strides. They established nonasymptotic error bounds for Gaussian approximations of sums involving high-dimensional dependent random vectors. Their analysis encompassed various scenarios of $A$ , including hyper-rectangles, simple convex sets, and sparsely convex sets. Let $A^{re}$ be the class of all hyper-rectangles in $R^{p}$ . Under the $α$ -mixing scenario and some mild regularity conditions, [14] showed

ρ_{n} (A^{re}) ≲ \frac{{log (p n)}^{7 / 6}}{n^{1 / 9}},

hence the Gaussian approximation holds if $log (p n) = o (n^{2 / 21})$ . In this paper, under some conditions similar to or even weaker than [14], we obtain

ρ_{n} (A^{\max}) ≲ \frac{{log (p n)}^{3 / 2}}{n^{1 / 6}},

which implies the Gaussian approximation holds if $log (p n) = o (n^{1 / 9})$ . Refer to Remark 1 for more details on the comparison of the convergence rates. By using the Gaussian-to-Gaussian comparison and Nazarov’s inequality for p-dimensional random vectors, we can easily extend our result to $ρ_{n} (A^{re}) ≲ {log (p n)}^{3 / 2} n^{- 1 / 6}$ . Given that our framework and numerous testing procedures rely on $ℓ_{\infty}$ -type test statistics, we thus propose our results under $A^{\max}$ . When p diverges polynomially with respect to n, to the best of our knowledge, there is no existing literature providing the convergence rate of $ρ_{n} (A^{\max})$ for $α$ -mixing sequences under bounded finite moments.

Based on the Gaussian approximation for high-dimensional independent random vectors [15,16], we employ the coupling method for $α$ -mixing sequence [17] and “big-and-small” block technique to specify the convergence rate of $ρ_{n} (A^{\max})$ under various divergence rates of p. For more details, refer to Theorem 1 in Section 3.1 and its corresponding proof in Appendix A. Given that $Σ$ is typically unknown in practice, we develop a data-driven procedure based on blockwise wild bootstrap [18] to determine the critical value for a given significance level $α$ . The blockwise wild bootstrap method is widely used in the time series analysis. See [19,20] and references within.

The independence between ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ is not a necessary assumption in our method. We only require the pair sequence ${(X_{t}, Y_{t})}$ is weakly dependent. Therefore, our method can be applied effectively to detect change points in high-dimensional time series. Further details on this application can be found in Section 4.

The rest of this paper is organized as follows. Section 2 introduces the test statistic and the blockwise bootstrap method. The convergence rates of Gaussian approximations for high-dimensional $α$ -mixing sequence and the theoretical properties of the proposed test can be found in Section 3. In Section 4, an application to change point detection for high-dimensional time series is presented. The selection method for tuning parameter and a simulation study to investigate the numerical performance of the test are displayed in Section 5. We apply the proposed method to the opening price data from multiple stocks in Section 6. Section 7 provides discussions on the results and outlines our future work. The proofs of the main results in Section 3 are detailed in the Appendix A, Appendix B, Appendix C and Appendix D.

Notation:

For any positive integer $p \geq 1$ , we write $[p] = {1, \dots, p}$ . We use ${| a |}_{\infty} = {max}_{j \in [p]} | a_{j} |$ to denote the $ℓ_{\infty}$ -norm of the p-dimensional vector $a$ . Let $⌊ x ⌋$ and $⌈ x ⌉$ represent the greatest integer less than or equal to x and the smallest integer greater than or equal to x, respectively. For two sequences of positive numbers ${a_{n}}$ and ${b_{n}}$ , we write $a_{n} ≲ b_{n}$ or $b_{n} ≳ a_{n}$ if ${lim sup}_{n \to \infty} a_{n} / b_{n} ⩽ c_{0}$ for some positive constant $c_{0}$ . Let $a_{n} ≍ b_{n}$ if $a_{n} ≲ b_{n}$ and $b_{n} ≲ a_{n}$ hold simultaneously. Denote $0_{p} = {(0, \dots, 0)}^{T} \in R^{p}$ . For any $m \times m$ matrix $A = {(a_{i j})}_{m \times m}$ , let ${| A |}_{\infty} = {max}_{i, j \in [m]} | a_{i j} |$ and ${∥ A ∥}_{2}$ be the spectral norm of $A$ . Additionally, denote $λ_{min} (A)$ as the smallest eigenvalue of $A$ . Let $1 (\cdot)$ be the indicator function. For any $x, y \in R$ , denote $x \lor y = max {x, y}$ and $x \land y = min {x, y}$ . Given $γ > 0$ , we define the function $ψ_{γ} (x) : = exp (x^{γ}) - 1$ for any $x > 0$ . For a real-valued random variable $ξ$ , we define ${∥ ξ ∥}_{ψ_{γ}} : = inf [λ > 0 : E {ψ_{γ} (| ξ | / λ)} \leq 1]$ . Throughout the paper, we use $c, C \in (0, \infty)$ to denote two generic finite constants that do not depend on $(n_{1}, n_{2}, p)$ , and may be different in different uses.

2. Methodology

2.1. Test Statistic and Its Gaussian Analog

Consider two weakly stationary time series ${X_{t}, t \in Z}$ and ${Y_{t}, t \in Z}$ with $X_{t} = {(X_{t, 1}, \dots, X_{t, p})}^{T}$ and $Y_{t} = {(Y_{t, 1}, \dots, Y_{t, p})}^{T}$ . Let $μ_{X} = E (X_{t})$ and $μ_{Y} = E (Y_{t})$ . The primary focus is on testing equality of mean vectors of the two populations:

\begin{matrix} H_{0} : μ_{X} = μ_{Y} versus H_{1} : μ_{X} \neq μ_{Y} . \end{matrix}

Given the observations ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ , the estimations of $μ_{X}$ and $μ_{Y}$ are, respectively, ${\hat{μ}}_{X} = {n_{1}}^{- 1} \sum_{t = 1}^{n_{1}} X_{t}$ and ${\hat{μ}}_{Y} = {n_{2}}^{- 1} \sum_{t = 1}^{n_{2}} Y_{t}$ . In this paper, we assume $n_{1} ≍ n_{2} ≍ n$ . It is natural to consider the $ℓ_{\infty}$ -type test statistic $T_{n} = {(n_{1} n_{2})}^{1 / 2} {(n_{1} + n_{2})}^{- 1 / 2} {| {\hat{μ}}_{X} - {\hat{μ}}_{Y} |}_{\infty}$ . Write $\tilde{n} = max {n_{1}, n_{2}}$ . Define two new sequences ${{\tilde{X}}_{t}}_{t = 1}^{\tilde{n}}$ and ${{\tilde{Y}}_{t}}_{t = 1}^{\tilde{n}}$ with

\begin{matrix} {\tilde{X}}_{t} = X_{t \land n_{1}} 1 (1 \leq t \leq n_{1}) and {\tilde{Y}}_{t} = Y_{t \land n_{2}} 1 (1 \leq t \leq n_{2}) . \end{matrix}

For each $t \in [\tilde{n}]$ , let

Z_{t} = \sqrt{\frac{n_{2} \tilde{n}}{n_{1} (n_{1} + n_{2})}} {\tilde{X}}_{t} - \sqrt{\frac{n_{1} \tilde{n}}{n_{2} (n_{1} + n_{2})}} {\tilde{Y}}_{t} .

Then, $T_{n}$ can be rewritten as

\begin{matrix} T_{n} = | \frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} Z_{t} |_{\infty} . \end{matrix}

(2)

We reject the null hypothesis $H_{0}$ if $T_{n} > {cv}_{α}$ , where ${cv}_{α}$ represents the critical value at the significance level $α \in (0, 1)$ . Determining ${cv}_{α}$ involves deriving the distribution of $T_{n}$ under $H_{0}$ . However, due to the divergence of p in a high-dimensional scenario, obtaining the distribution of $T_{n}$ is challenging. To address this challenge, we employ the Gaussian approximation theorem [15,16]. We seek a Gaussian analog, denoted as $T_{n}^{G}$ , satisfying the property that the Kolmogorov distance between $T_{n}$ and $T_{n}^{G}$ converges to zero under $H_{0}$ . Then, we can replace ${cv}_{α}$ by ${cv}_{α}^{G} : = inf {x > 0 : P (T_{n}^{G} > x) \leq α}$ . Define a p-dimensional Gaussian vector

\begin{matrix} G \sim N (0_{p}, Ξ_{\tilde{n}}) with Ξ_{\tilde{n}} = var (\frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} Z_{t}) . \end{matrix}

(3)

We then define the Gaussian analogue of $T_{n}$ as

\begin{matrix} T_{n}^{G} = {| G |}_{\infty} . \end{matrix}

Proposition 1 below demonstrates that the null distribution of $T_{n}$ can be effectively approximated by the distribution of $T_{n}^{G}$ .

2.2. Blockwise Bootstrap

Note that the long-run covariance matrix $Ξ_{\tilde{n}}$ specified in (3) is typically unknown. As a result, determining ${cv}_{α}^{G}$ through the distribution of $T_{n}^{G}$ becomes challenging. To address this challenge, we introduce a parametric bootstrap estimator for $T_{n}$ using the blockwise bootstrap method [18].

For some positive constant $ϑ \in [1 / 2, 1)$ , let $S ≍ {\tilde{n}}^{1 - ϑ}$ and $B = ⌈ \tilde{n} / S ⌉$ be the size of each block and the number of blocks, respectively. Denote $I_{b} = {(b - 1) S + 1, \dots, b S}$ for $b \in [B - 1]$ and $I_{B} = {(B - 1) S + 1, \dots, \tilde{n}}$ . Let ${ϱ_{b}}_{b = 1}^{B}$ be the sequence of i.i.d. standard normal random variables and $ϱ^{'} = (ϱ_{1}^{'}, \dots, ϱ_{\tilde{n}}^{'})$ , where $ϱ_{t}^{'} = ϱ_{b}$ if $t \in I_{b}$ . Define the bootstrap estimator of $T_{n}$ as

\begin{matrix} {\hat{T}}_{n}^{G} = | \frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} (Z_{t} - \bar{Z}) ϱ_{t}^{'} |_{\infty}, \end{matrix}

where $\bar{Z} = {\tilde{n}}^{- 1} \sum_{t = 1}^{\tilde{n}} Z_{t}$ . Based on this estimator, we define the estimated critical value ${\hat{cv}}_{α}$ as

\begin{matrix} {\hat{cv}}_{α} : = & inf {x > 0 : P ({\hat{T}}_{n}^{G} > x | E) \leq α}, \end{matrix}

(4)

where $E = {X_{1}, \dots, X_{n_{1}}, Y_{1}, \dots, Y_{n_{2}}}$ . Then, we reject the null hypothesis $H_{0}$ if $T_{n} > {\hat{cv}}_{α}$ . The procedure for selecting the parameter $ϑ$ (or block size S) is detailed in Section 5.1. In practice, we obtain ${\hat{cv}}_{α}$ through the following bootstrap procedure: Generate K independent sequences ${ϱ_{(1), t}^{'}}_{t = 1}^{\tilde{n}}, \dots, {ϱ_{(K), t}^{'}}_{t = 1}^{\tilde{n}}$ , with each ${ϱ_{(k), t}^{'}}_{t = 1}^{\tilde{n}}$ generated as ${ϱ_{t}^{'}}_{t = 1}^{\tilde{n}}$ . For each $k \in [K]$ , calculate ${\hat{T}}_{(k), n}^{G}$ with ${ϱ_{(k), t}^{'}}_{t = 1}^{\tilde{n}}$ . Then, ${\hat{cv}}_{α}$ is the $(1 - α) K$ -th largest value among ${{\hat{T}}_{(1), n}^{G}, \dots, {\hat{T}}_{(K), n}^{G}}$ . Here, K is the number of bootstrap replications.

3. Theoretical Results

We employ the concept of ‘ $α$ -mixing’ to characterize the serial dependence of ${(X_{t}, Y_{t})}$ , with the $α$ -mixing coefficient at lag $κ$ defined as

\begin{matrix} α (κ) : = sup_{r} sup_{A \in F_{- \infty}^{r}, B \in F_{r + κ}^{\infty}} | P (A B) - P (A) P (B) |, \end{matrix}

(5)

where $F_{- \infty}^{r}$ and $F_{r + κ}^{\infty}$ are the $σ$ -fields generated by ${(X_{t}, Y_{t}) : t \leq r}$ and ${(X_{t}, Y_{t}) : t \geq r + κ}$ , respectively. We call the sequence ${(X_{t}, Y_{t})}$ is $α$ -mixing if $α (κ) \to 0$ as $κ \to \infty$ .

3.1. Gaussian Approximation for High-Dimensional $α$ -Mixing Sequence

To show that the Kolmogorov distance between $T_{n}$ and $T_{n}^{G}$ converges to zero under various divergence rates of p, we need the following central limit theorems for high-dimensional $α$ -mixing sequence.

Theorem 1.

Let ${ξ_{t}}_{t = 1}^{n}$ be an α-mixing sequence of p-dimensional centered random vectors and ${α (κ)}_{κ \geq 1}$ denote the α-mixing coefficients of ${ξ_{t}}$ , defined in the same manner as (5). Write $S_{n} = {(S_{n, 1}, \dots, S_{n, p})}^{T} = n^{- 1 / 2} \sum_{t = 1}^{n} ξ_{t}$ and $W = {(W_{1}, \dots, W_{p})}^{T} \sim N (0_{p}, Σ_{n})$ with $Σ_{n} = E (S_{n} S_{n}^{T})$ . Define

$ρ_{n} = sup_{x \in R} | P (| S_{n} |_{\infty} \leq {x) - P (| W |}_{\infty} \leq x) | .$

(i)
If ${max}_{t \in [n]} {max}_{j \in [p]} E (| ξ_{t, j} |^{m}) \leq C_{1}^{*}$ , $α (κ) \leq C_{2}^{*} κ^{- τ}$ and $λ_{min} (Σ_{n}) \geq C_{3}^{*}$ for some $m > 3$ , $τ > max {2 m / (m - 3), 3}$ and constants $C_{1}^{*}, C_{2}^{*}, C_{3}^{*} > 0$ , we have
$ρ_{n} ≲ \frac{p^{1 / 2} {(log p)}^{1 / 4}}{n^{\tilde{τ}}}$
provided that $p = o (n^{2 \tilde{τ}})$ , where $\tilde{τ} = τ / (11 τ + 12)$ .

(ii)
If ${max}_{t \in [n]} {max}_{j \in [p]} {∥ ξ_{t, j} ∥}_{ψ_{γ_{1}}} \leq M_{n}$ , $α (κ) \leq C_{1}^{* *} exp (- C_{2}^{* *} κ^{γ_{2}})$ and ${min}_{j \in [p]} {(Σ_{n})}_{j, j} \geq C_{3}^{* *}$ for some $M_{n} \geq 1$ , $γ_{1} \in (0, 2]$ , $γ_{2} > 0$ and constants $C_{1}^{* *}, C_{2}^{* *}, C_{3}^{* *} > 0$ , we have
$ρ_{n} ≲ \frac{M_{n} {log (p n)}^{max {(2 γ_{2} + 1) / 2 γ_{2}, 3 / 2}}}{n^{1 / 6}}$
provided that ${log (p n)}^{3} = o {n^{γ_{1} γ_{2} / (2 γ_{1} + 2 γ_{2} - γ_{1} γ_{2})}}$ and $M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (n^{1 / 3})$ .

Remark 1.

In scenarios where the dimension p diverges polynomially with respect to n, Theorem 1(i) represents a novel contribution to the existing literature. Moreover, if $τ \to \infty$ (i.e., $α (κ) ≲ exp (- C κ)$ for some constant $C > 0$ ), we have $\tilde{τ} \to 1 / 11$ , and thus $ρ_{n} = o (1)$ if $p {(log p)}^{1 / 2} = o (n^{2 / 11})$ . Compared with Theorem 1 in [14], which provides a Gaussian approximation result when p diverges exponentially with respect to n, Theorem 1(ii) has three improvements. Firstly, all conditions of Theorem 1(ii) are equivalent to those in Theorem 1 of [14], with the exception that we permit $γ_{1} \in (0, 1)$ , thereby offering a weaker assumption that is more broadly applicable. Secondly, the convergence rate dependent on n via $n^{- 1 / 6}$ in Theorem 1(ii) outperforms the rate of $n^{- 1 / 9}$ demonstrated in Theorem 1 of [14]. Note that the convergence rate in Theorem 1 of [14] can be rewritten as

${[\frac{M_{n} {log (p n)}^{(2 γ_{2} + 1) / 2 γ_{2}}}{n^{1 / 6}}]}^{2 / 3} + \frac{M_{n} {log (p n)}^{7 / 6}}{n^{1 / 9}} .$

To ensure $ρ_{n} = o (1)$ , in our result, it is necessary to allow $M_{n}^{6} {log (p n)}^{(6 γ_{2} + 3) / γ_{2}} = o (n)$ when $γ_{2} \leq 2 / 3$ and $M_{n}^{6} {log (p n)}^{max {(6 γ_{2} + 3) / γ_{2}, 9}} = o (n)$ when $γ_{2} > 2 / 3$ , respectively. Comparatively, the basic requirements under Theorem 1 of [14] are $M_{n}^{6} {log (p n)}^{(6 γ_{2} + 3) / γ_{2}} = o (n)$ when $γ_{2} \leq 2 / 3$ and $M_{n}^{9} {log (p n)}^{21 / 2} = o (n)$ when $γ_{2} > 2 / 3$ , respectively. Due to $(6 γ_{2} + 3) / γ_{2} < 21 / 2$ when $γ_{2} > 2 / 3$ , our result permits a larger or equal divergence rate of p compared with Theorem 1 in [14].

3.2. Theoretical Properties

In order to derive the theoretical properties of $T_{n}$ , the following regular assumptions are needed.

Assumption 1.

(i)
For some $m > 4$ , there exists a constant $C_{1} > 0$ s.t. ${max}_{t \in [\tilde{n}]} {max}_{j \in [p]} E (| Z_{t, j} |^{m}) \leq C_{1}$ .

(ii)
There exists a constant $C_{2} > 0$ s.t. $α (κ) \leq C_{2} κ^{- τ}$ for some $τ > 3 m / (m - 4)$ .

(iii)
There exists a constant $C_{3} > 0$ s.t. $λ_{min} (Ξ_{\tilde{n}}) \geq C_{3}$ .

Assumption 2.

(i)
There exists a constant $C_{1}^{'} > 0$ s.t. ${max}_{t \in [\tilde{n}]} {max}_{j \in [p]} {∥ Z_{t, j} ∥}_{ψ_{2}} \leq C_{1}^{'}$ .

(ii)
There exist two constants $C_{2}^{'}, C_{3}^{'} > 0$ s.t. $α (κ) \leq C_{2}^{'} exp (- C_{3}^{'} κ)$ .

(iii)
There exists a constant $C_{4}^{'} > 0$ s.t. ${min}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j} \geq C_{4}^{'}$ .

Remark 2.

The two mild Assumptions, 1 and 2, delineate the necessary assumptions for ${(X_{t}, Y_{t})}$ to facilitate the development of Gaussian approximation theories for the dimension p divergence, characterized by polynomial and exponential rates relative to the sample size n, respectively. Assumptions 1(i) and 1(ii) are common assumptions in multivariate time series analysis. Due to $n_{1} ≍ n_{2} ≍ n$ , if ${max}_{t \in [n_{1}], j \in [p]} E (| X_{t, j} |^{m}) \leq C$ and ${max}_{t \in [n_{2}], j \in [p]} E (| Y_{t, j} |^{m}) \leq C$ , then Assumption 1(i) holds, as verified by the triangle inequality. Additionally, Assumption 1(iii) necessitates the strong nondegeneracy of $Ξ_{\tilde{n}}$ , a requirement commonly assumed in Gaussian approximation theories (see refs. [21,22], among others). Note that Assumption 2(iii) is implied by Assumption 1(iii). The latter assumption only necessitates the nondegeneracy of ${min}_{j \in [p]} var ({\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} Z_{t, j})$ . We can modify Assumption 2(i) to ${max}_{t \in [\tilde{n}]} {max}_{j \in [p]} {∥ Z_{t, j} ∥}_{ψ_{γ}} \leq C$ for any $γ \in (0, 2]$ , a standard assumption in the literature on ultra-high-dimensional data analysis. This assumption ensures subexponential upper bounds for the tail probabilities of the statistics in question when $p ≫ n$ , as discussed in [23,24]. The requirement of sub-Gaussian properties in Assumption 2(i) is made for the sake of simplicity. If ${X_{t}}$ and ${Y_{t}}$ share the same tail probability, Assumption 2(i) is satisfied automatically. Assumption 2(ii) necessitates that the α-mixing coefficients decay at an exponential rate.

Write $Δ_{n} : = max {n_{1}, n_{2}} - min {n_{1}, n_{2}}$ . Define two cases with respect to the distinct divergence rates of p as

Case1: ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ satisfy Assumption 1, and the dimension p satisfies $p^{2} log p = o {n^{4 τ / (11 τ + 12)}}$ and $Δ_{n}^{2} log p = o (n)$ ;
Case2: ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ satisfy Assumption 2, and the dimension p satisfies $log (p n) = o (n^{1 / 9})$ and $Δ_{n}^{2} log p = o (n)$ .

Note that $Δ_{n}^{2} log p = o (n)$ mandates the maximum difference between the two sample sizes. Proposition 1 below demonstrates that, under the aforementioned cases and $H_{0}$ , the Kolmogorov distance between $T_{n}$ and $T_{n}^{G}$ converges to zero as the sample size approaches infinity. Proposition 1 can be directly derived from Theorem 1. Note that, in the scenario where the dimension p diverges in a polynomial rate with respect to n, obtaining Proposition 1 requires only $m > 3$ and $τ > max {2 m / (m - 3), 3}$ , an assumption weaker than Assumption 1. The more stringent restrictions $m > 4$ and $τ > 3 m / (m - 4)$ in Assumption 1 are imposed to establish the results presented in Theorems 2 and 3.

Proposition 1.

In either Case1 or Case2, it holds under the null hypothesis $H_{0}$ that

$\begin{matrix} sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1) . \end{matrix}$

According to Proposition 1, the critical value ${cv}_{α}$ can be substituted with ${cv}_{α}^{G}$ . However, in practical scenarios, the long-run covariance $Ξ_{\tilde{n}}$ defined in (3) is typically unknown. This implies that obtaining ${cv}_{α}^{G}$ directly from the distribution of $T_{n}^{G}$ is not feasible. We introduce a bootstrap method for obtaining the estimator ${\hat{cv}}_{α}$ defined in (4). In situations where the dimension p diverges at a polynomial rate relative to the sample size n, we require an additional Assumption 3 to ensure that ${\hat{cv}}_{α}$ serves as a reliable estimator for ${cv}_{α}$ . Assumption 3 places restrictions on the cumulant function, a commonly assumed criterion in time series analysis. Refer to [25,26] for examples of such assumptions in the literature.

Assumption 3.

For each $i, j \in [p]$ , define ${cum}_{i, j} (h, t, s) = cov ({\overset{˚}{Z}}_{0, i} {\overset{˚}{Z}}_{h, j}, {\overset{˚}{Z}}_{t, i} {\overset{˚}{Z}}_{s, j}) - γ_{t, i, i} γ_{s - h, j, j} - γ_{s, i, j} γ_{t - h, j, i}$ , where $γ_{h, i, j} = cov (Z_{0, i}, Z_{h, j})$ and ${\overset{˚}{Z}}_{t, j} = Z_{t, j} - E (Z_{t, j})$ . There exists a constant $C_{4} > 0$ s.t.

$max_{i, j \in [p]} \sum_{h = - \infty}^{\infty} \sum_{t = - \infty}^{\infty} \sum_{s = - \infty}^{\infty} | {cum}_{i, j} (h, t, s) | < C_{4} .$

Similar to Case1 and Case2, we consider two cases corresponding to different divergence rates of the dimension p, as outlined below:

Case3: ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ satisfy Assumptions 1 and 3.
Case4: ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ satisfy Assumption 2.

Theorem 2.

In either Case3 with $p log p = o [n^{min {(1 - ϑ) / 4, 2 τ / (11 τ + 12)}}]$ and $Δ_{n}^{2} log p = o (n)$ , or Case4 with $log (p n) = o [n^{min {(1 - ϑ) / 2, ϑ / 7, 1 / 9}}]$ and $Δ_{n}^{2} log p = o (n)$ , it holds under $H_{0}$ that ${sup}_{x \in R} | P (T_{n} \leq x) - P ({\hat{T}}_{n}^{G} \leq x | E) | = o_{p} (1)$ . Moreover, it holds under $H_{0}$ that

$P (T_{n} > {\hat{cv}}_{α}) \to α a s n \to \infty .$

Theorem 3.

In either Case3 with $p = o {n^{(1 - ϑ) / 4}}$ or Case4 with $log (p n) = o [n^{min {ϑ / 3, (1 - ϑ) / 2}}]$ , if ${max}_{j \in [p]} | μ_{X, j} - μ_{Y, j} | ≫ n^{- 1 / 2} {(log p)}^{1 / 2}$ , it holds that

$P (T_{n} > {\hat{cv}}_{α}) \to 1 a s n \to \infty .$

Remark 3.

The different requirements for the divergence rates of p follow from the fact that we do not rely on the Gaussian approximation and comparison results under certain alternative hypotheses. By Theorem 2 and Theorem 3, the optimal selections for ϑ are $1 / 2$ and $7 / 9$ in Case3 and Case4, respectively. This implies that ${lim}_{n \to \infty} P_{H_{0}} (T_{n} > {\hat{cv}}_{α}) = α$ holds with $p log p = o (n^{1 / 8})$ in Case3 and $log (p n) = o (n^{1 / 9})$ in Case4. Under certain alternative hypotheses, ${lim}_{n \to \infty} P_{H_{1}} (T_{n} > {\hat{cv}}_{α}) = 1$ holds with $p = o (n^{1 / 8})$ in Case3 and $log (p n) = o (n^{1 / 9})$ in Case4.

4. Application: Change Point Detection

In this section, we elaborate that our two-sample testing procedure can be regarded as a novel method for detecting change points for high-dimensional time series. To illustrate, we provide a notation for the detection of a single change point, with the understanding that it can be easily extended to the multiple change points case.

Consider a p-dimensional time series ${X_{t}}_{t = 1}^{n}$ . Let $μ_{t} = E (X_{t})$ . Consider the following hypothesis testing problem:

H_{0}^{'} : μ_{1} = \dots = μ_{n} versus H_{1}^{'} : μ_{1} = \dots = μ_{τ_{0} - 1} \neq μ_{τ_{0}} = \dots = μ_{n} .

Here, $τ_{0}$ is the unknown change point. Let w be a positive integer such that $w < min {τ_{0}, n - τ_{0}}$ . We define ${\bar{μ}}^{t} = w^{- 1} \sum_{l = t - w / 2 + 1}^{t + w / 2} μ_{l}$ , ${\bar{μ}}^{(1)} = w^{- 1} \sum_{l = 1}^{w} μ_{l}$ and ${\bar{μ}}^{(2)} = w^{- 1} \sum_{l = n - w + 1}^{n} μ_{l}$ . Then for each $t \in [3 w / 2, n - 3 w / 2]$ , define $Δ^{t, (1)} = {\bar{μ}}^{t} - {\bar{μ}}^{(1)}$ and $Δ^{t, (2)} = {\bar{μ}}^{t} - {\bar{μ}}^{(2)}$ . Thus,

\begin{matrix} Δ^{t, (1)} = \{\begin{matrix} 0_{p} & , if 3 w / 2 \leq t \leq τ_{0} - w / 2, \\ ({\bar{μ}}^{(2)} - {\bar{μ}}^{(1)}) \frac{t + w / 2 - τ_{0}}{w} & , if τ_{0} - w / 2 < t \leq τ_{0} + w / 2, \\ {\bar{μ}}^{(2)} - {\bar{μ}}^{(1)} & , if τ_{0} + w / 2 < t \leq n - 3 w / 2, \end{matrix} \\ Δ^{t, (2)} = \{\begin{matrix} {\bar{μ}}^{(1)} - {\bar{μ}}^{(2)} & , if 3 w / 2 \leq t \leq τ_{0} - w / 2, \\ ({\bar{μ}}^{(1)} - {\bar{μ}}^{(2)}) \frac{- t + w / 2 + τ_{0}}{w} & , if τ_{0} - w / 2 < t \leq τ_{0} + w / 2, \\ 0_{p} & , if τ_{0} + w / 2 < t \leq n - 3 w / 2 . \end{matrix} \end{matrix}

Assume $| {\bar{μ}}^{(1)} - {\bar{μ}}^{(2)} |_{\infty} = O (1)$ , which represents the sparse signals case. Define $t_{1} (ε^{t, (1)}) = min {t \in [3 w / 2, n - 3 w / 2] : | Δ^{t, (1)} | > ε^{t, (1)}}$ and $t_{2} (ε^{t, (2)}) = max {t \in [3 w / 2, n - 3 w / 2] : | Δ^{t, (2)} | > ε^{t, (2)}}$ with two well-defined thresholds $ε^{t, (1)}, ε^{t, (2)} \geq 0$ . Due to the symmetry of $| Δ^{t, (1)} |$ and $| Δ^{t, (2)} |$ , it holds under $H_{1}^{'}$ that

\begin{matrix} τ_{0} = \frac{t_{1} (ε^{t, (1)}) + t_{2} (ε^{t, (2)})}{2} . \end{matrix}

The sample estimators of ${\bar{μ}}^{t}$ , ${\bar{μ}}^{(1)}$ and ${\bar{μ}}^{(2)}$ are, respectively, ${\hat{\bar{μ}}}^{t} = w^{- 1} \sum_{l = t - w / 2 + 1}^{t + w / 2} X_{l}$ , ${\hat{\bar{μ}}}^{(1)} = w^{- 1} \sum_{l = 1}^{w} X_{l}$ and ${\hat{\bar{μ}}}^{(2)} = w^{- 1} \sum_{l = n - w + 1}^{n} X_{l}$ . Based on the method proposed in Section 2, with $n_{1} = n_{2} = w$ , we define the following two test statistics:

T_{w}^{t, (1)} = \sqrt{w} | {\hat{\bar{μ}}}^{t} - {\hat{\bar{μ}}}^{(1)} |_{\infty} and T_{w}^{t, (2)} = \sqrt{w} {| {\hat{\bar{μ}}}^{t} - {\hat{\bar{μ}}}^{(2)} |}_{\infty} .

Given a significance level $α > 0$ , we choose $ε^{t, (1)} = {cv}_{1 α}^{t}$ and $ε^{t, (2)} = {cv}_{2 α}^{t}$ , where ${cv}_{1 α}^{t}$ and ${cv}_{2 α}^{t}$ are, respectively, the $(1 - α)$ -quantiles of the distributions of $T_{w}^{t, (1)}$ and $T_{w}^{t, (2)}$ . The estimated critical values ${\hat{cv}}_{1 α}^{t}$ and ${\hat{cv}}_{2 α}^{t}$ can be obtained by (4). Thus, ${\hat{t}}_{1} = min {t \in [3 w / 2, n - 3 w / 2] : T_{w}^{t, (1)} > {\hat{cv}}_{1 α}^{t}}$ and ${\hat{t}}_{2} = max {t \in [3 w / 2, n - 3 w / 2] : T_{w}^{t, (2)} > {\hat{cv}}_{2 α}^{t}}$ . Hence, the estimator of $τ_{0}$ is given by

\begin{matrix} {\hat{τ}}_{0} = \frac{{\hat{t}}_{1} + {\hat{t}}_{2}}{2} . \end{matrix}

We utilize $T_{w}^{t, (1)}$ as an illustrative example to elucidate the applicability of our proposed method. Let w be an even integer. For any $t \in [5 w / 2, n - 3 w / 2]$ , we have $T_{w}^{t, (1)} = {| w^{- 1 / 2} \sum_{l = 1}^{w} (X_{t - w / 2 + l} - X_{l}) |}_{\infty}$ , where the sequence ${X_{t - w / 2 + l} - X_{l}}_{l = 1}^{w}$ possesses the same weakly dependence properties and similar moment/tail conditions as ${X_{l}}_{l = 1}^{n}$ . For $t \in [3 w / 2, 5 w / 2 - 1]$ , let ${{\tilde{X}}_{l}}_{l = 1}^{t - w / 2}$ be defined as ${\tilde{X}}_{l} = X_{l}$ when $l \in [1, w]$ and ${\tilde{X}}_{l} = 0_{p}$ when $l \in [w + 1, t - w / 2]$ . Additionally, define ${{\tilde{Y}}_{l}}_{l = t - w / 2 + 1}^{2 t - w}$ as ${\tilde{Y}}_{l} = X_{l}$ when $l \in [t - w / 2 + 1, t + w / 2]$ and ${\tilde{Y}}_{l} = 0_{p}$ when $l \in [t + w / 2 + 1, 2 t - w]$ . Then, $T_{w}^{t, (1)}$ can be expressed as $| w^{- 1 / 2} \sum_{l = 1}^{t / 2 - w / 4} {({\tilde{Y}}_{t - w / 2 + l} - {\tilde{X}}_{l}) + ({\tilde{Y}}_{2 t - w + 1 - l} - {\tilde{X}}_{t - w / 2 + 1 - l})} |_{\infty}$ , and ${({\tilde{Y}}_{t - w / 2 + l} - {\tilde{X}}_{l}) + ({\tilde{Y}}_{2 t - w + 1 - l} - {\tilde{X}}_{t - w / 2 + 1 - l})}_{l = 1}^{t / 2 - w / 4}$ shares the same weakly dependence properties and similar moment/tail conditions as ${X_{l}}_{l = 1}^{n}$ . Hence, our method can be applied to change point detection.

The selections of w and $α$ are crucial in this method. We will elaborate on the specific choices for them in future works.

5. Simulation Study

5.1. Tuning Parameter Selection

Given the observations ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ , we use the minimum volatility (MV) method proposed in [27] to select the block size S.

When the data are independent, by the multiplier bootstrap method described in [28], we set $B = \tilde{n}$ (thus $S = 1$ ). In this case,

\begin{matrix} {\hat{Ξ}}_{\tilde{n}} = & var (\frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} (Z_{t} - \bar{Z}) ϱ_{t}^{'} | Z_{1}, \dots, Z_{\tilde{n}}) \\ = & \frac{1}{\tilde{n}} \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} (Z_{t} - \bar{Z})) {(\sum_{t \in I_{b}} (Z_{t} - \bar{Z}))}^{T}\} = \frac{1}{\tilde{n}} \sum_{t = 1}^{\tilde{n}} (Z_{t} - \bar{Z}) {(Z_{t} - \bar{Z})}^{T} \end{matrix}

proves to be a reliable estimator of $Ξ_{\tilde{n}}$ introduced in Section 3. When the data are weakly dependent (and thus nearly independent), we expect a small value for S and a large value for B. Therefore, we recommend exploring a narrow range of S, such as $S \in {1, \dots, m}$ , where m is a moderate integer. In our theoretical proof, the quality of the bootstrap approximation depends on how well the ${\hat{Ξ}}_{\tilde{n}}$ approximates the covariance $Ξ_{\tilde{n}}$ . The idea behind the MV method is that the conditional covariance ${\hat{Ξ}}_{\tilde{n}}$ should exhibit stable behavior as a function of S within an appropriate range. For more comprehensive discussions on the MV method and its applications in time series analysis, we refer readers to [27,29]. For a moderately sized integer m, let $S_{1} < S_{2} < \dots < S_{m}$ be a sequence of equally spaced candidate block sizes, and $S_{0} = 2 S_{1} - S_{2}$ , $S_{m + 1} = 2 S_{m} - S_{m - 1}$ . For each $i \in {0, \dots, m + 1}$ , let

Y_{j}^{i} = \sum_{b = 1}^{B (S_{i})} {\{\sum_{t \in I_{b}} (Z_{t, j} - {\bar{Z}}_{j})\}}^{2},

where $j \in [p]$ and $B (S) = ⌈ \tilde{n} / S ⌉$ . Then for each $i \in {1, \dots, m}$ , we compute

Y^{i} = \sum_{j = 1}^{p} sd ({Y_{j}^{l}}_{l = i - 1}^{i + 1}),

where $sd (\cdot)$ is the standard deviation. Then, we select the block size $S_{i^{*}}$ with $i^{*} = arg {min}_{i \in {1, \dots, m}} Y^{i}$ .

5.2. Simulation Settings

We present the results of a simulation study aimed at evaluating the performance of tests based on $T_{n}$ , as defined in (2), in finite samples. To assess the finite-sample properties of the proposed test, we employed the following fundamental generating processes: $W = H A + f (a) \in R^{n \times p}$ , where $A \in R^{p \times p}$ is the loading matrix, $f (\cdot) : R \to R^{n \times p}$ is a constant function, the parameter a belongs to the set ${0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6}$ , representing the distance between the null and alternative hypotheses. Additionally, $H = {(H_{1}, \dots, H_{n})}^{T} \in R^{n \times p}$ with $H_{t} = ρ H_{t - 1} + ε_{t} \in R^{p \times 1}$ , where $ε_{t} \overset{i i d}{\sim} N (0_{p}, I_{p})$ and $ρ \in {0, 0.1, 0.2}$ . Construct $f_{i} (a) = {(m_{1}^{(i)}, \dots, m_{n}^{(i)})}^{T} \in R^{n \times p}$ such that $m_{t}^{(i)} = {(m_{t, 1}^{(i)}, \dots, m_{t, p}^{(i)})}^{T}$ for $i \in {1, 2}$ , where $m_{t, j}^{(1)} = a^{j}$ and $m_{t, j}^{(2)} = a (1 - j / p)$ for each $t \in [n]$ and $j \in [p]$ . Then $f_{1} (\cdot)$ and $f_{2} (\cdot)$ represent the sparse and dense signal cases, respectively. We consider three different loading matrices for $A$ as follows:

(M1).
Let $V = {(v_{k, l})}_{1 \leq k, l \leq p}$ s.t. $v_{k, l} = 0 . 995^{| k - l |}$ , then let $A = V^{1 / 2}$ .
(M2).
Let $A = {(a_{k, l})}_{1 \leq k, l \leq p}$ s.t. $a_{k, k} = 1$ , $a_{k, l} = 0.7$ for $| k - l | = 1$ and $a_{k, l} = 0$ otherwise.
(M3).
Let $r = ⌈ p / 2.5 ⌉$ , $V = {(v_{k, l})}_{1 \leq k, l \leq p}$ , where $v_{k, k} = 1$ , $v_{k, l} = 0.9$ for $r (q - 1) + 1 \leq k \neq l \leq r q$ with $q = 1, \dots, ⌊ p / r ⌋$ , and $v_{k, l} = 0$ otherwise. Let $A = V^{1 / 2}$ .

We assess the finite sample performance of our proposed test (denoted by $Yang$ ) in comparison with tests introduced by [5] (denoted by $Dempster$ ), [4] (denoted by $BS$ ), [6] (denoted by $SD$ ), and [8] (denoted by $CLX$ ). All tests in our simulations are conducted at the $5 %$ significance level with 1000 Monte Carlo replications, and the number of bootstrap replications is set to 1000. We consider dimensions $p \in {50, 200, 400, 800}$ and sample size pairs $(n_{1}, n_{2}) \in {(200, 220), (400, 420)}$ .

5.3. Simulation Results

For the testing of the null hypothesis, consider independent generations of ${X_{t}}$ and ${Y_{t}}$ , following the same process as $W$ , with identical values for $ρ$ and $f (a) = 0$ . The choice of $f (a) = 0$ here is made for the sake of simplicity. We exclusively present the simulation results for (M1) in the main body of the paper. The results obtained for (M2) and (M3) are analogous to those of (M1) and are detailed in the Appendix E.

Table 1 presents the performance of various methods in controlling Type I errors based on (M1). As the dimension p or sample size $(n_{1}, n_{2})$ increases, the results of all methods exhibit small changes, except BS’s. When $ρ$ equals 0, indicating samples are generated from independent Gaussian distributions, both Yang’s method and BS’s method effectively control Type I errors at around $5 %$ , while the control achieved by the other three methods is less optimal. It is noteworthy that, with an increase in $ρ$ , the data generated by the AR(1) model significantly influence the other methods. In contrast, Yang’s method demonstrates superior and more stable results with increasing $ρ$ . These comparative effects are also observable in the results based on (M2) and (M3) in the Appendix E. For this reason, we exclusively compare the empirical power results by different methods with $ρ = 0$ .

Table 1.

The Type $I$ error rates, expressed as percentages, were calculated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M1). The simulations were replicated 1000 times.

( $n_{1}, n_{2}$ )	$ρ$	p	$Yang$	$Dempster$	$BS$	$SD$	$CLX$
(200,220)	0	50	5	18.5	5.8	0.9	0.3
		200	5.9	16.5	6.6	0.4	0.4
		400	5.4	17.4	6.2	0.2	0.3
		800	4.2	13.5	6.7	0.3	0.2
	0.1	50	6.5	22.8	9.3	2	1
		200	6.6	22.6	9.6	1.2	0.8
		400	7.4	22.9	10.4	1	0.8
		800	5.8	22.5	12.4	1	1.2
	0.2	50	6.8	30.2	13.8	3.1	2.5
		200	7.7	29.9	14.3	2.2	2.7
		400	9.3	30.5	18.2	2.2	2.4
		800	7.9	33.3	21.3	3	3.2
(400,420)	0	50	5.2	17.6	6.8	1	0.5
		200	5.3	17.2	6.8	0.5	0.1
		400	4.6	15.1	5.7	0.3	0
		800	5.2	14.2	6.3	0.3	0.4
	0.1	50	5.6	22.4	9.6	1.4	1
		200	6.3	22.5	9.6	1.3	0.8
		400	6.1	21.4	9.7	0.8	0.8
		800	6.5	23.6	12.1	0.7	1.2
	0.2	50	6.7	26.9	12.8	2.5	1.9
		200	7.6	29.2	14.9	2.3	2.4
		400	7.6	29.4	15.1	1.5	2.9
		800	8.3	36.3	21.9	2.5	3.8

Open in a new tab

Figure 1 and Figure 2 depict the empirical power results of various methods for sparse and dense signals based on (M1). Similarly, as the dimension p increases, the results of all methods show little variation, except Dempster’s. However, with an increase in sample size $(n_{1}, n_{2})$ , most methods exhibit improvement in their results. In Figure 1, it is evident that Yang’s method outperforms others significantly when the signal is sparse. Methods like SD, BS, and Dempster rely on the $ℓ_{2}$ -norm of the data, aggregating signals across all dimensions for testing. This makes them less effective when the signal is sparse, i.e., anomalies appear in only a few dimensions. CLX’s approach, akin to Yang’s, tests whether the largest signal is abnormal. Consequently, CLX performs better than the other three methods in scenarios with sparse signals but still falls short of Yang’s method. On the contrary, when the signal is dense, Figure 2 shows that all methods yield favorable results, with Dempster’s method proving to be the most effective. Yang’s method performs at a relatively high level among these methods. In contrast, the CLX’s method, which performs well in sparse signals, exhibits a relatively lower level of performance in dense signals. In conclusion, the proposed method exhibits the most stable performance across all methods and performs exceptionally well on sparse data.

The empirical powers with sparse signals were evaluated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ based on (M1), $f (\cdot) = 0$ and $ρ = 0$ , and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M1), $f (\cdot) = f_{1} (\cdot)$ and $ρ = 0$ . The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

The empirical powers with dense signals were evaluated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ based on (M1), $f (\cdot) = 0$ and $ρ = 0$ , and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M1), $f (\cdot) = f_{2} (\cdot)$ and $ρ = 0$ . The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

6. Real Data Analysis

In this section, we apply the proposed method to a dataset comprised of stock data obtained from Bloomberg’s public database. This dataset includes daily opening prices from 1 January 2018 to 31 December 2021 for 30 companies in the Consumer Discretionary Sector (CDS) and 31 companies in the Information Technology Sector (ITS), all listed in the S&P 500. The sample sizes for the years 2018, 2019, 2020, and 2021 are 251, 250, 253, and 252, respectively. The findings are presented in Table 2. Regarding the data for the Consumer Discretionary (CD) and Information Technology (IT) sectors, all p-values from the tests between two consecutive years are 0. This suggests a significant variation in the average annual opening prices across different years for both CDs and ITs.

Table 2.

The p-values for testing the equality of average annual opening prices across two consecutive years in the Consumer Discretionary Sector and Information Technology Sector, respectively.

Sector of S&P 500	2018–2019	2019–2020	2020–2021
Consumer Discretionary	0	0	0
Information Technology	0	0	0

Open in a new tab

For data visualization, Figure 3 displays the average annual opening prices of 30 companies in the CDS (left subgragh) and 31 companies in the ITS (right subgragh) in 2018, 2019, 2020, and 2021. The two subgraghs both exhibit a pattern of annual growth in the opening prices of nearly every stock. These results are well in line with the conclusion of Table 2.

The average annual opening prices of 30 Consumer Discretionary corporations and 31 Information Technology corporations in 2018, 2019, 2020, and 2021.

7. Discussion

In this paper, we propose a two-sample test for high-dimensional time series based on blockwise bootstrap. Our $ℓ_{\infty}$ -type test statistic is designed to detect the largest abnormal signal among dimensions. Unlike some frameworks, we do not necessarily require independence within each observation or between the two sets of observations. Instead, we rely on the weak dependence property of the pair sequence ${(X_{t}, Y_{t})}$ to ensure the asymptotic properties of our proposed method. We derive two Gaussian approximation results for two cases in which the dimension p diverges, one at a polynomial rate relative to the sample size n and the other at an exponential rate relative to the sample size n. In the bootstrap procedure, the block size serves as the tuning parameter, and we employ the minimum volatility method, as proposed by [27], for block size selection.

Our test statistic is designed to pinpoint the maximum value among dimensions, facilitating the detection of significant differences in certain dimensions. In cases where differences in each dimension are minimal, it is more appropriate to consider the $ℓ_{2}$ -type test statistic rather than the $ℓ_{\infty}$ -type one. Consequently, in the absence of prior information, the utilization of test statistics that combine both types proves advantageous. However, deriving theoretical results from such a combined approach is a significant challenge. As discussed in Section 4, our two-sample testing procedure can be applied to change point detection in high-dimensional time series. The choices of w, the size of each subsample mean, and the significance level $α$ play crucial roles in this change point detection procedure. We leave these considerations for future research.

Appendix A. Proof of Theorem 1

Appendix A.1. Proof of Theorem 1(i)

Proof.

We first show that, for any $τ > (q - 1) m / (m - q)$ with some $q \in [2, ⌊ m ⌋]$ ,

$\begin{matrix} max_{j \in [p]} E (| \sum_{t = 1}^{n} ξ_{t, j} |^{q}) ≲ n^{q / 2} . \end{matrix}$ (A1)

If $q = 2$ , due to $\sum_{κ = 1}^{\infty} α^{\frac{m - 2}{m}} (κ) ≲ \sum_{κ = 1}^{\infty} κ^{- \frac{(m - 2) τ}{m}} < \infty$ , Equation (1.12b) (Davydov’s inequality) of [30] yields

$\begin{matrix} E \{{(\sum_{t = 1}^{n} ξ_{t, j})}^{2}\} = & \sum_{t = 1}^{n} E (| ξ_{t, j} |^{2}) + \sum_{t_{1} \neq t_{2}} cov (ξ_{t_{1}, j}, ξ_{t_{2}, j}) \\ ≲ & n + \sum_{t_{1} \neq t_{2}} {E (| ξ_{t_{1}, j} |^{m} {)}}^{\frac{1}{m}} {E (| ξ_{t_{2}, j} |^{m} {)}}^{\frac{1}{m}} α^{\frac{m - 2}{m}} (| t_{1} - t_{2} |) \\ ≲ & n + n \sum_{κ = 1}^{n} α^{\frac{m - 2}{m}} (κ) ≲ n \end{matrix}$ (A2)

for any $j \in [p]$ . For $q > 2$ and $j \in [p]$ , Theorem 6.3 of [30] yields

$\begin{matrix} E (| \sum_{t = 1}^{n} ξ_{t, j} |^{q}) \leq a_{q} s_{n, j}^{q} + n b_{q} \int_{0}^{1} {[α^{- 1} (u) \land n]}^{q - 1} {\{sup_{t \in [n]} Q_{t, j} (u)\}}^{q} d u, \end{matrix}$

where $a_{q}, b_{q} > 0$ are two constants depending only on q, $s_{n, j}^{2} = \sum_{t_{1}, t_{2} = 1}^{n} | Cov (ξ_{t_{1}, j}, ξ_{t_{2}, j}) |$ , $α^{- 1} (u) = \sum_{κ \geq 0} 1 (u \leq α (κ))$ and $Q_{t, j} (u) = inf {x : P (| ξ_{t, j} | > x) \leq u}$ . By (A2), it holds that $s_{n, j}^{q} = {(s_{n, j}^{2})}^{q / 2} ≲ n^{q / 2}$ . Due to ${max}_{t \in [n]} {max}_{j \in [p]} E (| ξ_{t, j} |^{m}) \leq C$ , we have ${max}_{t \in [n]} {max}_{j \in [p]} Q_{t, j} (u) ≲ u^{- \frac{1}{m}}$ . By the denifition of $α^{- 1} (\cdot)$ , we know that $α^{- 1} (u) ≲ u^{- \frac{1}{τ}}$ . Thus

$\int_{0}^{1} {[α^{- 1} (u) \land n]}^{q - 1} {\{sup_{t \in [n]} Q_{t, j} (u)\}}^{q} d u ≲ \int_{0}^{1} u^{- \frac{q - 1}{τ} - \frac{q}{m}} d u \leq C,$

where the last inequality follows from $τ > (q - 1) m / (m - q)$ . Hence, we have

$E (| \sum_{t = 1}^{n} ξ_{t, j} |^{q}) ≲ n^{q / 2}$

for any $j \in [p]$ . By combining above results, we complete the proof of (A1).

Now, we begin to prove Theorem 1(i). Define

$\begin{matrix} ω_{n} = sup_{x > 0} | P (max_{j \in [p]} S_{n, j} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | . \end{matrix}$

Let ${\overset{ˇ}{S}}_{n} = {({\overset{ˇ}{S}}_{n, 1}, \dots, {\overset{ˇ}{S}}_{n, 2 p})}^{T} = {(S_{n, 1}, - S_{n, 1}, \dots, S_{n, p}, - S_{n, p})}^{T}$ and $\overset{ˇ}{W} = {({\overset{ˇ}{W}}_{1}, \dots, {\overset{ˇ}{W}}_{2 p})}^{T} = {(W_{1}, - W_{1}, \dots, W_{p}, - W_{p})}^{T}$ . Then, we have ${max}_{j \in [p]} | S_{n, j} | = {max}_{j \in [2 p]} {\overset{ˇ}{S}}_{n, j}$ and ${max}_{j \in [p]} | W_{j} | = {max}_{j \in [2 p]} {\overset{ˇ}{W}}_{j}$ . Then, to obtain Theorem 1(i), without loss of generality, it suffices to specify the convergence rate of $ω_{n}$ .

For some constant $ς \in (0, 1)$ , let $B_{n} = ⌊ n^{ς} ⌋$ and $K_{n} = ⌈ n / B_{n} ⌉$ be the number of blocks and the size of each block, respectively. For simplicity, we assume $B_{n} ≍ n^{ς}$ and $K_{n} = n / B_{n} ≍ n^{1 - ς}$ . We first decompose the sequence ${1, \dots, n}$ into $B_{n}$ blocks: $G_{b} = {(b - 1) K_{n} + 1, \dots, b K_{n}}$ for $b \in [B_{n}]$ . Let $g_{n} ≫ k_{n}$ be two non-negative integers such that $K_{n} = g_{n} + k_{n}$ . We then decompose each $G_{b} (b \in [B_{n}])$ to a “large” block $I_{b}$ with length $g_{n}$ and a “small” block $J_{b}$ with length $k_{n}$ : $I_{b} = {(b - 1) K_{n} + 1, \dots, b K_{n} - k_{n}}$ and $J_{b} = {b K_{n} - k_{n} + 1, \dots, b K_{n}}$ . Let $H_{b} = {(H_{b, 1}, \dots, H_{b, p})}^{T} = K_{n}^{- 1 / 2} \sum_{t \in I_{b}} ξ_{t}$ . For each $b \in [B_{n}]$ and some $D_{n} \to \infty$ , define $H_{b}^{+} = {(H_{b, 1}^{+}, \dots, H_{b, p}^{+})}^{T}$ with $H_{b, j}^{+} = H_{b, j} 1 (| H_{b, j} | \leq D_{n}) - E {H_{b, j} 1 (| H_{b, j} | \leq D_{n})}$ and $H_{b}^{-} = {(H_{b, 1}^{-}, \dots, H_{b, p}^{-})}^{T}$ with $H_{b, j}^{-} = H_{b, j} 1 (| H_{b, j} | > D_{n}) - E {H_{b, j} 1 (| H_{b, j} | > D_{n})}$ . For each $j \in [p]$ , by Theorem 2 of [17], there exists an independent sequence ${{\tilde{H}}_{b, j}}_{b = 1}^{B_{n}}$ such that ${\tilde{H}}_{b, j}$ has the same distribution as $H_{b, j}^{+}$ and

$\begin{matrix} E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |) ≲ \int_{0}^{α (k_{n})} inf {x \in R : P (| H_{b, j}^{+} | > x) \leq u} d u . \end{matrix}$

Due to $| H_{b, j}^{+} | \leq 2 D_{n}$ , we have $inf {x \in R : P (| H_{b, j}^{+} | > x) \leq u} ≲ D_{n}$ for any $u \geq 0$ , which implies

$\begin{matrix} E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |) ≲ D_{n} α (k_{n}) . \end{matrix}$ (A3)

Define ${\tilde{S}}_{n} = {({\tilde{S}}_{n, 1}, \dots, {\tilde{S}}_{n, p})}^{T} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b}$ with ${\tilde{S}}_{n, j} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}$ and

$\begin{matrix} {\tilde{ω}}_{n} = sup_{x > 0} | P (max_{j \in [p]} {\tilde{S}}_{n, j} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | . \end{matrix}$ (A4)

For any $ϵ_{1} > 0$ , triangle inequality implies

$\begin{matrix} P (max_{j \in [p]} S_{n, j} \leq x) \leq P (max_{j \in [p]} {\tilde{S}}_{n, j} \leq x + ϵ_{1}) + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1}) \\ \leq & P (max_{j \in [p]} W_{j} \leq x + ϵ_{1}) + {\tilde{ω}}_{n} + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1}) \\ \leq & P (max_{j \in [p]} W_{j} \leq x) + P (x - ϵ_{1} < max_{j \in [p]} W_{j} \leq x + ϵ_{1}) + {\tilde{ω}}_{n} + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1}) \end{matrix}$

for any $x > 0$ , then $P ({max}_{j \in [p]} S_{n, j} \leq x) - P ({max}_{j \in [p]} W_{j} \leq x) \leq P (x - ϵ_{1} < {max}_{j \in [p]} W_{j} \leq x + ϵ_{1}) + {\tilde{ω}}_{n} + P (| {max}_{j \in [p]} S_{n, j} - {max}_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1})$ . Likewise, $P ({max}_{j \in [p]} S_{n, j} \leq x) - P ({max}_{j \in [p]} W_{j} \leq x) \geq - P (x - ϵ_{1} < {max}_{j \in [p]} W_{j} \leq x + ϵ_{1}) - {\tilde{ω}}_{n} - P (| {max}_{j \in [p]} S_{n, j} - {max}_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1})$ . Due to ${min}_{j \in [p]} {(Σ_{n})}_{j, j} \geq λ_{min} (Σ_{n}) \geq c$ , Lemma A.1 of [31] yields

$\begin{matrix} sup_{x \in R} P (x - ϵ_{1} < max_{j \in [p]} W_{j} \leq x + ϵ_{1}) ≲ ϵ_{1} {(log p)}^{1 / 2} \end{matrix}$

for any $ϵ_{1} > 0$ . Thus, we can conclude that

$\begin{matrix} ω_{n} ≲ {\tilde{ω}}_{n} + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{1}) + ϵ_{1} {(log p)}^{1 / 2} . \end{matrix}$ (A5)

Define $S_{n}^{+} = {(S_{n, 1}^{+}, \dots, S_{n, p}^{+})}^{T} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} H_{b}^{+}$ . By triangle inequality,

$\begin{matrix} | max_{j \in [p]} S_{n, j} - max_{j \in [p]} S_{n, j}^{+} | \leq max_{j \in [p]} | S_{n, j} - S_{n, j}^{+} | \leq max_{j \in [p]} | \frac{1}{n^{1 / 2}} \sum_{b = 1}^{B_{n}} \sum_{t \in J_{b}} ξ_{t, j} | + max_{j \in [p]} | \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} H_{b, j}^{-} | . \end{matrix}$

By (A1), we have $E (| H_{b, j} |^{3}) \leq C$ . Thus $E (| H_{b, j}^{-} |^{3}) ≲ E (| H_{b, j} |^{3}) \leq C$ , and

$\begin{matrix} E (| H_{b, j}^{-} |^{2}) ≲ E {| H_{b, j} |^{2} 1 (| H_{b, j} | > D_{n})} \leq E (| H_{b, j} |^{3}) D_{n}^{- 1} ≲ D_{n}^{- 1} . \end{matrix}$ (A6)

Similar to (A2), we have $E (| \sum_{b = 1}^{B_{n}} \sum_{t \in J_{b}} ξ_{t, j} |) ≲ B_{n}^{1 / 2} k_{n}^{1 / 2}$ for any $j \in [p]$ , and

$\begin{matrix} E \{{(\sum_{b = 1}^{B_{n}} H_{b, j}^{-})}^{2}\} = \sum_{b = 1}^{B_{n}} E (| H_{b, j}^{-} |^{2}) + \sum_{b_{1} \neq b_{2}} cov (H_{b_{1}, j}^{-}, H_{b_{2}, j}^{-}) \\ ≲ & B_{n} D_{n}^{- 1} + \sum_{b_{1} \neq b_{2}} α^{\frac{1}{3}} \{k_{n} 1 (| b_{1} - b_{2} | = 1) + | b_{2} - b_{1} - 1 | K_{n} 1 (| b_{1} - b_{2} | > 1)\} \\ ≲ & B_{n} D_{n}^{- 1} + \sum_{| b_{1} - b_{2} | = 1} α^{\frac{1}{3}} (k_{n}) + \sum_{| b_{1} - b_{2} | > 1} α^{\frac{1}{3}} (| b_{1} - b_{2} - 1 | K_{n}) ≲ B_{n} D_{n}^{- 1} + B_{n} k_{n}^{- \frac{τ}{3}}, \end{matrix}$ (A7)

where the last inequality follows from $τ > 3$ . Thus, $E (| \sum_{b = 1}^{B_{n}} H_{b, j}^{-} |) ≲ B_{n}^{1 / 2} (D_{n}^{- 1 / 2} + k_{n}^{- τ / 6})$ and

$\begin{matrix} E (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} S_{n, j}^{+} |) ≲ \frac{p k_{n}^{1 / 2}}{K_{n}^{1 / 2}} + \frac{p}{D_{n}^{1 / 2}} + \frac{p}{k_{n}^{τ / 6}} . \end{matrix}$ (A8)

Due to ${\tilde{H}}_{b, j}$ having the same distribution as $H_{b, j}^{+}$ and $| H_{b, j}^{+} | \leq 2 D_{n}$ , by (A3), we have $E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |^{s}) ≲ D_{n}^{s} k_{n}^{- τ}$ for $s \in {2, 3}$ . Thus, following the same arguments as in the proof of (A7), it holds that

$\begin{matrix} E [{\{\sum_{b = 1}^{B_{n}} ({\tilde{H}}_{b, j} - H_{b, j}^{+})\}}^{2}] \\ ≲ & \sum_{b = 1}^{B_{n}} D_{n}^{2} k_{n}^{- τ} + D_{n}^{2} k_{n}^{- 2 τ / 3} \{\sum_{| b_{1} - b_{2} | = 1} α^{\frac{1}{3}} (k_{n}) + \sum_{| b_{1} - b_{2} | > 1} α^{\frac{1}{3}} (| b_{1} - b_{2} - 1 | K_{n})\} \\ ≲ & B_{n} D_{n}^{2} k_{n}^{- τ} . \end{matrix}$

Thus, $E (| \sum_{b = 1}^{B_{n}} ({\tilde{H}}_{b, j} - H_{b, j}^{+}) |) ≲ B_{n}^{1 / 2} D_{n} k_{n}^{- τ / 2}$ and

$\begin{matrix} E (| max_{j \in [p]} {\tilde{S}}_{n, j} - max_{j \in [p]} S_{n, j}^{+} |) \leq E (max_{j \in [p]} | {\tilde{S}}_{n, j} - S_{n, j}^{+} |) ≲ \frac{p D_{n}}{k_{n}^{τ / 2}} . \end{matrix}$

Together with (A8), we have

$E (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} |) ≲ \frac{p k_{n}^{1 / 2}}{K_{n}^{1 / 2}} + \frac{p}{D_{n}^{1 / 2}} + \frac{p}{k_{n}^{τ / 6}} + \frac{p D_{n}}{k_{n}^{τ / 2}} .$

Let $ϵ_{1} = p^{1 / 2} {(log p)}^{- 1 / 4} {(k_{n}^{1 / 2} K_{n}^{- 1 / 2} + D_{n}^{- 1 / 2} + k_{n}^{- τ / 6} + D_{n} k_{n}^{- τ / 2})}^{1 / 2}$ . It holds by (A5) and Markov inequality that

$\begin{matrix} ω_{n} ≲ {\tilde{ω}}_{n} + p^{1 / 2} {(log p)}^{1 / 4} {(\frac{k_{n}^{1 / 2}}{K_{n}^{1 / 2}} + \frac{1}{D_{n}^{1 / 2}} + \frac{1}{k_{n}^{τ / 6}} + \frac{D_{n}}{k_{n}^{τ / 2}})}^{1 / 2} . \end{matrix}$ (A9)

Define ${\tilde{Σ}}_{G} = B_{n}^{- 1} \sum_{b = 1}^{B_{n}} var ({\tilde{H}}_{b})$ and $Δ = | Σ_{n} - {\tilde{Σ}}_{G} |$ , where $Σ_{n} = E (S_{n} S_{n}^{T})$ . Note that

$\begin{matrix} Δ = & | \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} \{var ({\tilde{H}}_{b}) - var (H_{b}^{+})\} + \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} \{var (H_{b}^{+}) - var (H_{b})\} + \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} var (H_{b}) - Σ_{n} | \\ \leq & \underset{Δ_{1}}{\underset{︸}{\frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} | var ({\tilde{H}}_{b}) - var (H_{b}^{+}) |}} + \underset{Δ_{2}}{\underset{︸}{\frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} | var (H_{b}^{+}) - var (H_{b}) |}} + \underset{Δ_{3}}{\underset{︸}{| \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} var (H_{b}) - Σ_{n} |}} . \end{matrix}$ (A10)

In this sequel, we specify the convergence rates of $| Δ_{1} |_{\infty}$ , $| Δ_{2} |_{\infty}$ , and $| Δ_{3} |_{\infty}$ , respectively. Note that the $(i, j)$ -th element of $var ({\tilde{H}}_{b}) - var (H_{b}^{+})$ is $E ({\tilde{H}}_{b, i} {\tilde{H}}_{b, j} - H_{b, i}^{+} H_{b, j}^{+})$ . Due to ${\tilde{H}}_{b, j}$ having the same distribution as $H_{b, j}^{+}$ and $| H_{b, j}^{+} | ≲ D_{n}$ for any $b \in [B_{n}]$ and $j \in [p]$ , it holds by (A3) that

$\begin{matrix} | E ({\tilde{H}}_{b, i} {\tilde{H}}_{b, j} - H_{b, i}^{+} H_{b, j}^{+}) | \leq | E {({\tilde{H}}_{b, i} - H_{b, i}^{+}) {\tilde{H}}_{b, j}} | + | E {({\tilde{H}}_{b, j} - H_{b, j}^{+}) {\tilde{H}}_{b, i}^{+}} | ≲ D_{n}^{2} k_{n}^{- τ} \end{matrix}$

for any $b \in [B_{n}]$ and $i, j \in [p]$ . Thus, we can conclude that $| Δ_{1} |_{\infty} ≲ D_{n}^{2} k_{n}^{- τ}$ . The $(i, j)$ -th element of $var (H_{b}^{+}) - var (H_{b})$ is $E (H_{b, i}^{+} H_{b, j}^{+} - H_{b, i} H_{b, j})$ . Note that $E (| H_{b, j}^{-} |) ≲ E {| H_{b, j} | 1 (| H_{b, j} | > D_{n})} \leq E (| H_{b, j} |^{3}) D_{n}^{- 2} ≲ D_{n}^{- 2}$ . Due to $H_{b, j} = H_{b, j}^{+} + H_{b, j}^{-}$ , it holds by (A6) that

$\begin{matrix} | E (H_{b, i}^{+} H_{b, j}^{+} - H_{b, i} H_{b, j}) | = & | E {H_{b, i}^{+} H_{b, j}^{+} - (H_{b, i}^{+} + H_{b, i}^{-}) (H_{b, j}^{+} + H_{b, j}^{-})} | \\ \leq & | E (H_{b, i}^{+} H_{b, j}^{-}) | + | E (H_{b, j}^{+} H_{b, i}^{-}) | + | E (H_{b, i}^{-} H_{b, j}^{-}) | ≲ D_{n}^{- 1} \end{matrix}$

for any $b \in [B_{n}]$ and $i, j \in [p]$ . Thus, we can conclude that $| Δ_{2} |_{\infty} ≲ D_{n}^{- 1}$ . The $(i, j)$ -th element of $Σ_{n} - {B_{n}}^{- 1} \sum_{b = 1}^{B_{n}} var (H_{b})$ is $n^{- 1} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - n^{- 1} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j})$ , and

$\begin{matrix} | \frac{1}{n} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - \frac{1}{n} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) | \\ = & \frac{1}{n} | \sum_{b_{1} \neq b_{2}} E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} \\ + \sum_{b = 1}^{B_{n}} E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j}) + (\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | . \end{matrix}$ (A11)

Similar to the proof of (A2), we have

$\begin{matrix} | E \{(\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | \\ = & | \sum_{t \in J_{b}} cov (ξ_{t, i}, ξ_{t, j}) + \sum_{t_{1} \neq t_{2} : t_{1}, t_{2} \in J_{b}} cov (ξ_{t_{1}, i}, ξ_{t_{2}, j}) + \sum_{t_{1} \in J_{b}} \sum_{t_{2} \in I_{b}} cov (ξ_{t_{1}, i}, ξ_{t_{2}, j}) | \\ ≲ & k_{n} + \sum_{t_{1} \neq t_{2} : t_{1}, t_{2} \in J_{b}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) \\ + \sum_{t_{1} \in J_{b}} \sum_{t_{2} \in I_{b}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) ≲ k_{n} . \end{matrix}$

Similarly, we can also obtain

$| E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j})\} | ≲ k_{n} .$

Thus,

$\begin{matrix} | \sum_{b = 1}^{B_{n}} E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j}) + (\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | ≲ k_{n} B_{n} . \end{matrix}$

Analogously to the proof of (A2), if $b_{1} < b_{2}$ , due to $τ > 2 m / (m - 3)$ ,

$\begin{matrix} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | ≲ & \sum_{t_{1} \in G_{b_{1}}} \sum_{t_{2} \in G_{b_{2}}} {E (| ξ_{t_{1}, i} |^{m} {)}}^{\frac{1}{m}} {E (| ξ_{t_{2}, i} |^{m} {)}}^{\frac{1}{m}} α^{\frac{m - 2}{m}} (| t_{1} - t_{2} |) \\ ≲ & \sum_{δ = 1}^{K_{n}} δ α^{\frac{m - 2}{m}} {(b_{2} - b_{1} - 1) K_{n} + δ} \\ ≲ & 1 (b_{2} - b_{1} = 1) + K_{n}^{2} α^{\frac{m - 2}{m}} {(b_{2} - b_{1} - 1) K_{n}} 1 (b_{2} - b_{1} > 1) . \end{matrix}$

Then,

$\begin{matrix} \sum_{b_{1} < b_{2}} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | ≲ B_{n} + B_{n} K_{n}^{\frac{2 m - (m - 2) τ}{m}} \sum_{δ = 1}^{B_{n}} δ^{\frac{- (m - 2) τ}{m}} ≲ B_{n} . \end{matrix}$

The same result still holds for $b_{1} > b_{2}$ . Thus, we can conclude that

$\begin{matrix} \sum_{b_{1} \neq b_{2}} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | ≲ B_{n} . \end{matrix}$

Then, by (A11), it holds that

$\begin{matrix} | \frac{1}{n} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - \frac{1}{n} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) | ≲ \frac{k_{n}}{K_{n}} \end{matrix}$

for any $i, j \in [p]$ . Thus, $| Δ_{3} |_{\infty} ≲ k_{n} K_{n}^{- 1}$ . By (A10), we can conclude that

$\begin{matrix} {| Δ |}_{\infty} \leq | Δ_{1} |_{\infty} + | Δ_{2} |_{\infty} + {| Δ_{3} |}_{\infty} ≲ \frac{D_{n}^{2}}{k_{n}^{τ}} + \frac{1}{D_{n}} + \frac{k_{n}}{K_{n}} . \end{matrix}$

Let ${{\tilde{H}}_{b}^{G}}_{b = 1}^{B_{n}}$ be a sequence of an independent Gaussian vector such that ${\tilde{H}}_{b}^{G} = {({\tilde{H}}_{b, 1}^{G}, \dots, {\tilde{H}}_{b, p}^{G})}^{T}$ $\sim N {0_{p}, var ({\tilde{H}}_{b})}$ for each $b \in [B_{n}]$ , where ${\tilde{H}}_{b} = {({\tilde{H}}_{b, 1}, \dots, {\tilde{H}}_{b, p})}^{T}$ . By Theorem 1.1 of [15], Cauchy–Schwarz inequality and Jensen’s inequality,

$\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j} \leq x) - P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) | \\ ≲ p^{1 / 4} \cdot \{\sum_{b = 1}^{B_{n}} E (| {\tilde{Σ}}_{G}^{- 1 / 2} B_{n}^{- 1 / 2} {\tilde{H}}_{b} |_{2}^{3})\} \\ ≲ \frac{p^{1 / 4}}{B_{n}^{3 / 2}} {∥ {\tilde{Σ}}_{G}^{- 1 / 2} ∥}_{2}^{3} \cdot [\sum_{b = 1}^{B_{n}} E \{{(\sum_{j = 1}^{p} {\tilde{H}}_{b, j}^{2})}^{3 / 2}\}] \\ ≲ \frac{p^{7 / 4}}{B_{n}^{3 / 2}} {∥ {\tilde{Σ}}_{G}^{- 1 / 2} ∥}_{2}^{3} \cdot \{\sum_{b = 1}^{B_{n}} max_{j \in [p]} E (| {\tilde{H}}_{b, j} |^{3})\}, \end{matrix}$

where ${\tilde{Σ}}_{G} = B_{n}^{- 1} \sum_{b = 1}^{B_{n}} var ({\tilde{H}}_{b})$ . Note that

$| λ_{min} ({\tilde{Σ}}_{G}) - λ_{min} (Σ_{n}) {| \leq ∥ Δ ∥}_{2} \leq p {| Δ |}_{\infty} .$

Due to $λ_{min} (Σ_{n}) \geq c$ , we have $λ_{min} ({\tilde{Σ}}_{G}) \geq c$ as long as ${p | Δ |}_{\infty} = o (1)$ . Thus, if ${p | Δ |}_{\infty} = o (1)$ , we have $∥ {\tilde{Σ}}_{G}^{- 1 / 2} ∥_{2} \leq C$ . Since $H_{b, j} = K_{n}^{- 1 / 2} \sum_{t \in I_{b}} ξ_{t, j}$ , (A1) yields $E (| {\tilde{H}}_{b, j} |^{3}) = E (| H_{b, j}^{+} |^{3}) ≲ E (| H_{b, j} |^{3}) \leq C$ for any $b \in [B_{n}]$ and $j \in [p]$ , which implies

$\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j} \leq x) - P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) | ≲ \frac{p^{7 / 4}}{B_{n}^{1 / 2}} \end{matrix}$ (A12)

provided that ${p | Δ |}_{\infty} = o (1)$ . By Proposition 2.1 of [16], we have

$\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | ≲ {| Δ |}_{\infty}^{1 / 2} log p . \end{matrix}$ (A13)

Then, by (A4), (A12), and (A13), we have

$\begin{matrix} {\tilde{ω}}_{n} ≲ \frac{p^{7 / 4}}{B_{n}^{1 / 2}} + {| Δ |}_{\infty}^{1 / 2} log p \end{matrix}$

provided that ${p | Δ |}_{\infty} = o (1)$ . Together with (A9),

$\begin{matrix} ω_{n} ≲ & \frac{p^{7 / 4}}{B_{n}^{1 / 2}} + {| Δ |}_{\infty}^{1 / 2} log p + p^{1 / 2} {(log p)}^{1 / 4} {(\frac{k_{n}^{1 / 2}}{K_{n}^{1 / 2}} + \frac{1}{D_{n}^{1 / 2}} + \frac{1}{k_{n}^{τ / 6}} + \frac{D_{n}}{k_{n}^{τ / 2}})}^{1 / 2} \end{matrix}$

provided that ${p | Δ |}_{\infty} = o (1)$ . Select $D_{n} ≍ n^{4 τ / (11 τ + 12)}$ , $k_{n} ≍ n^{12 / (11 τ + 12)}$ , and $ς = 7 τ / (11 τ + 12)$ . Then, if $p = o {n^{2 τ / (11 τ + 12)}}$ , we have

$\begin{matrix} ω_{n} ≲ \frac{p^{7 / 4}}{n^{7 τ / (22 τ + 24)}} + \frac{log p}{n^{2 τ / (11 τ + 12)}} + \frac{p^{1 / 2} {(log p)}^{1 / 4}}{n^{τ / (11 τ + 12)}} ≲ \frac{p^{1 / 2} {(log p)}^{1 / 4}}{n^{τ / (11 τ + 12)}} . \end{matrix}$

Hence, we complete the proof of Theorem 1(i). □

Appendix A.2. Proof of Theorem 1(ii)

Proof.

Define ${(G_{b}, I_{b}, J_{b})}_{b = 1}^{B_{n}}$ , ${H_{b}^{+}}_{b = 1}^{B_{n}}$ , and ${H_{b}^{-}}_{b = 1}^{B_{n}}$ in the same manner as in the proof of Theorem 1(i) with $B_{n} ≍ n^{ς}$ , $K_{n} ≍ n^{1 - ς}$ , $k_{n} ≪ n^{1 - ς}$ and $D_{n} \to \infty$ , where $ς \in (0, 1)$ . Let

$\begin{matrix} ω_{n} = sup_{x > 0} | P (max_{j \in [p]} S_{n, j} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | . \end{matrix}$

Analogously to (A5), due to ${min}_{j \in [p]} {(Σ_{n})}_{j, j} > c$ , we have

$\begin{matrix} ω_{n} ≲ {\tilde{ω}}_{n} + P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > ϵ_{2}) + ϵ_{2} {(log p)}^{1 / 2} \end{matrix}$ (A14)

for some $ϵ_{2} > 0$ , where ${\tilde{S}}_{n, j} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}$ with ${{\tilde{H}}_{b, j}}$ specified in the same manner as in the proof of Theorem 1(i), and

$\begin{matrix} {\tilde{ω}}_{n} = sup_{x > 0} | P (max_{j \in [p]} {\tilde{S}}_{n, j} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | . \end{matrix}$

Define $S_{n}^{+} = {(S_{n, 1}^{+}, \dots, S_{n, p}^{+})}^{T} = B_{n}^{- 1 / 2} \sum_{b = 1}^{B_{n}} H_{b}^{+}$ . By triangle inequality,

$\begin{matrix} | max_{j \in [p]} S_{n, j} - max_{j \in [p]} S_{n, j}^{+} | \leq max_{j \in [p]} | S_{n, j} - S_{n, j}^{+} | \leq max_{j \in [p]} | \frac{1}{n^{1 / 2}} \sum_{b = 1}^{B_{n}} \sum_{t \in J_{b}} ξ_{t, j} | + max_{j \in [p]} | \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} H_{b, j}^{-} | . \end{matrix}$

Note that $P (| ξ_{t, j} {M_{n}}^{- 1} | > x) ≲ exp (- C x^{γ_{1}})$ for any $x > 0$ . Let $\tilde{γ} = {(1 / γ_{1} + 1 / γ_{2})}^{- 1}$ . By Theorem 1 of [32] and Bonferroni inequality, we have

$\begin{matrix} P (max_{j \in [p]} | \frac{1}{n^{1 / 2}} \sum_{b = 1}^{B_{n}} \sum_{t \in J_{b}} ξ_{t, j} | > x) ≲ p B_{n} k_{n} exp (- \frac{C n^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + p exp (- \frac{C n x^{2}}{M_{n}^{2} B_{n} k_{n}}) \end{matrix}$ (A15)

for any $x ≫ M_{n} n^{- 1 / 2}$ . Similarly, by Theorem 1 of [32] again, for any $x ≫ M_{n} K_{n}^{- 1 / 2}$ ,

$\begin{matrix} P (| H_{b, j} | > x) = P (| \frac{1}{K_{n}^{1 / 2}} \sum_{t \in I_{b}} ξ_{t, j} | > x) ≲ K_{n} exp (- \frac{C K_{n}^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + exp (- \frac{C x^{2}}{M_{n}^{2}}) . \end{matrix}$

Then, if $D_{n} > M_{n}$ ,

$\begin{matrix} E {H_{b, j}^{2} 1 (| H_{b, j} | > D_{n})} = & 2 \int_{0}^{D_{n}} x P (| H_{b, j} | > D_{n}) d x + 2 \int_{D_{n}}^{\infty} x P (| H_{b, j} | > x) d x \\ ≲ & D_{n}^{2} \{K_{n} exp (- \frac{C K_{n}^{\tilde{γ} / 2} D_{n}^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + exp (- \frac{C D_{n}^{2}}{M_{n}^{2}})\} \\ + K_{n} \int_{D_{n}}^{\infty} x exp (- \frac{C K_{n}^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) d x + \int_{D_{n}}^{\infty} x exp (- \frac{C x^{2}}{M_{n}^{2}}) d x \\ ≲ & D_{n}^{2} \{K_{n} (- \frac{C K_{n}^{\tilde{γ} / 2} D_{n}^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + exp (- \frac{C D_{n}^{2}}{M_{n}^{2}})\} . \end{matrix}$

Thus, for any $b \in [B_{n}]$ and $j \in [p]$ ,

$\begin{matrix} E (| H_{b, j}^{-} |^{2}) ≲ E {H_{b, j}^{2} 1 (| H_{b, j} | > D_{n})} ≲ D_{n}^{2} \{K_{n} (- \frac{C K_{n}^{\tilde{γ} / 2} D_{n}^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + exp (- \frac{C D_{n}^{2}}{M_{n}^{2}})\} . \end{matrix}$ (A16)

Select $D_{n} = C^{*} M_{n} {log (p n)}^{1 / 2}$ for some sufficiently large constant $C^{*} > 0$ . Thus, for any $x \geq 0$ ,

$\begin{matrix} P (max_{j \in [p]} | \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} H_{b, j}^{-} | > x) \leq \frac{p B_{n}^{1 / 2} {max}_{j \in [p]} {max}_{b \in [B_{n}]} E (| H_{b, j}^{-} |)}{x} ≲ \frac{{(p n)}^{- 1}}{x} \end{matrix}$

provided that $log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . Then, by (A15), we can conclude that for any $x ≫ M_{n} n^{- 1 / 2}$ ,

$\begin{matrix} P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} S_{n, j}^{+} | > x) ≲ p B_{n} k_{n} exp (- \frac{C n^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + p exp (- \frac{C n x^{2}}{M_{n}^{2} B_{n} k_{n}}) + \frac{{(p n)}^{- 1}}{x} \end{matrix}$ (A17)

provided that $log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . Similar to (A3), we have

$\begin{matrix} E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |) ≲ D_{n} α (k_{n}) ≲ D_{n} exp (- C k_{n}^{γ_{2}}) . \end{matrix}$ (A18)

Select $k_{n} = C^{* *} {log (p n)}^{1 / γ_{2}}$ for some sufficiently large constant $C^{* *} > 0$ . By (A18) and triangle inequality,

$\begin{matrix} P (| max_{j \in [p]} {\tilde{S}}_{n, j} - max_{j \in [p]} S_{n, j}^{+} | > x) \leq \frac{p B_{n}^{1 / 2} {max}_{b \in [B_{n}]} {max}_{j \in [p]} E (| {\tilde{H}}_{b, j} - H_{b, j}^{+} |)}{x} ≲ \frac{{(p n)}^{- 1}}{x} \end{matrix}$

for any $x \geq 0$ . Thus, by (A17), for any $x ≫ M_{n} n^{- 1 / 2}$ ,

$\begin{matrix} P (| max_{j \in [p]} S_{n, j} - max_{j \in [p]} {\tilde{S}}_{n, j} | > x) ≲ p B_{n} k_{n} exp (- \frac{C n^{\tilde{γ} / 2} x^{\tilde{γ}}}{M_{n}^{\tilde{γ}}}) + p exp (- \frac{C n x^{2}}{M_{n}^{2} B_{n} k_{n}}) + \frac{{(p n)}^{- 1}}{x} \end{matrix}$ (A19)

provided that $log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . Let $ϵ_{2} = C^{* * *} M_{n} k_{n}^{1 / 2} K_{n}^{- 1 / 2} {log (p n)}^{1 / 2}$ for some sufficient large constant $C^{* * *} > 0$ . It holds by (A14) that

$\begin{matrix} ω_{n} ≲ {\tilde{ω}}_{n} + \frac{M_{n} {log (p n)}^{(2 γ_{2} + 1) / 2 γ_{2}}}{K_{n}^{1 / 2}} \end{matrix}$ (A20)

provided that $log (p n) = o {k_{n}^{\tilde{γ} / (2 - \tilde{γ})} B_{n}^{\tilde{γ} / (2 - \tilde{γ})} \land K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . Define ${\tilde{Σ}}_{G} = B_{n}^{- 1} \sum_{b = 1}^{B_{n}} var ({\tilde{H}}_{b})$ and $Δ = | Σ_{n} - {\tilde{Σ}}_{G} |$ , where $Σ_{n} = E (S_{n} S_{n}^{T})$ . Note that

$\begin{matrix} Δ = & | \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} \{var ({\tilde{H}}_{b}) - var (H_{b}^{+})\} + \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} \{var (H_{b}^{+}) - var (H_{b})\} + \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} var (H_{b}) - Σ_{n} | \\ \leq & \underset{Δ_{1}}{\underset{︸}{\frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} | var ({\tilde{H}}_{b}) - var (H_{b}^{+}) |}} + \underset{Δ_{2}}{\underset{︸}{\frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} | var (H_{b}^{+}) - var (H_{b}) |}} + \underset{Δ_{3}}{\underset{︸}{| \frac{1}{B_{n}} \sum_{b = 1}^{B_{n}} var (H_{b}) - Σ_{n} |}} . \end{matrix}$ (A21)

In this sequel, we will specify the convergence rates of $| Δ_{1} |_{\infty}$ , $| Δ_{2} |_{\infty}$ and $| Δ_{3} |_{\infty}$ , respectively. Note that the $(i, j)$ -th element of $var ({\tilde{H}}_{b}) - var (H_{b}^{+})$ is $E ({\tilde{H}}_{b, i} {\tilde{H}}_{b, j} - H_{b, i}^{+} H_{b, j}^{+})$ . Due to ${\tilde{H}}_{b, j}$ has the same distribution as $H_{b, j}^{+}$ and $| H_{b, j}^{+} | ≲ D_{n}$ for any $b \in [B_{n}]$ and $j \in [p]$ , it holds by (A18) that

$\begin{matrix} | E ({\tilde{H}}_{b, i} {\tilde{H}}_{b, j} - H_{b, i}^{+} H_{b, j}^{+}) | \leq | E {({\tilde{H}}_{b, i} - H_{b, i}^{+}) {\tilde{H}}_{b, j}} | + | E {({\tilde{H}}_{b, j} - H_{b, j}^{+}) {\tilde{H}}_{b, i}^{+}} | ≲ {(p n)}^{- 1} \end{matrix}$

for any $b \in [B_{n}]$ and $i, j \in [p]$ . Thus, we can conclude that $| Δ_{1} |_{\infty} ≲ {(p n)}^{- 1}$ . The $(i, j)$ -th element of $var (H_{b}^{+}) - var (H_{b})$ is $E (H_{b, i}^{+} H_{b, j}^{+} - H_{b, i} H_{b, j})$ . Due to $H_{b, j} = H_{b, j}^{+} + H_{b, j}^{-}$ , then it holds by (A16) that

$\begin{matrix} | E (H_{b, i}^{+} H_{b, j}^{+} - H_{b, i} H_{b, j}) | = | E {H_{b, i}^{+} H_{b, j}^{+} - (H_{b, i}^{+} + H_{b, i}^{-}) (H_{b, j}^{+} + H_{b, j}^{-})} | \\ \leq & | E (H_{b, i}^{+} H_{b, j}^{-}) | + | E (H_{b, j}^{+} H_{b, i}^{-}) | + | E (H_{b, i}^{-} H_{b, j}^{-}) | ≲ {(p n)}^{- 1} \end{matrix}$

for any $b \in [B_{n}]$ and $i, j \in [p]$ provided that $log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . Thus, we can conclude that $| Δ_{2} |_{\infty} ≲ {(p n)}^{- 1}$ provided that $log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . The $(i, j)$ -th element of $Σ_{n} - {B_{n}}^{- 1} \sum_{b = 1}^{B_{n}} var (H_{b})$ is $n^{- 1} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - n^{- 1} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j})$ , and

$\begin{matrix} | \frac{1}{n} \sum_{t_{1}, t_{2} = 1}^{n} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) - \frac{1}{n} \sum_{b = 1}^{B_{n}} \sum_{t_{1}, t_{2} \in I_{b}} E (ξ_{t_{1}, i} ξ_{t_{2}, j}) | \\ = & \frac{1}{n} | \sum_{b_{1} \neq b_{2}} E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} \\ + \sum_{b = 1}^{B_{n}} E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j}) + (\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | . \end{matrix}$ (A22)

Note that $E (| ξ_{t, j} |^{r}) ≲ M_{n}^{r}$ for any constant integer $r > 0$ . Equation (1.12b) of [30] yields

$\begin{matrix} | E \{(\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | \\ = & | \sum_{t \in J_{b}} cov (ξ_{t, i}, ξ_{t, j}) + \sum_{t_{1} \neq t_{2} : t_{1}, t_{2} \in J_{b}} cov (ξ_{t_{1}, i}, ξ_{t_{2}, j}) + \sum_{t_{1} \in J_{b}} \sum_{t_{2} \in I_{b}} cov (ξ_{t_{1}, i}, ξ_{t_{2}, j}) | \\ ≲ & M_{n}^{2} k_{n} + \sum_{t_{1} \neq t_{2} : t_{1}, t_{2} \in J_{b}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) \\ + \sum_{t_{1} \in J_{b}} \sum_{t_{2} \in I_{b}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) ≲ M_{n}^{2} k_{n} . \end{matrix}$ (A23)

Similarly, we can also obtain

$| E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j})\} | ≲ M_{n}^{2} k_{n} .$

Thus,

$\begin{matrix} | \sum_{b = 1}^{B_{n}} E \{(\sum_{t \in I_{b}} ξ_{t, i}) (\sum_{t \in J_{b}} ξ_{t, j}) + (\sum_{t \in J_{b}} ξ_{t, i}) (\sum_{t \in G_{b}} ξ_{t, j})\} | ≲ M_{n}^{2} k_{n} B_{n} . \end{matrix}$

By Equation (1.12b) of [30], if $b_{1} < b_{2}$ ,

$\begin{matrix} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | \\ ≲ & \sum_{t_{1} \in G_{b_{1}}} \sum_{t_{2} \in G_{b_{2}}} {E (| ξ_{t_{1}, i} |^{3} {)}}^{\frac{1}{3}} {E (| ξ_{t_{2}, j} |^{3} {)}}^{\frac{1}{3}} α^{\frac{1}{3}} (| t_{1} - t_{2} |) \\ ≲ & M_{n}^{2} \sum_{δ = 1}^{K_{n}} δ exp [- C {(b_{2} - b_{1} - 1) K_{n} + δ}^{γ_{2}}] \\ ≲ & M_{n}^{2} 1 (b_{2} - b_{1} = 1) + M_{n}^{2} K_{n}^{2} exp {- C {(b_{2} - b_{1} - 1)}^{γ_{2}} {K_{n}}^{γ_{2}}} 1 (b_{2} - b_{1} > 1) . \end{matrix}$

Thus,

$\begin{matrix} \sum_{b_{1} < b_{2}} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | \\ ≲ & M_{n}^{2} B_{n} + M_{n}^{2} K_{n}^{2} \sum_{b_{2} - b_{1} = 2}^{B_{n} - 1} exp {- C {(b_{2} - b_{1} - 1)}^{γ_{2}} {K_{n}}^{γ_{2}}} ≲ M_{n}^{2} B_{n} . \end{matrix}$

Same result holds for $b_{1} > b_{2}$ . Thus we can conclude that

$\begin{matrix} \sum_{b_{1} \neq b_{2}} | E \{(\sum_{t \in G_{b_{1}}} ξ_{t, i}) (\sum_{t \in G_{b_{2}}} ξ_{t, j})\} | ≲ M_{n}^{2} B_{n} . \end{matrix}$

Note that the above upper bounds do not depend on $(i, j)$ . Then by (A22), it holds that $| Δ_{3} |_{\infty} ≲ M_{n}^{2} k_{n} K_{n}^{- 1}$ . By (A21), we can conclude that

$\begin{matrix} {| Δ |}_{\infty} ≲ \frac{M_{n}^{2} k_{n}}{K_{n}} \end{matrix}$ (A24)

provided that $log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . Let ${{\tilde{H}}_{b}^{G}}_{b = 1}^{B_{n}}$ be a sequence of independent Gaussian vector such that ${\tilde{H}}_{b}^{G} = {({\tilde{H}}_{b, 1}^{G}, \dots, {\tilde{H}}_{b, p}^{G})}^{T} \sim N {0_{p}, var ({\tilde{H}}_{b})}$ for any $b \in [B_{n}]$ , where ${\tilde{H}}_{b} = {({\tilde{H}}_{b, 1}, \dots, {\tilde{H}}_{b, p})}^{T}$ . Due to $k_{n} ≍ {log (p n)}^{1 / γ_{2}}$ , we know that ${min}_{j \in [p]} {({\tilde{Σ}}_{G})}_{j, j} > c$ provided that $M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (K_{n})$ and $log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . Due to ${\tilde{H}}_{b, j} \leq 2 D_{n} ≲ M_{n} {log (p n)}^{1 / 2}$ , it holds that $E ({\tilde{H}}_{b, j}^{4}) ≲ D_{n}^{2} E ({\tilde{H}}_{b, j}^{2}) ≲ M_{n}^{4} log (p n)$ for any $b \in [B_{n}]$ and $j \in [p]$ , where the last inequality follows from $E ({\tilde{H}}_{b, j}^{2}) = E (| H_{b, j}^{+} |^{2}) ≲ E (H_{b, j}^{2})$ and the similar arguments as in the proof of (A23). By Theorem 2.1 of [16], we have

$\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} {\tilde{S}}_{n, j} \leq x) - P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) | ≲ \frac{M_{n} {log (p n)}^{3 / 2}}{B_{n}^{1 / 4}} . \end{matrix}$ (A25)

provided that $M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (K_{n})$ and $log (p n) = o {K_{n}^{\tilde{γ} / (2 - \tilde{γ})}}$ . By Proposition 2.1 of [16] and (A24), we have

$\begin{matrix} sup_{x > 0} | P (max_{j \in [p]} \frac{1}{B_{n}^{1 / 2}} \sum_{b = 1}^{B_{n}} {\tilde{H}}_{b, j}^{G} \leq x) - P (max_{j \in [p]} W_{j} \leq x) | \\ ≲ & {| Δ |}_{\infty}^{1 / 2} log p ≲ \frac{M_{n} {log (p n)}^{(2 γ_{2} + 1) / 2 γ_{2}}}{K_{n}^{1 / 2}} . \end{matrix}$ (A26)

By (A20), (A25) and (A26), due to $\tilde{γ} = {(1 / γ_{1} + 1 / γ_{2})}^{- 1}$ , we have

$\begin{matrix} ω_{n} ≲ \frac{M_{n} {log (p n)}^{3 / 2}}{B_{n}^{1 / 4}} + \frac{M_{n} {log (p n)}^{(2 γ_{2} + 1) / 2 γ_{2}}}{K_{n}^{1 / 2}} \end{matrix}$

provided that $log (p n) = o {B_{n}^{γ_{1} γ_{2} / (γ_{1} + 2 γ_{2} - γ_{1} γ_{2})} \land K_{n}^{γ_{1} γ_{2} / (2 γ_{1} + 2 γ_{2} - γ_{1} γ_{2})}}$ and $M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (K_{n})$ . Select $ς = 2 / 3$ . Then $B_{n} ≍ n^{2 / 3}$ , $K_{n} ≍ n^{1 / 3}$ and

$\begin{matrix} ω_{n} ≲ \frac{M_{n} {log (p n)}^{max {(2 γ_{2} + 1) / 2 γ_{2}, 3 / 2}}}{n^{1 / 6}} \end{matrix}$

provided that $M_{n}^{2} {log (p n)}^{1 / γ_{2}} = o (n^{1 / 3})$ and ${log (p n)}^{3} = o {n^{γ_{1} γ_{2} / (2 γ_{1} + 2 γ_{2} - γ_{1} γ_{2})}}$ . Thus we complete the proof of Theorem 1(ii). □

Appendix B. Proof of Proposition 1

Proof.

Define

${\overset{˚}{T}}_{n} = | \frac{1}{\sqrt{\tilde{n}}} \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t} |_{\infty},$

where ${\overset{˚}{Z}}_{t} = Z_{t} - E (Z_{t})$ . Under $H_{0}$ , we know that $μ_{X} = μ_{Y} = : μ$ . Recall $n_{1} ≍ n_{2} ≍ n$ and $Δ_{n} = n_{1} \lor n_{2} - n_{1} \land n_{2}$ . Without loss of generality, we assume $n_{1} \leq n_{2}$ . By triangle inequality, for any $j \in [p]$ ,

$\begin{matrix} | \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j} - \sum_{t = 1}^{\tilde{n}} Z_{t, j} | ≲ \sum_{t = 1}^{n_{1}} | \sqrt{\frac{n_{2}^{2}}{n_{1} (n_{1} + n_{2})}} - \sqrt{\frac{n_{1}}{(n_{1} + n_{2})}} | + \sum_{t = n_{1} + 1}^{n_{2}} | \sqrt{\frac{n_{1}}{(n_{1} + n_{2})}} | = O (Δ_{n}) . \end{matrix}$

Thus $| T_{n} - {\overset{˚}{T}}_{n} | = O (Δ_{n} n^{- 1 / 2})$ . Write $δ_{n} = Δ_{n} n^{- 1 / 2} π_{n}$ , where $π_{n} > 0$ diverges at a sufficiently slow rate. Thus, we have

$\begin{matrix} P (T_{n} \leq x) \leq P ({\overset{˚}{T}}_{n} \leq x + δ_{n}) + P (| T_{n} - {\overset{˚}{T}}_{n} | > δ_{n}) \\ \leq & P (T_{n}^{G} \leq x + δ_{n}) + sup_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | + o (1) \\ \leq & P (T_{n}^{G} \leq x) + sup_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) + sup_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | + o (1) . \end{matrix}$

Analogously, we can also obtain that $P (T_{n} \leq x) \geq P (T_{n}^{G} \leq x) - {sup}_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) - {sup}_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | - o (1)$ . Thus,

$\begin{matrix} sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | \\ \leq & sup_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) + sup_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | + o (1) . \end{matrix}$

In Case1, by Assumption 1(iii), we have ${min}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j} > c$ . Then by Lemma A.1 of [31], due to $Δ_{n}^{2} log p = o (n)$ , we have ${sup}_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) ≲ Δ_{n} n^{- 1 / 2} π_{n} {(log p)}^{1 / 2} = o (1)$ . By Assumption 1(i), we have ${max}_{t \in [\tilde{n}]} {max}_{j \in [p]} E (| {\overset{˚}{Z}}_{t, j} |^{m}) \leq C$ . Note that $Ξ_{\tilde{n}} = E ({\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t}, {\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t}^{T})$ . Then by Assumption 1 and Theorem 1(i), due to $3 m / (m - 4) > max {2 m / (m - 3), 3}$ , we have ${sup}_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1)$ provided that $p^{2} log p = o {n^{4 τ / (11 τ + 12)}}$ . Thus, if $p^{2} log p = o {n^{4 τ / (11 τ + 12)}}$ ,

$sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1) .$

Similarly, in Case2, by Assumption 2 and Theorem 1(ii) with $(M_{n}, γ_{1}, γ_{2}) = (C, 2, 1)$ , we have ${sup}_{x \in R} P (x - δ_{n} \leq T_{n}^{G} \leq x + δ_{n}) ≲ Δ_{n} n^{- 1 / 2} π_{n} {(log p)}^{1 / 2} = o (1)$ and ${sup}_{x \in R} | P ({\overset{˚}{T}}_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1)$ provided that $log (p n) = o (n^{1 / 9})$ . Thus, if $log (p n) = o (n^{1 / 9})$ ,

$sup_{x \in R} | P (T_{n} \leq x) - P (T_{n}^{G} \leq x) | = o (1) .$

We complete the proof of Proposition 1. □

Appendix C. Proof of Theorem 2

Appendix C.1. Proof of Theorem 2 under Case3

Proof.

By Proposition 1 under Case1, it suffices to show

$\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | = o_{p} (1) . \end{matrix}$

Recall $T_{n}^{G} = {| G |}_{\infty}$ with $G \sim N (0, Ξ_{\tilde{n}})$ and ${\hat{T}}_{n}^{G} = {| {\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} (Z_{t} - \bar{Z}) ϱ_{t}^{'} |}_{\infty}$ , where $Ξ_{\tilde{n}} = var ({\tilde{n}}^{- 1 / 2} \sum_{t = 1}^{\tilde{n}} Z_{t})$ . Let

$\begin{matrix} {\hat{Ξ}}_{\tilde{n}} = \frac{1}{\tilde{n}} \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} (Z_{t} - \bar{Z})) {(\sum_{t \in I_{b}} (Z_{t} - \bar{Z}))}^{T}\} . \end{matrix}$ (A27)

By Proposition 2.1 of [16], we have

$\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | ≲ Γ^{1 / 2} log p, \end{matrix}$ (A28)

where

$Γ = | Ξ_{\tilde{n}} - {\hat{Ξ}}_{\tilde{n}} |_{\infty} = \frac{1}{\tilde{n}} | \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} (Z_{t} - \bar{Z})) {(\sum_{t \in I_{b}} (Z_{t} - \bar{Z}))}^{T}\} - var (\sum_{t = 1}^{\tilde{n}} Z_{t}) |_{\infty} .$

Let ${\overset{˚}{Z}}_{t} = {({\overset{˚}{Z}}_{t, 1}, \dots, {\overset{˚}{Z}}_{t, p})}^{T} = Z_{t} - E (Z_{t})$ . Then, for any $i, j \in [p]$ , triangle inequality yields

$\begin{matrix} | \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} (Z_{t, i} - {\bar{Z}}_{i})) (\sum_{t \in I_{b}} (Z_{t, j} - {\bar{Z}}_{j}))\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | \\ = & | \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} ({\overset{˚}{Z}}_{t, i} - {\bar{\overset{˚}{Z}}}_{i})) (\sum_{t \in I_{b}} ({\overset{˚}{Z}}_{t, j} - {\bar{\overset{˚}{Z}}}_{j}))\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | \\ \leq & | \sum_{b = 1}^{B} \{(\sum_{t \in I_{b}} ({\overset{˚}{Z}}_{t, i} - {\bar{\overset{˚}{Z}}}_{i})) (\sum_{t \in I_{b}} ({\overset{˚}{Z}}_{t, j} - {\bar{\overset{˚}{Z}}}_{j}))\} - \sum_{b = 1}^{B} E \{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})\} | \\ + | \sum_{b = 1}^{B} E \{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | \\ \leq & \underset{I_{1, i, j}}{\underset{︸}{| \sum_{b = 1}^{B} \sum_{t_{1}, t_{2} \in I_{b}} {{\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})} |}} + \underset{I_{2, i, j}}{\underset{︸}{\frac{S}{\tilde{n}} | (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j}) |}} \\ + \underset{I_{3, i, j}}{\underset{︸}{| \sum_{b = 1}^{B} E \{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} |}} . \end{matrix}$

In this sequel, we will specify the upper bounds of $I_{1, i, j}$ , $I_{2, i, j}$ and $I_{3, i, j}$ , respectively. Without loss of generality, we assume $\tilde{n} = B S$ with $B ≍ n^{ϑ}$ and $S ≍ n^{1 - ϑ}$ . By Assumption 1(i), it holds that ${max}_{t \in [\tilde{n}]} {max}_{j \in [p]} E (| {\overset{˚}{Z}}_{t, j} |^{m}) \leq C$ for some $m > 4$ . Then, due to $τ > 3 m / (m - 4)$ , (A1) yields

$E ({[\sum_{t_{1}, t_{2} \in I_{b}} {{\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})}]}^{2}) \leq E \{{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i})}^{2} {(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})}^{2}\} ≲ S^{2} .$

By triangle inequality,

$\begin{matrix} E ({[\sum_{b = 1}^{B} \sum_{t_{1}, t_{2} \in I_{b}} {{\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})}]}^{2}) \\ ≲ B S^{2} + \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} cov ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j}, {\overset{˚}{Z}}_{t_{3}, i} {\overset{˚}{Z}}_{t_{4}, j}) | \\ \leq B S^{2} + \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} {cum}_{i, j} (t_{2} - t_{1}, t_{3} - t_{1}, t_{4} - t_{1}) | \\ + \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{3}, i}) E ({\overset{˚}{Z}}_{t_{2}, j} {\overset{˚}{Z}}_{t_{4}, j}) | \\ + \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{4}, j}) E ({\overset{˚}{Z}}_{t_{3}, i} {\overset{˚}{Z}}_{t_{2}, j})} | . \end{matrix}$ (A29)

By Assumption 3, $\sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} {cum}_{i, j} (t_{2} - t_{1}, t_{3} - t_{1}, t_{4} - t_{1}) ≲ S$ , which implies

$\begin{matrix} \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} {cum}_{i, j} (t_{2} - t_{1}, t_{3} - t_{1}, t_{4} - t_{1}) | ≲ B^{2} S . \end{matrix}$ (A30)

For any $b \in [B - 1]$ and $s \in [B - b]$ , due to $τ > 3 m / (m - 4)$ , Equation (1.12b) of [30] yields

$\begin{matrix} | \sum_{t_{1} \in I_{b}} \sum_{t_{3} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{3}, i}) | \leq \sum_{t_{1} \in I_{b}} \sum_{t_{3} \in I_{b + s}} | E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{3}, i}) | \\ ≲ & \sum_{t_{1} \in I_{b}} \sum_{t_{3} \in I_{b + s}} {E (| Z_{t_{1}, i} |^{m} {)}}^{\frac{1}{m}} {E (| Z_{t_{3}, i} |^{m} {)}}^{\frac{1}{m}} α^{\frac{m - 2}{m}} (t_{3} - t_{1}) ≲ \sum_{t_{1} \in I_{b}} \sum_{t_{3} \in I_{b + s}} α^{\frac{m - 2}{m}} (t_{3} - t_{1}) \\ ≲ & \sum_{h = 1}^{S} h α^{\frac{m - 2}{m}} {h + (s - 1) S} ≲ 1 (s = 1) + S^{\frac{2 m - (m - 2) τ}{m}} {(s - 1)}^{- \frac{(m - 2) τ}{m}} 1 (s > 1) . \end{matrix}$

Similarly, we also have

$| \sum_{t_{2} \in I_{b}} \sum_{t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{2}, j} {\overset{˚}{Z}}_{t_{4}, j}) | ≲ 1 (s = 1) + S^{\frac{2 m - (m - 2) τ}{m}} {(s - 1)}^{- \frac{(m - 2) τ}{m}} 1 (s > 1) .$

Thus,

$\begin{matrix} \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{3}, i}) E ({\overset{˚}{Z}}_{t_{2}, j} {\overset{˚}{Z}}_{t_{4}, j}) | \\ ≲ & \sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} \{1 (s = 1) + S^{\frac{4 m - 2 (m - 2) τ}{m}} {(s - 1)}^{- \frac{2 (m - 2) τ}{m}} 1 (s > 1)\} ≲ B . \end{matrix}$ (A31)

Analogously, we also have $\sum_{b = 1}^{B - 1} \sum_{s = 1}^{B - b} | \sum_{t_{1}, t_{2} \in I_{b}} \sum_{t_{3}, t_{4} \in I_{b + s}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{4}, j}) E ({\overset{˚}{Z}}_{t_{3}, i} {\overset{˚}{Z}}_{t_{2}, j})} | ≲ B$ . Combining this with (A29)–(A31), due to $B \geq S$ ,

$E ({[\sum_{b = 1}^{B} \sum_{t_{1}, t_{2} \in I_{b}} {{\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})}]}^{2}) ≲ B^{2} S .$

Then it holds that

$\begin{matrix} I_{1, i, j} = O_{p} (B S^{1 / 2}) . \end{matrix}$ (A32)

Similar to (A1), we have $| (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j}) | = O_{p} (n)$ . Thus, we know that

$\begin{matrix} I_{2, i, j} = O_{p} (S) . \end{matrix}$ (A33)

Note that

$\begin{matrix} I_{3, i, j} \leq \sum_{b_{1} \neq b_{2}} | E \{(\sum_{t \in I_{b_{1}}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b_{2}}} {\overset{˚}{Z}}_{t, j})\} | . \end{matrix}$

For $b_{1} < b_{2}$ , due to $τ > 3 m / (m - 4)$ , Equation (1.12b) of [30] yields

$\begin{matrix} | E \{(\sum_{t \in I_{b_{1}}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b_{2}}} {\overset{˚}{Z}}_{t, j})\} | \\ ≲ & \sum_{s = 1}^{S} s {E (| Z_{t, i} |^{m} {)}}^{\frac{1}{m}} {E (| Z_{t + s, j} |^{m} {)}}^{\frac{1}{m}} α^{\frac{m - 2}{m}} {s + (b_{2} - b_{1} - 1) S} \\ ≲ & 1 (b_{2} - b_{1} = 1) + S^{\frac{2 m - (m - 2) τ}{m}} {(b_{2} - b_{1} - 1)}^{- \frac{(m - 2) τ}{m}} 1 (b_{2} - b_{1} > 1) . \end{matrix}$

Thus,

$\begin{matrix} \sum_{b_{1} < b_{2}} | E \{(\sum_{t \in I_{b_{1}}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b_{2}}} {\overset{˚}{Z}}_{t, j})\} | ≲ & B + S^{\frac{2 m - (m - 2) τ}{m}} \sum_{b_{2} - b_{1} > 1} {(b_{2} - b_{1} - 1)}^{- \frac{(m - 2) τ}{m}} ≲ B . \end{matrix}$

Similarly, we can also obtain $\sum_{b_{1} > b_{2}} | E {(\sum_{t \in I_{b_{1}}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b_{2}}} {\overset{˚}{Z}}_{t, j})} | ≲ B$ , which implies $I_{3, i, j} ≲ B$ . Then by (A32) and (A33), it holds that

$\frac{1}{\tilde{n}} | \sum_{b = 1}^{B} (\sum_{t \in I_{b}} (Z_{t, i} - {\bar{Z}}_{i})) (\sum_{t \in I_{b}} (Z_{t, j} - {\bar{Z}}_{j})) - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | = O_{p} (S^{- 1 / 2}) .$

Then, by Markov’s inequality,

$\begin{matrix} Γ = | Ξ_{\tilde{n}} - {\hat{Ξ}}_{\tilde{n}} |_{\infty} = O_{p} (p^{2} S^{- 1 / 2}) . \end{matrix}$ (A34)

By (A28), due to $S ≍ n^{1 - ϑ}$ , it holds that

$\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | = o_{p} (1) \end{matrix}$ (A35)

provided that $p log p = o {n^{(1 - ϑ) / 4}}$ .

Recall ${\hat{cv}}_{α} = inf {x > 0 : P ({\hat{T}}_{n}^{G} > x | E) \leq α}$ . For any $ϵ > 0$ , let ${cv}_{α}^{(ϵ)}$ and ${cv}_{α}^{(- ϵ)}$ be two constants which satisfy $P {T_{n}^{G} > {cv}_{α}^{(ϵ)}} = α + ϵ$ and $P {T_{n}^{G} > {cv}_{α}^{(- ϵ)}} = α - ϵ$ , respectively. We claim that for any $ϵ > 0$ , it holds that $P {{cv}_{α}^{(ϵ)} < {\hat{cv}}_{α} < {cv}_{α}^{(- ϵ)}} \to 1$ as $n \to \infty$ . Otherwise, if ${\hat{cv}}_{α} \leq {cv}_{α}^{(ϵ)}$ , by (A35), we have

$\begin{matrix} α = P ({\hat{T}}_{n}^{G} > {\hat{cv}}_{α} | E) \geq P {{\hat{T}}_{n}^{G} > {cv}_{α}^{(ϵ)} | E} = P {T_{n}^{G} > {cv}_{α}^{(ϵ)}} + o_{p} (1) = α + ϵ + o_{p} (1), \end{matrix}$

which is a contradiction with probability approaching one as $n \to \infty$ . Analogously, if ${\hat{cv}}_{α} \geq {cv}_{α}^{(- ϵ)}$ , by (A35), we have

$\begin{matrix} α = P ({\hat{T}}_{n}^{G} > {\hat{cv}}_{α} | E) \leq P {{\hat{T}}_{n}^{G} > {cv}_{α}^{(- ϵ)} | E} = P {T_{n}^{G} > {cv}_{α}^{(- ϵ)}} + o_{p} (1) = α - ϵ + o_{p} (1), \end{matrix}$

which is also a contradiction with probability approaching one as $n \to \infty$ .

For any $ϵ > 0$ , define the event $E_{1, ϵ} = {{cv}_{α}^{(ϵ)} < {\hat{cv}}_{α} < {cv}_{α}^{(- ϵ)}}$ . Then $P (E_{1, ϵ}) \to 1$ as $n \to \infty$ . On the one hand, by Proposition 1,

$\begin{matrix} P (T_{n} > {\hat{cv}}_{α}) \leq & P (T_{n} > {\hat{cv}}_{α} | E_{1, ϵ}) + P (E_{1, ϵ}^{c}) \leq P {T_{n} > {cv}_{α}^{(ϵ)}} + o (1) \\ = & P {T_{n}^{G} > {cv}_{α}^{(ϵ)}} + o (1) = α + ϵ + o (1), \end{matrix}$

which implies that ${\lim^{¯}}_{n \to \infty} P (T_{n} > {\hat{cv}}_{α}) \leq α + ϵ$ . On the other hand, by Proposition 1,

$\begin{matrix} P (T_{n} > {\hat{cv}}_{α}) \geq & P (T_{n} > {\hat{cv}}_{α} | E_{1, ϵ}) \geq P {T_{n} > {cv}_{α}^{(- ϵ)}} - P (E_{1, ϵ}^{c}) \\ \geq & P {T_{n}^{G} > {cv}_{α}^{(- ϵ)}} - o (1) = α - ϵ - o (1), \end{matrix}$

which implies that ${\lim_{_}}_{n \to \infty} P (T_{n} > {\hat{cv}}_{α}) \geq α - ϵ$ . Since $P (T_{n} > {\hat{cv}}_{α})$ does not depend on $ϵ$ , by letting $ϵ \to 0^{+}$ , we have ${lim}_{n \to \infty} P (T_{n} > {\hat{cv}}_{α}) = α$ . Thus we complete the proof of Theorem 2 under Case3. □

Appendix C.2. Proof of Theorem 2 under Case4

Proof.

By Proposition 1 under Case2 and the arguments in Appendix C.1, it suffices to show

$\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | ≲ Γ^{1 / 2} log p = o_{p} (1), \end{matrix}$

where

$Γ \leq max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{1, i, j} | + max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{2, i, j} | + max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{3, i, j} |$

with $I_{1, i, j}$ , $I_{2, i, j}$ and $I_{3, i, j}$ specified in Appendix C.1. In this sequel, we will specify the upper bounds of ${max}_{i, j \in [p]} | I_{1, i, j} |$ , ${max}_{i, j \in [p]} | I_{2, i, j} |$ and ${max}_{i, j \in [p]} | I_{3, i, j} |$ , respectively.

Without loss of generality, we assume $\tilde{n} = B S$ with $B ≍ {\tilde{n}}^{ϑ}$ and $S ≍ {\tilde{n}}^{1 - ϑ}$ for some $ϑ \in [1 / 2, 1)$ . Let $W_{b, i, j} = \sum_{t_{1}, t_{2} \in I_{b}} {\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j} - E (\sum_{t_{1}, t_{2} \in I_{b}} {\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j})$ . For $R_{n} > C^{*} S$ with some sufficiently large constant $C^{*} > 0$ , denote $W_{b, i, j}^{+} = W_{b, i, j} 1 (| W_{b, i, j} | \leq R_{n}) - E {W_{b, i, j} 1 (| W_{b, i, j} | \leq R_{n})}$ and $W_{b, i, j}^{-} = W_{b, i, j} 1 (| W_{b, i, j} | > R_{n}) - E {W_{b, i, j} 1 (| W_{b, i, j} | > R_{n})}$ . Then for some $C_{n} > 0$ , it holds by Bonferroni inequality that

$\begin{matrix} P (max_{i, j \in [p]} | \sum_{b = 1}^{B} W_{b, i, j} | > \tilde{n} x) \leq & p^{2} max_{i, j \in [p]} P (| \sum_{b = 1}^{B} W_{b, i, j}^{+} | + | \sum_{b = 1}^{B} W_{b, i, j}^{-} | > \tilde{n} x) \\ \leq & p^{2} max_{i, j \in [p]} \{P (| \sum_{b = 1}^{B} W_{b, i, j}^{+} | > \tilde{n} x - C_{n}) + P (| \sum_{b = 1}^{B} W_{b, i, j}^{-} | > C_{n})\} \end{matrix}$

for all $x > C_{n} {\tilde{n}}^{- 1}$ . Note that

$\begin{matrix} E {W_{b, i, j}^{2} 1 (| W_{b, i, j} | > R_{n})} = 2 \int_{0}^{R_{n}} x P (| W_{b, i, j} | > R_{n}) d x + 2 \int_{R_{n}}^{\infty} x P (| W_{b, i, j} | > x) d x \end{matrix}$

By Assumptions 2(i)–(ii) and Cauchy–Schwarz inequality, $E (\sum_{t_{1}, t_{2} \in I_{b}} {\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j}) ≲ S$ . By Assumptions 2(i)–(ii) again and Theorem 1 of [32], we know that

$P (| \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i} | > x) ≲ S exp (- C x^{2 / 3}) + exp (- C S^{- 1} x^{2})$

for any $x \to \infty$ . Thus, for any $x > C S$ , we have

$\begin{matrix} P (| W_{b, i, j} | > x) \leq P (| \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i} | | \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j} | > C x) \\ \leq & P (| \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i} | > C x^{1 / 2}) + P (| \sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j} | > C x^{1 / 2}) ≲ S exp (- C x^{1 / 3}) + exp (- C S^{- 1} x) . \end{matrix}$

Due to $R_{n} > C S$ , we can show that

$\begin{matrix} E (| W_{b, i, j}^{-} |^{2}) ≲ E {W_{b, i, j}^{2} 1 (| W_{b, i, j} | > R_{n})} ≲ R_{n}^{2} S exp (- C R_{n}^{1 / 3}) + R_{n}^{2} exp (- C S^{- 1} R_{n}) . \end{matrix}$

Selecting $R_{n} = C^{* *} S log (p n)$ for some sufficiently large constant $C^{* *} > 0$ , and $C_{n} ≍ B^{1 / 2}$ , it holds by Markov’s inequality that

$\begin{matrix} p^{2} max_{i, j \in [p]} P (| \sum_{b = 1}^{B} W_{b, i, j}^{-} | > C_{n}) ≲ p^{2} B^{1 / 2} max_{i, j \in [p]} max_{b \in [B]} E (| W_{b, i, j}^{-} |) = o (1) \end{matrix}$

provided that $log (p n) = o (S^{1 / 2})$ . Due to $| W_{b, i, j}^{+} | \leq 2 R_{n}$ , by Theorem 1 of [33],

$\begin{matrix} P (| \sum_{b = 1}^{B} W_{b, i, j}^{+} | > \tilde{n} x - C_{n}) ≲ exp \{- \frac{C {\tilde{n}}^{2} x^{2}}{B R_{n}^{2} + R_{n} \tilde{n} x (log B) (log log B)}\} \end{matrix}$

for any $x > C B^{- 1 / 2} S^{- 1}$ . Thus, we can conclude that

$\begin{matrix} max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{1, i, j} | = max_{i, j \in [p]} | \frac{1}{\tilde{n}} \sum_{b = 1}^{B} W_{b, i, j} | = O_{p} [\frac{{log (p n)}^{3 / 2}}{B^{1 / 2}}] \end{matrix}$

provided that $log (p n) = o [min {S^{1 / 2}, B {(log n log log n)}^{- 2}}]$ . By Bonferroni inequality and Theorem 1 of [32], we know that

$\begin{matrix} P (max_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{2, i, j} | > x) ≲ & p^{2} max_{i \in [p]} P (| \sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i} | > C S^{- 1 / 2} n x^{1 / 2}) \\ ≲ & p^{2} n exp (- C S^{- 1 / 3} n^{2 / 3} x^{1 / 3}) + p^{2} exp (- C S^{- 1} n x) \end{matrix}$

for any $x ≫ S n^{- 2}$ . Then, we can conclude that ${max}_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{2, i, j} | = O_{p} {B^{- 1} log (p n)}$ provided that $log (p n) = o (n^{1 / 2})$ . Finally, Equation (1.12b) of [30] yields

$\begin{matrix} | \sum_{b = 1}^{B} E \{(\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, i}) (\sum_{t \in I_{b}} {\overset{˚}{Z}}_{t, j})\} - E \{(\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, i}) (\sum_{t = 1}^{\tilde{n}} {\overset{˚}{Z}}_{t, j})\} | \\ \leq & \sum_{b = 1}^{B - 1} \sum_{κ = 1}^{B - b} | \sum_{t_{1} \in I_{b}} \sum_{t_{2} \in I_{b + κ}} E ({\overset{˚}{Z}}_{t_{1}, i} {\overset{˚}{Z}}_{t_{2}, j}) | ≲ \sum_{b = 1}^{B - 1} \sum_{κ = 1}^{B - b} \sum_{δ = 1}^{B} δ exp [- C {δ + (κ - 1) S}] \\ ≲ & \sum_{b = 1}^{B - 1} \sum_{δ = 1}^{B} δ exp (- C δ) + \sum_{b = 1}^{B - 1} \sum_{κ = 2}^{B - b} B^{2} exp {- C (κ - 1) S} ≲ B + B^{3} exp (- C S) ≲ B, \end{matrix}$

which implies ${max}_{i, j \in [p]} {\tilde{n}}^{- 1} | I_{3, i, j} | = O (S^{- 1})$ . Thus,

$\begin{matrix} Γ = O_{p} [\frac{{log (p n)}^{3 / 2}}{B^{1 / 2}} + \frac{1}{S}] \end{matrix}$ (A36)

provided that $log (p n) = o [min {S^{1 / 2}, B {(log n log log n)}^{- 2}}]$ . It holds that

$\begin{matrix} sup_{x \in R} | P ({\hat{T}}_{n}^{G} \leq x | E) - P (T_{n}^{G} \leq x) | ≲ Γ^{1 / 2} log p = o_{p} (1) \end{matrix}$

provided that $log (p n) = o [n^{min {(1 - ϑ) / 2, ϑ / 7}}]$ . The proof of the second result of Theorem 2 under Case4 is the same as in the proof of the second result of Theorem 2 under Case3. Thus, we complete the proof of Theorem 2. □

Appendix D. Proof of Theorem 3

Proof.

Let $s = C π_{n} p^{2} S^{- 1 / 2}$ in Case3 and $s = C π_{n} [B^{- 1 / 2} {log (p n)}^{3 / 2} + C S^{- 1}]$ in Case4, where $π_{n} > 0$ diverges at a sufficiently slow rate. Then $s = o (1)$ provided that $p = o (S^{1 / 4})$ in Case3 and $log (p n) = o (B^{1 / 3})$ in Case4. Define an event

$Φ (s) = \{max_{j \in [p]} | \frac{{({\hat{Ξ}}_{\tilde{n}})}_{j, j}}{{(Ξ_{\tilde{n}})}_{j, j}} - 1 | \leq s\},$

where ${\hat{Ξ}}_{\tilde{n}}$ and $Ξ_{\tilde{n}}$ are specified in (A27) and (3), respectively. By (A34) and (A36) in Appendix C, we have

$\begin{matrix} max_{j \in [p]} | {({\hat{Ξ}}_{\tilde{n}})}_{j, j} - {(Ξ_{\tilde{n}})}_{j, j} | \leq & | {\hat{Ξ}}_{\tilde{n}} - Ξ_{\tilde{n}} |_{\infty} = o_{p} (s) \end{matrix}$

holds under Case3 and Case4 with $log (p n) = o (S^{1 / 2})$ . By Assumption 1(iii) and Assumption 2(iii), we know that ${min}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j} > c$ holds under Case3 and Case4. Therefore,

$max_{j \in [p]} | \frac{{({\hat{Ξ}}_{\tilde{n}})}_{j, j}}{{(Ξ_{\tilde{n}})}_{j, j}} - 1 | \leq \frac{{max}_{j \in [p]} | {({\hat{Ξ}}_{\tilde{n}})}_{j, j} - {(Ξ_{\tilde{n}})}_{j, j} |}{{min}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j}} = o_{p} (s) .$

Then it holds that $P {Φ^{c} (s) | E} = o_{p} (1)$ under Case3 and Case4. Let $ϱ = {max}_{j \in [p]} {(Ξ_{\tilde{n}})}_{j, j}$ . Restricted on $Φ (s)$ , there exists a constant $C_{0} > 0$ such that

$\begin{matrix} E ({\hat{T}}_{n}^{G} | E) \leq & C_{0} {(log p)}^{1 / 2} max_{j \in [p]} {({\hat{Ξ}}_{\tilde{n}})}_{j, j}^{1 / 2} \leq {(1 + s)}^{1 / 2} C_{0} {(log p)}^{1 / 2} ϱ^{1 / 2} . \end{matrix}$

By Borell inequality for Gaussian process,

$\begin{matrix} P {{\hat{T}}_{n}^{G} > E ({\hat{T}}_{n}^{G} | E) + x | E} \leq 2 exp \{- \frac{x^{2}}{2 {max}_{j \in [p]} {({\hat{Ξ}}_{\tilde{n}})}_{j, j}}\} \end{matrix}$

for any $x > 0$ . Let $x_{0} = ϱ^{1 / 2} {(1 + s)}^{1 / 2} [C_{0} {(log p)}^{1 / 2} + {2 log (4 / α)}^{1 / 2}]$ . Restricted on $Φ (s)$ , we have

$x_{0} \geq E ({\hat{T}}_{n}^{G} | E) + {(2 ϱ)}^{1 / 2} {(1 + s)}^{1 / 2} {log}^{1 / 2} (\frac{4}{α}),$

which implies

$\begin{matrix} P {{\hat{T}}_{n}^{G} > x_{0}, Φ (s) | E} \leq 2 exp \{- \frac{2 ϱ (1 + s) log (4 / α)}{2 ϱ (1 + s)}\} = \frac{α}{2} . \end{matrix}$

Since $P {Φ^{c} (s) | E} = o_{p} (1)$ , then $P {Φ^{c} (s) | E} \leq α / 4$ with probability approaching one. Hence, $P ({\hat{T}}_{n}^{G} > x_{0} | E) \leq α$ with probability approaching one. Similar to the proof of (A1), we know that $ϱ \leq C$ under Case3 and Case4. By the definition of ${\hat{cv}}_{α}$ , it holds with probability approaching one that

$\begin{matrix} {\hat{cv}}_{α} \leq & ϱ^{1 / 2} {(1 + s)}^{1 / 2} [C_{0} {(log p)}^{1 / 2} + {2 log (4 / α)}^{1 / 2}] ≲ {(log p)}^{1 / 2} \end{matrix}$

under Case3 with $p = o (S^{1 / 4})$ and Case4 with $log (p n) = o (B^{1 / 3} \land S^{1 / 2})$ . Let $μ_{X} = {(μ_{X, 1}, \dots, μ_{X, p})}^{T} = E (X_{t})$ , $μ_{Y} = {(μ_{Y, 1}, \dots, μ_{Y, p})}^{T} = E (Y_{t})$ and $j_{0} = arg {max}_{j \in [p]} | μ_{X, j} - μ_{Y, j} |$ , then

$\begin{matrix} T_{n} = & \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} {| {\hat{μ}}_{X} - {\hat{μ}}_{Y} |}_{\infty} \geq \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} | \frac{1}{n_{1}} \sum_{t = 1}^{n_{1}} X_{t, j_{0}} - \frac{1}{n_{2}} \sum_{t = 1}^{n_{2}} Y_{t, j_{0}} | \\ = & \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} | \frac{1}{n_{1}} \sum_{t = 1}^{n_{1}} (X_{t, j_{0}} - μ_{X, j_{0}}) - \frac{1}{n_{2}} \sum_{t = 1}^{n_{2}} (Y_{t, j_{0}} - μ_{Y, j_{0}}) + μ_{X, j_{0}} - μ_{Y, j_{0}} | \\ \geq & \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} | μ_{X, j_{0}} - μ_{Y, j_{0}} | - \sqrt{\frac{n_{1} n_{2}}{n_{1} + n_{2}}} | \frac{1}{n_{1}} \sum_{t = 1}^{n_{1}} (X_{t, j_{0}} - μ_{X, j_{0}}) - \frac{1}{n_{2}} \sum_{t = 1}^{n_{2}} (Y_{t, j_{0}} - μ_{Y, j_{0}}) | . \end{matrix}$

Similar to the proof of (A1), we know that

$| \frac{1}{n_{1}} \sum_{t = 1}^{n_{1}} (X_{t, j_{0}} - μ_{X, j_{0}}) - \frac{1}{n_{2}} \sum_{t = 1}^{n_{2}} (Y_{t, j_{0}} - μ_{Y, j_{0}}) | = O_{p} (n^{- 1 / 2})$

under Case3 and Case4. If $| μ_{X, j_{0}} - μ_{Y, j_{0}} | ≫ n^{- 1 / 2}$ , we can conclude that $P (T_{n} \geq C^{*} n^{1 / 2} | μ_{X, j_{0}} - μ_{Y, j_{0}} |) \to 1$ as $n \to \infty$ for some constant $C^{*} > 0$ . Due to ${\hat{cv}}_{α} ≲ {(log p)}^{1 / 2} = o (n^{1 / 2} | μ_{X, j_{0}} - μ_{Y, j_{0}} |)$ under Case3 and Case4, we have that Theorem 3 holds under Case3 and Case4. □

Appendix E. Additional Simulation Results

Table A1.

The Type $I$ error rates, expressed as percentages, were calculated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M2). The simulations were replicated 1000 times.

( $n_{1}, n_{2}$ )	$ρ$	p	$Yang$	$Dempster$	$BS$	$SD$	$CLX$
(200,220)	0	50	4.6	2.3	6.4	4.5	3
		200	3.3	0	6.7	5.8	3.4
		400	3.7	0	5	4.4	4.1
		800	3.3	0	6.4	5.2	4.2
	0.1	50	6.2	12.4	23.4	18.4	9.8
		200	5.2	1.8	43.2	39.5	12.9
		400	5.8	0.1	63	59.7	14.7
		800	5.3	0	88.7	87.3	19.4
	0.2	50	8.1	35.6	51.3	44.3	21.9
		200	7.7	23.5	87.2	85.5	37.9
		400	9.2	16.9	98.5	98.3	43
		800	9.6	9.8	100	100	54.5
(400,420)	0	50	4.9	1.8	5.4	3.6	3.5
		200	3.5	0	6.4	5.3	3.2
		400	5	0	5.7	4.8	4.4
		800	4.3	0	6	5	4.5
	0.1	50	6.9	12.2	21.7	17	9.3
		200	4.9	1.9	41.7	39	11.1
		400	7.8	0.1	63.6	61.4	17.7
		800	7.3	0	87.9	86.7	18.3
	0.2	50	8.6	33.7	46.9	40.7	20.6
		200	7.7	23.7	86.3	84.7	31
		400	9.4	17.1	99.2	99	43.6
		800	9	9.5	100	100	53.2

Open in a new tab

Figure A1 — The empirical powers with sparse signals were evaluated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ based on (M2), $f (\cdot) = 0$ and $ρ = 0$ , and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M2), $f (\cdot) = f_{1} (\cdot)$ and $ρ = 0$ . The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure A2 — The empirical powers with dense signals were evaluated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ based on (M2), $f (\cdot) = 0$ and $ρ = 0$ , and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M2), $f (\cdot) = f_{2} (\cdot)$ and $ρ = 0$ . The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Table A2.

The Type $I$ error rates, expressed as percentages, were calculated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M3). The simulations were replicated 1000 times.

( $n_{1}, n_{2}$ )	$ρ$	p	$Yang$	$Dempster$	$BS$	$SD$	$CLX$
(200,220)	0	50	5.7	16.8	7.7	3	1.6
		200	4.3	14.9	6.9	0.9	1.6
		400	3.5	14.7	7.7	0.2	1.2
		800	4.2	15.4	6.9	0.2	1.7
	0.1	50	7.9	25.2	13.7	5.5	5.4
		200	6.2	23	12	2.7	5
		400	5.5	23.3	12.5	1.2	4.2
		800	6.9	24	12.9	0.7	5.5
	0.2	50	8.6	33.8	21	10.7	12.8
		200	7.5	32.5	19.7	5.8	13.9
		400	6.9	30.4	20.2	4.4	15
		800	9.3	32.3	20.7	1.7	18.4
(400,420)	0	50	5.4	13.9	6.7	1.7	1.6
		200	5.1	15.5	6.4	1	1.1
		400	5.3	14.1	7.1	0.8	1.3
		800	4	16.2	6.3	0.1	1.3
	0.1	50	6.9	21.3	10.7	4.7	5.4
		200	6.6	23.1	12.5	2.7	4.9
		400	7.3	22	11.4	1.7	5.9
		800	6.2	23.8	12.7	0.6	5.4
	0.2	50	8.2	31.8	18.2	8.6	11
		200	8.2	31	19.6	5.2	13.5
		400	8.7	32.6	18.9	3.9	14.7
		800	7.4	35.2	21.3	1.8	17.3

Open in a new tab

Figure A3 — The empirical powers with sparse signals were evaluated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ based on (M3), $f (\cdot) = 0$ and $ρ = 0$ , and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M3), $f (\cdot) = f_{1} (\cdot)$ and $ρ = 0$ . The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Figure A4 — The empirical powers with dense signals were evaluated by independently generated sequences ${X_{t}}_{t = 1}^{n_{1}}$ based on (M3), $f (\cdot) = 0$ and $ρ = 0$ , and ${Y_{t}}_{t = 1}^{n_{2}}$ based on (M3), $f (\cdot) = f_{2} (\cdot)$ and $ρ = 0$ . The parameter a represents the distance between the null and alternative hypotheses. The simulations were replicated 1000 times.

Data Availability Statement

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The author declares no conflict of interest.

Funding Statement

This research received no external funding.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

1.Hotelling H. The generalization of student’s ratio. Ann. Math. Stat. 1931;2:360–378. doi: 10.1214/aoms/1177732979. [DOI] [Google Scholar]
2.Hu J., Bai Z. A review of 20 years of naive tests of significance for high-dimensional mean vectors and covariance matrices. Sci. China Math. 2016;59:2281–2300. doi: 10.1007/s11425-016-0131-0. [DOI] [Google Scholar]
3.Harrar S.W., Kong X. Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches. J. Multivar. Anal. 2022;188:104855. doi: 10.1016/j.jmva.2021.104855. [DOI] [Google Scholar]
4.Bai Z., Saranadasa H. Effect of high dimension: By an example of a two sample problem. Stat. Sin. 1996;6:311–329. [Google Scholar]
5.Dempster A.P. A high dimensional two sample significance test. Ann. Math. Stat. 1958;29:995–1010. doi: 10.1214/aoms/1177706437. [DOI] [Google Scholar]
6.Srivastava M.S., Du M. A test for the mean vector with fewer observations than the dimension. J. Multivar. Anal. 2008;99:386–402. doi: 10.1016/j.jmva.2006.11.002. [DOI] [Google Scholar]
7.Gregory K.B., Carroll R.J., Baladandayuthapani V., Lahiri S.N. A two-sample test for equality of means in high dimension. J. Am. Stat. Assoc. 2015;110:837–849. doi: 10.1080/01621459.2014.934826. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Cai T.T., Liu W., Xia Y. Two-sample test of high dimensional means under dependence. J. R. Stat. Soc. Ser. B Stat. Methodol. 2014;76:349–372. [Google Scholar]
9.Chang J., Zheng C., Zhou W.X., Zhou W. Simulation-Based Hypothesis Testing of High Dimensional Means Under Covariance Heterogeneity. Biometrics. 2017;73:1300–1310. doi: 10.1111/biom.12695. [DOI] [PubMed] [Google Scholar]
10.Xu G., Lin L., Wei P., Pan W. An adaptive two-sample test for high-dimensional means. Biometrika. 2017;103:609–624. doi: 10.1093/biomet/asw029. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Chernozhukov V., Chetverikov D., Kato K. Testing Many Moment Inequalities. Centre for Microdata Methods and Practice (cemmap); London, UK: 2016. Cemmap working paper, No. CWP42/16. [Google Scholar]
12.Zhang D., Wu W.B. Gaussian approximation for high dimensional time series. Ann. Stat. 2017;45:1895–1919. doi: 10.1214/16-AOS1512. [DOI] [Google Scholar]
13.Wu W.B. Nonlinear system theory: Another look at dependence. Proc. Natl. Acad. Sci. USA. 2005;102:14150–14154. doi: 10.1073/pnas.0506715102. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Chang J., Chen X., Wu M. Central limit theorems for high dimensional dependent data. Bernoulli. 2024;30:712–742. doi: 10.3150/23-BEJ1614. [DOI] [Google Scholar]
15.Raič M. A multivariate berry–esseen theorem with explicit constants. Bernoulli. 2019;25:2824–2853. [Google Scholar]
16.Chernozhukov V., Chetverikov D., Kato K., Koike Y. Improved central limit theorem and bootstrap approximations in high dimensions. Ann. Stat. 2022;50:2562–2586. doi: 10.1214/22-AOS2193. [DOI] [Google Scholar]
17.Peligrad M. Some remarks on coupling of dependent random variables. Stat. Probab. Lett. 2002;60:201–209. doi: 10.1016/S0167-7152(02)00318-8. [DOI] [Google Scholar]
18.Künsch H.R. The jackknife and the bootstrap for general stationary observations. Ann. Stat. 1989;17:1217–1241. doi: 10.1214/aos/1176347265. [DOI] [Google Scholar]
19.Liu R.Y. Bootstrap procedures under some non-I.I.D. models. Ann. Stat. 1988;16:1696–1708. doi: 10.1214/aos/1176351062. [DOI] [Google Scholar]
20.Hill J.B., Li T. A global wavelet based bootstrapped test of covariance stationarity. arXiv. 20222210.14086 [Google Scholar]
21.Fang X., Koike Y. High-dimensional central limit theorems by Stein’s method. Ann. Appl. Probab. 2021;31:1660–1686. doi: 10.1214/20-AAP1629. [DOI] [Google Scholar]
22.Chernozhukov V., Chetverikov D., Koike Y. Nearly optimal central limit theorem and bootstrap approximations in high dimensions. Ann. Appl. Probab. 2023;33:2374–2425. doi: 10.1214/22-AAP1870. [DOI] [Google Scholar]
23.Chang J., He J., Yang L., Yao Q. Modelling matrix time series via a tensor CP-decomposition. J. R. Stat. Soc. Ser. B Stat. Methodol. 2023;85:127–148. doi: 10.1093/jrsssb/qkac011. [DOI] [Google Scholar]
24.Koike Y. High-dimensional central limit theorems for homogeneous sums. J. Theor. Probab. 2023;36:1–45. doi: 10.1007/s10959-022-01156-2. [DOI] [Google Scholar]
25.Hörmann S., Kokoszka P. Weakly dependent functional data. Ann. Stat. 2010;38:1845–1884. doi: 10.1214/09-AOS768. [DOI] [Google Scholar]
26.Zhang X. White noise testing and model diagnostic checking for functional time series. J. Econ. 2016;194:76–95. doi: 10.1016/j.jeconom.2016.04.004. [DOI] [Google Scholar]
27.Politis D.N., Romano J.P., Wolf M. Subsampling. Springer; Berlin/Heidelberg, Germany: 1999. (Springer Series in Statistics). [Google Scholar]
28.Chernozhukov V., Chetverikov D., Kato K. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 2013;41:2786–2819. doi: 10.1214/13-AOS1161. [DOI] [Google Scholar]
29.Zhou Z. Heteroscedasticity and autocorrelation robust structural change detection. J. Am. Stat. Assoc. 2013;108:726–740. doi: 10.1080/01621459.2013.787184. [DOI] [Google Scholar]
30.Rio E. Inequalities and Limit Theorems for Weakly Dependent Sequences, 3rd Cycle. 2013. [(accessed on 8 December 2023)]. p. 170. France. Available online: https://cel.hal.science/cel-00867106v2.
31.Chernozhukov V., Chetverikov D., Kato K. Central limit theorems and bootstrap in high dimensions. Ann. Probab. 2017;45:2309–2352. doi: 10.1214/16-AOP1113. [DOI] [Google Scholar]
32.Merlevède F., Peligrad M., Rio E. A Bernstein type inequality and moderate deviations for weakly dependent sequences. Probab. Theory Relat. Fields. 2011;151:435–474. doi: 10.1007/s00440-010-0304-9. [DOI] [Google Scholar]
33.Merlevède F., Peligrad M., Rio E. High Dimensional Probability V: The Luminy Volume. Volume 5. Institute of Mathematical Statistics; Waite Hill, OH, USA: 2009. Bernstein inequality and moderate deviations under strong mixing conditions; pp. 273–292. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data used to support the findings of this study are included within the article.

[B1-entropy-26-00226] 1.Hotelling H. The generalization of student’s ratio. Ann. Math. Stat. 1931;2:360–378. doi: 10.1214/aoms/1177732979. [DOI] [Google Scholar]

[B2-entropy-26-00226] 2.Hu J., Bai Z. A review of 20 years of naive tests of significance for high-dimensional mean vectors and covariance matrices. Sci. China Math. 2016;59:2281–2300. doi: 10.1007/s11425-016-0131-0. [DOI] [Google Scholar]

[B3-entropy-26-00226] 3.Harrar S.W., Kong X. Recent developments in high-dimensional inference for multivariate data: Parametric, semiparametric and nonparametric approaches. J. Multivar. Anal. 2022;188:104855. doi: 10.1016/j.jmva.2021.104855. [DOI] [Google Scholar]

[B4-entropy-26-00226] 4.Bai Z., Saranadasa H. Effect of high dimension: By an example of a two sample problem. Stat. Sin. 1996;6:311–329. [Google Scholar]

[B5-entropy-26-00226] 5.Dempster A.P. A high dimensional two sample significance test. Ann. Math. Stat. 1958;29:995–1010. doi: 10.1214/aoms/1177706437. [DOI] [Google Scholar]

[B6-entropy-26-00226] 6.Srivastava M.S., Du M. A test for the mean vector with fewer observations than the dimension. J. Multivar. Anal. 2008;99:386–402. doi: 10.1016/j.jmva.2006.11.002. [DOI] [Google Scholar]

[B7-entropy-26-00226] 7.Gregory K.B., Carroll R.J., Baladandayuthapani V., Lahiri S.N. A two-sample test for equality of means in high dimension. J. Am. Stat. Assoc. 2015;110:837–849. doi: 10.1080/01621459.2014.934826. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8-entropy-26-00226] 8.Cai T.T., Liu W., Xia Y. Two-sample test of high dimensional means under dependence. J. R. Stat. Soc. Ser. B Stat. Methodol. 2014;76:349–372. [Google Scholar]

[B9-entropy-26-00226] 9.Chang J., Zheng C., Zhou W.X., Zhou W. Simulation-Based Hypothesis Testing of High Dimensional Means Under Covariance Heterogeneity. Biometrics. 2017;73:1300–1310. doi: 10.1111/biom.12695. [DOI] [PubMed] [Google Scholar]

[B10-entropy-26-00226] 10.Xu G., Lin L., Wei P., Pan W. An adaptive two-sample test for high-dimensional means. Biometrika. 2017;103:609–624. doi: 10.1093/biomet/asw029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B11-entropy-26-00226] 11.Chernozhukov V., Chetverikov D., Kato K. Testing Many Moment Inequalities. Centre for Microdata Methods and Practice (cemmap); London, UK: 2016. Cemmap working paper, No. CWP42/16. [Google Scholar]

[B12-entropy-26-00226] 12.Zhang D., Wu W.B. Gaussian approximation for high dimensional time series. Ann. Stat. 2017;45:1895–1919. doi: 10.1214/16-AOS1512. [DOI] [Google Scholar]

[B13-entropy-26-00226] 13.Wu W.B. Nonlinear system theory: Another look at dependence. Proc. Natl. Acad. Sci. USA. 2005;102:14150–14154. doi: 10.1073/pnas.0506715102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B14-entropy-26-00226] 14.Chang J., Chen X., Wu M. Central limit theorems for high dimensional dependent data. Bernoulli. 2024;30:712–742. doi: 10.3150/23-BEJ1614. [DOI] [Google Scholar]

[B15-entropy-26-00226] 15.Raič M. A multivariate berry–esseen theorem with explicit constants. Bernoulli. 2019;25:2824–2853. [Google Scholar]

[B16-entropy-26-00226] 16.Chernozhukov V., Chetverikov D., Kato K., Koike Y. Improved central limit theorem and bootstrap approximations in high dimensions. Ann. Stat. 2022;50:2562–2586. doi: 10.1214/22-AOS2193. [DOI] [Google Scholar]

[B17-entropy-26-00226] 17.Peligrad M. Some remarks on coupling of dependent random variables. Stat. Probab. Lett. 2002;60:201–209. doi: 10.1016/S0167-7152(02)00318-8. [DOI] [Google Scholar]

[B18-entropy-26-00226] 18.Künsch H.R. The jackknife and the bootstrap for general stationary observations. Ann. Stat. 1989;17:1217–1241. doi: 10.1214/aos/1176347265. [DOI] [Google Scholar]

[B19-entropy-26-00226] 19.Liu R.Y. Bootstrap procedures under some non-I.I.D. models. Ann. Stat. 1988;16:1696–1708. doi: 10.1214/aos/1176351062. [DOI] [Google Scholar]

[B20-entropy-26-00226] 20.Hill J.B., Li T. A global wavelet based bootstrapped test of covariance stationarity. arXiv. 20222210.14086 [Google Scholar]

[B21-entropy-26-00226] 21.Fang X., Koike Y. High-dimensional central limit theorems by Stein’s method. Ann. Appl. Probab. 2021;31:1660–1686. doi: 10.1214/20-AAP1629. [DOI] [Google Scholar]

[B22-entropy-26-00226] 22.Chernozhukov V., Chetverikov D., Koike Y. Nearly optimal central limit theorem and bootstrap approximations in high dimensions. Ann. Appl. Probab. 2023;33:2374–2425. doi: 10.1214/22-AAP1870. [DOI] [Google Scholar]

[B23-entropy-26-00226] 23.Chang J., He J., Yang L., Yao Q. Modelling matrix time series via a tensor CP-decomposition. J. R. Stat. Soc. Ser. B Stat. Methodol. 2023;85:127–148. doi: 10.1093/jrsssb/qkac011. [DOI] [Google Scholar]

[B24-entropy-26-00226] 24.Koike Y. High-dimensional central limit theorems for homogeneous sums. J. Theor. Probab. 2023;36:1–45. doi: 10.1007/s10959-022-01156-2. [DOI] [Google Scholar]

[B25-entropy-26-00226] 25.Hörmann S., Kokoszka P. Weakly dependent functional data. Ann. Stat. 2010;38:1845–1884. doi: 10.1214/09-AOS768. [DOI] [Google Scholar]

[B26-entropy-26-00226] 26.Zhang X. White noise testing and model diagnostic checking for functional time series. J. Econ. 2016;194:76–95. doi: 10.1016/j.jeconom.2016.04.004. [DOI] [Google Scholar]

[B27-entropy-26-00226] 27.Politis D.N., Romano J.P., Wolf M. Subsampling. Springer; Berlin/Heidelberg, Germany: 1999. (Springer Series in Statistics). [Google Scholar]

[B28-entropy-26-00226] 28.Chernozhukov V., Chetverikov D., Kato K. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Ann. Stat. 2013;41:2786–2819. doi: 10.1214/13-AOS1161. [DOI] [Google Scholar]

[B29-entropy-26-00226] 29.Zhou Z. Heteroscedasticity and autocorrelation robust structural change detection. J. Am. Stat. Assoc. 2013;108:726–740. doi: 10.1080/01621459.2013.787184. [DOI] [Google Scholar]

[B30-entropy-26-00226] 30.Rio E. Inequalities and Limit Theorems for Weakly Dependent Sequences, 3rd Cycle. 2013. [(accessed on 8 December 2023)]. p. 170. France. Available online: https://cel.hal.science/cel-00867106v2.

[B31-entropy-26-00226] 31.Chernozhukov V., Chetverikov D., Kato K. Central limit theorems and bootstrap in high dimensions. Ann. Probab. 2017;45:2309–2352. doi: 10.1214/16-AOP1113. [DOI] [Google Scholar]

[B32-entropy-26-00226] 32.Merlevède F., Peligrad M., Rio E. A Bernstein type inequality and moderate deviations for weakly dependent sequences. Probab. Theory Relat. Fields. 2011;151:435–474. doi: 10.1007/s00440-010-0304-9. [DOI] [Google Scholar]

[B33-entropy-26-00226] 33.Merlevède F., Peligrad M., Rio E. High Dimensional Probability V: The Luminy Volume. Volume 5. Institute of Mathematical Statistics; Waite Hill, OH, USA: 2009. Bernstein inequality and moderate deviations under strong mixing conditions; pp. 273–292. [Google Scholar]

PERMALINK

A Blockwise Bootstrap-Based Two-Sample Test for High-Dimensional Time Series

Lin Yang

Roles

Abstract

1. Introduction

Notation:

2. Methodology

2.1. Test Statistic and Its Gaussian Analog

2.2. Blockwise Bootstrap

3. Theoretical Results

3.1. Gaussian Approximation for High-Dimensional α-Mixing Sequence

Theorem 1.

Remark 1.

3.2. Theoretical Properties

Assumption 1.

Assumption 2.

Remark 2.

Proposition 1.

Assumption 3.

Theorem 2.

Theorem 3.

Remark 3.

4. Application: Change Point Detection

5. Simulation Study

5.1. Tuning Parameter Selection

5.2. Simulation Settings

5.3. Simulation Results

Table 1.

Figure 1.

Figure 2.

6. Real Data Analysis

Table 2.

Figure 3.

7. Discussion

Appendix A. Proof of Theorem 1

Appendix A.1. Proof of Theorem 1(i)

Proof.

Appendix A.2. Proof of Theorem 1(ii)

Proof.

Appendix B. Proof of Proposition 1

Proof.

Appendix C. Proof of Theorem 2

Appendix C.1. Proof of Theorem 2 under Case3

Proof.

Appendix C.2. Proof of Theorem 2 under Case4

Proof.

Appendix D. Proof of Theorem 3

Proof.

Appendix E. Additional Simulation Results

Table A1.

Figure A1.

Figure A2.

Table A2.

Figure A3.

Figure A4.

Data Availability Statement

Conflicts of Interest

Funding Statement

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.1. Gaussian Approximation for High-Dimensional $α$ -Mixing Sequence