Homogeneity tests for one-way models with dependent errors under correlated groups

Yuichi Goto; Koichi Arakaki; Yan Liu; Masanobu Taniguchi

doi:10.1007/s11749-022-00828-9

. 2022 Sep 2;32(1):163–183. doi: 10.1007/s11749-022-00828-9

Homogeneity tests for one-way models with dependent errors under correlated groups

Yuichi Goto ^1,^✉, Koichi Arakaki ², Yan Liu ^3,⁴, Masanobu Taniguchi ³

PMCID: PMC9438895 PMID: 36091581

Abstract

We consider the problem of testing for the existence of fixed effects and random effects in one-way models, where the groups are correlated and the disturbances are dependent. The classical F-statistic in the analysis of variance is not asymptotically distribution-free in this setting. To overcome this problem, we propose a new test statistic for this problem without any distributional assumptions, so that the test statistic is asymptotically distribution-free. The proposed test statistic takes the form of a natural extension of the classical F-statistic in the sense of distribution-freeness. The new tests are shown to be asymptotically size $α$ and consistent. The nontrivial power under local alternatives is also elucidated. The theoretical results are justified by numerical simulations for the model with disturbances from linear time series with innovations of symmetric random variables, heavy-tailed variables, and skewed variables, and furthermore from GARCH models. The proposed test is applied to log-returns for stock prices and uncovers random effects in sectors.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11749-022-00828-9.

Keywords: Dynamic panel data, Fixed effect, Homogeneity test, Longitudinal data, One-way model, Random effect

Introduction

Longitudinal data and panel data are omnipresent in the real world. Statistical methods to analyze such data have been studied for several decades (Diggle et al. 2002). The methods have a wide range of applications, e.g., analysis of stress in mothers (Zeger et al. 1985), the weight of infants (Hoover et al. 1998), and COVID-19 data (Bernardes et al. 2020; Lucas et al. 2020).

The analysis of variance (ANOVA) is a common method to test for equality among groups. An F-statistic, defined as the ratio of variance between groups to variance within groups, is designed to test for the homogeneity of groups for independent and identically distributed (i.i.d.) data. Numerous papers are devoted to ANOVA and related topics for i.i.d. data (see, e.g., Searle et al. 1992; Rashid 1995; Clarke 2008; Liu and Xu 2016, and references there in). By contrast, the statistic does not work for dependent data. To resolve the issue, Nagahata and Taniguchi (2018) studied a test for the equality of means among groups based on the Whittle likelihood for multivariate one-way fixed effect models. Their statistic can be rephrased as the classical F-statistic rescaled by the spectral density of disturbances. They showed their statistic is asymptotically Chi-square distributed, although they did not derive the consistency of the test and assumed independence of groups.

One-way models for time series are closely related to the analysis of longitudinal data and dynamic panel data. For dynamic panel data, Baltagi and Li (1991) constructed the consistent estimator of variance of random effects for dynamic panel data models with errors from the autoregressive process of order 1 (AR(1)), provided that the number of groups and sample size tends to infinity. Galbraith and Zinde-Walsh (1995) dealt with error components models for panel data models with errors from the autoregressive moving-average process of orders p and q. You and Zhou (2013) advocated semiparametric panel data partially linear additive models with errors from the AR(1). The statistical methods for longitudinal data also have been intensively investigated. For example, Tang and Leng (2011) estimated regression coefficients by the empirical likelihood. Li (2011) constructed an efficient estimator for semiparametric regression models. A panel data model with common shocks is proposed by Bai and Li (2014), and Ergemen and Velasco (2017) extended the model to a fractionally integrated panel data model with common shocks. Under high-dimensional settings, Zhong et al. (2019) considered a test for homogeneity of covariance matrices and constructed a change test for covariance matrices. Fang et al. (2020) proposed a test for regression parameters. However, the principal objective in these fields is not fixed and random effects but is regression coefficients.

The importance of fixed effects and random effects has been recognized, whereas, to our best knowledge, there are few references of diagnostic tests for fixed effects and random effects. On a related topic, Akharif et al. (2020) and Fihri et al. (2020) established optimal tests for the existence of random coefficients for i.i.d. data based on the locally asymptotic normality for random coefficient regression models. The optimal test based on multivariate ranks for the existence of fixed effects for i.i.d. data proposed by Hallin et al. (2021). Recently, González et al. (2021) have discussed tests for the existence of fixed effects and interactions for two-way models for spatial point processes. Ditzhaus et al. (2021) proposed robust tests based on quantiles for fixed effects and interaction for i.i.d. random variables.

We propose a test for the existence of fixed or random effects in one-way models for correlated groups and derive the asymptotic null distribution. In addition, the consistency of the proposed test and the nontrivial power under the local alternatives are elucidated. The numerical study illustrates the finite sample performance of the proposed test and comparison with the classical test. In particular, we also include the skewness and the heteroscedasticity in the disturbance process, which reveals its own importance in practical applications (Cook and Weisberg 1983). In this study, we also compare our statistic with the classical statistic. The classical statistic, defined in Sect. 2, assumes independence between groups, which is a major drawback in its application. The new statistic, defined in Sect. 3, elaborately relaxes the strong assumption of independence between groups. We emphasize that our setting allows us to deal with correlated groups, and thus, our proposed method has a wide range of applications.

A motivated real data example with correlated groups is the analysis of stock prices. Stock prices can be categorized by industry. Equity-focused investors believe that the stock prices are linked by factors related to earnings. For example, stock prices of automobile companies are linked to exchange rates. In other words, equity-focused investors believe that there are random effects related to industries. Our test which takes into account correlations between groups can be applied to verify this hypothesis.

This paper is organized as follows: We briefly review spectra and the classical settings and test statistic in Sect. 2. In Sect. 3, we introduce the fixed effects model and propose a new test for the existence of fixed effects. In Sect. 4, we deal with the random effects model and derive the asymptotic results for the proposed test. Section 5 presents the simulation study. In Sect. 6, we apply our test for the existence of effects to the log-returns in stock prices. The discussion is provided in Sect. 7. Supplementary material includes all proofs of theorems and additional simulation results.

Preliminary

Spectral density

In the frequency-domain approach, the $L^{2}$ -based spectral density is a pivotal index to describe time-dependent structures of data. To recall the definition, let $X_{t}$ be a strictly stationary process with the autocovariance function $γ_{X} (h) = E X_{t} X_{t + h}$ satisfying $\sum_{h = - \infty}^{\infty} |γ_{X}, (h)| < \infty$ . Then, the spectral density function is defined, for $λ \in [- π, π]$ , as

\begin{matrix} f_{X} (λ) = \frac{1}{2 π} \sum_{h = - \infty}^{\infty} γ_{X} (h) e^{- i h λ} . \end{matrix}

Since $γ_{X} (h) = \int_{- π}^{π} f_{X} (λ) e^{i h λ} d λ$ , the information of the spectrum $f_{X} (λ)$ is equivalent to that of autocovariance functions for all lags ${γ_{X} (h)}_{h \in Z}$ . A multivariate spectral density function can be defined by replacing $γ_{X} (h)$ in (1) with $Γ_{X} (h) = E X_{t} X_{t + h}^{⊤}$ for a p-dimensional strictly stationary process $X_{t}$ . Typical examples of spectra are the spectrum for ARMA models of orders (p,q) and the exponential type of the spectrum proposed by Bloomfield (1973), taking the forms of

\begin{matrix} f_{ARMA} (λ) = & \frac{σ^{2}}{2 π} \frac{{|1 + θ_{1} e^{- i λ} + \dots + θ_{q} e^{- i q λ}|}^{2}}{{|1 - ϕ_{1} e^{- i λ} - \dots - ϕ_{p} e^{- i p λ}|}^{2}} \\ and f_{EXP} (λ) = & \frac{σ^{2}}{2 π} exp (2 \sum_{r = 1}^{d} ς_{r} cos (r λ)), \end{matrix}

where $σ, θ_{1}, \dots, θ_{q}, ϕ_{1}, \dots, ϕ_{p}, ς_{1}, \dots, ς_{d}$ are parameters, respectively. Other examples can be found by, e.g., Chiu (1988). We refer readers to von Sachs (2020) for review.

Classical setting and statistic

Nagahata and Taniguchi (2018) discussed one-way models with independent groups; for a fixed group size a, a growing sample size $n_{i}$ of the ith group $(i = 1 \dots, a)$ , and a fixed dimension p of time series in each group,

\begin{matrix} y_{it} = μ + τ_{i} + e_{it}, i = 1, \dots, a ; t = 1, \dots, n_{i}, \end{matrix}

\begin{matrix} S_{n} & = n \sum_{i = 1}^{a} {(\bar{y_{i .}} - \bar{y_{. .}})}^{T} {(2, π, {\tilde{f}}_{n}, (0))}^{- 1} (\bar{y_{i .}} - \bar{y_{. .}}), \end{matrix}

where $\bar{y_{i .}} = \sum_{t = 1}^{n_{i}} y_{it} / n_{i}$ , $\bar{y_{. .}} = \sum_{i = 1}^{a} \sum_{t = 1}^{n_{i}} y_{it} / (a n_{i})$ , ${\tilde{f}}_{n} (0)$ is defined as

\begin{matrix} {\tilde{f}}_{n} (0) = \frac{1}{a} \sum_{i = 1}^{a} {\hat{f}}_{ii} (λ) / ρ_{i}, \end{matrix}

where ${\hat{f}}_{ii} (λ)$ is given in (6), and $ρ_{i} = n_{i} / n$ with $n = \sum_{i = 1}^{a} n_{i}$ . This statistic is standardized within groups, and thus, the test based on $S_{n}$ is asymptotically distribution-free in the case of independent groups (see Sect. 7). However, it does not hold when groups are correlated. This paper focuses on data with correlated groups such as stock prices are considered. In stock prices, sectors correspond to groups. We propose the test statistic standardized not only within groups but also between groups, defined in (7) so that our test statistic is asymptotically distribution-free. In this sense, our statistic takes the form of the natural extension of $S_{n}$ .

Test for existence of fixed effects

In this section, we scrutinize one-way fixed effects model with dependent disturbance processes when the number of groups is fixed and the number of observations for each group diverges. Let us consider the model

\begin{matrix} y_{it} = μ + τ_{i} + e_{it}, i = 1, \dots, a ; t = 1, \dots, n_{i}, \end{matrix}

where $y_{it} = {(y_{i t 1}, \dots, y_{itp})}^{T}$ is a tth p-dimensional observation of an ith group, $μ = {(μ_{1}, \dots, μ_{p})}^{T}$ is a general mean, $τ_{i} = {(τ_{i 1}, \dots, τ_{ip})}^{T}$ is a fixed effect such that $\sum_{i = 1}^{a} τ_{i} = 0$ , and $e_{it} = {(e_{i t 1}, \dots, e_{itp})}^{T}$ is a centered strictly stationary sequence. Suppose that an observed stretch ${y_{it} ; i = 1, \dots, a, t = 1, \dots, n_{i}}$ is available, and ${(e_{1 t}^{T}, \dots, e_{at}^{T})}^{T}$ has an ap-by-ap spectral density matrix $f (λ) = {(f_{ij} (λ))}_{i, j = 1, \dots, a}$ for $λ \in [- π, π]$ . In addition, there exists $ρ_{i} \in (0, 1)$ such that $n_{i} = ρ_{i} n$ with $n = \sum_{i = 1}^{a} n_{i}$ . The number of groups, the length of time series from an ith group, and the dimension of time series from each group at each time are denoted as a, $n_{i}$ , and p, respectively. The role of p is to include the multivariate analysis of variance (MANOVA) case. Obviously, $p = 1$ corresponds to the univariate ANOVA.

Remark 1

The above one-way model defined in (4) seems that only one time series for each group can be coped with, whereas we can handle the case that there are more than one time series for each group by reconfiguring the settings as follows: taking p as pq for $q \in N$ ,

$y_{it} = {(y_{i t 11}, \dots, y_{i t 1 q}, y_{i t 21}, \dots, y_{i t p 1}, \dots y_{itpq})}^{T}$ , where $1_{q}$ is a q-dimensional vector with all elements equal to one, $μ = {(μ_{1} 1_{q}^{T}, \dots, μ_{p} 1_{q}^{T})}^{T}$ , $τ_{i} = {(τ_{i 1} 1_{q}^{T}, \dots, τ_{ip} 1_{q}^{T})}^{T}$ , and $e_{it} = {(e_{i t 11}, \dots, e_{i t 1 q}, e_{i t 12}, \dots, e_{i t p 1}, \dots, e_{itpq})}^{T}$ . Moreover, p and q can depend on i. In this case, p and q represent the dimension of time series from each group at each time and the number of time series in each group, respectively.

Remark 2

The condition $\sum_{i = 1}^{a} τ_{i} = 0$ is not essential. When $\sum_{i = 1}^{a} τ_{i} \neq 0$ , we can redefine $μ$ as $μ - \sum_{i = 1}^{a} τ_{i}$ and $τ_{i}$ as $τ_{i} - \sum_{i = 1}^{a} τ_{i}$ .

Let the null hypothesis $H_{0}$ and the alternative $K_{0}$ be

\begin{matrix} H_{0} : τ_{1} = \dots = τ_{a} vs K_{0} : τ_{i} \neq 0 for some i . \end{matrix}

Under the assumption $\sum_{i = 1}^{a} τ_{i} = 0$ , the null hypothesis is equivalent to $τ_{i} = 0$ for all $i \in {1, \dots, a}$ .

Let ${\hat{f}}_{n} (λ) = {({\hat{f}}_{ij} (λ))}_{i, j = 1, \dots, a}$ be the nonparametric spectral density estimator defined as

\begin{matrix} {\hat{f}}_{ij} (λ) = \frac{1}{2 π} \sum_{{h \in Z ; |h| \leq min {n_{i}, n_{j}} - 1}} ω (\frac{h}{M_{n}}) {\hat{Γ}}_{ij} (h) e^{- i h λ}, λ \in [- π, π], \end{matrix}

where $ω (x) = \int_{- \infty}^{\infty} W (t) e^{i x t} d t$ and the function $W (\cdot)$ satisfy Assumption 3.2. Here, $M_{n}$ is a positive sequence such that $M_{n} \to \infty$ and $M_{n} / {min}_{i = 1, \dots, a} n_{i} \to 0$ as ${min}_{i = 1, \dots, a} n_{i} \to \infty$ , for $h \in {0, \dots, min {n_{i}, n_{j}} - 1}$ ,

\begin{matrix} {\hat{Γ}}_{ij} (h) = \frac{1}{min {n_{i}, n_{j}} - |h|} \sum_{t = 1}^{min {n_{i}, n_{j}} - |h|} (y_{i (t + h)} - \bar{y_{i .}}) {(y_{jt} - \bar{y_{j .}})}^{T}, \end{matrix}

for $h \in {- min {n_{i}, n_{j}} + 1, \dots, 0}$ , and

\begin{matrix} {\hat{Γ}}_{ij} (h) = \frac{1}{min {n_{i}, n_{j}} - |h|} \sum_{t = - h + 1}^{min {n_{i}, n_{j}}} (y_{i (t + h)} - \bar{y_{i .}}) {(y_{jt} - \bar{y_{j .}})}^{T}, \end{matrix}

where $\bar{y_{i .}} = \sum_{t = 1}^{n_{i}} y_{it} / n_{i}$ , and $\bar{y_{. .}} = \sum_{i = 1}^{a} \sum_{t = 1}^{n_{i}} y_{it} / (a n_{i})$ . Let ${\hat{V}}_{n} = {({\hat{V}}_{ij})}_{i, j = 1 \dots, a}$ be

\begin{matrix} {\hat{V}}_{ij} = & \frac{2 π min {ρ_{i}, ρ_{j}}}{ρ_{i} ρ_{j}} {\hat{f}}_{ij} (0) - \frac{2 π}{a} \sum_{s = 1}^{a} \{\frac{min {ρ_{s}, ρ_{j}}}{ρ_{s} ρ_{j}} {\hat{f}}_{sj} (0) + \frac{min {ρ_{i}, ρ_{s}}}{ρ_{i} ρ_{s}} {\hat{f}}_{is} (0)\} \\ + \frac{2 π}{a^{2}} \sum_{s, k = 1}^{a} \frac{min {ρ_{s}, ρ_{k}}}{ρ_{s} ρ_{k}} {\hat{f}}_{sk} (0) . \end{matrix}

The test statistic for $H_{0}$ is proposed as

\begin{matrix} T_{n} = n ({\bar{y_{1 .}}}^{T} - {\bar{y_{. .}}}^{T}, \dots, {\bar{y_{a .}}}^{T} - {\bar{y_{. .}}}^{T}) {\hat{V}}_{n}^{-} {({\bar{y_{1 .}}}^{T} - {\bar{y_{. .}}}^{T}, \dots, {\bar{y_{a .}}}^{T} - {\bar{y_{. .}}}^{T})}^{T}, \end{matrix}

where ${\hat{V}}_{n}^{-}$ denotes the Moore–Penrose inverse of ${\hat{V}}_{n}$ . Using the Moore–Penrose inverse ${\hat{V}}_{n}^{-}$ in $T_{n}$ is essential since ${\hat{V}}_{n}$ is a singular matrix. Actually, $\sum_{i = 1}^{a} {\hat{V}}_{ij} = O_{p}$ for any j, where $O_{p}$ is an p-by-p zero matrix; thus, 0 is an eigenvalue of ${\hat{V}}_{n}$ . It is worth mentioning that our proposed test statistic $T_{n}$ is scale-invariant. Since $({\bar{y_{1 .}}}^{T} - {\bar{y_{. .}}}^{T}, \dots, {\bar{y_{a .}}}^{T} - {\bar{y_{. .}}}^{T})$ converges in distribution to a centered normal distribution with variance $V$ , defined in Theorem 1, and $V$ is the function of the spectral density matrix $f (λ)$ (see Lemma 1 in Section A in the supplementary material), $f (λ)$ appears.

To state the assumptions, we define, for a random variables ${X_{t}}$ , the cumulant of order $ℓ$ of $(X_{1}, \dots, X_{ℓ})$ as

\begin{matrix} cum (X_{1}, \dots, X_{ℓ}) = \sum_{(ν_{1}, \dots, ν_{p})} {(- 1)}^{p - 1} (p - 1)! (E, \prod_{j \in ν_{1}}, X_{ν_{1}}) \dots (E, \prod_{j \in ν_{p}}, X_{ν_{p}}), \end{matrix}

where the summation $\sum_{(ν_{1}, \dots, ν_{p})}$ extends over all partitions $(ν_{1}, \dots, ν_{p})$ of ${1, 2, \dots, ℓ}$ (see Brillinger 1981, p. 19). The following assumptions are made throughout the paper.

Assumption 3.1

For all $ℓ \in N$ , $(k_{1}, \dots, k_{ℓ}) \in {1, \dots, a}^{ℓ},$ and $(r_{1}, \dots, r_{ℓ}) \in {1, \dots, p}^{ℓ}$ ,

\begin{matrix} \sum_{s_{2}, \dots, s_{ℓ} = - \infty}^{\infty} (1 + \sum_{j = 1}^{ℓ} |s_{j}|) |κ_{r_{1} \dots r_{ℓ}}^{k_{1} \dots k_{ℓ}}, (s_{2}, \dots, s_{ℓ})| < \infty, \end{matrix}

where $κ_{r_{1} \dots r_{ℓ}}^{k_{1} \dots k_{ℓ}} (s_{2}, \dots, s_{ℓ}) = cum {e_{k_{1} 0 r_{1}}, e_{k_{2} s_{2} r_{2}}, \dots, e_{k_{ℓ} s_{ℓ} r_{ℓ}}}$ .

Assumption 3.2

$W (\cdot)$ is a real, bounded, nonnegative, even function such that $\int_{- \infty}^{\infty} W (t) d t = 1$ and $\int_{- \infty}^{\infty} W^{2} (t) d t < \infty$ with a bounded derivative.

Assumption 3.3

$rank ({\hat{V}}_{n})$ converges in probability to $rank (V)$ , where $V$ is defined in Theorem 1, as ${min}_{i = 1, \dots, a} n_{i} \to \infty$ .

We briefly explain all assumptions. Assumption 3.1 is an assumption often imposed for dependent observations (see Brillinger 1981, p. 26). It implies the asymptotic normality of $({\bar{y_{1 .}}}^{T} - {\bar{y_{. .}}}^{T}, \dots, {\bar{y_{a .}}}^{T} - {\bar{y_{. .}}}^{T})$ . This assumption can be relaxed as Remark 1 in Section A in the supplementary material. Assumption 3.2 is a natural assumption for the nonparametric spectral density estimator. In conjunction with Assumption 3.1, ${\hat{f}}_{n} (λ)$ is a consistent estimator (see Brillinger 1981, Corollaries 5.6.1 and 5.6.2 and Theorem 5.9.1). Other conditions which ensure the consistency of the nonparametric spectral density estimator can be seen in Robinson (1991). Assumption 3.3 is a technical assumption to ensure ${\hat{V}}_{n}^{-}$ converges in probability to $V^{-}$ as ${min}_{i = 1, \dots, a} n_{i} \to \infty$ (see Rakocevic 1997; Stewart 1969).

Remark 3

When we assume independence of groups and $f_{11} (0) = \dots = f_{aa} (0)$ , ${\hat{V}}_{n}$ fulfills Assumption 3.3. As an illustration, we set $p = 1$ and $a = 3$ . Then,

\begin{matrix} V = (\begin{matrix} 1 - 1 / a & - 1 / a & - 1 / a \\ - 1 / a & 1 - 1 / a & - 1 / a \\ - 1 / a & - 1 / a & 1 - 1 / a \end{matrix}) 2 π f (0), \end{matrix}

and for matrices,

\begin{matrix} P = (\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 1 & 1 \end{matrix}) a n d B = (\begin{matrix} 1 - 1 / a & - 1 / a & - 1 / a \\ - 1 / a & 1 - 1 / a & - 1 / a \end{matrix}) 2 π f (0), \end{matrix}

it holds that

graphic file with name 11749_2022_828_Equ27_HTML.gif

Also, the matrix $P {\hat{V}}_{n}$ takes the form of

graphic file with name 11749_2022_828_Equ28_HTML.gif

where $\hat{B_{n}}$ is an appropriate $(a - 1)$ -by-a matrix. Since $B$ is a full rank matrix and the set of all full rank $(a - 1)$ -by-a matrices is open, $\hat{B_{n}}$ is a full rank matrix for large n. Hence, the condition is confirmed.

Then, we obtain the following asymptotic null distribution based on Rao and Mitra (1971, Theorem 9.2.3, p. 173).

Theorem 1

Suppose Assumptions 3.1–3.3 hold. Under $H_{0}$ , $T_{n}$ converges in distribution to the Chi-square distribution with r degrees of freedom as ${min}_{i = 1, \dots, a} n_{i} \to \infty$ , where $r = rank (V)$ and $V = {(V_{ij})}_{i, j = 1 \dots, a}$ with

\begin{matrix} V_{ij} = & \frac{2 π min {ρ_{i}, ρ_{j}}}{ρ_{i} ρ_{j}} f_{ij} (0) - \frac{2 π}{a} \sum_{s = 1}^{a} \{\frac{min {ρ_{s}, ρ_{j}}}{ρ_{s} ρ_{j}} f_{sj} (0) + \frac{min {ρ_{i}, ρ_{s}}}{ρ_{i} ρ_{s}} f_{is} (0)\} \\ + \frac{2 π}{a^{2}} \sum_{s, k = 1}^{a} \frac{min {ρ_{s}, ρ_{k}}}{ρ_{s} ρ_{k}} f_{sk} (0) . \end{matrix}

From Theorem 1, we obtain an asymptotically size $α$ test whether we reject $H_{0}$ when $T_{n} \geq χ_{{\hat{r}}_{n}}^{2} [1 - α]$ , where ${\hat{r}}_{n} = rank ({\hat{V}}_{n}^{-})$ and $χ_{{\hat{r}}_{n}}^{2} [1 - α]$ denotes the upper $α$ -percentiles of the Chi-square distribution with ${\hat{r}}_{n}$ degrees of freedom.

We elucidate the theoretical power of the test in the next theorem.

Theorem 2

Suppose Assumptions 3.1–3.3 hold. Under the alternative $K_{0}$ , the power of the above test based on $T_{n}$ converges to 1, as ${min}_{i = 1, \dots, a} n_{i} \to \infty$ . In other words, the test is consistent.

To see the nontrivial power of the proposed test, let us consider local alternative hypotheses. Provided the perturbations $h_{1}, \dots, h_{a}$ satisfying $\sum_{i = 1}^{a} h_{i} = 0$ , the local alternative is defined as

\begin{matrix} K_{0}^{(n)} : τ_{i} = \frac{h_{i}}{\sqrt{n}} (i = 1, \dots, a) . \end{matrix}

Theorem 3

Suppose Assumptions 3.1–3.3 hold. Under the local alternatives $K_{0}^{(n)}$ , $T_{n}$ converges in distribution to the noncentral Chi-square distribution with r degrees of freedom and the noncentrality parameter $δ = (h_{1}^{T}, \dots, h_{a}^{T}) V^{-} {(h_{1}^{T}, \dots, h_{a}^{T})}^{T}$ , as ${min}_{i = 1, \dots, a} n_{i} \to \infty$ .

In view of this theorem, the nontrivial asymptotic power of the test under the local alternatives can be expressed as

\begin{matrix} 1 - Ψ_{r, δ} (χ_{r}^{2} [1 - α]), \end{matrix}

where $Ψ_{r, δ}$ is the cumulative distribution function of the noncentral Chi-square with r degrees of freedom and the noncentrality parameter $δ$ .

Remark 4

In case that the number of time series in each group is greater than one ( $q \geq 2$ , see Remark 1), the multiple comparison problem occurs since our test provides different p-values for different orders of time series. For example, for $p = 1$ , we obtain ${(q!)}^{a - 1}$ different p-values in total. To avoid the multiple comparison problem, we propose that $y_{it} = {(y_{i t 1 .}, y_{i t 2 .}, \dots, y_{i t p .})}^{T}$ , where $y_{itpq} = \sum_{j = 1}^{q} y_{itpj} / q$ , is used instead of ${(y_{i t 11}, \dots, y_{i t 1 q}, y_{i t 21}, \dots, y_{i t p 1}, \dots y_{itpq})}^{T}$ .

Test for existence of random effects

In this section, we consider the one-way random effects model with a series of strictly stationary residuals when the number of groups is fixed and the number of observations for each group diverges. The only difference from the fixed effects model (4) is that $τ_{i}$ is random effect of the ith group. To be simple, we assume ${(τ_{1}^{T}, \dots, τ_{a}^{T})}^{T}$ follows the ap-dimensional centered normal distribution with variance $Σ^{τ} = {(Σ_{ij}^{τ})}_{i, j = 1, \dots, a}$ . Here, ${τ_{j}}$ are supposed to be independent of any disturbance process ${e_{it} ; t = 1, . . ., n_{i}}$ . In this random effects model, the spectral density of $y_{it}$ does not exist due to the random effects.

Let the null hypothesis $H_{1}$ and the alternative $K_{1}$ for the existence of random effects be

\begin{matrix} H_{1} : Σ^{τ} = O_{ap} vs K_{1} : Σ^{τ} \neq O_{ap}, \end{matrix}

where $O_{ap}$ is an ap-by-ap zero matrix. The test statistic $T_{n}$ , defined in (7), is still available in this situation. The following theorem shows that the asymptotic null distribution is exactly the same as that for the fixed effects model.

Theorem 4

Suppose Assumptions 3.1–3.3 hold. Under the null $H_{1}$ , $T_{n}$ converges in distribution to the Chi-square distribution with r degrees of freedom as ${min}_{i = 1, \dots, a} n_{i} \to \infty$ .

In consequence, we reject $H_{1}$ in favor of $K_{1}$ if $T_{n} \geq χ_{{\hat{r}}_{n}}^{2} [1 - α]$ . The consistency of the test is shown as follows.

Theorem 5

Suppose Assumptions 3.1–3.3 hold. Under the alternative $K_{1}$ , the proposed test is consistent. More precisely, under the alternative $K_{1}$ , $pr (T_{n} \geq χ_{{\hat{r}}_{n}}^{2} [1 - α]) \to 1$ , as ${min}_{i = 1, \dots, a} n_{i} \to \infty$ .

Now we consider the local alternative hypothesis to study the nontrivial power of the test based on $T_{n}$ . Let $H = {(H_{ij})}_{i, j = 1 \dots, a}$ be an ap-by-ap symmetric, positive definite matrix, and the local alternatives $K^{(n)}$ be defined as

\begin{matrix} K_{1}^{(n)} : Σ^{τ} = \frac{H}{n} . \end{matrix}

The nontrivial power of the proposed test is elucidated in the next result.

Theorem 6

Suppose Assumptions 3.1–3.3 hold. Under the alternatives $K_{1}^{(n)}$ , we have

\begin{matrix} lim_{{min}_{i = 1, \dots, a} n_{i} \to \infty} pr (T_{n} \geq χ_{{\hat{r}}_{n}}^{2} [1 - α]) = pr (Z^{T} V^{-} Z \geq χ_{r}^{2} [1 - α]), \end{matrix}

where $Z$ follows an ap-dimensional centered normal distribution with variance $\tilde{H} + V$ ; Here, $\tilde{H} = {({\tilde{H}}_{ij})}_{i, j = 1 \dots, a}$ is determined in terms of the matrix $H$ as

\begin{matrix} {\tilde{H}}_{ij} = H_{ij} - \frac{1}{a} \sum_{s = 1}^{a} (H_{sj} + H_{is}) + \frac{1}{a^{2}} \sum_{s, k = 1}^{a} H_{sk} . \end{matrix}

Remark 5

We can generalize the random effects ${(τ_{1}^{⊤}, \dots, τ_{a}^{⊤})}^{⊤}$ to an ap-dimensional random vector and show corresponding theorems to Theorems 4–6.

Numerical study

The finite sample performance of the proposed test based on $T_{n}$ and comparison with the classical test based on $S_{n}$ are illustrated in this section. To be specific, we let the dimension of time series from each group at each time p, the number of time series in each group q, and the number of groups a be $p = 1$ , $q = 1$ , and $a = 3, 9$ . The sample sizes are set as (I) $n_{1} = \dots = n_{a} = 1000$ , (II) $n_{3 k - 1} = n_{3 k - 2} = 2000$ and $n_{3 k} = 1000$ for $k \leq a / 3$ , (III) $n_{1} = \dots = n_{a} = 2000$ . (I) and (III) are cases of the sample size of each group being equal (balanced design). (II) is the case of the sample size of each group being unequal (unbalanced design). For each $1 \leq t \leq {max}_{i = 1, \dots, a} n_{i}$ , denote ${(e_{1 t}, \dots, e_{at})}^{T}$ by $(e_{t}) = {(e_{it})}_{i = 1, \dots, a}$ .

We consider two scenarios, independent groups (Case 1) and correlated groups (Case 2). The disturbance process ${e_{it}}$ is supposed to follow a multivariate moving-average model or a generalized autoregressive conditional heteroscedasticity model. Let ${ε_{t}}$ be an i.i.d. sequence in the following.

As for Processes 1–3, we suppose $e_{t} = ε_{t} + Φ ε_{t - 1}$ , with the coefficient matrix $Φ = (Φ_{ij})$ , where, in Case 1, $Φ = 0.5 I_{a}$ and, in Case 2, $Φ_{3 k - 2, 3 k - 2} = 0.7$ , $Φ_{3 k - 1, 3 k - 1} = - 0.5$ , $Φ_{3 k, 3 k} = 0.3$ , $Φ_{3 k, 3 k - 2} = 0.3$ , $Φ_{3 k, 3 k - 1} = - 0.1$ for positive integer $k \leq a / 3$ ; and otherwise $Φ_{ij} = 0$ .

Process 1: In Case 1, each component of $ε_{t}$ follows a centered normal distribution with unit variance, which is of independent other components of $ε_{t}$ . In Case 2, $ε_{t}$ is distributed as a zero mean multivariate normal distribution with covariance matrix $Σ = (Σ_{ij})$ , where $Σ_{ii} = 1$ and $Σ_{j, j + 1} = Σ_{j + 1, j} = 0.5$ for $1 \leq i \leq a$ , $1 \leq j \leq a - 1$ .

Process 2: In Case 1, each component of $ε_{t}$ follows a centered t-distribution with 5 degrees of freedom, which is of independent other components of $ε_{t}$ . In Case 2, $ε_{t}$ is distributed as a zero mean multivariate t-distribution with 5 degrees of freedom, with the scale matrix $Σ$ defined in Process 1.

Process 3: In Case 1, each component of $ε_{t}$ follows a centered skew normal distribution with location parameter 0, scale 1 and shape parameter 50, which is of independent other components of $ε_{t}$ . The noncentered skew normal distribution has a nonzero mean $50 \sqrt{2} / \sqrt{π (1 + 50^{2})}$ . In Case 2, $ε_{t}$ is distributed as a centered multivariate skew normal distribution with location parameter $0_{a}$ , correlation matrix $Σ$ defined in Process 1, and shape parameter $ζ = 50 1_{a}$ , where $0_{a}$ and $1_{a}$ are a-dimensional vectors with every component being zero and one, respectively. The skewed process is found in Chan and Tong (1986); The joint density function of multivariate skew normal distribution is given, for $x \in R^{a}$ , by

\begin{matrix} f_{SN} (x ; Σ, ζ) = 2 υ_{a} (x ; Σ) Υ (ζ^{T} x), \end{matrix}

where $υ_{a} (\cdot ; Ω)$ is the probability density function of the a-dimensional centered multivariate normal distribution with a correlation matrix $Σ$ and $Υ (\cdot)$ is the cumulative distribution function of the standard normal distribution. Note that the noncentered process has a nonzero mean $\sqrt{2 / (π (1 + ζ^{T} Σ ζ))} Σ ζ$ unless $ζ = 0$ , so we need subtract the mean. The more details of multivariate skew normal distribution can be found in Azzalini and Valle (1996), Azzalini and Capitanio (1999).

As for Process 4, we suppose

Process 4: ${e_{t}}$ follows the generalized autoregressive conditional heteroscedasticity model

\begin{matrix} e_{it} = h_{it}^{1 / 2} ε_{it}, i = 1, \dots, a, (\begin{matrix} h_{1 t} \\ ⋮ \\ h_{at} \end{matrix}) = (\begin{matrix} 1 \\ ⋮ \\ 1 \end{matrix}) + 0.1 Φ (\begin{matrix} e_{1 t}^{2} \\ ⋮ \\ e_{at}^{2} \end{matrix}) + (\begin{matrix} 0.1 h_{1, t - 1} \\ ⋮ \\ 0.1 h_{a, t - 1} \end{matrix}), \end{matrix}

where $ε_{t}$ is distributed as a zero mean multivariate normal distribution with covariance $I_{a}$ in case 1 and $Σ$ in case 2.

R package mvtnorm (Genz et al. 2021) is available to produce innovation processes for Processes 1 and 2. Process 4 can be produced by R package ccgarch (Nakatani 2014). The skew normal distribution can be generated by R package sn (Azzalini 2022).

Features of Processes 1–4 as follows: Process 1 is the most standard setting. Fifth and higher moments of Process 2 do not exist. Processes 3 and 4 have a nonzero skewness and conditional heteroskedasticity, respectively.

We report the rejection probabilities of our proposed test $T_{n}$ and the classical tests $S_{n}$ in Figs. 1, 2 and 3 over 1000 simulations for the following situations: (i) $τ = 0_{a}$ ; (ii) $τ = {(τ_{1}, \dots, τ_{a})}^{⊤}$ , where $τ_{3 k - 2} = - 0.03$ , $τ_{3 k - 1} = 0$ , and $τ_{3 k} = 0.03$ for $k \leq a / 3$ ; and (iii) $τ$ is distributed as a zero mean multivariate normal with covariance matrix $Σ^{τ}$ . We let $Σ^{τ}$ be a block diagonal matrix whose off-diagonal blocks are all $3 \times 3$ zero matrix and main-diagonal blocks are all the same $3 \times 3$ matrix ${\tilde{Σ}}^{τ} = ({\tilde{Σ}}_{ij}^{τ}) / 5000$ , where ${\tilde{Σ}}_{11}^{τ} = 3$ , ${\tilde{Σ}}_{22}^{τ} = 2$ , ${\tilde{Σ}}_{33}^{τ} = {\tilde{Σ}}_{12}^{τ} = {\tilde{Σ}}_{21}^{τ} = 1$ , ${\tilde{Σ}}_{23}^{τ} = {\tilde{Σ}}_{32}^{τ} = - 0.5$ , and ${\tilde{Σ}}_{13}^{τ} = {\tilde{Σ}}_{31}^{τ} = 0.008$ . The significance level is set to be 0.05.

Fig. 1 — Empirical size of tests for the existence of fixed and random effects based on $T_{n}$ and $S_{n}$ . The upper and lower plots correspond to $a = 3$ and $a = 9$ , respectively. The left and right plots correspond to the cases 1 (independent groups) and 2 (correlated groups), respectively. The tick marks of the x-label (I), (II), and (III) correspond to the sample size $n_{1} = \dots = n_{a} = 1000, n_{3 k - 1} = n_{3 k - 2} = 2000$ and $n_{3 k} = 1000$ for $k \leq a / 3$ , and $n_{1} = \dots = n_{a} = 2000$ , respectively

Fig. 2 — Empirical power of tests for the existence of fixed effects based on $T_{n}$ and $S_{n}$ for fixed effects $τ = (τ_{1}, \dots, τ_{a})$ , where $τ_{3 k - 2} = - 0.03$ , $τ_{3 k - 1} = 0$ , and $τ_{3 k} = 0.03$ for $k \leq a / 3$ . The upper and lower plots correspond to $a = 3$ and $a = 9$ , respectively. The left and right plots correspond to the cases 1 (independent groups) and 2 (correlated groups), respectively. The tick marks of the x-label (I), (II), and (III) correspond to the sample size $n_{1} = \dots = n_{a} = 1000$ , $n_{3 k - 1} = n_{3 k - 2} = 2000$ and $n_{3 k} = 1000$ for $k \leq a / 3$ , and $n_{1} = \dots = n_{a} = 2000$ , respectively

Fig. 3 — Empirical power of tests for the existence of random effects based on $T_{n}$ and $S_{n}$ for random effects $τ$ distributed as a zero mean multivariate normal with covariance matrix $Σ^{τ}$ , where ${\tilde{Σ}}^{τ} = ({\tilde{Σ}}_{ij}^{τ}) / 5000$ is a block diagonal matrix whose main-diagonal blocks are all the same $3 \times 3$ matrix such as ${\tilde{Σ}}_{11}^{τ} = 3$ , ${\tilde{Σ}}_{22}^{τ} = 2$ , ${\tilde{Σ}}_{33}^{τ} = {\tilde{Σ}}_{12}^{τ} = {\tilde{Σ}}_{21}^{τ} = 1$ , ${\tilde{Σ}}_{23}^{τ} = {\tilde{Σ}}_{32}^{τ} = - 0.5$ , ${\tilde{Σ}}_{13}^{τ} = {\tilde{Σ}}_{31}^{τ} = 0.008$ , and $τ_{3 k} = 0.03$ for $k \leq a / 3$ . The upper and lower plots correspond to $a = 3$ and $a = 9$ , respectively. The left and right plots correspond to the cases 1 (independent groups) and 2 (correlated groups), respectively. The tick marks of the x-label (I), (II), and (III) correspond to the sample size $n_{1} = \dots = n_{a} = 1000$ , $n_{3 k - 1} = n_{3 k - 2} = 2000$ and $n_{3 k} = 1000$ for $k \leq a / 3$ , and $n_{1} = \dots = n_{a} = 2000$ , respectively

The situation (i) corresponds to both null hypotheses $H_{0}$ and $H_{1}$ defined in (5) and (8), respectively, and (ii) and (iii) correspond to the alternatives $K_{0}$ and $K_{1}$ , respectively. Note that fixed effects and random effects are chosen as tiny so that power become less than one to compare performances of tests against Processes 1–4. In the supplementary material, the consistency can be confirmed by results (see Tables 1–6 in Section B.1).

Figure 1 shows the empirical size of the tests. Both tests work well for $a = 3$ and the case 1 (the top left plot) for all processes. Our proposed test based on $T_{n}$ has good size for $a = 3$ and the case 2 (the top right plot). On the other hand, our test has small size distortion for $a = 9$ and the cases 1 and 2 (the lower plots). This distortion has occurred by the accumulation of estimating errors of the large matrix $V^{-}$ (see Figures 1 and 2 in Section B.2 in the supplementary material). As expected, the classical test based on $S_{n}$ has size distortion for both $a = 3, 9$ and the case 2 (the right plots) since the correlated groups are dealt with.

Figures 2 and 3 show the empirical power of the tests. Figures 2 and 3 for both $a = 3, 9$ and the case 1 display that empirical power of both tests are nearly equal for each model.

In most cases, size and power for the unbalanced design (II) $n_{3 k - 1} = n_{3 k - 2} = 2000$ and $n_{3 k} = 1000$ for $k \leq a / 3$ fall between results for the balanced designs (I) $n_{1} = \dots = n_{a} = 1000$ and (III) $n_{1} = \dots = n_{a} = 2000$ . There are the cases that the empirical power for unbalanced design (II) is worse than that for balanced design (I) regardless of the fact that the total sample size of (II) is larger than that of (I), e.g., the power of Processes 1 and 2 for $a = 9$ and case 2 in Fig. 3. Further, we implemented some additional experiments in the supplementary material and confirmed that the consistency of our test, i.e., the empirical power goes to one (see Tables 1–6 in Section B.1). Overall, our proposed test works well to detect the existence of fixed or random effects. In summary, our test outperforms the classical test when groups are correlated and a is moderate.

Application to real data

Data analysis on stock prices often does not take random effects into account. However, for some portfolio of stocks, random effects cannot always be ignored. In fact, equity-focused investors take into account the sensitivities of currency, oil prices, market, etc. in determining their equity portfolios. In other words, equity-focused investors believe that the factors related to earnings and stock prices are linked. For example, stock prices of trading companies are linked to oil prices. It can be rephrased that equity-focused investors believe that random effects with respect to industries exist. In this empirical study, we pursue the question of whether random effects really exist for a portfolio that combines the automobile, telecom, and trading companies. We analyze the log-return in stock prices from January 4, 2016, to December 30, 2019. The companies we investigate are Itochu Corp., Mitsubishi Corp, Mitsui & Co., Ltd., and Marubeni Corp. from trading companies, Honda Motor Co. Ltd., Nissan Motor Co., Ltd., Suzuki Motor Corp., and Subaru Corp. from car companies, and KDDI Corp., Hikari Tsushin Inc. and NTT Data Corp. from telecom companies. The length of each time series is 978. These data can be downloaded from the website https://www.investing.com.

For this dataset, the number of groups a is three (trading, car, and telecom sectors), the dimension of time series p from each firm is one which corresponds to univariate ANOVA, the number of firms q is three for telecom sector and four for car and trading sectors, and the number of observations is $n_{1} = n_{2} = n_{3} = 978$ .

The plots of the log-returns are shown in Fig. 4. The dataset seems stationary and we cannot tell the difference between sectors. Table 1 gives that sample means and variances of the log-returns. The sample means of Suzuki and Hikari appear to be large compared to the sample means for other car and telecom companies, respectively. As for sample variances, the variances of Suzuki and Subaru are a little larger than other companies. Figure 5 shows the heatmap of sample correlations. These data have correlations between and within groups. This implies the classical F-test statistic should not be applied in this situation since it is designed for independent groups. Interesting observations for the data are as follows: Within-group correlations of telecom and trading companies are low and rather high, respectively. This may be because of a similar product mix for trading companies and a different product mix for telecom companies. Between-group correlations for car and trading companies are higher than those for telecom and car companies and those for telecom and trading companies. This may be ascribed to the facts that car and trading companies’ stocks are cyclical, and by contrast, telecom companies’ stocks are defensive.

Table 1.

Sample means and sample variances of log-returns

Sector	Company	Mean $\times 10^{- 4}$	Variance $\times 10^{- 4}$
Trading company	Itochu	5.89	2.15
	Mitsubishi	3.74	2.57
	Mitsui	3.15	2.16
	Marubeni	2.80	2.81
Car company	Honda	-1.90	2.75
	Nissan	-6.80	2.35
	Suzuki	2.51	3.91
	Subaru	-5.91	3.58
Telecom company	KDDI	0.71	2.45
	Hikari	12.61	2.86
	NTT	2.53	2.71

Open in a new tab

Fig. 5 — Heatmap of sample correlations between companies

We apply our test and the classical test as a comparison to this dataset (see Remark 4) and obtain the values 5.517 and 2.401 and the corresponding p-values 0.0634 and 0.301, respectively.

Therefore, the null hypothesis $H_{1}$ does not rejected under the significance level 0.05 for the existence of random effects for both tests. However, the p-values of our test is close to 0.05, and for the significance level 0.1, our tests rejects the hypothesis, but the classical test does not. From the observations that (i) our dataset has between-group correlations, and thus, the classical test is not appropriate, (ii) there exists the tendency of sample means: Car companies tend to have negative sample mean; in contrast, telecom and trading companies tend to positive sample mean, and (iii) the p-value of our test is close to 0.05, we conclude random effects should be taken into consideration for modeling log-return for stock prices. This result ensures equity-focused investors’ thoughts that different industries have different factors that affect corporate profits of companies and corporate profits influence stock prices such as profits of trading companies are linked to the price of crude oil.

Our result is convincing from portfolio theory. In that field, it is well known that portfolios of stocks have systematic risks related to the whole market and unsystematic risks related to sectors and companies. Many studies taking into account unsystematic risk have been conducted and emphasized the importance of unsystematic risks (see Aber 1976; Hsu and Jang 2008, and references therein). Industry effects corresponds to unsystematic risks in our case.

Additional thoughts/remarks

Nagahata and Taniguchi (2018) showed the asymptotic null distribution of $S_{n}$ under the independence of groups. The following lines show that the independence of groups can be relaxed to uncorrelated groups. A simple algebra gives

\begin{matrix} S_{n} & = n \sum_{i = 1}^{a} {(\bar{y_{i .}} - \bar{y_{. .}})}^{T} {(2, π, {\tilde{f}}_{n}, (0))}^{- 1} (\bar{y_{i .}} - \bar{y_{. .}}) \\ = n {(\begin{matrix} {(2, π, {\tilde{f}}_{n}, (0))}^{- 1 / 2} \bar{e_{1 .}} \\ ⋮ \\ {(2, π, {\tilde{f}}_{n}, (0))}^{- 1 / 2} \bar{e_{a .}} \end{matrix})}^{T} \{(I_{a} - J_{a} / a) \otimes I_{p}\} (\begin{matrix} {(2, π, {\tilde{f}}_{n}, (0))}^{- 1 / 2} \bar{e_{1 .}} \\ ⋮ \\ {(2, π, {\tilde{f}}_{n}, (0))}^{- 1 / 2} \bar{e_{a .}} \end{matrix}) . \end{matrix}

Under Assumption 3.1 and the balanced design ( $n_{1} = \dots = n_{a}$ ), it holds that $\sqrt{n} ({(2, π, {\tilde{f}}_{n}, (0))}^{- 1 / 2} \bar{e_{1 .}}, \dots, {(2, π, {\tilde{f}}_{n}, (0))}^{- 1 / 2} \bar{e_{a .}})$ converges in distribution to $N (0, I_{ap})$ as $n \to \infty$ .

The idempotence of $(I_{a} - J_{a} / a) \otimes I_{p}$ , $rank \{(I_{a} - J_{a} / a) \otimes I_{p}\} = (a - 1) p$ , the positive definiteness of the spectral density matrix, and the continuous mapping theorem yield that $S_{n}$ converges in distribution to the Chi-square distribution with $(a - 1) p$ degrees of freedom under the independence of groups. The consistency of the test under the alternative and the power of the test under the local alternative can also be derived along the same line as our proof.

The independence or uncorrelatedness of groups is quite restrictive and impractical. In the case that groups are correlated, the asymptotic null limit distribution of $S_{n}$ depends on the process since the nondiagonal elements of the asymptotic variance of the vector $\sqrt{n} ({(2, π, {\tilde{f}}_{n}, (0))}^{- 1 / 2} \bar{e_{1 .}}, \dots, {(2, π, {\tilde{f}}_{n}, (0))}^{- 1 / 2} \bar{e_{a .}})$ are not equal to zero. Thus, the p-value of the test based on $S_{n}$ is not easy to compute. On the other hand, our proposed test statistic $T_{n}$ is asymptotically distribution-free under the null. Based on the numerical studies, we realized the proposed test statistic has some size distortion under the null for large a. One direction to solve this problem is using $S_{n}$ and applying a bootstrap method to obtain critical value. Homogeneity tests specialized for this type of models will be investigated in our future work.

Discussion

In this paper, the tests for the existence of fixed and random effects for one-way model with correlated groups were considered. The new test statistic was proposed and out tests are shown to be asymptotically size $α$ under the null and consistent. The nontrivial power of tests is derived under the local alternative. In the numerical study, we confirmed our test performs well for several settings. In particular, our test is superior to the classical test when groups are correlated and a is moderate. The empirical study suggests the random effects are better to take into account in the analysis of stock prices .

Supplementary information

Supplementary material includes all proofs of theorems and additional simulation results.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 175 KB)^{(176KB, pdf)}

Acknowledgements

The authors are grateful to the editor and two referees for their instructive comments. The authors gratefully acknowledge Mr. Takeshi Tamaoka, the chief executive officer of Ananas Japan Co. Ltd, and Mr. Yuki Nakayasu, the chief executive officer of Minsetsu Inc., for their comments from practical points of view on the real data analysis. This work was supported by JSPS Grant-in-Aid for Research Activity Start-up under Grant Number JP21K20338 (Y.G.); JSPS Grant-in-Aid for Scientific Research (C) under Grant Number JP20K11719 (Y.L.); JSPS Grant-in-Aid for Scientific Research (S) under Grant Number JP18H05290 (M.T.); and the Research Institute for Science & Engineering of Waseda University (M.T.). This work was mainly carried out when the first author was affiliated with Waseda University.

Declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Yuichi Goto, Email: yuichi.goto@math.kyushu-u.ac.jp.

Koichi Arakaki, Email: arakaki74@akane.waseda.jp.

Yan Liu, Email: liu@ims.sci.waseda.ac.jp.

Masanobu Taniguchi, Email: taniguchi@waseda.jp.

References

Aber JW. Industry effects and multivariate stock price behavior. J Financ Quant Anal. 1976;11(4):617–624. doi: 10.2307/2330216. [DOI] [Google Scholar]
Akharif A, Fihri M, Hallin M, Mellouk A. Optimal pseudo-Gaussian and rank-based random coefficient detection in multiple regression. Electron J Stat. 2020;14(2):4207–4243. doi: 10.1214/20-EJS1770. [DOI] [Google Scholar]
Azzalini A (2022) The R package sn: the skew-normal and related distributions such as the skew- $t$ and the SUN (version 2.0.2). Università degli Studi di Padova, Italia
Azzalini A, Capitanio A. Statistical applications of the multivariate skew normal distribution. J R Stat Soc Ser B. 1999;61(3):579–602. doi: 10.1111/1467-9868.00194. [DOI] [Google Scholar]
Azzalini A, Valle AD. The multivariate skew-normal distribution. Biometrika. 1996;83(4):715–726. doi: 10.1093/biomet/83.4.715. [DOI] [Google Scholar]
Bai J, Li K. Theory and methods of panel data models with interactive effects. Ann Stat. 2014;42(1):142–170. doi: 10.1214/13-AOS1183. [DOI] [Google Scholar]
Baltagi BH, Li Q. A transformation that will circumvent the problem of autocorrelation in an error-component model. J Econom. 1991;48(3):385–393. doi: 10.1016/0304-4076(91)90070-T. [DOI] [Google Scholar]
Bernardes J, Mishra N, Tran F, Bahmer T, Best L, Blase J, Bordoni D, Franzenburg J, Geisen U, Josephs-Spaulding J, Köhler P, Künstner A, Rosati E, Aschenbrenner A, Bacher P, Baran N, Boysen T, Brandt B, Bruse N, Dörr J, Dräger A, Elke G, Ellinghaus D, Fischer J, Forster M, Franke A, Franzenburg S, Frey N, Friedrichs A, J. Fuß, Glück A, Hamm J, Hinrichsen F, Hoeppner M, Imm S, Junker R, Kaiser S, Kan Y, Knoll R, Lange C, Laue G, Lier C, Lindner M, Marinos G, Markewitz R, Nattermann J, Noth R, Pickkers P, Rabe K, Renz A, Röcken C, Rupp J, Schaffarzyk A, Scheffold A, Schulte-Schrepping J, Schunk D, Skowasch D, Ulas T, Wandinger K, Wittig M, Zimmermann J, Busch H, Hoyer B, Kaleta C, Heyckendorf J, Kox M, Rybniker J, Schreiber S, Schultze J, Rosenstiel P, DeCOI (2020) Longitudinal multi-omics analyses identify responses of megakaryocytes, erythroid cells, and plasmablasts as hallmarks of severe COVID-19. Immunity 53(6):1296–1314 [DOI] [PMC free article] [PubMed]
Bloomfield P. An exponential model for the spectrum of a scalar time series. Biometrika. 1973;60(2):217–226. doi: 10.1093/biomet/60.2.217. [DOI] [Google Scholar]
Brillinger DR. Time series: data analysis and theory. San Francisco: Holden-Day; 1981. [Google Scholar]
Chan K, Tong H. A note on certain integral equations associated with non-linear time series analysis. Probab Theory Relat Fields. 1986;73(1):153–158. doi: 10.1007/BF01845999. [DOI] [Google Scholar]
Chiu ST. Weighted least squares estimators on the frequency domain for the parameters of a time series. Ann Stat. 1988;16(3):1315–1326. doi: 10.1214/aos/1176350963. [DOI] [Google Scholar]
Clarke BR (2008) Linear models: the theory and application of analysis of variance. Wiley
Cook RD, Weisberg S. Diagnostics for heteroscedasticity in regression. Biometrika. 1983;70(1):1–10. doi: 10.1093/biomet/70.1.1. [DOI] [Google Scholar]
Diggle PJ, Heagerty P, Liang KY, Zeger SL. Analysis of longitudinal data. Oxford: Oxford University Press; 2002. [Google Scholar]
Ditzhaus M, Fried R, Pauly M. QANOVA: quantile-based permutation methods for general factorial designs. TEST. 2021;30:960–979. doi: 10.1007/s11749-021-00758-y. [DOI] [PubMed] [Google Scholar]
Ergemen YE, Velasco C. Estimation of fractionally integrated panels with fixed effects and cross-section dependence. J Econom. 2017;196(2):248–258. doi: 10.1016/j.jeconom.2016.05.020. [DOI] [Google Scholar]
Fang EX, Ning Y, Li R. Test of significance for high-dimensional longitudinal data. Ann Stat. 2020;48(5):2622–2645. doi: 10.1214/19-AOS1900. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fihri M, Akharif A, Mellouk A, Hallin M. Efficient pseudo-Gaussian and rank-based detection of random regression coefficients. J Nonparam Stat. 2020;32(2):367–402. doi: 10.1080/10485252.2020.1748625. [DOI] [Google Scholar]
Galbraith JW, Zinde-Walsh V. Transforming the error-components model for estimation with general ARMA disturbances. J Econom. 1995;66(1–2):349–355. doi: 10.1016/0304-4076(94)01621-6. [DOI] [Google Scholar]
Genz A, Bretz F, Miwa T, Mi X, Leisch F, Scheipl F, Hothorn T (2021) mvtnorm: multivariate normal and t distributions. R package version 1.1-3
González JA, Lagos-Álvarez BM, Mateu J. Two-way layout factorial experiments of spatial point pattern responses in mineral flotation. TEST. 2021;30:1046–1075. doi: 10.1007/s11749-021-00768-w. [DOI] [Google Scholar]
Hallin M, Hlubinká D, Hudecova S. Efficient fully distribution-free center-outward rank tests for multiple-output regression and MANOVA. J Am Stat Assoc. 2021;66:1–43. [Google Scholar]
Hoover DR, Rice JA, Wu CO, Yang LP. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika. 1998;85(4):809–822. doi: 10.1093/biomet/85.4.809. [DOI] [Google Scholar]
Hsu LT, Jang S. The determinant of the hospitality industry’s unsystematic risk: a comparison between hotel and restaurant firms. Int J Hosp Tour Admin. 2008;9(2):105–127. [Google Scholar]
Li Y. Efficient semiparametric regression for longitudinal data with nonparametric covariance estimation. Biometrika. 2011;98(2):355–370. doi: 10.1093/biomet/asq080. [DOI] [Google Scholar]
Liu X, Xu X. Confidence distribution inferences in one-way random effects model. TEST. 2016;25(1):59–74. doi: 10.1007/s11749-015-0440-8. [DOI] [Google Scholar]
Lucas, C., P. Wong, J. Klein, T.B. Castro, J. Silva, M. Sundaram, M.K. Ellingson, T. Mao, J.E. Oh, B. Israelow, T. Takahashi, M. Tokuyama, P. Lu, A. Venkataraman, A. Park, S. Mohanty, H. Wang, A.L. Wyllie, C.B.F. Vogels, R. Earnest, S. Lapidus, I.M. Ott, A.J. Moore, M.C. Muenker, J.B. Fournier, M. Campbell, C.D. Odio, A. Casanovas-Massana, Y.I. Team, R. Herbst, A.C. Shaw, R. Medzhitov, W.L. Schulz, N.D. Grubaugh, C.D. Cruz, S. Farhadian, A.I. Ko, S.B. Omer, and A. Iwasaki Longitudinal analyses reveal immunological misfiring in severe COVID-19. Nature. 2020;584(7821):463–469. doi: 10.1038/s41586-020-2588-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nagahata H, Taniguchi M. 4. Analysis of variance for multivariate time series. Metron. 2018;76:69–82. doi: 10.1007/s40300-017-0122-2. [DOI] [Google Scholar]
Nakatani T (2014) ccgarch: an R package for modelling multivariate GARCH models with conditional correlations
Rakocevic V. On continuity of the Moore–Penrose and Drazin inverses. Mater Vesn. 1997;49(3–4):163–172. [Google Scholar]
Rao CR, Mitra SK. Generalized inverse of matrices and its applications. New York: Wiley; 1971. [Google Scholar]
Rashid MM. Robust analysis of two-way models with repeated measures on both factors. TEST. 1995;4(1):39–62. doi: 10.1007/BF02563102. [DOI] [Google Scholar]
Robinson PM. Automatic frequency domain inference on semiparametric and nonparametric models. Econometrica. 1991;59(5):1329–1363. doi: 10.2307/2938370. [DOI] [Google Scholar]
Searle SR, Casella G, McCulloch CE. Variance Components. New York: Wiley; 1992. [Google Scholar]
Stewart G. On the continuity of the generalized inverse. SIAM J Appl Math. 1969;17(1):33–45. doi: 10.1137/0117004. [DOI] [Google Scholar]
Tang CY, Leng C. Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika. 2011;98(4):1001–1006. doi: 10.1093/biomet/asr050. [DOI] [Google Scholar]
von Sachs R. Nonparametric spectral analysis of multivariate time series. Annu Rev Stat Appl. 2020;7:361–386. doi: 10.1146/annurev-statistics-031219-041138. [DOI] [Google Scholar]
You J, Zhou X. Efficient estimation in panel data partially additive linear model with serially correlated errors. Stat Sin. 2013;23:271–303. [Google Scholar]
Zeger SL, Liang KY, Self SG. The analysis of binary longitudinal data with time independent covariates. Biometrika. 1985;72(1):31–38. [Google Scholar]
Zhong PS, Li R, Santo S. Homogeneity tests of covariance matrices with high-dimensional longitudinal data. Biometrika. 2019;106(3):619–634. doi: 10.1093/biomet/asz011. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary file 1 (pdf 175 KB)^{(176KB, pdf)}

[CR1] Aber JW. Industry effects and multivariate stock price behavior. J Financ Quant Anal. 1976;11(4):617–624. doi: 10.2307/2330216. [DOI] [Google Scholar]

[CR2] Akharif A, Fihri M, Hallin M, Mellouk A. Optimal pseudo-Gaussian and rank-based random coefficient detection in multiple regression. Electron J Stat. 2020;14(2):4207–4243. doi: 10.1214/20-EJS1770. [DOI] [Google Scholar]

[CR3] Azzalini A (2022) The R package sn: the skew-normal and related distributions such as the skew- $t$ and the SUN (version 2.0.2). Università degli Studi di Padova, Italia

[CR4] Azzalini A, Capitanio A. Statistical applications of the multivariate skew normal distribution. J R Stat Soc Ser B. 1999;61(3):579–602. doi: 10.1111/1467-9868.00194. [DOI] [Google Scholar]

[CR5] Azzalini A, Valle AD. The multivariate skew-normal distribution. Biometrika. 1996;83(4):715–726. doi: 10.1093/biomet/83.4.715. [DOI] [Google Scholar]

[CR6] Bai J, Li K. Theory and methods of panel data models with interactive effects. Ann Stat. 2014;42(1):142–170. doi: 10.1214/13-AOS1183. [DOI] [Google Scholar]

[CR7] Baltagi BH, Li Q. A transformation that will circumvent the problem of autocorrelation in an error-component model. J Econom. 1991;48(3):385–393. doi: 10.1016/0304-4076(91)90070-T. [DOI] [Google Scholar]

[CR8] Bernardes J, Mishra N, Tran F, Bahmer T, Best L, Blase J, Bordoni D, Franzenburg J, Geisen U, Josephs-Spaulding J, Köhler P, Künstner A, Rosati E, Aschenbrenner A, Bacher P, Baran N, Boysen T, Brandt B, Bruse N, Dörr J, Dräger A, Elke G, Ellinghaus D, Fischer J, Forster M, Franke A, Franzenburg S, Frey N, Friedrichs A, J. Fuß, Glück A, Hamm J, Hinrichsen F, Hoeppner M, Imm S, Junker R, Kaiser S, Kan Y, Knoll R, Lange C, Laue G, Lier C, Lindner M, Marinos G, Markewitz R, Nattermann J, Noth R, Pickkers P, Rabe K, Renz A, Röcken C, Rupp J, Schaffarzyk A, Scheffold A, Schulte-Schrepping J, Schunk D, Skowasch D, Ulas T, Wandinger K, Wittig M, Zimmermann J, Busch H, Hoyer B, Kaleta C, Heyckendorf J, Kox M, Rybniker J, Schreiber S, Schultze J, Rosenstiel P, DeCOI (2020) Longitudinal multi-omics analyses identify responses of megakaryocytes, erythroid cells, and plasmablasts as hallmarks of severe COVID-19. Immunity 53(6):1296–1314 [DOI] [PMC free article] [PubMed]

[CR9] Bloomfield P. An exponential model for the spectrum of a scalar time series. Biometrika. 1973;60(2):217–226. doi: 10.1093/biomet/60.2.217. [DOI] [Google Scholar]

[CR10] Brillinger DR. Time series: data analysis and theory. San Francisco: Holden-Day; 1981. [Google Scholar]

[CR11] Chan K, Tong H. A note on certain integral equations associated with non-linear time series analysis. Probab Theory Relat Fields. 1986;73(1):153–158. doi: 10.1007/BF01845999. [DOI] [Google Scholar]

[CR12] Chiu ST. Weighted least squares estimators on the frequency domain for the parameters of a time series. Ann Stat. 1988;16(3):1315–1326. doi: 10.1214/aos/1176350963. [DOI] [Google Scholar]

[CR13] Clarke BR (2008) Linear models: the theory and application of analysis of variance. Wiley

[CR14] Cook RD, Weisberg S. Diagnostics for heteroscedasticity in regression. Biometrika. 1983;70(1):1–10. doi: 10.1093/biomet/70.1.1. [DOI] [Google Scholar]

[CR15] Diggle PJ, Heagerty P, Liang KY, Zeger SL. Analysis of longitudinal data. Oxford: Oxford University Press; 2002. [Google Scholar]

[CR16] Ditzhaus M, Fried R, Pauly M. QANOVA: quantile-based permutation methods for general factorial designs. TEST. 2021;30:960–979. doi: 10.1007/s11749-021-00758-y. [DOI] [PubMed] [Google Scholar]

[CR17] Ergemen YE, Velasco C. Estimation of fractionally integrated panels with fixed effects and cross-section dependence. J Econom. 2017;196(2):248–258. doi: 10.1016/j.jeconom.2016.05.020. [DOI] [Google Scholar]

[CR18] Fang EX, Ning Y, Li R. Test of significance for high-dimensional longitudinal data. Ann Stat. 2020;48(5):2622–2645. doi: 10.1214/19-AOS1900. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] Fihri M, Akharif A, Mellouk A, Hallin M. Efficient pseudo-Gaussian and rank-based detection of random regression coefficients. J Nonparam Stat. 2020;32(2):367–402. doi: 10.1080/10485252.2020.1748625. [DOI] [Google Scholar]

[CR20] Galbraith JW, Zinde-Walsh V. Transforming the error-components model for estimation with general ARMA disturbances. J Econom. 1995;66(1–2):349–355. doi: 10.1016/0304-4076(94)01621-6. [DOI] [Google Scholar]

[CR21] Genz A, Bretz F, Miwa T, Mi X, Leisch F, Scheipl F, Hothorn T (2021) mvtnorm: multivariate normal and t distributions. R package version 1.1-3

[CR22] González JA, Lagos-Álvarez BM, Mateu J. Two-way layout factorial experiments of spatial point pattern responses in mineral flotation. TEST. 2021;30:1046–1075. doi: 10.1007/s11749-021-00768-w. [DOI] [Google Scholar]

[CR23] Hallin M, Hlubinká D, Hudecova S. Efficient fully distribution-free center-outward rank tests for multiple-output regression and MANOVA. J Am Stat Assoc. 2021;66:1–43. [Google Scholar]

[CR24] Hoover DR, Rice JA, Wu CO, Yang LP. Nonparametric smoothing estimates of time-varying coefficient models with longitudinal data. Biometrika. 1998;85(4):809–822. doi: 10.1093/biomet/85.4.809. [DOI] [Google Scholar]

[CR25] Hsu LT, Jang S. The determinant of the hospitality industry’s unsystematic risk: a comparison between hotel and restaurant firms. Int J Hosp Tour Admin. 2008;9(2):105–127. [Google Scholar]

[CR26] Li Y. Efficient semiparametric regression for longitudinal data with nonparametric covariance estimation. Biometrika. 2011;98(2):355–370. doi: 10.1093/biomet/asq080. [DOI] [Google Scholar]

[CR27] Liu X, Xu X. Confidence distribution inferences in one-way random effects model. TEST. 2016;25(1):59–74. doi: 10.1007/s11749-015-0440-8. [DOI] [Google Scholar]

[CR28] Lucas, C., P. Wong, J. Klein, T.B. Castro, J. Silva, M. Sundaram, M.K. Ellingson, T. Mao, J.E. Oh, B. Israelow, T. Takahashi, M. Tokuyama, P. Lu, A. Venkataraman, A. Park, S. Mohanty, H. Wang, A.L. Wyllie, C.B.F. Vogels, R. Earnest, S. Lapidus, I.M. Ott, A.J. Moore, M.C. Muenker, J.B. Fournier, M. Campbell, C.D. Odio, A. Casanovas-Massana, Y.I. Team, R. Herbst, A.C. Shaw, R. Medzhitov, W.L. Schulz, N.D. Grubaugh, C.D. Cruz, S. Farhadian, A.I. Ko, S.B. Omer, and A. Iwasaki Longitudinal analyses reveal immunological misfiring in severe COVID-19. Nature. 2020;584(7821):463–469. doi: 10.1038/s41586-020-2588-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] Nagahata H, Taniguchi M. 4. Analysis of variance for multivariate time series. Metron. 2018;76:69–82. doi: 10.1007/s40300-017-0122-2. [DOI] [Google Scholar]

[CR30] Nakatani T (2014) ccgarch: an R package for modelling multivariate GARCH models with conditional correlations

[CR31] Rakocevic V. On continuity of the Moore–Penrose and Drazin inverses. Mater Vesn. 1997;49(3–4):163–172. [Google Scholar]

[CR32] Rao CR, Mitra SK. Generalized inverse of matrices and its applications. New York: Wiley; 1971. [Google Scholar]

[CR33] Rashid MM. Robust analysis of two-way models with repeated measures on both factors. TEST. 1995;4(1):39–62. doi: 10.1007/BF02563102. [DOI] [Google Scholar]

[CR34] Robinson PM. Automatic frequency domain inference on semiparametric and nonparametric models. Econometrica. 1991;59(5):1329–1363. doi: 10.2307/2938370. [DOI] [Google Scholar]

[CR35] Searle SR, Casella G, McCulloch CE. Variance Components. New York: Wiley; 1992. [Google Scholar]

[CR36] Stewart G. On the continuity of the generalized inverse. SIAM J Appl Math. 1969;17(1):33–45. doi: 10.1137/0117004. [DOI] [Google Scholar]

[CR37] Tang CY, Leng C. Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika. 2011;98(4):1001–1006. doi: 10.1093/biomet/asr050. [DOI] [Google Scholar]

[CR38] von Sachs R. Nonparametric spectral analysis of multivariate time series. Annu Rev Stat Appl. 2020;7:361–386. doi: 10.1146/annurev-statistics-031219-041138. [DOI] [Google Scholar]

[CR39] You J, Zhou X. Efficient estimation in panel data partially additive linear model with serially correlated errors. Stat Sin. 2013;23:271–303. [Google Scholar]

[CR40] Zeger SL, Liang KY, Self SG. The analysis of binary longitudinal data with time independent covariates. Biometrika. 1985;72(1):31–38. [Google Scholar]

[CR41] Zhong PS, Li R, Santo S. Homogeneity tests of covariance matrices with high-dimensional longitudinal data. Biometrika. 2019;106(3):619–634. doi: 10.1093/biomet/asz011. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Homogeneity tests for one-way models with dependent errors under correlated groups

Yuichi Goto

Koichi Arakaki

Yan Liu

Masanobu Taniguchi

Abstract

Supplementary Information

Introduction

Preliminary

Spectral density

Classical setting and statistic

Test for existence of fixed effects

Remark 1

Remark 2

Assumption 3.1

Assumption 3.2

Assumption 3.3

Remark 3

Theorem 1

Theorem 2

Theorem 3

Remark 4

Test for existence of random effects

Theorem 4

Theorem 5

Theorem 6

Remark 5

Numerical study

Fig. 1.

Fig. 2.

Fig. 3.

Application to real data

Fig. 4.

Table 1.

Fig. 5.

Additional thoughts/remarks

Discussion

Supplementary information

Supplementary Information

Acknowledgements

Declarations

Conflict of interest

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases