Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Dec 23.
Published before final editing as: J Am Stat Assoc. 2024 Dec 23:10.1080/01621459.2024.2422129. doi: 10.1080/01621459.2024.2422129

Partial Quantile Tensor Regression

Dayu Sun a, Limin Peng a,#, Zhiping Qiu b, Ying Guo a, Amita Manatunga a
PMCID: PMC12448065  NIHMSID: NIHMS2033840  PMID: 40980192

Abstract

Tensors, characterized as multidimensional arrays, are frequently encountered in modern scientific studies. Quantile regression has the unique capacity to explore how a tensor covariate influences different segments of the response distribution. In this work, we propose a partial quantile tensor regression (PQTR) framework, which novelly applies the core principle of the partial least squares technique to achieve effective dimension reduction for quantile regression with a tensor covariate. The proposed PQTR algorithm is computationally efficient and scalable to a large tensor covariate. Moreover, we uncover an appealing latent variable model representation for the PQTR algorithm, justifying a simple population interpretation of the resulting estimator. We further investigate the connection of the PQTR procedure with an envelope quantile tensor regression (EQTR) model, which defines a general set of sparsity conditions tailored to quantile tensor regression. We prove the root-n consistency of the PQTR estimator under the EQTR model, and demonstrate its superior finite-sample performance compared to benchmark methods through simulation studies. We demonstrate the practical utility of the proposed method via an application to a neuroimaging study of post traumatic stress disorder (PTSD). Results derived from the proposed method are more neurobiologically meaningful and interpretable as compared to those from existing methods.

Keywords: Envelope method, Partial least squares, Quantile regression, Tensor covariate

1. Introduction

Quantile regression (Koenker and Bassett, 1978) offers a valuable perspective for characterizing the associations between a response and covariates of interest. By formulating covariate effects on one or more quantiles of the response while allowing such effects to change across different quantile levels, quantile regression confers a flexible and robust view of how covariates influence different segments of the response distribution (e.g., outcomes from the middle range, or from low or upper tail). With these natural appeals, quantile regression has received increasing attention in real data analyses. Theory, inference, and computation for quantile regression in the typical data settings, where covariates can sensibly enter a quantile regression model in a vector form, have been well studied; see the monograph by Koenker (2005) for a comprehensive summary.

There are emerging scientific applications (e.g., neuroimaging studies) where the covariates of interest take the complex form of multidimensional arrays, formally termed as tensors. Quantile regression has the unique capacity to explore the effects of a tensor covariate on average outcomes as well as unusual outcomes that often attract scientific interests. For example, neuroimaging features indicative of severe symptoms of a mental disorder are of more diagnostic or therapeutic value than those explaining average symptoms. However, addressing quantile regression with tensor covariates (or predictors) faces nontrivial challenges not only from the large sizes of tensors (typically coupled with small sample sizes) but also posed by the substantive need to properly account for tensors’ inherent spatial structures. For example, a typical brain functional connectivity matrix generated from functional magnetic resonance imaging (fMRI) can be of size 300 × 300, including 90,000 elements. The connectivity matrix entries corresponding to connections between brain regions in various functional networks are naturally correlated, implying an intrinsic spatial structure that carries important scientific implications.

To handle a tensor covariate under quantile regression, intuitive approaches include regressing over a few features extracted from the tensor covariate, or vectorizing the whole tensor covariate and then applying existing high-dimensional quantile regression methods (Zou and Yuan, 2008; Wang et al., 2012; Zheng et al., 2013; Fan et al., 2014, among others). Similar strategies were applied in the early work on linear regression with tensor covariates (Worsley et al., 2004; Caffo et al., 2010, for example). However, these types of approaches have several limitations. That is, substituting the tensor covariate by its low-dimensional features may produce dubious results if the selected features are irrelevant to the response. Vectorizing the tensor covariate ignores the inherent spatial structure of tensors and thus may fail to provide interpretable results (see more discussions in Zhou and Li (2013)). Analyses with the vectorized tensor covariate, even after applying penalization, may still suffer from heavy computation burden and/or unstable results due to a large number of tensor elements coupled with a typically limited sample size.

In this work, we concern the quantile regression problem with the tensor covariate retained in its original tensor form. To address this problem, one possible direction is to approximate the tensor covariate coefficient by a low-rank tensor decomposition, such as the CANDECOMP/PARAFAC (CP) decomposition or Tucker decomposition. Such a strategy has been investigated under generalized linear regression (Zhou et al., 2013; Li et al., 2018; Guhaniyogi et al., 2017, for example) and also under quantile regression (Lu et al., 2020; Li and Zhang, 2021, for example). While utilizing a tensor decomposition can effectively retain the spatial structure of tensors and reduce the number of free parameters, current approaches are prone to various computational issues and challenges. Specifically, most existing approaches utilize the alternating block updating (ABU) algorithm (Zhou et al., 2013; Li et al., 2018; Lu et al., 2020; Li and Zhang, 2021; Zhou et al., 2021, for example) that optimizes the objective function with respect to each component of the decomposition cyclically. However, the number of free parameters in each decomposition component may still be greater than the sample size (Li et al., 2018; Lu et al., 2020; Li and Zhang, 2021, for example); thus additional penalty methods are required, incurring increased computational complexity and intensity (Lu et al., 2020; Li and Zhang, 2021, for example). In addition, using an ABU algorithm is generally not flexible to account for special structures of tensor predictors, such as symmetry. For example, when the tensor covariate is symmetric (e.g., functional connectivity matrix), it is not straightforward to adapt the ABU algorithm to generate exactly symmetric tensor coefficient, as pointed out by Zhang et al. (2022), and this may complicate the result interpretation.

To address the issues and limitations of current tensor-decomposition based methods, we propose to employ the partial least squares (PLS) technique (Wold, 1982; de Jong, 1993) that has emerged as a promising strategy for handling tensor covariates under linear regression. Under linear regression, the classical PLS approach, which implements supervised dimension reduction through projecting a vector covariate in a direction guided by the correlation between the response and the covariate, is viable even when the covariate dimension is much larger than the sample size. A bulk of literature has elucidated justifications and interpretations for the classical PLS based on the latent variable models (Helland, 1988, 1990, 2001) or nascent envelope models (Cook et al., 2013; Zhang and Li, 2017; Cook, 2018). To accommodate a tensor predictor, several authors (Bro, 1996; Eliseyev and Aksenova, 2013; Zhao et al., 2013) pioneered extending the classical PLS algorithm but did not provide the corresponding population interpretations. Zhang and Li (2017) formally developed a PLS algorithm under a linear tensor regression model and established a rigorous population interpretation for their tensor PLS estimator using the notion of envelope (Cook et al., 2010). Zhang and Li (2017)’s PLS algorithm serves to find a reduced tensor predictor via Tucker decomposition, while avoiding optimization settings involving excessive number of free parameters and a large number of iterations often required by existing approaches that use ABU algorithms. It can also seamlessly render a symmetric tensor coefficient estimate given a symmetric tensor predictor.

However, the utility of adapting the PLS strategy to the quantile regression setting with a tensor covariate remains unexplored. A main obstacle is that the PLS strategy operates upon a convenient linear relationship between regression coefficients and the covariance between response and covariates, which exists under linear regression but no longer holds under quantile regression. Dodge and Whittaker (2009) proposed a hybrid of PLS and quantile regression with a low-dimensional vector predictor by adopting an intuitive substitute for the covariance used by the classical PLS; however the interpretation and theoretical properties of the resulting estimators were not established.

Motivated by the important gaps discussed above, we propose a new regression framework, referred to as partial quantile tensor regression (PQTR), where the core principle of PLS is adapted to achieve dimension reduction and theoretically justified estimation for a quantile regression model including both a tensor covariate and a traditional low-dimensional vector covariate. Targeting the general tensor covariate, we propose to employ the quantile partial tensor covariance, which is generalized from the quantile partial covariance studied by Li et al. (2015) for vector predictors, to guide the dimension reduction in the presence of a tensor covariate. As shown in Sections 2 and 3, the choice of quantile partial tensor covariance facilitates algorithm development while enabling theoretical justification. Other types of quantile covariance, such as the one adopted by Dodge and Whittaker (2009), do not have similarly desirable capacities. The idea of integrating quantile partial tensor covariance into a PLS procedure, though intuitive, cannot be straightforwardly justified by extending the arguments for its linear regression counterpart. This is mainly due to lack of a clear relationship between quantile partial tensor covariance and the coefficients under quantile tensor regression. We tackle this difficulty by establishing a “pseudo” linear structure of the quantile partial tensor covariance in terms of the quantile tensor regression coefficients, which can sufficiently serve the need to justify the use of quantile partial tensor covariance. Moreover, for the first time, we reveal a population latent variable model for a PLS-type procedure in the quantile regression setting. Our finding confers a meaningful interpretation for the estimator resulted from the proposed PQTR procedure.

To gain a deeper insight regarding the utility of the proposed PQTR procedure, we further propose and study an envelope quantile tensor regression (EQTR) model closely tied to the concept of envelope (Cook et al., 2010) and a modified framework of sufficient dimension reduction (Cook, 1998). The formulation of the new EQTR model reflects a general set of sparsity conditions tailored to quantile tensor regression, and represents a novel adaptation of the envelope concept to quantile regression with tensor covariates. Ding et al. (2020) did precursor work on an envelope quantile regression model with a traditional vector covariate, proposing estimation based on the generalized method of moments (GMM). This approach, however, is not suitable for handling a tensor covariate due to potential scalability concerns and/or interpretative issues from vectorizing the tensor covariate. In this work, we delineate the connection between the proposed PQTR procedure and the new EQTR model via a population latent variable model representation. A useful finding is that PQTR can serve as a legitimate approach to providing scalable estimation of the EQTR model. Assuming the proposed EQTR model holds, we establish the n-consistency of the estimator obtained from the proposed PQTR procedure. The proof involves addressing multiple challenging problems. These include using complex random matrix theory to characterize the factor matrices of the estimated reduced tensor, and dealing with a non-standard quantile regression problem where the tensor covariate is latent and unobserved. Such a problem is largely unexplored even in the special case with regular non-tensor covariates. Our asymptotic proof, which employs sophisticated empirical process arguments, may provide a general template to deal with similar estimation settings.

Our extensive simulation studies suggest that the PQTR estimator, as compared to some benchmark methods, can achieve better performance with much lower computational cost.

2. Partial Quantile Tensor Regression

2.1. Data and quantile tensor regression model

Let Y denote a continuous response of interest, and ZRpz and 𝒳Rp1××pm denote a low-dimensional vector covariate and an mth-order tensor covariate, respectively. Throughout the paper, following Qi and Luo (2017), we will use capital bold letters A,B,X, for matrices and calligraphic letters 𝒜,,𝒳, for general tensors, and adopt standard notations for common tensor operations, which are reviewed in Section S0 of the Supplementary Materials.

Let QY(τX,Z)=inf{t:Pr(YtX,Z)τ} be the τth conditional quantile of Y given X and Z. Given X and Z, a quantile tensor regression model may take the form,

QY(τX,Z)=α0(τ)+γ0T(τ)Z+0(τ),𝒳,τΔ, (1)

where Δ(0,1) includes the quantile levels of interest. In model (1), α0(τ)R is the intercept, γ0(τ)Rpz is an unknown vector coefficient for Z, and 0(τ)Rp1××pm is an unknown mth-order tensor coefficient representing the effect of 𝒳 on the τth quantile of Y. Without loss of generality, we assume E[Z]=0 and E[𝒳]=0. For convenience, denote Z=(1,ZT)T and γ~0(τ)=(α0(τ),γ0(τ)T)T.

Provided a sample consisting of i.i.d. (Yi,Zi,𝒳i)(i=1,,n), without concerning the dimensionality issue, one may estimate model (1) by minimizing the standard sample quantile regression loss function (Koenker, 2005),

1ni=1nρτ(YiαγTZi,𝒳i), (2)

with respect to α, γ, and , where ρτ(w)=w{τI(w<0)} and I() is the indicator function. Note that the minimizer of (2) can attain the typical root-n convergence rate as n goes to . However, this optimization problem becomes infeasible when the number of unknown parameters, 1+pz+i=1mpk, exceeds the sample size n. This can occur even when each individual pk(k=1,,m) is not large. This serves as the main motivation of this work.

2.2. The proposed partial quantile tensor regression procedure

To conquer the dimensionality challenge for quantile tensor regression, our proposal is to adapt the core principle of PLS, which has demonstrated superb utility in achieving effective dimension reduction and accurate prediction in linear regression settings. Pertaining to a tensor covariate, Zhang and Li (2017) developed the Tensor Envelope PLS (TEPLS) algorithm for a linear tensor regression model of the form, Y=,𝒳+ε, where ε stands for the error term. TEPLS assumes a separable Kronecker covariance structure of 𝒳, i.e. Σ𝒳=var(𝒳)=ΣmΣ1 with ΣkRpk×pk. The main line of the TEPLS algorithm is to first reduce 𝒳 to 𝒯X;W1T,,WmTRd1××dm, where W1,,Wm are factor matrices, and then regress Y over 𝒯. In the TEPLS algorithm, the initial cross covariance, cov(Y,𝒳), plays an instrumental role in deriving the factor matrices, WkRpk×dk(k=1,,m), which are computed by maximizing a function of the deflated cross covariance term sequentially and independently of other factor matrices, without direct optimization. The total dimension of 𝒯,k=1mdk, can be made sufficiently small if dk ’s are properly chosen. For example, for a 300×300 second-order tensor covariate, if one sets d1=d2=3, then the TEPLS algorithm significantly reduces the total dimension of 𝒳, 90,000, to 9, which is manageable with a small-to-moderate sample size. However, the TEPLS algorithm is not an appropriate tool to address the quantile tensor regression model (1) because the classical cross covariance, cov(Y,𝒳), does not contain sufficient information regarding the effects of 𝒳 on quantiles of Y. In addition, TEPLS does not accommodate the low-dimensional vector covariate Z.

To address these obstacles, we propose to generalize the quantile partial covariance (Li et al., 2015) to guide the dimension reduction of 𝒳 for the purpose of estimating 0(τ) while properly accounting for the effect of Z. More specifically, we define the quantile partial tensor covariance between the response Y and the tensor covariate 𝒳 given Z as qpcovτ(Y,𝒳Z)=E[R(τ)𝒳], where R(τ)=ψτ(Yγ~YZ(τ)TZ) with ψτ(w)=τI(w<0), γ~YZ(τ)=(αYZ(τ),γ~YZT(τ))T, and (αYZ(τ),γYZ(τ))=argminα,γE[ρτ(YαγTZ)]. Here R(τ) may be viewed as the τ-th quantile score of Y after adjusting for the marginal quantile effect of Z. In Lemma 1, we derive an important result regarding the connection between the quantile partial tensor covariance qpcovτ(Y,𝒳Z) and the tensor covariate coefficient 0(τ).

Lemma 1

We assume that if aTvec(𝒳)=bTZ with probability one for two deterministic vectors a and b, then a=0 and b=0. Suppose that the conditional density function of Y, given (𝒳,Z), denoted by fY𝒳,Z, is uniformly bounded away from 0 and . Then under model (1), vec(qpcovτ(Y,𝒳Z))=V(τ)vec(0(τ)), for τΔ, where

V(τ)=E[vτ(Z,𝒳)(vec(𝒳)E[vτ(Z,𝒳)vec(𝒳)ZT]E[vτ(Z,𝒳)ZZT]1Z)2],

and vτ(Z,𝒳)=01fY𝒳,Z((1u)(γ~0(τ)TZ+0(τ),𝒳)+uγ~YZ(τ)TZ)du. Furthermore, V(τ) is positive-definite.

Lemma 1 demonstrates that, after vectorization, qpcovτ(Y,𝒳Z) can be expressed as a “pseudo” linear transformation of 0(τ), which considerably differs from the simple linear relationship between the classical covariance and the coefficients under linear regression (Zhang and Li, 2017, Lemma 3.1–3.2). An immediate implication from Lemma 1 is that the quantile partial tensor covariance qpcovτ(Y,𝒳Z) would vanish if and only if all elements in 0(τ) are zero. A parallel result in Li et al. (2015) (see their Lemma 1) only infers about the equivalence between zero quantile partial covariance and zero quantile slope (or coefficient) under a simple linear quantile regression model with a single scalar covariate. In Lemma 1, we also derive the explicit form of V(τ), which is new in literature. The result suggests that V(τ) depends on the whole function 0(), not just 0(τ), given its involvement of fY𝒳,Z(). Such a subtle global dependence entails a mechanistic distinction between using quantile partial tensor covariance to guide dimension reduction versus using the traditional covariance, which warrants different lines of arguments for justification. This is reflected by some key technical results in Section 3 (see Proposition 5).

Motivated by Lemma 1, assuming a separable Kronecker covariance structure of 𝒳, we propose a partial quantile tensor regression (PQTR) procedure, which mimics TEPLS while employing 𝒞(τ)qpcovτ(Y,𝒳Z) as the substitute for cov(Y,𝒳) to reduce 𝒳 via a Tucker decomposition. The detailed procedure is described in Algorithm 1.

1 Given τ(0,1), calculate the partial quantile covariance 𝒞(τ)=qpcovτ(Y,𝒳Z). for k=1 to m do

2 Standardize the mode-k cross covariance matrix 𝒞(k)(τ):

𝒞0k(τ)=𝒞(k)(τ)(Σm12Σk+112Σk112Σ112)Rpk×(jkpj)

for s=1 to dk do

4 3 Find wsk(τ)Rpk that maximizing wT𝒞(s1)k(τ)𝒞(s1)kT(τ)w subject to wTw=1 Deflate the cross covariance 𝒞sk(τ)=Qsk(τ)𝒞0k(τ), where

Qsk(τ)=Ipk×pkΣkWsk(τ)(WskT(τ)ΣkWsk(τ))1WskT(τ)Rpk×pk

is a projection onto the orthogonal subspace of span(ΣkWsk(τ)), where

Wsk(τ)=(w1k(τ),,wsk(τ))Rpk×s.

end for

end for

7 6 5Reduce XRp1××pm to a tensor 𝒯(τ)=X;W1T(τ),,WmT(τ)Rd1××dm, where Wk(τ)=Wdkk(τ)Rpk×dk. Perform the quantile regression for Y with respect to Z and 𝒯(τ), and obtain the estimation for coefficients α, γ and 𝒟(τ)Rd1×d2××dm. Obtain (τ)=𝒟(τ);W1(τ),,Wm(τ)Rp1××pm.

Algorithm 1 Partial quantile tensor regression (PQTR) procedure

The following are several remarks on the proposed PQTR procedure. First, since 𝒞(τ)qpcovτ(Y,𝒳Z)=cov(R(τ),𝒳) given E{R(τ)}=0, analogous to 𝒞=cov(Y,𝒳) in TEPLS, we may regard Steps 1-5 as a minor variant of the corresponding steps in TEPLS with the response being R(τ). The separable Kronecker assumption has been widely used in tensor regression and high-dimensional covariance estimation literature (Hoff, 2011; Tsiligkaridis and Hero, 2013; Li and Zhang, 2017; Zhang and Li, 2017; Leng and Pan, 2018, among others). Of note, Step 3 in PQTR is equivalent to obtaining the eigenvector associated to the largest eigenvalue of 𝒞(s1)k(τ)𝒞(s1)kT(τ), which can be easily implemented by standard software, for instance, MATLAB or R. With a given k, Steps 3-4 of PQTR repeats only dk times to sequentially generate each column of Wk(τ). By this design, PQTR is more computationally efficient than existing work that adopts an ABU algorithm, which may involve hundreds of iterations. With a properly chosen small dk(k=1,,m), Step 6 is to perform trivial low-dimensional quantile regression. Moreover, Steps 2–4 in PQTR have an invariance property in the sense that they produce the same Wsk(τ) when 𝒳 is multiplied by a constant because eigenvectors are invariant to matrix scaling. Finally, PQTR has an inherent capacity to accommodate a symmetric 𝒳 with minor modifications. This is mainly because Wk(τ) generated from the resulting symmetric 𝒞(τ) would be the same for all k=1,,m, leading to the symmetric 𝒯(τ) at Step 5. Then, at Step 6 of PQTR, we regress Y only over the non-redundant entries of the symmetric 𝒯(τ), for example, lower triangular elements of a two-way 𝒯(τ), and use the regression results to fill in all entries of 𝒟(τ) by the symmetry constraint. The symmetric 𝒟(τ) is then converted to a symmetric (τ) at Step 7.

2.3. Implementation of the PQTR procedure

Note that the PQTR procedure only requires the inputs of 𝒞(τ), Σk and dk(k=1,,m). Therefore, to implement PQTR based on a dataset, we only need to replace 𝒞(τ) and Σk in Algorithm 1 by their estimates and select proper dk’s. More specifically, to obtain an estimate for 𝒞(τ), we can first etimate αYZ(τ) and γYZ(τ) by (α^YZ(τ),γ^YZ(τ))=argminα,γ1ni=1nρτ(YiαγTZi). This optimization problem can be solved as the regular quantile regression problem since Zi, is a low-dimensional vector. Following the lines of Li et al. (2015), we can estimate 𝒞(τ) by 𝒞(τ)=1ni=1nψτ(Yiα^YZ(τ)γ^YZT(τ)Zi)𝒳i. In addition, Σk can be estimated by Σk=n1i=1n𝒳i(k)𝒳i(k)T, up to a multiplicative constant, where 𝒳i(k) is the mode-k matricization of 𝒳i(i=1,,n,k=1,,m) (Zhang and Li, 2017; Li and Zhang, 2017). We require that njkpj>pk so that Σk is invertible. By implementing Algorithm 1 with 𝒞(τ) and Σk replaced by 𝒞(τ) and Σk respectively, we then obtain the sample counterparts of Wk(τ) and 𝒯(τ), denoted by Wk(τ) and 𝒯(τ)X;W1T(τ),,WmT(τ), respectively. Next, we obtain the estimators α^(τ),γ^(τ) and 𝒟(τ) from Step 6 and finally estimate 0(τ) by (τ)=𝒟(τ);W1(τ),,Wm(τ) in Step 7.

Selecting proper dk’s is a crucial part of the PQTR procedure. A natural way is to conduct cross-validation (CV) using the quantile regression loss function (2) as the criterion function. However, a CV method is typically computationally intensive. Therefore, we consider a eigenvalue-ratio (ER) approach, which is expected to be more computationally efficient. The key rationale is that Steps 3–4 of PQTR are expected to generate Wk(τ) associated with the eigenvalues of 𝒞0k(τ)𝒞0kT(τ) that are nonzero or much larger than the rest of eigenvalues. Consequently, we propose to choose dk by maximizing the ratio of two adjacent eigenvalues of 𝒞0k(τ)𝒞0kT(τ), following existing strategies in literature, for example, Lam and Yao (2012), Ahn and Horenstein (2013), and Fan et al. (2016). Specifically, for each k, we first generate, λk1,,λlkmax, the largest lkmax eigenvalues of 𝒞0k(τ)𝒞0kT(τ), where lkmax(npz1)1m and x denotes the largest integer that is smaller than x. We then choose the dk as d^k=arg max1llkmax{λklλk(l+1)}. Based on our simulation studies, both CV and ER approaches perform well with realistic sample sizes, and it generally takes longer to implement the CV approach. Thus, in practice, one may use either ER or CV approach to select dk’s when applying the proposed PQTR procedure, and the ER method may be preferred for the purpose of saving computational cost.

3. Population Interpretation of Partial Quantile Tensor Regression

3.1. Population latent variable model

It is critically important to understand how the proposed PQTR procedure can serve to address the quantile tensor regression model (1). To this end, in Propositions 1 and 2, we first show that PQTR induces a latent “orthogonal” decomposition of 𝒳, and then identify a general mode-wise condition under which PQTR has an appealing population latent variable model representation. To better understand the general mode-wise condition and the population model, in Section 3.2, we propose a novel envelope quantile tensor regression (EQTR) model. In Section 3.3, we study the connection between the EQTR model and the general mode-wise condition, which entails an elegant population interpretation of the estimates obtained from the PQTR procedure.

Proposition 1

The PQTR procedure decomposes 𝒳 as

𝒳=(τ);P1T(τ),,PmT(τ)+(τ), (3)

where Pk(τ)=(WkT(τ)ΣkWk(τ))1WkT(τ)ΣkRdk×pk,

(τ)=_j=0m1_l=1mul=j𝒳;Q1(τ,u1),,Qm(τ,um)Rp1××pm.

Here, ul is some binary variable for l=1,m, and Qk(τ,uk)=(12uk)Qk(τ)+ukIpk×pk. with Qk(τ)=Qdkk(τ) defined in POTR. Furthermore, (τ) and 𝒯(τ) are uncorrelated as vcov((τ),𝒯(τ))=0.

Proposition 2

Under model (1), if 0(τ)(k)span(Wk(τ)) for k=1,,m, the quantile tensor regression model (1) reduces to the latent variable models (3) and (4):

QY(τ𝒳,Z)=QY(τ𝒯(τ),Z)=α0(τ)+γ0T(τ)Z+𝒟0(τ),𝒯(τ), (4)

where 𝒟0(τ)=0(τ);P1(τ),,Pm(τ)Rd1××dm.

Propositions 1 and 2 imply that 𝒳 can be divided into two uncorrelated parts but only one part is related to the τth conditional quantile of Y through the latent variable 𝒯(τ) given that 0(τ)(k)(τ)span(Wk(τ)). Implementing the proposed PQTR procedure first renders an approximation for 𝒯(τ) (as indicated by Proposition 1), and subsequently provides estimates for 𝒟0(τ) under model (4) and then 0(τ) through Steps 6 and 7. The population interpretation of PQTR implied by the latent variable models (3) and (4) shares a similar spirit with the classical latent variable interpretation of PLS under linear regression (Helland, 1990).

3.2. Envelope quantile tensor regression model

To better understand the scenarios implied by the mode-wise condition 0(τ)(k)span(Wk(τ)) which leads to the population latent variable models (3) and (4), we further propose and study an envelope quantile tensor regression (EQTR) model, which adapts the nascent envelope concept (Cook et al., 2010; Cook, 2018) under the quantile tensor regression model (1). Intuitively speaking, we formulate the new EQTR model by adapting the lines of sufficient dimension reduction (Cook, 1998) that divides the tensor covariate 𝒳 into the material and immaterial parts. As shown in Section 3.3, 0(τ)(k)span(Wk(τ))(k=1,,m) holds under the EQTR model. This allows us to bridge the new EQTR model and the proposed PQTR procedure via the latent variable models (3) and (4).

To formulate the EQTR model, we propose to address the division of the material part and the immaterial part in a mode-wise manner as a special accommodation for the tensor covariate. Specifically, under model (1), we assume that there exists a series of τ -specific subspaces, Sk,τRpk(k=1,,m) with dimension r~kpk, such that for τΔ,

cov(𝒳×kQSk,τ,𝒳×kPSk,τ)=0. (E1)
QY(τ𝒳,Z)=QY(τ𝒳×kPSk,τ,Z). (E2)

Here, PSk,τRpk×pk is the orthogonal projection matrix onto Sk,τ, and QSk,τ is the orthogonal projection matrix onto the complement space of Sk,τ. When m=1 (i.e., vector covariate), these assumptions are the same as those adopted by Ding et al. (2020). Condition (E1) means that 𝒳 can be decomposed into two uncorrelated mode-k linear combinations, 𝒳×kQSk,τ and 𝒳×kPSk,τ. Condition (E2) implies 𝒳 influences the τth conditional quantile of Y only through 𝒳×kPSk,τ. These share the same spirit with the interpretations of model (3) and model (4).

To understand how conditions (E1) and (E2) are connected to the nascent envelope concept, in the following, we introduce the definitions of reducing subspace and envelope (Cook et al., 2010; Zhang and Li, 2017; Cook, 2018), and present Proposition 3, which reveals the envelope structure implied by (E1) and (E2).

Definition 1

A subspace SRr is said to be a reducing subspace of MRr×r if S decomposes M as M=PSMPS+QSMQS, where PS is the projection matrix onto S and QS the projection matrix to the complement space of S. If S is a reducing subspace of M, we say that S reduces M.

Definition 2

Let MRr×r and let Sspan(M). Then the M -envelope of S, denoted by M(S), is the intersection of all reducing subspaces of M that contain S. For a matrix A such that span(A)span(M), define M(A)=M(span(A)), where span(A) is the space spanned by the columns of A.

Definition 3

Let 𝒜Rp1××pm and vec(𝒜)span(M) for MR(k=1mpk)×(k=1mpk). The M -tensor envelope of 𝒜, denoted by M(𝒜), is the intersection of all reducing subspaces of M that contain span(vec(𝒜)), that is, M(𝒜)=M(vec(𝒜)).

Proposition 3

Under model (1), conditions 1 and 2 imply that for any τΔ, 0(τ)=𝒦(τ);Γ1(τ),,Γm(τ) and

Σk=Γk(τ)Ωk(τ)ΓkT(τ)+Γ0k(τ)Ω0k(τ)Γ0kT(τ),k=1,,m., (5)

where 𝒦(τ)Rr~1××r~m, Γk(τ)Rpk×r~k denotes a semi-orthogonal basis of Sk,τ such that PSk,τ=Γk(τ)ΓkT(τ),Γ0k(τ)Rpk×(pkr~k) denotes a basis of the complement space of Sk,τ such that QSk,τ=Γ0k(τ)Γ0kT(τ), Ωk(τ)=ΓkT(τ)ΣkΓk(τ)Rr~k×r~k and Ω0k(τ)=Γ0kT(τ)ΣkΓ0k(τ)R(pkr~k)×(pkr~k), which are two symmetric positive-definite matrices.

Proposition 3 shows that Σk can be decomposed into two orthogonal parts by projecting onto Sk,τ and is connected with 0(τ) through Γk(τ), a semi-orthogonal basis of Sk,τ. From Proposition 3, we also see that 0(τ)(k)Sk,τ and Σk=QSk,τΣkQSk,τ+PSk,τΣkPSk,τ; hence Sk,τ is a reducing subspace of Σk that contains 0(τ)(k). Therefore, we further set Sk,τ=Σk(0(τ)(k)). According to Definition 2, such defined Sk,τ’s uniquely exist. Model (1), combined with conditions (E1) and (E2), confers the proposed EQTR model.

3.3. Connection between EQTR model and the PQTR procedure

In this subsection, we show that under the proposed EQTR model, the mode-wise sufficient condition in Proposition 2, 0(τ)(k)span(Wk(τ)), can be satisfied, and thus PQTR can serve as a legitimate approach to providing scalable estimation of the EQTR model.

First, we note that, provided 0(τ)(k)Sk,τ and Sk,τ=Σk(0(τ)(k)), if we can show Σk(0(τ)(k))span(Wk(τ)), then 0(τ)(k)span(Wk(τ)) follows. Following this line, we first establish the equivalence between span(Wk(τ)) and Σk(𝒞(τ)(k)) in Proposition 4:

Proposition 4

Let rk be the dimension of Σk(𝒞(τ)(k)) for k=1,,m. Under model (1), for k=1,,m and τΔ, span(Wk(τ))=Σk(𝒞(τ)(k)) if dkrk.

The result in Proposition 4 suggests that the PQTR procedure finds a semi-orthogonal basis of Σk(𝒞(τ)(k)).

The next step is to link Σ𝒳(0(τ)) and Σ𝒳(𝒞(τ)). To this end, we utilize the result from Lemma 1, V1(τ)vec(𝒞(τ))=vec(0(τ)). However, V1(τ), the matrix that bridges the tensor coefficient and the quantile partial tensor covariance, does not have a Kronecker structure, unlike its counterpart in the linear regression setting, Σ𝒳1. This prevents us from attaining the mode-wide sufficient condition that ensures the nice population latent variable interpretation by straightforwardly adapting the method of Zhang and Li (2017). To tackle this difficulty, we propose a novel set of the regularity structural constraints tailored to quantile tensor regression.

Let U(τ) denote the semi-orthgonal basis of Σ𝒳(0(τ)), columns of which are eigenvectors of Σ𝒳, and QU(τ) denote the projection matrix onto span(U(τ)). We impose the following conditions:

1. (i) E[(vec(𝒳)T,ZT)U(τ)Tvec(𝒳),Z] is a linear function of (vec(𝒳)TU(τ),ZT)T and (ii) QU(τ)E[vec(𝒳)ZT]=0.

2. Suppose that U(τ)Tvec(𝒞) does not contain zero elements.

Condition (E3)(i) mimics the condition in Proposition 2 of Cook and Zhang (2015) and plays a role in connecting V(τ)-envelope with Σ𝒳-envelope. In particular, condition (E3)(i) known as the linearity condition, has been widely considered in the literature of sufficient dimension reduction (Cook, 1998; Li and Wang, 2007) and envelop models for generalized linear model (Cook and Zhang, 2015). A sufficient condition for condition (E3)(i) is that (vec(𝒳)T,ZT)T follows an elliptically contoured distribution, for example, the multivariate normal distribution. Diaconis and Freedman (1984) and Hall and Li (1993) demonstrated that almost all projections of high-dimensional data are approximately normal, and thus this assumption is usually considered as mild. Condition (E3)(ii) means that the covariance between 𝒳 and Z is in Σ𝒳(0(τ)). A stronger version of condition (E3)(ii) is that E[vec(𝒳)ZT]=0, implying that 𝒳 and Z are uncorrelated. This occurs in practice, for example, when Z represents a randomized treatment indicator.

Condition (E4) is a crucial technical condition that leads to Σ𝒳(0(τ))Σ𝒳(𝒞(τ)). In view of model (6) or (E2) and the definition of Σ𝒳(0(τ)), U(τ)Tvec(𝒳) may be regarded as latent variables that captures all the information regarding the association between the τth conditional quantile of Y and 𝒳 and contains no redundant information. Since U(τ)Tvec(𝒞)=E[R(τ)U(τ)Tvec(𝒳)], which reflects the association between U(τ)Tvec(𝒳) and the τth conditional quantile of Y after adjusting for the marginal effect of Z, it is reasonable to assume that all elements of E[R(τ)UT(τ)vec(𝒳)] are nonzero; otherwise the elements of U(τ)Tvec(𝒳) corresponding to the zero elements of E[R(τ)U(τ)Tvec(𝒳)] would be superfluous. Under conditions 1 and 2, we establish the relationship between Σ𝒳(0(τ)) and Σ𝒳(𝒞(τ)) in Proposition 5, which subsequently leads to the conclusion in Corollary 1 that justifies the latent variable model in Proposition 2.

Proposition 5

Suppose model (1) and conditions 1–2 hold, the assumptions of Lemma 1 are satisfied, and dkrk(k=1,,m). We have Σ𝒳(0(τ))Σ𝒳(𝒞(τ)). Furthermore, let fYU(τ)Tvec(𝒳),Z(y) denote conditional density function of Y given (U(τ)Tvec(𝒳),Z). If fYU(τ)Tvec(𝒳),Z(y)=fY𝒳,Z(y) for all yR, then Σ𝒳(0(τ))=Σ𝒳(𝒞(τ)).

Corollary 1

Under the assumptions of Proposition 5, 0(τ)(k)span(Wk(τ)).

Unlike the result for TEPLS under linear regression (Zhang and Li, 2017), the finding from Proposition 5 suggests that Σ𝒳(0(τ)) is usually not equal to but a subspace of Σ𝒳(𝒞(τ)). A possible explanation for this unique phenomenon of PQTR is tied to the fact that V1(τ)vec(𝒞(τ))=vec(0(τ)) but V(τ) involves fY𝒳,Z() that contains the information on 0(τ) across all τ(0,1); thus it is intuitive to perceive that Σ𝒳(𝒞(τ)) can be bigger than Σ𝒳(0(τ)). A sufficient condition for Σ𝒳(0(τ))=Σ𝒳(𝒞(τ)) is fYU(τ)Tvec(𝒳),Z(y)=fY𝒳,Z(y). A special case where this condition holds is the case where model (1) reduces to the standard linear tensor model with i.i.d. errors. Another special case is a scale-shift model Y=(,𝒳+γTZ)ϵ, where ϵ is independent of (𝒳,Z).

The result in Corollary 1 implies that the latent variable model representation of PQTR, given by (3)-(4), holds under the proposed EQTR model. In sum, the PQTR procedure finds Wk(τ), which is a semi-orthogonal basis of Σk(𝒞(k)(τ)) that contains 0(τ)(k), and subsequently yields estimates for α0(τ), γ0(τ) and 0(τ) in model (1) through the latent variable model (4).

4. Asymptotic Properties of the Proposed Estimator

We study the asymptotic properties of the estimator obtained from the proposed PQTR procedure, denoted by θ(τ)=(α^(τ),γ^(τ),(τ)). We first introduce necessary notations. Write θ(τ)=(α(τ),γ(τ),(τ)) and θ0(τ)=(α0(τ),γ0(τ),0(τ)). We also define L2-metric d(θ1,θ2) in the parameter space Θ(τ)R×Rpz×Rk=1mpk as d(θ1,θ2)=(α1α2)2+γ1γ222+12F2, where Fvec()Tvec() stands for the Frobenius norm.

In Theorem 1, we establish the consistency and convergence rate of θ.

Theorem 1

Suppose that regularity conditions (C1)(C6) in Section S2 of the Supplementary Materials and model (4) hold. If dk=rk, then d(θ(τ),θ0(τ))=op(1) and nd(θ(τ),θ0(τ))=Op(1) as n.

Note that model (4) holds under the assumptions of Proposition 5. Therefore, by Theorem 1, we establish the n-consistency of the estimator obtained from the PQTR procedure under the proposed EQTR model.

We also establish the consistency of the eigenvalue ratio method in Theorem 2

Theorem 2

Suppose the regularity conditions (C1)(C6) in Section S2 and condition (D1) in Section S7 of the Supplementary Materials hold. Under model (4), d^k obtained from the eigenvalue ratio method (described in Section 2.3) equals rk with probability approaching to 1, i.e., Pr(d^k=rk)1 as n.

Proofs of Lemma 1, Propositions 1-5, and Theorems 1-2 are provided in Sections S4-S7 of the Supplementary Materials.

5. Simulation Studies

We evaluate the finite sample performance of the proposed method and compare it with some alternative methods via Monte Carlo simulations. We first consider scenarios with a second-order tensor covariate, 𝒳R50×50, where the tensor covariate coefficient (τ)R50×50 is the same for all τ (i.e., homogeneous scenarios) or vary across different τ’s (i.e., heterogeneous scenarios).

For the homogeneous scenarios, we generate responses as Yi=γTZi+,𝒳i+Φt1(ξi), where ξi’s are i.i.d. standard uniform random variables, and Φt is the distribution function of Student’s t distribution with 1 degree of freedom. For , as shown in Figures S1 and S2 in the Supplementary Materials, we consider frame and bi-circle shapes, which are realistic for mental health neuroimaging datasets. The frame shape represents a low-rank case of , while the bi-circle shape is a challenging high-rank case with rank 12, where 𝒟0(τ) includes 144 parameters. In this case, Σ𝒳((τ))=Σ𝒳(𝒞(τ)) and, rk, the rank of Σ𝒳((τ)), equals to r~k, the rank of , for k=1,2.

For the heterogeneous scenarios, we generate responses as Yi=γTZi+(ξi),𝒳i+Φt1(ξi), where ξi’s are i.i.d. standard uniform random variables, and each element of (ξi) is increasing with respect to ξi. We consider two settings for (τ). In the first setting, (τ)=Φ1(τ)B as illustrated in Figure S5 in the Supplementary Materials. In this case, (τ) is a matrix of the same rank and bases across different τ’s but the values of its elements change with τ. The first setting incurs Σ𝒳((τ))=Σ𝒳(𝒞(τ)) and rk=r~k. For the second setting, we set (τ)=B1I(0<τ<0.35)+B2I(0.35τ<0.65)+B3I(0.65τ<1) and plot the resulting (τ) in Figure 1 and S6 in the Supplementary Materials. The (τ) shown in Figure S6 in the Supplementary Materials is defined based on {Bl}l=13 of the same rank 1 but different bases. The (τ) shown in Figure 1 corresponds to {Bl}l=13 that have different ranks and bases, representing the most challenging setting. In this setting, Σ𝒳((τ)) is not always equal to Σ𝒳(𝒞(τ)) because the rank of Σ𝒳(𝒞(τ)) is 3 but the rank of Σ𝒳((τ)) is the rank of (τ), which is sometimes smaller than 3.

Figure 1:

Figure 1:

Empirical averages of estimates for the heterogeneous tri-square-shaped tensor coefficients that are step functions of τ with τ=0.1,0.25,0.5,0.75, and 0.9 when the covariance of the tensor covariate follows an envelope structure (Envelope) or a diagonal constant structure (Diagonal).

For both homogeneous and heterogeneous scenarios, we generate 𝒳i with covariance Σ2Σ1, where Σk(k=1,2) either follow the envelope structure in (5) or are diagonal-constant matrices which depict a reasonable spatial correlation pattern but do not follow the envelope structure in (5). The generation schemes for 𝒳i and Zi are detailed in the Sections S1.1 and S1.2 of the Supplementary Materials.

In addition, we consider a scenario with a 3-way tensor covariate of dimension 16 × 16 × 16 , where the tensor coefficient is heterogeneous over τ, as shown in Figure S8 in the Supplementary Materials. The tensor covariate 𝒳i, Zi, and Yi are generated in the same manner as that for the heterogeneous setting with a 2-way tensor covariate.

We compare the performance of the following five methods:

Fix: the proposed PQTR procedure with dk fixed as the true rank of in the homogeneous scenario and the true rank of B or l=1,2,3span(Bl) (i.e., the largest possible rank of Σ𝒳(𝒞(τ))) in the heterogeneous scenario;

ER: the proposed PQTR procedure with dk’s estimated by the eigenvalue ratio method;

CV: the proposed PQTR procedure with dk’s estimated by 5-fold CV;

CP: Li and Zhang (2021)’s CP-decomposition based method for quantile tensor regression that uses an ABU algorithm and adopts the fuse penalty.

TK: Lu et al. (2020)’s Tucker decomposition-based method for quantile tensor regression that uses an ABU algorithm with lasso-type penalty and does not accommodate a low dimensional vector predictor.

PCA: quantile regression with a reduced covariate vector obtained from applying the principle component analysis (PCA) to the vectorized tensor covariate.

Since the rank of Σ𝒳(𝒞(τ)) is unknown with real datasets, the method, Fix, serves as an oracle or benchmark version of the proposed method, while the methods, ER and CV, represent practical versions of the proposed PQTR. The fuse penalty used in CP was recommended and used in Li and Zhang (2021)’s numerical studies. Both CP and TK methods select the ranks from 1 to 3 and determine the tuning parameters by minimizing BIC as suggested by Li and Zhang (2021). To save the computational cost of TK, we assume the same rank for all modes when the tensor coefficient is 3-way. The initial values of CP and TK are generated from the L1-norm quantile regression (Li and Zhu, 2008) with the vectorized tensor covariates and the tuning parameter 0.1. All five methods are implemented in MATLAB. The MATLAB code for CP is obtained from the original implementation provided by Li and Zhang (2021), and the MATLAB code for TK is translated from the R code used by Lu et al. (2020). We evaluate these methods with sample sizes, 50, 100 and 200, which are realistic in neuroimaging studies, based on 100 replicates. The set of quantile levels of interest is specified as Δ={0.1,0.25,0.5,0.75,0.9}.

In Figures 1, and Figures S5, S6, and S10 in the Supplementary Materials, which correspond to the heterogeneous settings of with either a two-way tensor covariate or a three-way tensor covariate, we present the empirical averages of the estimated tensor coefficients based on the five methods described above. These figures help assess the estimation biases of different methods by visualization. Figure S7 in the Supplementary Materials includes the box plots in the logarithmic scale that depict the Frobenius norm of the difference between the true tensor coefficients and the estimated tensor coefficients based on the six different methods, which provide meaningful quantification of estimation variability. Figure S8 in the Supplementary Materials presents the box plots of the computation time (in seconds) of the six different methods.

Based on the results for the heterogeneous scenarios shown by Figures 1, S5, S6, and S10, when the EQTR model holds, we note that the proposed methods, Fix, ER, and CV, can distinguish and capture the shapes of the true tensor coefficient across different τ’s quite well for either 2-way or 3-way tensor coefficients. The empirical biases of Fix, ER and CV decrease as the sample size increases. ER seems to perform slightly better than CV in terms of estimation bias, while CV generally shows better estimation accuracy than ER (see Figure S7). In contrast, CP and TK yield noisy estimates across all τ’s. As suggested by Figure S7, this may be caused by the presence of outlying estimates which may occur when CP and TK converge to some local optimums or fail to converge. Since the rank of the true tensor coefficient in the heterogeneous scenarios is always smaller than or equal to 3, the noisy results from CP and TK are not caused by assuming a rank lower than needed but are likely because the ABU optimization algorithm of CP or TK frequently converges to some local optimums provided a larger number of parameters to estimate. When the EQTR model does not hold, the average estimates based on the proposed methods, Fix, ER, and CV, may still capture the shapes of the true tensor coefficients given the relatively large sample size 200, and CV may outperform ER, particular in the three-way tensor coefficient case. In contrast, both CP and TK produce noisy estimates suffering from larger empirical biases and estimation variability. In all cases, the PCA approach tends to underestimate the magnitude of non-zero components of the tensor coefficient. It may completely miss capturing the shape of the true tensor coefficient when the sample size is small (e.g., n=50).

Based on the comparison of computation time in Figure S8, it is clear that the implementation of the proposed methods, Fix, ER, and CV, have a comparable speed to that of PCA but can be much faster than that of CP and TK. As expected, CV generally takes longer computation time as compared to ER.

Regarding the comparisons of the proposed methods with CP, TK, and PCA, we have very similar findings from the simulations in the homogeneous settings. We also consider TEPLS a benchmark method in the homogeneous cases. Figures S1-S2 show that TEPLS has comparable performance with PQTR in terms of recovering the shape of (τ). Figure S3 shows that TEPLS may yield smaller estimation errors than PQTR. This is expected as TEPLS directly utilizes the true linear relationship between the response and the tensor covariate while the proposed method assumes a more flexible relationship. However, TEPLS may sometimes generate unstable estimates as suggested by the many outlying points in the box-plots corresponding to TEPLS (see Figure S3). In Figure S4, it is shown that the proposed PQTR procedure demands similar amount of computing time as compared to TEPLS. More details are provided in Section S1.1 of the Supplementary Materials.

In summary, our simulation studies support both statistical and computational advantages of the proposed PQTR methods over the existing benchmark approaches.

6. Application

We apply the PQTR procedure to a neuroimaging dataset collected from a cohort of 98 female PTSD patients recruited by the Grady Trauma Project (Ressler et al., 2011). The goal of our analyses is to investigate the association between brain functional connectivity and severity of PTSD symptoms, measured by the PTSD symptom scale (PSS) total score, while adjusting for the effect of age. In our analyses, the response Y corresponds to the PSS total score, the low-dimensional vector covariate Z stands for age, and the tensor covariate 𝒳 represents the functional connectivity matrix. Connectivity measures were derived from the rs-fMRI based on the Seitzman atlas which include 300 nodes across the brain. After removing 21 nodes with missing data for one or more participants, we analyze the connectivity matrix based on the remaining 279 nodes, which are grouped into functional modules. Before fitting the quantile tensor regression model (1), we apply the Fisher’s z-transformation to each entry of 𝒳, which is bounded between 1 and 1 for being defined as a Pearson’s correlation coefficient. We also set the diagonal elements of 𝒳 as zero to avoid infinite values after the Fisher’s z-transformation.

We implement the proposed PQTR procedure with Δ={0.25,0.4,0.5,0.6,0.75}. This allows us to examine how brain functional connectivity is associated with low (τ=0.25), middle (τ=0.4,0.5,0.6), and high (τ=0.75) ranges of PSS total score. As discussed in Section 2.2, to accommodate the symmetric nature of the functional connectivity matrix (and thus (τ)), we slightly modify the proposed PQTR procedure. Specifically, we use the elements in the lower-triangular 𝒯(τ) for the quantile regression conducted at Step 6, obtain the lower-triangular entries of 𝒟(τ), and then symmetrically reflect the lower-triangular entries to impute the corresponding upper-triangular entries of 𝒟(τ). We also compare the quantile regression results to the mean-based linear regression by TEPLS with necessary modifications, similar to those applied to PQTR, to include the lower-dimensional vector covariate for fair comparisons. We let d1=d2 for a two-way tensor (matrix) coefficient and employ the eigenvalue ratio method to save the computational cost. By the eigenvalue ratio method, dk(k=1,2) is selected as 1, 3, 3, 2, and 2 for PQTR with τ=0.25,0.4,0.5, and 0.6 respectively, and as 2 for TEPLS.

In Figure 2, we present the standardized estimated tensor coefficients thresholded by 5 (i.e. setting as 0 if the magnitude is smaller than 5). The threshold 5 is chosen to yield reasonably sparse visualization of the estimated tensor coefficients. The dots in red (or blue) indicate the edges that may have significant positive (or negative) effects on quantiles or mean of PSS total score. These edges are also visualized in brain maps in Figure S10 in the Supplementary Materials. A counterpart of Figure 2 that adopts 4 as the thresholding cutoff is Figure S13 in the Supplementary Materials. From Figure 2, we observe a heterogeneous pattern in the effects of brain functional connectivity on different ranges of PSS total score. For example, the estimated coefficients for some edges within the cingulo opercular and default mode networks are negative corresponding to effects on lower quantiles (e.g. τ=0.25) but not effects on upper quantiles (e.g. τ=0.75). This suggests that the negative associations of these edges with PTSD symptom severity may mainly be present among patients with mild PTSD symptoms. For some edges within the somatomotor dorsal network or between the somatomotor dorsal area and the somatomotor lateral area, their coefficients are positive only at upper quantiles (e.g. τ=0.75). An implication of this result is that stronger connectivity of these edges may be reflective of worse symptoms in patients with severe PTSD symptoms. These findings are generally consistent with those from some recent neuroimaging studies. For example, Zhang et al. (2020) reported that the connectivities within the somatomotor network is positively correlated with severity of PTSD symptoms, and Zandvakili et al. (2020) found that the connectivities in the default mode network have a negative relationship with PTSD symptoms.

Figure 2:

Figure 2:

The heat map visualization of the standardized (τ) by ER thresholded at the cutoff 5, and the standardized estimated tensor coefficient based on mean tensor regression by TEPLS thresholded at the cutoff 5. The gray lines separate brain regions belong to different brain functional networks, which include: Auditory (Aud), Somatomotor Dorsal (SMd), Cingulo Opercular (CO), Default Mode (DMN), Fronto Parietal (FP), Somatomotor Lateral (SMI), Visual (Vis) and Salience (Sal).

In contrast, TEPLS, which adopts mean-based linear tensor regression, discovers little effects of connections within the somatomotor dorsal networks and default mode network on the PSS total score based on our dataset. In fact, no effect estimates by TEPLS pass the threshold 5 as shown in Figure 2. This may reflect diluted effects resulted from assuming constant effects across different ranges of PSS total score, which is inherited with linear regression modeling.

In addition, we apply the CP and TK methods to the PTSD dataset. As commented in Section 1, CP and TK, by using ABU algorithms, are not flexible in accounting for the symmetric structure of 𝒳. In our analyses, we heuristically make the final coefficient estimate symmetric by specifying it as the average of the estimate obtained from CP (or TK) and its transpose. A similar approach was used by Zhang et al. (2022). Despite the adjustment, we find the results from CP or TK are difficult to interpret, and this is discussed in details in Section 1.4 of the Supplementary Materials.

As illustrated by this example, quantile tensor regression, as addressed by the proposed PQTR procedure, provides a flexible and robust tool to gain a detailed view regarding the association between brain functional connectivity and symptom severity of PTSD, which may not be attained by existing approaches.

7. Concluding remarks

This paper represents the first work that thoroughly studies the integration of PLS strategy with the effort to address quantile regression with a tensor covariate. In this work, we not only deliver an efficient algorithm (i.e. PQTR) to provide estimates scalable to a large tensor covariate but also uncover a meaningful way to interpret the resulting estimates. Moreover, we establish the connection between the latent variable model underlying the PQTR procedure and a new envelope model uniquely formulated for quantile tensor regression. The results contribute useful theoretical insights.

The proposed PQTR procedure bears some similarity with the principal component analysis/regression (PCA) in the sense that both are two-step procedures that carry out dimension reduction in the first step and regression at the second step. At the same time, they are several main distinctions. First, applying an existing PCA procedure to quantile tensor regression requires first vectorizing the tensor covariate 𝒳. This can lead to interpretation issues as the inherent spatial structure of 𝒳 is ignored. In contrast, PQTR circumvents such issues by retaining the original tensor form of 𝒳. Secondly, PCA conducts unsupervised dimension reduction which only utilizes the covariance of covariates and ignores the information from the response (Cook, 2018), while the PQTR procedure implements supervised dimension reduction guided by the correlation between the response and the tensor covariate 𝒳. As demonstrated by simulation studies, the supervised dimension reduction adopted by PQTR can lead to considerably improved empirical performance.

As enlightened by one referee, we note that the envelope estimation for Σ𝒳(𝒞(τ)(k)) involved in the PQTR may be tackled by the Non-Iterative Enveloped Component Estimation (NIECE) procedure of Zhang et al. (2023), leading to a useful variant of PQTR. While NIECE has a principal component regression formulation, the resulting variant, however, is different from the regular application of PCA to tensor regression. This is because the quantile partial tensor covariance 𝒞(τ) still plays a role in determining the envelopes in the former but is not involved in the latter. Delineating the connection between the proposed PQTR and the variant commented above merits future research.

For the tensor covariate, we assume pk’s (k=1,,m) are fixed and does not increase with the sample size n. This covers realistic tensor covariates such as those encountered in neuorimaging studies. When pk’s diverge with the sample size n, one may consider imposing regularization to further promote sparsity, similarly to developing a sparse PLS method for linear regression (Chung and Keles, 2010, for example). As pointed out by one referee, one may also consider adapting the sieve estimation technique to allows for diverging dk’s (k=1,,m), thereby relaxing the low-rank tensor constraint imposed by the proposed method. These enlighten promising directions for future research. The work presented in this paper sets the critical foundation for such investigations.

Supplementary Material

Supp 1

Acknowledgments

*The authors gratefully acknowledge this work was supported by National Institutes of Health under Award Numbers R01MH079448, R01HL113548, R01MH105561, and R01DK136023. Qiu’s work was partly supported by the National Natural Science Foundation of China (NSFC) (12071164). The authors are grateful to Dr. Jushua Lukemire for his help with the preparation of the rs-fMRI dataset of the PTSD study.

Footnotes

Supplementary Materials

The Supplementary Materials provide additional simulation and data analysis results as well as the detailed proofs of the presented theorem, propositions, corollary and lemma.

References

  1. Ahn SC and Horenstein AR (2013), ‘Eigenvalue ratio test for the number of factors’, Econometrica 81(3), 1203–1227. [Google Scholar]
  2. Bro R. (1996), ‘Multiway calibration. Multilinear PLS’, Journal of Chemometrics 10(1), 47–61. [Google Scholar]
  3. Caffo BS, Crainiceanu CM, Verduzco G, Joel S, Mostofsky SH, Bassett SS, and Pekar JJ , (2010), ‘Two-stage decompositions for the analysis of functional connectivity for fMRI with application to Alzheimer’s disease risk’, NeuroImage 51(3), 1140–1149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Chung D and Keles S (2010), ‘Sparse partial least squares classification for high dimensional data’, Statistical Applications in Genetics and Molecular Biology 9(1). [Google Scholar]
  5. Cook RD (1998), Regression Graphics: Ideas for Studying Regressions through Graphics, Wiley, New York. [Google Scholar]
  6. Cook RD (2018), An Introduction to Envelopes: Dimension Reduction for Efficient Estimation in Multivariate Statistics, John Wiley & Sons, Hoboken, NJ. [Google Scholar]
  7. Cook RD, Helland IS and Su Z (2013), ‘Envelopes and partial least squares regression’, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75(5), 851–877. [Google Scholar]
  8. Cook RD, Li BD and Chiaromonte F (2010), ‘Envelope models for parsimonious and efficient multivariate linear regression’, Statistica Sinica 20(3), 927–960. [Google Scholar]
  9. Cook RD and Zhang X (2015), ‘Foundations for envelope models and methods’, Journal of the American Statistical Association 110(510), 599–611. [Google Scholar]
  10. de Jong S. (1993), ‘SIMPLS: An alternative approach to partial least squares regression’, Chemometrics and Intelligent Laboratory Systems 18(3), 251–263. [Google Scholar]
  11. Diaconis P and Freedman D (1984), ‘Asymptotics of graphical projection pursuit’, The Annals of Statistics 12(3), 793–815. [Google Scholar]
  12. Ding S, Su Z, Zhu G and Wang L (2020), ‘Envelope quantile regression’, Statistica Sinica 31(1), 79–106. [Google Scholar]
  13. Dodge Y and Whittaker J (2009), ‘Partial quantile regression’, Metrika 70(1), 35–57. [Google Scholar]
  14. Eliseyev A and Aksenova T (2013), ‘Recursive N-way partial least squares for brain-computer interface’, PLOS ONE 8(7), e69962. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Fan J, Fan Y and Barut E (2014), ‘Adaptive robust variable selection’, Ann. Statist. 42, 324–351. [Google Scholar]
  16. Fan J, Liao Y and Wang W (2016), ‘Projected principal component analysis in factor models’, Annals of statistics 44(1), 219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Guhaniyogi R, Qamar S and Dunson DB (2017), ‘Bayesian tensor regression’, Journal of Machine Learning Research 18(79), 1–31. [Google Scholar]
  18. Hall P and Li K-C (1993), On almost linearity of low dimensional projections from high dimensional data’, The Annals of Statistics 21(2), 867–889. [Google Scholar]
  19. Helland IS (1988), On the structure of partial least squares regression’, Communications in Statistics - Simulation and Computation 17(2), 581–607. [Google Scholar]
  20. Helland IS (1990), ‘Partial least squares regression and statistical models’, Scandinavian Journal of Statistics 17(2), 97–114. [Google Scholar]
  21. Helland IS (2001), ‘Some theoretical aspects of partial least squares regression’, Chemometrics and Intelligent Laboratory Systems 58(2), 97–107. [Google Scholar]
  22. Hoff PD (2011), ‘Separable covariance arrays via the Tucker product, with applications to multivariate relational data’, Bayesian Analysis 6(2), 179–196. [Google Scholar]
  23. Koenker R (2005), Quantile Regression, Cambridge University Press. [Google Scholar]
  24. Koenker R and Bassett G (1978), ‘Regression quantiles’, Econometrica: journal of the Econometric Society pp. 33–50. [Google Scholar]
  25. Lam C and Yao Q (2012), ‘Factor modeling for high-dimensional time series: inference for the number of factors’, The Annals of Statistics 40(2), 694–726. [Google Scholar]
  26. Leng C and Pan G (2018), ‘Covariance estimation via sparse kronecker structures’, Bernoulli 24(4B), 3833–3863. [Google Scholar]
  27. Li B and Wang S (2007), On directional regression for dimension reduction’, Journal of the American Statistical Association 102(479), 997–1008. [Google Scholar]
  28. Li C and Zhang H (2021), ‘Tensor quantile regression with application to association between neuroimages and human intelligence’, The Annals of Applied Statistics 15(3), 1455–1477. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Li G, Li Y and Tsai C-L (2015), ‘Quantile correlations and quantile autoregressive modeling’, Journal of the American Statistical Association 110(509), 246–261. [Google Scholar]
  30. Li L and Zhang X (2017), ‘Parsimonious tensor response regression’, Journal of the American Statistical Association 112(519), 1131–1146. [Google Scholar]
  31. Li X, Xu D, Zhou H and Li L (2018), ‘Tucker tensor regression and neuroimaging analysis’, Statistics in Biosciences 10(3), 520–545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Li Y and Zhu J (2008), ‘L1-norm quantile regression’, Journal of Computational and Graphical Statistics 17(1), 163–185. [Google Scholar]
  33. Lu W, Zhu Z and Lian H (2020), ‘High-dimensional quantile tensor regression’, Journal of Machine Learning Research 21(250), 1–31. [PMC free article] [PubMed] [Google Scholar]
  34. Qi L and Luo Z (2017), Tensor Analysis: Spectral Theory and Special Tensors, SIAM. [Google Scholar]
  35. Ressler KJ, Mercer KB, Bradley B, Jovanovic T, Mahan A, Kerley K, Norrholm SD, Kilaru V, Smith AK, Myers AJ et al. (2011), ‘Post-traumatic stress disorder is associated with pacap and the pac1 receptor’, Nature 470(7335), 492–497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Tsiligkaridis T and Hero AO (2013), ‘Covariance estimation in high dimensions via kronecker product expansions’, IEEE Transactions on Signal Processing 61(21), 5347–5360. [Google Scholar]
  37. Wang L, Wu Y and Li R (2012), ‘Quantile regression for analyzing heterogeneity in ultrahigh dimension’, J. Amer. Stat. Assoc 101, 1418–1429. [Google Scholar]
  38. Wold H. (1982), Soft modeling: the basic design and some extensions, in Joreskog KG and Wold H, eds, ‘Systems Under Indirect Observation: Causality, Structure, Prediction’, Vol. 2, North-Holland, pp. 1–54. [Google Scholar]
  39. Worsley KJ, Taylor JE, Tomaiuolo F and Lerch J (2004), ‘Unified univariate and multivariate random field theory’, NeuroImage 23, S189–S195. [DOI] [PubMed] [Google Scholar]
  40. Zandvakili A, Barredo J, Swearingen HR, Aiken EM, Berlow YA, Greenberg BD, Carpenter LL and Philip NS (2020), ‘Mapping PTSD symptoms to brain networks: a machine learning study’, Translational Psychiatry 10(1), 195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Zhang J, Sun WW and Li L (2022), ‘Generalized Connectivity Matrix Response Regression with Applications in Brain Connectivity Studies’, Journal of Computational and Graphical Statistics 0, to appear. [Google Scholar]
  42. Zhang L, Li D and Yin H (2020), ‘How is psychological stress linked to sleep quality? The mediating role of functional connectivity between the sensory/somatomotor network and the cingulo-opercular control network’, Brain and Cognition 146, 105641. [DOI] [PubMed] [Google Scholar]
  43. Zhang X, Deng K and Mai Q (2023), ‘Envelopes and principal component regression’, Electronic Journal of Statistics 17(2), 2447–2484. [Google Scholar]
  44. Zhang X and Li L (2017), ‘Tensor envelope partial least-squares regression’, Technometrics 59(4), 426–436. [Google Scholar]
  45. Zhao Q, Caiafa CF, Mandic DP, Chao ZC, Nagasaka Y, Fujii N, Zhang L and Cichocki A (2013), ‘Higher order partial least squares (HOPLS): a generalized multilinear regression method’, IEEE Transactions on Pattern Analysis and Machine Intelligence 35(7), 1660–1673. [DOI] [PubMed] [Google Scholar]
  46. Zheng Q, Gallagher C and Kulasekera KB (2013), ‘Adaptive penalized quantile regression for high dimensional data’, J. Statist. Plann. Inference 143, 1029–1038. [Google Scholar]
  47. Zhou H and Li L (2013), ‘Regularized matrix regression’, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76(2), 463–483. [Google Scholar]
  48. Zhou H, Li L and Zhu H (2013), ‘Tensor regression with applications in neuroimaging data analysis’, Journal of the American Statistical Association 108(502), 540–552. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Zhou J, Sun WW, Zhang J and Li L (2021), ‘Partially observed dynamic tensor response regression’, Journal of the American Statistical Association pp. 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Zou H and Yuan M (2008), ‘Composite quantile regression and the oracle model selection theory’, Annals of Statistics 36, 1108–1126. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp 1

RESOURCES