Skip to main content
Sensors (Basel, Switzerland) logoLink to Sensors (Basel, Switzerland)
. 2019 Dec 3;19(23):5335. doi: 10.3390/s19235335

Stable Tensor Principal Component Pursuit: Error Bounds and Efficient Algorithms

Wei Fang 1,*, Dongxu Wei 2, Ran Zhang 3
PMCID: PMC6928658  PMID: 31817050

Abstract

The rapid development of sensor technology gives rise to the emergence of huge amounts of tensor (i.e., multi-dimensional array) data. For various reasons such as sensor failures and communication loss, the tensor data may be corrupted by not only small noises but also gross corruptions. This paper studies the Stable Tensor Principal Component Pursuit (STPCP) which aims to recover a tensor from its corrupted observations. Specifically, we propose a STPCP model based on the recently proposed tubal nuclear norm (TNN) which has shown superior performance in comparison with other tensor nuclear norms. Theoretically, we rigorously prove that under tensor incoherence conditions, the underlying tensor and the sparse corruption tensor can be stably recovered. Algorithmically, we first develop an ADMM algorithm and then accelerate it by designing a new algorithm based on orthogonal tensor factorization. The superiority and efficiency of the proposed algorithms is demonstrated through experiments on both synthetic and real data sets.

Keywords: tensor principal component pursuit, stable recovery, tensor SVD, ADMM

1. Introduction

In recent years, different types of tensor data have emerged with the significant progress of modern sensor technology, such as color images [1], videos [2], functional MRI data [3], hyper-spectral images [4], point could data [5], traffic stream data [6], etc. Thanks to its multi-way nature, tensor-based methods have natural superiority over vector and matrix-based methods in analyzing and processing ubiquitous modern multi-way data, and have found extensive applications in computer vision [1,7], data mining [5], machine learning [2], signal processing [8], to name a few. In real applications, the acquired tensor data may often suffer from noises and gross corruptions owing to many different reasons such as sensor failure, lens pollution, communication interference, occlusion in videos, or abnormalities in a sensor network [9], etc. At the same time, many real-world tensor data, such as face images or videos, have been shown to have some low-dimensional structure and can be well approximated by a smaller number of “principal components” [8]. Then, a question naturally arises: how to pursue the principal components of an observed tensor data in the presence of both noises and gross corruptions? We will answer this question in this paper and refer to the proposed methodology as Stable Tensor Principal Component Pursuit (STPCP).

The tensor low-rankness is an ideal model of the property that a tensor data can be well approximated by a small number of principal components [8]. In the last decade, low-rank tensor models have attracted much attention in many fields [10]. There are multiple low-rank tensor models since there exist different definitions of tensor rank. Among these models, the low CP rank model [11] and the low Tucker rank model [1] should be the most famous two. The low CP rank model approximates the underlying tensor by the sum of a small number of rank-1 tensors, whereas the low Tucker rank model assumes the unfolding matrix along each mode are low rank. To estimate an unknown low-rank tensor from corrupted observations, it is a natural option to consider the rank minimization problem which chooses the tensor of lowest rank as the solution from a certain feasible set. However, tensor rank minimization, even in its 2-way (matrix) case, is generally NP-hard [12] and even harder in higher-way cases [13]. For tractable solutions, researchers turn to a variety of convex surrogates for tensor rank  [1,14,15,16,17,18] to replace the tensor rank in rank minimization problem. Methods based on surrogates for the tensor CP rank and Tucker rank have been extensively explored in both the theoretical side and the application side [14,17,19,20,21,22,23,24].

Recently, the low-tubal-rank model [16,25] has shown better performance than traditional tensor low-rank models in many tensor recover tasks such as image/video inpainting/denoising/ sensing [2,25,26], moving object detection [27], multi-view learning [28], seismic data completion [29], WiFi fingerprint [30], MRI imaging [16], point cloud data inpainting [31], and so on. The tubal rank is a new complexity measure of tensor defined through the framework of tensor singular value decomposition (t-SVD) [32,33]. At the core of existing low-tubal-rank models is the tubal nuclear norm (TNN) which is a convex surrogate for the tubal rank. In contrast to CP-based tensor nuclear norms or Tucker-based tensor nuclear norms which models low-rankness in the original domain, TNN models low-rankness in the Fourier domain. It is pointed out in [25,34,35] that TNN has superiority over traditional tensor nuclear norms in exploiting the ubiquitous “spatial-shifting” property in real-world tensor data.

Inspired by the superior performance of TNN, this paper adopts TNN as a low-rank regularizer in the proposed STPCP model. Specifically, the proposed STPCP aims to estimate the underlying tensor data L_0Rn1×n2×n3 from an observation tensor M_ polluted by both small dense noises and sparse gross corruptions as follows:

M_=L_0+S_0+E_0, (1)

where S_0 is a tensor denoting the sparse corruptions and E_0 is a tensor representing small dense noises. Model (1) is also known as robust tensor decomposition in [36,37].

Our STPCP model is first formulated as a TNN-based convex problem. Then, our theoretical analysis gives upper bound on the estimation error of L_0 and S_0. In contrast to the analysis in [37], the proposed STPCP can exactly recovery the underlying tensor L_0 and the sparse corruption tensor S_0 when the noise term E_0 vanishes. For efficient solution of the proposed STPCP model, we develop two algorithms with extensions to a more challenging scenario where missing observations are also considered. The first algorithm is an ADMM algorithm and the second algorithm accelerates it using tensor factorization. Experiments show the effectiveness and the efficiency of the designed algorithms.

We organize the rest of this paper as follows. In Section 2, we briefly introduce basic preliminaries for t-SVD and some related works. The proposed STPCP model is formulated and analyzed theoretically in Section 3. We design two algorithms in Section 4 and report experimental results in Section 5. This work is concluded in Section 6. The proofs of theorems, propositions, and lemmas are given in the appendix.

2. Preliminaries and Related Works

In this section, some preliminaries of t-SVD are first introduced. Then, the related works are presented.

Notations. We denote vectors by bold lower-case letters, e.g., aRn, matrices by bold upper-case letters, e.g., ARn1×n2, and tensors by underlined upper-case letters, e.g., A_Rn1×n2×n3. For a given 3-way tensor, we define its fiber as a vector given through fixing all indices but one, and its slice as a matrix obtained by fixing all indices but two. For a given 3-way tensor A_, we use A_ijk to denote its (i,j,k)-th element; A(k):=A_(:,:,k) is used to denote its k-th frontal slice. A_˜ is used to denote the tensor after performing 1D Discrete Fourier Transformation (DFT) on all tube fibers A_(i,j,:) of T_, i=1,2,,n1,j=1,2,,n2, which can be efficiently computed by the Matlab command A_˜=fft(A_,[],3). We use dft3(·) and idft3(·) to represent the 1D DFT and inverse DFT along the tube fibers of 3-way tensors, i.e.,  dft3(A_):=fft(A_,[],3),idft3(A_):=ifft(A_,[],3).

For a given matrix MRn1×n2, define the nuclear norm and spectral norm of M respectively as:

M:=i=1pσi(M),andMsp:=max{σi(M)},

where p=min{n1,n2}, and σ1(M)σp(M) are the singular values of M in a non-ascending order. The l0-norm, l1-norm, Frobenius norm, l-norm of a tensor A_Rn1×n2×n3 is defined as:

A_0:=ijk1(A_ijk0),A_1:=ijk|A_ijk|,A_F:=ijkA_ijk2,A_:=maxijk|A_ijk|,

where 1(C) is an indicator function whose value is 1 if the condition C is true, and 0 otherwise.

Given two matrices A=(aij)Cn1×n2,B=(bij)Cn1×n2, we define their inner product as follows:

A,B=tr(AHB)=ija¯ijbij,

where AH denotes conjugate transpose of matrix A and a¯ij denotes the conjugation of complex number aij. Given two 3-way tensors A_,B_Rn1×n2×n3, we define their inner product as follows:

A_,B_:=ijkA_ijkB_ijk.

2.1. Tensor Singular Value Decomposition

We first define 3 operators based on block matrices which are introduced in [33]. For a given 3-way tensor A_Rn1×n2×n3, we define its block vectorization bvec(·) and the inverse operation bvfold(·) in the following equation:

bvec(A_):=A(1)A(2)A(n3)Rn1n3×n2,bvfold(bvec(A_))=A_.

We further define the block circulant matrix bcirc(·) of any 3-way tensor A_Rn1×n2×n3 as follows:

bcirc(A_):=A(1)A(n3)A(2)A(2)A(1)A(3)A(n3)A(n31)A(1)Cn1n3×n2n3

Equipped with above defined operators, we are now in a position to define the t-product of 3-way tensors.

Definition 1

(t-product [33]). Given two tensors A_Rn1×n2×n3 and B_Rn2×n4×n3, the t-product of A_ and B_ is a new 3-way tensor C_ with size n1×n4×n3:

C_=A_B_=:bvfoldbcirc(A_)bvec(B_). (2)

A more intuitive interpretation of t-SVD is as follows [33]. If we treat a 3-way tensor A_Rn1×n2×n3 as a matrix of size n1×n2 whose entries are the tube fibers, then the tensor t-product can be analogously understood as the “matrix multiplication” where the standard scalar product is replaced with the vector circular convolution between the tubes (i.e., vectors):

C_=A_B_C_(i,j,:)=k=1n2A_(i,k,:)B_(k,j,:),i=1,2,,n1,j=1,2,,n4, (3)

where ⋆ represent the operation of circular convolution [33] of two vectors a,bRn3 defined as (ab)j=k=1n3akb1+(jk)modn3.

We also define the block diagonal matrix bdiag(·) of any 3-way tensor A_Rn1×n2×n3 and its inverse bdfold(·) as follows:

bdiag(A_):=A(1)A(n3)Rn1n3×n2n3,bdfold(bdiag(A_))=A_.

We also use A¯ (or A_¯) to denote the block diagonal matrix of tensor A_˜=dft3(A_) (i.e., the Fourier version of A_) i.e.,

A¯=bdiag(A_˜):=A˜(1)A˜(n3)Cn1n3×n2n3.

Then the relationship between DFT and circular convolution further indicates that the conducting t-product in the original domain is equivalent to performing standard matrix product on the Fourier block diagonal matrices [33]. Since matrix product on the Fourier block diagonal matrices can be parallel written as matrix product of all the frontal slices in the Fourier domain, we have the following relationships:

C_=A_B_C¯=A¯B¯C˜(k)=A˜(k)B˜(k),k=1,2,,n3. (4)

The relationship between the t-product and FFT also indicates that the inner product of two 3-way tensors A_,B_Rn1×n2×n3 and the inner product of their corresponding Fourier block diagonal matrices A_¯,B_¯Cn1n3×n2n3 satisfy the following relationship:

A_,B_=1n3A_˜,B_˜=1n3A¯,B¯.

When A_=B_=X_, one has:

X_F=1n3X_¯F.

We further define the concepts of tensor transpose, identity tensor, f-diagonal tensor and orthogonal tensor as follows.

Definition 2

(tensor transpose [33]). Given a tensor A_Rn1×n2×n3, then define its transpose tensor A_ of size n2×n1×n3 which can be formed through first transposing all the frontal slices of A_ and then exchanging each k-th transposed frontal slice with the (n3+2k)-th transposed frontal slice for all k=2,3,,n3.

For example, consider 3-way tensor A_=[A(1)|A(2)|A(3)|A(4)]Rn1×n2×4 with 4 frontal slices, the tensor transpose A_ of A_ is:

A_=[(A(1))|(A(4))|(A(3))|(A(2))]Rn2×n1×4.

Definition 3

(identity tensor [33]). The identity tensor I_Rn×n×n3 is a tensor whose first frontal slice is the n-by-n identity matrix with all other frontal slices are zero matrices.

Definition 4

(f-diagonal tensor [33]). We call a 3-way tensor f-diagonal if all the frontal slices of it are diagonal matrices.

Definition 5

(orthogonal tensor [33]). We call a tensor Q_Rn×n×n3 an orthogonal tensor if the following equations hold:

Q_Q_=Q_Q_=I_.

Then, the tensor singular value decomposition (t-SVD) can be given as follows.

Definition 6

(Tensor singular value decomposition, and Tensor tubal rank [38]). Given any 3-way tensor X_Rn1×n2×n3, then it has the following factorization called tensor singular value decomposition (t-SVD):

X_=U__V_, (5)

where the left and right factor tensors U_Rn1×n1×n3 and V_Rn2×n2×n3 are orthogonal, and the middle tensor _Rn1×n2×n3 is a rectangular f-diagonal tensor.

A visual illustration for the t-SVD is shown in Figure 1. It can be computed efficiently by FFT and IFFT in the Fourier domain according to Equation (4). For more details, see [2].

Figure 1.

Figure 1

A visual illustration of t-SVD.

Definition 7

(Tensor tubal rank [38]). The tensor tubal rank of any 3-way tensor X_Rn1×n2×n3 is defined as the number of non-zero tubes of _ in its t-SVD shown in Equation (5), i.e.,

rtubal(A_):=i1(_(i,i,:)0). (6)

Definition 8

(Tubal average rank [38]). The tubal average rank ra(A_) of any 3-way tensor A_Rn1×n2×n3 is defined as the averaged rank of all frontal slices of A_˜ as follows,

ra(A_):=1n3k=1n3errorA˜(k). (7)

Definition 9

(Tensor operator norm [2,38]). The tensor operator norm F_op of any 3-way tensor F_Rn1×n2×n3 is defined as follows:

F_op:=supA_F1F_A_F. (8)

The relationship between t-product and FFT indicates that:

F_op:=supA_F1F_A_F=supA¯Fn3F¯·A_¯F=A¯sp. (9)

Definition 10

(Tensor spectral norm [38]). The tensor spectral norm A_sp of any 3-way tensor F_Rn1×n2×n3 is defined as the matrix spectral norm of A¯, i.e., 

A_sp:=A¯sp. (10)

We further define the tubal nuclear norm.

Definition 11

(Tubal nuclear norm [2]). For any tensor A_Rn1×n2×n3 with t-SVD A_=U__V_, the tubal nuclear norm (TNN) of A_ is defined as:

A_TNN:=_,I_=i=1r_(i,i,1), (11)

where r=rtubal(A_).

To understand the tubal nuclear norm, first note that:

rtubal(A_)=i1(_(i,i,:)0)=(i)i1(_˜(i,i,:)0)=(ii)i1(_˜(i,i,:)10)=(iii)i1(_(i,i,1)0), (12)

where (i) holds because of the definition of DFT [2], (ii) holds by the property of l1-norm, and (iii) is a result of DFT [2]. Thus, the tubal rank of A_ is also the number of non-zero diagonal elements of _(i,i,1), i.e., the first frontal slice of tensor _ in the t-SVD of A_. Similar to the matrix singular values, the values _(i,i,1),i=1,2,,n3 are also called the singular values of tensor A_. As the matrix nuclear norm is the sum of matrix singular values, the tubal nuclear norm can be similarly understood as the sum of tensor singular values.

One can also verify by the property of DFT [2] that:

A_TNN=i=1r_(i,i,1)=k=1n3i=1r_˜(i,i,k)=1n3k=1n3A˜(k)=1n3A¯, (13)

which indicates that the TNN of A_Rn1×n2×n3 is also the averaged nuclear norm all frontal slices of A_˜. Thus, TNN indeed models the low-rankness of Fourier domain.

Now, we will show that the low-tubal-rank model is ideal to some real-world tensor data, such as color images and videos.

First, we consider a natural image of size 256×256×3, shown in Figure 2a. In Figure 2b, we plot the distribution of its singular values, i.e., the values of _(i,i,1) along with the index i. As can be seen from Figure 2b, there are only a small number of singular values with large magnitude, and most of the singular values are close to 0. Then, we can say that some natural color images are approximately low tubal rank.

Figure 2.

Figure 2

The distribution of tensor singular values _(i,i,1) in a natural color image. (a) the sample image, (b) the distribution of _(i,i,1).

Then, consider a commonly used YUV sequence Mother-daughter_qcif (These data can be download from the following link https://sites.google.com/site/subudhibadri/fewhelpfuldownloads.) whose first frame is shown in Figure 3a. We use the Y components of the first 30 frames, and get a tensor of size 144×176×30 and show the distribution of tensor singular values in Figure 3b. We can see from Figure 3b that similar to Figure 2b, there are only a small number of singular values with large magnitude, and most of the singular values are close to 0. Then, we can say that some videos can be well approximately low tubal rank.

Figure 3.

Figure 3

The distribution of tensor singular values _(i,i,1) in a video sequence. (a) the first frame of the video, (b) the distribution of _(i,i,1).

For TNN and tensor spectral norm, we highlight the following two lemmas.

Lemma 1.

[2] TNN is the convex envelop of the tensor average rank in the unit ball of tensor spectral norm {T_Rn1×n2×n3|T_sp1}.

Lemma 2.

[2] The TNN and the tensor spectral norm are dual norms to each other.

2.2. Related Works

In this subsection, we briefly introduce some related works. The proposed STPCP is tightly related to the Tensor Robust Principal Component Analysis (TRPCA) which aims to recover a low-rank tensor L_0 and a sparse tensor S_0 from their sum M_=L_0+S_0. This is a special case of our measurement Model (1) where the noise tensor E_0 is a zero tensor.

In [39], the SNN-based TRPCA model is proposed by modeling the underlying tensor as a low Tucker rank one:

minL_,S_L_SNN+S_1s.t.L_+S_=M_, (14)

where SNN (Sum of Nuclear Norms) is defined as L_SNN:=i=1KαkL(k), where αk>0 and L(k) is the mode-k matricization of L_ [40].

Model (14) indeed assumes the underlying tensor to be low Tucker rank, which can be too strong for some real tensor data. The TNN-based TRPCA model uses TNN to impose low-rankness in the final solution L_ as follows:

minL_,S_L_TNN+λS_1s.t.L_+S_=M_. (15)

As shown in [2], when the underlying tensor L_0 satisfy the tensor incoherent conditions, by solving Problem (15), one can exactly recover the underlying tensor L_0 and S_0 with high probability with parameter λ=1/max{n1,n2}n3.

When the noise tensor E_0 is not zero, the robust tensor decomposition based on SNN is proposed in [36] as follows:

minL_,S_12M_L_S_F+λ1L_SNN+λ2S_1, (16)

where λ1 and λ2 are positive regularization parameters. The estimation error on L_ and S_ is analyzed with an upper bound in [36].

In [37], the TNN-based RTD model is proposed as follows:

minL_,S_12M_L_S_F+λ1L_TNN+λ2S_1,s.t.L_α, (17)

where α is an upper estimate of l-norm of the underlying tensor L_0. An upper bound on the estimation error is also established. However, in the analysis of Model (17), the error does not vanish as the noise tensor E_0 vanishes which means the analysis cannot guarantee exact recovery in the noiseless setting (which can be provided by the analysis of TNN-based TRPCA (15) by Lu et al. [2]).

The Bayesian approach is also used for robust tensor recovery. The CP decomposition under sparse corruption and small dense noise is considered [41], and tensor rank estimation is achieved using Bayesian approach. In [42], CP decomposition under missing value and small dense noise is considered with rank estimation similar to [41]. A sparse Bayesian CP model is proposed in [43] to recover a tensor with missing value, outliers and noises. In [44], a fully Bayesian treatment is proposed to recover a low-tubal-rank tensor corrupted by both noises and outliers.

3. Theoretical Guarantee for Stable Tensor Principal Component Pursuit

In this section, we formulate the proposed STPCP model and give the main theoretical result which upper bounds the estimation error and guarantees exact recovery in the noiseless setting.

3.1. The Proposed STPCP

As for the measurement Model (1), we further assume that the noise tensor E_0 has bounded energy measured in F-norm, i.e., E_0Fδ. Please note that the limited energy assumption is very mild, since most signals are of limited energy.

To recover the low-rank tensor L_0 and the sparse tensor S_0, we first produce the following optimization problem:

(L_^,S_^)=argminL_,S_L_TNN+λS_1,s.t.M_L_S_Fδ, (18)

where λ is a positive parameter balancing the two regularizers. The motivation is to use TNN as a low-rank regularization term to exploit the low-dimensional structure in the signal tensor, whereas tensor l1-norm is used to impose sparsity in the corruption tensor (since we assumes it to be sparse).

The relationship between Model (18) and existing models are discussed in Remark 1 and Remark 2.

Remark 1.

The following models can be seen as special cases as the proposed STPCP Model (18);

  • (I).

    When δ=0, i.e., in the noiseless case, the proposed model degenerates to the TRPCA Model (15) [2].

  • (II).
    When n3=1, then the stable tensor PCP Model (18) degenerates to the Stable Principal Component Pursuit (SPCP) [45] which aims to pursuit the principal components modeled by low-rank matrix L_0 from it observation M corrupted by both noises E0 and sparse corruptions S0. The SPCP is formulated as follows:
    minL,SL+λS1,s.t.MLSFδ. (19)
  • (III).
    When n3=1 and δ=0, the proposed STPCP further degenerates to Robust Principal Component Analysis (RPCA) [46] given as follows:
    minL,SL+λS1,s.t.L+S=M. (20)

Remark 2.

The differences from the proposed Model (18) and TNN-based RTD Model ((17) [37]) is as follows. First, our model does not need to upper estimate the l-norm of the underlying tensor. Second, our model is a constrained optimization problem, whereas Model (17) is an unconstrained optimization problem.

3.2. A Theorem for Stable Recovery

To analyze the statistical performance of Model (18), we should assume on the underlying low-rank tensor L_0 that it is not sparse. Only by this assumption, L_0 can be identified from its mixture with sparse S_0. Such an assumption can be described by the tensor incoherence condition [2,47], which is used to provide an identifiablility for low-rank L_0.

Definition 12

(Tensor incoherence condition [2,47]). Given a 3-way tensor T_Rn1×n2×n3 with tubal rank r, suppose it has the skinny t-SVD T_=U_Λ_V_, where U_Rn1×r×n3,V_Rr×n2×n3 are orthogonal tensors, and Λ_Rr×r×n3 is an f-diagonal tensor. Then, T_ is said to satisfy the tensor incoherent condition (TIC) with parameter μ(T_) if the following inequalities hold:

maxi[n1]U_e˚iFrμ(T_)n1n3, (21)
maxj[n2]V_e˚jFrμ(T_)n2n3, (22)
U_V_rμ(T_)n1n2n3. (23)

where e˚iRn1×1×n3 is a tensor column basis with only the (i,1,1)-th element being 1 and all the others being 0, and e˚jRn2×1×n3 is also a tensor column basis with only the (j,1,1)-th element being 1 and all the others being 0.

Assumption 1.

Suppose the true tensor L_0 in the measurement model (1) satisfies tensor incoherence condition with parameter μ.

Assumption 1 intrinsically ensures that the row bases and column bases of L_0 do not align well with the canonical row and column bases. Thus, the low-rank L_0 is not sparse, which avoids the ambiguity that low-rank component can also be sparse in the measurement Model (1).

We should also force the sparse component in Model (1) is not low rank.

Assumption 2.

Assume the support Ω of S_0 is drawn uniformly at random.

Now we can establish an upper bound on the estimation error of L_^ and S_^ in Problem (18).

Theorem 1

(An Upper Bound on the Estimation Error). Suppose L_0 and S_0 satisfy Assumption 1 and Assumption 2, respectively. If the tubal rank r of L_0 and the sparsity (i.e., the l0-norm) s of S_0 are respectively upper bounded as follows:

rcrmin{n1,n2}μlog2(n3max{n1,n2}),andscsn1n2n3 (24)

where cl and cs are two sufficiently small numerical constants independent on the dimensions n1, n2 and n3. Then the estimator defined in Model (18) satisfy the following inequalities:

L_^L_0F1+1max{n1,n2}+8(1+22)min{n1,n2}n3δS_^S_0F1+max{n1,n2}+8(1+22)n1n2n3δ, (25)

with probability at least 1c1(n3max{n1,n2})c2 (over the choice of support of S_0), where c1 and c2 are positive constants independent on the dimensions n1, n2 and n3.

The proof of Theorem 1 are given in the appendix. In Theorem 1, estimation errors on L_0 and S_0 are separately established. It indicates that the estimation error scales linearly with the noise level δ, which is in consistence with the result in [37].

Remark 3.

A significant progress over [37] is that in the noiseless setting where E_0 vanishes, our analysis can provide exact recovery guarantee of L_0 and S_0. This is because the tensor incoherence condition adopted in our analysis intrinsically ensures that the low-rank tensor L_0 is not sparse and thus can be separated from the sparse corruption tensor, whereas the non-spiky condition adopted in [37] fails to provide identifiability in the measurement Model (1).

For Theorem 1, we also give the following remark.

Remark 4.

The error bounds established in Theorem 1 are consistent with the theoretical analysis for the special cases shown in Remark 1.

  • (I).

    When δ=0, i.e., in the noiseless case, the error bounds in Theorem 1 will vanish, which means exact recovery of L_0 and S_0 can be guaranteed. This result is consistent with the analysis in [2] for TNN-based TRPCA Model (15).

  • (II).

    When n3=1, the error bound on the sparse component in Theorem 1 is consistent with the error bound shown in Equation (8) of [45]. The upper bound on error of the low-rank component in Theorem 1 is sharper than that given in Equation (8) of [45].

  • (III).

    When n3=1 and δ=0, the proposed STPCP has consistent theoretical guarantee with the analysis of RPCA [46].

4. Algorithms

In this section, we design two algorithms. The first algorithm is based on the framework of ADMM [48] which has been extensively used in convex optimization with good convergence behavior. However, ADMM requires full SVDs on large matrices in each iteration which is high computational burden in high-dimensional settings. Thus, the second algorithm is proposed to solve this issue by using a factorization trick which can instead conducting SVDs on much smaller matrices.

4.1. An ADMM Algorithm

The proposed estimator (18) is equivalent to the following unconstrained problem:

minL_,S_12L_+S_M_F2+γ(L_TNN+λS_1), (26)

where γ is a positive parameter balancing the data fidelity term and the regularization term.

Besides being corrupted by noises and outliers, the observed tensor M_ may also suffer from missing entries which can be taken as outliers with known positions in many applications. Thus, it is more practical to consider the recovery of L_0 against outliers S_0, noises E_0 and missing entries shown in the following measurement model:

M_=B_(L_0+S_0+E_0), (27)

where tensor B_Rn1×n2×n3 denote the missing mask where B_ijk=1, if the (i,j,k)-th entry of L_ is observed and B_ijk=0 otherwise, and ⊙ denotes element-wise multiplication. Taking into consideration of missing entries, Model (26) can be further modified as:

minL_,S_12B_(L_+S_M_)F2+γ(L_TNN+λS_1). (28)

By adding auxiliary variables to Problem (28), we obtain:

minK_,L_,R_,S_12B_(L_+S_M_)F2+γK_TNN+γλR_1s.t.K_=L_,R_=S_. (29)

The Augmented Lagrangian (AL) of Problem (29) is given as follows:

Lρ(L_,S_,K_,R_,Y_1,Y_2)=12B_(L_+S_M_)F2+γK_TNN+γλR_1+Y_1,K_L_+ρ2K_L_F2+Y_2,R_S_+ρ2R_S_F2, (30)

where Y_1,Y_2Rn1×n2×n3 are Lagrangian multipliers and ρ is a penalty parameter.

According the strategy of ADMM, we update prime variables (L_,S_) and (K_,R_) by alternative minimization of AL in Problem (29) as follows

  • Update (L_,S_). We update (L_,S_) by minimizing Lρ with other variables fixed as follows:
    (L_t+1,S_t+1)=argmin(L_,S_)Lρ(L_,S_,K_t,R_t,Y_1t,Y_2t)=argmin(L_,S_)12B_(L_+S_M_)F2+Y_1t,K_tL_+ρ2K_tL_F2+Y_2t,R_tS_+ρ2R_tS_F2. (31)
    Taking derivatives of the right-hand side of Equation (31) with respect to L_ and S_ respectively, and setting the results zero, we obtain:
    B_(L_t+1+S_t+1)B_M_Y_1t+ρ(L_t+1K_t)=0_B_(L_t+1+S_t+1)B_M_Y_2t+ρ(S_t+1R_t)=0_. (32)
    Resolving the above equation group yields:
    L_t+1=ρ(B_+ρ1_)K_t+ρB_M_+(B_+ρ1_)Y_1tB_Y_2tρB_R_tρ(2B_+ρ1_),S_t+1=ρ(B_+ρ1_)R_t+ρB_M_+(B_+ρ1_)Y_2tB_Y_1tρB_K_tρ(2B_+ρ1_), (33)
    where ⊘ denotes entry-wise division and 1_ denotes the tensor all whose entries are 1.
  • Update (K_,R_). We update (K_,R_) by minimizing Lρ with other variables fixed as follows
    (K_t+1,R_t+1)=argmin(K_,R_)Lρ(L_t+1,S_t+1,K_,R_,Y_1t,Y_2t)=argmin(K_,R_)γK_TNN+γλR_1+Y_1t,K_L_t+1+ρ2K_L_t+1F2+Y_2t,R_S_t+1+ρ2R_S_t+1F2. (34)
    Please note that Problem (34) can further be solved separately as follows:
    K_t+1=argminK_γK_TNN+Y_1t,K_L_t+1+ρ2K_L_t+1F2=Sγρ1·TNNL_t+1ρ1Y_1t. (35)
    and
    R_t+1=argminR_γλR_1+Y_1t,R_S_t+1+ρ2R_S_t+1F2=Sγλρ1·1S_t+1ρ1Y_2t, (36)
    where Sτ·TNN(·) is the proximity operator of TNN [5]. and Sτ·1(·) is the proximity operator of tensor l1-norm given as follows [49]:
    Sτ·1(A_):=argminX_τX_1+12X_A_F2=sign(A_)max{(|A_|τ,0},
    In [5], a closed-form expression of Sτ(·) is given as follows:
    Lemma 3.
    (Proximity operator of TNN [5]) For any 3D tensor A_Rn1×n2×n3 with reduced t-SVD A_=U_Λ_V_, where U_Rn1×r×n3 and V_Rn2×r×n3 are orthogonal tensors and Λ_Rr×r×n3 is the f-diagonal tensor of singular tubes, the proximity operator Sτ·TNN(A_) at A_ can be computed by:
    Sτ·TNN(A_):=argminX_τX_TNN+12X_A_F2=U_ifft3(max(fft3(Λ_)τ,0))V_,
  • Update (Y_1,Y_2). The Lagrangian multipliers are updated by gradient ascent as follows:
    Y_1t+1=Y_1t+ρ(K_t+1L_t+1),Y_2t+1=Y_2t+ρ(R_t+1S_t+1). (37)

The algorithm is summarized in Algorithm 1. The convergence analysis of Algorithm 1 is established in Theorem 2.

Algorithm 1 Solving Problem (29) using ADMM.
  • Input: 

    The observed tensor M_, the parameters γ,λ,ρ,δ.

  • 1:

    Initialize t=0, L_0=S_0=K_0=R_0=Y_10=Y_20=0_Rn1×n2×n3

  • 2:

    for t=0,,Tmax do

  • 3:

    Update (L_t+1,S_t+1) by Equation (33);

  • 4:

    Update (K_t+1,R_t+1) by Equations (35)–(36);

  • 5:

    Update (Y_1t+1,Y_2t+1) by Equation (37);

  • 6:
    Check the convergence criteria:
    • (i)
      convergence of variables: A_t+1A_tδ,A_{L_,S_,K_,R_},
    • (ii)
      convergence of constraints: max{K_t+1L_t,R_t+1S_t+1}δ.
  • 7:

    end for

  • Output: 

    (L_^,S_^)=(L_t+1,S_t+1).

Theorem 2

(Convergence of Algorithm 1). For any ρ>0, if the unaugmented Lagrangian L(L_,S_,K_,R_,Y_1,Y_2) has a saddle point, then the iterations L(L_t,S_t,K_t,R_t,Y_1t,Y_2t) in Algorithm 1 satisfy the residual convergence, objective convergence and dual variable convergence of Problem (29) as t.

The proof of Theorem 2 is given in the Appendix A.

In a single iteration of Algorithm 1, the main cost comes from updating L_t which involves computing FFT, IFFT and n3 SVDs of n1×n2 matrices [47]. Hence Algorithm 1 has per-iteration complexity of order On1n2n3(n1n2+logn3). Thus, if the total iteration number is T, then the total computational complexity is:

OTn1n2n3(min{n1,n2}+logn3). (38)

4.2. A Faster Algorithm

To reduce the cost of computing TNN which is a main cost of Algorithm 1, we propose the following lemma which indicates that TNN is orthogonal invariant.

Lemma 4.

Given a tensor X_Rr×r×n3, let Q_Rn1×r×n3 a two semi-orthogonal tensors, i.e., Q_Q_=I_Rr×r×n3 and rmin{n1,n2}. Then, we have the following relationship:

Q_X_TNN=X_TNN.

The proof of Lemma 4 can be found in the appendix. Equipped with Lemma 4, we decompose the low-rank component in Problem (28) as follows:

L_=Q_X_,s.t.Q_Q_=I_r,

where I_rRr×r×n3 is an identity tensor. The similar strategy has been used in low-rank matrix recovery from gross corruptions by [50]. Furthermore, we propose the following model for Problem (28):

minQ_,X_,S_12B_(Q_X_+S_M_)F2+γ(X_TNN+λS_1)s.t.Q_Q_=I_r, (39)

where r is an upper estimation of tubal rank of the underlying tensor r=rtubal(L_0).

In contrast to Model (28), the proposed Model (39) is a non-convex optimization problem. That means Model (39) may have many local minima. We establish a connection between the proposed Model (39) with Model (28) in the following theorem.

Theorem 3

(Connection between Model (39) and Model (28)). Let (Q_,X_,S_) be a global optimal solution to Problem (39). Furthermore, let (L_,S_) be the solution to Problem (28), and rtubal(L_)r, where r is the initialized tubal rank. Then (Q_X_,S_) is also the optimal solution to Problem (28).

The proof of Theorem 3 can be found in the appendix. Theorem 3 states that the global optimal point of the (non-convex) Model (39) coincides with solution of the (convex) Model (28). It further indicates that the accuracy of Model (39) cannot exceed Model (28), which can be validated numerically in the experiment section.

To solve Model (39), we also use the ADMM framework.

First, by adding auxiliary variables, we have the following problem:

minL_,S_,R_,Q_,X_12B_(L_+S_M_)F2+γ(X_TNN+λR_1)s.t.Q_X_=L_;R_=S_;Q_Q_=I_r. (40)

The augmented Lagrangian of Problem (40) is:

L2(L_,S_,R_,Q_,X_)=12B_(L_+S_M_)F2+γ(X_TNN+λR_1)+Y_1,Q_X_L_+ρ2Q_X_L_F2+Y_2,R_S_+ρ2R_S_F2s.t.Q_Q_=I_r. (41)

According the strategy of ADMM, we update prime variables (L_,S_) and (Q_,X_,R_) by alternative minimization of AL in Problem (41) as follows

  • Update (L_,S_): We update (L_,S_) by minimizing Lρ with other variables fixed as follows:
    (L_t+1,S_t+1)=argmin(L_,S_)Lρ(L_,S_,Q_t,X_t,R_t,Y_1t,Y_2t)=argmin(L_,S_)12B_(L_+S_M_)F2+Y_1t,Q_tX_tL_+ρ2Q_tX_tL_F2+Y_2t,R_tS_+ρ2R_tS_F2. (42)
    Taking derivatives of the right-hand side with respect to L_ and S_ respectively, and setting the results zero, we obtain:
    B_(L_t+1+S_t+1)B_M_Y_1t+ρ(L_t+1Q_tX_t)=0_B_(L_t+1+S_t+1)B_M_Y_2t+ρ(S_t+1R_t)=0_, (43)
    Resolving the above equation group yields:
    L_t+1=(1+ρ)Q_tX_t+B_M_+Y_1tR_t(2B_+ρ1_),S_t+1=(1+ρ)R_t+B_M_+Y_2tQ_tX_t(2B_+ρ1_). (44)
  • Update Q_. We update Q_ by minimizing Lρ with other variables fixed as follows
    minQ_Q_=I_rLρ(L_t+1,S_t+1,Q_,X_t,R_t,Y_1t,Y_2t)=minQ_Q_=I_rY_1t,Q_X_tL_t+1+ρ2Q_X_tL_t+1F2.=minQ_Q_=I_rρ2Q_X_t(L_t+1ρ1Y_1t)F2=P(L_t+1ρ1Y_1t)(X_t), (45)
    where operator P(·) is defined in Lemma 5 as follows.
    Lemma 5.
    ([51]) Given any tensors A_Rr×n2×n3,B_Rn1×n2×n3, suppose tensor B_A_ has t-SVD B_A_=U_Λ_V_, where U_Rn1×r×n3 and V_Rr×r×n3. Then, the problem:
    minQ_Q_=I_rP_A_B_F2 (46)
    has a closed-form solution as:
    Q_=P(B_A_):=U_V_. (47)
  • Update (X_,R_):We update (X_,S_) by minimizing Lρ with other variables fixed as follows:
    min(X_,R_)Lρ(L_t+1,S_t+1,Q_t+1,X_,R_,Y_1t,Y_2t)=min(X_,R_)γX_TNN+γλR_1+Y_1t,Q_t+1X_L_t+1+ρ2Q_t+1X_L_t+1F2+Y_2t,R_S_t+1+ρ2R_S_t+1F2. (48)
    Please note that Problem (48) can further be solved separately as follows:
    K_t+1=argminX_γX_TNN+Y_1t,Q_t+1X_L_t+1+ρ2Q_X_L_t+1F2=argminX_γX_TNN+ρ2Q_t+1X_(L_t+1ρ1Y_1t)F2=(i)argminX_γX_TNN+ρ2X_(Q_t+1)(L_t+1ρ1Y_1t)F2=Sγρ1·TNN(Q_t+1)(L_t+1ρ1Y_1t). (49)
    and
    R_t+1=argminR_γλR_1+Y_1t,R_S_t+1+ρ2R_S_t+1F2=Sγλρ1·1K_t+1ρ1Y_2t. (50)
    The equality (i) in Equation (49) holds because according to Q_Q_=I_, we have:
    minX_Q_X_Y_F2=minX¯1n3Q¯·X¯Y¯F2=minX¯1n3Y¯F22n3Q¯·X¯,Y¯+1n3Q¯·X¯F2=minX¯1n3Y¯F22n3X¯,Q¯HY¯+1n3X¯F2=minX¯1n3X¯Q¯HY¯F2=minX_ρ2X_Q_Y_F2. (51)
  • Update (Y_1,Y_2). The Lagrangian multipliers are updated by gradient ascent as follows:
    Y_1t+1=Y_1t+ρ(Q_t+1X_t+1L_t+1),Y_2t+1=Y_2t+ρ(R_t+1S_t+1). (52)

The algorithmic steps are summarized in Algorithm 2. The complexity analysis is given as follows.

In each iteration of Algorithm 2, the update of L_ requires FFT/IFFT, and n3 multiplications of n1-by-r and r-by-n2 matrices, which costs O(n1n2+rn1+rn2)n3logn3+rn1n2n3; updating S_ costs On1n2n3; updating of Q_ involves FFT/IFFT and n3 SVDs of n1-by-r matrices, which costs Orn1n3logn3+r2n1n3; updating X_ involves FFT/IFFT and n3 SVDs of r-by-n2, which costs Orn2n3logn3+r2n2n3). Then, the per-iteration computational complexity of Algorithm 2 is dominated by:

Omaxn1n2n3logn3,r2(n1+n2)n3.

Since the low-tubal-rank assumption rmin{n1,n2} is adopted in this paper, the per-iteration of Algorithm 2 is much lower than Algorithm 1.

Algorithm 2 Solving Problem (40) using ADMM.
  • Input: 

    The observed tensor M_, an upper estimation r of rtubal(L_0), the parameters γ,λ,ρ,δ.

  • 1:

    Initialize t=0, L_0=S_0=R_0=Y_10=Y_20=0_Rn1×n2×n3, Q_0=0_Rn1×r×n3, X_0=0_Rr×n2×n3.

  • 2:

    for t=0,,Tmax do

  • 3:

    Update (L_t+1,S_t+1) by Equation (42);

  • 4:

    Update Q_t+1 by Equation (45);

  • 5:

    Update (X_t+1,R_t+1) by Equations (49)–(50);

  • 6:

    Update (Y_1t+1,Y_2t+1) by Equation (52);

  • 7:
    Check the convergence criteria:
    • (i)
      convergence of variables: A_t+1A_tδ,A_{L_,S_,R_,Q_,X_}
    • (ii)
      convergence of constraints: max{Q_t+1X_t+1L_t,R_t+1S_t+1}δ.
  • 8:

    end for

  • Output: 

    (L_^,S_^)=(L_t+1,S_t+1).

5. Experiments

5.1. Synthetic Data

We first verify the correctness of Theorem 1. Specifically, we check whether the following two statements indicated in Theorem 1 hold in experiments on synthetic data sets:

  • (I).

    (Exact recovery in the noiseless setting.) Our analysis guarantees that the underlying low-rank tensor L_0 and sparse tensor S_0 can be exactly recovered in the noiseless setting. This statement will be checked in Section 5.1.1.

  • (II).

    (Linear scaling of errors with the noise level.) In Theorem 1, the estimation errors on L_0 and S_0 scales linearly with the noise level δ. This statement will be checked in Section 5.1.2.

Signal Generation. With a given tubal rank r0, we first generate the underlying tensor L_0Rn1×n2×n3 by L_0=A_B_/n3, where tensors A_Rn1×r0×n3 and B_Rr0×n2×n3 are generated with i.i.d. standard Gaussian elements. Then, the sparse corruption tensor S_0 is generated by choosing its support uniformly at random. The non-zero elements of S_0 will be i.i.d. sampled from a certain distribution that will be specified afterwards. Furthermore, the noise tensor E_0 is generated with entries sampled i.i.d. from N(0,σ2) with σ=cL_0F/n1n2n3, where we set constant c is to control the signal noise ratio. Finally, the observed tensor M_ is formed by M_=L_0+S_0+E_0.

5.1.1. Exact Recovery in the Noiseless Setting

We first check Statement (I), i.e., exact recovery in the noiseless setting. Specifically, we will show that Algorithm 1 and Algorithm 2 can exactly recover the underlying tensor L_0 and the sparse corruption S_0. We first test the recovery performance of different tensor sizes by setting n=n1=n2{100,160,200} and n3=20, with (rtubal(L_0),S_00)=(0.05n,0.05n2n3). The non-zero elements of tensor S_0 is sampled from i.i.d. symmetric Bernoulli distribution, i.e., the possibility of being 1 or −1 are 1/2. The results are shown in Table 1. It can be seen that both Algorithm 1 and Algorithm 2 can obtain relative standard error (RSE) smaller than 1e5 by which we can say that L_0 and S_0 are exact recovered. We can also see that Algorithm 2 runs much faster than Algorithm 1.

Table 1.

Performance of Algorithm 1 and Algorithm 2 in both accuracy and speed for different tensor sizes when the gross corruption. Outliers from symmetric Bernoulli, observation tensor M_Rn×n×n3, n3=30, rtubal(L_0)=0.05n, S_01=0.05n2n3, noise level c=0, r=max2rtubal(L_0),15.

n rtubal L_0 S_00 Method rtubal L_^ L_^L_0FL_0F S_^S_0FS_0F time/s
100 5 1×104 Algorithm 1 5 5.13×106 5.27×106 3.63
Algorithm 2 5 4.92×106 5.12×106 1.76
160 8 2.56×104 Algorithm 1 8 3.86×106 3.52×106 9.52
Algorithm 2 8 4.48×106 4.08×106 4.42
200 10 4×104 Algorithm 1 10 3.46×106 3.59×106 14.16
Algorithm 2 10 4.12×106 4.63×106 7.44

We then test whether the recovery performance can be affected by the distribution of the corruptions. This is done by choosing the non-zeros elements of S_0 from i.i.d. standard Gaussian distribution. The experimental results are reported in Table 2. We can find that both Algorithm 1 and Algorithm 2 can exactly recover the true L_0 and S_0 and Algorithm 2 runs much faster than Algorithm 1.

Table 2.

Performance of Algorithm 1 and Algorithm 2 in both accuracy and speed for different tensor sizes when the gross corruption. Outliers from standard Gaussian distribution, observation tensor M_Rn×n×n3, n3=30, rtubal(L_0)=0.05n, S_01=0.05n2n3, noise level c=0, r=max2rtubal(L_0),15.

n rtubal L_0 S_00 Method rtubal L_^ L_^L_0FL_0F S_^S_0FS_0F time/s
100 5 1×104 Algorithm 1 5 2.7×106 2.6×106 4.43
Algorithm 2 5 2.9×106 3.2×106 1.82
160 8 2.56×104 Algorithm 1 8 4.76×106 4.08×106 10.45
Algorithm 2 8 4.24×106 4.05×106 5.15
200 10 4×104 Algorithm 1 10 3.78×106 3.64×106 18.97
Algorithm 2 10 3.78×106 3.63×106 8.04

We also conduct STPCP by Algorithm 1 and Algorithm 2 with missing entries. After generating L_0, S_0 and E_0, we get the observation by Model (27). We choose the support of B_ uniformly at random with possibility 0.8 and then set elements in the chosen support to be 1. Thus, %20 of the entries are missing. The corrupted observation M is then formed by M_=B_(L_0+S_0+E_0). We show the recover results in Table 3. We can see that the underlying low-rank tensor L_0 can be exactly recovered and the observed part of the corruption tensor B_S_0 can also be exactly recovered (Please note that it is impossible to recover the unobserved entries of a sparse tensor S_0 [52]).

Table 3.

Performance of Algorithm 1 and Algorithm 2 in both accuracy and speed for different tensor sizes when the gross corruption. Outliers from symmetric Bernoulli, observation tensor M_Rn×n×n3, n3=30, rtubal(L_0)=0.05n, S_01=0.05n2n3, noise level c=0, r=max2rtubal(L_0),15, with %20 random missing entries.

n rtubal L_0 B_S_00 Method rtubal L_^ L_^L_0FL_0F S_^B_S_0FB_S_0F time/s
100 5 8×103 Algorithm 1 5 7.52×106 5.97×106 3.87
Algorithm 2 5 7.50×106 5.96×106 1.69
160 8 2.048×104 Algorithm 1 8 4.46×106 5.17×106 9.64
Algorithm 2 8 5.60×106 4.71×106 4.46
200 10 3.2×104 Algorithm 1 10 4.78×106 4.04×106 14.78
Algorithm 2 10 5.13×106 4.20×106 7.77

5.1.2. Linear Scaling of Errors with the Noise Level

We then verify Statement (II) that the estimation errors have linear scale behavior with respect to the noise level. The estimation errors are measured using the mean-squared-error (MSE):

MSE(L_^)=L_^L_0F2n1n2n3,MSE(S_^)=S_^S_0F2n1n2n3,

for the low rank component L_0 and the sparse component S_0, respectively. We test tensors of 3 different size by choosing n{60,80,100} and n3=20. The tubal rank rtubal(L_0) of L_0 and sparsity s of S_0 are set as (rtubal(L_0),s)=(5,0.1n2n3). We vary the signal noise ratio c=0.03:0.03:0.6 which is in proportional of the noise level δ. We run the proposed Algorithm 1, test 50 trials, and report the averaged MSEs. The MSEs of L_^ and S_0 versus c2 are shown in sub-figures (a) and (b) in Figure 4. We can see that the estimation error has linear scaling behavior along with the noise level as Theorem 1 indicates. Since the results for n=80 and n=100 are quite similar to the case of n=60, they are simply omitted.

Figure 4.

Figure 4

The MSEs of L_^ and S_0 versus c2 for tensors of size 60×60×20 where the tubal rank rtubal(L_0) of L_0 and sparsity s of S_0 are set as (rtubal(L_0),s)=(5,0.1n2n3). (a): MSE of L_^ vs c2. (b): MSE of S_^ vs c2.

5.2. Real Data Sets

In this section, experiments on real data sets (color images and videos) are carried out to evaluate the effectiveness and efficiency of the proposed Algorithms 1 and 2. Besides noises and sparse corruptions, we also consider missing values which is more challenging. The proposed algorithms are compared with the following typical models:

  • (I).
    NN-I: tensor recovery based on matrix nuclear norms of frontal slices formulated as follows:
    minL_,S_12B_(M_L_E_)F+γk=1n3(L(k)+λS(k)1). (53)

    This model will be used for image restoration in Section 5.2.1. Please note that Model (53) is equivalent to parallel matrix recovery on each frontal slice.

  • (II).
    NN-II: tensor recovery based on matrix nuclear norm formulated as follows:
    minL_,S_12B_(M_L_E_)F+γL+γλS_1, (54)
    where L=[l1,l2,,ln3]Rn1n2×n3 with lk:=vec(L(k))Rn1n2 defined as the vectorization [40] of frontal slices L(k), for all k=1,2,,n3. This model will be used for video restoration in Section 5.2.2.
  • (III).
    SNN: tensor recovery based on SNN formulated as follows:
    minL_,S_12B_(M_L_E_)F+γi=13αmL(i)+γS_1, (55)
    where L(i)Rni×jinj is the mode-i matriculation of tensor L_Rn1×n2×n3, for all i=1,2,3.

We solve the above Model (53)–(55) using ADMM implemented by ourselves in Matlab. The effectiveness of the algorithms is measured by Peaks Signal Noise Ratio (PSNR):

PSNR:=10log10n1n2n3L_02L_^L_0F2.,

Please note that a larger PSNR value indicates higher quality of L_^.

5.2.1. Color Images

Color images are the most commonly used 3-way tensors. We test the twenty 256-by-256-by-3 color images which have been used in [37], and carry out robust tensor recovery with missing entries (see Figure 5). Following [37], for a color image L_0Rn×n×3, we choose its support uniformly at random with ratio ρs and fill in the values with i.i.d. symmetric Bernoulli variables to generate S_0. The noise tensor E_0 is generated with i.i.d. zero-mean Gaussian entries whose standard deviation is given by σ=0.05L_0F/3n2. Then, we form the binary observation mask B_ by choosing its support uniformly at random with ratio ρobs. Finally, the partially observed corruption M_=B_(L_0+S_0+E_0) are formed.

Figure 5.

Figure 5

The 20 color images used.

We consider two scenarios by setting (ρobs,ρs){(0.9,0.1),(0.8,0.2)}. For NN (Model (53)), we set the regularization parameters λ=1/nρobs (suggested by [46]), and set parameter γ=E_0sp where E_0sp is estimated as 6.5σ3ρobsnlog(6n) (suggested by [5]). For SNN, the parameters are chosen to satisfy γ=0.05, α1=α2=3nρobs,α3=0.013nρobs. For Algorithm 1 and Algorithm 2, we set γ=0.3E_0sp, and λ=1/3nρobs. The initialized rank r in Algorithm 2 is set as 60. In each setting, we test each color image for 10 times and report the averaged PSNR and time. For quantitative comparison, we show the PSNR values and running times in Figure 6 and Figure 7 for settings of (ρobs,ρs)=(0.9,0.1) and (ρobs,ρs)=(0.8,0.2), respectively. Several visual examples are shown in Figure 8 for qualitative comparison for the setting of (ρobs,ρs)=(0.8,0.2). We can see from Figure 6, Figure 7 and Figure 8 that the proposed Algorithm 1 has the highest recovery quality and the proposed Algorithm 2 has the second highest quality but the fastest running time.

Figure 6.

Figure 6

The quantitative comparison in PSNR and time on color images. First, 10% entries of each image is corrupted by i.i.d. symmetric Bernoulli variable, then polluted by Gaussian noise of noise level c=0.05, and finally 10% of the corrupted entries are missing uniformly at random. (a): the PSNR values of each algorithm; (b): the running time of each algorithm.

Figure 7.

Figure 7

Figure 7

The quantitative comparison in PSNR and time on color images. First, 20% entries of each image is corrupted by i.i.d. symmetric Bernoulli variable, then polluted by Gaussian noise of noise level c=0.05, and finally 20% of the corrupted entries are missing uniformly at random. (a): the PSNR values of each algorithm; (b): the running time of each algorithm.

Figure 8.

Figure 8

The visual results for image recovery of different algorithms. First, 20% entries of each image is corrupted by i.i.d. symmetric Bernoulli variable, then polluted by Gaussian noise of noise level c=0.05, and finally 20% of the corrupted entries are missing uniformly at random. (a): the original image; (b): the corrupted image; (c) image recovered by Algorithm 1; (d) image recovered by Algorithm 2; (e) image recovered by the matrix nuclear norm (NN)-based Model (53); (f) image recovered by the SNN-based Model (55).

5.2.2. Videos

In this subsection, video restoration experiments are conducted on four broadly used YUV videos (They can be downloaded from https://sites.google.com/site/subudhibadri/fewhelpfuldownloads: Akiyo_qcif, Scilent_qcif, Carphone_qcif, and Claire_qcif.) Owing to computational limitation, we simply use the first 32 frames of the Y components of all the videos which results in four 144-by-176-by-30 tensors. For a 3-way data tensor L_0Rn1×n2×n3, To generate corruption S_0, the support is chosen uniformly at random with ratio ρs and then elements in the support are filled in with i.i.d. symmetric Bernoulli variables. The noise tensor E_0 is also generated with i.i.d. zero-mean Gaussian entries whose standard deviation is given by σ=0.05L_0F/n1n2n3. Then, the binary observation mask B_ is formed thorough choosing its support uniformly at random with ratio ρobs. Finally, the partially observed corruption M_=B_(L_0+S_0+E_0) are formed.

We also consider two scenarios by setting (ρobs,ρs){(0.9,0.1),(0.8,0.2)}. NN-II Model (54) is used in video restoration. For NN-II, we set the regularization parameters λ=1/n1n2ρobs (suggested by [46]), and set parameter γ=E_0sp where E_0sp is estimated as 6.5σρobsn1n3log((n1+n2)n3) (suggested by [5]). For SNN, the parameters are chosen to satisfy γ=0.05, α1=α2=n1n3ρobs,α3=5n1n3ρobs. For Algorithm 1 and Algorithm 2, we set γ=0.3E_0sp, and λ=1/max{n1,n2}n3ρobs after careful parameter tuning. The initialized rank r in Algorithm 2 is set as 60. In each setting, we test each video for 10 times and report the averaged PSNR and time. For quantitative comparison, we show the PSNR values and running times in Table 4. It can be seen that Algorithm 1 has the highest recovery quality and the proposed Algorithm 2 has the second highest quality but the fastest running time.

Table 4.

PSNR values and running time (in seconds) of different algorithms on video data. First, ρsn1n2n3 entries of each image is corrupted by i.i.d. symmetric Bernoulli variable, then polluted by Gaussian noise of noise level c=0.05, and finally (1ρobs)n1n2n3 of the corrupted entries are missing uniformly at random. The items with highest PSNR values are highlighted with bold face, and the items with shortest running time are highlighted with underline.

Data Set (ρobs,ρs) Index NN, Model (54) SNN, Model (55) Algorithm 1 Algorithm 2
Akiyo (0.9,0.1) PSNR 31.74 32.09 33.94 33.36
time/s 29.48 51.13 20.10 12.39
(0.8,0.2) PSNR 30.59 30.70 32.44 32.07
time/s 30.65 51.17 19.53 14.92
Silent (0.9,0.1) PSNR 28.26 30.39 31.74 31.23
time/s 28.91 49.79 21.21 14.76
(0.8,0.2) PSNR 26.95 27.60 30.42 30.07
time/s 36.51 60.81 22.43 15.62
Carphone (0.9,0.1) PSNR 26.87 28.79 29.15 28.94
time/s 28.55 47.17 22.12 14.41
(0.8,0.2)) PSNR 26.12 26.43 28.17 27.99
time/s 26.72 49.21 20.55 14.74
Claire (0.9,0.1) PSNR 30.56 32.20 34.27 34.02
time/s 29.75 47.32 21.43 13.52
(0.8,0.2) PSNR 29.94 30.43 32.96 32.78
time/s 29.43 50.46 19.47 13.04

6. Conclusions

This paper studied the problem of stable tensor principal component pursuit which aims to recover a tensor from noises and sparse corruptions. We proposed a constrained tubal nuclear norm-based model and established upper bounds on the estimation error. In contrast to prior work [37], our theory can guarantee exact recovery in the noiseless setting. We also designed two algorithms, the first ADMM algorithm can be accelerated by the second Algorithm which adopts a factorization strategy. We validated the correctness of our theory by simulations on synthetic data, and evaluated the effectiveness and efficiency of the proposed algorithms via experiments on color images and videos.

For future directions, it is a natural and interesting extension to consider recovery of 4-way tensors [35] with arbitrary linear transformation [53,54]. It is also interesting to use tensor factorization-based methods [55,56] for STPCP. Another challenging future direction is developing tools to verify whether the unknown tensor satisfies the tensor incoherence condition from its incomplete or corrupted observations.

For extensions of the proposed approach to higher-way tensors, we produce the following two ideas:

  1. By recursively applying DFT over successive modes higher than 3 and then unfolding the obtained tensor into 3-way [57], the proposed algorithms and theoretical analysis can be extended to higher-way tensors.

  2. By using the overlapped orientation invariant tubal nuclear norm [58], we can extend the proposed algorithm to higher-order cases and obtain orientation invariance.

Acknowledgments

We sincerely thank Andong Wang who shared the codes of [37] and gave us some suggestions of the proof.

Appendix A. Proofs of Lemmas and Theorems

Appendix A.1. The Proof of Theorem 1

Appendix A.1.1. Key Lemmas for the Proof of Theorem 1

Before Proving Theorem 1, we should define some notations and operators first.

Suppose L_0Rn1×n2×n3 with tubal rank r has the skinny t-SVD L_0=U_Λ_V_, where U_Rn1×r×n3,V_Rr×n2×n3 are orthogonal tensors, and Λ_Rr×r×n3 is an f-diagonal tensor. Define the following set:

T:=U_A_+B_V_|A_Rr×n2×n3,B_Rn1×r×n3Rn1×n2×n3. (A1)

Then, define the projector onto T for any tensor T_Rn1×n2×n3 as follows:

PT(T_):=U_U_T_+T_V_V_U_U_T_V_V_,PT(T_):=(I_U_U_)T_(I_V_V_). (A2)

Let Ω be the complement of Ω[n1]×[n2]×[n3] which is the support of S_0. Then, define two operators PΩ,PΩ as follows:

PΩ(T_):=(i,j,k)ΩT_,e˚ie˙ke˚j,PΩ(T_):=(i,j,k)ΩT_,e˚ie˙ke˚j, (A3)

for any T_Rn1×n2×n3.

Define two sets Γ and Γ as follows:

Γ={(A_,A_)|A_Rn1×n2×n3},Γ={(A_,A_)|A_Rn1×n2×n3}. (A4)

Then, for any tensors X_ι,X_sRn1×n2×n3, the projectors of the tensor X_=(X_ι,X_s) into the sets Γ and Γ are given as follows, respectively:

PΓ(X_)=X_ι+X_s2,X_ι+X_s2,PΓ(X_)=X_ιX_s2,X_sX_ι2. (A5)

For any tensors X_ι,X_sRn1×n2×n3, define two operators on X_=(X_ι,X_s) as follows:

(PT×PΩ)(X_)=(PT(X_ι),PΩ(X_s)),(PT×PΩ)(X_)=(PT(X_ι),PΩ(X_s)). (A6)

Also define two norms as follows:

X_F=X_ιF2+X_sF2,X_F,μ=X_ιF2+μ2X_sF2. (A7)

where μ is a constant that will be determined afterwards.

We first give Lemma A1 which can be seen as a modified version of Lemma C.1 in [2].

Lemma A1.

Assume that PΩPT12, and λ12n3. Suppose there exists a tensor G_ satisfying the following conditions:

PT(G_)=U_V_,PT(G_)sp12,PΩ(G_λsign(S_0))Fλ4,PΩ(G_)λ2. (A8)

Then for any perturbation Δ_Rn1×n2×n3, one has:

L_0+Δ_TNN+λS_0Δ_1L_0TNN+λS_01+34PT(G_)spPT(Δ_)TNN+34λPΩ(G_)PΩ(Δ)1. (A9)
Proof. 

Let G_ιL_0TNN, i.e., any sub-gradient of ·TNN at L_0, then it satisfies:

PT(G_ι)=U_V_,PT(G_ι)sp1. (A10)

G_ιL_0TNN and G_s(λS_01). According to the convexity of ·TNN and ·1, we have:

L_0+Δ_TNNL_0TNN+G_ι,Δ_,λS_0Δ_1λS_01G_s,Δ_. (A11)

By choosing G_ι=U_V_+P_Q_, where P_ and Q_ comes from the skinny t-SVD of PT(Δ_)=P__Q_, one has:

G_ι,Δ_=G_,Δ_+G_ιG_,Δ_=G_,Δ_+PT(G_ι),PT(Δ_)PT(G_),PT(Δ_)=G_,Δ_(1PT(G_)sp)PT(Δ_)TNN. (A12)

Also, by choosing G_s=λsign(S_0)sign(PΩ(Δ_)), one has:

G_s,Δ_=G_,Δ_G_sG_,Δ_=G_,Δ_PΩ(λsign(S_0)G_),PΩ(Δ_)PΩ(G_s),PΩ(Δ_)+PΩ(G_),PΩ(Δ_)G_,Δ_PΩ(λsign(S_0)G_)FPΩ(Δ_)F+PΩ(Δ_)1PΩ(G_)PΩ(Δ_)1G_,Δ_λ4PΩ(Δ_)F+(1PΩ(G_))PΩ(Δ_)1 (A13)

Also note that:

PΩ(Δ_)FPΩPT(Δ_)F+PΩPT(Δ_)FPΩPT(Δ_)F+PΩPT(Δ_)F12Δ_F+PΩPT(Δ_)F12PΩ(Δ_)F+12PΩ(Δ_)F+PΩPT(Δ_)F (A14)

which leads to:

PΩ(Δ_)FPΩ(Δ_)F+2PΩPT(Δ_)FPΩ(Δ_)1+2n3PT(Δ_)TNN. (A15)

Putting things together, we have:

L_0+Δ_TNN+λS_0Δ_1(L_0TNN+λS_01)1λn32PT(G_)spPT(Δ_)TNN+34λPΩ(G_)PΩ(Δ_)1. (A16)

Since λ12n3, it holds that:

L_0+Δ_TNN+λS_0Δ_1L_0TNN+λS_01+34PT(G_)spPT(Δ_)TNN+34λPΩ(G_)PΩ(Δ)1,

for any perturbation Δ_Rn1×n2×n3. □

Lemma A2.

Suppose that PΩPT1/2, then for any X_=(X_ι,X_s), we have:

PΓ(PT×PΩ)(X_)F,μ21+μ28PT×PΩ(X_)F2. (A17)
Proof. 

According to the definitions of PΓ and PT×PΩ, we have:

PΓ(PT×PΩ)(X_)=PT(X_ι)+PΩ(X_s)2,PT(X_ι)+PΩ(X_s)2. (A18)

Then, we have:

PΓ(PT×PΩ)(X_)F,μ2=(1+μ2)·14·PT(X_ι)F2+PΩ(X_s)F2+2PT(X_ι),PΩ(X_s)=(1+μ2)4PT(X_ι)F2+PΩ(X_s)F2+2PΩPTPT(X_ι),PΩ(X_s)(1+μ2)4PT(X_ι)F2+PΩ(X_s)F22PΩPTPT(X_ι)FPΩ(X_s)F(1+μ2)4PT(X_ι)F2+PΩ(X_s)F22·12PT(X_ι)FPΩ(X_s)F(1+μ2)4PT(X_ι)F2+PΩ(X_s)F2PT(X_ι)F2+PΩ(X_s)F22=(1+μ2)8PT×PΩ(X_)F2. (A19)

Hence completes the proof. □

Appendix A.1.2. Proof of Theorem 1

Proof. 

For X_=(L_,S_), define X_=L_TNN+λS_1. Let X^_=(L_^,S_^),X_=(L_0,S_0). According to the optimality of (L_^,S_^) and the feasibility of (L_0,S_0), we directly have:

X^_X_, (A20)
L_^+S_^M_Fδ, (A21)
L_0+S_0M_Fδ. (A22)

Let Δ_ι=L_^L_0, Δ_s=S_^S_0. Then, we have:

Δ_ι+Δ_sF=L_^+S_^M_(L_0+S_0M_)FL_^+S_^M_F+L_0+S_0M_F2δ. (A23)

Define the pair of error tensors Δ_=X^_X_=(Δ_ι,Δ_s). The goal is to bound Δ_F,μ.

First, we use the decomposition Δ_=PΓ(Δ_)+PΓ(Δ_), and let Δ_Γ=PΓ(Δ_)=(Δ_ιΓ,Δ_sΓ)=(Δ_ι+Δ_s2,Δ_ι+Δ_s2),Δ_Γ=PΓ(Δ_)=(Δ_ιΓ,Δ_sΓ)=(Δ_ιΔ_s2,Δ_sΔ_ι2) for simplicity. Then, we have:

Δ_F,μ=Δ_Γ+Δ_ΓF,μΔ_ΓF,μ+Δ_ΓF,μ. (A24)

Please note that Δ_ιΓ=Δ_sΓ=Δ_ι+Δ_s2, thus Δ_ΓF,μ can be bounded easily as follows:

Δ_ΓF,μ=Δ_ιΓF2+μ2Δ_sΓF2=1+μ22Δ_ι+Δ_sFδ1+μ2. (A25)

Then, it remains to bound Δ_ΓF,μ. Due to the triangular inequality:

Δ_ΓF,μ(PT×PΩ)Δ_ΓF,μ+(PT×PΩ)Δ_ΓF,μ, (A26)

(A) bound (PT×PΩ)Δ_ΓF,μ. According to the convexity of · we have:

X_+Δ_=X_+Δ_Γ+Δ_ΓX_+Δ_ΓΔ_Γ. (A27)

Using Lemma A1, we have:

X_+Δ_ΓX_+34PT(G_)spPT(Δ_ιΓ)TNN+34λPΩ(G_)PΩ(Δ_sΓ)1X_+14(PT×PΩ)Δ_Γ. (A28)

Combining Equations (A20), (A27) and (A28), we have:

Δ_Γ14(PT×PΩ)Δ_Γ (A29)

Then, with μ=n3λ, we reach a bound on (PT×PΩ)Δ_ΓF,μ as follows:

(PT×PΩ)Δ_ΓF,μPT(Δ_ιΓ)F+μPΩ(Δ_sΓ)Fn3PT(Δ_ιΓ)TNN+μPΩ(Δ_sΓ)1n3PT(Δ_ιΓ)TNN+λPΩ(Δ_sΓ)1n3(PT×PΩ)Δ_Γ4n3Δ_Γ4n3Δ_ιΓTNN+λΔ_sΓ14n3min{n1,n2}Δ_ιΓF+λn1n2n3Δ_sΓF=4n3min{n1,n2}+λn1n2n3Δ_ιΓF4n3min{n1,n2}+λn1n2n3δ. (A30)

(B) bound (PT×PΩ)Δ_ΓF,μ. Please note that:

PΓ(Δ_Γ)=0_=PΓ(PT×PΩ)(Δ_Γ)+PΓ(PT×PΩ)(Δ_Γ), (A31)

which means:

PΓ(PT×PΩ)(Δ_Γ)F,μ=PΓ(PT×PΩ(Δ_Γ))F,μPT×PΩ(Δ_Γ)F,μ. (A32)

According to Lemma A2, we have:

(PT×PΩ)(Δ_Γ)F,μPT(Δ_ιΓ)F+μPΩ(Δ_ιΓ)F1+μ2PT(Δ_ιΓ)F2+PΩ(Δ_sΓ)F2=1+μ2PT×PΩ(Δ_Γ)F1+μ2·81+μ2·PΓ(PT×PΩ)(Δ_Γ)F. (A33)

According to Equations (A32) and (A33), we obtain:

(PT×PΩ)(Δ_Γ)F,μ22(PT×PΩ)(Δ_Γ)F,μ. (A34)

Thus, combing Equations (A24), (A25), (A30) and (A34), and setting μ=n3λ, we obtain:

Δ_F,μ1+n3λ2+4(1+22)min{n1,n2}n3+n3λn1n2δ. (A35)

Since λ=1max{n1,n2}n3, we have:

Δ_F,μ1+1max{n1,n2}+8(1+22)min{n1,n2}n3δ, (A36)

which indicates that:

L_^L_0F1+1max{n1,n2}+8(1+22)min{n1,n2}n3δS_^S_0F1+max{n1,n2}+8(1+22)n1n2n3δ. (A37)

Moreover, according to the analysis in [2], the conditions PΩPT12 and Equation (A8) in Lemma A1 hold with probability at least 1c1(n3max{n1,n2})c2, where c1 and c2 are positive constants.

In this way, the proof of Theorem 1 is completed. □

Appendix A.2. Proof of Theorem 2

Proof. 

The key idea is to rewrite Problem (29) into a standard two-block ADMM problem. For notational simplicity, let:

f(x)=12L_+S_Y_F2,g(z)=γK_TNN+γλR(S_),

where x,y,z and A are defined as follows:

x=vec(L_)vec(S_),y=vec(Y_1)vec(Y_2),z=vec(K_)vec(R_),A=diag(vec(B_))00diag(vec(B_)),

and vec(·) denotes an operation of tensor vectorization (see [40]).

It can be verified that f(·) and g(·) are closed, proper convex functions. Then, Problem (29) can be re-written as follows:

minx,zf(x)+g(z)s.t.Axz=0.

According to the convergence analysis in [48], we have:

objectiveconvergence:limtf(xt)+g(zt)=f+g,dualvariableconvergence:limtyt=y,constraintconvergence:limtAxtzt=0,

where f,g are the optimal values of f(x), g(z), respectively. Variable y is a dual optimal point defined as:

y=vec(Y_1)vec(Y_2),

where (Y_1,Y_2) is the dual component of a saddle point (L_,S_,K_,R_,Y_1,Y_2) of the unaugmented Lagrangian L(L_,S_,K_,R_,Y_1,Y_2). □

Appendix A.3. Proof of Lemma 4

Proof. 

Let the full t-SVD of X_ be X_=U_Λ_V_, where U_,V_Rr×r×n3 are orthogonal tensors and Λ_Rr×r×n3 is f-diagonal. Then:

X_TNN=U_Λ_V_¯=U_¯·Λ_¯·V_¯=Λ_¯. (A38)

Then Q_X_=(Q_U_)Λ_V_. Since

(Q_U_)(Q_U_)=U_Q_Q_U_=I_, (A39)

we obtain that:

Q_X_TNN=Q_X_¯=(Q_U_)Λ_V_¯=(Q_U_)¯·Λ_¯·V_¯=Λ_¯. (A40)

Thus, Q_X_TNN=X_TNN. □

Appendix A.4. Proof of Theorem 3

Proof. 

Please note that (Q_X_,S_) is a feasible point of Problem (28), then we have:

12B_(L_+S_M_)F2+γ(L_TNN+λS_1)12B_(Q_X_+S_M_)F2+γ(Q_X_TNN+λS_1)=12B_(Q_X_+S_M_)F2+γ(X_TNN+λS_1) (A41)

By the assumption that rtubal(L_)r, there exists a decomposition L_=Q_X_, such that (Q_,X_,S_) is also a feasible point of Problem (39).

Moreover, since (Q_,X_,S_) is a global optimal solution to Problem (39), then we have that

12B_(Q_X_+S_M_)F2+γ(X_TNN+λS_1)12B_(Q_X_+S_M_)F2+γ(X_TNN+λS_1).

By L_=Q_X_, we have:

L_TNN=Q_X_TNN=X_TNN. (A42)

Thus, we deduce:

12B_(Q_X_+S_M_)F2+γ(X_TNN+λS_1)12B_(L_+S_M_)F2+γ(L_TNN+λS_1). (A43)

According to Equations (A41) and (A43), we further have:

12B_(Q_X_+S_M_)F2+γ(X_TNN+λS_1)12B_(L_+S_M_)F2+γ(L_TNN+λS_1). (A44)

In this way, (Q_X_,S_) is also the optimal solution to Problem (28). □

Author Contributions

Conceptualization, W.F. Data curation, D.W. and R.Z. Formal analysis, W.F. Methodology, W.F., D.W and R.Z. Software, D.W. and R.Z. Writing, original draft, W.F., D.W. and R.Z.

Funding

This research was funded by the Key Projects of Natural Science Research in Universities in Anhui Province under grant number KJ2019A0994.

Conflicts of Interest

The authors declare no conflict of interest.

References

  • 1.Liu J., Musialski P., Wonka P., Ye J. Tensor completion for estimating missing values in visual data. IEEE Trans. Pattern Anal. Mach. Intell. 2013;35:208–220. doi: 10.1109/TPAMI.2012.39. [DOI] [PubMed] [Google Scholar]
  • 2.Lu C., Feng J., Chen Y., Liu W., Lin Z., Yan S. Tensor robust principal component analysis with a new tensor nuclear norm. IEEE Trans. Pattern Anal. Mach. Intell. 2019 doi: 10.1109/TPAMI.2019.2891760. [DOI] [PubMed] [Google Scholar]
  • 3.Xu Y., Hao R., Yin W., Su Z. Parallel matrix factorization for low-rank tensor completion. Inverse Probl. Imaging. 2015;9:601–624. doi: 10.3934/ipi.2015.9.601. [DOI] [Google Scholar]
  • 4.Liu Y., Shang F. An Efficient Matrix Factorization Method for Tensor Completion. IEEE Signal Process. Lett. 2013;20:307–310. doi: 10.1109/LSP.2013.2245416. [DOI] [Google Scholar]
  • 5.Wang A., Wei D., Wang B., Jin Z. Noisy Low-Tubal-Rank Tensor Completion Through Iterative Singular Tube Thresholding. IEEE Access. 2018;6:35112–35128. doi: 10.1109/ACCESS.2018.2850324. [DOI] [Google Scholar]
  • 6.Tan H., Feng G., Feng J., Wang W., Zhang Y.J., Li F. A tensor-based method for missing traffic data completion. Transp. Res. Part C. 2013;28:15–27. doi: 10.1016/j.trc.2012.12.007. [DOI] [Google Scholar]
  • 7.Peng Y., Lu B.L. Discriminative extreme learning machine with supervised sparsity preserving for image classification. Neurocomputing. 2017;261:242–252. doi: 10.1016/j.neucom.2016.05.113. [DOI] [Google Scholar]
  • 8.Cichocki A., Mandic D., De Lathauwer L., Zhou G., Zhao Q., Caiafa C., Phan H.A. Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE Signal Process. Mag. 2015;32:145–163. doi: 10.1109/MSP.2013.2297439. [DOI] [Google Scholar]
  • 9.Vaswani N., Bouwmans T., Javed S., Narayanamurthy P. Robust subspace learning: Robust PCA, robust subspace tracking, and robust subspace recovery. IEEE Signal Process. Mag. 2018;35:32–55. doi: 10.1109/MSP.2018.2826566. [DOI] [Google Scholar]
  • 10.Cichocki A., Lee N., Oseledets I., Phan A.H., Zhao Q., Mandic D.P. Tensor Networks for Dimensionality Reduction and Large-scale Optimization: Part 1 Low-Rank Tensor Decompositions. Found. Trends® Mach. Learn. 2016;9:249–429. doi: 10.1561/2200000059. [DOI] [Google Scholar]
  • 11.Yuan M., Zhang C.H. On Tensor Completion via Nuclear Norm Minimization. Found. Comput. Math. 2016;16:1–38. doi: 10.1007/s10208-015-9269-5. [DOI] [Google Scholar]
  • 12.Candès E.J., Tao T. The power of convex relaxation: Near-optimal matrix completion. IEEE Trans. Inf. Theory. 2010;56:2053–2080. doi: 10.1109/TIT.2010.2044061. [DOI] [Google Scholar]
  • 13.Hillar C.J., Lim L. Most Tensor Problems Are NP-Hard. J. ACM. 2009;60:45. doi: 10.1145/2512329. [DOI] [Google Scholar]
  • 14.Yuan M., Zhang C.H. Incoherent Tensor Norms and Their Applications in Higher Order Tensor Completion. IEEE Trans. Inf. Theory. 2017;63:6753–6766. doi: 10.1109/TIT.2017.2724549. [DOI] [Google Scholar]
  • 15.Tomioka R., Suzuki T. Convex tensor decomposition via structured schatten norm regularization; Proceedings of the Advances in Neural Information Processing Systems; Lake Tahoe, NV, USA. 5–10 December 2013; pp. 1331–1339. [Google Scholar]
  • 16.Semerci O., Hao N., Kilmer M.E., Miller E.L. Tensor-Based Formulation and Nuclear Norm Regularization for Multienergy Computed Tomography. IEEE Trans. Image Process. 2014;23:1678–1693. doi: 10.1109/TIP.2014.2305840. [DOI] [PubMed] [Google Scholar]
  • 17.Mu C., Huang B., Wright J., Goldfarb D. Square Deal: Lower Bounds and Improved Relaxations for Tensor Recovery; Proceedings of the International Conference on Machine Learning; Beijing, China. 21–26 June 2014; pp. 73–81. [Google Scholar]
  • 18.Zhao Q., Meng D., Kong X., Xie Q., Cao W., Wang Y., Xu Z. A Novel Sparsity Measure for Tensor Recovery; Proceedings of the IEEE International Conference on Computer Vision; Santiago, Chile. 7–13 December 2015; pp. 271–279. [Google Scholar]
  • 19.Wei D., Wang A., Wang B., Feng X. Tensor Completion Using Spectral (k, p) -Support Norm. IEEE Access. 2018;6:11559–11572. doi: 10.1109/ACCESS.2018.2811396. [DOI] [Google Scholar]
  • 20.Tomioka R., Hayashi K., Kashima H. Estimation of low-rank tensors via convex optimization. arXiv. 20101010.0789 [Google Scholar]
  • 21.Chretien S., Wei T. Sensing tensors with Gaussian filters. IEEE Trans. Inf. Theory. 2016;63:843–852. doi: 10.1109/TIT.2016.2633413. [DOI] [Google Scholar]
  • 22.Ghadermarzy N., Plan Y., Yılmaz Ö. Near-optimal sample complexity for convex tensor completion. arXiv. 2017 doi: 10.1093/imaiai/iay019.1711.04965 [DOI] [Google Scholar]
  • 23.Ghadermarzy N., Plan Y., Yılmaz Ö. Learning tensors from partial binary measurements. arXiv. 2018 doi: 10.1109/TSP.2018.2879031.1804.00108 [DOI] [Google Scholar]
  • 24.Liu Y., Shang F., Fan W., Cheng J., Cheng H. Generalized Higher-Order Orthogonal Iteration for Tensor Decomposition and Completion; Proceedings of the Advances in Neural Information Processing Systems; Montreal, QC, Canada. 8–13 December 2014; pp. 1763–1771. [Google Scholar]
  • 25.Zhang Z., Ely G., Aeron S., Hao N., Kilmer M. Novel methods for multilinear data completion and de-noising based on tensor-SVD; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Columbus, OH, USA. 23–28 June 2014; pp. 3842–3849. [Google Scholar]
  • 26.Lu C., Feng J., Lin Z., Yan S. Exact Low Tubal Rank Tensor Recovery from Gaussian Measurements; Proceedings of the 28th International Joint Conference on Artificial Intelligence; Stockholm, Sweden. 13–19 July 2018; pp. 1948–1954. [Google Scholar]
  • 27.Jiang J.Q., Ng M.K. Exact Tensor Completion from Sparsely Corrupted Observations via Convex Optimization. arXiv. 20171708.00601 [Google Scholar]
  • 28.Xie Y., Tao D., Zhang W., Liu Y., Zhang L., Qu Y. On Unifying Multi-view Self-Representations for Clustering by Tensor Multi-rank Minimization. Int. J. Comput. Vis. 2018;126:1157–1179. doi: 10.1007/s11263-018-1086-2. [DOI] [Google Scholar]
  • 29.Ely G.T., Aeron S., Hao N., Kilmer M.E. 5D seismic data completion and denoising using a novel class of tensor decompositions. Geophysics. 2015;80:V83–V95. doi: 10.1190/geo2014-0467.1. [DOI] [Google Scholar]
  • 30.Liu X., Aeron S., Aggarwal V., Wang X., Wu M. Adaptive Sampling of RF Fingerprints for Fine-grained Indoor Localization. IEEE Trans. Mob. Comput. 2016;15:2411–2423. doi: 10.1109/TMC.2015.2505729. [DOI] [Google Scholar]
  • 31.Wang A., Lai Z., Jin Z. Noisy low-tubal-rank tensor completion. Neurocomputing. 2019;330:267–279. doi: 10.1016/j.neucom.2018.11.012. [DOI] [Google Scholar]
  • 32.Sun W., Chen Y., Huang L., So H.C. Tensor Completion via Generalized Tensor Tubal Rank Minimization using General Unfolding. IEEE Signal Process. Lett. 2018;25:868–872. doi: 10.1109/LSP.2018.2819892. [DOI] [Google Scholar]
  • 33.Kilmer M.E., Braman K., Hao N., Hoover R.C. Third-order tensors as operators on matrices: A theoretical and computational framework with applications in imaging. SIAM J. Matrix Anal. Appl. 2013;34:148–172. doi: 10.1137/110837711. [DOI] [Google Scholar]
  • 34.Liu X.Y., Aeron S., Aggarwal V., Wang X. Low-tubal-rank tensor completion using alternating minimization. arXiv. 20161610.01690 [Google Scholar]
  • 35.Liu X.Y., Wang X. Fourth-order tensors with multidimensional discrete transforms. arXiv. 20171705.01576 [Google Scholar]
  • 36.Gu Q., Gui H., Han J. Robust tensor decomposition with gross corruption; Proceedings of the Advances in Neural Information Processing Systems; Montreal, QC, Canada. 8–13 December 2014; pp. 1422–1430. [Google Scholar]
  • 37.Wang A., Jin Z., Tang G. Robust tensor decomposition via t-SVD: Near-optimal statistical guarantee and scalable algorithms. Signal Process. 2020;167:107319. doi: 10.1016/j.sigpro.2019.107319. [DOI] [Google Scholar]
  • 38.Zhang Z., Aeron S. Exact Tensor Completion Using t-SVD. IEEE Trans. Signal Process. 2017;65:1511–1526. doi: 10.1109/TSP.2016.2639466. [DOI] [Google Scholar]
  • 39.Goldfarb D., Qin Z. Robust low-rank tensor recovery: Models and algorithms. SIAM J. Matrix Anal. Appl. 2014;35:225–253. doi: 10.1137/130905010. [DOI] [Google Scholar]
  • 40.Kolda T.G., Bader B.W. Tensor decompositions and applications. SIAM Rev. 2009;51:455–500. doi: 10.1137/07070111X. [DOI] [Google Scholar]
  • 41.Cheng L., Wu Y.C., Zhang J., Liu L. Subspace identification for DOA estimation in massive/full-dimension MIMO systems: Bad data mitigation and automatic source enumeration. IEEE Trans. Signal Process. 2015;63:5897–5909. doi: 10.1109/TSP.2015.2458788. [DOI] [Google Scholar]
  • 42.Cheng L., Xing C., Wu Y.C. Irregular Array Manifold Aided Channel Estimation in Massive MIMO Communications. IEEE J. Sel. Top. Signal Process. 2019;13:974–988. doi: 10.1109/JSTSP.2019.2937392. [DOI] [Google Scholar]
  • 43.Zhao Q., Zhou G., Zhang L., Cichocki A., Amari S.I. Bayesian robust tensor factorization for incomplete multiway data. IEEE Trans. Neural Networks Learn. Syst. 2016;27:736–748. doi: 10.1109/TNNLS.2015.2423694. [DOI] [PubMed] [Google Scholar]
  • 44.Zhou Y., Cheung Y. Bayesian Low-Tubal-Rank Robust Tensor Factorization with Multi-Rank Determination. IEEE Trans. Pattern Anal. Mach. Intell. 2019 doi: 10.1109/TPAMI.2019.2923240. [DOI] [PubMed] [Google Scholar]
  • 45.Zhou Z., Li X., Wright J., Candes E., Ma Y. Stable principal component pursuit; Proceedings of the 2010 IEEE International Symposium on Information Theory; Austin, TX, USA. 12–18 June 2010; pp. 1518–1522. [Google Scholar]
  • 46.Candès E.J., Li X., Ma Y., Wright J. Robust principal component analysis? J. ACM. 2011;58:11. doi: 10.1145/1970392.1970395. [DOI] [Google Scholar]
  • 47.Lu C., Feng J., Chen Y., Liu W., Lin Z., Yan S. Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Tensors via Convex Optimization; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, NV, USA. 27–30 June 2016; pp. 5249–5257. [Google Scholar]
  • 48.Boyd S., Parikh N., Chu E., Peleato B., Eckstein J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends® Mach. Learn. 2011;3:1–122. doi: 10.1561/2200000016. [DOI] [Google Scholar]
  • 49.Peng Y., Lu B.L. Robust structured sparse representation via half-quadratic optimization for face recognition. Multimed. Tools Appl. 2017;76:8859–8880. doi: 10.1007/s11042-016-3510-3. [DOI] [Google Scholar]
  • 50.Liu G., Yan S. Active subspace: Toward scalable low-rank learning. Neural Comput. 2012;24:3371–3394. doi: 10.1162/NECO_a_00369. [DOI] [PubMed] [Google Scholar]
  • 51.Wang A., Jin Z., Yang J. A Factorization Strategy for Tensor Robust PCA. ResearchGate; Berlin, Germany: 2019. [Google Scholar]
  • 52.Jiang Q., Ng M. Robust Low-Tubal-Rank Tensor Completion via Convex Optimization; Proceedings of the 28th International Joint Conference on Artificial Intelligence; Macao, China. 10–16 August 2019; pp. 2649–2655. [Google Scholar]
  • 53.Kernfeld E., Kilmer M., Aeron S. Tensor–tensor products with invertible linear transforms. Linear Algebra Its Appl. 2015;485:545–570. doi: 10.1016/j.laa.2015.07.021. [DOI] [Google Scholar]
  • 54.Lu C., Peng X., Wei Y. Low-Rank Tensor Completion With a New Tensor Nuclear Norm Induced by Invertible Linear Transforms; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Long Beach, CA, USA. 16–20 June 2019; pp. 5996–6004. [Google Scholar]
  • 55.Liu X.Y., Aeron S., Aggarwal V., Wang X. Low-tubal-rank tensor completion using alternating minimization; Proceedings of the SPIE Defense+ Security; Baltimore, MD, USA. 17–21 April 2016; Bellingham, DC, USA: International Society for Optics and Photonics; 2016. p. 984809. [Google Scholar]
  • 56.Zhou P., Lu C., Lin Z., Zhang C. Tensor Factorization for Low-Rank Tensor Completion. IEEE Trans. Image Process. 2018;27:1152–1163. doi: 10.1109/TIP.2017.2762595. [DOI] [PubMed] [Google Scholar]
  • 57.Martin C.D., Shafer R., Larue B. An Order-p Tensor Factorization with Applications in Imaging. SIAM J. Sci. Comput. 2013;35:A474–A490. doi: 10.1137/110841229. [DOI] [Google Scholar]
  • 58.Wang A., Jin Z. Orientation Invariant Tubal Nuclear Norms Applied to Robust Tensor Decomposition. [(accessed on 3 December 2019)]; Available online: https://www.researchgate.net/publication/329116872_Orientation_Invariant_Tubal_Nuclear_Norms_Applied_to_Robust_Tensor_Decomposition.

Articles from Sensors (Basel, Switzerland) are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES