Abstract
This paper extends robust principal component analysis (RPCA) to nonlinear manifolds. Suppose that the observed data matrix is the sum of a sparse component and a component drawn from some low dimensional manifold. Is it possible to separate them by using similar ideas as RPCA? Is there any benefit in treating the manifold as a whole as opposed to treating each local region independently? We answer these two questions affirmatively by proposing and analyzing an optimization framework that separates the sparse component from the manifold under noisy data. Theoretical error bounds are provided when the tangent spaces of the manifold satisfy certain incoherence conditions. We also provide a near optimal choice of the tuning parameters for the proposed optimization formulation with the help of a new curvature estimation method. The efficacy of our method is demonstrated on both synthetic and real datasets.
1. Introduction
Manifold learning and graph learning are nowadays widely used in computer vision, image processing, and biological data analysis on tasks such as classification, anomaly detection, data interpolation, and denoising. In most applications, graphs are learned from the high dimensional data and used to facilitate traditional data analysis methods such as PCA, Fourier analysis, and data clustering [7, 8, 9, 15, 12]. However, the quality of the learned graph may be greatly jeopardized by outliers which cause instabilities in all the aforementioned graph assisted applications.
In recent years, several methods have been proposed to handle outliers in nonlinear data [11, 21, 3]. Despite the success of those methods, they only aim at detecting the outliers instead of correcting them. In addition, very few of them are equipped with theoretical analysis of the statistical performances. In this paper, we propose a novel non-task-driven algorithm for the mixed noise model in (1) and provide theoretical guarantees to control its estimation error. Specifically, we consider the mixed noise model as
| (1) |
where is the noiseless data independently drawn from some manifold with an intrinsic dimension d ≪ p, Ei is the i.i.d. Gaussian noise with small magnitudes, and Si is the sparse noise with possibly large magnitudes. If Si has a large entry, then the corresponding is usually considered as an outlier. The goal of this paper is to simultaneously recover Xi and Si from , i = 1, .., n.
There are several benefits in recovering the noise term Si along with the signal Xi. First, the support of Si indicates the locations of the anomaly, which is informative in many applications. For example, if Xi is the gene expression data from the ith patient, the nonzero elements in Si indicate the differentially expressed genes that are the candidates for personalized medicine. Similarly, if Si is a result of malfunctioned hardware, its nonzero elements indicate the locations of the malfunctioned parts. Secondly, the recovery of Si allows the “outliers” to be pulled back to the data manifold instead of simply being discarded. This prevents a waste of information and is especially beneficial in cases where data is insufficient. Thirdly, in some applications, the sparse Si is a part of the clean data rather than a noise term, then the algorithm provides a natural decomposition of the data into a sparse and a non-sparse component that may carry different pieces of information.
Along a similar line of research, Robust Principle Component Analysis (RPCA) [2] has received considerable attention and has demonstrated its success in separating data from sparse noise in many applications. However, its assumption that the data lies in a low dimensional subspace is somewhat strict. In this paper, we generalize the Robust PCA idea to the non-linear manifold setting. The major new components in our algorithm are: 1) an incorporation of the manifold curvature information into the optimization framework, and 2) a unified way to apply RPCA to a collection of tangent spaces of the manifold.
2. Methodology
Let be the noisy data matrix containing n samples. Each sample is a vector in independently drawn from (1). The overall data matrix has the representation
where X is the clean data matrix, S is the matrix of the sparse noise, and E is the matrix of the Gaussian noise. We further assume that the clean data X lies on some manifold embedded in with a small intrinsic dimension d ≪ p and the samples are sufficient (n ≥ p). The small intrinsic dimension assumption ensures that data is locally low dimensional so that the corresponding local data matrix is of low rank. This property allows the data to be separated from the sparse noise.
The key idea behind our method is to handle the data locally. We use the k Nearest Neighbors (kNN) to construct local data matrices, where k is larger than the intrinsic dimension d. For a data point , we define the local patch centered at it to be the set consisted of its kNN and itself, and a local data matrix X(i) associated with this patch is X(i)=[Xi1, Xi2, …,Xik,Xi], where Xij is the jth-nearest neighbor of Xi. Let be the restriction operator to the ith patch, i.e., where Pi is the n × (k + 1) matrix that selects the columns of X in the ith patch. Then . Similarly, we define , and .
Since each local data matrix X(i) is nearly of low rank and S is sparse, we can decompose the noisy data matrix into low-rank parts and sparse parts through solving the following optimization problem
| (2) |
here we take β = max{k + 1, p}−1/2 as in RPCA, is the local data matrix on the ith patch and is the centering operator that subtracts the column mean:, where 1 is the (k + 1)-dimensional column vector of all ones. Here we are decomposing the data on each patch into a low-rank part L(i) and a sparse part S(i) by imposing the nuclear norm and entry-wise ℓ1 norm on L(i) and S(i), respectively. There are two key components in this formulation: 1). the local patches are overlapping (for example, the first data point X1 may belong to several patches). Thus, the constraint is particularly important because it ensures copies of the same point on different patches (and those of the sparse noise on different patches) remain the same. 2). we do not require L(i) to be restrictions of a universal L to the ith patch, because the L(i)s correspond to the local affine tangent spaces, and there is no reason for a point on the manifold to have the same projection on different tangent spaces. This seemingly subtle difference has a large impact on the final result.
If the data only contains sparse noise, i.e., E = 0, then is the final estimation for X. If E ≠ 0, we apply Singular Value Hard Thresholding [6] to truncate and remove the Gaussian noise (See §6), and use the resulting to construct a final estimate of X via least squares fitting
| (3) |
The following discussion revolves around (2) and (3), and the structure of the paper is as follows. In §3, we explain the geometric meaning of each term in (2). In §4, we establish theoretical recovery guarantees for (2) which justifies our choice of β and allows us to theoretically choose λ. The calculation of λ uses the curvature of the manifold, so in §5, we provide a simple method to estimate the average manifold curvature and the method is robust to sparse noise. The optimization algorithms that solve (2) and (3) are presented in §6 and the numerical experiments are in §7.
3. Geometric explanation
We provide a geometric intuition for the formulation (2). Let us write the clean data matrix X(i) on the ith patch in its Taylor expansion along the manifold,
| (4) |
where the Taylor series is expanded at Xi (the center point of the ith patch), T(i) stores the first order term and its columns lie in the tangent space of the manifold at Xi, and R(i) contains all the higher order terms. The sum of the first two terms Xi1T + T(i) is the linear approximation to X(i) that is unknown if the tangent space is not given. This linear approximation precisely corresponds to the L(i)s in (2), i.e., L(i) = Xi1T + T(i). Since the tangent space has the same dimensionality d as the manifold, with randomly chosen points, we have with probability one, that rank(T(i)) = d. As a result, rank(L(i)) = rank(Xi1T + T(i)) ≤ d + 1. By the assumption that d < min{p, k}, we know that L(i) is indeed low rank.
Combing (4) with , we find the misfit term in (2) equals E(i) + R(i). This implies that the misfit contains the high order residues (i.e., the linear approximation error) and the Gaussian noise.
4. Theoretical choice of tuning parameters
To establish the error bound, we need a coherence condition on the tangent spaces of the manifold.
Definition 4.1
Let be a matrix with U∗U = I, the coherence of U is defined as
where ek is the kth element of the canonical basis. For a subspace T, its coherence is defined as
where V is an orthonormal basis of T. The coherence is independent of the choice of basis.
The following theorem is proved for local patches constructed using the ϵ-neighborhoods. We use kNN in the experiments because kNN is more robust to insufficient samples. The full version of Theorem 4.2 can be found in the supplementary material.
Theorem 4.2
[succinct version] Let each , i = 1 ,..., n, be independently drawn from a compact manifold with an intrinsic dimension d and endowed with the uniform distribution. Let , j = 1, . . . , ki be the ki points falling in an η-neighborhood of Xi with radius η, where η > 0 is some fixed small constant. These points form the matrix . For any , let Tq be the tangent space of at q and define . Suppose the support of the noise matrix S(i) is uniformly distributed among all sets of cardinality mi. Then as long as , and (here ρr and ρs are positive constants, , and ) , then with probability over for some constants c1 and c2, the minimizer to (2) with weights
| (5) |
has the error bound
Here will be estimated in the next section, ϵ = [ϵ1, ..., ϵn], ‖ · ‖2,1 stands for taking ℓ2 norm along columns and ℓ1 norm along rows, and T(i) is the projection of X(i) − Xi1T to the tangent space .
Remark.
We can interpret ϵ as the total noise in the data. As explained in §3, , thus ϵ = 0 if the manifold is linear and the Gaussian noise is absent. The factor in front of ‖ϵ‖2 takes into account the use of different norms on the two hand sides (the right hand side is the Frobenius norm of the noise matrix obtained by stacking the R(i) + E(i) associated with each patch into one big matrix). The factor is due to the small weight βi of ‖S(i)‖1 compared to the weight 1 on . The factor appears because on average, each column of is added about times on the left hand side.
5. Estimating the curvature
The definition λi in (5) involves an unknown quantity . We assume the standard deviation σ of the i.i.d. Gaussian entries of E(i) is known, so can be approximated. Since R(i) is independent of E(i), the cross term 〈R(i),E(i)〉 is small. Our main task is estimating , the linear approximation error defined in §3. At local regions, second order terms dominates the linear approximation residue, hence estimating requires the curvature information.
5.1. A short review of related concepts in Riemannian geometry
The principal curvatures at a point on a high dimensional manifold are defined as the singular values of the second fundamental forms [10]. As estimating all the singular values from the noisy data may not be stable, we are only interested in estimating the mean curvature, that is the root mean squares of the principal curvatures.
For the simplicity of illustration, we review the related concepts using the 2D surface embedded in (Figure 1). For any curve γ(s) in parametrized by arclength with unit tangent vector tγ(s), its curvature is the norm of the covariant derivative of tγ: ‖dtγ(s)/ds‖ = ‖γ″(s)‖. In particular, we have the following decomposition
where is the unit normal direction of the manifold at γ(s) and is the direction perpendicular to and tγ(s), i.e.,. The coefficient kn(s) along the normal direction is called the normal curvature, and the coefficient kg(s) along the perpendicular direction is called the geodesic curvature. The principal curvatures purely depend on kn. In particular, in 2D, the principal curvatures are precisely the maximum and minimum of kn among all possible directions.
Figure 1:

Local manifold geometry
A natural way to compute the normal curvature is through geodesic curves. The geodesic curve between two points is the shortest curve connecting them. Therefore geodesic curves are usually viewed as “straight lines” on the manifold. The geodesic curves have the favorable property that their curvatures have 0 contribution from kg. That is to say, the second order derivative of the geodesic curve parameterized by the arclength is exactly kn.


5.2. The proposed method
All existing curvature estimation methods we are aware of are in the field of computer vision where the objects are 2D surfaces in 3D [5, 4, 19, 14]. Most of these methods are difficult to generalize to high (> 3) dimensions with the exception of the integral invariant based approaches [17]. However, the integral invariant based approaches is not robust to sparse noise and is unsuited to our problem.
We propose a new method to estimate the mean curvature from the noisy data. Although the graphic illustration is made in 3D, the method is dimension independent. To compute the average normal curvature at a point , we randomly pick m points on the manifold lying within a proper distance to p as specified in Algorithm 1. Let γi be the geodesic curve between p and qi. For each i, we compute the pairwise Euclidean distance ‖p − qi‖2 and the pairwise geodesic distance dg(p, qi) using the Dijkstra’s algorithm. Through a circular approximation of the geodesic curve as drawn in Figure 1, we can compute the curvature of the geodesic curve as the inverse of the radius
| (6) |
where is the tangent direction along which the curvature is calculated and is the radius of the circular approximation to the curve γ at p, which can be solved along with the angle through the geometric relations
| (7) |
as indicated in Figure 1. Finally, we define the average curvature at p to be
| (8) |
To estimate the mean curvature from the data, we construct two matrices D and A. is the pairwise distance matrix, where Dij denotes the Euclidean distance between two points Xi and Xj. A is a type of adjacency matrix defined as follows and is to be used to compute the pairwise geodesic distances from the data,
| (9) |
Algorithm 1 estimates the mean curvature at some point p and Algorithm 2 estimates the overall curvature within some region Ω on the manifold.
The geodesic distance is computed using the Dijkstra’s algorithm, which is not accurate when p and q are too close to each other. The constant r1 in Algorithm 1 and 2 is thus used to make sure that p and q are sufficiently apart. The constant r2 is to make sure that q is not too far away from p, as after all we are computing the mean curvature around p.
5.3. Estimating λi from the mean curvature
We provide a way to approximate λi when the number of points n is finite. In the asymptotic limit (k → ∞, k/n → 0), all the approximate sign “≈” below become “=”.
Fix a point and another point qi in the η-neighborhood of p. Let γi be the geodesic curve between them. With the computed curvature , we can estimate the linear approximation error of expanding qi at , where is the projection onto the tangent space at p. Let be the error of this linear approximation where is the orthogonal complement of the tangent space. From Figure 1, the relation between , ‖p − qi‖2, and is
| (10) |
To obtain a closed-form formula for , we assume that for the fixed p and a randomly chosen qi in an ξ neighborhood of p, the projection follows a uniform distribution in a ball with radius η′ (in fact η′ ≈ η since when η is small, the projection of q−p is almost q−p itself, therefore the radius of the projected ball almost equal to the radius of the original neighborhood). Under this assumption, let be the magnitude of the projection and be the direction, by [20], ri and ϕi are independent of each other. As the curvature Rγi only depends on the direction, the numerator and the denominator of the right hand side of (10) are independent of each other. Therefore,
| (11) |
where the first equality used the independence and the last equality used the definition of the mean curvature in the previous subsection.
Now we apply this estimation to the neighborhood of Xi. Let p = Xi, and be the neighbors of Xi. Using (11), the average linear approximation error on this patch is
| (12) |
where the right hand side can also be estimated with
| (13) |
so when k is sufficient large, is also close to , which can be completely computed from the data. Combining this with the argument at the beginning of §5 we get,
Thus we can set due to (5). We show in the supplementary material that , where as in (5).
6. Optimization algorithm
To solve the convex optimization problem (2) in a memory-economic way, we first write L(i) as a function of S and eliminate them from the problem. We can do so by fixing S and minimizing the objective function with respect to L(i)
| (14) |
Notice that L(i) can be decomposed as , set , , then (14) is equivalent to
which decouples into
The problems above have closed form solutions
| (15) |
where is the soft-thresholding operator on the singular values
Combing and , we have derived the closed form solution for
| (16) |
Plugging (16) into F in (2), the resulting optimization problem solely depends on S. Then we apply FISTA [1, 18] to find the optimal solution with
| (17) |
Once is found, if the data has no Gaussian noise, then the final estimation for X is ; if there is Gaussian noise, we use the following denoised local patches
| (18) |
where is the Singular Value Hard Thresholding Operator with the optimal threshold as defined in [6]. This optimal thresholding removes the Gaussian noise from . With the denoised , we solve (3) to obtain the denoised data
| (19) |
The proposed Nonlinear Robust Principle Component Analysis (NRPCA) algorithm is summarized in Algorithm 3. There is one caveat in solving (2): the strong sparse noise may result in a wrong

neighborhood assignment when constructing the local patches. Therefore, once is obtained and removed from the data, we update the neighborhood assignment and re-compute . This procedure is repeated T times.
7. Numerical experiment
Simulated Swiss roll:
We demonstrate the superior performance of NRPCA on a synthetic dataset following the mixed noise model (1). We sampled 2000 noiseless data Xi uniformly from a 3D Swiss roll and generated the Gaussian noise matrix with i.i.d. entries obeying . The sparse noise matrix S is generated by randomly replacing 100 entries of a zero p × n matrix with i.i.d. samples generated from (−1)y · z where y ∼ Bernoulli(0.5) and . We applied NRPCA to the simulated data with patch size k = 15. Figure 2 reports the denoising results in the original space (3D) looking down from above. We compare two ways of using the outputs of NRPCA: 1). only remove the sparse noise from the data ; 2). remove both the sparse and Gaussian noise from the data:. In addition, we plotted with and without the neighbourhood update. These results are all superior to an ad-hoc application of the Robust PCA on the individual local patches.
Figure 2:

NRPCA applied to the noisy 3D Swiss roll dataset. is the result after subtracting the sparse noise estimated by setting T = 1 in NRPCA, i.e., no neighbour update; “ with one neighbor update” used the obtained by setting T = 2 in NRPCA; clearly, the neighbour update helped to remove more sparse noise; is the data obtained via fitting the denoised tangent spaces as in (3). Compared to“ with one neighbor update”, it further removed the Gaussian noise from the data; ”Patch-wise Robust PCA” refers to the ad-hoc application of the vanilla Robust PCA to each local patch independently, whose performance is worse than the proposed joint-recovery formulation.
The MNIST datasest:
We observed some interesting dimension reduction result of MNIST with the help of NRPCA. It is well-known that the handwritten digits 4 and 9 are so similar that the popular dimension reduction methods Isomap and Laplacian Eigenmaps fail to separate them into two clusters (first column of Figure 3). We conjecture that the similarity between the two clusters is caused by personalized writing styles of the beginning and finishing strokes. As this type of variation can be better modeled by sparse noise than Gaussian or Poisson noises, we applied NRPCA to the raw MNIST images. The right column of Figure 3 shows that after the NRPCA denoising (with k = 11), the separability of the two clusters in the first two coordinates of Isomap and Laplacian Eigenmaps increases. In addition, these new embeddings seem to suggest that some trajectory patterns exist in the data. We provide additional plots in the supplementary material to support this observation.
Figure 3:

Laplacian eigenmaps and Isomap results for the original and the NRPCA denoised digits 4 and 9 from the MNIST dataset.
Biological data:
We illustrate the potential usefulness of NRPCA algorithm on an embryoid body (EB) differentiation dataset over a 27-day time course, which consists of gene expressions for 31,000 cells measured with single-cell RNA-sequencing technology (scRNAseq) [13, 16]. This EB data comprising expression measurement for cells originated from embryoid at different stages is hence developmental in nature, which should exhibit a progressive type of characters such as tree structure because all cells arise from a single oocyte and then develop into different highly-differentiated tissues. This progression character is often missing when we directly apply dimension reduction methods to the data as shown in Figure 4 because biological data including scRNAseq is highly noisy and often is contaminated with outliers from different sources including environmental effects and measurement error. In this case, we aim to reveal the progressive nature of the single-cell data from transcript abundance as measured by scRNAseq.
Figure 4:

LLE results for denoised scRNAseq data set.
We first normalized the scRNAseq data following the procedure described in [16] and randomly selected 1000 cells using the stratified sampling framework to maintain the ratios among different developmental stages. We applied our NRPCA method to the normalized subset of EB data and then applied Locally Linear Embedding (LLE) to the denoised results. The two-dimensional LLE results are shown in Figure 4. Our analysis demonstrated that although LLE is unable to show the progression structure using noisy data, after the NRPCA denoising, LLE successfully extracted the trajectory structure in the data, which reflects the underlying smooth differentiating processes of embryonic cells. Interestingly, using the denoised data from with neighbor update, the LLE embedding showed a branching at around day 9 and increased variance in later time points, which was confirmed by manual analysis using 80 biomarkers in [16].
8. Conclusion
In this paper, we proposed the first outlier correction method for nonlinear data analysis that can correct outliers caused by the addition of large sparse noise. The method is a generalization of the Robust PCA method to the nonlinear setting. We provided procedures to treat the non-linearity by working with overlapping local patches of the data manifold and incorporating the curvature information into the denoising algorithm. We established a theoretical error bound on the denoised data that holds under conditions only depending on the intrinsic properties of the manifold. We tested our method on both synthetic and real dataset that were known to have nonlinear structures and reported promising results.
Supplementary Material
Acknowledgements
The authors would like to thank Shuai Yuan, Hongbo Lu, Changxiong Liu, Jonathan Fleck, Yichen Lou, and Lijun Cheng for useful discussions. This work was supported in part by the NIH grants U01DE029255, 5RO3DE027399 and the NSF grants DMS-1902906, DMS-1621798, DMS-1715178, CCF-1909523 and NCS-1630982.
References
- [1].Beck Amir and Teboulle Marc. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM journal on imaging sciences, 2(1):183–202, 2009. [Google Scholar]
- [2].Candes Emmanuel J., Li Xiaodong, Ma Yi, and Wright John. Robust Principal Component Analysis? J. ACM, 58(3):11:1–11:37, June 2011. [Google Scholar]
- [3].Du Chun, Sun Jixiang, Zhou Shilin, and Zhao Jingjing. An Outlier Detection Method for Robust Manifold Learning. In Yin Zhixiang, Pan Linqiang, and Fang Xianwen, editors, Proceedings of The Eighth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA), 2013, Advances in Intelligent Systems and Computing, pages 353–360. Springer Berlin Heidelberg, 2013. [Google Scholar]
- [4].Eppel Sagi. Using curvature to distinguish between surface reflections and vessel contents in computer vision based recognition of materials in transparent vessels. arXiv preprint arXiv:1602.00177, 2006. [Google Scholar]
- [5].Flynn Patrick J and Jain Anil K. On reliable curvature estimation. Computer Vision and Pattern Recognition, 89:110–116, 1989. [Google Scholar]
- [6].Gavish M and Donoho DL. The optimal hard threshold for singular values is . IEEE Transactions on Information Theory, 60(8):5040–5053, Aug 2014. [Google Scholar]
- [7].Hammond David K., Vandergheynst Pierre, and Gribonval Rémi. Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis, 30(2):129–150, March 2011. [Google Scholar]
- [8].Shi Jianbo and J. Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, August 2000. [Google Scholar]
- [9].Jiang Bo., Ding Chris., Luo Bin, and Tang Jin.. Graph-Laplacian PCA: Closed-Form Solution and Robustness. In 2013 IEEE Conference on Computer Vision and Pattern Recognition, pages 3492–3498, June 2013. [Google Scholar]
- [10].Kobayashi Shoshichi and Nomizu Katsumi. Foundations of differential geometry. 2, 1996. [Google Scholar]
- [11].Li Xiang-Ru, Li Xiao-Ming, Li Hai-Ling, and Cao Mao-Yong. Rejecting Outliers Based on Correspondence Manifold. Acta Automatica Sinica, 35(1):17–22, January 2009. [Google Scholar]
- [12].Little Anna, Xie Yuying, and Sun Qiang. An analysis of classical multidimensional scaling. arXiv preprint arXiv:1812.11954, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Martin GR and Evans MJ. Differentiation of clonal lines of teratocarcinoma cells: formation of embryoid bodies in vitro. Proceedings of the National Academy of Sciences, 72(4):1441–1445, April 1975. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Meek Dereck S. and Walton Desmond J.. On surface normal and gaussian curvature approximations given data sampled from a smooth surface. Computer Aided Geometric Design, 17(6):521–543, 2000. [Google Scholar]
- [15].Meila Marina and Shi Jianbo. Learning Segmentation by Random Walks. In Leen TK, Dietterich TG, and Tresp V, editors, Advances in Neural Information Processing Systems 13, pages 873–879. MIT Press, 2001. [Google Scholar]
- [16].Moon Kevin, David van Dijk Zheng Wang, Gigante Scott, Burkhardt Daniel B., Chen William S., Yim Kristina, van den Elzen Antonia, Hirn Matthew J., Coifman Ronald R., Ivanova Natalia B., Wolf Guy, and Krishnaswamy Smita. Visualizing Structure and Transitions for Biological Data Exploration. bioRxiv, page 120378, April 2019. [Google Scholar]
- [17].Pottmann Helmut, Wallner Johannes, Yang Yong-Liang, Lai Yu-Kun, and Hu Shi-Min. Principal curvatures from the integral invariant viewpoint. Computer Aided Geometric Design, 24(8):428–442, 2007. [Google Scholar]
- [18].Sha Ningyu, Yan Ming, and Lin Youzuo. Efficient seismic denoising techniques using robust principal component analysis. In SEG Technical Program Expanded Abstracts 2019, pages 2543–2547. Society of Exploration Geophysicists, 2019. [Google Scholar]
- [19].Tong Wai-Shun and Tang Chi-Keung. Robust estimation of adaptive tensors of curvature by tensor voting. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(3):434–449, 2005. [DOI] [PubMed] [Google Scholar]
- [20].Vershynin Roman. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge University Press, 2018. [Google Scholar]
- [21].Tang Zhigang, Yang Jun, and Yang Bingru. A new Outlier detection algorithm based on Manifold Learning. In 2010 Chinese Control and Decision Conference, pages 452–457, May 2010. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
