Significance
Manifold fitting, a crucial challenge in nonlinear data analysis, holds immense potential for efficient and accurate modeling. However, existing methods struggle to balance accuracy and computational efficiency. In this study, we harness the adversarial generative network to model the relationship between low-dimensional latent space and high-dimensional ambient space, mirroring Riemannian exponential and logarithmic maps, for robust manifold analysis. Our extensive simulations and real data tests confirm our method’s ability to capture complex manifold structures in high-dimensional data. This approach has profound implications in statistics and computer science, especially for dimensionality reduction and processing non-Euclidean data. By addressing previous methods’ limitations, our research paves the way for enhanced data analysis and offers valuable insights for diverse applications in the scientific community.
Keywords: manifold fitting, deep generative neural network, CycleGAN
Abstract
Manifold fitting, which offers substantial potential for efficient and accurate modeling, poses a critical challenge in nonlinear data analysis. This study presents an approach that employs neural networks to fit the latent manifold. Leveraging the generative adversarial framework, this method learns smooth mappings between low-dimensional latent space and high-dimensional ambient space, echoing the Riemannian exponential and logarithmic maps. The well-trained neural networks provide estimations for the latent manifold, facilitate data projection onto the manifold, and even generate data points that reside directly within the manifold. Through an extensive series of simulation studies and real data experiments, we demonstrate the effectiveness and accuracy of our approach in capturing the inherent structure of the underlying manifold within the ambient space data. Notably, our method exceeds the computational efficiency limitations of previous approaches and offers control over the dimensionality and smoothness of the resulting manifold. This advancement holds significant potential in the fields of statistics and computer science. The seamless integration of powerful neural network architectures with generative adversarial techniques unlocks possibilities for manifold fitting, thereby enhancing data analysis. The implications of our findings span diverse applications, from dimensionality reduction and data visualization to generating authentic data. Collectively, our research paves the way for future advancements in nonlinear data analysis and offers a beacon for subsequent scholarly pursuits.
Numerous prevailing statistical methods operate under the assumption of linear dependence among their features. While these methods offer the advantage of simplifying data representation, they entail challenges in accurately capturing more complex patterns, particularly in high-dimensional scenarios where data tend to reside close to low-dimensional manifolds. To overcome these limitations, the integration of more sophisticated manifold-learning techniques becomes imperative.
Manifold-learning techniques, as classified by Yao et al. (1), encompass three distinct categories: manifold embedding, manifold denoising, and manifold fitting. The manifold-embedding methods aim to uncover a low-dimensional representation of high-dimensional datasets while preserving point-wise distances (2–7). In contrast, manifold-denoising methods target the identification and mitigation of outliers within data distributions around low-dimensional manifolds (8–12). But, despite their notable contributions to dimension reduction and various fields, many of these techniques suffer from a dearth of geometric details and a lack of robust theoretical foundations (1).
Manifold fitting, the third category, is a fundamental task that involves reconstructing a smooth manifold to accurately approximate the geometry and topology of a hidden low-dimensional manifold. Noteworthy contributions by Genovese et al. have focused on the minimax risk of manifold fitting under Hausdorff distance, providing valuable insights into effectively handling noise and improving smoothness (13, 14). Furthermore, the utilization of the ridge of the empirical distribution as a proxy has been explored to mitigate the sample size requirements (15–17). However, as noted by Dunson and Wu (18), these kernel-based methodologies overlook the intrinsic geometry of the domain and lead to sub-optimal outcomes with convergence rates dependent on the ambient dimensionality rather than on the intrinsic dimensionality. To address this limitation, several algorithms, including those of refs. 17, 19, 20, 21, have smooth functions incorporated that capture spatial properties, representing the output manifold as its root set or ridge, while some other methods, like those of refs. 22 and 23, utilize the local covariance information to project sample points onto low-dimensional structures. However, these algorithms often entail high computational costs, necessitating further efficiencies, and reduce the computational complexity in manifold fitting tasks.
The concept of smooth manifolds is also fundamental in machine learning, especially in computer vision. Dating back to the era of basic image classification, it has illuminated the intricate relationship between images and their defining features. Recently, generative models such as Autoencoders (24) and Generative Adversarial Networks [GANs, (25)], along with their numerous variations, have garnered widespread attention, all underpinned by the manifold hypothesis. These techniques have made considerable strides in navigating distributions adjacent to manifolds, with some even venturing into the domain of manifold denoising within image spaces. Nevertheless, most of them focus on the efficiency of noise reduction, the plausibility of synthesized images, and the results induced by feature modulation. In contrast, the task of reconstructing the intrinsic structures of latent manifolds has been comparatively neglected and represents an essential avenue for future research efforts.
To exploit the remarkable potential of neural networks in fitting smooth functions describing the latent manifold, we propose a manifold fitting method based on the adversarial generative framework. This approach is inspired by recent advances in deep generative neural networks, density estimators (26, 27), and manifold estimators (1, 20) and offers several key advantages.
First, utilizing deep neural networks, we can accurately fit smooth functions and generate image sets that provide more precise estimations of the latent manifold, ensuring control over dimensionality and smoothness. Second, our proposed method offers intuitive mapping functions between low-dimensional latent space and high-dimensional ambient space, echoing the Riemannian exponential and logarithmic maps, enabling seamless projection of new data points onto the manifold. Last, by establishing mappings between spaces, we gain insights into point-wise relationships along the manifold, laying the groundwork for further research. In what follows, we will present our model and substantiate its outstanding performance and broad applicability through extensive experimentation.
Methodology
The framework of our methods centers on viewing the observing distribution as the convolution of a distribution defined on a smooth manifold and a noise distribution within the ambient space. Given that any smooth manifold admits a complete Riemannian metric (as any manifold embeds into some Euclidean space as a closed subset according to the Whitney embedding theorem), the exponential map defined at any chosen point on a connected smooth manifold yields a smooth and surjective map to the manifold. In other words, fitting a manifold can be regarded as learning a smooth function that maps from a low-dimensional Euclidean space to a high-dimensional Euclidean space—a task in which neural networks excel.
To fit such functions and ensure their near-bijective nature, we leverage the fundamental framework of CycleGAN (26) to train the two generators as the exponential map and logarithmic map at a specific point on the latent manifold.
Model and Assumption.
Assume there are two domains and , with dimensionality of d and D respectively, where . We focus on a random vector , which can be represented as the sum of two components: Z and . Here, is an unobserved random vector that follows a distribution supported on the latent manifold . Additionally, represents the observation noise in ambient space, which is independent of Z and has a SD . Consequently, the distribution of Y can be viewed as the convolution , whose density at point y can be expressed as:
We assume that is the collection of observed data points, also in the form for . Here, corresponds to N independent and identically distributed (iid) realizations of . Our objective is to estimate the latent manifold based on the observations , and we make the following main assumptions: 1) The ambient space is D-dimensional Euclidean space with the standard norm, i.e., . 2) The noise distribution is a Gaussian distribution supported on with density 3) The latent manifold is a twice-differentiable and compact d-dimensional sub-manifold, embedded in . 4) The distribution is a uniform distribution, with respect to the d-dimensional Hausdorff measure, on . 5) The intrinsic dimension d and noise SD are both known. These assumptions are inherited from previous theoretical works and are subject to potential relaxation. A more detailed discussion can be found in SI Appendix.
Our target is to learn smooth mappings between the domains and . The procedure of our method is depicted in Fig. 1. Our model comprises two generators: and . During the forward propagation step, we collect a batch of real data points x and y. To generate synthetic data in domain , denoted as , we introduce Gaussian noise to , that is, . In contrast, in domain , synthetic data is directly obtained through . Subsequently, we calculate the recovered data: and . In order to differentiate between real and synthetic data points, we introduce two adversarial discriminators, and , aiming to distinguish between and respectively. Additionally, we incorporate a sub-module performing a pseudo manifold fitting step before passing data to , which helps in aligning the data points with the latent manifold and expedites the training process (1). During the training of the neural network modules, our objective function incorporates two primary components: adversarial losses (25) and cycle consistency loss (26). These terms serve distinct purposes to ensure effective learning and coherence between and .
Fig. 1.
The schematic representation of our framework. The shaded rectangles depict the two domains, while the circular components symbolize the individual sub-networks. The solid arrows indicate the transfer paths within the generators, while the dashed arrows denote the paths for computing the loss function.
Adversarial Loss.
Adversarial loss, inspired by ref. 25, is employed to align the latent distributions in the two domains. By using discriminators and , we adjust the generated samples, and , to resemble the real samples, x and y, respectively. This adversarial training strategy enables the networks to capture the underlying characteristics and statistical properties of the target domain. To stabilize the training and generate highly credible results, we adopt the least square GAN loss proposed in ref. 28. The loss function of and its discriminator can be expressed as:
where m and n are the batched sample sizes from spaces and , respectively. By removing the sub-module, we introduce a similar adversarial loss for and as well, i.e., .
Cycle Consistency Loss.
Although adversarial training has the potential to learn mappings that yield outputs distributed identically to the target, a network with sufficient capacity can map the same input set to various random permutations of the target set. Consequently, any of the acquired mappings can induce an output distribution that aligns with the target distribution (26). To further reduce the space of possible mappings, we introduce the term cycle consistency loss:
with being the norm. By controlling the cycle consistency loss, we can achieve additional objectives such as noise reduction, interpolation, and more, as demonstrated in the sections below.
Full Objective.
By combining these loss terms, we are able to formulate the comprehensive objective for the model as follows:
where is a negative parameter that strikes a balance between the two types of losses. The overall optimization problem in this study can be expressed as:
[1] |
After an iterative model training process, we obtain as estimators for . Moreover, by controlling the distribution on , we can achieve the following objectives: 1) Manifold estimating: and estimate the Riemannian exponential and logarithmic maps correspondingly. The image set serves as an estimator for , with the smoothness and dimensionality of each easily controllable. 2) Noise canceling: The composite mapping projects any observed point to , which is less affected by noise. 3) Nonlinear interpolating: For points and , we can study the curves between them by applying to interior point between and .
Implementation
Network Architecture.
Within our framework, individual sub-networks are adaptable to cater to specific needs, provided the input and output dimensions resonate with the data. Practically, we can tailor the model to a size that, over extended epochs, may lead to overfitting. Subsequently, we select optimal model weights for inference, drawing from the loss functions observed during training. A concise elaboration on this is available in SI Appendix, Text and Figs. S2 and S3.
To accommodate the intricate nature of subsequent simulation experiments, we have devised several pre-defined models. These models are available in our GitHub repository (https://github.com/zhigang-yao/MFCGAN). In the vector space cases, both the generators and discriminators consist of fully connected neural networks. These networks feature hidden layers with a width of 100 and a depth of 9. Conversely, for data represented in image space, we employ a fully connected neural network as the generator, incorporating hidden layers of 400 in width and 15 in depth. As for the discriminator, we employ a 3-layer PatchGAN discriminator (29) with a kernel size of . All fully connected neural networks utilize the rectified linear unit ReLU, (30) activation function between linear layers, without incorporating Dropout.
To facilitate the sub-module , we adhere to the simplified algorithm outlined in ref. 1. For a more complete formulation, please refer to SI Appendix. After initializing the manifold fitting sub-module with the sample set and three radii, namely , we proceed with the following steps for a given input data point y. Initially, we compute the weighted average of y with respect to within a radius of , denoting it as . Subsequently, after leveraging the directionality of , we construct a hypercylinder with radii and . The weighted average within this hypercylinder serves as the output of the manifold fitting sub-module, namely . By shifting points in close proximity to the manifold toward the manifold itself, this sub-module substantially boosts the discriminator’s ability to discern crucial distinctions, thereby expediting the training process and ameliorating the overall performance of the generator.
Training Details.
To address the optimization problem outlined in Eq. 1, we employ the Adam optimizer (31). Initially, the optimizer trains using a predetermined learning rate, which subsequently decreases in a linear fashion after a specified number of epochs. In all subsequent investigations, the distribution of is consistently set as a d-dimensional uniform distribution on .
To further enhance the training effectiveness of the generator and achieve the desired manifold fitting objective, we use two supplementary techniques. First, we dissociate the noise component from , creating replicated instances from each during each forward step and incorporating iid noise into them. By adjusting the associated loss function, we amplify the representational capacity of with regard to the manifold. Second, within each optimization iteration, we initially hold , , and fixed and perform 20 optimization steps for and . We then retain the learned parameters of and while executing a single optimization step for , , and . This procedural approach produces theoretical advantages by facilitating the training process of and , ultimately leading to improved performance.
Results
We initiate our analysis with a comparative assessment of our manifold fitting method, termed MFCGAN, juxtaposed against contemporary neural network-based techniques and established manifold fitting methods across several simulated vector-valued datasets. Subsequently, we underscore the adaptability and relevance of our algorithm by highlighting its performance across diverse image space domains. Owing to space constraints, results regarding different and additional figures are detailed in SI Appendix, Figs. S4–S6.
Simulation in Euclidean Spaces.
To enable a thorough comparison of different models in Euclidean space, we have devised three distinct datasets that exhibit diverse geometries and dimensionalities.
Circle: Sample and for , where denotes the two-dimensional standard Gaussian distribution. The sample points are calculated as
Involute: Sample and for . The sample points are calculated as
- Torus: Sample , , and for . Then, the dataset can be expressed as , with
By following these steps, the sample set can be obtained. Throughout all subsequent experiments, we maintain a consistent sample size of . Furthermore, to meet the criteria of MFCGAN, we draw additional data points, uniformly distributed in across their respective dimensions, serving as samples from .
Comparison with neural network-based methods.
To conduct a thorough comparison, we implement and train several alternative neural network-based methods, including CycleGAN (26), denoising autoencoder [DAE, (32)], RoundTrip (27), Masked Autoencoder for Distribution Estimation [MADE, (33)], Masked Autoregressive Flow [MAF, (34)], and real-valued nonvolume preserving transformation method [RealNVP, (35)]. We implement the CycleGAN and DAE methods independently, ensuring they share the same network architecture, hyperparameters, and tuning parameters as the proposed MFCGAN model. We utilize version 2.0.1 of RoundTrip, available on the Python Package Index. During the training step, we adjust only the dimensions of the domains and set the number of iterations to in the default parameter list. For the other methods based on normalizing flows, we adopt an implementation obtained from GitHub (downloaded as committed on Jan 22, 2020) at https://github.com/kamenbliznashki/normalizing_flows/tree/master. During training, we set the number of epochs to and enable the option of no_batch_norm. Since these methods are designed initially for density estimation, we make necessary code modifications to allow for saving of generated samples from the estimated distributions for all four methods, ensuring compatibility with our evaluation framework.
Following extensive training on datasets consisting of samples, we employ each of the aforementioned models, including MFCGAN, to generate additional sample points. For both MFCGAN and CycleGAN, we employ uniformly generated random samples as inputs to their respective generators, that is, ’s. In the case of DAE, we utilize its encoder–decoder architecture to generate denoised data points. With RoundTrip, we leverage its integrated sampler to supply inputs to its generator. For the other methods under consideration, we retain the intermediate samples produced during the estimation phase for subsequent comparison.
To provide a clear visual representation of the generated samples from each algorithm, we present scatter plots of these samples in Fig. 2, where only the 2D projection is plotted for the 3D torus case. It is evident that the points generated by MFCGAN, exhibiting a smooth distribution, closely fit the underlying latent manifold. This outcome underscores the effectiveness of our method in achieving manifold fitting objectives, wherein the resulting estimator manifests as a smooth manifold, particularly in these simulated scenarios. In comparison to the other methods, MFCGAN exhibits the ability to capture and identify more intricate details of the manifold within the generated distribution. This observation suggests that MFCGAN is a promising approach for applications in more complex tasks.
Fig. 2.
Comparative scatter plots in Euclidean space for three simulation cases. Each plot presents the input training set or newly generated sample sets from MFCGAN, CycleGAN, RoundTrip, MADE, MAF, and RealNVP, alongside denoised datasets obtained through DAE.
In the case of the involute dataset (columns 2 and 5 in Fig. 2), MFCGAN and CycleGAN generate numerous additional lines. The extra lines can be attributed to the fact that the latent involute manifold possesses two boundary points, and that our approach tends to maintain closed image sets for . Consequently, the optimization process sacrifices certain low-probability regions (the extra lines) to reconnect the head and tail of the output, thus preserving the closed structure.
To provide a quantitative assessment of the output generated by each method, we calculate the distances from sample points to the latent manifold and compile the results in Table 1. The table includes the results of all seven methods, including the training data for reference. We employed three primary metrics: the mean and 95th percentile distance, and the percentage of samples beyond .
Table 1.
Summary of the distance from the sample points to the latent manifold with respect to seven methods in three simulated cases
Circle | Involute | Torus | ||
---|---|---|---|---|
Input | Mean (SD) | 8.13 (6.14) | 8.03 (6.04) | 7.98 (6.04) |
95% quantile | 0.02 | 0.02 | 0.02 | |
Percent | 0.42% | 0.33% | 0.22% | |
MFCGAN | Mean (SD) | 5.66 (4.44) | 5.12 (0.02) | 2.57 (9.10) |
95% quantile | 1.60 | 0.01 | 5.51 | |
Percent | 0.00% | 3.51% | 0.30% | |
CycleGAN | Mean (SD) | 1.63 (1.08) | 5.75 (0.01) | 3.79 (0.01) |
95% quantile | 3.65 | 0.01 | 7.66 | |
Percent | 0.00% | 2.32% | 1.06% | |
DAE | Mean (SD) | 1.31 (9.09) | 0.03 (0.05) | 0.02 (0.03) |
95% quantile | 3.04 | 0.15 | 0.07 | |
Percent | 0.00% | 23.27% | 23.90% | |
RoundTrip | Mean (SD) | 0.02 (8.06 ) | 0.02 (0.02) | 0.02 (0.01) |
95% quantile | 0.03 | 0.08 | 0.04 | |
Percent | 12.10% | 15.30% | 13.14% | |
MADE | Mean (SD) | 0.37 (0.30) | 0.09 (0.08) | 0.19 (0.18) |
95% quantile | 0.88 | 0.17 | 0.48 | |
Percent | 93.57% | 79.86% | 87.75% | |
MAF | Mean (SD) | 0.03 (0.06) | 0.08 (0.09) | 0.09 (0.10) |
95% quantile | 0.13 | 0.16 | 0.32 | |
Percent | 23.55% | 71.16% | 59.21% | |
RealNVP | Mean (SD) | 0.03 (0.09) | 0.07 (0.14) | 0.08 (0.14) |
95% quantile | 0.14 | 0.15 | 0.32 | |
Percent | 13.45% | 54.67% | 46.87% |
Upon inspection of the table, it is evident that the output of MFCGAN exhibits closer proximity to the latent manifold compared to the training data. This finding once again highlights the ability of MFCGAN to obtain accurate estimates of the manifold structure. Moreover, for manifolds with fixed curvature, the estimation error from MFCGAN may be a higher-order term of , although further investigation is necessary to validate this claim due to the absence of a suitable metric.
Notably, the results of MFCGAN significantly outperform the other methods in terms of proximity to the latent manifold. This disparity is reasonable given that density estimation is the primary objective of the other methods. However, it is worth noting that MFCGAN does lag the other methods in terms of the tail distribution, which is a meaningful indicator for assessing the ability of MFCGAN to fit well with high probability. This aspect warrants further investigation in subsequent research endeavors.
Comparison with established manifold fitting methods.
Numerous algorithms for manifold fitting have been proposed in the literature, as highlighted by works such as ysl23 (1), yx19 (20), and cf18 (19). To benchmark our method against these, we conduct simulations across the three distinct scenarios using the algorithms hosted at https://github.com/zhigang-yao/manifold-fitting. Fig. 3 showcases the results of the torus case, with additional outcomes of the other two simulation cases available in SI Appendix. The parameters for these simulations align with those detailed in the preceding subsection. For MFCGAN, all samples in are mapped onto the identified manifold with the transformations (MFCGAN w/o ) and (MFCGAN w/ ). Conversely, traditional manifold fitting methods project a subset of samples onto their respective fitted manifolds, a choice driven by computational considerations.
Fig. 3.
Comparison of MFCGAN with other manifold fitting algorithms in the torus case. Black points represent (A) the input data with noise level and the projections of onto the manifolds fitted by (B) MFCGAN w/o , (C) MFCGAN w/ , (D) ysl23, (E) yx19, and (F) cf18. Red points indicate the projections of the black points onto the latent manifold. (G) A boxplot showcasing the distances between corresponding red and black points.
In terms of fitting error, the traditional manifold fitting methods exhibit a marginally better performance than MFCGAN, an outcome that aligns with theoretical expectations. Given that these methods directly measure the manifold using the Hausdorff distance, a strong alignment of results is expected. MFCGAN, on the other hand, seeks to approximate the distribution of with , based on specific divergence metrics. However, traditional approaches necessitate that query data remain proximate to the latent manifold, a constraint that can result in isolated points, as evident in panels (E)–(G) of Fig. 3. Such a limitation does not constrain MFCGAN.
From a computational vantage point, MFCGAN notably excels in its capacity to efficiently generate new data on the fitted manifold and project data points onto it, even when accounting for its training duration. In contrast, traditional manifold fitting methods, such as yx19 and cf18, demand rigorous iterative computations for point projection, with their computational requirements escalating as the dimensionality of the ambient space increases. Notably, in our simulation cases, yx19 and cf18 consume several minutes, sometimes extending to 10 min, to process merely 1,000 points. While the streamlined algorithm of ysl23 operates swiftly, it lacks assurances regarding dimensionality. In stark contrast, MFCGAN completes the projection, with exactly dimension d, for 10,000 points in a mere 10 s. This efficiency stems from MFCGAN’s innovative strategy of harnessing two generative networks to emulate exponential and logarithmic maps, positioning it as a potentially more adept solution for high-dimensional data scenarios.
In summary, while traditional manifold fitting methods have an edge in fitting accuracy, MFCGAN emerges as a compelling contender, especially when considering high-dimensional data and computational efficiency. The observed differences in fitting error between the two methodologies highlight avenues for future exploration, especially in the development of improved discriminators to refine the fitting error assessment of MFCGAN.
Simulation with Calabi–Yau Manifolds.
Calabi–Yau manifolds, as described by Calabi (36), represent a distinguished class of compact, complex Kähler manifolds characterized by a vanishing first Chern class. Their zero Ricci curvature renders them congruent with the theoretical models of the universe postulated by physicists. In this subsection, we explore a specific instance of Calabi–Yau manifolds: a reduced Fermat quartic, whose point satisfies:
[2] |
To construct the training set, we commence by creating a uniform mesh grid comprising points on the manifold specified by Eq. 2. Subsequently, we perturb both the real and imaginary components of each point by adding Gaussian noise with a SD of . This process yields the dataset . Fig. 4A depicts a 3D projection of the Fermat quartic alongside some points from . Additional information about data generation and projection is available in SI Appendix.
Fig. 4.
Fitting a Calabi–Yau manifold. (A) 3D projection of the manifold (surface) with samples from noisy inputs (blue points) and their projection with MFCGAN (red points). (B) boxplot showing the pseudo-bias of input data and outputs from neural network-based methods; the upper part is presented in log scale.
Given the computational challenges posed by the sample size, our comparison is limited to the neural network-based approaches. The model structure and the majority of hyperparameters remain consistent with those used in the previous subsection. However, there are adjustments in the training regimen for MFCGAN, CycleGAN, and DAE. Specifically, these models are trained with a constant learning rate for an initial 20 epochs, followed by an additional 40 epochs with a decreasing learning rate.
To evaluate the learning results, a subset of with size is projected onto the learned manifolds of MFCGAN, CycleGAN, and DAE. For the remaining density estimation methods, additional samples are generated with the same methods described before. Due to the complexity of calculating the distance from a point to the manifold, we use a pseudo-bias defined as the norm of for .
The pseudo-bias of input data and outputs from various methods are encapsulated in the boxplot shown in Fig. 4B. The primary segment of the box corresponding to MFCGAN exhibits the narrowest distribution range, indicating that the projections of sample points on the manifolds learned by MFCGAN have reduced pseudo-bias. This suggests that MFCGAN fits the latent Calabi–Yau manifold more accurately than other methods, implying its potential applicability to explore more intricate geometric structures.
Application in Images.
To provide examples of the potential applications of our approach, we conduct a series of experiments on image datasets. We initiate the process by constructing a basic graph with the density function of a Gaussian mixture model, with the component means resembling an “L” shape and subsequently rotating the components in the plane by 10∘. A total of 36 resulting images are captured, which are replicated 15 times and white noise is added. Then, we compile a dataset comprising 540 images with size in pixels (as illustrated in Fig. 5A). In theory, these images are positioned around a 1-manifold, yet they are embedded in . Our objective is to use MFCGAN to learn the features of this particular embedding.
Fig. 5.
Results of MFCGAN on image data. (A) A group of random samples from the training set; (B) The noise reduction result from MFCGAN, obtained by passing images in A to ; (C) Nonlinear interpolation of the two boxed images in A, calculated by feeding linear combinations of and to .
One potential application of MFCGAN is to remove noise from signals. As depicted in Fig. 5, we utilize the noisy images shown in Fig. 5A as input and pass them through the function to obtain cleaner images, as shown in Fig. 5B. Remarkably, the orientation of the graph in the output images remains largely consistent with that of the original, which would indicate that the cycle consistency loss component effectively contributes to the training process. In other words, we can employ as a projection function to map points in onto the latent manifold, thereby achieving an objective that is similar to noise reduction.
Another valuable application of MFCGAN is nonlinear interpolation. We select two graphs from Fig. 5A, labeled as and . By computing for various values of t, we can generate a series of images that lie between and , as depicted in Fig. 5C. Notably, these interpolated images are not present in the training set, yet they effectively capture the geometric connection between and , resulting in a smooth transition from to . This outcome demonstrates that MFCGAN enables us not only to project points onto the manifold but also to construct a neighborhood of y on the manifold for any given point y by leveraging the information obtained from . Such capabilities are of great importance for further statistical studies on non-Euclidean data.
We also conducted experiments on two sets of real portrait photographs. However, only one result is included here due to space limitations. Further details can be found in SI Appendix. For our task, we find a black and white GIF featuring Charlie Chaplin, from which 34 frames of black and white bitmaps can be extracted. The training set consists of the odd-numbered frames, which undergo data augmentation through the addition of white noise 20 times per frame, while the even-numbered frames remain unchanged and serve as the test set. The neural network structures are consistent with the description provided in the previous section. Following 200 epochs of training the entire network, we evaluate our method by inputting the test set into to assess its ability to capture manifold information not present in the training set.
Fig. 6 presents an intriguing result, showcasing frames 9 and 11 of the GIF on the Left- and Right-most panels, respectively, with frame 10 and its recovery depicted in the two Center panels. Notably, the recovered image is more than just a simple average of the previous and next frames, as evident from the angle of the cane and the shape of the legs. This suggests that MFCGAN effectively captures implicit information closer to frame 10 that is absent from the training data. However, the clarity of the recovered data is compromised due to the absence of image order labels and the limited data size during training. The model may sort the images in a different order than their temporal order. Additional details can be found in SI Appendix.
Fig. 6.
Results on real data: Projection of test image onto the learned manifold and comparison with the previous and next training frame.
Discussion
In this study, we introduce MFCGAN as an approach for manifold fitting, utilizing a modified framework derived from CycleGAN. In accordance with the existing theoretical framework, we address several modeling assumptions in our work. However, it is essential to reiterate that the assumptions related to uniform and Gaussian distributions, as well as the knowledge of d and σ, are subject to potential relaxation. Our proposed method effectively captures embedded manifold information in high-dimensional data, as demonstrated by extensive numerical simulations and real data experiments. Nonetheless, some areas require improvement. Specifically, the current algorithm demands considerable storage space while enhancing training efficiency is essential too. Furthermore, future research should focus on extending our computational approach to incorporate theoretical advancements in manifold fitting.
Supplementary Material
Appendix 01 (PDF)
Acknowledgments
Z.Y. has been supported by Singapore Ministry of Education Tier 2 grant (A-0008520-00-00, A-8001562-00-00) and Tier 1 grant (A-0004809-00-00, A8000987-00-00) at the National University of Singapore; J.S. is supported by Research Assistantship under grant A-0004826-00-00 at the National University of Singapore. Part of the work comes from Z.Y.’s collaboration with S.-T.Y. at the Center of Mathematical Sciences and Applications at Harvard University.
Author contributions
Z.Y. and S.-T.Y. designed research; Z.Y. and J.S. performed research; and Z.Y., J.S., and S.-T.Y. wrote the paper.
Competing interests
The authors declare no competing interest.
Footnotes
Reviewers: D.D., Duke University; and S.H., Georg August University of Göttingen.
Contributor Information
Zhigang Yao, Email: zhigang.yao@nus.edu.sg.
Shing-Tung Yau, Email: styau@tsinghua.edu.cn.
Data, Materials, and Software Availability
The PyTorch-based implementation, encompassing generators, discriminators, and manifold fitting sub-module of the model, along with the requisite code for data generation and comparison in simulations, is accessible via https://github.com/zhigang-yao/MFCGAN (37). All other data are included in the manuscript and/or SI Appendix.
Supporting Information
References
- 1.Z. Yao, J. Su, B. Li, Manifold fitting: An invitation to statistics. arXiv [Preprint] (2023). http://arxiv.org/abs/2304.07680 (Accessed 15 June 2023).
- 2.Tenenbaum J. B., Silva Vd., Langford J. C., A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000). [DOI] [PubMed] [Google Scholar]
- 3.Roweis S. T., Saul L. K., Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000). [DOI] [PubMed] [Google Scholar]
- 4.Donoho D. L., Grimes C., Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proc. Natl. Acad. Sci. U.S.A. 100, 5591–5596 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Belkin M., Niyogi P., Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15, 1373–1396 (2003). [Google Scholar]
- 6.Zhang Z., Zha H., Principal manifolds and nonlinear dimensionality reduction via tangent space alignment. SIAM J. Sci. Comput. 26, 313–338 (2004). [Google Scholar]
- 7.L. McInnes, J. Healy, J. Melville, Umap: Uniform manifold approximation and projection for dimension reduction. arXiv [Preprint] (2018). http://arxiv.org/abs/1802.03426 (Accessed 12 May 2023).
- 8.S. Deutsch, A. Ortega, G. Medioni, “Manifold denoising based on spectral graph wavelets” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Z. Ding, Ed. (IEEE, 2016), pp. 4673–4677.
- 9.Yang T., Meng J., Manifold fitting algorithm of noisy manifold data based on variable-scale spectral graph. Soft Comput. 27, 1–12 (2021). [Google Scholar]
- 10.S. Luo, W. Hu, “Differentiable manifold reconstruction for point cloud denoising” in Proceedings of the 28th ACM international Conference on Multimedia, C. W. Chen, Ed. (Association for Computing Machinery, Seattle, WA, 2020), pp. 1330–1338.
- 11.W. Wang, M. A. Carreira-Perpinán, “Manifold blurring mean shift algorithms for manifold denoising” in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, L. Davis, Ed. (IEEE, 2010), pp. 1759–1766.
- 12.Sober B., Levin D., Manifold approximation by moving least-squares projection (MMLS). Construct. Approx. 52, 433–478 (2020). [Google Scholar]
- 13.Genovese C., Perone-Pacifico M., Verdinelli I., Wasserman L., Minimax manifold estimation. J. Mach. Learn. Res. 13, 1263–1291 (2012). [Google Scholar]
- 14.Genovese C. R., Perone-Pacifico M., Verdinelli I., Wasserman L., Manifold estimation and singular deconvolution under Hausdorff loss. Ann. Stat. 40, 941–963 (2012). [Google Scholar]
- 15.Genovese C. R., Perone-Pacifico M., Verdinelli I., Wasserman L., Nonparametric ridge estimation. Ann. Stat. 42, 1511–1545 (2014). [Google Scholar]
- 16.Chen Y. C., Genovese C. R., Wasserman L., Asymptotic theory for density ridges. Ann. Stat. 43, 1896–1928 (2015). [Google Scholar]
- 17.K. Mohammed, H. Narayanan, Manifold learning using kernel density estimation and local principal components analysis. arXiv [Preprint] (2017). http://arxiv.org/abs/1709.03615.
- 18.Dunson D. B., Wu H. T., Wu N., Graph based Gaussian processes on restricted domains. J. R. Stat. Soc. Ser. B: Stat. Methodol. 84, 414–439 (2022). [Google Scholar]
- 19.C. Fefferman, S. Ivanov, Y. Kurylev, M. Lassas, H. Narayanan, “Fitting a putative manifold to noisy data” in Conference On Learning Theory, S. Bubeck, V. Perchet, P. Rigollet, Eds. (PMLR, Stockholm, Sweden, 2018) pp. 688–720.
- 20.Z. Yao, Y. Xia, Manifold fitting under unbounded noise. arXiv [Preprint] (2019). http://arxiv.org/abs/1909.10228 (Accessed 13 January 2023).
- 21.C. Fefferman, S. Ivanov, M. Lassas, H. Narayanan, Fitting a manifold of large reach to noisy data. arXiv [Preprint] (2021). http://arxiv.org/abs/1910.05084 (Accessed 4 January 2022).
- 22.Z. Yao, Y. Xia, Z. Fan, Random fixed boundary flows. J. Am. Stat. Assoc. 1–22 (2023).
- 23.Panaretos V. M., Pham T., Yao Z., Principal flows. J. Am. Stat. Assoc. 109, 424–436 (2014). [Google Scholar]
- 24.D. E. Rumelhart, G. E. Hinton, R. J. Williams, “Learning internal representations by error propagation” in Parallel Distributed Processing (The MIT Press, 1986).
- 25.Goodfellow I., et al. , Generative adversarial networks. Commun. ACM 63, 139–144 (2020). [Google Scholar]
- 26.J. Y. Zhu, T. Park, P. Isola, A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks” in Proceedings of the IEEE International Conference on Computer Vision (IEEE, Venice, Italy, 2017), pp. 2223–2232.
- 27.Liu Q., Xu J., Jiang R., Wong W. H., Density estimation using deep generative neural networks. Proc. Natl. Acad. Sci. U.S.A. 118, e2101344118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.X. Mao et al., “Least squares generative adversarial networks” in Proceedings of the IEEE International Conference on Computer Vision, K. Ikeuchi, Ed. (IEEE, Venice, Italy, 2017), pp. 2794–2802.
- 29.P. Isola, J. Y. Zhu, T. Zhou, A. A. Efros, “Image-to-image translation with conditional adversarial networks” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, R. Chellappa, Ed. (IEEE, Honolulu, HI, 2017), pp. 1125–1134.
- 30.A. F. Agarap, Deep learning using rectified linear units (ReLU). arXiv [Preprint] (2018). http://arxiv.org/abs/1803.08375 (Accessed 12 June 2023).
- 31.D. P. Kingma, J. Ba, Adam: A method for stochastic optimization. arXiv [Preprint] (2014). http://arxiv.org/abs/1412.6980 (Accessed 12 June 2023).
- 32.P. Vincent, H. Larochelle, Y. Bengio, P. A. Manzagol, “Extracting and composing robust features with denoising autoencoders” in Proceedings of the 25th International Conference on Machine Learning, W. W. Cohen, Ed. (Association for Computing Machinery, Helsinki, Finland, 2008), pp. 1096–1103.
- 33.M. Germain, K. Gregor, I. Murray, H. Larochelle, “Made: Masked autoencoder for distribution estimation” in International Conference on Machine Learning, F. Bach, D. Blei, Eds. (PMLR, 2015), pp. 881–889.
- 34.Papamakarios G., Pavlakou T., Murray I., Masked autoregressive flow for density estimation. Adv. Neural Inf. Process. Syst. 30, 2338–2347 (2017). [Google Scholar]
- 35.L. Dinh, J. Sohl-Dickstein, S. Bengio, Density estimation using real NVP. arXiv [Preprint] (2016). http://arxiv.org/abs/1605.08803 (Accessed 12 June 2023).
- 36.E. Calabi, “On Kähler manifolds with vanishing canonical class” in Algebraic Geometry and Topology. A Symposium in Honor of S Lefschetz, R. H. Fox, Ed. (Princeton University Press, Princeton, New Jersey, 1957), vol. 12, pp. 78–89.
- 37.Z. Yao, J. Su, S.-T. Yau, GitHub - zhigang-yao/MFCGAN. GitHub. https://github.com/zhigang-yao/MFCGAN. Deposited 7 December 2023.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Appendix 01 (PDF)
Data Availability Statement
The PyTorch-based implementation, encompassing generators, discriminators, and manifold fitting sub-module of the model, along with the requisite code for data generation and comparison in simulations, is accessible via https://github.com/zhigang-yao/MFCGAN (37). All other data are included in the manuscript and/or SI Appendix.