Representation learning of RNA velocity reveals robust cell transitions

Chen Qiao; Yuanhua Huang

doi:10.1073/pnas.2105859118

. 2021 Dec 3;118(49):e2105859118. doi: 10.1073/pnas.2105859118

Representation learning of RNA velocity reveals robust cell transitions

Chen Qiao ^a, Yuanhua Huang ^a,^b,¹

PMCID: PMC8670433 PMID: 34873054

Significance

The recently introduced RNA velocity methods, by leveraging the intrinsic RNA splicing process, have shown their unique capability of identifying the directionality of the cell differentiation trajectory. However, due to the minimal amount of unspliced RNA contents, the estimation of RNA velocity suffers from high noise and may result in less reliable trajectories. Here, we present Velocity Autoencoder (VeloAE), a tailored autoencoder to denoise RNA velocity for more accurate quantification of cell transitions. Through various biological systems, we demonstrate its effectiveness for correcting the inferred trajectory and its interpretability for linking the learned dimensions to underlying biological processes.

Keywords: single-cell RNA velocity, autoencoder, cellular transitions

Abstract

RNA velocity is a promising technique for quantifying cellular transitions from single-cell transcriptome experiments and revealing transient cellular dynamics among a heterogeneous cell population. However, the cell transitions estimated from high-dimensional RNA velocity are often unstable or inaccurate, partly due to the high technical noise and less informative projection. Here, we present Velocity Autoencoder (VeloAE), a tailored representation learning method, to learn a low-dimensional representation of RNA velocity on which cellular transitions can be robustly estimated. On various experimental datasets, we show that VeloAE can both accurately identify stimulation dynamics in time-series designs and effectively capture expected cellular differentiation in different biological systems. VeloAE, therefore, enhances the usefulness of RNA velocity for studying a wide range of biological processes.

Single-cell RNA sequencing (scRNA-seq), by probing the full transcriptome of many individual cells, has become a revolutionary tool for studying the dynamic processes of cells such as cell cycle, differentiation, and organ genesis (1). Numerous computational methods have been developed for cell trajectory inference over the past few years (2), covering various modern techniques, including principled graph fitting (3), diffusion map (4), Gaussian processes (5), and optimal transport (6). However, it is still highly challenging to automatically identify the directionality of the inferred trajectories. A fundamental reason is that all these methods can access only current-state transcriptomes, while indications of past and/or future states are lacking.

Besides the conventional use of mature messenger RNAs (mRNAs), most scRNA protocols, including poly(A)-enriched ones, contain a nonnegligible proportion of nascent RNAs reflecting intrinsic RNA dynamics (7, 8). Such informative unspliced RNAs have been further formulated into the concept of RNA velocity, i.e., the time derivative of mature mRNAs, which is becoming instrumental to the identification of past and future states of genes and consequently the transition directionality between cells (8). Recently, this technique has been further extended to a full dynamical framework with a probabilistic setting (9). However, despite its increasing popularity and wide success, estimation of RNA velocities is often found to be less robust or even inaccurate for indicating cellular transitions (10). This is partly due to the nature of low contents of unspliced RNAs and high noise of scRNA-seq data and partly due to the lack of an effective approach for integrating the high dimensions of the transcriptome during cellular state projection. For the latter, one possible solution is to identify a set of dynamic related genes, e.g., by manual curation (10) or by prioritization with supervised covariates (11).

Instead of selecting a subset of genes, in this study, we propose Velocity Autoencoder (VeloAE), a tailored representation learning method, to address the above challenges. A similar strategy of dimensionality reduction has been widely adopted in routine analysis of scRNA-seq data for both data denoising and computational efficiency, including principal component analysis (PCA), a latent variable model (12), autoencoder (13), and variational autoencoder with non-Gaussian noise (14, 15). Uniquely, VeloAE contributes to the robust estimation of cellular transitions by constructing a joint representation for both current mRNA states and velocity vectors. Thus, transition estimation in this dense low-dimensional space can avoid the sparsity and noise problems of the raw representations. This framework is not only highly capable of denoising the scRNA-seq count data (13), but also promising for learning informative representations for robust downstream analysis like cellular transition estimation.

Results

High-Level Description of VeloAE

Briefly, VeloAE is a principled autoencoder, a common choice of representation learning framework parameterized via neural networks (16). The encoder of the proposed framework consists of a conventional encoder, i.e., multilayer perceptrons (MLP) with a single hidden layer, and a graph convolutional network (GCN) module (i.e., cohort aggregation in Fig. 1; Methods). The GCN aims to smooth the preencoded latent states (output of MLP) within neighboring cells. The neighborhood of each cell is predefined via an adjacency graph based on transcriptome similarity, e.g., K-nearest neighbor graph by default. For the decoder, we introduce an attentive combination module, which has been recently proposed for machine translation and soon will be popularized for a wide range of tasks (17, 18). This module aims to capture patterns from complete gene profiles for reconstructing the input, as well as strengthen the biological interpretation of the latent dimensions (Methods).

Fig. 1. — Overview of the VeloAE model with exemplified low-dimensional projection effects. Compared to a standard autoencoder, VeloAE has a cohort aggregation module with a GCN in the encoder and an attentive combination module as decoder. Once fitted, velocity can be jointly quantified in the lower-dimensional space of spliced and unspliced expressions. Examples in box: The cell transition probability $π_{i, j}$ and its directionality are smoothed and corrected from original space (*Left*) to a learned low-dimensional space (*Right*) for one cell to its neighboring cells (*Upper Row*) or for neighboring cells to a more consistent common differentiation direction (*Lower Row*).

The VeloAE model will be jointly fitted on the observed cell-by-gene count matrices for both spliced and unspliced RNAs and optionally for aggregated expression if not redundant. After fitting the coefficient $γ_{z}$ in the low-dimensional space via linear regression (i.e., the steady-state model of refs. 8 and 9) on the encoded unspliced reads $u_{z}$ by the spliced reads $s_{z}$ , the low-dimensional velocities $v_{z}$ can be computed following the definition formula of single-cell RNA velocity (Methods and Fig. 1). On this informative and denoised representation space, more consistent transition probabilities from source to target cells can be computed (see lower-row examples in Fig. 1 and transition calculation method Eq. 10). Due to the computing power of graphic processing units (GPUs), VeloAE can be fitted within around 10 min (20,000 epochs) on datasets of around 3,000 cells, and 40 to 100 min on larger datasets of 35,000 to 90,000 cells (SI Appendix, Table S3).

Moreover, for quantitative evaluation of velocity estimation, we additionally propose two metrics, cross-boundary direction correctness (CBDir) and in-cluster coherence (ICVCoh), for scoring the direction correctness and coherence of estimated velocities, between or within cell groups, respectively (see Evaluation Metrics for details). These metrics can complement the usual vague visual evaluation based on plotted velocity fields.

VeloAE Corrects Cell Transitions in Time-Series Stimulations

To evaluate the performance of VeloAE in identifying cell transitions, we first applied it to a proof-of-principle design, where neurons are stimulated with KCl for 0, 15, 30, 60, and 120 min (10); hence the cellular transcriptomes are expected to transit along the dense stimulation time. We herein refer to this dataset as scNTseq, since it is obtained using the technology scNTseq proposed and analyzed in ref. 10. To assess whether RNA velocity is able to reconstruct the expected cell transitions, we estimated it from intronic and exonic reads by the software package scVelo (9) (Methods). However, we found the default RNA velocity in the original gene space struggles to identify the correct transition directions in the early time points for both scVelo’s stochastic mode (Fig. 2A) and its dynamical setting (SI Appendix, Fig. S1). We therefore ask whether most genes have the velocity at 0-min cells direct to the stimulated cells, by measuring the median velocity for 0-min cells (i.e., ${\bar{v}}_{0}$ ) and their median expression differences from 15-min cells [i.e., $δ (\bar{s}) = median (median (s_{15}) - s_{0})$ ]. Surprisingly, the relation between velocity ${\bar{v}}_{0}$ and forward expression difference $δ (\bar{s})$ is largely stochastic (Pearson’s R = –0.042), and a substantial fraction of genes (50.7%) have opposite signs between them, indicating a backward transition (Fig. 2C). A similar pattern is also found in scVelo’s dynamical mode (SI Appendix, Fig. S1). In other words, contradicting transition directions widely exist among different genes (for example, Arih1 has forward direction while Cltc has backward direction; Fig. 2C and SI Appendix, Fig. S2), which hinders the identification of the expected direction in early time points.

Fig. 2. — Comparison of scVelo and VeloAE for RNA velocity analysis on the scNTseq dataset. (A) scVelo stochastic mode in raw gene space. (B) Velocity projected into low-dimensional space by VeloAE. (C) Scatter plot of $δ (\bar{s})$ (15-0) over ${\bar{v}}_{0}$ in the raw space. (D) Scatter plot of $δ ({\bar{s}}_{z})$ (15-0) over ${\bar{v}}_{z 0}$ in the low-dim space. (E) In-cluster coherence scores. (F) Cross-boundary direction correctness (A → B) scores.

By contrast, the learned representation in a 100-dimensional space by VeloAE can largely retain a positive relation between ${\bar{v}}_{z 0}$ and $δ ({\bar{s}}_{z})$ (Pearson’s R = 0.962 on 98% interval) and more consistent signs between them (87.8%, Fig. 2D). Therefore, the cell transitions estimated from these latent dimensions are remarkably corrected, especially with a clear direction between 0 and 15 min (Fig. 2B). Overall, VeloAE’s representation not only substantially enhances the proportion of correct direction between two proximate states (mean CBDir: 0.253 by scVelo vs. 0.392 by VeloAE; Fig. 2F and Methods), but also increases the transition coherence within each subgroup of cells (mean ICVCoh from 0.914 by scVelo to 1.000 by VeloAE; Fig. 2E and Methods).

VeloAE Strengthens Directionality in Oligodendrocyte Lineages.

Next, we applied VeloAE to a well-studied data set on neuron genesis, where the development of mouse dentate gyrus was measured at two time points (P12 and P35) with scRNA-seq data (10x Genomics) (9, 19). As demonstrated in the scVelo paper (9), major differentiation lineages, e.g., neuroblasts developing into granule cells, can be successfully identified by its both stochastic and dynamical modes. However, sublineages remain challenging to be identified, particularly the differentiation from oligodendrocyte precursor cells (OPCs) into myelinating oligodendrocytes (OLs). Only the dynamical mode is able to moderately detect the right direction between them (9), and the stochastic mode instead returns erroneous transitions (Fig. 3A). This challenge occurs partly because both cell types are in substationary states and transient cells are limited. By visualizing the consistency between velocities at OPCs and the expression difference from OLs, we found there is only a weak relation between the velocities and the expression difference (Pearson’s R = 0.093) and 41.1% of genes show opposite signs between them (Fig. 3C).

Fig. 3. — Comparison of scVelo and VeloAE for RNA velocity analysis on dentate gyrus dataset. (A) scVelo stochastic mode in raw gene space. (B) Velocity projected into low-dimensional space by VeloAE. (C) Scatter plot of $δ (\bar{s})$ (OL-OPC) over ${\bar{v}}_{opc}$ in the raw space (by scVelo). (D) Scatter plot of $δ ({\bar{s}}_{z})$ (OL-OPC) over ${\bar{v}}_{z, opc}$ in the low-dim space (by VeloAE). (E) In-cluster coherence scores. (F) Cross-boundary direction correctness (A → B) scores.

However, we found the learned latent dimensions by VeloAE show a strong positive correlation between the velocity and expected differences (Pearsons’s R = 0.861) and a high proportion of consistent signs (65.3%; Fig. 3D). Consequently, a clear direction from OPCs to OLs has been achieved by projecting on these lower dimensions (Fig. 3B), with increasing the direction correctness from –0.886 (scVelo stochastic) or –0.438 (scVelo dynamical) to 0.981 (VeloAE; Fig. 3F and Table 1). Also, the transition directions are largely smoothed (in-cluster coherence: 1.000 by VeloAE vs. 0.936 by scVelo; Fig. 3E).

Table 1.

Performance comparison across datasets between our proposed method VeloAE, its variants, and other baseline methods

Method	scNTseq		Dentate gyrus		scEUseq	EryMouse		EryHuman		Pancreas
	ICVCoh	CBDir	ICVCoh	CBDir	ICVCoh	ICVCoh	CBDir	ICVCoh	CBDir	ICVCoh	CBDir
scVelo (stc)	0.9141	0.2527	0.9363	–0.8861	0.8297	0.7725	–0.1002	0.8769	–0.5253	0.8360	0.4787
scVelo (dyn)	0.9532	0.2726	0.8294	–0.0855	0.7286	0.9151	–0.2164	0.9298	–0.5036	0.7846	0.4678
FA	0.3297	–0.1005	0.6577	0.6715	0.5870	0.3341	0.2086	0.2967	0.1002	0.3430	0.0915
PCA	0.4148	–0.3527	0.8876	–0.8178	0.9411	0.9930	–0.6140	0.8711	–0.3358	0.3987	0.2533
AE	1.0000	0.2441	0.9993	–0.4937	1.0000	0.9868	0.1224	0.9850	–0.0023	0.9998	–0.0145
VeloAE	1.0000	0.3919	1.0000	0.9809	0.9997	0.9999	0.8276	0.9996	0.6499	1.0000	0.5168
AE w/ CohAgg	1.0000	0.4364	0.9999	–0.8969	1.0000	0.9998	0.6012	0.9999	–0.4461	1.0000	0.1720
AE w/ AttComb	0.9991	0.3347	0.9993	0.9540	0.9984	0.9879	0.2545	0.9953	0.3003	0.9928	0.2898

Open in a new tab

ICVCoh, in-cluster coherence score; CBDir, cross-boundary direction correctness score; w/, with only the specific configuration; stc, scVelo’s stochastic mode in raw space; dyn, scVelo’s dynamical mode in raw space.

VeloAE Identifies Intestinal Organoid Differentiation

In addition, we applied VeloAE to an intestinal organoid dataset, where a snapshot is taken during the differentiation from stem cells to secretory cells or enterocytes (20). We herein refer to this dataset as scEUseq, since it is obtained using the technology scEU-seq proposed in ref. 20. In the original paper, two strong differentiation trajectories have been identified by Monocle2 (3), and the directions were manually added by annotating the stem and differentiated cell types. Here, we ask whether RNA velocity can automatically identify the differentiation trajectory and its directionality. By using the spliced and unspliced RNAs, we found that the RNA velocity estimated by the scVelo stochastic model is able to identify the branch 1 trajectory from stem cell to secretory cells, but fails to find the fully correct direction on branch 2 terminating at enterocytes (see Fig. 4A for the stochastic model and SI Appendix, Fig. S1 for the dynamical mode). On this branch, two opposite directions were wrongly suggested by scVelo, possibly because of the low cell density in the middle of this trajectory.

Fig. 4. — Comparison between scVelo and VeloAE for RNA velocity analysis on organoid differentiation dataset (scEUseq). (A) scVelo stochastic mode in raw gene space. (B) Velocity projected into low-dimensional space (VeloAE). (C) Heat map of branch 1 and branch 2 marker genes’ attention distributions over latent dimensions. (D) Low-dim expression heat map of five most strongly attended dimensions along the two differentiation branches (i.e., monocle pseudotime). (E) In-cluster coherence scores.

By contrast, VeloAE can successfully identify both differentiation trajectories and their directions through the learned lower-dimensional representations (Fig. 4B). By examining the attention weights in the attentive combination module, we found that the major marker genes (see SI Appendix, Fig. S3 for their enrichment over two branches) of the two differentiation trajectories dominantly enrich in five latent dimensions: branch 1 (secretory cells) genes Defa24 in dim 39, Rnase4 in dim 84, and Defa17 and Lyz1 in dim 17 and branch 2 (enterocytes) genes Apoa1 in dim 57 and Ephx2, Gstn3, Cyp3a25 and Muc3 in dim 84 (Fig. 4C). Interestingly, by tracing the expression states of these five latent dimensions, we found that they can evidence the dynamics of the two differentiation branches (Fig. 4D). Particularly, latent dimension 17 enriches at the terminal state of branch 1, while latent dimensions 84 and 57 enrich in that of branch 2. Furthermore, due to the effective representation, the estimated transition directions are significantly more coherent compared to the original gene space (0.830 by scVelo vs. 1.000 by VeloAE; Fig. 4E).

VeloAE’s Performance on Other Featured Datasets

Moreover, to evaluate our framework in different data settings, three additional datasets are explored. Two of them are about erythroid development respectively from mouse and human (21). According to the original study, both datasets contain Multiple Rate Kinetics (Murk) genes that result in a completely reversed estimation of cellular transitions by scVelo (Fig. 5), possibly because of the violation of the RNA velocity model assumption. The authors reported that only by manually removing those genes could scVelo correctly recover the transitional direction (21). We are hence motivated to test our method on these datasets to investigate whether VeloAE can automatically diminish the impact of those misleading genes in cellular transition estimation.

Besides, to evaluate our method on multibranched development, we enrolled the pancreas dataset used in the CellRank paper (22), where cells develop from “Ngn3 EP” to “Fev+” and then branch into four lineages: alpha, beta, epsilon, and delta terminal fates.

The performances of scVelo and VeloAE on these three datasets are shown in Fig. 5, clearly demonstrating that low-dimensional estimations (Fig. 5, Bottom) of VeloAE are able to filter the noises of “Murk” genes and recover the multibranched development trends in the low-dimensional space.

Comparison with Multiple Baseline Methods

To assess the contribution of each module in our proposed method, we compared the velocity coherence and direction correctness between VeloAE and its two variants by keeping only the GCN in the encoder for cohort aggregation (i.e., with only cohort aggregation [w/CohAgg]) or only the attention module in the decoder (i.e., with only attentive combination [w/AttComb]) or by turning off both modules and changing it to a vanilla autoencoder (AE). We also included PCA and factor analysis (FA) models as additional baseline methods.

Compared with scVelo, almost all autoencoder-based methods consistently achieve higher within-cluster velocity coherence (ICVCoh) across the datasets (Table 1), probably due to their capability of data denoising. Among them, with higher ICVCoh scores, VeloAE is usually the top-performing method.

Importantly, when examining the direction correctness, only VeloAE and its variant with the attentive combination module (i.e., AE w/AttComb) can achieve high or positive CBDir scores across all the datasets (Table 1) applicable for CBDir measurement. This highlights the importance of the attentive combination module for ensuring an informative projection of spliced and unspliced RNA reads and hence more robust evaluation of RNA velocities for trajectory inference. Meanwhile, the cohort aggregation module further boosts VeloAE’s performance in inferring correct transition directions, as VeloAE usually outperforms at least one of its ablation versions.

Taken together, VeloAE by combining the GCN module in the encoder and attention module in the decoder has a balanced performance on both high coherence and accurate directionality.

Furthermore, in SI Appendix, Table S4 we show the effects of randomness on model performance by reporting the mean and SD of the metrics after fitting VeloAE on the six datasets with 10 different random seeds. The results evidence consistent model behaviors and indicate robust estimation of cellular transitions of VeloAE under different random seed settings.

VeloAE’s Potential Applicability in Extreme Scenarios

To further examine the broad applicability, we explored VeloAE on four more challenging datasets where technical quality and/or ground-truth annotations are limited, for which reason we did not include them in the quantitative evaluation experiments reported in Table 1.

SI Appendix, Fig. S4 shows the inferred trajectories of mouse intestinal epithelium development from a dataset with relatively shallow sequencing depth (median 689 unique molecular identifiers [UMIs] per cell in the top 2,000 genes; SI Appendix, Fig. S6 A). Nonetheless, we found VeloAE to be more promising than scVelo in identifying reasonable transition flows, e.g., maturation of distal and proximal enterocytes (SI Appendix, Fig. S4 B and C) (23).

In SI Appendix, Fig. S5, we reevaluated the cell differentiation in mouse bone marrow with the original dataset used in ref. 8. This dataset has extremely low coverage, for both total count (median 214 UMIs per cell in the top 2,000 genes; SI Appendix, Fig. S5 A) and marker genes of progenitor-to-activating transition (SI Appendix, Fig. S5 B) and hematopoietic stem cells (HSCs) (SI Appendix, Fig. S5 C). Surprisingly, even with such low coverage, both scVelo and VeloAE successfully identified the major expected transitions from progenitor to activating cells (SI Appendix, Fig. S5 D and E). However, they both conflict with the transitions estimated by ref. 8 between the progenitors and dividing cells (SI Appendix, Fig. S5 D–F). Although progenitors going back to dividing cells is concerning to us, we found expressed HSC marker genes in progenitors (shown in SI Appendix, Fig. S5 C), which may be an additional confounder for hindering the trajectory inference besides the limited coverage.

The human bone marrow dataset from ref. 24 has a much higher proportion of unspliced mRNA reads (48% versus generally 15 to 25%; SI Appendix, Fig. S6 A), making it unique from all the other evaluated datasets in this study. The effects of a changing transcription rate of certain genes (i.e., Murk genes with boosting expression) are supposed to cause the reversed transitions highlighted from Ery1 to Ery2 (24) in SI Appendix, Fig. S6 B and C, possibly due to the violation of a fixed transcription rate assumed by both scVelo (steady-state and stochastic modes) and VeloAE. Comparing this dataset with the above mouse and human erythroid datasets (Fig. 5) where VeloAE successfully counteracted the effects of Murk genes, we suspect that it is the unexpected spliced-to-unspliced ratio and the complex branches that add to the difficulty of estimating the expected transitions. One major conflicting result between scVelo and VeloAE is the direction between monocytes and precursors (SI Appendix, Fig. S6 B and C). While we are not certain about the genuine differentiation, it is seemingly more problematic to observe the transitions from monocytes to HSCs estimated by scVelo. As pointed out in the original study, this dataset may bring multifaceted challenges to RNA velocity analysis (24).

Finally, we evaluate a complex biological system of endoderm progenitor reprogramming (25), whose two clonal reprogramming trajectories are shown in SI Appendix, Fig. S7 A. The partially overlapped and crossed trajectories greatly challenge the capability of RNA velocity methods, making scVelo fail in producing expected transitions (SI Appendix, Fig. S7 B). By contrast, VeloAE managed to discover transitions that partially match with the expected trajectories for both clones, particularly in later stages (SI Appendix, Fig. S7 C). Nonetheless, such long and complex trajectories might have been beyond the modeling capacity of RNA velocity featured in transient RNA dynamics. More targeted trajectory modeling tools such as CellRank (22) may be potential alternatives.

Discussion

In this work, we aim to infer cell transitions from RNA velocity. This is a paradigm of trajectory inference where the intrinsic dynamics of transitions are inferred from two layers of molecular observations: the spliced and unspliced RNAs. Computationally, cosine- or correlation-based metrics on vectors are usually used to measure the similarity between cells and to estimate transition directionality. As the velocity vectors are not self-normalized, the results of such measurements are substantially different from commonly used Euclidean distances on transcription profiles in conventional trajectory inference methods. Therefore, a denser representation in a low-dimensional space could be crucial for RNA velocity analysis, which motivates us to develop VeloAE here, a tailored autoencoder that learns effective representations for RNA velocity projection. Our findings from the evaluation experiments on different datasets demonstrate the benefits of applying VeloAE in postprocessing raw velocity estimations for more robust results. The robustness of projections is impressively indicated in both visual demonstrations and quantitative metrics.

Our VeloAE model consistently outperforms baseline methods across datasets due to the proposed two mechanisms. While cohort aggregation explicitly constrains a cell’s low-dimensional representation to resemble its neighboring cells, attentive combination urges the representations to retain as much information from the original gene space as possible for reconstructing the input. Compared with either PCA or a standard autoencoder, VeloAE can hence naturally project similar cells closer to each other (effect of cohort aggregation), forming more centered neighborhoods or clusters for later analysis. Meanwhile, by replacing the decoder (e.g., an MLP) in AE with the attentive combination mechanism, VeloAE can implicitly embed the similarity information among gene profiles into the low dimensions, which may not be learned by the decoder of an AE instead. Consequently, the synergistic effect is that the low-dimensional cell representations can not only best keep the key information for reconstructing the original space but also explicitly encode its spatial closeness evidence among cells. By contrast, compared with our method, PCA, FA, and vanilla AE lack explicit mechanisms of encoding either gene-level or cell-level structures of a dataset into the low-dimensional space, Therefore, although at times FA and AE could retain dataset structures in the low-dimensional space and achieve high ICVCoh or positive CBDir scores, their applicability is likely restricted to a certain set of scenarios.

Although in the dentate gyrus dataset, VeloAE managed to correctly infer cellular transitions between “island”-like cell clusters OPC and OL, in general, given the grounding on transient dynamics of unspliced and spliced RNA states, RNA velocity is more suitable for short-term transition predictions, and so is VeloAE. Therefore, if the formation of islands of cell clusters is due to too long a time interval between developed cells, these cell clusters are probably already in different stationary states and hence lack the transient state of RNA splicing process. In such a situation, RNA velocity and VeloAE are both likely to fail, as also pointed out in ref. 22.

Furthermore, complex experiment designs are increasingly applied for dissecting different dynamics-related factors, making data integration an inevitable challenge; for example, six of the datasets used here are collected separately from multiple time points or conditions. Batch effect-minimized designs are often desired; otherwise, postcorrection of batch effects may be needed, e.g., for cell visualization and clustering; see a recent review of different strategies (26). This challenge also exists in RNA velocity analysis, possibly even more severe given the normalization needed for both layers of unspliced and spliced RNAs. Therefore, data integration and batch effect correction are worth further exploration with a focus on RNA velocity analysis.

Great challenges for transition estimation also come from the quality of single-cell datasets and the complexity of the underlying system. As shown in SI Appendix, Figs. S4–S7, the low depth of sequencing, the existence of assumption-violation genes (e.g., Murk genes), the unusual proportions of spliced and unspliced mRNA reads, and the complex developmental processes may all affect the success of transition estimation results, separately or collectively. Since VeloAE shows promising performance even in such extreme scenarios, we expect our approach of denoised low-dimensional estimation could serve as a different perspective on tackling the mentioned challenges, which are more comprehensively discussed in ref. 24.

Finally, we mention that the proposed method shares some merits with a range of dimensionality reduction methods in manifold learning (27). Assuming high-dimensional data to have concentrated low-dimensional intrinsic structures (i.e., manifolds), manifold learning methods attempt to map data from the original space into a low-dimensional space, where the main variation is kept along the manifold. These types of methods have also rapidly emerged in recent years for revealing complex structures in single-cell transcriptomics, as reviewed by Moon et al. (28). Therefore, we expect our method opens a door of manifold learning for multilayer data analysis and dynamical modeling and motivates methodology development for further enhancing the robustness of single-cell RNA velocity analysis.

Methods

RNA VeloAE Model

A standard autoencoder consists of an encoder and a decoder (29), which are usually parameterized by MLPs with a single hidden layer (SI Appendix, Fig. S8). When processing data, an encoder transforms input data points into a low-dimensional space, where key information about the data is retained. A decoder then reconstructs the input dimensions from the dimension-reduced representations, aiming to recover the original input. The low-dimensional representations are of particular interest to us, as they, when preserving key information, are expected to filter noise or less important information, thus enabling more coherent estimations for single-cell velocities.

Effective as it is, the standard autoencoding architecture is limited in incorporating intersample correlations into the encoding procedure. More specifically, a standard autoencoder assumes the input data to be independently identically distributed (IID); hence, there is no explicit mechanism of keeping cells’ neighborhood information in the low-dimensional space. Yet it is unable to capture cross-cell distribution patterns of a gene to identify key latent dimensions for its reconstruction. It is, however, noted that the neighborhood information of cells plays a critical role in cell-level velocity inference and that the steady-state approach of gene-level velocity inference also exploits cross-cell patterns of a gene.

To mitigate the above issues, this work proposes two mechanisms that can be embedded into the encoder and decoder modules of an autoencoder:

•
Cohort aggregation with GCNs (30) for incorporating neighborhood information into low-dimensional representations; and
•
Attentive combination of latent dimensions for input dimension reconstruction.

The overall computational framework is illustrated in SI Appendix, Fig. S9. In the following text, we formalize our framework and the proposed two mechanisms.

Suppose an input matrix $X \in R^{N \times d_{x}}$ denotes any matrix of transcriptome, spliced, or unspliced reads, with normalized counts of d_x genes across N cells. The encoder parameterized by a single hidden-layer MLP transforms it into a low-dimensional space as $Z^{o} = MLP (X) \in R^{N \times d_{z}}$ , where d_z is much smaller than d_x. To associate representations of cell neighborhood, the following cohort aggregation procedure is applied to $Z^{o}$ with the help of a prespecified neighborhood graph.

CohAgg

The CohAgg module leverages GCN to enrich the low-dimensional representation of a cell by taking the weighted average of its neighborhood including itself. The weights and self-loop enable the representation of a target cell to be smoothed by its neighbors while keeping cell-specific information from others, allowing less abundant cell types to also maintain specific representations. Formally, an aggregated low-dimensional representation ${\tilde{z}}_{i, :}$ for the ith cell is computed as

{\tilde{z}}_{i, :} = f (Z^{o}, G_{N}) = \sum_{j \in N (i) \cup {i}} \frac{w_{i, j}}{\sqrt{\deg (i)} \cdot \sqrt{\deg (j)}} \cdot (Θ^{(1)} \cdot z_{j, :}^{o}),

where $G_{N}$ is a given neighborhood graph (usually generated when scRNA-seq data are processed), $\deg (\cdot)$ returns node degree in the graph, $w_{i, j}$ is the similarity score between cells i and j, and the set operation $N (i) \cup {i}$ groups cell i and its nearest neighbors in a common set. $Θ^{(\cdot)} \in R^{d_{z} \times d_{z}}$ is the learnable parameter matrix of the GCN. In aggregation, $w_{i, j}$ weighs the importance of neighbor representation j, and inverse of degrees normalizes the result.

Compactly, the computation can be conducted in matrix form as

\tilde{Z} = {\tilde{D}}^{- 1 / 2} \tilde{W} {\tilde{D}}^{- 1 / 2} Z^{o} Θ^{(1)},

[1]

where $\tilde{W} = W + I_{N} \in R^{N \times N}$ is the weighted adjacency matrix $W \in R^{N \times N}$ of the neighborhood graph $G_{N}$ enhanced by self-loops of weight one, and $\tilde{D} \in R^{N \times N}$ is the degree matrix with each diagonal element computed as ${\tilde{D}}_{i, i} = \sum_{j = 1}^{N} {\tilde{W}}_{i, j}$ .

Moreover, to encode the influence of the second-order neighborhood, the same cohort aggregation mechanism is applied to $\tilde{Z}$ , which yields the final encoding result Z that explicitly encodes cohort relationship into low-dimensional representations:

Z = {\tilde{D}}^{- 1 / 2} \tilde{W} {\tilde{D}}^{- 1 / 2} \tilde{Z} Θ^{(2)} .

[2]

AttComb

The latent dimensions of Z compress multiple original dimensions. Consequently, reconstructing an input dimension from Z requires querying information encoded in multiple latent columns. We adopt a global attention mechanism to evaluate the contribution of each latent dimension to a target input dimension.

To enable attention computation, a prespecified representation matrix G for genes is required. The matrix offers in each column $G_{:, i} \in R^{d_{g}}$ a vectorized representation for a target gene. In practice, G can be prepared using techniques such as PCA on the columns of a count matrix.

Given a target gene represented as $G_{:, i}$ , reconstruction of its corresponding input dimension ${\hat{x}}_{i} \in R^{N}$ requires computing the attention weights for each of the latent dimensions in Z. To achieve this, we first transform the gene representation to a query vector $q_{i} \in R^{d_{t}}$ and each latent dimension $Z_{:, j}$ to a key vector $k_{j} \in R^{d_{t}}$ , both with separately parameterized MLPs. Scaled dot products between each pair of the query and key vectors are then calculated and normalized with Gumbel-Softmax function (31) to yield the weights, e.g., for gene i and latent dimension j:

α_{i, j} = Gumbel - Softmax (\frac{q_{i}^{T} \cdot k_{j}}{\sqrt{d_{t}}}) .

[3]

By setting the temperature hyperparameter of Gumbel-Softmax τ to be small (e.g., 1 or 5), the weights could be less evenly distributed and larger on only a small number of all the dimensions, which are hence forced to be dissimilar and to capture different structures of the raw gene profiles. The weights are then used to aggregate corresponding dimensions of Z by weighted sum to reconstruct the target input dimension. For reconstructing the reads (denoted as ${\hat{x}}_{n, i}$ ) of the ith gene in the nth cell, the procedure works as follows:

{\hat{x}}_{n, i} = \sum_{j = 1}^{d_{z}} α_{i, j} z_{n, j} .

Compactly, the computation for reconstructing all the input dimensions can be conducted in matrix form as

\hat{X} = Z \cdot A^{T} .

[4]

Note that the attention matrix A plays the role of a single fully connected feedforward neural layer, mimicking a standard decoder without any hidden layer or nonlinearity. However, different from a standard decoder, each attention weight in A is computed based on the global information of both latent and target dimensions, instead of fitting locally under the IID assumption as a standard decoder does. Moreover, as a by-product, the attention weights can help us better interpret the latent dimensions of Z by tracing their contributions to input dimensions.

Fitting and Velocity Estimation

The primary loss for fitting the parameters of the proposed framework is mean squared error (MSE) between reconstructed $\hat{X}$ and the original X. To tackle the problem of aligning latent dimensions, we initialize only one model and use it to simultaneously encode and reconstruct the reads of both spliced (S) and unspliced (U) RNA reads. The errors are then added up to form the following reconstruction loss $L_{rec}$ :

L_{rec} = MSE (S, \hat{S}) + MSE (U, \hat{U)} .

[5]

Meanwhile, an auxiliary regression loss is computed on the latent projections $S_{z}$ and $U_{z}$ , to mimic the fitting of RNA degradation rate and also add more structures to the latent dimensions. Following refs. 8 and 9, this is achieved by performing a linear regression on cells potentially at steady state, namely those with spliced and unspliced reads at extreme quantiles falling outside of the [0.05, 0.95] quantile interval. In our context, in the low-dimensional space, the degradation rate of the ith low dimension γ_i is fitted by solving the regression function $u = γ \cdot s + ϵ$ on the extreme quantile cells at the ith dimension (column) of $S_{z}$ and $U_{z}$ , respectively. Because there is an analytical solution to linear regression, we can directly estimate γ_i and compute the total regression loss $L_{reg}$ of all the latent dimensions:

L_{reg} = \sum_{i} M S E ({\tilde{u}}_{i}, {\hat{γ}}_{i} \cdot {\tilde{s}}_{i})

[6]

where {\hat{γ}}_{i} = \frac{{\tilde{u}}_{i}^{T} \cdot {\tilde{s}}_{i}}{{\tilde{s}}_{i}^{T} \cdot {\tilde{s}}_{i}},

[7]

and vectors ${\tilde{u}}_{i}$ and ${\tilde{s}}_{i}$ keep only those extreme-quantile samples (cells) at the ith dimension of $U_{z}$ and $S_{z}$ , respectively.

The final loss for optimization is the sum of $L_{rec}$ and $L_{reg}$ :

L = L_{rec} + L_{reg} .

[8]

After fitting the parameters with Eq. 8, the encoder of the framework can project S and U into low-dimensional representations as $S_{z} = encode (S)$ and $U_{z} = encode (U)$ , respectively. By using Eq. 7, we can compute per-dimension rate γ_i and then calculate all the velocities in the low-dimensional space:

V_{z} = U_{z} - Γ_{z} ⊙ S_{z},

[9]

where $Γ_{z}$ is generated by repeating row vector $[γ_{1}, γ_{2}, \dots, γ_{K}]$ multiple times to match the row size of S and U, and $⊙$ is elementwise multiplication operation.

Cell Transitions from Low-Dimensional Representations

With low-dimensional estimations $S_{z}$ and $V_{z}$ ready, the estimation for cellular transitions can take place in the low-dimensional space and is expected to achieve more robust results.

In accordance with scVelo, the transition strength $π_{i, j}$ from cell i to cell j in the low-dimensional space is defined as the cosine similarity between the low-dimensional velocity of cell i and the difference of splicing expressions between cells j and i:

π_{i, j} = \frac{{(s_{z}_{j} - s_{z}_{i})}^{T} \cdot v_{z}_{i}}{| | s_{z}_{j} - s_{z}_{i}) | | \cdot | | v_{z}_{i} | |} .

[10]

This transition score is then normalized with softmax function (with optional weights) over the neighborhood of cell i to yield the transition probabilities (details in ref. 9).

Settings of Used Models

To validate the performance of the proposed framework, evaluation experiments were conducted on multiple single-cell sequencing datasets, with baseline methods involved for comparison. On each dataset, we fit all the candidate models and estimate cell velocities using the scVelo package (9) in both stochastic and dynamical modes. As scVelo uses the first-order moments (mean) of the cell neighborhood instead of raw spliced and unspliced reads for estimation, to prevent mismatch we follow this practice and fit all the other candidate models (those in Methods and Baseline Methods) using the same moment matrices of spliced and unspliced reads.

The default data preparation procedures introduced in ref. 9 and implemented in scVelo are adopted when we preprocess all the evaluation datasets: The top 2,000 highly variable genes passing a minimum threshold of 20 expressed counts are normalized with an add-one logarithm and kept; and a 30 nearest-neighbor graph is constructed based on 30 principal components in the PCA space for computing each cell’s first- and second-order expression moments, all using the scVelo methods:

\begin{array}{l} scv.pp.filter_and_normalize (adata, min_shared_counts = 20, n_top_genes \\ = 2000) \\ scv.pp.moments (adata, n_neighbors = 30, n_pcs = 30) . \end{array}

Note that in our experiment we additionally evaluated model performance by varying the number of highly variable genes (results shown in SI Appendix, Table S5). As the performances are either close to or worse than the 2,000-gene setting, following the suggested number by Bergen et al. (9), in this study we fix the number of highly variable genes to be 2,000.

Then, regarding the stochastic mode, we apply the default settings in the scVelo package and fit all the models with scv.tl.velocity(adata, mode=”stochastic”). Similarly, we adopt all the default configurations of the package to fit scVelo dynamical models, except that we increase the maximum iteration rounds from 10 to 100 for more sufficient fitting and we force all the 2,000 genes involved in the estimation, which allows slight gains in quantitative performance metrics. The methods for fitting dynamical models are as follows:

\begin{array}{l} scv.tl.recover_dynamics (adata, max_iter = 100, var_names = " all ") \\ scv.tl.velocity (adata, mode = " dynamical ") . \end{array}

Baseline Methods

All the hyperparameters of the candidate models are summarized in SI Appendix, Table S2.

FA (32) models the variance of observed variables with a potentially lower number of unobserved latent variables called factors. To make it comparable to our method, we fit FA on the concatenated matrices of transcriptome, spliced, and unspliced expression reads and use the same computation as Eqs. 7 and 9 to obtain the low-dimensional velocities.

PCA

Ref. 33 projects high-dimensional data into a low-dimensional space of orthogonal basis, while keeping the most variance of the original data. To make it comparable to the proposed framework, as FA we fit PCA on the concatenated matrices of transcriptome, spliced, and unspliced reads, and use the same strategy as Eqs. 7 and 9 to obtain the dimension-reduced velocities.

Standard autoencoder

It applies MLPs with single hidden layers to parameterize encoders and decoders, apart from which the same reconstruction task and projection strategy are adopted in fitting and applying the model.

Ablation: w/CohAgg

In this ablation configuration, the attentive combination module is replaced with a single feedforward neural layer without any hidden layer or nonlinearity. Such a configured decoding layer is best comparable to the global attention weights in Eq. 4, which allows us to compare the influence of attentive combination versus a standard decoder in input dimension reconstruction. This ablation model also shares the same reconstruction task and projection strategy with the proposed model.

Ablation: w/AttComb

In this ablation configuration the GCN-enabled cohort aggregation module is removed, which is done to study the impact of incorporating neighborhood information in encoding the input dimensions. Similarly, the same reconstruction task and projection strategy are also adopted in this configuration.

Implementation

We implemented the standard autoencoder, the proposed VeloAE, and the ablation models using pytorch. The GCN layers of VeloAE and CohAgg are implemented using the pytorch-geometric package. We choose Gaussian error linear unit as the nonlinear activation for any MLP with hidden layer(s) and used AdamW for optimizing the model parameters. The specific learning rate for each dataset is set as either $1 \times 10^{- 6}$ or $1 \times 10^{- 5}$ , as, in practice, training losses on some datasets fluctuate more greatly, requiring a smaller rate to converge. We also adopt a learning rate decay strategy that diminishes the learning rate at a rate of 0.9 and resumes training from the last best result when a loss increases during training. The default temperature parameter τ of Gumbel-Softmax in the AttComb module is set to be 1.0, which adds great constraints on the latent dimensions so that attention is more centered on a few dimensions, whereas it makes training harder to converge to a lower value. Thus, on datasets of dentate gyrus and scNTseq, we set τ to be 5.0, which is the middle value of the [1, 10] interval, where a temperature of 10 can make the softmax more evenly distributed (31). Other hyperparameters like the size of low-dimensional space d_z and number of epochs are set consistent over all relevant models and listed in SI Appendix, Table S2. PCA is implemented with the scikit-learn package. All the neural network models are fitted with a single NVIDIA GTX1080ti GPU.

Evaluation Metrics

Prior works usually rely on visual observations on the plotted velocity estimation results to analyze and evaluate velocity estimation methods. However, we argue that visual judgment of velocity estimation results may be misleading and easily fall short in differentiating results that are less visually evident. Quantitative metrics hence should be developed. Unfortunately, the only off-the-shelf metric available is scVelo’s coherence score that indicates a cell’s general velocity coherence with its immediate and second-order neighbors.

Coherence alone, however, cannot reliably demonstrate the correctness of velocity directions for cells with known differentiation order. To illustrate, it is very likely that the directions of velocities could be totally reversed against ground-truth differentiation orders, but still be coherent, thus retaining an unfair high score.

To allow for a more comprehensive evaluation, we propose below two types of metrics targeting direction correctness and velocity coherence, respectively. Note that the metric with “cross-boundary” prefix requires input of ground-truth development directions for pairs of cell clusters, e.g., A → B. The statistics would be computed for those boundary cells, i.e., cells of type A with type B in the neighborhood. Formally, We define boundary cells indexed by the source cells (type A), and represent the set as

C_{A \to B} = {c \in C_{A} | \exists c' \in C_{B} \cap N (c)},

[11]

where C_A and C_B are sets of type A and type B cells, respectively, and $N (c)$ retrieves the neighboring cells of c. The $\exists$ condition filters our nonboundary cells.

CBDir (A → B) is designed for evaluating how likely, following its current velocity, a cell can develop to a target cell. If a cell’s velocity is correctly estimated, then the velocity direction should be consistent with the development trajectory of the cell. Note that both source and target cells are represented in a common vector space, and thus the development trajectory in a very short time period could be approximated simply with the location displacement from source to target cells in this space. Therefore, given a ground-truth developmental direction from cell type A to type B, an ideal velocity of a type A cell is expected to be consistent with its displacement to a type B cell. Specifically, we consider only cross-boundary cells that mimic cell development at a very short time, to prevent unexpected impacts of cell development within the same clusters; i.e., the development trajectory can hardly be approximated by simple displacement when a type B cell has developed to later stages greatly different from type A.

The formula for computing this score is

CBDir (c) = \frac{1}{| {c' \in C_{B} \cap N (c)} |} \sum_{c' \in C_{B} \cap N (c)} \frac{v_{c} \cdot (x_{c'} - x_{c})}{| v_{c} | \cdot | x_{c'} - x_{c} |},

where $x_{c'}$ and $x_{c}$ are vectors representing cells c and c $'$ in a low-dimensional space via the uniform manifold approximation and projection (UMAP) algorithm (34), $x_{c'} - x_{c}$ is the cell displacement in this space, and $v_{c}$ is decomposed UMAP velocity representation in the same space. In the scVelo package UMAP representations can be computed using the function scv.pl.velocity_embedding_stream.

ICVCoh is computed with a cosine similarity scoring function between cell velocities within the same cluster. Specifically, the neighborhood of some cell c (e.g., of type A) is searched, and those cells within the same type (i.e., type A) are kept. Then, an average velocity similarity score for cell c is computed as

ICVCoh (c) = \frac{1}{| {c' \in C_{A} \cap N (c)} |} \sum_{c' \in C_{A} \cap N (c)} \frac{v_{c} \cdot v_{c'}}{| v_{c} | \cdot | v_{c'} |} .

Then the mean of the ICVCoh scores across all selected cells is taken as the output score.

From the equation, it is evident that ICVCoh, like CBDir, is a local-neighborhood–based measurement, which first computes per-cell coherence on the immediate neighborhood before aggregating individual scores to yield a cluster- or dataset-level score. This mechanism allows ICVCoh to account for heterogeneous velocity within a cluster to a certain degree. Different from CBDir, this score does not require the prior knowledge of the transition direction between clusters, but rather it summarizes the consistency of the inferred directions for a cluster (e.g., a differentiation branch), regardless of the correctness.

Datasets, Processing, and Availability

To evaluate our methods, single-cell sequencing datasets of multiple different biological systems are enrolled: scNTseq (10), scEUseq (20), dentate gyrus neurogenesis (19), erythroid developmental datasets of mouse and human used in ref. 21, and pancreas dataset used in ref. 22. These datasets contain known differentiation orders for all or part of the cell clusters, with which we can validate both the coherence and correctness of estimated velocities. Note that the “cell clusters” in this study are more generally referring to different groups of cells whose transitional directions are known and studied in the original paper.

The datasets’ information relevant to our experiment is summarized in SI Appendix, Table S1, and the ground-truth orders of cell development in the datasets are further elaborated below:

The scNTseq dataset has time labels (in minutes) for each cell. The labels indicate length of stimulation on cells and hence their transition orders. We evaluate the velocities between immediate consecutive time points as 0 → 15, 15 → 30, 30 → 60, and 60 → 120, respectively. The preprocessed data (unspliced/spliced counts and metadata, see below) are directly obtained from the original paper (10).

In the dentate gyrus dataset the known cell differentiation direction is from OPC to OL cells, which serves as the only ground truth for evaluation. The preprocessed dataset is directly from the scVelo package (9).

The scEUseq dataset groups all cells into three monocle branches. Although there are no interbranch development orders known for applying our proposed metrics, cells within two branches (branches 1 and 2) exhibit unidirectional development trajectory. Therefore, correctly estimated velocities should be highly coherent within the two branching trajectories. We need only evaluate their in-branch coherence and then check visually whether the velocities are pointing from the branching start to end to know their correctness. The preprocessed data are directly obtained from the original paper (20).

The mouse erythroid dataset has erythroid cells annotated by stages 1, 2, and 3 of development, which serve as the order of ground-truth transition. The preprocessed data are requested from the authors of ref. 21, the same as the human erythroid dataset.

The human erythroid dataset has erythroid cells annotated by early, middle, and late stages of development, which is exactly the order of ground-truth transition.

The pancreas dataset has multiple lineages of cell development. The initial cluster of endocrine progenitors (EPs) turns from low level of neurogenin 3 (Ngn3 low EP) to high level (Ngn3 high EP) and then to Fev+ state, before branching toward four terminal states, i.e., alpha, beta, epsilon, and delta cell fates. We evaluate velocity on these transition directions: Ngn3 low EP → Ngn3 high EP, Ngn3 high EP → Fev+, Fev+ → Alpha, Fev+ → Beta, Fev+ → Epsilon, Fev+ → Delta. The preprocessed data are directly from the CellRank package (22).

In addition to the above datasets, four more extreme datasets are enrolled to explore VeloAE’s potential applicability in conditions of technically low coverage, unexpected high proportion of unspliced reads, and biologically complex trajectories, including the intestinal epithelium dataset (23), the mouse bone marrow dataset (8), the human bone marrow dataset (24), and the endoderm progenitor reprogramming dataset (25). Because of the limitation of technical quality and ground-truth annotations, we did not include them for quantitative assessment like the above six ones, but visualize only the estimated cellular transitions in SI Appendix, Figs. S4–S7. The intestinal epithelium dataset is downloaded from Gene Expression Omnibus: GSE92332 (batch 1 and batch 2) and preprocessed with the velocyto command line with default settings. The endoderm progenitor reprogramming dataset was obtained through the CellRank package (22). All the other datasets, in preprocessed form, are directly obtained from the original papers, including the mouse bone marrow requested from the authors of ref. 8.

Supplementary Material

Supplementary File

pnas.2105859118.sapp.pdf^{(24.2MB, pdf)}

Acknowledgments

We thank Melania Barile, Berthold Göttgens, and Peter Kharchenko for kindly sharing the preprocessed erythroid and mouse bone marrow datasets. We acknowledge support from the University of Hong Kong and its Li Ka Shing Faculty of Medicine through a startup fund.

Footnotes

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2105859118/-/DCSupplemental.

Data Availability

VeloAE is an open-source Python package available at GitHub, https://github.com/qiaochen/VeloAE. All the analysis notebooks for reproducing the results are also available in this repository. Previously published data were used for this work (8, 10, 19–25).

References

1.Trapnell C., Defining cell types and states with single-cell genomics. Genome Res. 25, 1491–1498 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Saelens W., Cannoodt R., Todorov H., Saeys Y., A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37, 547–554 (2019). [DOI] [PubMed] [Google Scholar]
3.Qiu X., et al., Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Haghverdi L., Büttner M., Wolf F. A., Buettner F., Theis F. J., Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016). [DOI] [PubMed] [Google Scholar]
5.Boukouvalas A., Hensman J., Rattray M., BGP: Identifying gene-specific branching dynamics from single-cell data with a branching Gaussian process. Genome Biol. 19, 65 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Schiebinger G., et al., Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell 176, 928–943.e22 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Gaidatzis D., Burger L., Florescu M., Stadler M. B., Analysis of intronic and exonic reads in RNA-seq data characterizes transcriptional and post-transcriptional regulation. Nat. Biotechnol. 33, 722–729 (2015). Correction in: Nat. Biotechnol. 3, 210 (2016). [DOI] [PubMed] [Google Scholar]
8.La Manno G., et al., RNA velocity of single cells. Nature 560, 494–498 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Bergen V., Lange M., Peidli S., Wolf F. A., Theis F. J., Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 38, 1408–1414 (2020). [DOI] [PubMed] [Google Scholar]
10.Qiu Q., et al., Massively parallel and time-resolved RNA sequencing in single cells with scNT-seq. Nat. Methods 17, 991–1001 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Huang Y., Sanguinetti G., BRIE2: Computational identification of splicing phenotypes from single-cell transcriptomic experiments. Genome Biol. 22, 251 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Buettner F., Pratanwanich N., McCarthy D. J., Marioni J. C., Stegle O., f-scLVM: Scalable and versatile factor analysis for single-cell RNA-seq. Genome Biol. 18, 212 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Eraslan G., Simon L. M., Mircea M., Mueller N. S., Theis F. J., Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 10, 390 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Lopez R., Regier J., Cole M. B., Jordan M. I., Yosef N., Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Wang D., Gu J., VASC: Dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. Genomics Proteomics Bioinformatics 16, 320–331 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Bengio Y., Courville A., Vincent P., Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2013). [DOI] [PubMed] [Google Scholar]
17.Bahdanau D., Cho K., Bengio Y., “Neural machine translation by jointly learning to align and translate” in 3rd International Conference on Learning Representations (ICLR, 2015). [Google Scholar]
18.Chaudhari S., Mithal V., Polatkan G., Ramanath R., An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. 12, 1–32 (2021).34336375 [Google Scholar]
19.Hochgerner H., Zeisel A., Lönnerberg P., Linnarsson S., Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing. Nat. Neurosci. 21, 290–299 (2018). [DOI] [PubMed] [Google Scholar]
20.Battich N., et al., Sequencing metabolically labeled transcripts in single cells reveals mRNA turnover strategies. Science 367, 1151–1156 (2020). [DOI] [PubMed] [Google Scholar]
21.Barile M., et al., Coordinated changes in gene expression kinetics underlie both mouse and human erythroid maturation. Genome Biol. 22, 197 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Lange M., et al., CellRank for directed single-cell fate mapping. BioRxiv [Preprint] (2020). https://www.biorxiv.org/content/10.1101/2020.10.19.345983v2 (Accessed 10 February 2021).
23.Haber A. L., et al., A single-cell survey of the small intestinal epithelium. Nature 551, 333–339 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Bergen V., Soldatov R. A., Kharchenko P. V., Theis F. J., RNA velocity-current challenges and future perspectives. Mol. Syst. Biol. 17, e10282 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Biddy B. A., et al., Single-cell mapping of lineage and identity in direct reprogramming. Nature 564, 219–224 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Argelaguet R., Cuomo A. S. E., Stegle O., Marioni J. C., Computational principles and challenges in single-cell data integration. Nat. Biotechnol. 39, 1202–1215 (2021). [DOI] [PubMed] [Google Scholar]
27.Burges C., Dimension Reduction: A Guided Tour (Foundations and Trends in Machine Learning, Now Publishers, 2010). [Google Scholar]
28.Moon K. R., et al., Manifold learning-based methods for analyzing single-cell RNA-sequencing data. Curr. Opin. Syst. Biol. 7, 36–46 (2018). [Google Scholar]
29.Hinton G. E., Salakhutdinov R. R., Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006). [DOI] [PubMed] [Google Scholar]
30.Kipf T. N., Welling M., “Semi-supervised classification with graph convolutional networks” in 5th International Conference on Learning Representations (ICLR, 2017). [Google Scholar]
31.Eric Jang B. P., Gu S., “Categorical reparameterization with Gumbel-Softmax” in 5th International Conference on Learning Representations (ICLR, 2017). [Google Scholar]
32.Harman H., Modern Factor Analysis (University of Chicago Press, 1976). [Google Scholar]
33.Jolliffe I., “Principal component analysis” in International Encyclopedia of Statistical Science, Lovric M., Ed. (Springer Berlin Heidelberg, Berlin, Germany, 2011), pp. 1094–1096. [Google Scholar]
34.McInnes L., Healy J., Saul N., Grossberger L., UMAP: Uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2018). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

pnas.2105859118.sapp.pdf^{(24.2MB, pdf)}

Data Availability Statement

[r1] 1.Trapnell C., Defining cell types and states with single-cell genomics. Genome Res. 25, 1491–1498 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r2] 2.Saelens W., Cannoodt R., Todorov H., Saeys Y., A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37, 547–554 (2019). [DOI] [PubMed] [Google Scholar]

[r3] 3.Qiu X., et al., Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r4] 4.Haghverdi L., Büttner M., Wolf F. A., Buettner F., Theis F. J., Diffusion pseudotime robustly reconstructs lineage branching. Nat. Methods 13, 845–848 (2016). [DOI] [PubMed] [Google Scholar]

[r5] 5.Boukouvalas A., Hensman J., Rattray M., BGP: Identifying gene-specific branching dynamics from single-cell data with a branching Gaussian process. Genome Biol. 19, 65 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r6] 6.Schiebinger G., et al., Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell 176, 928–943.e22 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r7] 7.Gaidatzis D., Burger L., Florescu M., Stadler M. B., Analysis of intronic and exonic reads in RNA-seq data characterizes transcriptional and post-transcriptional regulation. Nat. Biotechnol. 33, 722–729 (2015). Correction in: Nat. Biotechnol. 3, 210 (2016). [DOI] [PubMed] [Google Scholar]

[r8] 8.La Manno G., et al., RNA velocity of single cells. Nature 560, 494–498 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9] 9.Bergen V., Lange M., Peidli S., Wolf F. A., Theis F. J., Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 38, 1408–1414 (2020). [DOI] [PubMed] [Google Scholar]

[r10] 10.Qiu Q., et al., Massively parallel and time-resolved RNA sequencing in single cells with scNT-seq. Nat. Methods 17, 991–1001 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11] 11.Huang Y., Sanguinetti G., BRIE2: Computational identification of splicing phenotypes from single-cell transcriptomic experiments. Genome Biol. 22, 251 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r12] 12.Buettner F., Pratanwanich N., McCarthy D. J., Marioni J. C., Stegle O., f-scLVM: Scalable and versatile factor analysis for single-cell RNA-seq. Genome Biol. 18, 212 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r13] 13.Eraslan G., Simon L. M., Mircea M., Mueller N. S., Theis F. J., Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 10, 390 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r14] 14.Lopez R., Regier J., Cole M. B., Jordan M. I., Yosef N., Deep generative modeling for single-cell transcriptomics. Nat. Methods 15, 1053–1058 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r15] 15.Wang D., Gu J., VASC: Dimension reduction and visualization of single-cell RNA-seq data by deep variational autoencoder. Genomics Proteomics Bioinformatics 16, 320–331 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r16] 16.Bengio Y., Courville A., Vincent P., Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 (2013). [DOI] [PubMed] [Google Scholar]

[r17] 17.Bahdanau D., Cho K., Bengio Y., “Neural machine translation by jointly learning to align and translate” in 3rd International Conference on Learning Representations (ICLR, 2015). [Google Scholar]

[r18] 18.Chaudhari S., Mithal V., Polatkan G., Ramanath R., An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. 12, 1–32 (2021).34336375 [Google Scholar]

[r19] 19.Hochgerner H., Zeisel A., Lönnerberg P., Linnarsson S., Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing. Nat. Neurosci. 21, 290–299 (2018). [DOI] [PubMed] [Google Scholar]

[r20] 20.Battich N., et al., Sequencing metabolically labeled transcripts in single cells reveals mRNA turnover strategies. Science 367, 1151–1156 (2020). [DOI] [PubMed] [Google Scholar]

[r21] 21.Barile M., et al., Coordinated changes in gene expression kinetics underlie both mouse and human erythroid maturation. Genome Biol. 22, 197 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r22] 22.Lange M., et al., CellRank for directed single-cell fate mapping. BioRxiv [Preprint] (2020). https://www.biorxiv.org/content/10.1101/2020.10.19.345983v2 (Accessed 10 February 2021).

[r23] 23.Haber A. L., et al., A single-cell survey of the small intestinal epithelium. Nature 551, 333–339 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r24] 24.Bergen V., Soldatov R. A., Kharchenko P. V., Theis F. J., RNA velocity-current challenges and future perspectives. Mol. Syst. Biol. 17, e10282 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r25] 25.Biddy B. A., et al., Single-cell mapping of lineage and identity in direct reprogramming. Nature 564, 219–224 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r26] 26.Argelaguet R., Cuomo A. S. E., Stegle O., Marioni J. C., Computational principles and challenges in single-cell data integration. Nat. Biotechnol. 39, 1202–1215 (2021). [DOI] [PubMed] [Google Scholar]

[r27] 27.Burges C., Dimension Reduction: A Guided Tour (Foundations and Trends in Machine Learning, Now Publishers, 2010). [Google Scholar]

[r28] 28.Moon K. R., et al., Manifold learning-based methods for analyzing single-cell RNA-sequencing data. Curr. Opin. Syst. Biol. 7, 36–46 (2018). [Google Scholar]

[r29] 29.Hinton G. E., Salakhutdinov R. R., Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006). [DOI] [PubMed] [Google Scholar]

[r30] 30.Kipf T. N., Welling M., “Semi-supervised classification with graph convolutional networks” in 5th International Conference on Learning Representations (ICLR, 2017). [Google Scholar]

[r31] 31.Eric Jang B. P., Gu S., “Categorical reparameterization with Gumbel-Softmax” in 5th International Conference on Learning Representations (ICLR, 2017). [Google Scholar]

[r32] 32.Harman H., Modern Factor Analysis (University of Chicago Press, 1976). [Google Scholar]

[r33] 33.Jolliffe I., “Principal component analysis” in International Encyclopedia of Statistical Science, Lovric M., Ed. (Springer Berlin Heidelberg, Berlin, Germany, 2011), pp. 1094–1096. [Google Scholar]

[r34] 34.McInnes L., Healy J., Saul N., Grossberger L., UMAP: Uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2018). [Google Scholar]

PERMALINK

Representation learning of RNA velocity reveals robust cell transitions

Chen Qiao

Yuanhua Huang

Significance

Abstract

Results

High-Level Description of VeloAE

Fig. 1.

VeloAE Corrects Cell Transitions in Time-Series Stimulations

Fig. 2.

VeloAE Strengthens Directionality in Oligodendrocyte Lineages.

Fig. 3.

Table 1.

VeloAE Identifies Intestinal Organoid Differentiation

Fig. 4.

VeloAE’s Performance on Other Featured Datasets

Fig. 5.

Comparison with Multiple Baseline Methods

VeloAE’s Potential Applicability in Extreme Scenarios

Discussion

Methods

RNA VeloAE Model

CohAgg

AttComb

Fitting and Velocity Estimation

Cell Transitions from Low-Dimensional Representations

Settings of Used Models

Baseline Methods

PCA

Standard autoencoder

Ablation: w/CohAgg

Ablation: w/AttComb

Implementation

Evaluation Metrics

Datasets, Processing, and Availability

Supplementary Material

Acknowledgments

Footnotes

Data Availability

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases