Figure - PMC

Skip to main content

An official website of the United States government

Here's how you know

Here's how you know

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

View full-text article in PMC

. 2021 May 17;118(22):e2100293118. doi: 10.1073/pnas.2100293118

Search in PMC
Search in PubMed
View in NLM Catalog
Add to search

Published under the PNAS license.

PMC Copyright notice

Fig. 1. — Schematic demonstration of DA-seq. (A) Illustration of the DA-seq algorithm. DA-seq detects DA subpopulations by analyzing cells from two biological states. The input of the algorithm is the union of data from two states after initial dimension reduction. Step 1: Computing a multiscale score vector, based on the $k$ -nearest neighbors ( $k$ NN) of each cell, for several values of $k$ (e.g., $k = 4,8,12$ ). Step 2: Training a logistic classifier to predict the biological state of each cell based on the multiscale score to obtain a single DA measure. The algorithm retains only cells for which the DA measure is above a threshold $τ_{h}$ or below $τ_{l}$ and hence may reside in DA subpopulations. Step 3: Clustering the cells retained in step 2 to obtain contiguous DA subpopulations above a predefined size. These subpopulations are denoted $D A 1$ , $D A 2$ , and $D A 3$ . The degree of their differential abundance is quantified by a DA score (SI Appendix, Note 1). Step 4: Detect subsets of genes that characterize each of the DA subpopulations. For example, the genes G7 and G8 characterize $D A 3$ . (B) Standard clustering analysis vs. DA-seq. (Left) Cluster information obtained through standard clustering analysis. (Center) DA subpopulations identified through DA-seq. (Right) Normalized differential abundance of DA subpopulations and clusters, represented by DA score.