Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2024 Jul 31:2024.03.26.586728. Originally published 2024 Mar 28. [Version 2] doi: 10.1101/2024.03.26.586728

The art of seeing the elephant in the room: 2D embeddings of single-cell data do make sense

Jan Lause, Dmitry Kobak, Philipp Berens
PMCID: PMC10996625  PMID: 38585748

Abstract

A recent paper in PLOS Computational Biology (Chari and Pachter, 2023) claimed that t -SNE and UMAP embeddings of single-cell datasets fail to capture true biological structure. The authors argued that such embeddings are as arbitrary and as misleading as forcing the data into an elephant shape. Here we show that this conclusion was based on inadequate and limited metrics of embedding quality. More appropriate metrics quantifying neighborhood and class preservation reveal the elephant in the room: while t -SNE and UMAP embeddings of single-cell data do not preserve high-dimensional distances, they can nevertheless provide biologically relevant information.

Full Text

The Full Text of this preprint is available as a PDF (852.3 KB). The Web version will be available soon.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES