Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2024 Jul 4:2024.03.23.586420. Originally published 2024 Mar 27. [Version 2] doi: 10.1101/2024.03.23.586420

Accelerated dimensionality reduction of single-cell RNA sequencing data with fastglmpca

Eric Weine, Peter Carbonetto, Matthew Stephens
PMCID: PMC10996495  PMID: 38585920

Abstract

Summary

Motivated by theoretical and practical issues that arise when applying Principal Components Analysis (PCA) to count data, Townes et al introduced “Poisson GLM-PCA”, a variation of PCA adapted to count data, as a tool for dimensionality reduction of single-cell RNA sequencing (RNA-seq) data. However, fitting GLM-PCA is computationally challenging. Here we study this problem, and show that a simple algorithm, which we call “Alternating Poisson Regression” (APR), produces better quality fits, and in less time, than existing algorithms. APR is also memory-efficient, and lends itself to parallel implementation on multi-core processors, both of which are helpful for handling large single-cell RNA-seq data sets. We illustrate the benefits of this approach in two published single-cell RNA-seq data sets. The new algorithms are implemented in an R package, fastglmpca.

Availability and implementation

The fastglmpca R package is released on CRAN for Windows, macOS and Linux, and the source code is available at github.com/stephenslab/fastglmpca under the open source GPL-3 license. Scripts to reproduce the results in this paper are also available in the GitHub repository.

Contact

mstephens@uchicago.edu

Supplementary information

Supplementary data are available on BioRxiv online.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES