Skip to main content
[Preprint]. 2024 Aug 7:rs.3.rs-4819117. [Version 1] doi: 10.21203/rs.3.rs-4819117/v1

Figure 1). sciRED overview.

Figure 1)

A) sciRED comprises four main steps: factor discovery, factor interpretation, factor evaluation and biological characterization. B) In the factor discovery step, a Poisson generalized linear model is applied to the data to remove technical covariates, followed by extraction of residuals and factorization using PCA. The resulting score and loading matrices are then rotated for enhanced interpretability. The score matrix represents the projection of the original data onto the new factor space, illustrating the relationship between cells and factors. Each entry in this matrix reflects how much each cell contributes to the factors. The loading matrix contains the weights or coefficients that define the factors as linear combinations of the original genes. These weights can be used to rank genes according to their contribution to each factor, facilitating further interpretation of the factors. C) Factor interpretation uses an ensemble classifier to match factors with given covariates, generating a Factor-Covariate Association (FCA) table. Covariate-matched factors are identified by thresholding FCA scores based on the distribution of all FCA scores. Unannotated factors may capture novel biological processes or other covariates. D) Factor-interpretability scores (FIS) are computed for each factor. E) The top genes and enriched pathways associated with a selected factor are identified for manual interpretation.