Skip to main content
. 2025 Aug 22;8:1264. doi: 10.1038/s42003-025-08695-4

Table 1.

Transcriptional features commonly or potentially used to identify malignant cells from scRNA-seq data

Commonly used features
Feature/aberration Readout Comments
Expression of cell-of-origin-marker genes Gene signature score Not sufficient to distinguish between normal and malignant cells of the same type; usually combined with other features
Inter-patient tumor heterogeneity Index of cluster mixing (e.g. LISI score, entropy) Requires multiple samples; may be confounded by batch effects
Copy-number alterations Copy-number profile/aneuploidy score Requires a reference of “normal” ploidy; will not detect malignant cells without chromosomal alterations
Supporting features
Feature/aberration Readout Comments
Single-nucleotide alterations and mutational burden Mutations in known sites/total number of mutations Works best when combined with WES of matched samples; limited by low-coverage of scRNA-seq technologies
Formation of fusion transcripts Expression of fused genes Specific to individual cancer types; limited by low-coverage of scRNA-seq technologies
Sustained proliferation Signature score for cycling gene sets Commonly measured as cycling enrichment by cluster
Pathway dysregulation Signature score for altered pathway Specific to individual cancer types
Potentially discriminating features
Feature/aberration Readout Comments
MHC downregulation Signature score for antigen-presenting machinery Specific to individual cancer types, TMEs, or individual cancer sub-clones
Overexpression of checkpoint molecules Checkpoint ligand expression Limited evidence in scRNA-seq
Expression of telomerase subunits Gene or signature score Limited evidence in scRNA-seq
Metabolic signatures Signature score Adjacent normal cells may exhibit similar alterations
Pro-angiogenic signaling Gene or signature score Limited evidence in scRNA-seq
Drivers of invasion (EMT) Signature score Intermediate EMT states may be difficult to capture
Oncofetal reprogramming Gene or signature score Specific to individual cancer types
Number of unique expressed genes Gene count Can be confounded by heterogeneous sequencing depth