Abstract
Cancer treatment failure is often attributed to tumor heterogeneity, where diverse malignant cell clones exist within a patient. Despite a growing understanding of heterogeneous tumor cells depicted by single-cell RNA sequencing (scRNA-seq), there is still a gap in the translation of such knowledge into treatment strategies tackling the pervasive issue of therapy resistance. In this review, we survey methods leveraging large-scale drug screens to generate cellular sensitivities to various therapeutics. These methods enable efficient drug screens in scRNA-seq data and serve as the bedrock of drug discovery for specific cancer cell groups. We envision that they will become an indispensable tool for tailoring patient care in the era of heterogeneity-aware precision medicine.
Introduction
In an ongoing effort to curb cancer morbidity and mortality, computational methods have shown great potential at aiding drug discovery.1,2 Based on artificial intelligence (AI) principles including both classical machine learning (ML) and deep learning (DL) frameworks, many computational approaches leverage high-throughput drug screens (HTS) on cancer cell lines (CCLs) to infer sensitivities to various drugs in a target dataset.1 These HTS data, in combination with CCL molecular profilings, offer insights into connections between CCL expression profiles and drug response phenotypes. Models trained on these data can be applied to target expression datasets (e.g., patient tumors) to predict vulnerabilities to various anti-cancer drugs and further lead off drug selection in specific settings.
In recent years, an increasing volume of studies through single cell RNA-seq (scRNA-seq) analysis suggest heterogeneous tumors, in which cancer cells display temporal and spatial diversity, are causally related to a critical challenge in cancer treatment: disease progression through drug resistance.3 Within a heterogeneous tumor, various subpopulations of cells, each with distinct genetic and phenotypic traits, can coexist. When exposed to therapeutic interventions, these diverse cell subclones may respond differently, with some intrinsically resistant to treatment. The selective pressure exerted by the therapy can lead to the proliferation of these therapy-resistant cell subclones, ultimately contributing to treatment failure.4 Recognizing the pivotal role of tumor heterogeneity, this necessitates cell-type aware drug discovery to target problematic cell groups within patient tumors and provide treatment opportunities with clinical impact. In this context, computational frameworks are expected to enable efficient early development by generating drug response profiles at the single-cell (SC) level.5 Meanwhile, given that 1) large-scale CCL expression profiles are systematically surveyed by bulk RNA-seq which captures an averaged estimation across within-tumor cell subtypes and 2) the fundamental differences in RNA-seq and scRNA-seq techniques, specialized methods are needed to utilize large-scale drug screens of CCLs to infer drug activity at the SC level.
One key embodiment of such methods relates to the successful application of bulk-learned drug-gene relationships to SC datasets.6 The construct of such methods involves an essential process that converts learned relationships between CCL molecular profiles and drug response into references for scRNA-seq to generate cellular drug sensitivity status. Several transfer learning approaches have been proposed to facilitate such a process, including data integration through matrix factorization, variational autoencoder networks, and biomarker/signature based frameworks.
Successful prediction of cellular response is the keystone of cell type-aware drug discovery. It will facilitate development of therapeutics for targeting therapy-resistant cells. Furthermore, speciality treatments can be used with standard-of-care (SOC) therapies as a combination to reach all malignant cells and help eliminate heterogeneous tumors. In this review, we present computational methods that are designed to infer drug activity at the SC level. We focus on how inferring cellular drug response facilitates pre-clinical research through enabling hypothesis generation and providing actionable drug candidates. Moreover, we also reason that predicted cellular drug response can be used to design combination therapies to target heterogeneous tumors and help achieve curability.
Literature Data Collection
Harzing’s Publish or Perish software was used to query work with Google Scholar between the years 2020–2023, pulling the first 200 entries whose title contained the keyword ‘single cell drug’ or ‘single cell therapeutic.7 In Figure 1, a total of 374 entries were identified and subsequently filtered to remove review articles, abstracts, and preprint archives. Duplicate papers were removed as well as papers whose title or abstract did not contain the keyword ‘drug response’. From the 20 papers obtained at the end of filtering, we zoomed in and focused on eight publications that described computational drug response prediction at the single-cell level through integration with bulk RNAseq data. A summary of these works is depicted in Figure 2 and include neural network/deep-learning methods (scDEAL8, SCAD9), biomarker or signature based methods (DREEP10,11, Beyondcell12, ASGARD13, and scDr14), and traditional machine learning based approaches (CaDRReS-Sc15, scDRUG16).
Figure 1.

Literature selection workflow
Figure 2.

Overview of single-cell drug sensitivity prediction methods
HTS drug screens and CCL molecular data sources
All eight published works reviewed have been developed to infer drug response at the single-cell level using a collection of CCL HTS databases. To date, the largest publicly available HTS screening efforts are the Sanger’s Genomics of Drug Sensitivity in Cancer (GDSC)17, the Broad Institute’s Cancer Therapeutics Response Portal (CTRP),18 and the PRISM Repurposing dataset (PRISM).19 In addition, the Library of Integrated Network-Based Cellular Signatures (LINCS) consortium has generated encyclopedic profiles of cellular response (termed cellular signatures) under drug perturbations.20 The cell lines and compounds included in these datasets represent a variety of cancers and molecular targets.21 The CTRP evaluates 481 drugs on 860 CCLs; the GDSC tests roughly 300 to 400 drugs across two initiatives; the PRISM expands screened drugs to include various non-oncology molecules and covers thousands of treatments.The LINCS data contains CCL perturbation profiles under tens of thousands small molecules. For CCL transcriptomic data, the Broad Institute’s Cancer Cell Line Encyclopedia database (CCLE) provides a comprehensive RNA-seq on approximately 1000 CCLs.22 In addition, the GDSC platform measures CCL gene expression through microarray probes. CCL molecular features and drug response phenotypes together provide a gateway for unearthing drug-gene relationships and constructing predictive models. To describe drug sensitivity, HTS data include summary statistics in the form of half-maximal inhibitory concentration (IC50) values in GDSC and area-under-the-dose-response-curve (AUC) values for CTRP and PRISM.
SC drug response prediction methods
The aforementioned HTS and CCL data constitutes the majority of data sources used by computational methods to derive drug-gene relationships. A breakdown of specific data used by each SC drug response prediction method is provided in Table 1. Since the CCL transcriptomic profiles were analyzed at the bulk level, where expression of a gene is aggregated over the entire sample, learned drug-gene information cannot be directly applied to scRNA-seq data for meaningful drug response projection. To overcome this, several methods attempt to unify bulk and SC data to maximize similarities between the two sources. On the other hand, some methods seek to identify informative markers from bulk CCL and subsequently apply such markers in SC data to infer cellular drug response. An overview of the key methodologies used in these works are listed in Table 1 and further discussed in details.
Table 1.
Key aspects of SC drug response prediction methods.
| Original Method | HTS Implemented | Feature and Biomarker Selection | Data Integration Approach | Drug Response Statistics | Drugs Used for Benchmarking |
|---|---|---|---|---|---|
| scDEAL https://github.com/OSU-BMBL/scDEAL |
GDSC | Selects variable genes. Detects critical genes for drug response. | Neural Networks | Converts AUC to binary labels of sensitive and resistant | Cisplatin Docetaxel Erlotinib Gefitinib I-BET-762 |
| SCAD | GDSC | Extracts invariant features between bulk and SC domains | Neural Networks | Converts IC50 to binary labels | Afatinib AR-42 Cetuximab Gefitinib NVP-TAE684 PLX4720 Sorafenib Vorinostat |
| CaDDReSS-Sc https://github.com/CSB5/CaDRReS-Sc |
GDSC | Adopts a predefined essential gene set | Matrix Factorization | Uses IC50 derived binary labels and refits drug response curves | Docetaxel Doxorubicin Epothilone B Gefitinib Obatoclax Mesylate PHA-793887 PI-103 Vorinostat |
| Beyondcell https://github.com/cnio-bu/beyondcell |
GDSC, CTRP, LINCS | Identifies drug response biomarkers from bulk data | Relies on bulk-based biomarkers. | Calculates a unit-free signature score | 321 anticancer drugs from Ben-David et al.23 |
| scDR | CTRP | Identifies drug response biomarkers from bulk data | Relies on bulk-based biomarkers | Calculates a unit-free signature score | 77 FDA-approved drugs. |
| DREEP https://github.com/gambalab/DREEP |
CTRP | Identifies drug response biomarkers from bulk data | Relies on bulk-based biomarkers | Calculates enrichment scores via GSEA | 450 drugs from the CTRP (bulk level). |
| ASGARD https://github.com/lanagarmire/Asgard |
LINCS | Identifies genes altered by drug perturbations in bulk data | Relies on bulk-based perturbation biomarkers | Calculates a customized score based on signature reversion | 150 drugs from different diseases. Regards a drug as positive if it is: 1) FDA-approved, 2) used in advanced clinical trials, or 3) proven effective in animal models |
scDrug adopted the entirety of CaDDReSS-Sc and thus not listed here as an original approach.
Neural Network/ DL methods
A number of deep learning (DL) methods based on neural networks have been proposed to calibrate drug response modeling for more precise predictions.24–26 Adaptations of these methods for predicting SC drug sensitivities have been discussed and implemented in recent years.6 Such computational pipelines leverage the abundance of scRNA-seq data as DL algorithms often include many parameters and consume a lot of data for adequate training of these parameters. Aided by flexibility in finding latent space and features, specialized neural networks are designed to minimize distributional discrepancies between the input bulk RNA-seq and scRNA-seq sources, such that drug-gene relationships learned solely from bulk RNA-seq can be meaningfully applied to the target scRNA-seq dataset.
To this end, SCAD adopted an adversarial learning approach9 training a domain discriminator to counter cross-domain bias between the two data sources. This forces invariant feature extraction across bulk and SC RNA-seq domains in order to integrate the two sources. For drug response learning, bulk sample labels from GDSC were binarized and used to supervise a prediction network which minimizes a binary cross-entropy (BCE) loss and generates predicted labels in scRNA-seq data.9 While possessing similar functional compartments, scDEAL, on the other hand, employs denoising autoencoders for feature selection from bulk and SC RNA-seq data.8 Given the nature of generative models, the input dataset is compressed into a “bottleneck” on which data integration is performed. To facilitate transfer learning, a loss function incorporating the probability measurement, maximum mean discrepancy (MMD), is minimized to retain similarity between bulk and SC data. ScDEAL also utilizes binary bulk sample drug response to train a prediction network by minimizing the BCE. Ultimately, both methods assemble multiple specialized DL networks to fulfill the bulk-to-SC drug label prediction. DL models have the potential to concurrently account for heterogeneous scRNA-seq and provide fine-grained cellular drug sensitivity labels thanks to their elasticity and capability at learning complex relationships. However, it is unclear if dichotomization of continuous drug response measurements can always render pharmacological meanings, as response is better characterized as a spectrum.27 Also, optimal model structures can be drug dependent, and learning such structures often involves intensive computation, which can sometimes mask their practicality among the pharmacological research community for hypothesis generation and early development. Both methods require neural network parameter tuning for each drug to calibrate an optimal structure for transfer learning using measured drug sensitivity. This may consequently limit their feasibility to evaluate against many drugs or conduct large-scale drug screens.
Biomarkers or signatures based methods
While most methods involve recognition of molecules whose activities associate closely with sensitivities to drugs, a few methods center around identification and application of these biomarkers to maximize their predictability for drug response.28–30 Genes or gene signature sets are learned from the paired data of CCL bulk RNA-seq and drug response from HTS, frequently through correlation analysis.31,32 Next, instead of relying on detection of shared expression patterns between bulk and SC data, the same markers are directly selected from the scRNA-seq data and their normalized expression values are coalesced into a scalar describing relative likelihood of a cell having the desired phenotype (e.g., high sensitivity to a drug). As discovery of biomarkers depends only on bulk RNA-seq and bulk sample labels, generating cellular drug response predictions can be done ad hoc in a separate manner.
To predict drug response in breast cancer cells, Gambardella et al. developed DREEP, which learns drug specific biomarkers correlated with either sensitive or resistant phenotype from the CTRP and GDSC. At the SC level, the top 250 most expressed genes are compared against a drug’s biomarker profile through the Gene Set Enrichment Analysis (GSEA), through which an enrichment score (ES) was calculated for each cell-drug pair. A cell is classified to be sensitive to the drug with the extreme ES.10,11 Beyondcell derives sensitivity signature sets (SS) using the differential expression (DE) analysis R package limma from the CTRP and GDSC. Additionally, it compares expression levels before and after a drug perturbation from the LINCS dataset to offer perturbation signature sets (PS). With PS, drug response can be inferred based on the “signature reversion” principle, which prioritizes drugs that induce reverse-to-normal expression changes in disease models.12 A drug’s signature is divided to up- or down-regulated sets, each having 250 most significant genes, and applied to the target SC data to calculate a cell specific “Beyondcell Score” (BCS) indicating relative sensitivity to the drug.12 In scDr, CCLs in CTRP are dichotomized into sensitive and resistant ones. Differential expression analysis is then carried out between the two CCL groups, through which top 200 biomarkers based on their log2 fold change (log2FC) in either up- or down-direction are identified. The log2FC of the marker genes are used in conjunction with gene expression Z-scores in the SC data to generate drug response scores.14 Also demonstrated in breast cancer scRNA-seq, ASGARD requires the input of both disease samples and normal tissue samples and pairs cell identity clusters between the two types. To identify drugs for a specific cell cluster, DE analysis is first carried out between normal and disease clusters. DE genes are then used to screen for drugs that significantly reverse expression patterns in the disease cluster to that of normal in the LINCS dataset.13
Unlike the deep learning methods discussed, no training data or intense computation is required, as learned gene signatures can be applied to any SC data independently. However, due to low and sparse expression levels in a single cell and the stochastic nature of drop-outs, predictive power of a pre-defined gene set is not always guaranteed. Beyondcell addresses this potential pitfall by penalizing cells with high sparsity in their corresponding signature genes. Depending on the availability of paired disease-normal scRNA-seq from the same cohort, applications of ASGARD can be limited. Furthermore, compared to normal tissues, expression changes in certain advanced cancers are not unidirectional, which will greatly convolute the signature reversion principle.33
Machine learning methods
Traditional machine learning approaches have a rich history in the area of drug discovery.34,35 They have commonly been used to integrate various genomic spaces, including drug-gene interactions, disease-gene interactions, and gene-gene interactions.34,36,37 Great effort has been taken toward applying these approaches to drug response prediction.33,38,39 CaDDReSS-Sc is a machine learning framework used for cellular level cancer drug response prediction.14 It incorporates a set of 1856 essential genes41 identified through CRISPR screens as encoding components of fundamental pathways. CaDDReSS-Sc is an extension of CaDDReS,40 calibrated for single-cell transcriptomic profiles. The purpose of the factorization is to learn a latent pharmacogenomic space of 10 dimensions, projecting relationships between cell line gene expression with known drug response information. The dot product between the latent space’s cell-line vector and drug vector indicate specific cell-line drug responses and is then used to impute drug response of unseen samples (e.g. patients or cell-lines). Thus, the factorization allows for model training. Unlike CaDDReS, CaDDReSS-Sc computes kernel features using both bulk and single-cell RNA seq data prior to the model training, as essentially the Pearson correlation coefficient between their per-sample gene expression. scDrug is another cellular level drug response prediction technique based on unsupervised machine learning, leveraging CaDDReSS-Sc coupled with an automated pipeline to cluster scRNAseq data.16 The resolution selected is associated with the optimal silhouette coefficient or distance between clusters. For each cluster, differentially expressed genes are ranked and cell types are annotated using scMatch.42 This data is then used together with bulk RNA profiles to predict how each cluster will affect patient survival using CaDRReS-Sc. Unlike the biomarker based methods discussed, the essential genes comprising the signature used for CaDDReSS-Sc and scDrug were originally identified for their essential roles in cellular livelihood and are not compound specific.41 However, drug-gene associations can encompass a wide variety of genes; therefore, using only essential genes does not guarantee adequate gene-drug relationship information that can be used to infer drug response in an unseen data.
Combination of biomarker/signature and machine learning based methods
A new method scIDUC leverages drug-gene signatures in a machine learning based method to infer cellular level drug response.43 Prior to integrating bulk and single-cell data, the R package limma is used to identify drug response relevant genes (DRGs) from the known bulk sample phenotype. Then the datasets are integrated utilizing canonical correlation analysis to capture common gene expression patterns across the two datasets, adjusting for discrepancies. Non-negative matrix factorization can also be used to find correlations between these datasets as an alternative approach to integration. Lastly, linear regression is formulated using the integrated training and single-cell datasets. Due to the ease of parameter tuning, scIDUC lacks the intensive computation potentially associated with deep learning methods discussed previously; which are highly dependent on the existence of scRNAseq data with measured drug sensitivity to allow for parameter fine-tuning whereas scIDUC can more easily be applied to a large collection of drugs. Unlike the signature based methods, scIDUC does not also require corresponding normal tissue samples, which are not always readily available, or dichotomizing CCL training data into resistant or sensitive labels, which do not capture the nuances and spectrum of drug response. Additionally, traditional machine learning approaches do not consider drug specific genes and instead utilize an essential gene set that may not be specific enough to accurately project each drug sensitivity. scIDUC’s DRGs, on the other hand, are learned directly from the bulk data used to train the model, and the gene sets identified for each drug are most significantly associated with drug response.
Promising trends and potential pitfalls
Facilitating heterogeneity-aware drug discovery
Most of the 8 reviewed works enable projection of cell specific vulnerability to therapeutics, which can be associated with cell identities to facilitate drug nomination for certain cell types. For example, Fustero-Torre et al. used Beyondcell to generate sensitivities of 451HLu human melanoma cells to various drugs. They screened for drugs showing high predicted efficacy among BRAF inhibitor (BRAFi) resistant cell clusters as candidate therapeutics to combat BRAFi resistance in melanoma.12 With increasing availability of scRNA-seq data from varying diseases, such a strategy can be adapted to facilitate drug discovery targeting specific cell clones in many indications. However, most examples provided by the current works only referenced existing studies to justify reliability of prediction results. To truly evaluate the utility of these methods, experimental analysis of the proposed therapeutics should be carried out in vitro or in vivo.
These methods also model drug response through using either resistant or sensitive labels or a continuous spectrum; they do not attempt to predict drug dosage, which plays a role in improving our understanding of cancer tumor heterogeneity. Specifically, tumor heterogeneity is linked to the emergence of therapy resistant cell subclones. By predicting and adjusting drug dosages, it may be possible to more effectively target and inhibit these therapy resistant subclones, reducing the likelihood of resistance development. Methods are actively developing now.44
In addition, scRNAseq allows for a detailed distinction between normal and malignant cell subpopulations.45–48 This distinction, in turn, greatly facilitates heterogeneity aware drug discovery and avoids inhibition of normal populations, minimizing toxicity. It is imperative for future research to harness these full capabilities of scRNAseq to optimize treatment outcomes.
Spearheading drug combination efficacy prediction
At an individual level, therapeutic vulnerability profiles within a patient may enable modeling of drug combinations as tailored treatments to combat heterogeneous tumors. First, identified drugs for therapy-resistant cell groups can be used with SOC treatments to form drug combinations to help eliminate a tumor. This is a direct extension of cell specific drug discovery. Moreover, predictions of drug combination efficacy may be achieved using cellular drug response. Combination efficacy can be inferred probabilistically by assuming predicted cellular drug response indicates likelihood of cell kill.49 In this case, methods including continuous variables for drug response show advantages over those using only binary labels. For example, AUCs can be scaled to indicate percentages of cell death under a treatment or probabilities of cell kill; for a combination with multiple agents, their cell kill probabilities can be aggregated to generate a probability of cell kill for the whole population. This can be done at the cellular level or at the cell cluster level. An averaged combination probability therefore estimates efficacy over the whole heterogeneous population. However, it is unclear if current drug combination prediction rationales grant desirable predictive powers, especially given the discrepancy between complex intratumoral structures and information reflected by current scRNA-seq techniques.
Data availability as a limiting factor
The key limiting factor against predicting drug response at the single-cell level is insufficient training power due to the lack of public benchmarked data. Understanding how to appropriately integrate bulk and scRNA-seq data alleviates this limitation. Ultimately, it will enable the design of more precise therapeutic regimens, taking into account a patient’s specific microenvironment and tumor heterogeneity. A pitfall of this approach, however, is the ability to impute cellular level drug response is contingent on the type and quality of bulk CCL data used in the integration. Specifically, the techniques covered in this review largely leverage response across chemotherapeutics, and both chemotherapy and immunotherapy may be used alone, together (‘chemoimmunotherapy’), or in combination with other treatments (e.g. radiation therapy or surgery), which these approaches can’t generate predictions for yet.
Conclusion
We carefully review several latest existing methods that leverage HTS data to project cell-level drug sensitivity in given scRNA-seq data. They employ a variety of computational principles including deep learning frameworks, more traditional machine learning based approaches, as well as biomarkers. These methods center around techniques for transferring bulk-learned information into SC prediction anchors, directly or indirectly. Applications of these methods demonstrate their utility at generating and testing hypotheses for heterogeneity-aware drug discovery. Depending on specific research questions and biological models, different methods might be preferred for hypothesis generation. Eventually, application of these methods will help reduce occurrence of drug resistance, cancer relapse, and potentially lead to complete tumor regression.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Danielle Maeser: Software, Writing-Original Draft Weijie Zhang: Conceptualization, Writing-Original Draft Yingbo Huang: Visualization, Writing-Review and Editing R. Stephanie Huang: Supervision, Conceptualization, Writing-Review and Editing
References
- 1.Adam G et al. Machine learning approaches to drug response prediction: challenges and recent progress. npj Precis. Onc 4, 19 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Jarada TN, Rokne JG & Alhajj R A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions. J Cheminform 12, 46 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Dagogo-Jack I & Shaw AT Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol 15, 81–94 (2018). [DOI] [PubMed] [Google Scholar]
- 4.Brady SW et al. Combating subclonal evolution of resistant cancer phenotypes. Nat Commun 8, 1231 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Qi R & Zou Q Trends and Potential of Machine Learning and Deep Learning in Drug Study at Single-Cell Level. Research 6, 0050 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Wu Z et al. Single-Cell Techniques and Deep Learning in Predicting Drug Response. Trends in Pharmacological Sciences 41, 1050–1065 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Harzing AW (2007) Publish or Perish, available from https://harzing.com/resources/publish-or-perish [Google Scholar]
- 8.Chen J et al. Deep transfer learning of cancer drug responses by integrating bulk and single-cell RNA-seq data. Nat Commun 13, 6494 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]; * scDEAL is a deep transfer learning approach based on a neural network architecture to predict cancer drug response through integrating bulk and single-cell RNA sequencing (RNA-seq) data. It predicts single-cell drug response based on a trained model and the single-cell RNA-seq data, utilizing the maximum mean discrepancy as a loss function to produce binary drug response labels. Results from the application of scDEAL to six drug-treated single-cell data with experimental validation indicated that scDEAL is robust. The available scDEAL Python package currently only supports the datasets appeared in Chen et al.
- 9.Zheng Z et al. Enabling Single-Cell Drug Response Annotations from Bulk RNASeq Using SCAD. Advanced Science 10, 2204113 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Gambardella G et al. A single-cell analysis of breast cancer cell lines to study tumour heterogeneity and drug response. Nat Commun 13, 1714 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Pellecchia S, Viscido G, Franchini M & Gambardella G Predicting drug response from single-cell expression profiles of tumours. 10.1101/2023.06.01.543212 (2023) doi:10.1101/2023.06.01.543212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Fustero-Torre C et al. Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data. Genome Med 13, 187 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]; * Beyondcell is a method aimed to predict cellular drug response from single-cell RNA sequencing data. It generates a Beyondcell enrichment score (BCS) for each drug-cell pair, ranging from 0–1, and it is indicative of the activity of a gene-drug signature in a given gene expression matrix. This score also measures how susceptible each single-cell is to a drug under scrutiny where a high BCS represents high concordance between the gene signature and the single-cell analyzed. This method was applied to a breast cancer single-cell dataset where a BCS was computed for a panel of drugs. It was able to identify distinct drug-response cell subpopulations before and after bortezomib treatment, drug-resistant cellular populations, and single-cell variability in drug response across cancer patients. Through deconvolving tumor heterogeneity, it was also able to propose drug treatments and therefore can help design more precise treatment regimens.
- 13.He B et al. ASGARD is A Single-cell Guided Pipeline to Aid Repurposing of Drugs. Nat Commun 14, 993 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]; * ASGARD is a method for imputing single-cell drug sensitivities through identifying drugs that can reverse single-cell gene expression data from a diseased state to a normal state. Thus, it requires both diseased and normal single-cell data from the same subjects as input data. Drug efficacy is measured by a drug score at the individual patient level and takes into account the significance of the reversed differential gene expression pattern between diseased and normal samples. ASGARD was compared to other drug repurposing methods using bulk RNA-seq samples by summarizing single-cell RNA sequencing (RNA-seq) data into pseudo-bulk RNA-seq data and was found to predict drugs more accurately. It was also found to have robust performance across different single-cell populations with clinical validation. In short, ASGARD demonstrated itself to be a promising drug recommendation pipeline.
- 14.Lei W et al. scDR: Predicting Drug Response at Single-Cell Resolution. Genes 14, 268 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Suphavilai C et al. Predicting heterogeneity in clone-specific therapeutic vulnerabilities using single-cell transcriptomic signatures. Genome Med 13, 189 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]; * Through leveraging a recommender system and information across drugs, CaDRReS-Sc can predict drug accuracy across single-cell RNA sequencing (RNA-seq) data with high accuracy (80%). Thus, it can capture transcriptomic heterogeneity, predicting drug response in unseen cell types. Its latent pharmacogenomic model also eases visualization and interpretation in order to examine key drug pathways. This method was extended to combinations of drugs where drug pairs identified were more effective than individual drugs in vitro.
- 16.Hsieh C-Y et al. scDrug: From single-cell RNA-seq to drug response prediction. Computational and Structural Biotechnology Journal 21, 150–157 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yang W et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Research 41, D955–D961 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Seashore-Ludlow B et al. Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset. Cancer Discovery 5, 1210–1223 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Yu C et al. High-throughput identification of genotype-specific cancer vulnerabilities in mixtures of barcoded tumor cell lines. Nat Biotechnol 34, 419–423 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Keenan AB et al. The Library of Integrated Network-Based Cellular Signatures NIH Program: System-Level Cataloging of Human Cells Response to Perturbations. Cell Systems 6, 13–24 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Ling A, Gruener RF, Fessler J & Huang RS More than fishing for a cure: The promises and pitfalls of high throughput cancer cell line screens. Pharmacology & Therapeutics 191, 178–189 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Barretina J et al. The Cancer Cell Line Encyclopedia enables predictive modeling of anticancer drug sensitivity. Nature 483, 603–607 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ben-David U et al. Genetic and transcriptional evolution alters cancer cell line drug response. Nature 560, 325–330 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ji Y, Lotfollahi M, Wolf FA & Theis FJ Machine learning for perturbational single-cell omics. Cell Systems 12, 522–537 (2021). [DOI] [PubMed] [Google Scholar]
- 25.Zhang H, Chen Y & Li F Predicting Anticancer Drug Response With Deep Learning Constrained by Signaling Pathways. Front. Bioinform 1, 639349 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Jia P et al. Deep generative neural network for accurate drug response imputation. Nat Commun 12, 1740 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Bouhaddou M et al. Drug response consistency in CCLE and CGP. Nature 540, E9–E10 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Ben-Hamo R et al. Predicting and affecting response to cancer therapy based on pathway-level biomarkers. Nat Commun 11, 3296 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Miranda SP, Baião FA, Fleck JL & Piccolo SR Predicting drug sensitivity of cancer cells based on DNA methylation levels. PLoS ONE 16, e0238757 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Chen Y et al. Response prediction biomarkers and drug combinations of PARP inhibitors in prostate cancer. Acta Pharmacol Sin 42, 1970–1980 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lee JH, Park YR, Jung M & Lim SG Gene regulatory network analysis with drug sensitivity reveals synergistic effects of combinatory chemotherapy in gastric cancer. Sci Rep 10, 3932 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Rees MG et al. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat Chem Biol 12, 109–116 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Koudijs KKM, Böhringer S & Guchelaar H-J Validation of transcriptome signature reversion for drug repurposing in oncology. Briefings in Bioinformatics 24, bbac490 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Dai W et al. Matrix Factorization-Based Prediction of Novel Drug Indications by Integrating Genomic Space. Computational and Mathematical Methods in Medicine 2015, 1–9 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Poleksic A Hyperbolic matrix factorization improves prediction of drug-target associations. Sci Rep 13, 959 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Korsunsky I et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 16, 1289–1296 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Hsu LL & Culhane AC Impact of Data Preprocessing on Integrative Matrix Factorization of Single Cell Data. Front. Oncol 10, 973 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ammad-ud-din M et al. Drug response prediction by inferring pathway-response associations with kernelized Bayesian matrix factorization. Bioinformatics 32, i455–i463 (2016). [DOI] [PubMed] [Google Scholar]
- 39.Wang L, Li X, Zhang L & Gao Q Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization. BMC Cancer 17, 513 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Suphavilai C, Bertrand D & Nagarajan N Predicting Cancer Drug Response using a Recommender System. Bioinformatics 34, 3907–3914 (2018). [DOI] [PubMed] [Google Scholar]
- 41.Wang T et al. Identification and characterization of essential genes in the human genome. Science 350, 1096–1101 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Hou R, Denisenko E & Forrest ARR scMatch: a single-cell gene expression profile annotation tool using reference datasets. Bioinformatics 35, 4688–4695 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Zhang W et al. Inferring therapeutic vulnerability within tumors through integration of pan-cancer cell line and single-cell transcriptomic profiles. 10.1101/2023.10.29.564598 (2023) doi:10.1101/2023.10.29.564598. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Ianevski A et al. Single-cell transcriptomes identify patient-tailored therapies for selective co-inhibition of cancer clones. 10.1101/2023.06.26.546571 (2023) doi:10.1101/2023.06.26.546571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Gao R et al. Delineating copy number and clonal substructure in human tumors from single-cell transcriptomes. Nat Biotechnol 39, 599–608 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Van Galen P et al. Single-Cell RNA-Seq Reveals AML Hierarchies Relevant to Disease Progression and Immunity. Cell 176, 1265–1281.e24 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Sh Y et al. CaSee: A lightning transfer-learning model directly used to discriminate cancer/normal cells from scRNA-seq. Oncogene 41, 4866–4876 (2022). [DOI] [PubMed] [Google Scholar]
- 48.Nofech-Mozes I, Soave D, Awadalla P & Abelson S Pan-cancer classification of single cells in the tumour microenvironment. Nat Commun 14, 1615 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Pomeroy AE, Schmidt EV, Sorger PK & Palmer AC Drug independence and the curability of cancer by combination chemotherapy. Trends in Cancer 8, 915–929 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The key limiting factor against predicting drug response at the single-cell level is insufficient training power due to the lack of public benchmarked data. Understanding how to appropriately integrate bulk and scRNA-seq data alleviates this limitation. Ultimately, it will enable the design of more precise therapeutic regimens, taking into account a patient’s specific microenvironment and tumor heterogeneity. A pitfall of this approach, however, is the ability to impute cellular level drug response is contingent on the type and quality of bulk CCL data used in the integration. Specifically, the techniques covered in this review largely leverage response across chemotherapeutics, and both chemotherapy and immunotherapy may be used alone, together (‘chemoimmunotherapy’), or in combination with other treatments (e.g. radiation therapy or surgery), which these approaches can’t generate predictions for yet.
