Skip to main content
Cell Reports Medicine logoLink to Cell Reports Medicine
. 2024 May 15;5(6):101568. doi: 10.1016/j.xcrm.2024.101568

scRank infers drug-responsive cell types from untreated scRNA-seq data using a target-perturbed gene regulatory network

Chengyu Li 1,2,6, Xin Shao 1,2,6,, Shujing Zhang 1,3, Yingchao Wang 1,3, Kaiyu Jin 1,2, Penghui Yang 1,2, Xiaoyan Lu 1,3, Xiaohui Fan 1,2,4,5,7,∗∗, Yi Wang 1,2,∗∗∗
PMCID: PMC11228399  PMID: 38754419

Summary

Cells respond divergently to drugs due to the heterogeneity among cell populations. Thus, it is crucial to identify drug-responsive cell populations in order to accurately elucidate the mechanism of drug action, which is still a great challenge. Here, we address this problem with scRank, which employs a target-perturbed gene regulatory network to rank drug-responsive cell populations via in silico drug perturbations using untreated single-cell transcriptomic data. We benchmark scRank on simulated and real datasets, which shows the superior performance of scRank over existing methods. When applied to medulloblastoma and major depressive disorder datasets, scRank identifies drug-responsive cell types that are consistent with the literature. Moreover, scRank accurately uncovers the macrophage subpopulation responsive to tanshinone IIA and its potential targets in myocardial infarction, with experimental validation. In conclusion, scRank enables the inference of drug-responsive cell types using untreated single-cell data, thus providing insights into the cellular-level impacts of therapeutic interventions.

Keywords: drug response, perturbation, single cell, cell-type prioritization, gene network

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • scRank infers drug-responsive cell type exclusively from untreated scRNA-seq data

  • Target-perturbed GRN models drug effects via network alignment and diffusion

  • Applicable to diverse drugs and complex diseases in both in vivo and in vitro data


Identifying which specific cell populations are most responsive to drug treatments remains a critical challenge. Li et al. develop a computational approach, scRank, to prioritize cellular targets of interested drugs from untreated single-cell RNA-seq (scRNA-seq) data, which may facilitate precision therapy and maximize the clinical utility of scRNA-seq data.

Introduction

The cellular response to drugs exhibits tremendous heterogeneity, which is largely attributed to the diversity of cell populations.1 In the context of specific biological networks belonging to different cells,2 drugs interacting with their targets (e.g., receptors or enzymes) exert different effects.3,4 For example, a kinase inhibitor might impede tumor cell proliferation while having minimal impact on normal cells due to differential pathway activation.5 Hence, understanding the heterogeneous drug response requires the characterization of cell subpopulations within complex tissues.6 Traditionally, tissue transcriptome profiling yielded only aggregated expression data, obscuring the distinct molecular signatures of varied cell populations.7 These distinct molecular signatures are crucial in mediating how cells respond differently to treatments.8 Recent advances in single-cell RNA sequencing (scRNA-seq) have provided opportunities to delineate cellular heterogeneity at the single-cell level. This approach has unveiled a wealth of information, including the identification of cell types, their spatial distribution, intercellular communication patterns, and the underlying gene regulatory mechanisms that are key to understanding disease progression and cellular response to drugs.9,10,11,12,13,14 As a result, drug development strategies have shifted from targeting specific factors to targeting critical cell types, thus opening a new avenue for precision medicine.15,16 Recent studies, including those using therapeutic clusters to identify druggable cell populations17 and deep learning methods for prioritizing cell types in disease contexts,18 have furthered our understanding of cellular targets for drug development. Additionally, tools integrating bulk expression data with scRNA-seq analyses offer complementary insights into determining phenotype-associated cell clusters.19 Despite the advances in scRNA-seq, identifying the key cell types that respond to drug treatment remains a critical challenge, as it requires an understanding of pharmacodynamics, encompassing both the direct interactions of drugs with their targets and the subsequent cascade of effects on cellular pathways. Such knowledge is essential for understanding drug mechanisms of action at the cellular level20,21 and developing precise therapeutic strategies with fewer side effects.22 While most previous efforts to dissect drug responses have focused on identifying significant changes in gene expression or protein activity, the question of which cell type exhibits the highest responsiveness to drug treatment has yet to be adequately addressed.

In this study, the drug response of cell type is defined as the extent to which different cell types exhibit changes in cellular states upon exposure to a drug. A common means of analysis in exploring the drug response of cell types is based on the number of differentially expressed genes (DEGs).23 More recently, a novel method called Augur determines the response priority of cell types using the separability within a high-dimensional space.24 Although these strategies allow for quantifying drug response, they inevitably ignore the prior knowledge of the drug target. Furthermore, because such analyses require disease-treatment paired datasets, they are unsuitable for inferring drug-responsive cell types from existing datasets that only include a disease condition that lacks drug intervention.25 Since adequate diseased datasets are often available but drug treatment datasets are lacking, the scalability of existing methods in diseased samples is limited. In fact, with the increasing availability of diseased state data,26 inferring the key cell types responsive to treatments solely from untreated datasets has the potential to extend our knowledge of cellular-level impacts of drugs and therefore guide further experiment design. As a result, a method is urgently needed that integrates the knowledge of drug targets with that of diseased cellular states to identify drug-responsive cell types exclusively from untreated datasets. Given that genes interact extensively,27 and the drug response largely depends on transcriptional programs associated with a given drug target, gene regulatory networks (GRNs)28 may represent a plausible basis for inferring cellular drug response.

To this end, we present scRank, a drug-responsive cell-type inference method using the target-perturbed GRN (tpGRN) to model and score in silico drug perturbations in untreated scRNA-seq data. Our work is based on two assumptions. First, intrinsic cellular states in different cell types can be reflected by the cell-type-specific GRN. Second, drug (inhibitor) perturbation in cells can be modeled as a deletion of drug target gene nodes in the GRN that results in both global and local effects. The algorithm of scRank is designed to follow these hypotheses. At first, scRank performs cluster-wise network inference for cell types. To simulate the drug effect in a GRN, scRank deleted all edges of the drug target gene node to create a tpGRN. To quantify the degree of drug perturbation in the tpGRN of each cell type and in turn prioritize them, we evaluated global and local perturbation effects using a manifold alignment algorithm29 and network diffusion.30 We conducted extensive simulations to assess the accuracy and robustness of scRank in identifying drug-responsive cell types. We further validated scRank’s performance using untreated scRNA-seq datasets, which include both in vivo and in vitro data from a range of complex disease such as cancer and diabetes. We showed that scRank can successfully identify drug-targeted cell types and that it outperforms other existing approaches. Moreover, we applied scRank to our previous scRNA dataset from macrophages in a mouse model of myocardial infarction and identified another potential target for tanshinone IIA, which we confirmed experimentally. In short, scRank enables the inference of drug-responsive cell types exclusively from untreated scRNA-seq data, thus revealing the therapeutic mechanisms of drugs.

Results

Overview of scRank

We propose scRank (https://github.com/ZJUFanLab/scRank), a computational tool for ranking cell types based on their potential gene network activity in response to an assigned drug (Figure 1). Drug perturbation dose not only exerts an effect in a single target but also influences the downstream genes.28,31 These target-related downstream genes tend to be co-expressed in common pathways and often function in concert to respond to drug treatment.32 Upon interaction with targets, drugs exert their effects not just through direct target modulation but by initiating a cascade of signaling events through those co-expressed downstream genes that culminate in specific biological functions.33 Consequently, a comprehensive assessment of a drug’s impact must extend beyond simply examining the target gene’s expression, encompassing the drug’s integral influence within the broader biological network.34,35 Our approach with scRank leverages network methodologies to predict the extent of a drug’s impact across cell types. We hypothesize that the inhibition of the targets by a compound perturbation should propagate along the molecular network to finally affect the overall structure and strength of the network. By perturbing the drug direct target gene node in the molecular network, reconstructed from the gene expression profile based on co-expression relationships, we can capture the system-wide influence of the drug. We modeled this perturbation effect as a combination of the global network difference resulting from in silico drug perturbation and the local network diffusion within the target-related signal. Implementing this hypothesis with the tpGRN, we used a perturbation score to evaluate the in silico drug perturbation effect on each cell type and rank them. Due to this in silico drug perturbation, our approach allows one to infer the cell-type priority in the drug response based solely on untreated data.

Figure 1.

Figure 1

Workflow of the scRank method

Overview of scRank including the input, detailed strategy of cell-type ranking based on a target-perturbed gene regulatory network (tpGRN), and output.

(A) Untreated single-cell RNA sequencing data and direct target gene.

(B) Concept of ranking cell types responsive to the drug based on the tpGRN. Cell-type-specific GRNs are first constructed. After determination of the target gene, the tpGRN is created by cutting off all the edges of the target gene node. By comparing untreated and tpGRNs, the drug perturbation is quantified to rank cell types.

(C) Local and global perturbation effects were combined to prioritize the drug response for each cell type.

(D) Procedure to construct GRN. Target genes of inhibitor drug collected from DGIdb, top 2,000 highly variable genes, and transcription factors are considered to construct the GRN. Principal-component regression is used to obtain the co-expression adjacency matrix. For each cell type, a specific GRN is created with the same gene nodes.

(E) Schematic representation of calculating global perturbation effect via manifold alignment.

(F) Schematic representation of calculating local diffusion perturbation effect using network diffusion. Dd represents the distance of the drug target node, Dn represents the distance of the 1-hop gene node n, Wout represents the weight of the out-come edge, Win represents the weight of the in-come edge, and Deg represents the degree of the gene node.

Generally, scRank takes both single-cell transcriptomic data and drug direct targets of interest as input data, yielding a drug-responsive cell-type rank as the output (Figure 1A). The input data consist solely of an untreated expression count matrix derived from disease conditions, as well as gene names corresponding to drug direct targets for the purpose of executing in silico drug perturbation. After comparing the GRN between the untreated and drug target perturbed in silico for each cell type, the drug perturbation is evaluated using manifold alignment and network diffusion (Figure 1B). The output perturbation score represents the extent to which cell type responds to the drug (Figure 1C).

For single-cell GRN reconstruction, the untreated expression data are first separated by cell type to construct a set of cell-type-specific GRNs. To preserve the main features of the disease state and incorporate as many biological processes and drug target genes as possible for the efficient construction of the GRN, we only considered limited expression-related features, namely the top 2,000 highly variable genes (HVGs), transcription factors, and drug target genes. In practice, HVGs were determined using Seurat,36 transcription factors from AnimalTFDB,37 and drug targets from DGIdb38 and integrated as input gene features to construct the GRN (Figure 1D). This specific combination together makes it effective to predict drug response (Figure S1). Inspired by scTenifoldNet39 and scTenifoldKnk,40 we adopted a machine-learning-based framework for constructing the network. However, to ensure a robust characterization of the biological process in adapting our specific drug response inference problem, we made certain modifications to the framework, including the input features and subsample strategy. In practice, using the principal-component regression algorithm41 combined with a random sampling strategy and tensor decomposition,42 it was possible to determine the co-expression relationship between genes from the expression data, yielding a gene-gene adjacency matrix representing the GRN of cell types.

To rank drug-responsive cell types, the tpGRN was first generated to mimic the inhibitor effect by setting the edge weight of the target-gene in the untreated GRN to zero (Figure 1E). We also tested the ability to model the effect of agonists by the tpGRN (Figure S2). Then, a manifold alignment algorithm was used to compare the untreated and target-perturbed networks. The distance between the same gene nodes under different conditions shared in manifold space was calculated to measure the difference in terms of network structure, which represents the global perturbation. To consider local diffusion, meaning that the effect of perturbation will spread from the target gene via downstream genes alongside the co-expression edge, we considered 2-hop neighborhood diffusion and scored this effect (Figure 1F). The global and local effects in the network were then combined as a perturbation score to quantify the degree of perturbation and rank cell types.

Validation of scRank using synthetic ground-truth datasets

We first tested scRank on synthetic data in three scenarios to evaluate its performance for drug-responsive cell-type identification. In our study, synthetic datasets were designed to mimic the cellular transcriptome in a disease state. To achieve this, we simulated single-cell gene expression datasets from a pre-defined GRN using SERGIO43 where disease-associated genes interact and form disease modules. We hypothesized that if a drug’s target gene is located within the disease modules of high activity, then the perturbation effect of the drug would be more significant within the cellular network. Consequently, cell types with active target genes should exhibit a more substantial response to this perturbation, which results in a ground truth of cell-type ranking (Figure 2A). Specifically, we developed three distinct scenarios, each with a defined ground-truth rank of drug-responsive cell types (Figure 2B). This approach allowed us to assess the predictive accuracy of our model in different situations.

Figure 2.

Figure 2

Benchmark test for scRank in simulated datasets

(A) Schematic diagram for generating simulated ground-truth data. We use pre-defined GRNs to simulate disease scRNA data with varied module activity across cell types using SERGIO for three scenarios. We hypothesize that the cell type with high drug-related module activity could be perturbed largely. The ground truth of the drug-responsive cell-type ranking can then be obtained via pre-defined module activity.

(B) Ground-truth rank of cell types in each scenario.

(C) 3,000 simulated cells of four cell types with an increasing module 1 activity gradient. The scaled perturbation score calculated by scRank across cell types is presented in a dot-line graph. Data are presented as the median ± SD. The number of data points is 25 for each cell type. The averaged rank of cell types with increasing cell numbers predicted by scRank is presented in a heatmap, where the asterisk represents the top-ranked cell type for each dataset.

(D) 3,000 simulated cells with six cell types, which can be grouped into high and low groups based on their binary module 1 activity. The scaled perturbation score is calculated by scRank for two groups. Data are presented as boxplots (minima, 25th percentile; median, 75th percentile; and maxima). The number of data points is 75 for each group. The p value is calculated via Wilcoxon rank-sum test with increasing cell number.

(E) 3,000 simulated cells from four cell types with cell-type-specific modules exhibiting high activity. The bar plot is the perturbation score calculated by scRank for different module genes. Data are presented as the median ± SD. The number of data points is 25 for each module. The top-ranked cell type for different types of perturbation with increasing cell numbers is shown in a dot-line plot.

In scenario 1, we simulated scRNA-seq datasets with four cell types exhibiting stepwise increases in the activity of gene module 1 to illustrate the common pattern of disease progression (Figure 2C). Assuming a drug target in module 1, we tested if scRank could correctly prioritize cell types based on the gene activity related to the drug. The results showed that scRank can successfully identified cell types by perturbing genes in module 1 and consistently ranked cell type A highest in most cases with increasing cell numbers (Figure 2C).

In scenario 2, instead of increasing linearly, module 1 was exhibited in a binary manner for two groups of cell types (Figure 2D), reflecting a distinct disease pattern. The results indicated that scRank could also distinguish the “high group” in gene module 1 from the “low group”, with a significant difference in every case. In scenario 3, the gene modules were exhibited in a cell-type-specific manner (Figure 2E), which is crucial for understanding differential drug responses among various cell types. Each cell type had a specific gene module of high activity. We then iteratively perturbed the genes of modules 1 to 4. For each perturbation of specific module genes, the target cell type’s perturbation score was greater than for any other cell type, and the average rank of the target cell type was the highest in most cases. Collectively, the validation in synthetic datasets proved that scRank could accurately rank cell types.

Performance comparison of scRank with other methods

To further evaluate the performance of scRank in identifying drug-targeted cell types, we compared it with existing methods. These methods prioritize cell types that most respond to drugs based on the relative number of DEGs or separability between the same cell types under different conditions (pre- and post-treatment). Differential expression analysis methods, such as the Wilcoxon rank-sum test,36 MAST,44 the bimod likelihood-ratio test,45 and DESeq2,46 focus on identifying statistically significant DEGs.

It should be noted that both of these methods required datasets with pre- and post-treatments, whereas scRank only required pre-treatment datasets and drug information without any post-treatment data. Therefore, to compare scRank to these methods, we first applied them to synthetic paired-condition datasets (scenario 4, Figure 3A). The datasets comprised two conditions, disease and treatment. For the disease conditions, four cell types with gradient-increased activity of module 1 represented the abnormally activated gene module in the disease. Considering that a drug is introduced to inhibit genes in module 1, the activity of module 1 should decrease to basal levels for each cell type. We reasoned that the cell type with higher gene module activity would be affected by the drug to a greater extent, generating the ground-truth rank of the cell types. Hence, the Spearman rank test was adopted for comparison between the predicted and true cell type ranks. By increasing the number of cells in the dataset, the cell-type ranks were consistent with the ground truth in all methods. Nevertheless, scRank exhibited a robust performance using four benchmarked datasets with increasing cell numbers, outperforming existing methods in the inference of drug-responsive cell types (Figure 3A).

Figure 3.

Figure 3

Superior performance of scRank over existing methods

(A) Simulated data with four cell types in pre- and post-treatment conditions. The co-expression network across cell types and conditions is presented in a heatmap. The perturbation score across cell types in the disease condition calculated via scRank is presented in a boxplot (minima, 25th percentile; median, 75th percentile; and maxima). The number of data points is 25 for each cell type. p values are calculated using the Wilcoxon test. ∗p ≤ 0.05 and ∗∗∗p ≤ 0.001. The performance comparison of scRank with existing cell-type prioritization methods for drug perturbation (Augur, differential expression analysis via Wilcoxon rank-sum test) in simulated data is presented by a dot-line plot with increasing cell numbers. The asterisk represents the best prioritization for all cell types in a specific dataset.

(B) Low-dimensional projection for datasets with the ground-truth target cell type. The target cell type is circled by a dashed line.

(C) Performance comparison of scRank with other methods (Augur and differential expression analysis methods, including Wilcox, MAST, bimod, and DESeq2) across nine responsive cell types and two non-responsive cell types. The asterisk represents the best prioritization of responsive and non-responsive cell types in line with the ground truth.

(D) Averaged performance for each method. Data are presented as boxplots (minima, 25th percentile; median, 75th percentile; and maxima). The number of data points is 9 for each method.

(E) Performance comparison between scRank and other drug response prediction methods (beyondcell and scDEAL) in in vitro samples, where the mean accuracies are 77.8% for scRank, 72.2% for beyondcell, and 64.8% for scDEAL. Data are presented as boxplots (minima, 25th percentile; median, 75th percentile; and maxima). The number of data points is 18 for each method. The beyondcell method includes three strategies (beyondcell, bc_SwitchBinary, bc_BCS) to rank the cell type. The scDEAL method includes two strategies (scDEAL, scDEAL_binary) to rank the cell type.

(F) The boxplot represents the rank predicted by scRank for responsive and non-responsive cell lines, encompassing an analysis of 179 cell lines across 53 drugs. The number of data points is 159 for each group. The p value is calculated using a one-sided paired Wilcoxon test.

In addition, to further compare the performance of scRank with existing methods on real scRNA-seq data with ground truth, we collected five single-cell drug perturbation datasets47,48,49,50,51 with nine known target cell types (Figure 3B). Each dataset, with the drug-targeted cell type provided by the authors, consisted of paired conditions. In practice, we performed differential expression analysis and Augur using datasets with two conditions, whereas scRank was applied only on disease condition datasets. scRank identified nine drug-targeted cell types in the five datasets as being top ranked (Figures 3C and S3), exceeding the rank given by other methods in most cases. Furthermore, scRank ranked two non-responsive cell types, which were used as negative controls, at the bottom position (Figure 3C); this was lower than the rank given by any of the other methods. On average, scRank ranked target cell types highly when compared to the other methods (Figure 3D), indicating that it could more accurately determine drug-targeted cell types from datasets that only contain an untreated condition. Besides, we also evaluate the performance of scRank compared to other perturbation prediction methods such as scGen,52 scVIDR,53 and CellOracle,54 and scRank outperforms all other tools at identifying responsive cell types (Figure S4). We further validated scRank’s performance using three additional cancer scRNA-seq data,55,56,57 focusing on five classical treatments: anti-PD1, anti-CTLA4, anti-CD40, anti-PDL1, and docetaxel (Figure S5). Through these datasets, as well as other in vivo samples included in our study, we confirmed the efficacy of scRank in identifying drug-responsive cell types. This validation extends beyond the conventional approach of using solely the expression of either direct targets or downstream targets, highlighting scRank’s robustness and versatility in diverse therapeutic contexts (Figure S6).

Next, we have incorporated three additional cancer cell line scRNA-seq data58,59 to prove the performance of scRank in cases with low cellular heterogeneity and compare scRank with drug-response-predicting tools such as beyondcell17 and scDEAL60 using these datasets (Figures 3E and S7). The result demonstrates that even in in vivo samples with lower variability, scRank still maintains a superior performance. Furthermore, we applied scRank to scRNA-seq data encompassing 179 distinct cell lines from Kinker et al.59 We leveraged each cell line’s drug sensitivity information from the CTRP dataset61 to evaluate the performance of scRank in identifying responsive cell lines across 53 drugs. The results demonstrated that scRank is applicable in various drugs with an overall accuracy of 71.3% (Figures 3F and S8). Notably, responsive cell lines were significantly ranked higher than non-responsive ones.

Identification of sensitive and resistant tumor subtypes for vismodegib in medulloblastoma

As an extension of the benchmark in real scRNA-seq data, we selected one of the benchmarked datasets, which was generated from mice bearing medulloblastoma with defined drug-sensitive and -resistant cell types,47 in order to demonstrate how scRank would identify the drug-targeted cell type and associated biological mechanism (Figure 4A). With the cells in pre-treatment conditions and the vismodegib direct target gene Smo as input data (Figure 4A), scRank successfully identified drug-sensitive and -resistant tumor subtypes (Node_B and Node_A) based on their highest and lowest perturbation scores, respectively, which was consistent with the findings of the original paper (Figure 4B).

Figure 4.

Figure 4

Validation of predicted drug-sensitive and drug-resistant tumor subtypes for vismodegib in medulloblastoma

(A) scRNA data from medulloblastoma-bearing mice under two treatment conditions. Vehicle scRNA data and vismodegib target gene Smo were the input in scRank. The t-distributed stochastic neighbor embedding (t-SNE) plot is colored by the predicted top-ranked percentage in each cell type.

(B) t-SNE plot of the original sample in the two conditions colored by cell type. Node_A to Node_D represent tumor cells. Node_A and Node_B were defined as resistant and sensitive tumor subtypes, respectively, based on the relative difference between pre-treatment and post-treatment, as outlined in the original study.

(C) Smo-related subnetwork of Node_A and Node_B in pre-treatment conditions. Network nodes are colored by gene module category.

(D) Heatmaps of GRNs in the two tumor subtypes and the difference between these (post-treatment minus pre-treatment).

(E) Gene Ontology (GO) analysis of different module genes.

(F) Heatmap of differentially expressed genes in Node_A and Node_B. Bar represents the average expression in each sample.

(G) Significantly activated pathways in Node_B (top) and Node_A (bottom) determined via gene set enrichment analysis (GSEA) of the marker genes.

(H) Survival probability analysis of marker genes for Node_A and Node_B. High expression of signature genes in the drug-resistant tumor subtype (Node_A) exhibited lower survival probability. High expression of signature genes in the drug-sensitive tumor subtype (Node_B) exhibited higher survival probability. The p value is calculated with the two-sided log-rank test.

To elucidate the molecular mechanisms of medulloblastomas in response to Smo inhibitor vismodegib, we applied scRank to reconstruct the cell-type-specific gene networks for the two tumor subtypes in the pre-treatment conditions (Figures 4C and 4D). We focused on Smo-linked genes and visualized GRNs. As shown in Figure 4C, the genes in the network were divided into two gene modules (M1 and M2), with the drug target gene (Smo) located in module 1. Gene module 1 was involved in cell cycle progression, cell division, and G2/M phase transition, whereas gene module 2 was mainly implicated in protein phosphorylation (Figure 4E). Notably, the co-expression patterns related to Smo were significantly different between the networks of Node_A and Node_B in the pre-treatment condition, where Smo formed an isolated node in the network of Node_A and a compact structure in the network of Node_B (Figure 4C). This co-expression relationship with Smo suggested that the gene was highly relevant to cell proliferation in Node_B but not Node_A (Figure 4E), in line with the observation that the module activities of M1 and M2 in Node_B were significantly higher than that in Node_A (Figure S9). Furthermore, the network difference between pre-treatment and post-treatment for the two cell types indicated that the activity of cell-cycle-related module M1 was suppressed in Node_B while upregulated in Node_A (Figure 4D), providing evidence that the effect of the drug depends on the gene activity within the target-gene-related subnetwork. Intriguingly, the relationship between Smo and Cnbp, which has been proposed to participate in a non-canonical SHH/AMPK axis to support tumor proliferation,62 was found in the M1 module of Node_B (Figures 4C and S9), indicating that the signal from Smo could be transduced through this non-canonical pathway to promote growth. Thus, inhibiting the Smo signal in Node_B, instead of Node_A, with high activation of the Shh pathway could be effective for the suppression of cell proliferation. These results indicate that differences at the gene network level, rather than those at the expression level (Figure S9), gave rise to distinct drug responses between different cell types.

To further validate the predicted cell-type prioritization in response to SHH inhibitor therapy in clinical samples, we employed two tumor cell signatures to score 178 human sonic hedgehog subtype medulloblastoma (SHH-MB) bulk RNA-seq samples.63 We found that Node_A and Node_B followed a continuous development path from dividing to differentiation. Node_A defined proliferation status with cell-cycle-related gene expression, whereas Node_B defined differentiation status with high expression of HEY1, MDK, SOX9, and HES6 (Figures 4F and 4G). Patients whose tumors exhibited greater proliferation and presumably harbored greater Node_A activity had a significantly poorer prognosis (p < 0.0001, Figure 4H). Consistent with this, patients with tumors exhibiting higher differentiation had a significantly better prognosis (p = 0.0023, Figure 4H). Therefore, the scRank-predicted top-ranked cell type is expected to indeed represent the therapeutic target.

Characterization of the excitatory neuron subtype responsive to SSRIs in MDD

We then examined whether scRank could be applied to identify the target cell type responsive to selective serotonin reuptake inhibitors (SSRIs) in major depressive disorder (MDD). SSRIs are the most prescribed medication for treating MDD, inhibiting the serotonin transporter encoded by the SLC6A4 gene to then block serotonin reuptake and thus increase serotonin availability around nerve cells (Figure 5A).64 However, despite a comprehensive understanding of this mechanism of action, the SSRI-responsive neuronal subpopulation remains unknown. To this end, we collected single-nucleus RNA sequencing (snRNA-seq) data from the dorsolateral prefrontal cortex region of healthy and untreated MDD human brains.65 After using snRNA-seq from the untreated MDD brain tissue and SSRI fluoxetine direct target gene SLC6A466 as input data, scRank ranked the cell types responsive to fluoxetine (Figure 5B).

Figure 5.

Figure 5

Utilization of scRank for identifying neuronal subtypes targeted by fluoxetine in MDD

(A) Top image displays snRNA-seq dataset of the dorsolateral prefrontal cortex (DLPFC) in Brodmann area 9 (BA9) derived from 17 patients with MDD. The lower image shows a schematic diagram of the mechanism of fluoxetine for MDD. scRNA-seq data for the MDD condition and the fluoxetine target gene were input data for scRank.

(B) UMAP visualizations of data with 25 cell types colored based on the predicted rank for drug response. Bar plot shows the scaled perturbation score for each cell type. Ex, excitatory neurons; Inhib, inhibitory neurons; Oligos, oligodendrocytes; Endo, endothelial; Astro, astrocytes; OPC, oligodendrocyte precursor cells.

(C) Heatmap of the GRN in excitatory neuron cluster 9 (Ex_9) with 205 MDD risk genes, which were separated into four modules. The top right heatmap represents the subnetwork of modules 2–4 for Ex_9, while the bottom right heatmap represents the network for inhibitory neuron cluster 5 (Inhib_5). Both heatmaps represent the GRN in untreated samples. The graphs to the right of the heatmap provide the network visualizations for module 2 in corresponding cell types, where gene nodes are colored based on their module.

(D) Significantly enriched biological processes and pathways for module genes determined using the Metascape web tool.

(E) Average module 2 activity for each neuron subtype in MDD. Data are presented as boxplots (minima, 25th percentile; median, 75th percentile; and maxima). The number of data points is 26 for each group. The comparison of the drug-target-related module for Ex_9 between the control group and MDD group is shown on the right, with respect to the averaged edge weight of 26 module 2 genes evaluated via a paired two-sided Wilcoxon test.

(F) Averaged log fold change of MDD risk genes between healthy state and disease state.

(G) Significantly activated pathways in Ex_9 determined via GSEA of the differentially expressed MDD risk genes.

(H and I) Spatial mapping of Ex_9 using CellTrek. The images on the left in both (H) and (I) represent the layer annotation for each spatial spot in mouse anterior brain tissue (H) or human dorsolateral prefrontal cortex tissue (I). The images on the right in (H) and (I) represent the spatial distribution of Ex_9, where red pixels specifically mark the location of these neurons. The deeper layers (layers 5 and 6) are particularly highlighted. The subsequent bar plot shows the relative proportion of cells in each layer.

We first employed scRank to examine the GRN of the top-ranked and bottom-ranked cell types (Ex_9 [excitatory neuron cluster 9] and Inhib_5 [inhibitory neuron cluster 5]). To inspect disease-related biological processes, we chose only MDD risk genes. We found that the MDD risk gene network in Ex_9 modularized into four gene modules, with gene modules 2, 3, and 4 showing higher activity (Figure 5C). Gene modules were enriched in different pathways involving dopaminergic synapses, serotonergic synapses, glutamatergic synapses, and neurodegeneration (Figure 5D). Interestingly, Ex_9 exhibited abnormally high activity in module 2, where the drug-target-encoding gene SLC6A4 is located, indicating that this excitatory neuron subtype may be more active in monoamine transport and may thus be a target of SLC6A4 inhibitors. However, in the bottom cell type Inhib_5, when compared with Ex_9, SLC6A4 in gene module 2 formed an isolated node, and the whole network was very loose, suggesting a lower potential for targeting by the SLC6A4 inhibitor. We then examined the activity of gene module 2 across all neuron subtypes. Ex_9 consistently exhibited the highest activity for module 2 across all neuron subtypes (Figure 5E). Furthermore, the activity of module 2 significantly increased in Ex_9 between the control and MDD groups (Figure 5E). By comparing DEGs between Ex_9 in the control and MDD groups, the network weight and expression fold change of genes associated with the therapeutic SSRI response67,68 (SLC6A4, MAOA, HTR2A) were highly upregulated, leading us to characterize Ex_9 as exhibiting abnormally upregulated serotonin transporter activity (Figures 5F and 5G). These results proved that Ex_9 contributed to the development of MDD and acted as the drug-targeted cell type, owing to the strong activation of the subnetwork related to SLC6A4.

Recent research has indicated that the SLC6A4 gene is mainly expressed in a subset of deep layer neurons within the prefrontal cortex, with neurons in layers 5 and 6 potentially targeted by SSRIs,69,70 while neurons in the deep layer exhibit abnormal neuronal size.71 Therefore, we asked whether the spatial distribution of Ex_9 corresponded to these findings. We combined scRNA-seq data from MDD with mouse and human brain spatial transcriptomic data72 using CellTrek,73 recovering the spatial coordinates of Ex_9 in tissue sections (Figures 5H and 5I). Consistent with prior research, most Ex_9 neurons were distributed in the deep layers of the brain, exhibiting network characteristics that distinguish them from other neurons in these layers (Figure S10). Therefore, the top-ranked Ex_9 predicted by scRank could represent the target cell type responsive to SSRI. Further analysis of this neuronal subtype could be beneficial for MDD drug development.

Demonstration of a responsive macrophage subpopulation and another potential target of tanshinone IIA in myocardial infarction

Macrophages play diverse roles in the pathophysiological processes that lead to myocardial infarction, exerting a spectrum of pro-inflammatory and anti-inflammatory effects, which are determined by their heterogeneous phenotypes.74,75 Such diverse macrophage subsets can also exhibit distinct responses to drugs. It is therefore essential to identify the key drug-responsive macrophage subsets in order to strengthen the understanding of the pharmacological mechanism and discover novel therapeutic targets at the cellular level. Hence, we applied scRank to an scRNA-seq dataset of mouse cardiac macrophages under three conditions (sham, myocardial infarction, and tanshinone IIA treatment) from our previous study (Figure 6A).21 We used scRNA-seq data from myocardial infarction and the tanshinone IIA direct target gene Ctsk76 as input data. As shown in Figure 6B, macrophage 5 (M∅-5) was the top-ranked cell type responsive to tanshinone IIA, which was in agreement with our original conclusion that this compound could dramatically decrease the proportion and inflammatory effects of M∅-5 (Figure 6C).21 We then zoomed in on the Ctsk-centric subnetwork of the predicted top-ranked and bottom-ranked macrophage subtypes, constructed based on the data of pre-treatment conditions. We found that the subnetwork was modularized into three gene modules, with the drug target gene Ctsk in module 1 (Figures 6D and 6E). When compared to M∅-11, M∅-5 exhibited higher module 1 activity, which was predominantly enriched in inflammation-related functions, as determined via Gene Oncology analysis (Figures 6D and 6F). Moreover, we observed that the activity of module 1 in M∅-5, but not in M∅-11, was significantly reduced by tanshinone IIA (Figure 6E), indicating that the M∅-5 subset was the therapeutic target.

Figure 6.

Figure 6

Utilization of scRank for identifying the responsive macrophage subpopulation and potential targets of tanshinone IIA

(A) scRNA data of immune cells isolated from left ventricles of C57BL/6 mouse hearts from three conditions (sham, 3 days after left anterior descending artery ligation resulting in myocardial infraction (MI), and 3 days after ligation and tanshinone IIA treatment).

(B) scRNA data for MI and tanshinone IIA target gene Ctsk were the input in scRank. t-SNE visualization of MI data where the cell type is colored based on its prioritization predicted via scRank, from red (top rank) to black (bottom rank). Top-ranked macrophage subtype M∅-5 is circled by a dashed line.

(C) t-SNE visualization of data with combined group colored by experimental condition. Previously validated macrophage subtype M∅-5, as the ground-truth tanshinone IIA target cell type, is circled by a dashed line.

(D) Heatmaps of GRNs in top-ranked and bottom-ranked macrophage subtypes (M∅-5 and M∅-11) in different conditions (pre-treatment and after tanshinone IIA treatment) with 131 Ctsk-linked genes.

(E) Comparison of the drug-target-related modules among M∅-5 (top) and M∅-11 (bottom) in pre-treatment and post-treatment conditions with respect to the averaged edge weight of 47 module 1 genes evaluated via a paired two-sided Wilcoxon test.

(F) GO analysis of genes in different modules.

(G) Schematic diagram of scRank for ranking genes. scRank-calculated perturbation score for each gene in the GRN of M∅-5, where the input is the GRN of the target cell type and the output is the rank of the gene.

(H) Expression changes of predicted target genes in M∅-5 across different experimental conditions. The expression changes of predicted target genes in M∅-5 on different conditions are shown. Data are presented as boxplots (minima, 25th percentile; median, 75th percentile; and maxima). The numbers of data points are 93, 896, and 140 for each condition. The asterisk indicates that the expression value is significantly upregulated (MI versus sham) or downregulated (MI versus tanshinone IIA) (∗∗∗p < 0.0001).

(I) 3D structures and binding modes showing the hydrogen bonds formed between the active site of CTSB and tanshinone IIA.

(J) Sensorgram for the interactions of tanshinone IIA with CTSB.

(K) Immunofluorescence staining for the macrophage markers CD68 (green), CTSB (red), and DAPI (4′,6-diamidino-2-phenylindole; blue), showing the suppression of macrophage CTSB levels by tanshinone IIA treatment.

Since tanshinone IIA protects the cardiovascular system through multiple pathways, it should have more than one target77,78 (Figure S11). Therefore, after determining its target cell type, we sought to determine whether the perturbation score across genes in the GRN of M∅-5 could indicate the potential for being a target of tanshinone IIA. In other words, given the target cell type, we can rank genes based on the perturbation score calculated via scRank to identify novel drug targets. In practice, we calculated the perturbation score for all genes in the GRN of M∅-5 (Figure 6G), with the top five genes (Ctsb, Ctsd, Fcgr1, Lgmn, Cd68) considered potential tanshinone IIA targets (Figure 6G). Notably, these targets were not identifiable through simple correlation analysis (Figure S11). We also used simulated data to validate the accuracy of scRank in ranking target genes (Figure S12). Based on the assumption that tanshinone IIA exerts inhibitory effects on gene expression, we investigated whether the top five candidates predicted by scRank were indeed suppressed by treatment. Indeed, all were significantly downregulated by tanshinone IIA (Figures 6H and S12), which we confirmed experimentally (Figure S13). These genes were also abnormally upregulated in the myocardial infarction condition (Figures 6H and S13), suggesting the downregulation of those genes by tanshinone IIA due to the alleviation of myocardial ischemia. In fact, these genes were grouped into a module within the protein-protein interaction network, implicated in the regulation of hydrolase activity in lysosomes and immune system processes, thus suggesting a mechanism of action for tanshinone IIA (Figure S12). We also checked whether tanshinone IIA, as an inhibitor, directly binds to the proteins encoded by these genes via molecular docking analysis (Figure S12). Due to the dominant role of Ctsb in the perturbation score, we focused on it. Molecular docking results showed the putative binding of tanshinone IIA to the active site of CTSB via hydrogen-bonding interactions with His110 and Cys119 (Figure 6I). To confirm that tanshinone IIA binds to CTSB, we carried out a surface plasmon resonance assay and found that tanshinone IIA could directly bind to the CTSB protein and that this binding occurs in a concentration-dependent manner (Figure 6J). Furthermore, subsequent immunofluorescence staining experiments confirmed that tanshinone IIA significantly reduced CTSB expression in macrophages (Figure 6K). These results corresponded with recent research79,80 and indicated that CTSB could represent a potential target for tanshinone IIA.

Discussion

The accurate identification of key cell types responsible for drug treatment remains a challenge. Most methods for inferring therapy response focus on the gene expression of known drug sensitivity markers, potentially missing critical associations between drug targets and downstream signal genes. These associations are important for understanding the heterogeneous drug response since drug perturbation not only exerts effects in a single target but also influences signal transduction and GRNs.

In this study, we present scRank, a new method for ranking and inferencing drug-responsive cell types using a tpGRN. The underlying assumption is that the therapeutic response of different cell types depends on its intrinsic gene network related to drug targets. The main goal of scRank is to model the in silico drug perturbation in the GRN and quantify its effect in different cell types. To this end, scRank created tpGRNs for modeling in silico drug perturbation and compared the tpGRN with the original GRN for measuring the drug-introduced shift in the network so as to rank cell types. A common approach for comparing networks is to use a manifold alignment algorithm to project nodes from different networks onto a shared manifold space, calculating node similarity there to assess network structure dissimilarity.40,81 However, this approach fails to capture the dynamic changes in the tpGRN, which can pose a challenge in accurately measuring the shift in the network induced by the drug. To address this, we model drug perturbation in the tpGRN as having both global and local effects. Specifically, the perturbed target gene can cause global distortion in each gene node, and the resulting distortion could diffuse alongside the target-centric subnetwork. In practice, the distance is integrated with the drug target and its related subnetwork signal to quantify the drug effect. While the creation of the tpGRN may simplify the complex network rewiring that occurs in response to drug perturbation, it is designed to reflect genetic perturbations of putative drug targets through the gene-gene correlations within the network. Our comparison of GRNs and simulated tpGRNs to GRNs derived from treated cells validates that this approach can accurately simulate drug effect (Figure S14). To the best of our knowledge, the approach we have described here, which integrates drugs and biological networks to model and evaluate in silico drug perturbation, represents the first instance of inferring drug-responsive cell types based on untreated scRNA-seq data.

A significant challenge in single-cell drug response prediction is the accurate identification of various cellular states, as these states can exhibit markedly different responses to the same drug and may transition into distinct fates upon treatment.82 While scRank was specifically tailored for use with pre-labeled cell types, it has shown adaptability and robustness across various clustering methodologies in our supplementary benchmark tests, effectively discerning different cellular states (Figure S15). Another critical aspect to consider is drug resistance, which has a significant impact on treatment outcomes.83 We have incorporated an integrated score to also consider the intrinsic drug resistance mechanism in untreated samples (Figure S16). Moreover, we have extended scRank’s application to drugs with multiple targets, and the preliminary results demonstrate scRank’s potential effectiveness (Figure S17). However, it is crucial to acknowledge the inherent complexities of multi-target perturbation, such as the synergistic effect.84 These intricate target-to-target interactions necessitate specifically tailored methodologies, which we aim to develop in future research.

Overall, scRank makes the following conceptual advances. Our approach overcomes the limitations of experimental drug treatment and can provide valuable insights into a drug response without the need for costly and time-consuming experiments. Besides, scRank uses an explainable model based on target-related networks to illustrate the underlying mechanism of drug responsiveness, favoring the hypotheses that densely connected target-centric subnetworks may be subjected to stronger drug perturbation due to higher gene interactions85,86 (Figure S18). Moreover, our network-based method goes beyond the traditional reliance on expression profiles of direct and downstream drug targets, which often prove inadequate for accurately predicting drug responses.34,87 scRank, with its emphasis on target-related networks, offers a more holistic view of drug responses. This approach is particularly effective in cases where the drug target gene is not a marker gene for specific cell types (Figures S5 and S6). Considering that the curative efficacy of a drug depends on how it affects the cell types that are most important for the disease, we also confirm the disease relevance of the highest-ranking cell types in our three case studies (Figures S9–S11). Additionally, we assessed the functional impact of these drugs on the highest-ranking cell types using gene set enrichment analysis (Figures S9–S11), thereby enhancing our tool’s clinical utility. Furthermore, scRank is also effective in dealing with drugs from natural products whose mechanisms of action are unknown, such as tanshinone IIA, as it helps identify and prioritize potential drug targets.

In our comparison of scRank, beyondcell, and scDEAL, we observed disparities in the predictive performances. For instance, the lowest performances of scDEAL and beyondcell on sorafenib are, to an extent, influenced by the quality and imbalance of the bulk datasets they utilized. Meanwhile, scRank encounters challenges when dealing with drugs that have complex mechanisms, such as cytarabine. The action mechanism of cytarabine goes beyond inhibition to involve direct DNA damage and integration into DNA. This introduces complex perturbations in the biological network, making it challenging for scRank to predict.

We foresee several future applications of our method. Firstly, by understanding the relative drug sensitivity among cell types, scRank can help optimize treatments through better drug combinations and delivery strategies to minimize side effects and target key disease cells. Secondly, scRank can enhance our understanding of FDA-approved drugs at the cellular level. Although treatment efficacy is established, precise knowledge about which cell types are responsive to a drug is limited. Using scRank, we can identify these vital cell types for further study, thereby enhancing our knowledge of drug-associated therapeutic mechanisms. In our study, we chose scRNA-seq data over proteomic data due to their wider availability. Moving forward, with the anticipated increase in the availability of single-cell proteomic data,88 we foresee the potential for applying a similar methodology to protein-protein interaction networks derived from these proteomic measures. Such an approach would enable a more direct investigation of drug responses at the protein level.

Limitations of the study

One limitation of scRank is its primary focus on drugs acting as inhibitors, which is partly due to the nature of the available data. Most of the data we currently have access to pertain to inhibitory drugs, guiding the development and testing of scRank predominantly within this context. However, recognizing the need to broaden the scope of scRank, we have also conducted preliminary tests on agonists, adapting the methodology of scRank to model their effects (Figure S2). Specifically, to reflect the activation effect of agonists within the network, we set the weight on edges emanating from the target gene node to their maximum possible value, resulting in the tpGRN. Further, as more single-cell data for agonist drugs become available, we plan to further refine and expand scRank’s capabilities.

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies

CTSB (WB) Cell Signaling Technology Cat# 31718S; RRID: AB_2687580
CTSB (IF) Proteintech Cat# 12216-1-AP; RRID: AB_2086929
CTSD (WB) Cell Signaling Technology Cat# 69854S
CTSD (IF) Proteintech Cat# 21327-1-AP; RRID: AB_10733646
Lgmn Cell Signaling Technology Cat# 93627S; RRID: AB_2800205
CD68 Servicebio Cat# GB113109-100; RRID: AB_2935658
FCGR1 Invitrogen Cat# MA5-29706; RRID: AB_2785530
β-tubulin Beyotime Cat# AF1216; RRID: AB_2924787

Chemicals, peptides, and recombinant proteins

Tanshinone IIA Solarbio Cat# ST8020
Tanshinone IIA sulfonate Shanghai Aladdin Biochemical Technology Co. Ltd Cat# S107694
Human Cathepsin B/CTSB Protein ACROBiosystems CTB-H5222
Native human Cathepsin D protein Abcam ab91123

Deposited data

Single-cell RNA-seq data of mouse medulloblastoma tumors Ocasio et al.47 (2019) GEO: GSE129730
Single-cell RNA-seq data of mouse non-hematopoietic bone marrow cells Leimkühler et al.48 GEO: GSE156644
Single-cell RNA-seq data of mouse small intestinal organoid Mead et al.49 GEO: GSE148524
Single-cell RNA-seq data of mouse kidney Wu et al.50 GEO: GSE181382
Single-cell RNA-seq data of colorectal cancer ascites-derived epithelial cells Poonpanichakul et al.51 GEO: GSE155953
Single-cell RNA-seq data of human prostate Tuong et al.55 EGA: EGAS00001005787
Single-cell RNA-seq data of colorectal cancer Lee et al.56; Khaliq et al.57 GEO: GSE144735, GSE200997
Single-cell RNA-seq data of melanoma cell line Ho et al.58 GEO: GSE108394
Pan-cancer single-cell RNA-seq data of cell lines Kinker et al.59 GEO: GSE157220
Bulk RNA-seq data of medulloblastoma tumor Weishaupt et al.63 GEO: GSE124814
Single-cell RNA-seq data of human prefrontal cortex in major depressive disorder Nagy et al.65 GEO: GSE144136
10x Visium data of mouse brain cortex 10x Genomics https://satijalab.org/seurat/articles/spatial_vignette
10x Visium data of human dorsolateral prefrontal cortex Maynard et al.72 https://github.com/LieberInstitute/HumanPilot/
Single-cell RNA-seq data of heart immune cells Jin et al.21 GEO: GSE163465
Drug response data from the Cancer Therapeutic Response Portal (CTRP) Seashore-Ludlow et al.61 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4631646/bin/NIHMS711523-supplement-4.xlsx

Experimental models: Organisms/strains

C57BL6/J Mice Beijing Vital River Laboratory
Animal Technology Co. Ltd.
N/A

Oligonucleotides

qRT-PCR primers This study STAR Methods

Software and algorithms

R (v4.1.1) R Core Team https://www.r-project.org/
Python (v3.9) Python Software Foundation https://www.python.org/
rTensor Li et al.89 https://github.com/rikenbit/rTensor
Manifold alignment Vu et al.29 https://github.com/perimosocordiae/ManifoldWarping
beyondcell Fustero-Torre et al.17 https://github.com/cnio-bu/beyondcell
Augur Skinnider et al.24 https://github.com/neurorestore/Augur
Seurat (v4.3.0) Hao et al.36 https://satijalab.org/seurat/
AnimalTFDB (v3.0) Hu et al.37 http://bioinfo.life.hust.edu.cn/AnimalTFDB/
DGIdb (v4.0) Freshour et al.38 https://www.dgidb.org/
scTenifoldNet Osorio et al.39 https://github.com/cailab-tamu/scTenifoldNet
SERGIO Payam Dibaeinia and Saurabh Sinha43 https://github.com/PayamDiba/SERGIO
MAST Finak et al.44 https://github.com/RGLab/MAST
bimod McDavid et al.45 https://github.com/RGLab/MAST
DESeq2 Love et al.46 https://bioconductor.org/packages/release/bioc/html/DESeq2.html
scGen Lotfollahi et al.52 https://github.com/theislab/scgen
scVIDR Kana et al.53 https://github.com/BhattacharyaLab/scVIDR
CellOracle Kamimoto et al.54 https://github.com/morris-lab/CellOracle
scDEAL Chen et al.60 https://github.com/OSU-BMBL/scDEAL
CellTrek Wei et al.73 https://github.com/navinlabcode/CellTrek
biomaRt (v2.48.3) Durinck et al.90 https://bioconductor.org/packages/release/bioc/html/biomaRt.html
scSHC Grabski et al.91 https://github.com/igrabski/sc-SHC
Metascape Zhou et al.92 https://metascape.org/
clusterProfiler (v4.6.2) Wu et al.93 https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html
Molecular Signatures Database (v7.4) Subramanian et al.94 https://www.gsea-msigdb.org/gsea/msigdb
dynamicTreeCut R package Langfelder et al.95 https://cran.r-project.org/web/packages/dynamicTreeCut/index.html
survival R package Therneau and Grambsch96 https://cran.r-project.org/web/packages/survival/index.html
Meeko python package Stefano Forli and Arthur J. Olson97 https://pypi.org/project/meeko/
AutoDock vina (v1.2.3) Eberhardt et al.98 https://github.com/ccsb-scripps/AutoDock-Vina
scRank This study https://github.com/ZJUFanLab/scRank

Resource availability

Lead contact

Further information and request should be directed to and will be fulfilled by the lead contact, Xiaohui Fan (fanxh@zju.edu.cn)

Materials availability

This study did not generate new unique reagents.

Data and code availability

  • Data from five publicly available single-cell datasets with nine known drug-targeted cell types were processed for validation and analyzed using scRank, as described below. For these datasets, the drug name, and its corresponding target gene input into scRank is articulated in Table S1.

  • scRNA-seq data from mouse medulloblastoma tumors after vismodegib treatment and matched vehicle-treated samples were obtained from GEO (accession: GEO: GSE129730). Data were processed following the workflow provided by the authors (https://github.com/ben-babcock/Gershon_single-cell). The score was calculated by comparing the network constructed from vehicle-treated samples and perturbation of vismodegib target Smo.

  • scRNA-seq data from colorectal cancer ascites-derived epithelial cells of patients before and after chemotherapy (mFOLFOX6) was obtained from GEO (accession: GEO: GSE155953). Metadata, including cell types, was provided by the authors. Score was calculated by comparing the network of pre-treatment samples and in-silico perturbation of TOP1 and TYMS, which encode the targets of leucovorin and 5-fluorouracil, respectively.

  • scRNA-seq data from a ThPO-induced bone marrow fibrosis mouse model were obtained from GEO (accession: GEO: GSE156644). Cell type annotation was provided by the authors. The score was calculated by comparing the untreated network to that of perturbation of S100a8 and S100a9, which are the targets of tasquinimod.

  • scRNA-seq data from miniaturized organoid models of intestinal stem cell differentiation into Paneth cells before and after treatment with KPT-330 were obtained from the Broad Institute’s Single-Cell Portal (https://singlecell.broadinstitute.org/; studies SCP1547). Cell type annotation provided by the authors. The score was calculated by comparing networks of the pretreatment samples and perturbation of KPT-330 target Xpo1.

  • Wu et al. scRNA-seq data from the kidneys of mice with diabetic kidney disease, before and after dapagliflozin treatment, were obtained from GEO (accession: GEO: GSE181382). The authors provided Metadata. The score was calculated by comparing the networks of pretreatment samples and in-silico perturbation of dapagliflozin target Slc5a2.

  • scRNA-seq data from the prefrontal cortex of patients with MDD were obtained from GEO (accession: GEO: GSE144136). For mouse brain ST data (10X Genomics Visium), we downloaded Seurat objects from (https://satijalab.org/seurat/articles/spatial_vignette.html). Human DLPFC ST data (151673) were downloaded from (https://github.com/LieberInstitute/HumanPilot/).

  • Our previously published scRNA-seq data of heart immune cells after myocardial infarction and matched tanshinone IIA-treated samples on day 3 are available in GEO (accession: GEO: GSE163465).

  • Three cancer scRNA-seq data includes Tuong2021 (EGA: EGAS00001005787), Lee2020 (GEO: GSE144735), and Khaliq2022 (GEO: GSE200997). For classical treatments, the target cell type are as follows: T cells for anti-PD1 and anti-CTLA4 treatment, tumor cells for anti-PDL1 and Docetaxel treatment, B cells for anti-CD40 treatment.

  • Three cancer cell line scRNA-seq data includes Melanoma cell line (GEO: GSE108394), SCC47 cell line (GEO: GSE157220), and JHU006 cell line (GEO: GSE157220). Drug responsive cell populations were defined in the original research.

  • Pan-cancer scRNA-seq data of cell lines is from GEO: GSE157220. The drug response data of CTRP is from the site (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4631646/bin/NIHMS711523-supplement-4.xlsx). The top 3 cell lines with highest drug sensitivity as responsive cell line, and the last 3 with the lowest sensitivity as non-responsive cell line.

  • All original code has been deposited at GitHub (https://github.com/ZJUFanLab/scRank)

Method details

scRank algorithm

We designed scRank to rank cell types based on the gene network activity associated with the inhibitor-targeted gene. We reasoned that the drug’s effect will manifest in network connectivity. We attempted to quantify the potency of the drug by comparing untreated GRNs to tpGRNs. scRank quantifies the difference in network structure and the effect of network diffusion as the basis of cell type ranking. The method consists of three steps: network construction, in-silico drug perturbation, and perturbation evaluation. The first step is to reconstruct the gene regulatory network from the expression profiles. The second step is to simulate the effect of the in-silico drug perturbation in the GRN. The third step is to estimate the perturbation. The resource usage of scRank has been evaluated using simulated datasets, as detailed in Tables S2 and S3.

Network construction

Notably, constructing the network is the core step for accurately estimating the drug perturbation in tpGRN. Recently, several approaches have been developed for constructing gene regulatory networks from scRNA-seq transcriptomic data. scTenifoldNet39 is an advanced approach that outperforms existing tools like GENIE3 in both accuracy and efficiency. This is achieved through the use of principal component regression, combined with ensemble and denoising strategies, which help to robustly build networks. Given its remarkable efficacy, we incorporated analogous methodologies for network construction. In practice, we implemented further modifications after much deliberation to customize the approach to our particular research objective. To ensure that estimates of drug perturbation are comparable across all cell types, instead of constructing a GRN for the entire sample, we use a cluster-wise strategy to construct cell type-specific GRNs in which the network nodes are consistent. The selection of gene features used in constructing GRNs was purposeful, with a focus on preserving direct regulatory relationships, drug target information, and functional biological processes. In an effort to minimize noise and simplify input gene features, we further excluded mitochondrial and ribosomal genes from our analysis. The subsampling approach has been customized to address the challenge of rare cell types and their varying abundance. Specifically, a fixed proportion of cells is selected for subsampling, while cells belonging to types with less than 25 cells are excluded. Several parameters have been optimized, including the threshold for edge weight and the denoising precision, resulting in enhanced performance for drug-responsive cell type identification. The detailed steps are as follows.

For a given scRNA-seq expression matrix M with Mi[n×c] (n genes and c cells) in cell type i (i{1,2,,C}, where C denotes the total number of cell types in the dataset), we utilized Mi to construct a cell type-specific gene regulatory network for every cell type. To overcome the heterogeneity among cell types and construct robust GRNs, we first randomly selected δc cells (δ(0,1) denotes the selection ratio, and its default value is 0.5) t times from Mi, resulting in Ti=[X1,,Xt] , a set of gene expression matrices with δc cells. The set of gene expression matrices representing a more comprehensive cellular status in cell type i was used to build GRNs.

To reduce the computational cost while preserving the most meaningful biological data, we only selected a subset of expressed genes instead of the whole transcriptome. In practice, the top 2000 HVGs across the entire scRNA-seq dataset identified by Seurat, TFs from AnimalTFDB, and drug (inhibitor) target genes from DGIdb were integrated to construct the GRN. We also filtered mitochondrial, ribosomal, and redundant genes to retain meaningful cell-to-cell variance and relevant biological processes. The final number of gene set G=[HVG+TF+Tar] used to construct the GRN is denoted by g.

To recover gene-gene correlation relationships from gene expression matrix X in cell type i with a low computational cost, we adopted approaches similar to that from scTenifoldNet. To construct the cell type-specific GRN, we utilized principal component analysis (PCA) on each subset matrix Xt[g×δc] in expression profile set Ti. This allowed us to address the multicollinearity problem by combining the correlated independent variables into a few principal components and keep the cell type-specific information. We then applied a regression model to estimate regression coefficients between the target gene and all other genes, which were used as edge weights in the GRN. In practice, given a gene expression matrix X[n×p], Yn×1=(y1,,yn)T denotes the vector of target gene expression, and Xn×(p1)=(x1,xn)T denotes the vector of the remaining p1 regulator gene expression. Assuming that the value of Y is dependent on that of X, the relationship between the target gene and other p1 genes can be modeled by:

Y=Xβ+ε

where βRp1 denotes the vector of the regression coefficients, and ε denotes the vector of random errors. In order to reduce the computational cost and over-fitting, we regress target gene on a k latent covariant obtained via PCA instead of p1 genes (kp1). Let Xn×k=Xn×(p1)V(p1)×k denote the selected top-ranked k principal components, where V(p1)×k is the PC loading of X via singular value decomposition. The regression problem can then be transformed into:

Y=Xβ+ε

Here, βRk denotes the vector of regression coefficients in the transformed data matrix X, and it can be estimated using ordinary least squares. Then, the estimation of β, representing the correlation between the target gene and p1 gene, is given by: βˆ= VkβˆRp1. By iteratively regressing every gene in G for each random selection, we finally obtained t gene adjacency matrices, including the relationship and strength of every gene pair for cell type i. To retain significant biological signals in the GRN, we only kept the top θ% edges in terms of their absolute weight value (θ=5% by default). Since principal component regression considers the asymmetric nature of gene relationships, assigning different weight from gene A to gene B and vice versa, it results in bidirectional relationships, especially between TFs and their target genes. Then, to amplifies the TFs' regulatory influence, we made a trade-off between the bidirectional edges, saying that the edge with lower weight is maintained at 25% of its strength. Finally, we removed self-correlation by setting the diagonal value to 0 in order to obtain the final gene adjacency matrix, denoted as A.

To improve the robustness of the GRN and retain heterogeneity, we extracted the main signals in each GRN At, obtained from each subset matrix Xt in Ti, and integrated them into one matrix using tensor component analysis (TCA). We first combined t GRN into an S×T×R third-order tensor denoted as χ, where S, T, and R correspond to the number of GRN from randomly selected data, target genes, and regulator genes, respectively. Then, the TCA method named CANDECOMP/PARAFAC was conducted on χ, resulting in the decomposition and approximation of χ by the sum of R tensors of rank-1:

χr=1Rsrtrrr

where the notation represents the outer product, while sr, tr, and rr are vectors of the factor r that contain the loadings of the respective elements in each dimension of the tensor. We selected the top five factors to represent the critical component of χ and summed them as a new tensor, χ, approximating the original tensor χ. We then averaged the value in χ across S and further normalized each weight by dividing them by their maximum absolute value to obtain the final GRN for the given cell type i, denoted as A. The final GRN encompasses interactions across all input genes and forms the basis for subsequent creation of target-perturbed GRN.

In-silico drug perturbation

To mimic the effect of the inhibitor perturbation on the gene regulatory network, we created target-perturbed GRN Atp by setting all values in the row of drug direct target genes to 0. Conversely, to mimic the effects of agonist perturbations, we created the tpGRN by adjusting the weight of the out-edges emanating from the target gene node to their maximum possible value (weight = 1).

Perturbation evaluation

In order to model the in-silico drug perturbation effect in tpGRN, we considered both the global and local effects in network, and formulated using the distance between the same node in manifold space and the network diffusion effect.

Calculation of gene node distance between GRNs

To estimate drug perturbation in terms of the network, we first attempted to jointly project the two high-dimensional adjacency matrices into a shared low-dimensional latent space using manifold alignment (Figure 1E) and interpreted the distance between every matched gene node as the perturbation effect of the drug. The manifold alignment projects two datasets generated from a similar process into a new low-dimensional space, which maintains the smallest distance between matched points while retaining their original structure. A matched point with a greater distance represents greater topological changes of the point in the two datasets. To perform manifold alignment, we are given two adjacency matrices, ARp×p and APBRp×p, where pG is the gene set utilized for constructing the GRN. Then, we can combine A and APB into a joint matrix W:

W=[AλIλIAPB]

where I is an identity matrix describing the correspondence between gene nodes in two GRNs, and λ is a tuning parameter. We then utilized the d smallest non-zero eigenvalues solved from the Laplacian Eigenmaps problem, as they shared low-dimensional projection. Let E2p×d=(e1,,e2p) denote the eigenvector. Then, the Euclidean distance between two gene nodes is calculated by D(p)=epe2p=(d1,,dp).

Network diffusion

To infer drug perturbation more accurately, we considered the diffusion effect in the network. Given a target gene Tar, we considered that the effect of perturbation would spread from the target gene into downstream genes alongside edges in the GRN. We assumed that this effect is related to both edge weight and node distance. Therefore, the perturbation score was calculated as follows:

score=DTarWoutDegTar+nNDnWin+nNDnWoutDegn

where DTar is the distance of the target gene, Dn is the distance of the downstream genes, Deg is the degree of the corresponding gene node, Win is the weight of the in-edge, and Wout is the weight of the out-edge. The first item was the normalized drug perturbation effect in the target gene, the second item was the diffused effect in 1-hop node, and the third item was the normalized diffused effect in 2-hop node.

Data processing

For all ST and scRNA-seq datasets, cell type annotations were either provided by the authors or manually obtained based on their code. All cell types with more than 25 cells were subjected to the scRank pipeline. For mouse brain ST data, we used biomaRt (2.48.3)90 to convert mouse genes to human genes in order to map the human DLPFC data using CellTrek.

For all ST and scRNA-seq datasets, raw counts were normalized using the global-scaling normalization method LogNormalize in preparation for downstream analysis, and raw counts were used to run the scRank pipeline.

Simulation

The simulated data were generated using the “SERGIO” Python package. Cell type expression profile was generated following the predefined GRN, which is the input file of SERGIO, containing the GRN structure and its parameters. We defined the genes that could be separated into four gene modules, each containing 25 genes exhibiting co-regulation. Next, we adjusted the in-module strength by tuning the parameters of production rate and interaction strength to generate cell type-specific gene module activity. All GRNs of the simulated data were reconstructed using scRank from simulated single-cell expression data. Through parameter tuning, we evaluated the performance of scRank in multiple scenarios.

The first scenario was designed for generating cell types with an orderly increase in module activity. In practice, we simulated four datasets containing different cell numbers (1000, 3000, 5000, 10000 cells of each cell type). Each dataset consisted of four cell types, with 100 genes each. Each cell type had three gene modules (gene modules 2, 3, and 4) with similar activity and one gene module (gene module 1) with gradient-increased activity across cell types. We assumed that the drug target gene was in gene module 1. We iteratively input genes from module 1 and single-cell gene expression profiles into scRank model.

The second scenario was designed to generate six cell types, grouped into two sets, with high or low gene module activity. Cell types A, B, and C were assigned to the group with higher gene module activity for module 1 than in cell types D, E, and F, which had lower activity in module 1. We iteratively input genes from module 1 and single-cell gene expression profiles into scRank model.

The last scenario was designed to generate cell types with cell type-specific gene modules. Specifically, gene modules 1/2/3/4 corresponded to cell types A/B/C/D. For each module, we iteratively input genes within the module and single-cell gene expression profiles into scRank model.

To simulate the disease-treatment paired dataset, we first simulated the disease-like dataset with abnormally high co-expressed gene module 1, whose activity is gradient-increased in cell types as in scenario 1. Correspondingly, we also simulated a treatment-like dataset with very low activity in module 1, representing the potential status of the target gene being inhibited by the drug.

In the first simulation, we evaluated the performance of scRank in terms of identifying the continuous cellular state by comparing perturbation scores across cell types. Considering the drug target in gene module 1, we iteratively perturbed genes 1–25 for each cell type GRN in-silico to obtain the final perturbation score calculated by scRank. Performance was assessed using both the top-rank percentage of cell type A and the Spearman rank correlation coefficient (SRCC). In the second simulation, we evaluated the performance of scRank in terms of identifying discrete cellular states based on statistical significance. In the third simulation, we considered that each cell type with specific gene modules responds to a specific drug, indicating that when the drug target gene is in gene module 1, cell type A with specifically high module 1 activity should be most responsive to the drug among other cell types. For each module, we iteratively perturbed the genes in gene modules in-silico and assessed the performance of scRank for identifying cell type-specific responses to drugs based on statistical significance.

In the fourth simulation, based on the disease-treatment paired datasets, we compared scRank to other cell type-ranking methods. For scRank, we only used the disease-like dataset to rank cell types. Classic differential expression analysis (via Wilcoxon’s rank-sum test in Seurat) and Augur both analyze the disease and treatment datasets as input data in order to prioritize cell types. Before running these two methods, gene expression data were normalized and integrated across conditions, followed by the standard integration workflow in Seurat with default parameters.

Comparison with other methods

Simulated data and five real datasets were used to compare the performance of scRank to that of other existing cell-type ranking methods. For each condition in simulated and real datasets with paired conditions, the number of DEGs for each cell type between pre-treatment and post-treatment conditions was used to quantify the extent of drug perturbation and rank cell type. The AUC score representing the perturbation degree was quantified by Augur to rank cell type. The top-ranking percentage of the targeted cell type was calculated to evaluate the performance of the methods.

In the comparison study with other perturbation predicting methods (scGen, scVIDR and CellOracle), we employed the length of estimated perturbation vector as the magnitudes of perturbation and in turn use it to rank cell type. Given that CellOracle is specifically designed to support perturbations of transcription factors (TFs), we selected the TF gene exhibiting the highest fold change between the pre-treatment and post-treatment states within the drug-targeted cell type to serve as the perturbated gene for input into CellOracle.

In the comparison study with other drug response predicting methods, we employed different strategies to evaluate the performance of beyondcell and scDEAL. This involved utilizing different metrics to rank cell populations based on their responsiveness to treatment. For beyondcell, it generates BSC (Beyondcell score) for each cell, which measures the susceptibility of each cell to a given drug. To rank cell types using beyondcell, we calculated their averaged BSC score across all cells within a cell subpopulation. This approach, referred as “beyondcell”, focuses on the population-level prioritization. Additionally, given that beyondcell offers single-cell resolution, we implemented an alternative approach, “bc_BCS”, where we ranked every cell individually based on its BSC score and then calculated the averaged rank for each cell type. Moreover, beyondcell defines a switch point of the BSC score to categorize cells as sensitive or resistant. Following this concept, we ranked cell types based on the average of their binary prioritization (0 for resistant, 1 for sensitive), considering the switch point criterion. This approach was referred as “bc_SwitchBinary”. For scDEAL, it assigns a relative response probability to each cell, known as the sensitive score. For ranking cell types with scDEAL, we averaged these sensitive scores across all cells within a cell type. This approach, referred as “scDEAL”, similarly evaluates population-level responsiveness. Like beyondcell, scDEAL also provides a binary label (sensitive or resistant) for each cell. We incorporated this aspect to rank cell type by averaging the binary prioritization for each cell type (0 for resistant, 1 for sensitive). This approach was referred as “scDEAL_binary”.

In the comparison with expression-based method, we employed the mean expression levels of both direct and downstream target genes across cell types in untreated conditions to rank cell types. The downstream targets refer to network neighbor genes in the modularized target gene-associated network.

For all tools, we followed the guidance in their original repository and set the default value for all parameters.

Benchmarking tests of scRank with different clustering methods

scRank’s performance across three clustering methods (Leiden, Louvain, and scSHC91) is benchmarked. These methods were benchmarked on three real-world datasets, showing casing an increasing complexity of cell states, ranging from distinct cell types to more intricate cellular states. In practice, Leiden and Louvain executed via the function “FindClusters” in Seurat package with resolution of 0.1, 0.5, and 1. On the other hand, scSHC is implemented without the need for such parameter tuning. The known responsive cell types were utilized as ground truth labels. The proportion of these labels within the highest-ranking cell types predicted by scRank, based on clusters identified by each of the three algorithms, indicates the performance of scRank upon different clustering methods.

GSEA and GO enrichment analyses

The “Metascape web tool [https://metascape.org/]” 92 was used to perform enrichment analysis of pathways and biological processes, wherein the top 100 DEGs were selected according to the fold-change in average gene expression. GSEA was performed using the ranked gene list with the clusterProfiler93 tool to enrich the significantly activated pathways and biological processes, whose signatures were obtained from the Molecular Signatures Database v7.494 (“MSigDB [http://www.gsea-msigdb.org/gsea/msigdb]”, including GO and canonical pathway gene sets derived from KEGG, Reactome, and WikiPathways pathway databases.

Module analysis

We extracted discrete gene clusters from the hierarchical clustering using the R function cutree, with the number of clusters estimated using dynamicTreeCut. Then, module activity was calculated using the average edge weights of all genes in the same cluster.

Survival analysis

The human medulloblastoma dataset (GEO: GSE124814) was used to evaluate the prognostic performance of the samples containing different proportions of drug-resistant and drug-sensitive tumor cells. We extracted the signature of the two types of tumor cells using their marker gene and then used the “AddModuleScore” function in the Seurat R package to evaluate the degree to which samples harbor sensitive or resistant cells. The low and high groups were determined based on the mean module score. The Kaplan-Meier survival curves for these two groups of patients were drawn using the survival package in R(v3.2-13).

Animal experiment

All animal studies were approved by the ZJU-Laboratory Animal Welfare and Ethics Review Committee (Permit No. ZJU20230087). Male C57BL/6J mice, aged 6–8 weeks, were purchased from Beijing Vital River Laboratory Animal Technology Co., Ltd. The animals were acclimatized and then subjected to myocardial infarction modeling. Mice were anesthetized with an intraperitoneal injection of 150 mg/kg tribromoethanol and secured in a supine position. Following tracheal intubation, the animals were connected to a ventilator. A thoracotomy was performed between the third and fourth left intercostal spaces to expose the heart, and the pericardium was incised. The left anterior descending (LAD) coronary artery was ligated with a 7-0 nylon suture. The thorax was then sequentially closed, and the skin was disinfected with iodine solution. In the sham operation group, the procedure was identical except that the suture was passed under the artery without ligation. Post-surgery, animals were randomly divided into groups. The treatment group received oral administration of tanshinone IIA (20 mg/kg/day) for three consecutive days, whereas the sham and model groups were given an equivalent dose of the vehicle (0.5% sodium carboxymethyl cellulose solution). Three days after treatment, the mice were re-anesthetized, and their hearts were perfused with chilled saline, followed by fixation in formaldehyde for histological examination or snap freezing in liquid nitrogen for subsequent Western blot and PCR analyses.

Immunofluorescence

Heart tissues were fixed in 4% paraformaldehyde and embedded with paraffin. Sections (5 μm thickness) were cut into microscopic slides. Paraffin-embedded heart samples were dewaxed and rehydrated in xylene an ethyl alcohol, followed by incubation in 0.3% methanol to endogenous peroxidase. For the double staining of CTSB and CD68, samples were stained first with anti-CTSB (Proteintech, 21327-1-AP) and HRP–conjugated goat anti-rabbit IgG antibody using the Cy3-TSA kit, and then with anti-CD68 (Servicebio, GB113109) and Alexa Fluor 488–conjugated goat anti-rabbit IgG antibody, according to the manufacturer’s protocol. Nuclei were stained with DAPI. Images were acquired using fluorescent microscopy (Nikon, Eclipse C1), photographed with the graphical program (Nikon, DS-U3).

Western blot

Heart tissue were lysed in lysis buffer containing phenylmethylsulfonylfluoride (PMSF) and a protease inhibitor cocktail. Total proteins were separated by SDS-PAGE and blotted onto polyvinylidene fluoride (PVDF) membranes. The membranes were probed with antibodies against Cathepsin B, Cathepsin D, CD68 (all from Cell Signaling Technology), and beta Tubulin (from Beyotime, Hangzhou, China, https://www.beyotime.com/) followed by exposure to horseradish peroxidase-conjugated secondary antibodies (from Beyotime). ECL reagent was used to develop the membrane.

Quantitative reverse transcription-PCR (RT-qPCR)

The primer sequences used to perform real-time semiquantitative PCR are listed as follows. actin: forward primer: 5′-GTG CTA TGT TGC TCT AGA CTT CG-3′, reverse primer: 5′-ATG CCA CAG GAT TCC ATA CC-3′; CTSB: forward primer: 5′- TCC TTG ATC CTT CTT TCT TGCC-3′, reverse primer: 5′-ACA GTG CCA CAC AGC TTC TTC-3′; CTSD: forward primer: 5′-GCT TCC GGT CTT TGA CAA CCT-3′, reverse primer: 5′-CAC CAA GCA TTA GTT CTC CTCC-3′; CD68: forward primer: 5′-TGT CTG ATC TTG CTA GGA CCG-3′, reverse primer: 5′-GAG AGT AAC GGC CTT TTT GTGA-3′; LGMN: forward primer: 5′-TGG ACG ATC CCG AGG ATGG-3′, reverse primer: 5′-GTG GAT GAT CTG GTA GGC GT-3′; FCGR1: forward primer: 5′-TGC TGG ATT CTA CTG GTG TGA-3′, reverse primer: 5′-AAA CCA GAC AGG AGC TGA TGA-3′. Actin was amplified as an endogenous reference gene.

Molecular docking of tanshinone IIA to predicted targets

The target for tanshinone IIA was generated using the Python package “Meeko”97 with default parameters. The predicted target used for docking was downloaded from the Protein DataBank (PDB), with the deletion of heteroatoms and water. Molecular modeling was performed in AutoDock Vina v1.2.3,98 with the previously prepared ligand and protein sets. The ligands were docked to this area with an exhaustiveness of 32. The resulting poses were exported along with theoretical binding affinities. The theoretical Ki values were calculated from the theoretical binding affinities using AutoDock Vina software.

Surface plasmon resonance (SPR) experiments

The Surface Plasmon Resonance (SPR) experiments were carried out using a BIAcore 8K to measure the binding affinities. Human CTSB (ACRO) was diluted in sodium acetate solution (pH 4.5) with a final concentration of 30 μg/mL. CTSB was immobilized on a CM5 sensor chip (Cytiva) by amine coupling to reach target densities of 20000 resonance units (RU). Immobilized CTSB was used to capture the chemical compound. The running buffer contained PBS with 0.05% Tween 20 (pH 7.4) and 5% DMSO. Then injected 6 concentrations of the Tanshinone IIA sulfonate (12.5 μM, 25 μM, 50 μM, 100 μM, 200 μM) at a flow rate of 30 μL/min, the association time was 60s and dissociation time was 150s.

Quantification and statistical analysis

The quantitative and statistical analyses are described in the relevant sections of the Method details and in the figure legends. R (version 4.1.1) and Python 3.9 were used for all statistical analyses.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (U23A20513 to X.F., 82173941 to Yi Wang, and 82200725 to X.S.), the Ningbo Top Medical and Health Research Program (no. 2022030309), and the Fundamental Research Funds for the Central Universities (226-2024-00001 to X.F.). The authors thank the High-Performance Computing Cluster of the Zhejiang University Innovation Center of Yangtze River Delta and the Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare for their technical support.

Author contributions

Yi Wang, X.F., and X.S. conceived and designed the study. C.L., P.Y., and K.J. collected and analyzed the scRNA-seq and spatial transcriptomics (ST) data. C.L. and X.S. constructed the framework and developed the package of scRank. S.Z. and Yingchao Wang designed and conducted the experiment. All authors wrote the manuscript and read and approved the final manuscript.

Declaration of interests

The authors declare no competing interests.

Published: May 15, 2024

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.xcrm.2024.101568.

Contributor Information

Xin Shao, Email: xin_shao@zju.edu.cn.

Xiaohui Fan, Email: fanxh@zju.edu.cn.

Yi Wang, Email: zjuwangyi@zju.edu.cn.

Supplemental information

Document S1. Figures S1–S18 and Tables S1–S3
mmc1.pdf (7.8MB, pdf)
Document S2. Article plus supplemental information
mmc2.pdf (34.6MB, pdf)

References

  • 1.Zeng H. What is a cell type and how to define it? Cell. 2022;185:2739–2755. doi: 10.1016/j.cell.2022.06.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Vidal M., Cusick M.E., Barabási A.L. Interactome Networks and Human Disease. Cell. 2011;144:986–998. doi: 10.1016/j.cell.2011.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Covey T.M., Putta S., Cesano A. Single Cell Network Profiling (SCNP): Mapping Drug and Target Interactions. Assay Drug Dev. Technol. 2010;8:321–343. doi: 10.1089/adt.2009.0251. [DOI] [PubMed] [Google Scholar]
  • 4.Blucher A.S., McWeeney S.K., Stein L., Wu G. Visualization of drug target interactions in the contexts of pathways and networks with ReactomeFIViz. F1000Res. 2019;8:908. doi: 10.12688/f1000research.19592.1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Vincent E.E., Elder D.J.E., O′Flaherty L., Pardo O.E., Dzien P., Phillips L., Morgan C., Pawade J., May M.T., Sohail M., et al. Glycogen Synthase Kinase 3 Protein Kinase Activity Is Frequently Elevated in Human Non-Small Cell Lung Carcinoma and Supports Tumour Cell Proliferation. PLoS One. 2014;9 doi: 10.1371/journal.pone.0114725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Raghavan S., Winter P.S., Navia A.W., Williams H.L., DenAdel A., Lowder K.E., Galvez-Reyes J., Kalekar R.L., Mulugeta N., Kapner K.S., et al. Microenvironment drives cell state, plasticity, and drug response in pancreatic cancer. Cell. 2021;184:6119–6137.e26. doi: 10.1016/j.cell.2021.11.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Zhao W., Dovas A., Spinazzi E.F., Levitin H.M., Banu M.A., Upadhyayula P., Sudhakar T., Marie T., Otten M.L., Sisti M.B., et al. Deconvolution of cell type-specific drug responses in human tumor tissue with single-cell RNA-seq. Genome Med. 2021;13 doi: 10.1186/s13073-021-00894-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Goyal Y., Busch G.T., Pillai M., Li J., Boe R.H., Grody E.I., Chelvanambi M., Dardani I.P., Emert B., Bodkin N., et al. Diverse clonal fates emerge upon drug treatment of homogeneous cancer cells. Nature. 2023;620:651–659. doi: 10.1038/s41586-023-06342-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Shao X., Liao J., Lu X., Xue R., Ai N., Fan X. scCATCH: Automatic Annotation on Cell Types of Clusters from Single-Cell RNA Sequencing Data. iScience. 2020;23 doi: 10.1016/j.isci.2020.100882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Liao J., Qian J., Fang Y., Chen Z., Zhuang X., Zhang N., Shao X., Hu Y., Yang P., Cheng J., et al. De novo analysis of bulk RNA-seq data at spatially resolved single-cell resolution. Nat. Commun. 2022;13:6498. doi: 10.1038/s41467-022-34271-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Preface for Special Issue: Single-Cell and Spatially Resolved Omics. J. Pharm. Anal. 2023;13:689–690. doi: 10.1016/j.jpha.2023.07.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shao X., Lu X., Liao J., Chen H., Fan X. New avenues for systematically inferring cell-cell communication: through single-cell transcriptomics data. Protein Cell. 2020;11:866–880. doi: 10.1007/s13238-020-00727-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Yang Y., Li G., Zhong Y., Xu Q., Lin Y.-T., Roman-Vicharra C., Chapkin R.S., Cai J.J. scTenifoldXct: A semi-supervised method for predicting cell-cell interactions and mapping cellular communication graphs. Cell Syst. 2023;14:302–311.e4. doi: 10.1016/j.cels.2023.01.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Aibar S., González-Blas C.B., Moerman T., Huynh-Thu V.A., Imrichova H., Hulselmans G., Rambow F., Marine J.-C., Geurts P., Aerts J., et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods. 2017;14:1083–1086. doi: 10.1038/nmeth.4463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Hahn W.C., Bader J.S., Braun T.P., Califano A., Clemons P.A., Druker B.J., Ewald A.J., Fu H., Jagu S., Kemp C.J., et al. An expanded universe of cancer targets. Cell. 2021;184:1142–1155. doi: 10.1016/j.cell.2021.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Balzer M.S., Doke T., Yang Y.-W., Aldridge D.L., Hu H., Mai H., Mukhi D., Ma Z., Shrestha R., Palmer M.B., et al. Single-cell analysis highlights differences in druggable pathways underlying adaptive or fibrotic kidney regeneration. Nat. Commun. 2022;13:4018. doi: 10.1038/s41467-022-31772-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Fustero-Torre C., Jiménez-Santos M.J., García-Martín S., Carretero-Puche C., García-Jimeno L., Ivanchuk V., Di Domenico T., Gómez-López G., Al-Shahrour F. Beyondcell: targeting cancer therapeutic heterogeneity in single-cell RNA-seq data. Genome Med. 2021;13:187. doi: 10.1186/s13073-021-01001-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Johnson T.S., Yu C.Y., Huang Z., Xu S., Wang T., Dong C., Shao W., Zaid M.A., Huang X., Wang Y., et al. Diagnostic Evidence GAuge of Single cells (DEGAS): a flexible deep transfer learning framework for prioritizing cells in relation to disease. Genome Med. 2022;14:11. doi: 10.1186/s13073-022-01012-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Sun D., Guan X., Moran A.E., Wu L.-Y., Qian D.Z., Schedin P., Dai M.-S., Danilov A.V., Alumkal J.J., Adey A.C., et al. Identifying phenotype-associated subpopulations by integrating bulk and single-cell sequencing data. Nat. Biotechnol. 2022;40:527–538. doi: 10.1038/s41587-021-01091-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Gan Z., Gangadharan V., Liu S., Körber C., Tan L.L., Li H., Oswald M.J., Kang J., Martin-Cortecero J., Männich D., et al. Layer-specific pain relief pathways originating from primary motor cortex. Science. 2022;378:1336–1343. doi: 10.1126/science.add4391. [DOI] [PubMed] [Google Scholar]
  • 21.Jin K., Gao S., Yang P., Guo R., Li D., Zhang Y., Lu X., Fan G., Fan X. Single-Cell RNA Sequencing Reveals the Temporal Diversity and Dynamics of Cardiac Immunity after Myocardial Infarction. Small Methods. 2022;6 doi: 10.1002/smtd.202100752. [DOI] [PubMed] [Google Scholar]
  • 22.Kim M., Jeong M., Hur S., Cho Y., Park J., Jung H., Seo Y., Woo H.A., Nam K.T., Lee K., Lee H. Engineered ionizable lipid nanoparticles for targeted delivery of RNA therapeutics into different types of cells in the liver. Sci. Adv. 2021;7 doi: 10.1126/sciadv.abf4398. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Avey D., Sankararaman S., Yim A.K.Y., Barve R., Milbrandt J., Mitra R.D. Single-Cell RNA-Seq Uncovers a Robust Transcriptional Response to Morphine by Glia. Cell Rep. 2018;24:3619–3629.e4. doi: 10.1016/j.celrep.2018.08.080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Skinnider M.A., Squair J.W., Kathe C., Anderson M.A., Gautier M., Matson K.J.E., Milano M., Hutson T.H., Barraud Q., Phillips A.A., et al. Cell type prioritization in single-cell data. Nat. Biotechnol. 2021;39:30–34. doi: 10.1038/s41587-020-0605-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Eraslan G., Drokhlyansky E., Anand S., Fiskin E., Subramanian A., Slyper M., Wang J., Van Wittenberghe N., Rouhana J.M., Waldman J., et al. Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function. Science. 2022;376 doi: 10.1126/science.abl4290. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Smillie C.S., Biton M., Ordovas-Montanes J., Sullivan K.M., Burgin G., Graham D.B., Herbst R.H., Rogel N., Slyper M., Waldman J., et al. Intra- and Inter-cellular Rewiring of the Human Colon during Ulcerative Colitis. Cell. 2019;178:714–730.e22. doi: 10.1016/j.cell.2019.06.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Barabási A.L., Oltvai Z.N. Network biology: understanding the cell’s functional organization. Nat. Rev. Genet. 2004;5:101–113. doi: 10.1038/nrg1272. [DOI] [PubMed] [Google Scholar]
  • 28.Woo J.H., Shimoni Y., Yang W.S., Subramaniam P., Iyer A., Nicoletti P., Rodríguez Martínez M., López G., Mattioli M., Realubit R., et al. Elucidating Compound Mechanism of Action by Network Perturbation Analysis. Cell. 2015;162:441–451. doi: 10.1016/j.cell.2015.05.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Vu H., Carey C., Mahadevan S. Manifold Warping: Manifold Alignment over Time. Proc. AAAI Conf. Artif. Intell. 2021;26:1155–1161. doi: 10.1609/aaai.v26i1.8281. [DOI] [Google Scholar]
  • 30.Cowen L., Ideker T., Raphael B.J., Sharan R. Network propagation: a universal amplifier of genetic associations. Nat. Rev. Genet. 2017;18:551–562. doi: 10.1038/nrg.2017.38. [DOI] [PubMed] [Google Scholar]
  • 31.Isik Z., Baldow C., Cannistraci C.V., Schroeder M. Drug target prioritization by perturbed gene expression and network information. Sci. Rep. 2015;5 doi: 10.1038/srep17417. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Paci P., Fiscon G., Conte F., Wang R.-S., Farina L., Loscalzo J. Gene co-expression in the interactome: moving from correlation toward causation via an integrated approach to disease module discovery. Npj Syst. Biol. Appl. 2021;7:3–11. doi: 10.1038/s41540-020-00168-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ruiz C., Zitnik M., Leskovec J. Identification of disease treatment mechanisms through the multiscale interactome. Nat. Commun. 2021;12:1796. doi: 10.1038/s41467-021-21770-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Fernández-Torras A., Duran-Frigola M., Aloy P. Encircling the regions of the pharmacogenomic landscape that determine drug response. Genome Med. 2019;11:17. doi: 10.1186/s13073-019-0626-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Santolini M., Barabási A.L. Predicting perturbation patterns from the topology of biological networks. Proc. Natl. Acad. Sci. USA. 2018;115:E6375–E6383. doi: 10.1073/pnas.1720589115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Hao Y., Hao S., Andersen-Nissen E., Mauck W.M., Zheng S., Butler A., Lee M.J., Wilk A.J., Darby C., Zager M., et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e29. doi: 10.1016/j.cell.2021.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Hu H., Miao Y.-R., Jia L.-H., Yu Q.-Y., Zhang Q., Guo A.-Y. AnimalTFDB 3.0: a comprehensive resource for annotation and prediction of animal transcription factors. Nucleic Acids Res. 2019;47:D33–D38. doi: 10.1093/nar/gky822. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Freshour S.L., Kiwala S., Cotto K.C., Coffman A.C., McMichael J.F., Song J.J., Griffith M., Griffith O.L., Wagner A.H. Integration of the Drug–Gene Interaction Database (DGIdb 4.0) with open crowdsource efforts. Nucleic Acids Res. 2021;49:D1144–D1151. doi: 10.1093/nar/gkaa1084. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Osorio D., Zhong Y., Li G., Huang J.Z., Cai J.J. scTenifoldNet: A Machine Learning Workflow for Constructing and Comparing Transcriptome-wide Gene Regulatory Networks from Single-Cell Data. Patterns. 2020;1 doi: 10.1016/j.patter.2020.100139. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Osorio D., Zhong Y., Li G., Xu Q., Yang Y., Tian Y., Chapkin R.S., Huang J.Z., Cai J.J. scTenifoldKnk: An efficient virtual knockout tool for gene function predictions via single-cell gene regulatory network perturbation. Patterns. 2022;3 doi: 10.1016/j.patter.2022.100434. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Jolliffe I.T. In: Principal Component Analysis Springer Series in Statistics. Jolliffe I.T., editor. Springer; 1986. Principal Components in Regression Analysis; pp. 129–155. [DOI] [Google Scholar]
  • 42.Rabanser S., Shchur O., Günnemann S. Introduction to Tensor Decompositions and their Applications. Mach. Learn. 2017 doi: 10.48550/ARXIV.1711.10781. [DOI] [Google Scholar]
  • 43.Dibaeinia P., Sinha S. SERGIO: A Single-Cell Expression Simulator Guided by Gene Regulatory Networks. Cell Syst. 2020;11:252–271.e11. doi: 10.1016/j.cels.2020.08.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Finak G., McDavid A., Yajima M., Deng J., Gersuk V., Shalek A.K., Slichter C.K., Miller H.W., McElrath M.J., Prlic M., et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 2015;16:278. doi: 10.1186/s13059-015-0844-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.McDavid A., Finak G., Chattopadyay P.K., Dominguez M., Lamoreaux L., Ma S.S., Roederer M., Gottardo R. Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments. Bioinformatics. 2013;29:461–467. doi: 10.1093/bioinformatics/bts714. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Love M.I., Huber W., Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Ocasio J.K., Babcock B., Malawsky D., Weir S.J., Loo L., Simon J.M., Zylka M.J., Hwang D., Dismuke T., Sokolsky M., et al. scRNA-seq in medulloblastoma shows cellular heterogeneity and lineage expansion support resistance to SHH inhibitor therapy. Nat. Commun. 2019;10:5829. doi: 10.1038/s41467-019-13657-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Leimkühler N.B., Gleitz H.F.E., Ronghui L., Snoeren I.A.M., Fuchs S.N.R., Nagai J.S., Banjanin B., Lam K.H., Vogl T., Kuppe C., et al. Heterogeneous bone-marrow stromal progenitors drive myelofibrosis via a druggable alarmin axis. Cell Stem Cell. 2021;28:637–652.e8. doi: 10.1016/j.stem.2020.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Mead B.E., Hattori K., Levy L., Imada S., Goto N., Vukovic M., Sze D., Kummerlowe C., Matute J.D., Duan J., et al. Screening for modulators of the cellular composition of gut epithelia via organoid models of intestinal stem cell differentiation. Nat. Biomed. Eng. 2022;6:476–494. doi: 10.1038/s41551-022-00863-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Wu J., Sun Z., Yang S., Fu J., Fan Y., Wang N., Hu J., Ma L., Peng C., Wang Z., et al. Kidney single-cell transcriptome profile reveals distinct response of proximal tubule cells to SGLT2i and ARB treatment in diabetic mice. Mol. Ther. 2022;30:1741–1753. doi: 10.1016/j.ymthe.2021.10.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Poonpanichakul T., Shiao M.-S., Jiravejchakul N., Matangkasombut P., Sirachainan E., Charoensawan V., Jinawath N. Capturing tumour heterogeneity in pre- and post-chemotherapy colorectal cancer ascites-derived cells using single-cell RNA-sequencing. Biosci. Rep. 2021;41 doi: 10.1042/BSR20212093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lotfollahi M., Wolf F.A., Theis F.J. scGen predicts single-cell perturbation responses. Nat. Methods. 2019;16:715–721. doi: 10.1038/s41592-019-0494-8. [DOI] [PubMed] [Google Scholar]
  • 53.Kana O., Nault R., Filipovic D., Marri D., Zacharewski T., Bhattacharya S. Generative modeling of single-cell gene expression for dose-dependent chemical perturbations. Patterns. 2023;4 doi: 10.1016/j.patter.2023.100817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Kamimoto K., Stringa B., Hoffmann C.M., Jindal K., Solnica-Krezel L., Morris S.A. Dissecting cell identity via network inference and in silico gene perturbation. Nature. 2023;614:742–751. doi: 10.1038/s41586-022-05688-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Tuong Z.K., Loudon K.W., Berry B., Richoz N., Jones J., Tan X., Nguyen Q., George A., Hori S., Field S., et al. Resolving the immune landscape of human prostate at a single-cell level in health and cancer. Cell Rep. 2021;37 doi: 10.1016/j.celrep.2021.110132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Lee H.-O., Hong Y., Etlioglu H.E., Cho Y.B., Pomella V., Van den Bosch B., Vanhecke J., Verbandt S., Hong H., Min J.-W., et al. Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer. Nat. Genet. 2020;52:594–603. doi: 10.1038/s41588-020-0636-z. [DOI] [PubMed] [Google Scholar]
  • 57.Khaliq A.M., Erdogan C., Kurt Z., Turgut S.S., Grunvald M.W., Rand T., Khare S., Borgia J.A., Hayden D.M., Pappas S.G., et al. Refining colorectal cancer classification and clinical stratification through a single-cell atlas. Genome Biol. 2022;23:113. doi: 10.1186/s13059-022-02677-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Ho Y.-J., Anaparthy N., Molik D., Mathew G., Aicher T., Patel A., Hicks J., Hammell M.G. Single-cell RNA-seq analysis identifies markers of resistance to targeted BRAF inhibitors in melanoma cell populations. Genome Res. 2018;28:1353–1363. doi: 10.1101/gr.234062.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Kinker G.S., Greenwald A.C., Tal R., Orlova Z., Cuoco M.S., McFarland J.M., Warren A., Rodman C., Roth J.A., Bender S.A., et al. Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity. Nat. Genet. 2020;52:1208–1218. doi: 10.1038/s41588-020-00726-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Chen J., Wang X., Ma A., Wang Q.-E., Liu B., Li L., Xu D., Ma Q. Deep transfer learning of cancer drug responses by integrating bulk and single-cell RNA-seq data. Nat. Commun. 2022;13:6494. doi: 10.1038/s41467-022-34277-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Seashore-Ludlow B., Rees M.G., Cheah J.H., Cokol M., Price E.V., Coletti M.E., Jones V., Bodycombe N.E., Soule C.K., Gould J., et al. Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset. Cancer Discov. 2015;5:1210–1223. doi: 10.1158/2159-8290.CD-15-0235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.D’Amico D., Antonucci L., Di Magno L., Coni S., Sdruscia G., Macone A., Miele E., Infante P., Di Marcotullio L., De Smaele E., et al. Non-canonical Hedgehog/AMPK-Mediated Control of Polyamine Metabolism Supports Neuronal and Medulloblastoma Cell Growth. Dev. Cell. 2015;35:21–35. doi: 10.1016/j.devcel.2015.09.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Weishaupt H., Johansson P., Sundström A., Lubovac-Pilav Z., Olsson B., Nelander S., Swartling F.J. Batch-normalization of cerebellar and medulloblastoma gene expression datasets utilizing empirically defined negative control genes. Bioinformatics. 2019;35:3357–3364. doi: 10.1093/bioinformatics/btz066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Sangkuhl K., Klein T.E., Altman R.B. Selective serotonin reuptake inhibitors pathway. Pharmacogenetics Genom. 2009;19:907–909. doi: 10.1097/FPC.0b013e32833132cb. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Nagy C., Maitra M., Tanti A., Suderman M., Théroux J.F., Davoli M.A., Perlman K., Yerko V., Wang Y.C., Tripathy S.J., et al. Single-nucleus transcriptomics of the prefrontal cortex in major depressive disorder implicates oligodendrocyte precursor cells and excitatory neurons. Nat. Neurosci. 2020;23:771–781. doi: 10.1038/s41593-020-0621-y. [DOI] [PubMed] [Google Scholar]
  • 66.Iceta R., Mesonero J.E., Alcalde A.I. Effect of long-term fluoxetine treatment on the human serotonin transporter in Caco-2 cells. Life Sci. 2007;80:1517–1524. doi: 10.1016/j.lfs.2007.01.020. [DOI] [PubMed] [Google Scholar]
  • 67.Drago A., De Ronchi D., Serretti A. Pharmacogenetics of antidepressant response: An update. Hum. Genom. 2009;3:257–274. doi: 10.1186/1479-7364-3-3-257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Serretti A., Artioli P. The pharmacogenomics of selective serotonin reuptake inhibitors. Pharmacogenomics J. 2004;4:233–244. doi: 10.1038/sj.tpj.6500250. [DOI] [PubMed] [Google Scholar]
  • 69.Soiza-Reilly M., Meye F.J., Olusakin J., Telley L., Petit E., Chen X., Mameli M., Jabaudon D., Sze J.-Y., Gaspar P. SSRIs target prefrontal to raphe circuits during development modulating synaptic connectivity and emotional behavior. Mol. Psychiatr. 2019;24:726–745. doi: 10.1038/s41380-018-0260-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Chen X., Petit E.I., Dobrenis K., Sze J.Y. Spatiotemporal SERT expression in cortical map development. Neurochem. Int. 2016;98:129–137. doi: 10.1016/j.neuint.2016.05.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Cotter D., Mackay D., Chana G., Beasley C., Landau S., Everall I.P. Reduced Neuronal Size and Glial Cell Density in Area 9 of the Dorsolateral Prefrontal Cortex in Subjects with Major Depressive Disorder. Cerebr. Cortex. 2002;12:386–394. doi: 10.1093/cercor/12.4.386. [DOI] [PubMed] [Google Scholar]
  • 72.Maynard K.R., Collado-Torres L., Weber L.M., Uytingco C., Barry B.K., Williams S.R., Catallini J.L., Tran M.N., Besich Z., Tippani M., et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat. Neurosci. 2021;24:425–436. doi: 10.1038/s41593-020-00787-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Wei R., He S., Bai S., Sei E., Hu M., Thompson A., Chen K., Krishnamurthy S., Navin N.E. Spatial charting of single-cell transcriptomes in tissues. Nat. Biotechnol. 2022;40:1190–1199. doi: 10.1038/s41587-022-01233-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 74.Swirski F.K., Nahrendorf M. Leukocyte behavior in atherosclerosis, myocardial infarction, and heart failure. Science. 2013;339:161–166. doi: 10.1126/science.1230719. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Nahrendorf M., Swirski F.K. Abandoning M1/M2 for a Network Model of Macrophage Function. Circ. Res. 2016;119:414–417. doi: 10.1161/CIRCRESAHA.116.309194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Panwar P., Law S., Jamroz A., Azizi P., Zhang D., Ciufolini M., Brömme D. Tanshinones that selectively block the collagenase activity of cathepsin K provide a novel class of ectosteric antiresorptive agents for bone: Ectosteric inhibitors of cathepsin K. Br. J. Pharmacol. 2018;175:902–923. doi: 10.1111/bph.14133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Gao S., Liu Z., Li H., Little P.J., Liu P., Xu S. Cardiovascular actions and therapeutic potential of tanshinone IIA. Atherosclerosis. 2012;220:3–10. doi: 10.1016/j.atherosclerosis.2011.06.041. [DOI] [PubMed] [Google Scholar]
  • 78.Zhang X., Wang Q., Wang X., Chen X., Shao M., Zhang Q., Guo D., Wu Y., Li C., Wang W., Wang Y. Tanshinone IIA protects against heart failure post-myocardial infarction via AMPKs/mTOR-dependent autophagy pathway. Biomed. Pharmacother. 2019;112 doi: 10.1016/j.biopha.2019.108599. [DOI] [PubMed] [Google Scholar]
  • 79.Ni J., Wu Z., Stoka V., Meng J., Hayashi Y., Peters C., Qing H., Turk V., Nakanishi H. Increased expression and altered subcellular distribution of cathepsin B in microglia induce cognitive impairment through oxidative stress and inflammatory response in mice. Aging Cell. 2019;18 doi: 10.1111/acel.12856. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Ma L., Han Z., Yin H., Tian J., Zhang J., Li N., Ding C., Zhang L. Characterization of Cathepsin B in Mediating Silica Nanoparticle-Induced Macrophage Pyroptosis via an NLRP3-Dependent Manner. J. Inflamm. Res. 2022;15:4537–4545. doi: 10.2147/JIR.S371536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Nguyen N.D., Blaby I.K., Wang D. ManiNetCluster: a novel manifold learning approach to reveal the functional links between gene networks. BMC Genom. 2019;20:1003. doi: 10.1186/s12864-019-6329-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Rukhlenko O.S., Halasz M., Rauch N., Zhernovkov V., Prince T., Wynne K., Maher S., Kashdan E., MacLeod K., Carragher N.O., et al. Control of cell state transitions. Nature. 2022;609:975–985. doi: 10.1038/s41586-022-05194-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Perna D., Karreth F.A., Rust A.G., Perez-Mancera P.A., Rashid M., Iorio F., Alifrangis C., Arends M.J., Bosenberg M.W., Bollag G., et al. BRAF inhibitor resistance mediated by the AKT pathway in an oncogenic BRAF mouse melanoma model. Proc. Natl. Acad. Sci. USA. 2015;112:E536–E545. doi: 10.1073/pnas.1418163112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Kuenzi B.M., Park J., Fong S.H., Sanchez K.S., Lee J., Kreisberg J.F., Ma J., Ideker T. Predicting Drug Response and Synergy Using a Deep Learning Model of Human Cancer Cells. Cancer Cell. 2020;38:672–684.e6. doi: 10.1016/j.ccell.2020.09.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Barabási A.L., Gulbahce N., Loscalzo J. Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 2011;12:56–68. doi: 10.1038/nrg2918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Su Y., Ko M.E., Cheng H., Zhu R., Xue M., Wang J., Lee J.W., Frankiw L., Xu A., Wong S., et al. Multi-omic single-cell snapshots reveal multiple independent trajectories to drug tolerance in a melanoma cell line. Nat. Commun. 2020;11:2345. doi: 10.1038/s41467-020-15956-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 87.Kong J., Lee H., Kim D., Han S.K., Ha D., Shin K., Kim S. Network-based machine learning in colorectal and bladder organoid models predicts anti-cancer drug efficacy in patients. Nat. Commun. 2020;11:5485. doi: 10.1038/s41467-020-19313-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 88.Bennett H.M., Stephenson W., Rose C.M., Darmanis S. Single-cell proteomics enabled by next-generation sequencing or mass spectrometry. Nat. Methods. 2023;20:363–374. doi: 10.1038/s41592-023-01791-5. [DOI] [PubMed] [Google Scholar]
  • 89.Li J., Bien J., Wells M.T. rTensor: An R Package for Multidimensional Array (Tensor) Unfolding, Multiplication, and Decomposition. J. Stat. Softw. 2018;87:1–31. doi: 10.18637/jss.v087.i10. [DOI] [Google Scholar]
  • 90.Durinck S., Spellman P.T., Birney E., Huber W. Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt. Nat. Protoc. 2009;4:1184–1191. doi: 10.1038/nprot.2009.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Grabski I.N., Street K., Irizarry R.A. Significance analysis for clustering with single-cell RNA-sequencing data. Nat. Methods. 2023;20:1196–1202. doi: 10.1038/s41592-023-01933-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Zhou Y., Zhou B., Pache L., Chang M., Khodabakhshi A.H., Tanaseichuk O., Benner C., Chanda S.K. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 2019;10:1523. doi: 10.1038/s41467-019-09234-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Wu T., Hu E., Xu S., Chen M., Guo P., Dai Z., Feng T., Zhou L., Tang W., Zhan L., et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation. 2021;2 doi: 10.1016/j.xinn.2021.100141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Subramanian A., Tamayo P., Mootha V.K., Mukherjee S., Ebert B.L., Gillette M.A., Paulovich A., Pomeroy S.L., Golub T.R., Lander E.S., Mesirov J.P. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Langfelder P., Zhang B., Horvath dynamicTreeCut: Methods for Detection of Clusters in Hierarchical Clustering Dendrograms. R package version 1.63-1. 2016. https://CRAN.R-project.org/package=dynamicTreeCut
  • 96.Therneau T.M., Grambsch P.M. Springer; New York: 2000. Modeling Survival Data: Extending the Cox Model. ISBN 0-387-98784-3. [Google Scholar]
  • 97.Forli S., Olson A.J. A Force Field with Discrete Displaceable Waters and Desolvation Entropy for Hydrated Ligand Docking. J. Med. Chem. 2012;55:623–638. doi: 10.1021/jm2005145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Eberhardt J., Santos-Martins D., Tillack A.F., Forli S. AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. J. Chem. Inf. Model. 2021;61:3891–3898. doi: 10.1021/acs.jcim.1c00203. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S18 and Tables S1–S3
mmc1.pdf (7.8MB, pdf)
Document S2. Article plus supplemental information
mmc2.pdf (34.6MB, pdf)

Data Availability Statement

  • Data from five publicly available single-cell datasets with nine known drug-targeted cell types were processed for validation and analyzed using scRank, as described below. For these datasets, the drug name, and its corresponding target gene input into scRank is articulated in Table S1.

  • scRNA-seq data from mouse medulloblastoma tumors after vismodegib treatment and matched vehicle-treated samples were obtained from GEO (accession: GEO: GSE129730). Data were processed following the workflow provided by the authors (https://github.com/ben-babcock/Gershon_single-cell). The score was calculated by comparing the network constructed from vehicle-treated samples and perturbation of vismodegib target Smo.

  • scRNA-seq data from colorectal cancer ascites-derived epithelial cells of patients before and after chemotherapy (mFOLFOX6) was obtained from GEO (accession: GEO: GSE155953). Metadata, including cell types, was provided by the authors. Score was calculated by comparing the network of pre-treatment samples and in-silico perturbation of TOP1 and TYMS, which encode the targets of leucovorin and 5-fluorouracil, respectively.

  • scRNA-seq data from a ThPO-induced bone marrow fibrosis mouse model were obtained from GEO (accession: GEO: GSE156644). Cell type annotation was provided by the authors. The score was calculated by comparing the untreated network to that of perturbation of S100a8 and S100a9, which are the targets of tasquinimod.

  • scRNA-seq data from miniaturized organoid models of intestinal stem cell differentiation into Paneth cells before and after treatment with KPT-330 were obtained from the Broad Institute’s Single-Cell Portal (https://singlecell.broadinstitute.org/; studies SCP1547). Cell type annotation provided by the authors. The score was calculated by comparing networks of the pretreatment samples and perturbation of KPT-330 target Xpo1.

  • Wu et al. scRNA-seq data from the kidneys of mice with diabetic kidney disease, before and after dapagliflozin treatment, were obtained from GEO (accession: GEO: GSE181382). The authors provided Metadata. The score was calculated by comparing the networks of pretreatment samples and in-silico perturbation of dapagliflozin target Slc5a2.

  • scRNA-seq data from the prefrontal cortex of patients with MDD were obtained from GEO (accession: GEO: GSE144136). For mouse brain ST data (10X Genomics Visium), we downloaded Seurat objects from (https://satijalab.org/seurat/articles/spatial_vignette.html). Human DLPFC ST data (151673) were downloaded from (https://github.com/LieberInstitute/HumanPilot/).

  • Our previously published scRNA-seq data of heart immune cells after myocardial infarction and matched tanshinone IIA-treated samples on day 3 are available in GEO (accession: GEO: GSE163465).

  • Three cancer scRNA-seq data includes Tuong2021 (EGA: EGAS00001005787), Lee2020 (GEO: GSE144735), and Khaliq2022 (GEO: GSE200997). For classical treatments, the target cell type are as follows: T cells for anti-PD1 and anti-CTLA4 treatment, tumor cells for anti-PDL1 and Docetaxel treatment, B cells for anti-CD40 treatment.

  • Three cancer cell line scRNA-seq data includes Melanoma cell line (GEO: GSE108394), SCC47 cell line (GEO: GSE157220), and JHU006 cell line (GEO: GSE157220). Drug responsive cell populations were defined in the original research.

  • Pan-cancer scRNA-seq data of cell lines is from GEO: GSE157220. The drug response data of CTRP is from the site (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4631646/bin/NIHMS711523-supplement-4.xlsx). The top 3 cell lines with highest drug sensitivity as responsive cell line, and the last 3 with the lowest sensitivity as non-responsive cell line.

  • All original code has been deposited at GitHub (https://github.com/ZJUFanLab/scRank)


Articles from Cell Reports Medicine are provided here courtesy of Elsevier

RESOURCES