Abstract
Spatial cellular authors heterogeneity contributes to differential drug responses in a tumor lesion and potential therapeutic resistance. Recent emerging spatial technologies such as CosMx, MERSCOPE and Xenium delineate the spatial gene expression patterns at the single cell resolution. This provides unprecedented opportunities to identify spatially localized cellular resistance and to optimize the treatment for individual patients. In this work, we present a graph-based domain adaptation model, SpaRx, to reveal the heterogeneity of spatial cellular response to drugs. SpaRx transfers the knowledge from pharmacogenomics profiles to single-cell spatial transcriptomics data, through hybrid learning with dynamic adversarial adaption. Comprehensive benchmarking demonstrates the superior and robust performance of SpaRx at different dropout rates, noise levels and transcriptomics coverage. Further application of SpaRx to the state-of-the-art single-cell spatial transcriptomics data reveals that tumor cells in different locations of a tumor lesion present heterogenous sensitivity or resistance to drugs. Moreover, resistant tumor cells interact with themselves or the surrounding constituents to form an ecosystem for drug resistance. Collectively, SpaRx characterizes the spatial therapeutic variability, unveils the molecular mechanisms underpinning drug resistance and identifies personalized drug targets and effective drug combinations.
Keywords: graph transformer, adversarial learning, spatial cellular drug response, single-cell spatial transcriptomics
INTRODUCTION
Understanding how different cells spatially localize and communicate in their microenvironment is critical to personalized treatment [1, 2]. For instance, tumor cells of one patient are spatially heterogenous that present differential responses to treatment. For those tumor cells, their protective microenvironment and intercellular communications contribute to therapeutic failure and disease relapse [3, 4]. Therefore, to precisely treat a patient, it is crucial to identify that, inside its tumor lesion, which tumor cells are resistant to a candidate drug and what cell–cell communications are responsible for such tumor cell resistance. Unfortunately, these issues are not satisfactorily addressed, largely due to the lack of biotechnologies to accurately delineate the spatial heterogeneity of cells within tissues.
Recently, the emerging single-cell spatial transcriptomics (SCST) technologies, such as NanoString’s CosMx™ Spatial Molecular Imager (SMI) [5] and Vizgen’s MERSCOPE [6], hold the promise to unravel the spatial tissue architectures at subcellular level and to further our understanding of the underlying functional mechanisms of tumor metastasis [7] and drug resistance [8]. The SCST technologies provide the spatial locations of cells as well as their gene expression patterns, which offer a unique opportunity to investigate the therapeutic heterogeneity in the tumor microenvironment. Moreover, the existing pharmacogenomics databases, including the Cancer Cell Line Encyclopedia (CCLE) [9] and the Genomics of Drugs Sensitivity in Cancer (GDSC) [10, 11], provide valuable references relating gene expression patterns to drug response and treatment efficacy. The integration of these existing resources, along with SCST data, presents unprecedented opportunities to elucidate how individual cells within complex tissues will differentially respond to drugs, thus meeting the needs of personalized treatment.
Given the available data from drug screening cell lines, several studies have explored the connection between cell line profiles and single-cell RNA sequencing (scRNA-seq) data to investigate drug response at the single-cell level. For instance, Gambardella et al. [12] proposed a method called DREEP, which utilizes scRNA-seq data to predict the drug sensitivity of individual cells. They discovered that cells exhibiting transcriptional heterogeneity displayed varying degrees of drug sensitivity. Chen et al. [13] developed a deep transfer learning framework, scDEAL, which integrates large-scale bulk cell-line data with scRNA-seq data to predict the response of single cells to cancer drugs. Similarly, Zheng et al. [14] introduced SCAD, an adversarial discriminative domain adaptation framework that leverages scRNA-seq data from the GDSC database to identify drug sensitivities. However, these methods are not directly applicable to SCST data, as they do not take into account the spatial cell locations. To address this limitation, graph-based domain adaptation models have emerged as promising solutions for uncovering the spatial cellular response to drugs. Such graph-based model aggregates information not only from individual cells but also from their spatial neighbors, enabling a more comprehensive understanding of drug response within a spatial context.
Herein, we present SpaRx, a graph-based domain adaptation model, to reveal spatial therapeutic complexity with distinct drug responses, through leveraging the high-throughput pharmacological profiles and the SCST datasets. SpaRx is able to identify the cellular drug responses within a complex tissue, the spatial surrounding microenvironment of resistant cells, and the cell–cell communications involved in drug resistance. SpaRx can accurately transfer drug response predictors trained on a source domain (e.g. cell lines) to a target domain (e.g. spatial tumor cells). Our model explicitly considers the fundamental differences between source and target domain rather than modeling these differences as technical batch effects only. SpaRx will facilitate mechanistic studies for overcoming drug resistance and advance therapeutic research for precision medicine. It also holds promises to prioritize candidate drugs for individual patients, provide therapeutic guides for synergistic drug combinations and repurpose anti-cancer drugs for other diseases such as Alzheimer’s Disease (AD).
RESULTS
Overview of SpaRx
SpaRx employs a novel domain adaption strategy to transfer the knowledge of drug responses from large-scale drug screening databases (source domain) to predict the drug-sensitivity of cells in SCST data (target domain). We hypothesize that there are domain-invariant relations between molecular profiles and drug responses. SpaRx is built to learn such transferable knowledge of drug responses across the source and the target domain through end-to-end adversarial training (Figure 1A). As shown in Figure 1B, SpaRx consists of a feature extractor to project molecular profiles to a latent space, a drug response predictor to predict cellular sensitivity to a drug according to the latent representation of gene expression patterns and three domain discriminators to distinguish the source domain from the target domain. A hybrid learning strategy is used, with an adversarial learning procedure () involving the feature extractor and the domain discriminators to learn domain-invariant molecular features, and a supervised procedure (
) involving the feature extractor and the predictor to learn molecular features that are responsible for drug responses. With this hybrid learning strategy, SpaRx can transfer the domain-invariant information from the source knowledgebase to predict the drug responses of individual cells in SCST data.
Figure 1.
Schematic overview of the SpaRx method. (A) Concept design of the SpaRx model. With the pharmacogenomic profiles serving as the source data, SpaRx enables to predict the cells’ response to drugs in SCST data via graph-based adversarial adaptation. (B) SpaRx comprises three components (feature extractors, drug-response predictor and discriminators) to leverage source domain and predict cellular response in target domain.
In the adversarial learning procedure, SpaRx includes dynamic adversarial adaption learning to balance the learning of global and drug-specific domain-invariant gene expression patterns. To achieve this, three discriminators are used: a global discriminator that distinguishes the source domain from the target domain for all cells or cell lines, and two drug-specific discriminators to distinguish the source domain from the target domain under each category, drug-sensitive and drug-resistant. A dynamic learnable factor () is used to balance the contribution of these discriminators. Trained through the predictor-based loss that captures the knowledge of drug response, as well as the domain adaptation loss regarding global and drug-specific distributions, SpaRx is able to predict cellular sensitivity to drugs in SCST data. In this way, SpaRx successfully translates preclinical knowledge into cell-level drug response in SCST data, which facilitates deep insights into the underlying mechanisms of drug resistance and advances therapeutic effectiveness.
SpaRx demonstrates accurate predictions of drug response
As SpaRx is the first method to predict drug response in SCST data, here we benchmark it against four deep learning (DL) models including SpaRx-GAT, SpaRx-GCN, scDEAL [13] and SCAD [14], as well as four machine learning (ML) methods including support vector machines (SVM), RF, LightGBM and XGBoost (see Materials and Methods). Among them, SpaRx, SpaRx-GAT and SpaRx-GCN share similar model architecture but use graph transformer [15], GAT [16] and GCN [17] as the feature extractor. For benchmarking datasets, we randomly select a proportion (p) of cell lines from pharmacological database as the source domain. The remaining cell lines (1-p) are used to synthesize the single-cell gene expression data in target domain, with cellular complexity and drug response generated (see Materials and Methods). Benchmarking performance is evaluated by the accuracy of predicted drug response in target domain.
First, we randomly select 30% of cell lines (P = 30%) for the source domain data, and the other 70% for synthesizing the target domain data. The performance of SpaRx and other methods for predicting drug responses in the target domain is measured using the F1 score. SpaRx consistently demonstrates better performance across 80 drugs than the other methods including SpaRx-GAT (Figure 2A), SpaRx-GCN (Figure 2B), RF (Figure 2C) and SVM (Figure 2D), as well as SCAD, scDEAL, LightGBM and XGBoost (Supplementary Figure 1A). SpaRx achieves the highest accuracy (median F1 = 0.938, Figure 2E), which is significantly higher than other DL models (median F1 of SpaRx-GAT: 0.787; SpaRx-GCN: 0.751; SCAD: 0.856, scDEAL: 0.669) and ML methods (RF: 0.628; SVM: 0.564; LightGBM: 0.576; XGBoost: 0.588). Meanwhile, SpaRx demonstrates particularly higher accuracy relative to SpaRx-GAT and SpaRx-GCN for certain drugs. For example, SpaRx shows noticeably higher F1 scores than SpaRx-GAT (F-1 scores, 0.923 versus 0.683) based on a hormone therapy drug tamoxifen, and higher than SpaRx-GCN (F-1 scores, 0.921 versus 0.445) for the other kinase inhibitor drug alisertib.
Figure 2.
SpaRx accurately identifies drug response in targe domain. (A-D) Performance of SpaRx and other methods (SpaRx-GAT, SpaRx-GCN, RF and SVM) are measured by the F1 scores across 80 different drug compounds. Each point represents the F1 score of SpaRx versus an alternative method on one type of drug. (E) Boxplot of F1 scores based on different sizes of source data.
Moreover, we evaluate the performance of SpaRx in the settings of different P (50, 30, 10%) based on the F1 score. Across these different settings, SpaRx (median F1 = 0.934, Figure 2B, Supplementary Figure 1B) is consistently superior to other DL models (median F1; SpaRx-GAT: 0.784; SpaRx-GCN: 0.765; SCAD: 0.873; scDEAL: 0.669), and ML methods (median F1; RF: 0.638; SVM: 0.581; LightGBM: 0.585, XGBoost: 0.604). For example, for a commonly used liver cancer drug mitoxantrone, SpaRx shows superior performance (median F1 = 0.957) than other DL models (SpaRx-GAT: 0.844; SpaRx-GCN: 0.657; SCAD: 0.678; scDEAL: 0.662). In addition to F1 score, metrics including AUROC, AUPR, precision and recall (Supplementary Figures 2–5, Supplementary File 1), further demonstrate that SpaRx not only presents superior performance with different sizes of source data, but also achieves accurate response predictions for different types of drugs.
SpaRx accurately predicts drug response in different scenarios
We further evaluate the performance of SpaRx in the scenarios of different noise levels, dropout rates and numbers of genes in the source and the target data (see Materials and Methods). Benchmarking methods including four DL models (SpaRx-GAT, SpaRx-GCN, SCAD, scDEAL) and four ML models (RF, SVM, LightGBM, XGBoost).
Figure 3A shows the F1 scores achieved by different methods for each drug at the noise level of 1. SpaRx achieves higher F1 scores than SpaRx-GAT and SpaRx-GCN (median F1; 0.938, 0.787, 0.751), and also performs significantly better than SCAD and scDEAL (median F1: 0.856, 0.669, Supplementary Figure 1B). Moreover, when the noise level increases (noise level = 1, 1.5, 2), SpaRx maintains accurate predictions with median F1 as 0.938, 0.921 and 0.893, respectively (Figure 3B). In contrast, the other methods, such as SpaRx-GAT and SpaRx-GCN, are affected by the increased noise in source and target data, indicating these methods are more likely to be undermined by data noise. The other metrics including AUROC, AUPR, precision and recall (Supplementary Figures 2–5, Supplementary File 1) support that SpaRx is robust to data noise in real applications.
Figure 3.
Performance evaluation in different benchmarking scenarios. (A) Accuracy of identifying ground truth labels for the scenario with extra noise (noise level = 1) in source and target domain. Dashed lines refer to 15% percentile of SpaRx’s F1 scores. (B) Boxplot of F1 scores over three scenarios with extra noise (noise level = 1, 1.5, 2) in source and target data. (C) Accuracy of identifying ground truth labels for the scenario with extensive dropouts (dropout rate = 70%) in source and target domain. (D) Boxplot of F1 scores over three scenarios with different dropout levels (dropout rate = 70, 80, 90%) in source and target domain. (E) Accuracy of identifying ground truth labels for the scenario with limited number of genes (number of genes = 2 k) in source and target domain. (F) Boxplot of F1 scores over three scenarios with limited number of genes (number of genes = 2 k, 853, 500) in source and target domain.
In addition, we evaluate the performance of SpaRx with different dropout rates. When the dropout rate is 70%, SpaRx remains more accurate than SpaRx-GAT and SpaRx-GCN (median F1: 0.916, 0.738, 0.751; Figure 3C), as well as SCAD and scDEAL (median F1: 0.846, 0.650; Supplementary Figure 1B). When the dropout rate increases (Figure 3D, Supplementary Figure 1B), SpaRx is still superior to the other DL methods (median F1; SpaRx: 0.882; SpaRx-GAT: 0.709; SpaRx-GCN: 0.707; SCAD: 0.833; scDEAL: 0.620) and ML models (median F1; RF: 0.675, SVM: 0.709, LighGBM: 0.641, XGBoost: 0.615). Other metrics including AUROC, AUPR, precision and recall (Supplementary Figures 2–5, Supplementary File 1) demonstrate that SpaRx provides accurate predictions at different dropout levels.
Finally, we evaluate the performance of SpaRx with reduced number of genes in source and target data. Based on only 2000 genes, SpaRx remains superior to SpaRx-GAT and SpaRx-GCN (median F1; 0.960, 0.823, 0.883; Figure 3E) as well as SCAD and scDEAL (median F1: 0.853, 0.669; Supplementary Figure 1B). With the number of genes decreasing to ~1 k genes and 500 genes captured by NanoString CosMx and Vizgen MERSCOPE, respectively, SpaRx maintains much more reliable performance than the other methods (median F1; SpaRx: 0.930; SpaRx-GAT: 0.753; SpaRx-GCN: 0.866; SCAD: 0.754; scDEAL: 0.595; Figure 3F and Supplementary Figure 1B). The other metrics including AUROC, AUPR, precision and recall (Supplementary Figures 2–5, Supplementary File 1) show that SpaRx outperforms benchmarking methods.
Collectively, these results demonstrate that SpaRx achieves superior predictions in different scenarios, even when the target data have extra noises, high dropout rates and limited number of genes. These evaluations demonstrate the effectiveness of SpaRx in transferring drug-related intrinsic information across different biological domains, which enable to predict cellular drug response in SCST data.
SpaRx reveals the spatial cellular heterogeneity of drug response in lung cancer
To reveal spatial cell variability in drug response, we first apply SpaRx to the NanoString CosMx lung cancer SCST data with different cell types on eight Field Of View (FOV) [5] (Figure 4A). The zoomed-in image on the right shows that tumor cells (colored in light blue) are infiltrated with immune cells such as macrophage (colored in orange) and B cells (colored in green). Based on this tissue slice, we apply SpaRx to predict the tumor cells’ response to a typical lung cancer drug, cisplatin, for which the mechanism of action is to cause DNA damage in cancer cells, blocking cell division and leading to apoptotic cell death. As in Figure 4B, SpaRx uncovers tumor cells’ response to cisplatin, which exhibits strong heterogeneity of sensitivity and resistance. Interestingly, in contrast to the agminated resistant cells, the zoom-in FOV presents a scattered pattern of sensitive cells.
Figure 4.
Spatial heterogeneity of cellular response to cisplatin. (A) Spatial visualization of single-cell spatial data from lung cancer tissue. Eight FOVs are included. (B) Spatial visualization of cellular response to cisplatin.(C) Surrounding microenvironment of resistant (R) and sensitive (S) cells in each FOV. (D) Heatmap quantitatively delineates the constituents around sensitive and resistant cells. (E) Summarization across eight FOVs for the cell type distributions surrounding resistant and sensitive tumor cells to cisplatin. (F) Characterization of the infiltrated cell types in responders and nonresponders to cisplatin in TCGA lung cancer patient samples.
Given that those tumor cells respond differentially to cisplatin, we further interrogate if the surrounding microenvironment of resistant tumor cells is different from that of sensitive ones. As shown in Figure 4C, for each FOV, the spatial distributions (i.e. proportions) of cell types adjacent to resistant and sensitive cells are different. Such differences also appear to be distinct across different FOVs. Noteworthy, across FOV 1–3 (Figure 4C), CD8 memory T cells, B cells and natural killer (NK) cells are consistently reduced in the surroundings of resistant cells. Heatmap in Figure 4D further quantitatively delineates the distinctive microenvironment between sensitive and resistant cells, where CD4 and CD8 memory T cells are less infiltrated in the surrounding of resistant cells. Moreover, fewer B cells, dendritic cells (DC) and NK cells are present in the microenvironment of resistant cells. Further averaging of the surrounding cell type proportions across the eight FOVs (Figure 4E) shows that most cell types except macrophages are more abundant in the microenvironment of sensitive cells. Similar patterns are also observed in the TCGA lung cancer patients receiving cisplatin treatment (Figure 4F). Specifically, after bulk RNA-seq decomposition by CIBERSORT [18], CD4 and CD8 memory T cells, B cells and NK cells are shown to be more prevalent in responders than nonresponders to cisplatin. These results indicate that the surrounding microenvironment may be relevant to or modulate the tumor cells’ responses to cisplatin.
Spatial cellular crosstalk mediates drug resistance
Given the distinctive microenvironment surrounding sensitive and resistant tumor cells, further characterization of cell–cell communications can explain how neighboring cells modulate the differential responses of tumor cells to cisplatin. Using spaCI [19], a cell–cell communication tool specifically designed for SCST data, we infer the ligand-receptor (L–R) interactions involving tumor cells in the zoom-in FOV (Figure 4A). The aggerated L–R interactions that occur between tumor cells and adjacent cells are shown in the chord diagram (Figure 5A), with the chord width indicating the interaction strength. Of note, we observe that macrophage, fibroblast and CD4 and CD8 memory T cells interact strongly with both sensitive cells and resistant tumor cells. NK cells uniquely crosstalk with sensitive but not resistant tumor cells.
Figure 5.
Roles of cell–cell communications in tumor cell resistance. (A) Summary chord diagram of the identified cell–cell communication network. The chord width is proportional to the interaction strength across different cell types. (B) Heatmap shows the identified L–R interactions between major cell types and resistant (R) or sensitive (S) cells. (C) Spatial visualization of cellular response to cisplatin, docetaxel and their combinations.
The involved L–R pairs that contribute to the cellular crosstalk with tumor cells are further presented in Figure 5B. The gradient colors represent the interaction strength of each L–R pair. Specifically, CD4 and CD8 memory T cells are involved in more L–R interactions with resistant than sensitive cells. MMP9–CD44 is uniquely involved in the interactions between NK and sensitive cells, which is supported by previous study [20, 21]. Moreover, fibroblast expressed DCN (ligand) shows stronger interactions with resistant tumor cells’ MET (receptor). DCN has been reported to interact antagonistically with the MET factor (c-Met) [22, 23], and play roles in cancer development and metastasis [24]. Other L–R interactions including HGF [25]–MET [26], VCAN [27]–ITGB1 [28] and VCAN [27]–CD44 [29] also play a crucial role in cancer cells forming resistant state against drugs.
SpaRx can be used to explore optimal drug combinations. For example, in addition to cisplatin, SpaRx also identifies the spatially differential cellular response to the other lung cancer drug, docetaxel. As in Figure 5C, some tumor cells that are resistant to one drug appear sensitive to the other drug. Tumor cells sensitive to each of the two drugs are complementary, with Jaccard similarity as 0.381. This result suggests that the combined therapy of cisplatin and docetaxel may overcome resistance and improve therapeutics, which has also been confirmed in clinical trials for patients with unresectable NSCLC [30–32].
SpaRx uncovers an orderly pattern of resistant tumor cells
Next we apply SpaRx to the Vizgen MERSCOPE liver cancer SCST data (Figure 6A). In this case, most tumor cells (green colored cells) are confined in three regions with clear boundaries, with some infiltrating tumor cells within the hepatocytes. Specifically, the tumor region on the left (region-1) is surrounded by Kupffer cells, and the other two tumor cell regions (region-2 and region-3) at the right are surrounded by hepatoblasts. SpaRx is applied to predict the tumor cells’ response to a typical liver cancer drug, mitoxantrone. As shown in Figure 6B, SpaRx uncovers tumor cells’ response to mitoxantrone, with both sensitive and resistant cells revealed. Interestingly, sensitive cells mostly present at the outer area, whereas resistant cells majorly locate in the inner area of each tumor region. Such orderly patterns of resistant cells shared by three tumor regions indicate that those resistant cells may share similar molecular characteristics.
Figure 6.
SpaRx reveals drug resistance pattern in liver tumor tissue. (A) Spatial visualization of the single-cell spatial data from liver tumor tissue. (B) Spatial visualization of cellular response to mitoxantrone. (C) Volcano plot shows the DEGs of resistant versus sensitive tumor cells in tumor region-1. (D) Overlaps of DEGs across three tumor regions. (E) Enriched pathways based on overlapped DEGs across three tumor regions.
To investigate the underlying differences between the resistant and sensitive cells, differentially expressed gene (DEG) analysis is performed for each tumor region. For example, the DEGs of resistant cells at region-1 are shown in Figure 6C, among them VTN [33] and VEGFA [34] are overexpressed. Vitronectin (encoded by VTN) has been reported to protect cancer cells from drug-induced apoptosis [33]. VEGFA decreases the sensitivity of cancer cells to chemotherapy by suppressing VEGFA-mediated autophagy [34]. These over-expressed genes identified in resistant cells may serve as resistance biomarkers and potential therapeutic targets. More importantly, the DEGs of both resistant and sensitive cells across three tumor regions are largely in common (Figure 6D), further confirming that these regions share similar molecular mechanisms for mitoxantrone resistance. Enrichment analysis of these shared DEGs (Figure 6E) among resistant cells (R1, R2 and R3) reveals signaling pathways that are potentially responsible for mitoxantrone resistance, including the focal adhesion-induced PI3K-AKT signaling. In contrast, interleukin signaling and cytokine signaling pathways are enriched in sensitive cells.
DISCUSSION
The spatial heterogeneity in cells and their microenvironment play critical roles in the treatment of complex diseases such as cancers [35] and AD [36]. For example, tumor microenvironment is crucial for tumor cell metastasis [37] and drug resistance [38]. Recent emerging single-cell spatial technologies utilizing molecular imaging for targeted gene profiling provide deep insights into the spatial cellular ecosystems [39–41]. These state-of-the-art technologies help resolve the cellular heterogenous response to drugs, the intercellular communications that contribute to drug resistance, and how tumor ecosystem acquires drug resistance.
In this work, we have developed a novel SpaRx model that leverages the pharmacogenomics knowledgebase with SCST data to systematically reveal spatial complexity of therapeutic response. As to our knowledge, SpaRx is the first method to incorporate the large-scale pharmacogenomics profiles with SCST data, to accurately predict the heterogeneous cellular response to drugs. SpaRx is able to reveal the spatial cell variability in drug response and uncover the underlying biological mechanisms for drug resistance. For example, based on the lung cancer SCST data, we observe the multitude interactions related to tumor cell resistance and identify the spatially adjacent cell interactions that may alter tumor cells’ sensitivity to cisplatin. In addition to cancers, SpaRx also holds the promises for repurposing anti-cancer drugs for complex diseases such as AD, which is also known for its complexity and heterogeneity. Collectively, SpaRx is anticipated to reveal the mechanisms of drug resistance, prioritize tailored drugs for complex diseases and provide clues for drug repositioning.
Given the advantages of SpaRx, there are several aspects that SpaRx can be improved. First, current single-cell spatial technologies are still not able to detect sufficient number of genes, which may limit the potentials of SpaRx in some degree. Future advances in spatial technologies that capture more genes and less dropouts will help enhance the SpaRx model. Second, with the rapid development of single-cell spatial omics technologies [42], SpaRx can also be improved through incorporating spatial multi-omics. Though the current version of SpaRx that enables the predictions of drug responses based on SCST data, SpaRx can be improved by utilizing new data types, e.g. single-cell spatial ATAC-seq profiles [43], thus to further unveil the underlying mechanisms such as the upstream cis-regulatory elements and associated transcription factors involved in drug resistance.
MATERIALS AND METHODS
Data sources and preparation
(i) The GDSC [10] and CCLE [9] cell-line-based drug screening database. The gene expressions of cell lines and the sensitivity profile (IC50) of drugs are downloaded from the GDSC and CCLE database. The binary drug responses for each cell line are obtained from previous studies [11, 13]. Here for GDSC and CCLE database, cell lines and drug information without missing values are retained and integrated based on overlapped drug compounds. Collectively, we obtained 1280 cell lines with drug response information across 80 shared drug compounds.
(ii) The NanoString CosMx lung-13 SCST data [5] and the Vizgen MERSCOPE liver cancer-1 SCST data [6].
Benchmarking data
The benchmarking datasets are generated from the collected 1280 cell lines. Here, we randomly select a proportion of cell lines (p) as the source domain, whereas the remaining cell lines (1-p) are used to synthesize the target domain. For the target domain, the gene expression profiles of the remaining cell lines (1-p) are further mixed randomly to mimic tumor cell complexity. These mixed data are then downsampled to assure the total counts are comparable with the single-cell level, which allows the synthesized gene expression data in the target domain mimic that of single tumor cells in SCST data. Specifically, to mimic the complexity of tumor cells, we select 2–10 cell lines from the remaining cell lines (1-p), then combine their transcriptomic profiles as one tumor cell profile. To better mimic real tumor cell, if the total counts of the resulting tumor data exceed 2000, we downsample it accordingly. In this way, the synthesized gene expression data serving as the target domain are more likely to resemble the real heterogenous tumor cells in SCST data.
Moreover, four benchmarking scenarios are included. (i) Different numbers of cell lines in the source data. We choose different proportions of cell lines as the source domain, i.e. p = 10, 30, 50%, respectively. The remaining cell lines are used to generate the target domain. (ii) Different levels of noises. For both source and target domain based on the setting of p = 30%, we add extra noises randomly sampled from normal distributions
, with standard deviation
as 1, 1.5 and 2, respectively. (iii) Different levels of dropouts. For both source and target domain based on the setting of p = 30%, dropouts are simulated by replacing the gene expression values with zeros, to ensure the proportions of zeros among all gene expression values are 70, 80 and 90%, respectively. (iv) Different numbers of genes. Based on the setting of p = 30%, the number of genes in the cell line profiles is reduced to 2 k, 853, and 500 genes. The 2 k genes are randomly selected, whereas the 853 and 500 genes are selected based on the RNA panels used in the NanoString CosMx and the Vizgen MERSCOPE data, respectively.
SpaRx model
Source domain. The GDSC and the CCLE data are used as the source domain, denoted as . Here,
represents the source domain (the cell-line based drug response profiles),
represents the gene expression, with
representing a cell line,
representing the number of genes and
denoting the number of cell lines.
refers to the binarized drug response of each cell line, i.e. sensitive or resistance. The cell-line similarity graph
is constructed using mutual nearest neighbors (MNN) [44], with the number of MNN as
. In the graph
, each node
represents a cell line
, and if two nodes
and
are connected, it means that the corresponding gene expression profiles
and
are similar.
Target domain. The SCST data are used as the target domain. Each of the SCST data is represented by ,
, where
denotes the target domain (the SCST data),
denotes the number of genes,
represents the number of cells and
represents a cell in the SCST data. A spatial cell graph
is constructed according to cell locations using
-nearest neighbors. If two cells
and
are spatially adjacent, then the corresponding nodes
and
are connected in
.
The SpaRx model uses cell lines in the source domain and cells in the target domain as samples, gene expressions as features and drug responses as outcomes. SpaRx is composed of three components: (i) feature extractor to extract gene expression features from the source and the target domain, (ii) drug response predictor for both cell lines and single cells and (iii) global and drug-specific discriminators. The final output of the SpaRx is the predicted drug responses of each cell in the target SCST domain.
(i) Feature Extractor: the shared feature extractor is composed of multi-head graph transformer [15] layers to project the graph representation of the cellular transcriptomics data to a latent space in which cells that demonstrate similar responses to treatments are close to each other. Briefly, for a cell line
from the GDSC or the CCLE data or a cell
from the SCST data, the propagation of the graph transformer from the
layer to the
layer is defined as
![]() |
(1) |
where represents either a cell line (
) in the source domain or a cell (
) in the target domain,
represents a neighbor cell line or cell in their corresponding graph (i.e.
), the rectified linear unit (ReLU) [45] is used as the nonlinear gated activation function. When
,
for the cell line data and
for the SCST data. The attention module is defined as
, where:
![]() |
![]() |
![]() |
and . The multi-head attentions are concatenated. In this way, we obtain the latent representation,
, as the extracted features for source domain, and
as the extract features for target domain, respectively. The feature extractor
is shared by both source and target domain.
(ii) Drug response predictor: The predictor (), a fully connected classifier, is designed to classify the drug response results using latent features from the feature extractor
. It is trained by minimizing the differences between the predicted source labels and the source domain labels of drug response (ground truth labels) by the cross-entropy loss, which is formulated as
![]() |
(2) |
where is the probability of
belonging to drug sensitive,
is the probability of
belonging to drug resistance.
is the response predictor and
is the feature extractor.
represents the graph
in source domain. Here the drug response predictor is shared by both source and target domain.
(iii) Discriminators. A global discriminator is trained to align the latent representations of source and target domain. Here, the loss of the global discriminator is formulated as
![]() |
(3) |
where denotes the cross-entropy,
denotes the global discriminator,
is the feature extractor and
is the domain label for the input
(
= 0 for source domain,
= 1 for target domain).
represents the graph
when
and
when
.
Drug-specific discriminators ( and
):
and
are used to match the latent representations from source and target domains under drug-sensitive and drug-resistant category, respectively. Both drug-specific discriminators are trained to minimize the differences in the latent representations of source and target domain under each drug category. The output of the drug response predictor (
) is used to show the probability of being included into each drug category. The loss for each discriminator is calculated using cross-entropy:
![]() |
(4) |
where ,
and
,
are the drug-specific discriminator loss and its cross-entropy loss associated with drug categories, respectively.
and
is the predicted probability of the input
belonging to drug-sensitive or drug-resistant category, i.e.
.
is the label for the input
(
= 0 for source domain,
= 1 for target domain).
represents the graph
when
and
when
.
Loss function. Given the three major components, feature extractors, domain discriminators and label classifier in our model, the final learning objective is formulated as
![]() |
(5) |
In the loss function, is a dynamic adversarial factor, which balances the relative weight of the global and the drug-specific discriminator loss. During the training, ω is dynamically updated according to the losses of the three discriminators:
, where
,
and
are the proxy
-distances [46] between the source and the target domains for the three domain discriminators, respectively. Specifically, for the discriminator
,
; for the discriminator
,
and for the discriminator
,
, where
represents the number of cell lines in source domain, and
is the number of cells in target domain.
The SpaRx model is trained using the stochastic gradient descent (SGD) optimizer. In the SpaRx model, the parameters including the number of adjacent neighbors or MNNs in graph construction, as well as the latent dimensions in graph transformer layers, are determined through grid-based hyper-parameter fine tuning. The hyperparameters used in the final model are: for the SGD optimizer, momentum = 0.9 and weight decay = ; the learning rate is set to
; gradient clip threshold at 5; the number of graph transformer layers is 2, with the dimensions of 512 and 64, respectively. After the model training, SpaRx accurately predicts the drug response labels of cells in the spatial data and uncovers the spatially heterogeneous responses to different types of drugs.
Benchmarking methods and comparison measurement
To evaluate the performance of SpaRx, we compare it with four DL models, including SpaRx-GAT, SpaRx-GCN, scDEAL [13], SCAD [14], and four ML methods including random forest (RF), SVM, lightGBM and XGBoost. SpaRx-GAT is built based on the SpaRx model, with the feature extractor as GAT [16] layers, rather than the graph transformer [15]. SpaRx-GCN uses GCN [17] layers as the feature extractor. scDEAL [13] and SCAD [14] are proposed to predict single cell response to cancer drugs by integrating large-scale bulk cell-line data and scRNA-seq data. To evaluate the performance of each model, we use the F1 score to assess the agreement between the predicted drug response and the ground truth. The F1 ranges from 0 to 1 referring to the increasing match between the predicted drug response with ground truth. With denoting true positive, FP representing false positive, and
representing false negative, F1 score is calculated by
. Additional metrics including precision, recall, AUROC and AUPR are included for comprehensive evaluation of SpaRx and benchmarking methods.
Identifying surrounding microenvironment, L–R interactions and adjacent cell communications
When characterizing the surrounding microenvironment, we have summed up the adjacent (with five nearest neighbors) cells (by each cell type) around resistant/sensitive cells. After dividing the total number of cells within each cell type, we obtain the percentage of different cell types in the microenvironment of resistant/sensitive cells. To identify L–R interactions, our previous tool spaCI [19] is used here for SCST data. With the L–R interactions, we further characterize the adjacent cell communications with interaction strength. Specifically, for an L–R interaction pair, we define its interaction strength as the multiplication of their average expression values among adjacent cells, where the top and bottom 10% expressions of the ligand and the receptor are ignored. The interaction strength of all identified L–R pairs is then summarized as the interaction strength between two cell types. Thus, the higher value of the interaction strength, the stronger the two cell types adjacently interact.
Key Points
We have developed a novel graph-based domain adaption model named SpaRx, to reveal the heterogeneity of spatial cellular response to different types of drugs, which bridges the gap between pharmacogenomics knowledgebase and SCST data.
SpaRx is developed tailored for SCST data and is provided available as a ready-to-use open-source software, which demonstrates high accuracy and robust performance.
SpaRx uncovers that tumor cells located in different areas within tumor lesion exhibit varying levels of sensitivity or resistance to drugs. Moreover, SpaRx reveals that tumor cells interact with themselves and the surrounding microenvironment to form an ecosystem capable of drug resistance.
Supplementary Material
Author Biographies
Ziyang Tang is a PhD candidate in the Department of Computer and Information Technology, Purdue University, Indiana, USA. His research focuses on developing novel artificial intelligence methods in interdisciplinary science.
Xiang Liu is a postdoctoral researcher in the Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indiana, USA. His research focuses on developing novel deep learning methods in biomedical informatics.
Zuotian Li is a PhD candidate in the Department of Computer Graphics Technology, Purdue University, Indiana, USA. Her research focuses on developing web tools for data visualization.
Tonglin Zhang is an Associate Professor in the Department of Statistics, Purdue University, Indiana, USA. His research focuses on developing novel statistical model for interdisciplinary research.
Baijian Yang is a Professor in the Department of Computer and Information Technology, Purdue University, Indiana, USA. His research focuses on developing novel statistical model for interdisciplinary research.
Jing Su is an Assistant Professor in the Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indiana, USA. His research focuses on graph artificial intelligence and machine learning in biomedical informatics and precision health.
Qianqian Song is an Assistant Professor in the Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Florida, USA. She is also an adjunct Assistant Professor in the Department of Cancer Biology, Wake Forest University School of Medicine, North Carolina, USA. Her research focuses on developing innovative computational methods to decipher disease mechanisms and identify therapeutic biomarkers.
Contributor Information
Ziyang Tang, Department of Computer and Information Technology, Purdue University, Indiana, USA.
Xiang Liu, Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indiana, USA.
Zuotian Li, Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indiana, USA; Department of Computer Graphics Technology, Purdue University, Indiana, USA.
Tonglin Zhang, Department of Statistics, Purdue University, Indiana, USA.
Baijian Yang, Department of Computer and Information Technology, Purdue University, Indiana, USA.
Jing Su, Department of Biostatistics and Health Data Science, Indiana University School of Medicine, Indiana, USA.
Qianqian Song, Department of Cancer Biology, Wake Forest University School of Medicine, North Carolina, USA; Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Florida, USA.
FUNDING
Q.S. is supported by the Bioinformatics Shared Resources under the NCI Cancer Center Support Grant to the Comprehensive Cancer Center of Wake Forest University Health Sciences (P30CA012197), and the National Institute of General Medical Sciences of the National Institutes of Health (R35GM151089). J.S. is partially financially supported by the Indiana University Precision Health Initiative and the Indiana University Melvin and Bren Simon Comprehensive Cancer Center Support Grant from the National Cancer Institute (P30CA 082709). J.S. is also supported by the National Library of Medicine of the National Institutes of Health (R01LM013771).
DATA AVAILABILITY
NanoString CosMx SMI data: The single-cell spatial dataset (Lung-13), profiled by CosMx SMI on Formalin-Fixed Paraffin-Embedded (FFPE) samples of the non-small-cell lung cancer (NSCLC) tissue [5], is available from https://nanostring.com/resources/smi-ffpe-dataset-lung13-data/. Vizgen MERSCOPE data: We includes the Vizgen MERFISH liver cancer 1 dataset that contains a MERFISH measurement of a 500 gene panel. Data are available in https://info.vizgen.com/merscope-ffpe-solutionhttps://console.cloud.google.com/storage/browser/vz-ffpe-showcase/HumanLiverCancerPatient1;tab=objects?pageState= The gene expression profiles of GDSC and CCLE cell lines are downloaded from https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-3610 and https://depmap.org/portal/download/all/?releasename=DepMap+Public+22Q2&filename=CCLE_expression_full.csv. The gene expression profile data of TCGA lung cancer patients, including lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) patients, are available in the UCSC Xena database (http://xena.ucsc.edu/)https://xenabrowser.net/datapages/?dataset=TCGA.LUAD.sampleMap%2FHiSeqV2&host=https%3A%2F%2Ftcga.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443https://xenabrowser.net/datapages/?dataset=TCGA.LUSC.sampleMap%2FHiSeqV2&host=https%3A%2F%2Ftcga.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443. The corresponding response information to cisplatin is retrieved from previous studies [47], where responders (including complete response and partial response) and nonresponders (including stable disease and progressive disease), are characterized according to the RECIST standard [48].https://github.com/QSong-github/SpaRx/tree/main/Data.
CODE AVAILABILITY
SpaRx is provided as a Python package available at https://github.com/QSong-github/SpaRx, with detailed functions for the general applicability on different SCST data.
References
- 1. Nirmal AJ, Maliga Z, Vallius T, et al. The spatial landscape of progression and immunoediting in primary melanoma at single-cell resolution. Cancer Discov 2022;12:1518–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Luca BA, Steen CB, Matusiak M, et al. Atlas of clinically distinct cell states and ecosystems across human solid tumors. Cell 2021;184:5482–5496.e28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. El-Sayes N, Vito A, Mossman K. Tumor heterogeneity: a great barrier in the age of cancer immunotherapy. Cancer 2021;13:806. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Kemper K, Krijgsman O, Cornelissen-Steijger P, et al. Intra- and inter-tumor heterogeneity in a vemurafenib-resistant melanoma patient and derived xenografts. EMBO Mol Med 2015;7:1104–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. He S, Bhatt R, Brown C, et al. High-plex imaging of RNA and proteins at subcellular resolution in fixed tissue by spatial molecular imaging. Nat Biotechnol 2022;40:1794–806. [DOI] [PubMed] [Google Scholar]
- 6. Fang R, Xia C, Close JL, et al. Conservation and divergence of cortical cell organization in human and mouse revealed by MERFISH. Science 2022;377:56–62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Zhang Q, Abdo R, Iosef C, et al. The spatial transcriptomic landscape of non-small cell lung cancer brain metastasis. Nat Commun 2022;13:5983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Dagogo-Jack I, Shaw AT. Tumour heterogeneity and resistance to cancer therapies. Nat Rev Clin Oncol 2018;15:81–94. [DOI] [PubMed] [Google Scholar]
- 9. Barretina J, Caponigro G, Stransky N, et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 2012;483:603–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Yang W, Soares J, Greninger P, et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res 2012;41:D955–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Iorio F, Knijnenburg TA, Vis DJ, et al. A landscape of pharmacogenomic interactions in cancer. Cell 2016;166:740–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Gambardella G, Viscido G, Tumaini B, et al. A single-cell analysis of breast cancer cell lines to study tumour heterogeneity and drug response. Nat Commun 2022;13:1714. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Chen J, Wang X, Ma A, et al. Deep transfer learning of cancer drug responses by integrating bulk and single-cell RNA-seq data. Nat Commun 2022;13:6494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Zheng Z, Chen J, Chen X, et al. Enabling single-cell drug response annotations from bulk RNA-seq using SCAD. Advanced Science 2023;10:e2204113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Vaswani A, et al. Attention is all you need. Advances in neural information processing systems 2017;30. [Google Scholar]
- 16.Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. arXiv preprint arXiv:1710.10903. 2017.
- 17.Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. 2016.
- 18. Newman AM, Liu CL, Green MR, et al. Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 2015;12:453–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Tang Z, Zhang T, Yang B, Su J, Song Q. spaCI: deciphering spatial cellular communications through adaptive graph model. Briefings in Bioinformatics. 2023;24:bbac563. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Yosef G, Hayun H, Papo N. Simultaneous targeting of CD44 and MMP9 catalytic and hemopexin domains as a therapeutic strategy. Biochem J 2021;478:1139–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Albini A, Noonan DM. Decidual-like NK cell polarization: from cancer killing to cancer nurturing. Cancer Discov 2021;11:28–33. [DOI] [PubMed] [Google Scholar]
- 22. Neill T, Painter H, Buraschi S, et al. Decorin antagonizes the angiogenic network: concurrent inhibition of Met, hypoxia inducible factor 1alpha, vascular endothelial growth factor A, and induction of thrombospondin-1 and TIMP3. J Biol Chem 2012;287:5492–506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Gubbiotti MA, Vallet SD, Ricard-Blum S, Iozzo RV. Decorin interacting network: a comprehensive analysis of decorin-binding partners and their versatile functions. Matrix Biol 2016;55:7–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Sofeu Feugaing DD, Götte M, Viola M. More than matrix: the multifaceted role of decorin in cancer. Eur J Cell Biol 2013;92:1–11. [DOI] [PubMed] [Google Scholar]
- 25. Grugan KD, Miller CG, Yao Y, et al. Fibroblast-secreted hepatocyte growth factor plays a functional role in esophageal squamous cell carcinoma invasion. Proc Natl Acad Sci U S A 2010;107:11026–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Dong Y, Xu J, Sun B, et al. MET-targeted therapies and clinical outcomes: a systematic literature review. Mol Diagn Ther 2022;26:203–27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Chang MY, Kang I, Gale M, Jr, et al. Versican is produced by Trif- and type I interferon-dependent signaling in macrophages and contributes to fine control of innate immunity in lungs. Am J Physiol Lung Cell Mol Physiol 2017;313:L1069–86. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Wantoch von Rekowski K, König P, Henze S, Schlesinger M, Zawierucha P, Januchowski R, Bendas G. The impact of integrin-mediated matrix adhesion on cisplatin resistance of W1 ovarian cancer cells. Biomolecules 2019;9:788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Yang GH, Fan J, Xu Y, et al. Osteopontin combined with CD44, a novel prognostic biomarker for patients with hepatocellular carcinoma undergoing curative resection. Oncologist 2008;13:1155–65. [DOI] [PubMed] [Google Scholar]
- 30. Schiller JH, Harrington D, Belani CP, et al. Comparison of four chemotherapy regimens for advanced non-small-cell lung cancer. N Engl J Med 2002;346:92–8. [DOI] [PubMed] [Google Scholar]
- 31. Kaya AO, Buyukberber S, Benekli M, et al. Concomitant chemoradiotherapy with cisplatin and docetaxel followed by surgery and consolidation chemotherapy in patients with unresectable locally advanced non-small cell lung cancer. Med Oncol 2010;27:152–7. [DOI] [PubMed] [Google Scholar]
- 32. Katayama H, Ueoka H, Kiura K, et al. Preoperative concurrent chemoradiotherapy with cisplatin and docetaxel in patients with locally advanced non-small-cell lung cancer. Br J Cancer 2004;90:979–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Ding Y, Shen Y. Notch increased vitronection adhesion protects myeloma cells from drug induced apoptosis. Biochem Biophys Res Commun 2015;467:717–22. [DOI] [PubMed] [Google Scholar]
- 34. Li X, Hu Z, Shi H, et al. Inhibition of VEGFA increases the sensitivity of ovarian cancer cells to chemotherapy by suppressing VEGFA-mediated autophagy. Onco Targets Ther 2020;13:8161–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Baghban R, Roshangar L, Jahanban-Esfahlan R, et al. Tumor microenvironment complexity and therapeutic implications at a glance. Cell Commun Signal 2020;18:59. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Guo T, Zhang D, Zeng Y, et al. Molecular and cellular mechanisms underlying the pathogenesis of Alzheimer’s disease. Mol Neurodegener 2020;15:40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Quail DF, Joyce JA. Microenvironmental regulation of tumor progression and metastasis. Nat Med 2013;19:1423–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Maacha S, Bhat AA, Jimenez L, et al. Extracellular vesicles-mediated intercellular communication: roles in the tumor microenvironment and anti-cancer drug resistance. Mol Cancer 2019;18:55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Moses L, Pachter L. Museum of spatial transcriptomics. Nat Methods 2022;19:534–46. [DOI] [PubMed] [Google Scholar]
- 40. Lewis SM, Asselin-Labat ML, Nguyen Q, et al. Spatial omics and multiplexed imaging to explore cancer biology. Nat Methods 2021;18:997–1012. [DOI] [PubMed] [Google Scholar]
- 41. Moffitt JR, Lundberg E, Heyn H. The emerging landscape of spatial profiling technologies. Nat Rev Genet 2022;23:741–59. [DOI] [PubMed] [Google Scholar]
- 42. Vickovic S, Lötstedt B, Klughammer J, et al. SM-omics is an automated platform for high-throughput spatial multi-omics. Nat Commun 2022;13:795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Deng Y, Bartosovic M, Ma S, et al. Spatial profiling of chromatin accessibility in mouse and human tissues. Nature 2022;609:375–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Haghverdi L, Lun AT, Morgan MD, Marioni JC. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol 2018;36:421–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Agarap AF. Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375. 2018.
- 46. Ben-David S, Blitzer J, Crammer K, Pereira F. Analysis of representations for domain adaptation. Advances in neural information processing systems 2006;19. [Google Scholar]
- 47. Ding Z, Zu S, Gu J. Evaluating the molecule-based prediction of clinical drug responses in cancer. Bioinformatics 2016;32:2891–5. [DOI] [PubMed] [Google Scholar]
- 48. Eisenhauer EA, Therasse P, Bogaerts J, et al. New response evaluation criteria in solid tumours: revised RECIST guideline (version 1.1). Eur J Cancer 2009;45:228–47. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
NanoString CosMx SMI data: The single-cell spatial dataset (Lung-13), profiled by CosMx SMI on Formalin-Fixed Paraffin-Embedded (FFPE) samples of the non-small-cell lung cancer (NSCLC) tissue [5], is available from https://nanostring.com/resources/smi-ffpe-dataset-lung13-data/. Vizgen MERSCOPE data: We includes the Vizgen MERFISH liver cancer 1 dataset that contains a MERFISH measurement of a 500 gene panel. Data are available in https://info.vizgen.com/merscope-ffpe-solutionhttps://console.cloud.google.com/storage/browser/vz-ffpe-showcase/HumanLiverCancerPatient1;tab=objects?pageState= The gene expression profiles of GDSC and CCLE cell lines are downloaded from https://www.ebi.ac.uk/biostudies/arrayexpress/studies/E-MTAB-3610 and https://depmap.org/portal/download/all/?releasename=DepMap+Public+22Q2&filename=CCLE_expression_full.csv. The gene expression profile data of TCGA lung cancer patients, including lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) patients, are available in the UCSC Xena database (http://xena.ucsc.edu/)https://xenabrowser.net/datapages/?dataset=TCGA.LUAD.sampleMap%2FHiSeqV2&host=https%3A%2F%2Ftcga.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443https://xenabrowser.net/datapages/?dataset=TCGA.LUSC.sampleMap%2FHiSeqV2&host=https%3A%2F%2Ftcga.xenahubs.net&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443. The corresponding response information to cisplatin is retrieved from previous studies [47], where responders (including complete response and partial response) and nonresponders (including stable disease and progressive disease), are characterized according to the RECIST standard [48].https://github.com/QSong-github/SpaRx/tree/main/Data.