Abstract
Background
Predicting individual patient responses to anticancer drugs is a central challenge in precision oncology, hindered by the scarcity of clinical pharmacogenomic data and substantial biological dissimilarity between preclinical models and patient tumors. Patient-derived xenograft (PDX) models offer significantly enhanced tumor biological fidelity compared to in vitro cancer cell line models, yet computational methods to translate PDX-based drug response predictions (DRP) into clinical settings remain limited.
Methods
We developed TRANSPIRE-DRP, a deep learning framework that bridges the translational gap between PDX models and patient tumors through unsupervised domain adaptation. The framework employs a two-stage architecture: first, an autoencoder-based pretraining phase learns domain-invariant genomic representations from large-scale unlabeled data; second, an adversarial adaptation phase aligns these representations while preserving drug response signals from PDX models. We evaluated TRANSPIRE-DRP across three therapeutic agents—Cetuximab, Paclitaxel, and Gemcitabine—in real-life clinical prediction scenarios.
Results
TRANSPIRE-DRP consistently outperformed both cell line-based state-of-the-art models and PDX-based baselines, demonstrating superior translational capacity. Notably, the learned representations preserved tumor-specific molecular features and spontaneously recapitulated established drug-cancer type associations without requiring explicit histological annotations. Interpretability analyses revealed biologically coherent pathway enrichments consistent with known drug mechanisms of action, including EGFR-Wnt signaling crosstalk for Cetuximab, mitotic arrest mechanism for Paclitaxel, and NF-κB-mediated immunomodulation for Gemcitabine.
Conclusions
TRANSPIRE-DRP establishes a scalable, interpretable, and clinically relevant framework for translating preclinical PDX data into personalized therapeutic predictions, providing a robust computational foundation for advancing precision oncology beyond the inherent limitations of traditional in vitro systems.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12967-025-07371-9.
Keywords: Precision oncology, Patient-derived xenografts, Drug response prediction, Domain adaptation, Deep learning
Background
Cancer remains one of the most formidable health challenges worldwide, with approximately 20 million new cases and 9.7 million deaths recorded in 2022 [1]. The paradigm of precision oncology has emerged to reduce this burden by tailoring therapeutic interventions based on specific molecular or genetic vulnerabilities of individual tumors. This personalized approach has demonstrated remarkable clinical benefits, including doubled median overall survival (51.7 vs. 25.8 weeks) and substantial reductions in healthcare costs compared to conventional treatment strategies [2]. Central to precision oncology is pharmacogenomics, which seeks to predict patient-specific drug responses by leveraging comprehensive molecular profiles [3, 4]. However, a critical bottleneck impedes clinical implementation as patient-level drug response datasets remain limited in scale and often lack public accessibility due to high costs, limited accrual rates, and complex regulatory landscape, necessitating the development of robust preclinical models as patient surrogates [5].
Among available preclinical models, cancer cell lines have dominated pharmaceutical screening platforms due to their cost-effectiveness, standardized cultivation protocols, and compatibility with high-throughput evaluation systems. Large-scale collaborative initiatives, including the Cancer Cell Line Encyclopedia (CCLE) [6], Genomics of Drug Sensitivity in Cancer (GDSC) [7], and Cancer Therapeutics Response Portal (CTRP) [8], have established comprehensive drug sensitivity databases spanning hundreds of cancer lineages with extensive molecular characterization. These rich repositories have catalyzed the development of diverse computational frameworks for drug response prediction (DRP), encompassing matrix factorization approaches [9, 10], network-based methodologies [11, 12], conventional machine learning strategies [13, 14], and sophisticated deep learning architectures [15, 16]. Despite these advances, in vitro cell line models suffer from fundamental biological limitations that severely compromise their translational utility. Extended cultivation periods inevitably diminish tumor heterogeneity, eliminate critical microenvironmental interactions, and promote selection for rapid proliferation characteristics that diverge substantially from in vivo tumor biology. These systematic alterations contribute to a remarkably poor clinical translation rate, with only 5% of novel oncology compounds successfully progressing from cell line-based investigations to approved therapeutic applications [17]. This substantial translational gap fundamentally limits the utility of cell line-based approaches for precision medicine applications.
Patient-derived xenograft (PDX) models represent a compelling alternative preclinical platform. Generated by directly implanting fresh patient tumor fragments into immunodeficient mice, PDX models demonstrate superior biological fidelity to original tumor characteristics compared to cancer cell lines, as these in vivo models preserve the histological architecture, three-dimensional spatial organization, and genetic profiles of the original patient tumors [18]. Clinical validation studies have consistently demonstrated remarkable concordance between PDX drug responses and patient treatment outcomes [19–21], with concordance rates ranging from 81 to 100% across diverse tumor types [22]. Recognizing this superior clinical relevance, the National Cancer Institute announced its strategic transition from the traditional “NCI-60 Human Tumor Cell Lines Screen” to PDX-based screening platforms in 2016 [23]. Recent large-scale initiatives, particularly the Novartis PDX panel implementing a systematic “1 × 1 × 1” design (one patient-derived tumor, one PDX model, and one matched drug response dataset), have generated high-quality molecular profiling data paired with standardized in vivo drug efficacy measurements [24].
However, despite their enhanced biological relevance, available high-quality PDX datasets remain substantially smaller in scale compared to established cancer cell line repositories. This limited scale has significantly constrained research efforts in developing specialized computational methodologies for PDX-based DRP, resulting in a remarkably underdeveloped computational landscape. The pioneering computational approach, PDXGEM (Patient-Derived Xenograft Gene Expression Model) [25], implements a three-stage pipeline: (1) manual feature engineering through statistical techniques to identify drug sensitivity biomarkers from PDX gene expression profiles; (2) concordant co-expression analysis (CCEA) to align PDX and patient domains by retaining biomarkers with consistent expression patterns; and (3) random forest classification for patient response prediction. While PDXGEM represents an important initial step, it suffers from critical methodological limitations that constrain its clinical utility. First, its reliance on manual feature selection through univariate statistical tests fails to capture the complex, non-linear molecular interactions that are crucial for accurate DRP. Second, the simplistic correlation-based CCEA alignment strategy cannot adequately address the domain shifts that exist between PDX models and patient tumors. These limitations highlight the urgent need for more advanced computational approaches that can fully exploit the biological advantages of PDX models.
Recent advances in cell line-based DRP have successfully incorporated domain adaptation strategies to bridge the biological gap between these in vitro models and patient tumors, demonstrating substantial improvements in translational relevance [26–28]. The application of sophisticated domain adaptation techniques to PDX-based prediction holds exceptional promise, as PDX models inherently exhibit greater biological similarity to patient tumors compared to cancer cell lines. This enhanced similarity suggests that domain adaptation approaches may achieve more effective cross-domain alignment when applied to PDX-patient translation, potentially leading to significantly superior predictive performance for clinical applications.
To address these challenges and opportunities, we present TRANSPIRE-DRP (TRANSlating PDX Information for Real-world Estimation toward Drug Response Prediction), a novel deep learning framework specifically designed for transferring DRPs from PDXs (source domain) to clinical patients (target domain). Specifically, TRANSPIRE-DRP employs a two-phase fashion of pre-training and adaptation. In the pre-training phase, we leverage large-scale unlabeled genomic profiles from both domains to learn robust, domain-invariant representations that serve as informative initializations for subsequent adaptation with limited PDX samples, thereby mitigating the data scarcity challenge. In the subsequent adaptation phase, the pre-trained encoder undergoes fine-tuning within a domain adversarial framework that preserves drug response signals learned from the PDX domain while simultaneously aligning these representations with the patient domain, enabling direct clinical application of the resulting well-trained model. Comprehensive evaluation across multiple therapeutic agents demonstrates that TRANSPIRE-DRP achieves superior predictive performance compared to both cell line-based state-of-the-art models and PDX-based baselines. Systematic analyses confirm successful domain alignment while preserving biologically meaningful feature representations, highlighting the enhanced translational potential of our framework for precision oncology applications.
Methods
Overview of the TRANSPIRE-DRP framework
TRANSPIRE-DRP addresses the fundamental challenge of translating therapeutic sensitivity predictions from PDX models to clinical patients through a novel two-stage deep learning architecture. The framework conceptualizes this translation as a cross-domain knowledge transfer problem, where PDX models serve as the labeled source domain and patient cohorts represent the unlabeled target domain. Our approach integrates representation learning with adversarial domain adaptation to bridge the molecular gap between these distinct biological contexts while preserving therapeutically relevant signals. The architectural design encompasses two sequential phases: (1) an unsupervised representation learning phase that extracts domain-invariant genomic representations from both PDX and patient molecular profiles, and (2) an adaptation phase that employs domain adversarial training to align PDX-derived drug response patterns with patient molecular signatures (Fig. 1).
Fig. 1.
Schematic illustration of the TRANSPIRE-DRP framework. The framework consists of three primary phases: (1) Autoencoder pre-training: In this phase, the model learns domain-invariant genomic representations from large-scale unlabeled PDX and patient genomic data. The shared encoder learns a common representation, while the private encoder learns domain-specific representation. The output of both encoders is used for reconstructing input genomic data, ensuring generalizability across domains. (2) Domain adaptation: This phase fine-tunes the shared encoder using adversarial training, where the goal is to align the feature space between the domain-invariant parts of the PDX data. (source domain) and the patient data (target domain). The drug response classifier is trained on labeled PDX data, while a domain discriminator helps to minimize cross-domain distribution discrepancies. (3) Drug response prediction: The aligned domain-invariant representations are used to predict drug response for patient samples in the target domain, providing predictions for clinical applications
Problem formulation
We formulate the PDX-to-patient therapeutic efficacy modeling challenge as an unsupervised domain adaptation problem. Let
denote the source domain dataset composed of
sample-label pairs, and
denote the target domain dataset composed of
samples without labels. Here,
and
stand for the
-dimensional genomic feature vectors for the
-th PDX and patient, respectively. The binary response label
indicates drug sensitivity (1) or resistance (0) for the
-th PDX sample. Our objective is to train a pharmacogenomic model (including representation extraction and response prediction) on the source domain that can accurately infer drug response labels for the target domain samples.
Pre-training of domain representation extractor
The pre-training phase strategically leverages abundant unlabeled molecular data to learn generalizable representations, which can then facilitate effective model adaptation on small-scale drug response datasets. We implement a specialized autoencoder following the previous work [29] architecture that systematically decomposes input genomic profiles into domain-shared and domain-specific components, facilitating robust representation extraction across heterogeneous biological contexts. Specifically, each input
from unlabeled source domain dataset
and target domain dataset
will be encoded into two separate representations: one by its corresponding PDX or patient private encoder
, and another by a shared encoder
. The concatenation of two representations is then used to reconstruct the proximate genomic feature
via a shared decoder
, just as follows:
![]() |
1 |
where
stands for the vector concatenation operation. The loss of mean square error is adopted to minimize the distance between
and
, ensuring consistency in genomic features before and after reconstruction:
![]() |
2 |
Moreover, a difference loss is applied to encourage encoders to produce such separate representation. To this end, we introduce a soft subspace orthogonality constraint between private and shared representations in each domain:
![]() |
3 |
where
,
(or
,
) are matrices whose rows indicate the shared (or private) representations
,
(or
,
) from source and target domain samples, respectively;
means the squared Frobenius norm; and
indicates the transpose operation.
By combining the above two losses, we can ensure that the shared representation remains unaffected by the private representation of each domain, thereby yielding more invariant information that generalizes well across domains:
![]() |
4 |
Domain adversarial adaptation for drug response prediction
The adaptation phase implements an adversarial training framework that fine-tunes the pre-trained shared encoder for effective therapeutic sensitivity modeling while aligning domain-invariant feature representations. This adversarial architecture comprises three interconnected components operating in dynamic equilibrium: the pre-trained shared encoder
that transforms genomic inputs into intermediate representations, a drug response classifier
implemented as a multi-layer perceptron for binary sensitivity prediction, and a domain discriminator
that attempts to identify the domain origin of feature representations.
To address the prevalent challenges in preclinical drug screening datasets, particularly severe class imbalance where resistant samples typically outnumber sensitive ones, and the widespread presence of ambiguous cases near decision boundaries, we employ focal loss as our classification objective [30]. Unlike standard cross-entropy loss that treats all samples equally, focal loss dynamically adjusts the contribution of individual samples based on their classification difficulty and corresponding class:
![]() |
5 |
where
is the predicted probability of drug sensitivity. The balancing factor
compensates for skewed class distributions, while the modulating factors
and
down-weight contributions from well-classified examples, compelling the model to concentrate on hard-to-classify samples that often represent borderline drug responses. The focusing parameter
controls this effect, with higher values placing more emphasis on challenging cases. This is particularly crucial in pharmacogenomics, where these ambiguous cases typically harbor competing sensitivity and resistance mechanisms with complex molecular crosstalk, enabling the model to learn nuanced decision boundaries critical for clinically heterogeneous tumors.
Simultaneously, the domain discriminator
attempts to distinguish whether the encoded representations originate from the source or target domain. The discrimination loss is formulated as:
![]() |
6 |
The adversarial training mechanism employs a gradient reversal layer (GRL) [31] positioned between the shared encoder and domain discriminator, which reverses gradient signs during backpropagation with scaling factor
, effectively training the encoder to produce representations that maximize discriminator confusion while preserving drug response predictive capabilities:
![]() |
7 |
where
represents the encoder parameters,
is the learning rate, and
controls the adversarial adaptation strength.
Model training and optimization
TRANSPIRE-DRP employs a staged optimization protocol tailored to each phase’s specific objectives. The pre-training phase utilizes the AdamW optimizer with an initial learning rate of
, training for
epochs with balanced mini-batches containing samples from both domains to ensure equitable representation learning.
The adaptation phase implements the SGD optimizer with Nesterov momentum (momentum = 0.9) and weight regularization with decay of
for all network parameters. The combined training objective balances drug response accuracy with domain alignment:
![]() |
8 |
where
controls the trade-off between response prediction and domain adaptation. Training continues for
epochs with model selection based on validation performance to ensure optimal generalization to unseen data.
Hyperparameter optimization was conducted through systematic tuning of three critical parameters: the pre-training epoch
, the latent feature embedding size
, and the trade-off weight
. Each parameter was individually optimized while maintaining the others at empirically determined default values. The search spaces encompassed:
,
, and
. The parameter selection was based on validation performance evaluated within the 5-fold cross-validation framework.
The entire framework is implemented using PyTorch 1.13.1 and trained on NVIDIA GeForce RTX 3090 GPUs.
Datasets
For the pre-training phase, we assembled a comprehensive genomic repository comprising 399 PDX samples from the NIBR PDXE [24] and 9,808 patient tumor samples from TCGA [32], serving as source and target domains, respectively. Each sample was represented by its gene expression profile, initially metricized by the standard transcripts per million bases (TPM), followed by log transformation. To further ensure comparability across samples and enhance downstream learning, we performed a gene-level mean centering and standardization on the transformed expression data. The input feature space was constructed by integrating 1,426 highly variable genes previously identified in preclinical models and clinical tumors [28] with 1,127 established oncogenes from recent pan-cancer analyses [33], resulting in 2,358 unique genes after deduplication. For the adaptation phase, we curated labeled drug response datasets encompassing 178 PDX samples with corresponding in vivo efficacy measurements and 1,100 patient samples with documented clinical outcomes. Patient response categorization followed standardized Response Evaluation Criteria In Solid Tumors (RECIST) guidelines [34], where complete response (CR) and partial response (PR) were classified as sensitivity, while stable disease (SD) and progressive disease (PD) were designated as resistance. PDX response labels were determined using modified RECIST criteria (mRECIST) [24], with complete response (mCR), partial response (mPR), and stable disease (mSD) classified as sensitivity, and progressive disease (mPD) as resistance. Given computational requirements for robust model training and inherent class imbalance challenges in preclinical datasets, we implemented stringent sample size criteria for drug selection. Eligible therapeutic agents required a minimum of 30 PDX samples for computational power, with at least 5 positive (sensitive) samples to ensure adequate representation of the minority class. This filtering process initially identified four candidate drugs, from which Trametinib was subsequently excluded due to insufficient corresponding patient response data, resulting in three drugs (Cetuximab, Paclitaxel, and Gemcitabine) for comprehensive evaluation (Fig. 2A).
Fig. 2.
PDX dataset overview and molecular similarity analysis. (A) Sample size distribution across seven anti-cancer drugs showing PDXE (blue) and TCGA (red) samples, with positive (orange) and negative (pink) response labels. Only three drugs (Cetuximab, Paclitaxel, Gemcitabine) met minimum sample size criteria for model training. (B) UMAP visualization of gene expression profiles across cancer cell lines (orange circles), PDX models (blue triangles), and patient tumors (red circles). (C) Quantitative comparison of molecular similarity between patient samples and preclinical models using pairwise cosine similarity analysis. Boxplot shows the distribution of similarity scores between TCGA patient samples and CCLE cell lines (TCGA2CCLE, yellow) versus TCGA patient samples and PDXE models (TCGA2PDXE, blue). The P values are calculated using two-sided Wilcoxon rank-sum test. *** P value < 0.001
Evaluation protocol
This work focused on examining the performance of DRP models in a cross-domain transfer learning scenario, where the model predicts drug responses in out-of-distribution patient data (target domain) whose labels are never observed during training using knowledge transferred from PDX models (source domain). We implemented a 5-fold cross-validation protocol on the PDX dataset, where in each iteration, four out of five folds of PDX samples with drug response labels were used for training while the remaining one fold of samples served as validation for model selection (Supplementary Fig. 1A). Subsequently, the trained model was tested on the patient cohort, to predict drug response labels for patients. During the adaptation phase, the model performs unsupervised domain alignment between labeled PDX training samples and unlabeled test patient samples to bridge the domain gap through adversarial training. To address potential information leakage concerns, we additionally implemented a stricter evaluation protocol with complete test set isolation (Supplementary Fig. 1B). Under this conservative setting, test patient samples were entirely excluded from the domain alignment process during training. Instead, we randomly sampled an equivalent number of patients from the non-test patient pool for domain adaptation. Two common metrics were utilized to measure the classification performance of models: area under the ROC curve (AUC), and area under the precision-recall curve (AUPR).
To objectively evaluate domain alignment effectiveness, we employ two complementary approaches. First, we calculated the CORAL (CORrelation ALignment) distance [35] to quantify domain discrepancy:
![]() |
9 |
where
and
denote the covariance matrices of source and target domains respectively, and
represents the feature dimensionality. Lower CORAL values indicate reduced inter-domain mismatch. Second, we examined cluster preservation using a per-tcluster preservation ratio derived from Silhouette coefficients. For each sample
assigned to cluster
, the Silhouette coefficient is expressed as:
![]() |
10 |
where
is the average Euclidean distance from sample
to all other samples within the same cluster, and
is the average Euclidean distance to all samples in the nearest neighboring cluster. For each histology-defined cluster
, we computed the Silhouette coefficient across all samples
before and after alignment. The cluster preservation ratio for cluster c is then defined as:
![]() |
11 |
where
and
represent cluster-level Silhouette coefficients before and after alignment. A value
indicates improved within-cluster cohesion;
indicates partial preservation; negative values arise when
despite
, consistent with boundary erosion.
Results and discussion
PDX models exhibit superior molecular fidelity to patient tumors
To establish the biological rationale for our PDX-based approach, we systematically compared the molecular similarity between preclinical models and patient tumor samples. Using Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction, we visualized gene expression profiles of cancer cell lines from CCLE [6], PDX models from PDXE, and patient tumors from TCGA. The analysis revealed distinct clustering patterns that highlighted fundamental differences in translational relevance between preclinical platforms (Fig. 2B). Cancer cell lines (orange circles) formed tightly clustered, well-separated clusters that were distinctly isolated from patient tumor samples (red circles) in the UMAP space. This pronounced molecular segregation reflects the extensive adaptation of cell lines to in vitro culture conditions, including selection for rapid proliferation and accumulation of culture-induced genetic alterations that systematically diverge from primary tumor biology [36]. In stark contrast, PDX models (blue triangles) demonstrated extensive co-localization with patient clusters across multiple regions of the UMAP space, indicating superior preservation of original tumor characteristics compared to cancer cell lines.
To quantify these observations, we calculated pairwise cosine similarities between patient samples and each preclinical model type (Fig. 2C). Patient tumors exhibited significantly higher molecular similarity to PDX models than to cancer cell lines (TCGA-to-PDXE vs. TCGA-to-CCLE, P value < 0.001). However, the TCGA-to-PDXE similarities, while statistically superior, showed considerable variability and remained moderate in magnitude, indicating that molecular differences persist between patient tumors and PDX models. This quantitative finding was also consistent with the UMAP visualization, where PDX samples demonstrated substantial co-localization with patient clusters but did not achieve complete overlap (Fig. 2B).
These findings provide compelling evidence for PDX models’ enhanced biological fidelity, while the documented molecular disparities between PDX and patient samples justify the implementation of sophisticated transfer learning strategies to bridge systematic platform-specific differences and optimize predictive accuracy for clinical applications.
TRANSPIRE-DRP achieves superior predictive performance across multiple therapeutic agents
To validate the clinical utility of TRANSPIRE-DRP, we conducted comprehensive real-life therapeutic sensitivity predictions across three drugs: Cetuximab, Paclitaxel, and Gemcitabine. These agents were selected based on stringent sample size criteria (Fig. 2A; see Methods) and represented diverse data availability scenarios: Gemcitabine represented a source domain-limited scenario with relatively limited PDX samples, Cetuximab exemplified a target domain-limited case with constrained patient samples, while Paclitaxel with adequate samples in both domains. Systematic hyperparameter optimization yielded drug-specific configurations, with sensitivity analysis confirming robust parameter selection across all three therapeutic agents (Supplementary Fig. 2).
Firstly, to address potential systematic bias from differential response evaluation criteria between domains, we conducted sensitivity analyses using a unified cross-domain definition where stable disease (SD/mSD) was consistently classified as resistance in both domains. TRANSPIRE-DRP demonstrated minimal performance changes under this unified definition, suggesting that differential stable disease handling is unlikely to be a major confounding factor in our models (Supplementary Fig. 3).
To demonstrate TRANSPIRE-DRP’s superiority, we then compared it with six competing approaches, including four cancer cell line-based state-of-the-art models (TRANSACT [27], Velodrome [37], CODE-AE [28], and WISER [38]) and two PDX-based baselines implementing advanced unsupervised domain adaptation algorithms (MDD [39] and ToAlign [40]). Among the cell line-based models, TRANSACT utilizes nonlinear subspace alignment through kernel methods to establish a consensus space capturing shared biological processes between cancer cell lines and human tumors. Velodrome pioneers semi-supervised domain generalization with consistency and alignment losses to leverage both labeled cancer cell line data and unlabeled patient data. CODE-AE implements a context-aware deconfounding autoencoder with adversarial training to separate biological signals from confounding factors. WISER employs supervised domain-invariant learning to align cell line and patient profiles while preserving drug-specific patterns, then applies weak supervision with subset selection to generate high-quality pseudo-labels for unlabeled patient data. For PDX-based baselines, MDD minimizes margin disparity discrepancy between source and target domains through adversarial learning, providing theoretical guarantees for generalization performance. ToAlign performs task-oriented unsupervised domain adaptation by decomposing source features into task-relevant and task-irrelevant components, aligning only the discriminative features. Regarding computational efficiency, we compared the running time of all methods within the 500 training epochs, excluding the pre-training phase. As presented in Supplementary Table 1, TRANSPIRE-DRP demonstrates competitive efficiency.
In terms of performance comparison, TRANSPIRE-DRP demonstrated superior performance across all evaluation scenarios (Fig. 3 and Supplementary Fig. 4A). Specifically, for Cetuximab, our framework achieved exceptional performance with AUC of 0.6800 and AUPR of 0.7495, outperforming the best competitor by 3.43% in AUC and 5.96% in AUPR despite the limited patient cohort. For Paclitaxel, TRANSPIRE-DRP maintained competitive advantage (AUC: 0.6221, AUPR: 0.7609) with consistent improvements of 2.62% in AUC and 1.71% in AUPR. Notably, even under the challenging Gemcitabine scenario with only 35 PDX training samples, our framework achieved performance comparable to cell line-based methods (AUC: 0.5888, AUPR: 0.5648) while outperforming PDX baselines. Furthermore, experimental results demonstrated that PDX-based baselines exhibited drug-dependent performance variability, achieving competitive results for Paclitaxel (AUC: 0.5816–0.5899, AUPR: 0.7292–0.7366) but underperforming significantly for Cetuximab and Gemcitabine compared to cell line-based methods. This disparity likely reflects the interplay between limited sample diversity and the complex optimization requirements of advanced domain adaptation algorithms, where insufficient training data may lead to suboptimal convergence. To ensure the robustness of our model, we conducted comprehensive verification experiments comparing TRANSPIRE-DRP against cell line-based deep learning baselines. Statistical significance testing through 5 independent runs of 5-fold cross-validation, resulting in 25 performance measurements per model, confirmed TRANSPIRE-DRP’s consistent superiority over Velodrome and WISER for both Cetuximab and Paclitaxel, though not over CODE-AE, primarily due to higher variance across different random initializations—a known challenge when training deep networks on limited datasets where random initialization could occasionally lead to suboptimal convergence (Supplementary Fig. 5). Furthermore, control experiments under a strict isolation protocol, where test patient samples were entirely excluded from training, demonstrated that TRANSPIRE-DRP maintained superior performance with minimal change across all three drugs (Supplementary Fig. 1B and Supplementary Fig. 6; see Methods). These results further substantiate the effectiveness and robustness of our approach.
Fig. 3.
Performance comparison of TRANSPIRE-DRP with compared models across three anticancer drugs. (A) Area under the ROC curve (AUC) and (B) area under the precision-recall curve (AUPR) for response prediction on Cetuximab (left), Paclitaxel (middle), and Gemcitabine (right). Red bars represent TRANSPIRE-DRP, dark gray bars represent PDX baseline models (MDD and ToAlign), and light gray bars represent cell line-based state-of-the-art models (TRANSACT, Velodrome, CODE-AE, and WISER). Error bars indicate standard deviation across 5-fold cross-validation
Beyond the discriminative performance metrics, we further evaluated the clinical applicability of TRANSPIRE-DRP. Calibration analysis showed that Cetuximab and Paclitaxel models demonstrated reasonably well-calibrated predictions close to the ideal diagonal, while Gemcitabine model exhibited more pronounced deviations, consistent with the constraints imposed by limited training data (Supplementary Fig. 4B). To establish clinically actionable decision boundaries, we performed systematic threshold optimization using F1 score maximization, which balances the competing demands of sensitivity and precision in identifying treatment-responsive patients. This optimization yielded drug-specific decision boundaries: Cetuximab (threshold = 0.50593716, F1 = 0.6327), Paclitaxel (threshold = 0.5108448, F1 = 0.6790), and Gemcitabine (threshold = 0.5000005, F1 = 0.6276) (Supplementary Fig. 4C). These results provided practical guidance for clinical implementation while maintaining transparency about model confidence across different therapeutic contexts.
In conclusion, these experiments validate our hypothesis that PDX models provide superior translational relevance when coupled with appropriate domain adaptation mechanisms and establish TRANSPIRE-DRP’s clinical readiness for precision oncology applications. Moreover, the framework’s superior data efficiency—achieving superior performance with substantially smaller training datasets than cell line-based approaches—suggests future scalability potential as PDX biobanks expand globally.
Ablation study
To systematically evaluate the contribution of each architectural component within TRANSPIRE-DRP, we conducted comprehensive ablation studies across six model variants, each removing or modifying a specific module to assess its individual impact on predictive performance:
TRANSPIRE-DRP (w/o CG) that removes the cancer gene features from the input feature space, using only the transcriptomic profiles without incorporating curated oncogenic information.
TRANSPIRE-DRP (w/o AE) that removes the autoencoder pre-training phase, without performing robust shared representation extraction across domains.
TRANSPIRE-DRP (w/o FT) that freezes the pre-trained encoder parameters without fine-tuning, preventing adaptive parameter adjustment during the adaptation phase.
TRANSPIRE-DRP (w/o RS) that replaces the separate shared and private encoders with a single unified encoder, removing the explicit representation separation.
TRANSPIRE-DRP (w/o AL) that removes the discrimination loss
used for domain alignment.TRANSPIRE-DRP (w/o FL) that replaces the focal loss with standard binary cross-entropy loss, removing the class imbalance handling mechanism that down-weights well-classified examples and emphasizes hard negatives.
The ablation results demonstrated that each component contributes substantially to the overall framework performance (Table 1). The replacement of focal loss with standard binary cross-entropy loss (w/o FL) resulted in the most severe performance drop, with AUC declining from 0.6303 to 0.5605 and AUPR decreasing from 0.6917 to 0.6207, highlighting the specialized value of focal loss in handling the inherent class imbalance and sample difficulty variations characteristic of high-throughput drug screening datasets. The elimination of cancer gene features (w/o CG) also led to substantial degradation, with AUC dropping to 0.5665 and AUPR to 0.6472, underscoring the significant value of incorporating oncogenic prior knowledge in anticancer drug prediction. In addition, the variants w/o AE (AUC: 0.5950, AUPR: 0.6702) and w/o FT (AUC: 0.5883, AUPR: 0.6486) both exhibited compromised performance, collectively demonstrating that our two-stage learning approach is essential for cross-domain transfer of drug response signals. The representation disentanglement mechanism, verified through cosine similarity analysis, achieved near-orthogonal separation between shared and private representations across all drug models (Supplementary Fig. 7), while its removal through unified encoding (w/o RS) resulted in substantial performance degradation (AUC: 0.5665, AUPR: 0.6427), together confirming that explicit separation of domain-specific and domain-invariant features is essential for effective cross-domain transfer. Moreover, the removal of adversarial alignment component (w/o AL) resulted in performance degradation (AUC: 0.5916, AUPR: 0.6459), confirming that explicit domain adaptation mechanisms contribute meaningfully to achieving reliable clinical therapeutic sensitivity predictions. These results collectively demonstrate that TRANSPIRE-DRP’s superior performance emerges from the synergistic integration of all architectural components, with each module addressing specific challenges inherent in cross-domain DRP.
Table 1.
Performance of TRANSPIRE-DRP and its variants
| Cetuximab | Paclitaxel | Gemcitabine | Average | |||||
|---|---|---|---|---|---|---|---|---|
| AUC | AUPR | AUC | AUPR | AUC | AUPR | AUC | AUPR | |
| TRANSPIRE-DRP (w/o CG) | 0.5686 | 0.6670 | 0.5685 | 0.7260 | 0.5624 | 0.5486 | 0.5665 | 0.6472 |
| TRANSPIRE-DRP (w/o AE) | 0.6286 | 0.7296 | 0.5831 | 0.7503 | 0.5734 | 0.5306 | 0.5950 | 0.6702 |
| TRANSPIRE-DRP (w/o FT) | 0.6229 | 0.6731 | 0.5793 | 0.7377 | 0.5629 | 0.5352 | 0.5883 | 0.6486 |
| TRANSPIRE-DRP (w/o RS) | 0.5771 | 0.6686 | 0.5885 | 0.7470 | 0.5338 | 0.5124 | 0.5665 | 0.6427 |
| TRANSPIRE-DRP (w/o AL) | 0.6143 | 0.6628 | 0.6077 | 0.7492 | 0.5530 | 0.5258 | 0.5916 | 0.6459 |
| TRANSPIRE-DRP (w/o FL) | 0.5486 | 0.6100 | 0.6000 | 0.7417 | 0.5329 | 0.5104 | 0.5605 | 0.6207 |
| TRANSPIRE-DRP | 0.6800 | 0.7495 | 0.6221 | 0.7609 | 0.5888 | 0.5648 | 0.6303 | 0.6917 |
The highest score in each column is in bold
TRANSPIRE-DRP preserves biological heterogeneity while aligning PDX and patient features
To evaluate domain adaptation efficacy, we examined the learned feature representations through UMAP visualization, assessing whether TRANSPIRE-DRP effectively minimizes domain discrepancy while preserving biological characteristics. The original transcriptomic profiles exhibited pronounced domain separation, with PDX and patient samples forming discrete clusters (Fig. 2B). Following model training, TRANSPIRE-DRP successfully mitigated this domain divergence, achieving substantial integration of source and target domains in the learned embedding space (Fig. 4).
Fig. 4.
Visualization analysis of domain adaptation efficacy in TRANSPIRE-DRP. (A-C) UMAP visualization of learned feature representations of patient samples (circles) and PDX samples (triangles) after TRANSPIRE-DRP training for (A) Cetuximab, (B) Paclitaxel, and (C) Gemcitabine models. Each sample is colored according to the tissue of origin, including breast, colorectal, skin, lung, pancreas, and others
A compelling finding was the spontaneous emergence of tumor type organization within the aligned feature space. Despite the absence of histological labels during training, the learned representations self-organized into tumor type-specific clusters, indicating that TRANSPIRE-DRP captures intrinsic biological patterns rather than performing superficial domain matching. This unsupervised biological stratification demonstrates successful preservation of clinically relevant molecular heterogeneity while eliminating systematic disparities between preclinical and clinical contexts. Moreover, the quality of tumor type stratification correlated with training data availability across the three drug models. Paclitaxel, with the most robust training set that provided sufficient molecular diversity for pattern recognition, exhibited well-defined tumor type clusters with clear boundaries (Fig. 4B). Cetuximab demonstrated moderate clustering quality with distinguishable tumor type separation (Fig. 4A), while Gemcitabine showed less distinct patterns with increased tumor type intermixing (Fig. 4C), consistent with the relatively limited PDX training samples that constrained the model’s ability to establish robust tumor-specific signatures.
To complement the qualitative UMAP visualizations, we conducted rigorous quantitative analyses to validate the effectiveness of domain alignment. First, CORAL distance analysis quantified substantial domain alignment, with distance differences decreasing from 0.0151 to 0.0036, 0.0048, and 0.0024 for Cetuximab (76.18% reduction), Paclitaxel (68.34% reduction), and Gemcitabine (84.28% reduction) models respectively (Fig. 5A). Second, domain discriminator analysis on held-out validation sets confirmed effective adversarial training, with prediction probabilities centered around 0.5 with substantial variance (0.5342 ± 0.2544 for Cetuximab, 0.4981 ± 0.1754 for Paclitaxel, 0.5033 ± 0.1829 for Gemcitabine), indicating the discriminator’s inability to reliably identify the domain origin (Fig. 5A). Finally, Silhouette coefficient analysis demonstrated that tumor type clustering remained largely intact following domain alignment (Fig. 5B). Most cancer types maintained reasonable clustering structure, with the Paclitaxel model showing particularly robust preservation across all evaluated tumor types.
Fig. 5.
Domain discrepancy reduction and cluster preservation after alignment with TRANSPIRE-DRP. (A) Domain alignment results for Cetuximab (left), Paclitaxel (middle), and Gemcitabine (right). Density plots and boxplots illustrate the prediction probability distributions of the domain discriminator, while pie charts indicate the percentage reduction in CORAL distance relative to pre-alignment (dark blue segments). (B) Radar plots showing cluster preservation ratio across five cancer types (breast, colorectal, skin, lung, pancreas) for Cetuximab (left), Paclitaxel (middle), and Gemcitabine (right). Points located outside the dark cyan dashed line (> 1) indicate tighter clustering of tumor types; points inside the dashed line but above zero (< 1) indicate preserved tumor-type clustering; and points below zero (displayed as 0) indicate cluster boundary erosion
These visualization and quantitative analyses reveal TRANSPIRE-DRP’s achievement of a critical balance, that is, successful domain transfer without sacrificing biological relevance. The preservation of tumor-specific molecular signatures within the unified feature space indicates that the framework captures clinically meaningful heterogeneity while eliminating cross-platform systematic disparities, providing the foundation for reliable cross-domain DRP in precision oncology applications.
TRANSPIRE-DRP captures drug-specific pharmacological mechanisms
To elucidate the biological mechanisms underlying TRANSPIRE-DRP’s predictive performance, we conducted interpretability analysis using integrated gradients to quantify gene-level contributions to DRP. The top 50 contributory genes were identified for each drug model based on absolute gradient values, and pathway enrichment analysis was conducted to reveal the captured biological processes.
The Cetuximab model exhibited enrichment patterns dominated by core oncogenic networks, including pathways in cancer, microRNAs in cancer, and Wnt signaling pathway (Fig. 6A). The enrichment of pathways in cancer captures a fundamental therapeutic challenge where Cetuximab effectively neutralizes EGFR-mediated signaling, yet tumor cells frequently circumvent this blockade through activation of parallel oncogenic cascades. Notably, the prominent Wnt signaling enrichment reflects well-documented EGFR-Wnt crosstalk mechanisms, as converging EGFR and Wnt pathways can synergistically promote tumorigenesis and therapeutic resistance [41]. Moreover, the enrichment of microRNA-related pathways indicates capture of post-transcriptional regulatory networks that modulate anti-EGFR therapy outcomes, highlighting the multi-layered molecular determinants underlying therapeutic response.
Fig. 6.
Pathway enrichment analysis of top contributory genes identified by TRANSPIRE-DRP. (A-C) Heatmaps showing pathway enrichment patterns (rows) for the top 50 most contributory genes (columns) identified by integrated gradients for the TRANSPIRE-DRP models of (A) Cetuximab, (B) Paclitaxel, and (C) Gemcitabine. Both pathways and genes are ordered by frequency of appearance across enrichment results. Color intensity reflects gene attribution rank to model prediction
The Paclitaxel model demonstrated mechanistically coherent enrichment across multiple layers of drug action (Fig. 6B). Specifically, the enrichment of cellular response to organic cyclic compound directly corresponds to recognition of Paclitaxel’s complex tetracyclic diterpenoid structure [42]. Furthermore, the identification of negative regulation of cell population proliferation precisely reflects the drug’s primary mechanism of mitotic arrest and cell death induction [42]. Additionally, the enriched regulation of cytoskeleton organization captures comprehensive cellular structural responses, encompassing not only microtubule dynamics but also broader cytoskeletal networks that collectively determine treatment sensitivity through coordinated structural adaptations [43]. This comprehensive capture of chemical recognition, direct pharmacological action, and downstream cellular responses illustrates TRANSPIRE-DRP’s ability to decipher multi-dimensional aspects of drug-cell interactions.
The Gemcitabine model revealed a distinct immunomodulatory profile reflecting this nucleoside analog’s dual therapeutic mechanism (Fig. 6C). The enrichment of NF-κB signaling pathway and cellular response to cytokine stimulus represents a mechanistically coherent signature capturing Gemcitabine’s ability to activate inflammatory responses beyond primary cytotoxic effects. Specifically, the prominent NF-κB pathway enrichment reflects the drug’s capacity to trigger this master inflammatory regulator, which coordinates immune responses against tumor cells [44]. The concurrent cellular response to cytokine stimulus enrichment indicates capture of the inflammatory cascade where activated NF-κB triggers cytokine production that enhances immune-mediated tumor clearance [45, 46].
The comprehensive analysis demonstrates that TRANSPIRE-DRP captures interpretable features highly consistent with established pharmacological knowledge for each therapeutic agent. Notably, the enrichment of immune and inflammatory signatures suggests that our domain adaptation process may help recover certain microenvironmental characteristics that are incomplete in PDX models, enhancing the biological relevance of predictions for clinical translation.
Unsupervised predictions recapitulate established clinical drug-cancer associations
To assess the clinical translational potential of TRANSPIRE-DRP, we evaluated whether the framework could recapitulate established drug-cancer type associations using those TCGA patient samples without drug feedback records. We applied trained models to generate response predictions across all these samples, stratified patients by prediction probability into sensitive (top 10%) and resistant (bottom 10%) cohorts, and performed cancer type enrichment analysis using Fisher’s exact test to identify tumor types significantly over-represented in each response category.
The analysis revealed cancer type-specific response patterns with remarkable concordance to established clinical knowledge across all three drugs (Fig. 7). For Cetuximab, the EGFR-targeting monoclonal antibody, glioblastoma multiforme (GBM) demonstrated significant therapeutic sensitivity (P value = 0.0414), which aligns with clinical observations that approximately 50% of glioblastomas overexpress EGFR [47, 48] and ongoing research into blood-brain barrier penetration strategies for anti-EGFR therapeutics [49]. Conversely, kidney renal papillary cell carcinoma (KIRP) and diffuse large B-cell lymphoma (DLBC) showed pronounced therapeutic resistance (P value = 0.0076 and 0.0159, respectively), consistent with their independence from EGFR-driven oncogenic pathways. Similarly, Paclitaxel predictions demonstrated clinically coherent patterns, with stomach adenocarcinoma (STAD) showing significant treatment sensitivity (P value = 0.0255), directly reflecting current therapeutic guidelines where taxane-based regimens constitute standard-of-care treatment for advanced gastric cancer [50, 51]. The predicted marked treatment resistance in sarcoma (SARC) and mesothelioma (MESO) (P value = 0.0461 and 0.0215, respectively) corroborates extensive clinical experience documenting these malignancies’ refractoriness to conventional chemotherapy [52–55]. Additionally, Gemcitabine predictions showed notable treatment sensitivity in lung adenocarcinoma (LUAD) and prostate adenocarcinomas (PRAD) (both P value = 0.0190), consistent with the established clinical efficacy of Gemcitabine-based combinations in advanced non-small cell lung cancer [56, 57] and castration-resistant prostate cancer [58]. Conversely, the predicted therapeutic resistance in kidney renal clear cell carcinoma (KIRC) (P value = 0.0467) aligns with this tumor type’s well-documented resistance to conventional cytotoxic chemotherapy, where Gemcitabine demonstrated markedly inferior efficacy compared to targeted therapies and immunotherapies [59].
Fig. 7.
Cancer type enrichment analysis reveals clinically validated drug-cancer associations predicted by TRANSPIRE-DRP. (A-C) Bar plots showing cancer type enrichment in drug-sensitive (red bars) versus drug-resistant (blue bars) patient cohorts for (A) Cetuximab, (B) Paclitaxel, and (C) Gemcitabine. For each drug, unlabeled TCGA patient samples were stratified by TRANSPIRE-DRP prediction scores into sensitive (top 10%) and resistant (bottom 10%) groups. Cancer type enrichment was calculated using Fisher’s exact test. The x-axis represents -log10(P value), with the vertical dashed line indicating P value = 0.05 significance threshold. Abbreviations for cancer types are listed in Supplementary Table 2
The spontaneous emergence of these clinically validated drug-cancer associations through purely unsupervised domain adaptation—achieved without explicit incorporation of clinical knowledge or therapeutic guidelines—provides compelling evidence for TRANSPIRE-DRP’s ability to capture authentic biological relationships underlying drug response mechanisms. However, several well-established therapeutic relationships remained undetected, such as Gemcitabine’s role as backbone therapy for pancreatic adenocarcinoma [60, 61], reflecting the complexity of real-life clinical prediction where outcomes are influenced by multifactorial determinants, including tumor heterogeneity, patient-specific pharmacokinetics, and drug resistance mechanisms that extend beyond transcriptomic signatures alone.
Conclusions
This study introduces TRANSPIRE-DRP, a novel deep learning framework that addresses the critical translational gap in precision oncology by leveraging PDX models for reliable prediction of clinical drug responses. Through comprehensive molecular similarity analyses, we first established that PDX models demonstrate markedly superior biological fidelity to patient tumors compared to conventional cancer cell lines, which suffer from diminished tumor heterogeneity, loss of three-dimensional architecture, and selection for rapid proliferation characteristics during extended cultivation [36, 62], providing compelling rationale for their integration into computational pipelines. However, the computational landscape for PDX-based clinical drug efficacy prediction remains remarkably underdeveloped, representing a significant missed opportunity given its enhanced biological relevance. Through the integration of unsupervised representation learning and adversarial domain adaptation, TRANSPIRE-DRP successfully bridges the molecular gap between PDX models and patient samples while preserving therapeutically relevant signals, achieving superior predictive performance across multiple therapeutic agents. Beyond predictive accuracy, TRANSPIRE-DRP exhibits remarkable biological interpretability. The framework spontaneously recapitulates established drug-cancer type associations without explicit incorporation of histological annotations, while capturing mechanistically coherent pathway signatures that align precisely with known pharmacological mechanisms. Therefore, this work ultimately demonstrates that integrating PDX models with sophisticated domain adaptation strategies can bridge the translational gap in pharmacogenomics, moving the field closer to truly personalized therapeutic decision-making based on faithful molecular evidence.
Additionally, the superior data efficiency demonstrated by TRANSPIRE-DRP—achieving robust performance with substantially smaller training datasets compared to cell line-based approaches—positions this framework as particularly valuable for precision medicine implementation. However, several limitations warrant consideration. The scalability challenge across broader drug portfolios represents a significant constraint, requiring substantially larger PDX biobanks with standardized experimental protocols for comprehensive drug library coverage. Furthermore, PDX models inherently lack human immune components, limiting applicability to immunotherapeutics where immune-tumor interactions are critical determinants of response. The domain adaptation approach may also not fully capture complex tumor microenvironmental factors present in clinical settings, including stromal heterogeneity and drug delivery constraints.
Future research should prioritize developing humanized PDX models with reconstituted immune systems to enable immunotherapy prediction capabilities [63, 64]. Additionally, integrating multi-modal data beyond transcriptomics and establishing standardized screening protocols across institutions would facilitate the larger, more diverse datasets necessary for broad-spectrum drug response modeling. Prospective clinical validation studies will be essential to establish real-world utility in guiding therapeutic decision-making. As PDX biobanks continue expanding globally and humanized PDX models progress, these developments provide compelling opportunities for addressing current limitations while establishing TRANSPIRE-DRP as a forward-compatible solution for precision oncology.
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Acknowledgements
Not applicable.
Author contributions
Conceptualization, J.Y. and X.N.; data curation, J.Y. and F.X.; formal analysis, J.Y.; funding acquisition, X.N. and J.G.; investigation, J.Y. and X.N.; methodology, J.Y.; project administration, J.Y. and X.N.; software, J.Y.; supervision, X.N. and J.G.; validation, J.Y. and T.W.; visualization, J.Y.; writing–original draft, J.Y. and T.W.; writing–review & editing, J.Y. and T.W., X.N., and J.G. All authors read and approved the final manuscript.
Funding
This research was funded by Hubei Provincial Natural Science Foundation and Traditional Chinese Medicine Innovation and Development Joint Foundation of China, (2024AFD228 to X.N.), Biological Breeding-National Science and Technology Major Project (2023ZD0404702 to X.N.), Fundamental Research Funds for the Central Universities (2662024XXPY002 to J.G.), and Huazhong Agricultural University Scientific & Technological Self-innovation Foundation (11041810351 to J.G.).
Data availability
Datasets and source code are publicly available at https://github.com/YJY-98/TRANSPIRE-DRP.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Jianye Yang and Tian Wu are joint authors.
Contributor Information
Jing Gong, Email: gong.jing@mail.hzau.edu.cn.
Xiaohui Niu, Email: niuxiaoh@mail.hzau.edu.cn.
References
- 1.Bray F, Laversanne M, Sung HYA, Ferlay J, Siegel RL, Soerjomataram I, Jemal A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. Ca-a Cancer J Clin. 2024;74:229–63. [DOI] [PubMed] [Google Scholar]
- 2.Haslem DS, Chakravarty I, Fulde G, Gilbert H, Tudor BP, Lin K, Ford JM, Nadauld LD. Precision oncology in advanced cancer patients improves overall survival with lower weekly healthcare costs. Oncotarget. 2018;9:12316–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Pal SK, Miller MJ, Agarwal N, Chang SM, Chavez-MacGregor M, Cohen E, Cole S, Dale W, Magid Diefenbach CS, Disis ML, et al. Clinical cancer advances 2019: annual report on progress against cancer from the American society of clinical oncology. J Clin Oncol. 2019;37:834–49. [DOI] [PubMed]
- 4.Marquart J, Chen EY, Prasad V. Estimation of the percentage of US patients with cancer who benefit from Genome-Driven oncology. Jama Oncol. 2018;4:1093–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Adam G, Rampasek L, Safikhani Z, Smirnov P, Haibe-Kains B, Goldenberg A. Machine learning approaches to drug response prediction: challenges and recent progress. Npj Precision Oncol. 2020;4:19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Barretina J, Caponigro G, Stransky N, Venkatesan K, Margolin AA, Kim S, Wilson CJ, Lehár J, Kryukov GV, Sonkin D, et al. The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Iorio F, Knijnenburg TA, Vis DJ, Bignell GR, Menden MP, Schubert M, Aben N, Gonçalves E, Barthorpe S, Lightfoot H, et al. A landscape of Pharmacogenomic interactions in cancer. Cell. 2016;166:740–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Seashore-Ludlow B, Rees MG, Cheah JH, Cokol M, Price EV, Coletti ME, Jones V, Bodycombe NE, Soule CK, Gould J, et al. Harnessing connectivity in a Large-Scale Small-Molecule sensitivity dataset. Cancer Discov. 2015;5:1210–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ammad-ud-din M, Khan SA, Malani D, Murumägi A, Kallioniemi O, Aittokallio T, Kaski S. Drug response prediction by inferring pathway-response associations with kernelized bayesian matrix factorization. Bioinformatics. 2016;32:455–63. [DOI] [PubMed] [Google Scholar]
- 10.Wang L, Li XZ, Zhang LX, Gao Q. Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization. BMC Cancer. 2017;17:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Meybodi FY, Eslahchi C. Predicting anti-cancer drug response by finding optimal subset of drugs. Bioinformatics. 2021;37:4509–16. [DOI] [PubMed] [Google Scholar]
- 12.Zhang NQ, Wang HY, Fang Y, Wang J, Zheng XQ, Liu XS. Predicting anticancer drug responses using a Dual-Layer integrated cell Line-Drug network model. PLoS Comput Biol. 2015;11:e1004498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lind AP, Anderson PC. Predicting drug activity against cancer cells by random forest models based on minimal genomic information and chemical properties. PLoS ONE. 2019;14:e0219774. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Gerdes H, Casado P, Dokal A, Hijazi M, Akhtar N, Osuntola R, Rajeeve V, Fitzgibbon J, Travers J, Britton D, et al. Drug ranking using machine learning systematically predicts the efficacy of anti-cancer drugs. Nat Commun. 2021;12:1850. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lawrence PJ, Burns B, Ning X. Enhancing drug and cell line representations via contrastive learning for improved anti-cancer drug prioritization. Npj Precision Oncol. 2024;8:106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Liu Q, Hu ZQ, Jiang R, Zhou M. DeepCDR: a hybrid graph convolutional network for predicting cancer drug response. Bioinformatics. 2020;36:I911–8. [DOI] [PubMed] [Google Scholar]
- 17.Moreno L, Pearson ADJ. How can attrition rates be reduced in cancer drug discovery? Expert Opin Drug Discov. 2013;8:363–8. [DOI] [PubMed] [Google Scholar]
- 18.Goto T. Patient-Derived tumor xenograft models: toward the establishment of precision cancer medicine. J Personalized Med. 2020;10:64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Blanchard Z, Brown EA, Ghazaryan A, Welm AL. PDX models for functional precision oncology and discovery science. Nat Rev Cancer. 2025;25:153–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Hidalgo M, Bruckheimer E, Rajeshkumar NV, Garrido-Laguna I, De Oliveira E, Rubio-Viqueira B, Strawn S, Wick MJ, Martell J, Sidransky D. A pilot clinical study of treatment guided by personalized tumorgrafts in patients with advanced cancer. Mol Cancer Ther. 2011;10:1311–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Liu YH, Wu WT, Cai CJ, Zhang H, Shen H, Han Y. Patient-derived xenograft models in cancer therapy: technologies and applications. Signal Transduct Target Therapy. 2023;8:160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Izumchenko E, Paz K, Ciznadija D, Sloma I, Katz A, Vasquez-Dunddel D, Ben-Zvi I, Stebbing J, McGuire W, Harris W, et al. Patient-derived xenografts effectively capture responses to oncology therapy in a heterogeneous cohort of patients with solid tumors. Ann Oncol. 2017;28:2595–605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ledford H. US cancer Institute to overhaul tumour cell lines. Nature. 2016;530:391. [DOI] [PubMed] [Google Scholar]
- 24.Gao H, Korn JM, Ferretti S, Monahan JE, Wang YZ, Singh M, Zhang C, Schnell C, Yang GZ, Zhang Y, et al. High-throughput screening using patient-derived tumor xenografts to predict clinical trial drug response. Nat Med. 2015;21:1318–25. [DOI] [PubMed] [Google Scholar]
- 25.Kim Y, Kim D, Cao BW, Carvajal R, Kim M. PDXGEM: patient-derived tumor xenograft-based gene expression model for predicting clinical response to anticancer therapy in cancer patients. BMC Bioinformatics. 2020;21:288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Kim J, Park SH, Lee H. PANCDR: precise medicine prediction using an adversarial network for cancer drug response. Brief Bioinform. 2024;25:bbae088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Mourragui SMC, Loog M, Vis DJ, Moore K, Manjon AG, van de Wiel MA, Reinders MJT, Wessels LFA. Predicting patient response with models trained on cell lines and patient-derived xenografts by nonlinear transfer learning. Proc Natl Acad Sci USA. 2021;118:e2106682118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.He D, Liu Q, Wu Y, Xie L. A context-aware deconfounding autoencoder for robust prediction of personalized clinical drug response from cell-line compound screening. Nat Mach Intell. 2022;4:879–92. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Bousmalis K, Trigeorgis G, Silberman N, Krishnan D, Erhan D. Domain separation networks. In Proceedings of the 30th International Conference on Neural Information Processing Systems; Barcelona, Spain. Curran Associates Inc.; 2016: 343–351.
- 30.Lin TY, Goyal P, Girshick R, He KM, Dollár P. Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell. 2020;42:318–27. [DOI] [PubMed] [Google Scholar]
- 31.Ganin Y, Lempitsky V. Unsupervised domain adaptation by backpropagation. In Proceedings of the 32nd International Conference on Machine Learning; Lille, France. JMLR.org; 2015: 1180–1189.
- 32.Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM, Network CGAR. The cancer genome atlas Pan-Cancer analysis project. NAT GENET. 2013;45:1113–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Yang JY, Fu HT, Xue FY, Li ML, Wu YY, Yu ZH, Luo HH, Gong J, Niu XH, Zhang W. Multiview representation learning for identification of novel cancer genes and their causative biological mechanisms. Brief Bioinform. 2024;25:bbae418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Therasse P, Arbuck SG, Eisenhauer EA, Wanders J, Kaplan RS, Rubinstein L, Verweij J, Van Glabbeke M, van Oosterom AT, Christian MC, Gwyther SG. New guidelines to evaluate the response to treatment in solid tumors. J Natl Cancer Inst. 2000;92:205–16. [DOI] [PubMed] [Google Scholar]
- 35.Sun B, Feng J, Saenko K. Return of frustratingly easy domain adaptation. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence; Phoenix, Arizona. AAAI Press; 2016: 2058–2065.
- 36.Ben-David U, Siranosian B, Ha G, Tang H, Oren Y, Hinohara K, Strathdee CA, Dempster J, Lyons NJ, Burns R, et al. Genetic and transcriptional evolution alters cancer cell line drug response. Nature. 2018;560:325–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Sharifi-Noghabi H, Harjandi PA, Zolotareva O, Collins CC, Ester M. Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction. Nat Mach Intell. 2021;3:962–72. [Google Scholar]
- 38.Shubham K, Jayagopal A, Danish SM, AP P, Rajan V. WISER: Weak supervision and supervised representation learning to improve drug response prediction in cancer. In Proceedings of the 41st International Conference on Machine Learning; Vienna, Austria. JMLR.org; 2024: 45228–45243.
- 39.Zhang Y, Liu T, Long M, Jordan M. Bridging theory and algorithm for domain adaptation. In Proceedings of the 36th International Conference on Machine Learning; Proceedings of Machine Learning Research. Edited by Kamalika C, Ruslan S. PMLR; 2019: 7404–7413.
- 40.Wei G, Lan C, Zeng W, Zhang Z, Chen Z. ToAlign: task-oriented alignment for unsupervised domain adaptation. In Proceedings of the 35th International Conference on Neural Information Processing Systems. Curran Associates Inc.; 2021: 13834–13846.
- 41.Hu TH, Li CX. Convergence between Wnt-β-catenin and EGFR signaling in cancer. MOL CANCER. 2010;9:236. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Sharifi-Rad J, Quispe C, Patra JK, Singh YD, Panda MK, Das G, Adetunji CO, Michael OS, Sytar O, Polito L et al. Paclitaxel: application in modern oncology and nanomedicine-based cancer therapy. Oxid Med Cell Longev 2021;2021:3687700. [DOI] [PMC free article] [PubMed]
- 43.Tommasi S, Mangia A, Lacalamita R, Bellizzi A, Fedele V, Chiriatti A, Thomssen C, Kendzierski N, Latorre A, Lorusso V, et al. Cytoskeleton and Paclitaxel sensitivity in breast cancer: the role of β-tubulins. INT J CANCER. 2007;120:2078–85. [DOI] [PubMed] [Google Scholar]
- 44.Arlt A, Gehrz A, Müerköster S, Vorndamm J, Kruse ML, Fölsch UR, Schäfer H. Role of NF-κB and Akt/PI3K in the resistance of pancreatic carcinoma cell lines against gemcitabine-induced cell death. Oncogene. 2003;22:3243–51. [DOI] [PubMed] [Google Scholar]
- 45.Piadel K, Dalgleish AG, Smith PL. Gemcitabine in the era of cancer immunotherapy. J Clin Haematol. 2020;1:107–20. [Google Scholar]
- 46.Nemati M, Hsu CY, Nathiya D, Kumar MR, Oghenemaro EF, Kariem M, Kaur P, Bhanot D, Hjazi A, Saedi TA. Gemcitabine: Immunomodulatory or immunosuppressive role in the tumor microenvironment. FRONT IMMUNOL. 2025;16:1536428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Gan HK, Cvrljevic AN, Johns TG. The epidermal growth factor receptor variant III (EGFRvIII): where wild things are altered. FEBS J. 2013;280:5350–70. [DOI] [PubMed] [Google Scholar]
- 48.Harari PM. Epidermal growth factor receptor Inhibition strategies in oncology. ENDOCR RELAT CANCER. 2004;11:689–708. [DOI] [PubMed] [Google Scholar]
- 49.Porret E, Kereselidze D, Dauba A, Schweitzer-Chaput A, Jegot B, Selingue E, Tournier N, Larrat B, Novell A, Truillet C. Refining the delivery and therapeutic efficacy of cetuximab using focused ultrasound in a mouse model of glioblastoma: an Zr-cetuximab ImmunoPET study. EUR J PHARM BIOPHARM. 2023;182:141–51. [DOI] [PubMed] [Google Scholar]
- 50.Lordick F, Carneiro F, Cascinu S, Fleitas T, Haustermans K, Piessen G, Vogel A, Smyth EC, Committee EG. Gastric cancer: ESMO clinical practice guideline for diagnosis, treatment and follow-up. Ann Oncol. 2022;33:1005–20. [DOI] [PubMed] [Google Scholar]
- 51.Kruijtzer CMF, Boot H, Beijnen JH, Lochs HL, Parnis FX, Planting AST, Pelgrims JMG, Williams R, Mathôt RAA, Rosing H, et al. Weekly oral Paclitaxel as first-line treatment in patients with advanced gastric cancer. Ann Oncol. 2003;14:197–204. [DOI] [PubMed] [Google Scholar]
- 52.Rieth J, Monga V, Milhem M. The decline and fall of the current chemotherapy paradigm in soft tissue sarcoma. Cancers. 2025;17:1203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Casper ES, Waltzman RJ, Schwartz GK, Sugarman A, Pfister D, Ilson D, Woodruff J, Leung D, Bertino JR. Phase II trial of Paclitaxel in patients with soft-tissue sarcoma. CANCER INVEST. 1998;16:442–6. [DOI] [PubMed] [Google Scholar]
- 54.Tomek S, Emri S, Krejcy K, Manegold C. Chemotherapy for malignant pleural mesothelioma: past results and recent developments. BR J CANCER. 2003;88:167–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.vanMeerbeeck J, Debruyne C, vanZandwijk N, Postmus PE, Pennucci MC, vanBreukelen F, Galdermans D, Groen H, Pinson P, vanGlabbeke M, et al. Paclitaxel for malignant pleural mesothelioma: A phase II study of the EORTC lung cancer cooperative group. BR J CANCER. 1996;74:961–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Sandler AB, Nemunaitis J, Denham C, von Pawel J, Cormier Y, Gatzemeier U, Mattson K, Manegold C, Palmer MC, Gregor A, et al. Phase III trial of gemcitabine plus cisplatin versus cisplatin alone in patients with locally advanced or metastatic non-small-cell lung cancer. J Clin Oncol. 2000;18:122–30. [DOI] [PubMed] [Google Scholar]
- 57.Crinò L, Scagliotti GV, Ricci S, De Marinis F, Rinaldi M, Gridelli C, Ceribelli A, Bianco R, Marangolo M, Di Costanzo F, et al. Gemcitabine and cisplatin versus mitomycin, ifosfamide, and cisplatin in advanced non-small-cell lung cancer: A randomized phase III study of the Italian lung cancer project. J Clin Oncol. 1999;17:3522–30. [DOI] [PubMed] [Google Scholar]
- 58.Lee JL, Ahn JH, Choi MK, Kim Y, Hong SW, Lee KH, Jeong IG, Song C, Hong BS, Hong JH, Ahn H. Gemcitabine-oxaliplatin plus prednisolone is active in patients with castration-resistant prostate cancer for whom docetaxel-based chemotherapy failed. BR J CANCER. 2014;110:2472–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Motzer RJ, Jonasch E, Agarwal N, Alva A, Baine M, Beckermann K, Carlo MI, Choueiri TK, Costello BA, Derweesh IH, et al. Kidney Cancer, version 3.2022, NCCN clinical practice guidelines in oncology. J NATL COMPR CANC NETW. 2022;20:71–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Von Hoff DD, Ervin T, Arena FP, Chiorean EG, Infante J, Moore M, Seay T, Tjulandin SA, Ma WW, Saleh MN, et al. Increased survival in pancreatic cancer with nab-Paclitaxel plus gemcitabine. N ENGL J MED. 2013;369:1691–703. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Burris HA, Moore MJ, Andersen J, Green MR, Rothenberg ML, Madiano MR, Cripps MC, Portenoy RK, Storniolo AM, Tarassoff P, et al. Improvements in survival and clinical benefit with gemcitabine as first-line therapy for patients with advanced pancreas cancer: A randomized trial. J Clin Oncol. 1997;15:2403–13. [DOI] [PubMed] [Google Scholar]
- 62.Feng FYM, Shen BH, Mou XQ, Li YX, Li H. Large-scale Pharmacogenomic studies and drug response prediction for personalized cancer medicine. J Genet Genomics. 2021;48:540–51. [DOI] [PubMed] [Google Scholar]
- 63.Scherer SD, Riggio A, Haroun F, DeRose YS, Ekiz HA, Fujita M, Toner J, Zhao L, Li ZQ, Oesterreich S, et al. An immune-humanized patient-derived xenograft model of estrogen-independent, hormone receptor positive metastatic breast cancer. BREAST CANCER RES. 2021;23:100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Chuprin J, Buettner H, Seedhom MO, Greiner DL, Keck JG, Ishikawa F, Shultz LD, Brehm MA. Humanized mouse models for immuno-oncology research. Nat Reviews Clin Oncol. 2023;20:192–206. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Datasets and source code are publicly available at https://github.com/YJY-98/TRANSPIRE-DRP.


















