Skip to main content
Molecular Therapy. Nucleic Acids logoLink to Molecular Therapy. Nucleic Acids
. 2020 Jul 10;21:676–686. doi: 10.1016/j.omtn.2020.07.003

An Improved Anticancer Drug-Response Prediction Based on an Ensemble Method Integrating Matrix Completion and Ridge Regression

Chuanying Liu 1, Dong Wei 1, Ju Xiang 2,3, Fuquan Ren 1, Li Huang 4, Jidong Lang 5, Geng Tian 5, Yushuang Li 1,, Jialiang Yang 2,5,∗∗
PMCID: PMC7403773  PMID: 32759058

Abstract

In this study, we proposed an ensemble learning method, simultaneously integrating a low-rank matrix completion model and a ridge regression model to predict anticancer drug response on cancer cell lines. The model was applied to two benchmark datasets, including the Cancer Cell Line Encyclopedia (CCLE) and the Genomics of Drug Sensitivity in Cancer (GDSC). As previous studies suggest, the dual-layer integrated cell line-drug network model was one of the best models by far and outperformed most state-of-the-art models. Thus, we performed a head-to-head comparison between the dual-layer integrated cell line-drug network model and our model by a 10-fold crossvalidation study. For the CCLE dataset, our model has a higher Pearson correlation coefficient between predicted and observed drug responses than that of the dual-layer integrated cell line-drug network model in 18 out of 23 drugs. For the GDSC dataset, our model is better in 26 out of 28 drugs in the phosphatidylinositol 3-kinase (PI3K) pathway and 26 out of 30 drugs in the extracellular signal-regulated kinase (ERK) signaling pathway, respectively. Based on the prediction results, we carried out two types of case studies, which further verified the effectiveness of the proposed model on the drug-response prediction. In addition, our model is more biologically interpretable than the compared method, since it explicitly outputs the genes involved in the prediction, which are enriched in functions, like transcription, Src homology 2/3 (SH2/3) domain, cell cycle, ATP binding, and zinc finger.

Keywords: anticancer drug response, functional biomarker, ensemble learning method, matrix completion, ridge regression

Graphical Abstract

graphic file with name fx1.jpg


This study proposed an ensemble learning method, simultaneously integrating a low-rank matrix completion model and a ridge regression model to predict anticancer drug response on cancer cell lines. The model was effectively applied to two benchmark datasets. It also explicitly outputs the genes involved in the prediction, which are enriched in functions.

Introduction

Due to its direct impact on the therapy of cancer patients, the accurate assessment of the anticancer drug response of an individual is essential for precision medicine. It has been reported that many genomic biomarkers are closely related to drug response, and patients sometimes benefit from these biomarkers in clinical trials.1,2 However, human tissue samples are usually heterogeneous in cellular compositions,3 which makes it difficult to predict drug response at the tissue level. A compromised strategy is to study drug response at the cancer cell line level. During the past decade, many large-scale drug screenings have been performed on cultured human cancer cell line panels, including the National Cancer Institute (NCI)-60 study,4 connectivity map,5 Cancer Cell Line Encyclopedia (CCLE),6 Genomics of Drug Sensitivity in Cancer (GDSC) project,7 and more recently, Cancer Target Discovery and Development (CTD2) Project.8,9

These large compendia of genomic and pharmacological data have provided a basis as well as an unprecedented opportunity to infer functional biomarkers associated with drug sensitivity. For example, Geeleher et al.10 studied variability in general levels of drug sensitivity in preclinical models to improve cancer biomarker discovery. By studying the expression patterns of 48 ATP-binding cassette (ABC) transporters on 60 cancer cell lines during the treatment of 1,429 anticancer drugs, Szakács et al.11 confirmed the correlation between ABC transporter genes and the response of cytotoxic drugs in cancer cell lines. Through systematic analyses on 130 anticancer drugs and 639 human tumor cell lines, Garnett et al.7 identified several fusion genes, like EWS-FLI1, to be poly (ADP-ribose) polymerase (PARP) inhibitors. We applied a modified linear regression model to the CCLE data and identified several to hundreds of genes associated with the response of 24 anticancer drugs, respectively.12 These findings emphasize the need for the use of molecular information to predict drug response and thus optimize personalized cancer therapy.

Provided with drug-response-associated genes, it is usually feasible to predict computationally the responses of cancer cell lines on anticancer drugs, which are often quantified by the activity area (the area above the fitted dose-response curve), IC50 (the concentration of an anticancer drug to kill one-half inhibitory concentration of cancer cells), EC50 (the concentration of an anticancer drug that gives half-maximal effective concentration cancer cell killing effect), or AUC (the area under the nonlinear dose-response curve).6 Generally speaking, a drug-response prediction problem mainly includes the sensitivity estimation of single compounds,6 sensitive or resistant prediction of either a particular drug class or a specific cancer type,13 and sensitivity prediction of a combination of several drugs.14 Among the three typical problems, single drug sensitivity prediction is the simplest, for which various computational methods have been developed recently. These methods primarily rely on regression, classification, heterogeneous network, multiple kernel learning, or ensemble/model selection. For example, Zhang et al.15 developed a dual-layer integrated cell line-drug network, consisting of both drug similarity network and cell line similarity network; verified the assumption that similar cell lines and similar drugs exhibit similar responses; and presented network-based models to estimate anticancer drug responses of cell lines.16 Based on this assumption, Liu et al.17 and Zhang et al.18 formulated a drug-response prediction as a recommender system problem, which was solved by two proposed techniques, respectively: the neighbor-based collaborative filtering with global effect removal method and the hybrid interpolation weighted collaborative filtering method; Wang et al.19 and Guan et al.20 applied novel matrix factorizations with similar restrictions to infer anticancer drug response in cell lines. Despite the relative good-prediction performances of network or matrix factorization-based methods, most of them are uninterpretable in the sense that they provide very little information on what genes/biomarkers are associated with drug responses. In contrast, more interpretable models, like the Elastic-Net model used in the original CCLE paper, are usually poor in prediction accuracy. The readers are referred to Azuaje21 for a systematical summarization of key strategies, resources, techniques, challenges, and trends in this area.

In summary, recent progress in computational techniques has significantly improved anticancer drug-response prediction and contributed to preclinical drug screening.22,23 However, highly accurate yet biologically interpretable models are still in demand. In this paper, we proposed an ensemble learning method, simultaneously integrating a low-rank matrix completion (MC) model and a ridge regression (RR) model, and applied it into two benchmark datasets, including CCLE and GDSC. Comparison with other methods suggests that our model achieves both high prediction accuracy and good biological interpretability simultaneously.

Results

Parameter Optimization

The MC model contains two parameters: θ1 andr. θ1 varied from 0 to 1 with an increment of 0.01, and r varied with an increment of 1 from 1 to 24 for CCLE, 1 to 28 for phosphatidylinositol 3-kinase (PI3K) pathway, and 1 to 30 for extracellular signal-regulated kinase (ERK) signaling pathway, respectively. Figure 1 shows the trend of Pearson correlation coefficients (PCCs) between predicted and observed response values with an increase of r for CCLE—first increases, then achieves it maximum atr=6, and decreases after that. Similar trends were observed for other two datasets (see Figure S1). The detail information about θ1 and r in all three datasets was listed in Table S1.

Figure 1.

Figure 1

The Parameter r Optimization for the MC Model in CCLE

The horizontal axis denotes the rank number, and the vertical axis denotes the Pearson correlation between predicted and observed response values.

For the RR model, we selected 0.11σkˆ0.37 for CCLE and 0.1σkˆ0.3 for both the PI3K pathway and ERK signaling pathway. The detailed information about σkˆ for three datasets was shown in Tables S2A–S2C, respectively. τk ranged over [σkˆ0.1, σkˆ+0.1] with an increment of 0.01, and θ2 ranged over [0, 2 × 105] with an increment of 10. For the combination model, the weight parameter βk ranged over [0, 1], with an increment of 0.01. The detailed information aboutτk, θ2, and βk for three datasets was shown in Tables S3A–S3C, respectively. βk is greater than 0.5 for 79% of 24 drugs in CCLE, especially, βk=1 for nutlin-3 (see Figure 2 for a more intuitive illustration). The results confirmed that the MC model plays a more important role than the RR model for most drugs in the combination model. Similar observations could be found in Figure S2: βk0.5 for 61% of 28 drugs in PI3K and for 63% of 30 drugs in ERK, however βk=0.18 for the drug FTI-277 in ERK, which illustrated a weak contribution of MC to the final prediction.

Figure 2.

Figure 2

The Weight Parameter βk Optimization for the Combination Model in CCLE

Prediction Performance of the Proposed Models

We first applied our models to the CCLE dataset. It has been reported that the prediction accuracy of the dual-layer integrated cell line-drug network model (abbreviated as the integrated model) is significantly higher than some models (cell line similarity network [CSN], drug similarity network [DSN], elastic net model, random forest, support vector regression, and prediction of drug response through an iterative sure independence screening [DISIS]).15,24 Here, we compared CSN, DSN, and the integrated model, based on the PCC. Figure 3A showed the comparison for CCLE. As can be seen, the PCCs of 23 drugs obtained by the combination model are all higher than 0.6, 11 drugs higher than 0.7, and 3 drugs higher than 0.8 and superior to the integrated model for 78% of 23 drugs, especially the drug PD-0325901, in which its PCC reaches 0.86. In fact, the MC model alone is already superior to the compared models and got higher correlations than CSN, DSN, and the integrated model for 100%, 78%, and 52% drugs, respectively. Obviously, the combination model is superior to both MC and RR, which demonstrated that the correlations among drugs and the relationships between drug-response and gene-expression profiles are both important to prediction. For a better view, we exhibited an overall prediction comparison across 23 drugs (see Figure 3B) and also supplied scatterplots of four example drugs (see Figure 3C).

Figure 3.

Figure 3

Prediction Performance Analysis of the Combination Model in CCLE

(A) Bar graph showing the Pearson correlations between the predicted and observed activity areas of six different models for 24 drugs. CSN, based on the cell line similarity network; DSN, based on the drug similarity network; Integrated, based on the dual-layer integrated cell line-drug network model that integrates CSN and DSN; MC, based on the matrix completion model; RR, based on the ridge regression model; COMB, based on the MC-RR combination model. (B) Boxplot showing the Pearson correlation distributions of six models. (C) Scatterplots of observed and predicted responses for four example drugs using the combination model.

For two other datasets in GDSC, the combination model also exhibited good performance, superior to the integrated model for 93% of 28 drugs in PI3K and 87% of 30 drugs in ERK. Other comparison details for CSN and DSN were shown in Table 1. Notably, for the drug OSI-906, the combination model achieves 66% and 58% higher correlation than the integrated model in PI3K and ERK, respectively (see Figure S3). Meanwhile, the scatterplots (see Figure S4) of four example drugs from two datasets confirmed that the good performance of the combination model is not affected by a small number of outliers. In addition, the MC model is better than CSN, DSN, and the integrated model for about 90%, 90%, and 70% drugs in two datasets (see Table S4). The RR model is superior to CSN for the PI3K dataset but worse than DSN and significantly worse than the integrated model for all datasets (see Table S5). The boxplots (see Figure S5) showed the correlation comparison among six different models in two datasets, consistent with the above conclusions.

Table 1.

Prediction Comparison of the Combination Model with CSN, DSN, and the Integrated Models in Two Datasets (the Entry Is the Percentage of Drugs in the Dataset in Which the Correlations Are Higher Than That of the Compared Model)

Dataset >CSN (%) >DSN (%) >Integrated (%)
PI3K 96 96 93
ERK 100 93 87

It is of note that there are other prediction models that consider various, additional information. For example, Stanfield et al.25 integrated cell line mutation data and protein-protein interaction. HNMDRP, a heterogeneous network-based method, introduced cell line genomic profile, drug chemical structure, and drug-target and protein-protein interaction.26 Due to the different data information, we here did not compare with them.

The common techniques to further test the predictive power of a model are to assess its reliability in identifying the estimation of missing data and to evaluate its ability in classifying cell lines sensitive or resistant to a specific drug. The estimation of missing data is considered to be reliable if they exhibit a consistent distribution pattern with that of observed data. Following this rule, we discussed two MEK inhibitors, PD-0325901 and RDEA119, in GDSC, for which about 7% response values are missing across all cell lines. We utilized p value to illustrate “consistent pattern” statistically. As is shown in Figure 4, the observed response values of wild cell lines are significantly higher than that of BRAF mutant cell lines when treated with both PD-0325901 (p = 7.75e−07) and RDEA119 (p = 6.39e−07). Consistently, the predicted (missing) response values of wild cell lines are also significantly higher than that of BRAF mutant cell lines when treated with both PD-0325901 (p = 0.04841) and RDEA119 (p = 0.02999). In summary, BRAF mutant cell lines are more sensitive to MEK inhibitors than wild cell lines, which is in agreement with published literature.15, 16, 17 In addition, with the use of a similar method in Wei et al.,16 Dong et al.,27 and Staunton et al.,28 we ranked cell lines for each drug according to the observed response values. Specifically, we referred to the top 200 cell lines as “sensitive,” the bottom 200 cell lines as “resistant,” and the remaining cell lines as “intermediate,” which were removed from our analysis. Based on this criterion, we discussed that the cell line is sensitive or resistant to a specific drug based on the predicted response values of the combination model and obtained better results than our previous study.16 Figure 5 illustrated the desired cell line-type recognition to the example drug PD-0325901 in two datasets, with accuracy = 0.885, AUC = 0.956 (the area under the receiver operating characteristic [ROC] curve), and p = 4.286159e−56 in CCLE, and accuracy = 0.9, AUC = 0.964, and p = 1.82395e−64 in GDSC. The detailed cell line type to PD-0325901 in CCLE, identified by the combination model, was listed in Table S6. We found that many non-small cell lung carcinoma (NSCLC) cell lines (such as A549, CAL12T, NCIH1299, NCIH23) and NRAS mutant cell lines (such as CHP212, IPC298, SKMEL2, HUT78, HDMYZ, SKNAS, TT2609C02, HEC151, HT1080, NCIH1299, KMM1) are more sensitive to PD-0325901, in agreement with the previous studies.6,29 It is well known that activating mutations in NRAS are top predictors of sensitivity for the MEK inhibitor PD-0325901.2

Figure 4.

Figure 4

Consistent Drug-Response Patterns between Predicted (Missing) and Observed (Existing) Data for the BRAF Mutant and Wild Cell Lines When Treated with Two MEK Inhibitors in GDSC

(A) The responses of the BRAF mutant and wild cell lines to PD-0325901. (B) The responses of the BRAF mutant and wild cell lines to RDEA119.

Figure 5.

Figure 5

The Results of Cell Line-Type Recognition (Sensitive or Resistant) to an Example Drug PD-0325901, Respectively, in Two Datasets

(A) Boxplot of type recognition for PD-0325901 in CCLE. (B) ROC curve of type recognition for PD-0325901 in CCLE. (C) Boxplot of type recognition for PD-0325901 in GDSC. (D) ROC curve of type recognition for PD-0325901 in GDSC.

Drug-Response-Associated Genes and Their Function Enrichment Analyses

We assembled the top 1,000 genes for each drug in three datasets based on the RR model (see Table S7). Interestingly, many of the selected genes are well-known drug-sensitivity biomarkers. For example, NQO1, in which its expression produces a high-potency intermediate (17-AAGH2), was identified as the top-most predictive feature for the response of the heat shock protein 90 (HSP90) inhibitor 17-AAG. This finding is consistent with other literature.6,7,30,31 SPRY2, which encodes a regular of MAPK output, was also verified as the top predictive feature for the MEK inhibitors PD-0325901 and AZD6244.2,6,7 SLFN11—with the top-most features of two DNA topoisomerase I inhibitors, irinotecan and topotecan—was also known to be associated with the sensitivity of two drugs.6 In addition, SLFN11 was observed to be associated with the response of microtubule-stabilizing agent paclitaxel, which is cytotoxic, as are irinotecan and topotecan. It was suggested that topoisomerase I inhibitors and microtubule-stabilizing agents might offer an effective treatment option for some cancer type. There were other top predictors, including ERBB2 and HGF, for the epidermal growth factor receptor (EGFR)/HER2 inhibitor lapatinib6,7,32 and mesenchymal-epithelial transition (MET)/anaplastic lymphoma kinase (ALK) inhibitor PF-2341066.6,33 Top genes IGF1R, EGFR, and MDM2 have been demonstrated to be associated with the response of the IGF-1R inhibitor AEW541,6,34 EGFR inhibitor erlotinib,6,35 and MDM2 inhibitor nutlin-3,6,36 respectively. Notably, PTEN expression was significantly correlated with 13 drug sensitivities in CCLE, which indicated that PTEN expression might inform the therapeutic selectivity of many anticancer drugs.

Besides the aforementioned gene features, we also detected many other drug-response-related genes not mentioned in Barretina et al.6 but that played significant roles in our RR model. For example, BCL11A, a transcriptional repressor that inhibits expression of fetal globin genes in adults and a potential therapeutic target for the treatment of globinopathies,37 was related to 6 drug responses in CCLE. BCL11B, closely correlated to BCL11A, was also a common valuable feature gene for 4 drugs. FAM19A1, belonging to a novel secreted family with conserved cysteine residues and restricted expression in the brain,38 was identified as the top predictor for the gamma-secretase inhibitor L-685458 and MDM2 inhibitor nutlin-3. DPEP3, encoding a membrane-bound glycoprotein from the family of dipeptidases involved in hydrolytic metabolism of various dipeptides, made the highest contributions to the sensitivities of multi-kinase inhibitors sorafenib and TKI258. ENPP6, highly expressed in liver sinusoidal endothelial cells and developing oligodendrocytes,39 was verified as the top predictor for IGF-1R inhibitor AEW541, ALK inhibitor TAE684, and multi-kinase inhibitor TKI258. PPIA, a key target for treating APOE4-mediated neurovascular injury and the resulting neuronal dysfunction and degeneration,40 not only had the top 2 features for histone deacetylase (HDAC) inhibitor panobinostat and DNA topoisomerase I inhibitor irinotecan but also closely associated with 5 drug responses in CCLE, which indicated that PPIA might be involved in some fundamental mechanisms of the multi-drug targeting process. APOL4, which may play a role in lipid exchange and transport throughout the body, as well as in reverse cholesterol transport from peripheral cells to the liver, also exhibited substantial connections with the sensitivity of vascular endothelial growth factor receptor (VEGFR) and EGFR inhibitor ZD-6474, Abl inhibitor nilotinib, and Src and Abl inhibitor AZD0530. All of these prioritized genes could act as candidate biomarkers of drug sensitivity and might ultimately be useful for the deployment of targeted therapies in cancer.

We also performed function enrichment analyses of drug-response-associated genes by the Database for Annotation, Visualization and Integrated Discovery (DAVID) tools41 (https://david.ncifcrf.gov/home.jsp) and listed the terms with Benjamini false discovery rate (FDR) less than 0.05 in Table S8. The word cloud plot of the 186 functions existing in at least 2 drugs in CCLE was drawn in Figure 6. Clearly, many functions have a close relationship with cancers. For example, “transcription” and “transcription regulation” were the two most significant terms and enriched in 11 drugs, respectively. It has been reported that changes in certain genes during transcription play an important role in cancer development.42 “mRNA processing” and “SH2” (Src homology 2) domain were the second most enriched terms. The mechanisms of 3′ end mRNA processing and of its regulation are highly relevant both in biology and in medicine.43 SH2 domain, a structurally conserved protein domain, is important to the treatment of breast cancer.44 Besides the gene functions mentioned above, “RNA binding” and “zinc” also deserve attention. Alterations in the expression and function of RNA-binding proteins amplify the effects of cancer driver genes, accelerate tumor progression, and promote aggressiveness.45 Zinc supplementation should have beneficial effects on cancer by decreasing angiogenesis and induction of inflammatory cytokines while increasing apoptosis in cancer cells.46

Figure 6.

Figure 6

The Word Cloud Plot of 186 Functional Annotations Occurring in at Least 2 out of 24 Drugs in CCLE

The word cloud plots of the enriched gene functions for two other datasets were shown in Figure S6. As an example, the term “kinase” was enriched in multiple drugs. It has been discovered that receptor tyrosine kinases (RTKs) play key roles in growth, metabolism, adhesion, motility, death, and oncogenesis;47 protein kinases are critical in many cellular processes;48 and the Src family of protein tyrosine kinases is also related with oncogenesis, proliferation, and survival.49 In addition, several other top enriched gene functions have also been verified to associate with cancers, such as ATP binding, which plays a role in human breast cancer drug resistance;50 cell membrane, an adjuvant in cancer therapy;51 nucleotide binding, contributing to cancer cell radioresistance;52 cell junction molecules, which are diagnostic and prognostic markers;53 and SH3 domain, a potential target for anticancer drug design.54 In a word, these results showed that the drug-response-related genes prioritized by the RR model indeed are involved in various biological activities of carcinomas.

Discussion

The aim of this study is to utilize gene-expression profiles of cell lines and observed cell line-drug responses to predict known/unknown anticancer drug sensitivities. To our best knowledge, it is the first time to model the drug-response prediction as a combination of MC and RR, for which MC is superior in prediction sensitivity, whereas RR presents biological interpretable results. As expected, the combination model outperforms a popular integrated model in three tested data. To improve the prediction performance further, one direction is to introduce more drug-response-associated features, for example, the information of chemical structure of drugs and the genomic information of cell lines, including gene mutation and copy variation number.27,55 The other possible direction is to incorporate more efficient mathematical models. The performance of an ensemble method usually increases with the number of predictors. However, it is out of the scope of this study.

Based on the RR model, we consistently found many important drug-response-associated genes. For example, SLFN11 is one of the top genes inferred by our model, which has been demonstrated to be topoisomerase inhibitors.6 Furthermore, functional analyses prioritized transcription, SH2/3 domain, “cell cycle,” “ATP binding,” and “zinc finger” as top functions enriched in drug-response-associated genes, which is also consistent with previous literature. However, the finding that AHR expression is strongly correlated with sensitivity of the MEK inhibitor PD-0325901 in Barretina et al.6 was not observed in our work. It might be caused by the different feature selection methods. To enhance the identification of strong biomarkers of the RR model, further attention should be paid on gene screening. In addition, as we all know, many genes function corporately. Thus, it might be better to add this information by adopting a group feature-based model, like group lasso.

Finally, we compared prediction results of 19 common drugs in PI3K and ERK signaling pathways based on AUC (see Figure S7). Most drugs have different correlations between predicted and observed response values in two pathways under three models. The disagreements are mainly from three aspects, including different cell line-drug-response matrices, different sample groups of 10-fold crossvalidation, and different final screened genes corresponding to two pathways. It is interesting that the disagreements derived by the combination model are the lowest, which indicates, to a certain degree, that the combination model usually produces a more stable prediction.

Materials and Methods

Datasets

The CCLE6 and GDSC7 projects provide two most important publicly available resources for investigating the anticancer drug response in which drug responses of several hundred cancer cell lines, together with their genomic and transcriptomic information, including gene expression, mutation, and copy number variation, were profiled. For CCLE (https://portals.broadinstitute.org/ccle), there are 491 cancer cell lines with both drug-response profiles of 24 anticancer drugs (measured by the activity area, the more sensitive cell lines will get higher activity areas to a drug) and expression profiles of 18,900 genes available. Specifically, the cell line-drug-response information can be formulated as a matrix of size 491 × 24 (11,784 entries) with row representing cell lines and column representing drugs, among which 424 (3.6%) are missing values. For GDSC (release-5.0; https://www.cancerrxgene.org), there are 655 cancer cell lines with both responses of 140 drugs (measured by AUC, the more sensitive cell lines will get lower AUC values to a drug) and expression profiles of 12,072 genes available. The drug-response profile can be formulated as a matrix of size 655 × 140 (91,700 entries), among which 18,029 (19.66%) are missing. Our data and software are publicly available at https://zenodo.org/record/1325121#.W2IiA1i0Xcs.

The MC Model

MC is a traditional mathematical model that has been applied in many fields, such as influenza antigenic cartography56,57 and microRNA (miRNA)-disease association.58,59 The aim of MC is to impute the missing entries in an incomplete matrix by taking the advantages of the relationships among row vectors and column vectors. Thus, the anticancer drug-response prediction can be formulated as the MC problem. We first replaced all missing values in the observed cell line-drug-response matrix with 0, and then standardized each column by Z scores to derive the normalized matrix denoted by M=[Mik]m×p, wherem is the number of cell lines, and p is the number of drugs. Let E={(i,k)|Mik0,i[m],k[p]}, here [m]={1, 2, , m} and[p]={1, 2, , p}, and let Z=[Zik]m×p denote the underlying cell line-drug-response matrix. Then we could use the following optimization problem to estimate the drug-response matrix,

                min12(i,k)E(ZikMik)2+θ1G(Z) s.t. Z=Um×rΣr×r(Vp×r)T (1)

where θ1 is a model parameter. Similar to previous studies,56,57 the regularization function isG(Z)=i=1mg(Ui23αr)+k=1pg(Vi23αr), whereg(z)=e(z1)21 when z1, and g(z)=0otherwise. Ui(Vi) denotes the ith row of U(V), andα=max(m, p).

We utilized the alternating gradient descend (AGD) algorithm56 to solve the problem (Equation 1) by the following steps.

  • (1)

    Trimming: randomly assign some observed values to 0s from a row (column) in M when the row (column) contains more than (2|E|/m)(2|E|/p) observed values, and denote the resulted matrix by M(0).

  • (2)

    Initialization: do singular value decomposition M(0)=UΣVT. Set U(0)=U0m and V(0)=V0m, where U0 and V0 are the first r columns of U and V, respectively.

  • (3)

    Iteration:

  • (a) Fix U(k) and V(k), and calculate the matrix Σr×r to minimize the squared error

12(i,k)E(ZikMik)2
  • (b)

    Set U(k+1)=U(k)+tU(k) and V(k+1)=V(k)+tV(k).

  • (c)

    Repeat (a) and (b) until they converge or reach predetermined iteration times.

Here,supposeU=[a11a1ram1amr]andV=[b11b1rbp1bpr]

Then,

U=((UΣVT)EM)VΣT+UQU+θ143αrU
V=((UΣVT)EM)TUΣ+VQV+θ143αrV
whereQU=1mUT(M(UΣVT)E)VΣT,QV=1pVT(M(UΣVT)E)TUΣ
U=[a11e(a112++a1r23αr1)2(a112++a1r23αr1)a1re(a112++a1r23αr1)2(a112++a1r23αr1)am1e(am12++amr23αr1)2(am12++amr23αr1)amre(am12++amr23αr1)2(am12++amr23αr1)]
V=[b11e(b112++b1r23αr1)2(b112++b1r23αr1)b1re(b112++b1r23αr1)2(b112++b1r23αr1)bp1e(bp12++bpr23αr1)2(bp12++bpr23αr1)bpre(bp12++bpr23αr1)2(bp12++bpr23αr1)]

The RR Model

It has been suggested that regression models, such as elastic net and RR, tend to exhibit good and robust performance in different settings.60,61 Here, we applied RR to predict drug responses by assuming a linear relationship between the gene-expression profiles of cell lines and drug responses. However, the number of genes is much larger than that of cell lines in our data. To avoid overfitting, for each drug, we need to filter out unimportant genes.

Let X=[Xij]m×n denote the cell line-gene-expression matrix, where n is the number of genes. Let Mk be the kth column of M(1kp). We then divided Mk into ten nonoverlapping groups randomly with almost the same size and selected nine groups as the training set and the remaining group as the test set. Without loss of generality, we selected the qth group as the test set and the other 9 groups as the training set. With the deletion of all zeros in the training set, we obtained the resulting vector Mk(q). Meanwhile, we deleted gene-expression values of corresponding cell lines, normalized the remaining gene-expression values by Z score, and obtained the matrix X(q). Let Xj(q) be the jth column of X(q), j=1, ,n. We computed PCCs ρ, such that the size of the set Ωσk(q)={j |j[n] and |ρ(Xj(q),Mk(q))|σk(q)} is close to the number of cell lines as much as possible. We set σk=min{σk(q), q=1, , 10}, and repeated the above procedure 10 times. The minimum value of the 10 σks was referred to as σkˆ. Under the help of σkˆ, we could use the above training sets to find τk easily (satisfying that the size of the set Ωτk={j |j[n] and |ρ(Xj(q),Mk(q))|τk} is close to the number of cell lines as much as possible) and to train the RR model by minimizing the following formula:

12iEk(jΩτkXijλjkτkMik)2+θ2jΩτk(λjkτk)2 (Equation 2)

Here, Ek={i | i[m] and Mik0}, λjkτk are RR estimators, and θ2 is a penalty factor for reducing the number of effective features associated with drug responses. Finally, we used the test sets to determine the optimal parameters in the formula (Equation 2).

Let τ=[τ1, ,τk,, τp], and λτ=[λjk]n×p, λjk=λjkτk if jΩτk, otherwise λjk=0. Then, the predicted cell line-drug-response matrix Y under τ can be formulated as Y=Xλτ.

The MC-RR Combination Model

Ensemble learning usually achieves better predicting accuracy than single predictors and thus is promising in prediction.15,62, 63, 64 It is reasonable to combine the MC and RR models since the former emphasizes relationships among drugs and deals with all drugs simultaneously, whereas the latter takes advantage of the close relationship between drug-response and gene-expression profiles and tackles each drug separately. We combined two models as follows:

Mˆk=βkZk+(1βk)Yk,1kp (Equation 3)

where Mˆk is the kth column of the prediction matrix Mˆ of the combination model; Zk and Yk, respectively, denote the kth column of the MC-prediction matrix Z and RR-prediction matrix Y; and βk is a weight parameter. Figure 7 showed the workflow of the MC-RR combination model.

Figure 7.

Figure 7

The Workflow of the MC-RR Combination Model

Crossvalidation and Evaluation Criteria

We conducted 10-fold crossvalidation to evaluate the performance of models and determine model parameters. Observed values in M were divided into 10 groups randomly with almost the same size. One group was taken as the testing set and the rest as the training set each time. The operation was repeated 10 runs, such that each value in M was predicted once, and PCCs between predicted and observed values in M were calculated. It is of noting that the MC model was performed on the wholeM with all of the drugs simultaneously, but the RR model and the combination model were applied to each drug separately. Hence, to each drug, we restricted the grouping of training and testing sets to train a separate model in each run and required model parameters to be equal for all 10 runs to avoid overfitting. The whole crossvalidation was run 10 times, the average PCC was used to measure the performance of models, and the optimal parameters were determined by maximizing the average PCC.

Author Contributions

J.Y. and Y.L. conceived the concept of the work and designed the experiments. C.L. and D.W. performed the literature search and experiments. C.L., F.R., and Y.L. completed theoretical derivation. J.X. and L.H. collected and analyzed the data. D.W., J.L., and G.T. conducted case studies. Y.L., J.Y., and C.L. wrote the paper. All authors have approved the manuscript.

Conflicts of Interest

The authors declare no competing interests.

Acknowledgments

This work was supported by the Natural Science Foundation of Hebei, China (grant no. A2020203021); Natural Science Foundation of Hunan, China (grant no. 2018JJ2461 and 2018JJ3568); National Natural Science Foundation of China (grant nos. 61807029 and 61702054); and Scientific Research Fund of Education Department of Hunan Province, China (grant no. 17A024).

Footnotes

Supplemental Information can be found online at https://doi.org/10.1016/j.omtn.2020.07.003.

Contributor Information

Yushuang Li, Email: yushuangli@ysu.edu.cn.

Jialiang Yang, Email: yangjl@geneis.cn.

Supplemental Information

Document S1. Figures S1–S7 and Tables S1–S5
mmc1.pdf (1MB, pdf)
Table S6. Cell Line Type Recognition to PD-0325901 in CCLE Using the Combination Model
mmc2.zip (27.3KB, zip)
Table S7. Top 1000 Genes for Each Drug in Three Datasets
mmc3.xlsx (521.7KB, xlsx)
Table S8. Terms with the Benjamini False Discovery Rate (FDR) Less Than 0.05
mmc4.xlsx (322.9KB, xlsx)
Document S2. Article plus Supplemental Information
mmc5.pdf (4MB, pdf)

References

  • 1.Druker B.J., Talpaz M., Resta D.J., Peng B., Buchdunger E., Ford J.M., Lydon N.B., Kantarjian H., Capdeville R., Ohno-Jones S., Sawyers C.L. Efficacy and safety of a specific inhibitor of the BCR-ABL tyrosine kinase in chronic myeloid leukemia. N. Engl. J. Med. 2001;344:1031–1037. doi: 10.1056/NEJM200104053441401. [DOI] [PubMed] [Google Scholar]
  • 2.Solit D.B., Garraway L.A., Pratilas C.A., Sawai A., Getz G., Basso A., Ye Q., Lobo J.M., She Y., Osman I. BRAF mutation predicts sensitivity to MEK inhibition. Nature. 2006;439:358–362. doi: 10.1038/nature04304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Dalerba P., Kalisky T., Sahoo D., Rajendran P.S., Rothenberg M.E., Leyrat A.A., Sim S., Okamoto J., Johnston D.M., Qian D. Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nat. Biotechnol. 2011;29:1120–1127. doi: 10.1038/nbt.2038. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Shoemaker R.H. The NCI60 human tumour cell line anticancer drug screen. Nat. Rev. Cancer. 2006;6:813–823. doi: 10.1038/nrc1951. [DOI] [PubMed] [Google Scholar]
  • 5.Lamb J., Crawford E.D., Peck D., Modell J.W., Blat I.C., Wrobel M.J., Lerner J., Brunet J.P., Subramanian A., Ross K.N. The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science. 2006;313:1929–1935. doi: 10.1126/science.1132939. [DOI] [PubMed] [Google Scholar]
  • 6.Barretina J., Caponigro G., Stransky N., Venkatesan K., Margolin A.A., Kim S., Wilson C.J., Lehár J., Kryukov G.V., Sonkin D. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Garnett M.J., Edelman E.J., Heidorn S.J., Greenman C.D., Dastur A., Lau K.W., Greninger P., Thompson I.R., Luo X., Soares J. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature. 2012;483:570–575. doi: 10.1038/nature11005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Seashore-Ludlow B., Rees M.G., Cheah J.H., Cokol M., Price E.V., Coletti M.E., Jones V., Bodycombe N.E., Soule C.K., Gould J. Harnessing Connectivity in a Large-Scale Small-Molecule Sensitivity Dataset. Cancer Discov. 2015;5:1210–1223. doi: 10.1158/2159-8290.CD-15-0235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rees M.G., Seashore-Ludlow B., Cheah J.H., Adams D.J., Price E.V., Gill S., Javaid S., Coletti M.E., Jones V.L., Bodycombe N.E. Correlating chemical sensitivity and basal gene expression reveals mechanism of action. Nat. Chem. Biol. 2016;12:109–116. doi: 10.1038/nchembio.1986. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Geeleher P., Cox N.J., Huang R.S. Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models. Genome Biol. 2016;17:190. doi: 10.1186/s13059-016-1050-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Szakács G., Annereau J.P., Lababidi S., Shankavaram U., Arciello A., Bussey K.J., Reinhold W., Guo Y., Kruh G.D., Reimers M. Predicting drug sensitivity and resistance: profiling ABC transporter genes in cancer cells. Cancer Cell. 2004;6:129–137. doi: 10.1016/j.ccr.2004.06.026. [DOI] [PubMed] [Google Scholar]
  • 12.Liu X., Yang J., Zhang Y., Fang Y., Wang F., Wang J., Zheng X., Yang J. A systematic study on drug-response associated genes using baseline gene expressions of the Cancer Cell Line Encyclopedia. Sci. Rep. 2016;6:22811. doi: 10.1038/srep22811. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Tran T.P., Ong E., Hodges A.P., Paternostro G., Piermarocchi C. Prediction of kinase inhibitor response using activity profiling, in vitro screening, and elastic net regression. BMC Syst. Biol. 2014;8:74. doi: 10.1186/1752-0509-8-74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Bansal M., Yang J., Karan C., Menden M.P., Costello J.C., Tang H., Xiao G., Li Y., Allen J., Zhong R., NCI-DREAM Community. NCI-DREAM Community A community computational challenge to predict the activity of pairs of compounds. Nat. Biotechnol. 2014;32:1213–1222. doi: 10.1038/nbt.3052. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Zhang N., Wang H., Fang Y., Wang J., Zheng X., Liu X.S. Predicting Anticancer Drug Responses Using a Dual-Layer Integrated Cell Line-Drug Network Model. PLoS Comput. Biol. 2015;11:e1004498. doi: 10.1371/journal.pcbi.1004498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Wei D., Liu C., Zheng X., Li Y. Comprehensive anticancer drug response prediction based on a simple cell line-drug complex network model. BMC Bioinformatics. 2019;20:44. doi: 10.1186/s12859-019-2608-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Liu H., Zhao Y., Zhang L., Chen X., Anti-cancer Drug Response Prediction Using Neighbor-Based Collaborative Filtering with Global Effect Removal Anti-cancer Drug Response Prediction Using Neighbor-Based Collaborative Filtering with Global Effect Removal. Mol. Ther. Nucleic Acids. 2018;13:303–311. doi: 10.1016/j.omtn.2018.09.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Zhang L., Chen X., Guan N.N., Liu H., Li J.Q. A Hybrid Interpolation Weighted Collaborative Filtering Method for Anti-cancer Drug Response Prediction. Front. Pharmacol. 2018;9:1017. doi: 10.3389/fphar.2018.01017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Wang L., Li X., Zhang L., Gao Q. Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization. BMC Cancer. 2017;17:513. doi: 10.1186/s12885-017-3500-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Guan N.N., Zhao Y., Wang C.C., Li J.Q., Chen X., Piao X. Anticancer Drug Response Prediction in Cell Lines Using Weighted Graph Regularized Matrix Factorization. Mol. Ther. Nucleic Acids. 2019;17:164–174. doi: 10.1016/j.omtn.2019.05.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Azuaje F. Computational models for predicting drug responses in cancer research. Brief. Bioinform. 2017;18:820–829. doi: 10.1093/bib/bbw065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Boehm J.S., Golub T.R. An ecosystem of cancer cell line factories to support a cancer dependency map. Nat. Rev. Genet. 2015;16:373–374. doi: 10.1038/nrg3967. [DOI] [PubMed] [Google Scholar]
  • 23.Klijn C., Durinck S., Stawiski E.W., Haverty P.M., Jiang Z., Liu H., Degenhardt J., Mayba O., Gnad F., Liu J. A comprehensive transcriptional portrait of human cancer cell lines. Nat. Biotechnol. 2015;33:306–312. doi: 10.1038/nbt.3080. [DOI] [PubMed] [Google Scholar]
  • 24.Fang Y., Qin Y., Zhang N., Wang J., Wang H., Zheng X. DISIS: prediction of drug response through an iterative sure independence screening. PLoS ONE. 2015;10:e0120408. doi: 10.1371/journal.pone.0120408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Stanfield Z., Coşkun M., Koyutürk M. Drug Response Prediction as a Link Prediction Problem. Sci. Rep. 2017;7:40321. doi: 10.1038/srep40321. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Zhang F., Wang M., Xi J., Yang J., Li A. A novel heterogeneous network-based method for drug response prediction in cancer cell lines. Sci. Rep. 2018;8:3355. doi: 10.1038/s41598-018-21622-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Dong Z., Zhang N., Li C., Wang H., Fang Y., Wang J., Zheng X. Anticancer drug sensitivity prediction in cell lines from baseline gene expression through recursive feature selection. BMC Cancer. 2015;15:489. doi: 10.1186/s12885-015-1492-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Staunton J.E., Slonim D.K., Coller H.A., Tamayo P., Angelo M.J., Park J., Scherf U., Lee J.K., Reinhold W.O., Weinstein J.N. Chemosensitivity prediction by transcriptional profiling. Proc. Natl. Acad. Sci. USA. 2001;98:10787–10792. doi: 10.1073/pnas.191368598. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Suphavilai C., Bertrand D., Nagarajan N. Predicting Cancer Drug Response using a Recommender System. Bioinformatics. 2018;34:3907–3914. doi: 10.1093/bioinformatics/bty452. [DOI] [PubMed] [Google Scholar]
  • 30.Guo W., Reigan P., Siegel D., Zirrolli J., Gustafson D., Ross D. Formation of 17-allylamino-demethoxygeldanamycin (17-AAG) hydroquinone by NAD(P)H:quinone oxidoreductase 1: role of 17-AAG hydroquinone in heat shock protein 90 inhibition. Cancer Res. 2005;65:10006–10015. doi: 10.1158/0008-5472.CAN-05-2029. [DOI] [PubMed] [Google Scholar]
  • 31.Kelland L.R., Sharp S.Y., Rogers P.M., Myers T.G., Workman P. DT-Diaphorase expression and tumor cell sensitivity to 17-allylamino, 17-demethoxygeldanamycin, an inhibitor of heat shock protein 90. J. Natl. Cancer Inst. 1999;91:1940–1949. doi: 10.1093/jnci/91.22.1940. [DOI] [PubMed] [Google Scholar]
  • 32.Konecny G.E., Pegram M.D., Venkatesan N., Finn R., Yang G., Rahmeh M., Untch M., Rusnak D.W., Spehar G., Mullin R.J. Activity of the dual kinase inhibitor lapatinib (GW572016) against HER-2-overexpressing and trastuzumab-treated breast cancer cells. Cancer Res. 2006;66:1630–1639. doi: 10.1158/0008-5472.CAN-05-1182. [DOI] [PubMed] [Google Scholar]
  • 33.Zou H.Y., Li Q., Lee J.H., Arango M.E., McDonnell S.R., Yamazaki S., Koudriakova T.B., Alton G., Cui J.J., Kung P.P. An orally available small-molecule inhibitor of c-Met, PF-2341066, exhibits cytoreductive antitumor efficacy through antiproliferative and antiangiogenic mechanisms. Cancer Res. 2007;67:4408–4417. doi: 10.1158/0008-5472.CAN-06-4443. [DOI] [PubMed] [Google Scholar]
  • 34.Moreau P., Cavallo F., Leleu X., Hulin C., Amiot M., Descamps G., Facon T., Boccadoro M., Mignard D., Harousseau J.L. Phase I study of the anti insulin-like growth factor 1 receptor (IGF-1R) monoclonal antibody, AVE1642, as single agent and in combination with bortezomib in patients with relapsed multiple myeloma. Leukemia. 2011;25:872–874. doi: 10.1038/leu.2011.4. [DOI] [PubMed] [Google Scholar]
  • 35.Greshock J., Bachman K.E., Degenhardt Y.Y., Jing J., Wen Y.H., Eastman S., McNeil E., Moy C., Wegrzyn R., Auger K. Molecular target class is predictive of in vitro response profile. Cancer Res. 2010;70:3677–3686. doi: 10.1158/0008-5472.CAN-09-3788. [DOI] [PubMed] [Google Scholar]
  • 36.Müller C.R., Paulsen E.B., Noordhuis P., Pedeutour F., Saeter G., Myklebost O. Potential for treatment of liposarcomas with the MDM2 antagonist Nutlin-3A. Int. J. Cancer. 2007;121:199–205. doi: 10.1002/ijc.22643. [DOI] [PubMed] [Google Scholar]
  • 37.Canver M.C., Smith E.C., Sher F., Pinello L., Sanjana N.E., Shalem O., Chen D.D., Schupp P.G., Vinjamur D.S., Garcia S.P. BCL11A enhancer dissection by Cas9-mediated in situ saturating mutagenesis. Nature. 2015;527:192–197. doi: 10.1038/nature15521. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Tom Tang Y., Emtage P., Funk W.D., Hu T., Arterburn M., Park E.E., Rupp F. TAFA: a novel secreted family with conserved cysteine residues and restricted expression in the brain. Genomics. 2004;83:727–734. doi: 10.1016/j.ygeno.2003.10.006. [DOI] [PubMed] [Google Scholar]
  • 39.Morita J., Kano K., Kato K., Takita H., Sakagami H., Yamamoto Y., Mihara E., Ueda H., Sato T., Tokuyama H. Structure and biological function of ENPP6, a choline-specific glycerophosphodiester-phosphodiesterase. Sci. Rep. 2016;6:20995. doi: 10.1038/srep20995. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Bell R.D., Winkler E.A., Singh I., Sagare A.P., Deane R., Wu Z., Holtzman D.M., Betsholtz C., Armulik A., Sallstrom J. Apolipoprotein E controls cerebrovascular integrity via cyclophilin A. Nature. 2012;485:512–516. doi: 10.1038/nature11087. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Huang D.W., Sherman B.T., Tan Q., Collins J.R., Alvord W.G., Roayaei J., Stephens R., Baseler M.W., Lane H.C., Lempicki R.A. The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 2007;8:R183. doi: 10.1186/gb-2007-8-9-r183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Ongen H., Andersen C.L., Bramsen J.B., Oster B., Rasmussen M.H., Ferreira P.G., Sandoval J., Vidal E., Whiffin N., Planchon A. Putative cis-regulatory drivers in colorectal cancer. Nature. 2014;512:87–90. doi: 10.1038/nature13602. [DOI] [PubMed] [Google Scholar]
  • 43.Danckwardt S., Hentze M.W., Kulozik A.E. 3′ end mRNA processing: molecular mechanisms and implications for health and disease. EMBO J. 2008;27:482–498. doi: 10.1038/sj.emboj.7601932. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Morlacchi P., Robertson F.M., Klostergaard J., McMurray J.S. Targeting SH2 domains in breast cancer. Future Med. Chem. 2014;6:1909–1926. doi: 10.4155/fmc.14.120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Pereira B., Billaud M., Almeida R. RNA-Binding Proteins in Cancer: Old Players and New Actors. Trends Cancer. 2017;3:506–528. doi: 10.1016/j.trecan.2017.05.003. [DOI] [PubMed] [Google Scholar]
  • 46.Prasad A.S., Beck F.W., Snell D.C., Kucuk O. Zinc in cancer prevention. Nutr. Cancer. 2009;61:879–887. doi: 10.1080/01635580903285122. [DOI] [PubMed] [Google Scholar]
  • 47.Sharma P.S., Sharma R., Tyagi T. Receptor tyrosine kinase inhibitors as potent weapons in war against cancers. Curr. Pharm. Des. 2009;15:758–776. doi: 10.2174/138161209787582219. [DOI] [PubMed] [Google Scholar]
  • 48.Manning G., Plowman G.D., Hunter T., Sudarsanam S. Evolution of protein kinase signaling from yeast to man. Trends Biochem. Sci. 2002;27:514–520. doi: 10.1016/s0968-0004(02)02179-5. [DOI] [PubMed] [Google Scholar]
  • 49.Roskoski R., Jr. Src kinase regulation by phosphorylation and dephosphorylation. Biochem. Biophys. Res. Commun. 2005;331:1–14. doi: 10.1016/j.bbrc.2005.03.012. [DOI] [PubMed] [Google Scholar]
  • 50.Leonessa F., Clarke R. ATP binding cassette transporters and drug resistance in breast cancer. Endocr. Relat. Cancer. 2003;10:43–73. doi: 10.1677/erc.0.0100043. [DOI] [PubMed] [Google Scholar]
  • 51.Zalba S., Ten Hagen T.L. Cell membrane modulation as adjuvant in cancer therapy. Cancer Treat. Rev. 2017;52:48–57. doi: 10.1016/j.ctrv.2016.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Fukumoto M., Amanuma T., Kuwahara Y., Shimura T., Suzuki M., Mori S., Kumamoto H., Saito Y., Ohkubo Y., Duan Z. Guanine nucleotide-binding protein 1 is one of the key molecules contributing to cancer cell radioresistance. Cancer Sci. 2014;105:1351–1359. doi: 10.1111/cas.12489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Knights A.J., Funnell A.P., Crossley M., Pearson R.C. Holding Tight: Cell Junctions and Cancer Spread. Trends Cancer Res. 2012;8:61–69. [PMC free article] [PubMed] [Google Scholar]
  • 54.Smithgall T.E. SH2 and SH3 domains: potential targets for anti-cancer drug design. J. Pharmacol. Toxicol. Methods. 1995;34:125–132. doi: 10.1016/1056-8719(95)00082-7. [DOI] [PubMed] [Google Scholar]
  • 55.Costello J.C., Heiser L.M., Georgii E., Gönen M., Menden M.P., Wang N.J., Bansal M., Ammad-ud-din M., Hintsanen P., Khan S.A., NCI DREAM Community A community effort to assess and improve drug sensitivity prediction algorithms. Nat. Biotechnol. 2014;32:1202–1212. doi: 10.1038/nbt.2877. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Cai Z., Zhang T., Wan X.F. A computational framework for influenza antigenic cartography. PLoS Comput. Biol. 2010;6:e1000949. doi: 10.1371/journal.pcbi.1000949. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Huang L., Li X., Guo P., Yao Y., Liao B., Zhang W., Wang F., Yang J., Zhao Y., Sun H. Matrix completion with side information and its applications in predicting the antigenicity of influenza viruses. Bioinformatics. 2017;33:3195–3201. doi: 10.1093/bioinformatics/btx390. [DOI] [PubMed] [Google Scholar]
  • 58.Chen X., Wang L., Qu J., Guan N.N., Li J.Q. Predicting miRNA-disease association based on inductive matrix completion. Bioinformatics. 2018;34:4256–4265. doi: 10.1093/bioinformatics/bty503. [DOI] [PubMed] [Google Scholar]
  • 59.Chen X., Sun L.G., Zhao Y. NCMCMDA: miRNA-disease association prediction through neighborhood constraint matrix completion. Brief. Bioinform. 2020 doi: 10.1093/bib/bbz159. Published online January 12, 2020. [DOI] [PubMed] [Google Scholar]
  • 60.Jang I.S., Neto E.C., Guinney J., Friend S.H., Margolin A.A. Systematic assessment of analytical methods for drug sensitivity prediction from cancer cell line data. Pac. Symp. Biocomput. 2014:63–74. [PMC free article] [PubMed] [Google Scholar]
  • 61.Cancer Cell Line Encyclopedia Consortium. Genomics of Drug Sensitivity in Cancer Consortium Pharmacogenomic agreement between two cancer cell line data sets. Nature. 2015;528:84–87. doi: 10.1038/nature15736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Cortés-Ciriano I., van Westen G.J., Bouvier G., Nilges M., Overington J.P., Bender A., Malliavin T.E. Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel. Bioinformatics. 2016;32:85–95. doi: 10.1093/bioinformatics/btv529. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Wan Q., Pal R. An ensemble based top performing approach for NCI-DREAM drug sensitivity prediction challenge. PLoS ONE. 2014;9:e101183. doi: 10.1371/journal.pone.0101183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Chen X., Zhu C.C., Yin J. Ensemble of decision tree reveals potential miRNA-disease associations. PLoS Comput. Biol. 2019;15:e1007209. doi: 10.1371/journal.pcbi.1007209. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S7 and Tables S1–S5
mmc1.pdf (1MB, pdf)
Table S6. Cell Line Type Recognition to PD-0325901 in CCLE Using the Combination Model
mmc2.zip (27.3KB, zip)
Table S7. Top 1000 Genes for Each Drug in Three Datasets
mmc3.xlsx (521.7KB, xlsx)
Table S8. Terms with the Benjamini False Discovery Rate (FDR) Less Than 0.05
mmc4.xlsx (322.9KB, xlsx)
Document S2. Article plus Supplemental Information
mmc5.pdf (4MB, pdf)

Articles from Molecular Therapy. Nucleic Acids are provided here courtesy of The American Society of Gene & Cell Therapy

RESOURCES