Abstract
The number of large-scale high-dimensional datasets recording different aspects of a single disease is growing, accompanied by a need for frameworks that can create one coherent model from multiple tensors of matched columns, e.g., patients and platforms, but independent rows, e.g., probes. We define and prove the mathematical properties of a novel tensor generalized singular value decomposition (GSVD), which can simultaneously find the similarities and dissimilarities, i.e., patterns of varying relative significance, between any two such tensors. We demonstrate the tensor GSVD in comparative modeling of patient- and platform-matched but probe-independent ovarian serous cystadenocarcinoma (OV) tumor, mostly high-grade, and normal DNA copy-number profiles, across each chromosome arm, and combination of two arms, separately. The modeling uncovers previously unrecognized patterns of tumor-exclusive platform-consistent co-occurring copy-number alterations (CNAs). We find, first, and validate that each of the patterns across only 7p and Xq, and the combination of 6p+12p, is correlated with a patient’s prognosis, is independent of the tumor’s stage, the best predictor of OV survival to date, and together with stage makes a better predictor than stage alone. Second, these patterns include most known OV-associated CNAs that map to these chromosome arms, as well as several previously unreported, yet frequent focal CNAs. Third, differential mRNA, microRNA, and protein expression consistently map to the DNA CNAs. A coherent picture emerges for each pattern, suggesting roles for the CNAs in OV pathogenesis and personalized therapy. In 6p+12p, deletion of the p21-encoding CDKN1A and p38-encoding MAPK14 and amplification of RAD51AP1 and KRAS encode for human cell transformation, and are correlated with a cell’s immortality, and a patient’s shorter survival time. In 7p, RPA3 deletion and POLD2 amplification are correlated with DNA stability, and a longer survival. In Xq, PABPC5 deletion and BCAP31 amplification are correlated with a cellular immune response, and a longer survival.
Introduction
The growing number of large-scale high-dimensional datasets recording different aspects of a single disease promise to enhance basic understanding of life on the molecular level as well as medical diagnosis, prognosis, and treatment. This is accompanied by a fundamental need for mathematical frameworks that can create one coherent model from multiple datasets arranged in multiple order-matched, column-matched, and row-independent tensors, i.e., tensors of the same number of dimensions each, with one-to-one mappings among the columns across all but one of the corresponding dimensions among the tensors, but not necessarily among the rows across the one remaining dimension in each tensor. Consider, e.g., the structure of the DNA copy-number datasets in the Cancer Genome Atlas (TCGA) [1, 2]. Profiles of tumor and normal tissues from the same set of patients have the structure of two matrices, i.e., second-order tensors, with a one-to-one mapping between the columns that correspond to the same set of patients, but not necessarily between the rows that correspond to the DNA copy-number probes with valid data in either the tumor or the normal dataset, and may be different. When the tumor and normal profiles are measured in replicates, e.g., by the same set of profiling platforms, then the structure of the tumor and normal datasets is that of two third-order tensors, of matched columns that correspond to the same sets of patients and platforms, and independent rows that correspond to the probes in either the tumor or the normal dataset.
The higher-order generalized singular value decomposition (HO GSVD) is the only simultaneous decomposition to date of more than two such column-matched but row-independent datasets, which is by definition exact, and which mathematical properties allow interpreting its variables and operations in terms of the similar as well as dissimilar, e.g., biomedical reality among the datasets [3, 4]. The HO GSVD generalizes the GSVD [5–12], which was demonstrated in comparative modeling of, e.g., patient-matched but probe-independent glioblastoma (GBM) brain tumor and normal DNA copy-number profiles from TCGA [13]. The modeling uncovered a previously unrecognized genome-wide pattern of tumor-exclusive copy-number alterations (CNAs). Prior to the modeling, DNA copy-number subtypes of GBM predictive of survival and response to chemotherapy were not conclusively identified [14, 15], and the best predictor of GBM survival was the patient’s age at diagnosis [16, 17]. Survival analyses [18, 19] showed and validated that the pattern is correlated with a GBM patient’s prognosis and response to chemotherapy, is independent of age, and together with age makes a better predictor than age alone. Segmentation [20, 21] of the pattern showed that it includes most known GBM-associated changes in chromosome numbers and focal CNAs, as well as several previously unreported, yet frequent CNAs. This suggested that the pattern is not only correlated, but also possibly causally coordinated with the GBM tumor’s pathogenesis. Previously unrecognized targets for personalized GBM drug therapy were also suggested, the tousled-like kinase 2 TLK2 and the methyltransferase-like 2A METTL2A [22–24]. The GSVD comparative modeling, therefore, resulted in new insights into the poorly understood relations between a GBM tumor’s genome and a patient’s survival phenotype.
The GSVD and HO GSVD, however, are limited to datasets arranged in second-order tensors, i.e., matrices. We define, therefore, a novel tensor GSVD, i.e., an exact simultaneous decomposition of two datasets, arranged in two higher-than-second-order tensors of matched column dimensions but independent row dimensions. The tensor GSVD factors or separates the pair of tensors into corresponding pairs of “subtensors”, i.e., pairs of outer products or combinations of a paired set of patterns each: patterns, one across each of the matched column dimensions, which are identical for both tensors, combined with one pattern across the independent row dimension of either one of the two tensors. The pairs of subtensors are of varying relative mathematical significance, i.e., the significance of one subtensor in a pair in the corresponding tensor relative to the significance of the second subtensor in the second tensor varies among the pairs of subtensors. We prove that the tensor GSVD extends the GSVD and the tensor higher-order singular value decomposition (HOSVD) [25–28] from a decomposition of either two column-matched matrices or one tensor, respectively, to a decomposition of two order-matched, column-matched, and row-independent tensors [29]. We also show that the mathematical properties of the tensor GSVD allow interpreting the subtensors in terms of the biomedical similarities and dissimilarities between the two corresponding high-dimensional datasets.
We demonstrate the tensor GSVD in comparative modeling of patient- and platform-matched but probe-independent ovarian serous cystadenocarcinoma (OV) tumor and normal DNA copy-number profiles from TCGA. Most of the tumors, i.e., >95%, are high-grade tumors [30]. OV accounts for about 90% of all ovarian cancers. Despite recent large-scale profiling efforts, the best predictor of OV survival to date has remained the tumor’s stage at diagnosis, a pathological assessment of the spread of the cancer numbering I to IV [31]. About 25% of primary OV tumors are resistant, and most recurrent OV tumors develop resistance to platinum-based chemotherapy, the first-line treatment for more than 30 years now [32]. Even though there exist drugs for platinum-based chemotherapy-resistant OV tumors, no pathology laboratory diagnostic exists that distinguishes between resistant and sensitive tumors before the treatment [33]. OV tumors exhibit significant CNA variation among them, much more so than, e.g., GBM tumors, and very few frequent CNAs typical of OV have been identified so far. We, therefore, model the profiles across each chromosome arm, and each combination of two chromosome arms, separately. The modeling uncovers previously unrecognized chromosome arm-wide patterns of tumor-exclusive and platform-consistent co-occurring CNAs.
By using survival analyses of the discovery and, separately, validation set of patients, as well as only the platinum-based chemotherapy patients in the discovery and validation sets, we find, first, and validate that each of the patterns across only the chromosome arms 7p and Xq, and across only the combination of the two chromosome arms 6p+12p (but not 6p nor 12p separately), is correlated with an OV patient’s prognosis and response to platinum-based chemotherapy, is independent of stage, and together with stage makes a better predictor than stage alone. By using survival analyses of only the > 95% patients with high-grade tumors, we find and validate that these patterns are also independent of the OV tumor’s grade. We observe three groups of significantly different prognoses among the patients classified by a combination of the 6p+12p, 7p, and Xq tensor GSVD classifications, suggesting a possible implementation of the patterns in a pathology laboratory test. Second, by using segmentation of the 6p+12p, 7p, and Xq patterns, we find that the amplifications and deletions identified by these patterns include most known OV-associated CNAs that map to these chromosome arms [34], as well as several previously unreported, yet frequent focal CNAs [35–38]. Third, by using gene ontology enrichment analyses of the OV tumor mRNA expression profiles of the patients [39, 40], we find that differential mRNA expression between the patients, classified by any one of the three tensor GSVDs, is enriched in ontologies corresponding to one of three hallmarks of cancer [41]: a cell’s immortality in 6p+12p, DNA instability in 7p, and cellular immune response suppression in Xq. The differential mRNA expression of genes from these enriched ontologies that are located on any one of the chromosome arms is consistent with the CNAs across that arm. Genes that map to amplifications or deletions on any one pattern, are overexpressed or underexpressed, respectively, in the patients which tumor profiles are classified as highly similar to that pattern. The differential expression of all microRNAs and proteins that map to any one of the chromosome arms is also consistent with the CNAs across that arm.
Taken together, a coherent picture emerges for each of these previously unrecognized chromosome arm-wide patterns of tumor-exclusive and platform-consistent co-occurring alterations, suggesting roles for the DNA CNAs in OV pathogenesis in addition to personalized diagnosis, prognosis, and treatment. In 6p+12p, loss of the p21-encoding CDKN1A and the p38-encoding MAPK14 on 6p, and gain of KRAS on 12p, combined but not separately, can lead to transformation of human normal to tumor cells [42, 43]. These transformation-encoding CNAs, together with deletion of TNF on 6p, and amplification of RAD51AP1 and ITPR2 on 12p, are correlated with a suppression of cell cycle arrest, senescence, and apoptosis, i.e., a tumor cell’s immortality, and a patient’s shorter survival time [44–55]. Note that there already exist drugs that interact with CDKN1A, MAPK14, and RAD51AP1, even though these genes were not recognized previously as targets for OV drug therapy [56]. In 7p, RPA3 deletion and POLD2 amplification are correlated with DNA repair during replication, i.e., DNA stability, and a longer survival time [57, 58]. In Xq, PABPC5 deletion and BCAP31 amplification are correlated with a cellular immune response, and a longer survival time [59].
Mathematical Method: Tensor GSVD
Discovery Datasets are Pairs of Column-Matched but Row-Independent Tensors
We selected primary OV tumor and normal DNA copy-number profiles of a set of 249 TCGA patients [2] (Sec. 1.1 in S1 Appendix, and S1 Dataset). Each profile was measured in two replicates by the same set of two DNA microarray platforms. For each chromosome arm or combination of two chromosome arms, the structure of these tumor and normal discovery datasets 𝒟1 and 𝒟2, of K 1-tumor and K 2-normal probes × L-patients, i.e., arrays × M-platforms, is that of two third-order tensors with one-to-one mappings between the column dimensions L and M, but different row dimensions K 1 and K 2, where K 1, K 2 ≥ LM.
The Tensor GSVD
We define, therefore, a novel tensor GSVD that simultaneously separates the paired datasets into weighted sums of LM paired “subtensors”, i.e., combinations or outer products of three patterns each: Either one tumor-specific pattern of copy-number variation across the tumor probes, i.e., a “tumor arraylet” u 1,a, or the corresponding normal-specific pattern across the normal probes, i.e., the “normal arraylet” u 2,a, combined with one pattern of copy-number variation across the patients, i.e., an “x-probelet” and one pattern across the platforms, i.e., a “y-probelet” , which are identical for both the tumor and normal datasets (Fig. 1, and Figs. A and B in S1 Appendix),
(1) |
where ×a U i, ×b V x and ×c V y denote tensor-matrix multiplications, which contract the LM-arraylet, L-x-probelet, and M-y-probelet dimensions of the “core tensor” ℛi with those of U i, V x, and V y, respectively, and where ⊗ denotes an outer product.
Construction
Suppose that unfolding (or matricizing) both tensors 𝒟i into matrices, each preserving the K i-row dimension, e.g., by appending the LM columns 𝒟i,:lm of the corresponding tensor, gives two full column-rank matrices D i ∈ ℝKi×LM. We obtain the column bases vectors U i from the GSVD of D i [5–13], i.e., the “row mode GSVD”
(2) |
Suppose, similarly, that unfolding both tensors 𝒟i into matrices, each preserving the L-x- (or M-y-) column dimension, e.g., by appending the K i M rows (or the K i L rows ) of the corresponding tensor, gives two full column-rank matrices D ix ∈ ℝKiM×L (or D iy ∈ ℝKiL×M). We obtain the x- (or y-) row basis vectors (or ), from the GSVD of D ix (or D iy), i.e., the x- (or y-) column mode GSVD,
(3) |
Note that the x- and y-row bases vectors are, in general, non-orthogonal but normalized, and V x and V y are invertible. The column bases vectors are normalized and orthogonal, i.e., uncorrelated, such that .
The generalized singular values are positive, and are arranged in Σi, Σix, and Σiy in decreasing orders of the corresponding “GSVD angular distances”, i.e., decreasing orders of the ratios σ 1,a/σ 2,a, σ 1x,b/σ 2x,b, and σ 1y,c/σ 2y,c, respectively. We then compute the core tensors ℛi by contracting the row-, x-, and y-column dimensions of the tensors 𝒟i with those of the matrices U i, , and , respectively. For real tensors, the “tensor generalized singular values” ℛi,abc tabulated in the core tensors are real but not necessarily positive. Our tensor GSVD construction generalizes the GSVD to higher orders in analogy with the generalization of the singular value decomposition (SVD) by the HOSVD [25–28], and is different from other approaches to the decomposition of two tensors [29].
Existence, uniqueness and special cases
We prove that our tensor GSVD exists for two tensors of any order because it is constructed from the GSVDs of the tensors unfolded into full column-rank matrices (Lemma A in S1 Appendix). The tensor GSVD has the same uniqueness properties as the GSVD, where the column bases vectors u i,a and the row bases vectors and are unique, except in degenerate subspaces, defined by subsets of equal generalized singular values σ i,a, σ ix,b, and σ iy,c, respectively, and up to phase factors of ±1, such that each vector captures both parallel and antiparallel patterns (Lemma B in S1 Appendix). The tensor GSVD of two second-order tensors reduces to the GSVD of the corresponding matrices (Corollary A in S1 Appendix). The tensor GSVD of the tensor 𝒟1 ∈ ℝLM×L×M, which row mode unfolding gives the identity matrix D 1 = I ∈ ℝLM×LM, and a tensor 𝒟2 of the same column dimensions reduces to the HOSVD of 𝒟2 (Theorem A in S1 Appendix).
Interpretation
The significance of the subtensor 𝒮i(a,b,c) in the tensor 𝒟i is defined proportional to the magnitude of the corresponding tensor generalized singular values ℛi,abc (Fig. C in S1 Appendix), in analogy with the HOSVD,
(4) |
The significance of 𝒮1(a,b,c) in 𝒟1 relative to that of 𝒮2(a,b,c) in 𝒟2 is defined by the “tensor GSVD angular distance” Θabc as a function of the ratio ℛ1,abc/ℛ2,abc. This is in analogy with, e.g., the row mode GSVD angular distance θ a, which defines the significance of the column basis vector u 1,a in the matrix D 1 of Equation (2) relative to that of u 2,a in D 2 as a function of the ratio σ 1,a/σ 2,a,
(5) |
Because the ratios of the positive generalized singular values satisfy σ 1,a/σ 2,a ∈ [0, ∞), the row mode GSVD angular distances satisfy θ a ∈ [−π/4, π/4]. The maximum (or minimum) angular distance, i.e., θ a = π/4, which corresponds to σ 1,a/σ 2,a > > 1 (or −π/4, which corresponds to σ 1,a/σ 2,a < < 1), indicates that the row basis vector of Equation (2), which corresponds to the column basis vectors u 1,a in D 1 and u 2,a in D 2, is exclusive to D 1 (or D 2). An angular distance of θ a = 0, which corresponds to σ 1,a/σ 2,a = 1, indicates a row basis vector which is of equal significance in, i.e., common to both D 1 and D 2.
Thus, while the ratio σ 1,a/σ 2,a indicates the significance of u 1,a in D 1 relative to the significance of u 2,a in D 2, this relative significance is defined, as previously described [12, 13], by the angular distance θ a, a function of the ratio σ 1,a/σ 2,a, which is antisymmetric in D 1 and D 2. Note also that while other functions of the ratio σ 1,a/σ 2,a exist that are antisymmetric in D 1 and D 2, the angular distance θ a, which is a function of the arctangent of the ratio, i.e., arctan(σ 1,a/σ 2,a), is the natural function to use, because the GSVD is related to the cosine-sine (CS) decomposition, as previously described [9], and, thus, σ 1,a and σ 2,a are related to the sine and the cosine functions of the angle θ a, respectively.
Theorem 1. The tensor GSVD angular distance equals the row mode GSVD angular distance, i.e., Θabc = θa.
Proof. The unfolding of 𝒟i of Equation (1) into D i of Equation (2) unfolds the core tensors ℛi of Equation (1) into matrices R i, which preserve the row dimensions, i.e., the LM-column bases dimensions of ℛi, and gives
(6) |
where ⊗ denotes a Kronecker product. Because Σi are positive diagonal matrices, it follows that ℛ1,abc/ℛ2,abc = R 1,a/R 2,a = σ 1,a/σ 2,a. Substituting this in Equation (5) gives Θabc = θ a. Note that the proof holds for tensors of higher-than-third order.
From this it follows that the tensor GSVD angular distance ∣Θabc∣ ≤ π/4, and that, therefore, the ratio of the tensor generalized singular values ℛ1,abc/ℛ2,abc > 0, even though ℛ1,abc and ℛ2,abc are not necessarily positive. It also follows that Θabc = ±π/4 indicate a subtensor exclusive to either 𝒟1 or 𝒟2, respectively, and that Θabc = 0 indicates a subtensor common to both.
Note that since the generalized singular values are arranged in Σi of Equation (2) in a decreasing order of the row mode GSVD angular distances θ a, the most tumor-exclusive tumor subtensors, i.e., 𝒮1(a,b,c) where a maximizes θ a of Equation (5), correspond to a = 1, whereas the most normal-exclusive normal subtensors, i.e., 𝒮2(a,b,c) where a minimizes θ a, correspond to a = LM.
Discovery and Validation of CNAs Predicting OV Survival
We compute the tensor GSVD of the tumor and normal discovery datasets for each chromosome arm and each combination of two chromosome arms, separately (S1 Mathematica Notebook). For each arm or arms we examine the most significant subtensor in the tumor dataset, i.e., 𝒮1(a,b,c), where a, b, and c maximize 𝒫1,abc of Equation (4).
We, first, require the subtensor to be tumor-exclusive and platform-consistent: include the tumor arraylet u 1,a that is the most exclusive to the tumor dataset, i.e., u 1,1, as well as a y-probelet of consistent, i.e., approximately equal copy numbers in both platforms. Second, we require the subtensor to be correlated with an OV patient’s prognosis in the discovery set of patients, i.e., include an x-probelet that classifies the discovery set of patients into two groups of high (> 0.5 standardized median absolute deviation, i.e., sMAD, from the median) and low coefficients, of significantly (log-rank test P-value < 0.05) and robustly (throughout the range of ±0.1 sMAD around the cutoff) different prognoses (Fig. 2). Third, we require the subtensor to be correlated with prognosis in the validation set of patients, i.e., include an arraylet that classifies the validation set of patients into two groups of high and low Spearman’s rank correlation coefficients of significantly different prognoses, consistent with the x-probelet’s classification of the discovery set of patients (Fig. 3, and Sec. 1.3 in S1 Appendix). Note that the validation set includes 148 TCGA patients, mutually exclusive of the discovery set, with primary OV tumor profiles measured by at least one of the two DNA microarray platforms that were used to measure the discovery datasets (S2 Dataset).
We find that each of the tensor GSVDs of only the chromosome arms 7p and Xq, and only the combination of the two chromosome arms 6p+12p (but not 6p nor 12p separately), uncovers a pattern of tumor-exclusive and platform-consistent co-occurring CNAs that is correlated with an OV patient’s prognosis in the discovery and, separately, validation set of patients.
Biological Results
Independent Chromosome Arm-Wide Predictors of OV Survival and Response to Platinum-Based Chemotherapy
To date, the best predictor of OV survival has remained the tumor’s stage at diagnosis [31] (Sec. 2.1, and Figs. D and E in S1 Appendix). Additional indicators, such as the residual disease after surgery, the outcome of subsequent therapy, and the neoplasm status, which is the last known status of the disease, are determined during treatment. No diagnostic exists that distinguishes between platinum-based chemotherapy-resistant and -sensitive tumors before the treatment [32, 33].
We find and validate, by using survival analyses of the discovery and, separately, validation set of patients, as well as only the 88% and 95% platinum-based chemotherapy patients in the discovery and validation sets, respectively (Fig. F in S1 Appendix), that each of the patterns, across 6p+12, 7p, and Xq, is correlated with an OV patient’s prognosis and response to platinum-based chemotherapy, is independent of stage, and together with stage makes a better predictor than stage alone.
We also find and validate that each of these three tensor GSVDs is independent of each of the additional standard indicators (Tables A and B in S1 Appendix). For example, survival analyses of the discovery set classified by the 6p+12p tensor GSVD into high and low x-probelet coefficients, and by pathology at diagnosis into tumor stages I-II and III-IV, give the bivariate Cox hazard ratios of 1.5 and 4.0, which are similar to the corresponding univariate ratios of 1.7 and 4.4, respectively [18]. Similarly, survival analyses of the validation set classified by the 6p+12p tensor GSVD into high and low arraylet correlation coefficients, and by pathology at diagnosis into tumor stages III and IV, give the bivariate Cox hazard ratios of 1.9 and 1.8, which are the same as the corresponding univariate ratios (Fig. G in S1 Appendix). This means that the 6p+12p tensor GSVD and stage are independent predictors of survival. Therefore, combined with any one of the standard indicators, each of the three tensor GSVDs makes a better predictor than the standard indicator alone (Figs. H and I in S1 Appendix). For example, the Kaplan-Meier (KM) median survival time difference of 61 months among the discovery set of patients classified by both the 6p+12p tensor GSVD and stage, is about 85% and more than two years greater than the 33 month difference between the patients classified by stage alone [19]. The KM median survival difference of 34 months among the validation set of patients classified by both the 6p+12p tensor GSVD and stage, is about 62% and more than one year greater than the 21 month difference between the patients classified by stage alone.
Note that while the discovery set of patients reflects the general OV patient population, with approximately 5%, 7%, 76%, and 12% of the patients diagnosed at stages I, II, III, and IV, respectively, the validation set reflects the high-stage OV patient population, with approximately 20% and 80% of the patients diagnosed at stages III and IV, respectively. The 6p+12p, 7p, and Xq tensor GSVDs, therefore, predict survival both in the general as well as in the high-stage OV patient population. Note also that the discovery and validation sets each include mostly, i.e., > 95% high-grade, i.e., grades 2 and higher tumors. Tumor grade does not correlate with survival in either the discovery or the validation set of patients. Survival analyses of only the > 95% patients with high-grade tumors in the discovery and, separately, validation set give qualitatively the same and quantitatively similar results to those of the analyses of 100% of the patients in each set, respectively. The 6p+12p, 7p, and Xq tensor GSVDs, therefore, predict survival in the high-grade OV patient population, and are independent of the OV tumor’s grade as well as the molecular distinctions between high- and low-grade OV tumors [30].
We observe three groups of significantly different prognoses among the discovery and, separately, validation set of patients, as well as only the platinum-based chemotherapy patients, classified by a combination of the three, i.e., 6p+12p, 7p, and Xq, tensor GSVD classifications, each of which is binomial (Fig. 4). In group A, a combination of a low 6p+12p x-probelet coefficient or arraylet correlation, and high 7p and Xq x-probelet coefficients or arraylet correlations is indicative of a patient’s significantly longer survival time and better response to platinum-based chemotherapy. In group B, the three combinations where just one of the three binomial classifications differs from that of group A, indicate shorter survival time and worse response to chemotherapy than those of group A. In group C, the four combinations where at least two of the three binomial classifications differ from that of group A, indicate shorter survival time and worse response to chemotherapy than those of group B as well as group A. For example, the KM median survival times of the discovery set of patients classified into groups A, B, and C are 86, 52, and 36 months, such that the median survival time of group A is more than four years greater than, and more than twice that of group C.
This suggests a possible implementation of the 6p+12p, 7p, and Xq patterns in a pathology laboratory test, where a patient’s survival and response to platinum-based chemotherapy is predicted based upon the combination of the correlations of the OV tumor’s DNA copy-number profile with the 6p+12p, 7p, and Xq patterns.
Novel Frequent Focal CNAs Indicating Survival
OV tumors exhibit significant CNA variation among them, much more so than, e.g., GBM brain tumors [2, 13]. Very few frequently occurring OV CNAs have been identified to date.
We find, by using segmentation [20, 21], that the three tensor GSVD arraylets include most known OV-associated CNAs that map to the corresponding chromosome arms, and several previously unreported yet frequent CNAs in > 23% of the patients. For example, the 6p+12p arraylet includes two segments corresponding to the only known OV focal CNAs that map to 6p+12p, 7p, or Xq (Sec. 2.2 in S1 Appendix). One, a deletion (6p11.2), overlaps the 3’ end unique to isoform a of the DNA primase polypeptide 2-encoding PRIM2 [2]. The other, an amplification (12p12.1-p11.23), contains several genes, including the Kirsten rat sarcoma viral oncogene homolog KRAS, one of three human Ras genes, and the 5’ ends of isoforms b and d of the SRY (sex determining region Y)-box 5-encoding SOX5 [34], and is significantly (log-rank test P-value < 0.05, and KM median survival time difference ≥ 12 months) correlated with OV survival (S3 Dataset).
We also find that the three arraylet patterns include novel frequent focal CNAs (segments < 125 probes). Among these, four amplifications and two deletions are significantly correlated with OV survival (Fig. J in S1 Appendix). The amplifications flank the segment that contains KRAS. Two consecutive segments (12p12.1) contain the 5’ ends of isoforms a and e of SOX5, and exons 5 and 6, the first exons that are common to isoforms a, b, d, and e of SOX5 [35]. Two other consecutive segments (12p11.23) contain the inositol 1,4,5-trisphosphate receptor type 2-encoding ITPR2, and the asunder spermatogenesis regulator-encoding ASUN. ASUN was discovered in a screen of expressed sequence tags on 12p11-p12, which DNA amplification correlated with mRNA overexpression in four human testicular seminomas and one ovarian papillary serous adenocarcinoma cell line, exemplifying human germ cell tumors [36]. ASUN and its homologs are essential for nuclear division after DNA replication in the HeLa human cervical cancer cell line, the frog, and the fly [37]. One deletion (7p22.1-p21.3) contains the replication protein A3-encoding RPA3. The other (Xq21.31) contains the cytoplasmic poly(A)-binding protein 5-encoding PABPC5, and the sequence tag site DX214 adjacent to translocation breakpoints observed in premature ovarian failure [38].
Possible Roles in OV Pathogenesis
We find, by using gene ontology enrichment analyses of the OV tumor mRNA expression profiles of the patients [39, 40], that differential mRNA expression between the patients, classified by any one of the three tensor GSVDs, is enriched in ontologies corresponding to one of three hallmarks of cancer [41]: cell immortality in 6p+12p, DNA instability in 7p, and cellular immune response suppression in Xq.
The differential mRNA expression of genes from these enriched ontologies that are located on any one of the chromosome arms is consistent with the CNAs across that arm (Fig. K in S1 Appendix, and S4 Dataset). Genes that map to amplifications or deletions on any one arraylet pattern, are overexpressed or underexpressed, respectively, in the patients which tumor profiles are classified, by the corresponding tensor GSVD, as highly similar to that pattern, i.e., patients of high x-probelet coefficients or arraylet correlations. The differential expression of all microRNAs and proteins that map to any one of the chromosome arms is also consistent with the CNAs across that arm (Sec. 2.3, and Figs. L and M in S1 Appendix, and S5 and S6 Datasets). A coherent picture emerges for each pattern, suggesting roles for the CNAs in OV pathogenesis in addition to personalized diagnosis, prognosis, and treatment.
6p+12p. A cell’s transformation and immortality are correlated with a patient’s shorter survival
The genes, which are significantly (Mann-Whitney-Wilcoxon P-values < 0.05) differentially expressed between the 6p+12p tensor GSVD classes, i.e., in the patient group of high 6p+12p x-probelet coefficient or arraylet correlation, relative to the patient group of low coefficient or correlation, are enriched (hypergeometric P-values < 10−3) in the ontologies of cellular response to ionizing radiation (GO:0071479), and major histocompatibility (MHC) protein complex (GO:0042611). Most of the GO:0071479 genes are underexpressed, including the p21 cyclin-dependent kinase inhibitor-encoding CDKN1A, and the p38 mitogen-activated protein kinase-encoding MAPK14, which map to a deletion > 45 Mbp on the telomeric part of 6p (6p25.3-p21.1). Also underexpressed is p38, the protein encoded by MAPK14. All GO:0042611 genes, including the tumor necrosis factor-encoding TNF, are underexpressed, and map to the same deletion. The one microRNA that is significantly differentially expressed between the 6p+12p tensor GSVD classes, and maps to the same deletion, is the splicing-dependent microRNA miR-877*, which is encoded by the 13th intron of the ATP-binding cassette subfamily F member 1-encoding gene ABCF1 [44]. Both miR-877* and ABCF1 are consistently underexpressed.
One of only two GO:0071479 overexpressed genes is the RAD51-associated protein 1-encoding RAD51AP1, which maps to an amplification > 9 Mbp on the telomeric part of 12p (12p13.33-p13.31) that is significantly correlated with OV survival. All four microRNAs that are differentially expressed between the 6p+12p tensor GSVD classes, and map to the same amplification, miR-200c, miR-200c*, miR-141, and miR-141*, are consistently overexpressed. The second protein that is significantly differentially expressed between the 6p+12p tensor GSVD classes is p27. Consistently, the cyclin-dependent kinase inhibitor CDKN1B, which encodes p27, maps to a 4.5 Mbp amplification (12p13.2-p12.3) that is significantly correlated with OV survival, and its mRNA is overexpressed. The mRNA encoded by KRAS is also overexpressed.
Note that while the 6p+12p pattern of CNAs is correlated with survival in the discovery and, separately, validation sets, neither the 6p nor the 12p pattern alone are correlated with survival. Indeed, experiments studying the conditions for the transformation of human normal to tumor cells indicate that cells, where both p21 and p38 are inactive, are susceptible to Ras-mediated transformation [42, 43]. However, the activation of Ras alone induces tumor-suppressing cellular senescence via the activities of either p21 or p38. The 6p+12p pattern, therefore, which includes the loss of the p21-encoding CDKN1A and the p38-encoding MAPK14 on 6p, and the gain of KRAS on 12p, encodes for cellular conditions that combined but not separately can lead to transformation.
In addition, p21 and p38 are necessary for p53-mediated cell cycle arrest [45] and apoptosis [46], respectively, in response to DNA damage. Overexpression of the p21-encoding CDKN1A is correlated with a low malignant potential of an ovarian tumor [47]. RAD51AP1 overexpression disrupts cell cycle arrest and apoptosis, can lead to cellular resistance to DNA-damaging cancer therapies, such as platinum-based chemotherapy, and may increase DNA instability [48]. TNF-induced apoptosis is correlated with downregulation of ITPR2 [49]. Overexpression of miR-200c, and miR-141, both of which putatively target the BRCA1 associated protein-1 oncosuppressor-encoding BAP1, is correlated with OV tumor growth, dedifferentiation, and invasiveness [50, 51]. Overexpression of the CDKN1B-encoded p27, which can promote cellular migration [52] and even proliferation [53], is correlated with a poor OV patient’s prognosis [54, 55].
Taken together, previously unrecognized co-occurring deletion of CDKN1A and MAPK14 on 6p and amplification of KRAS on 12p, which encode for human cell transformation, together with deletion of TNF on 6p, and amplification of RAD51AP1 and ITPR2 on 12p, are correlated with a suppression of cell cycle arrest, senescence, and apoptosis, i.e., a tumor cell’s immortality, and a patient’s shorter survival time. Note that there already exist drugs that interact with CDKN1A, MAPK14, and RAD51AP1, even though these genes were not recognized previously as targets for OV drug therapy [56].
7p. A cell’s DNA stability is correlated with a longer survival
The genes that are significantly differentially expressed between the 7p tensor GSVD classes are enriched (hypergeometric P-value < 10−10) in the ontology of DNA strand elongation involved in DNA replication (GO:0006271). Most of these genes are overexpressed, including the DNA polymerase delta subunit 2-encoding POLD2 that is essential for DNA replication and repair, which maps to an amplification > 17 Mbp on the centromeric part of 7p (7p14.1-p11.2). Only two genes are underexpressed: RPA3 on 7p and the DNA ligase IV-encoding LIG4 on 13q. The interaction of p53 with the RPA3-encoded protein mediates suppression of homologous recombination (HR), the preferred cellular mechanism for DNA double-strand break (DSB) repair during replication [57]. LIG4 is essential for DSB repair via the more error-prone nonhomologous end joining pathway [58]. HR defects are thought to facilitate the significant CNA heterogeneity among OV tumors [2].
Taken together, previously unrecognized co-occurring deletion and underexpression of RPA3, and amplification and overexpression of POLD2 on 7p are correlated with DNA DSB repair via HR during replication, i.e., DNA stability, and a longer survival time.
Xq. Cellular immune response is correlated with a longer survival
The genes that are differentially expressed between the Xq tensor GSVD classes are enriched (hypergeometric P-value < 10−6) in the ontology of antigen processing and presentation of peptide antigen (GO:0048002). Most of these genes are overexpressed, including the B-cell receptor-associated protein 31-encoding BCAP31, which maps to an amplification > 11 Mbp on the telomeric part of Xq (Xq27.3-q28). All three microRNAs that are differentially expressed between the Xq tensor GSVD classes, and map to the same amplification, miR-888, miR-224, and miR-452, together with the gamma-aminobutyric acid (GABA) A receptor epsilon-encoding GABRE, which hosts mir-224 and mir-452 in its introns, are consistently overexpressed. Underexpression of miR-224 was implicated in OV pathogenesis [50]. PABPC5, which maps to a focal deletion on Xq, is suppressed upon viral infection [59].
Taken together, previously unrecognized co-occurring deletion of PABPC5, and amplification and overexpression of BCAP31 on Xq are correlated with a cellular immune response, and a longer survival time.
Discussion
We defined a novel tensor GSVD, an exact simultaneous decomposition of two datasets, arranged in two higher-than-second-order tensors of matched column dimensions but independent row dimensions. We showed that the mathematical properties of the tensor GSVD allow interpreting its variables and operations in terms of the similar as well as dissimilar, e.g., biomedical reality between the datasets. We demonstrated the tensor GSVD in comparative modeling of patient- and platform-matched but probe-independent OV tumor and normal DNA copy-number profiles from TCGA. The modeling resulted in new insights into the poorly understood relations between an OV tumor’s genome and a patient’s survival phenotype. Three previously unrecognized chromosome arm-wide patterns of tumor-exclusive and platform-consistent co-occurring alterations were uncovered, across 6p+12p, 7p, and Xq, that are correlated with an OV patient’s survival and response to platinum-based chemotherapy, and are of possible roles in OV pathogenesis, and of a possible implementation in a pathology laboratory test for personalized OV diagnosis, prognosis, and treatment.
Note that unlike previous analyses of the TCGA OV DNA copy-number data, notably by TCGA [2], our analyses were not limited to the 22 human autosomal chromosomes, and include the X chromosome. This is because the tensor GSVD, like the GSVD, comparatively—based upon the structure of the data—separates the matched datasets into uncorrelated, i.e., orthogonal patterns across the tumor and normal probes. Patterns of copy-number variation across the tumor probes that occur in the normal human genome, and are common to the tumor and normal datasets, such as the female-specific X chromosome amplification, are orthogonal to, and, therefore, are separated from the patterns that are exclusive to the tumor dataset. For example, the GSVD comparative modeling of patient-matched GBM tumor and normal copy-number profiles separated the prognosis-correlated GBM tumor-exclusive pattern from the female-specific X chromosome amplification as well as from experimental artifacts (or batch effects) due to experimental variations in, e.g., tissue batch, genomic center, hybridization date, and scanner, without a-priori knowledge of these variations.
Unlike recent approaches to the integrative modeling of different types of large-scale molecular biological profiles from the same set of patients, notably clustering [60, 61], our comparative modeling was not limited to tumor profiles, and included also patient- and platform-matched normal DNA copy-number profiles. This is because the tensor GSVD, like the GSVD, finds not just the similarities but, at the same time also the dissimilarities among the profiles without making any assumptions, except for the structure of the data: two third-order tensors, of matched columns that correspond to the same sets of patients and platforms, and independent rows that correspond to the probes in either the tumor or the normal dataset. The patients, platforms, tumor and normal probes as well as the tissue types, each represent a degree of freedom. Unfolded into two matrices or appended into a single tensor (or even unfolded and appended into a single matrix), some of the degrees of freedom are lost and much of the information in the datasets might also be lost. For example, SVD of the GBM tumor and normal profiles appended into a single matrix, while it is related to the GSVD of the data, would not separate the tumor dataset into patterns across the tumor probes that are orthogonal.
Additional possible applications of the tensor GSVD in personalized medicine include comparative modeling of two patient- and tissue-matched datasets, each corresponding to (i) a set of large-scale molecular biological profiles, e.g., DNA copy numbers, acquired by a high-throughput technology, e.g., DNA microarrays; (ii) a set of biomedical images or signals; or (iii) a set of cellular pathological observations, e.g., a tumor’s stage. Such tensor GSVD comparative models can uncover variations across the patients and tissues that are common to, possibly causally coordinated between the two aspects of the disease. In clinical settings, such tensor GSVD comparative models can determine an individual patient’s medical status in relation to all the other patients in a set, and inform the patient’s diagnosis, prognosis and treatment.
Supporting Information
Acknowledgments
We thank RA Horn for thoughtful discussions of matrix analysis in general, and the tensor GSVD in particular. We thank DDL Bowtell and MM Janát-Amsbury for useful notes on OV in general, and the molecular distinctions between high- and low-grade OV tumors in particular. We also thank RA Weinberg for helpful comments on the hallmarks of cancer in general, and the transformation of human normal to tumor cells in particular.
Data Availability
All relevant data are within the paper and its Supporting Information files.
Funding Statement
This research was supported by the Utah Science, Technology, and Research (USTAR) Initiative, National Human Genome Research Institute (NHGRI) R01 Grant HG-004302 and National Science Foundation (NSF) CAREER Award DMS-0847173 (to OA). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Cancer Genome Atlas Research Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455: 1061–1068. 10.1038/nature07385 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature. 2011;474: 609–615. 10.1038/nature10166 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Ponnapalli SP, Golub GH, Alter O. A novel higher-order generalized singular value decomposition for comparative analysis of multiple genome-scale datasets. Stanford University and Yahoo! Research Workshop on Algorithms for Modern Massive Datasets (MMDS) (Stanford, CA: ). 2006; June 21–24. [Google Scholar]
- 4. Ponnapalli SP, Saunders MA, Van Loan CF, Alter O. A higher-order generalized singular value decomposition for comparison of global mRNA expression from multiple organisms. PLoS One. 2011;6: e28072 10.1371/journal.pone.0028072 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Golub GH, Van Loan CF. Matrix Computations. 4th ed Baltimore, MD: Johns Hopkins University Press; 2012. [Google Scholar]
- 6. Horn RA, Johnson CR. Matrix Analysis. 2nd ed Cambridge, UK: Cambridge University Press; 2012. [Google Scholar]
- 7. Van Loan CF. Generalizing the singular value decomposition. SIAM J Numer Anal. 1976;13: 76–83. 10.1137/0713009 [DOI] [Google Scholar]
- 8. Paige CC, Saunders MA. Towards a generalized singular value decomposition. SIAM J Numer Anal. 1981;18: 398–405. 10.1137/0718026 [DOI] [Google Scholar]
- 9. Van Loan CF. Computing the CS and the generalized singular value decompositions. Numer Math. 1985;46: 479–491. 10.1007/BF01389653 [DOI] [Google Scholar]
- 10. Bai Z, Demmel JW. Computing the generalized singular value decomposition. SIAM J Sci Comput. 1993;14: 1464–1486. 10.1137/0914085 [DOI] [Google Scholar]
- 11. Friedland S. A new approach to generalized singular value decomposition. SIAM J Matrix Anal Appl. 2005;27: 434–444. 10.1137/S0895479804439791 [DOI] [Google Scholar]
- 12. Alter O, Brown PO, Botstein D. Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. Proc Natl Acad Sci USA. 2003;100: 3351–3356. 10.1073/pnas.0530258100 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Lee CH, Alpert BO, Sankaranarayanan P, Alter O. GSVD comparison of patient-matched normal and tumor aCGH profiles reveals global copy-number alterations predicting glioblastoma multiforme survival. PLoS One. 2012;7: e30098 10.1371/journal.pone.0030098 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Wiltshire RN, Rasheed BK, Friedman HS, Friedman AH, Bigner SH. Comparative genetic patterns of glioblastoma multiforme: potential diagnostic tool for tumor classification. Neuro Oncol. 2000;2: 164–173. 10.1093/neuonc/2.3.164 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Misra A, Pellarin M, Nigro J, Smirnov I, Moore D, Lamborn KR, et al. Array comparative genomic hybridization identifies genetic subgroups in grade 4 human astrocytoma. Clin Cancer Res. 2005;11: 2907–2918. 10.1158/1078-0432.CCR-04-0708 [DOI] [PubMed] [Google Scholar]
- 16. Curran WJ Jr, Scott CB, Horton J, Nelson JS, Weinstein AS, Fischbach AJ, et al. Recursive partitioning analysis of prognostic factors in three Radiation Therapy Oncology Group malignant glioma trials. J Natl Cancer Inst. 1993;85: 704–710. 10.1093/jnci/85.9.704 [DOI] [PubMed] [Google Scholar]
- 17. Gorlia T, van den Bent MJ, Hegi ME, Mirimanoff RO, Weller M, Cairncross JG, et al. Nomograms for predicting survival of patients with newly diagnosed glioblastoma: prognostic factor analysis of EORTC and NCIC trial 26981–22981/CE.3. Lancet Oncol. 2008;9: 29–38. 10.1016/S1470-2045(07)70384-4 [DOI] [PubMed] [Google Scholar]
- 18. Cox DR. Regression models and life-tables. J Roy Statist Soc B. 1972;34: 187–220. [Google Scholar]
- 19. Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Amer Statist Assn. 1958;53: 457–481. 10.1080/01621459.1958.10501452 [DOI] [Google Scholar]
- 20. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, et al. The human genome browser at UCSC. Genome Res. 2002;12: 996–1006. 10.1101/gr.229102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics. 2004;5: 557–572. 10.1093/biostatistics/kxh008 [DOI] [PubMed] [Google Scholar]
- 22. Hopkins AL, Groom CR. The druggable genome. Nat Rev Drug Discov. 2002;1: 727–730. 10.1038/nrd892 [DOI] [PubMed] [Google Scholar]
- 23. Silljé HH, Takahashi K, Tanaka K, Van Houwe G, Nigg EA. Mammalian homologues of the plant Tousled gene code for cell-cycle-regulated kinases with maximal activities linked to ongoing DNA replication. EMBO J. 1999;18: 5691–5702. 10.1093/emboj/18.20.5691 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Pellegrini M, Cheng JC, Voutila J, Judelson D, Taylor J, Nelson SF, et al. Expression profile of CREB knockdown in myeloid leukemia cells. BMC Cancer. 2008;8: 264 10.1186/1471-2407-8-264 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. De Lathauwer L, De Moor B, Vandewalle J. A multilinear singular value decomposition. SIAM J Matrix Anal Appl. 2000;21: 1253–1278. 10.1137/S0895479896305696 [DOI] [Google Scholar]
- 26. Omberg L, Golub GH, Alter O. A tensor higher-order singular value decomposition for integrative analysis of DNA microarray data from different studies. Proc Natl Acad Sci USA. 2007;104: 18371–18376. 10.1073/pnas.0709146104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Omberg L, Meyerson JR, Kobayashi K, Drury LS, Diffley JFX, Alter O. Global effects of DNA replication and DNA replication origin activity on eukaryotic gene expression. Mol Syst Biol. 2009;5: 312 10.1038/msb.2009.70 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Kolda TG, Bader BW. Tensor decompositions and applications. SIAM Rev. 2009;51: 455–500. 10.1137/07070111X [DOI] [Google Scholar]
- 29.Vandewalle J, De Lathauwer L, Comon P. The generalized higher order singular value decomposition and the oriented signal-to-signal ratios of pairs of signal tensors and their use in signal processing. In: Proc ECCTD’03—European Conf on Circuit Theory and Design; 2003. pp. I-389–I-392.
- 30. Ayhan A, Kurman RJ, Yemelyanova A, Vang R, Logani S, Seidman JD, et al. Defining the cut point between low-grade and high-grade ovarian serous carcinomas: a clinicopathologic and molecular genetic analysis. Am J Surg Pathol. 2009;33: 1220–1224. 10.1097/PAS.0b013e3181a24354 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Prisco MG, Zannoni GF, De Stefano I, Vellone VG, Tortorella L, Fagotti A, et al. Prognostic role of metastasis tumor antigen 1 in patients with ovarian cancer: a clinical study. Hum Pathol. 2012;43: 282–288. 10.1016/j.humpath.2011.05.002 [DOI] [PubMed] [Google Scholar]
- 32. Harries M, Gore M. Chemotherapy for epithelial ovarian cancer—treatment at first diagnosis. Lancet Oncol. 2002;3: 529–536. 10.1016/S1470-2045(02)00846-X [DOI] [PubMed] [Google Scholar]
- 33. Pujade-Lauraine E, Hilpert F, Weber B, Reuss A, Poveda A, Kristensen G, et al. Bevacizumab combined with chemotherapy for platinum-resistant recurrent ovarian cancer: The AURELIA open-label randomized phase III trial. J Clin Oncol. 2014;32: 1302–1308. 10.1200/JCO.2013.51.4489 [DOI] [PubMed] [Google Scholar]
- 34. Engler DA, Gupta S, Growdon WB, Drapkin RI, Nitta M, Sergent PA, et al. Genome wide DNA copy number analysis of serous type ovarian carcinomas identifies genetic markers predictive of clinical outcome. PLoS One. 2012;7: e30996 10.1371/journal.pone.0030996 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Ikeda T, Zhang J, Chano T, Mabuchi A, Fukuda A, Kawaguchi H, et al. Identification and characterization of the human long form of Sox5 (L-SOX5) gene. Gene. 2002;298: 59–68. 10.1016/S0378-1119(02)00927-7 [DOI] [PubMed] [Google Scholar]
- 36. Bourdon V, Naef F, Rao PH, Reuter V, Mok SC, Bosl GJ, et al. Genomic and expression analysis of the 12p11–p12 amplicon using EST arrays identifies two novel amplified and overexpressed genes. Cancer Res. 2002;62: 6218–6223. [PubMed] [Google Scholar]
- 37. Lee LA, Lee E, Anderson MA, Vardy L, Tahinci E, Ali SM, et al. Drosophila genome-scale screen for PAN GU kinase substrates identifies Mat89Bb as a cell cycle regulator. Dev Cell. 2005;8: 435–442. 10.1016/j.devcel.2004.12.008 [DOI] [PubMed] [Google Scholar]
- 38. Blanco P, Sargent CA, Boucher CA, Howell G, Ross M, Affara NA. A novel poly(A)-binding protein gene (PABPC5) maps to an X-specific subinterval in the Xq21.3/Yp11.2 homology block of the human sex chromosomes. Genomics. 2001;74: 1–11. 10.1006/geno.2001.6530 [DOI] [PubMed] [Google Scholar]
- 39. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25: 25–29. 10.1038/75556 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics. 2009;10: 48 10.1186/1471-2105-10-48 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Hanahan D, Weinberg RA. Hallmarks of cancer: the next generation. Cell. 2011;144: 646–674. 10.1016/j.cell.2011.02.013 [DOI] [PubMed] [Google Scholar]
- 42. Karnoub AE, Weinberg RA. Ras oncogenes: split personalities. Nat Rev Mol Cell Biol. 2008;9: 517–531. 10.1038/nrm2438 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Hahn WC, Counter CM, Lundberg AS, Beijersbergen RL, Brooks MW, Weinberg RA. Creation of human tumour cells with defined genetic elements. Nature. 1999;400: 464–468. 10.1038/22780 [DOI] [PubMed] [Google Scholar]
- 44. Sibley CR, Seow Y, Saayman S, Dijkstra KK, El Andaloussi, Weinberg MS, et al. The biogenesis and characterization of mammalian microRNAs of mirtron origin. Nucleic Acids Res. 2012;40: 438–448. 10.1093/nar/gkr722 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Waldman T, Kinzler KW, Vogelstein B. p21 is necessary for the p53-mediated G1 arrest in human cancer cells. Cancer Res. 1995;55: 5187–5190. [PubMed] [Google Scholar]
- 46. Bulavin DV, Saito S, Hollander MC, Sakaguchi K, Anderson CW, Appella E, et al. Phosphorylation of human p53 by p38 kinase coordinates N-terminal phosphorylation and apoptosis in response to UV radiation. EMBO J. 1999;18: 6845–6854. 10.1093/emboj/18.23.6845 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Anglesio MS, Arnold JM, George J, Tinker AV, Tothill R, Waddell N, et al. Mutation of ERBB2 provides a novel alternative mechanism for the ubiquitous activation of RAS-MAPK in ovarian serous low malignant potential tumors. Mol Cancer Res. 2008;6: 1678–1690. 10.1158/1541-7786.MCR-08-0193 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Klein HL. The consequences of Rad51 overexpression for normal and tumor cells. DNA Repair. 2008;7: 686–693. 10.1016/j.dnarep.2007.12.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49. Diaz F, Bourguignon LY. Selective down-regulation of IP3 receptor subtypes by caspases and calpain during TNFα-induced apoptosis of human T-lymphoma cells. Cell Calcium. 2000;27: 315–328. 10.1054/ceca.2000.0126 [DOI] [PubMed] [Google Scholar]
- 50. Iorio MV, Visone R, Di Leva G, Donati V, Petrocca F, Casalini P, et al. MicroRNA signatures in human ovarian cancer. Cancer Res. 2007;67: 8699–8707. 10.1158/0008-5472.CAN-07-1936 [DOI] [PubMed] [Google Scholar]
- 51. Yang D, Sun Y, Hu L, Zheng H, Ji P, Pecot CV, et al. Integrated analyses identify a master microRNA regulatory network for the mesenchymal subtype in serous ovarian cancer. Cancer Cell. 2013;23: 186–199. 10.1016/j.ccr.2012.12.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52. Nagahara H, Vocero-Akbani AM, Snyder EL, Ho A, Latham DG, Lissy NA, et al. Transduction of full-length TAT fusion proteins into mammalian cells: TAT-p27Kip1 induces cell migration. Nat Med. 1998;4: 1449–1452. 10.1038/4042 [DOI] [PubMed] [Google Scholar]
- 53. Kwon YH, Jovanovic A, Serfas MS, Tyner AL. The Cdk inhibitor p21 is required for necrosis, but it inhibits apoptosis following toxin-induced liver injury. J Biol Chem. 2003;278: 30348–30355. 10.1074/jbc.M300996200 [DOI] [PubMed] [Google Scholar]
- 54. Chu IM, Hengst L, Slingerland JM. The Cdk inhibitor p27 in human cancer: prognostic potential and relevance to anticancer therapy. Nat Rev Cancer. 2008;8: 253–267. 10.1038/nrc2347 [DOI] [PubMed] [Google Scholar]
- 55. Duncan TJ, Al-Attar A, Rolland P, Harper S, Spendlove I, Durrant LG. Cytoplasmic p27 expression is an independent prognostic factor in ovarian cancer. Int J Gynecol Pathol. 2010;29: 8–18. 10.1097/PGP.0b013e3181b64ec3 [DOI] [PubMed] [Google Scholar]
- 56. Ahmed J, Meinel T, Dunkel M, Murgueitio MS, Adams R, Blasse C, et al. CancerResource: a comprehensive database of cancer-relevant proteins and compound interactions supported by experimental knowledge. Nucleic Acids Res. 2011;39: D960–D967. 10.1093/nar/gkq910 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57. Romanova LY, Willers H, Blagosklonny MV, Powell SN. The interaction of p53 with replication protein A mediates suppression of homologous recombination. Oncogene. 2004;23: 9025–9033. 10.1038/sj.onc.1207982 [DOI] [PubMed] [Google Scholar]
- 58. Moynahan ME, Jasin M. Mitotic homologous recombination maintains genomic stability and suppresses tumorigenesis. Nat Rev Mol Cell Biol. 2010;11: 196–207. 10.1038/nrm2851 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59. Kumar GR, Shum L, Glaunsinger BA. Importin α-mediated nuclear import of cytoplasmic poly(A) binding protein occurs as a direct consequence of cytoplasmic mRNA depletion. Mol Cell Biol. 2011;31: 3113–3125. 10.1128/MCB.05402-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Shen R, Olshen AB, Ladanyi M. Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis. Bioinformatics. 2009;25: 2906–2912. 10.1093/bioinformatics/btp543 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Mo Q, Wang S, Seshan VE, Olshen AB, Schultz N, Sander C, et al. Pattern discovery and cancer gene identification in integrated cancer genomic data. Proc Natl Acad Sci USA. 2013;110: 4245–4250. 10.1073/pnas.1208949110 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All relevant data are within the paper and its Supporting Information files.