Abstract
Primary brain tumors are a leading cause of cancer-related mortality among young adults and children. The most common primary malignant brain tumor, glioma, carries a median survival of only 14 months. Two major multi-institutional programs, the Glioma Molecular Diagnostic Initiative and The Cancer Genome Atlas, have pursued a comprehensive genomic characterization of a large number of clinical glioma samples using a variety of technologies to measure gene expression, chromosomal copy number alterations, somatic and germline mutations, DNA methylation, microRNA, and proteomic changes. Classification of gliomas on the basis of gene expression has revealed six major subtypes and provided insights into the underlying biology of each subtype. Integration of genome-wide data from different technologies has been used to identify many potential protein targets in this disease, while increasing the reliability and biological interpretability of results. Mapping genomic changes onto both known and inferred cellular networks represents the next level of analysis, and has yielded proteins with key roles in tumorigenesis. Ultimately, the information gained from these approaches will be used to create customized therapeutic regimens for each patient based on the unique genomic signature of the individual tumor. In this Review, we describe efforts to characterize gliomas using genomic data, and consider how insights gained from these analyses promise to increase understanding of the biological underpinnings of the disease and lead the way to new therapeutic strategies.
Introduction
Malignant primary brain tumors, the most common of which are gliomas, are a major cause of cancer-related mortality in children and young adults. Approximately 10,000 new cases of high-grade or malignant glioma occur each year, the majority of which are glioblastoma multiforme (GBM). GBMs are highly infiltrative, making complete surgical resection impossible. These tumors progress rapidly, demonstrating relative resistance to both radiotherapy and most chemotherapeutic agents. Thus, despite aggressive multimodality treatment, the median survival of patients with GBM is still only about 14 months. The high morbidity and mortality of this disease has been a strong imperative to better understand GBMs at a genetic and molecular level for the purpose of identifying new molecular drug targets, and to develop personalized rational therapy based on the highly heterogeneous biology of each patient’s individual tumor.
Traditional genomic studies of primary brain tumors over the past 20 years have revealed common alterations in canonical central signaling pathways.1–5 Such dis coveries do not, however, hint at the complexity and heterogeneity of the genomic landscape of individual GBMs or the varied clinical course of patients with these tumors. Newer high-throughput technology has heralded the age of so-called ‘omic’ biology with the ability to generate vast quantities of molecular and genomic data from individual tumor samples in a given instance. This information promises to increase the understanding of the biological changes underlying gliomas, as well as ultimately guiding customized therapy for each patient. Although the accumulation of these high-throughput genomic data holds great promise for the treatment of all cancer types, challenges remain for the management and curation of large amounts of genome-wide data, as well as for the integration of these data to understand how diverse alterations in cellular networks give rise to tumori genesis, and the translation of this knowledge into improved patient therapy.
In this Review, we discuss advances in genomic technologies and how they have contributed to the understanding of the biology of primary brain tumors. We further examine bioinformatic approaches to integrating genome-scale data from different experimental sources with existing biological knowledge. Finally, we outline current efforts to create data repositories of genomic data from the analysis of clinical brain-tumor samples, and consider how this information may be used in the future to design custom therapeutic regimens tailored for each patient.
Gene expression profiling
Gene expression signatures
Historically, the traditional pathological classification schemas for gliomas have been problematic, with substantial intraobserver variability, no biological basis, poor individual prognostic significance, and inability to predict rational therapeutic strategies for any given tumor. Thus, keen interest has been expressed in finding objective and biologically based ways of classifying gliomas. After the successful classification of individual breast cancers and lymphomas into unique prognostically significant biological subgroups on the basis of gene expression profiles, interest in applying such an approach to primary brain tumors has been growing.6–9
The use of gene expression microarray data for molecular subtyping involves a class of algorithms called cluster analysis (Figure 1). In hierarchical clustering, the expression levels of each gene are examined across all samples. A distance metric, such as the Euclidean distance, is then used to form pairs of genes with highest associations between these two vectors. Pairs of genes are linked to other pairs using an additional metric, such as the average of all distances between genes. This process continues until genes are linked in a hierarchical network. Because the clustering process is sensitive to the order of the initial gene list, a bootstrapping or consensus clustering is often used to increase the reliability of the resulting subtypes. A second type of clustering uses a partitioning procedure; K-means is the most common algorithm of this type. Genes are first randomly assigned to K groups, and an iterative procedure refines these groups successively to minimize within-group variance. Unlike hierarchical clustering, K-means clustering requires the specification of the number of groups before clustering. Techniques to access cluster stability, such as bootstrapping or a silhouette plot, can assist in determining the number of clusters. Other commonly used techniques to subtype tumors on the basis of gene expression include principal component analysis and the closely related non-negative matrix factorization.
Figure 1 |.
Techniques for analyzing gene expression data. a | Clustering of gene expression values using a hierarchical clustering algorithm. In hierarchical clustering, distance is first computed for each pair of sample expression values across all genes using a metric such as Euclidean distance. Pairs showing the closest distances are found and these sets of observations are then joined by using a metric of group similarity, such as average or complete linkage. This process is continued until an entire dendogram is formed that describes the relationship and ordering between all samples. b | Differential gene expression is a technique commonly used to determine genes that display statistically significant changes across conditions. A procedure such as the T test is applied to each gene to test for significant changes across conditions. P-values may be converted to q-values using false discovery rate to account for multiple hypothesis testing. c | A gene expression signature can be combined with a classification algorithm, such as linear discriminant analysis or support vector machines. After development on a training set, the model can then be applied to classification of samples in an external test set. This process can be used to assign tumor samples to subtype or generate predictive models of drug response.
Gene expression signatures are sets of genes that correspond to a subtype or other characteristic of tumor biology. Signatures can be derived from the results of clustering or from another variable of interest (for example, association with a clinical parameter or outcome) by performing statistical tests to identify differentially expressed genes (Figure 1). The most common test for differentially expressed genes uses a T test to calculate the probability that any single gene shows a difference in sample means between two conditions. To correct for type I errors (incorrectly rejecting the null hypothesis) that can result from testing thousands of genes, the P values are usually corrected using such methods as conversion to a false discovery rate. Other methods for calculation of differentially expressed genes that can offer greater sensitivity than classical inferential statistics include significance analysis of microarrays (SAM) and Cyber-T.10 Signatures can also be generated using a continuous variable such as age or survival using a standard analysis of variance or SAM. Once signatures are developed from a training set, they can be used to classify new samples using established machine-learning techniques such as support vector machines.
Classification
A number of early studies showed that gene expression profiling of glioma samples could yield insights into tumor biology and increase the reliability of subtype classification over traditional histological and clinical criteria. Further work showed that that genes related to the cell cycle and mitosis were downregulated in secondary GBMs compared with primary GBMs, and that primary GBMs had a stromal–mesenchymal signature previously observed in other aggressive cancer types and mesenchymal stem cells.11–19
In 2006, Phillips and colleagues20 performed an analysis that investigated the possible lineage-dependent origins of molecular subtypes in GBM. In 76 grade III or IV GBM samples, genes were first identified that correlated with patient survival. This signature was then used in a two-way agglomerative clustering to separate the tumors into three groups, designated ‘proneural’, ‘mesenchymal’, and ‘proliferative’. The study authors then compared a 35-gene signature developed from the GBM samples with the expression of the same 35 genes in a variety of other human tissue types. They found that the proneural subclass showed the closest similarity to adult and fetal brain, the mesenchymal type paired most closely to bone, synovium, smooth muscle, endothelial and dendritic cells, and the proliferative type was most closely related to hematopoietic stem cells and the lymphocyte-derived Jurkat cell line. Overall, both mesenchymal and proliferative types showed the most pronounced expression of stem-cell-related genes, similar chromosomal losses on chromosome 10 and gains on chromosome 7, and worse survival compared with the proneural subtype. Akt and Notch pathway signaling was also implicated in poorer survival.20
Later, in 2009, Li et al.21 built on previous work by investigating a larger number of glioma clinical samples (n = 159), which included an unbiased full range of glioma subtypes (astrocytoma, oligodendroglioma and mixed glioma) and grades, using a gene expression microarray (GeneChip® human genome U133 plus 2.0 array, Affymetrix, Santa Clara, CA, USA) that offered coverage of the entire human genome. Using both K-means clustering and non-negative factorization, they showed that gliomas could be broadly clustered into a longer surviving group (oligodendroglioma or ‘O-like’) and a poor survival group (GBM or ‘G-like’). These investigators then found that the O group could be further sub-classified into two groups (OA and OB), and the G group could be subtyped sequentially into four independent groups (G1A, G1B, G2A, G2B). Gene expression signatures from this classification scheme were developed and showed high accuracy when tested on clinical samples taken from an independent patient cohort.22,23
Verhaak and colleagues23 later confirmed the existence of four subtypes of malignant gliomas that highly overlapped with those found by Li and co-workers,22 using clustering microarray results from 200 GBM and two normal tumor samples as part of The Cancer Genome Atlas (TCGA) project.23 Although using an Affymetrix array with less than whole-genome coverage (Affymetrix® GeneChip® human genome U133A 2.0 array), the authors combined these results with those from two other microarray platforms (Affymetrix® GeneChip® Human exon array; Agilent 244K CGH array, Agilent Technologies, Santa Clara, CA, USA). Clustering was performed by average-linkage hierarchical clustering with significance of clusters confirmed using the SigClust algorithm.24 Tumors in the mesenchymal subtype showed expression of the mesenchymal-type genes previously found by Phillips and colleagues,20 including CHI3L1, MET, CD44, and MERTK. Other differentially expressed genes in this group included those from the tumor necrosis factor superfamily and NF-κB pathway, including TRADD, RELB, and TNFRSF1A. The proneural class expressed genes typical of an oligodendrocytic lineage, including PDGFRA, NKX2–2, and OLIG2. Additional genes, including SOX genes, DCX, DLL3, ASCL1 and TCF4, were shown to be differentially expressed, which confirmed data by Phillips and co-workers.20
Interestingly, both the Verhaak23 and Li21,22 studies showed that the ‘proliferative’ subtype (according to Phillips et.al.20) clustered consistently into two further subtypes, ‘classical’ and ‘neural’. The classical subtype showed alterations of the retinoblastoma pathway, mediated in part by the deletion of CDKN2A. This subtype also showed increased expression levels of genes associated with neural stem cells, including NES and members of the Notch and hedgehog pathways. The neural subtype expressed genes known to be associated with a neuronal lineage, including NEFL, GABRA1, SYT1 and SLC12A5.
In summary, the initial attempts at gene expression profiling, which started almost a decade ago, led to limited and often contradictory biological information, but did demonstrate the heterogeneous nature of glial tumors that were previously thought to represent a single tumor type. With refinements in computational methodology, technically improved array platforms with greater probe density, and analysis of substantially increased numbers of tumors, high-throughput analyses in the past few years have yielded increasingly consistent results. We can conclude that the family of gliomas constitutes at least six major biological subtypes, including four subtypes of high-grade gliomas formerly thought to fall into the single pathological diagnosis of GBM (Table 1).
Table 1 |.
Genomic characteristics of four high-grade glioma subtypes
| Glioma subtype | Genomic changes | Marker genes | Characteristic signaling pathways |
|---|---|---|---|
| Proneural | PDGFRA (mutation or gain) | Oligodendrocytic markers: PDGFRA, NKX2–2, OLIG2 | PI3K and PDGFRA |
| IDH1 (mutation) | |||
| PIK3CA or PIL3R1 (mutation) | |||
| TP53 (mutation or loss) | Proneural markers: SOX, DCX, DLL3, ASCL1, TCF4 | ||
| PTEN (mutation or loss) | |||
| CDKN2A (loss) | |||
| Classical | EGFR (mutation or gain) | NES | Notch and hedgehog |
| PTEN (mutation or loss) | |||
| CDKN2A (loss) | |||
| Neural | EGFR (mutation or gain) | Neuronal markers: NEFL, GABRA1, SYT1, SLC12A5 | Unknown |
| TP53 (mutation or loss) | |||
| PTEN (mutation or loss) | |||
| CDKN2A (loss) | |||
| Mesenchymal | NF1 (mutation or loss) | Mesenchymal and astrocytic markers: CHI3L1, MET, CD44, MERTK | Tumor necrosis factor and NF-κB |
| TP53 (mutation or loss) | |||
| PTEN (mutation or loss) | |||
| CDKN2A (loss) |
Integrative genomics
Copy number alterations
Since the advent of gene expression microarrays more than a decade ago, several other genomic technologies have emerged that have been used to study gliomas. One consistent theme in a survey of neuro-oncology informatics is the way in which data from different platforms and modalities are combined to yield more reliable and biologically meaningful results.
Cancer is characterized by genomic instability and the associated abnormal and varied karyotypes characteristic of individual tumors has been recognized for years. Indeed, these deletions, amplifications, and loss of heterozygosity in parts of individual chromosomes can account for some of the aberrant gene expression found within cancer cells. Thus, there has been a longstanding interest in genotyping of tumors on the basis of these chromosomal copy number alterations (CNAs), both for a better understanding of the biology of the tumor and for tumor classification. Historically, fluorescently labeled chromosomal probes hybridized directly onto tumor DNA (comparative genomic hybridization or CGH) was the standard for assessing tumor CNAs. Now, single nucleotide polymorphisms (SNP)-based arrays (or SNP chips) have become the tool of choice in measuring these chromosomal alternations (Figure 2). Originally developed to quantify SNPs for genome-wide association studies, SNP chips were soon recognized as offering higher resolution and precision than other genome-wide techniques such as CGH.25,26 For instance, the new genome-wide Human SNP array 6.0 from Affymetrix® contains 1.8 million SNPs and nonpolymorphic copy number probes, making the median intermarker distance 680 bases.
Figure 2 |.
Copy number alteration detection using SNP chips. a | Tumor samples and reference samples are individually analyzed. Gain or loss of chromosomal regions is computed by comparing signal intensity in the tumor sample with the background signal in the reference and then applying a segmenting and smoothing algorithm. LOH frequently occurs in cancer when an individual carries one functioning copy of a tumor suppressor that is lost through chromosomal deletion. LOH can also be analyzed with SNP chips by detecting SNPs that show a heterozygous to homozygous transition between the reference and tumor sample. b | Classic chromosomal changes seen in glioma, with gain of chromosome 7 containing the EGFR gene and loss of chromosome 10 containing the PTEN gene. c | Chromosomal changes in oligodendrogliomas include 1p and 19q deletion. Abbreviations: LOH, loss of heterozygosity; SNP, single nucleotide polymorphism. Parts a–c were adapted from Kotliarov, Y. et al. BMC Med. Genom. 3, 11 (2010), which is published under an open-access license by Biomed Central.
CNAs involve insertions or deletions within large regions of the chromosomes as well as changes to smaller regions. Algorithms to analyze CNAs typically attempt segmentation and estimate chromosomal breakpoints on the basis of individual samples. Attempts at analyzing CNA in glioma from the past 5 years make use of multiple samples from different tumors to identify recurrent regions of aberration.25,26 Combining of CNA information and gene expression data might enable the separation of chromosomal alterations that drive, or contribute to the formation of, cancer from incidental ‘passenger’ alterations that have no meaningful bio logical effect. In particular, identification of genes that show concordant CNAs and gene expression changes may decrease false positives, which result from screening for glioma-specific genes, from either technique alone. This method was applied by one group of investigators using array-CGH to yield two potential novel tumor suppressor genes (PCDH9 and STARD13) and by another group of investigators using a different microarray on another set of clinical samples to yield additional candidate genes (CXCL12, PTER and LINGO2).27–30
The TCGA group also showed how both CNAs and somatic mutations varied across the four GBM subtypes that had been defined previously by gene expression changes.23 The samples in the classical subtype all showed chromosome 7 gain and chromosome 10 loss, and most showed amplification of EGFR, lack of TP53 mutations, and alterations of various components of the retinoblastoma pathway (CDKN2A, RB1, CDK4 and CCND2). The mesenchymal subtype was typified by mutations in NF1, PTEN and TP53, and CNAs of EGFR, CDK6, MET, PTEN, CDKN2A and RB1. The proneural type had distinct changes in PDGFRA and mutations in IDH1, together with other changes associated with chromosome 7 gain and chromosome 10 loss. The neural subtype seemed to be similar to the classical subtype but with a higher frequency of TP53 mutations. Overall, all four subtypes had distinct, but overlapping, patterns of CNAs and gene mutations.
In addition to using SNP chips to identify CNAs in tumors, SNP data have been used in genome-wide association studies to probe for germline mutations that increase the risk of developing glioma. Two separate studies identified the CDKN2A–CDKN2B locus, as well as the genes RTEL1, TERT, CCDC26 and PHLDB1, as risk factors.31,32 Another group used germline SNP and somatic CNA analysis to demonstrate that genes with either kinase or transferase function or genes in involved in synapse formation were enriched in amplified or deleted portions of chromosomes in GBM.33 By combining this method with gene expression analysis, the genes EGFR and DOCK4 were additionally recovered.
MicroRNA
MicroRNAs (miRNAs) are noncoding sequences of RNA with an average length of 22 nucleotides. They bind to the untranslated regtions of messenger RNA transcripts and affect gene expression, most often through degradation of the transcript before it can be translated to a functional protein sequence. Emerging evidence points to an important role for miRNAs in glioma biology that involves the regulation of genes involved in proliferation, apoptosis, migration, angiogenesis and aspects of stem-cell biology.34–38 Dong et al.39 integrated miRNA, somatic mutation, CNA and gene expression data to reconstruct networks associated with tumor initiation and progression in glioma. Wuchty and colleagues40 also integrated miRNA and gene expression data from glioma tumor samples, and found 128 miRNAs that were linked to 246 differentially-expressed genes, as well as identifying a network of miRNAs strongly associated with the kinase WEE1. Overexpression of miR-128, an miRNA known to be downregulated in GBM, resulted in downregulation of WEE1 expression. miRNA expression in glioma tumor samples has also been used to derive clinical subtypes that seem to demonstrate cellular origins from different developmental lineages.41
Epigenetics
The likelihood that epigenetic changes in glioma, as in other tumors, are an important driver in tumori genesis is becoming increasingly apparent (as reviewed by Nagarajan and colleagues4). Both DNA methylation and histone modifications have been studied on a genome-wide scale in gliomas. DNA methylation occurs through the enzymatic addition of a methyl group to certain cytosine nucleotides within CpG islands in chromosomal DNA. This methylation usually occurs within the upstream promoter regions of genes and mediates their transcriptional repression. Global hypomethylation of large regions of the genome is a common event in many cancers and is present in 80% of GBM tumors.42,43 The gene-specific methylation of promoter regions is now being mapped at a genome scale using oligonucleotide tiling arrays, as well as next-generation sequencing. Genes from well-studied pathways in glioma (such as RB1, PTEN and TP53) and potentially new target genes (such as EMP3, PCDHGA11 and SOCS1) show methylation changes in glioma.43–51 Methylation of MGMT, which encodes a protein involved in alkylating-agent-mediated DNA damage, has proven to be a marker for the predicted clinical response to temozolomide in patients with GBM.52 It remains unclear, however, whether gene-specific methylation changes are a cause or effect in glioma progression.
Methylation data have also been combined with other genomic technologies to help further classify gliomas with different biologies. For example, investigators in the TCGA project have identified a CpG island hyper methylation phenotype associated with CNA and somatic mutations of IDH1 in younger patients in the GBM proneural group who displayed longer survival times.53 In another example of an integrated genomic approach, a group of researchers has identified an oncomir–oncogene cluster comprised of miR-26a, CDK4 and CETG1 that, when amplified, was associated with decreased survival.54
Proteomics
As most of the work of a cell occurs at the protein level, the study of the ‘proteome’ of the cell promises to offer the greatest insight into cellular function in real time. Routine assessment and informatic investigation into the cellular proteome has, however, lagged behind that seen in cellular genomic and global RNA assays because of limitations in our current technology to develop assays that can readily and reproducibly deal with the complexity of protein biochemistry and the rapid and ever-changing post-transcriptional modifications found in proteins. Nevertheless, a number of new proteomic technologies have emerged to study both glioma cell lines and clinical samples (reviewed elsewhere55). Both gel-based and liquid chromatography approaches have been applied to identify changes in protein levels and post-translational modifications. Multiplexed antibody-based approaches promise to increase the throughput of these techniques by enabling multiple analytes to be simultaneously detected in each sample.
Proteomics is yet another genome-scale technology that is integrated with other information sources to study primary brain tumors.56–58 A 2009 targeted proteomics study looked at 56 proteins or phosphoproteins in 20 GBM samples.59 Principal component analysis and K-means clustering of protein levels (determined by western blot) divided samples into three groups. When compared with CNA and mutation data generated from the TCGA project, these groups seemed to correspond to the classical–neural, mesenchymal and proneural groups, characterized by overexpression of EGFR, NF1 and PDGFRA, respectively. Multiplexed bead-based profiling of 62 tyrosine kinases in 130 human cancer cell lines showed that the kinase SRC is often phosphorylated in glioma cell lines and is sensitive to the drug dasatinib.60 Subsequent work showed that SRC is also active in clinical GBM samples. Another group profiled the protein-expression level of all human kinases between different cancers and demonstrated that the Wee1-like protein kinase is overexpressed in GBM.61 WEE1 mediates DNA-damage-induced G2 cell cycle arrest, which allows TP53-mutated tumor cells time to repair the DNA damage before proceeding through the cell cycle. Genetic or pharmacological inhibition of WEE1 sensitizes glioma cells to radiation and DNA damaging agents, thereby laying out the preclinical rationale for the pharma ceutical development of WEE1 inhibitors, which are about to enter clinical trials in patients with malignant gliomas. The fact that two separate computational analyses of two totally divergent GBM databases (proteomics and miRNA profiling) resulted in the convergent identification of WEE1 as an important therapeutic target adds validity to the finding, and demonstrates the potential power of computational biological approaches for finding new treatment targets and strategies.
Network and functional analysis
Mapping of genomic data onto both known and inferred regulatory networks represents the next level of analysis required to understand cancer biology (Figures 3 and 4). Known pathways include signaling, gene regulatory and metabolic pathways that have been curated from the litera ture and compiled in databases such as KEGG and BioCarta.62,63 Inferred networks can be assembled from both genome-wide protein–protein interaction maps and gene expression microarray data. Association of genomic changes in cancer with networks of functionally and/or physically interacting proteins serves to explain the higher-order logic driving tumor formation, and can help to identify central ‘hub’ proteins in these networks. Creation of dynamic models of these networks may ultimately provide a more-sensitive means to identify potential protein targets.
Figure 3 |.
Mapping genomic changes in glioma to known pathways. Mutations and copy number changes from 91 tumors were applied to the known topologies of the RTK–Ras–PI3K, p53, and RB signaling pathways. 88% of patients showed alterations in the RTK–Ras–PI3K pathway, 87% showed alterations in the p53 pathway, and 78% showed alterations in the RB pathway. Overall, 74% of patients showed changes in all three pathways. Red shading indicates activating genetic alterations, with darker colour representing frequently altered genes. Blue colour indicates inactivating alterations, with darker shades representing a higher perentage of alteration. Abbreviations: FOXO, forkhead box protein O; PI3K, phosphoinositide 3-kinase; RB, retinoblastoma; RTK, receptor tyrosine kinase. Permission obtained from Nature Publishing Group © The Cancer Genome Atlas Research Network. Nature 455, 1061–1068 (2008).
Figure 4 |.
Mapping genomic changes in glioma to inferred pathways. a | Core module of transcription factors associated with mesenchymal transformation of glioma.72 b | p53 network in glioma created by combining copy number alteration, mutation, and protein–protein interaction data.65 c | Network of microRNAs associated with inhibition of the kinase WEE1.40 Permission for part a obtained from Nature Publishing Group © Carro, M. S. et al. Nature 463, 318–327 (2010). Part b was adapted from Cerami, E. et al. PLoS ONE 5, e8918 (2010) and part c was adapted from Wuchty, S. et al. PLOS ONE 6, e14681 (2010), which are published under an open-access license by Public Library of Science.
The TCGA group and Parsons and colleagues64 put CNA and somatic mutations in the context of RTK–Ras–PI3K, retinoblastoma and p53 pathways previously shown to be deregulated in glioma.23,64 By looking at CNAs for the multiple genes that comprised each pathway, the researchers showed alteration of these pathways in 66%, 77% and 59% of tumors, respectively. Adding somatic mutations increased these percentages to 87%, 78% and 88%. Overall, 74% of patients showed at least one aberration in each of the three pathways. Mapping of both CNAs and mutations onto a database of protein–protein interactions enables the identification of new functional modules not previously implicated in a particular cancer. This approach has been used for GBM, and identified both the canonical RTK–retinoblastoma–p53 pathway and a new potential target gene, AGAP2 (previously known as CENTG1).30,65 Full exome sequencing for GBM clinical samples should enable expansion of this approach to discover new pathways involved in the pathogenesis of these tumors.
Microarray gene expression data sets are another way to infer biological changes in glioma.66,67 Using two different clinical data sets of GBM samples, one effort focused on identifying gene-coexpression modules present in both GBM and other cancers. One module, also present in breast cancer, was inferred to be downstream of EGFR and contained a new potential target gene, ASPM, which seemed to operate as a hub gene in the coexpression network. Gene set enrichment analysis can also be used to map changes in global gene expression to known pathways.68 A version of this analysis tool has been designed to detect alteration in signaling cascades between classes of samples and was used to analyze microarray gene expression from GBM tumors. The analysis returned many genes previously known to be deregulated in GBM, such as EGFR, NKFB1, VEGF and BCL3.69 Two other groups have incorporated both gene expression changes and protein–protein interactions for reverse-engineering glioma-specific networks.70,71
A particularly promising application of gene expression data has emerged in the latest attempts to reconstruct gene regulatory networks in the tumor and identify transcription factors that could be viable drug tar gets. Carro et al.72 used a reverse-engineering method based on information theory to derive a core network of transcription factors that were associated with a mesenchymal-like gene expression signature in GBM. On the basis of these analyses, two transcription factors, STAT3 and C/EBPβ, were shown to be expressed in GBM clinical samples and associated with a poor outcome. Further validation showed that STAT3 and C/EBPβ reprogrammed neural stem cells towards a mesenchymal lineage and when knocked down in glioma cells in vitro led to a reversal of the mesenchymal phenotype. Other work has used both gene expression and transcription-factor-specific motifs in gene promoters to increase the reliability of network reconstruction.73,74
Databases for neuroinformatics
Clinical data enriches and informs the corollary genomic data in the understanding of human glioma biology. Standard clinical data—such as patient demo graphics, age, survival time from diagnosis, and treatment history—are basic but invaluable. Advances in neuro imaging (dynamic and functional MRI, metabolic imaging with magnetic resonance spectroscopy and PET scans), represents an ever-growing rich and complex data source for structural and functional information about tumors that will increasingly be correlated with genomic data.75
The Glioma Molecular Diagnostic Initiative
The Glioma Molecular Diagnostic Initiative (GMDI) was the first large-scale, multi-institutional effort to collect genomic data for GDM and integrate this information with clinical and demographic features. The bioinformatics component of GMDI, the REMBRANDT (Repository of Molecular BRAin Neoplasia DaTa) database,76 was designed to serve as both a data repository and a means to actively query and analyze genomic and clinical details simultaneously.28,77,78 REMBRANDT was built to leverage existing bioinformatics infrastructure developed at the US National Cancer Institute Center for Bioinformatics, such as the caARRAY79 gene expression tools, the Cancer Genome Anatomy Project (CGAP),80 the C3D clinical informatics system,81 and the caCORE infrastructure.82 In addition, REMBRANDT makes use of tools developed as part of the caBIG (cancer Biomed Informatics Grid) project,83 and Gene Pattern,84 the database and analysis tool developed at the Broad Institute of Harvard and Massachussetts Institute of Technology, USA.
The conceptual model for REMBRANDT shows how these resources are tied into different elements of the database. Tumor sample identifiers act as the central hub that ties together clinical and genetic information. The entire database system was designed according to an n-tier architecture, with separate resources for: a data tier implemented using Java 2 Enterprise Edition and Oracle 10g database (Oracle, Redwood Shores, CA, USA); a middle tier that handles logic and analysis, which was implemented using Java and caBIG elements; an analytical server that handles heavy computational tasks and communicates to the middle tier through the Java messaging service; and a data presentation tier. For performance reasons, genomic data are stored as binary files readable by the R language on the data tier.
The database schema was implemented as a so-called ‘star’ topology. In this schema type, tables are classified as either ‘fact’ or ‘dimension’ tables. Fact tables hold the actual data and help eliminate performance inefficiencies that would result if the data were divided into many smaller tables that needed to be combined in join operations. Dimension tables are smaller than the fact tables and hold attributes describing the data. In the case of REMBRANDT, dimension tables represent categories of data analysis. A typical query that returns gene expression data, for example, would combine the BIOSPECIMEN_DM table and the GENE_DM table. Despite the complexity of the REMBRANDT structure, the power of the platform is that the tool allows data analysis by users at all levels of informatic sophistication. Experienced statisticians and bioinformaticians can use the advanced REMBRANDT analyses tools or download raw data from the REMBRANDT data warehouse for their own analysis. By contrast, REMBRANDT’s simple query tool allows nonexperienced investigators, such as clinicians, to ask basic questions connecting genomic and clinical data in a user-friendly, intuitive manner, thereby bringing the power of bioinformatics to the forefront of clinical investigation.
The Cancer Genome Atlas
TCGA is a large-scale collaboration between the US National Human Genome Research Institute and National Cancer Institute that began as a pilot project in 2006 and was expanded in 2009. Although the goal of TCGA is to collect comprehensive genomic information on all major human cancers, initial efforts focused on glioblastoma, lung cancer and ovarian cancer.85,86 As with GMDI, TCGA utilizes a wide array of genomic platforms to gather data on a large number of samples (more than 500) using well-defined quality control protocols.
In addition to three different types of gene expression microarray, as well as SNP chips, the TCGA project has gathered genome-wide data looking at methylation and somatic mutations. To accommodate the diverse formats of these genomics platforms, TCGA adopted a data object model based on MAGE-OM, developed originally by the Functional Genomics Data Society.87 This object model enables the description of both matrix and vector-based data, such as gene expression microarray data, and nonmatrix formats, such as next-generation sequencing. The common elements and controlled vocabulary of the object model are both flexible and general in nature, and enable the description of new concepts and relationships without the requirement for new formats specific to each technology. To reduce complexity and file size of the metadata, a tab-based format (MAGE-TAB) was ultimately chosen over extensible markup language (MAGE-XML).
The diverse data requirements of different genomic platforms were also reflected in the topology of the database schema chosen by the TCGA informatics team. In contrast with REMBRANDT, the TCGA project uses a database schema often called third normal form (3NF). In this topology, a larger number of tables are created to minimize redundancy and maximize internal consistency of each table. The 3NF topology, common in large data-warehouse applications, increases flexibility and reduces the need to aggregate data before populating the database. However, this flexibility often produces a decrease in performance for complex queries. As with REMBRANDT, the TCGA database was implemented using the resources of the open-source caBIG.75,88–92
The TCGA project also introduced a tiered system for data access, to protect sensitive clinical data and to allow researchers access to genomic information with different levels of preprocessing. For clinical data, a first tier is publicly available that prevents aggregation of results that could potentially identify any particular individual. A second tier that does not mask this information is available only to researchers at participating institutions who have signed an agreement to protect patient privacy and will use the information for biomedical research. TCGA provides access to genomic data at four specific levels. The first level allows access to the raw data, for example the binary.CEL files of the Affymetrix® platform. The second level provides access to the transformed data as referenced by the manufacturer identification for each data point, which in the case of Affymetrix® is the probe set identification that maps a collection of 25-mer oligonucleotide probes to each targeted transcript. The third level connects each genomic data point to a gene-level annotation. Finally, a fourth level provides further transformation of the data, such as genes that are differentially expressed between two conditions.
Towards personalized medicine
Analysis of large-scale genomic data sets has started to yield results for discriminating subgroups of patients with GBM who seem to have differences in responses to chemotherapeutic agents.52,93,94 As a new generation of drugs are developed that target specific protein targets, personalized treatment for brain cancer may ultimately move beyond a classification model towards identifying small subsets of patients who are likely to respond to a particular agent. For diseases such as GBM, in which few treatment options are available, drug signatures will also have the potential to guide clinical trials by enabling the selection of patients who have the best chance of responding to the drug under investigation.
Previous work has shown that drug-specific gene expression signatures can be developed and successfully applied to predict drug response in vitro.95–100 These signatures can be used for prospective drug screening against a panel of cell lines that are representative of a particular cancer type.96,101–103 Extension of this technique to successfully predict in vitro responses of tumor stem cells may be particularly valuable for those cancers, such as glioma, in which stem cells are thought to have a major role. Gene expression signatures have also been used successfully in screens for other cellular states, such as differentiation. For example, one study identified the synergistic potential of histone deacetylase inhibitors and retinoids in inducing differentiation in neuroblastoma cell lines.104
The clinical use of drug-specific gene signatures is still in the early stages of development, but seems promising. Whole-exome sequencing, currently on the horizon for clinical samples, should bring an even greater level of reliability to the prediction of drug response.99 Past experience and current evidence, however, points to the conclusion that linking drug efficacy to mutations in a single protein might be too simplistic. The probable clinical efficacy of any given drug for a specific tumor depends on multiple factors in addition to the status of the target protein (expression and activation level, mutational state), including: the activation states of other pathways in tumor cells, the status of ATP-driven pumps that confer drug resistance, and the tumor cell microenvironment.105 A 2010 study of a CDK4–CDK6 inhibitor in glioma cell lines reinforced this idea by showing that, surprisingly, the expression levels of CDK4–CDK6 were not highly predictive of drug potency.106 Chromosomal deletion of the CDKN2A-CDKN2C locus was a better predictor of drug potency, which suggests that the development of integrated genomic models might ultimately be a superior approach for predicting drug response.
The development of personalized medicine approaches to glioma therapeutics is likely to integrate genomic data sets with other types of data. The practical limitations of repeated brain tumor biopsies make the use of imaging data (such as MRI or PET scans) a particularly attractive option for obtaining real-time and repeated assessments of a patient’s tumor. Neuroimaging has been combined with gene expression microarray analysis in patients with GBM to show that imaging could act as a surrogate for molecular diagnostics in predicting proliferative and hypoxia-associated phenotypes, as well as accurately predicting EGRF overexpression and an ‘infiltrative’ phenotype associated with decreased survival.107,108
Conclusions
The amount of glioma-specific genomic data produced during the past 3 years is truly remarkable. Database infrastructures have evolved to efficiently warehouse this information and allow both computational and molecular biologists to perform integrated genome-scale analysis of curated clinical tumor samples on a scale previously unimaginable. Bioinformatics approaches for analyzing this information have also grown and have led to improved classification of tumor subtypes and insights into glioma biology, and are beginning to probe the dense web of connected intracellular pathways that drive the formation and progression of brain tumors. Future challenges lie in translating this knowledge into individualized therapy for the cancer patient.
Key points.
Genomic analysis has revealed that gliomas fall into six subtypes, two of which have characteristics of oligodendrogliomas and four of which show poor survival
The glioma subtypes show distinct, but overlapping, patterns of mutations, copy number alterations, and gene expression
Integration of genomic data, such as gene expression, single nucleotide polymorphism chips, proteomics, microRNA, and epigenomics, increases the reliability and biological interpretability of results
Mapping of genomic data onto both known and inferred cellular networks reveals new insights into tumorigenesis
Review criteria.
A literature search of PubMed for articles published between 1990 and 2010 was performed using the key words “glioma”, “genomics”, “genetics”, “microRNA”, “gene expression”, “microarray”, “SNP”, “copy number”, “integrative”, “systems”, “pathways”, “sequencing”, “proteomics”, “methylation” and “epigenetic”, both alone and in combination. In addition, we relied on our knowledge of important integrative genomics papers for glioma that have been published in the past 10 years, as well as manually searching the reference sections of these papers for further relevant articles.
References
- 1.Purow B & Schiff D Advances in the genetics of glioblastoma: are we reaching critical mass? Nat. Rev. Neurol 5, 419–426 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Goodwin CR & Laterra J Neuro-oncology: unmasking the multiforme in glioblastoma. Nat. Rev. Neurol 6, 304–305 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Macdonald TJ et al. Expression profiling of medulloblastoma PDGFRA and the RAS/MAPK pathway. Nat. Genet 29, 143–152 (2001). [DOI] [PubMed] [Google Scholar]
- 4.Nagarajan RP & Costello JF Molecular epigenetics and genetics in neuro-oncology. Neurotherapeutics 6 436–446 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Robinson JP et al. Activated BRAF induces gliomas in mice when combined with Ink4a/Arf loss or Akt activation. Oncogene 29, 335–344 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Nevins JR & Potti A Mining gene expression profiles: expression signatures as cancer phenotypes. Nat. Rev. Genet 8, 601–609 (2007). [DOI] [PubMed] [Google Scholar]
- 7.Hedenfalk I et al. Gene-expression profiles in hereditary breast cancer. N. Engl. J. Med 344, 539–548 (2001). [DOI] [PubMed] [Google Scholar]
- 8.Sotiriou C et al. Breast cancer classification and prognosis based on gene expression profiles from a population-based study. Proc. Natl Acad. Sci. USA 100, 10393–10398 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Alizadeh AA et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000). [DOI] [PubMed] [Google Scholar]
- 10.Institute for Genomics and Bioinformatics, University of California, Irvine: Cyber-T [online], http://cybert.ics.uci.edu/ (2010). [Google Scholar]
- 11.Tso C-L et al. Distinct transcription profiles of primary and secondary glioblastoma subgroups. Cancer Res 66, 159–167 (2006). [DOI] [PubMed] [Google Scholar]
- 12.MacDonald TJ et al. Expression profiling of medulloblastoma: PDGFRA and the RAS/MAPK pathway as therapeutic targets for metastatic disease. Nat. Genet 29, 143–152 (2001). [DOI] [PubMed] [Google Scholar]
- 13.Sallinen SL et al. Identification of differentially expressed genes in human gliomas by DNA microarray and tissue chip techniques. Cancer Res 60, 6617–6622 (2000). [PubMed] [Google Scholar]
- 14.Rickman DS et al. Distinctive molecular profiles of high-grade and low-grade gliomas based on oligonucleotide microarray analysis. Cancer Res 61, 6885–6891 (2001). [PubMed] [Google Scholar]
- 15.Mischel PS et al. Identification of molecular subtypes of glioblastoma by gene expression profiling. Oncogene 22, 2361–2373 (2003). [DOI] [PubMed] [Google Scholar]
- 16.Shai R et al. Gene expression profiling identifies molecular subtypes of gliomas. Oncogene 22, 4918–4923 (2003). [DOI] [PubMed] [Google Scholar]
- 17.Freije WA et al. Gene expression profiling of gliomas strongly predicts survival. Cancer Res 64, 6503–6510 (2004). [DOI] [PubMed] [Google Scholar]
- 18.Tso CL Distinct transcription profiles of primary and secondary glioblastoma subgroups. Cancer Res 66, 159–167 (2006). [DOI] [PubMed] [Google Scholar]
- 19.Tso CL Primary glioblastomas express mesenchymal stem-like properties. Mol. Cancer Res 4, 607–619 (2006). [DOI] [PubMed] [Google Scholar]
- 20.Phillips HS et al. Molecular subclasses of high-grade glioma predict prognosis, delineate a pattern of disease progression, and resemble stages in neurogenesis. Cancer Cell 9, 157–173 (2006). [DOI] [PubMed] [Google Scholar]
- 21.Li A et al. Unsupervised analysis of transcriptomic profiles reveals six glioma subtypes. Cancer Res 69, 2091–2099 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Li A et al. Genomic changes and gene expression profiles reveal that established glioma cell lines are poorly representative of primary human gliomas. Mol. Cancer Res 6, 21–30 (2008). [DOI] [PubMed] [Google Scholar]
- 23.Verhaak RGW et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 17, 98–110 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Liu J et al. Identification of a gene signature in cell cycle pathway for breast cancer prognosis using gene expression profiling data. BMC Med. Genomics 1, 39 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Beroukhim R et al. Assessing the significance of chromosomal aberrations in cancer: methodology and application to glioma. Proc. Natl Acad. Sci. USA 104, 20007–20012 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Freire P et al. Exploratory analysis of the copy number alterations in glioblastoma multiforme. PLoS ONE 3, e4076 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.de Tayrac M et al. Integrative genome-wide analysis reveals a robust genomic glioblastoma signature associated with copy number driving changes in gene expression. Genes Chromosomes Cancer 48, 55–68 (2009). [DOI] [PubMed] [Google Scholar]
- 28.Kotliarov Y et al. CNAReporter: a GenePattern pipeline for the generation of clinical reports of genomic alterations. BMC Med. Genomics 3, 11 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kotliarov Y et al. Correlation analysis between single-nucleotide polymorphism and expression arrays in gliomas identifies potentially relevant target genes. Cancer Res 69, 1596–1603 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Bredel M et al. A network model of a cooperative genetic landscape in brain tumors. JAMA 302, 261–275 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wrensch M et al. Variants in the CDKN2B and RTEL1 regions are associated with high-grade glioma susceptibility. Nat. Genet 41, 905–908 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Shete S et al. Genome-wide association study identifies five susceptibility loci for glioma. Nat. Genet 41, 899–904 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.LaFramboise T, Dewal N, Wilkins K, Pe’er I & Freedman ML Allelic selection of amplicons in glioblastoma revealed by combining somatic and germline analysis. PLoS Genet. 6, e1001086 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Asadi-Moghaddam K, Chiocca EA & Lawler SE Potential role of miRNAs and their inhibitors in glioma treatment. Expert Rev. Anticancer Ther 10, 1753–1762 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Chiocca EA & Lawler SE The many functions of microRNAs in glioblastoma. World Neurosurg. 73, 598–601 (2010). [DOI] [PubMed] [Google Scholar]
- 36.Godlewski J, Bronisz A, Nowicki MO, Chiocca EA & Lawler S microRNA-451: A conditional switch controlling glioma cell proliferation and migration. Cell Cycle 9, 2742–2748 (2010). [PubMed] [Google Scholar]
- 37.Godlewski J, Newton HB, Chiocca EA & Lawler SE MicroRNAs and glioblastoma;the stem cell connection. Cell Death Differ. 17, 221–228 (2010). [DOI] [PubMed] [Google Scholar]
- 38.Turner JD et al. The many roles of microRNAs in brain tumor biology. Neurosurg. Focus 28, E3 (2010). [DOI] [PubMed] [Google Scholar]
- 39.Dong H et al. Integrated analysis of mutations, miRNA and mRNA expression in glioblastoma. BMC Syst. Biol 4, 163 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Wuchty S et al. Prediction of associations between microRNAs and gene expression in glioma biology. PLoS ONE 6, e14681 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kim T, Huang W, Park R, Park PJ & Johnson MD A developmental taxonomy of glioblastoma defined and maintained by microRNAs. Cancer Res 71, 3387–3399 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Cadieux B, Ching TT, VandenBerg SR & Costello JF Genome-wide hypomethylation in human glioblastomas associated with specific copy number alteration, methylenetetrahydrofolate reductase allele status, and increased proliferation. Cancer Res 66, 8469–8476 (2006). [DOI] [PubMed] [Google Scholar]
- 43.Fanelli M et al. Loss of pericentromeric DNA methylation pattern in human glioblastoma is associated with altered DNA methyltransferases expression and involves the stem cell compartment. Oncogene 27, 358–365 (2008). [DOI] [PubMed] [Google Scholar]
- 44.Alaminos M et al. EMP3, a myelin-related gene located in the critical 19q13.3 region, is epigenetically silenced and exhibits features of a candidate tumor suppressor in glioma and neuroblastoma. Cancer Res 65, 2565–2571 (2005). [DOI] [PubMed] [Google Scholar]
- 45.Amatya VJ, Naumann U, Weller M & Ohgaki H TP53 promoter methylation in human gliomas. Acta Neuropathol. 110, 178–184 (2005). [DOI] [PubMed] [Google Scholar]
- 46.Baeza N, Weller M, Yonekawa Y, Kleihues P & Ohgaki H PTEN methylation and expression in glioblastomas. Acta Neuropathol. 106, 479–485 (2003). [DOI] [PubMed] [Google Scholar]
- 47.Costello JF, Berger MS, Huang HS & Cavenee WK Silencing of p16/CDKN2 expression in human gliomas by methylation and chromatin condensation. Cancer Res 56, 2405–2410 (1996). [PubMed] [Google Scholar]
- 48.Dallol A et al. Frequent epigenetic inactivation of the SLIT2 gene in gliomas. Oncogene 22, 4611–4616 (2003). [DOI] [PubMed] [Google Scholar]
- 49.Foltz G et al. DNA methyltransferase-mediated transcriptional silencing in malignant glioma: a combined whole-genome microarray and promoter array analysis. Oncogene 28, 2667–2677 (2009). [DOI] [PubMed] [Google Scholar]
- 50.Hesson L et al. Frequent epigenetic inactivation of RASSF1A and BLU genes located within the critical 3p21.3 region in gliomas. Oncogene 23, 2408–2419 (2004). [DOI] [PubMed] [Google Scholar]
- 51.Veeriah S et al. The tyrosine phosphatase PTPRD is a tumor suppressor that is frequently inactivated and mutated in glioblastoma and other human cancers. Proc. Natl Acad. Sci. USA 106, 9435–9440 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Mladkova N & Chakravarti A Molecular profiling in glioblastoma: prelude to personalized treatment. Curr. Oncol. Rep 11, 53–61 (2009). [DOI] [PubMed] [Google Scholar]
- 53.Noushmehr H et al. Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma. Cancer Cell 17, 510–522 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Kim H et al. Integrative genome analysis reveals an oncomir/oncogene cluster regulating glioblastoma survivorship. Proc. Natl Acad. Sci. USA 107, 2183–2188 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Niclou SP, Fack F & Rajcevic U Glioma proteomics: status and perspectives. J. Proteomics 73, 1823–1838 (2010). [DOI] [PubMed] [Google Scholar]
- 56.Huang PH, Xu AM & White FM Oncogenic EGFR signaling networks in glioma. Science Signaling 2, re6 (2009). [DOI] [PubMed] [Google Scholar]
- 57.Rich JN et al. A genetically tractable model of human glioma formation. Cancer Res. 61, 3556–3560 (2001). [PubMed] [Google Scholar]
- 58.Deighton RF, McGregor R, Kemp J, McCulloch J & Whittle IR Glioma pathophysiology: insights emerging from proteomics. Brain Pathol. 20, 691–703 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Brennan C et al. Glioblastoma subclasses can be defined by activity among signal transduction pathways and associated genomic alterations. PLoS ONE 4, e7752 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Du J et al. Bead-based profiling of tyrosine kinase phosphorylation identifies SRC as a potential target for glioblastoma therapy. Nat. Biotechnol 27, 77–83 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Mir SE et al. In silico analysis of kinase expression identifies WEE1 as a gatekeeper against mitotic catastrophe in glioblastoma. Cancer Cell 18, 244–257 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Kanehisha Laboratories. KEGG: Kyoto Encyclopedia of Genes and Genomes [online], http://www.genome.jp/kegg/ (2010).
- 63.BioCarta. BioCarta Pathways [online], https://www.biocarta.com/genes/index.asp (2010).
- 64.Parsons DW et al. An integrated genomic analysis of human glioblastoma multiforme. Science 321, 1807–1812 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Cerami E, Demir E, Schultz N, Taylor BS & Sander C Automated network analysis identifies core pathways in glioblastoma. PLoS ONE 5, e8918 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Beckner ME et al. Identification of ATP citrate lyase as a positive regulator of glycolytic function in glioblastomas. Int. J. Cancer 126, 2282–2295 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Zheng H et al. PLAGL2 regulates Wnt signaling to impede differentiation in neural stem cells and gliomas. Cancer Cell 17, 497–509 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Wong DJ et al. Revealing targeted therapy for human cancer by gene module maps. Cancer Res 68, 369–378 (2008). [DOI] [PubMed] [Google Scholar]
- 69.Keller A et al. A novel algorithm for detecting differentially regulated paths based on gene set enrichment analysis. Bioinformatics 25, 2787–2794 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Bredel M et al. Functional network analysis reveals extended gliomagenesis pathway maps and three novel MYC-interacting genes in human gliomas. Cancer Res 65, 8679–8689 (2005). [DOI] [PubMed] [Google Scholar]
- 71.Wuchty S et al. Gene pathways and subnetworks distinguish between major glioma subtypes and elucidate potential underlying biology. J. Biomed. Inform 43, 945–952 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Carro MS et al. The transcriptional network for mesenchymal transformation of brain tumours. Nature 463, 318–325 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Bozdag S, Li A, Wuchty S & Fine HA FastMEDUSA: a parallelized tool to infer gene regulatory networks. Bioinformatics 26, 1792–1793 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Consortium FANTOM et al. The transcriptional network that controls growth arrest and differentiation in a human myeloid leukemia cell line. Nat. Genet 41, 553–562 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Cooper LAD et al. An integrative approach for in silico glioma research. IEEE Trans. Biomed. Eng 57, 2617–2621 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.National Cancer Institute. REMBRANDT database [online], http://caintegrator-info.nci.nih.gov/rembrandt (2010).
- 77.Madhavan S et al. Rembrandt: helping personalized medicine become a reality through integrative translational research. Mol. Cancer Res 7, 157–167 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Li A, Bozdag S, Kotliarov Y & Fine HA GliomaPredict: a clinically useful tool for assigning glioma patients to specific molecular subtypes. BMC Med. Inform. Decis. Mak 10, 38 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 79.National Cancer Institute. caArray [online], https://cabig.nci.nih.gov/tools/caArray (2011).
- 80.National Cancer Institute. Cancer Genome Anatomy Project [online], http://cgap.nci.nih.gov/cgap.html (2011).
- 81.National Cancer Institute. Cancer Central Clinical Database [online], https://cabig.nci.nih.gov/tools/c3d (2011).
- 82.National Cancer Institute. caCORE SDK [online], https://cabig.nci.nih.gov/tools/caCORE_SDK (2011).
- 83.National Cancer Institute. caBIG [online], https://cabig.nci.nih.gov/ (2011).
- 84.Broad Institute. GenePattern [online], http://www.broadinstitute.org/cancer/software/genepattern/ (2011).
- 85.National Cancer Institute. The Cancer Genome Atlas [online], http://cancergenome.nih.gov/ (2011).
- 86.CGAR Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 455, 1061–1068 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.Jones AR et al. The Functional Genomics Experiment model (FuGE): an extensible framework for standards in functional genomics. Nat. Biotechnol 25, 1127–1133 (2007). [DOI] [PubMed] [Google Scholar]
- 88.Fenstermacher D et al. The Cancer Biomedical Informatics Grid (caBIG). Conf. Proc. IEEE Eng. Med. Biol. Soc 1, 743–746 (2005). [DOI] [PubMed] [Google Scholar]
- 89.Kakazu KK, Cheung LWK & Lynne W The Cancer Biomedical Informatics Grid (caBIG): pioneering an expansive network of information and tools for collaborative cancer research. Hawaii Med. J 63, 273–275 (2004). [PubMed] [Google Scholar]
- 90.Deus HF et al. Exposing the Cancer Genome Atlas as a SPARQL endpoint. J. Biomed. Inform 43, 998.– (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Gadaleta E, Lemoine NR & Chelala C Online resources of cancer data: barriers, benefits and lessons. Brief. Bioinform 12, 52–63 (2010). [DOI] [PubMed] [Google Scholar]
- 92.Ovaska K et al. Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme. Genome Med 2, 65 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Schadt EE, Friend SH & Shaywitz DA A network view of disease and compound screening. Nat. Rev. Drug Discov 8, 286–295 (2009). [DOI] [PubMed] [Google Scholar]
- 94.Colman H et al. A multigene predictor of outcome in glioblastoma. Neuro Oncol 12, 49–57 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 95.Staunton JE et al. Chemosensitivity prediction by transcriptional profiling. Proc. Natl Acad. Sci. USA 98, 10787–10792 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Lee JK et al. A strategy for predicting the chemosensitivity of human cancers and its application to drug discovery. Proc. Natl Acad. Sci. USA 104, 13086–13091 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.Sos ML et al. Predicting drug susceptibility of non-small cell lung cancers based on genetic lesions. J. Clin. Invest 119, 1727–1740 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Wallqvist A, Rabow AA, Shoemaker RH, Sausville EA & Covell DG Establishing connections between microarray expression data and chemotherapeutic cancer pharmacology. Mol. Cancer Ther 1, 311–320 (2002). [PubMed] [Google Scholar]
- 99.McDermott U et al. Identification of genotype-correlated sensitivity to selective kinase inhibitors by using high-throughput tumor cell line profiling. Proc. Natl Acad. Sci. USA 104, 19936–19941 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 100.Riddick G et al. Predicting in vitro drug sensitivity using Random Forests. Bioinformatics 27, 220–224 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Mori S, Chang JT, Andrechek ER, Potti A & Nevins JR Utilization of genomic signatures to identify phenotype-specific drugs. PLoS ONE 4, e6772 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Nelander S et al. Models from experiments: combinatorial drug perturbations of cancer cells. Mol. Syst. Biol 4, 216 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Ring BZ, Chang S, Ring LW, Seitz RS & Ross DT Gene expression patterns within cell lines are predictive of chemosensitivity. BMC Genomics 9,74 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Hahn CK et al. Expression-based screening identifies the combination of histone deacetylase inhibitors and retinoids for neuroblastoma differentiation. Proc. Natl Acad. Sci. USA 105, 9751–9756 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Kamb A, Wee S & Lengauer C Why is cancer drug discovery so difficult? Nat. Rev. Drug Discov 6, 115–120 (2007). [DOI] [PubMed] [Google Scholar]
- 106.Wiedemeyer WR et al. Pattern of retinoblastoma pathway inactivation dictates response to CDK4/6 inhibition in GBM. Proc. Natl Acad. Sci. USA 107, 11501–11506 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Diehn M et al. Identification of noninvasive imaging surrogates for brain tumor gene-expression modules. Proc. Natl Acad. Sci. USA 105, 5213–5218 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Segal E et al. Decoding global gene expression programs in liver cancer by noninvasive imaging. Nat. Biotechnol 25, 675–680 (2007). [DOI] [PubMed] [Google Scholar]




