Skip to main content
Molecular & Cellular Proteomics : MCP logoLink to Molecular & Cellular Proteomics : MCP
. 2019 Jul 15;18(9):1893–1898. doi: 10.1074/mcp.TIR119.001673

Integration and Analysis of CPTAC Proteomics Data in the Context of Cancer Genomics in the cBioPortal*

Pamela Wu ‡,§,, Zachary J Heins , James T Muller , Lizabeth Katsnelson , Ino de Bruijn , Adam A Abeshouse , Nikolaus Schultz ‖,**, David Fenyö ‡,§,¶¶, Jianjiong Gao ‖,**,‡‡
PMCID: PMC6731080  PMID: 31308250

Integration of the CPTAC mass spectrometry-based proteomics data into the cBioPortal, consisting of 77 breast, 95 colorectal, and 174 ovarian tumors that already have been profiled by TCGA for mutations, copy number alterations, gene expression, and DNA methylation.

Keywords: Proteogenomics, Mass Spectrometry, Phosphoproteome, Cancer Biology, Cancer Biomarker(s)

Graphical Abstract

graphic file with name zjw0091960020002.jpg

Highlights

  • Support for mass spectrometry-based proteomics in cBioPortal.

  • User-friendly web interface, a web API, and an R client to query proteogenomic data.

  • Integration of Clinical Proteomics Tumor Analysis Consortium data with cBioPortal.

Abstract

The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced extensive mass spectrometry-based proteomics data for selected breast, colon, and ovarian tumors from The Cancer Genome Atlas (TCGA). We have incorporated the CPTAC proteomics data into the cBioPortal to support easy exploration and integrative analysis of these proteomic datasets in the context of the clinical and genomics data from the same tumors. cBioPortal is an open source platform for exploring, visualizing, and analyzing multidimensional cancer genomics and clinical data. The public instance of the cBioPortal (http://cbioportal.org/) hosts more than 200 cancer genomics studies, including all of the data from TCGA. Its biologist-friendly interface provides many rich analysis features, including a graphical summary of gene-level data across multiple platforms, correlation analysis between genes or other data types, survival analysis, and per-patient data visualization. Here, we present the integration of the CPTAC mass spectrometry-based proteomics data into the cBioPortal, consisting of 77 breast, 95 colorectal, and 174 ovarian tumors that already have been profiled by TCGA for mutations, copy number alterations, gene expression, and DNA methylation. As a result, the CPTAC data can now be easily explored and analyzed in the cBioPortal in the context of clinical and genomics data. By integrating CPTAC data into cBioPortal, limitations of TCGA proteomics array data can be overcome while also providing a user-friendly web interface, a web API, and an R client to query the mass spectrometry data together with genomic, epigenomic, and clinical data.


In the last decade, The Cancer Genome Atlas (TCGA) consortium has generated multiplatform cancer genomics data, including somatic mutations, copy number alterations, gene expression, and DNA methylation, in more than 30 cancer types (1). TCGA also generated some proteomics data using the RPPA platform, measuring protein levels in tumors for about 150 proteins and 50 phosphoproteins (2). However, the RPPA technology is limited by the availability and binding efficiency of available antibodies for protein and posttranslational modification detection. The Clinical Proteomic Tumor Analysis Consortium (CPTAC)1 consortium is aimed at characterizing the protein inventory in tumors by leveraging the latest developments in mass spectrometry-based discovery proteomics. The initial CPTAC projects have generated extensive mass spectrometry-based proteomics data on TCGA tumors of breast cancer (proteome and phosphoproteome) (3, 4), ovarian cancer (proteome, phosphoproteome, and glycoproteome) (5, 6), and colorectal cancer (proteome only, with matching normal samples) (79). Currently, the consortium is extending the proteogenomic analysis to tumor samples for other cancer types, including clear cell renal cell carcinoma, endometrial carcinoma, lung adenocarcinoma, lung squamous cell carcinoma, glioblastoma multiforme, head and neck squamous cell carcinoma, and pancreatic ductal adenocarcinoma.

By profiling the same cancer patients already profiled by TCGA, CPTAC results provide a unique opportunity for performing integrative analysis of cancer genomics and proteomics data, which can link proteomics to genotypes and potentially phenotypes in cancer. There is a growing demand for exploratory analysis tools that integrate cancer genomics, proteomics, and clinical data and provide easy access to the multidimensional datasets, and efforts such as TCPA and LinkedOmics have started to fill this gap (10, 11). The cBioPortal for Cancer Genomics has been one of the leading resources for analyzing cancer genomics data, including all TCGA projects and many datasets curated from the literature (12, 13). cBioPortal features a biologist-friendly interface, biology-aware visualization, and integrative analysis features, making it one of the most popular resources in the community of cancer genomics researchers, especially biologists without bioinformatics skills.

We have updated cBioPortal to integrate the CPTAC mass spectrometry-based proteomics data, making the high-quality proteomics data easily accessible for visualization and analysis in the context of cancer genomics. First, we developed a data transformation pipeline for converting mass spectrometry results produced by CPTAC members into a data format that is compatible with the cBioPortal data pipeline. This data transformation pipeline is also applicable for uploading non-CPTAC mass spectrometry results into cBioPortal, as they become available. In order to better support integrative analysis of genomics and proteomics data, we have made various adjustments to the cBioPortal interface, including protein expression heatmap visualization in OncoPrint (graphic summary of gene-level data and clinical attributes), correlation analysis, and differential analysis. By integrating CPTAC data into cBioPortal, we aim to increase the accessibility of mass spectrometry-based proteomics data to cancer researchers in the context of cancer genomics and provide researchers with an intuitive interface to explore and analyze interactions between genomics and proteomics in cancer.

Experimental Procedures and Results

Introduction to the cBioPortal for Cancer Genomics Interface

The public instance of the cBioPortal hosts more than 200 studies, including all TCGA projects and published studies from the literature. A key step in cBioPortal that enables the integration of various datasets is gene-level mapping of the genomics data. For example, by mapping mutations, copy number, and gene expression onto a gene, the visualization and analysis features of the cBioPortal supports studying different types of gene alterations in tumors simultaneously, including, among others:

Gene-oriented Query

A simple web form that allows users to query alterations to genes of interest in individuals or across cancer studies. Samples were classified into the altered group (if there is an alteration in any of the query genes in the sample) or the unaltered group for downstream analysis.

OncoPrint

A graphical summary of genomic alterations across samples in the query genes, represented by different glyphs and color coding. This graphical representation gives an overview of genomic alterations to the genes of interest in a selected cohort. Clinical attributes can also be visualized together with the genomics data. To better support the visualization of proteomics data, we have added heatmap visualization into OncoPrint.

Correlation Analysis

The Plots tab can be used to perform correlation analysis between genomics data (copy number alteration and gene expression), epigenomics data (DNA methylation), proteomics data (RPPA, and now mass spectrometry-based data), and clinical attributes. Mutation and copy number data can be overlaid onto the correlation plots.

Co-expression Analysis

For each queried gene, Pearson's and Spearman's correlation coefficients are calculated against all other genes in a selected gene expression profile.

Enrichments Analysis

After classifying samples into altered and unaltered groups based on the query, this analysis identifies mutations or copy number alterations that are enriched in either group (by a Fisher's exact test). It also identifies genes and proteins that are over/underexpressed in a group (by a two-sided, two-sample Student's t test).

Survival Analysis

Kaplan–Meier estimators and plots for overall survival and disease-free or progression-free survival are generated to study the survival difference between altered and unaltered groups.

Mutual Exclusivity Analysis

Fisher's exact test to analyze whether alterations are significantly mutually exclusive or co-occurring between every pair of query genes.

Patient View Page

Summary and visualization of clinical attributes and genomic alterations in a tumor or a patient.

Study View Page

Summary, visualization, and interactive exploration of clinical attributes and genomic alterations in a cohort of tumors.

Transforming CPTAC Data for the cBioPortal Database

Processed CPTAC mass spectrometry data were downloaded from the CPTAC Data Portal (https://cptac-data-portal.georgetown.edu/). The downloaded data had been processed using the Common Data Analysis Pipeline (CDAP) (14), which standardizes the treatment of data across the consortium. CDAP software is flexible to accommodate different types of mass spectrometry data. Peptides are identified using MS-GF+ (15), and the protein FASTA file used is a concatenation of RefSeq Homo sapiens build 37 with the sequence for Sus scrofa trypsinogen added. The MS-GF+ phosphopeptide assignments are fed into PhosphoRS to obtain phosphosite localizations. The results are made into both peptide-level and protein-level tab-separated files consisting of iTRAQ ratios for breast and ovarian sample and precursor area intensity measurements for the colon samples.

We mapped the proteomics data onto genes because the query interface and analysis features of cBioPortal are gene centric, so protein levels were mapped directly to their corresponding genes by mapping RefSeq protein IDs to HUGO gene symbols. Only peptides that mapped uniquely to a single gene were included in the quantitation. By using gene quantitation with unique peptides, the issue of mapping between proteins and peptides was avoided (including protein groups and one-to-many and many-to-many mapping). For PTM data, we focused on the phosphoproteomics data. For phosphoprotein levels (currently, the only PTM added to cBioPortal), we make special symbols for them with the following pattern:

<HUGOgenesymbol>_P{S,T,Y}<AAmodified>

For example, EIF4EBP1 phosphorylated at the serine located at position 65 is denoted as EIF4EBP1_PS65. In order to easily access phosphoproteins while querying, each phosphoprotein in the database has as an alias

PHOSPHO<HUGOgenesymbol>

which, when queried, brings up a list of phosphoproteins sharing that alias, from which the user can choose the phosphoproteins of interest.

In order for the data to be used by cBioPortal, the distribution of intensities per sample must be reasonably close to a normal distribution because the interface relies on Z-scores for much of its computations and displays. For both the iTRAQ and the label-free quantitation data, the log-transformed intensities are approximately normally distributed, and therefore their values were kept as intensities. In addition, the global proteomics and PTM data were formatted as required by the cBioPortal import pipeline.

Additionally, we also implemented an option of converting MS intensity values into estimates of protein copy number per cell using the -proteomic-ruler parameter. This function uses the methods outlined in Wiśniewski et al. (16) to transform protein intensities into copy number values. The calculation is based on two observations: 1) DNA and histone mass are fairly constant per cell, and 2) the ratio between protein and histone mass is proportional to the ratio between protein and histone mass spectrometry signals. The equation they have outlined is as follows:

Nprotein=Sprotein·NAvogadroMmolar·MDNAShistone

N stands for number (Avogadro's number, 6.022 × 1023, and the measured protein's copy number), S stands for mass spectrometry signal value (of histones and the measured protein), and M stands for mass (of DNA per cell, which is estimated to be 6.5 pg (16) and the molar mass of the measured protein). cBioPortal currently only hosts the raw, unprocessed intensities, but this feature for the processing pipeline has been added to increase the interpretability of intensity values of mass spectrometry data.

The classes created for the processing of CDAP and MaxQuant proteome and PTM file formats can be found on GitHub (see Supplementary Information 5), which also includes a usage tutorial that leads users through the API. For MaxQuant proteome data, which is generally contained in the “proteinGroups.txt” file, intensity data are filtered for q-values under 0.05. MaxQuant PTM data are filtered for a localization probability of over 0.75.

Use Case 1: The cBioPortal web Interface Facilitates Exploration of Regulatory Patterns of mRNA and Protein Expression Data of Cancer Patients

The CPTAC proteomics data can now be interactively analyzed and visualized in the context of TCGA genomics and clinical data in the cBioPortal. We have added the new proteomics data in the query interface as a new genetic profile (currently for TCGA breast, colorectal, and ovarian provisional studies), which allows for querying the protein and phosphoprotein levels together with gene-level genomics data (mutations, copy number alterations, and gene expression). After submitting a query, the proteomics data can be explored in downstream visualization and analysis features.

One new feature we added was to support the generation of a heatmap of expression data in the OncoPrint tab. Upon query submission, the user now has an option to append heatmaps of gene and protein expression to the OncoPrint summarizing genomic alterations of the query genes and samples. The new heatmap feature allows researchers to identify expression patterns at a glance. For example, it is possible to view the protein levels of a gene and its identified phosphosites and cluster the data along both dimensions (Fig. 1). These new features can be combined with previously released features to gain an integrative overview of genomic and clinical attributes of the patient samples, which can be used to easily identify data trends or individual patient samples of interest. For example, in Fig. 1, it can be observed that ERBB2 copy number amplification, mRNA expression, and protein and phosphosite level detection are all very closely correlated with whether ERBB2 receptors are detected on the surface of tumors, as indicated by the HER2-immunohistochemistry score. The RPPA measurements for ERBB2 are highly correlated with mass spectrometry-based measurements (ρ = 0.82) (Fig. S1). We can also observe that TCGA-A2-A0T6 has an unusual mutation spectrum that is enriched for the thymine to guanine transversion. This patient, diagnosed with breast cancer lobular carcinoma, has a lower than average diagnosis age and overall survival. Interestingly, this tumor sample has three oncogenic ERBB2 mutations (shown when hovering over the sample), and its mRNA expression of ERBB2 is above average but its protein and phosphoprotein expression is below average.

Fig. 1.

Fig. 1.

Oncoprint visualization of clinical, genomic, and proteomic features of 77 TCGA breast tumors. Clinical information for this example includes mutation spectrum, diagnosis age, overall survival, detailed cancer type, and HER2 (ERBB2) immunohistochemistry score, which indicates whether or not immunohistochemistry staining was able to detect ERBB2 receptors on the surface of the tumors (negative corresponds to a score of 0–1, equivocal corresponds to a score of 2, and 3+ is positive for ERBB2). ERBB2 genomic and proteomic features are visualized, including mutations, copy number amplifications, mRNA, and protein and phosphoprotein levels (based on CPTAC data).

Other tabs can also allow users to explore potential correlations in depth. For example, the Plots tab can be used to select the DNA copy number, methylation, gene expression, or protein levels from the list of queried genes and posttranslational modifications, which allows the user to browse the Pearson and Spearman correlations between the protein levels of ERBB2 and its phosphosites. By modifying the query to select other cancer types, it is also possible to see how these correlations shift across cancers (Fig. S2).

To illustrate the value of data exploration using well-established examples, we centered on ERBB2, which is a known oncogene in breast and ovarian cancer. The gene sequences of ERBB2 and GRB7 lie on the same amplicon on chromosome 17q12–21 (17). It has been shown that GRB7 can act as an ERBB2-dependent oncogene by enhancing ERBB2 phosphorylation (18). Using the cBioPortal Plots tab, users can see a perfect co-occurrence of copy number categorizations between ERBB2 and GRB7. This persists as a high correlation between copy number and mRNA levels and mRNA and protein levels. On the other hand, ERBB4, which is in the same receptor family as ERBB2, is not located on the same amplicon, so their copy number changes are not correlated. This explains why mRNA levels of ERBB2 and ERBB4 are not correlated. However, their protein levels are correlated, consistent with ERBB2 and ERBB4 forming a heteromeric complex (Fig. S3) (19).

Use Case 2: Exploring cBioPortal Data Through its Web API Using the R Package cgdsr to Access the Full Database Now Loaded with Mass Spectrometry Data

While the cBioPortal web interface offers many prebuilt analysis tools for exploratory analysis, users may want to perform further custom and automated analysis of the data that cannot be done on the interface. The cBioPortal public database has integrated all the different genomic profiles from TCGA, CPTAC, and other sources together into a MySQL database and made the data available programmatically via a web API. The R package cgdsr accesses all the genomics and clinical data by connecting to the web API. An example R script to access cBioPortal data including the proteomics data is provided here (see Supplementary Information 6). In this example, we looked at how well PAM50-based subtypes are conserved when using mass spectrometry-based proteomics values for clustering. Using the “ward.D2” setting for the histfun parameter of heatmap.2 in R, we performed hierarchical clustering on the 77 breast cancer samples containing mass spectrometry data and color coded the samples by their PAM50 subtype classification (see Fig. S4). A more rigorous version of this analysis has been performed on the same data (4), but this example shows the streamlining of data retrieval and analysis that is achieved by wrapping the CPTAC data with the cBioPortal API (see Supplementary Script S1).

Use Case 3: Loading Unpublished or Institution-specific Proteomics Data onto a Local Private cBioPortal Instance for Integrative Analysis

Many institutions now have a private instance of cBioPortal hosting institution-specific research and/or clinical data, which allows researchers to perform cBioPortal-enabled exploratory analysis on private institutional data. An import pipeline now exists to convert mass spectrometry data processed by CDAP into a format for effective loading into the cBioPortal database. All features and functionalities have been released to the cBioPortal GitHub code repository so that any institutions with a private instance of cBioPortal can utilize these features for exploring their private proteogenomic dataset. Instructions for the deployment of a private instance can be found in the documentation (see Supplementary Information 7).

CONCLUSION

Being able to easily integrate multiple -omics domains in order to synthesize them into actionable insights is a modern challenge created by the proliferation of high-throughput sequencing and spectrometry techniques. To address this challenge, the cBioPortal has been developed to provide a public and user-friendly interface for performing interactive and integrative data exploration, analysis, and visualization. To incorporate the mass spectrometry data from CPTAC, we have updated the cBioPortal data pipeline and the public web interface. The data process pipeline not only enables the integration of current and future CPTAC data but also allows private mass spectrometry data to be loaded into databases of institutional instances of the cBioPortal. We also updated the web interface and web API of cBioPortal to support easy access and better visualization and analysis of the mass spectrometry data in the context of cancer genomics and clinical datasets.

The inclusion of mass spectrometry data from CPTAC in the cBioPortal is a significant step toward a better understanding of the molecular profile of various cancer types. In TCGA projects, RPPA was adopted as the proteomics platform to profile key proteins, which has generated a valuable resource for studying cancer proteomics and their association with cancer genomics. However, due to the availability of antibodies, the coverage of RPPA for measuring the proteome (both total proteins and PTMs) is limited, while mass spectrometry can profile a large portion of the proteome and many different types of PTMs without bias in the initial data collection associated with the availability of detection agents (for example, see comparison in Fig. S4). At the moment, the number of tumors analyzed mass spectrometry is still considerably smaller than those analyzed by RPPA (see Table I), but with the continuous efforts such as CPTAC, we expect to see a large increase of the number of tumors in more cancer types profiled by mass spectrometry. All in all, integrating mass spectrometry data with genomics data provide opportunities to uncover novel associations between genome, proteome, and phenotype in cancer.

Table I. Cancer studies in cBioPortal with CPTAC mass spectrometry data as of July 2019. Query Menu Location refers to the location in the tree structure that organizes the cancer studies on the homepage of cBioPortal where the initial query parameters are set.

Tissue RPPA Samples MS Samples Query Menu Location
Breast 892 77 Breast → Invasive Breast Carcinoma → Breast Invasive Carcinoma (TCGA, Provisional)
Colorectal 357 95 Bowel → Colorectal Adenocarcinoma → Colorectal Adenocarcinoma (TCGA, Provisional)
Ovarian 498 174 Ovary → Ovarian Epithelial Tumor → Serous Ovarian Cancer → High-Grade Serous Ovarian Cancer → Ovarian Serous Cystadenocarcinoma (TCGA, Provisional)

DATA AVAILABILITY

All data is available at the CPTAC Data Portal (https://cptac-data-portal.georgetown.edu/cptacPublic/) and the Genomic Data Commons (GDC, https://gdc.cancer.gov/) and were deposited into the public cBioPortal (https://cbioportal.org).

Supplementary Material

Supplementary Script S1 and Figures S1-S4

Footnotes

* P.W. and L.K. was funded by the Google Summer of Code program. D.F. was funded by National Cancer Institute (NCI) CPTAC award U24 CA210972 and by contract 13XS068 from Leidos Biomedical Research. J.G. and N.S. were funded by the Marie-Josée and Henry R. Kravis Center for Molecular Oncology, a National Cancer Institute Cancer Center Core Grant (P30-CA008748), an NCI ITCR grant (NCI-U24CA220457), and the Robertson Foundation (N.S.).

Inline graphic This article contains supplemental material Figs. S1–S4.

1 The abbreviations used are:

CPTAC
Clinical Proteomic Tumor Analysis Consortium
TCGA
The Cancer Genome Atlas
CDAP
Common Data Analysis Pipeline
RPPA
reverse-phase protein array
HUGO
Human Genome Organization
iTRAQ
isobaric tag for relative quantitation
FASTA
a format for encoding protein and nucleotide sequences
PTM
posttranslational modification
API
application programming interface.

REFERENCES

  • 1. Cancer Genome Atlas Network Weinstein J. N., Collisson E. A., Mills G. B., Shaw K. R., Ozenberger B. A., Ellrott K., Shmulevich I., Sander C., and Stuart J. M. (2013) The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Akbani R., Ng P. K., Werner H. M., Shahmoradgoli M., Zhang F., Ju Z., Liu W., Yang J. Y., Yoshihara K., Li J., Ling S., Seviour E. G., Ram P. T., Minna J. D., Diao L., Tong P., Heymach J. V., Hill S. M., Dondelinger F., Städler N., Byers L. A., Meric-Bernstam F., Weinstein J. N., Broom B. M., Verhaak R. G., Liang H., Mukherjee S., Lu Y., and Mills G. B. (2014) A pan-cancer proteomic perspective on The Cancer Genome Atlas. Nat. Commun. 29, 3887. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Cancer Genome Atlas Network. (2012) Comprehensive molecular portraits of human breast tumours. Nature 490, 61–70 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Mertins P., Mani D. R., Ruggles K. V., Gillette M. A., Clauser K. R., Wang P., Wang X., Qiao J. W., Cao S., Petralia F., Kawaler E., Mundt F., Krug K., Tu Z., Lei J. T., Gatza M. L., Wilkerson M., Perou C. M., Yellapantula V., Huang K. L., Lin C., McLellan M. D., Yan P., Davies S. R., Townsend R. R., Skates S. J., Wang J., Zhang B., Kinsinger C. R., Mesri M., Rodriguez H., Ding L., Paulovich A. G., Fenyö D., Ellis M. J., and Carr S. A.; NCI CPTAC. (2016) Proteogenomics connects somatic mutations to signalling in breast cancer. Nature. 534, 55–62 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Cancer Genome Atlas Network. (2011) Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–615 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Zhang H., Liu T., Zhang Z., Payne S. H., Zhang B., McDermott J. E., Zhou J. Y., Petyuk V. A., Chen L., Ray D., Sun S., Yang F., Chen L., Wang J., Shah P., Cha S. W., Aiyetan P., Woo S., Tian Y., Gritsenko M. A., Clauss T. R., Choi C., Monroe M. E., Thomas S., Nie S., Wu C., Moore R. J., Yu K. H., Tabb D. L., Fenyö D., Bafna V., Wang Y., Rodriguez H., Boja E.S., Hiltke T., Rivers R. C., Sokoll L., Zhu H., Shih I. M., Cope L., Pandey A., Zhang B., Snyder M. P., Levine D. A., Smith R. D., Chan D. W., and Rodland K. D.; CPTAC Investigators. (2016) Integrated proteogenomic characterization of human high-grade serous ovarian cancer. Cell 166, 755–765 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Cancer Genome Atlas Network. (2012) Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Zhang B., Wang J., Wang X., Zhu J., Liu Q., Shi Z., Chambers M. C., Zimmerman L. J., Shaddox K. F., Kim S., Davies S. R., Wang S., Wang P., Kinsinger C. R., Rivers R. C., Rodriguez H., Townsend R. R., Ellis M. J., Carr S. A., Tabb D. L., Coffey R. J., Slebos R. J., and Liebler D. C.; NCI CPTAC. (2014) Proteogenomic characterization of human colon and rectal cancer. Nature 513, 382–387 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Slebos R. J., Wang X., Wang X., Wang X., Zhang B., Tabb D. L., and Liebler D. C. (2015) Proteomic analysis of colon and rectal carcinoma using standard and customized databases. Scientific Data 2, 150022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Li J., Lu Y., Akbani R., Ju Z., Roebuck P. L., Liu W., Yang J. Y., Broom B. M., Verhaak R. G. W., Kane D. W., Wakefield C., Weinstein J. N., Mills G. B., and Liang H. (2013) TCPA: A resource for cancer functional proteomics data. Nat. Methods. 10, 1046–1047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Vasaikar S. V., Straub P., Wang J., and Zhang B. (2018) LinkedOmics: Analyzing multi-omics data within and across 32 cancer types. Nucleic Acids Res. 46, D956–D963 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Cerami E., Gao J., Dogrusoz U., Gross B. E., Sumer S. O., Aksoy B. A., Jacobsen A., Byrne C. J., Heuer M. L., Larsson E., Antipin Y., Reva B., Goldberg A. P., Sander C., and Schultz N. (2012) The cBio Cancer Genomics Portal: An open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2, 401–404 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Gao J., Aksoy B. A., Dogrusoz U., Dresdner G., Gross B., Sumer S. O., Sun Y., Jacobsen A., Sinha R., Larsson E., Cerami E., Sander C., and Schultz N. (2013) Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal. 6, 1–20 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Rudnick P. A., Markey S. P., Roth J., Mirokhin Y., Yan X., Tchekhovskoi D. V., Edwards N. J., Thangudu R. R., Ketchum K. A., Kinsinger C. R., Mesri M., Rodriguez H., and Stein S. E. (2016) A description of the Clinical Proteomic Tumor Analysis Consortium (CPTAC) common data analysis pipeline. J. Proteome Res. 15, 1023–1032 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Kim S., and Pevzner P. A. (2014) MS-GF+ makes progress towards a universal database search tool for proteomics. Nat. Commun. 5, 5277. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Wiśniewski J. R., Hein M. Y., Cox J., and Mann M. (2014) A “proteomic ruler” for protein copy number and concentration estimation without spike-in standards. Mol. Cell. Proteomics 13, 3497–3506 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Katoh M., and Katoh M. (2003) MGC9753 gene, located within PPP1R1B-STARD3-ERBB2-GRB7 amplicon on human chromosome 17q12, encodes the seven-transmembrane receptor with extracellular six-cystein domain. Intl. J. Oncol. 22, 1369–1374 [PubMed] [Google Scholar]
  • 18. Saito M., Kato Y., Ito E., Fujimoto J., Ishikawa K., Doi A., Kumazawa K., Matsui A., Takebe S., Ishida T., Azuma S., Mochizuki H., Kawamura Y., Yanagisawa Y., Honma R., Imai J., Ohbayashi H., Goshima N., Semba K., and Watanabe S. (2012) Expression screening of 17q12–21 amplicon reveals GRB7 as an ERBB2-dependent oncogene. FEBS Lett. 586, 1708–1714 [DOI] [PubMed] [Google Scholar]
  • 19. Yarden Y., and Sliwkowski M. X. (2001) Untangling the ErbB signalling network. Nat. Rev. Mol. Cell Biol. 2, 127–137 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Script S1 and Figures S1-S4

Data Availability Statement

All data is available at the CPTAC Data Portal (https://cptac-data-portal.georgetown.edu/cptacPublic/) and the Genomic Data Commons (GDC, https://gdc.cancer.gov/) and were deposited into the public cBioPortal (https://cbioportal.org).


Articles from Molecular & Cellular Proteomics : MCP are provided here courtesy of American Society for Biochemistry and Molecular Biology

RESOURCES