Skip to main content
. 2018 Aug 10;11(1):31–39. doi: 10.1007/s12551-018-0446-z

Table 1.

Details of the key omics datasets available from representative cancer cell lines and patient genomic resources, along with the dataset sizes and dimensionalities of the raw and processed profiles for the NGS-based datatypes (rows in italics). For the other datatypes, only the dimensionality of the processed data is reported for comparison

NCI-DREAM71 NCI-602 GDSC10003 TCGA4/TCPA5
Cancer type Breast cancer 9 tissue types 29 tissue types 33 tissue types
Number of samples 53 cell lines 59 cell lines 1124 cell lines ~ 11,000 patient tumors
Total size of NGS datatypes (raw datasets) ~ 13 TB ~ 15 TB 260 TB ~ 2.5 PB
Whole genome sequencings (~ 3.2 billion reads/sample) ~ 8 TB ~ 9 TB ~ 170 TB ~ 1.6 PB
Whole exome sequencings (150 million reads/sample) ~ 4.5 TB ~ 5 TB ~ 90 TB ~ 0.8 PB
RNA sequencing (30–100 million reads/sample) ~ 55 GB ~ 60 GB ~ 1 TB ~ 11 TB
MicroRNA profiles ~ 5 GB ~ 500 GB
Total size (processed datasets) ~ 27 GB ~ 33 GB ~ 90 GB ~ 480 TB
Whole genome sequencing ~ 20,000 genes ~ 17,000 genes 19,100 genes ~ 21,000 genes
Whole exome sequencing ~ 22,000 genes ~ 13,000 genes ~ 23,000 genes ~ 20,000 genes
RNAsequencing ~ 40,000 transcripts ~ 60, 000 transcripts ~ 50, 000 transcripts ~ 55,000 transcripts
MicroRNA profiles ~ 800 miRNA transcripts ~ 1800 miRNA transcripts
Microarray gene expression ~ 18,000 genes 25,722 genes 17,737 genes ~ 22,000 genes
Somatic mutation calling ~ 33,000 SNPs6 ~ 500,000 SNPs ~ 485,000 SNPs ~ 500,000 SNPs
Copy number variation ~ 27,000 variants ~ 25,000 variants ~ 50,000 variants ~ 50,000 variants
DNA methylation patterns ~ 27,000 CpGs7 20,000 CpGs ~ 35,000 CpGs ~ 486,000 CpGs
RPPA8 proteomics 131 proteins 162 proteins ~ 240 proteins
MS9 proteomics 10,350 proteins ~ 16,000 proteins10
Drug response data 28 compounds > 100,000 compounds 265 compounds Survival data for clinical treatments

Boldface entries represent total sizes of raw and processed datasets. These are not statistical significance values

1NCI-DREAM7, DREAM7 Challenge (http://dreamchallenges.org/), organized together with the National Cancer Institute (NCI; Costello et al. 2014)

2NCI-60, The National Cancer Institute drug screening panel (Shoemaker 2006)

3GDSC1000, Genomics of Drug Sensitivity in Cancer project (Yang et al. 2012)

4TCGA, The Cancer Genome Atlas (http://cancergenome.nih.gov/; Weinstein et al. 2013)

5TCPA, The Cancer Proteome Atlas (http://tcpaportal.org/tcpa/, Li et al. 2013)

6SNPs, single-nucleotide polymorphism

7CpGs, CpG island in DNA where “C” is connected to “G” by a phosphodiester bond “p”

8RPPA, reverse phase protein array

9MS, mass spectrometry

10CPTAC, Clinical Proteomic Tumor Analysis Consortium (https://proteomics.cancer.gov/programs/cptac)