Table 1.
NCI-DREAM71 | NCI-602 | GDSC10003 | TCGA4/TCPA5 | |
---|---|---|---|---|
Cancer type | Breast cancer | 9 tissue types | 29 tissue types | 33 tissue types |
Number of samples | 53 cell lines | 59 cell lines | 1124 cell lines | ~ 11,000 patient tumors |
Total size of NGS datatypes (raw datasets) | ~ 13 TB | ~ 15 TB | 260 TB | ~ 2.5 PB |
Whole genome sequencings (~ 3.2 billion reads/sample) | ~ 8 TB | ~ 9 TB | ~ 170 TB | ~ 1.6 PB |
Whole exome sequencings (150 million reads/sample) | ~ 4.5 TB | ~ 5 TB | ~ 90 TB | ~ 0.8 PB |
RNA sequencing (30–100 million reads/sample) | ~ 55 GB | ~ 60 GB | ~ 1 TB | ~ 11 TB |
MicroRNA profiles | – | ~ 5 GB | – | ~ 500 GB |
Total size (processed datasets) | ~ 27 GB | ~ 33 GB | ~ 90 GB | ~ 480 TB |
Whole genome sequencing | ~ 20,000 genes | ~ 17,000 genes | 19,100 genes | ~ 21,000 genes |
Whole exome sequencing | ~ 22,000 genes | ~ 13,000 genes | ~ 23,000 genes | ~ 20,000 genes |
RNAsequencing | ~ 40,000 transcripts | ~ 60, 000 transcripts | ~ 50, 000 transcripts | ~ 55,000 transcripts |
MicroRNA profiles | – | ~ 800 miRNA transcripts | – | ~ 1800 miRNA transcripts |
Microarray gene expression | ~ 18,000 genes | 25,722 genes | 17,737 genes | ~ 22,000 genes |
Somatic mutation calling | ~ 33,000 SNPs6 | ~ 500,000 SNPs | ~ 485,000 SNPs | ~ 500,000 SNPs |
Copy number variation | ~ 27,000 variants | ~ 25,000 variants | ~ 50,000 variants | ~ 50,000 variants |
DNA methylation patterns | ~ 27,000 CpGs7 | 20,000 CpGs | ~ 35,000 CpGs | ~ 486,000 CpGs |
RPPA8 proteomics | 131 proteins | 162 proteins | – | ~ 240 proteins |
MS9 proteomics | – | 10,350 proteins | – | ~ 16,000 proteins10 |
Drug response data | 28 compounds | > 100,000 compounds | 265 compounds | Survival data for clinical treatments |
Boldface entries represent total sizes of raw and processed datasets. These are not statistical significance values
1NCI-DREAM7, DREAM7 Challenge (http://dreamchallenges.org/), organized together with the National Cancer Institute (NCI; Costello et al. 2014)
2NCI-60, The National Cancer Institute drug screening panel (Shoemaker 2006)
3GDSC1000, Genomics of Drug Sensitivity in Cancer project (Yang et al. 2012)
4TCGA, The Cancer Genome Atlas (http://cancergenome.nih.gov/; Weinstein et al. 2013)
5TCPA, The Cancer Proteome Atlas (http://tcpaportal.org/tcpa/, Li et al. 2013)
6SNPs, single-nucleotide polymorphism
7CpGs, CpG island in DNA where “C” is connected to “G” by a phosphodiester bond “p”
8RPPA, reverse phase protein array
9MS, mass spectrometry
10CPTAC, Clinical Proteomic Tumor Analysis Consortium (https://proteomics.cancer.gov/programs/cptac)