Table 1.
Data availability: summary representation of data available to account holders in the Cloud Resources.
Broad FireCloud | ISB-CGC | SB-CGC | ||
---|---|---|---|---|
Reference genomes and files | e.g., GTEx, 1000 Genomes | ✓ | ✓ | ✓ |
Derived data | e.g., gene expression matrixes | ✓ | ✓ | |
Connection to non-cancer data | e.g., AnVIL | ✓ | ✓ | ✓ |
GDCa,b | TCGA (The Cancer Genome Atlas) | ✓ | ✓ | ✓ |
AWS and GCP | TARGET (Therapeutically Applicable Research to Generate Effective Treatments) | ✓ | ✓ | ✓ |
CCLE (Cancer Cell Line Encyclopedia) | ✓ | ✓ | ✓ | |
PDCa,b | CPTAC (Clinical Proteomic Tumor Analysis Consortium) | ✓ | ✓ | |
AWS | APOLLO (applied Proteomics Organizational Learning and Outcomes) | ✓ | ✓ | |
ICPC (International Cancer Proteogenomic Consortium) | ✓ | ✓ | ||
CBTN (Children's Brain Tumor Network) | ✓ | ✓ | ||
ICDCa | CMPC (The Comparative Molecular Characterization Program) | ✓ | ||
AWS | COP (Comparative Oncology Program) | ✓ | ||
PCCR (The Purdue University Center for Cancer Research) | ✓ | |||
CDSa,b | PPTC (Pediatric Preclinical Testing consortium) | ✓ | ||
AWS | HTAN (Human Tumor Atlas Network) | ✓ | ✓ | |
CCDI (Childhood Cancer Data Initiative) | ✓ | |||
IDC | TCGA (The Cancer Genome Atlas) | ✓ | ||
GCP |
Note: The cloud(s) hosting each data node is also provided. Refer to Supplementary Table S3 for a complete list of acronyms and definitions. Of note, the datasets represent the most commonly requested and used data by cancer researchers.
aMore data is available than the ones highlighted on this table. Please refer to the individual websites for a full list of datasets available.
bData portals include both controlled and open-access data. To access controlled data, researchers must obtain the appropriate dbGaP permissions. CRDC provides a list of key datasets on their website.