Skip to main content
. 2018 Nov 21;18:107. doi: 10.1186/s12911-018-0665-z

Table 1.

Overview of data types and value ranges for data elements covered by the core data model for minimum variant level data

Class Attribute Value range Example
Allele descriptive
Gene Gene ID Internal ID G0002V5Z
Gene name HGNC gene symbols KRAS
Chromosome 1.. 22, X, Y 12
Entrez gene ID Entrez gene IDs 3845
Ensembl gene ID Ensembl gene IDs ENSG00000133703
RefSeq gene ID RefSeq gene IDs NG_007524
Gene transript Gene ID Internal ID G0002V5Z
Gene transcript ID Internal ID T0006OOW
RefSeq transcript ID RefSeq Transcript IDs NM_033360
RefSeq protein ID RefSeq protein IDs NP_203524
Ensemble transcript ID Ensemble transcript IDs ENST00000256078
UniProt ID UniProt IDs P01116
Gene position Gene ID Internal ID G0002V5Z
Genome version Genome build IDs GRCh37.p13
DNA position Genomic coordinate 12p12.1
Gene pathway Gene ID Internal ID G0002V5Z
Pathway ID Internal ID P003V724
Gene pathway Pathway ID Internal ID P003V724
Common name Activation of RAS in B cells
Kegg ID Kegg IDs map04014
Reactome ID Reactome IDs R-HSA-1169092
PathwayCommons ID PathwayCommons IDs R-HSA-1169092
Allele interpretive
Variant Variant ID Internal ID V0000LBB
Variant type “Single nucleotide variant (SNV)”, “multinucleotide variant (MNV)”, “insertion (INS)”, “deletion (DEL)” SNV
Variant position Variant ID Internal ID V0000LBB
Genome version Genome build IDs GRCh37.p13
DNA sub. & position HGVS genomic coordinate NC_000012.11:g.25398284C >G
Gene variant Gene ID Internal ID G0002V5Z
Variant ID Internal ID V0000LBB
Variant consequence “Non-sense”, “missense”, “silent”, “frame shift”, “in-frame”, “3UTR”, “5UTR”, “splice”, “splice-region”, “intronic”, “upstream”, “downstream” missense
Gene variant transcript Gene ID Internal ID G0002V5Z
Variant ID Internal ID V0000LBB
Gene transcript ID Internal ID T0006OOW
Protein sub. & Position HGVS formatted variants NM_033360.3(KRAS):c.35G >C (p.Gly12Ala)
Protein domain Descriptive name of protein domain Small GTP-binding protein domain
Variant consequence “Expression”, “amplification”, “deletion”, “fusion”, “loss of function”, “missense” missense
Risk score FATHMM, SIFT, PolyPhen 0.98468, 0, 0.97
Somatic interpretive
Cancer type Cancer type ID Internal ID C000WQFL
Cancer type name NCI thesaurus | Oncotree IDs Colorectal cancer
UMLS ID UMLS concept IDs C1527249
HPO ID HPO concept IDs HP:0003003
Cancer variant Cancer variant ID Internal ID CV00XBQW
Variant ID Internal ID V0000LBB
Cancer type ID Internal ID C000WQFL
Biomarker class “Diagnostic”, “prognostic”, “predictive”, “predisposing”, “pharmacogenomic” predictive
Clinical relevance level() “Tier 1”, “Tier 2”, “Tier 3” [8] Tier 2
Cancer variant sample Cancer variant ID Internal ID CV00XBQW
Sample ID Internal ID SXBQW0A7
Somatic classification “Confirmed somatic”, “confirmed germline”, “unknown” somatic
Allele frequency Allele frequency in global population 0.00001647
Sample specimen Sample ID Internal ID SXBQW0A7
Tumor purity Ratio 0.763
TNM status TNM values T2N1M1
Primary / relapse Primary || relapse primary
Cancer variant drug Cancer variant ID Internal ID CV00XBQW
Drug ID Internal ID D00000Z9
Cancer variant drug effect Cancer variant ID Internal ID CV00XBQW
Drug ID Internal ID D00000Z9
Effect “Resistant”, “responsive”, “non-responsive”, “sensitive”, “reduced sensitivity”, “other” Resistance or non-response
Level of evidence see Table 6 C
Sublevel of evidence see Table 6 3A
Drug Drug ID Internal ID D00000Z9
Substance name FDA approved | DrugBank substance names Panitumumab
DrugBank ID DrugBank IDs DB01269
PharmGKB ID PharmGKB IDs PA162373091
FDA ID FDA IDs 125147
Drug mechanism Drug ID Internal ID D00000Z9
Molecular mechanism Description Binds to the epidermal growth factor receptor (EGFR) on both normal and tumor cells[...]

Example data for evidence recording is given in Additional file 3