Skip to main content
. Author manuscript; available in PMC: 2021 Jul 14.
Published in final edited form as: Cancer Epidemiol Biomarkers Prev. 2020 May 28;29(8):1519–1534. doi: 10.1158/1055-9965.EPI-19-1551

Table 1.

Coding fields used for reviewing papers.

Topic Coding fields
Publication Pubmed ID; Journal; Year; Author; Title; Abstract;
Study general Goal; Study design; Source of individuals; Cancer type; Ethnicity; Sequencing Center; Data repository;
Number of individuals sequenced in discovery Number of cases, controls, families, and cases per family sequenced in discovery phase;
Sequencing technique Samples type; Exome and/or genome; Capture kit; Sequencer; Coverage/depth;
Processing of raw-data Aligner; Reference genome; Variant caller and calling QC; Annotation software and sources;
Technical validation Yes/No; Validation technology; Number of cases, controls, families, and cases per family sequenced in validation phase; variants/genes validated;
Independent replication Yes/No; Replication technology and analysis; Number of cases, controls, families, and cases per family sequenced in replication phase; variants/genes replicated;
Functional validation In silico functional analyses; Experimental functional study;
Variants and genes data analysis Candidate analysis approach; Filtering strategy overall; Analytical methods;
Variants and genes identification Yes/No; Identified variants and genes; Number of cases, controls, families carrying the identified variants and genes;
Authors comments and conclusions Challenges; Suggested next steps; Conclusions;
Derived filtering criteria/categories shown in Figure 1 f_1: variant passing quality control metrics
f_2: heterozygous variant
f_3: homozygous variant
f_4: variant located in a coding region
f_5: nonsynonymous or splice variant
f_6: variant damaging according to in silico algorithms
f_7: truncating variant
f_8: variant altering protein properties according to molecular modeling
f_9: not hypervariable gene
f_10: variant absent from Minor Allele Frequency (MAF) databases
f_11: variant rare in MAF databases
f_12: variant segregating with disease status in the family
f_13: variant present in multiple families or independent cases
f_14: variant enriched in cases compared to controls
f_15: gene mutated in multiple families or independent cases
f_16: gene enriched in cases compared to controls
f_17: variant present in disease related databases
f_18: gene known to be linked to disease
f_19: genetic region known to be linked to disease
f_20: biological or molecular pathway known to be linked to disease
f_21: pathway analysis indicating a gene-disease link
f_22: variant confirmed through technical validation
f_23: variant or gene replicating in independent cases
f_24: variant loss of heterozygosity (LOH) observed in tumor
f_25: relevant somatic mutations observed in tumor
f_26: gene-disease link supported by functional experiments
f_27: variant splicing supported by experiment
f_28: variant-disease link supported by functional experiment