Skip to main content
. 2018 Mar 28;8:5307. doi: 10.1038/s41598-018-23563-4

Table 1.

Overview of possible contamination types, their consequences and suitable filtering options. PC-AF: pool-complement allelic fraction.

Contamination type Cause (the type of co-multiplexed samples) Possible somatic variant calling artefacts Prevalence of given contamination type in affected datasets Suitable post-sequencing filtering options
a) Contaminant germline variants in a tumour sample Any samples from other individuals False positive somatic variants in the form of germline variation from other individuals The most likely contamination type to occur;
Contamination targets are expected to be more affected in copy number loss regions*
A variant filter based on an appropriate germline variant database or a relevant panel of normal samples;
A filter based on PC-AF values (if a more discriminative solution is necessary)
b) Contaminant somatic variants in a tumour sample Other tumour samples False positive “recurrent” somatic variants in the form of somatic variation from other tumour samples – whether from other individual(s) or the same individual Expected to be relevant in tumour sample pools enriched** for specific somatic variants;
Contamination targets are expected to be more affected in copy number loss regions*
A filter based on PC-AF values (non-discriminative filtering might lead to false negatives of high importance)
c) Contaminant germline variants in a control sample Any samples from other individuals False negatives/missed somatic variant calls – only concerning somatic variants that also occur as germline variants Dependent on the occurrence of important variants as both germline and somatic in a given project’s setting Review of calls not classified as somatic, adjustment of the variant caller parameters
d) Contaminant somatic variants in a control sample Any tumour samples False negatives/missed somatic variant calls – concerning all somatic variants Elevated relevancy when matched samples are co-multiplexed;
Prevalence dependent on the enrichment** of potential contaminant variants in a given sample pool;
Consequences dependent on variant caller’s tendency to reject a somatic variant candidate due to evidence of its presence in the matched control
Review of calls not classified as somatic, adjustment of the variant caller parameters

*Copy number loss regions of high-purity tumour samples will be especially affected.

**The enrichment will increase together with given variant’s recurrence, as well as with purity of tumour samples that carry the variant.