Table 1. Override parameters selected for customizing hotspot calling of BRCA1 and BRCA2 variants in this study.
Parameters | Description | Values | ||
---|---|---|---|---|
Applied | Recommended | Allowed | ||
Minimum allele frequency | Minimum observed allele frequency required for a non-reference variant call. | For SNPs/hotspots: 0.1 | For SNPs: 0.01-0.2 | Floats 0.0-1.0 |
Lowering this value improves sensitivity and decreases specificity (and increases the ratio of false positives to true positives). | For indels: 0.1 | For indels: 0.05-0.2 | ||
Minimum quality | Do not call variants if the phred-scaled call quality is below this value. | ≥ 10 | ≥ 10 | Integers ≥ 0 |
Lowering this value improves sensitivity and decreases specificity. | ||||
Minimum coverage | Do not call variants if the total coverage on both strands is below this value. | For SNPs/hotspots: 6 | For SNPs/hotspots: 5-20 | Integers ≥ 0 |
For germline workflows, lowering coverage improves sensitivity. | For indels: 15 | For indels: 15-30 | ||
Lowering this value is dangerous for homo-polymer indels: this decreases specificity drastically. | ||||
Minimum coverage on either strand | Do not call variants if coverage on either strand is below this value. | For SNPs/hotspots: ≥ 0/3 | ≥3 | Integers ≥ 0 |
For indel calling, reducing this value improves sensitivity but at a high cost of specificity. | For indels: ≥5 | |||
Minimum strand bias | Do not call variants if the proportion of variant alleles from one strand if higher than this ratio. | For SNPs/hotspots: 0.95 | 0.95 | Floats 0.5-1.0 |
For indels: 0.85 | ||||
Minimum relative read quality | Do not call variants if Relative Read Quality is below this threshold. | ≥ 6.5 | ≥ 6.5 | Floats ≥ 0 |
A phred-scaled minimum average evidence per read or no-call. | ||||
Maximum common signal shift | Do not call variants if Common Signal Shift exceeds this threshold. | 0.7 | 0.3 = 30% of variant change size | Floats ≥ 0 |
If the predictions are distorted to fit the data more than this distance (relative to the size of the variant), filter this candidate position out. | ||||
Maximum reference/variant signal shift | Do not call insertions if Reference or Variant Signal Shift exceeds this threshold. Filter observed clusters that deviate from predictions by more than this amount (relative to the size of the variant). | For ins: 0.4 | 0.2 = 20% of variant change size | Floats ≥ 0 |
For del: 0.2 | ||||
hp_max_length | Maximum homopolymer length for calling indels. | 10 | 8 | Integers ≥ 1 |
downsample_to_coverage | Reduce coverage in over-sampled locations to this value. | 10,000 | For germline: 400 | Integers ≥ 1 |
For somatic: 2,000 | ||||
utlier_probability | Prior probability that a read comes from some other distribution. | 0.01 | 0.005-0.01 | Floats 0.0-1.0 |
Lower numbers reduce the influence of outlier observations. Higher numbers increase the influence of outliers. Empirical adjustment indicates that increasing the influence of outliers leads to more false-positives and slightly more true positives, but at a poor tradeoff. | ||||
prediction_precision | Number of pseudo-data-points suggesting our predictions match the measurements without bias. | 1.0 | 1.0 | Floats ≥ 0 |
heavy_tailed | How heavy the T-distribution tails are to allow for unusual spread in the data. This value represents the prior probability that a given read comes from some distribution other than the possibilities being evaluated. | 3.0 | NA | NA |
Lower values mean that more reads are forced to be assigned to one of the tested alleles, even at very poor data fit (fewer reads are thrown out, with the likely tradeoff of more false positive calls). | ||||
Higher values mean that reads that are merely slightly noisy |
Abbreviations: SNP, single nucleotide polymorphism; indel, insertion and deletion; NA, not available.