Skip to main content
. Author manuscript; available in PMC: 2021 Feb 7.
Published in final edited form as: Nat Methods. 2021 Jan 4;18(2):144–155. doi: 10.1038/s41592-020-01013-2

Table 1:

Checklist of recommended best practices

Recommendation Rationale
High-depth sequencing (>60x) of biopsy samples with the highest pathological purity possible, ideally complemented with deep targeted sequencing of SNVs Increasing read depth increases the limit of detection for minor subclones and the resolution of CCF estimation22,23,54. High purity ensures most of the reads come from the tumor cells, increasing NRPCC.
Ensure the number of SNVs called is sufficient for subclonal reconstruction A low coding substitution rate can lead to insufficient data for accurate subclonal reconstruction in exome-based studies23,90.
Sequence multiple regions from a single tumor Single-region bulk sequencing systematically underestimates the number of subclones and locally dominant subclones can be mistaken as clonal13,21,91. Multi-region sequencing also provides better subclone resolution and allows phylogeny inference.
Minimize germline variant contamination:
  • Sequence matched normal tissue, ideally from an unrelated tissue source (e.g. blood)

  • Remove known germline variants

  • Combine multiple SNV detection algorithms

  • Remove SNVs in genomic regions where read mapping is difficult

  • Use a panel of normal samples

Germline contamination can lead to false-positive SNVs with high VAF that can be mislabeled as a cluster. Using a consensus call set can improve sensitivity and specificity of variant detection23.
Call somatic variants with a highly sensitive algorithm Increased algorithm sensitivity facilitates low VAF SNV detection, improves clustering accuracy and better captures the level of tumor heterogeneity. Highly sensitive detection algorithms can also improve the chances of detecting clinically relevant minor subclones22,23,92. However, users should be cautious of false-positive SNVs which are often seen at low VAF and may form a low VAF cluster.
For CNA reconstructions, review solutions for incorrect CP and WGD estimation and adjust accordingly. Optimally, perform experimental ploidy validation. CNA reconstructions must decide between multiple equally likely ploidy and purity solutions. Ideally, inform CNA calling with experimental ploidy estimates using FACS, image cytometry, or FISH.
Carry out orthogonal copy number estimation Multiple copy number solutions are usually possible; estimating copy number from WES data can be especially challenging90.
Perform CNA + SNV based reconstruction using a method that incorporates a Binomial or Beta-binomial noise model Binomial and Beta-binomial noise models better capture the noise in read sampling for a given read depth and CP, improving SNV clustering accuracy.
If possible, use phasing or single-cell sequencing data to support inferred mutation ordering. Ideally, perform multi-sample sequencing. An unambiguous phylogeny is not always possible based only on the crossing and sum rules. Phasing or single-cell sequencing information can support or refute a proposed phylogeny14,16,73. Our preferred setup is multi-sample sequencing, intelligent designs combine high and low depth sequencing to minimize cost93.
Validate subclonal SNVs of interest using high-depth targeted sequencing SNVs detected in one sample may occur at very low VAFs in another, and high-depth targeted sequencing can detect these rare subclonal populations17,23,92.