Sample and case selection |
|
Matched normal samples |
-
-
Sequencing matched normal tissue is essential for removing germline variants and identifying mapping artifacts or sequencing errors.
-
-
For hematologic cancers, skin normals should be collected at remission to reduce tumor contamination of the normal.
-
-
For solid tumors, use blood instead of adjacent normals to avoid tumor infiltration.
-
-
In the absence of a matched normal, use as many unmatched normal samples as possible (e.g. a pool of healthy individuals).
|
Library construction |
-
-
Improve coverage, reduce amplification-related errors, and improve SV detection by constructing multiple independent libraries per sample. This approach resulted in PCR error rates below those detectable from the assays that were performed (< 0.23–0.35%).
-
-
A large amount (>1 μg) of starting input DNA allows for multiple libraries, decreases duplication rates, and enables adequate sampling of rare subclonal populations.
|
Sequencing platform |
-
-
Choose a platform that allows for cost-effective generation of high depth data.
-
-
Orthogonal sequencing methods have value for confirmation of low-frequency variants.
-
-
Single cell sequencing can be useful for resolving tumor phylogeny.
|
Sequencing depth |
-
-
Greater depth is needed in the case of impure tumors, tumor contamination of the normal sample, aneuploidy, and clonal heterogeneity.
-
-
Expect non-uniform coverage across the genome. Total coverage levels may need to be increased to ensure adequate depth in certain regions (e.g. GC rich promoter regions).
-
-
30× WGS was insufficient for inferring clonal architecture or identifying variants with <15% VAF, even in a tumor with >90% purity.
-
-
50× WGS was insufficient to detect variants at <10% VAF, including many important for relapse.
-
-
An increase in coverage from ~30× to ~300× (coupled with a less-contaminated normal) resulted in the identification of 4 additional subclones and over 11× as many variants in this case.
|
Whole genome sequencing |
-
-
WGS is essential for detection of CNVs and other SVs.
-
-
Difficult to capture coding regions may be better covered in WGS.
-
-
WGS enables detection of non-coding mutations that may be biologically relevant or serve as clonal markers.
|
Targeted Sequencing |
-
-
WGS should be accompanied by either commercial exome or custom capture for increased coverage of key cancer genes.
-
-
“Spiking in” oligonucleotide probes allows for more coverage (>1,000×) and improved sensitivity in critical ‘hotspot’ regions (can be cancer specific or pan-cancer). We achieved ~5-fold greater coverage across 264 genes recurrently mutated in AML with little exome-wide loss of coverage.
|
Sequence alignment |
-
-
The choice of reference sequence and alignment algorithm impacts variant calling. VAFs calculated from the same data aligned with alternate algorithms had Spearman correlations that varied from 0.56 to 0.99.
-
-
Local assembly of indels and realignment can produce more accurate VAF estimates, especially for multi-basepair events.
|
Variant calling |
-
-
Current SNV callers are not optimized for detecting low VAF events in high-depth data. Optimization of parameters may help, but new algorithms are probably needed in the long term.
-
-
Using multiple variant callers is a viable strategy for improving performance. Intersections improve PPV, while unions improve sensitivity.
-
-
Match the goals of a project to algorithms that provide the right balance of sensitivity and specificity.
-
-
Indels and SVs are harder to detect – expect poorer performance.
-
-
Samples from multiple time points increase confidence and enable detection of key low-VAF variants that are enriched during clonal evolution.
|
Subclonal inference |
-
-
Accurate estimation of tumor VAFs requires high depths to overcome sampling error. Plan for 500–1,000× coverage or more if detailed inference of subclonal populations is important.
-
-
Exomes or targeted assays may not provide enough variants for accurate clonal clustering, especially in cancers with low mutation rates.
-
-
Temporally and/or spatially separated samples aid in subclonal inference and tracking tumor evolution.
|
RNA sequencing |
-
-
Variants detected in both DNA-seq and RNA-seq have high confidence because they are confirmed by orthogonal library and alignment strategies.
-
-
RNA-seq may be used to assess expression status of coding somatic variants and fusions as well as the functional impact of regulatory variants.
|
Overall recommended strategy |
Sequencing strategy will always be dependent on the goals of the project and budget, but an ideal tumor profiling study might include:
-
-
WGS to a depth of 200–300×
-
-
Exome sequencing to a depth of 1,000× (possibly with spike-in probes for mutational hotspots)
-
-
Analysis with multiple alignment strategies and variant callers
-
-
Validation of variants with custom capture and deep sequencing (1,000–10,000×)
-
-
RNA-seq with 250–300 million mapped 2×100 reads or greater for robust integration with DNA-seq data.
|