Skip to main content
Oxford University Press logoLink to Oxford University Press
letter
. 2020 Mar 31;66(4):616–618. doi: 10.1093/clinchem/hvaa026

Fragment Size Analysis May Distinguish Clonal Hematopoiesis from Tumor-Derived Mutations in Cell-Free DNA

Francesco Marass 1,2, Dennis Stephens 3, Ryan Ptashkin 4, Ahmet Zehir 4, Michael F Berger 4,5,6, David B Solit 5,7,8, Luis A Diaz Jr 3, Dana W Y Tsui 4,5,6,
PMCID: PMC7108495  PMID: 32191320

To the Editor:

Noninvasive detection of somatic, solid tumor-derived mutations in the blood is an important clinical and investigative tool. However, analysis of cell-free DNA (cfDNA) for somatic mutations can be confounded by the presence of mutations that are not of tumor origin. These include germline alterations, mutations from clonal events in nonneoplastic tissue, and artifacts from the sequencing process (1). The most abundant set of clonal mutations is derived from the hematopoietic system and these may be mistaken for tumor mutations since similar genetic alterations may be present in both (2). One strategy to determine whether mutations stem from this process, termed “clonal hematopoiesis” (CH), or from the tumor is to sequence matched white blood cells. However, cfDNA sequencing is frequently not paired with a matched blood control. Multiple studies have shown that tumor-derived cfDNA consists on average of shorter fragments than cfDNA derived from white blood cells (3). We therefore hypothesized that the size profile of fragments bearing CH mutations would be more similar to the profile of normal white blood cells than to the profile of circulating tumor DNA, and that this difference may allow discrimination between the 2 types of mutation in cell-free DNA.

To test this hypothesis, we studied 44 patients with solid tumors (including prostate, bladder, breast, melanoma, and lung cancers) with CH mutations previously identified by matched tumor: normal analysis using our institutional FDA-authorized clinical test, Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT, Memorial Sloan Kettering- Integrated Mutation Profiling of Actionable Cancer Targets), then analyzed the matched plasma cfDNA collected from these patients. The protocol was approved by Memorial Sloan Kettering Cancer Center institutional review board and informed consent was obtained from all patients. Blood samples were processed to extract cfDNA (4), and subjected to the MSK-IMPACT hybridization-capture protocol as described except modified to adjust the adapter concentration to 4.5 μM (5). Captured DNA libraries were sequenced on a HiSeq 4000 with PE100 reads to a mean of 646× coverage per sample, demultiplexed and aligned (5). CH-derived and tumor-derived nonsynonymous mutations from the tumor: normal MSK-IMPACT data were genotyped in the matched cfDNA.

In the cohort, 38 patients had 69 CH-derived mutations and 42 patients had 349 tumor-derived mutations. We detected a total of 63 CH-derived mutations (variant allele frequency (VAF)) median 3.85%, range 0.1–39.3%) and 169 tumor-derived mutations (VAF median 4%, range 0.1–80%) in the matched cfDNA. Fragments bearing either tumor-derived mutations or CH-derived mutations were extracted from aligned files, resulting in 13 353 CH mutant reads, 25 373 tumor mutant reads, and 429 769 wild-type reads, aggregated across multiple loci in each group. Fragment lengths were extracted in the range of 1–720 bp, tallied, and counts were normalized into proportions. We then computed the difference between fragment length proportions of tumor-derived and CH fragments to highlight regions of differential enrichment, which approximately follow the ∼160 bp periodic nucleosomal pattern. This allowed us to define 2 predominantly tumor-specific regions (127–141 bp and 272–292 bp, inclusive) and 2 CH-specific regions (173–191 bp and 346–361 bp, inclusive), consistent with the hypothesis that fragments of tumor origin are shorter compared to cfDNA from noncancer cells (Fig. 1). For each mutation, whether tumor or CH, we computed the proportion of fragments falling in the 2 tumor regions out of all fragments falling in the 4 selected regions, and we performed classification by considering all mutations with fewer than 4 supporting reads across the selected regions were removed. Classification based on this simple statistic achieved an area under the curve (AUC) of 0.74. However, performance improved when we considered mutations with at least 20 supporting reads (AUC 0.8089), because estimation of the statistic from few reads was inaccurate. Doing so reduced the number of mutations to 125 from 232 (54%); of these, 35 were CH mutations. As the threshold was increased further, performance on this dataset plateaued.

Fig. 1.

Fig. 1.

Fragment size analysis of reads bearing mutations derived from tumor and CH in plasma cell-free DNA. Relative enrichment between tumor (positive values) and clonal hematopoiesis (CH) fragments (negative values), obtained by subtracting the normalized CH size profile from the normalized tumor profile. Shown in black is a LOESS fit. Colored areas denote the selected regions (orange for tumor, blue for CH).

As a proof-of-concept, our data indicate that tumor-derived cfDNA presents a shorter fragment size distribution than CH-derived cfDNA. This supports a strategy to distinguish CH-derived mutations from tumor-derived mutation in cfDNA. Incorporating additional information such as patient age may further improve prediction accuracy. Larger datasets will be needed to refine the definition of the regions of interest, the statistic used for classification, and the read threshold. Finally, the predictive performance of this approach will need to be evaluated in independent datasets.

Author Contributions

All authors confirmed they have contributed to the intellectual content of this paper and have met the following 4 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; (c) final approval of the published article; and (d) agreement to be accountable for all aspects of the article thus ensuring that questions related to the accuracy or integrity of any part of the article are appropriately investigated and resolved.

D.W.Y. Tsui, L.A. Diaz, Jr, study design; F. Marass, D. Stephens, R. Ptashkin, statistical analysis; D.W.Y. Tsui, A. Zehir, M.F. Berger, data collection, administrative support, provision of study material or patients; D.B. Solit, financial support, provision of study material or patients.

Authors’ Disclosures or Potential Conflicts of Interest

Upon manuscript submission, all authors completed the author disclosure form. Disclosures and/or potential conflicts of interest:

Employment or Leadership

L. A. Diaz, Jr. PGDx Board of Directors, Jounce Therapeutics Board of Directors

Consultant or Advisory Role

F. Marass, Memorial Sloan Kettering Cancer Center, Tsui Lab; D. Stephens, Jounce Therapeutics Pyxis Oncology. M.F. Berger, Roche; D.B. Solit, Pfizer, Loxo Oncology, Lilly Oncology, Illumina, Vividion Therapeutics, QED Therapeutics and Illumina. L.A. Diaz, Jr, PDGx, NeoPhore, 4Paws.

Stock Ownership

F. Marass, Inivata Ltd L.A. Diaz, Jr., PDGx, Jounce Therapeutics, NeoPhore, 4Paws, Thrive.

Honoraria

A. Zehir, Illumina; D.W.Y. Tsui, Nanodigmbio, Cowen, BoA Merrill Lynch.

Research Funding

D.W.Y. Tsui, ThermoFisher Scientific, EPIC Sciences, Prostate Cancer Foundation (these funders were not involved in the design of the study). We acknowledge the funding support by NIH/NCI Cancer Center Support Grant P30 CA008748, the Marie-Josée and Henry R. Kravis Center for Molecular Oncology, and Cycle for Survival for the generation of data involved in this study.

Expert Testimony

None declared.

Patents

F. Marass, D.W.Y. Tsui, Patent pending; D.W.Y. Tsui, M.F. Berger, provisional application no. 62/658,489.

Other Remuneration

D.W.Y. Tsui, Nanodigmbio (Travel).

Acknowledgments

We acknowledge the important contribution of the following collaborators: Agnes Viale and Kety Huberman for overseeing data generation from cell-free DNA at Integrated Genomics Operation core facility, Marc Ladanyi, Maria Arcila and Ryma Benayed for overseeing data generation from clinical MSK-IMPACT testing, Nicholas Socci and the Bioinformatic Core for processing of cell-free DNA raw data, Rose A Brannon, Julie Yang, Maha Shady and Caitlin Stewart for help with data collection and management, and Michael Cheng, Paul Chapman, Samuel Funt, Darren Feldman, Bob T. Li, Pedram Ravazi, Jonathan Rosenberg, Dean Bajorin, Gopa Iyer, Howard I. Scher, Dana Rathkopf for patient recruitment and samples collection.

References

  • 1. Ptashkin RN, Mandelker DL, Coombs CC, Bolton K, Yelskaya Z, Hyman DM, et al.  Prevalence of clonal hematopoiesis mutations in tumor-only clinical genomic profiling of solid tumors. JAMA Oncol  2018;4:1589–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Bauml J, Levy B.  Clonal hematopoiesis: a new layer in the liquid biopsy story in lung cancer. Clin Cancer Res  2018;24:4352–4. [DOI] [PubMed] [Google Scholar]
  • 3. Jiang P, Lo Y.  The long and short of circulating cell-free DNA and the ins and outs of molecular diagnostics. Trends Genet  2016;32:360–71. [DOI] [PubMed] [Google Scholar]
  • 4. Shukla NN, Patel JA, Magnan H, Zehir A, You D, Tang J, et al.  Plasma DNA-based molecular diagnosis, prognostication, and monitoring of patients with EWSR1 fusion-positive sarcomas. JCO Precis Oncol  2017;1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Cheng DT, Mitchell TN, Zehir A, Shah RH, Benayed R, Syed A, et al.  Memorial Sloan Kettering-integrated mutation profiling of actionable cancer targets (MSK-IMPACT): a hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J Mol Diagn  2015;17:251–64. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Clinical Chemistry are provided here courtesy of Oxford University Press

RESOURCES