Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Jan 10.
Published in final edited form as: Nat Genet. 2025 Jun 2;57(6):1504–1511. doi: 10.1038/s41588-025-02204-3

Longitudinal, multi-site sampling reveals mutational and copy number evolution during metastatic dissemination

Karena Zhao 1,2,3, Joris Vos 1,2, Stanley Lam 1,2,3, Lillian A Boe 4, Daniel Muldoon 5,6, Catherine Y Han 1,2, Cristina Valero 1,2, Mark Lee 1,2, Conall Fitzgerald 1,2, Andrew S Lee 1,2,3, Manu Prasad 1,2, Swati Jain 1,2, Xinzhu Deng 1,2, Timothy A Chan 7, Michael F Berger 5,6, Chaitanya Bandlamudi 5,6,*, Xi Kathy Zhou 8,*, Luc GT Morris 1,2,*
PMCID: PMC12784189  NIHMSID: NIHMS2121751  PMID: 40457077

Abstract

To understand genetic evolution in cancer during metastasis, we analyzed genomic profiles of 3,732 cancer patients in whom multiple tumor sites were longitudinally biopsied. During distant metastasis, tumors were observed to accumulate copy number alterations to a much greater degree than mutations. In particular, the development of whole genome duplication was a common event during metastasis, emerging de novo in 28% of patients. Loss of 9p (including CDKN2A) developed during metastasis in 11% of patients. To a lesser degree, mutations and allelic loss in HLA class I and other genes associated with antigen presentation also emerged. Increasing copy number alteration, but not increasing mutational load, was associated with immune evasion in patients treated with immunotherapy. Taken together, these data suggest that copy number alteration, rather than mutational accumulation, is enriched during cancer metastasis, perhaps due to a more favorable balance of enhanced cellular fitness versus immunogenicity.


Cancer that has metastasized to distant sites is generally incurable; as a result, metastasis is the predominant cause of death from cancer. Because of the selection pressures that occur during tumor dissemination, immune evasion, and adaptation to new microenvironments, the genetic profiles of tumors have been observed to evolve during metastasis13.

As progressing and/or metastasizing tumors undergo genetic evolution, both mutational and copy number changes may accumulate under positive selection. While both classes of genetic change are associated with poorer prognosis46, they have opposite anticipated effects on immunogenicity: increasing mutation count is expected to generate more immunogenic neoantigens, whereas increasing copy number instability has been associated with immune evasion7,8. At the same time, effects on cellular fitness differ profoundly: some types of copy number instability, such as genome doubling, can buffer the deleterious effects of accumulating mutations (Muller’s ratchet), by preserving wild-type copies of essential genes. Therefore, our objective was to determine the degree to which mutations and/or copy number instability are selected for, as tumors progress or metastasize.

Understanding the genomic differences between primary and metastatic tumors is also of clinical importance, because genomic biomarkers are increasingly used by clinicians to inform decision-making around the use of immunotherapy drugs. For example, tumor mutational burden (TMB) is a predictive biomarker associated with response to immune checkpoint inhibitor (ICI) therapy, and a cutoff of 10 mutations/megabase identifies TMB-high tumors for which anti-PD1 ICI drugs are FDA-approved9,10. The degree of copy number alteration (CNA) or aneuploidy in tumors has also been strongly associated with ICI treatment response7,8,11. However, these biomarkers could conceivably differ across tumor sites within a patient. Prior analyses of unmatched primary and metastatic cancer samples have suggested that metastatic tumors might harbor a higher number of somatic mutations, and more frequent aneuploidy and copy number alterations1. Such observations would be consistent with a process of increasing genetic complexity, genomic instability, or development of novel cancer driver genes, as tumors evolve to pass through a bottleneck and disseminate to distant sites12,13. These mechanisms could lead to genomic biomarkers such as TMB and CNA differing between metastases and the primary site, which may have relevance to the site of disease being sampled for genomic profiling. If these differences are substantial, they may confound genomic biomarkers, and hinder precision oncology approaches to treatment selection.

However, these observations have largely been drawn from comparisons of unpaired primary and metastatic tumors arising in different patients. Unpaired analyses may not accurately capture associations between genetic alterations and metastasis, as they compare tumors from different individuals with distinct clinical behavior and evolutionary trajectories. There are many relevant clinical and tumor biological differences between patients with and without metastatic disease, and between patients in whom different sites have been selected for genomic profiling. For these reasons, the process of metastatic spread is ideally observed in multi-site samples from the same patient. Because prior datasets of paired primary-metastatic tumors have not included large numbers of patients1417, we assembled a cohort of patients in whom multiple sites were biopsied and genomically profiled. This enabled us to longitudinally evaluate the tumor site-dependent changes in biomarker values, while accounting for within-patient variation, and incorporating relevant clinical covariates in mixed effects models.

The study population included 3,732 patients with 8,171 tumor samples (all with matched normal DNA) profiled using a targeted next-generation sequencing platform (MSK-IMPACT; Supplementary Table 1; Fig. 1; Methods)18 at median coverage of 620× (IQR=485–758). All patients also had more than one tumor biopsy profiled (median=2 samples) and confirmed to be genetically related. Median interval between a patient’s first and last sample was 354 days (IQR=85–771.2). In 1,509 patients, both primary and metastatic samples were available; 1,013 patients had sequential primary samples only, and 1,210 had multiple metastatic samples only.

Figure 1: Flow diagram of cohort samples included.

Figure 1:

TMB = tumor mutational burden

The degree of CNA was assessed as the fraction of genome altered (FGA): the percentage of the genome with absolute log2 copy number ratio>0.218. TMB and FGA were not correlated (Spearman’s ρ=0.1; Extended Data Fig. 1). Bland-Altman plots depict how each primary-metastatic pair in our sample differed in TMB and FGA (Fig. 2ab). These plots demonstrated that the ratio of patients with metastatic TMB > primary TMB to patients with primary TMB > metastatic TMB was 1.91. For FGA, this ratio was 1.61. This indicates that both TMB and FGA tend to be higher in metastatic compared to primary sites.

Figure 2: Difference in tumor mutational burden and fraction of genome altered among paired primary and metastatic samples.

Figure 2:

Bland-Altman plot demonstrating the difference in tumor mutational burden (TMB, a) and fraction of genome altered (FGA, b) across patients with paired primary and metastatic samples. Column graph depicts number of pairs with positive or negative differences, with ratio displayed above column.

To quantify more precisely how tumor location (primary vs. metastases) affects these biomarkers, we used mixed effects models, where random effects included patients, cancer type, and interactions between cancer type and sex, race, sample type (tumor location: primary or metastasis), and sample site (organ); and fixed effects included age, sex, race, systemic therapy history, sample type, sample coverage, tumor purity, MSK-IMPACT panel version, and time from first sample (Methods). The mixed effects models included all samples, including subsequent primary and metastatic tumors, to improve covariate representation, minimizing the possibility of confounding that might arise from pairing constraints alone. This also better refines random effects structures by providing more information on between-subject and between-category variation.

TMB and FGA in Primary Versus Metastatic Sites

To compare trajectories of mutations and copy number alteration, mixed effect models were created for log-transformed TMB (Fig. 3a) and arcsine-transformed FGA (Fig. 3b). With adjustment for other clinical covariates, a sample from a metastatic site, compared to a sample from the same patient’s primary site, was associated with an increased TMB (β=0.1, 95% CI=0.03–0.2): corresponding to a 6.4% (95% CI=1.9–11.2%) higher TMB. For FGA, a sample from the metastatic site was estimated to have a higher FGA (β=0.04, 95% CI=0.01–0.07): corresponding to a 20.6% (95% CI=6.2–36.0%) higher FGA in metastatic samples at our cohort’s median FGA of 0.2. This analysis was also performed with alternate, high stringency criteria for genetic relatedness (Methods), with very similar results (Extended Data Fig. 2 and 3). Similar results were also obtained across a wide range of minimum tumor purity requirements (Extended Data Fig. 4, Supplementary Table 2).

Figure 3: Results of mixed effects models demonstrating how tumor mutational burden and fraction of genome altered change between primary and metastatic sites, and further stratified by cancer type.

Figure 3:

Forest plot of fixed effects TMB (a) and FGA (b) with point estimate and 95% confidence intervals, derived from 8,171 samples. In c and d, forest plots show the point estimate (center dot) and 95% confidence intervals (lines). Difference in TMB (c) and FGA (d) between primary and metastatic sites (Metastatic minus Primary) for the 20 most prevalent cancer types in our cohort. Cancer types that demonstrate a significant difference between primary and metastatic sites in both TMB and FGA are shown in grey. Exact FDR-corrected q-values are included above each cancer type as an adjustment for multiple comparisons. In c and d, plots show the point estimate (dot) and 1.5x the 95% confidence interval (whiskers).

Other covariates including age, tumor purity, prior or intervening systemic therapy, and sample coverage, were associated with an increased TMB, FGA, or both. Sex, race, and MSK-IMPACT gene panel version were not significantly associated with changes in either TMB or FGA.

We then performed analyses by cancer type and metastatic site using separate mixed effects models for each cancer type1. We used the mixed effects models to compare how TMB (Fig. 3c) and FGA (Fig. 3d) varied between primary and metastatic sites among the 20 most common cancer types in our cohort. In the majority of cancer types, metastatic sites had numerically higher TMB (14 of 20) or FGA (16 of 20). These differences were clearest in pancreas, breast, and prostate: in these cancer types, both TMB and FGA were significantly higher in metastases compared to primary tumors, after adjustment for other covariates. We then examined histologic subtypes among several cancer types, and found no significant differences in TMB or FGA trajectories during metastasis in breast cancer (by ER, PR, and Her2 status), pancreatic cancer (adenocarcinomas vs. neuroendocrine tumors), or NSCLC (squamous cell carcinomas vs. adenocarcinomas; Extended Data Fig. 5).

Additional analyses by site of metastasis (lymph nodes, liver, lung/pleura, peritoneum, bone, CNS/brain, and other) were performed (Fig. 4; Supplementary Table 3). Across all cancer types and metastatic sites, all (50 of 50) statistically significant (FDR q<0.05) differences favored a higher TMB or FGA in the metastatic site.

Figure 4: Results of mixed effects models demonstrating how tumor mutation burden and fraction of genome altered vary between different metastatic sites, stratified by cancer type.

Figure 4:

Associations between metastatic dissemination and tumor mutational burden (TMB) or fraction of genome altered (FGA), stratified by cancer type and site of metastasis. Number of samples, number of patients, sex distribution, and sample type distribution are shown on the left. Number of samples and patients are shown on a log-scale. Mixed effects models were generated for each cancer type. False discovery rate adjusted p-values are starred for q < 0.05 as an adjustment for multiple comparisons. Estimated effects are represented with different hues, with positive estimated effects shown in red and negative estimated effects shown in blue.

Taken together, these data indicate that the process of metastatic dissemination in cancers is consistently associated with increased genetic complexity as tumors pass through an evolutionary bottleneck – as evidenced by higher somatic mutation count and more copy number alterations. This pattern was observed in most cancer types, and across most metastatic sites. With adjustment for other covariates, the differences in TMB were modest, translating to approximately 6.4% higher TMB in metastatic samples. Differences for FGA were more pronounced, with a median patient having 20.6% higher FGA in a metastatic sample. For TMB, which is the biomarker more widely used and reported in the clinical setting, the difference in values may not be large enough to be clinically relevant; however, if FGA or other measures of copy number alteration become more widely used clinically, differences across metastatic sites may be more important.

Interestingly, this pattern was not observed in tumors starting in the highest quartile of TMB and FGA (Extended Data Fig. 6), where TMB and FGA did not increase further during metastasis. This may be because these tumors with high genetic complexity gain minimal fitness advantage from additional genetic alterations, or because beyond this level, additional mutations or CNAs become deleterious.

TMB, FGA, and Immunogenicity

To evaluate the hypothesis that copy number instability would be enriched over mutational accumulation during tumor evolution under immune selection, we compared the effect of each process on immunogenicity, by examining tumor response to immunotherapy. Analyzing a separate, previously reported cohort5 of 1,892 cancer patients treated at our center with checkpoint inhibitor drugs, using multivariable Cox and logistic regression models including TMB, FGA, cancer type and stage as covariates, we found that increasing TMB was consistently associated with significantly improved overall survival (OS; HR=0.99 per mutation/MB, 95% CI=0.98–0.99), p=1.7 × 10−6), progression-free survival (PFS; HR=0.98, 95% CI=0.97–0.98, p=6.7 × 10−18), and tumor response (OR of response=1.03, 95% CI=1.02–1.04, p=5.7 × 10−11). In contrast, increasing FGA had the opposite association – significantly poorer OS (HR=1.6, 95% CI=1.2–2.2, p=0.003), PFS (HR=1.5, 95% CI=1.1–1.9, p=0.004) and tumor response (OR=0.45, 95% CI=0.2–0.8, p=0.007), in patients treated with checkpoint inhibition (Extended Data Fig. 7).These findings suggest that, during tumor genetic evolution, more mutations are associated with higher immunogenicity, whereas copy number alterations may be more associated with immune escape. These immunological consequences, together with the increase in intratumoral heterogeneity1921 and metastatic potential2224, may in part explain why copy number alteration is more selected for than mutagenesis during the metastatic process.

Emergence of Whole Genome Duplication

To examine copy number alteration further, we considered a more extreme form: tetraploidization, or genome doubling. Whole genome duplication (WGD) events are common in tumors and strongly associated with chromosomal instability. WGD provides a substrate for more rapid genetic evolution, the development of further genomic instability, and may be a protective factor against the accumulation of deleterious mutations, by providing additional wild type genetic copies25. As a result, WGD has been linked with poorer prognosis across multiple cancer types, and may be more prevalent in metastases25. WGD was determined using FACETS (excluding low-purity [<20%] samples), and was observed in 40.4% of samples, similar to prior reports25. The prevalence of WGD was higher in metastatic samples compared to primary samples (47.1 vs. 37.9%) in primary-metastasis pairs from the same patient (p<0.001, McNemar’s test). WGD emerged de novo in metastatic samples in 28.5% (154 of 541; 95% CI=24.8–32.4%) of patients in whom primary tumors had clear absence of WGD; this rate was highest in colorectal cancer patients (42.2%; 19 of 45, Fig. 5ab). This finding was robust to more stringent definitions of WGD, more stringent requirements for minimum tumor purity (Supplementary Table 4), and more stringent criteria of genetic relatedness (Extended Data Fig. 3). In comparison to WGD occurring during metastasis, we then examined the de novo development of WGD in subsequent primary tumors, among 359 patients with WGD data who had 2 or more serial biopsies of the primary tumor (median interval, 6.3 months). We observed that WGD arose de novo in a subsequent primary tumor, but at a lower rate (15.9%; 95% CI=12.5–20.0%), indicating that WGD may also develop during the process of tumor evolution, but much less commonly than during metastasis.

Figure 5: De novo emergence of alterations in selected genes, HLA class I loss of heterozygosity, and whole genome duplication.

Figure 5:

Percentage of patients with paired primary and metastatic samples (a) and paired initial primary to subsequent primary samples (b) demonstrating de novo mutations in antigen presenting machinery (APM) genes, p53, HLA Class I loss of heterozygosity (LOH), and whole genome duplication (WGD), by cancer type. Samples with purity<20% were excluded. Primary to metastatic de novo mutations are shown on the left and primary to subsequent primary de novo mutations are shown on the right. Percentage of patients with paired primary and metastatic samples (c) and paired initial primary to subsequent primary samples (d) demonstrating de novo mutations or copy number alterations in top driver mutations.

De Novo Mutations in Metastases

Several specific genomic alterations have been studied in metastasis, such as the development of novel driver mutations26,27. However, prior studies have generally relied on comparisons of unpaired samples and have not identified individual genes that are consistently and clearly enriched or depleted during the process of metastasis, with the possible exception of TP531,26,28,29. Thus, we examined the de novo emergence of the above genetic alterations in the metastatic samples, and subsequent primary tumor samples, of patients in whom index primary samples did not harbor evidence of these alterations (Fig. 5ab; excluding samples with low purity). In this cohort, 45 of 614 (7.3%; 95% CI=5.5–9.7%) patients whose primary tumor lacked a TP53 mutation or deletion developed a de novo mutation or homozygous deletion of TP53 in the metastatic sample. This rate was similar to the rate of de novo alteration in subsequent primary tumor samples (6.7%; 95% CI=4.6–9.4%).

Immune evasion is an important mechanism of successful distant tumor seeding. We examined metastatic samples, and subsequent primary tumor samples, for the de novo emergence of mutations and/or homozygous deletions in HLA class I genes and antigen presenting machinery genes: B2M, TAP1, TAP2, HLA-A, HLA-B, and HLA-C. In metastatic samples, the rate of de novo mutation in each gene individually ranged from 0.2–0.7% (Extended Data Fig. 8). Across all APM genes combined, 29 of 1203 patients had a de novo mutation in any one of these genes emerge in a metastatic sample (2.4%; 95% CI=1.7–3.4%). This rate was similar to the rate of de novo alteration in subsequent primary tumor samples (3.4%; 95% CI=2.3–4.9%; Fig. 5ab). Alterations in these immune evasion genes appear uncommon during tumor progression. The prevalence of these alterations was robust to more stringent minimum purity requirements (Supplementary Table 5) and genetic relatedness criteria (Extended Data Fig. 3).

We then analyzed the de novo emergence of alterations in the 36 most commonly mutated oncogenes and tumor suppressor genes in metastatic cancer28 (Fig. 5cd). The majority of genes had low rates (0–5%) of de novo emergence in metastatic samples, or in subsequent primary samples. Of note, we observed that CDKN2A alterations (nearly all of which were deletions) emerged de novo in 10.8% (95% CI=9.1–12.8%) of metastatic samples, a rate that was significantly higher than the rate of emergence in subsequent primary tumor samples (6.2%; 95% CI=4.6–8.3%,; χ2 = 10.51, p = 0.001). This process was particularly prevalent in NSCLC metastasis: CDKN2A loss developed in 24% of cases. Deletions in CDKN2A and the 9p21 locus have been linked with metastasis, poor survival, and immune evasion in various cancer types, perhaps explaining its increased incidence in metastatic samples. There were also some cancer-type-specific mutations observed to emerge during metastasis, such as ESR1 mutation in breast cancer, and AR amplification and PTEN mutation in prostate cancer, consistent with prior studies3032.

Clonal dynamics of mutations in these genes were then examined. Driver mutations tended to remain clonal across the process of metastasis. While monoclonal seeding of metastases by subclonal driver genes was observed, and differential polyclonal dispersal across various metastatic sites was also observed, these scenarios were relatively uncommon (Extended Data Fig. 9).

HLA-I Loss of Heterozygosity in Metastases

Finally, allelic loss of HLA class I genes (HLA loss of heterozygosity, LOH) was assessed using LOHHLA (Methods)33, with 23.9% of samples demonstrating evidence of HLA LOH in at least one site. Focusing on patients in whom the primary tumor did not have evidence of HLA LOH, we observed that 125/797 patients (15.7%; 95% CI=13.3–18.4%) had HLA LOH emerge de novo within the metastatic tumor (differing thresholds for absence of LOH in the primary tumor are shown in Supplementary Table 6). This rate was similar to the rate of de novo alteration in subsequent primary samples (16.2%; 95% CI=13.4–19.6%; Fig 5ab). However, in general, the development of LOH at the HLA-I locus was associated with a global process of increasing LOH in tumors. The fraction of genome with loss of heterozygosity (FGLOH) was strongly associated with de novo emergence of HLA-I LOH in logistic regression (OR=8.62, p= 4.41×10−10), and the change in FGLOH from primary to metastatic samples was significantly higher in patients who developed de novo HLA-I LOH (Extended Data Fig. 10).

Taken together, these findings provide insight into the genetic alterations that emerge during the process of cancer metastasis – an increase in mutation count, copy number alterations and genome duplication. The higher mutation counts found in metastases may reflect a process of genetic evolution under selective pressure, as subclones with mutations conferring increased fitness or therapeutic resistance grow out and seed metastases1. However, increasing mutations may also exert negative effects on tumor cell fitness: first, as the process generates more potentially immunogenic neoantigens, it may be constrained by immune surveillance; second, accumulating mutations may eventually become deleterious to essential genes.

Increasing copy number alteration may mitigate some of these effects. First, increasing copy number alteration and aneuploidy have both been associated with immune evasion7,34, providing one explanation for the more pronounced increases in FGA and development of WGD in the metastatic samples in our dataset35. Consistent with this mechanism, we observed directionally opposite associations between these processes and immunotherapy response, suggesting that immune surveillance may, to some degree, constrain mutagenesis but not copy number alteration. Second, increases in copy number alteration also allow for better adaptation to selective pressures posed by the tumor microenvironment (such as hypoxia) and anti-cancer therapies. WGD in particular has been identified as a process that tumors use to mitigate the “Muller’s ratchet” effect of accumulating deleterious mutations. This may explain the notable enrichment in WGD during the process of metastasis. However, both mutations and CNAs are likely restrained by a cap on genetic complexity, in that above a certain threshold, additional alterations may no longer confer a fitness advantage.

Certain genetic alterations associated with immune evasion were also observed to emerge during the process of tumor evolution – in both subsequent primary tumor samples and metastatic samples. Of these, CDKN2A deletion seemed to be most specific for the process of metastasis, possibly reflecting the role of this gene, and the 9p21 locus, in metastasis and immune evasion36.

There are important caveats to these findings. First, while pan-cancer analyses may offer some clues into the genetic evolution of tumors, differences were more pronounced in some cancer types than others. However, despite this variation, the statistically significant differences favored higher TMB and FGA in metastatic tumors, across all cancer types and all metastatic sites (Fig. 4). A second caveat is that this cohort of patients having more than one sample biopsied and sequenced will tend to be enriched for more aggressive tumors. This might lead to an overestimation of the frequency of these events.

Nevertheless, these data validate and expand upon current benchmarks for the interpretation of the FDA-approved biomarker TMB, suggesting that sample location (primary vs metastatic) is unlikely to substantially affect this biomarker. In contrast, copy number instability (as reflected in FGA and the presence of WGD) was strongly enriched in metastatic sites, perhaps due to the role of chromosomal instability in immune evasion7, epithelial to mesenchymal transition22, production of cystosolic dsDNA22,24, and protection against deleterious mutations37,38, thereby promoting metastatic potential.

In conclusion, by drawing on matched primary and metastatic samples from patients with advanced cancer, we observe that during the bottlenecking process of distant metastasis, positive selection appears to favor tumor subclones with more mutational and copy number complexity. Distant metastasis was most strongly associated with copy number instability, rather than mutagenesis. This may be attributable to potentially negative effects of mutational accumulation on tumor cell fitness: increasing immunogenicity, and deleterious mutations to essential genes. Both of these processes may be buffered by increasing copy number alteration, especially WGD. Indeed, WGD was the most common feature observed to emerge de novo in metastasis. We also observed genetic alterations in HLA and APM genes that facilitate immune evasion, developing during the process of tumor evolution and/or metastasis. Importantly, some of these alterations that facilitate immune evasion are common in recurrent or metastatic cancers, and have also been associated with resistance to immunotherapeutic strategies such as adoptive cell transfer and checkpoint blockade. Therefore, newer strategies (for example, those that include cytotoxic systemic therapy, targeted therapies to modify the immune microenvironment, or restoration of HLA) are likely to be critical for the achievement of durable tumor responses in patients with metastatic cancer.

Methods

Patient Selection

After receiving Memorial Sloan Kettering Cancer Center (MSK) institutional review board approval, we identified 10,667 patients with various solid cancers for whom ≥2 tumor biopsies (total 28,432 samples) were genomically profiled using MSK-IMPACT, a targeted next-generation sequencing (tNGS) panel18. All patients were treated at Memorial Sloan Kettering Cancer Center (MSKCC) between April 7, 2015 and November 21, 2021. Patients with genomically profiled biopsies that were obtained from different synchronous cancers, CNS tumors, or a second primary tumor of different histology, were excluded. Cancer types with less than 25 samples, and samples with metastatic samples taken before primary samples were also excluded from analysis. Tumor samples from the same patient were required to share at least one identical somatic gene alteration as evidence of relatedness. For patients with more than 2 samples in our cohort, we only included samples that had at least one shared gene alteration. Our final cohort consisted of 8,171 genomically profiled tumor samples, obtained from 3,732 patients. The median age of patients was 63 years (IQR: 52–71) and 54.4% of patients were female. All patients provided informed consent to an MSK institutional review board-approved protocol, permitting the return of results from sequencing analyses for research.

Clinical and Genomic Data

Tumor biopsies were profiled using different versions of the MSK-IMPACT tNGS panel: 341 (n=275), 410 (n=1201), 468 (n=5063), or 505 gene (n=1632) versions. Tumor mutational burden was defined as the total number of somatic non-synonymous tumor mutations normalized to the exonic coverage of the MSK-IMPACT panel, in mutations/megabase39. The degree of copy number alteration burden in a tumor was quantified using the metric FGA: fraction of genome altered11, which is calculated as the percentage of the genome with absolute log2 copy number ratios>0.218. Tumor purity was calculated utilizing FACETS40. If purity could not be calculated from FACETS (in 3.7% of our samples; e.g., in cases of purely diploid tumors), pathologist-estimated purity was instead utilized. Potentially prognostically relevant clinical covariates39,41 obtained through patient chart review were age, sex, race, cancer type, whether the patient had received systemic therapy prior to IMPACT sampling, the sample site (primary/metastatic), and the interval between the patient’s profiled tumor samples.

Statistical Analyses

Statistics & Reproducibility

All analyses were conducted in RStudio v2023.12.1+402. No statistical method was used to predetermine sample size. Some samples were excluded, based on the exclusion criteria specified above (Fig. 1).

Correlation

The relationship between TMB and FGA in our database was investigated with a Spearman’s rank correlation coefficient.

Mixed Effects Models

To quantify overall cancer site (primary vs. metastatic sites) related differences in TMB and FGA and to account for contributions from other relevant variables, two linear mixed models (one for TMB and one for FGA) using restricted maximum likelihood estimators (REML) were created in order to account for repeated measurements within the same patient, using the lme4 package in R42. The mixed effects models included patient ID, cancer type, and interactions between cancer type and sample site, between cancer site and sex (for TMB only), and between cancer type and race (for TMB only) as random effects. Random effects were examined for contribution to the explanation of the total variance and only the ones that could be reliably estimated and without convergence problems in each model were kept. Age, sex, race, previous systemic therapy, whether the sample was from a primary or metastatic site, sample coverage, tumor purity, the version of the IMPACT gene panel, and the interval between samples were included as fixed effects. Previous systemic therapy was coded into 3 categories: no systemic therapy prior to sampling, a new systemic therapy regimen <6 months prior to sampling, and a new systemic therapy regimen ≥6 months prior to sampling.

To ensure that model assumptions were satisfied, TMB and FGA were transformed using a log2 and arcsine square root transformation, respectively. For each mixed effects model, forest plots were created to show the effect size of the fixed effects on TMB and FGA.

We then analyzed the association between TMB and FGA and metastatic site, by cancer type. TMB and FGA were analyzed with separate mixed effects models and the point estimates and 95% confidence intervals shown on forest plots. Percent change for TMB was calculated using 2Estimate1. The FGA model estimate reports the difference in square roots (e.g. estimate=FGA(met)FGA(pri)). Percent change for FGA was reported for the cohort’s median FGA value (e.g. FGA(met)FGA(pri)=(1+estimateFGA(pri))2). In calculating the estimated percent change in FGA, we utilized the median FGA value for the value of FGA(pri), and the estimate from the mixed effects model for the value of “estimate.” Purity cut-offs were not utilized in the mixed effects models, because purity was included as a covariate.

The following models were implemented in R with the lme4 package.

log2(TMB+1)~Age+Sex+Race+Chemotherapy+Sample Type+Sample Coverage+Tumor Purity+IMPACT Panel Version+Time from First Sample+(1|Patient ID)+(1|Cancer Type)+(1|Cancer Type:Sample Type:Sample Site)+(1|Cancer Type:Sex)+(1|Cancer Type:Race)
asin(sqrt(FGA))~Age+Sex+Race+Chemotherapy+Sample Type+Sample Coverage+Tumor Purity+IMPACT Panel Version+Time from First Sample+(1|Patient ID)+(1|Cancer Type)+(1|Cancer Type:Sample Type:Sample Site)

The variability of TMB between metastatic sites was relatively modest, approximately 12% of between-patient variability and 19% of between-cancer-type variability. FGA showed more variability between metastatic sites, approximately 18% of between-patient variability and 28% of between-cancer-type variability, as defined by the standard deviations of the random effects defined in the mixed effects model.

Further analysis was then carried out in order to compare how TMB and FGA varied between primary and each specific metastatic site, as stratified by cancer type. Two linear mixed effects models with patient ID as random effects and age, sex, race, systemic therapy, sample coverage, tumor purity, IMPACT panel version, days from patient’s first sample, cancer type, sample type, and cancer type by sample type interaction as fixed effects were created; one for TMB and one for FGA. Again, TMB was log transformed and FGA was arcsine square root transformed to ensure the model assumptions were met. Simultaneous inference of general linear functions of model parameters were used to estimate the differences between primary and each specific metastatic site for each cancer group for TMB and FGA. Point estimates and 95% confidence intervals were shown in bar error plots.

In addition, we used the following mixed effects models for each cancer type.

log2(TMB+1)~I(Age/10)+Race+Sex+Sample Site+Chemotherapy+I(Sample Coverage/1000)+Tumor Purity+Gene Panel Version+I(Months from First Sample)+(1|Patient ID)
asin(sqrt(FGA))~I(Age/10)+Race+Sex+Sample Site+Chemotherapy+I(Sample Coverage/1000)+Tumor Purity+Gene Panel Version+I(Months from First Sample)+(1|Patient ID)

The effect size of various sample sites (primary vs all metastatic sites, lymph nodes, liver, lung/pleura, peritoneum, CNS/brain, and other) were collected and p-values were adjusted using false discovery rate (FDR) correction43. Only cancer types with more than 20 patients with a sample from the metastatic site were included. Exact sample sizes are shown in Supplementary Table 3.

Immunotherapy Response Analyses

To examine the effect of TMB and FGA on survival and response to immunotherapy, we analyzed a cohort5 of 2,037 cancer patients treated with checkpoint inhibitor drugs and genomically profiled using MSK-IMPACT. We did not include CNS tumors (n=75), samples of tumors of unknown primary origin (n=29), samples without complete FGA data (n=9), and samples without tumor stage available (n=32). Using the remaining 1,892 samples, Cox and logistic regression models were performed using the coxph and glm functions from the R survival44 and stats packages, respectively.

Overall Survival~TMB+FGA+Tumor Stage+Cancer Type Progression Free Survival~TMB+FGA+Tumor Stage+Cancer Type Response~TMB+FGA+Tumor Stage+Cancer Type

Tumor stage was dichotomized (Stage IV vs. Stages I–III). Overall survival (OS) was defined as the time between the ICI start date, and death from any cause. If the patient did not have a death notification, then the OS was censored at the last time the patient had an order, infusion, message, or appointment at MSK. Progression-free survival (PFS) was defined as the time between the ICI start date, and progression of their tumor or their death date. If the tumor did not progress, then PFS was censored at the last time the patient had a medical appointment at MSK. Hazard ratios, odds ratios, and their 95% confidence intervals were plotted in forest plots using the R forester package45.

De Novo Alteration Analyses

For the following analyses involving WGD, TP53, APM genes, and HLA-I LOH, all samples included were required to have a minimum purity of 20%. Previous studies have cited purity thresholds ranging from 10–30%4649 and prior research suggests that tumor purities as low as 20% may still offer accurate FGA, WGD and LOH calls5052. In order to mitigate the confounding effect of low-purity samples, while still retaining a sufficient sample size for our analyses, we have reported our main findings with a purity threshold of 20%. However, sensitivity analyses are included in Supplementary Tables 4 and 5, demonstrating similar results of analyses when conducting the analyses with purity thresholds of up to 60%.

Whole genome duplication (WGD) was analyzed in FACETS as previously described, to identify tumors in which >50% of the autosomal genome had a major copy number (MCN) greater than or equal to 225. For this analysis, all samples passed the more stringent FACETS QC filter of best fit. In calculating the percentage of tumors with de novo development of WGD, we opted to not include cases with only marginal change in WGD around 50% (e.g., cases where both samples had MCN ≥ 2 in close to 50% of the genome). Thus, we used more stringent thresholds, requiring MCN ≥ 2 in <40% of the genome in the first sample, and MCN ≥ 2 in >60% of the genome, in subsequent samples– this is anticipated to underestimate the rate of de novo whole genome doubling. McNemar’s analysis was performed on patients who had both primary and metastatic samples (n=1004). The first available primary and first metastatic sample was used for each patient. Additional analyses applied more stringent filters to the primary cases (requiring duplication in <30% and <20% of the genome), or higher purity thresholds, all of which produced similar results (Supplementary Table 2).

For analyses of mutations and deletions in antigen presenting machinery (APM) genes and TP53 and HLA-I loss of heterozygosity (LOH), samples were filtered to those with purity >20%. For analysis of de novo genetic mutations, the cohort was subsetted into patients with both a primary and metastatic sample available for analysis of TP53 and antigen presenting machinery genes, in which single nucleotide variants and/or homozygous deletions (defined as cases with log2 copy number ratio ≤−1.5). Genetic alterations were categorized as de novo events in metastatic samples in the event that the same patient’s primary tumor did not harbor a mutation or deletion in any primary tumor sample. In order to ensure that our results were not affected by cases with low sample purity, we repeated these analyses with a more stringent purity filter, excluding samples with purity <30%. (Supplementary Table 3).

Loss of heterozygosity data was calculated using the Loss of Heterozygosity in Human Leukocyte Antigen (LOHHLA) tool, a computational method developed to infer haplotype-specific copy number of the HLA locus29. A tumor sample was considered to have HLA loss of heterozygosity if the estimated copy number of allele 1 or allele 2 of HLA class I using binning and b-allele frequencies were <0.5, with p-value<0.001. The rate of LOH emerging de novo in metastatic samples was assessed among patients in which the primary sample did not harbor HLA LOH. To ensure that the de novo rate of HLA LOH was not overestimated by mis-categorizing primary tumors with borderline LOH as negative for LOH, additional sensitivity analyses were performed to exclude patients with sub-threshold cases of LOH in primary tumors: specifically, those with estimated copy number of HLA-I allele 1 or allele 2 using binning and b-allele frequencies <0.5 but with p-values up to <0.05 and <0.10 (Supplementary Table 4). In order to ensure that our results were not affected by cases with low sample purity, we repeated these analyses excluding all samples with purity <30%, 40%, and 60%. (Supplementary Table 3).

Fraction of genome with loss of heterozygosity (FGLOH) was inferred using “facets-suite” utility from MSKCC53 (Extended Data Fig. 10).

Genetic Relatedness Criteria

In our main cohort, we required all samples for each patient to share at least one identical genetic alteration (mutation, deep deletion, or amplification), in addition to sharing tumor histology. However, to minimize the possibility that a patient with multiple synchronous primary tumors might seed distinct metastases that appear to be matched based only on hotspot mutations (unlikely, given that patients with known multiple primary tumors were excluded), we also repeated these analyses with substantially more stringent criteria for genetic relatedness, which required that all samples from a patient have a Jaccard similarity score of > 0.7 in CNA across all loci in the genome AND share at least two mutations; or in the case of tumors with low mutation count, share at least one mutation and one high-level copy number alteration. Our main mixed effects models and de novo emergence analyses were repeated with this new criteria and included in Extended Data Fig. 23.

Driver Mutations

Common oncogenes and TSG’s were defined by the 20 most common in each category to be mutated in metastatic samples, as found by Priestly et al28. Six genes (FHIT, DMD, ZNF703, ZBTB10, PABPC1, and ZFP36L2) were not included in our analyses as they were not included in the MSK-IMPACT panel. De novo analyses were conducted to include mutations, deep deletions, and high-level amplifications.

Clonality

Mutations with a CCF > 80%, or a CCF > 70% and an upper bound of the 95% confidence interval of it’s CCF > 90% were considered clonal, and mutations below this threshold were considered sub-clonal. Clonality was classified as indeterminate if the segment harboring the mutation loci was less than 60% of the purity54.

Extended Data

Extended Data Fig. 1. Correlation between tumor mutational burden and fraction of genome altered.

Extended Data Fig. 1

Spearman’s rho of log2(TMB + 1) and FGA. TMB = tumor mutational burden. FGA = fraction of genome altered.

Extended Data Fig. 2. Mixed effects model results in original dataset and dataset with more stringent mutation matching criteria.

Extended Data Fig. 2

Comparison of forest plots in original dataset (a, c) to dataset with more stringent mutation matching criteria (b, d; see Methods).

Extended Data Fig. 3. De novo analyses results in original dataset and dataset with more stringent mutation matching criteria.

Extended Data Fig. 3

Comparison of de novo mutation analyses in original dataset (a, c) to dataset with more stringent mutation matching criteria (b, d; see Methods).

Extended Data Fig. 4. Mixed effects model results with different minimum purity requirements.

Extended Data Fig. 4

Estimate of TMB (a) and FGA (b) metastasis estimate from mixed effects models run with increasing minimum purity requirements. TMB = tumor mutational burden. FGA = fraction of genome altered.

Extended Data Fig. 5. Estimated effect of metastatic sample on tumor mutational burden and fraction of genome altered by cancer subtype.

Extended Data Fig. 5

(a-b) Subtype analysis of infiltrating ductal carcinoma, and the estimated effect of metastatic sample on TMB and FGA. HR = hormone receptor. (c-d) Subtype analysis of pancreatic cancer, and the estimated effect of metastatic sample on TMB and FGA. NET = neuroendocrine tumor. (e-f) Subtype analysis of non-small cell lung cancer, and the estimated effect of metastatic sample on TMB and FGA. The plots show the point estimate (dot), with the whiskers representing the 95% confidence interval. FGA = fraction of genome altered.

Extended Data Fig. 6. Tumor mutational burden and fraction of genome altered changes by quantile.

Extended Data Fig. 6

The ratio of Metastasis > Primary to Metastasis < Primary cases, alongside the mean difference in TMB (a) and FGA (b) for each quartile of primary tumor scores. Mixed effects model results for the patients in the lowest (first quartile) of TMB and FGA values for their first available sample (c, e) and in the highest (fourth quartile) of TMB and FGA values for their first available sample (d, f). TMB = tumor mutational burden. FGA = fraction of genome altered.

Extended Data Fig. 7. Association of tumor mutational burden and fraction of genome altered with overall survival, progression free survival, and immunotherapy response.

Extended Data Fig. 7

Forest plots depicting multivariable regression models examining associations with overall survival in Cox multivariable regression (a), progression-free survival in Cox multivariable regression (b), and tumor response to immunotherapy in logistic regression (c). Covariates in the models included tumor mutational burden (TMB), fraction of genome altered (FGA), cancer stage, and cancer type. Hazard or odds ratios with 95% confidence intervals and corresponding p-values are shown on the right.

Extended Data Fig. 8. De novo development of different genetic alterations between paired primary and metastatic sites by cancer type.

Extended Data Fig. 8

Percentage of patients with paired primary and metastatic samples demonstrating de novo mutations in antigen presenting mechanism (APM) genes, including B2M, TAP1, TAP2, HLA-A, HLA-B, and HLA-C. All samples were filtered to ensure a sample purity of greater than 20%.

Extended Data Fig. 9. Clonality status between paired primary and metastatic samples.

Extended Data Fig. 9

(a) Scatterplot showing paired mutations that appeared in both a primary sample and a metastatic sample from the same patient. Cancer cell fraction (CCF) of the mutation in the primary sample was plotted against CCF of the mutation in the metastatic sample. (b) Alluvial plot showing fate of cancer cell fraction (CCF) of mutations among patients with both a primary and a metastatic site sampled. This plot tracks how the clonality status, determined by CCF, changes in the metastatic site. Ind = indeterminate.

Extended Data Fig. 10. Relationship between fraction of genome with loss of heterozygosity and HLA-I loss of heterozygosity status.

Extended Data Fig. 10

The difference in fraction of genome with loss of heterozygosity (FGLOH) between paired primary and metastatic samples (a) and paired initial and subsequent primary samples (b) in different subgroups based on de novo HLA-I LOH status.

Supplementary Material

Supplemental Tables

Supplementary Table 1: Patient and sample characteristics in the whole cohort and by sample type.

Supplementary Table 2: Mixed effects model results at different purity cut-offs. Red shading indicates that p-value is < 0.05.

Supplementary Table 3: Number of samples by cancer type and metastatic sample site. Reference for fig. 4.

Supplementary Table 4: Results of analysis of de novo whole genome duplication at different purity stringency cutoffs.

Supplementary Table 5: Results of analysis of de novo mutations and alterations at different purity stringency cutoffs.

Supplementary Table 6: Results of analysis of de novo HLA-I loss of heterozygosity at different LOHHLA significance thresholds.

Acknowledgments

We are grateful to our patients and their families for their bravery and support of cancer research. We thank members of the Morris Lab and the Center for Molecular Oncology at MSK for illuminating discussions. This study was supported by the Department of Defense Peer Reviewed Cancer Research Program and Rare Cancer Research Program, The Geoffrey Beene Cancer Research Center, Cycle for Survival: Team Fearless4Jen, The Jayme Flowers Fund, The Larry De Shon Fund, The Raquel and Riccardo Di Capua Fund (to LGTM), the Weill Cornell CTSC Grant 2UL1-TR-002384 (to KZ), the Area of Concentration Program at Weill Cornell Medical College (to KZ), and the NIH/NCI Cancer Center Support Grant P30 CA008748 (institutional, to MSKCC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing Interests Statement

M.F.B. reports personal fees from AstraZeneca and Paige.AI; research support from Boundless Bio; and intellectual property rights with SOPHiA Genetics. T.A.C. acknowledges grant funding from Bristol Myers Squibb, AstraZeneca, Illumina, Pfizer, AN2H and Eisai, has served as an advisor for Bristol Myers Squibb, Illumina, Eisai and AN2H, holds equity in AN2H and is a cofounder of Gritstone Oncology and holds equity in the company. T.A.C. and L.G.T.M. are listed inventors on intellectual property held by Memorial Sloan Kettering Cancer Center, unrelated to this work. The remaining authors declare no competing interests.

Data Availability

A de-identified dataset—containing the clinical features and processed data that underlie the results reported in this article derived from MSK-IMPACT tumor sequencing—is available on Zenodo: https://zenodo.org/records/14538739

Code Availability

No custom code was generated for this article. Analyses described in the manuscript were conducted using the following freely available software: lme4 v1.1.35.1; lmerTest v3.1.3; ggplot2 v3.4.4; patchwork v1.2.0; stats v4.3.2; survival v3.5.7; and forester v0.3.0.

References

  • 1.Birkbak NJ & McGranahan N Cancer Genome Evolutionary Trajectories in Metastasis. Cancer Cell vol. 37 8–19 Preprint at 10.1016/j.ccell.2019.12.004 (2020). [DOI] [PubMed] [Google Scholar]
  • 2.Rauwerdink DJW et al. Mixed Response to Immunotherapy in Patients with Metastatic Melanoma. Ann Surg Oncol 27, 3488–3497 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Morinaga T et al. Mixed Response to Cancer Immunotherapy is Driven by Intratumor Heterogeneity and Differential Interlesion Immune Infiltration. Cancer Research Communications 2, 739–753 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.van Dijk E et al. Chromosomal copy number heterogeneity predicts survival rates across cancers. Nat Commun 12, (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Valero C et al. The association between tumor mutational burden and prognosis is dependent on treatment context. Nat Genet 53, 11–15 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Liu L et al. Combination of TMB and CNA stratifies prognostic and predictive responses to immunotherapy across metastatic cancer. Clinical Cancer Research 25, 7413–7423 (2019). [DOI] [PubMed] [Google Scholar]
  • 7.Davoli T, Uno H, Wooten EC & Elledge SJ Tumor aneuploidy correlates with markers of immune evasion and with reduced response to immunotherapy. Science (1979) 355, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Spurr LF, Weichselbaum RR & Pitroda SP Tumor aneuploidy predicts survival following immunotherapy across multiple cancers. Nat Genet 54, 1782–1785 (2022). [DOI] [PubMed] [Google Scholar]
  • 9.Rizvi NA et al. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science (1979) 348, 124–128 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Marcus L et al. FDA approval summary: Pembrolizumab for the treatment of tumor mutational burden-high solid tumors. Clinical Cancer Research 27, 4685–4689 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hieronymus H et al. Tumor copy number alteration burden is a pan-cancer prognostic factor associated with recurrence and death. (2018) doi: 10.7554/eLife.37294.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gupta GP & Massagué J Cancer Metastasis: Building a Framework. Cell vol. 127 679–695 Preprint at 10.1016/j.cell.2006.11.001 (2006). [DOI] [PubMed] [Google Scholar]
  • 13.Zhang Y, Chen F & Creighton CJ Pan-cancer molecular subtypes of metastasis reveal distinct and evolving transcriptional programs. Cell Rep Med 4, (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.van de Haar J et al. Limited evolution of the actionable metastatic cancer genome under therapeutic pressure. Nat Med 27, 1553–1563 (2021). [DOI] [PubMed] [Google Scholar]
  • 15.Brastianos PK et al. Genomic characterization of brain metastases reveals branched evolution and potential therapeutic targets. Cancer Discov 5, 1164–1177 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Turajlic S et al. Tracking Cancer Evolution Reveals Constrained Routes to Metastases: TRACERx Renal. Cell 173, 581–594.e12 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Yates LR et al. Genomic Evolution of Breast Cancer Metastasis and Relapse. Cancer Cell 32, 169–184.e7 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Cheng DT et al. Memorial sloan kettering-integrated mutation profiling of actionable cancer targets (MSK-IMPACT): A hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. Journal of Molecular Diagnostics 17, 251–264 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Sansregret L & Swanton C The role of aneuploidy in cancer evolution. Cold Spring Harbor Perspectives in Medicine vol. 7 Preprint at 10.1101/cshperspect.a028373 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Andor N et al. Pan-cancer analysis of the extent and consequences of intratumor heterogeneity. Nat Med 22, 105–113 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Mroz EA, Tward AM, Hammon RJ, Ren Y & Rocco JW Intra-tumor Genetic Heterogeneity and Mortality in Head and Neck Cancer: Analysis of Data from The Cancer Genome Atlas. PLoS Med 12, (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bakhoum SF et al. Chromosomal instability drives metastasis through a cytosolic DNA response. Nature 553, 467–472 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Gao C et al. Chromosome instability drives phenotypic switching to metastasis. Proc Natl Acad Sci U S A 113, 14793–14798 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.MacKenzie KJ et al. CGAS surveillance of micronuclei links genome instability to innate immunity. Nature 548, 461–465 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Bielski CM et al. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat Genet 50, 1189–1195 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Robinson DR et al. Integrative clinical genomics of metastatic cancer. Nature 548, 297–303 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Powell E, Piwnica-Worms D & Piwnica-Worms H Contribution of p53 to metastasis. Cancer Discovery vol. 4 405–414 Preprint at 10.1158/2159-8290.CD-13-0136 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Priestley P et al. Pan-cancer whole-genome analyses of metastatic solid tumours. Nature 575, 210–216 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Nguyen B et al. Genomic characterization of metastatic patterns from prospective clinical sequencing of 25,000 patients. Cell 185, 563–575.e11 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Jamaspishvili T et al. Clinical implications of PTEN loss in prostate cancer. Nature Reviews Urology vol. 15 222–234 Preprint at 10.1038/nrurol.2018.9 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.van Dessel LF et al. The genomic landscape of metastatic castration-resistant prostate cancers reveals multiple distinct genotypes with potential clinical impact. Nat Commun 10, (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Toy W et al. ESR1 ligand-binding domain mutations in hormone-resistant breast cancer. Nat Genet 45, 1439–1445 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.McGranahan N et al. Allele-Specific HLA Loss and Immune Escape in Lung Cancer Evolution. Cell 171, 1259–1271.e11 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Taylor AM et al. Genomic and Functional Approaches to Understanding Cancer Aneuploidy. Cancer Cell 33, 676–689.e3 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Turajlic S & Swanton C Metastasis as an evolutionary process. Science vol. 352 167–169 Preprint at 10.1126/science.aaf6546 (2016). [DOI] [PubMed] [Google Scholar]
  • 36.William W Jr et al. Immune evasion in HPV − head and neck precancer-cancer transition is driven by an aneuploid switch involving chromosome 9p loss. PNAS 118, (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.López S et al. Whole Genome Doubling mitigates Muller’s Ratchet in Cancer Evolution. Preprint at 10.1101/513457 (2019). [DOI] [Google Scholar]
  • 38.Alfieri F, Caravagna G & Schaefer MH Cancer genomes tolerate deleterious coding mutations through somatic copy number amplifications of wild-type regions. Nat Commun 14, (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Samstein RM et al. Tumor mutational load predicts survival after immunotherapy across multiple cancer types. Nature Genetics vol. 51 202–206 Preprint at 10.1038/s41588-018-0312-8 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Shen R & Seshan VE FACETS: Allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acids Res 44, (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Chowell D et al. Improved prediction of immune checkpoint blockade efficacy across multiple cancer types. Nat Biotechnol 40, 499–506 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Bates D, Maechler M, Bolker B & Walker S Lme4: Linear Mixed-Effects Models Using Eigen and S4. Preprint at (2022).
  • 43.Benjamini Y & Hochberg Y Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple. Source: Journal of the Royal Statistical Society. Series B (Methodological) vol. 57 (1995). [Google Scholar]
  • 44.Therneau T A Package for Survival Analysis in R. Preprint at (2023). [Google Scholar]
  • 45.Boyes R Forester: An R package for creating publication-ready forest plots. Preprint at (2021).
  • 46.Kato S et al. Multicenter experience with large panel next-generation sequencing in patients with advanced solid cancers in Japan. Jpn J Clin Oncol 49, 174–182 (2019). [DOI] [PubMed] [Google Scholar]
  • 47.Hong TH et al. Clinical advantage of targeted sequencing for unbiased tumor mutational burden estimation in samples with low tumor purity. J Immunother Cancer 8, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Burdett NL et al. Timing of whole genome duplication is associated with tumor-specific MHC-II depletion in serous ovarian cancer. Nature Communications 15, (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Schoenfeld AJ et al. Clinical and molecular correlates of PD-L1 expression in patients with lung adenocarcinomas. Annals of Oncology 31, 599–608 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Lozac’hmeur A et al. Detecting HLA loss of heterozygosity within a standard diagnostic sequencing workflow for prognostic and therapeutic opportunities. NPJ Precis Oncol 8, (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Shin HT et al. Prevalence and detection of low-allele-fraction variants in clinical cancer samples. Nat Commun 8, (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Fan X, Luo G & Huang YS Accucopy: accurate and fast inference of allele-specific copy number alterations from low-coverage low-purity tumor sequencing data. BMC Bioinformatics 22, (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Lin AL et al. Genome-wide loss of heterozygosity predicts aggressive, treatment-refractory behavior in pituitary neuroendocrine tumors. Acta Neuropathol 147, (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Clinton TN et al. Genomic heterogeneity as a barrier to precision oncology in urothelial cancer. Cell Rep 41, (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Tables

Supplementary Table 1: Patient and sample characteristics in the whole cohort and by sample type.

Supplementary Table 2: Mixed effects model results at different purity cut-offs. Red shading indicates that p-value is < 0.05.

Supplementary Table 3: Number of samples by cancer type and metastatic sample site. Reference for fig. 4.

Supplementary Table 4: Results of analysis of de novo whole genome duplication at different purity stringency cutoffs.

Supplementary Table 5: Results of analysis of de novo mutations and alterations at different purity stringency cutoffs.

Supplementary Table 6: Results of analysis of de novo HLA-I loss of heterozygosity at different LOHHLA significance thresholds.

Data Availability Statement

A de-identified dataset—containing the clinical features and processed data that underlie the results reported in this article derived from MSK-IMPACT tumor sequencing—is available on Zenodo: https://zenodo.org/records/14538739

No custom code was generated for this article. Analyses described in the manuscript were conducted using the following freely available software: lme4 v1.1.35.1; lmerTest v3.1.3; ggplot2 v3.4.4; patchwork v1.2.0; stats v4.3.2; survival v3.5.7; and forester v0.3.0.

RESOURCES