Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2024 Nov 28:2024.11.27.624656. [Version 1] doi: 10.1101/2024.11.27.624656

SNMF: Integrated Learning of Mutational Signatures and Prediction of DNA Repair Deficiencies

Sander Goossens, Yasin I Tepeli, Colm Seale, Joana P Gonçalves
PMCID: PMC11623639  PMID: 39651280

Abstract

Motivation

Many tumours show deficiencies in DNA damage response (DDR), which influence tumorigenesis and progression, but also expose vulnerabilities with therapeutic potential. Assessing which patients might benefit from DDR-targeting therapy requires knowledge of tumour DDR deficiency status, with mutational signatures reportedly better predictors than loss of function mutations in select genes. However, signatures are identified independently using unsupervised learning, and therefore not optimised to distinguish between different pathway or gene deficiencies.

Results

We propose SNMF, a supervised non-negative matrix factorisation that jointly optimises the learning of signatures: (1) shared across samples, and (2) predictive of DDR deficiency. We applied SNMF to mutation profiles of human induced pluripotent cell lines carrying gene knockouts linked to three DDR pathways. The SNMF model achieved high accuracy (0.971) and learned more complete signatures of the DDR status of a sample, further discerning distinct mechanisms within a pathway. Cell line SNMF signatures recapitulated tumour-derived COSMIC signatures and predicted DDR pathway deficiency of TCGA tumours with high recall, suggesting that SNMF-like models can leverage libraries of induced DDR deficiencies to decipher intricate DDR signatures underlying patient tumours.

Availability

https://github.com/joanagoncalveslab/SNMF .

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES