Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2024 Sep 24:2024.09.19.613933. [Version 1] doi: 10.1101/2024.09.19.613933

A Machine Learning-Based Investigation of Integrin Expression Patterns in Cancer and Metastasis

Hossain Shadman, Saghar Gomrok, Qianyi Cheng, Yu Jiang, Xiaohua Huang, Jesse D Ziebarth, Yongmei Wang
PMCID: PMC11463510  PMID: 39386595

Abstract

Background

Integrins, a family of transmembrane receptor proteins, play complex roles in cancer development and metastasis. These roles could be better delineated through machine learning of transcriptomic data to reveal relationships between integrin expression patterns and cancer.

Methods

We collected publicly available RNA-Seq integrin expression from 8 healthy tissues and their corresponding tumors, along with data from metastatic breast cancer. We then used machine learning methods, including t-SNE visualization and Random Forest classification, to investigate changes in integrin expression patterns.

Results

Integrin expression varied across tissues and cancers, and between healthy and cancer samples from the same tissue, enabling the creation of models that classify samples by tissue or disease status. The integrins whose expression was important to these classifiers were identified. For example, ITGA7 was key to classification of breast samples by disease status. Analysis in breast tissue revealed that cancer rewires co-expression for most integrins, but the co-expression relationships of some integrins remain unchanged in healthy and cancer samples. Integrin expression in primary breast tumors differed from their metastases, with liver metastasis notably having reduced expression.

Conclusions

Integrin expression patterns vary widely across tissues and are greatly impacted by cancer. Machine learning of these patterns can effectively distinguish samples by tissue or disease status.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES