Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2023 Jul 12:2023.07.06.548037. [Version 2] doi: 10.1101/2023.07.06.548037

Using Published Pathway Figures in Enrichment Analysis and Machine Learning

MIN-GYOUNG SHIN, Alexander Pico
PMCID: PMC10350053  PMID: 37461614

Abstract

Pathway Figure OCR (PFOCR) is a novel kind of pathway database approaching the breadth and depth of Gene Ontology while providing rich, mechanistic diagrams and direct literature support. PFOCR content is extracted from published pathway figures currently emerging at a rate of 1000 new pathways each month. Here, we compare the pathway information contained in PFOCR against popular pathway databases with respect to overall and disease-specific coverage. In addition to common pathways analysis use cases, we present two advanced case studies demonstrating unique advantages of PFOCR in terms of cancer subtype and grade prediction analyses.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES