Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2024 Apr 15:2024.04.12.589302. [Version 1] doi: 10.1101/2024.04.12.589302

A novel method to guide biomarker combinations to optimize the sensitivity

Seyyed Mahmood Ghasem, Johannes F Fahrmann, Samir Hanash, Kim-Anh Do, James P Long, Ehsan Irajizad
PMCID: PMC11042214  PMID: 38659773

Abstract

Logistic regression has demonstrated its utility in classifying binary labeled datasets through the maximum likelihood approach. However, in numerous biological and clinical contexts, the aim is often to determine coefficients that yield the highest sensitivity at the pre-specified specificity or vice versa. Therefore, the application of logistic regression is limited in such settings. To this end, we have developed an improved regression framework, SMAGS, for binary classification that, for a given specificity, finds the linear decision rule that yields the maximum sensitivity. Furthermore, we employed the method for feature selection to find the features that are satisfying the sensitivity maximization goal. We compared our method with normal logistic regression by applying it to real clinical data as well as synthetic data. In the real application data (colorectal cancer dataset), we found 14% improvement of sensitivity at 98.5% specificity.

Availability and implementation

Software is made available in Python ( https://github.com/smahmoodghasemi/SMAGS )

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES