Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

medRxiv logoLink to medRxiv
[Preprint]. 2020 Dec 4:2020.12.03.20243493. [Version 1] doi: 10.1101/2020.12.03.20243493

Supervised Image Classification Algorithm Using Representative Spatial Texture Features: Application to COVID-19 Diagnosis Using CT Images

Zehor Belkhatir, Raúl San José Estépar, Allen R Tannenbaum
PMCID: PMC7724681  PMID: 33300010

Abstract

Although there is no universal definition for texture, the concept in various forms is nevertheless widely used and a key element of visual perception to analyze images in different fields. The present work’s main idea relies on the assumption that there exist representative samples, which we refer to as references as well, i.e., “good or bad” samples that represent a given dataset investigated in a particular data analysis problem. These representative samples need to be accounted for when designing predictive models with the aim of improving their performance. In particular, based on a selected subset of texture gray-level co-occurrence matrices (GLCMs) from the training cohort, we propose new representative spatial texture features, which we incorporate into a supervised image classification pipeline. The pipeline relies on the support vector machine (SVM) algorithm along with Bayesian optimization and the Wasserstein metric from optimal mass transport (OMT) theory. The selection of the best, “good and bad,” GLCM references is considered for each classification label and performed during the training phase of the SVM classifier using a Bayesian optimizer. We assume that sample fitness is defined based on closeness (in the sense of the Wasserstein metric) and high correlation (Spearman’s rank sense) with other samples in the same class. Moreover, the newly defined spatial texture features consist of the Wasserstein distance between the optimally selected references and the remaining samples. We assessed the performance of the proposed classification pipeline in diagnosing the corona virus disease 2019 (COVID-19) from computed tomographic (CT) images.

Full Text Availability

The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.


Articles from medRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES