Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

medRxiv logoLink to medRxiv
[Preprint]. 2024 Oct 30:2024.10.25.24316081. [Version 1] doi: 10.1101/2024.10.25.24316081

Explainable machine-learning model to classify culprit calcified carotid plaque in embolic stroke of undetermined source

Yu Sakai, Jiehyun Kim, Huy Q Phi, Andrew C Hu, Pargol Balali, Konstanze V Guggenberger, John H Woo, Daniel Bos, Scott E Kasner, Brett L Cucchiara, Luca Saba, Zhi Huang, Daniel Haehn, Jae W Song
PMCID: PMC11581071  PMID: 39574846

Abstract

Background

Embolic stroke of undetermined source (ESUS) may be associated with carotid artery plaques with <50% stenosis. Plaque vulnerability is multifactorial, possibly related to intraplaque hemorrhage (IPH), lipid-rich-necrotic-core (LRNC), perivascular adipose tissue (PVAT), and calcification morphology. Machine-learning (ML) approaches in plaque classification are increasingly popular but often limited in clinical interpretability by black-box nature. We apply an explainable ML approach, using noncalcified plaque components and calcification features with SHapley Additive exPlanations (SHAP) framework to classify calcified carotid plaques as culprit/non-culprit.

Methods

In this retrospective cross-sectional study, patients with unilateral anterior circulation ESUS who underwent neck CT angiography and had calcific carotid plaque were analyzed. Calcification-level features were derived from manual segmentations. Plaque-level features were assessed by a neuroradiologist blinded to stroke-side and by semi-automated software. Calcifications/plaques were classified as culprit if ipsilateral to stroke-side. Eight baseline ML models were compared. Three CatBoost models were trained: Plaque-level, Calcification-level, and Combined. SHAP was incorporated to explain model decisions.

Results

70 patients yielded 116 calcific carotid plaques (60 ipsilateral to stroke; 270 calcifications (146 ipsilateral)). 17 plaque-level and 15 calcification-level features were extracted. Baseline CatBoost model outperformed other models. Combined model achieved test AUC 0.77 (95% CI: 0.59-0.92), accuracy 0.82 (95% CI: 0.71 - 0.91), mean cross-validation AUC 0.78. Plaque-level and calcification-level models performed lower (AUC 0.41 95% CI: 0.15-0.68, 0.60 95% CI 0.44-0.76). Combined model utilized five features: plaque thickness, IPH/LRNC volume ratio, PVAT volume, calcification minimum density, and total calcification volume over mean density ratio. Plaque thickness was most important feature based on SHAP values, with potential threshold at >2.6 mm.

Conclusions

ML model trained with noncalcified plaque and calcification features can classify culprit calcific carotid plaque with greater accuracy than models trained using only plaque-level or calcification-level features. Model using clinically interpretable features with SHAP framework provides explanations for its decisions and allows identification of potential thresholds for high-risk features.

Abstract Figure

Full Text

The Full Text of this preprint is available as a PDF (772.4 KB). The Web version will be available soon.


Articles from medRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES