Validation of a convolutional neural network algorithm for calcium score quantification using a multivendor dataset

Emanuele Muscogiuri; Marly van Assen; Giovanni Tessarin; Alexander Razavi; Chris Schwemmer; Max Schoebinger; Michael Wels; Saikiran Rapaka; George SK Fung; Arthur E Stillman; Carlo N De Cecco

doi:10.1016/j.jcct.2023.10.014

. Author manuscript; available in PMC: 2024 Nov 7.

Published in final edited form as: J Cardiovasc Comput Tomogr. 2023 Nov 7;17(6):473–475. doi: 10.1016/j.jcct.2023.10.014

Validation of a convolutional neural network algorithm for calcium score quantification using a multivendor dataset

Emanuele Muscogiuri ¹, Marly van Assen ², Giovanni Tessarin ³, Alexander Razavi ⁴, Chris Schwemmer ⁵, Max Schoebinger ⁶, Michael Wels ⁷, Saikiran Rapaka ⁸, George SK Fung ⁹, Arthur E Stillman ¹⁰, Carlo N De Cecco ^11,^*

PMCID: PMC10908358 NIHMSID: NIHMS1967360 PMID: 37945453

Coronary artery calcium (CAC) is a measure of subclinical atherosclerosis burden and is a strong predictor of atherosclerotic cardiovascular disease (ASCVD) events.¹ CAC quantification is recommended to improve risk stratification and guide treatment strategies when ASCVD risk is uncertain.² CAC analysis can be labor intensive and time-consuming. Recently, artificial intelligence (AI) has been increasingly used for medical imaging with Convolutional Neural Networks (CNN) among the most widely used. The aim of this study is to validate a novel CNN algorithm for CAC measurement using a heterogeneous population with multivendor CT datasets.

We included adult patients (>18 years old) retrospectively enrolled from a single-center cohort involving multiple locations, who underwent cardiac CT (CCT) imaging for CAC scoring and had suitable diagnostic image quality. Patients who underwent prior cardiac surgery, with metallic devices affecting the image quality and those with coronary anomalies were excluded. The study was performed according to HIPAA regulations and approved by our ethics committee who waived the need for informed consent.

Datasets were obtained by multiple CT scanners from different vendors: Siemens (SOMATOM Force/Flash/Definition), Philips (iCT256 Brilliance) and General Electric (LightSpeed VCT/Revolution HD, Discovery CT750 HD). CCT was performed according to the clinical practice with a prospective ECG-triggered protocol.

Quality assessment and CAC quantification were performed by a radiologist with 4 years of experience in CCT imaging (EM, further mentioned as R1), using a commercially available software (CT CaScoring Syngo.via, version VB40, Siemens Healthineers, Forchheim, Germany). CAC, quantified with the Agatston score (AS), was reported perpatient and per-vessel. Left main (LM) and left anterior descending (LAD) arteries were grouped, while left circumflex (LCx) and right coronary (RCA) arteries were analyzed individually. The time of analysis was recorded for the algorithm and R1 on a sample of 50 patients. Automated CAC measurement was obtained using a deep-learning (DL) based CNN algorithm which is not yet commercially available (AI-Heart, Siemens Healthineers, Forchheim, Germany) and is an updated version (VB70) of that described by Winkel et al. (VB50).³

For risk classification, the patients were divided according to AS thresholds in 5 classes (0, 1–10, 11–100, 101–400, 400–1000, >1000) and also dichotomically (0 and >0; <100 and ≥100).

CAC was compared per-patient and per-vessel considering R1 the reference standard. Agreement and correlation between R1 and the algorithm were evaluated with Spearman’s rank correlation coefficient (⍴) and intraclass correlation coefficient (ICC). A sub-analysis was performed for the three different CT vendors. A Bland-Altman plot was used to display bias and agreement within 95 % confidence interval between the CNN algorithm and R1. The agreement in risk classification was tested using weighed kappa analysis (κ) and accuracy. Statistical analyses were conducted using SPSS version 27 (IBM, Armonk, New York).

The study population included 432 individuals: mean age 56 years old, 243 males (53.6 %). CCT scans were equally distributed among the three vendors (150, 142 and 140 datasets performed with Siemens, Philips and GE scanners, respectively).

The agreement between the CNN algorithm and R1 was excellent per-patient (⍴ = 0.963, ICC = 0.994, all p < 0.001), as well as per-vessel (⍴ = 0.970, 0.890 and 0.932, ICC = 0.980, 0.952, 0.999, for LM/LAD, LCX and RCA respectively, all p < 0.001). The Bland-Altman plot comparing the readings of the CNN algorithm and R1 per patient showed upper and lower limits of agreement equal to 78.46 and −87.72 respectively, a mean difference of −4.63 and only few outliers outside of the limits of agreement (Fig. 1). The analysis per vendor showed excellent agreement per-patient, with ⍴ = 0.968, 0.974 and 0.947, ICC = 0.984, 0.998, 0.998, for Siemens, Philips and GE respectively (all p < 0.001). Scatter plots comparing the measurements of R1 and CNN per patient and per vendor are provided in Fig. 2.

Fig. 1. — On the x-axis is represented the mean of measurements of CNN and R1, while on the y-axis is represented the difference between the measurements of CNN and R1. The three red lines in the Bland-Altman plot represent (from superior to inferior): the upper limit of agreement (78.46), the mean difference (−4.63), and the lower limit of agreement (−87.72), respectively. There are only a few outliers outside of the limits of agreement (mostly for values of Agatston Score <1000). (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 2. — In left upper quadrant is depicted the comparison between CNN and R1 regarding the CAC measurement per patient. In right upper quadrant, left lower quadrant and right lower quadrant are depicted the comparison between CNN and R1 for Siemens, Philips and General Electric (GE) scanners, respectively. CAC: coronary artery calcium score; CNN: convolutional neural network; GE: General Electric; R1: reader 1.

Analysis time was significantly lower for the algorithm (median time 4s [4–6s] vs 37s [34–38s] for negative cases and 113s [86–119s] for positive cases, for the CNN algorithm and R1 respectively; p < 0.001). Concerning risk classification, the agreement between the CNN and R1 was very good: κ = 0.859, overall accuracy = 90.2 % (all p < 0.001). The analysis based on the thresholds of 0 and 100 AS showed high agreement and accuracy: κ = 0.875 and 0.958, accuracy of 93.7 % and 98.6 % respectively (all p < 0.001).

In this study, we validated a fully automated CNN algorithm for CAC measurement on a heterogeneous multi-vendor CT dataset. Although said algorithm (VB70) has been cleared for use in the USA (510(k) K221219), there are still no publications on the algorithm so far that could be cited.

Comparing several similar studies on the same topic,^3–5 the results obtained by our algorithm proved to be consistent and showed how implementing such system in the clinical routine could be beneficial, yielding fast and reliable CAC measurement. Moreover, the consistency on a multi-vendor dataset also supports wide applicability to different clinical scenarios.

Some limitations are worth discussing: the CNN algorithm has only been compared with manual readings using a semi-automatic clinical software from the same vendor; the inclusion of data derived from both high-end and older scanners implies a huge variety of image quality possibly affecting the analysis. Follow up studies are required to evaluate the performance on low quality images and the added value of the algorithm in these cases.

In conclusion, the fully automated CNN algorithm for CAC quantification showed excellent agreement with expert reader analysis in a heterogeneous multivendor CT dataset with significantly reduced image analysis time.

Abbreviations

AI: Artificial Intelligence
AS: Agatston Score
ASCVD: Atherosclerotic Cardiovascular Disease
CAC: Coronary Artery Calcium Score
CCT: Cardiac Computed Tomography
CNN: Convolutional Neural Network
LAD: Left Anterior Descending Artery
LCx: Left Circumflex Artery
LM: Left Main Coronary Artery
RCA: Right Coronary Artery

Footnotes

Declaration of competing interest

Dr. De Cecco, Dr. van Assen and Emory receive research funding from Siemens Healthineers. Drs. Schwemmer, Wels, Rapaka and Fung are employed by Siemens Healthineers. The remaining authors do not have anything to declare regarding funding and possible conflict of interest.

Data generated or analyzed during the study are available from the corresponding author by request.

Contributor Information

Emanuele Muscogiuri, Department of Radiology and Imaging Sciences, Emory University Hospital | Emory Healthcare, Inc., Atlanta, GA, USA; Thoracic Imaging Division, Department of Radiology, University Hospitals Leuven, Leuven, Belgium.

Marly van Assen, Department of Radiology and Imaging Sciences, Emory University Hospital | Emory Healthcare, Inc., Atlanta, GA, USA.

Giovanni Tessarin, Department of Radiology and Imaging Sciences, Emory University Hospital | Emory Healthcare, Inc., Atlanta, GA, USA; Department of Medicine-DIMED, Institute of Radiology, University of Padova, Padua, Italy; Department of Radiology, Ca’ Foncello General Hospital, Treviso, Italy.

Alexander Razavi, Department of Cardiology, Emory University Hospital | Emory Healthcare, Inc., Atlanta, GA, USA.

Chris Schwemmer, Computed Tomography, Siemens Healthineers, Forchheim, Germany.

Max Schoebinger, Computed Tomography, Siemens Healthineers, Forchheim, Germany.

Michael Wels, Computed Tomography, Siemens Healthineers, Forchheim, Germany.

Saikiran Rapaka, Siemens Healthineers, Princeton, NJ, USA.

George S.K. Fung, Siemens Healthineers, Malvern, PA, USA

Arthur E. Stillman, Department of Radiology and Imaging Sciences, Emory University Hospital | Emory Healthcare, Inc., Atlanta, GA, USA

Carlo N. De Cecco, Department of Radiology and Imaging Sciences, Emory University Hospital | Emory Healthcare, Inc., Atlanta, GA, USA.

References

1.Greenland P, Blaha MJ, Budoff MJ, Erbel R, Watson KE. Coronary calcium score and cardiovascular risk. J Am Coll Cardiol 2018;72:434–447. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Knuuti J, Wijns W, Saraste A, et al. 2019 ESC Guidelines for the diagnosis and management of chronic coronary syndromes. Eur Heart J 2020;41:407–477. [DOI] [PubMed] [Google Scholar]
3.Winkel DJ, Suryanarayana VR, Ali AM, et al. Deep learning for vessel-specific coronary artery calcium scoring: validation on a multi-centre dataset. European Heart Journal - Cardiovascular Imaging 2021;23:846–854. [DOI] [PubMed] [Google Scholar]
4.Eng D, Chute C, Khandwala N, et al. Automated coronary calcium scoring using deep learning with multicenter external validation. npj Digital Medicine 2021;4. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Martin SS, Van Assen M, Rapaka S, et al. Evaluation of a deep learning–based automated CT coronary artery calcium scoring algorithm. JACC (J Am Coll Cardiol): Cardiovascular Imaging 2020;13:524–526. [DOI] [PubMed] [Google Scholar]

[R1] 1.Greenland P, Blaha MJ, Budoff MJ, Erbel R, Watson KE. Coronary calcium score and cardiovascular risk. J Am Coll Cardiol 2018;72:434–447. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Knuuti J, Wijns W, Saraste A, et al. 2019 ESC Guidelines for the diagnosis and management of chronic coronary syndromes. Eur Heart J 2020;41:407–477. [DOI] [PubMed] [Google Scholar]

[R3] 3.Winkel DJ, Suryanarayana VR, Ali AM, et al. Deep learning for vessel-specific coronary artery calcium scoring: validation on a multi-centre dataset. European Heart Journal - Cardiovascular Imaging 2021;23:846–854. [DOI] [PubMed] [Google Scholar]

[R4] 4.Eng D, Chute C, Khandwala N, et al. Automated coronary calcium scoring using deep learning with multicenter external validation. npj Digital Medicine 2021;4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Martin SS, Van Assen M, Rapaka S, et al. Evaluation of a deep learning–based automated CT coronary artery calcium scoring algorithm. JACC (J Am Coll Cardiol): Cardiovascular Imaging 2020;13:524–526. [DOI] [PubMed] [Google Scholar]

PERMALINK

Validation of a convolutional neural network algorithm for calcium score quantification using a multivendor dataset

Emanuele Muscogiuri

Marly van Assen

Giovanni Tessarin

Alexander Razavi

Chris Schwemmer

Max Schoebinger

Michael Wels

Saikiran Rapaka

George SK Fung

Arthur E Stillman

Carlo N De Cecco

Fig. 1. Bland-Altman plot representing the comparison between the CNN and R1 regarding coronary artery calcium score classification per patient.

Fig. 2. Scatter plots of coronary artery calcium score (CAC) measurements performed by the CNN and R1.

Abbreviations

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Validation of a convolutional neural network algorithm for calcium score quantification using a multivendor dataset

Emanuele Muscogiuri

Marly van Assen

Giovanni Tessarin

Alexander Razavi

Chris Schwemmer

Max Schoebinger

Michael Wels

Saikiran Rapaka

George SK Fung

Arthur E Stillman

Carlo N De Cecco

Fig. 1. Bland-Altman plot representing the comparison between the CNN and R1 regarding coronary artery calcium score classification per patient.

Fig. 2. Scatter plots of coronary artery calcium score (CAC) measurements performed by the CNN and R1.

Abbreviations

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases