Abstract
Purpose
To develop an artificial intelligence–based model to detect mitral regurgitation on chest radiographs.
Materials and Methods
This retrospective study included echocardiographs and associated chest radiographs consecutively collected at a single institution between July 2016 and May 2019. Associated radiographs were those obtained within 30 days of echocardiography. These radiographs were labeled as positive or negative for mitral regurgitation on the basis of the echocardiographic reports and were divided into training, validation, and test datasets. An artificial intelligence model was developed by using the training dataset and was tuned by using the validation dataset. To evaluate the model, the area under the curve, sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were assessed by using the test dataset.
Results
This study included a total of 10 367 images from 5270 patients. The training dataset included 8240 images (4216 patients), the validation dataset included 1073 images (527 patients), and the test dataset included 1054 images (527 patients). The area under the curve, sensitivity, specificity, accuracy, positive predictive value, and negative predictive value in the test dataset were 0.80 (95% CI: 0.77, 0.82), 71% (95% CI: 67, 75), 74% (95% CI: 70, 77), 73% (95% CI: 70, 75), 68% (95% CI: 64, 72), and 77% (95% CI: 73, 80), respectively.
Conclusion
The developed deep learning–based artificial intelligence model may possibly differentiate patients with and without mitral regurgitation by using chest radiographs.
Keywords: Computer-aided Diagnosis (CAD), Cardiac, Heart, Valves, Supervised Learning, Convolutional Neural Network (CNN), Deep Learning Algorithms, Machine Learning Algorithms
Supplemental material is available for this article.
© RSNA, 2022
Keywords: Computer-aided Diagnosis (CAD), Cardiac, Heart, Valves, Supervised Learning, Convolutional Neural Network (CNN), Deep Learning Algorithms, Machine Learning Algorithms
Summary
An artificial intelligence–based model that can identify signs of mitral regurgitation on chest radiographs and that may thus serve as a rapid and cost-effective diagnostic aid was developed.
Key Points
■ The model detected mitral regurgitation, with an area under the receiver operating characteristic curve of 0.80 in both the validation and test datasets.
■ Visualization of model focus showed that the left atrium was the primary region of interest, extending to the left ventricle and superior vena cava as mitral regurgitation severity increased.
Introduction
Mitral regurgitation is a condition in which blood flow to the aorta is regurgitated by the mitral valve during contraction of the left ventricle. Mitral regurgitation has the highest prevalence of valve diseases, followed by aortic stenosis (1,2). The severity of mitral regurgitation is associated with the prognosis (3–5). The incidence of cardiovascular events within 5 years, including death, is high for patients with moderate or severe mitral regurgitation and a left ventricular ejection fraction (LVEF) below 50% (6). In the case of chronic mitral regurgitation, the 5-year mortality rate from heart disease varies widely with severity: It is 36% for patients with an effective regurgitant orifice area of 40 mm2 or greater but 3% for those with an orifice area of less than 20 mm2 (7).
Guidelines for patients with suspected mitral regurgitation recommend determining the diagnosis by performing transthoracic echocardiography, which should be followed by regular monitoring with echocardiography (4) at appropriate intervals to prevent irreversible damage to the ventricles and pulmonary circulation, which can occur without symptoms. Although diagnosing mitral regurgitation by using interviews or auscultation is noninvasive, neither method is objective, and the accuracy of the examiners varies widely (8–10). Echocardiography also requires skill and time to perform, and some patients may not be able to lie still on their backs for extended periods. Thus, an additional objective method to identify mitral regurgitation when echocardiography is not available would be a supportive tool for physicians. Radiographic findings of chronic mitral regurgitation (11,12) may include left atrial hypertrophy due to volume loading in mild mitral regurgitation (13,14). As the disease progresses, left ventricular hypertrophy appears. In severe cases, when pulmonary edema occurs, superior vena cava dilatation due to pulmonary vein hypertension is observed (15). However, the diagnostic accuracy of these radiographic findings of mitral regurgitation has not previously been well established in the literature, and chest radiography is thus rarely used as a diagnostic tool for valvular disease.
Deep learning (16,17) can extract features from target data and is therefore particularly helpful for the classification and quantification of objects with complicated and even unknown features. We created a model to diagnose mitral regurgitation from chest radiographs by using deep learning, and we visualized the extracted features by using heat maps.
Materials and Methods
Study Design
We built a deep learning–based model that classifies mitral regurgitation by using digital chest radiographs. Chest radiographs were retrospectively collected from patients who had undergone echocardiography at our secondary care institution to determine mitral regurgitation status. We then applied a classification activation heat map to help visualize the region of interest on the chest radiographs. The ethics board of our institution reviewed and approved the protocol of the study. The need for informed consent was waived because the images had been acquired during daily clinical practice from patients who consented to the comprehensive research use of their data. Patients are guaranteed the opportunity to opt out of the study.
Study Patients, Examination, and Image Acquisition
Comprehensive two-dimensional echocardiographic examinations from both inpatients and outpatients were consecutively collected from July 2016 through May 2019. Echocardiography to evaluate mitral regurgitation was performed by using an iE33 (Philips Medical Systems); Vivid E9 (GE Healthcare); or Aplio 500, Aplio 80, or Artida (Canon Medical Systems) system with a high-frequency transducer.
Chest radiographs were retrospectively collected if they were obtained in a posteroanterior view in the standing position within 30 days of an eligible echocardiographic examination, as described above. These radiographs were obtained by using a DR CALNEO C 1417 Wireless SQ (Fujifilm Medical Systems), AeroDR1717 (Konica Minolta), or DigitalDiagnost VR (Philips Medical Systems) system. If one patient had multiple eligible chest radiographs, then all were collected.
Ground Truth Labeling
Chest radiographs were labeled according to the echocardiographic reports. Those with echocardiographically reported mitral regurgitation ranging from mild to severe, according to the American Society of Echocardiography recommendations (5), were defined as being positive for mitral regurgitation, and all others were defined as being negative for mitral regurgitation.
For moderate to severe mitral regurgitation, we extracted the mechanism (18) (degenerative or functional) from the echocardiographic reports. Finally, we extracted the LVEF (19) and whether the patient had atrial fibrillation at the time of echocardiography from the electronic health records of all patients.
Data Partition
All labeled chest radiographs were randomly divided into training, validation, and test datasets at a ratio of 8:1:1. Partition on a patient basis ensured there was no overlap of images or patients among the respective datasets. The definition of each dataset is provided in Appendix E1A (supplement).
Image Processing
All chest radiographs in each dataset were resized to 320 × 320–pixel portable network graphics files. First, the longer side was downscaled while maintaining the aspect ratio. Second, the shorter side of the radiographs was padded black to 320 pixels.
Model Development
The model was developed by using the ResNet50 model (20) in the TensorFlow framework (21). The deep learning–based model was trained from scratch with the training dataset and tuned with the validation dataset. The model determines which features distinguish between mitral regurgitation–positive and mitral regurgitation–negative images during training. All images were augmented by using random rotation, random shifts, and brightness shifts and were horizontally flipped. The model with the smallest value for the loss function (within 100 epochs) in the validation dataset was adopted as the best-performing model. Our work is open source, and the trained model is available under the Apache 3.0 license (22). Detailed processes for development of the deep learning model are shown in Appendix E1B (supplement), an outline of the model is shown in Figure E1 (supplement), and the source code is available online (22). Appendix E3A (supplement) shows this model developed with another architecture. The machine environment is provided in Appendix E1C (supplement).
Model Test
With use of the best-performing model and the same thresholds as those for the validation dataset, the diagnostic performance of the model was assessed on both the validation and test datasets.
Visualizing Regions of Interest for the Trained Model by Using Heat Maps
A heat map was generated for each chest radiograph to visualize the focus of the deep learning model as it classified radiographs by the presence or absence of mitral regurgitation. A classification activation map was applied to create class-discriminative visualization of the chest radiograph (23). A detailed explanation of the heat map generation model is shown in Figure E2 (supplement), and the source code is available online (22).
After applying the visualization model to the test dataset, two board-certified general radiologists with 8 and 10 years of experience evaluated the anatomic regions indicated by the heat maps for all mitral regurgitation–positive chest radiographs. The radiologists were blinded to the model predictions but were aware that these patients had been clinically diagnosed with mitral regurgitation. Because radiographic findings of mitral regurgitation include left atrial enlargement, left ventricular enlargement, superior vena cava enlargement, and hilar shadow enhancement, the location of the heat map distribution in these regions was visually evaluated. When there was disagreement between radiologists, consensus was achieved by discussion.
Statistical Analysis
To evaluate the model, sensitivity, specificity, accuracy, positive predictive value, negative predictive value, the receiver operating characteristic curve, and the area under the curve were assessed. Sensitivity was assessed on the basis of the severity and mechanism (degenerative or functional) of mitral regurgitation. Sensitivity and specificity were assessed on the basis of the grade of LVEF and whether or not atrial fibrillation was present. All analyses were performed using R statistical software (version 3.6.0, R Project for Statistical Computing) (24). These statistical inferences were performed with a two-sided significance level of 5%.
Results
Datasets
In total, 10 367 radiographs from 7555 echocardiograms in 5270 patients were included in this study. Of the originally collected 14 544 echocardiographic examinations, those from patients with a postoperative mitral valve (n = 1254) and those with no corresponding chest radiograph (n = 5735) were excluded. The training dataset included 8240 images (4216 patients; mean age ± SD, 67 years ± 15; 56% men). The validation dataset included 1073 images (527 patients; mean age ± SD, 69 years ± 15; 55% men). The test dataset included 1054 images (527 patients; mean age ± SD, 67 years ± 15; 56% men). Figure E3 (supplement) shows a flowchart of the dataset criteria, and detailed demographics for the datasets are shown in Table 1. The radiographic and echocardiographic equipment are described in Appendix E2A and Table E1 (supplement).
Table 1:
Dataset Demographics
Model Development
The model was developed by using the training dataset applied for 100 training epochs. During this period, the lowest total loss value occurred at 66 epochs, when the value was 0.41 in the validation dataset; therefore, the parameters present at that time were adopted as the best-performing model. The learning curves are shown in Figure E4 (supplement).
Model Evaluation
The model had an area under the curve, sensitivity, specificity, accuracy, positive predictive value, and negative predictive value of 0.80 (95% CI: 0.77, 0.82), 73% (95% CI: 69, 77), 72% (95% CI: 68, 75), 72% (95% CI: 69, 75), 65% (95% CI: 61, 69), and 78% (95% CI: 75, 81), respectively, in the validation dataset; in the test dataset, these values were 0.80 (95% CI: 0.77, 0.82), 71% (95% CI: 67, 75), 74% (95% CI: 70, 77), 73% (95% CI: 70, 75), 68% (95% CI: 64, 72), and 77% (95% CI: 73, 80), respectively. The area under the curve of 0.80 was set as our benchmark because this is considered to be acceptable to excellent performance. The sensitivity for determining degenerative and functional mitral regurgitation was 84% (95% CI: 73, 93) and 89% (95% CI: 80, 96), respectively, in the test dataset. Detailed results and receiver operating characteristic curves are shown in Table 2 and Figure 1, respectively. The results of the model metrics according to the equipment used for chest radiography and echocardiography are shown in Appendix E2B and Table E2 (supplement). The predictive values for the deep learning model across a full prevalence range for the detection of mitral regurgitation are provided in Appendix E2C and Figure E5 (supplement). Model results after ensuring only one radiograph per patient are also available (Appendix E2D, Fig E6, and Table E3 [supplement]). The learning curve from modeling with another architecture is shown in Figure E7 (supplement), and the receiver operating characteristic curve is shown in Figure E8 (supplement). These results are also summarized in Appendix E3B and Table E5 (supplement), and Table E7 (supplement) shows results according to age groups. Appendix E3D, Figure E9, and Table E6 (supplement) show results of the models prepared with other severity cutoffs.
Table 2:
Model Performance Results
Figure 1:
Receiver operating characteristic curves of the model for the validation and test datasets. The deep learning model discriminated chest radiographs showing mitral regurgitation from those that did not show mitral regurgitation, with areas under the receiver operating characteristic curve of 0.80 (95% CI: 0.77, 0.82) in the validation dataset (yellow line) and 0.80 (95% CI: 0.77, 0.82) in the test dataset (red line). The shaded areas show 95% CIs.
Heat Map Visualization
A total of 327 true-positive results in the test dataset were visualized. The distribution of the heat map was strongest in most radiographs over the left atrium (326 of 327) and hilum (322 of 327). As mitral regurgitation severity progressed, the heat map expanded from accenting only the left atrium and hilum to also accenting the left ventricle and superior vena cava. A heat map highlighting the left ventricle was observed in 22% (51 of 232) of radiographs for mild mitral regurgitation and 73% (16 of 22) of radiographs for severe mitral regurgitation, whereas a heat map highlighting the superior vena cava was observed in 20% (47 of 232) of mild mitral regurgitation cases and 68% (15 of 22) of severe mitral regurgitation cases. Heat maps from patients without mitral regurgitation were not focused on a specific region. Heat maps of false-positive results were focused on the left atrium, showing regions of interest similar to those for the true-positive results. Detailed results are shown in Table 3, heat map interpretation by each reader-radiologist is shown in Table E4 (supplement), and representative heat maps for each severity level of mitral regurgitation are shown in Figure 2.
Table 3:
Heat Map Distribution by Severity
Figure 2:
Representative heat maps by severity of mitral regurgitation. These chest radiographs were correctly diagnosed by the model, and the heat maps show features on which the model focused when making the determination of mitral regurgitation. Dark blue represents very low effect on model output, and dark red represents a very high effect on model output. (A) An 80-year-old man with mild mitral regurgitation. The left atrium was identified as a hot spot on the radiograph. (B) A 74-year-old man with moderate mitral regurgitation. The left atrium and hilar regions were identified as hot spots on the radiograph. (C) A 95-year-old woman with severe mitral regurgitation. The left atrium, left ventricle, and hilar regions were identified as hot spots on the radiograph.
Discussion
We developed a deep learning–based model for diagnosing mitral regurgitation from chest radiographs. The area under the curve, sensitivity, and specificity of our model were 0.80 (95% CI: 0.77, 0.82), 71% (95% CI: 67, 75), and 74% (95% CI: 70, 77), respectively, in the independent test dataset. Visualization analysis showed that the model focused on the left atrium and hilum, regardless of severity. In more severe cases, the model focused on the left ventricle and superior vena cava as well. These results indicate that chest radiographs may have intrinsic features that can help with diagnosing mitral regurgitation.
To our knowledge, this is the first study to create and validate a diagnostic model for mitral regurgitation from chest radiographs. We used a visualization technique (23) to confirm that the model focuses on features of the radiographs that agreed with expected changes in cardiac morphologic characteristics as mitral regurgitation progressed. On true-positive chest radiographs, heat maps focused on the left atrium for all severities (326 of 327). As the severity increased, heat maps included the left ventricle and superior vena cava. These visual findings of the model were consistent with reported findings (11–15); however, it is difficult for radiologists to diagnose mitral regurgitation by using these findings exclusively. When this model is used as an objective test, chest radiographs may help in the diagnosis of mitral regurgitation.
In this study, model performance also correlated with the expected magnitude of the morphologic changes of the heart. The sensitivity of the model increased as mitral regurgitation became more severe. For moderate to severe mitral regurgitation, the sensitivity was 85% or over in both the validation and the test datasets, indicating high performance. These findings suggest that as mitral regurgitation progresses, the changes in heart morphologic characteristics become more pronounced because of volume overload. Our results by mechanism (18) of mitral regurgitation, LVEF (19), and the presence of atrial fibrillation follow this trend. For the mechanism specifically, the model was more sensitive to functional mitral regurgitation, which may have more morphologic abnormalities than degenerative mitral regurgitation. In the presence of either atrial fibrillation or low LVEF, sensitivity increased while specificity decreased, which may be due to the association of these conditions with cardiac enlargement. This decrease in model specificity may have increased the likelihood of false-positive output.
Our institution is a secondary care hospital with a high prevalence of mitral regurgitation. Although it is advantageous from a modeling perspective to have a larger mitral regurgitation dataset, we need to be mindful that this prevalence is different from that of the target population. For example, in a cohort that is not enriched with patients with mitral regurgitation (25), it may be beneficial to use our model for patients with symptoms of mitral regurgitation. In this case, the model should be used in combination with auscultation and interviewing for arterial hypertension, dyslipidemia, diabetes, smoking, and a history of alcohol use to increase the positive predictive value. Our model could also potentially benefit patients who are not able to lie still on their backs for echocardiography, as chest radiographs can be performed in a matter of seconds while the patient is standing. Furthermore, this model may be useful in facilities where echocardiography is not available because of the unavailability of equipment or skilled technicians. Additionally, we can raise the threshold of the model to find only moderate to severe mitral regurgitation, but we must consider radiation exposure to the patient and cost if this method is considered for screening.
This study had several limitations. First, there exists inherent bias due to the retrospective nature of this study. Second, the data we collected were obtained from scanners from multiple vendors but only at a single center. Further validation with a test dataset acquired from another institution is desirable. Third, our model has been trained with multiple images for some patients. Although this could mean the model has the performance to make decisions about changes over time in individual patients, it also may affect the performance of the algorithm in other institutions or the general population. Fourth, even though we visualized the model’s region of interest with a heat map, it is still unclear which specific features the model used to determine mitral regurgitation in that region. Last, the accuracy of mitral regurgitation detection may increase by including left lateral images, but this may be practically difficult because often, only frontal images are captured.
In this study, we developed a model that uses deep learning–based artificial intelligence to diagnose mitral regurgitation by using features inherent to chest radiographs. Our model demonstrated high performance and may prove valuable for aiding in the diagnosis of mitral regurgitation. Further research in multicenter and prospective studies is needed to confirm the robustness of this research. Additionally, evaluating the abilities of machine learning versus radiologists or use of this model as a second reader are interesting possibilities for future research.
Acknowledgments
Acknowledgments
We are grateful to Mitsuhiro Inomata and Yoshikazu Hashimoto for technical assistance in regard to deep learning.
Supported in part by the Japan Society for the Promotion of Science (KAKENHI grant JP20K16769).
Disclosures of conflicts of interest: D.U. No relevant relationships. S.E. No relevant relationships. A.Y. No relevant relationships. S.I. No relevant relationships. K.A. No relevant relationships. S.L.W. No relevant relationships. T.M. No relevant relationships. A.S. No relevant relationships. M.Y. No relevant relationships. Y.M. No relevant relationships.
Abbreviations:
- LVEF
- left ventricular ejection fraction
References
- 1. Iung B , Baron G , Butchart EG , et al . A prospective survey of patients with valvular heart disease in Europe: the Euro Heart Survey on Valvular Heart Disease . Eur Heart J 2003. ; 24 ( 13 ): 1231 – 1243 . [DOI] [PubMed] [Google Scholar]
- 2. Nkomo VT , Gardin JM , Skelton TN , Gottdiener JS , Scott CG , Enriquez-Sarano M . Burden of valvular heart diseases: a population-based study . Lancet 2006. ; 368 ( 9540 ): 1005 – 1011 . [DOI] [PubMed] [Google Scholar]
- 3. Nishimura RA , Otto CM , Bonow RO , et al . 2017 AHA/ACC focused update of the 2014 AHA/ACC guideline for the management of patients with valvular heart disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines . J Am Coll Cardiol 2017. ; 70 ( 2 ): 252 – 289 . [DOI] [PubMed] [Google Scholar]
- 4. Nishimura RA , Otto CM , Bonow RO , et al . 2014 AHA/ACC guideline for the Management of Patients With Valvular Heart Disease: executive summary: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines . Circulation 2014. ; 129 ( 23 ): 2440 – 2492 . [Published correction appears in Circulation 2014;129(23):e650.] [DOI] [PubMed] [Google Scholar]
- 5. Zoghbi WA , Enriquez-Sarano M , Foster E , et al . Recommendations for evaluation of the severity of native valvular regurgitation with two-dimensional and Doppler echocardiography . J Am Soc Echocardiogr 2003. ; 16 ( 7 ): 777 – 802 . [DOI] [PubMed] [Google Scholar]
- 6. Avierinos JF , Gersh BJ , Melton LJ 3rd , et al . Natural history of asymptomatic mitral valve prolapse in the community . Circulation 2002. ; 106 ( 11 ): 1355 – 1361 . [DOI] [PubMed] [Google Scholar]
- 7. Enriquez-Sarano M , Avierinos JF , Messika-Zeitoun D , et al . Quantitative determinants of the outcome of asymptomatic mitral regurgitation . N Engl J Med 2005. ; 352 ( 9 ): 875 – 883 . [DOI] [PubMed] [Google Scholar]
- 8. Etchells E , Bell C , Robb K . Does this patient have an abnormal systolic murmur? JAMA 1997. ; 277 ( 7 ): 564 – 571 . [PubMed] [Google Scholar]
- 9. Mangione S . Cardiac auscultatory skills of physicians-in-training: a comparison of three English-speaking countries . Am J Med 2001. ; 110 ( 3 ): 210 – 216 . [DOI] [PubMed] [Google Scholar]
- 10. Mangione S , Nieman LZ . Cardiac auscultatory skills of internal medicine and family practice trainees. A comparison of diagnostic proficiency . JAMA 1997. ; 278 ( 9 ): 717 – 722 . [PubMed] [Google Scholar]
- 11. Woolley K , Stark P . Pulmonary parenchymal manifestations of mitral valve disease . RadioGraphics 1999. ; 19 ( 4 ): 965 – 972 . [DOI] [PubMed] [Google Scholar]
- 12. Webb WR , Higgins CB . Thoracic imaging: pulmonary and cardiovascular radiology . Philadelphia, Pa: Lippincott Williams & Wilkins, 2010. . [Google Scholar]
- 13. Enriquez-Sarano M , Akins CW , Vahanian A . Mitral regurgitation . Lancet 2009. ; 373 ( 9672 ): 1382 – 1394 . [DOI] [PubMed] [Google Scholar]
- 14. Gahl K , Sutton R , Pearson M , Caspari P , Lairet A , McDonald L . Mitral regurgitation in coronary heart disease . Br Heart J 1977. ; 39 ( 1 ): 13 – 18 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Otto CM , Prendergast B . Aortic-valve stenosis—from patients at risk to severe valve obstruction . N Engl J Med 2014. ; 371 ( 8 ): 744 – 756 . [DOI] [PubMed] [Google Scholar]
- 16. LeCun Y , Bengio Y , Hinton G . Deep learning . Nature 2015. ; 521 ( 7553 ): 436 – 444 . [DOI] [PubMed] [Google Scholar]
- 17. Hinton G . Deep learning—a technology with the potential to transform health care . JAMA 2018. ; 320 ( 11 ): 1101 – 1102 . [DOI] [PubMed] [Google Scholar]
- 18. Carpentier A . Cardiac valve surgery—the “French correction” . J Thorac Cardiovasc Surg 1983. ; 86 ( 3 ): 323 – 337 . [PubMed] [Google Scholar]
- 19. Lang RM , Badano LP , Mor-Avi V , et al . Recommendations for cardiac chamber quantification by echocardiography in adults: an update from the American Society of Echocardiography and the European Association of Cardiovascular Imaging . J Am Soc Echocardiogr 2015. ; 28 ( 1 ): 1 – 39.e14 . [DOI] [PubMed] [Google Scholar]
- 20.He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway, NJ:Institute of Electrical and Electronics Engineers,2016;770–778. [Google Scholar]
- 21. TensorFlow . https://www.tensorflow.org. Accessed April 1, 2020 .
- 22. xp-mr . GitHub . https://github.com/xp-mr. Accessed October 1, 2020 .
- 23. Selvaraju RR , Cogswell M , Das A , Vedantam R , Parikh D , Batra D . Grad-CAM: visual explanations from deep networks via gradient-based localization . Int J Comput Vis 2020. ; 128 ( 2 ): 336 – 359 . [Google Scholar]
- 24. R Core Team . R: a language and environment for statistical computing. R Foundation for Statistical Computing . http://www.R-project.org. Published 2013. Accessed August 1, 2021.
- 25. Salmi LR , Coureau G , Bailhache M , Mathoulin-Pélissier S . To screen or not to screen: reconciling individual and population perspectives on screening . Mayo Clin Proc 2016. ; 91 ( 11 ): 1594 – 1605 . [DOI] [PubMed] [Google Scholar]