Abstract
Purpose: To improve and test the generalizability of a deep learning-based model for assessment of COVID-19 lung disease severity on chest radiographs (CXRs) from different patient populations. Materials and Methods: A published convolutional Siamese neural network-based model previously trained on hospitalized patients with COVID-19 was tuned using 250 outpatient CXRs. This model produces a quantitative measure of COVID-19 lung disease severity (pulmonary x-ray severity (PXS) score). The model was evaluated on CXRs from four test sets, including 3 from the United States (patients hospitalized at an academic medical center (N=154), patients hospitalized at a community hospital (N=113), and outpatients (N=108)) and 1 from Brazil (patients at an academic medical center emergency department (N=303)). Radiologists from both countries independently assigned reference standard CXR severity scores, which were correlated with the PXS scores as a measure of model performance (Pearson r). The Uniform Manifold Approximation and Projection (UMAP) technique was used to visualize the neural network results. Results: Tuning the deep learning model with outpatient data improved model performance in two United States hospitalized patient datasets (r=0.88 and r=0.90, compared to baseline r=0.86). Model performance was similar, though slightly lower, when tested on the United States outpatient and Brazil emergency department datasets (r=0.86 and r=0.85, respectively). UMAP showed that the model learned disease severity information that generalized across test sets. Conclusions: Performance of a deep learning-based model that extracts a COVID-19 severity score on CXRs improved using training data from a different patient cohort (outpatient versus hospitalized) and generalized across multiple populations.
Full Text Availability
The license terms selected by the author(s) for this preprint version do not permit archiving in PMC. The full text is available from the preprint server.