DeBoNet: A deep bone suppression model ensemble to improve disease detection in chest radiographs

Sivaramakrishnan Rajaraman; Gregg Cohen; Lillian Spear; Les Folio; Sameer Antani

doi:10.1371/journal.pone.0265691

. 2022 Mar 31;17(3):e0265691. doi: 10.1371/journal.pone.0265691

DeBoNet: A deep bone suppression model ensemble to improve disease detection in chest radiographs

Sivaramakrishnan Rajaraman ^1,^*, Gregg Cohen ², Lillian Spear ², Les Folio ^2,^¤, Sameer Antani ¹

Editor: Jie Zhang³

PMCID: PMC8970404 PMID: 35358235

Abstract

Automatic detection of some pulmonary abnormalities using chest X-rays may be impacted adversely due to obscuring by bony structures like the ribs and the clavicles. Automated bone suppression methods would increase soft tissue visibility and enhance automated disease detection. We evaluate this hypothesis using a custom ensemble of convolutional neural network models, which we call DeBoNet, that suppresses bones in frontal CXRs. First, we train and evaluate variants of U-Nets, Feature Pyramid Networks, and other proposed custom models using a private collection of CXR images and their bone-suppressed counterparts. The DeBoNet, constructed using the top-3 performing models, outperformed the individual models in terms of peak signal-to-noise ratio (PSNR) (36.7977±1.6207), multi-scale structural similarity index measure (MS-SSIM) (0.9848±0.0073), and other metrics. Next, the best-performing bone-suppression model is applied to CXR images that are pooled from several sources, showing no abnormality and other findings consistent with COVID-19. The impact of bone suppression is demonstrated by evaluating the gain in performance in detecting pulmonary abnormality consistent with COVID-19 disease. We observe that the model trained on bone-suppressed CXRs (MCC: 0.9645, 95% confidence interval (0.9510, 0.9780)) significantly outperformed (p < 0.05) the model trained on non-bone-suppressed images (MCC: 0.7961, 95% confidence interval (0.7667, 0.8255)) in detecting findings consistent with COVID-19 indicating benefits derived from automatic bone suppression on disease classification. The code is available at https://github.com/sivaramakrishnan-rajaraman/Bone-Suppresion-Ensemble.

Introduction

Chest X-ray (CXR) is a commonly performed radiological examination to visualize various abnormalities in the thoracic cavity [1]. However, accurate interpretation of pulmonary abnormalities like COVID-19 and others is particularly challenging because their visibility may be obstructed by the presence of bony structures like ribs and clavicles. This reduced visibility may lead to an erroneous interpretation by an expert or an artificial intelligence (AI) algorithm, thereby severely impacting clinical decision-making. It has been noted in the literature that the presence of ribs and clavicles in CXR images led to missed lung cancer nodules resulting in false interpretations [2].

Advanced radiology methods like dual-energy subtraction (DES) chest radiography are used to produce “bone-only” and soft tissue images [3]. However, compared to traditional CXRs, DES has several limitations [4]: (i) DES radiography exposes the subjects to a slightly higher radiation dosage compared to traditional CXR imaging; (ii) DES radiographic imaging can only be performed in the posterior-anterior view; (iii) Cost of DES radiography is higher compared to conventional CXR imaging; and, (iv) DES radiography is recommended only for patients above 16 years of age. Therefore, an automated bone suppression method for traditional CXRs should add value for enhancing soft tissue visibility and aid in the improved detection of pulmonary manifestations.

A study of the literature reveals several works published on suppressing bones in CXRs. These studies involve using (i) commercial software, (ii) conventional machine learning methods using hand-crafted feature descriptors, or (iii) state-of-the-art deep learning (DL) models to initially generate bone-only images and further subtract them from the original CXR to increase soft-tissue visibility. In [5], the authors used commercial software to suppress bones and improve performance for detecting lung nodules. It was observed that the performance of the experts significantly improved (p < 0.05) by using the bone-suppressed CXRs resulting in an area under the receiver-operating-characteristic curve (AUROC) of 0.863, compared to an AUROC of 0.82 using non-bone-suppressed CXRs. Another study used commercial software to suppress bones in CXRs and investigated for a performance improvement in Tuberculosis (TB) detection [6]. They observed that the average AUROC of experts improved from 0.882 to 0.933 using bone-suppressed images. A convolutional neural network (CNN)-based model was used in [7] to generate a bone-only image. This image is subtracted from the original CXR to increase soft-tissue visibility, thereby resulting in 89.2% bone suppression. A cascade of CNNs was used in [8] to create bone-only images at multiple scales. The generated images were fused to form the final bone-only image that was subtracted from the original CXR to generate a “bone-free” image. In another study [9], an artificial neural network was used to generate a bone-only image that was subtracted from the original CXR to increase the visibility of soft tissues. A method based on independent component analysis was proposed in [10] to suppress bones and increase lung nodule visibility. Other studies [11–14] adopted bone suppression methods to improve performance toward detecting lung nodules and other pulmonary manifestations. These studies in general, propose multiple steps to generate bone-only images and subtract them from the original CXRs to increase soft-tissue visibility. A limitation of this approach is that an inaccurate generation of bone-only images would lead to introducing noise, reducing the visibility of soft tissues, increasing interpretation errors, and adversely impacting decision-making. As of the writing of this manuscript and to the best of our knowledge, other than [15] there are no other articles in the literature that propose an automated method to generate a soft-tissue image directly from the original CXR image, alleviating the need for intermediate bone image generation and subsequent subtraction methods.

Though CNN models demonstrate state-of-the-art performance in natural and medical vision recognition tasks, they are often found to suffer from bias and variance issues that could adversely affect their interpretation. These issues could be tackled through ensemble learning that optimally combines the predictions of several models to improve prediction performance compared to the individual constituent models and reduce prediction spread or dispersion [16]. Ensemble learning is widely used in medical computer vision tasks such as segmentation, object detection, and classification [17]. To the best of our knowledge, we do not find any literature that evaluates the performance of DL model ensembles for bone suppression in CXRs.

In this study, we propose DeBoNet, an ensemble of DL models, for suppressing bones in frontal CXRs. Through its use, we aim to improve disease classification and interpretation performance which is demonstrated through the detection of findings that are consistent with COVID-19 on CXRs [18]. We train several state-of-the-art architectures such as U-Nets [19] and Feature Pyramid Networks (FPNs) [20], using several ImageNet classifiers as backbones, and also propose custom models toward bone suppression. DeBoNet is constructed by (i) measuring the multi-scale structural similarity index (MS-SSIM) score between the sub-blocks of the bone-suppressed image predicted by each of the top-3 performing bone-suppression models and the corresponding sub-blocks in the respective ground truth soft-tissue image, and (ii) performing a majority voting of the MS-SSIM score computed in each sub-block to identify the sub-block with the maximum MS-SSIM score and use it in constructing the final bone-suppressed image. We empirically determine the sub-block size that delivers superior bone suppression performance. The performances of individual models and DeBoNet are evaluated using several performance metrics such as average peak signal-to-noise (PSNR) ratio, structural similarity index (SSIM), MS-SSIM, correlation, intersection, chi-square, and Bhattacharya distances. Next, the best-performing bone suppression model is selected, truncated, and appended with classification layers. This is done to transfer CXR modality-specific knowledge and improve performance in the task of classifying CXRs as showing normal lungs or other findings consistent with COVID-19. The performance of the classification model trained on non-bone-suppressed CXRs and bone-suppressed CXRs are compared through several performance metrics such as accuracy, AUROC, precision, recall, the area under the precision-recall curve (AUPRC), F-score, and MCC. Additionally, we used our in-house class-selective relevance map (CRM) algorithm [21] to interpret model predictions. Fig 1 shows the graphical abstract of our proposed approach.

Fig 1 — Several proposed bone suppression models (M1, M2, …, Mn, n = 1, 2, …, 14) are trained on a set of input CXRs. The predictions of the top-3 performing bone suppression models are combined using a majority voting approach to construct DeBoNet. A classification model is trained on the non-bone-suppressed and bone-suppressed images to classify them into COVID-19 or normal categories.

Our novel contributions are highlighted as follows:

To the best of our knowledge, this is the first study to develop a model ensemble for suppressing bones in CXRs, that we call DeBoNet, and demonstrate its effectiveness through extensive qualitative and quantitative analyses.
We train and evaluate variants of U-Nets, Feature Pyramid Networks, and other proposed custom models toward the bone suppression task.
The individual constituent models and the DeBoNet proposed in this study are not restricted to the task of CXR bone suppression but can be potentially applied to other image denoising applications.

Materials and methods

Datasets

The following datasets are used in this study:

COVID-19 CXR collection: A total of 3016 de-identified publicly available CXR images showing findings that are consistent with COVID-19, which serve as the set of cases in our study, are pooled from several sources. A majority of these CXRs are pooled from the BIMCV-COVID19+ CXR data collection that contains 2473 CXRs showing COVID-19-like manifestations [22]. A set of 183 CXR images showing findings consistent with COVID-19 are collected from a GitHub repository hosted by the Institute for Diagnostic and Interventional Radiology, Hannover Medical School, Hannover, Germany [23]. These CXR images are accompanied by other metadata such as admission status and patient demographics. The authors [17] collected 226 CXRs manifesting COVID-19, from a public GitHub repository hosted by the authors of [24]. The CXR collection is accompanied by other metadata including sex, age, finding, and intubation status. They also used a collection of 134 CXRs acquired from SARS-CoV-2 PCR+ patients from a hospital in Spain and posted by a radiologist in a public Twitter thread [25]. The ground truth COVID-19 disease-specific region of interest (ROI) annotations, set by the verification from two expert radiologists, for a subset of this collection [n = 36] are used by the authors of [16] in interpreting model performance.
RSNA CXR dataset: To serve as experimental controls, we randomly select an equal number of 3016 de-identified CXR images showing no abnormalities from the publicly available RSNA CXR dataset, released toward the RSNA pneumonia detection challenge hosted by Kaggle [26]. The collection, however, includes a total of 26,684 CXR images, of which, 8851 CXRs showed no abnormalities, 6012 CXRs showed pneumonia-related lung opacities, and 11,821 CXRs showed other pulmonary abnormalities.
NIH-CC-DES-Set 1: This set consists of 27 de-identified DES CXR images [15] that were acquired at the National Institutes of Health (NIH) Clinical Center (CC) as a part of routine clinical care. A GE Discovery XR656 digital radiography system was used to acquire the DES images at 120 and 50 Kilovoltage-peak (kVp), respectively, to capture the soft-tissue images and bone-only images. This dataset is used to evaluate the performance of the bone suppression models proposed in this study.
NIH-CC-DES-Set 2: Another set of de-identified 100 DES CXRs are acquired similar to NIH-CC-DES-Set 1. This collection contains DES images of 54 females and 46 males, the average age and standard deviation of the males and females are 48.9 +/- 14.5 and 45.4+/- 13.6, respectively. This dataset is augmented and used to train the bone suppression models.

The NIH-CC-DES-Set 1 and NIH-CC-DES-Set 2 data were selected samples of adult subjects with no radiological findings from the NIH archives that were deidentified and manually verified before use. The NIH Institutional Review Board (IRB) exempted their use from full review. The total number of CXRs pooled from different sources is given in Table 1.

Table 1. Dataset sources.

Source	Number of CXR images
Source	COVID-19	Normal
BIMCV-COVID19+ CXR	2473	No
Hannover Medical School, Hannover	183	No
Cohen et al.	226	No
Twitter COVID-19 CXR	134	No
RSNA CXR	3016	3016
NIH-CC-DES-Set 1	No	27
NIH-CC-DES-Set 2	No	100

Open in a new tab

Bone suppression models

The set of 100 grayscale DES CXR images (i.e., the original CXRs and soft tissue counterparts) from the NIH-CC-DES-Set 2 dataset is augmented using affine transformations such as rotations (-10 to 10 degrees), horizontal and vertical shifting (-5 to 5 pixels), horizontal mirroring, zooming, median filtering, Gaussian blurring, and unsharp masking, resulting in 1000 DES CXRs. The augmented images are further resized to 256×256 dimensions to reduce computational complexity. The contrast of the images is enhanced by clipping the top and bottom 1%, respectively, of all pixel values. The pixel values are then normalized.

We propose the following model architectures for the task of bone suppression in CXRs: (i) Autoencoder-BS (BS—Bone Suppression); (ii) ResNet-BS; (iii) U-EB0-BS; (iv) U-Res18-BS; (v) U-SE-Res18-BS; (vi) U-D121-BS; (vii) U-IV3-BS; (viii) U-MobileV2-BS; (ix) F-EB0-BS; (x) F-Res18-BS; (xi) F-SE-Res18-BS; (xii) F-D121-BS, (xiii) F-IV3-BS; (xiv) F-MobileV2-BS. These model architectures are discussed in subsequent sections.

Autoencoder with separable convolutions (Autoencoder-BS)

The Autoencoder-BS model is a denoising autoencoder with symmetrical encoder and decoder layers. Fig 2 illustrates the architecture of the proposed Autoencoder-BS model.

The encoder consists of four separable convolutional blocks. Each convolutional block except for the last block contains two separable convolutional layers. Separable convolutions are used to reduce computational complexity, thereby facilitating faster convergence and real-time deployment [27]. The number of filters in the separable convolutional blocks of the encoder are 64, 128, 256, and 512, respectively. Except for the last block, a max-pooling layer is used after each separable convolutional block to calculate the maximum value for individual patches of the feature map. Upsampling layers are used correspondingly in the symmetric decoder blocks to preserve the spatial resolution of the input.

ResNet-based model with residual scaling (ResNet-BS)

The architecture of the proposed ResNet-BS model is shown in Fig 3. The first and last convolutional layer contains 128 filters of dimension 3×3. We used residual blocks with shortcuts to skip over layers. This approach helps to overcome convergence issues due to vanishing gradients in deeper models. Skipping layers helps to reuse the activations of the earlier layers until weight updates in the succeeding layers. Each residual block consists of two convolutional layers with 3×3 filters and 128 feature maps.

Inspired by [28], we used a modified residual block in which (i) the batch normalization layers are removed for they are mentioned to adversely affect the range flexibility through the normalization process, and (ii) activations are not used outside the residual blocks and in the final layer. The network consists of 16 residual blocks with an identical layout. We used zero paddings to preserve the spatial dimensions of the input image. The residuals after the deepest convolutional layer in each residual block are scaled at an empirically determined scaling factor (0.1) before adding them back to the convolutional path. This scaling approach stabilizes training in deeper models with high computational complexity [28]. The deepest convolutional layer with the sigmoidal activation function predicts a grayscale bone-suppressed image.

U-Net and FPN-based models

The U-Net models are widely used in image segmentation tasks [19]. The U-Net is composed of an encoder and decoder. The encoder or the contracting path extracts image features at multiple scales and the decoder or the expanding path semantically projects the features learned by the encoder onto the pixel space.

The Feature Pyramid Networks (FPN) are widely used as feature extractors to help object detection [20]. Fig 4 shows the general architecture of the U-Net and FPN models. The FPN network is composed of bottom-up and top-down pathways. The bottom-up pathway constitutes the encoder backbone that extracts image features at multiple scales (scaling step is 2). A convolutional layer with a 1×1 filter is used to reduce the feature dimensions of the deepest convolutional layer in the bottom-up pathway to 256. This constitutes the first layer of the top-down pathway. Going deeper, the preceding layer is up-sampled by a factor of 2 using the nearest neighbor up-sampling method. A 1×1 convolutional filter is applied to the corresponding feature maps in the bottom-up pathway and is added elementwise. A 3×3 convolution is then applied to all the merged layers to reduce aliasing effects. This helps to generate high-resolution features at each scale.

The grayscale CXR is duplicated in three channels and fed into the U-Net and FPN models. This is because we use ImageNet-pretrained models, trained on RGB images, as the encoder backbones. We experimented with several encoder backbones for the U-Net and FPN models [29] toward the task of bone suppression in CXRs. These backbones include (i) EfficientNet-B0 [30], (ii) ResNet-18 [31], (iii) SE-ResNet-18 [32], (iv) DenseNet-121 [33], (v) Inception-V3 [34], and (vi) MobileNet-V2 [35]. We are motivated by the fact that these ImageNet-pretrained models have demonstrated superior performance in medical visual recognition tasks [17]. The final layer of the U-Net and FPN models consists of a convolutional layer with Sigmoidal activation to predict grayscale bone-suppressed CXRs.

The proposed bone-suppression models are trained on the augmented NIH-CC-DES-Set 2 dataset and tested with the NIH-CC-DES-Set 1 dataset. We allocated 10% of the training data for validation using a fixed seed. We compiled the models using an Adam optimizer with an initial learning rate of 1e-3 and monitored the following validation performance metrics: (i) loss, (ii) PSNR, (iii) SSIM, and (iv) MS-SSIM. We propose a mixed-loss function that benefits from the combination of mean absolute error (MAE) and MS-SSIM losses, given by,

M i x e d l o s s = Ω . M S - S S I M + (1 - Ω) . M A E

(1)

We empirically set the value of Ω to 0.84. The MS-SSIM metric is given higher weightage since the bone suppressed image is preferred to closely match the ground truth. The MAE metric is given a comparatively lower significance because the metric focuses on the contrast and luminance that is expected to change while suppressing the bones. We reduced the learning rate whenever the validation performance ceased to improve. Early stopping with the patience of 10 epochs is used. The best-performing models (with the least validation loss) are further used to predict bone-suppressed CXR images using the test set. An Ubuntu Linux system with NVIDIA GeForce GTX 1080 graphics card and Keras framework with Tensorflow backend is used for model training and evaluation.

DeBoNet—bone suppression model ensemble

The bone suppression model ensemble, which we call DeBoNet, is constructed using the top-3 performing models that demonstrate markedly improved performance in terms of the MS-SSIM metric using the NIH-CC-DES-Set 1 test set. Each of the top-3 performing models predicts a bone-suppressed image for an input CXR. The predicted image by the individual models is divided into sub-blocks of M×M dimensions. The optimal value of M [4, 8, 16, 32, 64, 128, 256] is determined through extensive empirical evaluations. For a given sub-block size and in each sub-block, the following are performed: (i) we measured the MS-SSIM score between the sub-block of the bone-suppressed image predicted by each of the top-3 performing models and the corresponding sub-block in respective ground truth soft-tissue image; (ii) we performed a majority voting for the MS-SSIM score to find that image sub-block with the maximum MS-SSIM score and use it in constructing the final bone-suppressed image. The algorithm below discusses these steps. Fig 5 illustrates the steps involved in constructing the DeBoNet.

Algorithm

Input: Ground-truth bone-suppressed image K of 256×256 resolution

Bone-suppressed Images I = (I_M1, I_M2, I_M3) of 256×256 resolution from M = [M₁, M₂, M₃]; M₁, M₂, M₃ are the top-3 performing bone-suppression models

Image sub-block sizes B = [4, 8, 16, 32, 64, 128, 256]

Output: Final Bone-suppressed image J

for each sub-block size B

for each set of bone-suppressed Images I generated by M₁, M₂, M₃

for each sub-block in K and I_M1, I_M2, I_M3

compute MS-SSIM between K and I_M1, K and I_M2, K and I_M3

perform Majority Voting = Max(MS-SSIM(K and I_M1), MS-SSIM(K and I_M2), MS-SSIM(K and I_M3))

choose the sub-block with the maximum MS-SSIM value and put it in its respective position in the final bone-suppressed image J

end for

DeBoNet evaluation

We performed evaluations by using the histograms of the ground truths and the bone-suppressed images predicted by the individual bone-suppression models and the DeBoNet. Several metrics such as correlation, intersection, chi-square distance, and Bhattacharyya distance are measured to investigate for similarity. The higher the value of correlation and intersection, the closer (or more similar) are the histograms of the image pairs. For distance-based metrics such as chi-square and Bhattacharyya, a smaller value indicates a superior match between the histogram pairs. This implies the histograms of the predicted bone-suppressed images closely match their respective ground truths. The mathematical formulations of these metrics can be found in the literature [36]. The average values of the aforementioned metrics are computed for each model and the DeBoNet and compared for statistical significance.

Classification model

For classification, we initially used a custom U-Net model proposed in [17] to segment the lung ROI on the CXRs. This approach ensures that the models learn relevant features from the lung ROI and not the surrounding context. The U-Net model is trained to generate 256×256-dimension lung masks. The generated masks are overlaid on the input CXRs to delineate the lung boundaries. The delineated boundaries are cropped to a bounding box containing the lung pixels. The lung-cropped CXRs are further preprocessed to enhance image contrast by clipping the top and bottom 1%, respectively, of all pixel values. We further performed pixel normalization, centering, and standardization to reduce computational complexity during model training.

The encoder of the best-performing bone-suppression model is truncated and appended with the following layers: (i) Zero padding (ZP); (ii) Convolutional layer with 512 filters, each of size 3×3; (iii) Global average pooling (GAP), and (iv) a dense layer with two nodes and Softmax activation to classify the CXRs as showing normal lungs or other findings consistent with COVID-19. This approach is followed to transfer the CXR modality-specific knowledge learned from the bone suppression task to improve performance in a relevant classification task. A study of the literature reveals several works that used CXR modality-specific models to transfer knowledge and improve classification and localization performance in a relevant task [17, 37, 38].

Recall that we use the COVID-19 CXR collection as cases and the RSNA CXR collection as controls for the classification task. Since the ground truth soft-tissue images are not available for these CXRs, the DeBoNet could not be directly used. Instead, the best-performing bone suppression model is selected and applied to these CXR collections. We used 90% of these data for training and 10% for hold-out testing. For consistency, we use a fixed seed and allocated 10% of the training data for validation. The model is then retrained individually on the non-bone-suppressed and bone-suppressed CXR images to classify them as showing no abnormalities or findings consistent with COVID-19. We performed augmentation with random affine transformations such as rotations (-10 to 10 degrees), horizontal and vertical pixel shifting (-5 to 5 pixels), zooming, and horizontal mirroring, to introduce variability into the training process and reduce model overfitting to the training data. The model is compiled using a stochastic gradient descent optimizer with an initial learning rate of 1e-3. The learning rate is reduced whenever the validation performance did not improve. We used callbacks to store model weights and early stopping to prevent overfitting and stored the best weights for further analysis. The best model is used to predict the test set and output class probabilities.

The following metrics are measured to compare model performance: (i) accuracy; (ii) AUROC; iii) precision (P); (iv) recall (R); (v) AUPRC; (vi) F-score; and (vii) MCC. These metrics are expressed below.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(2)

R e c a l l = \frac{T P}{T P + F N}

(3)

P r e c i s i o n = \frac{T P}{T P + F P}

(4)

F - s c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(5)

M C C = \frac{T P \times T N - F P \times F N}{{((T P + F P) (T P + F N) (T N + F P) (T N + F N))}^{1 / 2}}

(6)

Here, TP, TN, FP, and FN denote the true positive, true negative, false positive, and false negative values, respectively. Additionally, we used our in-house class-selective relevance map (CRM) algorithm [21] to interpret the predictions of the model trained on non-bone-suppressed and bone-suppressed images and ensure they learned to highlight regions containing findings that are consistent with COVID-19.

Statistical analyses

We performed statistical analyses to identify the existence of a significant difference in performance achieved by the bone suppression and classification models. For bone suppression, we performed a one-way Analysis of Variance (ANOVA) to analyze if a significant difference existed in the MS-SSIM and chi-square distance values obtained using the top-3 performing bone-suppression models and DeBoNet. We performed Shapiro-Wilk and Levene tests to analyze if the prerequisite conditions of data normality and homogeneity of variances are satisfied to perform one-way ANOVA analyses. For classification, we measured the 95% binomial confidence intervals (CI) as the Exact Clopper-Pearson interval for the MCC metric to compare the classification performance achieved by the models trained on non-bone-suppressed and bone-suppressed images. We used R statistical software (Version 4.1.1) to perform these evaluations.

Results

Bone suppression

Recall that the proposed bone suppression models are trained on the augmented NIH-CC-DES-Set 2 dataset and tested using the NIH-CC-DES-Set 1 collection. The performance achieved by the bone suppression models is shown in Table 2. Fig 6 shows the bone-suppressed images predicted using the proposed bone suppression models for an input CXR instance from the test set.

Table 2. Performance achieved by the proposed bone suppression models using the NIH-CC-DES-Set 1 test set.

The values are given in terms of mean ± standard deviation. The best performances are denoted by bold numerical values in the corresponding columns.

Model	PSNR	SSIM	MS-SSIM	Correlation	Intersection	Chi-square	Bhattacharya
Autoencoder-BS	33.1861±3.5922	0.9371±0.0310	0.9798±0.0093	0.5949±0.1800	8.4827±1.4190	1.4279±0.9773	0.4009±0.0878
ResNet-BS	30.9168±3.1286	0.9420±0.0261	0.9817±0.0092	0.5142±0.1831	8.2680±1.5036	2.6780±1.6202	0.4281±0.0884
U-EB0-BS	35.9098±1.5674	0.9359±0.0306	0.9795±0.0084	0.6529±0.1576	8.8000±1.3606	0.9004±0.6436	0.3813±0.0845
U-Res18-BS	35.7993±1.4498	0.9402±0.0283	0.9809±0.0080	0.6518±0.1403	8.8879±1.4312	0.9767±0.4622	0.3796±0.0833
U-SE-Res18-BS	35.531±1.6773	0.9325±0.0310	0.9773±0.0077	0.6421±0.1505	8.6794±1.3098	1.0484±0.8215	0.383±0.0836
U-D121-BS	33.7751±1.3033	0.9284±0.0301	0.9746±0.0083	0.6017±0.1543	8.4233±1.6595	1.7434±1.2997	0.3852±0.0838
U-IV3-BS	34.8914±1.7280	0.9368±0.0294	0.9795±0.0089	0.6411±0.1339	8.8026±1.4659	1.1195±0.4987	0.3836±0.0816
U-MobileV2-BS	27.6842±0.1715	0.8593±0.0342	0.9136±0.0139	0.2583±0.1131	5.7133±1.5060	10.9967±4.2341	0.4704±0.0631
F-EB0-BS	36.5525±1.6923	0.9449±0.0290	0.9840±0.0081	0.6654±0.1473	9.0462±1.4529	0.6893±0.4005	0.3790±0.0846
F-Res18-BS	36.3233±1.7004	0.9428±0.0281	0.9823±0.0079	0.6417±0.1424	8.8840±1.4194	0.9392±0.3799	0.3856±0.0833
F-SE-Res18-BS	36.0318±1.6900	0.9418±0.0294	0.9821±0.0084	0.6334±0.1559	8.8531±1.4131	1.0227±0.5185	0.3853±0.0841
F-D121-BS	35.2788±1.4938	0.9402±0.0283	0.9794±0.0082	0.6290±0.1365	8.7087±1.5015	1.2203±0.9092	0.3827±0.0818
F-IV3-BS	33.7446±1.8066	0.9369±0.0310	0.9793±0.0084	0.6225±0.1560	8.6645±1.3670	1.1846±0.7676	0.3910±0.0817
F-MobileV2-BS	33.5028±1.3452	0.9255±0.0320	0.9734±0.0088	0.5767±0.1743	8.1361±1.6643	2.3053±1.4224	0.3877±0.0844

Open in a new tab

Fig 6 — (a) Original CXR; (b) Ground truth soft tissue image; (c) U-EB0-BS; (d) U-Res18-BS; (e) U-SE-Res18-BS; (f) U-D121-BS; (g) U-IV3-BS; (h) U-MobileV2-BS; (i) F-EB0-BS; (j) F-Res18-BS; (k) F-SE-Res18-BS; (l) F-D121-BS; (m) F-IV3-BS and (n) F-MobileV2-BS.

It is observed from Table 2 that the FPN model with the EfficientNet-B0 encoder backbone (F-EB0-BS) demonstrated superior performance for all metrics compared to other models. We observed from Fig 6 that all models predicted bone suppressed images that demonstrated substantial suppression of the bony structures. We further performed a quantitative evaluation to differentiate model performance. In this regard, we observed that the F-EB0-BS model demonstrated the least values for the chi-square and Bhattacharya distances and superior values for the correlation and intersection measures. Higher values for the correlation and intersection metrics demonstrate that the bone-suppressed images predicted by the F-EB0-BS model closely match that of the ground truth soft-tissue images. Considering the chi-square and Bhattacharyya distance-based metrics, a smaller value indicates a superior match between the images. This signifies that compared to other models, the bone-suppressed image predicted by the F-EB0-BS model closely matches that of the ground truth soft-tissue images. This performance is followed by the FPN model with ResNet-18 encoder backbone (F-Res18-BS) and the U-Net model with the ResNet-18 encoder backbone (U-Res18-BS) that demonstrated markedly improved values for the PSNR, SSIM, MS-SSIM, correlation, intersection, chi-square, and Bhattacharya distance measures compared to other models. These top-3 performing models are further considered to construct the ensemble.

The predicted bone-suppressed images by the top-3 performing models are divided into sub-blocks of M×M dimensions. We empirically determined the value of M [4, 8, 16, 32, 64, 128, 256] that deliver superior bone suppression performance. For a given sub-block size, and in each sub-block, (i) we measured the MS-SSIM score between the sub-block of the bone-suppressed image predicted by each of the top-3 performing models and the corresponding sub-block in the respective ground truth, and (ii) performed a majority voting of the MS-SSIM score for each sub-block to identify the sub-block with the maximum MS-SSIM score and use it in constructing the final bone-suppressed image. Table 3 shows the performance achieved while constructing the DeBoNet using varying sub-block sizes. It is observed from Table 3 that the DeBoNet performance with various sub-block sizes is superior compared to the performance achieved using the top-3 performing models (from Table 2). We observed that using a sub-block size of 4×4, the DeBoNet achieved superior performance in terms of PSNR, SSIM, MS-SSIM, correlation, intersection, chi-square, and Bhattacharya distances compared to using other sub-block sizes and the top-3 performing models. Curiously, we also note a relatively high performance at 256x256 grid dimensions. Studying the correlation between granularity and MS-SSIM score is left as future work.

Table 3. Performance achieved by the DeBoNet using various sizes for the sub-blocks.

The values are given in terms of mean ± standard deviation. The best performances are denoted by bold numerical values in the corresponding columns.

Block size	PSNR	SSIM	MS-SSIM	Correlation	Intersection	Chi-Square	Bhattacharya
4×4	36.7977±1.6207	0.9465±0.0272	0.9848±0.0073	0.6720±0.1404	9.0862±1.4413	0.6174±0.2726	0.3778±0.0839
8×8	36.4574±1.4724	0.9226±0.0255	0.8721±0.0223	0.6344±0.1361	8.5230±1.3419	1.2636±0.4771	0.3806±0.0822
16×16	36.7651±1.6012	0.9437±0.0256	0.9837±0.0073	0.6598±0.1431	9.0193±1.4464	0.7282±0.3169	0.3800±0.0839
32×32	35.7137±1.2588	0.8965±0.0266	0.8390±0.0237	0.5161±0.1218	7.1297±1.1228	3.4226±1.0079	0.3901±0.0801
64×64	36.2657±1.4698	0.9218±0.0272	0.8719±0.0221	0.6297±0.1409	8.4949±1.3528	1.3402±0.5874	0.3815±0.0834
128×128	36.4872±1.5982	0.9380±0.0282	0.9213±0.0174	0.6667±0.1483	8.9424±1.4365	0.7807±0.4931	0.3784±0.0850
256×256	36.5787±1.6885	0.9458±0.0284	0.9841±0.0080	0.6641±0.1470	9.0304±1.4599	0.7026±0.4129	0.3796±0.0849

Open in a new tab

We performed a one-way ANOVA analysis to observe if a statistically significant difference existed in the MS-SSIM and chi-square values obtained using DeBoNet with sub-block size 4×4, and the top-3 performing bone-suppression models namely, the F-EB0-BS, F-Res18-BS, and U-Res18-BS models. Fig 7 shows the mean plots for the MS-SSIM and chi-square values, respectively, obtained by the models.

Fig 7 — (a) and (b) shows the mean plot for the MS-SSIM and chi-square values, respectively, obtained by the DeBoNet (4×4), F-EB0-BS, F-Res18-BS, and U-Res18-BS models.

The one-way ANOVA analyses require that the assumptions regarding the normal distribution of the data and homogeneity of data variances are satisfied. We performed the Shapiro-Wilk normality test and Levene test for analyzing the homogeneity of variances. For the MS-SSIM metric, we observed that the p-values for the Levene (p = 0.9828) and Shapiro-Wilk (p = 0.3824) tests are not statistically significant (p>0.05). This confirms that the assumptions of data normality and homogeneous variances are satisfied. Hence, we performed one-way ANOVA analyses by measuring the size of the group, the variance within groups, and the variance between the means of the groups. This information is collectively used to measure the F statistic. In this study, we have four groups/models (i.e., the 4×4 DeBoNet, F-EB0-BS, F-Res18-BS, and U-Res18-BS models) with 27 observations (images) each, hence the distribution is given as F (3, 104). Considering the MS-SSIM metric, we observed that no statistically significant difference existed between the 4×4 DeBoNet and the top-3 performing models (F (3, 104) = 0.886, p = 0.451, p>0.05). A similar analysis is performed using the chi-square distance metric. We observed that the conditions of data normality and homogeneous variances are satisfied based on the p-values obtained using the Shapiro-Wilk (p = 0.4768) and Levene (p = 0.4321) tests (p>0.05). The one-way ANOVA analysis revealed that a statistically significant difference existed in the chi-square values obtained using the 4×4 DeBoNet, F-EB0-BS, F-Res18-BS, and U-Res18-BS models (F (3, 104) = 5.838, p = 0.001, p<0.05). We further performed Tukey post-hoc analyses to identify the models that demonstrate these significant differences in the chi-square values. We observed that the chi-square distance value obtained using the 4×4 DeBoNet (0.6174±0.2726) is significantly smaller compared to the F-Res18-BS (0.9392±0.3799, p = 0.0142) and U-Res18-BS (0.9767±0.4622, p = 0.0047) models. These evaluations underscored the fact that the 4×4 DeBoNet achieved significantly smaller values for the chi-square metrics (p<0.05). Also, the chi-square value obtained using the F-EB0-BS model is significantly smaller (p = 0.0355) compared to the U-Res18-BS model. Unlike the top-3 performing models, the bone-suppressed images predicted by the 4×4 DeBoNet closely resembled the ground truth soft-tissue images.

Recall that the best-performing F-EB0-BS bone suppression model is used to suppress the bones in the CXRs used in this classification task. This is because the ground truth soft-tissue images are not available for these CXRs. Hence, DeBoNet could not be used. Fig 8 shows the bone-suppressed images predicted by the F-EB0-BS model for instances of CXRs showing findings that are consistent with COVID-19. Note that the F-EB0-BS model generalizes to the unseen CXRs from the classification data that are not used during bone-suppression model training and validation. We observed superior suppression of bones and the image resolution is preserved.

Fig 8 — (a) CXR from the BIMCV-COVID19+ CXR data collection; (b) Corresponding bone-suppressed image; (c) CXR from the Twitter COVID-19 CXR collection, and (d) Corresponding bone-suppressed image.

Classification

Recall that the encoder of the best-performing F-EB0-BS bone suppression model is truncated and added with the classification layers to classify the CXRs as showing normal lungs or COVID-19-consistent findings. Such an approach is followed to transfer CXR modality-specific knowledge to improve classification performance. The classification model is retrained on the non-bone-suppressed and bone-suppressed CXR images, and the measured performance is shown in Table 4 and illustrated in Fig 9 in terms of AUROC, confusion matrix, normalized Sankey diagram, and AUPRC curves.

Table 4. Classification performance achieved with the model trained on non-bone-suppressed and bone-suppressed images.

Data in parenthesis denote the 95% binomial CI measured as the Exact Clopper Pearson interval for the MCC metric. Bold numerical values denote superior performance in respective columns.

Data	Accuracy	AUROC	AUPRC	Sensitivity	Precision	F-score	MCC
Non bone suppressed	0.8964	0.9470	0.9275	0.8964	0.8997	0.8962	0.7961 (0.7667, 0.8255)
Bone suppressed	0.9820	0.9980	0.9981	0.9820	0.9825	0.9820	0.9645 (0.9510, 0.9780)

Open in a new tab

We observed from Table 4 and Fig 9 that the classification model trained on bone-suppressed images demonstrated superior performance in terms of accuracy, AUROC, AUPRC, sensitivity, precision, F-score, and MCC metrics, compared to the model trained on non-bone-suppressed images. The 95% binomial CI value obtained for the MCC metric using the model trained on bone-suppressed images demonstrated a tighter error margin, higher precision, and is found to be significantly superior (p < 0.05) compared to the MCC metric achieved by the model trained on non-bone-suppressed images.

We qualitatively evaluated the performance of the models trained on non-bone-suppressed and bone-suppressed images to ensure if the models learned to highlight regions containing COVID-19-consistent findings and not the surrounding context. We used the CRM localization tool to interpret model behavior. Fig 10 shows the instances of CXRs, and the CRM-based disease ROI localization obtained using the trained models.

Fig 10A, 10D and 10G show instances of CXRs from the Twitter COVID-19 CXR collection with expert annotations shown in blue bounding boxes. Fig 10B, 10E and 10H show the localization achieved using the model trained on non-bone-suppressed images. It could be observed that the model is highlighting the surrounding context but not COVID-19-consistent manifestations. This demonstrates that the model has not learned relevant features regarding findings that are consistent with COVID-19. Fig 10C, 10F and 10I show the localization achieved using the model trained on bone-suppressed images. We could observe that this model precisely highlighted regions specific to findings that are consistent with COVID-19, thereby demonstrating that the model learned task-specific features, confirming the experts’ knowledge about the disease.

Discussion and conclusions

The observations made from this study underscores the need for (i) customizing a model for the problem under study, (ii) constructing a model ensemble for bone suppression, and (iii) interpreting model behavior.

Our proposed approach facilitates predicting a bone-suppression image given an input CXR image. This is more computationally effective than other studies proposed in the literature [5–9] that propose a series of steps to generate bone-only images and subtract them from input CXRs to increase soft-tissue visibility. A limitation of this approach proposed in the literature is that a sub-optimal generation of bone-only images introduces noise and distortion into the process and may adversely impact decision-making. We proposed several custom models and experimented with state-of-the-art architectures like U-Nets and FPN using various ImageNet-pretrained encoder backbones to obtain superior bone suppression performance. To the best of our knowledge, this study is the first to explore the use of these models in the context of an image denoising problem where the bony structures in an input CXR are considered noise. Through extensive empirical evaluations, we observed that the FPN model with the EfficientNet-B0 encoder backbone delivered superior bone suppression performance, followed by the FPN model with ResNet-18, and U-Net with ResNet-18 encoder backbones. The bone-suppressed images predicted by these top-3 models appeared sharp while preserving soft-tissue characteristics. Therefore, these images could be used for further CXR image analysis such as screening for cardiopulmonary diseases. We propose an ensemble approach toward bone suppression, called DeBoNet, that demonstrated superior values for PSNR, SSIM, MS-SSIM, correlation, intersection, chi-square, and Bhattacharya distance metrics compared to the individual constituent models. This underscores the fact that the DeBoNet improved bone suppression performance so that the predicted bone suppressed image closely matched the ground-truth, soft-tissue image.

We observed the effect of bone suppression toward improving COVID-19 detection using CXRs. We observed that the classification model trained using bone-suppressed images demonstrated significantly superior performance in terms of accuracy, AUPRC, AUROC, precision, recall, F-score, and MCC, compared to the model trained on non-bone-suppressed images. We further observed through localization studies that the models trained on bone-suppression images precisely highlighted regions showing findings that are consistent with COVID-19, confirming the expert knowledge of the disease. This underscores the fact that, unlike the model trained on non-bone-suppressed images, the models trained on bone-suppressed images learned task-specific features and not the surrounding context, to classify the CXRs to their respective classes. The models trained on non-bone-suppressed images are accurate, however, they demonstrated sub-optimal localization. This underscores the fact that (i) the disease-specific ROI localization ability of a trained model is not related to its classification accuracy and (ii) localization studies are therefore indispensable to interpret the learned behavior of the trained models.

This study suffers from the following limitations: (i) We used the best-performing bone-suppression model, and not the DeBoNet, to suppress bones in the CXR data used for the classification task. This is because we do not have the ground truth soft-tissue images for these CXRs. However, on a positive note, DeBoNet ensemble training helps develop and identify the best performing individual model. (ii) The lack of large-scale publicly available DES CXR datasets is a significant limitation in training the bone-suppression models. The studies reported in the literature [5–9] used the JSRT CXR images and their bone-suppressed counterparts generated by an automated algorithm developed by the researchers from the Budapest University of Technology and Economics [39] to train the bone suppression models. However, these automated algorithms might have introduced noise and artifacts into the bone suppression process, thereby leading to sub-optimal model training and inference. To the best of our knowledge, this is the first study to use DES CXRs to train the bone-suppression models. However, this is not large-scale data and hence may not encompass a wide range of variability in the bone structures. With the increased availability of DES CXRs that would introduce sufficient data diversity into the training process, it would be possible to propose deeper architectures and improve model confidence, performance, and generalization to real-world data. (iii) This is not a classification-related study, but we wanted to evaluate if bone suppression would improve performance toward COVID-19 detection. We observed that the model trained on bone-suppressed CXRs improved detection of findings that are consistent with COVID-19, signifying that CXR bone suppression improved the model sensitivity toward COVID-19 classification and localization. Empirically determining the best classification model is outside the scope of this study.

The proposed approach could be extended to other image denoising problems. The importance of using bone suppressed CXRs for detecting other cardiopulmonary abnormalities including lung nodules, TB, pneumothorax, among others would be good research avenues. We believe our results will improve human visual interpretation of COVID-19-consistent findings, as well as automated detection in AI-driven workflows.

Data Availability

The minimal data required to reproduce this study are available in terms of the figures, performance metrics, and other measured reported in the tables. The code and related information for the algorithm are made publicly available at https://github.com/sivaramakrishnan-rajaraman/Bone-Suppresion-Ensemble to promote global research.

Funding Statement

This research was supported by the Intramural Research Program of the National Library of Medicine, and the Clinical Center, both parts of the National Institutes of Health. The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript.

References

1.Santosh KC, Antani S. Automated chest x-ray screening: Can lung region symmetry help detect pulmonary abnormalities? IEEE Trans Med Imaging. 2018. doi: 10.1109/TMI.2017.2775636 [DOI] [PubMed] [Google Scholar]
2.Shah PK, Austin JHM, White CS, Patel P, Haramati LB, Pearson GDN, et al. Missed non-small cell lung cancer: Radiographic findings of potentially resectable lesions evident only in retrospect. Radiology. 2003. doi: 10.1148/radiol.2261011924 [DOI] [PubMed] [Google Scholar]
3.Manji F, Wang J, Norman G, Wang Z, Koff D. Comparison of dual energy subtraction chest radiography and traditional chest X-rays in the detection of pulmonary nodules. Quant Imaging Med Surg. 2016. doi: 10.3978/j.issn.2223-4292.2015.10.09 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Kuhlman JE, Collins J, Brooks GN, Yandow DR, Broderick LS. Dual-energy subtraction chest radiography: What to look for Beyond calcified nodules. Radiographics. 2006. doi: 10.1148/rg.261055034 [DOI] [PubMed] [Google Scholar]
5.Li F, Hara T, Shiraishi J, Engelmann R, MacMahon H, Doi K. Improved detection of subtle lung nodules by use of chest radiographs with bone suppression imaging: Receiver operating characteristic analysis with and without localization. Am J Roentgenol. 2011. doi: 10.2214/AJR.10.4816 [DOI] [PubMed] [Google Scholar]
6.Kodama N, Loc T Van, Hai PT, Cong N Van, Katsuhara S, Kasai S, et al. Effectiveness of bone suppression imaging in the diagnosis of tuberculosis from chest radiographs in Vietnam: An observer study. Clin Imaging. 2018. doi: 10.1016/j.clinimag.2018.05.021 [DOI] [PubMed] [Google Scholar]
7.Matsubara N, Teramoto A, Saito K, Fujita H. Bone suppression for chest X-ray image using a convolutional neural filter. Australas Phys Eng Sci Med. 2019. doi: 10.1007/s13246-019-00822-w [DOI] [PubMed] [Google Scholar]
8.Yang W, Chen Y, Liu Y, Zhong L, Qin G, Lu Z, et al. Cascade of multi-scale convolutional neural networks for bone suppression of chest radiographs in gradient domain. Med Image Anal. 2017. doi: 10.1016/j.media.2016.08.004 [DOI] [PubMed] [Google Scholar]
9.Suzuki K, Abe H, MacMahon H, Doi K. Image-processing technique for suppressing ribs in chest radiographs by means of massive training artificial neural network (MTANN). IEEE Trans Med Imaging. 2006. doi: 10.1109/TMI.2006.871549 [DOI] [PubMed] [Google Scholar]
10.Nguyen HX, Dang TT. Ribs suppression in chest X-Ray images by using ICA method. IFMBE Proceedings. 2015. doi: 10.1007/978-3-319-11776-8_47 [DOI] [Google Scholar]
11.Freedman MT, Lo SCB, Seibel JC, Bromley CM. Lung nodules: Improved detection with software that suppresses the rib and clavicle on chest radiographs. Radiology. 2011. doi: 10.1148/radiol.11100153 [DOI] [PubMed] [Google Scholar]
12.Oda S, Awai K, Suzuki K, Yanaga Y, Funama Y, MacMahon H, et al. Performance of radiologists in detection of small pulmonary nodules on chest radiographs: Effect of rib suppression with a massive-training artificial neural network. Am J Roentgenol. 2009. doi: 10.2214/AJR.09.2431 [DOI] [PubMed] [Google Scholar]
13.Li F, Engelmann R, Pesce LL, Doi K, Metz CE, MacMahon H. Small lung cancers: Improved detection by use of bone suppression imaging—Comparison with dual-energy subtraction chest radiography. Radiology. 2011. doi: 10.1148/radiol.11110192 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Li F, Engelmann R, Pesce L, Armato SG, MacMahon H. Improved detection of focal pneumonia by chest radiography with bone suppression imaging. Eur Radiol. 2012. doi: 10.1007/s00330-012-2550-y [DOI] [PubMed] [Google Scholar]
15.Rajaraman S, Zamzmi G, Folio L, Alderson P, Antani S. Chest x-ray bone suppression for improving classification of tuberculosis-consistent findings. Diagnostics. 2021;11: 1–21. doi: 10.3390/diagnostics11050840 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Dietterich TG. Ensemble Methods in Machine Learning. Mult Classif Syst. 2000;1857: 1–15. doi: 10.1007/3-540-45014-9 [DOI] [Google Scholar]
17.Rajaraman S, Sornapudi S, Alderson PO, Folio LR, Antani SK. Analyzing inter-reader variability affecting deep ensemble learning for COVID-19 detection in chest radiographs. PLoS One. 2020. doi: 10.1371/journal.pone.0242301 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Bhattacharya S, Reddy Maddikunta PK, Pham Q-V, Gadekallu TR, Krishnan S SR, Chowdhary CL, et al. Deep learning and medical image processing for coronavirus (COVID-19) pandemic: A survey. Sustain Cities Soc. 2021;65: 102589. 10.1016/j.scs.2020.102589 [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2015. doi: 10.1007/978-3-319-24574-4_28 [DOI] [Google Scholar]
20.Xie X, Liao Q, Ma L, Jin X. Gated feature pyramid network for object detection. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2018;11259 LNCS: 199–208. doi: 10.1007/978-3-030-03341-5_17 [DOI] [Google Scholar]
21.Kim I, Rajaraman S, Antani S. Visual interpretation of convolutional neural network predictions in classifying medical image modalities. Diagnostics. 2019. doi: 10.3390/diagnostics9020038 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Vayá M de la I, Saborit JM, Montell JA, Pertusa A, Bustos A, Cazorla M, et al. BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients. 2020; 1–22. Available: http://arxiv.org/abs/2006.01174
23.Institute for Diagnostic and Interventional Radiology HMS. COVID-19 image repository. 2020 [cited 8 Aug 2021]. Available: https://github.com/ml-workgroup/covid-19-image-repository/tree/master/png
24.Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. 2020. Available: http://arxiv.org/abs/2003.11597
25.Imaging C. This is a thread of COVID-19 CXR (all SARS-CoV-2 PCR+) from my hospital (Spain). 2020 [cited 8 Aug 2021]. Available: https://threadreaderapp.com/thread/1243928581983670272.html
26.Shih G, Wu CC, Halabi SS, Kohli MD, Prevedello LM, Cook TS, et al. Augmenting the National Institutes of Health Chest Radiograph Dataset with Expert Annotations of Possible Pneumonia. Radiol Artif Intell. 2019. doi: 10.1148/ryai.2019180041 [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Chollet F. Xception: Deep Learning with Separable Convolutions. arXiv Prepr arXiv161002357. 2016; 1–14. [Google Scholar]
28.Lim B, Son S, Kim H, Nah S, Lee KM. Enhanced Deep Residual Networks for Single Image Super-Resolution. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 2017. doi: 10.1109/CVPRW.2017.151 [DOI]
29.Pavel Yakubovskiy. Segmentation Models. In: GitHub [Internet]. 2020 [cited 2 May 2021]. Available: https://github.com/qubvel/segmentation_models
30.Tan M, Le Q V. EfficientNet: Rethinking model scaling for convolutional neural networks. 36th International Conference on Machine Learning, ICML 2019. 2019.
31.He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. pp. 770–778. doi: 10.1109/CVPR.2016.90 [DOI]
32.Hu J, Shen L, Sun G. Squeeze-and-Excitation Networks. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2018; 7132–7141. doi: 10.1109/CVPR.2018.00745 [DOI]
33.Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. 2017. doi: 10.1109/CVPR.2017.243 [DOI]
34.Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the Inception Architecture for Computer Vision. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2016; 2818–2826. doi: 10.1002/2014GB005021 [DOI]
35.Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC. MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2018. doi: 10.1109/CVPR.2018.00474 [DOI]
36.Open Source Computer Vision. Histogram Comparison. 2020 [cited 3 Mar 2020]. Available: https://docs.opencv.org/3.4/d8/dc8/tutorial_histogram_comparison.html
37.Rajaraman S, Sornapudi S, Kohli M, Antani S. Assessment of an ensemble of machine learning models toward abnormality detection in chest radiographs. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS. 2019. doi: 10.1109/EMBC.2019.8856715 [DOI] [PubMed]
38.Islam MT, Aowal MA, Minhaz AT, Ashraf K. Abnormality Detection and Localization in Chest X-Rays using Deep Convolutional Neural Networks. arXiv. 2017. Available: http://arxiv.org/abs/1705.09850 [Google Scholar]
39.Budapest University of Technology and Economics (BME). Bone Shadow Eliminated Images of the JSRT Database. 2013 [cited 6 Mar 2020]. Available: https://www.mit.bme.hu/eng/events/2013/04/18/boneshadow-eliminated-images-jsrt-database

PLoS One. doi: 10.1371/journal.pone.0265691.r001

Decision Letter 0

Jie Zhang

16 Feb 2022

PONE-D-21-38420DeBoNet: A Deep Bone Suppression Model Ensemble to Improve Disease Detection in Chest RadiographsPLOS ONE

Dear Dr. Rajaraman,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please revise the paper by considering the reviewer's comments.

Please submit your revised manuscript by Apr 02 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Jie Zhang

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Thank you for stating the following financial disclosure:

“This research was supported by the Intramural Research Program of the National Library of Medicine, and the Clinical Center, both parts of the National Institutes of Health.”

Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript."

If this statement is not correct you must amend it as needed.

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

3. Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: 1. What is the main substantial case in your article?

2. What is the originality of your topic in terms of scientific research and related novel topics?

3. The plagiarism rate is high in your article, reaching 36%. You have to reduce the plagiarism rate according to the permissible limit.

4. What would be required to make your new methodology is their case credible and applicable?

5. What did your topic add to the topic area compared to other published materials?

6. What is the main question covered by your article? Is it relevant and interesting?

7. What does your article add to the subject area compared with other published material?

8. The paper is well written, but it contains many grammatical errors and needs more proofreading.

9. The texts is clear and easy to read.

10. Do the conclusions you have drawn agree with the evidence and arguments presented in your research?

11. Page 12, only written (Algorithm), write Algorithm and its declaration of this algorithm.

12. How many images and how many types did you take to train your network? And What is the deep details of your network( No. of Input Layer, No. of Hidden Layer, …)

13. Was the main question posed answered in your article on which the research was built?

14. Is the tables and figures in your article are sufficient to present your results in a sound, clear and understandable manner?

15. What is the main parameter on the basis of which the entered images were classified?

16. From your point of view, do you think that your article (Including methodology) is original so that it can be cited by the other researchers based on the findings you have taken?

17. Is it possible to develop your research methodology to include other parts of the human body and is not limited to the chest area? And how?

18. Your research paper is serious and worth reading.

• Does the title properly reflect the subject of the paper?

• Does the abstract provide an accessible summary of the paper?

• Do the keywords accurately reflect the content?

• Is the paper an appropriate length?

• Are the key messages short, accurate and clear?

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2022 Mar 31;17(3):e0265691. doi: 10.1371/journal.pone.0265691.r002

Author response to Decision Letter 0

3 Mar 2022

Response to the Editor:

We render our sincere thanks to the Academic Editor for arranging peer review and encouraging resubmission of our manuscript. To the best of our knowledge and belief, we have addressed the concerns of the Academic Editor and the reviewers in the revised manuscript.

Q1: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf, https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf.

Author response: We have formatted the manuscript per the templates recommended by the Editor. All figures are checked and converted using the PACE tool recommended by PLOS ONE during submission.

Q2: Thank you for stating the following financial disclosure:

“This research was supported by the Intramural Research Program of the National Library of Medicine, and the Clinical Center, both parts of the National Institutes of Health.”

Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript." If this statement is not correct you must amend it as needed. Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

Author response: We agree to the modifications in the financial disclosure per the Editor’s suggestions. The modified financial disclosure is as follows:

“This research was supported by the Intramural Research Program of the National Library of Medicine, and the Clinical Center, both parts of the National Institutes of Health. The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript.”

Q3: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

Author response: We ensure that the reference list in the revised manuscript is complete and correct. We have not cited any retracted papers.

Response to Reviewer #1:

Q1: What is the main substantial case in your article?

Author response: We successfully demonstrate a novel deep learning-based ensemble method for bone suppression in non-dual energy (i.e., conventional) digital chest radiographs CXRs. While existing methods apply multiple steps for bone suppression that may introduce noise and artifacts into the process, the proposed deep learning-based method is trained on a small subset of dual-energy CXRs to recognize bony structures and directly suppresses them in the conventional CXRs (without need for corresponding dual-energy image). This allows us to suppress bones in any conventional CXR. We demonstrate the effectiveness of our bone-suppression technique (named DeBoNet) through testing for gain in performance in detecting pulmonary abnormalities consistent with COVID-19 disease. We observed that the model trained on bone-suppressed CXRs significantly outperformed the model trained on non-bone-suppressed images.

Q2: What is the originality of your topic in terms of scientific research and related novel topics?

Author response: The originality of the proposal lies in producing a new knowledge toward bone suppression in conventional chest radiographs without the need for corresponding dual-energy images. This method, called DeBoNet, would enable researchers worldwide to analyze hundreds of thousands of CXRs that are publicly available for cardiopulmonary diseases without interference from bony structures. We experimented, observed, and discussed the efficacy of our approach to solving the problem of bone suppression, interpreted, and analyzed significance in the reported results. We also discussed the current limitations and the score for future research in bone suppression.

Q3: The plagiarism rate is high in your article, reaching 36%. You have to reduce the plagiarism rate according to the permissible limit.

Author response: We would like to kindly let you know that our article is available as an arXiv preprint at https://arxiv.org/abs/2111.03404. This might be a potential reason for a high plagiarism rate. However, we observed that PLOS ONE encourages the authors to share their research on a preprint server before submission. We do not have any “recycled” material in this novel work.

Q4: What would be required to make your new methodology is their case credible and applicable?

Author response: The credibility of this research stems from several facts including the rationale for the study, the process of data collection, novel methodology, the observed results and analysis, and subsequent claims. The trustworthiness of this proposal lies in using the most appropriate data for analysis (using the dual-energy radiographic image ground truth compared to the literature using data from automated bone suppression methods) and performing significance analysis using the reported results, ensuring that it is not reported by chance. We demonstrate a performance gain in bone suppression and also an improvement in detecting pulmonary abnormalities that are consistent with COVID-19 disease using the bone-suppressed chest radiographs. As well, the proposed approach could be extended to other image denoising problems. Further, we have made the code publicly available to support reproducibility. Link is available in the revised manuscript and will become active upon publication of the manuscript.

Q5: What did your topic add to the topic area compared to other published materials?

Author response: A study of the literature reveals several works published on suppressing bones in CXRs. These studies involve using (i) commercial software, (ii) conventional machine learning methods using hand-crafted feature descriptors, or (iii) state-of-the-art deep learning (DL) models to initially generate bone-only images and further subtract them from the original CXR to increase soft-tissue visibility. These studies, in general, propose multiple steps to generate bone-only images and subtract them from the original CXRs to increase soft-tissue visibility. A limitation of this approach is that an inaccurate generation of bone-only images would lead to introducing noise, reducing the visibility of soft tissues, increasing interpretation errors, and adversely impacting decision-making. As of the writing of this manuscript and to the best of our knowledge, there are no other articles in the literature that propose an automated method to generate a soft-tissue image directly from the original CXR image, alleviating the need for intermediate bone image generation and subsequent subtraction methods. We propose a deep learning-based method that is trained on dual-energy CXRs to recognize bony artifacts and directly suppress them in the conventional CXRs. We further demonstrate the impact of bone suppression by evaluating the gain in performance in detecting pulmonary abnormality consistent with COVID-19 disease. To our best knowledge, this is the first study to perform such analyses.

Q6: What is the main question covered by your article? Is it relevant and interesting?

Author response: This study aims to improve bone suppression in chest radiographs. We experimented with individual models and their ensembles and observed that the ensemble of the top-3 performing models resulted in improve bone suppression compared to individual models and other ensembles. We further demonstrate the impact of bone suppression by evaluating the gain in performance in detecting pulmonary abnormality consistent with COVID-19 disease. We believe the proposal is relevant to chest X-ray analysis and would interest the readers for this is the first study to perform such analyses.

Q7: What does your article add to the subject area compared with other published material?

Author response: We wish to reiterate our response to Q5.

Q8: The paper is well written, but it contains many grammatical errors and needs more proofreading. The text is clear and easy to read.

Author response: Thanks for your appreciative comments. The revised manuscript has been proofread by a native English speaker and corrected for typos and grammatical errors.

Q9: Do the conclusions you have drawn agree with the evidence and arguments presented in your research?

Author response: Yes. We identified the gaps in the literature and proposed a novel bone suppression method to improve performance. We discussed these observations and highlighted key findings in our study. The observations made from this study underscores the need for (i) customizing a model for the problem under study, (ii) constructing a model ensemble for bone suppression, and (iii) interpreting model behavior. Such discussion would help advance the understanding of the problem of bone suppression. We performed a significance analysis in the reported results, establishing that our observations are well-founded. We also discussed the current limitations of our study. Further, we discussed how the proposed approach could be extended to other applications. We believe our results will improve bone suppression and human visual interpretation of pulmonary abnormalities, as well as automated detection in AI-driven workflows.

Q10: Page 12, only written (Algorithm), write Algorithm and its declaration of this algorithm.

Author response: The algorithm is discussed in p. 12 of the revised manuscript. However, we also include it below for the reviewer’s convenience:

Algorithm

Input: Ground-truth bone-suppressed image K of 256×256 resolution

Bone-suppressed Images I = (IM1, IM2, IM3) of 256×256 resolution from M = [M1, M2, M3] ; M1, M2, M3 are the top-3 performing bone-suppression models

Image sub-block sizes B = [4, 8, 16, 32, 64, 128, 256]

Output: Final Bone-suppressed image J

for each sub-block size B

for each set of bone-suppressed Images I generated by M1, M2, M3

for each sub-block in K and IM1, IM2, IM3

compute MS-SSIM between K and IM1, K and IM2, K and IM3

perform Majority Voting = Max(MS-SSIM(K and IM1), MS-SSIM(K and IM2), MS-SSIM(K and IM3))

choose the sub-block with the maximum MS-SSIM value and put it in its respective position in the final bone-suppressed image J

end for

Q11: How many images and how many types did you take to train your network? And What is the deep details of your network( No. of Input Layer, No. of Hidden Layer, …)

Author response: The details about the number of layers and the number of filters in each layer of the various models proposed in this study are shown in Fig. 2, Fig. 3, and Fig. 4. The details about the input, intermediate, and output layers of these models are discussed in pp. 8 – 11. The training methods are discussed in lines 244 – 257 on p. 11.

Q12: Was the main question posed answered in your article on which the research was built?

Author response: Yes. We have answered these questions through our research findings. The methods proposed, the experiments conducted, and subsequent discussions answer our research questions about bone suppression. We have included the findings of our experiments, discussed the results with significance analysis, and explained how these observations answered the research question of bone suppression in chest radiographs.

Q13: Is the tables and figures in your article are sufficient to present your results in a sound, clear and understandable manner?

Author response: We believe the tables and figures presented in this study would suffice to convey the observations from this study and help propose novel methods in the future toward bone suppression and subsequent analysis.

Q14: What is the main parameter on the basis of which the entered images were classified?

Author response: We believe that the query is about the performance metrics used to evaluate the bone suppression and classification models. The proposed bone-suppression models are trained and evaluated using the following metrics: (i) loss, (ii) PSNR, (iii) SSIM, and (iv) MS-SSIM. The classification models are trained and evaluated using the following metrics: (i) accuracy; (ii) AUROC; iii) precision (P); (iv) recall (R); (v) AUPRC; (vi) F-score; and (vii) MCC.

Q15: From your point of view, do you think that your article (Including methodology) is original so that it can be cited by the other researchers based on the findings you have taken?

Author response: We affirm that our article is original and can be cited by other researchers based on the important findings reported in our study. This is the first study to successfully demonstrate a novel deep learning-based ensemble method for bone suppression in non-dual energy (i.e., conventional) digital chest radiographs CXRs without the need for a corresponding dual-energy image. The effectiveness of our bone-suppression technique (named DeBoNet) was proven through improvements in disease detection. The code will be shared and will allow researchers to suppress bones in the hundreds of thousands of PA CXRs that are publicly available for research use.

Q16: Is it possible to develop your research methodology to include other parts of the human body and is not limited to the chest area? And how?

Author response: The proposed approach could be extended to other image denoising problems. The key requirement is that during training there be a corresponding dual-energy CXR in the training, or if using other modalities a bone-only version is available to train the deep-learning algorithm to recognize the bones. We believe our results will improve human visual interpretation of disease findings, as well as other similar automated detection in AI-driven workflows.

Q17: Your research paper is serious and worth reading.

Author response: We render our sincere thanks to the reviewer for the valuable comments and appreciation of our study. To the best of our knowledge and belief, we have addressed the reviewer’s concerns.

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(29.8KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0265691.r003

Decision Letter 1

Jie Zhang

7 Mar 2022

DeBoNet: A Deep Bone Suppression Model Ensemble to Improve Disease Detection in Chest Radiographs

PONE-D-21-38420R1

Dear Dr. Rajaraman,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Jie Zhang

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0265691.r004

Acceptance letter

Jie Zhang

9 Mar 2022

PONE-D-21-38420R1

DeBoNet: A Deep Bone Suppression Model Ensemble to Improve Disease Detection in Chest Radiographs

Dear Dr. Rajaraman:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Jie Zhang

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(29.8KB, docx)}

Data Availability Statement

[pone.0265691.ref001] 1.Santosh KC, Antani S. Automated chest x-ray screening: Can lung region symmetry help detect pulmonary abnormalities? IEEE Trans Med Imaging. 2018. doi: 10.1109/TMI.2017.2775636 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref002] 2.Shah PK, Austin JHM, White CS, Patel P, Haramati LB, Pearson GDN, et al. Missed non-small cell lung cancer: Radiographic findings of potentially resectable lesions evident only in retrospect. Radiology. 2003. doi: 10.1148/radiol.2261011924 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref003] 3.Manji F, Wang J, Norman G, Wang Z, Koff D. Comparison of dual energy subtraction chest radiography and traditional chest X-rays in the detection of pulmonary nodules. Quant Imaging Med Surg. 2016. doi: 10.3978/j.issn.2223-4292.2015.10.09 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0265691.ref004] 4.Kuhlman JE, Collins J, Brooks GN, Yandow DR, Broderick LS. Dual-energy subtraction chest radiography: What to look for Beyond calcified nodules. Radiographics. 2006. doi: 10.1148/rg.261055034 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref005] 5.Li F, Hara T, Shiraishi J, Engelmann R, MacMahon H, Doi K. Improved detection of subtle lung nodules by use of chest radiographs with bone suppression imaging: Receiver operating characteristic analysis with and without localization. Am J Roentgenol. 2011. doi: 10.2214/AJR.10.4816 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref006] 6.Kodama N, Loc T Van, Hai PT, Cong N Van, Katsuhara S, Kasai S, et al. Effectiveness of bone suppression imaging in the diagnosis of tuberculosis from chest radiographs in Vietnam: An observer study. Clin Imaging. 2018. doi: 10.1016/j.clinimag.2018.05.021 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref007] 7.Matsubara N, Teramoto A, Saito K, Fujita H. Bone suppression for chest X-ray image using a convolutional neural filter. Australas Phys Eng Sci Med. 2019. doi: 10.1007/s13246-019-00822-w [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref008] 8.Yang W, Chen Y, Liu Y, Zhong L, Qin G, Lu Z, et al. Cascade of multi-scale convolutional neural networks for bone suppression of chest radiographs in gradient domain. Med Image Anal. 2017. doi: 10.1016/j.media.2016.08.004 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref009] 9.Suzuki K, Abe H, MacMahon H, Doi K. Image-processing technique for suppressing ribs in chest radiographs by means of massive training artificial neural network (MTANN). IEEE Trans Med Imaging. 2006. doi: 10.1109/TMI.2006.871549 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref010] 10.Nguyen HX, Dang TT. Ribs suppression in chest X-Ray images by using ICA method. IFMBE Proceedings. 2015. doi: 10.1007/978-3-319-11776-8_47 [DOI] [Google Scholar]

[pone.0265691.ref011] 11.Freedman MT, Lo SCB, Seibel JC, Bromley CM. Lung nodules: Improved detection with software that suppresses the rib and clavicle on chest radiographs. Radiology. 2011. doi: 10.1148/radiol.11100153 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref012] 12.Oda S, Awai K, Suzuki K, Yanaga Y, Funama Y, MacMahon H, et al. Performance of radiologists in detection of small pulmonary nodules on chest radiographs: Effect of rib suppression with a massive-training artificial neural network. Am J Roentgenol. 2009. doi: 10.2214/AJR.09.2431 [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref013] 13.Li F, Engelmann R, Pesce LL, Doi K, Metz CE, MacMahon H. Small lung cancers: Improved detection by use of bone suppression imaging—Comparison with dual-energy subtraction chest radiography. Radiology. 2011. doi: 10.1148/radiol.11110192 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0265691.ref014] 14.Li F, Engelmann R, Pesce L, Armato SG, MacMahon H. Improved detection of focal pneumonia by chest radiography with bone suppression imaging. Eur Radiol. 2012. doi: 10.1007/s00330-012-2550-y [DOI] [PubMed] [Google Scholar]

[pone.0265691.ref015] 15.Rajaraman S, Zamzmi G, Folio L, Alderson P, Antani S. Chest x-ray bone suppression for improving classification of tuberculosis-consistent findings. Diagnostics. 2021;11: 1–21. doi: 10.3390/diagnostics11050840 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0265691.ref016] 16.Dietterich TG. Ensemble Methods in Machine Learning. Mult Classif Syst. 2000;1857: 1–15. doi: 10.1007/3-540-45014-9 [DOI] [Google Scholar]

[pone.0265691.ref017] 17.Rajaraman S, Sornapudi S, Alderson PO, Folio LR, Antani SK. Analyzing inter-reader variability affecting deep ensemble learning for COVID-19 detection in chest radiographs. PLoS One. 2020. doi: 10.1371/journal.pone.0242301 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0265691.ref018] 18.Bhattacharya S, Reddy Maddikunta PK, Pham Q-V, Gadekallu TR, Krishnan S SR, Chowdhary CL, et al. Deep learning and medical image processing for coronavirus (COVID-19) pandemic: A survey. Sustain Cities Soc. 2021;65: 102589. 10.1016/j.scs.2020.102589 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0265691.ref019] 19.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2015. doi: 10.1007/978-3-319-24574-4_28 [DOI] [Google Scholar]

[pone.0265691.ref020] 20.Xie X, Liao Q, Ma L, Jin X. Gated feature pyramid network for object detection. Lect Notes Comput Sci (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 2018;11259 LNCS: 199–208. doi: 10.1007/978-3-030-03341-5_17 [DOI] [Google Scholar]

[pone.0265691.ref021] 21.Kim I, Rajaraman S, Antani S. Visual interpretation of convolutional neural network predictions in classifying medical image modalities. Diagnostics. 2019. doi: 10.3390/diagnostics9020038 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0265691.ref022] 22.Vayá M de la I, Saborit JM, Montell JA, Pertusa A, Bustos A, Cazorla M, et al. BIMCV COVID-19+: a large annotated dataset of RX and CT images from COVID-19 patients. 2020; 1–22. Available: http://arxiv.org/abs/2006.01174

[pone.0265691.ref023] 23.Institute for Diagnostic and Interventional Radiology HMS. COVID-19 image repository. 2020 [cited 8 Aug 2021]. Available: https://github.com/ml-workgroup/covid-19-image-repository/tree/master/png

[pone.0265691.ref024] 24.Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. 2020. Available: http://arxiv.org/abs/2003.11597

[pone.0265691.ref025] 25.Imaging C. This is a thread of COVID-19 CXR (all SARS-CoV-2 PCR+) from my hospital (Spain). 2020 [cited 8 Aug 2021]. Available: https://threadreaderapp.com/thread/1243928581983670272.html

[pone.0265691.ref026] 26.Shih G, Wu CC, Halabi SS, Kohli MD, Prevedello LM, Cook TS, et al. Augmenting the National Institutes of Health Chest Radiograph Dataset with Expert Annotations of Possible Pneumonia. Radiol Artif Intell. 2019. doi: 10.1148/ryai.2019180041 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0265691.ref027] 27.Chollet F. Xception: Deep Learning with Separable Convolutions. arXiv Prepr arXiv161002357. 2016; 1–14. [Google Scholar]

[pone.0265691.ref028] 28.Lim B, Son S, Kim H, Nah S, Lee KM. Enhanced Deep Residual Networks for Single Image Super-Resolution. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 2017. doi: 10.1109/CVPRW.2017.151 [DOI]

[pone.0265691.ref029] 29.Pavel Yakubovskiy. Segmentation Models. In: GitHub [Internet]. 2020 [cited 2 May 2021]. Available: https://github.com/qubvel/segmentation_models

[pone.0265691.ref030] 30.Tan M, Le Q V. EfficientNet: Rethinking model scaling for convolutional neural networks. 36th International Conference on Machine Learning, ICML 2019. 2019.

[pone.0265691.ref031] 31.He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. pp. 770–778. doi: 10.1109/CVPR.2016.90 [DOI]

[pone.0265691.ref032] 32.Hu J, Shen L, Sun G. Squeeze-and-Excitation Networks. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2018; 7132–7141. doi: 10.1109/CVPR.2018.00745 [DOI]

[pone.0265691.ref033] 33.Huang G, Liu Z, Van Der Maaten L, Weinberger KQ. Densely connected convolutional networks. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017. 2017. doi: 10.1109/CVPR.2017.243 [DOI]

[pone.0265691.ref034] 34.Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the Inception Architecture for Computer Vision. Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2016; 2818–2826. doi: 10.1002/2014GB005021 [DOI]

[pone.0265691.ref035] 35.Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC. MobileNetV2: Inverted Residuals and Linear Bottlenecks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2018. doi: 10.1109/CVPR.2018.00474 [DOI]

[pone.0265691.ref036] 36.Open Source Computer Vision. Histogram Comparison. 2020 [cited 3 Mar 2020]. Available: https://docs.opencv.org/3.4/d8/dc8/tutorial_histogram_comparison.html

[pone.0265691.ref037] 37.Rajaraman S, Sornapudi S, Kohli M, Antani S. Assessment of an ensemble of machine learning models toward abnormality detection in chest radiographs. Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS. 2019. doi: 10.1109/EMBC.2019.8856715 [DOI] [PubMed]

[pone.0265691.ref038] 38.Islam MT, Aowal MA, Minhaz AT, Ashraf K. Abnormality Detection and Localization in Chest X-Rays using Deep Convolutional Neural Networks. arXiv. 2017. Available: http://arxiv.org/abs/1705.09850 [Google Scholar]

[pone.0265691.ref039] 39.Budapest University of Technology and Economics (BME). Bone Shadow Eliminated Images of the JSRT Database. 2013 [cited 6 Mar 2020]. Available: https://www.mit.bme.hu/eng/events/2013/04/18/boneshadow-eliminated-images-jsrt-database

PERMALINK

DeBoNet: A deep bone suppression model ensemble to improve disease detection in chest radiographs

Sivaramakrishnan Rajaraman

Gregg Cohen

Lillian Spear

Les Folio

Sameer Antani

Roles

Abstract

Introduction

Fig 1. Graphical abstract of the proposal.

Materials and methods

Datasets

Table 1. Dataset sources.

Bone suppression models

Autoencoder with separable convolutions (Autoencoder-BS)

Fig 2. The architecture of the Autoencoder-BS model.

ResNet-based model with residual scaling (ResNet-BS)

Fig 3. The architecture of the ResNet-BS model.

U-Net and FPN-based models

Fig 4.

DeBoNet—bone suppression model ensemble

Fig 5. The architecture of the proposed DeBoNet.

DeBoNet evaluation

Classification model

Statistical analyses

Results

Bone suppression

Table 2. Performance achieved by the proposed bone suppression models using the NIH-CC-DES-Set 1 test set.

Fig 6. Bone-suppressed CXR images predicted by the proposed models using a CXR sample from the NIH–CC DES-Set 1 test set.

Table 3. Performance achieved by the DeBoNet using various sizes for the sub-blocks.

Fig 7. Statistical analyses using one-way ANOVA.

Fig 8. Bone-suppressed images predicted by the F-EB0-BS model using instances of CXRs with COVID-19-consistent findings.

Classification

Table 4. Classification performance achieved with the model trained on non-bone-suppressed and bone-suppressed images.

Fig 9. Classification performance achieved by the model trained on non-bone-suppressed and bone-suppressed images.

Fig 10. CRM-based localization of COVID-19-consistent manifestations.

Discussion and conclusions

Data Availability

Funding Statement

References

Decision Letter 0

Jie Zhang

Roles

Author response to Decision Letter 0

Decision Letter 1

Jie Zhang

Roles

Acceptance letter

Jie Zhang

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases