Introducing Hann windows for reducing edge-effects in patch-based image segmentation

Nicolas Pielawski; Carolina Wählby

doi:10.1371/journal.pone.0229839

. 2020 Mar 12;15(3):e0229839. doi: 10.1371/journal.pone.0229839

Introducing Hann windows for reducing edge-effects in patch-based image segmentation

Nicolas Pielawski ^1,^*, Carolina Wählby ^1,^2,^*

Editor: Jie Zhang³

PMCID: PMC7067425 PMID: 32163435

Abstract

There is a limitation in the size of an image that can be processed using computationally demanding methods such as e.g. Convolutional Neural Networks (CNNs). Furthermore, many networks are designed to work with a pre-determined fixed image size. Some imaging modalities—notably biological and medical—can result in images up to a few gigapixels in size, meaning that they have to be divided into smaller parts, or patches, for processing. However, when performing pixel classification, this may lead to undesirable artefacts, such as edge effects in the final re-combined image. We introduce windowing methods from signal processing to effectively reduce such edge effects. With the assumption that the central part of an image patch often holds richer contextual information than its sides and corners, we reconstruct the prediction by overlapping patches that are being weighted depending on 2-dimensional windows. We compare the results of simple averaging and four different windows: Hann, Bartlett-Hann, Triangular and a recently proposed window by Cui et al., and show that the cosine-based Hann window achieves the best improvement as measured by the Structural Similarity Index (SSIM). We also apply the Dice score to show that classification errors close to patch edges are reduced. The proposed windowing method can be used together with any CNN model for segmentation without any modification and significantly improves network predictions.

Introduction

Semantic image segmentation is a process consisting of separating an image into regions, e.g. representing different object types. This problem is ill-defined because there is no general definition of a region, and learning-based methods such as CNNs have started to outperform classical rule-based methods in recent years. In 2015 Ronneberger et al. [1] introduced the U-Net neural network and architecture, consisting of a compressing and decompressing part with skip-connections in between. In 2017, Jégou et al. [2] greatly increased the number of skip-connections used, and reached the state-of-the-art of semantic segmentation with very few trainable parameters, but consequently at the cost of a larger memory footprint.

Segmentation tasks are memory intensive, mainly due to the size of the input images and the preservation of feature maps along the computational graph of a CNN. In some cases, such as when handling giga-pixel-sized whole slide images (WSI) in digital pathology, it is not possible to process the whole image at once [3]. To counteract these memory issues, patch-based segmentation methods use different techniques that feed more or less contextual information to the neural networks. Edge-effects are all known to appear when working with CNNs, and have been approached in different ways: for instance by keeping a contextual border in the input that is removed from the output in the original U-Net architecture. In 2018, Innamorati et al. [4] showed that the errors of segmentation are higher for the pixels near the edges and even worse for the corners. The same year, Cui et al. [5] proposed a method to reduce edge effects and increase the final segmentation quality after reassembling the different patches. Their method consists of weighing the loss function and the patches with a specific mask which will be referred to as Pyramidal window in this paper.

In signal processing, one of the renowned windows is the Hann window, invented by Julius von Hann around 1900 and named after him by Blackman and Tukey [6] in 1958. Window functions are often used to taper a signal, by multiplying a window by a patch extracted from the signal, reducing the importance of the borders. Their usage is broad, for instance when transforming a signal into a spectrogram that can later reconstruct the original signal without artefacts. Another example lies in the field of statistics where curve fitting can be given a weighing factor—the window function—which is also known under the name of kernel. It is desirable for these windows to integrate to one (with a stride of half a window); not filling the criterion requires an additional normalisation step after reconstruction which is the case of the Pyramidal window. This supplementary step can introduce additional artefacts due to rounding errors in floating point arithmetic.

In this paper we explore the idea of windowing to reduce edge effects after CNN-based image segmentation. Our contributions are the following:

We present a method that can be applied post hoc on any CNN segmentation output without needing to retrain or modify the loss function nor renormalise the output.
We handle edge and corner cases separately, which gives a weighted estimate given the available context. We also avoid additional floating point errors by using windows that integrate to one.

Methods

The proposed window patch-based method is a refinement step that reduces the edge artefacts at patch borders. Inspired by signal processing, we multiply each patch with a 2-dimensional window function, which gives more emphasis to their centres and less to their adjacent edges and corners. Next, we hypothesise that the most correct information will be kept when combining the window-weighted CNN outputs increasing the quality of our predictions.

Our method follows this pipeline:

Extracting overlapping patches, with a stride of half a patch size.
Performing the prediction on the patches.
Multiplying each patch by the appropriate window depending on its absolute location: the window must be of the same size as the patch and must be replaced if it is associated to a border or corner patch.
Summing all the patches at their absolute location.

Window functions

We evaluate three different windows from classical signal processing: Hann [7], Bartlett-Hann [7], and Triangular [8] and compare with the Pyramidal window [5] as well as with simple averaging of patch overlaps. We chose to focus on evaluating these windows rendered in 2 dimensions. The choice of a window function is arbitrary for as long as it is separable and sums up to 1 when properly integrated over the patches with a stride of half a window size. The Pyramidal window does not follow this requirement and thus needs an extra normalisation step which can introduce additional artefacts.

In signal processing, for an arbitrary 1-dimensional window function w, Speake et al. [9] described the 2-dimensional version W in separable form as:

\begin{matrix} W (i, j) = w (i) w (j) \end{matrix}

We can thus derive the 2-dimensional versions of the original windows as follows:

\begin{matrix} W_{A v e r a g e} (i, j) = \frac{1}{4} \end{matrix}

(1)

\begin{matrix} W_{H a n n} (i, j) = \frac{1}{4} (1 - cos (\frac{2 π i}{I - 1})) (1 - cos (\frac{2 π j}{J - 1})) \end{matrix}

(2)

\begin{matrix} \begin{matrix} W_{B a r t l e t t - H a n n} (i, j) = & (a_{0} + a_{1} | \frac{i}{I} - \frac{1}{2} | - a_{2} cos (\frac{2 π i}{I}) \\ (a_{0} + a_{1} | \frac{j}{J} - \frac{1}{2} | - a_{2} cos (\frac{2 π j}{J}) \end{matrix} \end{matrix}

(3)

\begin{matrix} W_{T r i a n g u l a r} (i, j) = (1 - | \frac{2 i}{I} - 1 |) (1 - | \frac{2 j}{J} - 1 |) \end{matrix}

(4)

where i the current horizontal position, I the width of the patch, j the current vertical position, J the height of the patch, a₀ = 0.62, a₁ = 0.48, and a₂ = 0.38 [7].

Cui et al. [5] defined a weighted loss function which is later used as a window. The resulting Pyramidal window is defined as follows:

W_{P y r a m i d a l} (i, j) = α \frac{D_{i, j}^{e}}{D_{i, j}^{c} + D_{i, j}^{e}}

(5)

α = \frac{I \cdot J}{\sum_{i = 1}^{I} \sum_{j = 1}^{J} \frac{D_{i, j}^{e}}{D_{i, j}^{c} + D_{i, j}^{e}}}

where $D_{i, j}^{e}$ the absolute distance from the edge and $D_{i, j}^{c}$ the distance from the centre. The Pyramidal window formula above cannot be described in separable form which consequently increases the cost of computing the 2-dimensional window.

The 2-dimensional realisation of the different windows can be seen in Fig 1.

Complexity

The complexity of a non-overlapping reconstruction takes nm computations of individual patches, with n the number of patches horizontally, and m the number of patches vertically. Our method has a complexity of

\begin{matrix} (2 n - 1) (2 m - 1) = 4 n m - 2 n - 2 m + 1 \end{matrix}

That is approximately 4nm, and this approximation becomes more accurate as n or m grows, i.e., the image gets larger.

In practice, the non-overlapping reconstruction can be performed first in order to display a preview to a user, and then complete the missing information by computing the three quarters of the remaining patches in the meanwhile. Another optimisation is to combine both methods and use the fast non-overlapping reconstruction method for the inessential details of an image (e.g. the background) and the windowed method for the objects of interest.

Edges and corners

Contextual information from adjacent patches is naturally not available at the edges and corners of the full-size image. Therefore we propose specific windows to increase inference accuracy in these parts of the final image. The patches nearby the edges and corners are weighted with a different set of windows to compensate for the missing information so that the sum of all overlapping windows over the full image is 1. The 2-dimensional border windows of an image can be constructed from an arbitrary 1-dimensional window w:

\begin{matrix} W_{U p} (i, j) & = {\begin{matrix} w (i), & if j < \frac{J}{2} \\ w (i) w (j), & otherwise \end{matrix} \end{matrix}

(6)

\begin{matrix} W_{D o w n} (i, j) & = {\begin{matrix} w (i), & if j > \frac{J}{2} \\ w (i) w (j), & otherwise \end{matrix} \end{matrix}

(7)

\begin{matrix} W_{L e f t} (i, j) & = {\begin{matrix} w (j), & if i < \frac{I}{2} \\ w (i) w (j), & otherwise \end{matrix} \end{matrix}

(8)

\begin{matrix} W_{R i g h t} (i, j) & = {\begin{matrix} w (j), & if i > \frac{I}{2} \\ w (i) w (j), & otherwise \end{matrix} \end{matrix}

(9)

with W(i, j) the resulting border window, w(i) the evaluation of an arbitrary 1-dimensional window w at position i.

The formula of the upper left corner patch of an image is defined as:

\begin{matrix} W_{U p L e f t} (i, j) = {\begin{matrix} 1, & if i \leq \frac{I}{2} a n d j \leq \frac{J}{2} \\ w (i), & if i > \frac{I}{2} a n d j < \frac{J}{2} \\ w (j), & if i < \frac{I}{2} a n d j > \frac{J}{2} \\ w (i) w (j), & otherwise \end{matrix} \end{matrix}

(10)

It is possible to construct the three remaining corner windows from 10. These formulas yield eight new windows as visualised in Fig 2.

Experiments

We present an experiment where six different methods for combining the results of patch-based image segmentation are compared. The experiments follow the same protocol and data as in Cui et al. [5], where a U-Net neural network architecture is trained on a hematoxylin and eosin (H&E)-stained tissue dataset. The model is trained with a cross-entropy loss, and the weighting loss as constructed by Cui et al. [5] was discarded. The patch size is 128x128 for a complete image of 1024x1024 pixels. The six different methods that are compared are: No overlap, Average, Pyramidal, Hann, Bartlett-Hann, and Triangular windows.

Fig 3(a)–3(h) show input, ground-truth, and examples of results of the different methods. Although subtle, artefacts are most visible in the “No overlap” and the Average windowing, while the visual artefacts after reconstruction with the proposed windowing approaches are marginal.

Here, we focus on evaluating the benefit of using windowing when combining patches of a U-Net output. However, the ground truth we have access to consists of manual annotations describing the desired U-Net output. This means that any direct comparison to ground truth will be a mixture of U-Net performance and windowing effects. We therefore chose two strategies for evaluation. First we compare ground truth (exemplified in Fig 3(b)) and U-Net output without reconstruction (No overlap, exemplified in Fig 3c) as well as the different reconstructed outputs (exemplified in Fig 3(d)–3(h)) using using the structural similarity index [10] (SSIM). Next we compare to the ground truth using the Dice coefficient, and to separate U-Net performance errors from windowing effects, we measure the Dice coefficient separately for pixels in patch centres and pixels in the vicinity of edges, using a binary mask as described below.

The SSIM is a method used to measure the similarity between two images, and is related to the human visual perception. Moreover, the SSIM generates statistics from local structures by using a window in order to compute a score, in opposite to single pixel methods such as PSNR (Peak Signal-to-Noise Ratio). The U-Net prediction consists of three classes, and we calculated the SSIM for each of the three classes, and then average in order to achieve the resulting scores. As pointed out above, the produced windowing reduces artefacts which are present on only a small number of pixels at the edges of patches. This results in a low variance of the SSIM scores. To better visualise the effects of windowing we subtract the SSIM of the baseline with the most prominent artefacts, i.e. the result of not having patch overlap (No overlap), from the SSIM of each method. Fig 4(a) and 4(b) shows the adjusted SSIM for each windowing approach and each image. Each marker is representing an image with a unique combination of symbol and colour.

Fig 4 — **(a)** and **(b)** show SSIM for five different windowing methods as adjusted to a baseline, which is the SSIM from “No overlap” patches. The SSIM indexes have been computed class-wise and then averaged. **(b)** shows a zoomed in version of (a) excluding the average window in order to emphasise the differences. The differences between the Pyramidal and the Hann windows are not clearly visible and dashed lines have been plotted to display a visual trend. **(c)** and **(d)** show the micro and macro Dice coefficients respectively, after adjusting (by subtraction) for the corresponding Dice coefficients of “No overlap” patches, to separate windowing errors from U-Net errors. The Dice coefficients were calculated separately for patch centres and edge vicinity, showing how the windowing has less of an effect in patch centres but improves the Dice coefficient at edge vicinity. Each of the 14 sample images has a unique combination of symbol and colour.

The Dice coefficient is typically used to compare segmentation masks, and as we have three classes we calculated both micro and macro Dice [11]. The micro-averaging and macro-averaging of the Dice coefficients were computed following the definition of Sebastiani [12, p. 33]. We started from the reconstructed U-Net outputs and assigned each pixel its most probable class, resulting in three binary images representing objects (blue), object edges (green), and image background (red). Next, the micro and macro Dice coefficients comparison to the ground truth was calculated separately for pixels in patch centres and pixels in the vicinity of edges using a binary mask as shown in Fig 3(i) and overlaid in Fig 3(j). As with the SSIM comparison, we adjust the scores by subtracting the result of not having patch overlap (No overlap), from the Dice score of each method. Fig 4(c) and 4(d) show the adjusted micro and macro Dice score for patch centres and edges vicinity for each windowing approach and image.

Hypothesis testing

Besides the average window, all windows’ SSIM significantly outperformed the baseline (using adjacent patches, without overlaps, i.e. “No overlap”). An independent paired t-test was conducted and showed evidence that the Pyramidal window (mean: 0.02079±0.00597; t(13) = 13.0294, p <.0001), Hann window (mean: 0.02112±0.00603; t(13) = 13.1095, p <.0001), Bartlett-Hann window (mean: 0.02026±0.00562; t(13) = 13.4807, p <.0001) and Triangular window (mean: 0.01691±0.00393; t(13) = 16.0797, p <.0001) all yielded a better score than the baseline.

Surprisingly, predictions with overlapping patches weighted by an average window (mean: −0.00710±0.00447) performed worse than our baseline; t(13) = 5.9431, p < 0.0001, even though they contain about four times as much information. This most likely happens because the overlapping patches do not weigh down edge artefacts thus yielding four times as many artefacts.

In our results, the Hann outperformed the Pyramidal window by a small amount and we performed an exact sign test [13] in order to highlight those differences in SSIM. The Hann window elicited a statistically significant mean increase in SSIM (0.00033±0.00017) compared to the Pyramidal window; t(13) = 7.0849, p <.0001.

In terms of Dice coefficient, all methods significantly outperformed the baseline, including the prediction with overlapping patches weighted by an average window (see Fig 5(a)). The triangular and pyramidal window got exactly the same Dice coefficients for all images, and significantly outperformed other methods, except for the overlapping patches. As stochasticity is added when computing the argmax of the softmax, it becomes more difficult to compare the Dice coefficient of the different methods (see Fig 5(b)).

Fig 5 — **(a)** Paired t-test between the methods and the baseline (all are statistically significant), **(b)** Paired t-test between the methods and the triangular window.

Discussion and future work

Our method offers an immediate reduction in edge artefacts as assessed with SSIM and can be easily implemented and integrated without the need of modifying any existing Deep Learning model. Moreover, signal theory provides a basis for further improvements. Our results suggest that using a Hann window is an effective way of reducing edge artefacts, and testing the method on different datasets and image modalities could further confirm our hypothesis. More window types could be compared and a more in-depth study could focus on determining the best window type depending on the circumstances.

Assessment of segmentation accuracy, assessed with the Dice coefficient, showed that it is difficult to separate variations in accuracy due to a non-perfect ground truth, a sub-optimal network, and edge effects. We therefore focused on improvements in accuracy at patch edges as compared to patch centres when applying different types of weighted windows, and showed that all versions of weighted windows improved the result. However, the choice of window should be left as a hyper-parameter dependent on the application. Even though the weighting can erase a great deal of information, the predictions at the patch’s vicinity was shown to be noisy in [4] and this hypothesis was confirmed by the result of our experiments. Using a window that weighs the centre more than the vicinity of the patch was beneficial to both the SSIM and Dice metrics.

The proposed method assumes that a constant amount of context is needed to have an accurate prediction. It could be of interest to focus on reducing the amount of context needed, which would result in saturating windows. For instance, the Tukey window [8], has a tunable parameter varying the amount of context. This would result in reducing the amount of overlap between the patches and could bring important computational gains.

An optimisation of the context could also be achieved with Deep Bayesian neural networks that yield a prediction associated to an uncertainty. This uncertainty, or precision, could be combined to a Bayesian prior in order to compute a different window for each patch. This window would then be predicted depending on how much context the neural network needs: the more certain a predicted area will be, the more weight it will receive.

Conclusion

In this paper we describe a new method that introduces Hann windows for reducing the edge-effects when performing image segmentation with CNNs. We explained how to construct arbitrary windows in 2-dimensions and how they can be expanded for borders and corner cases. To demonstrate our concept, we tested six different windows on a cell nuclei dataset and showed that it compares favourably with an existing method provided by Cui et al. [5]. Finally, the method is readily available and simple to implement in existing Deep Learning models, even if they are already trained.

Data Availability

https://github.com/easycui/nuclei_segmentation/tree/master/datasets/Multi_organs.

Funding Statement

This project was financially supported by the Swedish Foundation for Strategic Research (grant SB16-0046 and BD150008) and the European Research Council (ERC-2015-CoG 683810). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer; 2015. p. 234–241.
2.Jégou S, Drozdzal M, Vazquez D, Romero A, Bengio Y. The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on. IEEE; 2017. p. 1175–1183.
3. Solorzano L, Almeida GM, Mesquita B, Martins D, Oliveira C, Wählby C. Whole Slide Image Registration for the Study of Tumor Heterogeneity In: Stoyanov D, Taylor Z, Ciompi F, Xu Y, Martel A, Maier-Hein L, et al. , editors. Computational Pathology and Ophthalmic Medical Image Analysis. Cham: Springer International Publishing; 2018. p. 95–102. [Google Scholar]
4.Innamorati C, Ritschel T, Weyrich T, Mitra NJ. Learning on the Edge: Explicit Boundary Handling in CNNs. arXiv preprint arXiv:180503106. 2018;.
5.Cui Y, Zhang G, Liu Z, Xiong Z, Hu J. A Deep Learning Algorithm for One-step Contour Aware Nuclei Segmentation of Histopathological Images. arXiv preprint arXiv:180302786. 2018;.
6. Blackman RB, Tukey JW. The measurement of power spectra from the point of view of communications engineering—Part I. Bell System Technical Journal. 1958;37(1):185–282. 10.1002/j.1538-7305.1958.tb03874.x [DOI] [Google Scholar]
7. Ha YH, Pearce JA. A new window and comparison to standard windows. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1989;37(2):298–301. 10.1109/29.21693 [DOI] [Google Scholar]
8. Harris FJ. On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE. 1978;66(1):51–83. 10.1109/PROC.1978.10837 [DOI] [Google Scholar]
9. Speake T, Mersereau R. A note on the use of windows for two-dimensional FIR filter design. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1981;29(1):125–127. 10.1109/TASSP.1981.1163515 [DOI] [Google Scholar]
10. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP, et al. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing. 2004;13(4):600–612. 10.1109/tip.2003.819861 [DOI] [PubMed] [Google Scholar]
11. Dice LR. Measures of the amount of ecologic association between species. Ecology. 1945;26(3):297–302. 10.2307/1932409 [DOI] [Google Scholar]
12. Sebastiani F. Machine learning in automated text categorization. ACM computing surveys (CSUR). 2002;34(1):1–47. 10.1145/505282.505283 [DOI] [Google Scholar]
13. Dixon WJ, Mood AM. The Statistical Sign Test. Journal of the American Statistical Association. 1946;41(236):557–566. 10.1080/01621459.1946.10501898 [DOI] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0229839.r001

Decision Letter 0

Jie Zhang

27 Nov 2019

PONE-D-19-29003

Introducing Hann windows for reducing edge-effects in patch-based image segmentation

PLOS ONE

Dear Mr. Pielawski,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please revise the manuscript by considering the reviewers' comments.

We would appreciate receiving your revised manuscript by Jan 11 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Jie Zhang

Academic Editor

PLOS ONE

Journal Requirements:

1. When submitting your revision, we need you to address these additional requirements.

Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors present a procedure to address edge effects in CNN image segmentation. Their experiments are documented in detail and the manuscript is in general well-written. I do however have some recommendations that the authors should consider.

References: I find it hard to believe that only 11 references were relevant for this manuscript. The authors should significantly improve their literature review, in order to place their work more rigorously within existing literature.

Computational complexity: Although a mention of the general complexity of their method is shown, it would be very much helpful for the reader to display the computational time for the authors technique – along with the benchmark methods. One of the objectives in the era of big data processing is to find the right balance between improvements in accuracy, and time needed to retrieve the desired outputs and as such, this type of information can be particularly useful.

Validation: I understand why the authors decided to use the SSIM as validation metric. One thing, the SSIM should be presented in detail and not just put in a reference and second, wouldn’t other established accuracy metrics (such as overall accuracy, precision, recall and F1 score) be also fruitful?

Reviewer #2: The paper introduced Hann windows for patch-based image segmentation. The result shows that in terms of Structural similarity index, the Hann windows slightly outperfoms baseline. The novalty of the paper is limited. Eq 6 to Eq 10 are not clear enough, especially the definition of w(i).

Only SSIM is used for evaluation, it is necessary to add commonly used metrics for segmentation tasks, such as DICE.

Minor: Eq 10, should be if i <= L/2 and j <= J/2.

Reviewer #3: This manuscript describes the use of windowing functions on microscope image data that is to be segmented using convolutional neural network (CNN) approaches. In specific, the manuscript compares the effectiveness of several different window functions, including Hann windows. A strength of the approach described in this manuscript is that it provides an effective way to perform CNN-based segmentation on large images by subdividing them into many smaller images, while minimizing any artifacts caused by the segmentation algorithm near the edges of each subdivided image (i.e., edge artifacts). The manuscript is well thought out and well written and only very minimal comments were identified, as described below:

1. With reference to Figure 1 and the color-bar and color look-up tables used in the figure: was a different color look-up table used for the pyramidal window? It was mentioned in the manuscript that the integrated area of the pyramidal window is not 1, while the other window functions do have an integrated area of 1. Was it necessary to compensate for this different integrated area by using a different color look-up table, or alternatively, were the data normalized prior to visualizing as a 2D heatmap?

2. The description of the different approaches and different methods is a bit confusing on page 5, for the top 3 paragraphs. For example, the first paragraph refers to "The five different approaches", while the 3rd paragraph (which is the Fig. 3 caption) refers to "the adjusted SSIM for six different methods". It is a bit unclear of what an "approach" is vs. a "method" and whether there are 5 or 6 of them. In Fig. 3a, there appear to be 5 different window functions shown. For similar reasons, the 2nd paragraph on page 5 is also confusing, in the sentence that reads "Due to a low variance in the SSIM indices for each method but a high variance between methods..." It is unclear here as well what the method is that the sentence is referring to. Clearing up the language in these 3 paragraphs is needed. Finally, along similar lines, the authors might help to clarify the vertical axis of Figure 3 by explaining that this is the improvement in SSIM over baseline, or some wording similar to this.

3. On page 6, first paragraph of the Discussion section, the 2nd sentence contains the word "theory" twice, which is a bit redundant.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Mar 12;15(3):e0229839. doi: 10.1371/journal.pone.0229839.r002

Author response to Decision Letter 0

13 Jan 2020

We thank the reviewers for their constructive comments and have revised the manuscript to address their concerns.

REVIEWER #1

> References: I find it hard to believe that only 11 references were relevant for this manuscript. The authors should significantly improve their literature review, in order to place their work more rigorously within existing literature.

The issue is relatively little discussed as many researchers have access to powerful computers or can reformulate the problem to deal with smaller images. One approach consists of cropping the borders (e.g. Rudolph, Robert, et al. "Efficient identification, localization and quantification of grapevine inflorescences in unprepared field images using fully convolutional networks." arXiv preprint arXiv:1807.03770 (2018)) but can still generate artifacts.

> Although a mention of the general complexity of their method is shown, it would be very much helpful for the reader to display the computational time for the authors technique – along with the benchmark methods.

Time complexity is defined as <number of patches> times <time complexity of the deep learning model> and thus it will vary from model to model, computer to computer and so on. Using overlapping windows increases the number of patches by a factor of 4 and increases the complete time complexity by the same factor.

> The SSIM should be presented in detail and not just put in a reference other established accuracy metrics (such as overall accuracy, precision, recall and F1 score) be also fruitful?

Agreed. We added a new metric: the DICE coefficient, which corresponds to the F1 score.

REVIEWER #2

> Eq 6 to Eq 10 are not clear enough, especially the definition of w(i).

The text was modified to clarify and provide a better explanation of the equations.

It is necessary to add commonly used metrics for segmentation tasks, such as DICE.

Agreed, we added two more experiments (see fig. 3).

> Eq 10, should be if i <= L/2 and j <= J/2.

Corrected.

REVIEWER #3

> Was a different color look-up table used for the pyramidal window? […] Was it necessary to compensate for this different integrated area by using a different color look-up table, or alternatively, were the data normalized prior to visualizing as a 2D heatmap?

The same colour look-up tables were used in the article. The un-normalized windows were normalized.

> The description of the different approaches and different methods is a bit confusing on page 5, for the top 3 paragraphs. For example, the first paragraph refers to "The five different approaches", while the 3rd paragraph (which is the Fig. 3 caption) refers to "the adjusted SSIM for six different methods". It is a bit unclear of what an "approach" is vs. a "method" and whether there are 5 or 6 of them.

Changed all occurrences of “approach” to “method”, and 5 to 6. There are 5 methods plus 1 baseline, clarified in the revised article.

> In Fig. 3a, there appear to be 5 different window functions shown. For similar reasons, the 2nd paragraph on page 5 is also confusing, in the sentence that reads "Due to a low variance in the SSIM indices for each method but a high variance between methods..." It is unclear here as well what the method is that the sentence is referring to. Clearing up the language in these 3 paragraphs is needed. The authors might help to clarify the vertical axis of Figure 3 by explaining that this is the improvement in SSIM over baseline.

The three paragraphs were partially reformulated.

> On page 6, first paragraph of the Discussion section, the 2nd sentence contains the word "theory" twice.

Corrected.

Attachment

Submitted filename: rebuttal_letter.docx

Click here for additional data file.^{(16.4KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0229839.r003

Decision Letter 1

Jie Zhang

27 Jan 2020

PONE-D-19-29003R1

Introducing Hann windows for reducing edge-effects in patch-based image segmentation

PLOS ONE

Dear Mr. Pielawski,

Please address the comments of reviewer 2.

We would appreciate receiving your revised manuscript by Mar 12 2020 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Jie Zhang

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Reviewer #1: (No Response)

Reviewer #2: The author showed the experiment result of Dice in Fig 4, but didn't give the overall performance comparision in terms of Dice and the discussion is not sufficient.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

PLoS One. 2020 Mar 12;15(3):e0229839. doi: 10.1371/journal.pone.0229839.r004

Author response to Decision Letter 1

14 Feb 2020

We thank the reviewers for their constructive comments and have revised the manuscript to address their concerns.

> The author showed the experiment result of Dice in Fig 4, but didn't give the overall performance comparison in terms of Dice

We added a new figure (fig 5), containing statistical testing of the different method with respect to the baseline and triangular/cui.

> The discussion is not sufficient.

The discussion section was modified and some content was added.

Attachment

Submitted filename: rebuttal_letter_2nd.pdf

Click here for additional data file.^{(32.5KB, pdf)}

PLoS One. doi: 10.1371/journal.pone.0229839.r005

Decision Letter 2

Jie Zhang

18 Feb 2020

Introducing Hann windows for reducing edge-effects in patch-based image segmentation

PONE-D-19-29003R2

Dear Dr. Pielawski,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Jie Zhang

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

The authors have adequately addressed the reviewers' comments and the revised manuscript can be accepted.

Reviewers' comments:

PLoS One. doi: 10.1371/journal.pone.0229839.r006

Acceptance letter

Jie Zhang

25 Feb 2020

PONE-D-19-29003R2

Introducing Hann windows for reducing edge-effects in patch-based image segmentation

Dear Dr. Pielawski:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Jie Zhang

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: rebuttal_letter.docx

Click here for additional data file.^{(16.4KB, docx)}

Attachment

Submitted filename: rebuttal_letter_2nd.pdf

Click here for additional data file.^{(32.5KB, pdf)}

Data Availability Statement

https://github.com/easycui/nuclei_segmentation/tree/master/datasets/Multi_organs.

[pone.0229839.ref001] 1.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Springer; 2015. p. 234–241.

[pone.0229839.ref002] 2.Jégou S, Drozdzal M, Vazquez D, Romero A, Bengio Y. The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on. IEEE; 2017. p. 1175–1183.

[pone.0229839.ref003] 3. Solorzano L, Almeida GM, Mesquita B, Martins D, Oliveira C, Wählby C. Whole Slide Image Registration for the Study of Tumor Heterogeneity In: Stoyanov D, Taylor Z, Ciompi F, Xu Y, Martel A, Maier-Hein L, et al. , editors. Computational Pathology and Ophthalmic Medical Image Analysis. Cham: Springer International Publishing; 2018. p. 95–102. [Google Scholar]

[pone.0229839.ref004] 4.Innamorati C, Ritschel T, Weyrich T, Mitra NJ. Learning on the Edge: Explicit Boundary Handling in CNNs. arXiv preprint arXiv:180503106. 2018;.

[pone.0229839.ref005] 5.Cui Y, Zhang G, Liu Z, Xiong Z, Hu J. A Deep Learning Algorithm for One-step Contour Aware Nuclei Segmentation of Histopathological Images. arXiv preprint arXiv:180302786. 2018;.

[pone.0229839.ref006] 6. Blackman RB, Tukey JW. The measurement of power spectra from the point of view of communications engineering—Part I. Bell System Technical Journal. 1958;37(1):185–282. 10.1002/j.1538-7305.1958.tb03874.x [DOI] [Google Scholar]

[pone.0229839.ref007] 7. Ha YH, Pearce JA. A new window and comparison to standard windows. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1989;37(2):298–301. 10.1109/29.21693 [DOI] [Google Scholar]

[pone.0229839.ref008] 8. Harris FJ. On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE. 1978;66(1):51–83. 10.1109/PROC.1978.10837 [DOI] [Google Scholar]

[pone.0229839.ref009] 9. Speake T, Mersereau R. A note on the use of windows for two-dimensional FIR filter design. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1981;29(1):125–127. 10.1109/TASSP.1981.1163515 [DOI] [Google Scholar]

[pone.0229839.ref010] 10. Wang Z, Bovik AC, Sheikh HR, Simoncelli EP, et al. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing. 2004;13(4):600–612. 10.1109/tip.2003.819861 [DOI] [PubMed] [Google Scholar]

[pone.0229839.ref011] 11. Dice LR. Measures of the amount of ecologic association between species. Ecology. 1945;26(3):297–302. 10.2307/1932409 [DOI] [Google Scholar]

[pone.0229839.ref012] 12. Sebastiani F. Machine learning in automated text categorization. ACM computing surveys (CSUR). 2002;34(1):1–47. 10.1145/505282.505283 [DOI] [Google Scholar]

[pone.0229839.ref013] 13. Dixon WJ, Mood AM. The Statistical Sign Test. Journal of the American Statistical Association. 1946;41(236):557–566. 10.1080/01621459.1946.10501898 [DOI] [PubMed] [Google Scholar]

PERMALINK

Introducing Hann windows for reducing edge-effects in patch-based image segmentation

Nicolas Pielawski

Carolina Wählby

Roles

Abstract

Introduction

Methods

Window functions

Fig 1. Illustration of the different 2-dimensional windows.

Complexity

Edges and corners

Fig 2. The different configurations of Hann windows of size 128x128 for edge and corner cases.

Experiments

Fig 3. Input, ground-truth and different reconstruction methods.

Fig 4. Comparison of windowing approaches using SSIM and Dice coefficients.

Hypothesis testing

Fig 5. Paired t-test between the Dice coefficient within grid and outside of grid (patch centres) between the different methods.

Discussion and future work

Conclusion

Data Availability

Funding Statement

References

Decision Letter 0

Jie Zhang

Roles

Author response to Decision Letter 0

Decision Letter 1

Jie Zhang

Roles

Author response to Decision Letter 1

Decision Letter 2

Jie Zhang

Roles

Acceptance letter

Jie Zhang

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases