Skip to main content
Computational and Mathematical Methods in Medicine logoLink to Computational and Mathematical Methods in Medicine
. 2012 Feb 20;2012:438617. doi: 10.1155/2012/438617

Nonlocal Means-Based Denoising for Medical Images

Ke Lu 1,*, Ning He 2, Liang Li 2
PMCID: PMC3291081  PMID: 22454694

Abstract

Medical images often consist of low-contrast objects corrupted by random noise arising in the image acquisition process. Thus, image denoising is one of the fundamental tasks required by medical imaging analysis. Nonlocal means (NL-means) method provides a powerful framework for denoising. In this work, we investigate an adaptive denoising scheme based on the patch NL-means algorithm for medical imaging denoising. In contrast with the traditional NL-means algorithm, the proposed adaptive NL-means denoising scheme has three unique features. First, we use a restricted local neighbourhood where the true intensity for each noisy pixel is estimated from a set of selected neighbouring pixels to perform the denoising process. Second, the weights used are calculated thanks to the similarity between the patch to denoise and the other patches candidates. Finally, we apply the steering kernel to preserve the details of the images. The proposed method has been compared with similar state-of-art methods over synthetic and real clinical medical images showing an improved performance in all cases analyzed.

1. Introduction

Medical images obtained from Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) and Ultrasound imaging (US) are the most common tools for diagnosis. These images are often affected by random noise arising in the image acquisition process. The presence of noise not only produces undesirable visual quality but also lowers the visibility of low-contrast objects. Image denoising is one of the classical problems in digital image processing. As a primary basis image processing procedure, noise removal has been extensively studied and many denoising schemes have been proposed, from the earlier smoothing filters and frequency domain denoising methods to the lately developed wavelet- [15], curvelet- [6], and ridgelet- [7] based methods, sparse representation [8] and K-SVD [9] methods, shape adaptive transform [10], bilateral filtering [11], NL-means based methods [12, 13], and more recently proposed nonlinear variational methods like the total variation minimization [1416]. With the rapid development of modern digital imaging devices and their increasingly wide applications in our daily life, there are increasing requirements of new denoising algorithms for higher image quality. Particularly, in medical imaging, denoising is challenging since all kinds of noise cannot be easily modeled and are known to be tissue dependent, such as ultrasound images. Although noise gives an image a generally undesirable appearance, the most significant factor is that noise can cover and reduce the visibility of certain features within the image. The presence of noise gives an image a mottled, grainy, textured, or snowy appearance. In the imaging process, the energy of the high-frequency waves is partially reflected and transmitted at the boundaries between tissues having different acoustic impedances. Nevertheless, the diagnosis quality is often low and reducing speckle while preserving anatomic information is necessary to delineate reliably and accurately the regions of interest. Recently, it has been demonstrated that image patches are relevant features for denoising images in adverse situations. The related methodology can be adapted to derive a robust filter for medical images. Accordingly, in this paper we introduce a novel restoration scheme for medical images, inspired from the NL-means approach introduced by Buades et al. [12] to denoise 2D natural images corrupted by an additive white Gaussian noise.

The rest of this paper is organized as follows. The noise distribution and estimation in medical images are depicted in Section 2.1. The brief description of NL-means algorithm is discussed in Section 2.2 while the improved NL-means algorithm and the denoising performance under Rician noise are analyzed Section 2.3. The supporting experimental results of improved NL-means algorithm compared to other denoising methods under various conditions are illustrated in Section 3. Finally, concluding remarks are given in Section 4.

2. Improved NL-Means Denoising Method

2.1. Noise Distribution and Estimation in Medical Images

The most MR images acquired in the Fourier domain are characterized by a zero-mean Gaussian probability density function (PDF). After the inverse Fourier transform, the noise distribution in the real and imaginary components will still be Gaussian due to the linearity and the orthogonality of the Fourier transform. However, due to the subsequent transform to a magnitude image, the noise distribution will be no longer Gaussian but Rician distributed. For an MR magnitude image defined on a discrete grid Ω, M = {m i | iΩ}, then the PDF of m i is

p(miA,σ)=miσ2e(mi2+A2/2σ2)I0(Ami/σ2)ε(mi), (1)

where I 0(·) is the 0th-order modified Bessel function of the first kind and ε(·) is the Heaviside step function. σ 2 denotes the variance of the Gaussian noise in the complex MR data, which can be independently estimated. When the underlying intensity A equals zero, the Rician PDF simplifies to a Rayleigh distribution:

p(miA,σ)=miσ2e(mi2/2σ2)ε(mi). (2)

At high SNR, the Rician PDF approaches to a Gaussian PDF with a mean A and variance σ 2 (see Figure 1):

Figure 1.

Figure 1

At high SNR, Rician data is approximately Gaussian. At low-medium SNR, neither Gaussian nor Rayleigh is a great approximation.

p(miA,σ)=12πσ2e((miA)2/2σ2)ε(mi). (3)

That is, Rician noise in magnitude MR images behaves like Gaussian distributed when SNR is high and Rayleigh distributed for low SNR.

Now we discuss how to measure the noise variance from an MR image without the need for high SNR regions or a background region.

Let m 1, m 2     ,…, m n be the n Rician distributed magnitude data points, and region of constant signal intensity is A. Then the joint PDF of the observations is

p({mi}A,σ)=Πi=1nmiσ2e(mi2+A2)/2σ2I0(Amiσ2), (4)

where {m i} are the magnitude variables corresponding to the magnitude observations m i. The maximum likelihood (ML) estimate of A and σ is then found from the global maximum of log⁡L:

A^ML=i=1nln(miσ2)i=1nmi2+A22σ2+i=1nlnI0(Amiσ2). (5)

Since the noise is estimated from the available piecewise constant regions in the image, this estimation neither depends on the image background nor on the SNR of the image.

2.2. NL-Means Filter

We focus on the problem of denoising: an observed image Y is assumed to be a noisy version of an unobserved image f corrupted by white Gaussian noise. Let ΩZ 2 be the indexing set of the pixels. For any pixel xΩ,

Y(x)=f(x)+ε(x), (6)

where ε is a centered Gaussian random variable with known variance σ 2 and the noise components ε(x) are independent. For each pixel the output of the procedure is a weighted average of the whole image. The weights used are selected using a “metric” which determines whether two pixels are similar or not. The core idea of the NL-means is to create a metric governed by patches surrounding each pixel, regardless of their position, that is, nonlocal in the image space. For a fixed (odd) width p, a patch P x is a subimage of width p, centered around the pixel x, and the NL-means estimator of f(x) is:

f^(x)=xΩw(x,x)Y(x)xΩw(x,x), (7)

where w(x, x′) = exp⁡(−||P xP x||2,a 2/2h 2), which measures the proximity between patches. h > 0 is the bandwidth, which has a smoothing effect and plays the same role as the bandwidth for kernel methods in statistics. The larger the bandwidth is, the smoother the image becomes. ||·||2,a is a weighted Euclidean norm in R |P|(|P| = p 2) using the Gaussian kernel, a controlling the concentration of the norm around the central pixel. The denominator is a normalizing factor ensuring the weights sum to one. The patch size P is generally chosen equal to 5, 7, or 9. From the patch estimator, it is possible to recover a pixel estimator by reprojection.

In the following, the proposed filter is realized in three steps: (a) finding the image patches similar to a given patch; (b) applying the Rician estimation on the 3D block; (c) collaborative adaptive filtering.

2.3. Adaptation to Rician Noise Denoising Model

In case of Rician noise, there is no closed form for the ML estimate of the true signal μ given n such measures x i. However, the even order moments of the Rician law have very simple expressions. In particular, the second-order moment is E(X i 2) = μ 2 + 2σ 2 where σ 2 is the variance of the Gaussian noise of MRI data. The measured value of x i 2 (and that of x i) is thus usually overestimated compared to its true, unknown value, which is termed the Rician bias in the following. Using the same remark as in the Gaussian case, that is, E(∑i w i X i 2) = μ 2 + 2σ 2, it then seems natural to restore x as iwixi2-2σ2, the weights w i summing to (1). The voxel value x can be restored as

NLMR(x)=(xiVwixi2)2σ2, (8)

where σ 2 is the noise variance. As noted by others in case if i.i.d random variables X i and with w i = 1/n, the term under the square root has a nonnull probability to be negative, which decreases when n is large. In such cases the restored value is set to zero. In practice, on real data, negative values are mainly found in the background of the images.

On the other hand, we should identify features that capture the underlying geometry of the image patches, without regard to the average intensity of the patches. For this, we make use of the data adaptive steering kernels developed by Takeda et al. [17]. In that work on Steering Kernel Regression (SKR), robust estimates of the local gradients are taken into account in analyzing the similarity between two pixels in a patch. The gradients are then used to describe the shape and size of the kernel. The steering kernel weight at the jth pixel in ith patch, which is a measure of similarity between the two pixels, is then given by

w(i,j)=det(Cj)2πh2exp{(xixj)TCj(xixj)2h2}, (9)

where h is a global smoothing parameter also known as the bandwidth of the kernel. The matrix C j denotes the gradient covariance formed from the estimated vertical and horizontal gradient of the jth pixel that lies in the ith patch. The 3 × 3 data-dependent steering matrix C j can be defined as C j = h(H i)−1/2, where h is a global smoothing parameter and H i is a 3 × 3 covariance matrix based on the sample variations in a local neighborhood around sample x i. The weight w(i, j) is calculated for each location in the ith patch to form the weight matrix (or kernel). It is interesting to see that the weight matrix thus formed is indicative of the underlying image geometry. This fact is illustrated in Figure 2. Note that in each point of the weight matrix a different C j is used to compute the weight, and hence, the kernels do not have simple elliptical contours.

Figure 2.

Figure 2

Steering kernels at different locations of the Lena image. The patch size is chosen to be 11 × 11.

However, when dealing with nonstationary noise the use of a global noise variance across the image will lead to suboptimal results. To deal with this situation, local noise estimation should be introduced.

Such estimation can be obtained by observing that the expectation of the squared Euclidean distance of two noisy patches as pointed out by Buades et al. is [12]

d(Ni,Nj)=E||u(Ni)u(Nj)||22=||u0(Ni)u0(Nj)||22+2σ2, (10)

where u 0 is the noise-free image. Therefore, d(N i, N j) = 2σ 2 if N i = N j. If we assume that each 2D patch in the volume has at least one patch equal to itself then the noise variance can be estimated as

σ2=min(d(Ni,Nj))2ji. (11)

However, we found experimentally that this assumption is not normally met in real clinical conditions. In order to relax such an assumption we estimated the local variance as

σ2=min(d(Ri,Rj))2ji,  R=uψ(u), (12)

where the distance is calculated from a volume R computed as the subtraction of the original noisy volume u and the lowpass filtered volume ψ(u). We have found experimentally that the minimum distance in this case is approximately equal to σ 2 due to the removal of low-frequency information and the application of the minimum operator.

This Rician adapted filter removes bias intensity using the properties of the second-order moment of a Rice law. In fact, the second-order moment of a random variable X following a Rice distribution can be written as

E(X2)=μ2+2σ2. (13)

Consider a gray-scale image y = (y(x))xΩ defined over a bounded domain ΩR 2, and y(x) ∈ R + is the noisy observed intensity at pixel xΩ. The weighting associated to the patch P is computed from the steering kernel:

WP(i,j)=det(Cj)σ2exp12(||y(x)y(xi)||σ22n1)2, (14)

where ||·|| denotes the Euclidean distance. y(x) : = (y(x k), y kB(x)) ∈ R n is a vectorized image patch. B(x) is a n×n neighborhood centered at pixel x(n = 7). Δ(x) is a square neighborhood of N = |Δ(x)| pixels. y(x i) is a vectorized image patch such that x i ∈ Δ(x). σ 2 is the noise variance assumed to be known or estimated. The final estimate is given by

INLσ,ny(x)=PWP(i,j)y(xi)PWP(i,j). (15)

The algorithm is divided in two identical separate steps, the image is scanned pixel per pixel. Let us denote by P the current reference patch which size is n × n (with k = 5) and x r the current central pixel of P. The loop on the image is done on x r.

This approach has two important benefits. On the one hand, it allows finding more similar patches with the same pattern but with different mean level compensating intensity inhomogeneities typically present on MRI data, and on the other hand, overestimation of the noise variance will be minimized in cases with unique patches in the search volume. Thus, the adaptive filter proposed will set the parameter h 2 equal to the minimum distance estimation as described in (11).

3. Experiments and Results

To evaluate and compare the proposed method with state-of-the-art methods, we did experiments on both synthetic and real medical images. To conduct the experiments on synthetic data, we use the standard MR images phantom of the brain obtained from the BrainWeb database [18]. The proposed algorithm was compared with the following recently proposed methods.

  1. NL-means: Nonlocal Means Image Denoising Method [12]. The size of the patch and research window depend on the value of σ. The search window size used for experiments was 9 × 9 × 9, neighborhood size was 3 × 3 × 3, and value of the decay parameter h and σ were 0.4 σ, 20.

  2. NL-PCA: Nonlocal Principal Component Analysis Method [19]. Local neighborhood size used for the experiments was 3 × 3 × 3. Other parameters are fixed to α = 2.1; K hard = K wien = 3; n hard = n wien = 15.

  3. DCT: Local Discrete Cosine Transform Method [20]. The method decomposes the image into local patches, and denoises the patches with thresholding estimate in the DCT domain. The local patches of size used for the experiments was N=16×16.

  4. Proposed Method. The search window n for the experiments was 5 × 5 × 5.

For quantitative analysis of the denoising methods, we used the peak signal-to-noise ratio (PSNR), the structural similarity index matrix (SSIM).

Figure 3 displays the results of the image denoised with NL-means, NL-PCA, DCT, and proposed method. This experiment was conducted on the 2D slice of the synthetic images of the brain in the 3D environment after corrupting the image by uniform Rician noise with σ = 20. The proposed filter was executed using a neighborhood size for denoising as 13 × 13 × 13 and a neighborhood size for the local computation of range as 5 × 5 × 5. It can be observed from Figure 3 that the image denoised with the proposed method is closer to the original image than the images denoised with other approaches. The graph in Figure 4 shows the quantitative analysis of the proposed method with other recently proposed methods based on the similarity measures PSNR, MSSIM, respectively. This experiment was also conducted on the BrainWeb MR image with σ of the noise ranging from 10 to 30. All the methods with which the proposed method was compared are based on the Rician noise model. In the quantitative analysis, the background was excluded; that is, only the area of the image inside the skull was considered. It can be seen from the graph that the performance of the proposed method is best for each similarity measure.

Figure 3.

Figure 3

Denoising MRI with several methods: (a) original image; (b) original image corrupted by Rician noise of σ = 20; (c) denoised with NL-means method; (d) denoised with DCT method; (e) denoised with NL-PCA method; (f) denoised with proposed method.

Figure 4.

Figure 4

Comparative analysis of the proposed method with other methods based on PSNR, MSSIM for different values of the noise standard deviation.

Figure 5 shows the extremely noisy data and we use the proposed method to remove the noise. The absolute value of the residuals of the filtering process clearly show the capabilities of the proposed approach on the extremely noisy data.

Figure 5.

Figure 5

The denoising results obtained with the proposed filter. (a) The original noisy images (σ = 30); (b) the denoised images (PSNR is 38.9 and 37.7, resp.); (c) the differences of the images.

4. Conclusion

A new method to denoise the medical images by applying NL-means method is proposed in this paper. To demonstrate the efficiency of the proposed method, experiments were conducted on both simulated and real medical images. Comparative analysis with other recently proposed methods based on the similarity measures, PSNR, MSSIM, proves that the proposed method is superior to them in terms of image quality.

Acknowledgments

This work was supported by the NSFC (Grant nos. 61103130, 61070120, 60982145); National Program on Key basic research Project (973 Programs) (Grant nos. 2010CB731804-1, 2011CB706901-4); Beijing Natural Science Foundation (Grant no. 4112021); Foundation of Beijing Educational Committee (no. KM201111417015); the opening project of Shanghai key laboratory of integrate administration technologies for information security (no. AGK2010005).

References

  • 1.Chang SG, Yu B, Vetterli M. Spatially adaptive wavelet thresholding with context modeling for image denoising. IEEE Transactions on Image Processing. 2000;9(9):1522–1531. doi: 10.1109/83.862630. [DOI] [PubMed] [Google Scholar]
  • 2.Pižurica A, Philips W, Lemahieu I, Acheroy M. A joint inter- and intrascale statistical model for Bayesian wavelet based image denoising. IEEE Transactions on Image Processing. 2002;11(5):545–557. doi: 10.1109/TIP.2002.1006401. [DOI] [PubMed] [Google Scholar]
  • 3.Zhang L, Bao P, Wu X. Hybrid inter- and intra-wavelet scale image restoration. Pattern Recognition. 2003;36(8):1737–1746. [Google Scholar]
  • 4.Zhang L, Bao P, Wu X. Multiscale LMMSE-based image denoising with optimal wavelet selection. IEEE Transactions on Circuits and Systems for Video Technology. 2005;15(4):469–481. [Google Scholar]
  • 5.Pižurica A, Philips W. Estimating the probability of the presence of a signal of interest in multiresolution single- and multiband image denoising. IEEE Transactions on Image Processing. 2006;15(3):654–665. doi: 10.1109/tip.2005.863698. [DOI] [PubMed] [Google Scholar]
  • 6.Starck JL, Candès EJ, Donoho DL. The curvelet transform for image denoising. IEEE Transactions on Image Processing. 2002;11(6):670–684. doi: 10.1109/TIP.2002.1014998. [DOI] [PubMed] [Google Scholar]
  • 7.Chen GY, Kégl B. Image denoising with complex ridgelets. Pattern Recognition. 2007;40(2):578–585. [Google Scholar]
  • 8.Elad M, Aharon M. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing. 2006;15(12):3736–3745. doi: 10.1109/tip.2006.881969. [DOI] [PubMed] [Google Scholar]
  • 9.Aharon M, Elad M, Bruckstein A. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing. 2006;54(11):4311–4322. [Google Scholar]
  • 10.Foi A, Katkovnik V, Egiazarian K. Pointwise shape-adaptive DCT for high-quality denoising and deblocking of grayscale and color images. IEEE Transactions on Image Processing. 2007;16(5):1395–1411. doi: 10.1109/tip.2007.891788. [DOI] [PubMed] [Google Scholar]
  • 11.Barash D. A fundamental relationship between bilateral filtering, adaptive smoothing, and the nonlinear diffusion equation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2002;24(6):844–847. [Google Scholar]
  • 12.Buades A, Coll B, Morel JM. A review of image denoising algorithms, with a new one. Multiscale Modeling and Simulation. 2005;4(2):490–530. [Google Scholar]
  • 13.Kervrann C, Boulanger J. Optimal spatial adaptation for patch-based image denoising. IEEE Transactions on Image Processing. 2006;15(10):2866–2878. doi: 10.1109/tip.2006.877529. [DOI] [PubMed] [Google Scholar]
  • 14.Chen S, Zhao M, Wu G, Yao C, Zhang J. Recent advances in morphological cell image analysis. Computational and Mathematical Methods in Medicine. 2012;2012:15 pages. doi: 10.1155/2012/101536. Article ID 101536. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chen SY, Guan Q. Parametric shape representation by a deformable NURBS model for cardiac functional measurements. IEEE Transactions on Biomedical Engineering. 2011;58(3):480–487. doi: 10.1109/TBME.2010.2087331. [DOI] [PubMed] [Google Scholar]
  • 16.Chen S, Tong H, Cattani C. Markov models for image labeling. Mathematical Problems in Engineering. 2012;2012:18 pages. Article ID 814356. [Google Scholar]
  • 17.Takeda H, Farsiu S, Milanfar P. Kernel regression for image processing and reconstruction. IEEE Transactions on Image Processing. 2007;16(2):349–366. doi: 10.1109/tip.2006.888330. [DOI] [PubMed] [Google Scholar]
  • 18.Coupe P, Yger P, Prima S, Hellier P, Kervrann C, Barillot C. An optimized blockwise nonlocal means denoising filter for 3-D magnetic resonance images. IEEE Transactions on Medical Imaging. 2008;27(4):425–441. doi: 10.1109/TMI.2007.906087. Article ID 4359947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Zhang L, Dong W, Zhang D, Shi G. Two-stage image denoising by principal component analysis with local pixel grouping. Pattern Recognition. 2010;43(4):1531–1549. [Google Scholar]
  • 20.Yu G, Sapiro G. DCT image denoising: a simple and effective image denoising algorithm. Image Processing On Line. 2011 [Google Scholar]

Articles from Computational and Mathematical Methods in Medicine are provided here courtesy of Wiley

RESOURCES