Optical Coherence Tomography Noise Reduction Using Anisotropic Local Bivariate Gaussian Mixture Prior in 3D Complex Wavelet Domain

Hossein Rabbani; Milan Sonka; Michael D Abramoff

doi:10.1155/2013/417491

. 2013 Oct 10;2013:417491. doi: 10.1155/2013/417491

Optical Coherence Tomography Noise Reduction Using Anisotropic Local Bivariate Gaussian Mixture Prior in 3D Complex Wavelet Domain

Hossein Rabbani ^1,^2,^*, Milan Sonka ², Michael D Abramoff ²

PMCID: PMC3810483 PMID: 24222760

Abstract

In this paper, MMSE estimator is employed for noise-free 3D OCT data recovery in 3D complex wavelet domain. Since the proposed distribution for noise-free data plays a key role in the performance of MMSE estimator, a priori distribution for the pdf of noise-free 3D complex wavelet coefficients is proposed which is able to model the main statistical properties of wavelets. We model the coefficients with a mixture of two bivariate Gaussian pdfs with local parameters which are able to capture the heavy-tailed property and inter- and intrascale dependencies of coefficients. In addition, based on the special structure of OCT images, we use an anisotropic windowing procedure for local parameters estimation that results in visual quality improvement. On this base, several OCT despeckling algorithms are obtained based on using Gaussian/two-sided Rayleigh noise distribution and homomorphic/nonhomomorphic model. In order to evaluate the performance of the proposed algorithm, we use 156 selected ROIs from 650 × 512 × 128 OCT dataset in the presence of wet AMD pathology. Our simulations show that the best MMSE estimator using local bivariate mixture prior is for the nonhomomorphic model in the presence of Gaussian noise which results in an improvement of 7.8 ± 1.7 in CNR.

1. Introduction

Optical coherence tomography (OCT) is an optical signal acquisition and processing method that captures 3D images from within optical scattering media such as biological tissues [1–4]. For example, in ophthalmology, OCT is used to obtain detailed images from within the retina [4]. Similar to other optical tomographic techniques, OCT suffers from speckle noise that reduces the ability of image interpretation [5]. So, noise reduction is an essential part of OCT image processing systems. Until now, several techniques for OCT noise reduction have been reported [6–14]. Initial methods perform in complex domain [15], that is, before producing magnitude of OCT interference signal, while most introduced despeckling methods are applied after an OCT image is formed [6–14]. These methods, which usually suppose multiplicative noise for speckled data, also can be categorized into image domain and transform domain methods. As an example for image domain techniques, in [16] the rotating kernel transform (RKT) filters are applied on an image with a set of oriented kernels and keep the largest filter output for each pixel. Other image domain methods based on enhanced Lee filter [17], median filter [17], symmetric nearest neighbor filter [17], and adaptive Wiener filter [17], and I-divergence regularization [6] and PDE-based nonlinear diffusion methods [14] have been reported in the literature.

Transform domain techniques typically outperform the image domain techniques because incorporating speckle statistics in the despeckling process would be facilitated in sparse domains. Such techniques apply a sparse transform (such as wavelet and curvelet transforms) [7–12, 18] directly on data (viz., nonhomomorphic methods) or on log-transformed data (viz., homomorphic methods), and suppose that in the sparse domain noise is converted to additive white Gaussian noise (AWGN) [13] or other models which can be removed using an appropriate shrinkage function. For example, in [18], a spatially adaptive wavelet thresholding method is used for speckle suppression in log-transformed domain. Since actual signal in OCT images consists of horizontal edges arising from reflections at the layer boundaries, most of the edge information is in “low-pass”-“high-pass” (LH) subbands (and some of it in HH subbands). Therefore, an increased threshold in the vertical subbands using a constant multiplier (K = 4) is chosen to decrease further noise with a minimal effect on the edge sharpness. Other transform domain methods based on hard thresholding in 3D curvelet domain [8], soft thresholding in discrete complex wavelet transform (DCWT) domain [9], and temporal and spatial wavelet-based filtering [10] have been reported in other literatures.

In fact denoising is the problem of obtaining the noise-free data from noisy data observation, which may be solved in a deterministic or probabilistic framework. In the first case, each voxel is considered as an unknown deterministic variable, and non-Bayesian techniques are employed to solve this problem. In the second case, the data is modeled as a random field, and Bayesian methods are used for the estimation of clean data from the noisy environment. Therefore, the proposed prior probability distributions for noise-free data and noise (i.e., proposed as speckle for OCT data) play a key role in the noise reduction problem.

1.1. Statistical Properties of Noise-Free Coefficients

Description of the statistical properties of natural signals can be facilitated in the wavelet domain [19] due to sparseness and decorrelation properties of wavelets [20]. The sparseness property states that the marginal pdf of wavelet coefficients in each subband has a large peak at zero and its tails fall to zero slower than the Gaussian pdf (leptokurtic). On this base, some long-tailed pdfs such as generalized Gaussian distribution (GGD) [21, 22], α-stable distributions [23], Bessel K form densities [24, 25], and mixture pdfs [26–31] have been proposed. Although the decorrelation property of wavelets states that coefficients at the same positions in the adjacent scales are uncorrelated, it does not mean that they are independent. The interscale dependency of wavelet states that large/small values of wavelet coefficients tend to propagate across scales [32]. Some researchers have proposed hidden Markov models (HMMs) [33] and Markov random fields (MRFs) [34] to model the interscale dependency [35]. Recently, it has been shown that some non-Gaussian bivariate joint pdfs for each coefficient and its parent, such as circular symmetric Laplacian pdf [36], bivariate Cauchy distribution [37], (multivariate) Gaussian scale mixture (GSM) model [27, 38, 39], and bivariate Laplacian mixture models [40] are able to capture this property easily and produce better denoising results with lower computational complexity.

The dependencies between wavelet coefficients are not restricted to the interscale dependency. There is another dependency between spatial adjacent coefficients in each subband, namely, intrascale dependency [42]. This dependency states that if a particular wavelet coefficient is large/small, then the spatial adjacent coefficients are likely to be large/small too. Usually this property is captured using local parameters for pdfs [37], and it has been shown that denoising algorithms using this property for statistical modeling of wavelets are able to improve the denoising results [43–45]. For example, Mihçak [43] employs local variance for Gaussian pdf to model intrascale dependency. In [44], a mixture of two Laplace pdf with local parameters is proposed for simultaneously capturing heavy-tailed nature and intrascale dependency. Reference [45], using local variance for proposed model in [36], improves the results for noise reduction application because this local pdf models both interscale and intrascale dependencies. In this paper, we extend the proposed pdf in [46] based on a mixture of bivariate Gaussian pdfs with local parameters for noise-free wavelet coefficients. Since the empirically observed distribution of wavelet coefficient pairs in adjacent scales have elliptical symmetry, we use different variances for marginal pdfs that lead to an elliptical symmetric bivariate pdf instead of circular symmetric pdf. Recently, it has been shown that using anisotropic window instead of square window can improve the denoising results [47]. Based on the special structure of OCT data, we choose an anisotropic windowing procedure for local parameters estimation that results in visual quality improvement.

1.2. Discrete Complex Wavelet Transform (DCWT)

The wavelet based image denoising consists of the following steps.

Signal transformation of the noisy observation.
Modification of the noisy wavelet coefficients based on some criteria.
Inverse signal transformation of modified coefficients.

As explained earlier, the second step depends on the type of estimator and for a minimum mean square error (MMSE) estimator, the proposed model for signal and noise (which we propose as a multiplicative model), the proposed pdf of noise-free wavelet coefficients (modeled, in this paper, as a mixture of bivariate Gaussian pdfs with local parameters), and the proposed pdf for noise (with which we test both Gaussian and two-sided Rayleigh distributions) define the performance of the algorithm. However, for the first and last steps of wavelet-based denoising algorithm, the type of transformation plays a key role. In this paper, we use DCWT [48] instead of ordinary discrete wavelet transform (DWT). Despite DWT being a sparse representation that outperforms many signal processing approaches, it does not lead to an optimum performance in all applications and suffers from several fundamental shortcomings (especially in high-dimensional cases), which DCWT avoids them. These shortcomings are as follows.

In the neighborhood of an edge, the DWT produces both large and small wavelet coefficients. In contrast, the magnitudes of DCWT coefficients are more directly related to their adjacency to the edge. The main reason of this phenomenon is using bandpass filters that produce DWT coefficients which oscillate positively and negatively around the singularities, and this subject complicates wavelet-based processing.
DWT is not shift invariant. It means that a small shift in the input signal of DWT makes the total energy of wavelet coefficients in subband completely differ. This shift greatly perturbs oscillation pattern around singularities of the DWT coefficient which complicates wavelet-domain processing.
Since the DWT coefficients in each subband are produced via critical sampling after using nonideal low-pass and high-pass filters, substantial aliasing would be produced. If the wavelet coefficients are not changed, the inverse DWT cancels this aliasing. Applying any processing method on wavelet coefficients (such as thresholding) disarranges this balance between the forward and inverse transforms which leads to artifacts in the reconstructed signal.
The directional selectivity of 2D DCWT has been explained in Appendix A. Similar to the 2D case, the standard 3D data transforms, which are separable multiplication of 1D tensors, do not provide useful representations with good energy compaction property for 3D data. For example, the multi-dimensional standard separable DWT mixes orientations and motions in its subbands and produces the checkerboard artifacts (Figure 1). In contrast, since the spectrum of the (approximately) analytic 1D wavelet is supported on only one side of the frequency axis, the spectrum of the DCWT in 3D domain is supported in only 1/27 of the 3D frequency plane. So, instead of 3D DWT, usually oriented transforms such as 3D DCWT are proposed for 3D data processing [41, 48–52]. Figure 1 shows a comparison between subbands of 3D DWT and 3D DCWT.

A comparison between the idealized support of the Fourier spectrum of each standard and complex wavelet in the 3D frequency domain. (a) Isosurfaces of the 7 3D wavelets for a standard 3D wavelet transform. The blue and red colors have the same amplitude, but their phases are complement. (b) Isosurfaces of 7 of the 28 3D wavelets for a 3D DCWT. Each subband corresponds to motion in a specific direction [41].

1.3. Organization of the Paper

In Section 2, we explain our proposed pdf for noise-free 3D DCWT coefficients, that is, a mixture of bivariate Gaussian distributions with local parameters. In Section 3, at first we obtain a local thresholding function supposing a priori distribution as a bivariate Gaussian pdf with local variance, and then in a Bayesian framework we produce our new shrinkage functions derived from the proposed pdf and using Gaussian/two-sided Rayleigh noise distribution and homomorphic/non-homomorphic model. In Section 4, we explain the proposed anisotropic window selection procedure for local parameter estimation based on special structure of OCT data. In Section 5, we use our model for wavelet-based denoising of several 3D OCT data. We compare our methods visually and in terms of PSNR. Also in this section, we use the proposed method for nonspeckle noise reduction. Finally, in Section 6, we summarize this paper and suggest some future work.

2. Bivariate Gaussian Mixture Model with Local Parameters

One of the primary properties of the wavelet transform is compression. This property means that the marginal distributions of wavelet coefficients are highly kurtotic, and so long-tailed distributions are suitable models for marginal pdf. A zero-mean mixture model could have a large peak at zero and would be long tailed. For example, in [22, 26, 29, 31] a mixture of Gaussian distributions is proposed to model the heavy-tailed nature of wavelet coefficients. Figure 2 shows this model that consists of two zero-mean Gaussian distributions with two different variances. The Gaussian pdf with low variance can model the large peak at zero and the Gaussian pdf with high variance can model tails of distribution. The secondary properties of the wavelet transform are clustering and persistence. The clustering property, that is called intrascale dependency, states that if a particular wavelet coefficient is large/small, then adjacent coefficients are very likely to also be large/small [36], and usually local pdfs are able to model this property. The persistence property, that is called the interscale dependency, states that large/small values of wavelet coefficients tend to propagate across scales [36]. As an example, Figure 3 illustrates the empirical joint parent-child histogram of wavelet coefficients computed from the 200, 512 × 512 images from the Corel image database [42]. Usually this property can be modeled using proper bivariate pdfs.

Zero-mean Gaussian mixture model (left image) and empirical histogram of wavelet in a subband together with the Gaussian mixture model (right image) [26].

Empirical joint parent-child histogram of wavelet coefficients (computed from the Corel image database) [42].

2.1. Description of the Proposed Model

In this paper, we assume a pdf as a mixture of two bivariate Gaussian pdfs with local parameters in order to model the distribution of wavelet coefficients of images as follows:

\begin{matrix} p_{\bar{w} (k)} (\bar{w} (k)) = a (k) p_{1} (\bar{w} (k)) + (1 - a (k)) p_{2} (\bar{w} (k)) \\ = \frac{a (k) e^{(w_{1}^{2} (k) / 2 σ_{11}^{2} (k)) - (w_{2}^{2} (k) / 2 σ_{12}^{2} (k))}}{2 π σ_{11} (k) σ_{12} (k)} \\ + \frac{(1 - a (k)) e^{(- w_{1}^{2} (k) / 2 σ_{21}^{2} (k)) - (w_{2}^{2} (k) / 2 σ_{22}^{2} (k))}}{2 π σ_{21} (k) σ_{22} (k)}, \end{matrix}

(1)

where a(k) ∈ [0,1], σ ₁₁(k), σ ₁₂(k), σ ₂₁(k), σ ₂₂(k) are the mixture model parameters. For each random bivariable, the second component is the parent of the first component; for example, w ₂(k) represent, the parent of w ₁(k) at the same spatial position as the kth wavelet coefficient w ₁(k) and at the next coarser scale.

Our proposed model in this paper, that is a mixture of bivariate Gaussian pdfs with local parameters, is mixture, bivariate and local. Therefore, it is able to simultaneously capture the heavy-tailed property and inter- and intrascale dependencies.

After substitution of mixture model in the definition of E(w ₁(k)w ₂(k)), we can see that this pdf represents two uncorrelated random variables as follows:

\begin{matrix} E (w_{1} (k) w_{2} (k)) \\ = \iint w_{1} (k) w_{2} (k) p_{\bar{w} (k)} (\bar{w} (k)) d \bar{w} (k) \\ = (1 - a (k)) \\ \times \iint w_{1} (k) w_{2} (k) p_{2} (\bar{w} (k)) d w_{1} (k) d w_{2} (k) \\ + a (k) \iint w_{1} (k) w_{2} (k) p_{1} (\bar{w} (k)) d w_{1} (k) d w_{2} (k) \\ = 0 . \end{matrix}

(2)

Interestingly, the marginal pdf of w _i(k) for i = 1, 2 is the mixture of two univariate Gaussian pdf with local parameters [44],

\begin{matrix} p_{w_{1} (k)} (w_{1} (k)) = \int_{- \infty}^{\infty} p_{\bar{w} (k)} (\bar{w} (k)) d w_{2} (k) \\ = a (k) \frac{\exp (- w_{1}^{2} (k) / 2 σ_{11}^{2} (k))}{σ_{11} (k) \sqrt{2 π}} \\ + (1 - a (k)) \frac{\exp (- w_{1}^{2} (k) / 2 σ_{21}^{2} (k))}{σ_{21} (k) \sqrt{2 π}} . \end{matrix}

(3)

It is easy to see that w ₁(k), w ₂(k) are not independent; that is,

\begin{matrix} p_{\bar{w} (k)} (\bar{w} (k)) \neq p_{w_{1} (k)} (w_{1} (k)) p_{w_{2} (k)} (w_{2} (k)) . \end{matrix}

(4)

See Appendix B for more explanation.

2.2. Local EM Algorithm

To characterize the parameters in (1), it is necessary to have the parameters σ ₁₁(k), σ ₂₁(k), σ ₁₂(k), σ ₂₂(k), and a(k). For this mixture model, we use an iterative numerical algorithm to estimate these parameters. The expectation maximization (EM) algorithm is most frequently used to estimate such parameters. Usually, the EM algorithm for mixture models employs all data in each subband to obtain the parameters. Using this global EM algorithm, equal parameters are obtained for all data in each subband. However, to model the intrascale dependency, we must incorporate the local statistics and need to have different parameters for each voxel in each subband. So, we introduce a local version of EM algorithm. This local EM algorithm is able to obtain separate parameters for each voxel by the implementation of EM algorithm in each window N(k) centered at w(k). This iterative algorithm has two steps. Assuming that the observed data $\bar{w} (k)$ for k = 1,…, N, the E-step calculates the responsibility factors for each data as follows:

\begin{matrix} r_{1} (k) ⟵ \frac{a (k) p_{1} (\bar{w} (k))}{a (k) p_{1} (\bar{w} (k)) + (1 - a (k)) p_{2} (\bar{w} (k))}, \\ r_{2} (k) ⟵ 1 - r_{1} (k) . \end{matrix}

(5)

The M-step updates the parameters a(k), σ ₁₁(k), σ ₁₂(k), σ ₂₁(k), and σ ₂₂(k). a(k) is computed by

\begin{matrix} a (k) ⟵ \frac{1}{M} \sum_{j \in N (K)} r_{1} (j), \end{matrix}

(6)

where M is the number of coefficients in the square window N(k) centered at w(k).

The variances σ ₁₁(k), σ ₁₂(k), σ ₂₁(k), and σ ₂₂(k) are computed by [40]:

\begin{matrix} σ_{i m}^{2} (k) ⟵ \frac{\sum_{j \in N (K)}^{} r_{i} (j) w_{m}^{2} (k)}{\sum_{j \in N (K)}^{} r_{i} (j)}, i, m = 1,2 . \end{matrix}

(7)

3. Denoising Using MMSE Estimator

In this section, the denoising of a 3D OCT data is considered. We assume that dominant noise in OCT data is speckle. In this case as a common model, we propose multiplicative model as follows:

\begin{matrix} x (i) = s (i) g (i), \end{matrix}

(8)

where i is the index of voxel and is between 1 and number of voxels.

As explained in Introduction, reported transform-based OCT noise reduction methods in the literatures [7–12, 18] usually at first transform data into log domain, and suppose that noise in log domain is AWGN:

\begin{matrix} W (\log x (i)) = W (\log s (i)) + W (\log g (i)), \end{matrix}

(9)

where in this paper W shows 3D DCWT operator. So, we can write

\begin{matrix} y (k) = w (k) + n (k), \end{matrix}

(10)

where w(k), y(k), and n(k) are, respectively, the kth noise-free 3D DCWT coefficients, noisy 3D DCWT coefficients, and noise in the 3D DCWT domain.

Recently, it has been reported [56–59] that non-homomorphic techniques that do not use this nonlinear operation and apply wavelet transform directly on speckled data lead to unbiased estimation of the data and decrease the computational complexity. On this base after applying 3D DCWT (directly) on data, we would have

\begin{matrix} W (x (i)) = W (s (i) g (i)) = W (s (i) + s (i) (g (i) - 1)) \\ = W (s (i)) + W (s (i) (g (i) - 1)) . \end{matrix}

(11)

Again we can write

\begin{matrix} y (k) = w (k) + n (k), \end{matrix}

(12)

where w(k), y(k), and n(k) are, respectively, the kth noise-free 3D DCWT coefficients, noisy 3D DCWT coefficients, and noise in the 3D DCWT domain. Since speckle noise g can be modeled as a unit-mean random process independent of the noise-free data, we would have E[W(s(g − 1))] = 0, and also it can be easily shown [58] that E[W(s)W(s(g − 1))] = 0 which means that w(k) and n(k) are zero-mean uncorrelated random variables. If w _p(k), y _p(k), and n _p(k) show the parent coefficients of w(k), y(k), and n(k), respectively, we can write

\begin{matrix} y_{p} (k) = w_{p} (k) + n_{p} (k) . \end{matrix}

(13)

Based on the persistence property, we need to have a bivariate model based on parent-child pairs. So, we can propose the following bivariate model:

\begin{matrix} \bar{y} (k) = \bar{w} (k) + \bar{n} (k), \end{matrix}

(14)

where $\bar{w} (k) = (w (k), w_{p} (k))$ , $\bar{y} (k) = (y (k), y_{p} (k))$ , and $\bar{n} (k) = (n (k), n_{p} (k))$ are, respectively, the kth parent-child pairs of noise-free 3D DCWT coefficients, noisy 3D DCWT coefficients, and additive noise in the 3D DCWT domain. In the literature, several models such as K-distribution, Rayleigh, Weibull, log-normal, and Nakagami distributions have been proposed [57, 58, 60–63] for speckle in image domain. In this paper, we test both AWGN and two-sided Rayleigh model for noise in wavelet domain as follows:

\begin{matrix} p_{\bar{n}} (\bar{n} (k)) = \frac{1}{2 π σ_{n}^{2}} \exp (- \frac{n_{1}^{2} (k) + n_{2}^{2} (k)}{2 σ_{n}^{2}}), \end{matrix}

(15)

\begin{matrix} p_{\bar{n}} (\bar{n} (k)) = \frac{| n_{1} (k) n_{2} (k) |}{4 α^{4}} \exp (- \frac{n_{1}^{2} (k) + n_{2}^{2} (k)}{2 α^{2}}), \end{matrix}

(16)

where σ _n ² = 2α ² shows the noise variance.

Now our goal is the estimation of $\bar{w} (k)$ from $\bar{y} (k) = \bar{w} (k) + \bar{n} (k)$ , where $\bar{n} (k)$ is a Gaussian or two-sided Rayleigh according to some criteria.

If we employ the MMSE estimator for the estimation problem, we get the posterior mean as an optimal solution:

\begin{matrix} \hat{w} (k) = \iint w (k) p_{\bar{w} (k) ∣ \bar{y} (k)} (\bar{w} (k) ∣ \bar{y} (k)) d \bar{w} (k) \\ = \iint w (k) \frac{p_{\bar{y} (k) ∣ \bar{w} (k)} (\bar{y} (k) ∣ \bar{w} (k)) p_{\bar{w} (k)} (\bar{w} (k))}{p_{\bar{y} (k)} (\bar{y} (k))} d \bar{w} (k) \\ = \frac{\iint w (k) p_{\bar{y} (k) ∣ \bar{w} (k)} (\bar{y} (k) ∣ \bar{w} (k)) p_{\bar{w} (k)} (\bar{w} (k)) d \bar{w} (k)}{p_{\bar{y} (k)} (\bar{y} (k))} \\ = \frac{\iint w (k) p_{\bar{y} (k) ∣ \bar{w} (k)} (\bar{y} (k) ∣ \bar{w} (k)) p_{\bar{w} (k)} (\bar{w} (k)) d \bar{w} (k)}{\iint p_{\bar{y} (k) ∣ \bar{w} (k)} (\bar{y} (k) ∣ \bar{w} (k)) p_{\bar{w} (k)} (\bar{w} (k)) d \bar{w} (k)} \\ = \frac{\iint w (k) p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) p_{\bar{w} (k)} (\bar{w} (k)) d \bar{w} (k)}{\iint p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) p_{\bar{w} (k)} (\bar{w} (k)) d \bar{w} (k)} . \end{matrix}

(17)

3.1. Denoising Based on Modeling Noise-Free Data by Bivariate Gaussian PDF with Local Variance

In order to solve (17), we must know the prior distribution of 3D DCWT coefficients, that is, $p_{\bar{w} (k)} (\bar{w} (k))$ . Defining $Gauss (x, σ) : = \exp (- x^{2} / (2 σ^{2})) / (σ \sqrt{2 π})$ , if we suppose that w(k), w _p(k) are independent Gaussian pdf with variances σ ₁(k) and σ ₂(k), the following bivariate Gaussian pdf with local variances can be proposed for the noise-free wavelet coefficients:

\begin{matrix} p_{\bar{w} (k)} (\bar{w} (k)) = p_{w (k)} (w (k)) \cdot p_{w_{p} (k)} (w_{p} (k)) \\ = Gauss (w (k), σ (k)) \cdot Gauss (w_{p} (k), σ_{p} (k)) \\ \Rightarrow p_{\bar{w} (k)} (\bar{w} (k)) \\ = \exp [- \frac{w^{2} (k)}{2 σ^{2} (k)} - \frac{w_{p}^{2} (k)}{2 σ_{p}^{2} (k)}] \\ \times {(2 π σ (k) σ_{p} (k))}^{- 1} . \end{matrix}

(18)

In this case, w(k) and w _p(k) are uncorrelated and independent, and therefore the MMSE estimator of w(k), w _p(k) yields the shrinkage function corresponding to univariate Gaussian pdf, that is, Wiener filter [21] as follows:

\begin{matrix} \hat{w} (k) = \frac{\iint w (k) p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) p_{\bar{w} (k)} (\bar{w} (k)) d \bar{w} (k)}{\iint p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) p_{\bar{w} (k)} (\bar{w} (k)) d \bar{w} (k)} \\ = y (k) \frac{σ^{2} (k)}{σ^{2} (k) + σ_{n}^{2}} . \end{matrix}

(19)

And so we can write

\begin{matrix} \hat{\bar{w}} (k) = (\frac{y (k) σ^{2} (k)}{σ^{2} (k) + σ_{n}^{2}}, \frac{y_{p} (k) σ_{p}^{2} (k)}{σ_{p}^{2} (k) + σ_{n}^{2}}) . \end{matrix}

(20)

Similarly, if we choose two-sided Rayleigh pdf for noise distribution, the following estimator is obtained [44]:

\begin{matrix} \hat{w} (k) = 2 z (k) \sqrt{2} (2 - \frac{σ^{2} (k)}{α^{2}}) + \sqrt{\frac{π}{2}} (1 - 2 \frac{σ^{2} (k) z^{2} (k)}{α^{2}}) \\ \times (erfc x (z (k)) - erfc x (- z (k))) \\ \times (\sqrt{\frac{1}{α^{2}} + \frac{1}{σ^{2} (k)}} (2 + z (k) \sqrt{π} erfc x (- z (k)) \\ {- z (k) \sqrt{π} erfc x (z (k))))}^{- 1}, \end{matrix}

(21)

where

\begin{matrix} z (k) = \frac{y (k)}{σ^{2} (k)} \sqrt{\frac{1}{2 / α^{2} + 2 / σ^{2} (k)}}, \\ erfc x (u) = \frac{2}{\sqrt{π}} \int_{0}^{\infty} e^{- t^{2} - 2 t u} d t . \end{matrix}

(22)

And so we can write

\begin{matrix} \hat{\bar{w}} (k) = ((2 z (k) \sqrt{2} (2 - \frac{σ^{2} (k)}{α^{2}}) \\ + \sqrt{\frac{π}{2}} (1 - \frac{σ^{2} (k) z^{2} (k)}{α^{2}}) \\ \times (erfc x (z (k)) - erfc x (- z (k)))) \\ \times (\sqrt{\frac{1}{α^{2}} + \frac{1}{σ^{2} (k)}} \\ \times (2 + z (k) \sqrt{π} erfc x (- z (k)) \\ {- z (k) \sqrt{π} erfc x (z (k))))}^{- 1}, \\ (2 z (k) \sqrt{2} (2 - \frac{σ_{p}^{2} (k)}{α^{2}}) \\ + \sqrt{\frac{π}{2}} (1 - \frac{σ_{p}^{2} (k) z^{2} (k)}{α^{2}}) \\ \times (erfc x (z_{p} (k)) - erfc x (- z_{p} (k)))) \\ \times (\sqrt{\frac{1}{α^{2}} + \frac{1}{σ_{p}^{2} (k)}} \\ \times (2 + z_{p} (k) \sqrt{π} erfc x (- z_{p} (k)) \\ {- z_{p} (k) \sqrt{π} erfc x (z_{p} (k))))}^{- 1}) . \end{matrix}

(23)

Suppose that the input noise variance is known. To implement (20) or (23), we must know the parameter of the prior σ(k) (suppose that σ(k) = σ _p(k)). Mihçak et al. [43] showed that using local variance (instead of global variance) for Wiener filter leads to a substantial improvement in denoising results (using local variance allows incorporating the local statistics of image into the proposed prior). It has been shown in the literature that the correctness of estimation of variance is an impact factor for denoising [23, 27, 34, 42–46]. Thus, the proposed criteria for estimation of the variance, such as the involved data for estimation (e.g., in some approaches the coarser scales are used as a source of prior), the type of estimator, and the shape and size of the proposed window for the local estimation of the variance, play key roles in the performance of denoising procedure. For example, in [54] a recurrence equation using a local Gaussian pdf is used for estimation of σ(k) or in [64] the variable size of the locally adaptive window is obtained using a region-based approach. However, for each data point $\bar{y} (k)$ , a simple estimation of σ(k) can be formed based on a local neighborhood N(k). In simplest case, we can use a square window N(k) centered at $\bar{y} (k)$ and suppose that in this window the variance is approximately constant. Then, an empirical estimate for σ(k) can be obtained as follows:

\begin{matrix} {\hat{σ}}^{2} (k) = \frac{1}{2 M} \sum_{j \in N (k)} (y^{2} (j) + y_{p}^{2} (j)) - σ_{n}^{2}, \end{matrix}

(24)

where M is the number of coefficients in N(k) and σ _n can be estimated by [4] σ _n = median{|noisy wavelet coefficients in finest scale|}/0.6745. In this estimation, we propose the coarser scale as a source of prior, but another estimate can be obtained using only spatial adjacent in the same scale. It has been shown [47] that the local features in the edges of images are not isotropic and so can be better modeled in a shape-adaptive window selection manner. We explain in this regard in Section 4 and try to improve the denoising results by using anisotropic window instead of square window for the estimation of local parameters (such as variance in (24)).

3.2. Denoising Based on Modeling Noise-Free Data by a Mixture of Bivariate Gaussian PDFs with Local Parameters

A nonlinear shrinkage function for wavelet-based denoising, which is derived by assuming that the noise-free wavelet coefficients follow a bivariate Gaussian mixture model with local parameters given by (1), is introduced in this section. Substituting (1) in (17), we can write

\begin{matrix} \hat{w} (k) = (\iint w (k) p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) \\ \times [a (k) p_{1} (\bar{w} (k)) \\ + (1 - a (k)) p_{2} (\bar{w} (k))] d \bar{w} (k)) \\ \times (\iint p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) \\ \times [a (k) p_{1} (\bar{w} (k)) \\ {+ (1 - a (k)) p_{2} (\bar{w} (k))] d \bar{w} (k))}^{- 1} \\ = \frac{a (k) \iint w (k) p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) p_{1} (\bar{w} (k)) d \bar{w} (k)}{a (k) g_{1} (\bar{y} (k)) + (1 - a (k)) g_{2} (\bar{y} (k))} \\ + ((1 - a (k)) \\ \times \iint w (k) p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) p_{2} (\bar{w} (k)) d \bar{w} (k)) \\ \times {(a (k) g_{1} (\bar{y} (k)) + (1 - a (k)) g_{2} (\bar{y} (k)))}^{- 1}, \end{matrix}

(25)

where

\begin{matrix} g_{i} (\bar{y} (k)) = \iint p_{\bar{n}} (\bar{y} (k) - \bar{w} (k)) p_{i} (\bar{w} (k)) d \bar{w} (k), \\ i = 1,2 . \end{matrix}

(26)

In fact, $g_{i} (\bar{y} (k))$ is the 2D convolution of the pdf of $p_{\bar{n}}$ (defined in (15) or (16)) and p _i (defined in (1)). Using (15) as proposed model for noise, both $p_{\bar{n}}$ and p _i are bivariate Gaussian pdfs. So, we obtain

\begin{matrix} g_{i} (\bar{y} (k)) = \exp (- (1 / 2) \\ \times (y^{2} (k) / (σ_{n}^{2} + σ_{i 1}^{2} (k)) \\ + y_{p}^{2} (k) / (σ_{n}^{2} + σ_{i 2}^{2} (k)))) \\ \times {(2 π \sqrt{(σ_{n}^{2} + σ_{i 1}^{2} (k)) (σ_{n}^{2} + σ_{i 2}^{2} (k))})}^{- 1} \\ i = 1,2 . \end{matrix}

(27)

For two-sided Rayleigh noise (16), more computations are needed. After some simplifications, the final formula would be

\begin{matrix} g_{i} (\bar{y} (k)) \\ = \frac{\exp (- y^{2} (k) / 2 σ_{i 1}^{2} (k) - y_{p}^{2} (k) / 2 σ_{i 2}^{2} (k))}{8 π (1 + σ_{i 1}^{2} (k) / α^{2}) (1 + σ_{i 2}^{2} (k) / α^{2}) σ_{i 1} (k) σ_{i 2} (k)} \\ \times (2 + z_{i} (k) \sqrt{π} erfc x (- z_{i} (k)) \\ - z_{i} (k) \sqrt{π} erfc x (z_{i} (k))) \\ \times (2 + z_{i p} (k) \sqrt{π} erfc x (- z_{i p} (k)) \\ - z_{i p} (k) \sqrt{π} erfc x (z_{i p} (k))), \\ i = 1,2, \end{matrix}

(28)

where

\begin{matrix} z_{i} (k) = \frac{y (k)}{σ_{i 1}^{2} (k)} \sqrt{\frac{1}{2 / α^{2} + (2 / σ_{i 1}^{2} (k))}}, i = 1,2, \\ z_{i p} (k) = \frac{y_{p} (k)}{σ_{i 2}^{2} (k)} \sqrt{\frac{1}{2 / α^{2} + (2 / σ_{i 2}^{2} (k))}}, i = 1,2 . \end{matrix}

(29)

Using (19), we can obtain numerators of (25), and finally (25) for AWGN can be written as

\begin{matrix} \hat{w} (k) = (σ_{11}^{2} (k) / (σ_{11}^{2} (k) + σ_{n}^{2}) \\ + R (\bar{y} (k)) (σ_{21}^{2} (k) / (σ_{21}^{2} (k) + σ_{n}^{2}))) \\ {\times (1 + R (\bar{y} (k)))}^{- 1} y (k), \end{matrix}

(30)

where

\begin{matrix} R (\bar{y} (k)) = (((1 - a (k)) \\ \times \exp (- \frac{1}{2} \\ \times (\frac{y^{2} (k)}{σ_{n}^{2} + σ_{21}^{2} (k)} + \frac{y_{p}^{2} (k)}{σ_{n}^{2} + σ_{22}^{2} (k)}))) \\ \times {(\sqrt{(σ_{n}^{2} + σ_{21}^{2} (k)) (σ_{n}^{2} + σ_{22}^{2} (k))})}^{- 1}) \\ \times ((a (k) \exp (- \frac{1}{2} (\frac{y^{2} (k)}{σ_{n}^{2} + σ_{11}^{2} (k)} \\ + \frac{y_{p}^{2} (k)}{σ_{n}^{2} + σ_{12}^{2} (k)}))) \\ {\times {(\sqrt{(σ_{n}^{2} + σ_{11}^{2} (k)) (σ_{n}^{2} + σ_{12}^{2} (k))})}^{- 1})}^{- 1} . \end{matrix}

(31)

We call the new obtained bivariate local shrinkage function as BiGaussMixShrinkL. Figure 4 shows this shrinkage function with sample constant parameters.

A shrinkage function produced from BiGaussMixShrink for sample parameters.

Similarly, using (21), we can obtain numerators of (25), and finally (25) for two-sided Rayleigh noise is obtained as

\begin{matrix} \hat{w} (k) = \frac{1}{1 + R (\bar{y} (k))} \\ \times (2 z_{1} (k) \sqrt{2} (2 - \frac{σ_{11}^{2} (k)}{α^{2}}) \\ + \sqrt{\frac{π}{2}} (1 - \frac{σ_{11}^{2} (k) z_{1}^{2} (k)}{α^{2}}) \\ \times (erfc x (z_{1} (k)) - erfc x (- z_{1} (k)))) \\ \times (\sqrt{\frac{1}{α^{2}} + \frac{1}{σ_{11}^{2} (k)}} \\ \times (2 + z_{1} (k) \sqrt{π} erfc x (- z_{1} (k)) \\ {- z_{1} (k) \sqrt{π} erfc x (z_{1} (k))))}^{- 1} \\ + \frac{R (\bar{y} (k))}{1 + R (\bar{y} (k))} \\ \times (2 z_{2} (k) \sqrt{2} (2 - \frac{σ_{21}^{2} (k)}{α^{2}}) \\ + \sqrt{\frac{π}{2}} (1 - \frac{σ_{21}^{2} (k) z_{2}^{2} (k)}{α^{2}}) \\ \times (erfc x (z_{2} (k)) - erfc x (- z_{2} (k)))) \\ \times (\sqrt{\frac{1}{α^{2}} + \frac{1}{σ_{21}^{2} (k)}} \\ \times (2 + z_{2} (k) \sqrt{π} erfc x (- z_{2} (k)) \\ {- z_{2} (k) \sqrt{π} erfc x (z_{2} (k))))}^{- 1}, \end{matrix}

(32)

where

\begin{matrix} R (\bar{y} (k)) \\ = \frac{1 - a (k)}{a (k)} \\ \times \frac{(1 + σ_{11}^{2} (k) / α^{2}) (1 + σ_{12}^{2} (k) / α^{2}) σ_{11}^{} (k) σ_{12}^{} (k)}{(1 + σ_{21}^{2} (k) / α^{2}) (1 + σ_{22}^{2} (k) / α^{2}) σ_{2 i 1}^{} (k) σ_{22}^{} (k)} \\ \times \frac{\exp (- y^{2} (k) / 2 σ_{21}^{2} (k) - y_{p}^{2} (k) / 2 σ_{22}^{2} (k))}{\exp (- y^{2} (k) / 2 σ_{11}^{2} (k) - y_{p}^{2} (k) / 2 σ_{12}^{2} (k))} \\ \times ((2 + z_{2} (k) \sqrt{π} erfc x (- z_{2} (k)) \\ - z_{2} (k) \sqrt{π} erfc x (z_{2} (k))) \\ \times (2 + z_{2 p} (k) \sqrt{π} erfc x (- z_{2 p} (k)) \\ - z_{2 p} (k) \sqrt{π} erfc x (z_{2 p} (k)))) \\ \times ((2 + z_{1} (k) \sqrt{π} erfc x (- z_{1} (k)) \\ - z_{1} (k) \sqrt{π} erfc x (z_{1} (k))) \\ \times (2 + z_{1 p} (k) \sqrt{π} erfc x (- z_{1 p} (k)) \\ {- z_{1 p} (k) \sqrt{π} erfc x (z_{1 p} (k))))}^{- 1} . \end{matrix}

(33)

We call this bivariate local shrinkage function as BiGaussRayMixShrinkL. Figure 5 shows this shrinkage function with sample constant parameters.

A shrinkage function produced from BiGaussRayMixShrink for sample parameters.

For implementation of our denoising algorithm, we must estimate the parameters σ _ij(k) for i, j = 1,2, and a(k) (that are for noise-free data) from noisy observation. For AWGN, the noisy observation would be a Gaussian mixture model with parameters a(k), $\sqrt{σ_{n}^{2} + σ_{i j}^{2} (k)}$ for i, j = 1,2. So, the following local EM algorithm is used to obtain the parameters.

E-step

\begin{matrix} r_{1} (k) ⟵ \frac{a (k) g_{1} (\bar{y} (k))}{a (k) g_{1} (\bar{y} (k)) + (1 - a (k)) g_{2} (\bar{y} (k))}, \\ r_{2} (k) ⟵ 1 - r_{1} (k) . \end{matrix}

(34)

M-step

\begin{matrix} a (k) ⟵ \frac{1}{M} \sum_{j \in N (K)} r_{1} (j), \end{matrix}

(35)

\begin{matrix} σ_{1 m}^{2} (k) ⟵ \frac{\sum_{j \in N (K)}^{} r_{i} (j) y^{2} (k)}{\sum_{j \in N (K)}^{} r_{i} (j)} - σ_{n}^{2}, m = 1,2, \end{matrix}

(36)

\begin{matrix} σ_{2 m}^{2} (k) ⟵ \frac{\sum_{j \in N (K)}^{} r_{i} (j) y_{p}^{2} (k)}{\sum_{j \in N (K)}^{} r_{i} (j)} - σ_{n}^{2}, m = 1,2, \end{matrix}

(37)

where M is the number of coefficients in the window N(K) centered at $\bar{y} (k)$ . As discussed in the literatures [40], for non-Gaussian mixture models, which is a case for two-sided Rayleigh noise, using (34)–(36) finally converge to the final results.

Our denoising algorithm is summarized in Algorithm 1.

4. Shape Adaptive Windows Selection

It has been shown that using anisotropic and shape adaptive window for local parameter estimation can extremely improve the modeling and processing results. For example, in [47] a new image denoising is introduced that proposes an anisotropic window around each pixel of image and obtains the denoised pixel just by using the located data in the window. Comparing with the denoising methods that are based on proposing isotropic window around each pixel (e.g., [23, 27, 34, 42–46]), the proposed method in [47] is able to segment the image to rather smoothed regions before denoising due to anisotropic window selection that leads to improvement of denoising results. As explained before, the mixture model parameters in each subbands are estimated locally using an isotropic window around each voxel. In this section at first we explain the structure of macular OCT then we introduce 3D “linear polynomial approximation-intersection confidence interval” (LPA-ICI) method for applying shape adaptive window selection around each voxel in 3D DCWT domain. So we will try the despeckling results in 3D DCWT domain by choosing an anisotropic window (instead of isotropic) for estimating the parameters of mixture model in each subband locally.

4.1. OCT Structure

To select the shape-adaptive window, we must take a look at the special structure of OCT data. In ophthalmology, the OCT data shows detailed images from within the retina. The automated analysis of OCT images can be used for the image-guided retinal therapy. Every year, many people become blind as a result of age-related macular degeneration (AMD) due to affecting the central retina where our central vision is perceived. The most sight-threatening form of AMD is called exudative or wet AMD. Choroidal neovascularization (CNV) is a common symptom of the degenerative maculopathy wet AMD. A wealth of powerful new treatments for CNV, especially anti-VEGF agents, have become available very recently to restore normal visual function. The risk of ocular adverse events, including the devastating intraocular infection, endophthalmitis, increases with repeated intravitreal treatment injections, and the effects of chronic treatment with anti-VEGF agents on the retina are unknown. Ideally a more cost-effective, patient-specific dosing strategy with the minimally necessary number of anti-VEGF injections is required. With all the promise, these novel treatments will only reach their full potential when objective and early indices of treatment response are developed. Prior to the introduction of retinal OCT imaging, clinical assessment of whether the preservation or restoration of visual function is successful, which indeed is the ultimate goal of treatment, could only be obtained by measuring visual function. Unfortunately, visual function lags structural response and is cumbersome and noisy, and its reproducibility is limited. Two-dimensional OCT imaging of the retina was introduced several years ago, and was rapidly adopted, among others, to qualitatively measure macular structure as an indicator of AMD treatment response and for guidance of retreatment in CNV recurrence. It is now becoming clear that these simplified structural measures though leading indicators of visual function are inadequate, as they are based on simplified interpretation of single transverse slices of the macula, some patients do not recover visual function even though their total macular thickness has become normally thin after treatment, and others paradoxically gain visual acuity while their macula is still thickened.

True 3D spectral OCT imaging, that became available in 2007 is fast (1.5 s per volume scan), allows full 3D retinal coverage at a much higher resolution and offers improved imaging of subtle differences in retinal structure. In the recent years [65, 66], 3D analysis of 3D OCT as an improved measure of subtle macular structure has been proposed motivated by various hypotheses as follows: A model of retinal response to initial anti-VEGF treatment for CNV, based on quantitative 3D OCT-derived measures, can predict the timing of retreatment.

On this base, developing analysis methods and approaches for 3D spectral OCT image analysis in the presence of wet AMD pathology (Symptomatic Exudate Associated Derangements or SEADs, also known as AMD-related cysts, vessel leakages, etc.) and assessing their performance by comparison to expert analyses are of utmost interest. Another interesting subject is determining how well the quantitative SEAD- and layer-derived measures from 3D OCT predict the patient-specific outcome parameters in response to postinduction anti-VEGF treatment in patients with CNV in order to predict the timing of retreatment.

Figure 6 shows several sample macular OCTs and detected SEADs by an expert as the region of interest (ROI). As we can see in this figure, the most important information of OCT data (about retina layers) is located in the center of OCT images.

4.2. 3D LPA-ICI for Data between the First and Last Layers

In [47], a new image denoising based on using an anisotropic window around each pixel of image is introduced. To select the anisotropic window, the linear directional filters g _h,θ that are obtained using local polynomial approximation (LPA) are employed. The θ indicates the direction of filter that is a member of countable set {θ ₁, θ ₂,…, θ _L}, where L is the number of directions. A common choice for L is L = 8 that results in the set {0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°}. For each θ, the length of proposed window is selected from the countable and increasing set {h ₁, h ₂,…, h _J}. So, for the noisy observation y(k), we would have the following estimate:

\begin{matrix} x_{h, θ}^{est} (k) = g_{h, θ} (k) * y (k) . \end{matrix}

(38)

For each θ and k, an appropriate value of h called h ⁺ is estimated using the nonlinear intersection of confidence intervals (ICI) rule. h ⁺ is the largest h from the h ₁ < h ₂ < ⋯<h _J provided that the estimated data using h ⁺ does not have noticeable difference with the estimated data with smaller h's. For this reason, the following confidence intervals are defined:

\begin{matrix} C_{s} = [x_{h_{s, θ}}^{est} (k) - R σ_{x_{h_{s, θ}}^{est} (k)}, x_{h_{s, θ}}^{est} (k) + R σ_{x_{h_{s, θ}}^{est} (k)}], \end{matrix}

(39)

where R is the smoothing parameter and σ _{x_{h_s,θ}^est(k)} shows the variance of x _{h_s,θ} ^est(k) and is obtained as follows:

\begin{matrix} σ_{x_{h_{s, θ}}^{est} (k)}^{2} = \int P_{x_{h_{s, θ}}^{est} (k)} \int (f) d f = \int P_{y} (f) G_{h, θ} (f) d f, \end{matrix}

(40)

where P(·) shows the power spectral density function and G _h,θ(f) is the Fourier transform of g _h,θ, and for a white random process, (40) is simplified to

\begin{matrix} σ_{x_{h_{s, θ}}^{est} (k)}^{2} = \int σ_{n}^{2} G_{h, θ} (f) d f = σ_{n}^{2} \sum g_{h, θ} (k) . \end{matrix}

(41)

According to the ICI rule, D _s is defined using the following formula:

\begin{matrix} D_{s} = ⋂_{i = 1}^{s} C_{i} . \end{matrix}

(42)

The largest s that leads to a nonempty value is called s ⁺, and finally h ⁺(k, θ) is obtained using h ⁺(k, θ) = h _s⁺.

Figure 7 shows an example of mentioned anisotropic window selection for a SEAD.

The red line shows the detected SEAD by an expert. The yellow circles show the isotropic windows with various radii. The green line illustrates the obtained anisotropic based on LPA-ICI rule.

Since applying LPA-ICI in each subband is a time consuming process, a fast version of the mentioned algorithm can be based on only applying LPA-ICI to low-pass subbands using L = 12 with an offset of 15° that results in the set {15°, 45°, 75°, 105°, 135°, 165°, 195°, 225°, 255°, 285°, 315°, 345°} in a 2D case. Since, in this case, each subband is extracting the information concentrated in a specific direction corresponding to {15° (195°), 45° (225°), 75° (255°), 105° (285°), 135° (315°), 165° (345°)}, the extracted h ⁺(k, θ) = h _s⁺ that results from applying LPA-ICI to corresponding low-pass subband is used for obtaining the local parameters of kth pixel. For example, suppose that DCWT is used for 3 scales and we want to calculate the local parameters of coarsest scale for the oriented real subband around 45° (225°). For this reason, h ⁺(k, 45°) and h ⁺(k, 225°) are extracted from the results of applying LPA-ICI on the LL subband of real part (or imaginary part) of DCWT. Then, if we are in the jth scale, only 2^j−1 h ⁺(k, 45°) pixel in direction of 45° and 2^j−1 h ⁺(k, 225°) pixel in direction of 225° are used to extract the local parameters of kth pixel in this subband (Figure 8).

From left to right: imaginary LL subband of one slice of OCT data, the oriented (imaginary) subband around 45° (225°), h ⁺(·, 45°) for LL subband, and h ⁺(·, 225°) for LL subband extracted by applying LPA-ICI to the LL subband of imaginary part of DCWT. As indicated in the second image for p ₁ = (46,20), p ₂ = (47,21), and p ₃ = (46,22) we would have h ⁺(p ₁, 45°) = 2, h ⁺(p ₁, 225°) = 3 (green dash), h ⁺(p ₂, 45°) = 1, h ⁺(p ₂, 225°) = 3 (orange dash), and h ⁺(p ₃, 45°) = 3, h ⁺(p ₃, 225°) = 3 (red dash).

A similar manner can be proposed in 3D case [67]. However, instead of using 2D direction θ _i, we use 3D direction (θ _i, φ _i). As shown in Figure 9, in 2D case we use a circular sector for each direction while for 3D case a conical body is produced for direction (θ _i, φ _i), and the sphere is covered (partly) using these cones. Similar to 2D case g _h,θ,φ, is defined and for each (θ _i, φ _i) the best h called h ⁺(k, θ _i, φ _i) is obtained using ICI rule.

Comparison between a circular sector for direction θ in 2D case (b) with a conical body produced for direction (θ, φ) in 3D case (a).

Note that in order to incorporate the anisotropic window selection for each DCWT coefficient in our OCT denoising algorithm explained in Algorithm 1, instead of using a square window for parameter estimation, an anisotropic window is obtained for each coefficient k using the explained LPA-ICI method in this section, and only available data in this window are used for estimating a(k) and σ ₁₁(k), σ ₁₂(k), σ ₂₁(k), and σ ₂₂(k).

5. Experimental Results

In this section, we apply the proposed despeckling algorithm to OCT image noise reduction. For this reason, we use 20 three-dimensional OCT datasets in the presence of wet AMD pathology (SEAD) and use mean signal-to-noise ratio (MSNR) and contrast-to-noise ratio (CNR) as two quality measurements for OCT data. To calculate these measurements, we must define the region of interest (ROI). In this paper, we propose this region within the SEAD as illustrated in Figure 10. The base MSNR and CNR are defined as follows:

\begin{matrix} MSN R_{ROI} = \frac{μ_{ROI}}{σ}, \\ CNR = | MSN R_{ROI 1} - MSN R_{ROI 2} |, \end{matrix}

(43)

where μ _ROI shows the mean of ROI and σ indicates the standard deviation of a large region outside the ROI (noise ROI in Figure 10).

Table 1 shows the results of MSNR and CNR for proposed ROIs in OCT data using our algorithm. As discussed in Section 3, various shrinkage functions can be obtained using our algorithm based on applying log transformation before applying 3D DCWT (we use homomorphic prefix for this method and non-homomorphic when we do not use log transformation) and proposing AWGN or two-sided Rayleigh pdf for modeling noise in 3D DCWT domain (we name them BiGaussMixShrinkL and BiGaussRayMixShrinkL, resp.). Figures 11 and 12, respectively, show the results of applying non-homomorphic and homomorphic methods for (a slice of) depicted OCT image in Figure 10. In this figure, also in Table 1, we compare the results of nonlocal version of methods to show the effect of using anisotropic window selection technique. In order to show the SNR improvements, CNR curves for 156 selected ROIs have been depicted in Figure 13. It is clear that non-homomorphic BiGaussMixShrinkL method outperforms the others.

Table 1.

The results of MSNR and CNR using several ROIs, shown in Figure 12.

Methods			MSNR_ROI1	MSNR_ROI2	CNR
Local (L) Nonlocal (NL)	Homomorphic (H) Nonhomomorphic (NH)	Gaussian noise (G) Two-sided Rayleigh noise (R)	MSNR_ROI1	MSNR_ROI2	CNR
L	H	G	7.00	15.76	8.76
NL	H	G	7.56	17.03	9.47
L	NH	G	12.27	27.76	13.49
NL	NH	G	10.77	22.73	11.95
L	H	R	5.89	13.11	7.22
NL	H	R	8.63	19.59	10.95
L	NH	R	10.75	22.55	11.81
NL	NH	R	10.88	23.05	12.17

	Original image		2.56	5.30	2.74

Open in a new tab

The results of applying homomorphic methods on proposed image in Figure 10. From top-left clockwise: despeckled data using *BiGaussMixShrinkL,* nonlocal BiGaussMixShrinkL, nonlocal BiGaussRayMixShrinkL, and *BiGaussRayMixShrinkL.*

The results of applying non-homomorphic methods on proposed image in Figure 10. From top-left clockwise: despeckled data using *BiGaussMixShrinkL,* nonlocal BiGaussMixShrinkL, nonlocal BiGaussRayMixShrinkL, and *BiGaussRayMixShrinkL.*

A comparison between CNR curves for 156 selected ROIs from OCT dataset.

Another way for evaluating the effect of our despeckling algorithm is the investigation of the intralayer segmentation results. Figure 14 shows a comparison between the segmented layers of a 650 × 512 × 128 Topcon 3D OCT-1000 imaging system using proposed method in [53]. It is clear that the first layer is detected truly after despeckling.

A comparison between the segmented layers of a 650 × 512 × 128 Topcon 3D OCT-1000 imaging system using proposed method in [53]. From left to right: original image, denoised image by nonlocal homomorphic *BiGaussRayMixShrinkL* method, and local homomorphic *BiGaussRayMixShrinkL* method.

6. Conclusion and Future Work

In this paper, we introduced a new noise reduction algorithm for 3D OCT data. We found new shrinkage functions employing a mixture of bivariate Gaussian for modeling wavelet coefficients in each subband of complex wavelets. The parameters of this mixture model are estimated locally using a shape-adaptive manner based on the special structure of OCT data. We also used this model for denoising of other kinds of noise. Experiments show that our model has better results than other methods visually and in terms of PSNR especially for the crowded images. In this paper, we suppose that the parameters of EM algorithm, in extracted windows are constant. It is possible to improve the EM algorithm, for example, by using recurrence equations. It is possible that we only propose the main section of data (between the first and last layers) containing retina layer information and apply our algorithm on the selected data to improve the speed and performance of denoising process.

Using 3D DCWT instead of other transforms such as 3D DWT is a main reason for improvement of the denoising results [30]. In [27], it has been shown that other kinds of oriented transforms such as steerable pyramid decomposition can produce better denoising results. However, for 3D case, 3D transforms that are applied on whole 3D data (not slice by slice) such as surfacelet [68] and 3D discrete curvelet [69] can be investigated.

Appendices

A. Directional Selectivity Property of 2D DCWT

Since DWT in 2D domain is produced using separable (row-column) implementation, it has a poor directional selectivity. For example, the HH wavelet is the product of the high-pass functions along the first and second dimensions. Because DWT uses real filters, the HH wavelet mixes +45° and −45° orientations that results in the checkerboard artifact because it fails to isolate these orientations. In contrast, since the spectrum of the (approximately) analytic 1D wavelet is supported on only one side of the frequency axis, the spectrum of the DCWT in 2D domain is supported in only one quadrant of the 2D frequency plane. Figure 15 illustrates a comparison between subbands of 2D DWT and 2D DCWT.

A comparison between subbands of DWT and DCWT. (a) The wavelets in the space domain (LH, HL, and HH). (b) The idealized support of the Fourier spectrum of each wavelet in the 2D frequency domain. We can see the checkerboard artifact of the third wavelet. (c) The complex wavelets in the space domain. (d) The idealized support of the Fourier spectrum of each wavelet in the 2D frequency plane. The absence of the checkerboard phenomenon is observed in both the space and frequency domains [48].

B. A Sample Bivariate Gaussian Mixture Model

Figure 16 shows the pdf of a bivariate Gaussian mixture model for sample parameters. We can see the marginal distribution produced from the model in this figure.

The pdf of a bivariate Gaussian mixture model for sample parameters and its marginal distribution.

C. Other Kinds of Noise

In this appendix, we briefly explain the abilities of the proposed denoising algorithm in this paper for other kinds of noise.

C.1. Stationary Noise

We tested the shrinkage function BiGaussMixShrinkL for stationary noise and compared it with other methods such as ProbShrink [34], BiShrink [45], and BLS-GSM [27] and found that our algorithm outperforms these techniques visually and in terms of peak signal-to-noise ratio (PSNR) for various levels of noise. For example, our algorithm for crowded images preserves details of images while BiShrink [45] in some cases produces blurred images (e.g., compare the area around the corresponding arrows in Figure 17). In addition, in [40] using other bivariate mixture models such as bivariate Laplacian was examined. We compared the results of using our model with method in [40], and our simulations show that our algorithm is faster and outperforms the reported shrinkage functions in [40] for some images. For example, for a 512 × 512 8-bit grayscale Barbara image, an improvement of 0.6 dB is obtained for noise level of 30, and BiGaussMixShrinkL is two times faster.

(a) shows a part of Barbara image denoised using BiShrink [45] for stationary noise with σ _n = 40 and (b) shows denoised image using our method. Comparing the area around the corresponding arrows, we understand that our method is able to better preserve the details of images.

C.2. Nonstationary Noise

This section presents nonstationary noise reduction examples in complex wavelet domain. Although the stationary noise model is able to simplify the implementation of denoising algorithms, the statistical properties of the noise are not always accurately described with this assumption. For example, in some applications, the noise statistics are spatially varying and the noise power varies between pixels or samples. In these cases, the nonstationary noise assumption is more reasonable and can improve the denoising results. For example, we contaminate three 512 × 512 grayscale images, namely, Lena, Boat, and Barbara using signal-dependent Gaussian noise with variance σ _g(i) that is defined as a linear function of the pixel intensities s(i) as [54]:

\begin{matrix} σ_{g} (i) = k_{0} s (i) + k_{1} . \end{matrix}

(C.1)

Since the variance of each noise component is spatially varying with the corresponding content of signal, the nonstationary processes are able to model the statistical properties of this noise. A comparison between the denoised image using soft thresholding, proposed method in [54], and the denoised image using a mixture of two bivariate Gaussian pdfs with local parameters (BiGaussMixShrinkL) for different noise levels can be seen in Table 2. In this table, the highest PSNR value is bolded. We can see from the table that our proposed algorithm has the better results compared to others especially for the Barbara image (which contains details) in the high-level noise.

Table 2.

PSNR (in dB) values of test images for different nonstationary noise levels.

Noise parameters σ_g(i) = k ₀ s(i) + k ₁	Lena				Boat				Barbara
Noise parameters σ_g(i) = k ₀ s(i) + k ₁	Noisy Image	Soft thresh. [54]	Proposed method in [55]	Our method	Noisy image	Soft thresh. [54]	Proposed method in [55]	Our method	Noisy image	Soft thresh. [54]	Proposed method in [55]	Our method
k ₀ = 0.05, k ₁ = 4	27.72	34.13	34.61	35.60	27.51	32.44	32.59	33.26	27.94	31.99	32.20	33.56
k ₀ = 0.1, k ₁ = 4	23.48	31.62	32.49	33.31	23.22	29.95	30.23	30.75	23.69	29.05	25.81	30.91
k ₀ = 0.2, k ₁ = 4	18.49	27.50	29.65	30.52	18.20	26.77	27.60	28.13	18.71	25.81	25.94	27.91

Open in a new tab

In [54], it has been shown that the proposed algorithm based on the signal-dependent Gaussian noise can also be effective for the reduction of Poisson noise. On this base, we use our algorithm for noise reduction of images corrupted by Poisson noise generated using corresponding voxel intensities. For this reason, we use the software provided on http://willett.ece.wisc.edu/software.html to compare our method with Fast TI Haar algorithm [55]. The PSNR of grayscale images Lena, Boat, Barbara, Confocal Microscopy Phantom, Bowl, and Shepp Logan Phantom can be seen in Table 3. We can see that our algorithm outperforms the Fast TI Haar algorithm. Figure 18 illustrates a comparison between the denoised images produced from two algorithms. It is clear that our algorithm has better results especially for the crowded images. In fact, the Fast TI Haar algorithm has reasonable performance for soft images such as Confocal Microscopy Phantom, but since this algorithm blurs the produced images, for images with details such as Barbara, high-frequency features will be removed and we lose the important information.

Table 3.

Comparison between PSNRs (in dB) of denoised images with Fast TI Haar algorithm [55] and BiGaussMixShrinkL.

	Lena	Boat	Barbara	Confocal Phantom	Shep Logan Phantom	Bowl
Noisy image	27.22	27.05	27.49	35.74	47.68	28.21
Fast TI Haar	32.11	29.30	26.59	44.49	60.63	46.79
BiGaussMixShrinL	39.88	37.57	37.78	47.36	64.65	47.09

Open in a new tab

(a–d) show denoising results for Confocal Microscopy Phantom: from left to right: noise-free image, noisy image, and denoised image with our model and denoised image with Fast TI Haar algorithm. (e–h) show from left to right parts of denoised Barbara image with BiGaussMixShrinkL method and parts of denoised Barbara image with Fast TI Haar algorithm.

References

1.Huang D, Swanson EA, Lin CP, et al. Optical coherence tomography. Science. 1991;254(5035):1178–1181. doi: 10.1126/science.1957169. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Schmitt JM. Optical coherence tomography (OCT): a review. IEEE Journal on Selected Topics in Quantum Electronics. 1999;5(4):1205–1215. doi: 10.1109/2944.796347. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Choma MA, Sarunic MV, Yang C, Izatt JA. Sensitivity advantage of swept source and Fourier domain optical coherence tomography. Optics Express. 2003;11(18):2183–2189. doi: 10.1364/oe.11.002183. [DOI] [PubMed] [Google Scholar]
4.van Velthoven MEJ, Faber DJ, Verbraak FD, van Leeuwen TG, de Smet MD. Recent developments in optical coherence tomography for imaging the retina. Progress in Retinal and Eye Research. 2007;26(1):57–77. doi: 10.1016/j.preteyeres.2006.10.002. [DOI] [PubMed] [Google Scholar]
5.Schmitt JM, Xiang SH, Yung KM. Speckle in optical coherence tomography. Journal of Biomedical Optics. 1999;4(1):95–105. doi: 10.1117/1.429925. [DOI] [PubMed] [Google Scholar]
6.Marks DL, Ralston TS, Boppart SA. Speckle reduction by I-divergence regularization in optical coherence tomography. Journal of the Optical Society of America A. 2005;22(11):2366–2371. doi: 10.1364/josaa.22.002366. [DOI] [PubMed] [Google Scholar]
7.Jian Z, Yu Z, Yu L, Rao B, Chen Z, Tromberg BJ. Speckle attenuation in optical coherence tomography by curvelet shrinkage. Optics Letters. 2009;34(10):1516–1518. doi: 10.1364/ol.34.001516. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Jian Z, Yu L, Rao B, Tromberg BJ, Chen Z. Three-dimensional speckle suppression in optical coherence tomography based on the curvelet transform. Optics Express. 2010;18(2):1024–1032. doi: 10.1364/OE.18.001024. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Chitchian S, Fiddy MA, Fried NM. Denoising during optical coherence tomography of the prostate nerves via wavelet shrinkage using dual-tree complex wavelet transform. Journal of Biomedical Optics. 2009;14(1) doi: 10.1117/1.3081543.014031 [DOI] [PubMed] [Google Scholar]
10.Zlokolica V, Jovanov Lj, Pizurica A, Philips W. Wavelet-based denoising for OCT images, Interaction between image processing, optics and photonics. Symposium on Optical Science and Technology; August 2007; San Diego, Calif, USA: [Google Scholar]
11.Gupta V, Chan CC, Poh C-L, Chow TH, Meng TC, Koon NB. Computerized automation of wavelet based denoising method to reduce speckle noise in OCT images. Proceedings of the 5th International Conference on Information Technology and Applications in Biomedicine; May 2008; pp. 120–123. [Google Scholar]
12.Puvanathasan P, Bizheva K. Speckle noise reduction algorithm for optical coherence tomography based on interval type II fuzzy set. Optics Express. 2007;15(24):15747–15758. doi: 10.1364/oe.15.015747. [DOI] [PubMed] [Google Scholar]
13.Pižurica A, Jovanov L, Huysmans B, et al. Multiresolution denoising for optical coherence tomography: a review and evaluation. Current Medical Imaging Reviews. 2008;4(4):270–284. [Google Scholar]
14.Salinas HM, Fernández DC. Comparison of PDE-based nonlinear diffusion approaches for image enhancement and denoising in optical coherence tomography. IEEE Transactions on Medical Imaging. 2007;26(6):761–771. doi: 10.1109/TMI.2006.887375. [DOI] [PubMed] [Google Scholar]
15.Yung KM, Lee SL, Schmilt JM. Phase-domain processing of optical coherence tomography images. Journal of Biomedical Optics. 1999;4(1):125–136. doi: 10.1117/1.429942. [DOI] [PubMed] [Google Scholar]
16.Rogowska J, Brezinski ME. Evaluation of the adaptive speckle suppression filter for coronary optical coherence tomography imaging. IEEE Transactions on Medical Imaging. 2000;19(12):1261–1266. doi: 10.1109/42.897820. [DOI] [PubMed] [Google Scholar]
17.Ozcan A, Bilenca A, Desjardins AE, Bouma BE, Tearney GJ. Speckle reduction in optical coherence tomography images using digital filtering. Journal of the Optical Society of America A. 2007;24(7):1901–1910. doi: 10.1364/josaa.24.001901. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Adler DC, Ko TH, Fujimoto JG. Speckle reduction in optical coherence tomography images by use of a spatially adaptive wavelet filter. Optics Letters. 2004;29(24):2878–2880. doi: 10.1364/ol.29.002878. [DOI] [PubMed] [Google Scholar]
19.Mallat SG. A Wavelet Tour of Signal Processing. San Diego, Calif, USA: Academic Press; 1998. [Google Scholar]
20.Vidakovic B. Statistical Modeling by Wavelets. New York, NY, USA: John Wiley & Sons; 1999. [Google Scholar]
21.Simoncelli EP, Adelson EH. Noise removal via Bayesian wavelet coring. Proceedings of the IEEE International Conference on Image Processing; September 1996; Lausanne, Switzerland. pp. 379–382. [Google Scholar]
22.Simoncelli EP. Bayesian denoising of visual images in the wavelet domain. In: Muller P, Vidakovic B, editors. Bayesian Inference in Wavelet Based Models. New York, NY, USA: Springer; 1999. pp. 291–308. [Google Scholar]
23.Achim A, Bezerianos A, Tsakalides P. Novel Bayesian multiscale method for speckle removal in medical ultrasound images. IEEE Transactions on Medical Imaging. 2001;20(8):772–783. doi: 10.1109/42.938245. [DOI] [PubMed] [Google Scholar]
24.Fadili JM, Boubchir L. Analytical form for a Bayesian wavelet estimator of images using the Bessel K form densities. IEEE Transactions on Image Processing. 2005;14(2):231–240. doi: 10.1109/tip.2004.840704. [DOI] [PubMed] [Google Scholar]
25.Khazron PA, Selesnick IW. Bayesian estimation of bessel K form random vectors in AWGN. IEEE Signal Processing Letters. 2008;15:261–264. [Google Scholar]
26.Crouse MS, Nowak RD, Baraniuk RG. Wavelet-based statistical signal processing using hidden Markov models. IEEE Transactions on Signal Processing. 1998;46(4):886–902. [Google Scholar]
27.Portilla J, Strela V, Wainwright MJ, Simoncelli EP. Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Transactions on Image Processing. 2003;12(11):1338–1351. doi: 10.1109/TIP.2003.818640. [DOI] [PubMed] [Google Scholar]
28.Rabbani H, Gazor S. Image denoising employing local mixture models in sparse domains. IET Image Processing. 2010;4(5):413–428. [Google Scholar]
29.Elmzoughi A, Benazza-Benyahia A, Pesquet J-C. An interscale multivariate statistical model for map multicomponent image denoising in the wavelet transform domain. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05); March 2005; pp. II45–II48. [Google Scholar]
30.Rabbani H, Vafadust M. Image/video denoising based on a mixture of Laplace distributions with local parameters in multidimensional complex wavelet domain. Signal Processing. 2008;88(1):158–173. [Google Scholar]
31.Chipman HA, Kolaczyk ED, McCulloch RE. Adaptive Bayesian wavelet shrinkage. Journal of the American Statistical Association. 1997;92(440):1413–1421. [Google Scholar]
32.Borran MJ, Nowak RD. Wavelet-based denoising using hidden Markov models. Proceedings of the IEEE Interntional Conference on Acoustics, Speech, and Signal Processing; May 2001; pp. 3925–3928. [Google Scholar]
33.Romberg JK, Choi H, Baraniuk RG. Bayesian tree-structured image modeling using wavelet-domain hidden Markov models. IEEE Transactions on Image Processing. 2001;10(7):1056–1068. doi: 10.1109/83.931100. [DOI] [PubMed] [Google Scholar]
34.Pižurica A, Philips W, Lemahieu I, Acheroy M. A joint inter- and intrascale statistical model for Bayesian wavelet based image denoising. IEEE Transactions on Image Processing. 2002;11(5):545–557. doi: 10.1109/TIP.2002.1006401. [DOI] [PubMed] [Google Scholar]
35.Bharath AA, Ng J. A steerable complex wavelet construction and its application to image denoising. IEEE Transactions on Image Processing. 2005;14(7):948–959. doi: 10.1109/tip.2005.849295. [DOI] [PubMed] [Google Scholar]
36.Şendur L, Selesnick IW. Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency. IEEE Transactions on Signal Processing. 2002;50(11):2744–2756. [Google Scholar]
37.Rabbani H, Vafadust M, Gazor S, Selesnick I. Image denoising employing a bivariate cauchy distribution with local variance in complex wavelet domain. Proceedings of the 12th IEEE Digital Signal Processing Workshop and 4th IEEE Signal Processing Education Workshop; September 2006; Grand Teton National Park, Wy, USA. pp. 203–208. [Google Scholar]
38.Portilla J. Full blind denoising through noise covariance estimation using gaussian scale mixtures in the wavelet domain. Proceedings of the International Conference on Image Processing (ICIP '04); October 2004; pp. 1217–1220. [Google Scholar]
39.Lyu S, Simoncelli EP. Modeling multiscale subbands of photographic images with fields of Gaussian scale mixtures. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2009;31(4):693–706. doi: 10.1109/TPAMI.2008.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Rabbani H, Nezafat R, Gazor S. Wavelet-domain medical image denoising using bivariate laplacian mixture model. IEEE Transactions on Biomedical Engineering. 2009;56(12):2826–2837. doi: 10.1109/TBME.2009.2028876. [DOI] [PubMed] [Google Scholar]
41.Selesnick I, Li K. Video denoising using 2d and 3d dual-tree complex wavelet transforms. Wavelet Applications in Signal and Image Processing; August 2003; San Diego, Calif, USA: [Google Scholar]
42.Cai Z, Cheng TH, Lu C, Subramanian KR. Efficient wavelet-based image denoising algorithm. Electronics Letters. 2001;37(11):683–685. [Google Scholar]
43.Mihçak MK, Kozintsev I, Ramchandran K, Moulin P. Low-complexity image denoising based on statistical modeling of wavelet coefficients. IEEE Signal Processing Letters. 1999;6(12):300–303. [Google Scholar]
44.Rabbani H, Vafadust M, Abolmaesumi P, Gazor S. Speckle noise reduction of medical ultrasound images in complex wavelet domain using mixture priors. IEEE Transactions on Biomedical Engineering. 2008;55(9):2152–2160. doi: 10.1109/TBME.2008.923140. [DOI] [PubMed] [Google Scholar]
45.Şendur L, Selesnick IW. Bivariate shrinkage with local variance estimation. IEEE Signal Processing Letters. 2002;9(12):438–441. [Google Scholar]
46.Rabbani H, Vafadust M, Gazor S. Image denoising based on a mixture of bivariate gaussian distributions with local parameters in complex wavelet domain. Proceedings of the International Conference on Biomedical and Pharmaceutical Engineering; December 2006; pp. 174–179. [Google Scholar]
47.Katkovnik V, Egiazarian K, Astola J. Adaptive window size image de-noising based on intersection of confidence intervals (ICI) rule. Journal of Mathematical Imaging and Vision. 2002;16(3):223–235. [Google Scholar]
48.Selesnick IW, Baraniuk RG, Kingsbury NG. The dual-tree complex wavelet transform. IEEE Signal Processing Magazine. 2005;22(6):123–151. [Google Scholar]
49.Wang B, Wang Y, Selesnick I, Vetro A. Video coding using 3D dual-tree wavelet transform. Eurasip Journal on Image and Video Processing. 2007;2007:15 pages.42761 [Google Scholar]
50.Jingwei W, Xinbo G, Juanjuan Z. A video watermarking based on 3-D complex wavelet. Proceedings of the 14th IEEE International Conference on Image Processing; September 2007; pp. V493–V496. [Google Scholar]
51.Yang J, Wang Y, Xu W, Dai Q. Image and video denoising using adaptive dual-tree discrete wavelet packets. IEEE Transactions on Circuits and Systems for Video Technology. 2009;19(5):642–655. [Google Scholar]
52.Bayram I, Selesnick IW. A dual-tree rational-dilation complex wavelet transform. IEEE Transactions on Signal Processing. 2011;59(12):6251–6256. [Google Scholar]
53.Lee K, Niemeijer M, Garvin MK, Kwon YH, Sonka M, Abramoff MD. Segmentation of the optic disc in 3-D OCT scans of the optic nerve head. IEEE Transactions on Medical Imaging. 2010;29(1):159–168. doi: 10.1109/TMI.2009.2031324. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Lo WY, Selesnick IW. Wavelet-domain soft-thresholding for non-stationary noise. Proceedings of the IEEE International Conference on Image Processing; October 2006; pp. 1441–1444. [Google Scholar]
55.Willett RM, Nowak RD. Fast multiresolution photon-limited image reconstruction. Proceedings of the 2nd IEEE International Symposium on Biomedical Imaging: Macro to Nano (ISBI '04); April 2004; pp. 1192–1195. [Google Scholar]
56.Pižurica A, Philips W, Lemahieu I, Acheroy M. A versatile wavelet domain noise filtration technique for medical imaging. IEEE Transactions on Medical Imaging. 2003;22(3):323–331. doi: 10.1109/TMI.2003.809588. [DOI] [PubMed] [Google Scholar]
57.Gupta S, Chauhan RC, Saxena SC. Robust non-homomorphic approach for speckle reduction in medical ultrasound images. Medical and Biological Engineering and Computing. 2005;43(2):189–195. doi: 10.1007/BF02345953. [DOI] [PubMed] [Google Scholar]
58.Gupta S, Kaur L, Chauhan RC, Saxena SC. A versatile technique for visual enhancement of medical ultrasound images. Digital Signal Processing. 2007;17(3):542–560. [Google Scholar]
59.Yan S, Yuan J, Liu M, Hou C. Speckle noise reduction of ultrasound images based on an undecimated wavelet packet transform domain nonhomomorphic filtering. Proceedings of the 2nd International Conference on Biomedical Engineering and Informatics; October 2009; pp. 1–5. [Google Scholar]
60.Wagner RF, Insana MF, Brown DG. Statistical properties of radio-frequency and envelope-detected signals with applications to medical ultrasound. Journal of the Optical Society of America A. 1987;4(5):910–922. doi: 10.1364/josaa.4.000910. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Shankar PM. Ultrasonic tissue characterization using a generalized Nakagami model. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control. 2001;48(6):1716–1720. doi: 10.1109/58.971725. [DOI] [PubMed] [Google Scholar]
62.Shankar PM, Reid JM, Ortega H, Piccoli CW, Goldberg BB. Use of non-Rayleigh statistics for the identification of tumors in ultrasonic B-scans of the breast. IEEE Transactions on Medical Imaging. 1993;12(4):687–692. doi: 10.1109/42.251119. [DOI] [PubMed] [Google Scholar]
63.Shankar PM. A model for ultrasonic scattering from tissues based on the K distribution. Physics in Medicine and Biology. 1995;40(10):1633–1649. doi: 10.1088/0031-9155/40/10/006. [DOI] [PubMed] [Google Scholar]
64.Eom IK, Kim YS. Wavelet-based denoising with nearly arbitrarily shaped windows. IEEE Signal Processing Letters. 2004;11(12):937–940. [Google Scholar]
65.Drexler W, Fujimoto JG. State-of-the-art retinal optical coherence tomography. Progress in Retinal and Eye Research. 2008;27(1):45–88. doi: 10.1016/j.preteyeres.2007.07.005. [DOI] [PubMed] [Google Scholar]
66.Abramoff MD, Garvin MK, Sonka M. Retinal imaging and image analysis. IEEE Reviews in Biomedical Engineering. 2010;3:169–208. doi: 10.1109/RBME.2010.2084567. [DOI] [PMC free article] [PubMed] [Google Scholar]
67.Ercole C, Foi A, Katkovnik V, Egiazarian K. Spatio-temporal pointwise adaptive denoising of video: 3D non-parametric approach. Proceedings of the 1st International Workshop on Video Process and Quality Metrics for Consumer Electronics (VPQM '05); 2005. [Google Scholar]
68.Lu YM, Do MN. Multidimensional directional filter banks and surfacelets. IEEE Transactions on Image Processing. 2007;16(4):918–931. doi: 10.1109/tip.2007.891785. [DOI] [PubMed] [Google Scholar]
69.Ying L, Demanet L, Candès E. 3D discrete curvelet transform. Wavelets XI; August 2005; pp. 1–11. [Google Scholar]

[B1] 1.Huang D, Swanson EA, Lin CP, et al. Optical coherence tomography. Science. 1991;254(5035):1178–1181. doi: 10.1126/science.1957169. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] 2.Schmitt JM. Optical coherence tomography (OCT): a review. IEEE Journal on Selected Topics in Quantum Electronics. 1999;5(4):1205–1215. doi: 10.1109/2944.796347. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] 3.Choma MA, Sarunic MV, Yang C, Izatt JA. Sensitivity advantage of swept source and Fourier domain optical coherence tomography. Optics Express. 2003;11(18):2183–2189. doi: 10.1364/oe.11.002183. [DOI] [PubMed] [Google Scholar]

[B4] 4.van Velthoven MEJ, Faber DJ, Verbraak FD, van Leeuwen TG, de Smet MD. Recent developments in optical coherence tomography for imaging the retina. Progress in Retinal and Eye Research. 2007;26(1):57–77. doi: 10.1016/j.preteyeres.2006.10.002. [DOI] [PubMed] [Google Scholar]

[B5] 5.Schmitt JM, Xiang SH, Yung KM. Speckle in optical coherence tomography. Journal of Biomedical Optics. 1999;4(1):95–105. doi: 10.1117/1.429925. [DOI] [PubMed] [Google Scholar]

[B6] 6.Marks DL, Ralston TS, Boppart SA. Speckle reduction by I-divergence regularization in optical coherence tomography. Journal of the Optical Society of America A. 2005;22(11):2366–2371. doi: 10.1364/josaa.22.002366. [DOI] [PubMed] [Google Scholar]

[B7] 7.Jian Z, Yu Z, Yu L, Rao B, Chen Z, Tromberg BJ. Speckle attenuation in optical coherence tomography by curvelet shrinkage. Optics Letters. 2009;34(10):1516–1518. doi: 10.1364/ol.34.001516. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B8] 8.Jian Z, Yu L, Rao B, Tromberg BJ, Chen Z. Three-dimensional speckle suppression in optical coherence tomography based on the curvelet transform. Optics Express. 2010;18(2):1024–1032. doi: 10.1364/OE.18.001024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] 9.Chitchian S, Fiddy MA, Fried NM. Denoising during optical coherence tomography of the prostate nerves via wavelet shrinkage using dual-tree complex wavelet transform. Journal of Biomedical Optics. 2009;14(1) doi: 10.1117/1.3081543.014031 [DOI] [PubMed] [Google Scholar]

[B10] 10.Zlokolica V, Jovanov Lj, Pizurica A, Philips W. Wavelet-based denoising for OCT images, Interaction between image processing, optics and photonics. Symposium on Optical Science and Technology; August 2007; San Diego, Calif, USA: [Google Scholar]

[B11] 11.Gupta V, Chan CC, Poh C-L, Chow TH, Meng TC, Koon NB. Computerized automation of wavelet based denoising method to reduce speckle noise in OCT images. Proceedings of the 5th International Conference on Information Technology and Applications in Biomedicine; May 2008; pp. 120–123. [Google Scholar]

[B12] 12.Puvanathasan P, Bizheva K. Speckle noise reduction algorithm for optical coherence tomography based on interval type II fuzzy set. Optics Express. 2007;15(24):15747–15758. doi: 10.1364/oe.15.015747. [DOI] [PubMed] [Google Scholar]

[B13] 13.Pižurica A, Jovanov L, Huysmans B, et al. Multiresolution denoising for optical coherence tomography: a review and evaluation. Current Medical Imaging Reviews. 2008;4(4):270–284. [Google Scholar]

[B14] 14.Salinas HM, Fernández DC. Comparison of PDE-based nonlinear diffusion approaches for image enhancement and denoising in optical coherence tomography. IEEE Transactions on Medical Imaging. 2007;26(6):761–771. doi: 10.1109/TMI.2006.887375. [DOI] [PubMed] [Google Scholar]

[B15] 15.Yung KM, Lee SL, Schmilt JM. Phase-domain processing of optical coherence tomography images. Journal of Biomedical Optics. 1999;4(1):125–136. doi: 10.1117/1.429942. [DOI] [PubMed] [Google Scholar]

[B16] 16.Rogowska J, Brezinski ME. Evaluation of the adaptive speckle suppression filter for coronary optical coherence tomography imaging. IEEE Transactions on Medical Imaging. 2000;19(12):1261–1266. doi: 10.1109/42.897820. [DOI] [PubMed] [Google Scholar]

[B17] 17.Ozcan A, Bilenca A, Desjardins AE, Bouma BE, Tearney GJ. Speckle reduction in optical coherence tomography images using digital filtering. Journal of the Optical Society of America A. 2007;24(7):1901–1910. doi: 10.1364/josaa.24.001901. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] 18.Adler DC, Ko TH, Fujimoto JG. Speckle reduction in optical coherence tomography images by use of a spatially adaptive wavelet filter. Optics Letters. 2004;29(24):2878–2880. doi: 10.1364/ol.29.002878. [DOI] [PubMed] [Google Scholar]

[B19] 19.Mallat SG. A Wavelet Tour of Signal Processing. San Diego, Calif, USA: Academic Press; 1998. [Google Scholar]

[B20] 20.Vidakovic B. Statistical Modeling by Wavelets. New York, NY, USA: John Wiley & Sons; 1999. [Google Scholar]

[B21] 21.Simoncelli EP, Adelson EH. Noise removal via Bayesian wavelet coring. Proceedings of the IEEE International Conference on Image Processing; September 1996; Lausanne, Switzerland. pp. 379–382. [Google Scholar]

[B22] 22.Simoncelli EP. Bayesian denoising of visual images in the wavelet domain. In: Muller P, Vidakovic B, editors. Bayesian Inference in Wavelet Based Models. New York, NY, USA: Springer; 1999. pp. 291–308. [Google Scholar]

[B23] 23.Achim A, Bezerianos A, Tsakalides P. Novel Bayesian multiscale method for speckle removal in medical ultrasound images. IEEE Transactions on Medical Imaging. 2001;20(8):772–783. doi: 10.1109/42.938245. [DOI] [PubMed] [Google Scholar]

[B24] 24.Fadili JM, Boubchir L. Analytical form for a Bayesian wavelet estimator of images using the Bessel K form densities. IEEE Transactions on Image Processing. 2005;14(2):231–240. doi: 10.1109/tip.2004.840704. [DOI] [PubMed] [Google Scholar]

[B25] 25.Khazron PA, Selesnick IW. Bayesian estimation of bessel K form random vectors in AWGN. IEEE Signal Processing Letters. 2008;15:261–264. [Google Scholar]

[B26] 26.Crouse MS, Nowak RD, Baraniuk RG. Wavelet-based statistical signal processing using hidden Markov models. IEEE Transactions on Signal Processing. 1998;46(4):886–902. [Google Scholar]

[B27] 27.Portilla J, Strela V, Wainwright MJ, Simoncelli EP. Image denoising using scale mixtures of Gaussians in the wavelet domain. IEEE Transactions on Image Processing. 2003;12(11):1338–1351. doi: 10.1109/TIP.2003.818640. [DOI] [PubMed] [Google Scholar]

[B28] 28.Rabbani H, Gazor S. Image denoising employing local mixture models in sparse domains. IET Image Processing. 2010;4(5):413–428. [Google Scholar]

[B29] 29.Elmzoughi A, Benazza-Benyahia A, Pesquet J-C. An interscale multivariate statistical model for map multicomponent image denoising in the wavelet transform domain. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05); March 2005; pp. II45–II48. [Google Scholar]

[B30] 30.Rabbani H, Vafadust M. Image/video denoising based on a mixture of Laplace distributions with local parameters in multidimensional complex wavelet domain. Signal Processing. 2008;88(1):158–173. [Google Scholar]

[B31] 31.Chipman HA, Kolaczyk ED, McCulloch RE. Adaptive Bayesian wavelet shrinkage. Journal of the American Statistical Association. 1997;92(440):1413–1421. [Google Scholar]

[B32] 32.Borran MJ, Nowak RD. Wavelet-based denoising using hidden Markov models. Proceedings of the IEEE Interntional Conference on Acoustics, Speech, and Signal Processing; May 2001; pp. 3925–3928. [Google Scholar]

[B33] 33.Romberg JK, Choi H, Baraniuk RG. Bayesian tree-structured image modeling using wavelet-domain hidden Markov models. IEEE Transactions on Image Processing. 2001;10(7):1056–1068. doi: 10.1109/83.931100. [DOI] [PubMed] [Google Scholar]

[B34] 34.Pižurica A, Philips W, Lemahieu I, Acheroy M. A joint inter- and intrascale statistical model for Bayesian wavelet based image denoising. IEEE Transactions on Image Processing. 2002;11(5):545–557. doi: 10.1109/TIP.2002.1006401. [DOI] [PubMed] [Google Scholar]

[B35] 35.Bharath AA, Ng J. A steerable complex wavelet construction and its application to image denoising. IEEE Transactions on Image Processing. 2005;14(7):948–959. doi: 10.1109/tip.2005.849295. [DOI] [PubMed] [Google Scholar]

[B36] 36.Şendur L, Selesnick IW. Bivariate shrinkage functions for wavelet-based denoising exploiting interscale dependency. IEEE Transactions on Signal Processing. 2002;50(11):2744–2756. [Google Scholar]

[B37] 37.Rabbani H, Vafadust M, Gazor S, Selesnick I. Image denoising employing a bivariate cauchy distribution with local variance in complex wavelet domain. Proceedings of the 12th IEEE Digital Signal Processing Workshop and 4th IEEE Signal Processing Education Workshop; September 2006; Grand Teton National Park, Wy, USA. pp. 203–208. [Google Scholar]

[B38] 38.Portilla J. Full blind denoising through noise covariance estimation using gaussian scale mixtures in the wavelet domain. Proceedings of the International Conference on Image Processing (ICIP '04); October 2004; pp. 1217–1220. [Google Scholar]

[B39] 39.Lyu S, Simoncelli EP. Modeling multiscale subbands of photographic images with fields of Gaussian scale mixtures. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2009;31(4):693–706. doi: 10.1109/TPAMI.2008.107. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B40] 40.Rabbani H, Nezafat R, Gazor S. Wavelet-domain medical image denoising using bivariate laplacian mixture model. IEEE Transactions on Biomedical Engineering. 2009;56(12):2826–2837. doi: 10.1109/TBME.2009.2028876. [DOI] [PubMed] [Google Scholar]

[B48] 41.Selesnick I, Li K. Video denoising using 2d and 3d dual-tree complex wavelet transforms. Wavelet Applications in Signal and Image Processing; August 2003; San Diego, Calif, USA: [Google Scholar]

[B41] 42.Cai Z, Cheng TH, Lu C, Subramanian KR. Efficient wavelet-based image denoising algorithm. Electronics Letters. 2001;37(11):683–685. [Google Scholar]

[B42] 43.Mihçak MK, Kozintsev I, Ramchandran K, Moulin P. Low-complexity image denoising based on statistical modeling of wavelet coefficients. IEEE Signal Processing Letters. 1999;6(12):300–303. [Google Scholar]

[B43] 44.Rabbani H, Vafadust M, Abolmaesumi P, Gazor S. Speckle noise reduction of medical ultrasound images in complex wavelet domain using mixture priors. IEEE Transactions on Biomedical Engineering. 2008;55(9):2152–2160. doi: 10.1109/TBME.2008.923140. [DOI] [PubMed] [Google Scholar]

[B44] 45.Şendur L, Selesnick IW. Bivariate shrinkage with local variance estimation. IEEE Signal Processing Letters. 2002;9(12):438–441. [Google Scholar]

[B45] 46.Rabbani H, Vafadust M, Gazor S. Image denoising based on a mixture of bivariate gaussian distributions with local parameters in complex wavelet domain. Proceedings of the International Conference on Biomedical and Pharmaceutical Engineering; December 2006; pp. 174–179. [Google Scholar]

[B46] 47.Katkovnik V, Egiazarian K, Astola J. Adaptive window size image de-noising based on intersection of confidence intervals (ICI) rule. Journal of Mathematical Imaging and Vision. 2002;16(3):223–235. [Google Scholar]

[B47] 48.Selesnick IW, Baraniuk RG, Kingsbury NG. The dual-tree complex wavelet transform. IEEE Signal Processing Magazine. 2005;22(6):123–151. [Google Scholar]

[B49] 49.Wang B, Wang Y, Selesnick I, Vetro A. Video coding using 3D dual-tree wavelet transform. Eurasip Journal on Image and Video Processing. 2007;2007:15 pages.42761 [Google Scholar]

[B50] 50.Jingwei W, Xinbo G, Juanjuan Z. A video watermarking based on 3-D complex wavelet. Proceedings of the 14th IEEE International Conference on Image Processing; September 2007; pp. V493–V496. [Google Scholar]

[B51] 51.Yang J, Wang Y, Xu W, Dai Q. Image and video denoising using adaptive dual-tree discrete wavelet packets. IEEE Transactions on Circuits and Systems for Video Technology. 2009;19(5):642–655. [Google Scholar]

[B52] 52.Bayram I, Selesnick IW. A dual-tree rational-dilation complex wavelet transform. IEEE Transactions on Signal Processing. 2011;59(12):6251–6256. [Google Scholar]

[B66] 53.Lee K, Niemeijer M, Garvin MK, Kwon YH, Sonka M, Abramoff MD. Segmentation of the optic disc in 3-D OCT scans of the optic nerve head. IEEE Transactions on Medical Imaging. 2010;29(1):159–168. doi: 10.1109/TMI.2009.2031324. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B61] 54.Lo WY, Selesnick IW. Wavelet-domain soft-thresholding for non-stationary noise. Proceedings of the IEEE International Conference on Image Processing; October 2006; pp. 1441–1444. [Google Scholar]

[B67] 55.Willett RM, Nowak RD. Fast multiresolution photon-limited image reconstruction. Proceedings of the 2nd IEEE International Symposium on Biomedical Imaging: Macro to Nano (ISBI '04); April 2004; pp. 1192–1195. [Google Scholar]

[B53] 56.Pižurica A, Philips W, Lemahieu I, Acheroy M. A versatile wavelet domain noise filtration technique for medical imaging. IEEE Transactions on Medical Imaging. 2003;22(3):323–331. doi: 10.1109/TMI.2003.809588. [DOI] [PubMed] [Google Scholar]

[B54] 57.Gupta S, Chauhan RC, Saxena SC. Robust non-homomorphic approach for speckle reduction in medical ultrasound images. Medical and Biological Engineering and Computing. 2005;43(2):189–195. doi: 10.1007/BF02345953. [DOI] [PubMed] [Google Scholar]

[B55] 58.Gupta S, Kaur L, Chauhan RC, Saxena SC. A versatile technique for visual enhancement of medical ultrasound images. Digital Signal Processing. 2007;17(3):542–560. [Google Scholar]

[B56] 59.Yan S, Yuan J, Liu M, Hou C. Speckle noise reduction of ultrasound images based on an undecimated wavelet packet transform domain nonhomomorphic filtering. Proceedings of the 2nd International Conference on Biomedical Engineering and Informatics; October 2009; pp. 1–5. [Google Scholar]

[B57] 60.Wagner RF, Insana MF, Brown DG. Statistical properties of radio-frequency and envelope-detected signals with applications to medical ultrasound. Journal of the Optical Society of America A. 1987;4(5):910–922. doi: 10.1364/josaa.4.000910. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B58] 61.Shankar PM. Ultrasonic tissue characterization using a generalized Nakagami model. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control. 2001;48(6):1716–1720. doi: 10.1109/58.971725. [DOI] [PubMed] [Google Scholar]

[B59] 62.Shankar PM, Reid JM, Ortega H, Piccoli CW, Goldberg BB. Use of non-Rayleigh statistics for the identification of tumors in ultrasonic B-scans of the breast. IEEE Transactions on Medical Imaging. 1993;12(4):687–692. doi: 10.1109/42.251119. [DOI] [PubMed] [Google Scholar]

[B60] 63.Shankar PM. A model for ultrasonic scattering from tissues based on the K distribution. Physics in Medicine and Biology. 1995;40(10):1633–1649. doi: 10.1088/0031-9155/40/10/006. [DOI] [PubMed] [Google Scholar]

[B62] 64.Eom IK, Kim YS. Wavelet-based denoising with nearly arbitrarily shaped windows. IEEE Signal Processing Letters. 2004;11(12):937–940. [Google Scholar]

[B63] 65.Drexler W, Fujimoto JG. State-of-the-art retinal optical coherence tomography. Progress in Retinal and Eye Research. 2008;27(1):45–88. doi: 10.1016/j.preteyeres.2007.07.005. [DOI] [PubMed] [Google Scholar]

[B64] 66.Abramoff MD, Garvin MK, Sonka M. Retinal imaging and image analysis. IEEE Reviews in Biomedical Engineering. 2010;3:169–208. doi: 10.1109/RBME.2010.2084567. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B65] 67.Ercole C, Foi A, Katkovnik V, Egiazarian K. Spatio-temporal pointwise adaptive denoising of video: 3D non-parametric approach. Proceedings of the 1st International Workshop on Video Process and Quality Metrics for Consumer Electronics (VPQM '05); 2005. [Google Scholar]

[B68] 68.Lu YM, Do MN. Multidimensional directional filter banks and surfacelets. IEEE Transactions on Image Processing. 2007;16(4):918–931. doi: 10.1109/tip.2007.891785. [DOI] [PubMed] [Google Scholar]

[B69] 69.Ying L, Demanet L, Candès E. 3D discrete curvelet transform. Wavelets XI; August 2005; pp. 1–11. [Google Scholar]

PERMALINK

Optical Coherence Tomography Noise Reduction Using Anisotropic Local Bivariate Gaussian Mixture Prior in 3D Complex Wavelet Domain

Hossein Rabbani

Milan Sonka

Michael D Abramoff

Abstract

1. Introduction

1.1. Statistical Properties of Noise-Free Coefficients

1.2. Discrete Complex Wavelet Transform (DCWT)

Figure 1.

1.3. Organization of the Paper

2. Bivariate Gaussian Mixture Model with Local Parameters

Figure 2.

Figure 3.

2.1. Description of the Proposed Model

2.2. Local EM Algorithm

3. Denoising Using MMSE Estimator

3.1. Denoising Based on Modeling Noise-Free Data by Bivariate Gaussian PDF with Local Variance

3.2. Denoising Based on Modeling Noise-Free Data by a Mixture of Bivariate Gaussian PDFs with Local Parameters

Figure 4.

Figure 5.

Algorithm 1.

4. Shape Adaptive Windows Selection

4.1. OCT Structure

Figure 6.

4.2. 3D LPA-ICI for Data between the First and Last Layers

Figure 7.

Figure 8.

Figure 9.

5. Experimental Results

Figure 10.

Table 1.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

6. Conclusion and Future Work

Appendices

A. Directional Selectivity Property of 2D DCWT

Figure 15.

B. A Sample Bivariate Gaussian Mixture Model

Figure 16.

C. Other Kinds of Noise

C.1. Stationary Noise

Figure 17.

C.2. Nonstationary Noise

Table 2.

Table 3.

Figure 18.

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases