Abstract
In image denoising (IDN) processing, the low-rank property is usually considered as an important image prior. As a convex relaxation approximation of low rank, nuclear norm-based algorithms and their variants have attracted a significant attention. These algorithms can be collectively called image domain-based methods whose common drawback is the requirement of great number of iterations for some acceptable solution. Meanwhile, the sparsity of images in a certain transform domain has also been exploited in image denoising problems. Sparsity transform learning algorithms can achieve extremely fast computations as well as desirable performance. By taking both advantages of image domain and transform domain in a general framework, we propose a sparsifying transform learning and weighted singular values minimization method (STLWSM) for IDN problems. The proposed method can make full use of the preponderance of both domains. For solving the nonconvex cost function, we also present an efficient alternative solution for acceleration. Experimental results show that the proposed STLWSM achieves improvement both visually and quantitatively with a large margin over state-of-the-art approaches based on an alternatively single domain. It also needs much less iteration than all the image domain algorithms.
1. Introduction
Noise inevitably exists in images during the process of real-world scenes acquisition by reason of physical limitations, leading to image denoising (IDN) and becomes a fundamental task in image processing. The recent IDN can be categorized as data-driven and prior-driven approaches.
The data-driven methods turn to a certain deep convolution neural network, such as Universal Denoising Net (UDN) [1] and Fractional Optimal Control Net [2], for the IDN problem. These CNN models, although have achieved great success provided with sufficient training samples, may not perform well in small-scale data applications. For example, one cannot obtain the acceptable network parameters on a single corrupted image, which is the case considered in this study. The aim of the prior-driven methods for image denoising is to renovate the inferior image by certain image prior or other properties, such as local smoothness, nonlocal similarity, low-rank structure, and so forth [3–5]. More specifically, the prior-based image denoising process means to find the inherently ideal image from the degraded one by extracting few significant factors and excluding the noisy information. It is a typical ill-posed linear inverse problem, and a widely used image degradation model can be generally formulated as follows [6–9]:
| (1) |
where X and Y are both matrices representing the original image and the degraded one, respectively. H is also a matrix denoting the noninvertible degradation operator, and N is the additive noise.
To cope with the ill-posed problem, the general image denoising problem can be formulated as follows [9, 10]:
| (2) |
where F(X) is regarded as the image prior knowledge, including local smoothness, nonlocal similarity, low-rank, and sparsity, and ‖·‖F denotes the Frobenius norm. According to sparsity property, the degraded image x (x is the vectorization of X, x ∈ Rn) satisfies x=Dκ+e, where D ∈ Rn×m is a synthesis overcomplete dictionary, κ ∈ Rm is the sparse coefficient, and e is an approximation term in image domain [11]. This model is called as the synthesis model, and κ is the supposed sparse (‖κ‖0 ≪ m).
To be specific, given an image x, the synthesis sparse coding problem is subjected to find a sparse κ to minimize ‖x − Dκ‖22. Various algorithms have been proposed [10, 12–15] to figure out this NP-hard problem. Numerous researchers have learned the synthesis dictionary and updated the nonzero coefficients simultaneously to well represent the potential high-quality image. And these methods have been demonstrated useful in image denoising. Specifically, these synthesis models typically alternate two steps: the sparse coding updating and dictionary learning. However, the practical operation of synthesis models requires some rigorous conditions, which often violate in applications.
While the synthesis model has attracted extensive attentions, the analysis model has also been catching notice recently [16, 17]. The analysis model considers that a noisy image x ∈ Rn satisfies ‖Ωx‖0 ≪ m, where Ω ∈ Rn×m is regarded as an analysis dictionary, since it ‘analyzes' the image x to a sparse form. The essence of Ωx defines the subspace to which the image belongs. And the underlying ideal image is formulated as y=x+ξ, with ξ representing the noise. The denoising problem is to find x by minimizing ‖y − x‖22 subject to ‖Ωx‖0 ≪ m. This problem is also NP-hard and resemblant of sparse coding in the synthesis model. Approximation algorithms of learning analysis dictionary have been proposed in recent years, which similar to the synthesis case are also computationally expensive.
More recently, a generalized analysis model named the transform learning model has been proposed, which follows the intuition that images are essential sparse in certain transform domain and can be expressed as Wx=μ+ε, where W ∈ Rm×n is the transform matrix, μ ∈ Rm is the sparse coefficient, and ε is the approximation error [18]. The distinguishing feature from the synthesis and analysis models is that approximation error ε of the transform learning model is in transform domain and is likely to be small. Another superiority of the transform model compared to the image domain model is that the former can achieve exact and extremely fast computations.
Instead of learning synthesis or analysis dictionary, the transform learning model aims at learning the transform matrix to minimize the approximation errorε. After getting the learned transform W, the original image is recovered by W†μ, where W† is the pseudoinverse of W. The transform learning model has earned great success in application of image denoising in both efficiency and effectiveness [18–21].
Nonetheless, a remaining drawback is that the transform model overemphasizes transform domain but ignores the primary image domain. There is always a connection between image domain and transform domain, and this can be treated as a regularization term in image denoising.
For taking full use of the advantages of both image domain and transform domain and implementing single image denoising problem, this study focuses on sparsifying transform learning and essential sparsity property of image, and proposes a novel algorithm named sparsifying transform learning and weighted singular values minimization (STLWSM). Specifically, our model simultaneously considers the sparsifying transform learning and the weighted singular values minimization of image patches.
The remainder of this paper is organized as follows. In the next section, a brief review of the transform domain and image domain for IDN is provided. In section 3, we propose our method and obtain the efficient solution. Section 4 provides experimental results of gray images and color images. Conclusions are drawn in section 5.
2. Related Works
2.1. Transform Domain for IDN
As mentioned in the previous section, the transform model can utilize the sparsity of image in transform domain to increase efficiency. Therefore, the analytical transform models such as wavelets and discrete cosine transform (DCT) are widely used in practical application, for instance, the image compression standards JPEG2000. As a classical and effective tool, transform models have been increasingly used in image denoising. Inspired by dictionary learning, Saiprasad et.al [19] proposed a learning sparsifying transform (LST) model. In [19], for any noisy image X ∈ Rh×l, it is first reformed to another resolution as X′ ∈ Rp×N, where each column represents a square patch of the original X extracted by a sliding window. Second, a transform matrix W ∈ Rp×p is randomly initialized to formulate the transform sparse coding problem as follows:
| (3) |
where μ ∈ Rp×N is the sparse coefficient, μiis the column of μ, and s is a constant representing the sparse magnitude. The additional regular term λlgdet|W| is used to avoid a trivial solution. λ is a balance coefficient, and lgdet|W| is the log-determinant of W with base 10. Ravishankar and Bresler [19] solved the proposed problem by alternately updating W and μ and proved the convergence. To carry forward their achievements, they further proposed a learning doubly sparse transforms (LDST) for IDN [21]. Specifically, W′=BΦ is adopted to replace the original W, where B and Φ are both square matrices with the same size. B is a transform constrained to be sparse, and Φ is an analysis transform with an efficient implementation. They use the doubly sparse transform model in image denoising and get faster and better results than unstructured transforms. And then, Wen et al. [18, 20] proposed a structured overcomplete sparsifying transform learning (SOSTL) model. The main feature different from aforementioned transform models is that Wen et al. cluster image patches and learns diverse W for corresponding patch groups. This process can be formulated as following:
| (4) |
where Q(W)=−log|detW|+‖W‖F2 is a regular term to prevent trivial solutions. {Ck} indicates the specific class of image X′, K is the number of categories, and G is the set of all classes.
2.2. Image Domain for IDN
While the transform learning models have achieved great success, in image domain, there also have been proposed various algorithms for IDN. As mentioned before, in the general image denoising model, F(X) is an additional regularization. The widely studied regularizations include l1, l2, and l1/2 norm, nuclear norm, low-rank property, and so on [22–24]. Focusing on patch form instead of vector form, low-rank property has been attracting a significant research interest. As a convex relaxation of low-rank matrix factorization problem (LRFM), the nuclear norm minimization (NNM) has engrossed more attention [4, 6, 24, 25]. The nuclear norm of an image X is defined as ‖X‖∗=∑i|σi(X)|1, where σi(X) is the ith singular value of X. However, many researchers hold that the minimization of different singular values should be separated. Liu et.al [4] proposed weight nuclear norm minimization (WNNM) for image denoising problems. The weight nuclear norm is defined as ‖X‖w,∗=∑i|wiσi(X)|1, and w = [w1, w2,…, wn] is nonnegative. At this point, we can treat F(X) as F(X)=‖X‖w,∗, and the denoising model is
| (5) |
By taking consideration of different singular values, as well as image structure, the WNNM shows strong denoising capability. Meanwhile, Hu et al. [6] proposed truncated nuclear norm regularization (TNNR) for matrix completion. They deemed that the minimization of the smallest min(m, n)-r singular values can maintain the original matrix rank by holding the first r nonzero singular values fixed. Using F(X)=∑i=r+1min(m, n)σi(X), the TNNR constrained model can be written as follows:
| (6) |
TNNR gets a better approximation to the rank function than nuclear norm-based-approaches. Inspired by both WNNM and TNNR, Liu et al. [26] improved the previous algorithms by reweighting the residual error separately and minimizing the truncated nuclear norm of error matrix simultaneously (TNNR-WRE). In their work, F(X) is considered as follows:
| (7) |
where H = X–Y, U and V are the left and right matrices of H's singular value decomposition (SVD), respectively, and r is the truncation parameter. TNNR-WRE further achieves higher accuracy than TNNR.
From above, the nuclear norm-based algorithms usually can get considerable results because of the essential low-rank property in image domain. For taking both advantages of transform domain and image domain in IDN, the sparsifying transform learning and weighted singular values minimization (STLWSM) method is proposed. In contrast to LST, LDST, SOLST, WNNM, TNNR, and TNNR-WRE, the proposed STLWSM jointly takes consideration of sparsity in transform domain and low-rank in image domain. The main results of our work can be enumerated as follows:
We propose a general framework of image process in both transform domain and image domain, which combines the sparsifying transform learning of image patches and the low-rank property of the original image.
As image patches can take advantage of the nonlocal similarity existing inherently in the image, we learn the sparsifying transform for each group of similar patches by Euclidean distance.
For solving the proposed NP-hard problem, we present an efficient alternative optimization algorithm. In practical applications, our method requires limited number of iterations, mostly less than 3, for the final solution.
We applied our model to IDN, and the results show that STLWSM can achieve evident PSNR (peak signal to noise ratio) improvements over other state-of-the-art methods.
3. Proposed Method
In this section, we propose a general framework in both transform domain and image domain. To be clear, we take sparsifying transform learning in transform domain and weighted singular values minimization in image domain simultaneously. To solve this NP-hard problem, an efficient solution is also derived.
3.1. Sparsifying Transform Learning and Weighted Singular Values Minimization (STLWSM)
In light of the observations mentioned above, we first introduce a sparsifying learning transform based on image patches and utilize the weighted singular values minimization to improve the image quality.
Given a noisy image X ∈ Rh×l, nonlocal similarity is a well-known patch-based prior, which means that one patch in one image has many similar patches [7–9]. Accordingly, overlapped image patches can be extracted with a sliding window in a fixed step size. For each specific patch, we choose the most similar M patches by Euclidean distance [4, 7, 18–20] for potential low-rank structure, and a matrix of Xi′ ∈ Rp×M is constructed. The patch's size is , and the total number N′ of Xi′depends on the size of the original image X, patch size, and step size. After similar patches' aggregation process, in each group, Xi′is obtained, and X′=[X1′, X2′, ..., XN′′] ∈ Rp×M×N′. Following the idea of the transform learning algorithm [18–20], with the obtained Xi′ and some initialized Wi, our preliminary model can be formulated as the following:
| (8) |
The definition of Q(Wi) is the same as one in problem (4), but μi ∈ Rp∗M is the sparse representation of Xi′ in transform domain, which is a matrix. Suppose the transform W i and sparse coefficient μi have been updated. The denoised patch can be obtained by Xi″=Wi†μi. Obviously, Xi″ also has low-rank structure; hence, we utilize weighted singular values minimization to approximate the matrix. The unified denoising minimization is
| (9) |
where αi and βiare the regularization parameters and usually set empirically. This formulation can minimize the residual in transform domain and the rank of the recovered matrix Xi″ simultaneously.
3.2. Efficient Optimization of the Proposed Model
In this subsection, we introduce an efficient solution for the nonconvex sparsifying transform learning and weighted singular values minimization problem. According to [16–19], the transform learning process is not sensitive to the initialization of W. As a result, with given W, the subproblem of μi can be obtained using cheap hard-thresholding, . Here, Ths(·) is the hard-thresholding operator. And the subproblem of Wi is as follows:
| (10) |
Because the term βi‖Wi†μi‖w,∗ is more like a postfix operator, we divide the updating process of Wi into two parts:
| (11) |
The first formula is
| (12) |
Decomposing Xi′Xi′T+λiIp as ZiZiT, Oi=WiZi. Then, WiX′μiT can be written as OiZ−1X′μiT. Let Oi and Z−1X′μiT have full SVD of UΦVT and PΨQT, respectively. If we take consideration of their diagonal matrix only, the foregoing formula can be rewritten as
| (13) |
where log|detZ−1| is the constant and can be omitted. The revised problem is convex for φi, so the optimizing solution can be found by taking partial differential with respect to φi and setting the derivative to 0.
| (14) |
Therefore, excluding the nonpositive results, the solution is
| (15) |
To sum up, the transform update step can be computed as follows:
| (16) |
(b) The Second Formula is
| (17) |
With fixed obtained in step (a), this part can be simply seen as
| (18) |
where LWi†μiwiΣWi†μiRWi†μi=SVD(Wi†μi), and Wi†μi represents the denoised matrix. Following Liu et.al [4], a desirable weighting vector Wi in image domain can be given as
| (19) |
where σi(Wi†μi)is the ith singular of Wi†μi, c is a positive constant, and ε = 10−16 is to avoid dividing by zero. And the second formula's optimal solution is
| (20) |
where Xi″=Wi†μi and the soft-thresholding operator Sw(∑i) is defined as Sw(∑i)=max(∑i − wi, 0).
The summary of our optimization solution is presented in Algorithm 1 where the similar patches are determined by Euclidean distance.
Algorithm 1.

Efficient Solution of STLWSM.
4. Experiment Results
In this section, we choose 25, 12, 15, and 10 reference images with a size of 256∗256 from TID2008 [27], USC-SIPI1, Live-IQAD [28], and IVC-SQDB [29] to test the image denoising effects, respectively. As we use six different noise levels to the test images in our experiments, the total number of distorted images is 372. Some representative images from USC-SIPI database are shown in Figures 1 and 2. Four recently proposed methods, including the patch-based algorithm GSR, weighted nuclear norm WNNM, sparsity learning transform scheme SOLST and sparsity transform learning, and the low-rank model STROLLR, are adopted as contrasts. The noisy images are obtained by additional Gaussian noise with σn = 15, 20, 30, 40, 50, and 75. All competing algorithms use their default settings, which has been finely tuned and deeply verified in their original publications. Since our method is derived from both the schemes of image domain and transform domain, we set our parameters the same as the representative methods in these two domains, i.e., WNNM and SOLST, for fairness. That is, for the image denoising application, whenσn ≤ 20, p is 6, M is 70, and λiis 0.54. When 20 < σn ≤ 40, p is 7, M is 90, and λiis 0.56. When 40 < σn ≤ 60, p is 8, M is 120, and λiis 0.58. And whenσnis set others, p is 9, M is 140, and λiis 0.58. In addition, 6 images of 512∗512 from USC-SIPI (Figure 3) are used in image inpainting application. For the image inpainting application, we also follow the similar setting rule. The balance parameters αi and βiare both set as αi=βi=10∗‖Xi′‖F2. Table 1 shows the detailed parameter setting in our experiments where the texts in bracket are used for the 512∗512 images, while the plain ones are for the 256∗256 images.
Figure 1.

Original gray images.
Figure 2.

Original color images.
Figure 3.

Original images of size 512∗512.
Table 1.
Parameter setting in our experiments.
| σ n(σm) | 15 (15%) | 20 (20%) | 30 (30%) | 40 (40%) | 50 (50%) | 75 |
|---|---|---|---|---|---|---|
| P | 6 (12) | 7 (14) | 8 (16) | 9 | ||
| M | 70 (200) | 90 (260) | 120 (300) | 140 | ||
| λ i | 0.54 (0.54) | 0.56 (0.56) | 0.58 (0.58) | 0.58 | ||
| α i | 10∗‖Xi′‖F2(10∗‖Xi′‖F2) | |||||
| β i | 10∗‖Xi′‖F2(10∗‖Xi′‖F2) | |||||
The peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) are used to evaluate the quality of the denoised images. PSNR is defined by
| (21) |
where MSE is the mean squared error between the original image and the denoised one. SSIM is defined as [28, 30]
| (22) |
where x and y represent the original image and the denoised one, respectively, μxand μyare the mean values of x and y, σx and σyare the variances, and σxyis the covariance. C1 and C2 denote two stabilization variables.
For a thorough comparison, we list the average denoising results from all the 372 distorted images in Table 2. Also, the experimental results from all the gray images of USC-SIPI are provided in Table 3.
Table 2.
Average denoising results with different noise level (PSNR/SSIM).
| σ n | GSR | WNNM | SOLST | STROLLR | STLWSM |
|---|---|---|---|---|---|
| 15 | 38.63/0.574 | 50.78/0.935 | 47.74/0.914 | 42.99/0.677 | 55.49/0.983 |
| 20 | 35.25/0.431 | 48.36/0.925 | 45.34/0.894 | 40.11/0.584 | 53.20/0.978 |
| 30 | 31.23/0.293 | 45.88/0.903 | 41.91/0.822 | 36.67/0.465 | 52.90/0.973 |
| 40 | 28.59/0.216 | 43.40/0.874 | 39.45/0.790 | 33.72/0.375 | 50.52/0.962 |
| 50 | 26.60/0.165 | 43.82/0.811 | 37.54/0.734 | 31.70/0.328 | 50.75/0.955 |
| 75 | 23.04/0.095 | 41.19/0.488 | 34.05/0.609 | 28.17/0.254 | 48.07/0.920 |
Table 3.
Gray images' denoising results (PSNR/SSIM).
| Image | σ n | GSR | WNNM | SOLST | STROLLR | STLWSM |
|---|---|---|---|---|---|---|
| Baboon | 15 | 38.27/0.765 | 50.89/0.981 | 47.76/0.960 | 43.00/0.752 | 55.74/0.992 |
| 20 | 35.41/0.650 | 48.42/0.967 | 45.35/0.933 | 40.12/0.640 | 53.36/0.987 | |
| 30 | 31.60/0.478 | 45.98/0.937 | 41.91/0.865 | 36.34/0.469 | 53.08/0.979 | |
| 40 | 29.01/0.359 | 43.45/0.896 | 39.45/0.788 | 33.70/0.356 | 50.63/0.963 | |
| 50 | 27.03/0.275 | 43.89/0.809 | 37.54/0.710 | 31.71/0.280 | 50.88/0.955 | |
| 75 | 23.47/0.154 | 41.21/0.360 | 34.05/0.535 | 28.18/0.170 | 48.15/0.906 | |
|
| ||||||
| Camera | 15 | 39.32/0.577 | 50.74/0.979 | 47.72/0.959 | 42.89/0.741 | 55.32/0.990 |
| 20 | 35.77/0.429 | 48.36/0.964 | 45.32/0.932 | 40.04/0.629 | 53.11/0.985 | |
| 30 | 31.67/0.295 | 45.89/0.935 | 41.92/0.864 | 36.31/0.458 | 52.81/0.976 | |
| 40 | 29.03/0.225 | 43.43/0.894 | 39.45/0.788 | 33.69/0.347 | 50.47/0.961 | |
| 50 | 27.04/0.179 | 43.79/0.806 | 37.54/0.709 | 31.70/0.271 | 50.71/0.952 | |
| 75 | 23.47/0.112 | 41.16/0.359 | 34.05/0.535 | 28.17/0.162 | 48.06/0.902 | |
|
| ||||||
| Couple | 15 | 38.82/0.719 | 50.82/0.980 | 47.75/0.960 | 42.95/0.746 | 55.57/0.991 |
| 20 | 35.61/0.584 | 48.40/0.967 | 45.34/0.933 | 40.09/0.634 | 53.26/0.986 | |
| 30 | 31.64/0.411 | 45.87/0.936 | 41.90/0.865 | 36.33/0.463 | 52.98/0.978 | |
| 40 | 29.02/0.305 | 43.44/0.895 | 39.45/0.288 | 33.76/0.350 | 50.57/0.963 | |
| 50 | 27.03/0.233 | 43.82/0.807 | 37.54/0.711 | 31.69/0.275 | 50.83/0.954 | |
| 75 | 23.47/0.132 | 41.19/0.359 | 34.05/0.535 | 28.18/0.166 | 48.14/0.905 | |
|
| ||||||
| Lax | 15 | 38.39/0.751 | 50.80/0.980 | 47.76/0.959 | 42.64/0.717 | 55.65/0.992 |
| 20 | 35.46/0.636 | 48.38/0.966 | 45.34/0.931 | 39.86/0.600 | 53.29/0.986 | |
| 30 | 31.61/0.470 | 45.88/0.935 | 41.91/0.863 | 36.20/0.425 | 52.97/0.977 | |
| 40 | 29.01/0.357 | 43.39/0.894 | 39.45/0.787 | 33.59/0.313 | 50.55/0.962 | |
| 50 | 27.02/0.277 | 43.83/0.806 | 37.53/0.710 | 31.61/0.241 | 50.78/0.953 | |
| 75 | 23.47/0.160 | 41.20/0.359 | 34.05/0.536 | 28.11/0.140 | 48.08/0.903 | |
|
| ||||||
| Man | 15 | 38.79/0.690 | 50.74/0.979 | 47.73/0.959 | 42.88/0.739 | 55.37/0.990 |
| 20 | 35.61/0.552 | 48.36/0.965 | 45.33/0.931 | 40.03/0.627 | 53.13/0.985 | |
| 30 | 31.64/0.381 | 45.89/0.935 | 41.90/0.863 | 36.29/0.454 | 52.85/0.976 | |
| 40 | 29.02/0.279 | 43.38/0.894 | 39.45/0.786 | 33.67/0.342 | 50.48/0.961 | |
| 50 | 27.03/0.212 | 43.77/0.806 | 37.53/0.708 | 31.67/0.267 | 50.71/0.952 | |
| 75 | 23.47/0.119 | 41.19/0.358 | 34.05/0.533 | 28.15/0.159 | 48.05/0.903 | |
|
| ||||||
| Woman1 | 15 | 39.02/0.648 | 50.81/0.996 | 47.75/0.960 | 43.08/0.759 | 55.53/0.991 |
| 20 | 35.68/0.499 | 48.34/0.990 | 45.34/0.933 | 40.18/0.650 | 53.22/0.986 | |
| 30 | 31.66/0.333 | 45.87/0.936 | 41.90/0.865 | 36.39/0.479 | 52.91/0.978 | |
| 40 | 29.02/0.240 | 43.38/0.895 | 39.45/0.788 | 33.73/0.366 | 50.52/0.962 | |
| 50 | 27.03/0.181 | 43.81/0.807 | 37.54/0.710 | 31.71/0.289 | 50.74/0.954 | |
| 75 | 23.47/0.102 | 41.19/0.358 | 34.05/0.534 | 28.20/0.177 | 48.06/0.904 | |
From these two tables, we can observe that among the competing algorithms, GSR also adopts the nonlocal similarity that groups image patches for low-rank structure. However, it requires too much iterations in practical applications, e.g., 100 or even up to 200 times. In contrast, WNNM needs fewer iterations, around 14, and achieves pretty good results than other 3 algorithms at an average of 8.26 dB for gray images. In the meantime, the proposed STLWSM needs the least iterations and achieves best performance.
SOLST and STROLLR are both transform algorithms and have hard-to-catch efficiency. STROLLR trains transform matrices for each group, while SOLST combines nonlocal low-rank and transform learning, and they also achieved better results than STROLLR at an average of 5.54 dB. In Table 3, the numerical results of the proposed STLWSM are all made bold that means the best one among the five algorithms. It is evident that the proposed method has achieved visible improvement in PSNR under all kinds of noise levels at an average of 13.61 dB. More visual results are shown in Figure 4, in which our method clearly outperforms all other methods.
Figure 4.

PSNR AVG of gray images denoising results.
Moreover, considering that GSR needs too much iterations, and pure transform learning algorithms are extremely faster; we compare our time consummation against WNNM, and the results are shown in Figure 5. It can be seen that our method spends much less time than WNNM, at an average of 55.46%.
Figure 5.

Elapsed time comparison in gray images.
Our algorithm also has good scalability; we further use RGB images in IDN, and experiments results show that the proposed STLWSM still outperform than other algorithms, and specific numerical comparisons are shown in Table 4. Again, Figures 6 and 7, respectively, show the visual results in terms of the average PSNR and the elapsed time, which also demonstrate our superiority against other competitors. Figures 8 and 9 show the visual results of average SSIM comparison of gray images and color images, respectively. It can be seen that our method can hold denoised image structure even with high noise rate.
Table 4.
Color images' denoising results (PSNR/SSIM).
| Image | σ n | GSR | WNNM | SOLST | STROLLR | STLWSM |
|---|---|---|---|---|---|---|
| House | 15 | 38.89/0.507 | 50.87/0.980 | 47.76/0.960 | 43.06/0.756 | 55.73/0.992 |
| 20 | 35.08/0.550 | 48.43/0.967 | 45.35/0.933 | 40.14/0.645 | 53.35/0.987 | |
| 30 | 30.90/0.448 | 45.96/0.937 | 41.91/0.865 | 36.37/0.477 | 53.04/0.978 | |
| 40 | 28.24/0.356 | 43.40/0.896 | 39.45/0.788 | 33.73/0.364 | 50.63/0.963 | |
| 50 | 26.24/0.268 | 43.86/0.808 | 37.54/0.710 | 31.71/0.287 | 50.86/0.955 | |
| 75 | 22.67/0.152 | 41.19/0.359 | 34.05/0.534 | 28.19/0.176 | 48.14/0.905 | |
|
| ||||||
| House 2 | 15 | 38.20/0.329 | 50.78/0.980 | 47.74/0.960 | 43.19/0.770 | 55.55/0.992 |
| 20 | 34.87/0.319 | 48.37/0.967 | 45.33/0.933 | 40.26/0.663 | 53.25/0.986 | |
| 30 | 30.85/0.265 | 45.88/0.937 | 41.90/0.865 | 36.44/0.466 | 52.97/0.978 | |
| 40 | 28.22/0.211 | 43.42/0.896 | 39.44/0.789 | 33.80/0.383 | 50.57/0.963 | |
| 50 | 26.23/0.172 | 43.82/0.809 | 37.54/0.710 | 31.76/0.278 | 50.84/0.955 | |
| 75 | 22.67/0.109 | 41.22/0.358 | 34.05/0.533 | 28.21/0.190 | 48.15/0.907 | |
|
| ||||||
| Lake | 15 | 38.10/0.484 | 50.67/0.979 | 47.71/0.959 | 42.93/0.746 | 55.29/0.990 |
| 20 | 34.83/0.461 | 48.27/0.965 | 45.31/0.932 | 40.08/0.635 | 53.09/0.985 | |
| 30 | 30.84/0.381 | 45.83/0.935 | 41.89/0.864 | 36.32/0.466 | 52.82/0.977 | |
| 40 | 28.22/0.291 | 43.38/0.895 | 39.44/0.788 | 33.71/0.354 | 50.48/0.962 | |
| 50 | 26.23/0.226 | 43.78/0.808 | 37.54/0.710 | 31.68/0.278 | 50.72/0.954 | |
| 75 | 22.67/0.130 | 41.22/0.359 | 34.04/0.534 | 28.16/0.169 | 48.08/0.906 | |
|
| ||||||
| Pepper | 15 | 38.52/0.535 | 50.74/0.978 | 47.74/0.959 | 42.94/0.744 | 55.28/0.989 |
| 20 | 34.97/0.492 | 48.33/0.964 | 45.33/0.932 | 40.07/0.632 | 53.06/0.984 | |
| 30 | 30.88/0.439 | 45.84/0.933 | 41.98/0.864 | 39.52/0.476 | 52.73/0.975 | |
| 40 | 28.23/0.344 | 43.35/0.892 | 39.45/0.787 | 33.69/0.348 | 50.45/0.959 | |
| 50 | 26.24/0.279 | 43.81/0.805 | 37.54/0.709 | 31.70/0.272 | 50.61/0.950 | |
| 75 | 22.67/0.158 | 41.18/0.358 | 34.05/0.534 | 28.16/0.164 | 47.95/0.900 | |
|
| ||||||
| Plane | 15 | 38.44/0.451 | 50.82/0.980 | 47.74/0.961 | 43.31/0.782 | 55.57/0.992 |
| 20 | 34.94/0.431 | 48.37/0.967 | 45.34/0.939 | 40.35/0.676 | 53.25/0.987 | |
| 30 | 30.87/0.348 | 45.88/0.937 | 41.91/0.867 | 36.51/0.511 | 52.95/0.979 | |
| 40 | 28.23/0.265 | 43.41/0.896 | 39.45/0.790 | 33.85/0.397 | 50.55/0.963 | |
| 50 | 26.23/0.204 | 43.85/0.809 | 37.54/0.712 | 31.81/0.318 | 50.79/0.955 | |
| 75 | 22.67/0.117 | 41.17/0.358 | 34.05/0.534 | 28.23/0.200 | 48.14/0.906 | |
|
| ||||||
| Woman 2 | 15 | 38.82/0.398 | 50.73/0.979 | 47.74/0,959 | 42.89/0.737 | 55.36/0.990 |
| 20 | 35.07/0.390 | 48.32/0.965 | 45.34/0.932 | 40.03/0.623 | 53.11/0.985 | |
| 30 | 30.90/0.302 | 45.86/0.934 | 41.90/0.863 | 36.29/0.451 | 52.73/0.976 | |
| 40 | 28.23/0.227 | 43.42/0.893 | 39.44/0.786 | 33.67/0.338 | 50.39/0.960 | |
| 50 | 26.24/0.174 | 43.84/0.805 | 37.53/0.708 | 31.67/0.264 | 50.61/0.951 | |
| 75 | 22.67/0.102 | 41.14/0.357 | 34.04/0.533 | 28.14/0.157 | 47.93/0.901 | |
Figure 6.

PSNR AVG of color images denoising results.
Figure 7.

Elapsed time comparison in color images.
Figure 8.

SSIM AVG of gray images denoising results.
Figure 9.

SSIM AVG of color images denoising results.
For detailed display of the efficiency of our algorithm, we provide its generated results versus different iterations (up to 10). The experimental results are shown in Figure 10. All 12 images' PSNR values are averaged for each noise level. The PSNR value of the original noisy images in different noise levels is shown as the starting point where the top black line is the max value of 24.63, the bottom black line is the min value of 10.65, and the red line represents the median of 17.64. And the green star is an average of 17.72. Figure 8 shows that our algorithm has a fast constringency speed and needs limited number of iterations, mostly 3, for the final solution.
Figure 10.

Average PSNR of 12 images denoising in each epoch of different image noise levels.
We also applied our method in image inpainting with 6 images in sizes of 512∗512, and the degenerated images are obtained by multiplying with a random logical matrix in an element-wise manner, and the missing rates are set as σm = {15%, 20%, 30%, 40%, 50%}. The image inpainting results are shown in Table 5. The original images are shown in Figure 3. The results show that all methods achieve admirable inpainting results for filling in missing pixels, and the proposed STLWSM still outperforms all the other state-of-the-art algorithms. Taking into account the image denoising results, our STLWSM has better robustness with much less PSNR changes compared to other competing approaches.
Table 5.
Images inpainting results of size 512∗512.
| Image | σ m (%) | WNNM | SOLST | STROLLR | STLWSM |
|---|---|---|---|---|---|
| Boats | 15 | 57.88| | 56.51 | 56.88 | 58.05 |
| 20 | 57.36 | 56.19 | 56.32 | 57.82 | |
| 30 | 56.57 | 55.76 | 55.87 | 57.32 | |
| 40 | 56.08 | 55.18 | 55.53 | 56.64 | |
| 50 | 55.86 | 54.79 | 55.05 | 55.63 | |
|
| |||||
| Clock | 15 | 54.39 | 53.85 | 53.91 | 55.12 |
| 20 | 54.16 | 53.45 | 53.62 | 54.81 | |
| 30 | 53.99 | 53.76 | 53.94 | 55.52 | |
| 40 | 53.42 | 53.14 | 53.49 | 53.79 | |
| 50 | 52.15 | 52.14 | 52.50 | 52.71 | |
|
| |||||
| Factory | 15 | 59.11 | 58.18 | 58.26 | 59.68 |
| 20 | 58.85 | 57.73 | 57.76 | 59.45 | |
| 30 | 58.26 | 56.16 | 56.35 | 58.98 | |
| 40 | 57.55 | 55.14 | 55.46 | 57.57 | |
| 50 | 56.86 | 54.49 | 55.11 | 56.66 | |
|
| |||||
| Baboon | 15 | 57.95 | 56.18 | 57.97 | 58.54 |
| 20 | 57.25 | 55.85 | 56.95 | 57.94 | |
| 30 | 57.09 | 55.47 | 56.12 | 57.58 | |
| 40 | 56.56 | 54.95 | 55.27 | 57.07 | |
| 50 | 56.01 | 54.35 | 54.48 | 56.18 | |
|
| |||||
| Beans | 15 | 56.13 | 54.26 | 55.18 | 56.57 |
| 20 | 55.85 | 54.19 | 54.79 | 56.19 | |
| 30 | 54.32 | 53.92 | 54.22 | 55.55 | |
| 40 | 53.61 | 53.14 | 53.29 | 54.74 | |
| 50 | 52.52 | 51.95 | 52.03 | 53.67 | |
|
| |||||
| Tree | 15 | 57.15 | 56.74 | 56.91 | 57.85 |
| 20 | 57.08 | 56.34 | 56.66 | 57.64 | |
| 30 | 56.59 | 55.73 | 56.71 | 57.11 | |
| 40 | 54.73 | 54.67 | 54.71 | 55.26 | |
| 50 | 53.56 | 53.22 | 53.34 | 53.95 | |
5. Conclusions
In this paper, we have proposed a unified framework of image denoising using both knowledge from image domain and transform domain, namely sparsifying transform learning and weighted singular values minimization (STLWSM). Specifically, we learned the transform matrix for each group of patches with similar structure. After obtaining the optimized transform matrix and the sparse coefficient with an efficient optimization algorithm, we further restored the image patch groups through their low-rank prior. By adopting STLWSM to all the groups, a denoised image can be reconstructed. For both gray images and color images, experimental results show that, the proposed model can achieve visible improvement in PSNR over other state-of-the-art approaches. Our efficient optimization algorithm also costs much less running time compared to the typical image domain-based method. Note that while the pure transform learning methods run faster than STLWSM, they perform poorer with a large margin. To further improve, the efficiency of our framework will be our main work in the near future.
Acknowledgments
This work was supported in part by the National Key Research and Development Program of China under grant No 2018YFE0126100, National Science Fund of China under grant Nos. 51875524, 61873240, and 61602413, and the Natural Science Foundation of Zhejiang Province of China under grant No LY19F030016.
Data Availability
The image data are provided in the manuscript, and all images can be found in http://sipi.usc.edu/database/. The codes of this article are available in https://github.com/Yapan0975/STLWSM.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
References
- 1.Lefkimmiatis S. Universal denoising networks: a novel cnn architecture for image denoising. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; June 2018; Salt Lake City, UT, USA. IEEE; pp. 3204–3213. [DOI] [Google Scholar]
- 2.Jia X., Liu S., Feng X., Zhang L. Focnet: a fractional optimal control network for image denoising. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; June 2019; Long Beach, CA, USA. IEEE; pp. 6054–6063. [DOI] [Google Scholar]
- 3.Fang X., Xu Y., Li X., Fan Z., Liu H., Chen Y. Locality and similarity preserving embedding for feature selection. Neurocomputing. 2014;128:304–315. doi: 10.1016/j.neucom.2013.08.040. [DOI] [Google Scholar]
- 4.Liu J., Zhang L., Yang J. Mixed noise removal by weighted encoding with sparse nonlocal regularization. IEEE Transactions on Image Processing. 2014;23(6):2651–2662. doi: 10.1109/tip.2014.2317985. [DOI] [PubMed] [Google Scholar]
- 5.Wei W., Jia Q. Weighted feature gaussian kernel SVM for emotion recognition. Computational Intelligence and Neuroscience. 2016;2016:1–7. doi: 10.1155/2016/7696035.7696035 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hu Y., Zhang D., Ye J., Li X., He X. Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2012;35(9):2117–2130. doi: 10.1109/tpami.2012.271. [DOI] [PubMed] [Google Scholar]
- 7.He S., Zhang L., Zuo W., Feng X. Weighted nuclear norm minimization with application to image denoising. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; June 014; Columbus, OH, USA. IEEE; pp. 2862–2869. [DOI] [Google Scholar]
- 8.Zhang J., Zhao D., Gao W. Group-based sparse representation for image restoration. IEEE Transactions on Image Processing. 2014;23(8):3336–3351. doi: 10.1109/tip.2014.2323127. [DOI] [PubMed] [Google Scholar]
- 9.Liu H., Xiong R., Zhang J., Gao W. Image denoising via adaptive soft-thresholding based on non-local samples. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; June 2015; Boston, MA, USA. IEEE; pp. 484–492. [DOI] [Google Scholar]
- 10.Li F., Lv X. A Decoupled method for image inpainting with patch-based low rank regulariztion. Applied Mathematics and Computation. 2017;314:334–348. doi: 10.1016/j.amc.2017.06.027. [DOI] [Google Scholar]
- 11.Sahoo S. K., Makur A. Dictionary training for sparse representation as generalization of K-means clustering. IEEE Signal Processing Letters. 2013;20(6):587–590. doi: 10.1109/lsp.2013.2258912. [DOI] [Google Scholar]
- 12.Liu Z., Yu L., Sun H. Image restoration via bayesian dictionary learning with nonlocal structured beta process. Journal of Visual Communication & Image Representation. 2018;52:159–169. doi: 10.1016/j.jvcir.2018.02.011. [DOI] [Google Scholar]
- 13.Jiajun D., Donghai B., Qingpei W. A novel multi-dictionary framework with global sensing matrix design for compressed sensing. Signal Processing. 2018;152:69–78. doi: 10.1016/j.sigpro.2018.05.012. [DOI] [Google Scholar]
- 14.Hongyao D., Qingxin Z., Xiuli S. A decision-based modified total variation diffusion method for impulse noise removal. Computational Intelligence and Neuroscience. 2017;2017:1–20. doi: 10.1155/2017/2024396.2024396 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Xu J., Zhang L., Zhang D. Multi-channel weighted nuclear norm minimization for real color image denoising. IEEE International Conference on Computer Vision (ICCV); October 2017; Venice, Italy. IEEE; [DOI] [Google Scholar]
- 16.Foucart S. Dictionary-sparse recovery via thresholding-based algorithms. Journal of Fourier Analysis and Applications. 2016;22(1):6–19. doi: 10.1007/s00041-015-9411-4. [DOI] [Google Scholar]
- 17.Xie J., Liao A., Lei Y. A new accelerated alternating minimization method for analysis sparse recovery. Signal Processing. 2018;145:167–174. doi: 10.1016/j.sigpro.2017.12.010. [DOI] [Google Scholar]
- 18.Wen B., Ravishankar S., Bresler Y. Structured overcomplete sparsifying transform learning with convergence guarantees and applications. International Journal of Computer Vision. 2015;114(2-3):137–167. doi: 10.1007/s11263-014-0761-1. [DOI] [Google Scholar]
- 19.Ravishankar S., Bresler Y. Learning sparsifying transforms. IEEE Transactions on Signal Processing. 2012;61(5):1072–1086. doi: 10.1109/tsp.2012.2226449. [DOI] [Google Scholar]
- 20.Wen B., Ravishankar S., Bresler Y. Learning overcomplete sparsifying transforms with block cosparsity. 2014 IEEE International Conference on Image Processing (ICIP); October 2014; Paris, France. IEEE; pp. 803–807. [DOI] [Google Scholar]
- 21.Ravishankar S., Bresler Y. Learning doubly sparse transforms for images. IEEE Transactions on Image Processing. 2013;22(12):4598–4612. doi: 10.1109/tip.2013.2274384. [DOI] [PubMed] [Google Scholar]
- 22.Zhang K. S., Zhong L., Zhang X. Y. Image restoration via group l2,1 norm-based structural sparse representation. International Journal of Pattern Recognition and Artificial Intelligence. 2018;32(04):p. 1854008. doi: 10.1142/s0218001418540083. [DOI] [Google Scholar]
- 23.Wang Q., Zhang X., Wu Y., Tang L., Zha Z. Non-convex weighted lp minimization based group sparse representation framework for image denoising. IEEE Signal Processing Letters. 2017;24(11):1686–1690. doi: 10.1109/lsp.2017.2731791. [DOI] [Google Scholar]
- 24.Kim D.-G., Shamsi Z. H. Enhanced residual noise estimation of low rank approximation for image denoising. Neurocomputing. 2018;293:1–11. doi: 10.1016/j.neucom.2018.02.063. [DOI] [Google Scholar]
- 25.Zheng J., Qin M., Zhou X. Efficient implementation of truncated reweighting low-rank matrix approximation. IEEE Transactions on Industrial Informatics. 2019;16(1):488–500. doi: 10.1109/tii.2019.2916986. [DOI] [Google Scholar]
- 26.Liu Q., Lai Z., Zhou Z. A truncated nuclear norm regularization method based on weighted residual error for matrix completion. IEEE Transactions on Image Processing. 2015;25(1):316–330. doi: 10.1109/tip.2015.2503238. [DOI] [PubMed] [Google Scholar]
- 27.Ponomarenko N., Lukin V., Zelensky A. TID2008-a database for evaluation of full-reference visual quality assessment metrics. Advances of Modern Radioelectronics. 2009;10(4):30–45. [Google Scholar]
- 28.Wang Z., Bovik A. C., Sheikh H. R., Simoncelli E. P. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing. 2004;13(4):600–612. doi: 10.1109/tip.2003.819861. [DOI] [PubMed] [Google Scholar]
- 29.Le Callet P., Autrusseau F. Subjective Quality Assessment IRCCyN/IVC database. 2005. [Google Scholar]
- 30.Wang Z., Bovik A. C. Mean squared error: love it or leave it? A new look at signal fidelity measures. IEEE Signal Processing Magazine. 2009;26(1):98–117. doi: 10.1109/msp.2008.930649. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The image data are provided in the manuscript, and all images can be found in http://sipi.usc.edu/database/. The codes of this article are available in https://github.com/Yapan0975/STLWSM.
