Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Sep 1.
Published in final edited form as: Phys Med Biol. 2017 Sep 1;62(18):N428–N435. doi: 10.1088/1361-6560/aa837d

A sequential solution for anisotropic total variation image denoising with interval constraints

Jingyan Xu 1, Frédéric Noo 2
PMCID: PMC5779866  NIHMSID: NIHMS924913  PMID: 28862998

Abstract

We show that two problems involving the anisotropic total variation (TV) and interval constraints on the unknown variables admit, under some conditions, a simple sequential solution. Problem 1 is a constrained TV penalized image denoising problem; Problem 2 is a constrained fused lasso signal approximator. The sequential solution entails finding first the solution to the unconstrained problem, and then applying a thresholding to satisfy the constraints. If the interval constraints are uniform, this sequential solution solves problem 1. If the interval constraints furthermore contain zero, the sequential solution solves problem 2. Here uniform interval constraints refer to all unknowns being constrained to the same interval. A typical example of application is image denoising in x-ray CT, where the image intensities are non-negative as they physically represent linear attenuation coefficient in the patient body. Our results are simple yet seem unknown; we establish them using the Karush-Kuhn-Tucker conditions for constrained convex optimization.

Keywords: denoising, smoothing, total variation, lasso, fused lasso, interval constraints, box constraints, constrained optimization

1. Introduction

Total variation (TV) is widely used in signal processing and in image denoising, restoration, and reconstruction, as an edge-preserving penalty to cope with ill-posedness in the problem. The TV penalty, both the anisotropic model and the isotropic model, is convex but nonsmooth, thus causes difficulty for numerical optimization. Some numerical algorithms replace the TV by a smoothed version of TV so that conventional gradient-descent type algorithms can be applied. These algorithms introduce a nuisance parameter ε > 0 to the TV penalty; the value of ε can be used to trade off computational accuracy and algorithm efficiency [1]. Other methods, such as the primal-dual type [2], the operator splitting type [3], and numerous others [46], have been developed to deal with the nonsmooth TV directly without introducing any nuisance parameters.

In many imaging applications, the image intensity values are known to be non-negative. For example, in x-ray CT, the pixel values can only be non-negative as they represent the x-ray linear attenuation coefficient of human tissues. In nuclear medicine, the pixel values represent radioactivity uptakes, which are also non-negative. Incorporating this prior knowledge not only produces more interpretable results [79], it also improves image quality; see e.g., [1014]. In SPECT and CT, it is also often possible to identify an upper bound on the image intensity values. Such interval constraints have been shown to reduce reconstruction time and generate more accurate attenuation maps in the case of truncated SPECT transmission data [15, 16].

TV problems with interval constraints have also received attention from a numerical optimization perspective, see, e.g., [10, 1719]. The existing algorithms for the constrained problems often consider a generic image restoration problem with either the anisotropic or the isotropic TV model. In many practical implementations, the interval constraints are simply handled in the following ad-hoc manner. First, the solution to the unconstrained TV problem is obtained. Second, the solution from the first step is thresholded to satisfy the constraints [13,20]. In this work, we show that this seemingly naive solution is indeed accurate for a subclass of constrained TV problems, namely, when the problem is a denoising problem using the anisotropic TV and when the interval constraints are uniform, i.e., all unknowns have the same interval constraint [l, u], l < u. Note that this subclass includes TV denoising with the non-negativity constraint as a special case, i.e., l = 0 and u = ∞.

The rest of the paper is organized as follows. In Sec. 2 we define Problem 1 and establish that the two-step solution mentioned above is exact. Note that this exactness is true for nonsmooth TV minimization, i.e., without introducing any nuisance parameter ε. Counter examples with smooth TV or with nonuniform interval constraints are given in Appendix. The analysis in Sec 2 is more detailed and serves as a model for Sec. 3, which discusses a generalization of Problem 1, a constrained fused lasso signal approximator. The constrained fused lasso also admits a sequential solution when the interval constraint is uniform and contains zero. We discuss some related development in the literature in Sec 4 and conclude in Sec. 5.

2. TV-penalized denoising with uniform interval constraints

Consider the following problem

min{xi}{Θ1(γ)=i=1Nwi2(xiyi)2+γiι[l,u](xi)+ijNiβij|xixj|} (1)

where xi, yi, i = 1, ⋯, N are the unknowns and the noisy input data, respectively, and where Ni is a fixed set of neighbors for each i. The weight wi > 0 is often chosen to be inversely proportional to the variance of yi, and βij > 0 is the penalty weight.

Note that we use ιC to denote the indicator function of the convex set C, i.e., ιC(x) = 0 if xC, and ιC(x) = ∞ when xC. In (1), the parameter γ of the indicator function is either 0 or 1, so that the objective function is unconstrained Θ1(0) or constrained Θ1(1), respectively. When γ = 1, the unknowns in (1) are all uniformly constrained to the interval [l, u]. We assume l < u and we allow l = −∞ or u = ∞, which means the constraint can be one-sided. Problem (1) is a general smoothing or denoising problem.

Since the objective function (1) is strongly convex, there is a unique global minimum to Θ1(γ) for γ = {0, 1}. To simplify notation, we write z ≡ {zi}, where z is a generic vector variable. The solution x* to the constrained problem Θ1(1) can be obtained from the Karush-Kuhn-Tucker (KKT) conditions for convex nonsmooth problems [21, Theorem 3.34]: if the triple x*, λ1*, λ2* satisfies the following conditions

wi(xiyi)+jNiβijtij(x)j:iNjβjitji(x)λi1+λi20,i (2a)
xiu,λi20,λi2(xiu)=0,i (2b)
xil,λi10,λi1(xil)=0,i (2c)

where tmn(x) is the subdifferential of |xmxn| at x, then x* is the minimizer of Θ1(1). By definition, tmn(x) = sign(xmxn) if xmxn, and tmn(x) = [−1, 1] if xm = xn. The auxiliary variables λi1,λi2, i = 1, ⋯, N, are the (optimal) Lagrange multipliers introduced for the constraints.

Denote by * the solution to minimizing Θ1(0), i.e., the difference between * and x* is that x* satisfies the constraints lxiu, ∀i. We establish that xi=median{xi,l,u}, which expands to

xi={lxilxil<xi<uuxiu. (3)

In other words, x* can be obtained in a two-step manner. First we solve (1) without considering the constraints; second, we perform pixelwise thresholding.

The solution * from the first step satisfies the first order optimality condition, i.e.,

wi(xiyi)+jNiβijtij(x)j:iNjβjitji(x)0,i (4)

and the second (thresholding) step in (3) is the solution to the following constrained minimization problem

minx^i[l,u]12i(xix^i)2. (5)

By applying the KKT conditions to (5), we find that the solution satisfies

(x^ixi)λ^i1+λ^i2=0,i (6a)
x^iu,λ^i20,(x^iu)λ^i2=0,i (6b)
x^il,λ^i10,(x^il)λ^i1=0,i (6c)

for some {λ^i1},{λ^i2}. We show that {i}, {wiλ^i1},{wiλ^i2} is a triple that satisfies the KKT conditions (2). Hence, = x*, the unique minimizer of Θ1(1).

Given the relation in (6) between * and , we must have

tij(x)tij(x^) (7)

This is because the thresholding from * to is order-preserving when all pixels are subject to the same constraints [l, u]. In other words, if xixj, then ij. More specifically,

  • If ij, then tij() = sign(ij) = tij(*);

  • If i = j, then tij(*) ⊆ tij() = [−1, 1].

If we multiply (6a) by wi and add it to (4), then

wi(x^iyi)+jNiβijtij(x)j:iNjβjitji(x)(wiλ^i1)+(wiλ^i2)0 (8)

which is almost (2a), except that in (8) tij are dependent on *. The subset relation in (7) ensures that we can replace tij(*) by tij() in (8) and the relation still holds. Then our claim is established as the triple {i}, {wiλ^i1},{wiλ^i2} obviously satisfies (2b) and (2c).

3. Denoising with the fused lasso penalty subject to uniform intervals

We define the following minimization problem

min{xi}{Θ2(γ,η)=i=1Nwi2(xiyi)2+γiι[l,u](xi)+ηi|xi|+ijNiβij|xixj|}. (9)

The l1 penalty encourages sparse coefficients in the unknown x, with hyperparameter η ≥ 0. When wi = w for all i, Θ2(0, η) is the objective function of the fused lasso signal approximator (FLSA) [22]. When η = 0, we have Θ2(γ, 0) ≡ Θ1(γ), thus (9) includes (1) as a special case. Again, since the objective function is strongly convex, there is a unique global minimizer to Θ2(γ, η) for γ = {0, 1} and all η ≥ 0.

Applying the KKT conditions for convex nonsmooth problems [21, Theorem 3.34], the solution to (9) satisfies

wi(xiyi)+ηs(xi)+jNiβijtij(x)j,iNjβjitji(x)λi1+λi20,i (10a)
xiu,λi20,λi2(xiu)=0,i (10b)
xil,λi10,λi1(xil)=0,i (10c)

Comparing (10) with (2), the only addition is the term related to η, where s(xi) is the subdifferential of |xi|, namely s(xi) = sign(xi) if xi ≠ 0, and s(xi) = [−1, 1] otherwise.

On the other hand, the solution to the unconstrained counterpart satisfies the first order optimality condition:

wi(xiyi)+ηs(xi)+jNiβijtij(x)j:iNjβjitji(x)0,i (11)

The second (thresholding) step, which produces the output i from xi in (11), is the same as in (6) and will not be repeated. We claim that {i}, {wiλ^i1},{wiλ^i2} is a triple that satisfies the KKT conditions (10). Hence, we must have = x*, the unique solution of Θ2(1, η). To prove this, we multiply both sides of (6) by wi and add it to (11) to obtain

wi(x^iyi)+ηs(xi)+jNiβijtij(x)j:iNjβjitji(x)(wiλ^i1)+(wiλ^i2)0 (12)

which is almost (10a) except that the subdifferentials are evaluated at xi and not i. As discussed in Sec 2, since all pixels are subject to the same constraint [l, u] and since the thresholding step is order-preserving, we can replace tij(*) by tij() without affecting the validity of (12). If in addition, the interval [l, u] contains zero, i.e., 0 ∈ [l, u], then the thresholding step (3) is sign-preserving as well. Therefore we have s(x^i)s(xi) for all i. Indeed, if i > 0 then xi>0 and thus s(x^i)=s(xi); if i = 0, then xi can be nonzero if l = 0 or u = 0, but we always have s(xi)s(x^i)=[1,1]; if i < 0, then xi<0 and thus s(x^i)=s(xi). In summary, replacing * by maintains the correctness of (12). Hence our claim is established as the triple {i}, {wiλ^i1},{wiλ^i2} obviously satisfies (10b) and (10c).

It is interesting to point out that the FLSA itself, i.e., the problem of minimizing Θ2(0, η) for η > 0, admits a two-step solution when all weights wi = w are equal. First, the solution to the problem of minimizing Θ2(0, 0) is obtained, then pixelwise soft-thresholding can be applied to the first step solution to obtain the solution to Θ2(0, η) [23]. Combining this observation with our result, the solution to minimizing Θ2(1, η) can be obtained in a sequential manner. Starting from the solution to Θ2(0, 0), we apply pixelwise soft-thresholding (to minimize Θ2(0, η)) and then another thresholding to satisfy the interval constraints, i.e., to solve Θ2(1, η).

4. Discussion

Throughout the text, we have assumed that a solver for the unconstrained TV penalized problem is available. Our contribution has shown that applying thresholding to the result from this solver yields the solution to the problem with uniform interval constraints. Any solver for the nonsmooth TV problem can be used. When the problem has additional structures, e.g., if the unknowns xi form a chain [14, 24], a tree or an acyclic graph [25], a direct (noniterative) solution to the unconstrained problem can be obtained. Applying our result then produces a direct solution to the constrained counterpart.

Some problem instances in sparse reconstruction and machine learning are known to admit a sequential solution, see e.g., [2628]. These solutions can be addressed in a general setting using proximal decomposition [29]. We did not take the approach of proximal decomposition as that would limit us to uniform data weighting wi = w > 0. Our sequential characterization of the solution to (1) and (9) works under general nonuniform data weighting.

5. Conclusions

We have discussed constrained image denoising problems that involve the anisotropic total variation (TV) penalty. Two specific problems were considered: (i) the constrained TV denoising problem, and (ii) the constrained fused lasso problem. When all unknowns are subject to the same interval constraint, and when the interval contains 0 for problem (ii), we have shown that the solution to the constrained problem is the thresholded version of the solution to the unconstrained problem. Our work thus provides justification for this approach, which is often adopted in an ad-hoc manner. However, note that the sequential approach is in general not correct for non-uniform interval constraints, nor is it correct in general for smooth TV using the Huber potential (see the counter examples in Appendix). Whether or not the sequential approach works with isotropic TV remains an open question. We also do not know if the sequential solution is valid for a generic image reconstruction problem. However, an image reconstruction algorithm often contains image denoising in an inner loop. Within such an inner loop, our sequential solution is applicable to solve the constrained image denoising problem.

Acknowledgments

The work of F. Noo was supported by Siemens Medical Solutions, USA; the concepts presented in this paper are based on research and are not commercially available.

Appendix

Examples

We provide two 1-D counter-examples showing that the two-step approach is incorrect for (i) non-uniform interval constraints and (ii) a smooth TV using the Huber function. We consider the following objective function,

i=1N12wi(xiyi)2+βi=1N1ϕ(xi+1xi), (13)

where ϕ(·) is a generic penalty function penalizing neighboring differences. The followings are common parameters for both examples.

  • Number of data points, N = 3;

  • Input y = {y1, y2, y3} = {−1, 0, 1};

  • Data weight w1 = w2 = w3 = 1;

  • Penalty weight β = 0.5;

Example (i)

Here ϕ(s) = |s|, then (13) is a special case of (1) in Sec 2. The unknowns x1, x2, x3 are subject to the nonuniform constraints: x1 ≥ 0.5, x2 ≥ 0, and x3 ≥ 0.5. That is, we let the upper bound u = ∞ in this example. The unconstrained solution is * = {−0.5, 0, 0.5}, as can be verified by checking the first order optimality condition. To satisfy the nonuniform constraints, we need to threshold x1 and obtain * = {0.5, 0, 0.5} which has an objective function value of 1.75. However, this is not optimal as another candidate {0.5, 0.5, 0.5} achieves a lower objective value of 1.375. Thus, the 2-step solution is incorrect when the interval constraints are not the same for all unknowns.

Example (ii)

We replace the TV penalty by the Huber function, given by ϕ(s) = inft{|t| + (st)2/(2ε)}, where ε > 0 is a nuisance parameter. As ε ↓ 0, ϕ(s) ↑ |s|. Alternatively, we can write ϕ(s) = s2/(2ε) if |s| ≤ ε, and ϕ(s) = |s| − ε/2 otherwise. We set ε = 0.1 in this example. The unknown x = {x1, x2, x3} is subject to the non-negativity constraint, i.e., xi ≥ 0 for i = 1, 2, 3. It can be verified that the unconstrained solution * = {−0.5, 0, 0.5} which is then thresholded to give = {0, 0, 0.5} and the objective function at is 0.85. This solution is not optimal as another candidate {0, 1/12, 0.5} achieves a lower objective value of ≈ 0.83. Thus the 2-step solution is not correct for smoothed TV in general, even with uniform interval constraints.

Footnotes

Note that xi is not necessarily a 1-D signal; it can be a multi-dimensional signal, represented using lexicographical ordering of the entries.

References

  • 1.Nesterov Y. Smooth minimization of non-smooth functions. Mathematical Programming. 2004 Dec.103:127–152. [Google Scholar]
  • 2.Chambolle A, Pock T. A first-order primal-dual algorithm for convex problems with applications to imaging. Journal of Mathematical Imaging and Vision. 2011;40(1):120–145. [Google Scholar]
  • 3.Goldstein T, Osher S. The split bregman method for l1-regularized problems. SIAM journal on imaging sciences. 2009;2(2):323–343. [Google Scholar]
  • 4.Chambolle A. An algorithm for total variation minimization and applications. Journal of Mathematical Imaging and Vision. 2004;20(1):89–97. [Google Scholar]
  • 5.Tai X-C, Wu C. Augmented Lagrangian Method, Dual Methods and Split Bregman Iteration for ROF Model. Berlin, Heidelberg: Springer Berlin Heidelberg; 2009. pp. 502–513. [Google Scholar]
  • 6.Kunisch K, Hintermuller M. Total bounded variation regularization as a bilaterally constrained optimization problem. SIAM Journal on Applied Mathematics. 2004;64(4):1311–1333. [Google Scholar]
  • 7.Schafer RW, Mersereau RM, Richards MA. Constrained iterative restoration algorithms. Proceedings of the IEEE. 1981 Apr;69:432–450. [Google Scholar]
  • 8.Persson M, Bone D, Elmqvist H. Total variation norm for three-dimensional iterative reconstruction in limited view angle tomography. Physics in Medicine and Biology. 2001;46(3):853. doi: 10.1088/0031-9155/46/3/318. [DOI] [PubMed] [Google Scholar]
  • 9.Dey N, Blanc-Feraud L, Zimmer C, Kam Z, Olivo-Marin JC, Zerubia J. A deconvolution method for confocal microscopy with total variation regularization. 2004 2nd IEEE International Symposium on Biomedical Imaging: Nano to Macro (IEEE Cat No. 04EX821) 2004 Apr;2:1223–1226. [Google Scholar]
  • 10.Krishnan D, Lin P, Yip AM. A primal-dual active-set method for non-negativity constrained total variation deblurring problems. IEEE Transactions on Image Processing. 2007 Nov;16:2766–2777. doi: 10.1109/tip.2007.908079. [DOI] [PubMed] [Google Scholar]
  • 11.Vogel C. Computational Methods for Inverse Problems. Society for Industrial and Applied Mathematics; 2002. [Google Scholar]
  • 12.Chen X, Ng MK, Zhang C. Non-Lipschitz ℓp-regularization and box constrained model for image restoration. IEEE Transactions on Image Processing. 2012 Dec;21:4709–4721. doi: 10.1109/TIP.2012.2214051. [DOI] [PubMed] [Google Scholar]
  • 13.Hansis E, Schafer D, Dossel O, Grass M. Evaluation of iterative sparse object reconstruction from few projections for 3-d rotational coronary angiography. IEEE Transactions on Medical Imaging. 2008 Nov;27:1548–1555. doi: 10.1109/TMI.2008.2006514. [DOI] [PubMed] [Google Scholar]
  • 14.Storath M, Brandt C, Hofmann M, Knopp T, Salamon J, Weber A, Weinmann A. Edge preserving and noise reducing reconstruction for magnetic particle imaging. IEEE Transactions on Medical Imaging. 2017 Jan;36:74–85. doi: 10.1109/TMI.2016.2593954. [DOI] [PubMed] [Google Scholar]
  • 15.Byrne C. Iterative algorithms for deconvolution and deblurring with constraints. Inverse Problems. 1998;14:1455–1467. [Google Scholar]
  • 16.Narayanan MV, Byrne CL, King MA. An interior point iterative maximum-likelihood reconstruction algorithm incorporating upper and lower bounds with application to SPECT transmission imaging. IEEE transactions on medical imaging. 2001;20(4):342–353. doi: 10.1109/42.921483. [DOI] [PubMed] [Google Scholar]
  • 17.Chartrand R, Wohlberg B. Total-variation regularization with bound constraints; 2010 IEEE International Conference on Acoustics, Speech and Signal Processing; Mar, 2010. pp. 766–769. [Google Scholar]
  • 18.Vogel C. Solution of linear systems arising in nonlinear image deblurring. Scientific Computing: Proceedings of the Workshop. 1998:148–158. [Google Scholar]
  • 19.Fu H, Ng MK, Nikolova M, Barlow JL. Efficient minimization methods of mixed l2−l1 and l1-l1 norms for image restoration. SIAM Journal on Scientific Computing. 2006;27(6):1881–1902. [Google Scholar]
  • 20.Beck A, Teboulle M. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Transactions on Image Processing. 2009 Nov;18:2419–2434. doi: 10.1109/TIP.2009.2028250. [DOI] [PubMed] [Google Scholar]
  • 21.Ruszczyński AP. Nonlinear optimization. Vol. 13. Princeton University Press; 2006. [Google Scholar]
  • 22.Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2005;67(1):91–108. [Google Scholar]
  • 23.Friedman J, Hastie T, Hfling H, Tibshirani R. Pathwise coordinate optimization. Ann. Appl. Stat. 2007;1:302–332. 12. [Google Scholar]
  • 24.Condat L. A Direct Algorithm for 1-D Total Variation Denoising. IEEE Signal Processing Letters. 2013 Nov.20:1054–1057. [Google Scholar]
  • 25.Loeliger HA. An introduction to factor graphs. IEEE Signal Processing Magazine. 2004 Jan;21:28–41. [Google Scholar]
  • 26.Jenatton R, Mairal J, Obozinski G, Bach F. Proximal Methods for Hierarchical Sparse Coding. Journal of Machine Learning Research. 2011 Jul;12:2297–2334. [Google Scholar]
  • 27.Chartrand R, Wohlberg B. A nonconvex admm algorithm for group sparsity with sparse groups; 2013 IEEE International Conference on Acoustics, Speech and Signal Processing; May, 2013. pp. 6009–6013. [Google Scholar]
  • 28.Beygi S, Mitra U, Ström EG. Nested sparse approximation: Structured estimation of v2v channels using geometry-based stochastic channel model. IEEE Transactions on Signal Processing. 2015 Sep;63:4940–4955. [Google Scholar]
  • 29.Yu Y-L. On decomposing the proximal map. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, editors. Advances in Neural Information Processing Systems 26. Curran Associates, Inc.; 2013. pp. 91–99. [Google Scholar]

RESOURCES