Enhancing sparse representation of color images by cross channel transformation

Laura Rebollo-Neira; Aurelien Inacio

doi:10.1371/journal.pone.0279917

. 2023 Jan 26;18(1):e0279917. doi: 10.1371/journal.pone.0279917

Enhancing sparse representation of color images by cross channel transformation

Laura Rebollo-Neira ^1,^*, Aurelien Inacio ²

Editor: Eugene Demidenko³

PMCID: PMC9879438 PMID: 36701348

Abstract

Transformations for enhancing sparsity in the approximation of color images by 2D atomic decomposition are discussed. The sparsity is firstly considered with respect to the most significant coefficients in the wavelet decomposition of the color image. The discrete cosine transform is singled out as an effective 3 point transformation for this purpose. The enhanced feature is further exploited by approximating the transformed arrays using an effective greedy strategy with a separable highly redundant dictionary. The relevance of the achieved sparsity is illustrated by a simple encoding procedure. On typical test images the compression at high quality recovery is shown to significantly improve upon JPEG and WebP formats.

1 Introduction

In the signal processing field sparse representation usually refers to the approximation of a signal as superposition of elementary components, called atoms, which are members of a large redundant set, called a dictionary [1]. The superposition, termed atomic decomposition, aims at approximating the signal involving as few atoms as possible [1–4]. Sparsity is also relevant to data collection. Within the emerging theory of sampling known as compressive sensing (CS) this property is key for reconstructing signals from a reduced number of measures [5–7]. In particular, distributed compressive sensing (DCS) algorithms exploit inter signal correlation structure for multiple signal recovery [8].

Sparse image representation using redundant dictionaries has been considered in numerous works e.g. [9–11] and in the context of applications such as image restoration [12–14], feature extraction [15–18] and super resolution [19–22].

The sparsity property of some types of 3D images benefits from 3D processing [23–26]. In particular most RGB (Red-Green-Blue) images admit a sparser atomic decomposition if approximated by 3D atoms [27]. Within the 3D framework the gain in sparsity comes at expenses of computational cost though.

The purpose of this work is to show that the application of a transformation across the direction of the colors improves sparsity in the wavelet domain representation of the 2D color channels. The relevance of this feature is demonstrated by a simple encoding procedure rendering good compression results in comparison to commonly used image compression standards.

The work is organized as follows: Sec. 2 introduces the mathematical notation. Sec. 3 compares several cross color transformations enhancing sparsity. A numerical example on a large data set is used to illustrate the suitability of the dct for that purpose. Sec. 4 demonstrates the gain in sparsity obtained by the atomic decomposition of color images when the dct is applied across the RGB channels. Sec. 5 illustrates the relevance of the approach to image compression with high quality recovery. The conclusions are summarized in Sec. 6.

2 Mathematical notation

Throughout the paper we use the following notational convention. $R$ represents the set of real numbers. Boldface letters indicate Euclidean vectors, 2D and 3D arrays. Standard mathematical fonts indicate components, e.g., $d \in R^{N}$ is a vector of components $d (i) \in R, i = 1, \dots, N$ . The elements of a 2D array $I \in R^{L_{x} \times L_{y}}$ are indicated as I(i, j), i = 1, …, L_x, j = 1, …, L_y and the color channels of $I \in R^{L_{x} \times L_{y} \times 3}$ as I(:, :, z), z = 1, 2, 3. The transpose of a matrix, G say, is indicated as G^⊤. A set of, say M, color images is represented by the arrays $I {m} \in R^{L_{x} \times L_{y} \times 3}, m = 1, \dots, M .$

The inner product between arrays in $R^{L_{x} \times L_{y}}$ is given by the Frobenius inner product 〈⋅, ⋅〉_F as

{⟨ G, I ⟩}_{F} = \sum_{i = 1}^{L_{x}} \sum_{j = 1}^{L_{y}} G (i, j) I (i, j) .

Consequently, the Frobenius norm ‖⋅‖_F is calculated as

{∥ G ∥}_{F} = \sqrt{\sum_{i = 1}^{L_{x}} \sum_{j = 1}^{L_{y}} G {(i, j)}^{2}} .

The inner produce between arrays in $R^{N}$ is given by the Euclidean inner product 〈⋅, ⋅〉 as $〈 d, g 〉 = \sum_{i = 1}^{N} d (i) g (i) .$

3 Cross color transformations

Given a color image I(i, j, z), i = 1, …, L_x, j = 1, …, L_y, z = 1, 2, 3 the processing of the 3 channels can be realized either in the pixel/intensity or in the wavelet domain. Since the representation of most images is sparser in the wavelet domain [27–30] we approximate in that domain and reconstruct the approximated image by the inverse wavelet transform. Thus, using a 3 × 3 matrix T, we construct the transformed arrays $U \in R^{L_{x} \times L_{y} \times 3}$ and $W \in R^{L_{x} \times L_{y} \times 3}$ as follows

\begin{matrix} U (:, :, z) & = & \sum_{l = 1}^{3} I (:, :, l) T (l, z), z = 1, 2, 3 . \end{matrix}

(1)

\begin{matrix} W (:, :, z) & = & dwt (U (:, :, z)), z = 1, 2, 3, \end{matrix}

(2)

where dwt indicates the 2D wavelet transform. For the transformation T we consider the following cases

(i)
The dct.
(ii)
The reversible YCbCr color space transform.
(iii)
The principal components (PC) transform.
(iv)
A transformation learned from an independent set of images.

The dct is given by the matrix

(\begin{matrix} \frac{1}{\sqrt{3}} & \frac{\sqrt{2}}{\sqrt{3}} cos (\frac{π}{6}) & \frac{\sqrt{2}}{\sqrt{3}} cos (\frac{π}{3}) \\ \frac{1}{\sqrt{3}} & 0 & \frac{\sqrt{2}}{\sqrt{3}} cos (π) \\ \frac{1}{\sqrt{3}} & \frac{\sqrt{2}}{\sqrt{3}} cos (\frac{5 π}{6}) & \frac{\sqrt{2}}{\sqrt{3}} cos (\frac{5 π}{3}) \end{matrix}),

whereas the YCbCr color space transform is given by the matrix below [31]

(\begin{matrix} 0.299 & - 0.169 & 0.5 \\ 0.587 & - 0.331 & - 0.419 \\ 0.114 & 0.5 & - 0.0813 \end{matrix}) .

The columns of the principal components transform are the normalized eigenvectors of covariance matrix of the RGB pixels. Thus, the transformation is image dependent and optimal with respect to decorrelating the color channels.

As a first test, the approximation of the transformed channels is realized by keeping a fixed number of the largest absolute value entries, and setting the others equal to zero. In relation to this, for an image of size L_x × L_y × 3 we define the Sparsity Ratio (SR) as follows

\begin{matrix} SR = \frac{L_{x} \cdot L_{y} \cdot 3}{Number of nonzero entries in the three channels} . \end{matrix}

(3)

The quality of a reconstructed image $\tilde{I}$ , with respect to the original 8-bit image I, is compared using the Peak Signal-to-Noise Ratio (PSNR)

PSNR = 10 {log}_{10} (\frac{255^{2}}{MSE}), with MSE = \frac{1}{L_{x} \cdot L_{y} \cdot 3} \sum_{i, j, z = 1}^{L_{x}, L_{y}, 3} {(I (i, j, z) - \tilde{I} (i, j, z))}^{2} .

For the numerical examples below the transformation corresponding to case (iv) is learned from a set of images I{m}, m = 1, …, M all of the same size. Starting from an invertible 3 × 3 matrix T^k, with k = 1, the learning algorithm proceeds through the following instructions.

Use T^k to transform the arrays I{m}→U^k{m}→W^k{m} as in (1) and (2).
Approximate each transformed array W^k{m} to obtain ${\tilde{W}}^{k} {m}$ by keeping the largest K absolute value entries.
Apply the inverse 2D wavelet transform idwt to reconstruct the approximated arrays ${\tilde{U}}^{k} {m}, m = 1, \dots, M$ as
$\begin{matrix} {\tilde{U}}^{k} {m} (:, :, z) & = & idwt ({\tilde{W}}^{k} {m} (:, :, z)), z = 1, 2, 3 . \end{matrix}$
Use the original images I{m}, m = 1, …, M to find G = T⁻¹ by least squares fitting, i.e
$G^{✱} = \underset{\begin{matrix} G (z, l) \\ z, l = 1, 2, 3 \end{matrix}}{arg min} E (G),$
where
$E (G) = \sum_{m = 1}^{M} \sum_{i, j, l = 1}^{L_{x}, L_{y}, 3} {(I {m} (i, j, l) - \tilde{I} {m} (i, j, l))}^{2}$
with
$\tilde{I} {m} (i, j, l)) = \sum_{z = 1}^{3} {\tilde{U}}^{k} {m} (i, j, z) G (z, l) .$
While $E$ decreases, or the maximum number of allowed iterations has not been reached, set k → k + 1, T^k = (G*)⁻¹ and repeat steps 1)–5).

Given the arrays ${\tilde{U}}^{k} {m}, m = 1, \dots, M$ the least squares problem for determining the transformation T^k has a unique solution. However, the joint optimization with respect to the arrays ${\tilde{U}}^{k} {m}, m = 1, \dots, M$ and the transformation T^k is not convex. Hence, the algorithm’s outcome depends on the initial value.

The transformation (iv) used in the numerical examples of Secs. 3.1 and 4.1 has been learned from M = 100 images, all of size 384 × 512 × 3, from the UCID data set [32], which contains images of buildings, places and cars. The learning curves for two random orthonormal initializations is shown in Fig 1.

It is worth mentioning that, as shown in Fig 1, learning is richer when starting comparatively distant from a local minimum (left graph in Fig 1). However, since the convergence is to a local minimum other random initializations, even if generating less learning, may produce better results (right graph in Fig 1).

The aim of the next numerical test is to demonstrate the effect on the SR (c.f. (3)) produced by the above (i)–(iv) transformations across the color channels.

3.1 Numerical test I

Using the whole Berkeley data set [33], consisting of 300 images all of size 321 × 481 × 3 we proceed with each image as in (1) and (2). The dwt corresponds to the Cohen-Daubechies-Feauveau 9/7 (CDF97) wavelet family. Fig 2 shows the transformed channels of the image in Fig 3, including the dct transformation across channels (right graph) and without T transformation (left graph). As noticeable in the figure, the effect of the dct is to re-distribute the intensity in the color channels by transferring values between channels.

For the numerical test the approximations are realized fixing SR = 20 and SR = 10. The reconstructed images are obtained for the approximated arrays $\tilde{W}$ as

\begin{matrix} \tilde{U} (:, :, z) & = & idwt (\tilde{W} (:, :, z)), z = 1, 2, 3 \end{matrix}

(4)

\begin{matrix} \tilde{I} (:, :, z) & = & \sum_{l = 1}^{3} \tilde{U} (:, :, l) (T^{- 1} (z, l)), z = 1, 2, 3 \end{matrix}

(5)

where T⁻¹ is the inverse of T. When no transformation T is considered the image is reconstructed directly from (4).

The 2nd and 4th columns of Table 1 show the mean value PSNR ( $\bar{PSNR}$ ), with respect to the whole data set, for SR = 20 and SR = 10 respectively, corresponding to the transformations T listed in the 1st column of the table. The 3rd and 5th columns give the standard deviations (std).

Table 1. Mean value PSNR, with respect to the 300 images in the Berkeley data set.

The approximation of each image is realized by setting the least significant entries in the arrays W{m} = 1, …, 300 equal zero, in order to obtain SR = 20 for all the images (2nd column) and SR = 10 for all the images (4th column).

Transf.	$\bar{PSNR}$	std	$\bar{PSNR}$	std
(i) dct	33.9	5.0	38.9	5.1
(ii) YCbCr	33.7	5.0	38.7	5.1
(iii) PC	33.7	4.9	38.4	5.0
(iv) Learned	33.7	4.9	38.7	5.0
(v) No transf.	29.7	4.8	32.9	5.2

Open in a new tab

While Table 1 shows that all (i)–(iv) transformations render equivalent superior results, with respect to not applying a transformation (case (v)), the dct slightly exceeds the others. Case (iv) refers to the best result achieved when initializing the learning algorithm with 500 different random orthonormal transformations. When initializing the algorithm with transformations (i) and (ii) it appears that each of these transformations is close to a local minimizer of the method. This stems from the fact that such initializations do not generate significant learning.

The common feature of most of the 300 images in the data set used in this numerical example is the correlation property of the three color channels. This property is assessed by the correlation coefficients

\begin{matrix} r (z) = \frac{\sum_{i = 1}^{L_{x}} \sum_{j = 1}^{L_{y}} Γ (i, j, z) Γ (i, j, z + 1)}{σ (z) σ (z + 1)}, z = 1, 2, r (3) = \frac{\sum_{i = 1}^{L_{x}} \sum_{j = 1}^{L_{y}} Γ (i, j, 1) Γ (i, j, 3)}{σ (1) σ (3)}, \end{matrix}

where

\begin{matrix} Γ (i, j, z) = (I (i, j, z) - \bar{I (:, :, z)}), σ (z) = \sqrt{\sum_{i = 1}^{L_{x}} \sum_{j = 1}^{L_{y}} {(I (i, j, z) - \bar{I (:, :, z)})}^{2}}, \end{matrix}

(8)

and $\bar{I (:, :, z)}$ indicates the mean value of channel z.

The range of the correlation coefficient r(z), z = 1, 2, 3 for the images in the data set can be better estimated using the Fisher transform [34, 35] which is defined as follows

ζ (z) = \frac{1}{2} ln (\frac{r (z) + 1}{r (z) - 1}), z = 1, 2, 3 .

As seen in the graphs of Fig 4, the histograms of the transformed coefficients ζ(z), z = 1, 2, 3 resemble normal distributions. Hence, the confidence intervals can be well estimated in this domain. Once that is done, the intervals for the correlation coefficients are retrieved through the inverse transformation

r (z) = \frac{e^{2 ζ (z)} - 1}{e^{2 ζ (z)} + 1}, z = 1, 2, 3 .

Table 2 gives the range of the correlation coefficients concerning approximately 68% and 95% of the images in the data set.

Table 2. Intervals for the correlation coefficients between R and G channels, r(1), G and B channels, r(2), and R and B channels, r(3), involving approximately 68% and 95% of the images in the data set.

Percentage	r(1)-interval	r(2)-interval	r(3)-interval
68%	(0.8821, 0.9882)	(0.8559, 0.9854)	(0.6881, 0.9854)
95%	(0.6612, 0.9964)	(0.5968, 0.9955)	(0.2617, 0.9884)

Open in a new tab

In view of the high correlation between the color channels for most images in the data set it is surprising than the PC transformation (iii), which completely decorrelates the channels, does not overperform the other transformations, on the contrary. This feature has also been noticed in the context of bit allocation for subband color image compression [36].

4 Approximations by atomic decomposition

We have seen (c.f. Table 1) that by transformation of channels it is possible to gain quality when reducing nonzero entries in the channels. Now we discuss how to improve quality further by approximating the 2D arrays (2) by an atomic decomposition, other than just by neglecting their less significant entries. For the approximation to be successful it is important to use an appropriate dictionary. To this end, one possibility could be to learn the dictionary from training data [37–40]. However as demonstrated in previous works [27, 29, 30] a separable dictionary, which is easy to construct, is well suited for the purposes of achieving sparsity and delivers a fast implementation of the approach. Since we use that dictionary in the numerical examples, below we describe the method for constructing the atomic decomposition of the array W considering specifically a separable dictionary.

Firstly we concatenate the 3 planes W(:, :, z), z = 1, 2, 3 into an extended 2D array $W^{'} \in R^{3 L_{x} \times L_{y}}$ and divide this array in small blocks $W_{q}^{'}, q = 1, \dots, Q$ . Without loss of generality the blocks are assumed to be square of size N_b × N_b say, and are approximated using separable dictionaries $D^{x} = {d_{n}^{x} \in R^{N_{b}}, ∥ d_{n}^{x} {∥_{2} = 1}}_{n = 1}^{M_{b}}$ and $D^{y} = {d_{m}^{y} \in R^{N_{b}}, ∥ d_{n}^{y} {∥_{2} = 1}}_{m = 1}^{M_{b}}$ .

For q = 1, …, Q every element ${W^{'}}_{q} \in R^{N_{b} \times N_{b}}$ is approximated by an atomic decomposition as below:

\begin{matrix} {W^{'}}_{q}^{k_{q}} = \sum_{n = 1}^{k_{q}} c^{k_{q}, q} (n) d_{ℓ_{n}^{x, q}}^{x} {(d_{ℓ_{n}^{y_{,} q}}^{y})}^{T}, \end{matrix}

(6)

where $ℓ_{n}^{y_{,} q}$ is the index in the set {1, 2, …, M_b} corresponding to the label of the atom in the dictionary $D^{y}$ contributing to the n-th term in the approximation of the q-th block. The index $ℓ_{n}^{x_{,} q}$ has the equivalent description. The assembling of the approximated blocks gives rise to the approximated array ${W^{'}}^{a} = {\hat{J}}_{q = 1}^{Q} {W^{'}}_{q}^{k_{q}}$ , where $\hat{J}$ represents the assembling operation, i.e. the operation that retrieves the approximation ${W^{'}}^{a} \in R^{3 L_{x} \times L_{y}}$ of the array $W^{'} \in R^{3 L_{x} \times L_{y}}$ from the approximation of the blocks in the partition. The approximated array ${W^{'}}^{a} \in R^{3 L_{x} \times L_{y}}$ is reshaped back into 3 planes, $W^{a} (:, :, z) \in R^{L_{x} \times L_{y}}, z = 1, 2, 3$ , to be converted back to the approximated RGB intensity image as in (4) and (5).

The approximation of the partition ${W^{'}}_{q} \in R^{N_{b} \times N_{b}}, q = 1, \dots, Q$ is carried out iteratively as a two step process which selects i) the atoms in the atomic decomposition (6) and ii) the sequence in which the blocks in the partition are approximated. The procedure is called Hierarchized Block Wise (HBW) implementation of greedy pursuit strategies [28, 41]. For the selection of the atoms we apply the Orthogonal Matching Pursuit (OMP) approach [42] dedicated to 2D with separable dictionaries (OMP2D) [43]. Thus, the whole algorithm is termed HBW-OMP2D [28]. The method iterates as described below.

On setting k_q = 0 and $R_{q}^{0} = {W^{'}}_{q} \in R^{N_{b} \times N_{b}}$ at iteration k_q + 1 the algorithm selects the indices $ℓ_{k_{q} + 1}^{x, q}$ and $ℓ_{k_{q} + 1}^{y, q}$ , as follows:

\begin{matrix} ℓ_{k_{q} + 1}^{x, q}, ℓ_{k_{q} + 1}^{y, q} = \underset{\begin{matrix} n = 1, \dots, M_{b} \\ m = 1, \dots, M_{b} \end{matrix}}{arg max} | {⟨ d_{n}^{x}, R_{q}^{k_{q}} d_{m}^{y} ⟩}_{F} |, q = 1, \dots, Q, \end{matrix}

(7)

where $R_{q}^{k_{q}} = {W^{'}}_{q} - {W^{'}}_{q}^{k_{q}}$ . The calculation of ${W^{'}}_{q}^{k_{q}}$ is realized in order to minimize $∥ R_{q}^{k_{q}} ∥_{F}$ , which is equivalent to finding the orthogonal projection onto the subspace spanned by the selected atoms ${A_{n} = d_{ℓ_{n}^{x, q}}^{x} {(d_{ℓ_{n}^{y, q}}^{y})}^{T} \in R^{N_{b} \times N_{b}}}_{n = 1}^{k_{q}}$ . In our implementation, the calculation of the coefficients c^q(n), n = 1, …, k_q in (6) is realized as

\begin{matrix} c^{k_{q}, q} (n) = {⟨ B_{n}^{k}, {W^{'}}_{q}^{k_{q}} ⟩}_{F}, n = 1, \dots, k_{q}, \end{matrix}

(8)

where the set ${B_{n}^{k_{q}} \in R^{N_{b} \times N_{b}}}_{n = 1}^{k_{q}}$ is biorthogonal to the set ${A_{n} \in R^{N_{b} \times N_{b}}}_{n = 1}^{k_{q}}$ and needs to be upgraded and updated to account for each newly selected atom. Starting from $B_{1}^{1} = Q_{1} = A_{1} = d_{ℓ_{1}^{x}}^{x} {(d_{ℓ_{1}^{y}}^{y})}^{⊤}$ the updating and upgrading is realized through the recursive equations [43, 44]:

\begin{matrix} \begin{matrix} B_{n}^{k_{q} + 1} & = B_{n}^{k_{q}} - B_{k_{q} + 1}^{k_{q} + 1} {⟨ A_{k_{q} + 1}, B_{n}^{k} ⟩}_{F}, n = 1, \dots, k_{q} \\ B_{k_{q} + 1}^{k_{q} + 1} & = Q_{k_{q} + 1} / {∥ Q_{k_{q} + 1} ∥}_{F}^{2}, where \\ Q_{k_{q} + 1} & = A_{k_{q} + 1} - \sum_{n = 1}^{k_{q}} \frac{Q_{n}}{∥ Q_{n} ∥_{F}^{2}} {⟨ Q_{n}, A_{k_{q} + 1} ⟩}_{F}, \end{matrix} \end{matrix}

(9)

with the additional re-orthogonalization step

\begin{matrix} Q_{k_{q} + 1} \leftarrow Q_{k_{q} + 1} - \sum_{n = 1}^{k_{q}} \frac{Q_{n}}{∥ Q_{n} ∥_{F}^{2}} . {⟨ Q_{n}, Q_{k_{q} + 1} ⟩}_{F} . \end{matrix}

(10)

As discussed in [28, 41], for $ℓ_{k_{q} + 1}^{x, q}$ and $ℓ_{k_{q} + 1}^{y, q}, q = 1, \dots, Q$ the indices resulting from (7), the block to be approximated in the next iteration corresponds to the value q^⋆ such that

q^{⋆} = \underset{q = 1, \dots, Q}{arg max} | ⟨ d_{ℓ_{k_{q} + 1}^{x, q}}^{x}, R_{q}^{k_{q}} d_{ℓ_{k_{q} + 1}^{y, q}}^{y} ⟩ | .

The algorithm stops when the required total number of $K = \sum_{q = 1}^{Q} k_{q}$ atoms has been selected. This number can be fixed using the SR, which is now calculated as

\begin{matrix} SR = \frac{L_{x} \cdot L_{y} \cdot 3}{K} . \end{matrix}

(11)

Remark 1. The above described implementation of HBW-OMP2D is very effective in terms of speed, but demanding in terms of memory (the partial outputs corresponding to all the blocks in the partition need to be stored at every iteration). An alternative implementation, termed HBW Self Projected Matching Pursuit (HBW-SPMP) [29, 45], would enable the application of the identical strategy to much larger images than the ones considered in this work.

4.1 Numerical example II

For this and the next numerical example, we use a mixed dictionary consisting of two classes of sub-dictionaries of different nature:

I)
The trigonometric dictionaries $D_{C}^{x}$ and $D_{S}^{x}$ , defined below, for i = 1…, N_b
$D_{C}^{x} = {w_{c} (n) cos \frac{π (2 i - 1) (n - 1)}{2 M}}_{n = 1}^{M_{x}}, D_{S}^{x} = {w_{s} (n) sin \frac{π (2 i - 1) (n)}{2 M} q}_{n = 1}^{M_{x}} .$
w_c(n) and w_s(n), n = 1, …, M_x are normalization factors.
II)
The dictionary $D_{L}^{x}$ , which is constructed by translation of the prototype atoms in Fig 5.

The mixed dictionary $D^{x}$ is built as $D^{x} = D_{C}^{x} \cup D_{S}^{x} \cup D_{L}^{x}$ and $D^{y} = D^{x}$ .

Table 3 shows the improvement in $\bar{PSNR}$ achieved by atomic decompositions using the mixed dictionary for SR = 20 and SR = 10.

Table 3. Mean value PSNR, with respect to 300 images in the Berkeley data set, produced by 2D atomic decomposition of the arrays W{m} = 1, …, 300 in order to obtain SR = 20 (2nd and 6th column) and SR = 10 (4th and 8th column).

Block size	8 × 8				16 × 16
Transf.	$\bar{PSNR}$	std	$\bar{PSNR}$	std	$\bar{PSNR}$	std	$\bar{PSNR}$	std
(i) dct	40.5	5.0	48.1	4.4	40.8	4.9	48.6	4.2
(ii) YCbCr	40.3	5.0	47.8	4.4	40.6	4.9	48.3	4.2
(iii) PC	39.6	4.8	46.3	4.5	39.9	4.8	46.7	4.5
(iv) Learned	40.3	4.9	47.4	4.3	40.5	4.9	47.8	4.2
(v) No transf.	34.0	5.2	39.1	5.6	34.1	5.2	39.3	5.5

Open in a new tab

Notice that while case (v), which does not include any T transformation, gives superior results than by disregarding entries (c.f. Table 1) when applying any of the transformations (i)–(iv) the results improve further. The PC transform, however, appears significantly less effective than the others. In addition to rendering the best results, the dct brings along the additional advantage of being orthonormal. Consequently, it does not magnify errors at the inversion step. Because of this, we single out the dct as the most convenient cross color transformation out of the four considered here.

5 Application to image compression

In order achieve compression by filing an atomic decomposition we need to address two issues. Namely, the quantization of the coefficients c_q(n)n = 1, …, k_q, q = 1, …, Q in (6) and the storage of the indices $(ℓ_{n}^{x, q}, ℓ_{n}^{y, q}), n = 1, \dots, k_{q}, q = 1, \dots, Q$ . We tackle the matters by simple but effective procedures [30].

For q = 1, …, Q the absolute value coefficients |c_q(n)|, n = 1, …, k_q are converted to integers through uniform quantization as follows

\begin{matrix} c_{q}^{Δ} (n) = {\begin{matrix} ⌈ \frac{| c_{q} (n) | - θ}{Δ} ⌉, & if | c_{q} (n) | \geq θ \\ 0 otherwise . \end{matrix} \end{matrix}

(12)

The signs of the coefficient are encoded separately as a vector, s_q, using a binary alphabet. Each pair of indices $(ℓ_{n}^{x, q}, ℓ_{n}^{y, q})$ corresponding to the atoms in the decompositions of the block $W_{q}^{'}$ is mapped into a single index o_q(n). The set o_q(1), …, o_q(k_q) is sorted in ascending order $o_{q} (n) \to {\tilde{o}}_{q} (n), n = 1, \dots, k_{q}$ to take the differences $δ_{q} (n) = {\tilde{o}}_{q} (n) - {\tilde{o}}_{q} (n - 1), n = 2, \dots, k_{q}$ and construct the string of non-negative numbers ${\tilde{o}}_{q} (1), δ_{q} (2), \dots, δ_{q} (k_{q})$ . The order of the set ${\tilde{o}}_{q} (n), n = 1, \dots, k_{q}$ induces order in the unsigned coefficients, $c_{q}^{Δ} (n) \to {\tilde{c}}_{q}^{Δ} (n)$ , and in the corresponding signs $s_{q} (n) \to {\tilde{s}}_{q} (n)$ .

For each q the number 0 is added at the end of the indices ${\tilde{o}}_{q} (n), n = 1, \dots, k_{q}$ before concatenation, to be able to separate strings corresponding to different blocks. Each sequence of strings corresponding to q = 1, …, Q is concatenated and encoded using the off-the-shelf MATLAB function Huff06 [46], which implements Huffman coding.

The compression rate is given in bits-per-pixel (bpp) which is defined as

bpp = \frac{Size of the file in bits}{Number of intensity pixels in a single channel} .

At the reconstruction stage the indices $({\tilde{ℓ}}_{n}^{x, q}, {\tilde{ℓ}}_{n}^{y, q}), n = 1 \dots k_{q}$ are recovered from the string of differences δ_q(n), n = 2, …, k_q. The signs of the coefficients are read from the binary string. The quantized unsigned coefficients are read and transformed into real numbers as:

| {\tilde{c}}_{q}^{r} (n) | = Δ \cdot {\tilde{c}}_{q}^{Δ} (n) + (θ - Δ / 2), n = 1 \dots k_{q} .

The codec for reproducing the examples in the next sections has been made available on [47].

5.1 Numerical example III

The relevance to image compression of the achieved sparsity by dct cross color transformation is illustrated in this section by comparison with results yielded by the compression standards JPEG, and WebP, on the 15 images in Table 4. These are typical images, used for compression tests, available in ppm or png format. The first 9 images are classic test images taken from [48]. The last six images are portions of 1024 × 1024 × 3 pixels shown in Fig 6 from very large high resolution images available on [49].

Table 4. Test images.

The last column gives the approximation times to produce the results in Table 5.

No	Image	Size	time (secs)
1	Lenna	512 × 512 × 3	1.8
2	Goldhill	576 × 720 × 3	3.8
3	Barbara	576 × 720 × 3	3.2
4	Baboon	512 × 512 × 3	2.7
5	Zelda	576 × 784 × 3	2.5
6	Sailboat	512 × 512 × 3	1.8
7	Boy	512 × 768 × 3	3.8
8	Jupiter	1072 × 1376 × 3	2.9
9	Saturn	1200 × 1488 × 3	2.6
10	Building	1024 × 1024 × 3	9.5
11	Cathedral	1024 × 1024 × 3	7.6
12	Flower	1024 × 1024 × 3	3.6
13	Spider-web	1024 × 1024 × 3	3.8
14	Bridge	1024 × 1024 × 3	8.5
15	Deer	1024 × 1024 × 3	5.8

Open in a new tab

All the results have been obtained in the MATLAB environment (version R2019a), using a machine with CPU Intel(R) Core(TM) i7–3520M RAM 8GB CPU @ 2.90GHz. For the image approximation the HBW-OMP2D method was implemented with a C++ MEX file. All the channels and all the images were partitioned into blocks of size 16 × 16. The approximation times to produce the results in Table 5 are displayed in the last column of Table 4.

Table 5. Compression rate (bpp) corresponding to JPEG (bjp), WebP (bwb) and the proposed sparse representation (bsr), for the values of PSNR given in the 2nd column.

The corresponding values of MSSIM are given in the 3rd to 5th columns. ssjp, sswb, and sssr indicate the MSSIM produced by JPEG, WebP and the sparse representation codec respectively.

I	dB	ssjp	sswb	sssr	bjp	bwb	bsr
1	35.9	0.98	0.98	0.97	3.28	2.21	1.37
2	36.6	0.98	0.98	0.98	3.63	2.47	2.01
3	37.2	0.98	0.99	0.99	3.67	2.80	1.77
4	28.8	0.97	0.97	0.96	5.80	4.66	2.48
5	39.3	0.98	0.98	0.95	2.61	1.81	1.07
6	30.9	0.98	0.98	0.95	4.49	3.21	1.51
7	32.6	0.97	0.98	0.97	4.34	3.07	2.22
8	48.2	0.99	0.99	0.99	0.60	0.51	0.20
9	49.0	0.99	0.99	0.99	0.34	0.36	0.12
10	37.4	0.99	0.99	0.99	3.41	2.35	1.75
11	38.5	0.99	0.99	0.99	2.84	1.83	1.20
12	41.5	0.99	0.99	0.99	1.78	1.08	0.53
13	45.0	0.99	0.99	0.99	1.55	1.10	0.57
14	34.5	0.99	0.99	0.99	3.57	2.48	1.48
15	30.9	0.97	0.97	0.96	3.76	2.59	0.90

Open in a new tab

For realizing the comparison we proceed as follows: we set the required value of PSNR as that produced by JPEG at quality = 95 and tune compression with the other methods to produce the same PSNR. In our codec the tuning is realized by approximating the image up to PSNR_o = 1.025 ⋅ PSNR (where PSNR is the targeted quality) and setting the quantization parameter Δ so as to reproduce the targeted value of PSNR.

For compression with JPEG we use the MATLAB imwrite command. The compression with WebP was realized using the software for Ubuntu distributed on [50]. All the approaches were tuned for producing the same value of PSNR as JPEG for quality 95. The Mean Structural SIMilarity (MSSIM) index [51] was then calculated with the approximation corresponding to those values of PSNR.

6 Conclusions

The application of a cross color transformation for enhancing sparsity in the atomic decomposition of RGB images has been proposed. It was demonstrated that the effect of the transformation is to re-distribute the most significant values in the dwt of the 2D channels. As a result, when approximating the arrays by disregarding the less significant entries, the quality of the reconstructed image improves with respect to disabling the cross color transformation. Four transformations have been considered: (i) a 3 point dct, (ii) the reversible YCbCr color space transform, (iii) the PC transform, (iv) a transformation learned from an independent set of images.

The quality of the image approximation was improved further by approximating the transformed arrays by an atomic decomposition using a separable dictionary and the greedy pursuit strategy HBW-OMP2D. The dct was singled out as the most convenient cross color transformation for approximating RGB color images in the wavelet domain.

The approximation approach has been shown to be relevant for image compression. By means of a simple coding strategy the achieved compression for typical test images considerably improves upon the most commonly used compression standards, namely JPEG and WebP.

Supporting information

S1 Data

(TXT)

Click here for additional data file.^{(96B, txt)}

Data Availability

The data are available in the Aston University repository (DOI: 10.17036/researchdata.aston.ac.uk.00000590).

Funding Statement

The author(s) received no specific funding for this work.

References

1. Mallat S. and Zhang Z., “Matching pursuit with time-frequency dictionaries,” IEEE Trans. Signal Process., Vol (41,12) 3397–3415 (1993). doi: 10.1109/78.258082 [DOI] [Google Scholar]
2. Chen S., Donoho D. and Saunders M. “Atomic Decomposition by Basis Pursuit”, SIAM Review, 129–159 (2001). doi: 10.1137/S003614450037906X [DOI] [Google Scholar]
3. Mallat S., A Wavelet Tour of Signal Processing: The Sparse Way, Academic Press; (2009). [Google Scholar]
4. Eldar Y., Kuppinger P. and Biölcskei H., “Block-Sparse Signals: Uncertainty Relations and Efficient Recovery”, IEEE Trans. Signal Process., 58, 3042–3054 (2010). doi: 10.1109/TSP.2010.2044837 [DOI] [Google Scholar]
5. Candès E. and Wakin M., “An introduction to compressive sampling”, IEEE Signal Processing Magazine, 25, 21–30 (2008). doi: 10.1109/MSP.2007.914731 [DOI] [Google Scholar]
6. Romberg J., “Imaging via compressive sampling”, IEEE Signal Processing Magazine, 25, 14–20 (2008). doi: 10.1109/MSP.2007.914729 [DOI] [Google Scholar]
7. Baraniuk R., “More Is less: Signal processing and the data deluge”, Science 331, 717–719 (2011). doi: 10.1126/science.1197448 [DOI] [PubMed] [Google Scholar]
8. Torkamani R., Zayyani H., and Sadeghzadeh R., “Model-based decentralized Bayesian algorithm for distributed compressed sensing”, Signal Processing: Image Communication, 95, (2021), 116212. [Google Scholar]
9. Wright J., Ma Yi, Mairal J., Sapiro G., Huang T.S., and Yan S., “Sparse Representation for Computer Vision and Pattern Recognition”, Proc. of the IEEE, 98, 1031–1044 (2010). doi: 10.1109/JPROC.2010.2044470 [DOI] [Google Scholar]
10. Elad M., Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, Springer; (2010). [Google Scholar]
11. Zhang Z., Xu Y., Yang J., Li X., and Zhang D., “A survey of sparse representation: algorithms and applications”, IEEE access, (2015). [Google Scholar]
12. Mairal J., Eldar M., and Sapiro G., “Sparse Representation for Color Image Restoration”, IEEE Trans. Image Proces., 17, 53–69 (2008). doi: 10.1109/TIP.2007.911828 [DOI] [PubMed] [Google Scholar]
13. Dong W., Zhang L., Shi G., and Li X., “Nonlocally Centralized Sparse Representation for Image Restoration”, IEEE Trans. Image Proces., 22, 1620–1630 (2013). doi: 10.1109/TIP.2012.2235847 [DOI] [PubMed] [Google Scholar]
14. Su Zhenming, Zhu Simiao, Lv Xin, and Wan Yi, “Image restoration using structured sparse representation with a novel parametric data-adaptive transformation matrix”, Signal Processing: Image Communication, 52, 151–172 (2017). [Google Scholar]
15. Wright J., Yang A. Y., and Ganesh A., “Robust Face Recognition via Sparse Representation”, IEEE Trans. Pattern Analysis and Machine Intelligence, 31, 210–227 (2009). doi: 10.1109/TPAMI.2008.79 [DOI] [PubMed] [Google Scholar]
16. Yuan XT., Liu X., and Yan S., “Visual Classification With Multitask Joint Sparse Representation”, IEEE Trans. Image Proces., 21, 4349–4360 (2012). doi: 10.1109/TIP.2012.2205006 [DOI] [PubMed] [Google Scholar]
17. Heinsohn D., Villalobos E., Prieto L., and Mery D. “Face recognition in low-quality images using adaptive sparse representations”, Image and Vision Computing, 85, 46–58 (2019). doi: 10.1016/j.imavis.2019.02.012 [DOI] [Google Scholar]
18. Raja K., Ramachandra R., and Busch C. “Collaborative representation of blur invariant deep sparse features for periocular recognition from smartphones”. Image and Vision Computing, 101, 46–58 (2020). doi: 10.1016/j.imavis.2020.103979 [DOI] [Google Scholar]
19. Yang J., Wright J., and Huang T., “Image Super-Resolution via Sparse Representation”, IEEE Trans. Image Proces., 19, 2861–2873 (2010). doi: 10.1109/TIP.2010.2050625 [DOI] [PubMed] [Google Scholar]
20. Zhang Y., Liu J., Yang W., and Guo Z., “Image Super-Resolution Based on Structure-Modulated Sparse Representation”, IEEE Trans. Image Proces., 9, 2797–2810 (2015). doi: 10.1109/TIP.2015.2431435 [DOI] [PubMed] [Google Scholar]
21. Zhang Chaopeng, Liu Weirong, Lui Jie, Liu Chaorong, and Shi Changhong, “Sparse representation and adaptive mixed samples regression for single image super-resolution”, Signal Processing: Image Communication, 67, 79–89 (2018). [Google Scholar]
22. Li Xuesong, Cao Guo, Zhang Youqiang, Shafique Ayesha, and Fu Peng, “Combining synthesis sparse with analysis sparse for single image super-resolution”, Signal Processing: Image Communication, 83 115805 (2020). [Google Scholar]
23.C. Caiafa F. and Cichocki A., “Computing sparse representations of multidimensional signals Using Kronecker Bases”, Neural computation, 25, 186–220 (2013). doi: 10.1162/NECO_a_00385 [DOI] [PubMed] [Google Scholar]
24. Cichocki A., Mandic D., De Lathauweri L., Zhou G., Zhao Q., Caiafai C., et al. , “Tensor decompositions for signal processing applications: From two-way to multiway component analysis”, IEEE Signal Processing Magazine, 32, 145–163 (2015). doi: 10.1109/MSP.2013.2297439 [DOI] [Google Scholar]
25. Dai Q., Yoo S., and Kappeler A. “Sparse Representation-Based Multiple Frame Video Super-Resolution”, IEEE Trans. Image Proces., 26, 2080–2095 (2017). doi: 10.1109/TIP.2016.2631339 [DOI] [PubMed] [Google Scholar]
26. Mousavi H. S. and Monga V., “Sparsity-based Color Image Super Resolution via Exploiting Cross Channel Constraints”, IEEE Trans. Image Proces., 26, 5094–5106 (2017). doi: 10.1109/TIP.2017.2704443 [DOI] [PubMed] [Google Scholar]
27. Rebollo-Neira L. and Whitehouse D., “Sparse representation of 3D images for piecewise dimensionality reduction with high quality reconstruction”, Array, 1 (2019). doi: 10.1016/j.array.2019.100001 [DOI] [Google Scholar]
28. Rebollo-Neira L., Matiol R., and Bibi S., “Hierarchized block wise image approximation by greedy pursuit strategies”, IEEE Signal Process. Letters, 20, 1175–1178 (2013). doi: 10.1109/LSP.2013.2283510 [DOI] [Google Scholar]
29. Rebollo-Neira L., “Effective sparse representation of X-Ray medical images”, International Journal for Numerical Methods in Biomedical Engineering, 33 e2886 (2017). doi: 10.1002/cnm.2886 [DOI] [PubMed] [Google Scholar]
30. Rebollo-Neira L., “A competitive scheme for storing sparse representation of X-Ray medical images”, PLoS ONE, 13(8): e0201455. (2018). doi: 10.1371/journal.pone.0201455 [DOI] [PMC free article] [PubMed] [Google Scholar]
31. Taubman D. S. and Marcellin M. W., JPEG2000 Image Compression, Fundamentals, Standards and Practice, Kluwer Academic Publishers; (2002). [Google Scholar]
32.G. Schaefer and M. Stich, “UCID: an uncompressed color image database”, Proceedings Volume 5307, Storage and Retrieval Methods and Applications for Multimedia 2004; (2003).
33. https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/ (Last access December 2022).
34. Fisher R.A., “Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population”, Biometrika, 10, 507–521 (1915). doi: 10.2307/2331838 [DOI] [Google Scholar]
35. Fisher R.A., “On the probable error of a coefficient of correlation deduced from small sample”, Metron, 1, 3–32 (1921). [Google Scholar]
36. Gershikov E. and Porat M., “On color transforms and bit allocation for optimal subband image compression”, Signal Processing: Image Communication, 22, 1–18 (2007). [Google Scholar]
37. Tošić I. and Frossard P., “Dictionary Learning: What is the right representation for my signal?”, IEEE Signal Processesing Magazine, 28, 27–38 (2011). doi: 10.1109/MSP.2010.939537 [DOI] [Google Scholar]
38. Zepeda J., Guillemot C., and Kijak E., “Image Compression Using Sparse Representations and the Iteration-Tuned and Aligned Dictionary”, IEEE Journal of Selected Topics in Signal Processing, 5,1061–1073 (2011). doi: 10.1109/JSTSP.2011.2135332 [DOI] [Google Scholar]
39. Srinivas M., Naidu R. R., Sastry C.S., and Krishna Mohana C., “Content based medical image retrieval using dictionary learning”, Neurocomputing, 168, 880–895 (2015). doi: 10.1016/j.neucom.2015.05.036 [DOI] [Google Scholar]
40. Wen B., Ravishankar S., and Bresler Y., “‘Structured Overcomplete Sparsifying Transform Learning with Convergence Guarantees and Applications”, International Journal of Computer Vision, 114, 137–167 (2015). doi: 10.1007/s11263-014-0761-1 [DOI] [Google Scholar]
41. Rebollo-Neira L., “Cooperative greedy pursuit strategies for sparse signal representation by partitioning”, Signal Processing, 125, 365–375 (2016). doi: 10.1016/j.sigpro.2016.02.008 [DOI] [Google Scholar]
42. Pati Y.C., Rezaiifar R., and Krishnaprasad P.S., “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition”, Proc. of the 27th ACSSC,1,40–44 (1993). [Google Scholar]
43. Rebollo-Neira L., Bowley J., Constantinides A. and Plastino A., “Self contained encrypted image folding”, Physica A, 391, 5858–5870 (2012). doi: 10.1016/j.physa.2012.06.042 [DOI] [Google Scholar]
44. Rebollo-Neira L. and Lowe D., “Optimized orthogonal matching pursuit approach”, IEEE Signal Process. Letters, 9, 137–140 (2002). doi: 10.1109/LSP.2002.1001652 [DOI] [Google Scholar]
45. Rebollo-Neira L., Rozložník M., and Sasmal P., “Analysis of the Self Projected Matching Pursuit Algorithm”, Journal of The Franklin Institute 8980–8994 (2020). doi: 10.1016/j.jfranklin.2020.06.006 [DOI] [Google Scholar]
46.K. Skretting, https://www.mathworks.com/matlabcentral/fileexchange/2818-huffman-coding-and-arithmetic-codin (Last access December 2022).
47. http://www.nonlinear-approx.info/examples/node015.html (Last access December 2022).
48. https://www.hlevkin.com/hlevkin/06testimages.htm (Last access December 2022).
49. https://imagecompression.info/test_images (Last access December 2022).
50. https://developers.google.com/speed/webp (Last access December 2022).
51. Wang Z., Bovik A. C., Sheikh H. R., and Simoncelli E. P., “Image quality assessment: From error visibility to structural similarity”, IEEE Trans. Image Process., 13, 600–612 (2004). doi: 10.1109/TIP.2003.819861 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

S1 Data

(TXT)

Click here for additional data file.^{(96B, txt)}

Data Availability Statement

The data are available in the Aston University repository (DOI: 10.17036/researchdata.aston.ac.uk.00000590).

[pone.0279917.ref001] 1. Mallat S. and Zhang Z., “Matching pursuit with time-frequency dictionaries,” IEEE Trans. Signal Process., Vol (41,12) 3397–3415 (1993). doi: 10.1109/78.258082 [DOI] [Google Scholar]

[pone.0279917.ref002] 2. Chen S., Donoho D. and Saunders M. “Atomic Decomposition by Basis Pursuit”, SIAM Review, 129–159 (2001). doi: 10.1137/S003614450037906X [DOI] [Google Scholar]

[pone.0279917.ref003] 3. Mallat S., A Wavelet Tour of Signal Processing: The Sparse Way, Academic Press; (2009). [Google Scholar]

[pone.0279917.ref004] 4. Eldar Y., Kuppinger P. and Biölcskei H., “Block-Sparse Signals: Uncertainty Relations and Efficient Recovery”, IEEE Trans. Signal Process., 58, 3042–3054 (2010). doi: 10.1109/TSP.2010.2044837 [DOI] [Google Scholar]

[pone.0279917.ref005] 5. Candès E. and Wakin M., “An introduction to compressive sampling”, IEEE Signal Processing Magazine, 25, 21–30 (2008). doi: 10.1109/MSP.2007.914731 [DOI] [Google Scholar]

[pone.0279917.ref006] 6. Romberg J., “Imaging via compressive sampling”, IEEE Signal Processing Magazine, 25, 14–20 (2008). doi: 10.1109/MSP.2007.914729 [DOI] [Google Scholar]

[pone.0279917.ref007] 7. Baraniuk R., “More Is less: Signal processing and the data deluge”, Science 331, 717–719 (2011). doi: 10.1126/science.1197448 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref008] 8. Torkamani R., Zayyani H., and Sadeghzadeh R., “Model-based decentralized Bayesian algorithm for distributed compressed sensing”, Signal Processing: Image Communication, 95, (2021), 116212. [Google Scholar]

[pone.0279917.ref009] 9. Wright J., Ma Yi, Mairal J., Sapiro G., Huang T.S., and Yan S., “Sparse Representation for Computer Vision and Pattern Recognition”, Proc. of the IEEE, 98, 1031–1044 (2010). doi: 10.1109/JPROC.2010.2044470 [DOI] [Google Scholar]

[pone.0279917.ref010] 10. Elad M., Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, Springer; (2010). [Google Scholar]

[pone.0279917.ref011] 11. Zhang Z., Xu Y., Yang J., Li X., and Zhang D., “A survey of sparse representation: algorithms and applications”, IEEE access, (2015). [Google Scholar]

[pone.0279917.ref012] 12. Mairal J., Eldar M., and Sapiro G., “Sparse Representation for Color Image Restoration”, IEEE Trans. Image Proces., 17, 53–69 (2008). doi: 10.1109/TIP.2007.911828 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref013] 13. Dong W., Zhang L., Shi G., and Li X., “Nonlocally Centralized Sparse Representation for Image Restoration”, IEEE Trans. Image Proces., 22, 1620–1630 (2013). doi: 10.1109/TIP.2012.2235847 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref014] 14. Su Zhenming, Zhu Simiao, Lv Xin, and Wan Yi, “Image restoration using structured sparse representation with a novel parametric data-adaptive transformation matrix”, Signal Processing: Image Communication, 52, 151–172 (2017). [Google Scholar]

[pone.0279917.ref015] 15. Wright J., Yang A. Y., and Ganesh A., “Robust Face Recognition via Sparse Representation”, IEEE Trans. Pattern Analysis and Machine Intelligence, 31, 210–227 (2009). doi: 10.1109/TPAMI.2008.79 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref016] 16. Yuan XT., Liu X., and Yan S., “Visual Classification With Multitask Joint Sparse Representation”, IEEE Trans. Image Proces., 21, 4349–4360 (2012). doi: 10.1109/TIP.2012.2205006 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref017] 17. Heinsohn D., Villalobos E., Prieto L., and Mery D. “Face recognition in low-quality images using adaptive sparse representations”, Image and Vision Computing, 85, 46–58 (2019). doi: 10.1016/j.imavis.2019.02.012 [DOI] [Google Scholar]

[pone.0279917.ref018] 18. Raja K., Ramachandra R., and Busch C. “Collaborative representation of blur invariant deep sparse features for periocular recognition from smartphones”. Image and Vision Computing, 101, 46–58 (2020). doi: 10.1016/j.imavis.2020.103979 [DOI] [Google Scholar]

[pone.0279917.ref019] 19. Yang J., Wright J., and Huang T., “Image Super-Resolution via Sparse Representation”, IEEE Trans. Image Proces., 19, 2861–2873 (2010). doi: 10.1109/TIP.2010.2050625 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref020] 20. Zhang Y., Liu J., Yang W., and Guo Z., “Image Super-Resolution Based on Structure-Modulated Sparse Representation”, IEEE Trans. Image Proces., 9, 2797–2810 (2015). doi: 10.1109/TIP.2015.2431435 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref021] 21. Zhang Chaopeng, Liu Weirong, Lui Jie, Liu Chaorong, and Shi Changhong, “Sparse representation and adaptive mixed samples regression for single image super-resolution”, Signal Processing: Image Communication, 67, 79–89 (2018). [Google Scholar]

[pone.0279917.ref022] 22. Li Xuesong, Cao Guo, Zhang Youqiang, Shafique Ayesha, and Fu Peng, “Combining synthesis sparse with analysis sparse for single image super-resolution”, Signal Processing: Image Communication, 83 115805 (2020). [Google Scholar]

[pone.0279917.ref023] 23.C. Caiafa F. and Cichocki A., “Computing sparse representations of multidimensional signals Using Kronecker Bases”, Neural computation, 25, 186–220 (2013). doi: 10.1162/NECO_a_00385 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref024] 24. Cichocki A., Mandic D., De Lathauweri L., Zhou G., Zhao Q., Caiafai C., et al. , “Tensor decompositions for signal processing applications: From two-way to multiway component analysis”, IEEE Signal Processing Magazine, 32, 145–163 (2015). doi: 10.1109/MSP.2013.2297439 [DOI] [Google Scholar]

[pone.0279917.ref025] 25. Dai Q., Yoo S., and Kappeler A. “Sparse Representation-Based Multiple Frame Video Super-Resolution”, IEEE Trans. Image Proces., 26, 2080–2095 (2017). doi: 10.1109/TIP.2016.2631339 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref026] 26. Mousavi H. S. and Monga V., “Sparsity-based Color Image Super Resolution via Exploiting Cross Channel Constraints”, IEEE Trans. Image Proces., 26, 5094–5106 (2017). doi: 10.1109/TIP.2017.2704443 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref027] 27. Rebollo-Neira L. and Whitehouse D., “Sparse representation of 3D images for piecewise dimensionality reduction with high quality reconstruction”, Array, 1 (2019). doi: 10.1016/j.array.2019.100001 [DOI] [Google Scholar]

[pone.0279917.ref028] 28. Rebollo-Neira L., Matiol R., and Bibi S., “Hierarchized block wise image approximation by greedy pursuit strategies”, IEEE Signal Process. Letters, 20, 1175–1178 (2013). doi: 10.1109/LSP.2013.2283510 [DOI] [Google Scholar]

[pone.0279917.ref029] 29. Rebollo-Neira L., “Effective sparse representation of X-Ray medical images”, International Journal for Numerical Methods in Biomedical Engineering, 33 e2886 (2017). doi: 10.1002/cnm.2886 [DOI] [PubMed] [Google Scholar]

[pone.0279917.ref030] 30. Rebollo-Neira L., “A competitive scheme for storing sparse representation of X-Ray medical images”, PLoS ONE, 13(8): e0201455. (2018). doi: 10.1371/journal.pone.0201455 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0279917.ref031] 31. Taubman D. S. and Marcellin M. W., JPEG2000 Image Compression, Fundamentals, Standards and Practice, Kluwer Academic Publishers; (2002). [Google Scholar]

[pone.0279917.ref032] 32.G. Schaefer and M. Stich, “UCID: an uncompressed color image database”, Proceedings Volume 5307, Storage and Retrieval Methods and Applications for Multimedia 2004; (2003).

[pone.0279917.ref033] 33. https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/ (Last access December 2022).

[pone.0279917.ref034] 34. Fisher R.A., “Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population”, Biometrika, 10, 507–521 (1915). doi: 10.2307/2331838 [DOI] [Google Scholar]

[pone.0279917.ref035] 35. Fisher R.A., “On the probable error of a coefficient of correlation deduced from small sample”, Metron, 1, 3–32 (1921). [Google Scholar]

[pone.0279917.ref036] 36. Gershikov E. and Porat M., “On color transforms and bit allocation for optimal subband image compression”, Signal Processing: Image Communication, 22, 1–18 (2007). [Google Scholar]

[pone.0279917.ref037] 37. Tošić I. and Frossard P., “Dictionary Learning: What is the right representation for my signal?”, IEEE Signal Processesing Magazine, 28, 27–38 (2011). doi: 10.1109/MSP.2010.939537 [DOI] [Google Scholar]

[pone.0279917.ref038] 38. Zepeda J., Guillemot C., and Kijak E., “Image Compression Using Sparse Representations and the Iteration-Tuned and Aligned Dictionary”, IEEE Journal of Selected Topics in Signal Processing, 5,1061–1073 (2011). doi: 10.1109/JSTSP.2011.2135332 [DOI] [Google Scholar]

[pone.0279917.ref039] 39. Srinivas M., Naidu R. R., Sastry C.S., and Krishna Mohana C., “Content based medical image retrieval using dictionary learning”, Neurocomputing, 168, 880–895 (2015). doi: 10.1016/j.neucom.2015.05.036 [DOI] [Google Scholar]

[pone.0279917.ref040] 40. Wen B., Ravishankar S., and Bresler Y., “‘Structured Overcomplete Sparsifying Transform Learning with Convergence Guarantees and Applications”, International Journal of Computer Vision, 114, 137–167 (2015). doi: 10.1007/s11263-014-0761-1 [DOI] [Google Scholar]

[pone.0279917.ref041] 41. Rebollo-Neira L., “Cooperative greedy pursuit strategies for sparse signal representation by partitioning”, Signal Processing, 125, 365–375 (2016). doi: 10.1016/j.sigpro.2016.02.008 [DOI] [Google Scholar]

[pone.0279917.ref042] 42. Pati Y.C., Rezaiifar R., and Krishnaprasad P.S., “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition”, Proc. of the 27th ACSSC,1,40–44 (1993). [Google Scholar]

[pone.0279917.ref043] 43. Rebollo-Neira L., Bowley J., Constantinides A. and Plastino A., “Self contained encrypted image folding”, Physica A, 391, 5858–5870 (2012). doi: 10.1016/j.physa.2012.06.042 [DOI] [Google Scholar]

[pone.0279917.ref044] 44. Rebollo-Neira L. and Lowe D., “Optimized orthogonal matching pursuit approach”, IEEE Signal Process. Letters, 9, 137–140 (2002). doi: 10.1109/LSP.2002.1001652 [DOI] [Google Scholar]

[pone.0279917.ref045] 45. Rebollo-Neira L., Rozložník M., and Sasmal P., “Analysis of the Self Projected Matching Pursuit Algorithm”, Journal of The Franklin Institute 8980–8994 (2020). doi: 10.1016/j.jfranklin.2020.06.006 [DOI] [Google Scholar]

[pone.0279917.ref046] 46.K. Skretting, https://www.mathworks.com/matlabcentral/fileexchange/2818-huffman-coding-and-arithmetic-codin (Last access December 2022).

[pone.0279917.ref047] 47. http://www.nonlinear-approx.info/examples/node015.html (Last access December 2022).

[pone.0279917.ref048] 48. https://www.hlevkin.com/hlevkin/06testimages.htm (Last access December 2022).

[pone.0279917.ref049] 49. https://imagecompression.info/test_images (Last access December 2022).

[pone.0279917.ref050] 50. https://developers.google.com/speed/webp (Last access December 2022).

[pone.0279917.ref051] 51. Wang Z., Bovik A. C., Sheikh H. R., and Simoncelli E. P., “Image quality assessment: From error visibility to structural similarity”, IEEE Trans. Image Process., 13, 600–612 (2004). doi: 10.1109/TIP.2003.819861 [DOI] [PubMed] [Google Scholar]

PERMALINK

Enhancing sparse representation of color images by cross channel transformation

Laura Rebollo-Neira

Aurelien Inacio

Roles

Abstract

1 Introduction

2 Mathematical notation

3 Cross color transformations