The Noise Collector for sparse recovery in high dimensions

Miguel Moscoso; Alexei Novikov; George Papanicolaou; Chrysoula Tsogka

doi:10.1073/pnas.1913995117

. 2020 May 11;117(21):11226–11232. doi: 10.1073/pnas.1913995117

The Noise Collector for sparse recovery in high dimensions

Miguel Moscoso ^a,¹, Alexei Novikov ^b, George Papanicolaou ^c,¹, Chrysoula Tsogka ^d

PMCID: PMC7260980 PMID: 32393628

Significance

The ability to detect sparse signals from noisy, high-dimensional data is a top priority in modern science and engineering. For optimal results, current approaches need to tune parameters that depend on the level of noise, which is often difficult to estimate. We develop a parameter-free, computationally efficient, $ℓ_{1}$ -norm minimization approach that has a zero false discovery rate (no false positives) with high probability for any level of noise while it detects the exact location of sparse signals when the noise is not too large.

Keywords: high-dimensional probability, convex geometry, sparsity-promoting algorithms, noisy data

Abstract

The ability to detect sparse signals from noisy, high-dimensional data is a top priority in modern science and engineering. It is well known that a sparse solution of the linear system $A ρ = b_{0}$ can be found efficiently with an $ℓ_{1}$ -norm minimization approach if the data are noiseless. However, detection of the signal from data corrupted by noise is still a challenging problem as the solution depends, in general, on a regularization parameter with optimal value that is not easy to choose. We propose an efficient approach that does not require any parameter estimation. We introduce a no-phantom weight $τ$ and the Noise Collector matrix $C$ and solve an augmented system $A ρ + C η = b_{0} + e$ , where $e$ is the noise. We show that the $ℓ_{1}$ -norm minimal solution of this system has zero false discovery rate for any level of noise, with probability that tends to one as the dimension of $b_{0}$ increases to infinity. We obtain exact support recovery if the noise is not too large and develop a fast Noise Collector algorithm, which makes the computational cost of solving the augmented system comparable with that of the original one. We demonstrate the effectiveness of the method in applications to passive array imaging.

We want to find sparse solutions $ρ \in R^{K}$ for

A ρ = b

[1]

from highly incomplete measurement data $b = b_{0} + e \in R^{N}$ corrupted by noise $e$ , where $1 ≪ N < K$ . In the noiseless case, $ρ$ can be found exactly by solving the optimization problem (1)

ρ_{*} = \arg min_{ρ} {‖ ρ ‖}_{ℓ_{1}}, subject to A ρ = b,

[2]

provided that the measurement matrix $A \in R^{N \times K}$ satisfies additional conditions (e.g., decoherence or restricted isometry properties) (2, 3) and that the solution vector $ρ$ has a small number $M$ of nonzero components or degrees of freedom. When measurements are noisy, exact recovery is no longer possible. However, the exact support of $ρ$ can still be determined if the noise is not too strong. The most commonly used approach is to solve the $ℓ_{2}$ -relaxed form of Eq. 2:

ρ_{λ} = \arg min_{ρ} (λ {‖ ρ ‖}_{ℓ_{1}} + {‖ A ρ - b ‖}_{ℓ_{2}}^{2}),

[3]

which is known as Lasso in the statistics literature (4). There are sufficient conditions for the support of $ρ_{λ}$ to be contained within the true support [e.g., the works of Fuchs (5), Tropp (6), Wainwright (7), and Maleki et al. (8)]. These conditions depend on the signal-to-noise ratio (SNR), which is not known and must be estimated, and on the regularization parameter $λ$ , which must be carefully chosen and/or adaptively changed (9). Although such an adaptive procedure improves the outcome, the resulting solutions tend to include a large number of “false positives” in practice (10). Belloni et al. (11) proposed to solve the square root Lasso minimization problem instead of Eq. 3, which makes the regularization parameter $λ$ independent of the SNR. Our contribution is a computationally efficient method for exact support recovery with no false positives in noisy settings. It also does not require an estimate on SNR.

Main Results

Suppose that $ρ$ is an $M$ -sparse solution of system [1] with no noise, where the columns of $A$ have unit length. Our main result ensures that we can still recover the support of $ρ$ when the data are noisy by looking at the support of $ρ_{τ}$ found as

\begin{align} (ρ_{τ}, η_{τ}) = \arg min_{ρ, η} (τ {‖ ρ ‖}_{ℓ_{1}} + {‖ η ‖}_{ℓ_{1}}), \\ subject to A ρ + C η = b_{0} + e, \end{align}

[4]

with an $O (1)$ weight $τ$ and an appropriately chosen Noise Collector matrix $C \in R^{N \times Σ}$ , $Σ ≫ K$ . The minimization problem [4] can be understood as a relaxation of [2] as it works by absorbing all of the noise and possibly, some signal in $C η_{τ}$ .

The following theorem shows that, if the signal is pure noise and the columns of $C$ are chosen independently and at random on the unit sphere $S^{N - 1} = {x \in R^{N}, {‖ x ‖}_{ℓ_{2}} = 1}$ , then $C η_{τ} = e$ for any level of noise, with large probability.

Theorem 1 (No-Phantom Signal). Suppose that $b_{0} = 0$ and that $e / {‖ e ‖}_{ℓ_{2}}$ is uniformly distributed on $S^{N - 1}$ . Fix $β > 1$ , and draw $Σ = N^{β}$ columns for $C$ independently from the uniform distribution on $S^{N - 1}$ . For any $κ > 0$ , there are constants $τ = τ (κ, β)$ and $N_{0} = N_{0} (κ, β)$ such that, for all $N > N_{0}$ , $ρ_{τ}$ , the solution of Eq. 4, is 0 with probability $1 - 1 / N^{κ}$ .

This theorem guarantees with large probability a zero false discovery rate in the absence of signals with meaningful information. The key to a zero false discovery rate is the choice of a no-phantom weight $τ$ . Next, we generalize this result for the case in which the recorded signals carry useful information.

Theorem 2 (Zero False Discoveries). Let $ρ$ be an $M$ -sparse solution of the noiseless system $A ρ = b_{0}$ . Assume that $κ$ , $β$ , the Noise Collector, and the noise are the same as in Theorem 1. In addition, assume that the columns of $A$ are incoherent in the sense that $| ⟨ a_{i}, a_{j} ⟩ | \leq \frac{1}{3 M}$ . Then, there are constants $τ = τ (κ, β)$ and $N_{0} = N_{0} (κ, β)$ such that $supp (ρ_{τ}) \subseteq supp (ρ)$ for all $N > N_{0}$ with probability $1 - 1 / N^{κ}$ .

This theorem holds for any level of noise and the same value of $τ$ as in Theorem 1. The incoherence conditions in Theorem 2 are needed to guarantee that the true signal does not create false positives elsewhere. Theorem 2 guarantees that the support of $ρ_{τ}$ is inside the support of $ρ$ . The next theorem shows that, if the noise is not too large, then $ρ_{τ}$ and $ρ$ have exactly the same support.

Theorem 3 (Exact Support Recovery). Keep the same assumptions as in Theorem 2. Let $γ = min_{i \in supp (ρ)} | ρ_{i} | / {‖ ρ ‖}_{ℓ_{\infty}}$ . There are constants $τ = τ (κ, β)$ , $c_{1} = c_{1} (κ, β, γ)$ , and $N_{0} = N_{0} (κ, β)$ such that, if the noise level satisfies ${‖ e ‖}_{ℓ_{2}} \leq c_{1} {‖ b_{0} ‖}_{ℓ_{2}}^{2} {‖ ρ ‖}_{ℓ_{1}}^{- 1} \sqrt{N} / \sqrt{\ln N}$ , then for all $N > N_{0}$ , $supp (ρ_{τ}) = supp (ρ)$ with probability $1 - 1 / N^{κ}$ .

To elucidate an interpretation of the last theorem, consider a model case where $A$ is the identity matrix and all coefficients of $b_{0} = ρ$ are either one or zero. Then, ${‖ b_{0} ‖}_{ℓ_{2}}^{2} = {‖ ρ ‖}_{ℓ_{1}} = M$ . In this case, an acceptable relative level of noise is

{‖ e ‖}_{ℓ_{2}} / {‖ b_{0} ‖}_{ℓ_{2}} ≲ \sqrt{N} / \sqrt{M \ln N} .

[5]

This means that ${‖ e ‖}_{ℓ_{2}} ≲ \sqrt{N} / \sqrt{\ln N}$ , and it implies that each coefficient of $b_{0}$ may be corrupted by $O (1 / \sqrt{\ln N})$ on average and that some coefficients of $b_{0}$ may be corrupted by $O (1)$ .

Motivation

We are interested in imaging sparse scenes accurately using limited and noisy data. Such imaging problems arise in many areas, such as medical imaging (12), structural biology (13), radar (14), and geophysics (15). In imaging, the $ℓ_{1}$ -norm minimization method in Eq. 2 is often used (16–21) as it has the desirable property of superresolution: that is, the enhancement of the fine-scale details of the images. This has been analyzed in different settings by Donoho (22), Candès and Fernandez-Granda (23), Fannjiang and Liao (24), and Borcea and Kocyigit (25) among others. We want to retain this property in our method when the data are corrupted by additive noise.

However, noise fundamentally limits the quality of the images formed with almost all computational imaging techniques. Specifically, $ℓ_{1}$ -norm minimization produces images that are unstable for low SNR due to the ill conditioning of superresolution reconstruction schemes. The instability emerges as clutter noise in the image, or grass, that degrades the resolution. Our initial motivation to introduce the Noise Collector matrix $C$ was to regularize the matrix $A$ and thus, to suppress the clutter in the images. We proposed in ref. 26 to seek the minimal $ℓ_{1}$ -norm solution of the augmented linear system $A ρ + C η = b$ . The idea was to choose the columns of $C$ almost orthogonal to those of $A$ . Indeed, the condition number of $[A | C]$ becomes $O (1)$ when $O (N)$ columns of $C$ are taken at random. This essentially follows from the bounds on the largest and the smallest nonzero singular values of random matrices (theorem 4.6.1 in ref. 27).

The idea to create a dictionary for noise is not new. For example, the work by Laska et al. (28) considers a specific version of the measurement noise model so that $b = A ρ + C e$ , where $C$ is a matrix with fewer (orthonormal) columns than rows and the noise vector $e$ is sparse. $C$ represents the basis in which the noise is sparse and it is assumed to be known. Then, they show that it is possible to recover sparse signals and sparse noise exactly. We stress that we do not assume here that the noise is sparse. In our work, the noise is large (SNR can be small) and is evenly distributed across the data, and therefore, it cannot be sparsely accommodated.

To suppress the clutter, our theory in ref. 26 required exponentially many columns, and therefore, $Σ ≲ e^{N}$ . This seemed to make the Noise Collector impractical, but the numerical experiments suggested that $O (N)$ columns were enough to obtain excellent results. We address this issue here and explain why the Noise Collector matrix $C$ only needs algebraically many columns. Moreover, to absorb the noise completely and thus, improve the algorithm in ref. 26, we introduce now the no-phantom weight $τ$ in Eq. 4. Indeed, by weighting the columns of the Noise Collector matrix $C$ with respect to those in the model matrix $A$ , the algorithm now produces images with no clutter at all regardless of how much noise is added to the data.

Finally, we want the Noise Collector to be efficient, with almost no extra computational cost with respect to the Lasso problem in Eq. 3. To this end, the Noise Collector is constructed using circulant matrices that allow for efficient matrix vector multiplications using fast Fourier transforms (FFTs).

We now explain how the Noise Collector works and reduce our theorems to basic estimates in high-dimensional probability.

The Noise Collector

The method has two main ingredients: the Noise Collector matrix $C$ and the no-phantom weight $τ$ . The construction of the Noise Collector matrix $C$ starts with the following three key properties. First, its columns should be sufficiently orthogonal to the columns of $A$ , and therefore, it does not absorb signals with “meaningful” information. Second, the columns of $C$ should be uniformly distributed on the unit sphere $S^{N - 1}$ so that we could approximate well a typical noise vector. Third, the number of columns in $C$ should grow slower than exponential with $N$ ; otherwise, the method is impractical.

One way to guarantee all three properties is to impose

| ⟨ a_{i}, c_{j} ⟩ | < \frac{α}{\sqrt{N}} \forall i, j and | ⟨ c_{i}, c_{j} ⟩ | < \frac{α}{\sqrt{N}} \forall i \neq j

[6]

with $α > 1$ and fill out $C$ drawing $c_{i}$ at random with rejections until the rejection rate becomes too high. Then, by construction, the columns of $C$ are almost orthogonal to the columns of $A$ , and when the rejection rate becomes too high, this implies that we cannot pack more N-dimensional unit vectors into $C$ ; thus, we can approximate well a typical noise vector. Finally, the Kabatjanskii–Levenstein inequality (discussed in ref. 29) implies that the number $Σ$ of columns in $C$ grows at most polynomially: $Σ \leq N^{α^{2}}$ . The first estimate in Eq. 6 implies that any solution $C η = a_{i}$ satisfies, for any $i \leq K$ , ${‖ η ‖}_{ℓ_{1}} ≳ \sqrt{N}$ . This estimate measures how expensive it is to approximate columns of $A$ (i.e., the meaningful signal) with the Noise Collector. In turn, the no-phantom weight $τ$ should be chosen so that it is expensive to approximate noise using columns of $A$ . It cannot be taken too large, however, because we may lose the signal. In fact, one can prove that, if $τ \geq \sqrt{N} / α$ , then $ρ_{τ} \equiv 0$ for any $ρ$ and any level of noise. Intuitively, $τ$ characterizes the rate at which the signal is lost as the noise increases. The most important property of the no-phantom weight $τ$ is that it does not depend on the level of noise, and therefore, it is chosen before we start using the Noise Collector.

It is, however, more convenient for the proofs to use a probabilistic version of Eq. 6. Suppose that the columns of $C$ are drawn independently at random. Then, the dot product of any two random unit vectors is still typically of order $1 / \sqrt{N}$ (27). If the number of columns grows polynomially, we only have to sacrifice an asymptotically negligible event where our Noise Collector does not satisfy the three key properties, and the decoherence constraints in Eq. 6 are weakened by a logarithmic factor only. This follows from basic estimates in high-dimensional probability. We will state them in the next lemma after we interpret problem [4] geometrically.

Consider the convex hulls

H_{1} = \{x \in R^{N} |x = \sum_{i = 1}^{Σ} ξ_{i} c_{i}, \sum_{i = 1}^{Σ} | ξ_{i} | \leq 1\},

[7]

H_{2} = \{x \in R^{N} |x = \sum_{i = 1}^{K} ξ_{i} a_{i}, \sum_{i = 1}^{Σ} | ξ_{i} | \leq 1\},

[8]

and $H (τ) = \{ξ h_{1} / τ + (1 - ξ) h_{2}, 0 \leq ξ \leq 1, h_{i} \in H_{i}\}$ . Theorem 1 states that, for a typical noise vector $e$ , we can find $λ_{0} > 0$ such that $e \in λ_{0} \partial H_{1}$ and $e \notin λ \partial H (τ)$ for any $λ < λ_{0}$ .

Lemma 1 (Typical Width of Convex Hulls $H_{i}$ ). Suppose that $Σ = N^{β}$ , $β > 1$ ; vectors $c_{i} \in S^{N - 1}$ , $i = 1,2, \dots, Σ$ , are drawn at random and independently; and $e \in S^{N - 1}$ . Then, for any $κ > 0$ , there are constants $c_{0} = c_{0} (κ, β)$ , $α = \sqrt{(β - 1) / 2}$ , and $N_{0} = N_{0} (κ, β)$ such that, for all $N \geq N_{0}$ ,

max (max_{i \leq K} (| ⟨ a_{i}, e ⟩ |), max_{i \leq Σ} (| ⟨ c_{i}, e ⟩ |)) < c_{0} \sqrt{\ln N} / \sqrt{N}

[9]

and

α \sqrt{\ln N} e / \sqrt{N} \in H_{1}

[10]

with probability $1 - 1 / N^{κ}$ .

We sketch the proof of estimates [9] and [10] in Proofs. Estimate [9] can also be derived from Milman’s version (30) of Dvoretzky’s theorem. Informally, inequality [9] states that $H_{1}$ and $H_{2}$ are contained in the $ℓ_{2}$ ball of radius $c_{0} \sqrt{\ln N} / \sqrt{N}$ except for a few spikes in statistically insignificant directions (Fig. 1, Left). Inequality [10] states that $H_{1}$ contains an $ℓ_{2}$ ball of radius $α \sqrt{\ln N} / \sqrt{N}$ except for a few statistically insignificant directions.

Fig. 1. — (*Left*) A convex hull $H_{1}$ is an $ℓ_{2}$ ball of radius $O (\sqrt{\ln N} / \sqrt{N})$ with few spikes. (*Right*) An intersection of $H (τ)$ with the span $(a_{1}, e)$ is a rounded rhombus.

These inequalities immediately imply Theorem 1. We just need to explain how to choose the no-phantom weight $τ$ . There will be no phantoms if $H_{2} / τ$ is strictly inside the $ℓ_{2}$ ball of radius $α \sqrt{\ln N} / \sqrt{N}$ . This could be done if $τ > c_{0} / α$ .

If columns of $A$ are orthogonal to each other, then Theorem 2 follows from Theorem 1. We just need to project the linear system in Eq. 4 on the span of $a_{i}$ , $i \notin supp (ρ)$ , and apply Theorem 1 to the projections. If, in addition, we assume that $b_{0} = a_{1} ρ_{1}$ , then Proof of Theorem 3 is illustrated in Fig. 1, Right. In detail, a typical intersection of $V = span (a_{1}, e)$ , and $H (τ)$ is a rounded rhombus because it is the convex hull of $a_{1} / τ$ and the $ℓ_{2}$ ball of radius $c_{0} \sqrt{\ln N} / \sqrt{N}$ . If $a_{1} ρ_{1} + e \in λ_{0} \partial H (τ)$ , then there are two options: 1) $a_{1} ρ_{1} + e$ lies on the curved boundary of the rounded rhombus, and then, $supp (ρ_{τ}) = \emptyset$ ; or 2) $a_{1} ρ_{1} + e$ lies on the flat boundary of the rounded rhombus, and then, $supp (ρ_{τ}) = supp (ρ)$ . The second option happens if the vector $a_{1} ρ_{1} + e$ intersects the flat boundary of $\partial H (τ)$ . This gives the support recovery estimate in Theorem 3.

In the general case, the columns of the combined matrix $[A | C]$ are incoherent. This property allows us to prove Theorems 2 and 3 in Proofs using known techniques (26). In particular, we automatically have exact recovery using ref. 2 applied to $[A | C]$ if the data are noiseless.

Lemma 2 (Exact Recovery). Suppose that $ρ$ is an $M$ -sparse solution of $A ρ = b$ and that there is no noise so that $e = 0$ . In addition, assume that the columns of $A$ are incoherent: $| ⟨ a_{i}, a_{j} ⟩ | \leq \frac{1}{3 M}$ . Then, the solution to Eq. 4 satisfies $ρ_{τ} = ρ$ for all

M < \frac{2 \sqrt{N}}{3 c_{0} τ \sqrt{\ln N}} with probability 1 - \frac{1}{N^{κ}} .

[11]

Fast Noise Collector Algorithm

To find the minimizer in Eq. 4, we consider a variational approach. We define the function

\begin{align} F (ρ, η, z) = λ (τ {‖ ρ ‖}_{ℓ_{1}} + {‖ η ‖}_{ℓ_{1}}) \\ + \frac{1}{2} {‖ A ρ + C η - b ‖}_{ℓ_{2}}^{2} + ⟨ z, b - A ρ - C η ⟩ \end{align}

[12]

for a no-phantom weight $τ$ and determine the solution as

max_{z} min_{ρ, η} F (ρ, η, z) .

[13]

The key observation is that this variational principle finds the minimum in Eq. 4 exactly for all values of the regularization parameter $λ$ . Hence, the method has no tuning parameters. To determine the exact extremum in Eq. 13, we use the iterative soft thresholding algorithm GeLMA (generalized Lagrangian multiplier algorithm) (31) that works as follows.

First, pick a value for $β$ and $τ$ . For optimal results, one can calibrate $τ$ to be the smallest constant such that Theorem 1 holds: that is, we see no-phantom signals when the algorithm is fed with pure noise. In our numerical experiments, we use $β = 1.5$ and $τ = 2$ .

Second, pick a value for the regularization parameter $λ$ (e.g., $λ = 1$ ). Choose step sizes $Δ t_{1} < 2 / {‖ [A | C] ‖}^{2}$ and $Δ t_{2} < λ / ‖ A ‖$ .* Set $ρ_{0} = 0$ , $η_{0} = 0$ , and $z_{0} = 0$ , and iterate for $k \geq 0$ :

\begin{align} r = b - A ρ_{k} - C η_{k}, \\ ρ_{k + 1} = S_{τ λ Δ t_{1}} (ρ_{k} + Δ t_{1} A^{*} (z_{k} + r)), \\ η_{k + 1} = S_{λ Δ t_{1}} (η_{k} + Δ t_{1} C^{*} (z_{k} + r)), \\ z_{k + 1} = z_{k} + Δ t_{2} r, \end{align}

[14]

where $S_{r} (y_{i}) = sign (y_{i}) max {0, | y_{i} | - r}$ .

The Noise Collector matrix $C$ is computed by drawing $N^{β - 1}$ normally distributed $N$ -dimensional vectors normalized to unit length. These are the generating vectors of the Noise Collector. From each of them, a circulant $N \times N$ matrix $C_{i}$ , $i = 1, \dots, N^{β - 1}$ , is constructed. The Noise Collector matrix is obtained by concatenation, and therefore, $C = [C_{1} |C_{2} |\dots |C_{N^{β - 1}}]$ . Exploiting the circulant structure of the matrices $C_{i}$ , we perform the matrix vector multiplications $C η_{k}$ and $C^{*} (z_{k} + r)$ in Eq. 14 using the FFT (32). This makes the complexity associated with the Noise Collector $O (N^{β} \log (N))$ . Note that only the $N^{β - 1}$ generating vectors are stored and not the entire $N \times N^{β}$ Noise Collector matrix. In practice, we use $β \approx 1.5$ , which makes the cost of using the Noise Collector negligible as typically, $K ≫ N^{β - 1}$ . The columns of the Noise Collector matrix $C$ with this circulant structure are uniformly distributed on $S^{N - 1}$ , and they satisfy Lemma 1. This implies that the theorems of this paper are still valid for such $C$ .

Application to Imaging

We consider passive array imaging of point sources. The problem consists of determining the positions ${\vec{z}}_{j}$ and the complex^† amplitudes $α_{j}$ , $j = 1, \dots, M$ , of a few point sources from measurements of polychromatic signals on an array of receivers (Fig. 2). The imaging system is characterized by the array aperture $a$ , the distance $L$ to the sources, the bandwidth $B$ , and the central wavelength $λ_{0}$ .

Fig. 2. — General setup for passive array imaging. The source at ${\vec{z}}_{j}$ emits a signal that is recorded at all array elements ${\vec{x}}_{r}$ , $r = 1, \dots, N_{r}$ .

The sources are located inside an image window (IW), which is discretized with a uniform grid of points ${\vec{y}}_{k}$ , $k = 1, \dots, K$ . The unknown is the source vector $ρ = {[ρ_{1}, \dots, ρ_{K}]}^{⊺} \in C^{K}$ , with components $ρ_{k}$ that correspond to the complex amplitudes of the $M$ sources at the grid points ${\vec{y}}_{k}$ , $k = 1, \dots, K$ , with $K ≫ M$ . For the true source vector, we have $ρ_{k} = α_{j}$ if ${\vec{y}}_{k} = {\vec{z}}_{j}$ for some $j = 1, \dots, M$ , while $ρ_{k} = 0$ otherwise.

Denoting by $G (\vec{x}, \vec{y}; ω)$ Green’s function for the propagation of a signal of angular frequency $ω$ from point $\vec{y}$ to point $\vec{x}$ , we define the single-frequency Green’s function vector that connects a point $\vec{y}$ in the IW with all points ${\vec{x}}_{r}$ , $r = 1, \dots, N_{r}$ , on the array as

g (\vec{y}; ω) = {[G ({\vec{x}}_{1}, \vec{y}; ω), G ({\vec{x}}_{2}, \vec{y}; ω), \dots, G ({\vec{x}}_{N}, \vec{y}; ω)]}^{⊺} \in C^{N_{r}} .

In three dimensions, $G (\vec{x}, \vec{y}; ω) = \frac{\exp {i ω | \vec{x} - \vec{y} | / c_{0}}}{4 π | \vec{x} - \vec{y} |}$ if the medium is homogeneous. The data for the imaging problem are the signals $b ({\vec{x}}_{r}, ω_{l}) = \sum_{j = 1}^{M} α_{j} G ({\vec{x}}_{r}, {\vec{z}}_{j}; ω_{l})$ recorded at receiver locations ${\vec{x}}_{r}$ , $r = 1, \dots, N_{r}$ , at frequencies $ω_{l}$ , $l = 1, \dots, S$ . These data are stacked in a column vector

b = {[b {(ω_{1})}^{⊺}, b {(ω_{2})}^{⊺}, \dots, b {(ω_{S})}^{⊺}]}^{⊺} \in C^{N}; N = N_{r} S,

[15]

with $b (ω_{l}) = {[b ({\vec{x}}_{1}, ω_{l}), b ({\vec{x}}_{2}, ω_{l}), \dots, b ({\vec{x}}_{N}, ω_{l})]}^{⊺} \in C^{N_{r}}$ . Then, $A ρ = b$ , with $A$ being the $N \times K$ measurement matrix with columns $a_{k}$ that are the multiple-frequency Green’s function vectors

a_{k} = {[g {({\vec{y}}_{k}; ω_{1})}^{⊺}, g {({\vec{y}}_{k}; ω_{2})}^{⊺}, \dots, g {({\vec{y}}_{k}; ω_{S})}^{⊺}]}^{⊺} \in C^{N}

[16]

normalized to have length 1. The system $A ρ = b$ relates the unknown vector $ρ \in C^{K}$ to the data vector $b \in C^{N}$ .

Next, we illustrate the performance of the Noise Collector in this imaging setup. The most important features are that 1) no calibration is necessary with respect to the level of noise, that 2) exact support recovery is obtained for relatively large levels of noise [i.e., ${‖ e ‖}_{ℓ_{2}} \leq c_{1} {‖ b_{0} ‖}_{ℓ_{2}}^{2} \sqrt{N} / ({‖ ρ ‖}_{ℓ_{1}} \sqrt{\ln N})$ ], and that 3) we have zero false discovery rates for all levels of noise with high probability.

We consider a high-frequency microwave imaging regime with central frequency $f_{0} = 60$ GHz corresponding to $λ_{0} = 5$ mm. We make measurements for $S = 25$ equally spaced frequencies spanning a bandwidth $B = 20$ GHz. The array has $N = 25$ receivers and an aperture $a = 50$ cm. The distance from the array to the center of the imaging window is $L = 50$ cm. Then, the resolution is $λ_{0} L / a = 5$ mm in the cross-range (direction parallel to the array) and $c_{0} / B = 15$ mm in range (direction of propagation). These parameters are typical in microwave scanning technology (33).

We seek to image a source vector with sparsity $M = 12$ (Fig. 3, Left). The size of the imaging window is 20 $\times$ 60 cm, and the pixel spacing is 5 $\times$ 15 mm. The number of unknowns is, therefore, $K = 1,681$ , and the number of data is $N = 625$ . The size of the Noise Collector is taken to be $Σ = 1 0^{4}$ , and therefore, $β \approx 1.5$ . When the data are noiseless, we obtain exact recovery as expected (Fig. 3, Right).

In Fig. 4, we display the imaging results with and without the Noise Collector when the data are corrupted by additive noise. The SNR $= 1$ , and therefore, the $ℓ_{2}$ norms of the signals and the noise are equal. In column 1 of Fig. 4, we show the recovered image using $ℓ_{1}$ -norm minimization without the Noise Collector. There is a lot of grass in this image, with many nonzero values outside the true support. When the Noise Collector is used, the level of the grass is reduced, and the image improves (column 2 of Fig. 4). Still, there are several false discoveries because we use $τ = 1$ in algorithm [14].

In column 3 of Fig. 4, we show the image obtained with a weight $τ = 2$ in algorithm [14]. With this weight, there are no false discoveries, and the recovered support is exact. This simplifies the imaging problem dramatically as we can now restrict the inverse problem to the true support just obtained and then, solve an overdetermined linear system using a classical $ℓ_{2}$ approach. The results are shown in column 4 of Fig. 4. Note that this second step largely compensates for the signal that was lost in the first step due to the high level of noise.

In Fig. 5, we illustrate the performance of the Noise Collector for different sparsity levels $M$ and ${‖ e ‖}_{ℓ_{2}} / {‖ b_{0} ‖}_{ℓ_{2}}$ values. Success in recovering the true support of the unknown corresponds to a value of one (yellow in Fig. 5), and failure corresponds to a value of zero (blue in Fig. 5). The small phase transition zone (green in Fig. 5) contains intermediate values. The black lines in Fig. 5 are the theoretical prediction Eq. 5. These results are obtained by averaging over 10 realizations of noise. We show results for three values of data sizes $N = 342$ , $N = 625$ , and $N = 961$ . In our experiments, the nonzero components of the unknown $ρ$ take values in $[0.6, 0.8]$ , and therefore, ${‖ b_{0} ‖}_{ℓ_{2}} / {‖ ρ ‖}_{ℓ_{1}} = c s t / \sqrt{M}$ .

Remark 1: We considered passive array imaging for ease of presentation. The same results hold for active array imaging with or without multiple scattering; ref. 34 discusses the detailed analytical setup.

Remark 2: We have considered a microwave imaging regime. Similar results can be obtained in other regimes.

Proofs

Proof of Lemma 1: Using the rotational invariance of all of our probability distributions, inequality [9] is true if

P (max_{i} | ⟨ d_{i}, e ⟩ | \geq c_{0} \sqrt{\ln N} / \sqrt{N}) \leq 1 / N^{κ},

where $d_{i}$ , $i = 1,2, \dots, K + Σ$ are (possibly dependent) uniformly distributed on $S^{N - 1}$ , and we can assume that $e = (1,0, \dots, 0)$ . Denote the event

Ω_{t} = \{max_{i} | ⟨ d_{i}, e ⟩ | \geq t / \sqrt{N}\} .

$P (| ⟨ d_{i}, e ⟩ | \geq t / \sqrt{N}) \leq 2 \exp (- t^{2} / 2)$ for each $d_{i}$ . We obtain $P (Ω_{t}) \leq 2 (K + Σ) \exp (- t^{2} / 2)$ using the union bound. Choosing $t = c_{0} \sqrt{\ln N}$ for sufficiently large $c_{0}$ , we get $P (Ω_{t}) \leq C N^{β} N^{- c_{0}^{2} / 2} \leq N^{- κ}$ , where $c_{0}^{2} > 2 (β + κ)$ and $N \geq N_{0}$ . Hence, Eq. 9 holds with probability $1 - N^{- κ}$ .

If $N$ columns $c_{j}$ , $j \in S$ of $C$ satisfy

min_{j \in S} | ⟨ c_{j}, e ⟩ | \geq θ, θ = α \sqrt{\ln N} / \sqrt{N},

[17]

then their convex hull will contain $θ e$ with probability ${(1 / 2)}^{N}$ . Therefore, inequality [10] follows if [17] holds with probability $1 - 1 / N^{κ}$ . Using the rotational invariance of all of our probability distributions, we can assume that $e = (1,0, \dots, 0)$ . For each $c_{i}$ ,

P (| ⟨ c_{i}, e ⟩ | \geq \frac{t}{\sqrt{N}}) = \frac{2}{\sqrt{2 π}} \int_{t}^{\infty} e^{- \frac{x^{2}}{2}} d x \geq \frac{1}{2} e^{- t^{2}} .

Split the index set $1,2, \dots, Σ$ into $N$ nonoverlapping subsets $S_{k}$ , $k = 1,2, \dots, N$ of size $N^{β - 1}$ . For each $S_{k}$ ,

P (max_{i \in S_{k}} | ⟨ c_{i}, e ⟩ | \leq \frac{α \sqrt{\ln N}}{\sqrt{N}}) \leq {(1 - \frac{1}{2 N^{α^{2}}})}^{N^{β - 1}} \leq e^{- \frac{1}{2} N^{\frac{β - 1}{2}}}

for $α = \sqrt{(β - 1) / 2}$ . By independence,

P ([17] holds) \geq Π_{k = 1}^{N} P (max_{i \in S_{k}} | ⟨ c_{i}, e ⟩ | \geq α \sqrt{\ln N} / \sqrt{N}) .

Then, $P ([17] holds) \geq {(1 - e^{- \frac{1}{2} N^{\frac{β - 1}{2}}})}^{N} \geq 1 - N e^{- \frac{1}{2} N^{\frac{β - 1}{2}}}$ . Choosing $N_{0}$ sufficiently large, we obtain [10]. $□$

Proof of Theorem 2: When columns of $A$ are not orthogonal, we will choose a $τ$ smaller than that in Theorem 1 by a factor of two. Suppose that the $M$ -dimensional space $V$ is the span of the column vectors $a_{j}$ , with $j$ in the support of $ρ$ . Say that $V$ is spanned by $a_{1} \dots a_{M}$ . Let $W = V^{⊥}$ be the orthogonal complement to $V$ . Consider the orthogonal decomposition $a_{i} = a_{i}^{v} + a_{i}^{w}$ for all $i \geq M + 1$ . Incoherence of $a_{i}$ implies that ${‖ a_{i}^{w} ‖}_{ℓ_{2}} \geq 1 / 2$ for all $i \geq M + 1$ . Indeed, fix any $i \geq M + 1$ . Suppose that $a_{i}^{v} = \sum_{k = 1}^{M} ξ_{k} a_{k}$ and that $| ξ_{j} | = max_{k \leq M} | ξ_{k} | = {‖ ξ ‖}_{l_{\infty}}$ . Thus, $\frac{1}{3 M} \geq | ⟨ a_{j}, a_{i}^{v} ⟩ | \geq | ⟨ a_{j}, \sum_{k = 1}^{M} ξ_{k} a_{k} ⟩ | \geq {‖ ξ ‖}_{l_{\infty}} (1 - \frac{M - 1}{3 M}) .$ Then, ${‖ ξ ‖}_{l_{\infty}} \leq 1 / (2 M)$ . Therefore, ${‖ a_{i}^{v} ‖}_{ℓ_{2}} \leq {‖ ξ ‖}_{ℓ_{1}} \leq M {‖ ξ ‖}_{l_{\infty}} \leq 1 / 2$ , and ${‖ a_{i}^{w} ‖}_{ℓ_{2}} \geq {‖ a_{i} ‖}_{ℓ_{2}} - {‖ a_{i}^{v} ‖}_{ℓ_{2}} \geq 1 / 2$ .

Project system [4] on $W$ . Then, we obtain a new system [4]. The $ℓ_{2}$ norms of the columns of new $A$ are at least $1 / 2$ . Otherwise, the new system satisfies all conditions of Theorem 1. Indeed, $b_{0}$ is projected to zero. All $c_{i}$ and $e / {‖ e ‖}_{ℓ_{2}}$ are projected to vectors uniformly distributed on $S^{N - M - 1}$ by the concentration of measure (27). If any $a_{i}$ , $i \geq M + 1$ , was used in an optimal approximation of $b_{0} + e$ , then its projection $a_{i}^{w}$ is used in an optimal approximation of the projection of $b_{0} + e$ on $W$ . This is a contradiction to Lemma 1 if we choose $τ < c_{0} / (2 α)$ and recall that ${‖ a_{i}^{w} ‖}_{ℓ_{2}} \geq 1 / 2$ . $□$

Proof of Theorem 3: Choose $τ$ as in Theorem 2. Incoherence of $a_{i}$ implies that we can argue as in Proof of Theorem 2 and assume that $⟨ a_{i}, a_{j} ⟩ = 0$ for $i \neq j$ , $i, j \in supp (ρ)$ . Suppose that $V^{i}$ are the two-dimensional (2D) spaces spanned by $e$ and $a_{i}$ for $i \in supp (ρ)$ . By Lemma 1, all $λ H (τ) \cap V^{i}$ look like the rounded rhombi depicted in Fig. 1, Right, and $λ H_{1} \cap V^{i} \subset B_{λ}^{i}$ with probability $1 - N^{- κ}$ , where $B_{λ}^{i}$ is a 2D $ℓ_{2}$ ball of radius $λ c_{0} \sqrt{\ln N} / \sqrt{N}$ . Thus, $λ H (τ) \cap V^{i} \subset H_{λ}^{i}$ with probability $1 - N^{- κ}$ , where $H_{λ}^{i}$ is the convex hull of $B_{λ}^{i}$ and a vector $λ f_{i}$ , $f_{i} = ρ_{i} {‖ ρ ‖}_{ℓ_{1}}^{- 1} τ^{- 1} a_{i}$ . Then, $supp (ρ_{τ}) = supp (ρ)$ if there exists $λ_{0}$ so that $ρ_{i} a_{i} + e$ lies on the flat boundary of $H_{λ_{0}}^{i}$ for all $i \in supp (ρ)$ .

If $min_{i \in supp (ρ)} | ρ_{i} | \geq γ {‖ ρ ‖}_{\infty}$ , then there is a constant $c_{2} = c_{2} (γ)$ such that, if $ρ_{i} a_{i} + e$ lies on the flat boundary of $H_{λ}^{i}$ for some $i$ and some $λ$ , then there exists $λ_{0}$ so that $ρ_{i} a_{i} + c_{2} e$ lies on the flat boundary of $H_{λ_{0}}^{i}$ for all $i \in supp (ρ)$ . Suppose that $V$ is spanned by $e$ and $b_{0}$ , $H_{λ} \subset V$ is the convex hull of $B_{λ}$ and $λ f$ , and $f = b_{0} {‖ ρ ‖}_{ℓ_{1}}^{- 1} τ^{- 1}$ where $B_{λ} \subset V$ is an $ℓ_{2}$ ball of radius $λ c_{0} \sqrt{\ln N} / \sqrt{N}$ . If $b_{0} + c_{2} e$ lies on the flat boundary of $H_{λ}$ , then there must be an $i \in supp (ρ)$ such that $ρ_{i} a_{i} + c_{2} e$ lies on the flat boundary of $H_{λ}^{i}$ . If

\frac{| ⟨ b_{0}, b_{0} + c_{2} e ⟩ |}{{‖ b_{0} ‖}_{ℓ_{2}} {‖ b_{0} + c_{2} e ‖}_{ℓ_{2}}} \geq \frac{c_{0} \sqrt{\ln N}}{\sqrt{N} {‖ f ‖}_{ℓ_{2}}},

[18]

then $b_{0} + c_{2} e$ lies on the flat boundary of $H_{λ}$ . Since $| ⟨ b_{0}, e ⟩ | \leq c_{0} {‖ e ‖}_{ℓ_{2}} {‖ b_{0} ‖}_{ℓ_{2}} / \sqrt{N}$ with probability $1 - N^{- κ}$ , Eq. 18 holds if ${‖ e ‖}_{ℓ_{2}} / {‖ b_{0} ‖}_{ℓ_{2}} \leq {‖ f ‖}_{ℓ_{2}} \sqrt{N /} (c_{2} c_{0} \sqrt{\ln N}) \leq c_{1} {‖ b_{0} ‖}_{ℓ_{2}} {‖ ρ ‖}_{ℓ_{1}}^{- 1} \sqrt{N} / \sqrt{\ln N}$ . $□$

Data Availability Statement.

There are no data in this paper; we present an algorithm, its theoretical analysis, and some numerical simulations.

Acknowledgments

The work of M.M. was partially supported by Spanish Ministerio de Ciencia e Innovación Grant FIS2016-77892-R. The work of A.N. was partially supported by NSF Grants DMS-1515187 and DMS-1813943. The work of G.P. was partially supported by Air Force Office of Scientific Research (AFOSR) Grant FA9550-18-1-0519. The work of C.T. was partially supported by AFOSR Grants FA9550-17-1-0238 and FA9550-18-1-0519. We thank Marguerite Novikov for drawing Fig. 1, Left.

Footnotes

The authors declare no competing interest.

*Choosing two step sizes instead of the smaller one $Δ t_{1}$ improves the convergence speed.

^†We chose to work with real numbers in the previous sections for ease of presentation, but the results also hold with complex numbers.

References

1.Chen S. S., Donoho D. L., Saunders M. A., Atomic decomposition by basis pursuit. SIAM Rev. 43, 12–159 (2001). [Google Scholar]
2.Donoho D. L., Elad M., Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization. Proc. Natl. Acad. Sci. U.S.A. 100, 2197–2202 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Candès E. J., Tao T., Decoding by linear programming. IEEE Trans. Inf. Theory 51, 4203–4215 (2005). [Google Scholar]
4.Tibshirani R., Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996). [Google Scholar]
5.Fuchs J. J., Recovery of exact sparse representations in the presence of bounded noise. IEEE Trans. Inf. Theory 51, 3601–3608 (2005). [Google Scholar]
6.Tropp J. A., Just relax: Convex programming methods for identifying sparse signals in noise. IEEE Trans. Inf. Theory 52, 1030–1051 (2006). [Google Scholar]
7.Wainwright M. J., Sharp thresholds for high-dimensional and noisy sparsity recovery using $ℓ_{1}$ -constrained quadratic programming (Lasso). IEEE Trans. Inf. Theory 55, 2183–2202 (2009). [Google Scholar]
8.Maleki A., Anitori L., Yang Z., Baraniuk R., Asymptotic analysis of complex Lasso via complex approximate message passing (CAMP). IEEE Trans. Inf. Theory 59, 4290–4308 (2013). [Google Scholar]
9.Zou H., The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006). [Google Scholar]
10.Sampson J. N., Chatterjee N., Carroll R. J., Müller S., Controlling the local false discovery rate in the adaptive Lasso. Biostatistics 14, 653–666 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Belloni A., Chernozhukov V., Wang L., Square-root Lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98, 791–806 (2011). [Google Scholar]
12.Trzasko J., Manduca A., Highly undersampled magnetic resonance image reconstruction via homotopic $ℓ_{0}$ -minimization. IEEE Trans. Med. Imag. 28, 106–121 (2009). [DOI] [PubMed] [Google Scholar]
13.AlQuraishi M., McAdams H. H., Direct inference of protein dna interactions using compressed sensing methods. Proc. Natl. Acad. Sci. U.S.A. 108, 14819–14824 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Baraniuk R., Steeghs P., “Compressive radar imaging” in 2007 IEEE Radar Conference (IEEE, 2007), pp. 128–133. [Google Scholar]
15.Taylor H. L., Banks SC., McCoy J. F., Deconvolution with the l1 norm. Geophysics 44, 39–52 (1979). [Google Scholar]
16.Malioutov D., Cetin M., Willsky A. S., A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process. 53, 3010–3022 (2005). [Google Scholar]
17.Romberg J., Imaging via compressive sampling. IEEE Signal Process. Mag. 25, 14–20 (2008). [Google Scholar]
18.Herman M. A., Strohmer T., High-resolution radar via compressed sensing. IEEE Trans. Signal Process. 57, 2275–2284 (2009). [Google Scholar]
19.Tropp J. A., Laska J. N., Duarte M. F., Romberg J. K., Baraniuk R. G., Beyond Nyquist: Efficient sampling of sparse bandlimited signals. IEEE Trans. Inf. Theory 56, 520–544 (2010). [Google Scholar]
20.Fannjiang A. C., Strohmer T., Yan P., Compressed remote sensing of sparse objects. SIAM J. Imag. Sci. 3, 595–618 (2010). [Google Scholar]
21.Chai A., Moscoso M., Papanicolaou G., Robust imaging of localized scatterers using the singular value decomposition and $ℓ_{1}$ optimization. Inverse Probl. 29, 025016 (2013). [Google Scholar]
22.Donoho D. L., Superresolution via sparsity constraints. SIAM J. Math. Anal. 23, 1303–1331 (1992). [Google Scholar]
23.Candès E. J., Fernandez-Granda C., Towards a mathematical theory of super-resolution. Comm. Pure Appl. Math 67, 906–956 (2014). [Google Scholar]
24.Fannjiang A. C., Liao W., Coherence pattern-guided compressive sensing with unresolved grids. SIAM J. Imag. Sci. 5, 179–202 (2012). [Google Scholar]
25.Borcea L., Kocyigit I., Resolution analysis of imaging with $ℓ_{1}$ optimization. SIAM J. Imag. Sci. 8, 3015–3050 (2015). [Google Scholar]
26.Moscoso M., Novikov A., Papanicolaou G., Tsogka C., Imaging with highly incomplete and corrupted data. Inverse Probl. 36, 035010 (2020). [Google Scholar]
27.Vershynin R., High-Dimensional Probability: An Introduction with Applications in Data Science (Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, 2018). [Google Scholar]
28.Laska J. N., Davenport M. A., Baraniuk R. G., “Exact signal recovery from sparsely corrupted measurements through the pursuit of justice” in 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers (IEEE, Pacific Grove, CA, 2009), pp. 1556–1560. [Google Scholar]
29.Tao T., “A cheap version of the Kabatjanskii-Levenstein bound for almost orthogonal vectors.” https://terrytao.wordpress.com/2013/07/18/a-cheap-version-of-the-kabatjanskii-levenstein-bound-for-almost-orthogonal-vectors. Accessed 23 January 2019.
30.Milman V. D., A new proof of Dvoretzky’s theorem on cross-sections of convex bodies. Funkcional. Anal. I Priložen. 5, 28–37 (1971). [Google Scholar]
31.Moscoso M., Novikov A., Papanicolaou G., Ryzhik L., A differential equations approach to l1-minimization with applications to array imaging. Inverse Probl. 28, 10 (2012). [Google Scholar]
32.Gray R. M., Toeplitz and circulant matrices: A review. Commun. Inf. Theory 2, 155–239 (2006). [Google Scholar]
33.Laviada J., Arboleya-Arboleya A., Alvarez-Lopez Y., Garcia-Gonzalez C., Las-Heras F., Phaseless synthetic aperture radar with efficient sampling for broadband near-field imaging: Theory and validation. IEEE Trans. Antennas Propag. 63, 573–584 (2015). [Google Scholar]
34.Chai A., Moscoso M., Papanicolaou G., Imaging strong localized scatterers with sparsity promoting optimization. SIAM J. Imaging Sci 7, 1358–1387 (2014). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

There are no data in this paper; we present an algorithm, its theoretical analysis, and some numerical simulations.

[r1] 1.Chen S. S., Donoho D. L., Saunders M. A., Atomic decomposition by basis pursuit. SIAM Rev. 43, 12–159 (2001). [Google Scholar]

[r2] 2.Donoho D. L., Elad M., Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization. Proc. Natl. Acad. Sci. U.S.A. 100, 2197–2202 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r3] 3.Candès E. J., Tao T., Decoding by linear programming. IEEE Trans. Inf. Theory 51, 4203–4215 (2005). [Google Scholar]

[r4] 4.Tibshirani R., Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58, 267–288 (1996). [Google Scholar]

[r5] 5.Fuchs J. J., Recovery of exact sparse representations in the presence of bounded noise. IEEE Trans. Inf. Theory 51, 3601–3608 (2005). [Google Scholar]

[r6] 6.Tropp J. A., Just relax: Convex programming methods for identifying sparse signals in noise. IEEE Trans. Inf. Theory 52, 1030–1051 (2006). [Google Scholar]

[r7] 7.Wainwright M. J., Sharp thresholds for high-dimensional and noisy sparsity recovery using $ℓ_{1}$ -constrained quadratic programming (Lasso). IEEE Trans. Inf. Theory 55, 2183–2202 (2009). [Google Scholar]

[r8] 8.Maleki A., Anitori L., Yang Z., Baraniuk R., Asymptotic analysis of complex Lasso via complex approximate message passing (CAMP). IEEE Trans. Inf. Theory 59, 4290–4308 (2013). [Google Scholar]

[r9] 9.Zou H., The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006). [Google Scholar]

[r10] 10.Sampson J. N., Chatterjee N., Carroll R. J., Müller S., Controlling the local false discovery rate in the adaptive Lasso. Biostatistics 14, 653–666 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r11] 11.Belloni A., Chernozhukov V., Wang L., Square-root Lasso: Pivotal recovery of sparse signals via conic programming. Biometrika 98, 791–806 (2011). [Google Scholar]

[r12] 12.Trzasko J., Manduca A., Highly undersampled magnetic resonance image reconstruction via homotopic $ℓ_{0}$ -minimization. IEEE Trans. Med. Imag. 28, 106–121 (2009). [DOI] [PubMed] [Google Scholar]

[r13] 13.AlQuraishi M., McAdams H. H., Direct inference of protein dna interactions using compressed sensing methods. Proc. Natl. Acad. Sci. U.S.A. 108, 14819–14824 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r14] 14.Baraniuk R., Steeghs P., “Compressive radar imaging” in 2007 IEEE Radar Conference (IEEE, 2007), pp. 128–133. [Google Scholar]

[r15] 15.Taylor H. L., Banks SC., McCoy J. F., Deconvolution with the l1 norm. Geophysics 44, 39–52 (1979). [Google Scholar]

[r16] 16.Malioutov D., Cetin M., Willsky A. S., A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Process. 53, 3010–3022 (2005). [Google Scholar]

[r17] 17.Romberg J., Imaging via compressive sampling. IEEE Signal Process. Mag. 25, 14–20 (2008). [Google Scholar]

[r18] 18.Herman M. A., Strohmer T., High-resolution radar via compressed sensing. IEEE Trans. Signal Process. 57, 2275–2284 (2009). [Google Scholar]

[r19] 19.Tropp J. A., Laska J. N., Duarte M. F., Romberg J. K., Baraniuk R. G., Beyond Nyquist: Efficient sampling of sparse bandlimited signals. IEEE Trans. Inf. Theory 56, 520–544 (2010). [Google Scholar]

[r20] 20.Fannjiang A. C., Strohmer T., Yan P., Compressed remote sensing of sparse objects. SIAM J. Imag. Sci. 3, 595–618 (2010). [Google Scholar]

[r21] 21.Chai A., Moscoso M., Papanicolaou G., Robust imaging of localized scatterers using the singular value decomposition and $ℓ_{1}$ optimization. Inverse Probl. 29, 025016 (2013). [Google Scholar]

[r22] 22.Donoho D. L., Superresolution via sparsity constraints. SIAM J. Math. Anal. 23, 1303–1331 (1992). [Google Scholar]

[r23] 23.Candès E. J., Fernandez-Granda C., Towards a mathematical theory of super-resolution. Comm. Pure Appl. Math 67, 906–956 (2014). [Google Scholar]

[r24] 24.Fannjiang A. C., Liao W., Coherence pattern-guided compressive sensing with unresolved grids. SIAM J. Imag. Sci. 5, 179–202 (2012). [Google Scholar]

[r25] 25.Borcea L., Kocyigit I., Resolution analysis of imaging with $ℓ_{1}$ optimization. SIAM J. Imag. Sci. 8, 3015–3050 (2015). [Google Scholar]

[r26] 26.Moscoso M., Novikov A., Papanicolaou G., Tsogka C., Imaging with highly incomplete and corrupted data. Inverse Probl. 36, 035010 (2020). [Google Scholar]

[r27] 27.Vershynin R., High-Dimensional Probability: An Introduction with Applications in Data Science (Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, 2018). [Google Scholar]

[r28] 28.Laska J. N., Davenport M. A., Baraniuk R. G., “Exact signal recovery from sparsely corrupted measurements through the pursuit of justice” in 2009 Conference Record of the Forty-Third Asilomar Conference on Signals, Systems and Computers (IEEE, Pacific Grove, CA, 2009), pp. 1556–1560. [Google Scholar]

[r29] 29.Tao T., “A cheap version of the Kabatjanskii-Levenstein bound for almost orthogonal vectors.” https://terrytao.wordpress.com/2013/07/18/a-cheap-version-of-the-kabatjanskii-levenstein-bound-for-almost-orthogonal-vectors. Accessed 23 January 2019.

[r30] 30.Milman V. D., A new proof of Dvoretzky’s theorem on cross-sections of convex bodies. Funkcional. Anal. I Priložen. 5, 28–37 (1971). [Google Scholar]

[r31] 31.Moscoso M., Novikov A., Papanicolaou G., Ryzhik L., A differential equations approach to l1-minimization with applications to array imaging. Inverse Probl. 28, 10 (2012). [Google Scholar]

[r32] 32.Gray R. M., Toeplitz and circulant matrices: A review. Commun. Inf. Theory 2, 155–239 (2006). [Google Scholar]

[r33] 33.Laviada J., Arboleya-Arboleya A., Alvarez-Lopez Y., Garcia-Gonzalez C., Las-Heras F., Phaseless synthetic aperture radar with efficient sampling for broadband near-field imaging: Theory and validation. IEEE Trans. Antennas Propag. 63, 573–584 (2015). [Google Scholar]

[r34] 34.Chai A., Moscoso M., Papanicolaou G., Imaging strong localized scatterers with sparsity promoting optimization. SIAM J. Imaging Sci 7, 1358–1387 (2014). [Google Scholar]

PERMALINK

The Noise Collector for sparse recovery in high dimensions

Miguel Moscoso

Alexei Novikov

George Papanicolaou

Chrysoula Tsogka

Significance

Abstract

Main Results

Motivation

The Noise Collector

Fig. 1.

Fast Noise Collector Algorithm

Application to Imaging

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Proofs

Data Availability Statement.

Acknowledgments

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

The Noise Collector for sparse recovery in high dimensions

Miguel Moscoso

Alexei Novikov

George Papanicolaou

Chrysoula Tsogka

Significance

Abstract

Main Results

Motivation

The Noise Collector

Fig. 1.

Fast Noise Collector Algorithm

Application to Imaging

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Proofs

Data Availability Statement.

Acknowledgments

Footnotes

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases