Significance
A frame (overcomplete set of vectors) represents an analog coding scheme. Deterministic frame constructions offer useful codes for communication and signal processing tasks. When the coded signal only uses a random subset of the frame vectors (for example, in compressed sensing), the coding quality is determined by the typical covariances within subsets of frame vectors. We provide a method to calculate functions of these typical covariances, which predict specific performance measures of the corresponding coding scheme. Our method uses a universality property: for many well-known deterministic and random frames, typical covariances within subsets of frame vectors do not depend on the frame and are described by the MANOVA (multivariate ANOVA) ensemble, a classical object in statistics and random matrix theory.
Keywords: deterministic frames, MANOVA, analog source coding, equiangular tight frames, restricted isometry property
Abstract
We draw a random subset of rows from a frame with rows (vectors) and columns (dimensions), where and are proportional to . For a variety of important deterministic equiangular tight frames (ETFs) and tight non-ETFs, we consider the distribution of singular values of the -subset matrix. We observe that, for large , they can be precisely described by a known probability distribution—Wachter’s MANOVA (multivariate ANOVA) spectral distribution, a phenomenon that was previously known only for two types of random frames. In terms of convergence to this limit, the -subset matrix from all of these frames is shown to be empirically indistinguishable from the classical MANOVA (Jacobi) random matrix ensemble. Thus, empirically, the MANOVA ensemble offers a universal description of the spectra of randomly selected subframes, even those taken from deterministic frames. The same universality phenomena is shown to hold for notable random frames as well. This description enables exact calculations of properties of solutions for systems of linear equations based on a random choice of frame vectors of possible vectors and has a variety of implications for erasure coding, compressed sensing, and sparse recovery. When the aspect ratio is small, the MANOVA spectrum tends to the well-known Marčenko–Pastur distribution of the singular values of a Gaussian matrix, in agreement with previous work on highly redundant frames. Our results are empirical, but they are exhaustive, precise, and fully reproducible.
Consider a frame or , and stack the vectors as rows to obtain the -by- frame matrix . Assume that (deterministic frames) or almost surely (random frames). This paper studies properties of a random subframe , where is chosen uniformly at random from and . We let denote the -by- submatrix of created by picking only the rows ; call this object a typical submatrix of . We consider a collection of well-known deterministic frames, listed in Table 1, which we denote by . Most of the frames in are equiangular tight frames (ETFs), and some are near-ETFs.
Table 1.
Label | Name | or | Natural | Tight frame | Equiangular | Refs. |
Deterministic frames | ||||||
DSS | Difference set spectrum | Yes | Yes | 36 | ||
GF | Grassmannian frame | Yes | Yes | 37, corollary 2.6b | ||
RealPF | Real Paley’s construction | Yes | Yes | 37, corollary 2.6a | ||
ComplexPF | Complex Paley’s construction | Yes | Yes | 38 | ||
Alltop | Quadratic Phase Chirp | Yes | No | 26, equation S4 | ||
SS | Spikes and Sines | Yes | No | 6 | ||
SH | Spikes and Hadamard | Yes | No | 6 | ||
Random frames | ||||||
HAAR | Unitary Haar frame | Yes | No | 3, 4 | ||
RealHAAR | Orthogonal Haar frame | Yes | No | 4 | ||
RandDFT | Random Fourier transform | Yes | No | 3 | ||
RandDCT | Random Cosine transform | Yes | No |
This paper suggests that, for a frame in , it is possible to calculate quantities of the form , where is the vector of eigenvalues of the -by- Gram matrix and is a functional of these eigenvalues. As discussed below, such quantities are of considerable interest in various applications where frames are used across a variety of domains, including compressed sensing, sparse recovery, and erasure coding.
We present a simple and explicit formula for calculating for a given frame in and a given spectral functional . Specifically, for the case ,
where , , and is the density of Wachter’s classical multivariate ANOVA limiting distribution (1), which we denote here by MANOVA. The fluctuations about this approximate value are given exactly by
[1] |
Although the constant may depend on the frame, the exponents and are universal and depend only on and the aspect ratios and . Evidently, the precision of the MANOVA-based approximation is good, is known, and improves as and both grow proportionally to .
Eq. 1 is based on a far-reaching universality hypothesis. For all frames in as well as well-known random frames also listed in Table 1, we find that the spectrum of the typical -submatrix ensemble is indistinguishable from that of the classical MANOVA (Jacobi) random matrix ensemble (2) of the same size. (Interestingly, it will be shown that, for deterministic ETFs, this indistinguishability holds in a stronger sense than for deterministic non-ETFs.) This universality is not asymptotic and concerns finite -by- frames. However, it does imply that the spectrum of the typical -submatrix ensemble converges to a universal limiting distribution, which is none other than Wachter’s MANOVA limiting distribution (1). It also implies that the universal exponents and in Eq. 1 are previously unknown, universal quantities corresponding to the classical MANOVA (Jacobi) random matrix ensemble.
This brief announcement tests Eq. 1 and the underlying universality hypothesis by conducting substantial computer experiments, in which a large number of random submatrices are generated. We study a large variety of deterministic frames, both real and complex. In addition to the universal object (the MANOVA ensemble) itself, we study difference set spectrum (DSS) frames, Grassmannian frame (GF), real Paley (RealPF) frames, complex Paley (ComplexPF) frames, quadratic phase chirp (Alltop) frames, Spikes and Sines (SS) frames, and Spikes and Hadamard (SH) frames.
We report compelling empirical evidence, systematically documented and analyzed, which fully supports the universality hypothesis and Eq. 1. Our results are empirical, but they are exhaustive, precise, and reproducible, and they meet the best standards of empirical science.
For this purpose, we develop a natural framework for empirically testing such hypotheses regarding limiting distribution and convergence rates of random matrix ensembles. Before turning to deterministic frames, we validate our framework on well-known random frames, including real orthogonal Haar frames, complex unitary Haar frames, real random Cosine frames, and complex random Fourier frames. Interestingly, rigorous proofs that identify the MANOVA distribution as the limiting spectral distribution of typical submatrices can be found in the literature for two of these random frames, namely the random Fourier frame (3) and the unitary Haar frame (4).
Motivation
Frames can be viewed as an analog counterpart for digital coding. They provide overcomplete representation of signals, adding redundancy and increasing immunity to noise. Indeed, they are used in many branches of science and engineering for stable signal representation as well as error and erasure correction.
Let denote the vector of nonzero eigenvalues of , and let and denote its maximum and minimum, respectively. Frames were traditionally designed to achieve frame bounds as high as possible [ as low as possible]. Alternatively, they were designed to minimize mutual coherence (5, 6), the maximal pairwise correlation between any two frame vectors.
In the passing decade, it has become apparent that neither frame bounds (a global criterion) nor coherence (a local pairwise criterion) are sufficient to explain various phenomena related to overcomplete representations and that one should also look at collective behavior of frame vectors from the frame, . Although different applications focus on different properties of the submatrix , most of these properties can be expressed as a function of and even just an average of a scalar function of the eigenvalues. Here are a few notable examples.
Restricted Isometry Property.
Recovery of any -sparse signal from its linear measurement using minimization is guaranteed if the spectral radius of [restricted isometry property (RIP)], namely
[2] |
Statistical RIP.
Numerous authors have studied a relaxation of the RIP condition suggested in ref. 10. Define
[3] |
Then, is the probability that the RIP condition with bound holds when acts on a signal supported on a random set of coordinates.
Analog Coding of a Source with Erasures.
In ref. 11, two of us considered a typical erasure pattern of random samples known at the transmitter but not known at the receiver. The rate distortion function of the coding scheme suggested in ref. 11 is determined by , with
[4] |
[i.e., is the arithmetic-to-harmonic means ratio of the eigenvalues (the arithmetic mean is because of the normalization of frames)]. This quantity is the signal amplification responsible for the excess rate of the suggested coding scheme. Note that here is the inverse of defined in ref. 11.
Shannon Transform.
The quantity
[5] |
which was suggested in ref. 12, measures the capacity of a linear Gaussian erasure channel. Specifically, it assumes (where and are the channel input and output) followed by random erasures. The quantity in Eq. 5 is the signal-to-noise ratio .
In this paper, we focus on typical case performance criteria [those that seek to optimize over random choice of ] rather than worse case performance criteria [those that seek to optimize , such as RIP]. For the remainder of this paper, will denote a uniformly distributed random subset of size . Importantly, should be allowed to be large, even as large as .
For a given , one would like to design frames that optimize . This optimization turns out to be a difficult task; in fact, it is not even known how to calculate for a given frame . Indeed, to calculate this quantity, one effectively has to average over the spectrum for all subsets . It is of little surprise to the information theorist that the first frame designs, for which performance was formally bounded (and still not calculated exactly), consisted of random vectors (8, 13–17).
Random Frames
When the frame is random, namely when is drawn from some ensemble of random matrices, each submatrix is also a random matrix. Given a specific , rather than seeking to bound for specific and , it can be extremely rewarding to study the limit of as the frame sizes and grow. The reason is that tools from random matrix theory become available, which allow exact asymptotic calculation of and , and also because their limiting values are usually very close to their corresponding values for finite and , even for low values of .
Let us consider then a sequence of dimensions with and a sequence of random frame matrices or . To characterize the collective behavior of submatrices, we choose a sequence with and look at the spectrum of the random matrix as , where is a randomly chosen subset with . Here and below, to avoid cumbersome notation, we omit the subscript and write , , and for , , and .
A mainstay of random matrix theory is the celebrated convergence of the empirical spectral distribution of random matrices, drawn from a certain ensemble, to a limiting spectral distribution corresponding to that ensemble. Such convergence has indeed been established for three random frames:
1. Gaussian i.i.d. Frame.
Let have i.i.d. (independent and identically distributed) normal entries with mean zero and variance . The empirical distribution of famously converges, almost surely in distribution, to the Marčenko–Pastur density (18) with parameter :
[6] |
supported on , where . Moreover, almost surely and ; in other words, the maximal and minimal empirical eigenvalues converge almost surely to the edges of the support of the limiting spectral distribution (19).
2. Random Fourier Frame.
Consider the random Fourier frame, in which the columns of are drawn uniformly at random from the columns of the -by- discrete Fourier transform (DFT) matrix (normalized such that the absolute value of matrix entries is ). Farrell (3) has proved that the empirical distribution of converges, almost surely in distribution, as and and grow proportionally to to the so-called MANOVA limiting distribution, which we now describe briefly.
The classical MANOVA ensemble,* with , is the distribution of the random matrix
[7] |
where are random standard Gaussian i.i.d. matrices with entries in . Wachter (1) discovered that, as and , the empirical spectral distribution of the MANOVA ensemble converges, almost surely in distribution, to the so-called MANOVA limiting spectral distribution,† with density that is given by
[8] |
where . The limiting MANOVA distribution is compactly supported on with
[9] |
The same holds for the MANOVA ensemble.
Note that the support of the MANOVA distribution is smaller than that of the corresponding Marčenko–Pastur law for the same aspect ratios. Fig. 1 shows these two densities for and . Nevertheless, as the MANOVA dimension ratio becomes small, its distribution tends to the Marčenko–Pastur distribution (Eq. 6) [i.e., as ]. Thus, a highly redundant random Fourier frame behaves like a Gaussian i.i.d. frame.
3. Unitary Haar Frame.
Let consist of the first columns of a Haar-distributed -by- unitary matrix normalized by (the Haar distribution being the uniform distribution over the group of -by- unitary matrices). Edelman and Sutton (4) proved that the empirical spectral distribution of ) also converges, almost surely in distribution, to the MANOVA limiting spectral distribution (refs. 1 and 3, closing remarks).
The maximal and minimal eigenvalues of a matrix from the MANOVA ensemble () are known to converge almost surely to and , respectively (20). Although we are not aware of any parallel results for the random Fourier and Haar frames, the empirical evidence in this paper shows that it must be the case.
These random matrix phenomena have practical significance for evaluations of functions of the form , such as those mentioned above. The functions and , for example, are called linear spectral statistics in ref. 21, namely functions of that may be written as an integral of a scalar function against the empirical measure of . Convergence of the empirical distribution of to the limiting MANOVA distribution with density implies
[10] |
for both the random Fourier and Haar frames; the integrals on the right-hand side may be evaluated explicitly. Similarly, convergence of and to and implies, for example, that
[11] |
To show why such calculations are significant, we note that Eqs. 10 and 11 immediately allow us to compare the Gaussian i.i.d. frame with the random Fourier and Haar frames in terms of their limiting value of functions of interest. Fig. 2 compares the limiting value of , , and over varying values of . The plots clearly show that frames with a typical submatrix that exhibits a MANOVA spectrum are superior to frames with a typical submatrix that exhibits a Marčenko–Pastur spectrum across the performance measures.
Deterministic Frames: Universality Hypothesis
Deterministic frames, namely frames with design that involves no randomness, have so far eluded this kind of asymptotically exact analysis. Although there are results regarding RIP (22, 23) and statistical RIP (10, 24, 25), for example, of deterministic frame designs, they are mostly focused on highly redundant frames () and the wide submatrix () case, where the spectrum tends to the Marčenko–Pastur distribution. Furthermore, nothing analogous, say, to the precise comparisons of Fig. 2 exists in the literature to the best of our knowledge. Specifically, no results analogous to Eqs. 10 and 11 are known for deterministic frames, let alone the associated convergence rates, if any.
To subject deterministic frames to an asymptotic analysis, we shift our focus from a single frame to a family of deterministic frames created by a common construction. The frame matrix is -by-. Each frame family determines allowable subsequences ; to simplify notation, we leave the subsequence implicit and index the frame sequence simply by . The frame family also determines the aspect ratio limit . In what follows, we also fix a sequence with and let denote a uniformly distributed random subset.
Frames Under Study.
The different frames that we studied are listed in Table 1 in a manner inspired by ref. 26. In addition to our deterministic frames of interest (the set ), Table 1 also contains two examples of random frames (real and complex variants for each) for validation and convergence analysis purposes.
Functionals Under Study.
We studied the functionals from Eq. 3, from Eq. 4, and from Eq. 5. In addition, we studied the maximal and minimal eigenvalues of and its condition number:
Measuring the Rate of Convergence.
To quantify the rate of convergence of the entire spectrum of the -by- matrix , which is a submatrix of an -by- frame matrix , to a limiting distribution, we let denote the empirical cumulative distribution function (CDF) of and denote the CDF of the MANOVA limiting distribution. The quantity
where is the Kolmogorov–Smirnov (KS) distance between CDFs, measures the distance to the hypothesized limit. Here, and are the actual aspect ratios for the matrix at hand. As a baseline, we use , where is a matrix from the MANOVA ensemble, with if is real and if is complex. Fig. 3 illustrates the KS distance between an empirical CDF and the limiting MANOVA CDF.
Similarly, to quantify the rate of convergence of a functional , the quantity
is the distance between the measured value of on a given submatrix and its hypothesized limiting value. For a baseline, we can use , with if is real and if is complex. For linear spectral functionals, like and , which may be written as for some kernel , we have . For that depends on and , we have .
Universality Hypothesis.
The contributions of this paper are based on the following assertions on the typical -submatrix ensemble corresponding to a frame family . This family may be random or deterministic, real or complex.
H1. Existence of a Limiting Spectral Distribution.
The empirical spectral distribution of , namely the distribution of , converges, as , to a compactly supported limiting distribution; furthermore, and converge to the edges of that compact support.
H2. Universality of the Limiting Spectral Distribution.
The limiting spectral distribution of is the MANOVA distribution (1) with density that is shown in Eq. 8. Also, and , where is given by Eq. 9.
H3. Exact Power Law Rate of Convergence for the Entire Spectrum.
The spectrum of converges to the limiting MANOVA distribution
and in fact, its fluctuations are given by the law
[12] |
for some constants , which may depend on the frame family.
H4. Universality of the Rate of Convergence for the Entire Spectrum of ETFs.
For an ETF family, the exponent in Eq. 12 is universal and does not depend on the frame. Furthermore, Eq. 12 also holds, with the same universal exponent, replacing with a same-sized matrix from the MANOVA distribution defined in Eq. 7 with if is a real frame family and if is a complex frame family. In other words, the universal exponent for ETFs is a property of the MANOVA (Jacobi) random matrix ensemble.
H5. Exact Power Law Rate of Convergence for Functionals.
For a “nice” functional , the value of converges to according to the law
[13] |
for some constants .
H6. Universality of the Rate of Convergence for Functionals.
Although the constant in Eq. 13 may depend on the frame, the exponents are universal. Eq. 13 also holds, with the same universal exponents, replacing with a same-sized matrix from the MANOVA ensemble defined in Eq. 7, with if is a real frame family and if is a complex frame family. In other words, the universal exponents are a property of the MANOVA (Jacobi) random matrix ensemble.
Nonstandard Aspect Ratio .
Although the classical MANOVA ensemble and limiting density are not defined for , in our case, it is certainly possible to sample vectors from the possible frame vectors, resulting in a situation with . In this situation, the hypotheses above require slight modifications. Specifically, the limiting spectral distribution of for is
[14] |
where is the function (no longer a density) defined in Eq. 8. The rate of convergence of the distribution of nonzero eigenvalues to the limiting density is compared with the baseline , where is a matrix from the MANOVA ensemble (i.e., with reversed order of and ).
Methods
The software that we developed has been permanently deposited in the data and code supplement (https://purl.stanford.edu/qg138qm8653). Because many of the deterministic frames under study are only defined for , we primarily studied the aspect ratios with . In addition, we inspected all frames under study that are defined for the aspect ratios and (all random frames as well as DSS and Alltop). We also studied nonstandard aspect ratios as described in SI Appendix (https://purl.stanford.edu/qg138qm8653). For deterministic frames, took allowed values in the ranges , for Grassmannian and SH frames, and for DSS frame with . For random frames and MANOVA ensemble, we used dense grid of values in the range . Hypothesis testing as discussed below was based on a subset of these values, where . For each of the frame families under study and each value of and under study, we selected a sequence . The values and were selected so that will be as close as possible to ; however, because of different aspect ratio constraints by the different frames, occasionally, we had close but not equal to . We then determined , such that will be as close as possible to . For each , we generated a single -by- frame matrix . We then produced independent samples from the uniform distribution on subsets, , and generated their corresponding submatrices (). Importantly, all of these are submatrices of the same frame matrix . We calculated , the empirical variance of , and , the average value of on as a Monte Carlo approximation to the left-hand side of Eq. 12, variance and MSE (mean square error), respectively. For each of the functionals under study, we also calculated , the average value of on , as a Monte Carlo approximation to the left-hand size of Eq. 13.
Separately, for each triplet (, , ) and , we have performed independent draws from the MANOVA ensembles [7] and calculated analogous quantities , , and .
Test 1: Testing H1–H4.
For each of the frames under study and each value of , we computed the KS distance for submatrices and performed simple linear regression of on with an intercept. We obtained the estimated linear coefficient as an estimate for the exponent and its SE . Similarly, we regressed on to obtain and . We performed Student’s t test to test the null hypotheses using the test statistic
Under the null hypothesis, the test statistic is distributed , where and are the numbers of different values of for which we have collected the data for a frame and the MANOVA ensemble, respectively. We report the of the linear fit, the slope coefficient and its SE, and the P value of the above t test. We next regressed on . Because , a linear fit verifies that .
Test 2: Testing H5–H6.
For each of the frames under study, each of the functionals under study, and each value of , we computed the empirical value of the functionals on submatrices. We first performed linear regression of on and with an intercept for . Let denote the fitted coefficient for , and let denote the fitted coefficient for . This step was based on triplets , yielding accurate aspect ratios in the range . We then performed simple linear regression of on . The estimated linear regression coefficient is the estimate for the exponent in Eq. 13, and is its SE. We used as an estimate for the exponent in Eq. 13. We proceeded as above to test the null hypothesis . We report the of the linear fit, the slope coefficient and its SE, and the P value of the test above.
Computing.
To allow the number of Monte Carlo samples to be as large as and to be as large as , we used a large Matlab cluster running on Amazon Web Services. We used 32 logical core machines with 240 GB RAM (random access memory) each, which were running several hundred hours in total. The code that we executed has been deposited (https://purl.stanford.edu/qg138qm8653); it may easily be executed for smaller values of and on smaller machines.
Results
The raw results obtained in our experiments as well as the analysis results of each experiment have been deposited with their generating code (https://purl.stanford.edu/qg138qm8653).
For space considerations, the full documentation of our results is deferred to SI Appendix (https://purl.stanford.edu/qg138qm8653). To offer a few examples, Fig. 4 and Table 2 show the linear fit to for . Fig. 5 shows the linear fit to for a different value of , namely . Fig. 6 shows the linear fit to for . Fig. 7 and Table 3 show the linear fit to for . Similar figures and tables for the other values , in particular, , , , , , and , are deferred to SI Appendix. Note that, in all coefficient tables, both those shown here and those deferred to SI Appendix, the upper boxes show complex frames (with t test comparison with the complex MANOVA ensemble of the same size denoted “MANOVA”), and the bottom boxes show real frames (with t test comparison with the real MANOVA ensemble of the same size denoted “RealMANOVA”). In each box, the top rows are deterministic frames, and the bottom rows are random frames. Furthermore, note that, in plots for test 2, the horizontal axis is slightly different for real and complex frames, because the preliminary step described above was performed separately for real and complex frames. In the interest of space, we plot all frames over the horizontal axis calculated for complex frames.
Table 2.
Frame | P value | |||
MANOVA | 0.99828 | 0.92505 | 0.00690 | 1 |
DSS | 0.99858 | 0.93652 | 0.00911 | 0.32089 |
GF | 0.99921 | 0.92474 | 0.02608 | 0.99082 |
ComplexPF | 0.99950 | 0.92454 | 0.00535 | 0.95390 |
Alltop | 0.98906 | 0.49660 | 0.00883 | 9.4651e-47 |
SS | 0.98767 | 0.47354 | 0.00950 | 5.8136e-45 |
HAAR | 0.99736 | 0.94421 | 0.00873 | 0.09019 |
RandDFT | 0.99544 | 0.94127 | 0.01644 | 0.36788 |
RealMANOVA | 0.99873 | 0.95610 | 0.00613 | 1 |
RealPF | 0.99871 | 0.91244 | 0.00821 | 9.7174e-05 |
SH | 0.99989 | 0.46822 | 0.00492 | 6.3109e-35 |
RealHAAR | 0.99596 | 0.94456 | 0.01081 | 0.35675 |
RandDCT | 0.99773 | 0.93859 | 0.01156 | 0.18737 |
Table 3.
Frame | P value | |||
MANOVA | 0.98721 | 1.79936 | 0.03678 | 1 |
DSS | 0.99110 | 1.88674 | 0.04615 | 0.14551 |
GF | 0.99997 | 1.88548 | 0.01073 | 0.03161 |
ComplexPF | 0.99977 | 1.77783 | 0.00701 | 0.56808 |
Alltop | 0.93841 | 1.70618 | 0.07388 | 0.26297 |
SS | 0.95539 | 1.89501 | 0.07355 | 0.24922 |
HAAR | 0.97971 | 1.87082 | 0.04836 | 0.24400 |
RandDFT | 0.96928 | 1.77454 | 0.08157 | 0.78270 |
RealMANOVA | 0.99202 | 2.05451 | 0.03309 | 1 |
RealPF | 0.99834 | 2.00345 | 0.02045 | 0.19576 |
SH | 0.97850 | 1.81297 | 0.26874 | 0.37904 |
RealHAAR | 0.98287 | 2.09078 | 0.04958 | 0.54503 |
RandDCT | 0.98364 | 1.99663 | 0.06648 | 0.43977 |
Validation on Random Frames.
Although our primary interest was in deterministic frames, we included in the frames under study random frames. For the complex Haar frame and random Fourier frame, convergence of the empirical CDF of the spectrum to the limiting MANOVA distribution has been proved in refs. 3 and 4. To our surprise, not only was our framework validated on the four random frames under study, in the sense of asymptotic empirical spectral distribution, but also, all universality hypotheses H1–H6 were accepted (not rejected at the 0.001 significance level, with very few exceptions).
Test Results on Deterministic Frames.
A tabular summary of our results, per hypothesis and per frame under study, is included for convenience in SI Appendix. Universality hypotheses H1–H3 were accepted on all deterministic frames. For H1 and H2, convergence of the empirical spectral distribution to the MANOVA limit has been observed in all cases. For H3, the linear fit in all cases was excellent with without exception, confirming the power law in Eq. 12 and the polynomial decrease of with . Universality hypothesis H4 was accepted (not rejected) for deterministic ETFs at the 0.001 significance level, with few exceptions (Table 2; full results and a summary table are in SI Appendix); it was rejected for deterministic non-ETFs. For , hypothesis H4 has also been accepted for the Alltop frame (SI Appendix). Universality hypothesis H5 was accepted for all deterministic frames, with excellent linear fits ( without exception), confirming the power law in Eq. 13. Universality hypothesis H6 was accepted (not rejected) at the 0.001 significance level (and even 0.05 with few exceptions) for all deterministic frames. For the reader’s convenience, Table 4 summarizes the universal exponents for convergence of the entire spectrum (H4) and the universal exponents for convergence of the functionals under study (H6) for . The framework developed in this paper readily allows tabulation of these universal exponents for any value of . We have observed that the universal exponents are slightly sensitive to the random seed. However, exact evaluation of this variability requires very significant computational resources and is beyond our scope. Similarly, some sensitivity of the P values to random seed has been observed.
Table 4.
Frame | |||||||||||||
MANOVA | 0.93 | 1.15 | 2.21 | 1.44 | 3.48 | 1.80 | 0.99 | 1.13 | 2.48 | 1.00 | 3.09 | 1.87 | −4.55 |
DSS | 0.94 | 1.14 | 2.18 | 1.40 | 3.40 | 1.89 | 1.04 | 1.10 | 2.41 | 1.00 | 3.11 | 1.87 | −4.56 |
GF | 0.92 | 1.17 | 2.23 | 1.53 | 3.70 | 1.89 | 1.03 | 1.13 | 2.48 | 1.04 | 3.22 | 1.95 | −4.76 |
ComplexPF | 0.92 | 1.13 | 2.17 | 1.44 | 3.49 | 1.78 | 0.98 | 1.10 | 2.41 | 1.00 | 3.12 | 1.87 | −4.56 |
Alltop | 0.50 | 1.14 | 2.18 | 1.46 | 3.53 | 1.71 | 0.94 | 1.11 | 2.42 | 1.01 | 3.13 | 1.86 | −4.54 |
SS | 0.47 | 1.11 | 2.13 | 1.50 | 3.63 | 1.90 | 1.04 | 1.08 | 2.36 | 0.98 | 3.06 | 1.83 | −4.47 |
HAAR | 0.94 | 1.10 | 2.11 | 1.52 | 3.69 | 1.87 | 1.03 | 1.09 | 2.37 | 1.01 | 3.13 | 1.88 | −4.59 |
RandDFT | 0.94 | 1.21 | 2.32 | 1.47 | 3.56 | 1.77 | 0.97 | 1.11 | 2.42 | 1.03 | 3.18 | 1.93 | −4.70 |
RealMANOVA | 0.96 | 0.87 | 3.58 | 1.26 | 5.21 | 1.27 | 5.26 | 0.90 | 3.73 | 0.87 | 3.58 | 0.77 | 3.17 |
RealPF | 0.91 | 0.92 | 3.82 | 1.32 | 5.46 | 1.24 | 5.12 | 0.94 | 3.88 | 0.94 | 3.88 | 0.81 | 3.36 |
SH | 0.47 | 0.93 | 3.82 | 1.34 | 5.53 | 1.14 | 4.71 | 0.93 | 3.82 | 0.93 | 3.82 | 0.85 | 3.51 |
RealHAAR | 0.94 | 0.86 | 3.54 | 1.23 | 5.07 | 1.29 | 5.35 | 0.89 | 3.68 | 0.90 | 3.73 | 0.79 | 3.28 |
RandDCT | 0.94 | 0.99 | 4.08 | 1.30 | 5.38 | 1.24 | 5.10 | 0.94 | 3.89 | 0.95 | 3.93 | 0.82 | 3.40 |
Reproducibility Advisory.
All of the figures and tables in this paper, including those in SI Appendix, are fully reproducible from our raw results and code deposited in the data and code supplement (https://purl.stanford.edu/qg138qm8653).
Discussion
The Hypotheses.
Our universality hypotheses may be surprising in several aspects. First, the frames examined were designed to minimize frame bounds and worse case pairwise correlations. Still, it seems that they perform well when the performance criterion is based on spectrum of the typical selection of frame vectors. Second, under the universality hypotheses, all of these deterministic frames perform exactly as well as random frame designs, such as the random Fourier frame. Inasmuch as frames are continuous codes, we find deterministic codes matching the performance of random codes. Third, the hypotheses suggest an extremely broad universality property: many different ensembles of random matrices asymptotically exhibit the limiting MANOVA spectrum.
All of the deterministic frames under study satisfy the universality hypotheses (with hypothesis H4 satisfied only for ETFs). This finding should not give the impression that any deterministic frame satisfies these hypotheses! First, the empirical measures of an arbitrary sequence of frames rarely converge (thus violating hypothesis H1). Second, even if they converge, a too simplistic frame design often leads to concentration of the lower edge of the empirical spectrum near zero, resulting in a non-MANOVA spectrum and poor performance. For example, if the frame is sparse, say, consisting of some columns of the -by- identity matrix, then a fraction of the singular values of a typical submatrix is exactly zero.
The frames under study are all ETFs or near-ETFs, all with favorable frame properties. To make this point, we have included in SI Appendix (https://purl.stanford.edu/qg138qm8653) study of a low-pass frame, in which the Fourier frequencies included in the frame are the lowest ones. This construction is in contrast with the clever choice of frequencies leading to the DSS frame. Indeed, the low-pass frame does not have appealing frame properties. It is quite obvious from the results in SI Appendix as well as the results regarding the closely related random Vandermonde ensemble (27) that such frames do not satisfy any of the universality hypotheses H2–H6.
We note that convergence rates of the form Eqs. 12 and 13 are known for other classical random matrix ensembles (28–31).
We further note that hypotheses H1–H4 do not imply hypotheses H5 and H6. Even if the empirical CDF converges in the KS metric to the limiting MANOVA distribution, functionals that are not continuous in the KS metric do not necessarily converge, and moreover, no uniform rate of convergence is a priori implied.
Our Contributions.
This paper presents a simple method for approximate computation (with known and good approximation error) of spectral functionals of -submatrix ensemble for a variety of random and deterministic frames using Eq. 1. Our results make it possible to tabulate these approximate values, creating a useful resource for scientists. As an example, we include Table 5, a lookup table for the value of the functional on the DSS deterministic frame family, listing by values of and the asymptotic (approximate) value calculated analytically from the limiting distribution, and the standard approximation error.
Table 5.
1,031 | 1,151 | 1,291 | 1,451 | 1,571 | 1,811 | 1,951 | |
3 0.0281 | 3 0.0253 | 3 0.0227 | 3 0.0204 | 3 0.0189 | 3 0.0166 | 3 0.0155 | |
1.75 0.0073 | 1.75 0.0065 | 1.75 0.0058 | 1.75 0.0051 | 1.75 0.0048 | 1.75 0.0041 | 1.75 0.0038 |
To this end, we developed a systematic empirical framework, which allows validation of Eq. 1 and discovery of the exponents there. Our work is fully reproducible, and our framework is available (along with the rest of our results and code) in the code and data supplement (https://purl.stanford.edu/qg138qm8653). In addition, our results provide overwhelming empirical evidence for a number of phenomena.
-
i)
The typical -submatrix ensemble of deterministic frames is an object of interest. Although there is absolutely no randomness involved in the submatrix of a deterministic frame (other than the choice of subset ), the typical submatrix seems to be an ensemble in its own right, with properties so far attributed only to random matrix ensembles—including a universal, compactly supported limiting spectral distribution and convergence of the maximal (minimal) singular value to the upper (lower) edges of the limiting distribution.
-
ii)
MANOVA as a universal limiting spectral distribution. Wachter’s MANOVA distribution is the limiting spectral distribution of as and for the typical -submatrix ensemble of deterministic frames (including difference set, Grassmannian, real Paley, complex Paley, quadratic chirp, SS, and SH). The same is true for real random frames—random cosine transform and random Haar.
-
iii)
Convergence of the edge spectrum. For all of the deterministic frames above as well as the random frames (random cosine, random Fourier, complex Haar, and real Haar), the maximal and minimal eigenvalues of the -typical submatrix ensemble converge to the support edges of the MANOVA limiting distribution. The convergence follows a universal power law rate.
-
iv)
A definite power law rate of convergence for the entire spectrum of the MANOVA ensemble to its MANOVA limit, with different exponents in the real and complex cases.
-
v)
Universality of the power law exponents for the entire spectrum. The complex deterministic ETFs (difference set, Grassmannian, and complex Paley) share the power law exponents with the MANOVA ensemble. The same is true for the complex random frames (random Fourier and complex Haar). The complex tight nonequiangular Alltop frame, which can be constructed for various aspect ratios, also shares the power law exponents with the MANOVA ensemble for . The real deterministic ETF (real Paley) shares the exponent with the MANOVA. The same is true for real random frames (random cosine and real Haar). All non-ETFs under study, with , share different power law exponents (slower convergence).
-
vi)
A definite power law rate of convergence for functionals, including , , and .
-
vii)
Universality of the power law exponents for functionals. For practically all frames under study, both random and deterministic, the power law exponents for functionals agree with those of the MANOVA (real frames) and MANOVA (complex frames).
Intercepts.
Our results showed a surprising categorization of the deterministic and random frames under study according to the constant in Eq. 12 or equivalently, according to the intercept (vertical shift) in the linear regression on . Figs. 4 and 5 clearly show that the regression lines, while having identical slopes (as predicated by hypothesis H3), are grouped according to their intercepts into the following seven categories: complex MANOVA ensemble and complex Haar (Manova and HAAR); real MANOVA ensemble and real Haar (RealManova and RealHAAR); complex ETFs (DSS, GF, and ComplexPF); non-ETFs (SS, SH, and Alltop); real ETF (RealPF); complex random Fourier (RandDFT); and real random Fourier (RandDCT).
Interestingly, intercepts of all complex frames are larger (meaning that the linear coefficient in Eq. 12 is smaller) than those of all real frames. Also, the less randomness exists in the frame, the higher the intercept: intercepts of deterministic ETFs are higher than those of random Fourier and random cosine, which are in turn, higher than those of Haar frames and the MANOVA ensembles.
Related Work.
Farrell (3) has conjectured that the phenomenon of convergence of the spectrum of typical submatrices to the limiting MANOVA distribution is indeed much broader and extends beyond the partial Fourier frame that he considered. A related empirical study was conducted by Monajemi et al. (26). In it, the authors considered the so-called sparsity-undersampling phase transition in compressed sensing. This asymptotic quantity poses a performance criterion for frames that interacts with the typical submatrix in a manner possibly more complicated than the spectrum . The authors investigated various deterministic frames, most of which are studied in this paper, and brought empirical evidence that the phase transition for each of these deterministic frames is identical to the phase transition of Gaussian frames. Gurevich and Hadani (24) proposed certain deterministic frame construction and effectively proved that the empirical spectral distribution of their typical submatrix converges to a semicircle, assuming , a scaling relation different from the one considered here. The work in refs. 32 and 33 also considered deterministic frame designs, chirp sensing codes, and binary linear codes, with a random sampling. In their design, the aspect ratios are large (e.g., in ref. 32, and ), and therefore, the spectrum converges to the Marčenko–Pastur distribution. Tropp (34) provided bounds for and when is a general dictionary. Collins (35) has shown that the spectrum of a matrix model deriving from random projections has the same eigenvalue distribution of the MANOVA ensemble in finite . Wachter (1) used a connection between the MANOVA ensemble and submatrices of Haar matrices to derive the asymptotic spectral distribution MANOVA.
Conclusions
We have observed a surprising universality property for the -submatrix ensemble corresponding to various well-known deterministic frames as well as well-known random frames. The MANOVA ensemble and the MANOVA limiting distribution emerge as key objects in the study of frames, both random and deterministic, in the context of sparse signals and erasure channels. We hope that our findings will invite rigorous mathematical study of these fascinating phenomena.
In any frame where our universality hypotheses hold (including all of the frames under study here), Fig. 2 correctly describes the limiting values of , , and and shows that codes based on deterministic frames (involving no randomness and allowing fast implementations) are better across performance measures than i.i.d. random codes.
The empirical framework that we proposed in this paper may be easily applied to new frame families and new functionals , extending our results further and mapping the frontiers of the universality property. In any frame family and for any functional where our universality hypotheses hold, we have proposed a simple, effective method for calculating quantities of the form to known approximation, which improves polynomially with .
Supplementary Material
Acknowledgments
We thank the anonymous referees for their helpful comments. This work was partially supported by Israeli Science Foundation Grant 1523/16.
Footnotes
Conflict of interest statement: Matan Gavish is a former student of David L. Donoho, and they have published together, most recently in 2014.
This article is a PNAS Direct Submission.
Data deposition: The code and data supplement is available online at https://purl.stanford.edu/qg138qm8653.
*Also known as the beta-Jacobi ensemble with beta = 1 (orthogonal) for and beta = 2 (unitary) for .
†The literature uses the term MANOVA to refer to both the random matrix ensemble, which we denote here by MANOVA, and the limiting spectral distribution, which we denote here by MANOVA.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1700203114/-/DCSupplemental.
References
- 1.Wachter KW. The limiting empirical measure of multiple discriminant ratios. Ann Math Stat. 1980;8:937–957. [Google Scholar]
- 2.Forrester PJ. Log-Gases and Random Matrices (LMS-34) Princeton Univ Press; Princeton: 2010. [Google Scholar]
- 3.Farrell B. Limiting empirical singular value distribution of restrictions of discrete Fourier transform matrices. J Fourier Anal Appl. 2011;17:733–753. [Google Scholar]
- 4.Edelman A, Sutton BD. The beta-Jacobi matrix model, the CS decomposition, and generalized singular value problems. Found Comut Math. 2008;8:259–285. [Google Scholar]
- 5.Donoho DL, Elad M, Temlyakov VN. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans Inf Theory. 2006;52:6–18. [Google Scholar]
- 6.Elad M. Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer; New York: 2010. [Google Scholar]
- 7.Candes EJ. The restricted isometry property and its implications for compressed sensing. Compt Rendus Math. 2008;346:589–592. [Google Scholar]
- 8.Candes EJ, Tao T. Near-optimal signal recovery from random projections: Universal encoding strategies? IEEE Trans Inf Theory. 2006;52:5406–5425. [Google Scholar]
- 9.Foucart S, Lai MJ. Sparsest solutions of underdetermined linear systems via q-minimization for 0 q 1. Appl Comput Harmon Anal. 2009;26:395–407. [Google Scholar]
- 10.Calderbank R, Howard S, Jafarpour S. Construction of a large class of deterministic sensing matrices that satisfy a statistical isometry property. IEEE J Sel Top Signal Process. 2010;4:358–374. [Google Scholar]
- 11.Haikin M, Zamir R. Proceedings of the IEEE International Symposium on Information Theory Proceedings (ISIT) IEEE; New York: 2016. Analog coding of a source with erasures; pp. 2074–2078. [Google Scholar]
- 12.Tulino AM, Verdú S. Random Matrix Theory and Wireless Communications. Vol 1 Now Publishers Inc.; Delft, The Netherlands: 2004. [Google Scholar]
- 13.Haviv I, Regev O. Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM; Philadelphia: 2016. The restricted isometry property of subsampled Fourier matrices; pp. 288–297. [Google Scholar]
- 14.Rudelson M, Vershynin R. On sparse reconstruction from Fourier and Gaussian measurements. Commun Pure Appl Math. 2008;61:1025–1045. [Google Scholar]
- 15.Nelson J, Price E, Wootters M. New constructions of RIP matrices with fast multiplication and fewer rows. In: Chekuri C, editor. Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM; Philadelphia: 2014. pp. 1515–1528. [Google Scholar]
- 16.Pfander GE, Rauhut H, Tropp JA. The restricted isometry property for time–frequency structured random matrices. Probab Theory Relat Fields. 2013;156:707–737. [Google Scholar]
- 17.Cheraghchi M, Guruswami V, Velingker A. Restricted isometry of Fourier matrices and list decodability of random linear codes. SIAM J Comput. 2013;42:1888–1914. [Google Scholar]
- 18.Marčenko VA, Pastur LA. Distribution of eigenvalues for some sets of random matrices. Math USSR-Sbornik. 1967;1:457–483. [Google Scholar]
- 19.Bai Z, Silverstein JW. Spectral Analysis of Large Dimensional Random Matrices. Vol 20 Springer; New York: 2010. [Google Scholar]
- 20.Johnstone IM. Multivariate analysis and Jacobi ensembles: Largest eigenvalue, Tracy–Widom limits and rates of convergence. Ann Stat. 2008;36:2638. doi: 10.1214/08-AOS605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Yao J, Bai Z, Zheng S. Large Sample Covariance Matrices and High-Dimensional Data Analysis (No. 39) Cambridge Univ Press; Cambridge, UK: 2015. [Google Scholar]
- 22.Bandeira AS, Fickus M, Mixon DG, Wong P. The road to deterministic matrices with the restricted isometry property. J Fourier Anal Appl. 2013;19:1123–1149. [Google Scholar]
- 23.Fickus M, Jasper J, Mixon DG, Peterson J. Group-theoretic constructions of erasure-robust frames. Linear Algebra Appl. 2015;479:131–154. [Google Scholar]
- 24.Gurevich S, Hadani R. 2008. The statistical restricted isometry property and the Wigner semicircle distribution of incoherent dictionaries. arXiv:0812.2602. [DOI] [PMC free article] [PubMed]
- 25.Mazumdar A, Barg A. Proceedings of the IEEE International Symposium on Information Theory Proceedings (ISIT) IEEE; New York: 2011. General constructions of deterministic (s) rip matrices for compressive sampling; pp. 678–682. [Google Scholar]
- 26.Monajemi H, Jafarpour S, Gavish M, Donoho DL. Deterministic matrices matching the compressed sensing phase transitions of Gaussian random matrices. Proc Natl Acad Sci USA. 2013;110:1181–1186. doi: 10.1073/pnas.1219540110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Debbah M. 2008. Asymptotic behaviour of random Vandermonde matrices with entries on the unit circle. arXiv:0802.3570.
- 28.Götze F, Tikhomirov A. 2011. On the rate of convergence to the Marchenko–Pastur distribution. arXiv:1110.1284.
- 29.Götze F, Tikhomirov A. Optimal bounds for convergence of expected spectral distributions to the semi-circular law. Probab Theory Relat Fields. 2016;165:163–233. [Google Scholar]
- 30.Chatterjee S, Bose A. A new method for bounding rates of convergence of empirical spectral distributions. J Theor Probab. 2004;17:1003–1019. [Google Scholar]
- 31.Meckes ES, Meckes MW. 2016. Rates of convergence for empirical spectral measures: A soft approach. arXiv:1601.03720.
- 32.Applebaum L, Howard SD, Searle S, Calderbank R. Chirp sensing codes: Deterministic compressed sensing measurements for fast recovery. Appl Comput Harmon Anal. 2009;26:283–290. [Google Scholar]
- 33.Babadi B, Tarokh V. Spectral distribution of random matrices from binary linear block codes. IEEE Trans Inf Theory. 2011;57:3955–3962. [Google Scholar]
- 34.Tropp JA. On the conditioning of random subdictionaries. Appl Comput Harmon Anal. 2008;25:1–24. [Google Scholar]
- 35.Collins B. Product of random projections, Jacobi ensembles and universality problems arising from free probability. Probab Theor Relat Field. 2005;133:315–344. [Google Scholar]
- 36.Xia P, Zhou S, Giannakis GB. Achieving the Welch bound with difference sets. IEEE Trans Inf Theory. 2005;51:1900–1907. [Google Scholar]
- 37.Strohmer T, Heath RW. Grassmannian frames with applications to coding and communication. Appl Comput Harmon Anal. 2003;14:257–275. [Google Scholar]
- 38.Paley RE. On orthogonal matrices. J Math Phys. 1933;12:311–320. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.