Cusp Universality for Random Matrices I: Local Law and the Complex Hermitian Case

László Erdős; Torben Krüger; Dominik Schröder

doi:10.1007/s00220-019-03657-4

. 2020 Apr 28;378(2):1203–1278. doi: 10.1007/s00220-019-03657-4

Cusp Universality for Random Matrices I: Local Law and the Complex Hermitian Case

László Erdős ^1,^✉, Torben Krüger ², Dominik Schröder ^1,³

PMCID: PMC7426322 PMID: 32831359

Abstract

For complex Wigner-type matrices, i.e. Hermitian random matrices with independent, not necessarily identically distributed entries above the diagonal, we show that at any cusp singularity of the limiting eigenvalue distribution the local eigenvalue statistics are universal and form a Pearcey process. Since the density of states typically exhibits only square root or cubic root cusp singularities, our work complements previous results on the bulk and edge universality and it thus completes the resolution of the Wigner–Dyson–Mehta universality conjecture for the last remaining universality type in the complex Hermitian class. Our analysis holds not only for exact cusps, but approximate cusps as well, where an extended Pearcey process emerges. As a main technical ingredient we prove an optimal local law at the cusp for both symmetry classes. This result is also the key input in the companion paper (Cipolloni et al. in Pure Appl Anal, 2018. arXiv:1811.04055) where the cusp universality for real symmetric Wigner-type matrices is proven. The novel cusp fluctuation mechanism is also essential for the recent results on the spectral radius of non-Hermitian random matrices (Alt et al. in Spectral radius of random matrices with independent entries, 2019. arXiv:1907.13631), and the non-Hermitian edge universality (Cipolloni et al. in Edge universality for non-Hermitian random matrices, 2019. arXiv:1908.00969).

Introduction

The celebrated Wigner–Dyson–Mehta (WDM) conjecture asserts that local eigenvalue statistics of large random matrices are universal: they only depend on the symmetry type of the matrix and are otherwise independent of the details of the distribution of the matrix ensemble. This remarkable spectral robustness was first observed by Wigner in the bulk of the spectrum. The correlation functions are determinantal and they were computed in terms the sine kernel via explicit Gaussian calculations by Dyson, Gaudin and Mehta [59]. Wigner’s vision continues to hold at the spectral edges, where the correct statistics was identified by Tracy and Widom for both symmetry types in terms of the Airy kernel [70, 71]. These universality results have been originally formulated and proven [17, 35, 36, 67–69] for traditional Wigner matrices, i.e. Hermitian random matrices with independent, identically distributed (i.i.d.) entries and their diagonal [55, 57] and non-diagonal [51] deformations. More recently they have been extended to Wigner-type ensembles, where the identical distribution is not required, and even to a large class of matrices with general correlated entries [7, 8, 11]. In different directions of generalization, sparse matrices [1, 32, 47, 56], adjacency matrices of regular graphs [14] and band matrices [19, 20, 66] have also been considered. In parallel developments bulk and edge universal statistics have been proven for invariant $β$ -ensembles [12, 15, 17, 18, 29, 30, 52, 61, 62, 64, 65, 73] and even for their discrete analogues [13, 16, 41, 48] but often with very different methods.

A precondition for the Tracy-Widom distribution in all these generalizations of Wigner’s original ensemble is that the density of states vanishes as a square root near the spectral edges. The recent classification of the singularities of the solution to the underlying Dyson equation indeed revealed that at the edges only square root singularities appear [6, 10]. The density of states may also form a cusp-like singularity in the interior of the asymptotic spectrum, i.e. single points of vanishing density with a cubic root growth behaviour on either side. Under very general conditions, no other type of singularity may occur. At the cusp a new local eigenvalue process emerges: the correlation functions are still determinantal but the Pearcey kernel replaces the sine- or the Airy kernel.

The Pearcey process was first established by Brézin and Hikami for the eigenvalues close to a cusp singularity of a deformed complex Gaussian Wigner (GUE) matrix. They considered the model of a GUE matrix plus a deterministic matrix (“external source”) having eigenvalues $\pm 1$ with equal multiplicity [21, 22]. The name Pearcey kernel and the corresponding Pearcey process have been coined by [72] in reference to related functions introduced by Pearcey in the context of electromagnetic fields [63]. Similarly to the universal sine and Airy processes, it has later been observed that also the Pearcey process universality extends beyond the realm of random matrices. Pearcey statistics have been established for non-intersecting Brownian bridges [3] and in skew plane partitions [60], always at criticality. We remark, however, that critical cusp-like singularity does not always induce a Pearcey kernel, see e.g. [31].

In random matrix theory there are still only a handful of rather specific models for which the emergence of the Pearcey process has been proven. This has been achieved for deformed GUE matrices [2, 4, 23] and for Gaussian sample covariance matrices [42–44] by a contour integration method based upon the Brézin–Hikami formula. Beyond linear deformations, the Riemann-Hilbert method has been used for proving Pearcey statistics for a certain two-matrix model with a special quartic potential with appropriately tuned coefficients [40]. All these previous results concern only specific ensembles with a matrix integral representation. In particular, Wigner-type matrices are out of the scope of this approach.

The main result of the current paper is the proof of the Pearcey universality at the cusps for complex Hermitian Wigner-type matrices under very general conditions. Since the classification theorem excludes any other singularity, this is the third and last universal statistics that emerges from natural generalizations of Wigner’s ensemble.

This third universality class has received somewhat less attention than the other two, presumably because cusps are not present in the classical Wigner ensemble. We also note that the most common invariant $β$ -ensembles do not exhibit the Pearcey statistics as their densities do not feature cubic root cusps but are instead 1/2-Hölder continuous for somewhat regular potentials [28]. The density vanishes either as 2kth or $(2 k + \frac{1}{2})$ th power with their own local statistics (see [26] also for the persistence of these statistics under small additive GUE perturbations before the critical time). Cusp singularities, hence Pearcey statistics, however, naturally arise within any one-parameter family of Wigner-type ensembles whenever two spectral bands merge as the parameter varies. The classification theorem implies that cusp formation is the only possible way for bands to merge, so in that sense Pearcey universality is ubiquitous as well.

The bulk and edge universality is characterized by the symmetry type alone: up to a natural shift and rescaling there is only one bulk and one edge statistic. In contrast, the cusp universality has a much richer structure: it is naturally embedded in a one-parameter family of universal statistics within each symmetry class. In the complex Hermitian case these are given by the one-parameter family of (extended) Pearcey kernels, see (2.5) later. Thinking in terms of fine-tuning a single parameter in the space of Wigner-type ensembles, the density of states already exhibits a universal local shape right before and right after the cusp formation; it features a tiny gap or a tiny nonzero local minimum, respectively [5, 10]. When the local lengthscale $ℓ$ of these almost cusp shapes is comparable with the local eigenvalue spacing $δ$ , then the general Pearcey statistics is expected to emerge whose parameter is determined by the ratio $ℓ / δ$ . Thus the full Pearcey universality typically appears in a double scaling limit.

Our proof follows the three step strategy that is the backbone of the recent approach to the WDM universality, see [38] for a pedagogical exposé and for detailed history of the method. The first step in this strategy is a local law that identifies, with very high probability, the empirical eigenvalue distribution on a scale slightly above the typical eigenvalue spacing. The second step is to prove universality for ensembles with a tiny Gaussian component. Finally, in the third step this Gaussian component is removed by perturbation theory. The local law is used for precise apriori bounds in the second and third steps.

The main novelty of the current paper is the proof of the local law at optimal scale near the cusp. To put the precision in proper context, we normalize the $N \times N$ real symmetric or complex Hermitian Wigner-type matrix H to have norm of order one. As customary, the local law is formulated in terms of the Green function $G (z) : = {(H - z)}^{- 1}$ with spectral parameter z in the upper half plane. The local law then asserts that G(z) becomes deterministic in the large N limit as long as $η : = I z$ is much larger than the local eigenvalue spacing around $R z$ . The deterministic approximant M(z) can be computed as the unique solution of the corresponding Dyson equation (see (2.2) and (3.1) later). Near the cusp the typical eigenvalue spacing is of order $N^{- 3 / 4}$ ; compare this with the $N^{- 1}$ spacing in the bulk and $N^{- 2 / 3}$ spacing near the edges. We remark that a local law at the cusp on the non-optimal scale $N^{- 3 / 5}$ has already been proven in [8]. In the current paper we improve this result to the optimal scale $N^{- 3 / 4}$ and this is essential for our universality proof at the cusp.

The main ingredient behind this improvement is an optimal estimate of the error term D (see (3.4) later) in the approximate Dyson equation that G(z) satisfies. The difference $M - G$ is then roughly estimated by $B^{- 1} (M D)$ , where $B$ is the linear stability operator of the Dyson equation. Previous estimates on D (in averaged sense) were of order $ρ / N η$ , where $ρ$ is the local density; roughly speaking $ρ \sim 1$ in the bulk, $ρ \sim N^{- 1 / 3}$ at the edge and $ρ \sim N^{- 1 / 4}$ near the cusp. While this estimate cannot be improved in general, our main observation is that, to leading order, we need only the projection of MD in the single unstable direction of $B$ . We found that this projection carries an extra hidden cancellation due to a special local symmetry at the cusp and thus the estimate on D effectively improves to $ρ^{2} / N η$ . Customary power counting is not sufficient, we need to compute this error term explicitly at least to leading order. We call this subtle mechanism cusp fluctuation averaging since it combines the well established fluctuation averaging procedure with the additional cancellation at the cusp. Similar estimates extend to the vicinity of the exact cusps. We identify a key quantity, denoted by $σ (z)$ (in (3.5b) later), that measures the distance from the cusp in a canonical way: $σ (z) = 0$ characterizes an exact cusp, while $|σ (z)| ≪ 1$ indicates that z is near an almost cusp. Our final estimate on D is of order $(ρ + |σ|) ρ / N η$ . Since the error term D is random and we need to control it in high moment sense, we need to lift this idea to a high moment calculation, meticulously extracting the improvement from every single term. This is performed in the technically most involved Sect. 4 where we use a Feynman diagrammatic formalism to bookkeep the contributions of all terms. Originally we have developed this language in [34] to handle random matrices with slow correlation decay, based on the revival of the cumulant expansion technique in [45] after [50]. In the current paper we incorporate the cusp into this analysis. We identify a finite set of Feynman subdiagrams, called $σ$ -cells (Definition 4.10) with value $σ$ that embody the cancellation effect at the cusp. To exploit the full strength of the cusp fluctuation averaging mechanism, we need to trace the fate of the $σ$ -cells along the high moment expansion. The key point is that $σ$ -cells are local objects in the Feynman graphs thus their cancellation effects act simultaneously and the corresponding gains are multiplicative.

Formulated in the jargon of diagrammatic field theory, extracting the deterministic Dyson equation for M from the resolvent equation $(H - z) G (z) = 1$ corresponds to a consistent self-energy renormalization of G. One way or another, such procedure is behind every proof of the optimal local law with high probability. Our $σ$ -cells conceptually correspond to a next order resummation of certain Feynman diagrams carrying a special cancellation.

We remark that we prove the optimal local law only for Wigner-type matrices and not yet for general correlated matrices unlike in [11, 34]. In fact we use the simpler setup only for the estimate on D (Theorem 3.7) the rest of the proof is already formulated for the general case. This simpler setup allows us to present the cusp fluctuation averaging mechanism with the least amount of technicalities. The extension to the correlated case is based on the same mechanism but it requires considerably more involved diagrammatic manipulations which is better to develop in a separate work to contain the length of this paper.

Our cusp fluctuation averaging mechanism has further applications. It is used in [9] to prove an optimal cusp local law for the Hermitization of non-Hermitian random matrices with a variance profile, demonstrating that the technique is also applicable in settings where the flatness assumption is violated. The cusp of the Hermitization corresponds to the edge of the non-Hermitian model via Girko’s formula, thus the optimal cusp local law leads to an optimal bound on the spectral radius [9] and ultimately also to edge universality [25] for non-Hermitian random matrices.

Armed with the optimal local law we then perform the other two steps of the three step analysis. The third step, relying on the Green function comparison theorem, is fairly standard and previous proofs used in the bulk and at the edge need only minor adjustments. The second step, extracting universality from an ensemble with a tiny Gaussian component can be done in two ways: (i) Brézin–Hikami formula with contour integration or (ii) Dyson Brownian Motion (DBM). Both methods require the local law as an input. In the current work we follow (i) mainly because this approach directly yields the Pearcey kernel, at least for the complex Hermitian symmetry class. In the companion work [24] we perform the DBM analysis adapting methods of [37, 53, 54] to the cusp. The main novelty in the current work and in [24] is the rigidity at the cusp on the optimal scale provided below. Once this key input is given, the proof of the edge universality from [53] is modified in [24] to the cusp setting, proving universality for the real symmetric case as well. We remark, however, that, to our best knowledge, the analogue of the Pearcey kernel for the real symmetric case has not yet been explicitly identified.

We now explain some novelty in the contour integration method. We first note that a similar approach was initiated in the fundamental work of Johansson on the bulk universality for Wigner matrices with a large Gaussian component in [49]. This method was generalised later to Wigner matrices with a small Gaussian component in [35] as well as it inspired the proof of bulk universality via the moment matching idea [68] once the necessary local law became available. The double scaling regime has also been studied, where the density is very small but the Gaussian component compensates for it [27]. More recently, the same approach was extended to the cusp for deformed GUE matrices [23, Theorem 1.3] and for sample covariance matrices but only for large Gaussian component [42–44]. For our cusp universality, we need to perform a similar analysis but with a small Gaussian component. We represent our matrix H as $\hat{H} + \sqrt{t} U$ , where U is GUE and $\hat{H}$ is an independent Wigner-type matrix. The contour integration analysis (Sect. 5.1) requires a Gaussian component of size at least $t ≫ N^{- 1 / 2}$ .

The input of the analysis in Sect. 5.1 for the correlation kernel of H is a very precise description of the eigenvalues of $\hat{H}$ just above $N^{- 3 / 4}$ , the scale of the typical spacing between eigenvalues—this information is provided by our optimal local law. While in the bulk and in the regime of the regular edge finding an appropriate $\hat{H}$ is a relatively simple matter, in the vicinity of a cusp point the issue is very delicate. The main reason is that the cusp, unlike the bulk or the regular edge, is unstable under small perturbations; in fact it typically disappears and turns into a small positive local minimum if a small GUE component is added. Conversely, a cusp emerges if a small GUE component is added to an ensemble that has a density with a small gap. In particular, even if the density function $ρ (τ)$ of H exhibits an exact cusp, the density $\hat{ρ} (τ)$ of $\hat{H}$ will have a small gap: in fact $ρ$ is given by the evolution of the semicircular flow up to time t with initial data $\hat{ρ}$ . Unlike in the bulk and edge cases, here one cannot match the density of H and $\hat{H}$ by a simple shift and rescaling. Curiously, the contour integral analysis for the local statistics of H at the cusp relies on an optimal local law of $\hat{H}$ with a small gap far away from the cusp.

Thus we need an additional ingredient: the precise analysis of the semicircular flow $ρ_{s} : = \hat{ρ} ⊞ ρ_{sc}^{(s)}$ near the cusp up to a relatively long times $s ≲ N^{- 1 / 2 + ϵ}$ ; note that $ρ_{t} = ρ$ is the original density with the cusp. Here $ρ_{sc}^{(s)}$ is the semicircular density with variance s and $⊞$ indicates the free convolution. In Sects. 5.2–5.3 we will see that the edges of the support of the density $ρ_{s}$ typically move linearly in the time s while the gap closes at a much slower rate. Already $s ≫ N^{- 3 / 4}$ is beyond the simple perturbative regime of the cusp whose natural lengthscale is $N^{- 3 / 4}$ . Thus we need a very careful tuning of the parameters: the analysis of a cusp for H requires constructing a matrix $\hat{H}$ that is far from having a cusp but that after a relatively long time $t = N^{- 1 / 2 + ϵ}$ will develop a cusp exactly at the right location. In the estimates we heavily rely on various properties of the solution to the Dyson equation established in the recent paper [10]. These results go well beyond the precision of the previous work [5] and they apply to a very general class of Dyson equations, including a non-commutative von-Neumann algebraic setup.

Notations. We now introduce some custom notations we use throughout the paper. For non-negative functions f(A, B), g(A, B) we use the notation $f \leq_{A} g$ if there exist constants C(A) such that $f (A, B) \leq C (A) g (A, B)$ for all A, B. Similarly, we write $f \sim_{A} g$ if $f \leq_{A} g$ and $g \leq_{A} f$ . We do not indicate the dependence of constants on basic parameters that will be called model parameters later. If the implied constants are universal, we instead write $f ≲ g$ and $f \sim g$ . Similarly we write $f ≪ g$ if $f \leq c g$ for some tiny absolute constant $c > 0$ .

We denote vectors by bold-faced lower case Roman letters $x, y \in C^{N}$ , and matrices by upper case Roman letters $A, B \in C^{N \times N}$ with entries $A = {(a_{ij})}_{i, j = 1}^{N}$ . The standard scalar product and Euclidean norm on $C^{N}$ will be denoted by $〈x, y〉 : = N^{- 1} \sum_{i \in [N]} \bar{x_{i}} y_{i}$ and $‖ x ‖$ , while we also write $〈A, B〉 : = N^{- 1} Tr A^{*} B$ for the scalar product of matrices, and $〈A〉 : = N^{- 1} Tr A$ , $〈x〉 : = N^{- 1} \sum_{a \in [N]} x_{a}$ . We write $diag R$ , $diag r$ for the diagonal vector of a matrix R and the diagonal matrix obtained from a vector $r$ , and $S ⊙ R$ for the entrywise (Hadamard) product of matrices R, S. The usual operator norm induced by the vector norm $‖ \cdot ‖$ will be denoted by $‖ A ‖$ , while the Hilbert-Schmidt (or Frobenius) norm will be denoted by ${‖ A ‖}_{hs} : = \sqrt{〈A, A〉}$ . For integers n we define $[n] : = {1, \dots, n}$ .

Main Results

The Dyson equation

Let $W = W^{*} \in C^{N \times N}$ be a self-adjoint random matrix and $A = diag (a)$ be a deterministic diagonal matrix with entries $a = {(a_{i})}_{i = 1}^{N} \in R^{N}$ . We say that W is of Wigner-type [8] if its entries $w_{ij}$ for $i \leq j$ are centred, $E w_{ij} = 0$ , independent random variables. We define the variance matrix or self-energy matrix $S = {(s_{ij})}_{i, j = 1}^{N}$ by

\begin{matrix} s_{ij} : = E {|w_{ij}|}^{2} . \end{matrix}

2.1

This matrix is symmetric with non-negative entries. In [8] it was shown that as N tends to infinity, the resolvent $G (z) : = {(H - z)}^{- 1}$ of the deformed Wigner-type matrix $H = A + W$ entrywise approaches a diagonal matrix

\begin{matrix} M (z) : = diag (m (z)) . \end{matrix}

The entries $m = (m_{1}, \dots, m_{N}) : H \to H^{N}$ of M have positive imaginary parts and solve the Dyson equation

\begin{matrix} - \frac{1}{m_{i} (z)} = z - a_{i} + \sum_{j = 1}^{N} s_{ij} m_{j} (z), z \in H : = {z \in C | I z > 0}, i \in [N] . \end{matrix}

2.2

We call M or $m$ the self-consistent Green’s function. The normalised trace of M is the Stieltjes transform of a unique probability measure on $R$ that approximates the empirical eigenvalue distribution of $A + W$ increasingly well as $N \to \infty$ , motivating the following definition.

Definition 2.1

(Self-consistent density of states). The unique probability measure $ρ$ on $R$ , defined through

\begin{matrix} 〈M (z)〉 = \frac{1}{N} Tr M (z) = \int \frac{ρ (d τ)}{τ - z}, z \in H, \end{matrix}

is called the self-consistent density of states (scDOS). Accordingly, its support $supp ρ$ is called self-consistent spectrum.

Cusp universality

We make the following assumptions:

Assumption (A)

(Bounded moments). The entries of the Wigner-type matrix $\sqrt{N} W$ have bounded moments and the expectation A is bounded, i.e. there are positive $C_{k}$ such that

\begin{matrix} |a_{i}| \leq C_{0}, E {|w_{ij}|}^{k} \leq C_{k} N^{- k / 2}, k \in N . \end{matrix}

Assumption (B)

(Fullness). If the matrix $W = W^{*} \in C^{N \times N}$ belongs to the complex hermitian symmetry class, then we assume

\begin{matrix} \begin{matrix} (\begin{matrix} E {(R w_{ij})}^{2} & E (R w_{ij}) (I w_{ij}) \\ E (R w_{ij}) (I w_{ij}) & E {(I w_{ij})}^{2} \end{matrix}) \geq \frac{c}{N} 1_{2 \times 2}, \end{matrix} \end{matrix}

2.3

as quadratic forms, for some positive constant $c > 0$ . If $W = W^{T} \in R^{N \times N}$ belongs to the real symmetric symmetry class, then we assume $E w_{ij}^{2} \geq \frac{c}{N}$ .

Assumption (C)

(Bounded self-consistent Green’s function). In a neighbourhood of some fixed spectral parameter $τ \in R$ the self-consistent Green’s function is bounded, i.e. for positive $C, κ$ we have

\begin{matrix} |m_{i}, (z)| \leq C, z \in τ + (- κ, κ) + i R^{+} . \end{matrix}

We call the constants appearing in Assumptions (A)–(C)model parameters. All generic constants C in this paper may implicitly depend on these model parameters. Dependence on further parameters however will be indicated.

Remark 2.2

The boundedness of $m$ in Assumption (C) can be ensured by assuming some regularity of the variance matrix S. For more details we refer to [5, Chapter 6].

From the extensive analysis in [10] we know that the self-consistent density $ρ$ is described by explicit shape functions in the vicinity of local minima with small value of $ρ$ and around small gaps in the support of $ρ$ . The density in such almost cusp regimes is given by precisely one of the following three asymptotics:

(i)
Exact cusp. There is a cusp point $c \in R$ in the sense that $ρ (c) = 0$ and $ρ (c \pm δ) > 0$ for $0 \neq δ ≪ 1$ . In this case the self-consistent density is locally around $c$ given by
$\begin{matrix} ρ (c \pm x) = \frac{\sqrt{3} γ^{4 / 3}}{2 π} x^{1 / 3} [1 + O (x^{1 / 3})], x \geq 0 \end{matrix}$ 2.4a
for some $γ > 0$ .
(ii)
Small gap. There is a maximal interval $[e_{-}, e_{+}]$ of size $0 < Δ : = e_{+} - e_{-} ≪ 1$ such that ${ρ |}_{[e_{-}, e_{+}]} \equiv 0$ . In this case the density around $e_{\pm}$ is, for some $γ > 0$ , locally given by
$\begin{matrix} ρ (e_{\pm} \pm x) = \frac{\sqrt{3} {(2 γ)}^{4 / 3} Δ^{1 / 3}}{2 π} Ψ_{edge} (x / Δ) [1 + O (Δ^{1 / 3}, Ψ_{edge}, (x / Δ))], x \geq 0 \end{matrix}$ 2.4b
where the shape function around the edge is given by
$\begin{matrix} Ψ_{edge} (λ) : = \frac{\sqrt{λ (1 + λ)}}{{(1 + 2 λ + 2 \sqrt{λ (1 + λ)})}^{2 / 3} + {(1 + 2 λ - 2 \sqrt{λ (1 + λ)})}^{2 / 3} + 1}, λ \geq 0 . \end{matrix}$ 2.4c
(iii)
Non-zero local minimum. There is a local minimum at $m \in R$ of $ρ$ such that $0 < ρ (m) ≪ 1$ . In this case there exists some $γ > 0$ such that
$\begin{matrix} ρ (m + x) = ρ (m) + ρ (m) Ψ_{\min} (\frac{3 \sqrt{3} γ^{4} x}{2 {(π ρ (m))}^{3}}) [1 + O (ρ {(m)}^{1 / 2} + \frac{|x|}{ρ {(m)}^{3}})], x \in R \end{matrix}$ 2.4d
where the shape function around the local minimum is given by
$\begin{matrix} Ψ_{\min} (λ) : = \frac{\sqrt{1 + λ^{2}}}{{(\sqrt{1 + λ^{2}} + λ)}^{2 / 3} + {(\sqrt{1 + λ^{2}} - λ)}^{2 / 3} - 1} - 1, λ \in R . \end{matrix}$ 2.4e

We note that the parameter $γ$ in (2.4a) is chosen in a way which is convenient for the universality statement. We also note that the choices for $γ$ in (2.4b)–(2.4d) are consistent with (2.4a) in the sense that in the regimes $Δ ≪ x ≪ 1$ and $ρ {(m)}^{3} ≪ |x| ≪ 1$ the respective formulae asymptotically agree. Depending on the three cases (i)–(iii), we define the almost cusp point $b$ as the cusp $c$ in case (i), the midpoint $(e_{-} + e_{+}) / 2$ in case (ii), and the minimum $m$ in case (iii). When the local length scale of the almost cusp shape starts to match the eigenvalue spacing, i.e. if $Δ ≲ N^{- 3 / 4}$ or $ρ (m) ≲ N^{- 1 / 4}$ , then we call the local shape a physical cusp. This terminology reflects the fact that the shape becomes indistinguishable from the exact cusp with $ρ (c) = 0$ when resolved with a precision above the eigenvalue spacing. In this case we call $b$ a physical cusp point.

The extended Pearcey kernel with a real parameter $α$ (often denoted by $τ$ in the literature) is given by

\begin{matrix} K_{α} (x, y) = \frac{1}{{(2 π i)}^{2}} \int_{Ξ} d z \int_{Φ} d w \frac{exp (- w^{4} / 4 + α w^{2} / 2 - y w + z^{4} / 4 - α z^{2} / 2 + x z)}{w - z}, \end{matrix}

2.5

where $Ξ$ is a contour consisting of rays from $\pm \infty e^{i π / 4}$ to 0 and rays from 0 to $\pm \infty e^{- i π / 4}$ , and $Φ$ is the ray from $- i \infty$ to $i \infty$ . The simple Pearcey kernel with parameter $α = 0$ has been first observed in the context of random matrix theory by [21, 22]. We note that (2.5) is a special case of a more general extended Pearcey kernel defined in [72, Eq. (1.1)].

It is natural to express universality in terms of a rescaled k-point function $p_{k}^{(N)}$ which we define implicitly by

\begin{matrix} {(\begin{matrix} N \\ k \end{matrix})}^{- 1} \sum_{{i_{1}, \dots, i_{k}} \subset [N]} E f (λ_{i_{1}}, \dots, λ_{i_{k}}) = \int_{R^{k}} f (x_{1}, \dots, x_{k}) p_{k}^{(N)} (x_{1}, \dots, x_{k}) d x_{1} \dots d x_{k} \end{matrix}

for test functions f, where the summation is over all subsets of k distinct integers from [N].

Theorem 2.3

Let H be a complex Hermitian Wigner matrix satisfying Assumptions (A)–(C). Assume that the self-consistent density $ρ$ within $[τ - κ, τ + κ]$ from Assumption (C) has a physical cusp, i.e. that $ρ$ is locally given by (2.4) for some $γ > 0$ and $ρ$ either (i) has a cusp point $c$ , or (ii) a small gap $[e_{-}, e_{+}]$ of size $Δ : = e_{+} - e_{-} ≲ N^{- 3 / 4}$ , or (iii) a local minimum at $m$ of size $ρ (m) ≲ N^{- 1 / 4}$ . Then it follows that for any smooth compactly supported test function $F : R^{k} \to R$ it holds that

\begin{matrix} |\int_{R^{k}}, F, (x), [\frac{N^{k / 4}}{γ^{k}} p_{k}^{(N)} (b + \frac{x}{γ N^{3 / 4}}) - det {(K_{α} (x_{i}, x_{j}))}_{i, j = 1}^{k}], d, x| = O (N^{- c (k)}), \end{matrix}

where

\begin{matrix} b : = \{\begin{matrix} c & in case (i), \\ (e_{+} + e_{-}) / 2 & in case (ii), \\ m & in case (iii), \end{matrix}) α : = \{\begin{matrix} 0 & in case (i) \\ 3 {(γ Δ / 4)}^{2 / 3} N^{1 / 2} & in case (ii), \\ - {(π ρ (m) / γ)}^{2} N^{1 / 2} & in case (iii), \end{matrix}) \end{matrix}

2.6

$x = (x_{1}, \dots, x_{k})$ , $d x = d x_{1} \dots d x_{k}$ , and $c (k) > 0$ is a small constant only depending on k.

Local law

We emphasise that the proof of Theorem 2.3 requires a very precise a priori control on the fluctuation of the eigenvalues even at singular points of the scDOS. This control is expressed in the form of a local law with an optimal convergence rate down to the typical eigenvalue spacing. We now define the scale on which the eigenvalues are predicted to fluctuate around the spectral parameter $τ$ .

Definition 2.4

(Fluctuation scale). We define the self-consistent fluctuation scale $η_{f} = η_{f} (τ)$ through

\begin{matrix} \int_{- η_{f}}^{η_{f}} ρ (τ + ω) d ω = \frac{1}{N}, \end{matrix}

if $τ \in supp ρ$ . If $τ \notin supp ρ$ , then $η_{f}$ is defined as the fluctuation scale at a nearby edge. More precisely, let I be the largest (open) interval with $τ \in I \subseteq R \ supp ρ$ and set $Δ : = min {|I|, 1}$ . Then we define

\begin{matrix} \begin{matrix} η_{f} : = \{\begin{matrix} Δ^{1 / 9} / N^{2 / 3}, & Δ > 1 / N^{3 / 4}, \\ 1 / N^{3 / 4}, & Δ \leq 1 / N^{3 / 4} . \end{matrix}) \end{matrix} \end{matrix}

2.7

We will see later (cf. (A.8b)) that (2.7) is the fluctuation of the edge eigenvalue adjacent to a spectral gap of length $Δ$ as predicted by the local behaviour of the scDOS. The control on the fluctuation of eigenvalues is expressed in terms of the following local law.

Theorem 2.5

(Local law). Let H be a deformed Wigner-type matrix of the real symmetric or complex Hermitian symmetry class. Fix any $τ \in R$ . Assuming (A)–(C) for any $ϵ, ζ > 0$ and $ν \in N$ the local law holds uniformly for all $z = τ + i η$ with $dist (z, supp ρ) \in [N^{ζ} η_{f} (τ), N^{100}]$ in the form

\begin{matrix} P [|〈u, (G (z) - M (z)) v〉| \geq N^{ϵ} \sqrt{\frac{ρ (z)}{N η}} ‖ u ‖ ‖ v ‖] \leq \frac{C}{N^{ν}}, \end{matrix}

2.8a

for any $u, v \in C^{N}$ and

\begin{matrix} P [|〈B (G (z) - M (z)〉| \geq \frac{N^{ϵ} ‖ B ‖}{N dist (z, supp ρ)}] \leq \frac{C}{N^{ν}}, \end{matrix}

2.8b

for any $B \in C^{N \times N}$ . Here $ρ (z) : = 〈I M (z)〉 / π$ denotes the harmonic extension of the scDOS to the complex upper half plane. The constants $C > 0$ in (2.8) only depends on $ϵ, ζ, ν$ and the model parameters.

We remark that later we will prove the local law also in a form which is uniform in $τ \in [- N^{100}, N^{100}]$ and $η \in [N^{- 1 + ζ}, N^{100}]$ , albeit with a more complicated error term, see Proposition 3.11. The local law Theorem 2.5 implies a large deviation result for the fluctuation of eigenvalues on the optimal scale uniformly for all singularity types.

Corollary 2.6

(Uniform rigidity). Let H be a deformed Wigner-type matrix of the real symmetric or complex Hermitian symmetry class satisfying Assumptions (A)–(C) for $τ \in int (supp ρ)$ . Then

\begin{matrix} P [|λ_{k (τ)} - τ| \geq N^{ϵ} η_{f} (τ)] \leq \frac{C}{N^{ν}} \end{matrix}

for any $ϵ > 0$ and $ν \in N$ and some $C = C (ϵ, ν)$ , where we defined the (self-consistent) eigenvalue index $k (τ) : = ⌈ N ρ ((- \infty, τ)) ⌉$ , and where $⌈ x ⌉ = min {k \in Z | k \geq x}$ .

In particular, the fluctuation of the eigenvalue whose expected position is closest to the cusp location does not exceed $N^{- 3 / 4 + ϵ}$ for any $ϵ > 0$ with very high probability. The following corollary specialises Corollary 2.6 to the neighbourhood of a cusp.

Corollary 2.7

(Cusp rigidity). Let H be a deformed Wigner-type matrix of the real symmetric or complex Hermitian symmetry class satisfying Assumptions (A)–(C) and $τ = c$ the location of an exact cusp. Then $N ρ ((- \infty, c)) = k_{c}$ for some $k_{c} \in [N]$ , that we call the cusp eigenvalue index. For any $ϵ > 0$ , $ν \in N$ and $k \in [N]$ with $|k - k_{c}| \leq c N$ we have

\begin{matrix} P [|λ_{k} - γ_{k}| \geq \frac{N^{ϵ}}{{(1 + |k - k_{c}|)}^{1 / 4} N^{3 / 4}}] \leq \frac{C}{N^{ν}}, \end{matrix}

where $C = C (ϵ, ν)$ and $γ_{k}$ are the self-consistent eigenvalue locations, defined through $N ρ ((- \infty, γ_{k})) = k$ .

We remark that a variant of Corollary 2.7 holds more generally for almost cusp points. It is another consequence of Corollary 2.6 that with high probability there are no eigenvalues much further than the fluctuation scale $η_{f}$ away from the spectrum. We note that the following corollary generalises [11, Corollary 2.3] by also covering internal gaps of size $≪ 1$ .

Corollary 2.8

(No eigenvalues outside the support of the self-consistent density). Let $τ \notin supp ρ$ . Under the assumptions of Theorem 2.5 we have

\begin{matrix} P [\exists λ \in Spec H \cap [τ - c, τ + c], dist (λ, supp ρ) \geq N^{ϵ} η_{f} (τ)] \leq C N^{- ν}, \end{matrix}

for any $ϵ, ν > 0$ , where c and C are positive constants, depending on model parameters. The latter also depends on $ϵ$ and $ν$ .

Remark 2.9

Theorem 2.5 and its consequences, Corollaries 2.6, 2.7 and 2.8 also hold for both symmetry classes if Assumption (B) is replaced by the condition that there exists an $L \in N$ and $c > 0$ such that ${min}_{i, j} {(S^{L})}_{ij} \geq c / N$ . A variance profile S satisfying this condition is called uniformly primitive (cf. [6, Eq. (2.5)] and [5, Eq. (2.11)]). Note that uniform primitivity is weaker than condition (B) on two accounts. First, it involves only the variance matrix $E {|w_{ij}|}^{2}$ unlike (2.3) in the complex Hermitian case that also involves $E w_{ij}^{2}$ . Second, uniform primitivity allows certain matrix elements of W to vanish. The proof under these more general assumptions follows the same strategy but requires minor modifications within the stability analysis.1

Local Law

In order to directly appeal to recent results on the shape of solution to Matrix Dyson Equation (MDE) from [10] and the flexible diagrammatic cumulant expansion from [34], we first reformulate the Dyson equation (2.2) for N-vectors $m$ into a matrix equation that will approximately be satisfied by the resolvent G. This viewpoint also allows us to treat diagonal and off-diagonal elements of G on the same footing. In fact, (2.2) is a special case of

\begin{matrix} 1 + (z - A + S [M]) M = 0, \end{matrix}

3.1

for a matrix $M = M (z) \in C^{N \times N}$ with positive definite imaginary part, $I M = (M - M^{*}) / 2 i > 0$ . The uniqueness of the solution M with $I M > 0$ was shown in [46]. Here the linear (self-energy) operator $S : C^{N \times N} \to C^{N \times N}$ is defined as $S [R] : = E W R W$ and it preserves the cone of positive definite matrices. Definition 2.1 of the scDOS and its harmonic extension $ρ (z)$ (cf. Theorem 2.5) directly generalises to the solution to (3.1), see [10, Definition 2.2].

In the special case of Wigner-type matrices the self-energy operator is given by

\begin{matrix} S [R] = diag (S r) + T ⊙ R^{t}, \end{matrix}

3.2

where $r : = {(r_{ii})}_{i = 1}^{N}$ , S was defined in (2.1), $T = {(t_{ij})}_{i, j = 1}^{N} \in C^{N \times N}$ with $t_{ij} = E w_{ij}^{2} 1 (i \neq j)$ and $⊙$ denotes the entrywise Hadamard product. The solution to (3.1) is then given by $M = diag (m)$ , where $m$ solves (2.2). Note that the action of $S$ on diagonal matrices is independent of T, hence the Dyson equation (2.2) for Wigner-type matrices is solely determined by the matrix S, the matrix T plays no role. However, T plays a role in analyzing the error matrix D, see (3.4) below.

The proof of the local law consists of three largely separate arguments. The first part concerns the analysis of the stability operator

\begin{matrix} B : = 1 - M S [\cdot] M \end{matrix}

3.3

and shape analysis of the solution M to (3.1). The second part is proving that the resolvent G is indeed an approximate solution to (3.1) in the sense that the error matrix

\begin{matrix} D : = 1 + (z - A + S [G]) G = W G + S [G] G \end{matrix}

3.4

is small. In previous works [8, 11, 34] it was sufficient to establish smallness of D in an isotropic form $〈x, D y〉$ and averaged form $〈B, D〉$ with general bounded vectors/matrices $x, y, B$ . In the vicinity of a cusp, however, it becomes necessary to establish an additional cancellation when D is averaged against the unstable direction of the stability operator $B$ . We call this new effect cusp fluctuation averaging. Finally, the third part of the proof consists of a bootstrap argument starting far away from the real axis and iteratively lowering the imaginary part $η = I z$ of the spectral parameter while maintaining the desired bound on $G - M$ .

Remark 3.1

We remark that the proofs of Theorem 2.5, and Corollaries 2.6 and 2.8 use the independence assumption on the entries of W only very locally. In fact, only the proof of a specific bound on D (see (3.15) later), which follows directly from the main result of the diagrammatic cumulant expansion, Theorem 3.7, uses the vector structure and the specific form of $S$ in (3.2) at all. Therefore, assuming (3.15) as an input, our proof of Theorem 2.5 remains valid also in the correlated setting of [11, 34], as long as $S$ is flat (see (3.6) below), and Assumption (C) is replaced by the corresponding assumption on the boundedness of $‖ M ‖$ .

For brevity we will carry out the proof of Theorem 2.5 only in the vicinity of almost cusps as the local law in all other regimes was already proven in [8, 11] to optimality. Therefore, within this section we will always assume that $z = τ + i η = τ_{0} + ω + i η \in H$ lies inside a small neighbourhood

\begin{matrix} D_{cusp} : = {z \in H | |z - τ_{0}| \leq c}, \end{matrix}

of the location $τ_{0}$ of a local minimum of the scDOS within the self-consistent spectrum $supp ρ$ . Here c is a sufficiently small constant depending only on the model parameters. We will further assume that either (i) $ρ (τ_{0}) \geq 0$ is sufficiently small and $τ_{0}$ is the location of a cusp or internal minimum, or (ii) $ρ (τ_{0}) = 0$ and $τ_{0}$ is an edge adjacent to a sufficiently small gap of length $Δ > 0$ . The results from [10] guarantee that these are the only possibilities for the shape of $ρ$ , see (2.4). In other words, we assume that $τ_{0} \in supp ρ$ is a local minimum of $ρ$ with a shape close to a cusp (cf. (2.4)). For concreteness we will also assume that if $τ_{0}$ is an edge, then it is a right edge (with a gap of length $Δ > 0$ to the right) and $ω \in (- c, \frac{Δ}{2}]$ . The case when $τ_{0}$ is a left edge has the same proof.

We now introduce a quantity that will play an important role in the cusp fluctuation averaging mechanism. We define

3.5a

where $R M : = (M + M^{*}) / 2$ is the real part of $M = M (z)$ . It was proven in [10, Lemma 5.5] that $σ (z)$ extends to the real line as a 1/3-Hölder continuous function wherever the scDOS $ρ$ is smaller than some threshold $c \sim 1$ , i.e. $ρ \leq c$ . In the specific case of $S$ as in (3.2) the definition simplifies to

\begin{matrix} σ (z) : = 〈p, f^{3}〉 = \frac{1}{N} \sum_{i = 1}^{N} \frac{{(I m_{i} (z))}^{3} sgn R m_{i} (z)}{ρ {(z)}^{3} {|m_{i}, (z)|}^{3}}, f : = \frac{I m}{ρ |m|}, p : = sgn R m, \end{matrix}

3.5b

since $M = diag (m)$ is diagonal, where multiplication and division of vectors are understood entrywise. When evaluated at the location $τ_{0}$ the scalar $σ (τ_{0})$ provides a measure of how far the shape of the singularity at $τ_{0}$ is from an exact cusp. In fact, if $σ (τ_{0}) = 0$ and $ρ (τ_{0}) = 0$ , then $τ_{0}$ is a cusp location. To see the relationship between the emergence of a cusp and the limit $σ (τ_{0}) \to 0$ , we refer to [10, Theorem 7.7 and Lemma 6.3]. The analogues of the quantities $f, p$ and $σ$ in (3.5b) are denoted by $f_{u}, s$ and $σ$ in [10], respectively. The significance of $σ$ for the classification of singularity types in Wigner-type ensembles was first realised in [5]. Although in this paper we will use only [10] and will not rely on [5], we remark that the definition of $σ$ in [5, Eq. (8.11)] differs slightly from the definition (3.5b). However, both definitions equally fulfil the purpose of classifying singularity types, since the ensuing scalar quantities $σ$ are comparable inside the self-consistent spectrum. For the interested reader, we briefly relate our notations to the respective conventions in [10] and [5]. The quantity denoted by f in both [10] and [5] is the normalized eigendirection of the saturated self-energy operator F in the respective settings and is related to $f$ from (3.5b) via $f = f / ‖ f ‖ + O (η / ρ)$ . Moreover, $σ$ in [5] is defined as $〈f^{3}, sgn, R, m〉$ , justifying the comparability to $σ$ from (3.5b).

Stability and shape analysis

From (3.1) and (3.4) we obtain the quadratic stability equation

\begin{matrix} B [G - M] = - M D + M S [G - M] (G - M), \end{matrix}

for the difference $G - M$ . In order to apply the results of [10] to the stability operator $B$ , we first have to check that the flatness condition [10, Eq. (3.10)] is satisfied for the self-energy operator $S$ . We claim that $S$ is flat, i.e.

\begin{matrix} S [R] \sim 〈R〉 1 = \frac{1}{N} (Tr R) 1, \end{matrix}

3.6

as quadratic forms for any positive semidefinite $R \in C^{N \times N}$ . We remark that in the earlier paper [8] in the Wigner-type case only the upper bound $s_{ij} \leq C / N$ defined the concept of flatness. Here with the definition (3.6) we follow the convention of the more recent works [10, 11, 34] which is more conceptual. We also warn the reader, that in the complex Hermitian Wigner-type case the condition $c / N \leq s_{ij} \leq C / N$ implies (3.6) only if $t_{ij}$ is bounded away from $- s_{ij}$ .

However, the flatness (3.6) is an immediate consequence of the fullness Assumption (B). Indeed, (B) is equivalent to the condition that the covariance operator $Σ$ of all entries above and on the diagonal, defined as $Σ_{a b, c d} : = E w_{ab} w_{cd}$ , is uniformly strictly positive definite. This implies that $Σ \geq c Σ_{G}$ for some constant $c \sim 1$ , where $Σ_{G}$ is the covariance operator of a GUE or GOE matrix, depending on the symmetry class we consider. This means that $S$ can be split into $S = S_{0} + c S_{G}$ , where $S_{G}$ and $S_{0}$ are the self-energy operators corresponding to $Σ_{G}$ and $Σ - c Σ_{G}$ , respectively. It is now an easy exercise to check that $S_{G}$ and thus $S$ is flat.

In particular, [10, Proposition 3.5 and Lemma 4.8] are applicable implying that [10, Assumption 4.5] is satisfied. Thus, according to [10, Lemma 5.1] for spectral parameters z in a neighbourhood of $τ_{0}$ the operator $B$ has a unique isolated eigenvalue $β$ of smallest modulus and associated right $B [V_{r}] = β V_{r}$ and left $B^{*} [V_{l}] = \bar{β} V_{l}$ eigendirections normalised such that $‖ V_{r} ‖_{hs} = ⟨ V_{l}, V_{r} ⟩ = 1$ . We denote the spectral projections to $V_{r}$ and to its complement by $P : = 〈V_{l}, \cdot〉 V_{r}$ and $Q : = 1 - P$ . For convenience of the reader we now collect some important quantitative information about the stability operator and its unstable direction from [10].

Proposition 3.2

(Properties of the MDE and its solution). The following statements hold true uniformly in $z = τ_{0} + ω + i η \in D_{cusp}$ assuming flatness as in (3.6) and the uniform boundedness of $‖ M ‖$ for $z \in τ_{0} + (- κ, κ) + i R_{+}$ ,

(i)
The eigendirections $V_{l}, V_{r}$ are norm-bounded and the operator $B^{- 1}$ is bounded on the complement to its unstable direction, i.e.
$\begin{matrix} ‖ B^{- 1} {Q ‖}_{hs \to hs} + ‖ V_{r} ‖ + ‖ V_{l} ‖ ≲ 1 . \end{matrix}$ 3.7a
(ii)
The density $ρ$ is comparable with the explicit function $ρ (τ_{0} + ω + i η) \sim \tilde{ρ} (τ_{0} + ω + i η)$ given by
$\begin{matrix} \tilde{ρ} : = \{\begin{matrix} ρ (τ_{0}) + {(|ω| + η)}^{1 / 3}, & in cases (i),(iii) if τ_{0} = m, c, \\ {(|ω| + η)}^{1 / 2} {(Δ + |ω| + η)}^{- 1 / 6}, & in case (ii) if τ_{0} = e_{-}, ω \in [- c, 0] \\ η {(Δ + |ω| + η)}^{- 1 / 6} {(|ω| + η)}^{- 1 / 2}, & in case (ii) if τ_{0} = e_{-}, ω \in [0, Δ / 2] . \end{matrix}) \end{matrix}$ 3.7b
(iii)
The eigenvalue $β$ of smallest modulus satisfies
$\begin{matrix} |β| \sim \frac{η}{ρ} + ρ (ρ + |σ|), \end{matrix}$ 3.7c
and we have the comparison relations
$\begin{matrix} \begin{matrix} |〈V_{l}, M S [V_{r}] V_{r}〉| \sim ρ + |σ|, \\ |〈V_{l}, M S [V_{r}] B^{- 1} Q [M S [V_{r}] V_{r}] + M S B^{- 1} Q [M S [V_{r}] V_{r}] V_{r}〉| \sim 1 . \end{matrix} \end{matrix}$ 3.7d
(iv)
The quantities $η / ρ + ρ (ρ + |σ|)$ and $ρ + |σ|$ in (3.7c)–(3.7d) can be replaced by the following more explicit auxiliary quantities
$\begin{matrix} \begin{matrix} {\tilde{ξ}}_{1} (τ_{0} + ω + i η) & : = \{\begin{matrix} {(|ω| + η)}^{1 / 2} {(|ω| + η + Δ)}^{1 / 6}, \\ {(ρ (τ_{0}) + {(|ω| + η)}^{1 / 3})}^{2}, \end{matrix}) \\ {\tilde{ξ}}_{2} (τ_{0} + ω + i η) & : = \{\begin{matrix} {(|ω| + η + Δ)}^{1 / 3}, & if τ_{0} = e_{-}, \\ ρ (τ_{0}) + {(|ω| + η)}^{1 / 3}, & if τ_{0} = m, c . \end{matrix}) \end{matrix} \end{matrix}$ 3.7e
which are monotonically increasing in $η$ . More precisely, it holds that $η / ρ + ρ (ρ + |σ|) \sim {\tilde{ξ}}_{1}$ and, in the case where $τ_{0} = c, m$ is a cusp or a non-zero local minimum, we also have that $ρ + |σ| \sim {\tilde{ξ}}_{2}$ . For the case when $τ_{0} = e_{-}$ is a right edge next to a gap of size $Δ$ there exists a constant $c_{*}$ such that $ρ + |σ| \sim {\tilde{ξ}}_{2}$ in the regime $ω \in [- c, c_{*} Δ]$ and $ρ + |σ| ≲ {\tilde{ξ}}_{2}$ in the regime $ω \in [c_{*} Δ, Δ / 2]$ .

Proof

We first explain how to translate the notations from the present paper to the notations in [10]: The operators $S, B, Q$ are simply denoted by S, B, Q in [10]; the matrices $V_{l}, V_{r}$ here are denoted by $l / ⟨ l, b ⟩, b$ there. The bound on $B^{- 1} Q$ in (3.7a) follows directly from [10, Eq. (5.15)]. The bounds on $V_{l}, V_{r}$ in (3.7a) follow from the definition of the stability operator (3.3) together with the fact that $‖ M ‖ ≲ 1$ (by Assumption (C)) and ${‖ S ‖}_{hs \to ‖ \cdot ‖} ≲ 1$ , following from the upper bound in flatness (3.6). The asymptotic expansion of $ρ$ in (3.7b) follows from [10, Remark 7.3] and [5, Corollary A.1]. The claims in (iii) follow directly from [10, Proposition 6.1]. Finally, the claims in (iv) follow directly from [10, Remark 10.4]. $□$

The following lemma establishes simplified lower bounds on ${\tilde{ξ}}_{1}, {\tilde{ξ}}_{2}$ whenever $η$ is much larger than the fluctuation scale $η_{f}$ . We defer the proof of the technical lemma which differentiates various regimes to the Appendix.

Lemma 3.3

Under the assumptions of Proposition 3.2 we have uniformly in $z = τ_{0} + ω + i η \in D_{cusp}$ with $η \geq η_{f}$ that

\begin{matrix} {\tilde{ξ}}_{2} ≳ \frac{1}{N η} + (\frac{ρ}{N η})^{1 / 2}, {\tilde{ξ}}_{1} ≳ {\tilde{ξ}}_{2} (ρ + \frac{1}{N η}) . \end{matrix}

We now define an appropriate matrix norm in which we will measure the distance between G and M. The ${‖ \cdot ‖}_{*}$ -norm is defined exactly as in [11] and similar to the one first introduced in [34]. It is a norm comparing matrix elements on a large but finite set of vectors with a hierarchical structure. To define this set we introduce some notations. For second order cumulants of matrix elements $κ (w_{ab}, w_{cd}) : = E w_{ab} w_{cd}$ we use the short-hand notation $κ (a b, c d)$ . We also use the short-hand notation $κ (x b, c d)$ for the $x = {(x_{a})}_{a \in [N]}$ -weighted linear combination $\sum_{a} x_{a} κ (a b, c d)$ of such cumulants. We use the notation that replacing an index in a scalar quantity by a dot ( $\cdot$ ) refers to the corresponding vector, e.g. $A_{a \cdot}$ is a short-hand notation for the vector ${(A_{ab})}_{b \in [N]}$ . Matrices $R_{x y}$ with vector subscripts $x, y$ are understood as short-hand notations for $〈x, R y〉$ , and matrices $R_{x a}$ with mixed vector and index subscripts are understood as $〈x, R e_{a}〉$ with $e_{a}$ being the ath normalized $‖ e_{a} ‖ = 1$ standard basis vector. We fix two vectors $x, y$ and some large integer K and define the sets of vectors

\begin{matrix} \begin{matrix} I_{0} & : = {x, y} \cup {δ_{a \cdot}, {(V_{l}^{*})}_{a \cdot} | a \in [N]}, \\ I_{k + 1} & : = I_{k} \cup {M u | u \in I_{k}} \cup {κ_{c} ((M u) a, b \cdot), κ_{d} ((M u) a, \cdot b) | u \in I_{k}, a, b \in [N]} . \end{matrix} \end{matrix}

Here the cross and the direct part $κ_{c}, κ_{d}$ of the 2-cumulants $κ (\cdot, \cdot)$ refer to the natural splitting dictated by the Hermitian symmetry. In the specific case of (3.2) we simply have $κ_{c} (a b, c d) = δ_{ad} δ_{bc} s_{ab}$ and $κ_{d} (a b, c d) = δ_{ac} δ_{bd} t_{ab}$ . Then the ${‖ \cdot ‖}_{*}$ -norm is given by

\begin{matrix} {‖ R ‖}_{*} = {‖ R ‖}_{*}^{K, x, y} : = \sum_{0 \leq k < K} N^{- k / 2 K} {‖ R ‖}_{I_{k}} + N^{- 1 / 2} max_{u \in I_{K}} \frac{‖ R_{\cdot u} ‖}{‖ u ‖}, {‖ R ‖}_{I} : = max_{u, v \in I} \frac{|R_{u v}|}{‖ u ‖ ‖ v ‖} . \end{matrix}

We remark that the set $I_{k}$ hence also ${‖ \cdot ‖}_{*}$ depend on z via $M = M (z)$ . We omit this dependence from the notation as it plays no role in the estimates.

In terms of this norm we obtain the following estimate on $G - M$ in terms of its projection $Θ = 〈V_{l}, G - M〉$ onto the unstable direction of the stability operator $B$ . It is a direct consequence of a general expansion of approximate quadratic matrix equations whose linear stability operators have a single eigenvalue close to 0, as given in Lemma A.1.

Proposition 3.4

(Cubic equation for $Θ$ ). Fix $K \in N$ , $x, y \in C^{N}$ and use ${‖ \cdot ‖}_{*} = {‖ \cdot ‖}_{*}^{K, x, y}$ . For fixed $z \in D_{cusp}$ and on the event that ${‖ G - M ‖}_{*} + {‖ D ‖}_{*} ≲ N^{- 10 / K}$ the difference $G - M$ admits the expansion

\begin{matrix} \begin{matrix} G - M & = Θ V_{r} - B^{- 1} Q [M D] + Θ^{2} B^{- 1} Q [M S [V_{r}] V_{r}] + E, \\ {‖ E ‖}_{*} & ≲ N^{5 / K} ({|Θ|}^{3} + |Θ| {‖ D ‖}_{*} + {‖ D ‖}_{*}^{2}), \end{matrix} \end{matrix}

3.8a

with an error matrix E and the scalar $Θ : = 〈V_{l}, G - M〉$ that satisfies the approximate cubic equation

\begin{matrix} Θ^{3} + ξ_{2} Θ^{2} + ξ_{1} Θ = ϵ_{*} . \end{matrix}

3.8b

Here, the error $ϵ_{*}$ satisfies the upper bound

\begin{matrix} |ϵ_{*}| ≲ N^{20 / K} {(‖ D ‖}_{*}^{3} + {|〈R, D〉|}^{3 / 2}) + |〈V_{l}, M D〉| \\ + |〈V_{l}, M (S B^{- 1} Q [M D]) (B^{- 1} Q [M D])〉|, \end{matrix}

3.8c

where R is a deterministic matrix with $‖ R ‖ ≲ 1$ and the coefficients of the cubic equation satisfy the comparison relations

\begin{matrix} |ξ_{1}| \sim \frac{η}{ρ} + ρ (ρ + |σ|), |ξ_{2}| \sim ρ + |σ| . \end{matrix}

3.8d

Proof

We first establish some important bounds involving the ${‖ \cdot ‖}_{*}$ -norm. We claim that for any matrices $R, R_{1}, R_{2}$

\begin{matrix} \begin{matrix} ‖ M S [R_{1}] R_{2} ‖_{*} ≲ N^{1 / 2 K} ‖ R_{1} ‖_{*} ‖ R_{2} ‖_{*} {, ‖ M R ‖}_{*} ≲ N^{1 / 2 K} {‖ R ‖}_{*}, \\ {‖ Q ‖}_{* \to *} ≲ 1, ‖ B^{- 1} {Q ‖}_{* \to *} ≲ 1, |〈V_{l}, R〉| ≲ {‖ R ‖}_{*} . \end{matrix} \end{matrix}

3.9

The proof of (3.9) follows verbatim as in [11, Lemma 3.4] with (3.7a) as an input. Moreover, the bound on $〈V_{l}, \cdot〉$ follows directly from the bound on $Q$ . Obviously, we also have ${‖ \cdot ‖}_{*} \leq 2 ‖ \cdot ‖$ .

Next, we apply Lemma A.1 from the Appendix with the choices

\begin{matrix} A [R_{1}, R_{2}] : = M S [R_{1}] R_{2}, X : = M D, Y : = G - M . \end{matrix}

The operator $B$ in Lemma A.1 is chosen as the stability operator (3.3). Then (A.1) is satisfied with $λ : = N^{1 / 2 K}$ according to (3.9) and (3.7a). With $δ : = N^{- 25 / 4 K}$ we verify (3.8a) directly from (A.5), where $Θ = 〈V_{l}, G - M〉$ satisfies

\begin{matrix} μ_{3} Θ^{3} + μ_{2} Θ^{2} - β Θ = - μ_{0} + 〈R, D〉 Θ + O (N^{- 1 / 4 K} {|Θ|}^{3} + N^{20 / K} {‖ D ‖}_{*}^{3}) . \end{matrix}

3.10

Here we used $|Θ| \leq {‖ G - M ‖}_{*} ≲ N^{- 10 / K}$ and ${‖ M D ‖}_{*} ≲ N^{1 / 2 K} {‖ D ‖}_{*}$ . The coefficients $μ_{0}, μ_{2}, μ_{3}$ are defined through (A.4) and R is given by

\begin{matrix} R : = M^{*} {(B^{- 1} Q)}^{*} [S [M^{*} V_{l} V_{r}^{*}] + S [V_{r}^{*}] M^{*} V_{l}] . \end{matrix}

Now we bound $|〈R, D〉, Θ| \leq N^{- 1 / 4 K} {|Θ|}^{3} + N^{1 / 8 K} {|〈R, D〉|}^{3 / 2}$ by Young’s inequality, absorb the error terms bounded by Inline graphic into the cubic term, $μ_{3} Θ^{3} + O (N^{- 1 / 4 K} {|Θ|}^{3}) = {\tilde{μ}}_{3} Θ^{3}$ , by introducing a modified coefficient ${\tilde{μ}}_{3}$ and use that $|μ_{3}| \sim |{\tilde{μ}}_{3}| \sim 1$ for any $z \in D_{cusp}$ . Finally, we safely divide (3.10) by ${\tilde{μ}}_{3}$ to verify (3.8b) with $ξ_{1} : = - β / {\tilde{μ}}_{3}$ and $ξ_{2} : = μ_{2} / {\tilde{μ}}_{3}$ . For the fact $|μ_{3}| \sim 1$ on $D_{cusp}$ and the comparison relations (3.8d) we refer to (3.7c)–(3.7d). $□$

Probabilistic bound

We now collect bounds on the error matrix D from [34, Theorem 4.1] and Sect. 4. We first introduce the notion of stochastic domination.

Definition 3.5

(Stochastic domination). Let $X = X^{(N)}, Y = Y^{(N)}$ be sequences of non-negative random variables. We say that X is stochastically dominated by Y (and use the notation $X ≺ Y$ ) if

\begin{matrix} P [X > N^{ϵ} Y] \leq C (ϵ, ν) N^{- ν}, N \in N, \end{matrix}

for any $ϵ > 0, ν \in N$ and some family of positive constants $C (ϵ, ν)$ that is uniform in N and other underlying parameters (e.g. the spectral parameter z in the domain under consideration).

It can be checked (see [33, Lemma 4.4]) that $≺$ satisfies the usual arithmetic properties, e.g. if $X_{1} ≺ Y_{1}$ and $X_{2} ≺ Y_{2}$ , then also $X_{1} + X_{2} ≺ Y_{1} + Y_{2}$ and $X_{1} X_{2} ≺ Y_{1} Y_{2}$ . Furthermore, to formulate bounds on a random matrix R compactly, we introduce the notations

for random matrices R and a deterministic control parameter $Λ = Λ (z)$ . We also introduce high moment norms

\begin{matrix} {‖ X ‖}_{p} : = (E {|X|}^{p})^{1 / p}, {‖ R ‖}_{p} : = sup_{x, y} \frac{‖ 〈x, R y〉 ‖_{p}}{‖ x ‖ ‖ y ‖} \end{matrix}

for $p \geq 1$ , scalar valued random variables X and random matrices R. To translate high moment bounds into high probability bounds and vice versa we have the following easy lemma [11, Lemma 3.7].

Lemma 3.6

Let R be a random matrix, $Φ$ a deterministic control parameter such that $Φ \geq N^{- C}$ and $‖ R ‖ \leq N^{C}$ for some $C > 0$ , and let $K \in N$ be a fixed integer. Then we have the equivalences

\begin{matrix} {‖ R ‖}_{*}^{K, x, y} & ≺ Φ uniformly in x, y ⟺ |R| ≺ Φ ⟺ {‖ R ‖}_{p} \leq_{p, ϵ} \\ N^{ϵ} Φ for all ϵ > 0, p \geq 1 . \end{matrix}

Expressed in terms of the ${‖ \cdot ‖}_{p}$ -norm we have the following high-moment bounds on the error matrix D. The bounds (3.11a)–(3.11b) have already been established in [34, Theorem 4.1]; we just list them for completeness. The bounds (3.11c)–(3.11d), however, are new and they capture the additional cancellation at the cusp and are the core novelty of the present paper. The additional smallness comes from averaging against specific weights $p, f$ from (3.5b).

Theorem 3.7

(High moment bound on D with cusp fluctuation averaging). Under the assumptions of Theorem 2.5 for any compact set $D \subset {z \in C | I z \geq N^{- 1}}$ there exists a constant C such that for any $p \geq 1, ϵ > 0$ , $z \in D$ and matrices/vectors $B, x, y$ it holds that

\begin{matrix} ‖ 〈x, D y〉 ‖_{p} \leq_{ϵ, p} ‖ x ‖ ‖ y ‖ N^{ϵ} ψ_{q}^{'} (1 + {‖ G ‖}_{q})^{C} (1 + \frac{{‖ G ‖}_{q}}{\sqrt{N}})^{Cp}, \end{matrix}

3.11a

\begin{matrix} ‖ 〈B, D〉 ‖_{p} \leq_{ϵ, p} ‖ B ‖ N^{ϵ} [ψ_{q}^{'}]^{2} (1 + {‖ G ‖}_{q})^{C} (1 + \frac{{‖ G ‖}_{q}}{\sqrt{N}})^{Cp} . \end{matrix}

3.11b

Moreover, for the specific weight matrix $B = diag (p f)$ we have the improved bound

\begin{matrix} ‖ 〈diag (p f) D〉 ‖_{p} & \leq_{ϵ, p} N^{ϵ} σ_{q} [ψ + ψ_{q}^{'}]^{2} (1 + {‖ G ‖}_{q})^{C} (1 + \frac{{‖ G ‖}_{q}}{\sqrt{N}})^{Cp}, \end{matrix}

3.11c

and the improved bound on the off-diagonal component

\begin{matrix} ‖ 〈diag, (p f), [T ⊙ G^{t}], G〉 ‖_{p} & \leq_{ϵ, p} N^{ϵ} σ_{q} [ψ + ψ_{q}^{'}]^{2} (1 + {‖ G ‖}_{q})^{C} (1 + \frac{{‖ G ‖}_{q}}{\sqrt{N}})^{Cp}, \end{matrix}

3.11d

where we defined the following z-dependent quantities

\begin{matrix} ψ : = \sqrt{\frac{ρ}{N η}}, ψ_{q}^{'} : = \sqrt{\frac{{‖ I G ‖}_{q}}{N η}}, ψ_{q}^{''} : = {‖ G - M ‖}_{q}, σ_{q} : = |σ| + ρ + ψ + \sqrt{η / ρ} + ψ_{q}^{'} + ψ_{q}^{''} \end{matrix}

and $q = C p^{3} / ϵ$ .

Theorem 3.7 will be proved in Sect. 4. We now translate the high moment bounds of Theorem 3.7 into high probability bounds via Lemma 3.6 and use those to establish bounds on $G - M$ and the error in the cubic equation for $Θ$ . To simplify the expressions we formulate the bounds in the domain

\begin{matrix} D_{ζ} : = {z \in D_{cusp} | I z \geq N^{- 1 + ζ}} . \end{matrix}

3.12

Lemma 3.8

(High probability error bounds). Fix $ζ, c > 0$ sufficiently small and suppose that $|G - M| ≺ Λ$ , $|I (G - M)| ≺ Ξ$ and $|Θ| ≺ θ$ hold at fixed $z \in D_{ζ}$ , and assume that the deterministic control parameters $Λ, Ξ, θ$ satisfy $Λ + Ξ + θ ≲ N^{- c}$ . Then for any sufficiently small $ϵ > 0$ it holds that

\begin{matrix} |Θ^{3} + ξ_{2} Θ^{2} + ξ_{1} Θ| ≺ N^{2 ϵ} (ρ + |σ| + \frac{η^{1 / 2}}{ρ^{1 / 2}} + (\frac{ρ + Ξ}{N η})^{1 / 2}) \frac{ρ + Ξ}{N η} + N^{- ϵ} θ^{3}, \end{matrix}

3.13a

as well as

\begin{matrix} |G - M| ≺ θ + \sqrt{\frac{ρ + Ξ}{N η}}, {|G - M|}_{av} ≺ θ + \frac{ρ + Ξ}{N η}, \end{matrix}

3.13b

where the coefficients $ξ_{1}, ξ_{2}$ are those from Proposition 3.4, and we recall that $Θ = 〈V_{l}, G - M〉$ .

Proof

We translate the high moment bounds (3.11a)–(3.11b) into high probability bounds using Lemma 3.6 and $|G| ≺ ‖ M ‖ + Λ ≲ 1$ to find

\begin{matrix} |D| ≺ \sqrt{\frac{ρ + Ξ}{N η}}, {|D|}_{av} ≺ \frac{ρ + Ξ}{N η} . \end{matrix}

3.14

In particular, these bounds together with the assumed bounds on $G - M$ guarantee the applicability of Proposition 3.4. Now we use (3.14) and (3.9) in (3.8a) to get (3.13b). Here we used (3.9), translated ${‖ \cdot ‖}_{p}$ -bounds into $≺$ -bounds on ${‖ \cdot ‖}_{*}$ and vice versa via Lemma 3.6, and absorbed the $N^{1 / K}$ factors into $≺$ by using that K can be chosen arbitrarily large. It remains to verify (3.13a). In order to do so, we first claim that

\begin{matrix} \begin{matrix} |〈V_{l}, M D〉| + |〈V_{l}, M (S B^{- 1} Q [M D]) (B^{- 1} Q [M D])〉| \\ ≺ N^{ϵ} (|σ| + ρ + \frac{η^{1 / 2}}{ρ^{1 / 2}} + Λ + (\frac{ρ + Ξ}{N η})^{1 / 2}) \frac{ρ + Ξ}{N η} + θ^{2} (N^{- ϵ} Λ + (\frac{ρ + Ξ}{N η})^{1 / 2}) \end{matrix} \end{matrix}

3.15

for any sufficiently small $ϵ > 0$ .

Proof of (3.15)

We first collect two additional ingredients from [10] specific to the vector case.

The imaginary part $I m$ of the solution $m$ is comparable $I m \sim 〈I, m〉 = π ρ$ to its average in the sense $c 〈I, m〉 \leq I m_{i} \leq C 〈I, m〉$ for all $i$ and some $c, C > 0$ , and, in particular, $m = R m + O (ρ)$ .
The eigendirections $V_{l}, V_{r}$ are diagonal and are approximately given by
$\begin{matrix} V_{l} = c diag (f / |m|) + O (ρ + η / ρ), V_{r} = c^{'} diag (f |m|) + O (ρ + η / ρ) \end{matrix}$ 3.16
for some constants $c, c^{'} \sim 1$ .

Indeed, (a) follows directly from [10, Proposition 3.5] and the approximations in (3.16) follow directly from [10, Corollary 5.2]. The fact that $V_{l}, V_{r}$ are diagonal follows from simplicity of the eigendirections in the matrix case, and the fact that $M = diag (m)$ is diagonal and that $B$ preserves the space of diagonal matrices as well as the space of off-diagonal matrices. On the latter $B$ acts stably as $1 + O_{hs \to hs} (N^{- 1})$ . Thus the unstable directions lie inside the space of diagonal matrices.

We now turn to the proof of (3.15) and first note that, according to (a) and (b) we have

\begin{matrix} M = diag (p |m|) + O (ρ), V_{l} = c diag (f / |m|) + O (ρ + η / ρ) \end{matrix}

3.17

with errors in $‖ \cdot ‖$ -norm-sense, for some constant $c \sim 1$ to see

\begin{matrix} 〈V_{l}, M D〉 = c 〈diag (p f) D〉 + O (ρ + η / ρ) 〈diag (w_{1}) D〉, \end{matrix}

where $w_{1} \in C^{N}$ is a deterministic vector with uniformly bounded entries. Since $|〈diag (w_{1}) D〉| ≺ (ρ + Ξ) / N η$ by (3.14), the bound on the first term in (3.15) follows together with (3.11c) via Lemma 3.6. Now we consider the second term in (3.15). We split $D = D_{d} + D_{o}$ into its diagonal and off-diagonal components. Since $B$ and $S$ preserve the space of diagonal and the space of off-diagonal matrices we find

\begin{matrix} \begin{matrix} 〈V_{l}, M (S B^{- 1} Q [M D]) (B^{- 1} Q [M D])〉 \\ = \frac{1}{N^{2}} \sum_{i, j} u_{ij} d_{ii} d_{jj} + 〈V_{l}, M (S B^{- 1} Q [M D_{o}]) (B^{- 1} Q [M D_{o}])〉, \end{matrix} \end{matrix}

3.18

with an appropriate deterministic matrix $u_{ij}$ having bounded entries. In particular, the cross terms vanish and the first term is bounded by

3.19

according to (3.14). By taking the off-diagonal part of (3.8a) and using the fact that M and $V_{r}$ and therefore also $B^{- 1} Q [M S [V_{r}] V_{r}]$ are diagonal (cf. (b) above) we have

\begin{matrix} |B^{- 1} Q [M D_{o}] + G_{o}| ≺ θ^{3} + θ (\frac{ρ + Ξ}{N η})^{1 / 2} + \frac{ρ + Ξ}{N η} ≲ N^{- ϵ} θ^{2} + N^{ϵ} \frac{ρ + Ξ}{N η} \end{matrix}

for any $ϵ$ such that $θ ≲ N^{- ϵ}$ by Young’s inequality in the last step. Together with (3.17), (3.14) and the assumption that $|G_{o}| = |{(G - M)}_{o}| ≺ Λ$ we then compute

\begin{matrix} \begin{matrix} 〈V_{l}, M (S B^{- 1} Q [M D_{o}]) (B^{- 1} Q [M D_{o}])〉 \\ = c 〈diag, (p f), (S B^{- 1} Q [M D_{o}]), (B^{- 1} Q [M D_{o}])〉 + O ((ρ + \frac{η}{ρ}) \frac{ρ + Ξ}{N η}) \\ = c 〈diag, (p f), S, [G_{o}], G_{o}〉 + O ((ρ + \frac{η}{ρ}) \frac{ρ + Ξ}{N η} + ((\frac{ρ + Ξ}{N η})^{1 / 2} + Λ) [N^{- ϵ} θ^{2} + N^{ϵ} \frac{ρ + Ξ}{N η}]) . \end{matrix} \end{matrix}

Thus the bound on the second term on the lhs. in (3.15) follows together with (3.18)–(3.19) by $S [G_{o}] = T ⊙ G^{t}$ and (3.11d) via Lemma 3.6. This completes the proof of (3.15). $□$

With (3.14) and (3.15) the upper bound (3.8c) on the error $ϵ_{*}$ of the cubic equation (3.8b) takes the same form as the rhs. of (3.15) if K is sufficiently large depending on $ϵ$ . By the first estimate in (3.13b) we can redefine the control parameter $Λ$ on $|G - M|$ as $Λ : = θ + {((ρ + Ξ) / N η)}^{1 / 2}$ and the claim (3.13a) follows directly with (3.15), thus completing the proof of Lemma 3.8. $□$

Bootstrapping

Now we will show that the difference $G - M$ converges to zero uniformly for all spectral parameters $z \in D_{ζ}$ as defined in (3.12). For convenience we refer to existing bounds on $G - M$ far away from the real line to establish a rough bound on $G - M$ in, say, $D_{1}$ . We then iteratively lower the threshold on $η$ by appealing to Proposition 3.4 and Lemma 3.8 until we establish the rough bound in all of $D_{ζ}$ . As a second step we then improve the rough bound iteratively until we obtain Theorem 2.5.

Lemma 3.9

(Rough bound). For any $ζ > 0$ there exists a constant $c > 0$ such that on the domain $D_{ζ}$ we have the rough bound

\begin{matrix} |G - M| ≺ N^{- c} . \end{matrix}

3.20

Proof

The rough bound (3.20) in a neighbourhood of a cusp has first been established for Wigner-type random matrices in [8]. For the convenience of the reader we present a streamlined proof that is adapted to the current setting. The lemma is an immediate consequence of the following statement. Let $ζ_{s} > 0$ be a sufficiently small step size, depending on $ζ$ . Then for any $N_{0} ∋ k \leq 1 / ζ_{s}$ on the domain $D_{max {1 - k ζ_{s}, ζ}}$ we have

\begin{matrix} |G - M| ≺ N^{- 4^{- k} ζ} . \end{matrix}

3.21

We prove (3.21) by induction over k. For sufficiently small $ζ$ the induction start $k = 0$ holds due to the local law away from the self-consistent spectrum, e.g. [34, Theorem 2.1].

Now as induction hypothesis suppose that (3.21) holds on ${\tilde{D}}_{k} : = D_{max {1 - k ζ_{s}, ζ}}$ , and in particular, $|G| ≺ 1$ , ${‖ G ‖}_{p} \leq_{ϵ, p} N^{ϵ}$ for any $ϵ, p$ according to Lemma 3.6. The monotonicity of the function $η \mapsto {η ‖ G (τ + i η) ‖}_{p}$ (see e.g. [34, proof of Prop. 5.5]) implies ${‖ G ‖}_{p} \leq_{ϵ, p} N^{ϵ + ζ_{s}} \leq N^{2 ζ_{s}}$ and therefore, according to Lemma 3.6, that $|G| ≺ N^{2 ζ_{s}}$ on ${\tilde{D}}_{k + 1}$ . This, in turn, implies $|D| ≺ N^{- ζ / 3}$ on ${\tilde{D}}_{k + 1}$ by (3.11a) and Lemma 3.6, provided $ζ_{s}$ is chosen small enough. We now fix $x, y$ and a large integer K as the parameters of ${‖ \cdot ‖}_{*} = {‖ \cdot ‖}_{*}^{x, y, K}$ for the rest of the proof and omit them from the notation but we stress that all estimates will be uniform in $x, y$ . We find ${sup}_{z \in {\tilde{D}}_{k + 1}} {‖ D (z) ‖}_{*} ≺ N^{- ζ / 3}$ , by using a simple union bound and $‖ \partial_{z} D ‖ \leq N^{C}$ for some $C > 0$ . Thus, for K large enough, we can use (3.8a), (3.8b), (3.8c) and (3.9) to infer

\begin{matrix} \begin{matrix} |Θ^{3} + ξ_{2} Θ^{2} + ξ_{1} Θ| & ≲ N^{1 / 2 K} {‖ D ‖}_{*} ≺ N^{1 / 2 K - ζ / 3}, \\ {‖ G - M ‖}_{*} & ≲ |Θ| + N^{1 / K} {‖ D ‖}_{*} ≺ |Θ| + N^{1 / K - ζ / 3}, \end{matrix} \end{matrix}

3.22

on the event ${‖ G - M ‖}_{*} + {‖ D ‖}_{*} ≲ N^{- 10 / K}$ , and on ${\tilde{D}}_{k + 1}$ . Now we use the following lemma [10, Lemma 10.3] to translate the first estimate in (3.22) into a bound on $|Θ|$ . For the rest of the proof we keep $τ = R z$ fixed and consider the coefficients $ξ_{1}, ξ_{2}$ and $Θ$ as functions of $η$ .

Lemma 3.10

(Bootstrapping cubic inequality). For $0 < η_{*} < η^{*} < \infty$ let $ξ_{1}, ξ_{2} : [η_{*}, η^{*}] \to C$ be complex valued functions and ${\tilde{ξ}}_{1}, {\tilde{ξ}}_{2}, d : [η_{*}, η^{*}] \to R^{+}$ be continuous functions such that at least one of the following holds true:

(i)
$|ξ_{1}| \sim {\tilde{ξ}}_{1}$ , $|ξ_{2}| \sim {\tilde{ξ}}_{2}$ , and ${\tilde{ξ}}_{2}^{3} / d, {\tilde{ξ}}_{1}^{3} / d^{2}, {\tilde{ξ}}_{1}^{2} / d {\tilde{ξ}}_{2}$ are monotonically increasing, and $d^{2} / {\tilde{ξ}}_{1}^{3} + d {\tilde{ξ}}_{2} / {\tilde{ξ}}_{1}^{2} ≪ 1$ at $η^{*}$ ,
(ii)
$|ξ_{1}| \sim {\tilde{ξ}}_{1}$ , $|ξ_{2}| ≲ {\tilde{ξ}}_{1}^{1 / 2}$ , and ${\tilde{ξ}}_{1}^{3} / d^{2}$ is monotonically increasing.

Then any continuous function $Θ : [η_{*}, η^{*}] \to C$ that satisfies the cubic inequality Inline graphic on $[η_{*}, η^{*}]$ , has the property

\begin{matrix} If |Θ| ≲ min {d^{1 / 3}, \frac{d^{1 / 2}}{{\tilde{ξ}}_{2}^{1 / 2}}, \frac{d}{{\tilde{ξ}}_{1}}} at η^{*}, then |Θ| ≲ min {d^{1 / 3}, \frac{d^{1 / 2}}{{\tilde{ξ}}_{2}^{1 / 2}}, \frac{d}{{\tilde{ξ}}_{1}}} on [η_{*}, η^{*}] . \end{matrix}

3.23

With direct arithmetics we can now verify that the coefficients $ξ_{1}, ξ_{2}$ in (3.8b) and the auxiliary coefficients ${\tilde{ξ}}_{1}, {\tilde{ξ}}_{2}$ defined in (3.7e) satisfy the assumptions in Lemma 3.10 with the choice of the constant function $d = N^{- 4^{- k} ζ + δ}$ for any $δ > 0$ , by using only the information on $ξ_{1}, ξ_{2}$ given by the comparison relations (3.8d). As an example, in the regime where $τ_{0}$ is a right edge and $ω \sim Δ$ , we have ${\tilde{ξ}}_{1} \sim {(η + Δ)}^{2 / 3}$ and ${\tilde{ξ}}_{2} \sim {(η + Δ)}^{1 / 3}$ and both functions are monotonically increasing in $η$ . Then Assumption (ii) of Lemma 3.10 is satisfied. All other regimes are handled similarly.

We now set $η^{*} : = N^{- k ζ_{s}}$ and

\begin{matrix} η_{*} : = inf \{η \in [N^{- (k + 1) ζ_{s}}, η^{*}] | sup_{η^{'} \geq η} {‖ G (τ + i η^{'}) - M (τ + i η^{'}) ‖}_{*} \leq N^{- 10 / K} / 2\} . \end{matrix}

By the induction hypothesis we have $|Θ (η^{*})| ≲ d ≲ min {d^{1 / 3}, d^{1 / 2} {\tilde{ξ}}_{2}^{- 1 / 2}, d {\tilde{ξ}}_{1}^{- 1}}$ with overwhelming probability, so that the condition in (3.23) holds, and conclude $|Θ (η)| ≺ d^{1 / 3} = N^{- (4^{- k} ζ - δ) / 3}$ for $η \in [η_{*}, η^{*}]$ . For small enough $δ > 0$ the second bound in (3.22) implies ${‖ G - M ‖}_{*} ≺ N^{- 4^{k + 1} ζ}$ . By continuity and the definition of $η_{*}$ we conclude $η_{*} = N^{- (k + 1) ζ_{s}}$ , finishing the proof of (3.21). $□$

Proof of Theorem 2.5

The bounds within the proof hold true uniformly for $z \in D_{ζ}$ , unless explicitly specified otherwise. We therefore suppress this qualifier in the following statements. First we apply Lemma 3.8 with the choice $Ξ = Λ$ , i.e. we do not treat the imaginary part of the resolvent separately. With this choice the first inequality in (3.13b) becomes self-improving and after iteration shows that

\begin{matrix} |G - M| ≺ θ + \sqrt{\frac{ρ}{N η}} + \frac{1}{N η}, \end{matrix}

3.24

and, in other words, (3.13a) holds with $Ξ = θ + {(ρ / N η)}^{1 / 2} + 1 / N η$ . This implies that if $|Θ| ≺ θ ≲ N^{- c}$ for some arbitrarily small $c > 0$ , then

\begin{matrix} |Θ^{3} + ξ_{2} Θ^{2} + ξ_{1} Θ| ≲ N^{5 \tilde{ϵ}} d_{*} + N^{- \tilde{ϵ}} (θ^{3} + {\tilde{ξ}}_{2} θ^{2}) \end{matrix}

3.25

holds for all sufficiently small $\tilde{ϵ}$ with overwhelming probability, where we defined

\begin{matrix} d_{*} : = {\tilde{ξ}}_{2} (\frac{\tilde{ρ}}{N η} + \frac{1}{{(N η)}^{2}}) + \frac{1}{{(N η)}^{3}} + (\frac{\tilde{ρ}}{N η})^{3 / 2} . \end{matrix}

3.26

For this conclusion we used the comparison relations (3.8d), Proposition 3.2(iv) as well as (3.7b), and the bound $\sqrt{η / ρ} \sim \sqrt{η / \tilde{ρ}} ≲ {\tilde{ξ}}_{2}$ . $□$

The bound (3.25) is a self-improving estimate on $|Θ|$ in the following sense. For $k \in N$ and $l \in N \cup {*}$ let

\begin{matrix} d_{k} : = max {N^{- k \tilde{ϵ}}, N^{6 \tilde{ϵ}} d_{*}}, θ_{l} : = min {d_{l}^{1 / 3}, \frac{d_{l}^{1 / 2}}{{\tilde{ξ}}_{2}^{1 / 2}}, \frac{d_{l}}{{\tilde{ξ}}_{1}}} . \end{matrix}

Then (3.25) with $|Θ| ≺ θ_{k}$ implies that $|Θ^{3} + ξ_{2} Θ^{2} + ξ_{1} Θ| ≲ N^{- \tilde{ϵ}} d_{k}$ . Applying Lemma 3.10 with $d = N^{- \tilde{ϵ}} d_{k}$ , $η^{*} \sim 1$ , $η_{*} = N^{ζ - 1}$ yields the improvement $|Θ| ≺ θ_{k + 1}$ . Here we needed to check the condition in (3.23) but at $η^{*} \sim 1$ we have ${\tilde{ξ}}_{1} \sim 1$ , so $|Θ| ≲ N^{- \tilde{ϵ}} d_{k} \leq d_{k + 1} \sim θ_{k + 1}$ . After a k-step iteration until $N^{- k \tilde{ϵ}}$ becomes smaller than $N^{6 \tilde{ϵ}} d_{*}$ , we find $|Θ| ≺ θ_{*}$ , where we used that $\tilde{ϵ}$ can be chosen arbitrarily small. We are now ready to prove the following bound which we, for convenience, record as a proposition.

Proposition 3.11

For any $ζ > 0$ we have the bounds

\begin{matrix} |G - M| ≺ θ_{*} + \sqrt{\frac{ρ}{N η}} + \frac{1}{N η}, {|G - M|}_{av} ≺ θ_{*} + \frac{ρ}{N η} + \frac{1}{{(N η)}^{2}} in D_{ζ}, \end{matrix}

3.27

where $θ_{*} : = min {d_{*}^{1 / 3}, d_{*}^{1 / 2} / {\tilde{ξ}}_{2}^{1 / 2}, d_{*} / {\tilde{ξ}}_{1}}$ , and $d_{*}, \tilde{ρ}, {\tilde{ξ}}_{1}, {\tilde{ξ}}_{2}$ are given in (3.26), (3.7b) and (3.7e), respectively.

Proof

Using $|Θ| ≺ θ_{*}$ proven above, we apply (3.24) with $θ = θ_{*}$ to conclude the first inequality in (3.27). For the second inequality in (3.27) we use the estimate on ${|G - M|}_{av}$ from (3.13b) with $θ = θ_{*}$ and $Ξ = {(ρ / N η)}^{1 / 2} + 1 / N η$ . $□$

The bound on $|G - M|$ from (3.27) implies a complete delocalisation of eigenvectors uniformly at singularities of the scDOS. The following corollary was established already in [8, Corollary 1.14] and, given (3.27), the proof follows the same line of reasoning.

Corollary 3.12

(Eigenvector delocalisation). Let $u \in C^{N}$ be an eigenvector of H corresponding to an eigenvalue $λ \in τ_{0} + (- c, c)$ for some sufficiently small positive constant $c \sim 1$ . Then for any deterministic $x \in C^{N}$ we have

\begin{matrix} |〈u, x〉| ≺ \frac{1}{\sqrt{N}} ‖ u ‖ ‖ x ‖ . \end{matrix}

The bounds (3.27) simplify in the regime $η \geq N^{ζ} η_{f}$ above the typical eigenvalue spacing to

\begin{matrix} |G - M| ≺ \sqrt{\frac{ρ}{N η}} + \frac{1}{N η}, {|G - M|}_{av} ≺ \frac{1}{N η}, for η \geq N^{ζ} η_{f} \end{matrix}

3.28

using Lemma 3.3 which implies $θ_{*} \leq d_{*} / {\tilde{ξ}}_{1} \leq 1 / N η$ . The bound on ${|G - M|}_{av}$ is further improved in the case when $τ_{0} = e_{-}$ is an edge and, in addition to $η \geq N^{ζ} η_{f}$ , we assume $N^{δ} η \leq ω \leq Δ / 2$ for some $δ > 0$ , i.e. if $ω$ is well inside a gap of size $Δ \geq N^{δ + ζ} η_{f}$ . Then we find $Δ > N^{- 3 / 4}$ by the definition of $η_{f} = Δ^{1 / 9} / N^{2 / 3}$ in (2.7) and use Lemma 3.3 and (3.7b), (3.7e) to conclude

\begin{matrix} θ_{*} + \frac{\tilde{ρ}}{N η} + \frac{1}{{(N η)}^{2}} ≲ \frac{{\tilde{ξ}}_{2}}{{\tilde{ξ}}_{1}} (\frac{\tilde{ρ}}{N η} + \frac{1}{{(N η)}^{2}}) \sim \frac{Δ^{1 / 6}}{ω^{1 / 2}} (\frac{η}{Δ^{1 / 6} ω^{1 / 2}} + \frac{1}{N η}) \frac{1}{N η} ≲ \frac{N^{- δ / 2}}{N η} . \end{matrix}

3.29

In the last bound we used $1 / N ω \leq N^{- δ} / N η$ and $Δ^{1 / 6} / (N η ω^{1 / 2}) \leq N^{- δ / 2}$ . Using (3.29) in (3.27) yields the improvement

\begin{matrix} {|G - M|}_{av} ≺ \frac{N^{- δ / 2}}{N η}, for τ = e_{-} + ω, Δ / 2 \geq ω \geq N^{δ} η \geq N^{ζ + δ} η_{f} . \end{matrix}

3.30

The bounds on ${|G - M|}_{av}$ from (3.28) and (3.30), inside and outside the self-consistent spectrum, allow us to show the uniform rigidity, Corollary 2.6. We postpone these arguments until after we finish the proof of Theorem 2.5. The uniform rigidity implies that for $dist (z, supp ρ) \geq N^{ζ} η_{f}$ we can estimate the imaginary part of the resolvent via

\begin{matrix} I 〈x, G x〉 = \sum_{λ} \frac{η {|〈u_{λ}, x〉|}^{2}}{η^{2} + {(τ_{0} + ω - λ)}^{2}} ≺ η + \frac{1}{N} \sum_{|λ - τ_{0}| \leq c} \frac{η}{η^{2} + {(τ_{0} + ω - λ)}^{2}} ≺ ρ (z), \end{matrix}

3.31

for any normalised $x \in C^{N}$ , where $u_{λ}$ denotes the normalised eigenvector corresponding to $λ$ . For the first inequality in (3.31) we used Corollary 3.12 and for the second we applied Corollary 2.6 that allows us to replace the Riemann sum with an integral as ${[η^{2} + {(τ_{0} + ω - λ)}^{2}]}^{1 / 2} = |z - λ| \geq N^{ζ} η_{f}$ .

Using with (3.31), we apply Lemma 3.8, repeating the strategy from the beginning of the proof. But this time we can choose the control parameter $Ξ = ρ$ . In this way we find

\begin{matrix} |G - M| ≺ θ_{#} + \sqrt{\frac{ρ}{N η}}, {|G - M|}_{av} ≺ θ_{#} + \frac{ρ}{N η}, for dist (z, supp ρ) \geq N^{ζ} η_{f}, \end{matrix}

3.32

where we defined

\begin{matrix} θ_{#} : = min {\frac{d_{#}}{{\tilde{ξ}}_{1}}, \frac{d_{#}^{1 / 2}}{{\tilde{ξ}}_{2}^{1 / 2}}, d_{#}^{1 / 3}}, d_{#} : = {\tilde{ξ}}_{2} \frac{\tilde{ρ}}{N η} + (\frac{\tilde{ρ}}{N η})^{3 / 2} . \end{matrix}

Note that the estimates in (3.32) are simpler than those in (3.27). The reason is that the additional terms $1 / N η$ , $1 / {(N η)}^{2}$ and $1 / {(N η)}^{3}$ in (3.27) are a consequence of the presence of $Ξ$ in (3.13a), (3.13b). With $Ξ = ρ$ these are immediately absorbed into $ρ$ and not present any more. The second term in the definition of $d_{#}$ can be dropped since we still have ${\tilde{ξ}}_{2} ≳ {(ρ / N η)}^{1 / 2}$ (this follows from Lemma 3.3 if $η \geq N^{ζ} η_{f}$ , and directly from (3.7b), (3.7e) if $ω \geq N^{ζ} η_{f}$ ). This implies $θ_{#} ≲ d_{#}^{1 / 2} / {\tilde{ξ}}_{2}^{1 / 2} ≲ {(ρ / N η)}^{1 / 2}$ , so the first bound in (3.32) proves (2.8a).

Now we turn to the proof of (2.8b). Given the second bound in (3.28), it is sufficient to consider the case when $τ = e_{-} + ω$ and $η \leq ω \leq Δ / 2$ with $ω \geq N^{ζ} η_{f}$ . In this case Proposition 3.2 yields ${\tilde{ξ}}_{2} \tilde{ρ} / {\tilde{ξ}}_{1} + \tilde{ρ} ≲ η / ω \sim η / dist (z, supp ρ)$ . Thus we have

\begin{matrix} θ_{#} + \frac{ρ}{N η} ≲ \frac{d_{#}}{{\tilde{ξ}}_{1}} + \frac{\tilde{ρ}}{N η} ≲ \frac{1}{N dist (z, supp ρ)} \end{matrix}

and therefore the second bound in (3.32) implies (2.8b). This completes the proof of Theorem 2.5. $□$

Rigidity and absence of eigenvalues

The proofs of Corollaries 2.6 and 2.8 rely on the bounds on ${|G - M|}_{av}$ from (3.28) and (3.30). As before, we may restrict ourselves to the neighbourhood of a local minimum $τ_{0} \in supp ρ$ of the scDOS which is either an internal minimum with a small value of $ρ (τ_{0}) > 0$ , a cusp location or a right edge adjacent to a small gap of length $Δ > 0$ . All other cases, namely the bulk regime and regular edges adjacent to large gaps, have been treated prior to this work [8, 11].

Proof of Corollary 2.8

Let us denote the empirical eigenvalue distribution of H by $ρ_{H} = \frac{1}{N} \sum_{i = 1}^{N} δ_{λ_{i}}$ and consider the case when $τ_{0} = e_{-}$ is a right edge, $Δ \geq N^{δ} η_{f}$ for any $δ > 0$ and $η_{f} = η_{f} (e_{-}) \sim Δ^{1 / 9} N^{- 2 / 3}$ . Then we show that there are no eigenvalues in $e_{-} + [N^{δ} η_{f}, Δ / 2]$ with overwhelming probability. We apply [8, Lemma 5.1] with the choices

\begin{matrix} ν_{1} : = ρ, ν_{2} : = ρ_{H}, η_{1} : = η_{2} : = ϵ : = N^{ζ} η_{f}, τ_{1} : = e_{-} + ω, τ_{2} : = e_{-} + ω + N^{ζ} η_{f}, \end{matrix}

for any $ω \in [N^{δ} η_{f}, Δ / 2]$ and some $ζ \in (0, δ / 4)$ . We use (3.30) to estimate the error terms $J_{1}, J_{2}$ and $J_{3}$ from [8, Eq. (5.2)] by $N^{2 ζ - δ / 2 - 1}$ and see that $(ρ_{H} - ρ) ([τ_{1}, τ_{2}]) = ρ_{H} ([τ_{1}, τ_{2}]) ≺ N^{2 ζ - δ / 2 - 1}$ , showing that with overwhelming probability the interval $[τ_{1}, τ_{2}]$ does not contain any eigenvalues. A simple union bound finishes the proof of Corollary 2.8. $□$

Proof of Corollary 2.6

Now we establish Corollary 2.6 around a local minimum $τ_{0} \in supp ρ$ of the scDOS. Its proof has two ingredients. First we follow the strategy of the proof of [8, Corollary 1.10] to see that

\begin{matrix} |(ρ - ρ_{H}), ((- \infty, τ_{0} + ω])| ≺ \frac{1}{N}, \end{matrix}

3.33

for any $|ω| \leq c$ , i.e. we have a very precise control on $ρ_{H}$ . In contrast to the statement in that corollary we have a local law (3.28) with uniform $1 / N η$ error and thus the bound (3.33) does not deteriorate close to $τ_{0}$ . We warn the reader that the standard argument inside the proof of [8, Corollary 1.10] has to be adjusted slightly to arrive at (3.33). In fact, when inside that proof the auxiliary result [8, Lemma 5.1] is used with the choice $τ_{1} = - 10$ , $τ_{2} = τ$ , $η_{1} = η_{2} = N^{ζ - 1}$ for some $ζ > 0$ , this choice should be changed to $τ_{1} = - C$ , $τ_{2} = τ$ , $η_{1} = N^{ζ - 1}$ and $η_{2} = N^{ζ} η_{f} (τ)$ , where $C > 0$ is chosen sufficiently large such that $τ_{1}$ lies far to the left of the self-consistent spectrum.

The control (3.33) suffices to prove Corollary 2.6 for all $τ = τ_{0} + ω$ except for the case when $τ_{0} = e_{-}$ is an edge at a gap of length $Δ \geq N^{ζ} η_{f}$ and $ω \in [- N^{ζ} η_{f}, 0]$ for some fixed $ζ > 0$ and $η_{f} = η_{f} (e_{-}) \sim Δ^{1 / 9} / N^{2 / 3}$ , i.e. except for some $N^{ζ}$ eigenvalues close to the edge with arbitrarily small $ζ > 0$ . In all other cases, the proof follows the same argument as the proof of [8, Corollary 1.11] using the uniform 1/N-bound from (3.33) and we omit the details here.

The reason for having to treat the eigenvalues very close to the edge $e_{-}$ separately is that (3.33) does not give information on which side of the gap these $N^{ζ}$ eigenvalues are found. To get this information requires the second ingredient, the band rigidity,

\begin{matrix} P [ρ ((- \infty, e_{-} + ω]) = ρ_{H} ((- \infty, e_{-} + ω])] \geq 1 - N^{- ν}, \end{matrix}

3.34

for any $ν \in N$ , $Δ \geq ω \geq N^{ζ} η_{f}$ and large enough N. The combination of (3.34) and (3.33) finishes the proof of Corollary 2.6.

Band rigidity has been shown in case $Δ$ is bounded from below in [11] as part of the proof of Corollary 2.5. We will now adapt this proof to the case of small gap sizes $Δ \geq N^{ζ - 3 / 4}$ . Since by Corollary 2.8 with overwhelming probability there are no eigenvalues in $e_{-} + [N^{ζ} η_{f}, Δ / 2]$ , it suffices to show (3.34) for $ω = Δ / 2$ . As in the proof of [11, Corollary 2.5] we consider the interpolation

\begin{matrix} H_{t} : = \sqrt{1 - t} W + A - t S M (τ), t \in [0, 1], \end{matrix}

between the original random matrix $H = H_{0}$ and the deterministic matrix $H_{1} = A - S M (τ)$ , for $τ = e_{-} + Δ / 2$ . The interpolation is designed such that the solution $M_{t}$ of the MDE corresponding to $H_{t}$ is constant at spectral parameter $τ$ , i.e. $M_{t} (τ) = M (τ)$ . Let $ρ_{t}$ denote the scDOS of $H_{t}$ . Exactly as in the proof from [11] it suffices to show that no eigenvalue crosses the gap along the interpolation with overwhelming probability, i.e. that for any $ν \in N$ we have

\begin{matrix} P [a_{t} \in Spec (H_{t}) for some t \in [0, 1]] \leq \frac{C (ν)}{N^{ν}} . \end{matrix}

3.35

Here $t \to a_{t} \in R \ supp ρ_{t}$ is some spectral parameter inside the gap, continuously depending on t, such that $a_{0} = τ$ . In [11] $a_{t}$ was chosen independent of t, but the argument remains valid with any other choice of $a_{t}$ . We call $I_{t}$ the connected component of $R \ supp ρ_{t}$ that contains $a_{t}$ and denote $Δ_{t} = |I_{t}|$ the gap length. In particular, $Δ_{0} = Δ$ and $τ \in I_{t}$ for all $t \in [0, 1]$ by [10, Lemma 8.1(ii)]. For concreteness we choose $a_{t}$ to be the spectral parameter lying exactly in the middle of $I_{t}$ . The 1/3-Hölder continuity of $ρ_{t}$ , hence $I_{t}$ and $a_{t}$ in t follows from [10, Proposition 10.1(a)]. Via a simple union bound it suffices to show that for any fixed $t \in [0, 1]$ we have no eigenvalue in $a_{t} + [- N^{- 100}, N^{- 100}]$ .

Since $‖ W ‖ ≲ 1$ with overwhelming probability, in the regime $t \geq 1 - ϵ$ for some small constant $ϵ > 0$ , the matrix $H_{t}$ is a small perturbation of the deterministic matrix $H_{1}$ whose resolvent ${(H_{1} - τ)}^{- 1} = M (τ)$ at spectral parameter $τ$ is bounded by Assumption (C), in particular $Δ_{1} ≳ 1$ . By 1/3-Hölder continuity hence $Δ_{t} ≳ 1$ , and $Spec (H_{t}) \subset Spec (H_{1}) + [- C ϵ^{1 / 3}, C ϵ^{1 / 3}]$ for some $C \sim 1$ in this regime with very high probability. Since $Spec (H_{1}) \subset supp ρ_{t} + [- C ϵ^{1 / 3}, C ϵ^{1 / 3}]$ by [10, Proposition 10.1(a)] there are no eigenvalues of $H_{t}$ in a neighbourhood of $a_{t}$ , proving (3.35) for $t \geq 1 - ϵ$ .

For $t \in [ϵ, 1 - ϵ]$ we will now show that $Δ_{t} \sim_{ϵ} 1$ for any $ϵ > 0$ . In fact, we have $dist (τ, supp ρ_{t}) ≳_{ϵ} 1$ . This is a consequence of [10, Lemma D.1]. More precisely, we use the equivalence of (iii) and (v) of that lemma. We check (iii) and conclude the uniform distance to the self-consistent spectrum by (v). Since $M_{t} (τ) = M (τ)$ and $‖ M (τ) ‖ ≲ 1$ we only need to check that the stability operator $B_{t} = t + (1 - t) B$ of $H_{t}$ has a bounded inverse. We write $B_{t} = C (1 - (1 - t) \tilde{C} F) C^{- 1}$ in terms of the saturated self-energy operator $F = C S C$ , where $C [R] : = {|M (τ)|}^{1 / 2} R {|M (τ)|}^{1 / 2}$ and $\tilde{C} [R] : = (sgn M (τ)) R (sgn M (τ))$ . Afterwards we use that ${‖ F ‖}_{hs \to hs} \leq 1$ (cf. [7, Eq. (4.24)]) and $‖ \tilde{C} ‖_{hs \to hs} = 1$ to first show the uniform bound $‖ B_{t} ‖_{hs \to hs} ≲ 1 / t$ and then improve the bound to $‖ B_{t} ‖ ≲ 1 / t$ using the trick of expanding in a geometric series from [7, Eqs. (4.60)–(4.63)]. This completes the argument that $Δ_{t} \sim_{ϵ} 1$ . Now we apply [34, Corollary 2.3] to see that there are no eigenvalues of $H_{t}$ around $a_{t}$ as long as t is bounded away from zero and one, proving (3.35) for this regime.

Finally, we are left with the regime $t \in [0, ϵ]$ for some sufficiently small $ϵ > 0$ . By [10, Proposition 10.1(a)] the self-consistent Green’s function $M_{t}$ corresponding to $H_{t}$ is bounded even in a neighbourhood of $τ$ , whose size only depends on model parameters. In particular, Assumptions (A)–(C) are satisfied for $H_{t}$ and Corollary 2.8, which was already proved above, is applicable. Thus it suffices to show that the size $Δ_{t}$ of the gap in $supp ρ_{t}$ containing $τ$ is bounded from below by $Δ_{t} \geq N^{ζ - 3 / 4}$ for some $ζ > 0$ . The size of the gap can be read off from the following relationship between the norm of the saturated self-energy operator and the size of the gap: Let H be a random matrix satisfying (A)–(C) and $τ$ be well inside the interior of the gap of length $Δ \in [0, c]$ in the self-consistent spectrum for a sufficiently small $c \sim 1$ . Then

\begin{matrix} 1 - {‖ F (τ) ‖}_{hs \to hs} \sim lim_{η ↘ 0} \frac{η}{ρ (τ + i η)} \sim {(Δ + dist (τ, supp ρ))}^{1 / 6} \\ dist {(τ, supp ρ)}^{1 / 2} \sim Δ^{2 / 3}, \end{matrix}

3.36

where in the first step we used [7, Eqs. (4.23)–(4.25)], in the second step (3.7b), and in the last step that $dist (τ, supp ρ) \sim Δ$ . Applying the analogue of (3.36) for $H_{t}$ with $F_{t} (τ)$ and using that $dist (τ, ρ_{t}) ≲ Δ_{t}$ , we obtain $1 - ‖ F_{t} {(τ) ‖}_{hs \to hs} ≲ Δ_{t}^{2 / 3}$ . Combining this inequality with (3.36) and using that $F_{t} (τ) = (1 - t) F (τ)$ for $t \in [0, c]$ , we have $Δ_{t}^{3 / 2} ≳ t + (1 - t) Δ^{2 / 3}$ , i.e. $Δ_{t} ≳ t^{3 / 2} + Δ$ . In particular, the gap size $Δ_{t}$ never drops below $c Δ ≳ N^{ζ - 3 / 4}$ . This completes the proof of the last regime in (3.35). $□$

Cusp Fluctuation Averaging and Proof of Theorem 3.7

We will use the graphical multivariate cumulant expansion from [34] which automatically exploits the self-energy renormalization of $D$ to highest order. Since the final formal statement requires some custom notations, we first give a simple motivating example to illustrate the type of expansion and its graphical representation. If $W$ is Gaussian, then integration by parts shows that

\begin{matrix} \begin{matrix} E {〈D〉}^{2} = & \sum_{α, β} κ (α, β) E 〈Δ^{α}, G〉 〈Δ^{β}, G〉 \\ + \sum_{α_{1}, β_{1}} κ (α_{1}, β_{1}) \sum_{α_{2}, β_{2}} κ (α_{2}, β_{2}) E 〈Δ^{α_{1}}, G, Δ^{β_{2}}, G〉 〈Δ^{α_{2}}, G, Δ^{β_{1}}, G〉, \end{matrix} \end{matrix}

4.1

where we recall that $κ (α, β) : = κ (w_{α}, w_{β})$ is the second cumulant of the matrix entries $w_{α}, w_{β}$ index by double indices $α = (a, b)$ , $β = (a^{'}, b^{'})$ , and $Δ^{(a, b)}$ denotes the matrix of all zeros except for an 1 in the (a, b)th entry. Since for non-Gaussian $W$ or higher powers of $〈D〉$ the expansion analogous to (4.1) consists of much more complicated polynomials in resolvent entries, we represent them concisely as the values of certain graphs. As an example, the rhs. of (4.1) is represented simply by

4.2

The graphs retain only the relevant information of the complicated expansion terms and chains of estimates can be transcribed into simple graph surgeries. Graphs also help identify critical terms that have to be estimated more precisely in order to obtain the improved high moment bound on $D$ . For example, the key cancellation mechanism behind the cusp fluctuation averaging is encoded in a small distinguished part of the expansion that can conveniently be identified as certain subgraphs, called the $σ$ -cells, see Definition 4.10 later. It is easy to count, estimate and manipulate $σ$ -cells as part of a large graph, while following the same operations on the level of formulas would be almost intractable.

First we review some of the basic nomenclature from [34]. We consider random matrices $H = A + W$ with diagonal expectation A and complex Hermitian or real symmetric zero mean random component W indexed by some abstract set J of size $|J| = N$ . We recall that Greek letters $α, β, \dots$ stand for labels, i.e. double-indices from $I = J \times J$ , whereas Roman letters $a, b, \dots$ stand for single indices. If $α = (a, b)$ , then we set $α^{t} : = (b, a)$ for its transpose. Underlined Greek letters stand for multisets of labels, whereas bold-faced Greek letters stand for tuples of labels with the counting combinatorics being their—for our purposes—only relevant difference.

According to [34, Proposition 4.4] with $N (α) = {α, α^{t}}$ it follows from the assumed independence that for general (conjugate) linear functionals $Λ^{(k)}$ , of bounded norm $‖ Λ^{(k)} ‖ = O (1)$

\begin{matrix} E \prod_{k \in [p]} Λ^{(k)} (D) = E \prod_{l \in [p]} (1 + \sum_{α_{l}, β_{l}}^{\sim (l)}) \prod_{k \in [p]} \{\begin{matrix} Λ_{α_{k}, {\underline{β}}^{k}}^{(k)} & if \sum_{α_{k}} \\ Λ_{{\underline{β}}_{< k}^{k}, {\underline{β}}_{> k}^{k}}^{(k)} & else \end{matrix}) + O (N^{- p}), \end{matrix}

4.3a

where we recall that

4.3b

and that

\begin{matrix} \begin{matrix} Λ_{α_{1}, \dots, α_{k}} & : = - {(- 1)}^{k} Λ (Δ^{α_{1}} G \dots Δ^{α_{k}} G), Λ_{{α_{1}, \dots, α_{m}}} : = \sum_{σ \in S_{m}} Λ_{α_{σ (1)}, \dots, α_{σ (m)}}, \\ Λ_{α, {α_{1}, \dots, α_{m}}} & : = \sum_{σ \in S_{m}} Λ_{α, α_{σ (1)}, \dots, α_{σ (m)}}, Λ_{\underline{α}, \underline{β}} : = \sum_{α \in \underline{α}} Λ_{α, \underline{α} \cup \underline{β} \ {α}}, \\ {\underline{β}}_{< k}^{k} & : = ⨆_{j < k} {\underline{β}}_{j}^{k}, {\underline{β}}_{> k}^{k} : = ⨆_{j > k} {\underline{β}}_{j}^{k} . \end{matrix} \end{matrix}

4.3c

Some notations in (4.3) require further explanation. The qualifier “if $\sum_{α_{k}}$ ” is satisfied for those terms in which $α_{k}$ is a summation variable when the brackets in the product $\prod_{j} (1 + \sum)$ are opened. The notation $⨆$ indicates the union of multisets.

For even p we apply (4.3) with $Λ^{(k)} (D) : = 〈diag (f p) D〉$ for $k \leq p / 2$ and $Λ^{(k)} (D) : = \bar{〈diag (f p) D〉}$ for $k > p / 2$ . This is obviously a special case of $Λ^{(k)} (D) = 〈B, D〉$ which was considered in the so-called averaged case of [34] with arbitrary B of bounded operator norm since $‖ diag (f p) ‖ = {‖ f p ‖}_{\infty} \leq C$ . It was proved in [34] that

\begin{matrix} |〈diag (f p) D〉| ≲ \frac{ρ}{N η}, \end{matrix}

which is not good enough at the cusp. We can nevertheless use the graphical language developed in [34] to estimate the complicated right hand side of (4.3).

Graphical representation via double index graphs

The graphs (or Feynman diagrams) introduced in [34] encode the structure of all terms in (4.3). Their (directed) edges correspond to resolvents G, while vertices correspond to $Δ$ ’s. Loop edges are allowed while parallel edges are not. Resolvents G and their Hermitian conjugates $G^{*}$ are distinguished by different types of edges. Each vertex v carries a label $α_{v}$ and we need to sum up for all labels. Some labels are independently summed up, these are the $α$ -labels in (4.3), while the $β$ -labels are strongly restricted; in the independent case they can only be of the type $α$ or $α^{t}$ . These graphs will be called “double indexed” graphs since the vertices are naturally equipped with labels (double indices). Here we introduced the terminology “double indexed” for the graphs in [34] to distinguish them from the “single indexed” graphs to be introduced later in this paper.

To be more precise, the graphs in [34] were vertex-coloured graphs. The colours encoded a resummation of the terms in (4.3): vertices whose labels (or their transpose) appeared in one of the cumulants in (4.3) received the same colour. We then first summed up the colours and only afterwards we summed up all labels compatible with the given colouring. According to [34, Proposition 4.4] and the expansion of the main term [34, Eq. (49)] for every even p it holds that

\begin{matrix} E {|〈diag (f p) D〉|}^{p} = \sum_{Γ \in G^{av (p, 6 p)}} Val (Γ) + O (N^{- p}), \end{matrix}

4.4a

where $G^{av (p, 6 p)}$ is a certain finite collection of vertex coloured directed graphs with p connected components, and $Val (Γ)$ , the value of the graph $Γ$ , will be recalled below. According to [34] each graph $Γ \in G^{av (p, 6 p)}$ fulfils the following properties:

Proposition 4.1

(Properties of double index graphs). There exists a finite set $G^{av (p, 6 p)}$ of double index graphs $Γ$ such that (4.4) hold. Each $Γ$ fulfils the following properties.

There exist exactly p connected components, all of which are oriented cycles. Each vertex has one incoming and one outgoing edge.
Each connected component contains at least one vertex and one edge. Single vertices with a looped edge are in particular legal connected components.
Each colour colours at least two and at most 6p vertices.
If a colour colours exactly two vertices, then these vertices are in different connected components.
The edges represent the resolvent matrix G or its adjoint $G^{*}$ . Within each component either all edges represent G or all edges represent $G^{*}$ . Accordingly we call the components either G or $G^{*}$ -cycles.
Within each cycle there is one designated edge which is represented as a wiggled line in the graph. The designated edge represents the matrix $G diag (p f)$ in a G-cycle and the matrix $diag (p f) G^{*}$ in a $G^{*}$ -cycle.
For each colour there exists at least one component in which a vertex of that colour is connected to the matrix $diag (f p)$ . According to (f) this means that if the relevant vertex is in a G-cycle, then the designated (wiggled) edge is its incoming edge. If the relevant vertex is in a G-cycle, then the designated edge is its outgoing edge.

If V is the vertex set of $Γ$ and for each colour $c \in C$ , $V_{c}$ denotes the c-coloured vertices then we recall that

\begin{matrix} \begin{matrix} Val (Γ) & = {(- 1)}^{|V|} (\prod_{c \in C} \prod_{v \in V_{c}} \sum_{α_{v}} \frac{κ ({α_{v}}_{v \in V_{c}})}{(|V_{c}| - 1)!}) \\ \times E \prod_{Cyc (v_{1}, \dots, v_{k}) \in Γ} \{\begin{matrix} 〈G diag (f p) Δ^{α_{v_{1}}} G \dots G Δ^{α_{v_{k}}}〉 \\ 〈Δ^{α_{v_{k}}} G^{*} \dots G^{*} Δ^{α_{v_{1}}} diag (f p) G^{*}〉 \end{matrix}) \end{matrix} \end{matrix}

4.4b

where the ultimate product is the product over all p of the cycles in the graph. By the notation $Cyc (v_{1}, \dots, v_{k})$ we indicate a directed cycle with vertices $v_{1}, \dots, v_{k}$ . Depending upon whether a given cycle is a G-cycle or $G^{*}$ -cycle, it then contributes with one of the factors indicated after the last curly bracket in (4.4b) with the vertex order chosen in such a way that the designated edge represents the $G diag (f p)$ or $diag (f p) G^{*}$ matrix. As an example illustrating (4.4b) we have

4.5

Actually in [34] the graphical representation of the graph $Γ$ is simplified, it does not contain all information encoded in the graph. First, the direction of the edges are not indicated. In the picture both cycles should be oriented in a clockwise orientation. Secondly, the type of edges are not indicated, apart from the wiggled line. In fact, the edges in the second subgraph stand for $G^{*}$ , while those in the first subgraph stand for G. To translate the pictorial representation directly let the striped vertices in the first and second cycle be associated with $α_{1}, β_{1}$ and the dotted vertices with $α_{2}, β_{2}$ . Accordingly, the wiggled edge in the first cycle stands for $G diag (f p)$ , while the wiggled edge in the second cycle stands for $diag (f p) G^{*}$ . The reason why these details were omitted in the graphical representation of a double index graph is that they do not influence the basic power counting estimate of its value used in [34].

Single index graphs

In [34] we operated with double index graphs that are structurally simple and appropriate for bookkeeping complicated correlation structures, but they are not suitable for detecting the additional smallness we need at the cusp. The contribution of the graphs in [34] were estimated by a relatively simple power counting argument where only the number of (typically off-diagonal) resolvent elements were recorded. In fact, for many subleading graphs this procedure already gave a very good bound that is sufficient at the cusps as well. The graphs carrying the leading contribution, however, have now to be computed to a higher accuracy and this leads to the concept of “single index graphs”. These are obtained by a certain refinement and reorganization of the double index graphs via a procedure we will call graph resolution to be defined later. The main idea is to restructure the double index graph in such a way that instead of labels (double indices) $α = (a, b)$ its vertices naturally represent single indices a and b. Every double indexed graph will give rise to a finite number of resolved single index graphs. The double index graphs that require a more precise analysis compared with [34] will be resolved to single index graphs. After we explain the structure of the single index graphs and the graph resolution procedure, double index graphs will not be used in this paper any more. Thus, unless explicitly stated otherwise, by graph we will mean single index graph in the rest of this paper.

We now define the set $G$ of single index graphs we will use in this paper. They are directed graphs, where parallel edges and loops are allowed. Let the graph be denoted by $Γ$ with vertex set $V (Γ)$ and edge set $E (Γ)$ . We will assign a value to each $Γ$ which comprises weights assigned to the vertices and specific values assigned to the edges. Since an edge may represent different objects, we will introduce different types of edges that will be graphically distinguished by different line style. We now describe these ingredients precisely.

Vertices.

Each vertex $v \in V (Γ)$ is equipped with an associated index $a_{v} \in J$ . Graphically the vertices are represented by small sunlabelled bullets Inline graphic , i.e. in the graphical representation the actual index is not indicated. It is understood that all indices will be independently summed up over the entire index set J when we compute the value of the graph.

Vertex weights.

Each vertex $v \in V (Γ)$ carries some weight vector $w^{(v)} \in C^{J}$ which is evaluated $w_{a_{v}}^{(v)}$ at the index $a_{v}$ associated with the vertex. We generally assume these weights to be uniformly bounded in N, i.e. ${sup}_{N} {‖ w^{(v)} ‖}_{\infty} < \infty$ . Visually we indicate vertex weights by incoming arrows as in Inline graphic . Vertices without explicitly indicated weight may carry an arbitrary bounded weight vector. We also use the notation to indicate the constant $1$ vector as the weight, this corresponds to summing up the corresponding index unweighted

G-edges.

The set of G-edges is denoted by $GE (Γ) \subset E (Γ)$ . These edges describe resolvents and there are four types of G-edges. First of all, there are directed edges corresponding to G and $G^{*}$ in the sense that a directed G or $G^{*}$ -edge $e = (v, u) \in E$ initiating from the vertex $v = i (e)$ and terminating in the vertex $u = t (e)$ represents the matrix elements $G_{a_{v} a_{u}}$ or respectively $G_{a_{v} a_{u}}^{*}$ evaluated in the indices $a_{v}, a_{u}$ associated with the vertices v and u. Besides these two there are also edges representing $G - M$ and ${(G - M)}^{*}$ . Distinguishing between G and $G - M$ , for practical purposes, is only important if it occurs in a loop. Indeed, ${(G - M)}_{aa}$ is typically much smaller than $G_{aa}$ , while ${(G - M)}_{ab}$ basically acts just like $G_{ab}$ when a, b are summed independently. Graphically we will denote the four types of G-edges by

where all these edges can also be loops. The convention is that continuous lines represent G, dashed lines correspond to $G^{*}$ , while the diamond on both types of edges indicates the subtraction of M or $M^{*}$ . An edge $e \in GE (Γ)$ carries its type as its attribute, so as a short hand notation we can simply write $G_{e}$ for $G_{a_{i (e)}, a_{t (e)}}$ , $G_{a_{i (e)}, a_{t (e)}}^{*}$ , ${(G - M)}_{a_{i (e)}, a_{t (e)}}$ and ${(G - M)}_{a_{i (e)}, a_{t (e)}}^{*}$ depending on which type of G-edge e represents. Due to their special role in the later estimates, we will separately bookkeep those $G - M$ or $G^{*} - M^{*}$ edges that appear looped. We thus define the subset ${GE}_{g - m} \subset GE$ as the set of G-edges $e \in GE (Γ)$ of type $G - M$ or $G^{*} - M^{*}$ such that $i (e) = t (e)$ . We write $g - m$ to refer to the fact that looped edges are evaluated on the diagonal ${(g - m)}_{a_{v}}$ of ${(G - M)}_{a_{v} a_{v}}$ .

(G-)edge degree.

For any vertex v we define its in-degree ${deg}^{-} (v)$ and out-degree ${deg}^{+} (v)$ as the number of incoming and outgoing G-edges. Looped edges (v, v) are counted for both in- and out-degree. We denote the total degree by $deg (v) = {deg}^{-} (v) + {deg}^{+} (v)$ .

Interaction edges.

Besides the G-edges we also have interaction edges, $IE (Γ)$ , representing the cumulants $κ$ . A directed interaction edge $e = (u, v)$ represents the matrix $R^{(e)} = (r_{ab}^{(e)})_{a, b \in J}$ given by the cumulant

\begin{matrix} r_{ab}^{(u, v)} = & \frac{1}{(deg (u) - 1)!} κ (\underset{{deg}^{-} (u) times}{\underset{⏟}{a b, \dots, a b}}, \underset{{deg}^{+} (u) times}{\underset{⏟}{b a, \dots, b a}}) \\ = & \frac{1}{(deg (v) - 1)!} κ (\underset{{deg}^{+} (v) times}{\underset{⏟}{a b, \dots, a b}}, \underset{{deg}^{-} (v) times}{\underset{⏟}{b a, \dots, b a}}) . \end{matrix}

4.6

For all graphs $Γ \in G$ and all interaction edges $e = (u, v)$ we have the symmetries ${deg}^{-} (u) = {deg}^{+} (v)$ and ${deg}^{-} (v) = {deg}^{+} (u)$ . Thus (4.6) is compatible with exchanging the roles of u and v. For the important case when $deg (u) = deg (v) = 2$ it follows that the interaction from u to v is given by S if u has one incoming and one outgoing G-edge and T if u has two incoming G-edges, i.e.

\begin{matrix} s_{ab} = κ (a b, b a) t_{ab} = κ (a b, a b) . \end{matrix}

Visually we will represent interaction edges as

Although the interaction matrix $R^{(e)}$ is completely determined by the in- and out-degrees of the adjacent vertices i(e), t(e) we still write out the specific S and T names because these will play a special role in the latter part of the proof. As a short hand notation we shall frequently use $R_{e} : = R_{a_{i (e)}, a_{t (e)}}^{(e)}$ to denote the matrix element selected by the indices $a_{i (e)}, a_{t (e)}$ associated with the initial and terminal vertex of e. We also note that we do not indicate the direction of edges associated with S as the matrix S is symmetric.

Generic weighted edges.

Besides the specific G-edges and interaction edges, additionally we also allow for generic edges reminiscent of the generic vertex weights introduced above. They will be called generic weighted edges, or weighted edges for short. To every weighted edge e we assign a weight matrix $K^{(e)} = {(k_{ab}^{(e)})}_{a, b \in J}$ which is evaluated as $k_{a_{i (e)}, a_{t (e)}}^{(e)}$ when we compute the value of the graph by summing up all indices. To simplify the presentation we will not indicate the precise form of the weight matrix $K^{(e)}$ but only its entry-wise scaling as a function of N. A weighted edge presented as Inline graphic represents an arbitrary weight matrix $K^{(e)}$ whose entries scale like . We denote the set of weighted edges by $WE (Γ)$ . For a given weighted edge $e \in WE$ we record the entry-wise scaling of $K^{(e)}$ in an exponent $l (e) \geq 0$ in such a way that we always have .

Graph value.

For graphs $Γ \in G$ we define their value

\begin{matrix} Val (Γ) & : = {(- 1)}^{|GE (Γ)|} (\prod_{v \in V (Γ)} \sum_{a_{v} \in J} w_{a_{v}}^{(v)}) (\prod_{e \in IE (Γ)} r_{a_{i (e)}, a_{t (e)}}^{(e)}) (\prod_{e \in WE (Γ)} k_{a_{i (e)}, a_{t (e)}}^{(e)}) \\ \times E (\prod_{e \in GE (Γ)} G_{e}), \end{matrix}

4.7

which differs slightly from that in (4.4b) because it applies to a different class of graphs.

Single index resolution

There is a natural mapping from double indexed graphs to a collection of single indexed graphs that encodes the rearranging of the terms in (4.4b) when the summation over labels $α_{v}$ is reorganized into summation over single indices. Now we describe this procedure.

Definition 4.2

(Single index resolution). By the single index resolution of a double vertex graph we mean the collection of single index graphs obtained through the following procedure.

(i)
For each colour, the identically coloured vertices of the double index graph are mapped into a pair of vertices of the single index graph.
(ii)
The pair of vertices in the single index graph stemming from a fixed colour is connected by an interaction edge in the single index graph.
(iii)
Every (directed) edge of the double index graph is naturally mapped to a G-edge of the single index graph. While mapping equally coloured vertices $x_{1}, \dots, x_{k}$ in the double index graph to vertices u, v connected by an interaction edge $e = (u, v)$ there are $k - 1$ binary choices of whether we map the incoming edge of $x_{j}$ to an incoming edge of u and the outgoing edge of $x_{j}$ to an outgoing edge of v or vice versa. In this process we are free to consider the mapping of $x_{1}$ (or any other vertex, for that matter) as fixed by symmetry of $u \leftrightarrow v$ .
(iv)
If a wiggled G-edge is mapped to an edge from u to v, then v is equipped with a weight of $p f$ . If a wiggled $G^{*}$ -edge is mapped to an edge from u to v, then u is equipped with a weight of $p f$ . All vertices with no weight specified in this process are equipped with the constant weight $1$ .

We define the set $G (p) \subset G$ as the set of all graphs obtained from the double index graphs $G^{av (p, 6 p)}$ via the single index resolution procedure.

Remark 4.3

(i)
We note some ingredients described in Sect. 4.2 for a typical graph in $G$ will be absent for graphs $Γ \in G (p) \subset G$ . For example, $WE (Γ) = {GE}_{g - m} (Γ) = \emptyset$ for all $Γ \in G (p)$ .
(ii)
We also remark that loops in double index graphs are never mapped into loops in single index graphs along the single index resolution. Indeed, double index loops are always mapped to edges parallel to the interaction edge of the corresponding vertex.

A few simple facts immediately follow from the the single index construction in Definition 4.2. From (i) it is clear that the number of vertices in the single index graph is twice the number of colours of the double index graph. From (ii) it follows that the number of interaction edges in the single index graph equals the number of colours of the double index graph. Finally, from (iii) it is obvious that if for some colour c there are $k = k (c)$ vertices in the double index graph with colour c, then the resolution of this colour gives rise to $2^{k (c) - 1}$ single indexed graph. Since these resolutions are done independently for each colour, we obtain that the number of single index graphs originating from one double index graph is

\begin{matrix} \prod_{c} 2^{k (c) - 1} \end{matrix}

Since the number of double index graph in $G^{av (p, 6 p)}$ is finite, so is the number of graphs in $G (p)$ .

Let us present an example of single index resolution applied to the graph from (4.5) where we, for the sake of transparency, label all vertices and edges. $Γ$ is a graph consisting of one 2-cycle on the vertices $x_{1}, y_{2}$ and one 2-cycle on the vertices $x_{2}, y_{1}$ as in

graphic file with name 220_2019_3657_Equ72_HTML.gif

4.8

with $x_{1}, y_{1}$ and $x_{2}, y_{2}$ being of equal colour (i.e. being associated to labels connected through cumulants). In order to explain steps (i)–(iii) of the construction we first neglect that some edges may be wiggled, but we restore the orientation of the edges in the picture. We then fix the mapping of $x_{i}$ to pairs of vertices $(u_{i}, v_{i})$ for $i = 1, 2$ in such a way that the incoming edges of $x_{i}$ are incoming at $u_{i}$ and the outgoing edges from $x_{i}$ are outgoing from $v_{i}$ . It remains to map $y_{i}$ to $(u_{i}, v_{i})$ and for each i there are two choices of doing so that we obtain the four possibilities graphic file with name 220_2019_3657_Figg_HTML.jpg

which translates to

4.9

in the language of single index graphs where the S, T assignment agrees with (4.6). Finally we want to visualize step (iv) in the single index resolution in our example. Suppose that in (4.8) the edges $e_{1}$ and $e_{2}$ are G-edges while $e_{3}$ and $e_{4}$ are $G^{*}$ edges with $e_{2}$ and $e_{4}$ being wiggled (in agreement with (4.5)). According to (iv) it follows that the terminal vertex of $e_{2}$ and the initial vertex of $e_{4}$ are equipped with a weight of $p f$ while the remaining vertices are equipped with a weight of $1$ . The first graph in (4.9) would thus be equipped with the weights graphic file with name 220_2019_3657_Figh_HTML.jpg

Single index graph expansion.

With the value definition in (4.7) it follows from Definition 4.2 that

\begin{matrix} E {|〈diag (f p) D〉|}^{p} = N^{- p} \sum_{Γ \in G (p)} Val (Γ) + O (N^{- p}) . \end{matrix}

4.10

We note that in contrast to the value definition for double index graphs (4.4), where each average in (4.4b) contains an 1/N prefactor, the single index graph value (4.7) does not include the $N^{- p}$ prefactor. We chose this convention in this paper mainly because the exponent p in the prefactor $N^{- p}$ cannot be easily read off from the single index graph itself, whereas in the double index graph p is simply the number of connected components.

We now collect some simple facts about the structure of these graphs in $G (p)$ which directly follow from the corresponding properties of the double index graphs listed in Proposition 4.1.

Fact 1

The interaction edges $IE (Γ)$ form a perfect matching of $Γ$ , in particular $|V| = 2 |IE|$ . Moreover, $1 \leq |IE (Γ)| \leq p$ and therefore the number of vertices in the graph is even and satisfies $2 \leq |V (Γ)| \leq 2 p$ . Finally, since for $(u, v) \in IE$ we have ${deg}^{-} (u) = {deg}^{+} (v)$ and ${deg}^{-} (v) = {deg}^{+} (u)$ , consequently also $deg (e) : = deg (u) = deg (v)$ . The degree furthermore satisfies the bounds $2 \leq deg (e) \leq 6 p$ for each $e \in IE (Γ)$ .

Fact 2

The weights associated with the vertices are some non-negative powers of $f p$ in such a way that the total power of all $f p$ ’s is exactly p. The trivial zeroth power, i.e. the constant weight $1$ is allowed. Furthermore, the $f p$ weights are distributed in such a way that at least one non-trivial $f p$ weight is associated with each interacting edge $(u, v) = e \in IE (Γ)$ .

Examples of graphs

We now turn to some examples explaining the relation of the double index graphs from [34] and single index graphs. We note that the single index graphs actually contain more information because they specify edge direction, specify weights explicitly and differentiate between G and $G^{*}$ edges. These information were not necessary for the power counting arguments used in [34], but for the improved estimates they will be crucial.

We start with the graphs representing the following simple equality following from $κ (α, β) = E w_{α} w_{β}$

\begin{matrix} \begin{matrix} N^{2} E \sum_{α, β} κ (α, β) 〈diag, (f p), Δ^{α}, G〉 〈G^{*}, Δ^{β}, diag, {(f p)}^{*}〉 \\ = \sum_{a, b} s_{ab} {(p f)}_{a}^{2} E G_{ba} G_{ab}^{*} + \sum_{a, b} t_{ab} {(p f)}_{a} {(p f)}_{b} E G_{ba} G_{ba}^{*} \end{matrix} \end{matrix}

which can be represented as

We now turn to the complete graphical representation for the second moment in the case of Gaussian entries,

4.11

where we again stress that the double index graphs hide the specific weights and the fact that one of the connected components actually contains $G^{*}$ edges. In terms of single index graphs, the rhs. of (4.11) can be represented as the sum over the values of the six graphs

graphic file with name 220_2019_3657_Equ76_HTML.gif

4.12

The first two graphs were already explained above. The additional four graphs come from the second term in the rhs. of (4.11). Since $κ (α_{1}, β_{1})$ is non-zero only if $α_{1} = β_{1}$ or $α_{1} = β_{1}^{t}$ , there are four possible choices of relations among the $α$ and $β$ labels in the two kappa factors. For example, the first graph in the second line of (4.12) corresponds to the choice $α_{1}^{t} = β_{1}$ , $α_{2}^{t} = β_{2}$ . Written out explicitly with summation over single indices, this value is given by

\begin{matrix} \sum_{a_{1}, b_{1}} \sum_{a_{2}, b_{2}} {(p f)}_{a_{1}} {(p f)}_{b_{2}} s_{a_{1} b_{1}} s_{a_{2} b_{2}} E G_{a_{2} a_{1}} G_{b_{1} b_{2}} G_{a_{1} a_{2}}^{*} G_{b_{2} b_{1}}^{*} \end{matrix}

where in the picture the left index corresponds to $a_{1}$ , the top index to $b_{2}$ , the right one to $a_{2}$ and the bottom one to $b_{1}$ .

We conclude this section by providing an example of a graph with some degree higher than two which only occurs in the non-Gaussian situation and might contain looped edges. For example, in the expansion of $N^{2} E {|〈diag (f p) D〉|}^{2}$ in the non-Gaussian setup there is the term

where $r_{ab} = κ (a b, b a, b a) / 2$ and $s_{ab} = κ (a b, b a)$ , in accordance with (4.6).

Simple estimates on $Val (Γ)$

In most cases we aim only at estimating the value of a graph instead of precisely computing it. The simplest power counting estimate on (4.7) uses that the matrix elements of G and those of the generic weight matrix K are bounded by an $O (1)$ constant, while the matrix elements of $R^{(e)}$ are bounded by $N^{- deg (e) / 2}$ . Thus the naive estimate on (4.7) is

\begin{matrix} |Val (Γ)| ≲ (\prod_{v \in V (Γ)} N) (\prod_{e \in IE (Γ)} N^{- deg (e) / 2}) = \prod_{e \in IE (Γ)} N^{2 - deg (e) / 2} \leq \prod_{e \in IE (Γ)} N \leq N^{p} \end{matrix}

4.13

where we used that the interaction edges form a perfect matching and that $deg (e) \geq 2$ , $|IE (Γ)| \leq p$ . The somewhat informal notation $≲$ in (4.13) hides a technical subtlety. The resolvent entries $G_{ab}$ are indeed bounded by an $O (1)$ constant in the sense of very high moments but not almost surely. We will make bounds like the one in (4.13) rigorous in a high moments sense in Lemma 4.8.

The estimate (4.13) ignores the fact that typically only the diagonal resolvent matrix elements of G are of $O (1)$ , the off-diagonal matrix elements are much smaller. This is manifested in the Ward-identity

\begin{matrix} \sum_{a \in J} {|G_{ab}|}^{2} = {(G^{*} G)}_{bb} = \frac{{(G - G^{*})}_{bb}}{2 i η} = \frac{I G_{bb}}{η} . \end{matrix}

4.14a

Thus the sum of off-diagonal resolvent elements $G_{ab}$ is usually smaller than its naive size of order N, at least in the regime $η ≫ N^{- 1}$ . This is quantified by the so called Ward estimates

\begin{matrix} \sum_{a \in J} {|G_{ab}|}^{2} = N \frac{I G_{bb}}{N η} ≲ N ψ^{2}, \sum_{a \in J} |G_{ab}| ≲ N ψ, ψ : = {(\frac{ρ}{N η})}^{1 / 2} . \end{matrix}

4.14b

Similarly to (4.13) the inequalities $≲$ in (4.14b) are meant in a power counting sense ignoring that the entries of $I G$ might not be bounded by $ρ$ almost surely but only in some high moment sense.

As a consequence of (4.14b) we can gain a factor of $ψ$ for each off-diagonal (that is, connecting two separate vertices) G-factor, but clearly only for at most two G-edges per adjacent vertex. Moreover, this gain can obviously only be used once for each edge and not twice, separately when summing up the indices at both adjacent vertices. As a consequence a careful counting of the total number of $ψ$ -gains is necessary, see [34, Section 4.3] for details.

Ward bounds for the example graphs from Sect. 4.4. From the single index graphs drawn in (4.12) we can easily obtain the known bound $E {|〈diag (f p) D〉|}^{2} ≲ ψ^{4}$ . Indeed, the last four graphs contribute a combinatorial factor of $N^{4}$ from the summations over four single indices and a scaling factor of $N^{- 2}$ from the size of S, T. Furthermore, we can gain a factor of $ψ$ for each G-edge through Ward estimates and the bound follows. Similarly, the first two graphs contribute a factor of $N = N^{2 - 1}$ from summation and S/T and a factor of $ψ^{2}$ from the Ward estimates, which overall gives $N^{- 1} ψ^{2} ≲ ψ^{4}$ . As this example shows, the bookkeeping of available Ward-estimates is important and we will do so systematically in the following sections.

Improved estimates on $Val (Γ)$ : Wardable edges

For the sake of transparency we briefly recall the combinatorial argument used in [34], which also provides the starting point for the refined estimate in the present paper. Compared to [34], however, we phrase the counting argument directly in the language of the single index graphs. We only aim to gain from the G-edges adjacent to vertices of degree two or three; for vertices of higher degree the most naive estimate $|G_{ab}| ≲ 1$ is already sufficient as demonstrated in [34]. We collect the vertices of degree two and three in the set $V_{2, 3}$ and collect the G-edges adjacent to $V_{2, 3}$ in the set $E_{2, 3}$ . In [34, Section 4.3] a specific marking procedure on the G-edges of the graph is introduced that has the following properties. For each $v \in V_{2, 3}$ we put a mark on at most two adjacent G-edges in such a way that those edges can be estimated via (4.14b) while performing the $a_{v}$ summation. In this case we say that the mark comes from the v-perspective. An edge may have two marks coming from the perspective of each of its adjacent vertices. Later, marked edges will be estimated via (4.14b) while summing up $a_{v}$ . After doing this for all of $V_{2, 3}$ we call an edge in $E_{2, 3}$ marked effectively if it either (i) has two marks, or (ii) has one mark and is adjacent to only one vertex from $V_{2, 3}$ . While subsequently using (4.14b) in the summation of $a_{v}$ for $v \in V_{2, 3}$ (in no particular order) on the marked edges (and estimating the remaining edges adjacent to v trivially) we can gain at least as many factors of $ψ$ as there are effectively marked edges. Indeed, this follows simply from the fact that effectively marked edges are never estimated trivially during the procedure just described, no matter the order of vertex summation.

Fact 3

For each $Γ \in G (p)$ there is a marking of edges adjacent to vertices of degree at most 3 such that there are at least $\sum_{e \in IE (Γ)} {(4 - deg (e))}_{+}$ effectively marked edges.

Proof

On the one hand we find from Fact 1 (more specifically, from the equality $deg (e) = deg (u) = deg (v)$ for $(u, v) = e \in IE (Γ)$ ) that

\begin{matrix} |E_{2, 3}| \geq \sum_{v \in V_{2, 3}} \frac{1}{2} deg (v) = \sum_{e \in IE (Γ), deg (e) \in {2, 3}} deg (e) . \end{matrix}

4.15

On the other hand it can be checked that for every pair $(u, v) = e \in IE (Γ)$ with $deg (e) = 2$ all G-edges adjacent to u or v can be marked from the u, v-perspective. Indeed, this is a direct consequence of Proposition 4.1(d): Because the two vertices in the double index graph being resolved to (u, v) cannot be part of the same cycle it follows that all of the (two, three or four) G-edges adjacent to the vertices with index u or v are not loops (i.e. do not represent diagonal resolvent elements). Therefore they can be bounded by using (4.14b). Similarly, it can be checked that for every edge $(u, v) = e \in IE (Γ)$ with $deg (e) = 3$ at most two G-edges adjacent to u or v can remain unmarked from the u, v-perspective. By combining these two observations it follows that at most

\begin{matrix} \sum_{e \in IE (Γ), deg (e) \in {2, 3}} (2 deg (e) - 4) \end{matrix}

4.16

edges in $E_{2, 3}$ are ineffectively marked since those are counted as unmarked from the perspective of one of its vertices. Subtracting (4.16) from (4.15) it follows that in total at least

\begin{matrix} \sum_{e \in IE (Γ)} {(4 - deg (e))}_{+} = \sum_{e \in IE (Γ), deg (e) \in {2, 3}} (4 - deg (e)) \end{matrix}

edges are marked effectively, just as claimed. $□$

In [34] it was sufficient to estimate the value of each graph in $G (p)$ by subsequently estimating all effectively marked edges using (4.14b). For the purpose of improving the local law at the cusp, however, we need to introduce certain operations on the graphs of $G (p)$ which allow to estimate the graph value to a higher accuracy. It is essential that during those operations we keep track of the number of edges we estimate using (4.14b). Therefore we now introduce a more flexible way of recording these edges. We first recall a basic definition [58] from graph theory.

Definition 4.4

For $k \geq 1$ a graph $Γ = (V, E)$ is called k-degenerate if any induced subgraph has minimal degree at most k.

It is well known that being k-degenerate is equivalent to the following sequential property.2 We provide a short proof for convenience.

Lemma 4.5

A graph $Γ = (V, E)$ is k-degenerate if and only if there exists an ordering of vertices ${v_{1}, \dots, v_{n}} = V$ such that for each $m \in [n] : = {1, \dots, n}$ it holds that

\begin{matrix} {deg}_{Γ [{v_{1}, \dots, v_{m}}]} (v_{m}) \leq k \end{matrix}

4.17

where for $V^{'} \subset V$ , $Γ [V^{'}]$ denotes the induced subgraph on the vertex set $V^{'}$ .

Proof

Suppose the graph is k-degenerate and let $n : = |V|$ . Then there exists some vertex $v_{n} \in V$ such that $deg (v_{n}) \leq k$ by definition. We now consider the subgraph induced by $V^{'} : = V \ {v_{n}}$ and, by definition, again find some vertex $v_{n - 1} \in V^{'}$ of degree ${deg}_{Γ [V^{'}]} (v_{n - 1}) \leq k$ . Continuing inductively we find a vertex ordering with the desired property.

Conversely, assume there exists a vertex ordering such that (4.17) holds for each m. Let $V^{'} \subset V$ be an arbitrary subset and let $m : = max {l \in [n] | v_{l} \in V^{'}}$ . Then it holds that

\begin{matrix} {deg}_{Γ [V^{'}]} (v_{m}) \leq {deg}_{Γ [{v_{1}, \dots, v_{m}}]} (v_{m}) \leq k \end{matrix}

and the proof is complete. $□$

The reason for introducing this graph theoretical notion is that it is equivalent to the possibility of estimating edges effectively using (4.14b). A subset ${GE}^{'}$ of G-edges in $Γ \in G$ can be fully estimated using (4.14b) if and only if there exists a vertex ordering such that we can subsequently remove vertices in such a way that in each step at most two edges from ${GE}^{'}$ are removed. Due to Lemma 4.5 this is the case if and only if $Γ^{'} = (V, {GE}^{'})$ is 2-degenerate. For example, the graph $Γ_{eff} = (V, {GE}_{eff})$ induced by the effectively marked G-edges ${GE}_{eff}$ is a 2-degenerate graph. Indeed, each effectively marked edge is adjacent to at least one vertex which has degree at most 2 in $Γ_{eff}$ : Vertices of degree 2 in $(V, GE)$ are trivially at most of degree 2 in $Γ_{eff}$ , and vertices of degree 3 in $(V, GE)$ are also at most of degree 2 in $Γ_{eff}$ as they can only be adjacent to 2 effectively marked edges. Consequently any induced subgraph of $Γ_{eff}$ has to contain some vertex of degree at most 2 and thereby $Γ_{eff}$ is 2-degenerate.

Definition 4.6

For a graph $Γ = (V, GE \cup IE \cup WE) \in G$ we call a subset of G-edges ${GE}_{W} \subset GE$ Wardable if the subgraph $(V, {GE}_{W})$ is 2-degenerate.

Lemma 4.7

For each $Γ \in G (p)$ there exists a Wardable subset ${GE}_{W} \subset GE$ of size

\begin{matrix} |{GE}_{W}| = \sum_{e \in IE} {(4 - deg (e))}_{+} . \end{matrix}

4.18

Proof

This follows immediately from Fact 3, the observation that $(V, {GE}_{eff})$ is 2-degenerate and the fact that sub-graphs of 2-degenerate graphs remain 2-degenerate.

$□$

For each $Γ \in G (p)$ we choose a Wardable subset ${GE}_{W} (Γ) \subset GE (Γ)$ satisfying (4.18). At least one such set is guaranteed to exist by the lemma. For graphs with several possible such sets, we arbitrarily choose one, and consider it permanently assigned to $Γ$ . Later we will introduce certain operations on graphs $Γ \in G (p)$ which produce families of derived graphs $Γ^{'} \in G \supset G (p)$ . During those operations the chosen Wardable subset ${GE}_{W} (Γ)$ will be modified in order to produce eligible sets of Wardable edges ${GE}_{W} (Γ^{'})$ and we will select one among those to define the Wardable subset of $Γ^{'}$ . We stress that the relation (4.18) on the Wardable set is required only for $Γ \in G (p)$ but not for the derived graphs $Γ^{'}$ .

We now give a precise meaning to the vague bounds of (4.13), (4.14b). We define the N-exponent, $n (Γ)$ , of a graph $Γ = (V, GE \cup IE \cup WE)$ as the effective N-exponent in its value-definition, i.e. as

\begin{matrix} n (Γ) : = |V| - \sum_{e \in IE} \frac{deg (e)}{2} - \sum_{e \in WE} l (e) . \end{matrix}

We defer the proof of the following technical lemma to the Appendix.

Lemma 4.8

For any $c > 0$ there exists some $C > 0$ such that the following holds. Let $Γ = (V, GE \cup IE \cup WE) \in G$ be a graph with Wardable edge set ${GE}_{W} \subset GE$ and at most $|V| \leq c p$ vertices and at most $|GE| \leq c p^{2}$ G-edges. Then for each $0 < ϵ < 1$ it holds that

\begin{matrix} |Val (Γ)| \leq_{ϵ} N^{ϵ p} (1 + {‖ G ‖}_{q})^{C p^{2}} W - Est (Γ), \end{matrix}

4.19a

where

\begin{matrix} W - Est (Γ) : = N^{n (Γ)} (ψ + ψ_{q}^{'})^{|{GE}_{W}|} (ψ + ψ_{q}^{'} + ψ_{q}^{''})^{|{GE}_{g - m}|}, q : = C p^{3} / ϵ . \end{matrix}

4.19b

Remark 4.9

(i)
We consider $ϵ$ and p as fixed within the proof of Theorem 3.7 and therefore do not explicitly carry the dependence of them in quantities like $W - Est$ .
(ii)
We recall that the factors involving ${GE}_{g - m}$ and $WE$ do not play any role for graphs $Γ \in G (p)$ as those sets are empty in this restricted class of graphs (see Remark 4.3).
(iii)
Ignoring the difference between $ψ$ and $ψ_{q}^{'}$ , $ψ_{q}^{''}$ and the irrelevant order $O (N^{p ϵ})$ factor in (4.19), the reader should think of (4.19) as the heuristic inequality
$\begin{matrix} |Val (Γ)| ≲ N^{n (Γ)} ψ^{|{GE}_{W}| + |{GE}_{g - m}|} . \end{matrix}$
Using Lemma 4.7, $N^{- 1 / 2} ≲ ψ ≲ 1$ , $|V| = 2 |IE| \leq 2 p$ and $deg (e) \geq 2$ (from Fact 1) we thus find
$\begin{matrix} \begin{matrix} N^{- p} |Val (Γ)| & ≲ N^{|IE| - p} \prod_{e \in IE} N^{1 - deg (e) / 2} ψ^{{(4 - deg (e))}_{+}} \\ ≲ ψ^{2 |IE| - 2 p} \prod_{e \in IE} ψ^{deg (e) - 2 + {(4 - deg (e))}_{+}} \leq ψ^{2 p} \end{matrix} \end{matrix}$ 4.20
for any $Γ = (V, GE \cup IE) \in G (p)$ .

Improved estimates on $Val (Γ)$ at the cusp: $σ$ -cells

Definition 4.10

For $Γ \in G$ we call an interaction edge $(u, v) = e \in IE (Γ)$ a $σ$ -cell if the following four properties hold: (i) $deg (e) = 2$ , (ii) there are no G-loops adjacent to u or v, (iii) precisely one of u, v carries a weight of $p f$ while the other carries a weight of $1$ , and (iv), e is not adjacent to any other non $GE$ -edges. Pictorially, possible $σ$ -cells are given by

For $Γ \in G$ we denote the number of $σ$ -cells in $Γ$ by $σ (Γ)$ .

Next, we state a simple lemma, estimating $W - Est (Γ)$ of the graphs in the restricted class $Γ \in G (p)$ .

Lemma 4.11

For each $Γ = (V, IE \cup GE) \in G (p)$ it holds that

\begin{matrix} N^{- p} |W - Est (Γ)| \leq_{p} (\sqrt{η / ρ})^{p - σ (Γ)} {(ψ + ψ_{q}^{'})}^{2 p} \prod_{\begin{matrix} e \in IE \\ deg (e) \geq 4 \end{matrix}} N^{2 - deg (e) / 2} . \end{matrix}

Proof

We introduce the short-hand notations ${IE}_{k} : = {e \in IE | deg (e) = k}$ and ${IE}_{\geq k} : = ⋃_{l \geq k} {IE}_{l}$ . Starting from (4.19b) and Lemma 4.7 we find

\begin{matrix} \begin{matrix} N^{- p} |W - Est (Γ)| \\ \leq N^{- (p - |IE|)} (\prod_{e \in {IE}_{2}} {(ψ + ψ_{q}^{'})}^{2}) (\prod_{e \in {IE}_{3}} \frac{ψ + ψ_{q}^{'}}{\sqrt{N}}) (\prod_{e \in {IE}_{\geq 4}} \frac{1}{N}) (\prod_{e \in {IE}_{\geq 4}} N^{2 - deg (e) / 2}) . \end{matrix} \end{matrix}

Using $N^{- 1 / 2} = ψ \sqrt{η / ρ} \leq C ψ$ it then follows that

\begin{matrix} \begin{matrix} N^{- p} |W - Est (Γ)| \\ \leq_{p} [\frac{η}{ρ} ψ^{2}]^{p - |IE|} (\prod_{e \in {IE}_{2}} {(ψ + ψ_{q}^{'})}^{2}) (\prod_{e \in {IE}_{\geq 3}} \sqrt{\frac{η}{ρ}} {(ψ + ψ_{q}^{'})}^{2}) (\prod_{e \in {IE}_{\geq 4}} N^{2 - deg (e) / 2}) . \end{matrix} \end{matrix}

4.21

It remains to relate (4.21) to the number $σ (Γ)$ of $σ$ -cells in $Γ$ . Since each interaction edge of degree two which is not a $σ$ -cell has an additional weight $p f$ attached to it, it follows from Fact 2 that $|{IE}_{2}| - σ (Γ) \leq p - |IE|$ . Therefore, from (4.21) and $η / ρ \leq C$ we have that

\begin{matrix} \begin{matrix} N^{- p} |W - Est (Γ)| \\ \leq_{p} [\sqrt{η / ρ} {(ψ + ψ_{q}^{'})}^{2}]^{p - |IE| + |{IE}_{\geq 3}| + |{IE}_{2}| - σ (Γ)} [{(ψ + ψ_{q}^{'})}^{2}]^{σ (Γ)} (\prod_{e \in {IE}_{\geq 4}} N^{2 - deg (e) / 2}), \end{matrix} \end{matrix}

proving the claim. $□$

Using Lemma 4.8 and $\sqrt{η / ρ} \leq σ_{q}$ , the estimate in Lemma 4.11 has improved the previous bound (4.20) by a factor $σ_{q}^{p - σ (Γ)}$ (ignoring the irrelevant factors). In order to prove (3.11c), we thus need to remove the $- σ (Γ)$ from this exponent, in other words, we need to show that from each $σ$ -cell we can multiplicatively gain a factor of $σ_{q}$ . This is the content of the following proposition.

Proposition 4.12

Let $c > 0$ be any constant and $Γ \in G$ be a single index graph with at most cp vertices and $c p^{2}$ edges with a $σ$ -cell $(u, v) = e \in IE (Γ)$ . Then there exists a finite collection of graphs ${Γ_{σ}} ⊔ G_{Γ}$ with at most one additional vertex and at most 6p additional G-edges such that

\begin{matrix} \begin{matrix} Val (Γ) & = σ Val (Γ_{σ}) + \sum_{Γ^{'} \in G_{Γ}} Val (Γ^{'}) + O (N^{- p}), \\ W - Est (Γ_{σ}) & = W - Est (Γ), W - Est (Γ^{'}) \leq_{p} σ_{q} W - Est (Γ), Γ^{'} \in G_{Γ} \end{matrix} \end{matrix}

4.22

and all graphs $Γ_{σ}$ and $Γ^{'} \in G_{Γ}$ have exactly one $σ$ -cell less than $Γ$ .

Using Lemmas 4.8 and 4.11 together with the repeated application of Proposition 4.12 we are ready to present the proof of Theorem 3.7.

Proof of Theorem 3.7

We remark that the isotropic local law (3.11a) and the averaged local law (3.11b) are verbatim as in [34, Theorem 4.1]. We therefore only prove the improved bound (3.11c)–(3.11d) in the remainder of the section. We recall (4.10) and partition the set of graphs $G (p) = G_{0} (p) \cup G_{\geq 1} (p)$ into those graphs $G_{0} (p)$ with no $σ$ -cells and those graphs $G_{\geq 1} (p)$ with at least one $σ$ -cell. For the latter group we then use Proposition 4.12 for some $σ$ -cell to find

\begin{matrix} \begin{matrix} E {|〈diag (p f) D〉|}^{p} & = N^{- p} \sum_{Γ \in G_{0} (p)} Val (Γ) + O (N^{- 2 p}) \\ + N^{- p} \sum_{Γ \in G_{\geq 1} (p)} (σ Val (Γ_{σ}) + \sum_{Γ^{'} \in G_{Γ}} Val (Γ^{'})), \end{matrix} \end{matrix}

4.23

where the number of $σ$ -cells is reduced by 1 for $Γ_{σ}$ and each $Γ^{'} \in G_{Γ}$ as compared to $Γ$ . We note that the Ward-estimate $W - Est (Γ)$ from Lemma 4.11 together with Lemma 4.8 is already sufficient for the graphs in $G_{0} (p)$ . For those graphs $G_{1} (p)$ with exactly one $σ$ -cell the expansion in (4.23) is sufficient because $σ \leq σ_{q}$ and, according to (4.22), each $Γ^{'} \in G_{Γ}$ has a Ward estimate which is already improved by $σ_{q}$ . For the other graphs we iterate the expansion from Proposition 4.12 until no sigma cells are left.

It only remains to count the number of G-edges and vertices in the successively derived graphs to make sure that Lemma 4.8 and Proposition 4.12 are applicable and that the last two factors in (3.11c) come out as claimed. Since every of the $σ (Γ) \leq p$ applications of Proposition 4.12 creates at most 6p additional G-edges and one additional vertex, it follows that $|GE (Γ)| \leq C^{'} p^{2}$ , $|V| \leq C^{'} p$ also in any successively derived graph. Finally, it follows from the last factor in Lemma 4.11 that for each $e \in IE$ with $deg (e) \geq 5$ we gain additional factors of $N^{- 1 / 2}$ . Since $|IE| \leq p$ , we easily conclude that if there are more than 4p G-edges, then each of them comes with an additional gain of $N^{- 1 / 2}$ . Now (3.11c) follows immediately after taking the pth root.

We turn to the proof of (3.11d). We first write out

\begin{matrix} 〈diag, (p f), [T ⊙ G^{t}], G〉 = \frac{1}{N} \sum_{a, b} {(p f)}_{a} t_{ab} G_{ba} G_{ba} \end{matrix}

and therefore can, for even p, write the pth moment as the value

\begin{matrix} E {|〈diag, (p f), [T ⊙ G^{t}], G〉|}^{p} = N^{- p} Val (Γ_{0}) \end{matrix}

of the graph $Γ_{0} = (V, GE \cup IE) \in G$ which is given by p disjoint 2-cycles as graphic file with name 220_2019_3657_Figl_HTML.jpg

where there are p/2 cycles of G-edges and p/2 cycles of $G^{*}$ edges. It is clear that $(V, GE)$ is 2-degenerate and since $|GE| = 2 p$ it follows that

\begin{matrix} W - Est (Γ_{0}) \leq N^{p} {(ψ + ψ_{q}^{'})}^{2 p} . \end{matrix}

On the other hand each of the p interaction edges in $Γ_{0}$ is a $σ$ -cell and we can use Proposition 4.12p times to obtain (3.11d) just as in the proof of (3.11c). $□$

Proof of Proposition 4.12

It follows from the MDE that

\begin{matrix} G = M - M S [M] G - M W G = M - G S [M] M - G W M, \end{matrix}

which we use to locally expand a term of the form $G_{xa} G_{ay}^{*}$ for fixed a, x, y further. To make the computation local we allow for an arbitrary random function $f = f (W)$ , which in practice encodes the remaining G-edges in the graph. A simple cumulant expansion shows

\begin{matrix} \sum_{b} B_{ab} E G_{xb} G_{by}^{*} f & = E M_{xa} G_{ay}^{*} f - \sum_{k = 2}^{6 p} \sum_{b} \sum_{β \in I^{k}} κ (b a, \underline{β}) m_{a} E \partial_{β} [G_{xb} G_{ay}^{*} f] + O (N^{- p}) \\ + \sum_{b} s_{ba} m_{a} E [G_{xa} {(g - m)}_{b} G_{ay}^{*} + G_{xb} {\bar{(g - m)}}_{a} G_{by}^{*} - G_{xb} G_{ay}^{*} \partial_{ab}] f \\ + \sum_{b} t_{ba} m_{a} E [G_{xb} {(G - M)}_{ab} G_{ay}^{*} + G_{xb} G_{ab}^{*} G_{ay}^{*} - G_{xb} G_{ay}^{*} \partial_{ba}] f \end{matrix}

4.24

where $\partial_{α} : = \partial_{w_{α}}$ and introduced the stability operator $B : = 1 - diag ({|m|}^{2}) S$ . The stability operator B appears from rearranging the equation obtained from the cumulant expansion to express the quantity $E G_{xb} G_{by}^{*} f$ . In our graphical representation, the stability operator is a special edge that we can also express as

4.25

An equality like (4.25) is meant locally in the sense that the pictures only represent subgraphs of the whole graph with the empty, labelled vertices symbolizing those vertices which connect the subgraph to its complement. Thus (4.25) holds true for every fixed graph extending x, y consistently in all three graphs. The doubly drawn edge in (4.25) means that the external vertices x, y are identified with each other and the associated indices are set equal via a $δ_{a_{x}, a_{y}}$ function. Thus (4.25) should be understood as the equality

4.26

where the outside edges incident at the merged vertices x, y are reconnected to one common vertex in the middle graph. For example, in the picture (4.26) the vertex x is connected to the rest of the graph by two edges, and the vertex y by one.

In order to represent (4.24) in terms of graphs we have to define a notion of differential edge. First, we define a targeted differential edge represented by an interaction edge with a red $\partial$ -sign written on top and a red-coloured target G-edge to denote the collection of graphs

4.27

The second picture in (4.27) shows that the target G-edge may be a loop; the definition remains the same. This definition extends naturally to $G^{*}$ edges and is exactly the same for $G - M$ edges (note that this is compatible with the usual notion of derivative as M does not depend on W). Graphs with the differential signs should be viewed only as an intermediate simplifying picture but they really mean the collection of graphs indicated in the right hand side of (4.27). They represent the identities

\begin{matrix} \begin{matrix} \sum_{α} κ (u v, α) \partial_{uv} G_{xy} & = - s_{uv} G_{xv} G_{uy} - t_{uv} G_{xu} G_{vy}, \\ \sum_{α} κ (u v, α) \partial_{uv} G_{xx} & = - s_{uv} G_{xv} G_{ux} - t_{uv} G_{xu} G_{vx} \end{matrix} \end{matrix}

In other words we introduced these graphs only to temporary encode expressions with derivatives (e.g. second term in the rhs. of (4.24)) before the differentiation is actually performed. We can then further define the action of an untargeted differential edge according the Leibniz rule as the collection of graphs with the differential edge being targeted on all G-edges of the graph one by one (in particular not only those in the displayed subgraph), i.e. for example

4.28

Here the union is a union in the sense of multisets, i.e. allows for repetitions in the resulting set (note that also this is compatible with the usual action of derivative operations). The $⊔ \dots$ symbol on the rhs. of (4.28) indicates that the targeted edge cycles through all G-edges in the graph, not only the ones in the subgraph. For example, if there are k G-edges in the graph, then the picture (4.28) represents a collection of 2k graphs arising from performing the differentiation

\begin{matrix} \begin{matrix} \sum_{α} κ (u v, α) \partial_{uv} [G_{xy} G_{yz} f] \\ = \sum_{α} κ (u v, α) [\partial_{uv} G_{xy}] G_{yz} f + \sum_{α} κ (u v, α) G_{xy} [\partial_{uv} G_{yz}] f \\ + \sum_{α} κ (u v, α) G_{xy} G_{yz} [\partial_{uv} f] \\ = - s_{uv} [G_{xv} G_{uy} G_{yz} f + G_{xy} G_{yv} G_{uz} f + G_{xy} G_{yz} (\partial_{vu} f)] \\ - t_{uv} [G_{xu} G_{vy} G_{yz} f + G_{xy} G_{yu} G_{vz} f + G_{xy} G_{yz} (\partial_{uv} f)], \end{matrix} \end{matrix}

where $f = f (W)$ represents the value of the G-edges outside the displayed subgraph.

Finally we introduce the notation that a differential edge which is targeted on all G-vertices except for those in the displayed subgraph. This differential edge targeted on the outside will be denoted by $\hat{\partial}$ .

Regarding the value of the graph, we define the value of a collection of graphs as the sum of their values. We note that this definition is for the collection of graphs encoded by the differential edges also consistent with the usual differentiation.

Written in a graphical form (4.24) reads

graphic file with name 220_2019_3657_Equ95_HTML.gif

4.29

where the ultimate graph encodes the ultimate terms in the last two lines of (4.24).

We worked out the example for the resolution of the quantity $E G_{xa} G_{ay}^{*} f$ , but very similar formulas hold if the order of the fixed indices (x, y) and the summation index a changes in the resolvents, as well as for other combinations of the complex conjugates. In graphical language this corresponds to changing the arrows of the two G-edges adjacent to a, as well as their types. In other words, equalities like the one in (4.29) hold true for other any degree two vertex but the stability operator changes slightly: In total there are 16 possibilities, four for whether the two edges are incoming or outgoing at a and another four for whether the edges are of type G or of type $G^{*}$ . The general form for the stability operator is

\begin{matrix} B : = 1 - diag (m^{#_{1}} m^{#_{2}}) R, \end{matrix}

4.30

where $R = S$ if there is one incoming and one outgoing edge, $R = T$ if there are two outgoing edges and $R = T^{t}$ otherwise, and where $#_{1}, #_{2}$ represent complex conjugations if the corresponding edges are of $G^{*}$ type. Thus for, for example, the stability operator in a for $G_{xa}^{*} G_{ya}^{*}$ is $1 - diag ({\bar{m}}^{2}) T^{t}$ . Note that the stability operator at vertex with degree two is exclusively determined by the type and orientation of the two G-edges adjacent to a. In the sequel the letter B will refer to the appropriate stability operator, we will not distinguish their 9 possibilities ( $R = S, T, T^{t}$ and $m^{#_{1}} m^{#_{2}} = {|m|}^{2}, m^{2}, {\bar{m}}^{2}$ ) in the notation.

Lemma 4.13

Let $c > 0$ be any constant, $Γ \in G$ be a single index graph with at most cp vertices and $c p^{2}$ edges and let $a \in V (Γ)$ be a vertex of degree $deg (a) = 2$ not adjacent to a $G$ -loop. The insertion of the stability operator B (4.30) at a as in (4.29) produces a finite set of graphs with at most one additional vertex and 6p additional edges, denoted by $G_{Γ}$ , such that

\begin{matrix} Val (Γ) = \sum_{Γ^{'} \in G_{Γ}} Val (Γ^{'}) + O (N^{- p}), \end{matrix}

and all of them have a Ward estimate

\begin{matrix} W - Est (Γ^{'}) \leq_{p} (ρ + ψ + η / ρ + ψ_{q}^{'} + ψ_{q}^{''}) W - Est (Γ) \leq_{p} σ_{q} W - Est (Γ), Γ^{'} \in G_{Γ} . \end{matrix}

Moreover all $σ$ -cells in $Γ$ , except possibly a $σ$ -cell adjacent to a, remain $σ$ -cells also in each $Γ^{'}$ .

Proof

As the proofs for all of the 9 cases of B-operators are almost identical we prove the lemma for the case (4.29) for definiteness. Now we compare the value of the graph graphic file with name 220_2019_3657_Figm_HTML.jpg

with the graph in the lhs. of (4.29), i.e. when the stability operator B is attached to the vertex a. We remind the reader that the displayed graphs only show a certain subgraph of the whole graph. The goal is to show that $W - Est (Γ^{'}) \leq (ρ + ψ + η / ρ + ψ_{q}^{'} + ψ_{q}^{''}) W - Est (Γ)$ for each graph $Γ^{'}$ occurring on the rhs. of (4.29). The forthcoming reasoning is based on comparing the quantities $|V|$ , $|{GE}_{W}|$ , $|{GE}_{g - m}|$ and $\sum_{e \in IE} deg (e) / 2$ defining the Ward estimate $W - Est$ from (4.19b) of the graph $Γ$ and the various graphs $Γ^{'}$ occurring on the rhs. of (4.29).

We begin with the first graph and claim that Due to the double edge which identifies the x and a vertices it follows that $|V (Γ^{'})| = |V (Γ)| - 1$ . The degrees of all interaction edges remain unchanged when going from $Γ$ to $Γ^{'}$ . As the 2-degenerate set of Wardable edges ${GE}_{W} (Γ^{'})$ we choose ${GE}_{W} (Γ) \ N (a)$ , i.e. the 2-degenerate edge set in the original graph except for the edge-neighbourhood N(a) of a, i.e. those edges adjacent to a. As a subgraph of $(V, {GE}_{W} (Γ))$ it follows that $(V \ {a}, {GE}_{W} (Γ^{'}))$ is again 2-degenerate. Thus $|{GE}_{W}, (Γ)| \geq |{GE}_{W}, (Γ^{'})| \geq |{GE}_{W}, (Γ)| - 2$ and the claimed bound follows since $|{GE}_{g - m}, (Γ^{'})| = |{GE}_{g - m}, (Γ)|$ and
$\begin{matrix} \frac{W - Est (Γ^{'})}{W - Est (Γ)} = \frac{1}{N {(ψ + ψ_{q}^{'})}^{|{GE}_{W}, (Γ)| - |{GE}_{W}, (Γ^{'})|}} \leq \frac{1}{N ψ^{2}} . \end{matrix}$
Next, we consider the third and fourth graph and claim that Here there is one more vertex (corresponding to an additional summation index), $|V (Γ^{'})| = |V (Γ)| + 1$ , whose effect in (4.19b) is compensated by one additional interaction edge e of degree 2. Hence the N-exponent $n (Γ)$ remains unchanged. In the first graph we can simply choose ${GE}_{W} (Γ^{'}) = {GE}_{W} (Γ)$ , whereas in the second graph we choose ${GE}_{W} (Γ^{'}) = {GE}_{W} (Γ) \ {(x, a), (a, y)} \cup {(x, b), (b, y)}$ which is 2-degenerate as a subgraph of a 2-degenerate graph together with an additional vertex of degree 2. Thus in both cases we can choose ${GE}_{W} (Γ^{'})$ (if necessary, by removing excess edges from ${GE}_{W} (Γ^{'})$ again) in such a way that $|{GE}_{W}, (Γ^{'})| = |{GE}_{W}, (Γ)|$ but the number of $(g - m)$ -loops is increased by 1, i.e. $|{GE}_{g - m}, (Γ^{'})| = |{GE}_{g - m}, (Γ)| + 1$ .
Similarly, we claim for the fifth and sixth graph that There is one more vertex whose effect in (4.19b) is compensated by one more interaction edge of degree 2, whence the number N-exponent remains unchanged. The number of Wardable edges can be increased by one by setting ${GE}_{W} (Γ^{'})$ to be a suitable subset of ${GE}_{W} (Γ) \ {(x, a), (a, y)} \cup {(x, b), (a, b), (a, y)}$ which is 2-degenerate as the subset of a 2-degenerate graph together with two vertices of degree 2. The number of $(g - m)$ -loops remains unchanged.
For the last graph in (4.29), i.e. where the derivative targets an outside edge, we claim that Here the argument on the lhs., $Γ^{'}$ , stands for a whole collection of graphs but we essentially only have to consider two types: The derivative edge either hits a G-edge or a $(g - m)$ -loop, i.e. which encodes the graphs as well as the corresponding transpositions (as in (4.27)). In both cases the N-size of $W - Est$ remains constant since the additional vertex is balanced by the additional degree two interaction edge. In both cases all four displayed edges can be included in ${GE}_{W} (Γ^{'})$ . So $|{GE}_{W}|$ can be increased by 1 in the first case and by 2 in the second case while the number of $(g - m)$ -loops remains constant in the first case is decreased by 1 in the second case. The claim follows directly in the first case and from
$\begin{matrix} \frac{W - Est (Γ^{'})}{W - Est (Γ)} = \frac{{(ψ + ψ_{q}^{'})}^{2}}{ψ + ψ_{q}^{'} + ψ_{q}^{''}} \leq ψ + ψ_{q}^{'} + ψ_{q}^{''} \end{matrix}$
in the second case.
It remains to consider the second graph in the rhs. of (4.29) with the higher derivative edge. We claim that for each $k \geq 2$ it holds that We prove the claim by induction on k starting from $k = 2$ . For any $k \geq 2$ we write $\partial^{k} = \partial^{k - 1} \partial$ . For the action of the last derivative we distinguish three cases: (i) action on an edge adjacent to the derivative edge, (ii) action on a non-adjacent G-edge and (iii) an action on a non-adjacent $(g - m)$ -loop. Graphically this means
4.31
We ignored the case where the derivative acts on (a, y) since it is estimated identically to the first graph. We also neglected the possibility that the derivative acts on a g-loop, as this is estimated exactly as the last graph and the result is even better since no $(g - m)$ -loop is destroyed. After performing the last derivative in (4.31) we obtain the following graphs $Γ^{'}$
4.32
where we neglected the transposition of the third graph with u, v exchanged because this is equivalent with regard to the counting argument. First, we handle the second, third and fourth graphs in (4.32). In all these cases the set ${GE}_{W} (Γ^{'})$ is defined simply by adding all edges drawn in (4.32) to the set ${GE}_{W} (Γ) \ {(x, a), (a, y)}$ . The new set remains 2-degenerate since all these new edges are adjacent to vertices of degree 2. Compared to the original graph, $Γ$ , we thus have increased $|{GE}_{W}| + |{GE}_{g - m}|$ by at least 1.

We now continue with the first graph in (4.32), where we explicitly expand the action of another derivative (notice that this is the only graph where $k \geq 2$ is essentially used). We distinguish four cases, depending on whether the derivative acts on (i) the b-loop, (ii) an adjacent edge, (iii) a non-adjacent edge or (iv) a non-adjacent $(g - m)$ -loop, i.e. graphically we have
4.33
After performing the indicated derivative, the encoded graphs $Γ^{'}$ are
4.34
where we again neglected the version of the third graph with u, v exchanged. We note that both the first and the second graph in (4.33) produce the first graph in (4.34). Now we define how to get the set ${GE}_{W} (Γ^{'})$ from ${GE}_{W} (Γ) \ {(x, a), (a, y)}$ for each case. In the first graph of (4.34) we add all three non-loop edges to ${GE}_{W} (Γ^{'})$ , in the second graph we add both non-loop edges, in the third and fourth graph we add the non-looped edge adjacent to b as well as any two non-looped edges adjacent to a. Thus, compared to the original graph the number $|{GE}_{W}| + |{GE}_{g - m}|$ is at least preserved. On the other hand the N-power counting is improved by $N^{- 1 / 2}$ . Indeed, there is one additional vertex b, yielding a factor N, which is compensated by the scaling factor $N^{- 3 / 2}$ from the interaction edge of degree 3.

To conclude the inductive step we note that additional derivatives (i.e. the action of $\partial^{k - 2}$ ) can only decrease the Ward-value of a graph. Indeed, any single derivative can at most decrease the number $|{GE}_{W}, (Γ)| + |{GE}_{g - m}|$ by 1 by either differentiating a $(g - m)$ -loop or differentiating an edge from ${GE}_{W}$ . Thus the number $|{GE}_{W}| + |{GE}_{g - m}|$ is decreased by at most $k - 2$ while the number $|{GE}_{g - m}|$ is not increased. In particular, by choosing a suitable subset of Wardable edges, we can define ${GE}_{W} (Γ^{'})$ in such a way that $|{GE}_{W}| + |{GE}_{g - m}|$ is decreased by exactly $k - 2$ . But at the same time each derivative provides a gain of $c N^{- 1 / 2} \leq ψ \leq ψ + ψ_{q}^{'}$ since the degree of the interaction edge is increased by one. Thus we have
$\begin{matrix} \frac{W - Est (Γ^{'})}{W - Est (Γ)} \leq_{p} {(ψ + ψ_{q}^{'})}^{k - 1 + |{GE}_{W}, (Γ^{'})| + |{GE}_{g - m}, (Γ^{'})| - |{GE}_{W}, (Γ)| - |{GE}_{g - m}, (Γ)|} = ψ + ψ_{q}^{'}, \end{matrix}$
just as claimed. $□$

Lemma 4.13 shows that the insertion of the B-operator reduces the Ward-estimate by at least $ρ$ . However, this insertion does not come for free since the inverse

\begin{matrix} B^{- 1} = {(1 - diag (m^{#_{1}} m^{#_{2}}) R)}^{- 1} \end{matrix}

is generally not a uniformly bounded operator. For example, it follows from (2.2) that

\begin{matrix} I m = η {|m|}^{2} + {|m|}^{2} S I m \end{matrix}

and therefore ${(1 - diag ({|m|}^{2}) S)}^{- 1}$ is singular for small $η$ with $I m$ being the unstable direction. It turns out, however, that B is invertible on the subspace complementary to some bad direction $b^{(B)}$ . At this point we distinguish two cases. If B has a uniformly bounded inverse, i.e. if $‖ B^{- 1} ‖_{\infty \to \infty} \leq C$ for some constant $C > 0$ , then we set $P_{B} : = 0$ . Otherwise we define $P_{B}$ as the spectral projection operator onto the eigenvector $b^{(B)}$ of B corresponding to the eigenvalue $β$ with smallest modulus:

\begin{matrix} P_{B} : = \frac{〈l^{(B)}, \cdot〉}{〈l^{(B)}, b^{(B)}〉} b^{(B)}, Q_{B} : = 1 - P_{B}, \end{matrix}

4.35

where $〈v, w〉 : = N^{- 1} \sum_{a} \bar{v_{a}} w_{a}$ denotes the normalized inner product and $l^{(B)}$ is the corresponding left eigenvector, $(B^{*} - β) l^{(B)} = 0$ .

Lemma 4.14

For all 9 possible B-operators in (4.30) it holds that

\begin{matrix} ‖ B^{- 1} Q_{B} ‖_{\infty \to \infty} \leq C < \infty \end{matrix}

4.36

for some constant $C > 0$ , depending only on model parameters.

Proof

First we remark that it is sufficient to prove the bound (4.36) on $B^{- 1} Q_{B}$ as an operator on $C^{N}$ with the Euclidean norm, i.e. $‖ B^{- 1} Q_{B} ‖ \leq C$ . For this insight we refer to [5, Proof of (5.28) and (5.40a)]. Recall that $R = S$ , $R = T$ or $R = T^{t}$ , depending on which stability operator we consider (cf. (4.30)). We begin by considering the complex hermitian symmetry class and the cases $R = T$ and $R = T^{t}$ . We will now see that in this case B has a bounded inverse and thus $Q_{B} = 1$ . Indeed, we have

\begin{matrix} ‖ B^{- 1} ‖ ≲ \frac{1}{1 - ‖ F^{(R)} ‖}, \end{matrix}

where $F^{(R)} w : = |m| R (|m| w)$ . The fullness Assumption (B) in (2.3) implies that $|t_{ij}| \leq (1 - c) s_{ij}$ for some constant $c > 0$ and thus $‖ F^{(R)} ‖ \leq (1 - c) ‖ F^{(S)} ‖ \leq 1 - c$ for $R = T, T^{t}$ . Here we used $‖ F^{(S)} ‖ \leq 1$ , a general property of the saturated self-energy matrix $F^{(S)}$ that was first established in [6, Lemma 4.3] (see also [7, Eq. (4.24)] and [10, Eq. (4.5)]). Now we turn to the case $R = S$ for both the real symmetric and complex hermitian symmetry classes. In this case B is the restriction to diagonal matrices of an operator $T : C^{N \times N} \to C^{N \times N}$ , where $T \in {Id - M^{*} S [\cdot] M, Id - M S [\cdot] M, Id - M^{*} S [\cdot] M^{*}}$ . All of these operators were covered in [10, Lemma 5.1] and thus (4.36) is a consequence of that lemma. Recall that the flatness (3.6) of $S$ ensured the applicability of the lemma. $□$

We will insert the identity $1 = P_{B} + B B^{- 1} Q_{B}$ , and we will perform an explicit calculation for the $P_{B}$ component, while using the boundedness of $B^{- 1} Q_{B}$ in the other component. We are thus left with studying the effect of inserting B-operators and suitable projections into a $σ$ -cell. To include all possible cases with regard to edge-direction and edge-type (i.e. G or $G^{*}$ ), in the pictures below we neither indicate directions of the G-edges nor their type but implicitly allow all possible assignments. We recall that both the R-interaction edge as well as the relevant B-operators (cf. (4.30)) are completely determined by the type of the four G-edges as well as their directions. To record the type of the inserted B, $P_{B}$ , $Q_{B}$ operators we call those inserted on the rhs. of the R-edge $B^{'}$ , $P_{B}^{'}$ and $Q_{B}^{'}$ in the following graphical representations. Pictorially we first decompose the $σ$ -cell subgraph of some graph $Γ$ as

4.37

where we allow the vertices x, y to agree with z or w. With formulas, the insertion in (4.37) means the following identity

\begin{matrix} \sum_{ab} {(p f)}_{a} G_{ya} G_{xa} R_{ab} G_{bw} G_{bz} = \sum_{abc} {(p f)}_{c} G_{ya} G_{xa} (P_{ac} + Q_{ac}) R_{cb} G_{bw} G_{bz} \end{matrix}

since $P_{ac} + Q_{ac} = δ_{ac}$ . We first consider with the second graph in (4.37), whose treatment is independent of the specific weights, so we already removed the weight information. We insert the B operator as

and notice that due to Lemma 4.14 the matrix $K = {(B^{- 1})}^{t} Q_{B}^{t} R$ , assigned to the weighted edge in the last graph, is entry-wise $|k_{ab}| \leq c N^{- 1}$ bounded (the transpositions compensate for the opposite orientation of the participating edges). It follows from Lemma 4.13 that

4.38

where all $Γ^{'} \in G_{Γ}$ satisfy $W - Est (Γ^{'}) \leq_{p} σ_{q} W - Est (Γ)$ and all $σ$ -cells in $Γ$ except for the currently expanded one remain $σ$ -cells in $Γ^{'}$ . We note that it is legitimate to compare the Ward estimate of $Γ^{'}$ with that of $Γ$ because with respect to the Ward-estimate there is no difference between $Γ$ and the modification of $Γ$ in which the R-edge is replaced by a generic $N^{- 1}$ -weighted edge.

We now consider the first graph in (4.37) and repeat the process of inserting projections $P_{B}^{'} + Q_{B}^{'}$ to the other side of the R-edge to find

4.39

where we already neglected those weights which are of no importance to the bound. The argument for the second graph in (4.39) is identical to the one we used in (4.38) and we find another finite collection of graphs $G_{Γ}^{'}$ such that

4.40

where the weighted edge carries the weight matrix $K = P_{B}^{t} R Q_{B^{'}} B^{' - 1}$ , which is according to Lemma 4.14 indeed scales like $|k_{ab}| \leq c N^{- 1}$ . The graphs $Γ^{'} \in G_{Γ}^{'}$ also satisfy $W - Est (Γ^{'}) \leq_{p} σ_{q} W - Est (Γ)$ and all $σ$ -cells in $Γ$ except for the currently expanded one remain $σ$ -cells in $Γ^{'}$ .

It remains to consider the first graph in (4.39) in the situation where B does not have a bounded inverse. We compute the weight matrix of the $P_{B}^{t} R P_{B}^{'}$ interaction edge as

\begin{matrix} \begin{matrix} P_{B}^{t} diag (p f) R P_{B}^{'} & = (\frac{〈\bar{b^{(B)}}, \cdot〉}{〈\bar{b^{(B)}}, \bar{l^{(B)}}〉}, \bar{l^{(B)}}) [diag, (p f), R, \frac{〈l^{(B^{'})}, \cdot〉}{〈l^{(B^{'})}, b^{(B^{'})}〉}, b^{(B^{'})}] \\ = \frac{〈b^{(B)}, p, f, (R b^{(B^{'})})〉}{〈\bar{b^{(B)}}, \bar{l^{(B)}}〉} \frac{〈l^{(B^{'})}, \cdot〉 \bar{l^{(B)}}}{〈l^{(B^{'})}, b^{(B^{'})}〉} \end{matrix} \end{matrix}

which we separate into the scalar factor

\begin{matrix} \frac{〈b^{(B)}, p, f, (R b^{(B^{'})})〉 〈l^{(B^{'})}, \bar{l^{(B)}}〉}{〈\bar{b^{(B)}}, \bar{l^{(B)}}〉 〈l^{(B^{'})}, b^{(B^{'})}〉} \end{matrix}

and the weighted edge

\begin{matrix} K = \frac{〈l^{(B^{'})}, \cdot〉 \bar{l^{(B)}}}{〈l^{(B^{'})}, \bar{l^{(B)}}〉} \end{matrix}

4.41

which scales like $|k_{ab}| \leq c N^{- 1}$ since $l$ is $ℓ^{2}$ -normalised and delocalised. Thus we can write

4.42

Note that the B and $B^{'}$ operators are not completely independent: According to Fact 1 it follows that for an interaction edge $e = (u, v)$ associated with the matrix R the number of incoming G-edges in u is the same as the number of outgoing G-edges from v, and vice versa. Thus, according to (4.30), the B-operator at u comes with an S if and only if the $B^{'}$ -operator at v comes also with an S. Furthermore, if the B-operator comes with an T, then the $B^{'}$ -operator comes with an $T^{t}$ , and vice versa. The distribution of the conjugation operators to $B, B^{'}$ in (4.30), however, can be arbitrary. We now use the fact that the scalar factor in (4.42) can be estimated by $|σ| + ρ + η / ρ$ (cf. Lemma A.2). Summarising the above arguments, from (4.37)–(4.42), the proof of Proposition 4.12 is complete.

Cusp Universality

The goal of this section is the proof of cusp universality in the sense of Theorem 2.3. Let H be the original Wigner-type random matrix with expectation $A : = E H$ and variance matrix $S = (s_{ij})$ with $s_{ij} : = E {|h_{ij} - a_{ij}|}^{2}$ and $T = (t_{ij})$ with $t_{ij} : = E {(h_{ij} - a_{ij})}^{2}$ . We consider the Ornstein Uhlenbeck process ${{\tilde{H}}_{t} | t \geq 0}$ starting from ${\tilde{H}}_{0} = H$ , i.e.

\begin{matrix} d {\tilde{H}}_{t} = - \frac{1}{2} ({\tilde{H}}_{t} - A) d t + Σ^{1 / 2} [d B_{t}], Σ [R] : = E W Tr (W R) \end{matrix}

5.1

which preserves expectation and variance. In our setting of deformed Wigner-type matrices the covariance operator $Σ : C^{N \times N} \to C^{N \times N}$ is given by

\begin{matrix} Σ [R] : = S ⊙ R + T ⊙ R^{t} . \end{matrix}

The OU process effectively adds a small Gaussian component to ${\tilde{H}}_{t}$ along the flow in the sense that ${\tilde{H}}_{t} = A + e^{- t / 2} (H - A) + {\tilde{U}}_{t}$ in distribution with ${\tilde{U}}_{t}$ being and independent centred Gaussian matrix with covariance $Cov (\tilde{U}) = (1 - e^{- t / 2}) Σ$ . Due to the fullness Assumption (B) there exist small $c, t_{*}$ such that ${\tilde{U}}_{t}$ can be decomposed as ${\tilde{U}}_{t} = \sqrt{ct} U + U_{t}^{'}$ with $U \sim GUE$ and $U_{t}^{'}$ Gaussian and independent of U for $t \leq t_{*}$ . Thus there exists a Wigner-type matrix $H_{t}$ such that

\begin{matrix} \begin{matrix} {\tilde{H}}_{t} & = H_{t} + \sqrt{ct} U, S_{t} = S - c t S^{GUE}, E H_{t} = A, \\ U & \sim GUE, S^{GUE} [R] : = 〈R〉 = \frac{1}{N} Tr R \end{matrix} \end{matrix}

5.2

with U independent of $H_{t}$ . Note that we do not define $H_{t}$ as a stochastic process and we will use the representation (5.2) only for one carefully chosen $t = N^{- 1 / 2 + ϵ}$ . We note that $H_{t}$ satisfies the assumption of our local law from Theorem 2.5. It thus follows that $G_{t} : = {(H_{t} - z)}^{- 1}$ is well approximated by the solution $M_{t} = diag (M_{t})$ to the MDE

\begin{matrix} - M_{t}^{- 1} = z - A + S_{t} [M_{t}] . ρ_{t} (E) : = lim_{η ↘ 0} \frac{I 〈M_{t}, (E + i η)〉}{π} . \end{matrix}

In particular, by setting $t = 0$ , $M_{0}$ well approximates the resolvent of the original matrix H and $ρ_{0} = ρ$ is its self-consistent density. Note that the Dyson equation of ${\tilde{H}}_{t}$ and hence its solution as well are independent of t, since they are entirely determined by the first and second moments of ${\tilde{H}}_{t}$ that are the same A and S for any t. Thus the resolvent of ${\tilde{H}}_{t}$ is well approximated by the same $M_{0}$ and the self-consistent density of ${\tilde{H}}_{t}$ is given by $ρ_{0} = ρ$ for any t. While H and ${\tilde{H}}_{t}$ have identical self-consistent data, structurally they differ in a key point: ${\tilde{H}}_{t}$ has a small Gaussian component. Thus the correlation kernel of the local eigenvalue statistics has a contour integral representation using a version of the Brézin–Hikami formulas, see Sect. 5.2.

The contour integration analysis requires a Gaussian component of size at least $c t ≫ N^{- 1 / 2}$ and a very precise description of the eigenvalues of $H_{t}$ just above the scale of the eigenvalue spacing. This information will come from the optimal rigidity, Corollary 2.6, and the precise shape of the self-consistent density of states of $H_{t}$ . The latter will be analysed in Sect. 5.1 where we describe the evolution of the density near the cusp under an additive GUE perturbation $\sqrt{s} U$ . We need to construct $H_{t}$ with a small gap carefully so that after a relatively long time $s = c t$ the matrix $H_{t} + \sqrt{ct} U$ develops a cusp exactly at the right location. In fact, we the process has two scales in the shifted variable $ν = s - c t$ that indicates the time relative to the cusp formation. It turns out that the locations of the edges typically move linearly with $ν$ , while the length of the gap itself scales like ${(- ν)}_{+}^{3 / 2}$ , i.e. it varies much slower and we need to fine-tune the evolution of both.

To understand this tuning process, we fix $t = N^{- 1 / 2 + ϵ}$ and we consider the matrix flow $s \to H_{t} (s) : = H_{t} + \sqrt{s} U$ for any $s \geq 0$ and not just for $s = c t$ . It is well known that the corresponding self-consistent densities are given by the semicircular flow. Equivalently, these densities can be described by the free convolution of $ρ_{t}$ with a scaled semicircular distribution $ρ_{sc}$ . In short, the self-consistent density of $H_{t} (s)$ is given by $ρ_{s}^{fc} : = ρ_{t} ⊞ \sqrt{s} ρ_{sc}$ , where we omitted t from the notation $ρ_{s}^{fc}$ since we consider t fixed. In particular we have $ρ_{0}^{fc} = ρ_{t}$ , the density of $H_{t}$ and $ρ_{ct}^{fc} = ρ$ , the density of ${\tilde{H}}_{t} = H_{t} + \sqrt{ct} U$ as well as that of H. Hence, as a preparation to the contour integration, in Sect. 5.1 we need to describe the cusp formation along the semicircular flow. Before going into details, we describe the strategy.

Since in the sequel the densities $ρ_{s}^{fc}$ and their local minima and gaps will play an important role, we introduce the convention that properties of the original density $ρ$ will always carry $ρ$ as a superscript for the remainder of Sect. 5. In particular, the points $c, e_{\pm}, m$ and the gap size $Δ$ from (2.4) and Theorem 2.3 will from now on be denoted by $c^{ρ}, e_{\pm}^{ρ}, m^{ρ}$ and $Δ^{ρ}$ . In particular a superscript of $ρ$ never denotes a power.

Proof strategy

First we consider case (i) when $ρ$ , the self-consistent density associated with H, has an exact cusp at the point $c^{ρ} \in R$ . Note that $c^{ρ}$ is also a cusp point of the self-consistent density of ${\tilde{H}}_{t}$ for any t.

We set $t : = N^{- 1 / 2 + ϵ}$ . Define the functions

\begin{matrix} Δ (ν) : = {(2 γ)}^{2} {(ν / 3)}^{3 / 2} and ρ^{min} (ν) : = γ^{2} \sqrt{ν} / π \end{matrix}

for any $ν \geq 0$ . For $s < c t$ denote the gap in the support of $ρ_{s}^{fc}$ close to $c^{ρ}$ by $[e_{s}^{-}, e_{s}^{+}]$ and its length by $Δ_{s} : = e_{s}^{+} - e_{s}^{-}$ . In Sect. 5.1 we will prove that if $ρ$ has an exact cusp in $c^{ρ}$ as in (2.4a), then $ρ_{s}^{fc}$ has a gap of size $Δ_{s} \approx Δ (c t - s)$ , and, in particular, $ρ_{t} = ρ_{0}^{fc}$ has a gap of size $Δ_{0} \approx Δ (c t) \sim t^{3 / 2}$ , only depending on c, t and $γ$ . The distance of $c^{ρ}$ from the gap is $\approx const \cdot t$ . This overall shift will be relatively easy to handle, but notice that it must be tracked very precisely since the gap changes much slower than its location. For $s > c t$ with $s - c t = O (1)$ we will similarly prove that $ρ_{s}^{fc}$ has no gap anymore close to $c^{ρ}$ but a unique local minimum in $m_{s}$ of size $ρ_{s}^{fc} (m_{s}) \approx ρ^{min} (s - c t)$ .

Now we consider the case where $ρ$ has no exact cusp but a small gap of size $Δ^{ρ} > 0$ . We parametrize this gap length via a parameter $t^{ρ} > 0$ defined by $Δ^{ρ} = Δ (t^{ρ})$ . It follows from the associativity (5.3b) of the free convolution that $ρ_{t}$ has a gap of size $Δ_{0} \approx Δ (c t + t^{ρ})$ .

Finally, the third case is where $ρ$ has a local minimum of size $ρ (m^{ρ})$ . We parametrize it as $ρ (m^{ρ}) = ρ^{min} (t^{ρ})$ with $0 < t^{ρ} < c t$ then it follows that $ρ_{t}$ has a gap of size $Δ_{0} \approx Δ (c t - t^{ρ})$ .

Note that these conclusions follow purely from the considerations in Sect. 5.1 for exact cusps and the associativity of the free convolution. We note that in both almost cusp cases $t^{ρ}$ should be interpreted as a time (or reverse time) to the cusp formation.

In the final part of the proof in Sects. 5.2–5.3 we will write the correlation kernel of $H_{t} + \sqrt{ct} U$ as a contour integral purely in terms of the mesoscopic shape parameter $γ$ and the gap size $Δ_{0}$ of the density $ρ_{t}$ associated with $H_{t}$ . If $Δ_{0} \approx Δ (c t)$ , then the gap closes after time $s \approx c t$ and we obtain a Pearcey kernel with parameter $α = 0$ . If $Δ_{0} \approx Δ (c t + t^{ρ})$ and $t^{ρ} \sim N^{- 1 / 2}$ , then the gap does not quite close at time $s = c t$ and we obtain a Pearcey kernel with $α > 0$ , while for $Δ_{0} \approx Δ (c t - t^{ρ})$ with $t^{ρ} \sim N^{- 1 / 2}$ the gap after time $s = c t$ is transformed into a tiny local minimum and we obtain a Pearcey kernel with $α < 0$ . The precise value of $α$ in terms of $Δ^{ρ}$ and $ρ (m^{ρ})$ are given in (2.6). Note that as an input to the contour integral analysis, in all three cases we use the local law only for $H_{t}$ , i.e. in a situation when there is a small gap in the support of $ρ_{t}$ , given by $Δ_{0}$ defined as above in each case.

Free convolution near the cusp

In this section we quantitatively investigate the free semi-circular flow before and after the formation of cusp. We first establish the exact rate at which a gap closes to form a cusp, and the rate at which the cusp is transformed into a non-zero local minimum. We now suppose that $ρ^{*}$ is a general density with a small spectral gap $[e_{-}^{*}, e_{+}^{*}]$ whose Stieltjes transform $m^{*}$ can be obtained from solving a Dyson equation. Let $ρ_{sc} (x) : = \sqrt{{(4 - x^{2})}_{+}} / 2 π$ be the density of the semicircular distribution and let $s \geq 0$ be a time parameter. The free semicircular convolution $ρ_{s}^{fc}$ of $ρ^{*}$ with $\sqrt{s} ρ_{sc}$ is then defined implicitly via its Stieltjes transform

\begin{matrix} m_{s}^{fc} (z) = m^{*} (ξ_{s} (z)) = m^{*} (z + s m_{s}^{fc} (z)), ξ_{s} (z) : = z + s m_{s}^{fc} (z), z, m_{s}^{fc} (z) \in H . \end{matrix}

5.3a

It follows directly from the definition that $s \mapsto m_{s}^{fc}$ is associative in the sense that

\begin{matrix} m_{s + s^{'}}^{fc} (z) = m_{s} (z + s^{'} m_{s + s^{'}}^{fc} (z)), s, s^{'} \geq 0 . \end{matrix}

5.3b

Figure 1a illustrates the quantities in the following lemma. We state the lemma for scDOSs from arbitrary data pairs $(A_{*}, S_{*})$ satisfying the conditions in [10], i.e.

\begin{matrix} ‖ A_{*} ‖ \leq C, c 〈R〉 \leq S_{*} [R] \leq C 〈R〉 \end{matrix}

5.4

for any self-adjoint $R = R^{*}$ and some constants $c, C > 0$ .

Fig. 1 — (a) illustrates the evolution of $ρ_{s}^{fc}$ along the semicircular flow at two times $0 < s < t_{*} < s^{'}$ before and after the cusp. We recall that $ρ^{*} = ρ_{0}^{fc}$ and $ρ = ρ_{t_{*}}^{fc}$ . (b) shows the points $ξ_{s} (e_{s}^{\pm})$ as well as their distances to the edges $e_{0}^{\pm}$

Lemma 5.1

Let $ρ^{*}$ be the density of a Stieltjes transform $m^{*} = 〈M_{*}〉$ associated with some Dyson equation

\begin{matrix} - 1 = (z - A_{*} + S_{*} [M_{*}]) M_{*}, \end{matrix}

with $(A_{*}, S_{*})$ satisfying (5.4). Then there exists a small constant c, depending only on the constants in Assumptions (5.4) such that the following statements hold true. Suppose that $ρ^{*}$ has an initial gap $[e_{-}^{*}, e_{+}^{*}]$ of size $Δ^{*} = e_{+}^{*} - e_{-}^{*} \leq c$ . Then there exists some critical time $t_{*} ≲ {(Δ^{*})}^{2 / 3}$ such that $m_{t_{*}}^{fc}$ has exactly one exact cusp in some point $c^{*}$ with Inline graphic , and that $ρ_{t_{*}}^{fc}$ is locally around $c^{*}$ given by (2.4a) for some $γ > 0$ . Considering the time evolution $[0, 2 t_{*}] ∋ s \mapsto m_{s}^{fc}$ we then have the following asymptotics.

(i)
After the cusp. For $t_{*} < s \leq 2 t_{*}$ , $ρ_{s}^{fc}$ has a unique non-zero local minimum in some point $m_{s}$ such that
$\begin{matrix} ρ_{s}^{fc} (m_{s}) = & \frac{\sqrt{s - t_{*}} γ^{2}}{π} [1 + O ([) 0] {(s - t_{*})}^{1 / 2}], \\ |m_{s} - c^{*} + (s - t_{*}) R m_{s}^{fc} (m_{s})| ≲ {(s - t_{*})}^{3 / 2 + 1 / 4} . \end{matrix}$ 5.5a
Furthermore, $m_{s}$ can approximately be found by solving a simple equation, namely there exists ${\tilde{m}}_{s}$ such that
5.5b
(ii)
Before the cusp. For $0 \leq s < t_{*}$ , the support of $ρ_{s}^{fc}$ has a spectral gap $[e_{s}^{-}, e_{s}^{+}]$ of size $Δ_{s} : = e_{s}^{+} - e_{s}^{-}$ near $c^{*}$ which satisfies
$\begin{matrix} Δ_{s} = {(2 γ)}^{2} (\frac{t_{*} - s}{3})^{3 / 2} [1 + O ([) 0] {(t_{*} - s)}^{1 / 3}] . \end{matrix}$ 5.5c
In particular we find that the initial gap $Δ^{*} = Δ_{0}$ is related to $t_{*}$ via $Δ^{*} = {(2 γ)}^{2} {(t_{*} / 3)}^{3 / 2} [1 + O ([) 0] {(t_{*} - s)}^{1 / 3}]$ .

Proof

Within the proof of the lemma we rely on the extensive shape analysis from [10]. We are doing so not only for the density $ρ^{*} = ρ_{0}^{fc}$ and its Stieltjes transform, but also for $ρ_{s}^{fc}$ and its Stieltjes transform $m_{s}^{fc}$ for $0 \leq s \leq 2 t_{*}$ . The results from [10] also apply here since $m_{s}^{fc} (z) = 〈M_{*}, (ξ_{s} (z))〉$ can also be realized as the solution

\begin{matrix} - M_{*} {(ξ_{s} (z))}^{- 1} = & z + s 〈M_{*}, (ξ_{s} (z))〉 - A_{*} + S_{*} [M_{*} (ξ_{s} (z))] \\ = & z - A_{*} + (S_{*} + s S^{GUE}) [M_{*} (ξ_{s} (z))] \end{matrix}

to the Dyson equation with perturbed self-energy $S_{*} + s S^{GUE}$ . Since $t_{*} ≲ 1$ it follows that the shape analysis from [10] also applies to $ρ_{s}^{fc}$ for any $s \in [0, 2 t_{*}]$ .

We begin with part (i). Set $ν : = s - t_{*}$ , then for $0 \leq ν \leq t_{*}$ we want to find $x_{ν}$ such that $I m_{s}^{fc}$ has a local minimum in $m_{s} : = c^{*} + x_{ν}$ near $c^{*}$ , i.e.

\begin{matrix} x_{ν} : = {\arg \min}_{x} I m_{s}^{fc} (c^{*} + x), |x_{ν}| ≲ ν . \end{matrix}

First we show that $x_{ν}$ with these properties exists and is unique by using the extensive shape analysis in [10]. Uniqueness directly follows from [10, Theorem 7.2(ii)]. For the existence, we set

\begin{matrix} a_{ν} (x) : = I m_{fc}^{s} (c^{*} + x), b_{ν} (x) : = R m_{s}^{fc} (c^{*} + x), a_{ν} : = a_{ν} (x_{ν}), b_{ν} : = b_{ν} (x_{ν}) . \end{matrix}

Set $δ : = K ν$ with a large constant K. Since $a_{0} (x) = I m_{t_{*}} (c^{*} + x) \sim {|x|}^{1 / 3}$ , we have $a_{0} (\pm δ) \sim δ^{1 / 3}$ and $a_{0} (0) = 0$ . Recall from [10, Proposition 10.1(a)] that the map $s \mapsto m_{s}^{fc}$ is 1/3-Hölder continuous. It then follows that $a_{ν} (\pm δ) \sim δ^{1 / 3} + O (ν^{1 / 3})$ , while $a_{ν} (0) ≲ ν^{1 / 3}$ . Thus $a_{ν}$ necessarily has a local minimum in $(- δ, δ)$ if K is sufficiently large. This shows the existence of a local minimum with $|x_{ν}| ≲ K ν \sim ν$ .

We now study the function $f_{ν} (x) = x + ν b_{ν} (x)$ in a small neighbourhood around 0. From [10, Eqs. (7.62),(5.43)–(5.45)] it follows that

\begin{matrix} \begin{matrix} b_{ν}^{'} (x) & = R \frac{c_{1} (x) + O (a_{ν}, (x))}{- i c_{2} (x) a_{ν} (x) + a_{ν} {(x)}^{2} + O (a_{ν}, {(x)}^{3})} + O (1) \\ = \frac{c_{1} (x)}{c_{2} {(x)}^{2} + a_{ν} {(x)}^{2}} + O (\frac{1}{c_{2} (x) + a_{ν} (x)}) \end{matrix} \end{matrix}

5.6

whenever $a_{ν} (x) ≪ 1$ , with appropriate real functions3 $c_{1} (x) \sim 1$ and $c_{2} (x) \geq 0$ . Moreover, $|c_{2}, (0)| ≪ 1$ since $c^{*}$ is an almost cusp point for $m_{s}^{fc}$ for any $s \in [0, 2 t_{*}]$ . Thus it follows that $b_{ν}^{'} (x) > 0$ whenever $a_{ν} (x) + c_{2} (x) ≪ 1$ . Due to the 1/3-Hölder continuity4 of both $a_{ν} (x)$ and $c_{2} (x)$ and $a_{ν} (0) + |c_{2}, (0)| ≪ 1$ , it follows that $b_{ν}^{'} (x) > 0$ whenever $|x| ≪ 1$ . We can thus conclude that $f_{ν}$ satisfies $f_{ν}^{'} \geq 1$ in some $O (1)$ -neighbourhood of 0. As $|f_{ν}, (0)| ≲ ν$ we can conclude that there exists a root ${\tilde{x}}_{ν}$ , $f_{ν} ({\tilde{x}}_{ν}) = 0$ of size Inline graphic . With ${\tilde{m}}_{s} : = c^{*} + {\tilde{x}}_{ν}$ we have thus shown the first equality in (5.5b).

Using (2.4a), we now expand the defining equation

\begin{matrix} a_{ν} (x) = I m_{t_{*}}^{fc} (c^{*} + x + ν b_{ν} (x) + i ν a_{ν} (x)) \end{matrix}

for the free convolution in the regime for those x sufficiently close to ${\tilde{x}}_{ν}$ such that Inline graphic to find

\begin{matrix} \begin{matrix} a_{ν} (x) & = \frac{\sqrt{3} γ^{4 / 3}}{2 π} ν a_{ν} (x) \int_{R} \frac{{|λ|}^{1 / 3} + O ({|λ|}^{2 / 3})}{{(λ - x - ν b_{ν} (x))}^{2} + {(ν a_{ν} (x))}^{2}} d λ \\ = \frac{\sqrt{3} γ^{4 / 3}}{2 π} \int_{R} \frac{{(ν a_{ν} (x))}^{1 / 3} {|λ|}^{1 / 3}}{{(λ - [x + ν b_{ν} (x)] / ν a_{ν} (x))}^{2} + 1} d λ + O ({(ν a_{ν} (x))}^{2 / 3}) \\ = {(ν a_{ν} (x))}^{1 / 3} γ^{4 / 3} [1 + \frac{1}{9} {(\frac{x + ν b_{ν} (x)}{ν a_{ν} (x)})}^{2} + O ({(\frac{x + ν b_{ν} (x)}{ν a_{ν} (x)})}^{4} + {(ν a_{ν} (x))}^{1 / 3})], \end{matrix} \end{matrix}

i.e.

\begin{matrix} a_{ν} (x) = ν^{1 / 2} γ^{2} {[1 + \frac{1}{9} {(\frac{x + ν b_{ν} (x)}{ν a_{ν} (x)})}^{2} + O ({(\frac{x + ν b_{ν} (x)}{ν a_{ν} (x)})}^{4} + {(ν a_{ν} (x))}^{1 / 3})]}^{3 / 2} . \end{matrix}

5.7

Note that (5.7) implies that $ν a_{ν} ({\tilde{x}}_{ν}) \sim ν^{3 / 2}$ , i.e. the last claim in (5.5b). We now pick some large K and note that from (5.7) it follows that $a_{ν} ({\tilde{x}}_{ν} \pm K ν^{7 / 4}) > a_{ν} ({\tilde{x}}_{ν})$ . Thus the interval $[{\tilde{x}}_{ν} - K ν^{7 / 4}, {\tilde{x}}_{ν} + K ν^{7 / 4}]$ contains a local minimum of $a_{ν} (x)$ , but by the uniqueness this must then be $x_{ν}$ . We thus have Inline graphic , proving the second claim in (5.5b). By 1/3-Hölder continuity of $a_{ν} (x)$ and by $a_{ν} ({\tilde{x}}_{ν}) \sim ν^{1 / 2}$ from (5.7), we conclude that $a_{ν} = a_{ν} (x_{ν}) \sim ν^{1 / 2}$ as well. Using that ${\tilde{x}}_{ν} + ν b_{ν} ({\tilde{x}}_{ν}) = 0$ and $b_{ν}^{'} ≲ 1 / ν$ from (5.6) and $a_{ν} (x) ≳ \sqrt{ν}$ , we conclude that $|x_{ν} + ν b_{ν} (x_{ν})| ≲ ν^{7 / 4}$ , i.e. the second claim in (5.5a). Plugging this information back into (5.7), we thus find $a_{ν} = γ^{2} \sqrt{ν} (1 + O (ν^{1 / 2}))$ and have also proven the first claim in (5.5a).

We now turn to part (5.5). It follows from the analysis in [10] that $ρ_{s}^{fc}$ exhibits either a small gap, a cusp or a small local minimum close to $c^{*}$ . It follows from (i) that a cusp is transformed into a local minimum, and a local minimum cannot be transformed into a cusp along the semicircular flow. Therefore it follows that the support of $ρ_{s}^{fc}$ has a gap of size $Δ_{s} = e_{s}^{+} - e_{s}^{-}$ between the edges $e_{s}^{\pm}$ . Evidently $e_{t_{*}}^{-} = e_{t_{*}}^{+} = c^{*}$ , $e_{0}^{+} - e_{0}^{-} = Δ_{0}$ , $e_{0}^{\pm} = e_{\pm}^{*}$ and for $s > 0$ we differentiate (5.3a) to obtain

\begin{matrix} \frac{{(m_{s}^{fc})}^{'} (z)}{1 + s {(m_{s}^{fc})}^{'} (z)} = m_{*}^{'} (z + s m_{s}^{fc} (z)) and conclude m_{*}^{'} (ξ_{s} (e_{s}^{\pm})) = 1 / s \end{matrix}

5.8

by considering the $z \to e_{s}^{\pm}$ limit and the fact that $ρ_{s}^{fc}$ has a square root at edge (for $s < t_{*}$ ) hence ${(m_{s}^{fc})}^{'}$ blows up at this point. Denoting the $d / d s$ derivative by dot, from

\begin{matrix} \frac{d}{d s} m_{s}^{fc} (e_{s}^{\pm}) = m_{*}^{'} (ξ_{s} (e_{s}^{\pm})) ({\dot{e}}_{s}^{\pm} + m_{s}^{fc} (e_{s}^{\pm}) + s \frac{d}{d s} m_{s}^{fc} (e_{s}^{\pm})) = \frac{{\dot{e}}_{s}^{\pm} + m_{s}^{fc} (e_{s}^{\pm})}{s} + \frac{d}{d s} m_{s}^{fc} (e_{s}^{\pm}) \end{matrix}

we can thus conclude that ${\dot{e}}_{s}^{\pm} = - m_{s}^{fc} (e_{s}^{\pm})$ . This implies that the gap as a whole moves with linear speed (for non-zero $m_{s}^{fc} (e_{s}^{\pm})$ ), and, in particular, the distance of the gap of $ρ^{*}$ to $c^{*}$ is an order of magnitude larger than the size of the gap. It follows that the size $Δ_{s} : = e_{s}^{+} - e_{s}^{-}$ of the gap of $ρ_{s}^{fc}$ satisfies

\begin{matrix} {\dot{Δ}}_{s} = m_{s}^{fc} (e_{s}^{-}) - m_{s}^{fc} (e_{s}^{+}) = \int_{R} [\frac{1}{x - e_{s}^{-}} - \frac{1}{x - e_{s}^{+}}] ρ_{s}^{fc} (x) d x = - Δ_{s} \int_{R} \frac{ρ_{s}^{fc} (x)}{(x - e_{s}^{-}) (x - e_{s}^{+})} d x . \end{matrix}

We now use the precise shape of $ρ_{s}^{fc}$ close to $e_{s}^{\pm}$ according to (2.4b) which is given by

\begin{matrix} ρ_{s}^{fc} (e_{s}^{\pm} \pm x) = & \frac{\sqrt{3} {(2 γ)}^{4 / 3} Δ_{s}^{1 / 3}}{2 π} \\ ((1 + O ([) 0] {(t_{*} - t)}^{1 / 3}) Ψ_{edge} (x / Δ_{s}) + O (Δ_{s}^{1 / 3}, Ψ_{edge}^{2}, (x / Δ_{s}))), \end{matrix}

5.9

where $Ψ_{edge}$ defined in (2.4c) exhibits the limiting behaviour

\begin{matrix} lim_{Δ \to 0} Δ^{1 / 3} Ψ_{edge} (x / Δ) = {|x|}^{1 / 3} / 2^{4 / 3} . \end{matrix}

Using (5.9), we compute

\begin{matrix} \begin{matrix} {\dot{Δ}}_{s} & = - (1 + O ([) 0] {(t_{*} - s)}^{1 / 3}) \frac{\sqrt{3} {(2 γ)}^{4 / 3} Δ_{s}^{1 / 3}}{π} \int_{0}^{\infty} \frac{Ψ_{edge} (x)}{x (1 + x)} d x \\ = - γ^{4 / 3} {(2 Δ_{s})}^{1 / 3} [1 + O ([) 0] {(t_{*} - s)}^{1 / 3} + Δ_{s}^{1 / 3}], \end{matrix} \end{matrix}

5.10

where the $(1 + O ([) 0] {(t_{*} - s)}^{1 / 3})$ factor in (5.9) encapsulates two error terms; both are due to the fact that the shape factor $γ_{s}$ of $ρ_{s}^{fc}$ from (2.4b) is not exactly the same as $γ$ , i.e. the one for $s = t_{*}$ . To track this error in $γ$ we go back to [10]. First, $|σ|$ in [10, Eq. (7.5a)] is of size ${(t_{*} - s)}^{1 / 3}$ by the fact that $σ$ vanishes at $s = t_{*}$ and is 1/3-Hölder continuous according to [10, Lemma 10.5]. Secondly, according to [10, Lemma 10.5] the shape factor $Γ$ (which is directly related to $γ$ in the present context) is also 1/3-Hölder continuous and therefore we know that the shape factors of $ρ^{*}$ at $e_{0}^{\pm}$ are at most multiplicatively perturbed by a factor of $(1 + O ([) 0] {(t_{*} - s)}^{1 / 3})$ . By solving the differential equation (5.10) with the initial condition $Δ_{t_{*}} = 0$ , the claim (5.5c) follows. $□$

Besides the asymptotic expansion for gap size and local minimum we also require some quantitative control on the location of $ξ_{t_{*}} (c^{*})$ , as defined in (5.3a), and some slight perturbations thereof within the spectral gap $[e_{-}^{*}, e_{+}^{*}]$ of $ρ^{*}$ . We remark the the point $ξ^{*} : = ξ_{t_{*}} (c^{*})$ plays a critical role for the contour integration in Sect. 5.2 since it will be the critical point of the phase function. From (5.5c) we recall that the gap size scales as $t_{*}^{3 / 2}$ which makes it natural to compare distances on that scale. In the regime where $t^{'} ≪ t_{*}$ all of the following estimates thus identify points very close to the centre of the initial gap.

Lemma 5.2

Suppose that we are in the setting of Lemma 5.1. We then find that $ξ_{t_{*}} (c^{*})$ is very close to the centre of $[e_{-}^{*}, e_{+}^{*}]$ in the sense that

5.11a

Furthermore, for $0 \leq t^{'} \leq t_{*}$ we have that

5.11b

Proof

We begin with proving (5.11a). For $s < t_{*}$ we denote the distance of $ξ_{s} (e_{s}^{\pm})$ to the edges $e_{0}^{\pm}$ by $D_{s}^{\pm} : = \pm (e_{0}^{\pm} - ξ_{s} (e_{s}^{\pm}))$ , cf. Fig. 1b. We have, by differentiating $m_{*}^{'} (ξ_{s} (e_{s}^{\pm})) = 1 / s$ from (5.8) that

\begin{matrix} {\dot{D}}_{s}^{\pm} = \mp \frac{d}{d s} ξ_{s} (e_{s}^{\pm}), - \frac{1}{s^{2}} = m_{*}^{''} (ξ_{s} (e_{s}^{\pm})) \frac{d}{d s} ξ_{s} (e_{s}^{\pm}) \end{matrix}

5.12

and by differentiating (5.3a),

\begin{matrix} {(m_{s}^{fc})}^{'} = m_{*}^{'} (ξ_{s}) ξ_{s}^{'}, ξ_{s}^{'} {(m_{s}^{fc})}^{''} = m_{*}^{''} (ξ_{s}) {(ξ_{s}^{'})}^{3} + {(m_{s}^{fc})}^{'} ξ_{s}^{''}, m_{*}^{''} (ξ_{s}) = \frac{{(m_{s}^{fc})}^{''}}{{(1 + s {(m_{s}^{fc})}^{'})}^{3}} . \end{matrix}

We now consider $z = e_{s}^{\pm} + i η$ with $η \to 0$ and compute from (5.9), for any $s < t_{*}$ ,

\begin{matrix} \begin{matrix} lim_{η ↘ 0} \sqrt{η} {(m_{s}^{fc})}^{'} (z) & = lim_{η ↘ 0} \sqrt{η} \int_{R} \frac{ρ_{s}^{fc} (x)}{{(x - z)}^{2}} d x = lim_{η ↘ 0} \frac{\sqrt{3 η} {(2 γ)}^{4 / 3} Δ_{s}^{1 / 3}}{2 π} \int_{0}^{\infty} \frac{Ψ_{edge} (x / Δ_{s})}{{(x - i η)}^{2}} d x \\ = \frac{{(2 γ)}^{4 / 3}}{2 \sqrt{3} Δ_{s}^{1 / 6} π} \int_{0}^{\infty} \frac{x^{1 / 2}}{{(x - i)}^{2}} d x = \frac{{(2 γ)}^{4 / 3} \sqrt{i}}{4 \sqrt{3} Δ_{s}^{1 / 6}} \end{matrix} \end{matrix}

and

\begin{matrix} \begin{matrix} lim_{η ↘ 0} η^{3 / 2} {(m_{s}^{fc})}^{''} (z) & = lim_{η ↘ 0} η^{3 / 2} 2 \int_{R} \frac{ρ_{s}^{fc} (x)}{{(x - z)}^{3}} d x \\ = lim_{η ↘ 0} \frac{{\sqrt{3 η}}^{3 / 2} {(2 γ)}^{4 / 3} Δ_{s}^{1 / 3}}{π} \int_{0}^{\infty} \frac{Ψ_{edge} (x / Δ_{s})}{{(x - i η)}^{3}} d x \\ = \frac{{(2 γ)}^{4 / 3}}{\sqrt{3} Δ_{s}^{1 / 6} π} \int_{0}^{\infty} \frac{x^{1 / 2}}{{(x - i)}^{3}} d x = \frac{{(2 γ)}^{4 / 3} i^{3 / 2}}{8 \sqrt{3} Δ_{s}^{1 / 6}} . \end{matrix} \end{matrix}

Here we used that fact that the error terms in (5.9) become irrelevant in the $η \to 0$ limit. We conclude, together with (5.12), that

\begin{matrix} \begin{matrix} m_{*}^{''} (ξ_{s} (e_{s}^{\pm})) & = \pm \frac{3 {(2 Δ_{s})}^{1 / 3}}{s^{3} γ^{8 / 3}}, \\ {\dot{D}}_{s}^{\pm} & = \pm {(s^{2} m_{*}^{''} (ξ_{s} (e_{s}^{\pm})))}^{- 1} = \frac{s γ^{8 / 3}}{3 {(2 Δ_{s})}^{1 / 3}} = \frac{s γ^{2}}{2 \sqrt{3} \sqrt{t_{*} - s}} [1 + O ([) 0] t_{*}^{1 / 3}] . \end{matrix} \end{matrix}

Since $D_{0}^{-} = D_{0}^{+} = 0$ and ${\dot{D}}_{s}^{-} \approx {\dot{D}}_{s}^{+}$ it follows that, to leading order, $D_{s}^{+} \approx D_{s}^{-}$ and more precisely

\begin{matrix} D_{s}^{\pm} = γ^{2} \frac{2 t_{*}^{3 / 2} - s \sqrt{t_{*} - s} - 2 t_{*} \sqrt{t_{*} - s}}{3^{3 / 2}} [1 + O ([) 0] t_{*}^{1 / 3}] . \end{matrix}

In particular it follows that $|e_{0}^{\pm} - ξ_{t_{*}} (c^{*})| = [1 + O ([) 0] {t_{*}}^{1 / 3}] 2 γ^{2} t_{*}^{3 / 2} / 3^{3 / 2}$ . Together with the $s = 0$ case from (5.5c) we thus find

proving (5.11a).

We now turn to the proof of (5.11b) where we treat the small gap and small non-zero minimum separately. We start with the first inequality. We observe that (5.11a) in the setting where $(ρ^{*}, t_{*})$ are replaced by $(ρ_{t_{*} - t^{'}}^{fc}, t^{'})$ implies

5.13

Furthermore, we infer from the definition of $ξ$ and the associativity (5.3b) of the free convolution that

\begin{matrix} ξ_{t_{*} - t^{'}} (c^{*} + t^{'} m_{t_{*}}^{fc} (c^{*})) = c^{*} + t^{'} m_{t_{*}}^{fc} (c^{*}) + (t_{*} - t^{'}) m_{t_{*} - t^{'}}^{fc} (c^{*} + t^{'} m_{t_{*}}^{fc} (c^{*})) = ξ_{t_{*}} (c^{*}) \end{matrix}

and can therefore estimate

just as claimed. In the last step we used (5.13) and the fact that

\begin{matrix} |ξ_{s} (a) - ξ_{s} (b)| ≲ |a - b| + s {|a - b|}^{1 / 3}, \end{matrix}

5.14

which directly follows from the definition of $ξ$ and the 1/3-Hölder continuity of $m_{s}^{fc}$ .

Finally, we address the second inequality in (5.11b) and appeal to Lemma 5.1(i) to establish the existence of ${\tilde{m}}_{t_{*} + t^{'}}$ such that

\begin{matrix} c^{*} - {\tilde{m}}_{t_{*} + t^{'}} = t^{'} R m_{t_{*} + t^{'}}^{fc} ({\tilde{m}}_{t_{*} + t^{'}}) . \end{matrix}

5.15

It thus follows from (5.5b) that Inline graphic and therefore from (5.14) that

Using (5.15) twice, as well as the associativity (5.3b) of the free convolution and $I m_{t_{*}}^{fc} (c^{*}) = 0$ we then further compute

\begin{matrix} \begin{matrix} ξ_{t_{*} + t^{'}} ({\tilde{m}}_{t_{*} + t^{'}}) - ξ_{t_{*}} (c^{*}) = {\tilde{m}}_{t_{*} + t^{'}} + (t_{*} + t^{'}) m_{t_{*} + t^{'}}^{fc} ({\tilde{m}}_{t_{*} + t^{'}}) - c^{*} - t_{*} m_{t_{*}}^{fc} (c^{*}) \\ = t_{*} R [m_{t_{*}}^{fc} (c^{*} + i t^{'} I m_{t_{*} + t^{'}}^{fc} ({\tilde{m}}_{t_{*} + t^{'}})) - m_{t_{*}}^{fc} (c^{*})] + i (t_{*} + t^{'}) I m_{t_{*} + t^{'}}^{fc} ({\tilde{m}}_{t_{*} + t^{'}}) . \end{matrix} \end{matrix}

5.16

By Hölder continuity we can, together with (5.11a) and $I m_{t_{*} + t^{'}} ({\tilde{m}}_{t_{*} + t^{'}}) \sim {(t^{'})}^{1 / 2}$ from (5.5b), conclude that

In the first term we used (5.14) and the second estimate of (5.5b). In the second term we used (5.16) together with $I m_{t_{*} + t^{'}} ({\tilde{m}}_{t_{*} + t^{'}}) \sim {(t^{'})}^{1 / 2}$ from (5.5b) and 1/3-Hölder continuity of $m_{t_{*}}^{fc}$ . Finally, the last term was already estimated in the exact cusp case, i.e. in (5.11a). $□$

Correlation kernel as contour integral

We denote the eigenvalues of $H_{t}$ by $λ_{1}, \dots, λ_{N}$ . Following the work of Brézin and Hikami (see e.g. [22, Eq. (2.14)] or [35, Eq. (3.13)] for the precise version used in the present context) the correlation kernel of ${\tilde{H}}_{t} = H_{t} + \sqrt{ct} U$ can be written as

\begin{matrix} {\hat{K}}_{N}^{t} (u, v) : = \frac{N}{{(2 π i)}^{2} c t} \int_{Υ} d z \int_{Γ} d w \frac{exp (N [w^{2} - 2 v w + v^{2} - z^{2} + 2 z u - u^{2}] / 2 c t)}{w - z} \\ \prod_{i} \frac{w - λ_{i}}{z - λ_{i}}, \end{matrix}

where $Υ$ is any contour around all $λ_{i}$ , and $Γ$ is any vertical line not intersecting $Υ$ . With this notation, the k-point correlation function of the eigenvalues of ${\tilde{H}}_{t}$ is given by

\begin{matrix} p_{k}^{(N)} (x_{1}, \dots, x_{k}) = det (\frac{1}{N} {\hat{K}}_{N}^{t} (x_{i}, x_{j}))_{i, j \in [k]} . \end{matrix}

Due to the determinantal structure we can freely conjugate $K_{N}$ with $v \mapsto e^{N (ξ v - v^{2} / 2) / c t}$ for $ξ : = ξ_{ct} (b)$ to redefine the correlation kernel as

\begin{matrix} K_{N}^{t} (u, v) : = \frac{N}{{(2 π i)}^{2} c t} \int_{Υ} d z \int_{Γ} d w \frac{exp (N [w^{2} - 2 v (w - ξ) - z^{2} + 2 u (z - ξ)] / 2 c t)}{w - z} \\ \prod_{i} \frac{w - λ_{i}}{z - λ_{i}} . \end{matrix}

This redefinition $K_{N}^{t}$ does not agree point-wise with the previous definition ${\hat{K}}_{N}^{t}$ , but gives rise to the same determinant, and in particular to the same k-point correlation function. Here $b$ is the base point chosen in Theorem 2.3. The central result concerning the correlation kernel is the following proposition.

Proposition 5.3

Under the assumptions of Theorem 2.3, the rescaled correlation kernel

\begin{matrix} {\tilde{K}}_{N}^{t} (x, y) : = \frac{1}{N^{3 / 4} γ} K_{N}^{t} (b + \frac{x}{N^{3 / 4} γ}, b + \frac{y}{N^{3 / 4} γ}) \end{matrix}

5.17

around the base point $b$ chosen in (2.6) converges uniformly to the Pearcey kernel from (2.5) in the sense that

\begin{matrix} |{\tilde{K}}_{N}^{t} (x, y) - K_{α} (x, y)| \leq C N^{- c} \end{matrix}

for $x, y \in [- R, R]$ . Here R is an arbitrary large threshold, $c > 0$ is some universal constant, $C > 0$ is a constant depending only on the model parameters and R, and $α$ is chosen according to (2.6).

Proof

We now split the contour $Υ$ into two parts, one encircling all eigenvalues $λ_{i}$ to the left of $ξ = b + c t 〈M (b)〉$ , and the other one encircling all eigenvalues $λ_{i}$ to the right of $ξ$ , which does not change the value of $K_{N}^{t}$ . We then move the vertical $Γ$ contour so that it crosses the real axis in $ξ$ . This does also not change the value $K_{N}^{t}$ as the only pole is the one in z for which the residue reads

\begin{matrix} \frac{N}{{(2 π i)}^{2} c t} \int_{Υ} d z exp (\frac{N}{c t γ}, (u - v), (z - ξ)) = 0 . \end{matrix}

We now perform a linear change of variables $z \mapsto ξ + Δ_{0} z$ , $w \mapsto ξ + Δ_{0} w$ in (5.17) to transform the contours $Υ, Γ$ into contours

\begin{matrix} \hat{Γ} : = (Γ - ξ) / Δ_{0}, \hat{Υ} : = (Υ - ξ) / Δ_{0} \end{matrix}

5.18

to obtain

\begin{matrix} {\tilde{K}}_{N}^{t} (x, y) = \frac{N^{1 / 4} Δ_{0}}{{(2 π i)}^{2} c t γ} \int_{\hat{Υ}} d z \int_{\hat{Γ}} d w \frac{exp (Δ_{0} N^{1 / 4} (x z - y w) / c t γ + N Δ_{0}^{2} [\tilde{f} (w) - \tilde{f} (z)] / c t)}{w - z}, \end{matrix}

5.19

where

\begin{matrix} \tilde{f} (z) : = \frac{z^{2}}{2} - \frac{ct}{Δ_{0}^{2}} \int_{ξ}^{ξ + Δ_{0} z} 〈G_{t} (u) - M_{t} (ξ)〉 d u . \end{matrix}

Here $Δ_{0} : = e_{0}^{+} - e_{0}^{-}$ indicates the length of the gap $[e_{0}^{-}, e_{0}^{+}]$ in the support of $ρ_{t}$ . From Lemma 5.1 with $ρ^{*} = ρ_{t}$ and $t_{*} = c t$ we infer $Δ_{0} \sim t^{3 / 2} \sim N^{- 3 / 4 + 3 ϵ / 2}$ . In order to obtain (5.19) we used the relation $ξ - b = c t m_{ct}^{fc} (b) = c t 〈M_{t}, (b + c t m_{ct}^{fc} (b))〉 = c t 〈M_{t}, (ξ)〉$ .

We begin by analysing the deterministic variant of $\tilde{f} (z)$ ,

\begin{matrix} f (z) : = \frac{z^{2}}{2} - \frac{ct}{Δ_{0}^{2}} \int_{ξ}^{ξ + Δ_{0} z} 〈M_{t} (u) - M_{t} (ξ)〉 d u . \end{matrix}

We separately analyse the large- and small-scale behaviour of f(z). On the one hand, using the 1/3-Hölder continuity of $u \mapsto 〈M_{t}, (u)〉$ , eq. (5.5c) and

\begin{matrix} \frac{ct}{Δ_{0}^{2}} \int_{ξ}^{ξ + Δ_{0} z} |〈M_{t} (u) - M_{t} (ξ)〉| d u ≲ \frac{t {(Δ_{0} |z|)}^{4 / 3}}{Δ_{0}^{2}} ≲ {|z|}^{4 / 3} . \end{matrix}

we conclude the large-scale asymptotics

\begin{matrix} f (z) = \frac{z^{2}}{2} + O ({|z|}^{4 / 3}), |z| ≫ 1 . \end{matrix}

5.20

We now turn to the small-scale $|z| ≪ 1$ asymptotics. We first specialize Lemmas 5.1 and 5.2 to $ρ^{*} = ρ_{t}$ and collect the necessary conclusions in the following Lemma.

Lemma 5.4

Under the assumptions of Theorem 2.3 it follows that $ρ_{t}$ has a spectral gap $[e_{0}^{-}, e_{0}^{+}]$ of size

\begin{matrix} Δ_{0} = e_{0}^{+} - e_{0}^{-} = Δ (c t \pm t^{ρ}) [1 + O (t^{1 / 3})], where \pm t^{ρ} \\ : = \{\begin{matrix} 0 & in case (i) \\ 3 {(Δ^{ρ})}^{2 / 3} / {(2 γ)}^{4 / 3} & in case (ii) \\ - π^{2} ρ {(m^{ρ})}^{2} / γ^{4} & in case (iii) . \end{matrix}) \end{matrix}

5.21a

Furthermore, in all three cases we have that $ξ$ is is very close to the centre of the gap in the support of $ρ_{t}$ in the sense that

\begin{matrix} |ξ - \frac{e_{0}^{+} + e_{0}^{-}}{2}| = O (t^{3 / 2}, N^{- ϵ / 2}) . \end{matrix}

5.21b

Proof

We prove (5.21a)–(5.21b) separately in cases (i), (ii) and (iii).

(i)
Here (5.21a) follows directly from (5.5c) with $ρ^{*} = ρ_{t}$ , $t_{*} = c t$ , $s = 0$ and $c^{*} = c^{ρ}$ . Furthermore (5.21b) follows from (5.11a) with $ρ^{*} = ρ_{t}$ , $t_{*} = c t$ and $c^{*} = c^{ρ}$ .
(ii)
We apply (5.5c) with $ρ^{*} = ρ = ρ_{ct}^{fc}$ , $t_{*} = t^{ρ}$ , $s = 0$ to conclude that $Δ^{ρ} = {(2 γ)}^{2} {(t^{ρ} / 3)}^{3 / 2} [1 + O ([) 0] {(t^{ρ})}^{1 / 3}]$ , and that $ρ_{c t + t^{ρ}}^{fc}$ has an exact cusp in some point $c$ . Thus (5.21a) follows from another application of (5.5c) with $ρ^{*} = ρ_{t}$ , $t_{*} = c t + t^{ρ}$ , $s = 0$ and $c^{*} = c$ . Furthermore, (5.21b) follows again from (5.11b) but this time with $ρ^{*} = ρ_{t}$ , $t_{*} = c t + t^{ρ}$ , $t^{'} = t^{ρ}$ and $e_{t_{*} - t^{'}}^{\pm} = e_{\pm}^{ρ}$ , and using that $t_{*}^{1 / 9} \leq N^{- ϵ / 2}$ for sufficiently small $ϵ$ .
(iii)
From (5.5a) with $ρ^{*} = ρ_{t}$ , $t_{*} = c t - t^{ρ}$ , $s = c t$ to conclude $ρ (m^{ρ}) = [1 + O ([) 0] {(t^{ρ})}^{1 / 2}] γ^{2} \sqrt{t^{ρ}} / π$ , and that $ρ_{c t - t^{ρ}}$ has an exact cusp in some point $c$ . Finally, (5.21b) follows again from (5.11b) but with $ρ^{*} = ρ_{t}$ , $t_{*} = c t - t^{ρ}$ , $t^{'} = t^{ρ}$ and $m_{t_{*} + t^{'}} = m^{ρ}$ , and using $t^{'} / t_{*} ≲ t^{ρ} / c t ≲ N^{- ϵ}$ and $t_{*}^{1 / 12} \leq N^{- ϵ / 2}$ for sufficiently small $ϵ$ . $□$

Equipped with Lemma 5.4 we can now turn to the small scale analysis of f(z) and write out the Stieltjes transform to find

\begin{matrix} \begin{matrix} f (z) & = \frac{z^{2}}{2} - \frac{ct}{Δ_{0}^{2}} \int_{R} \int_{ξ}^{ξ + Δ_{0} z} \frac{u - ξ}{(x - u) (x - ξ)} ρ_{t} (x) d u d x \\ = \frac{z^{2}}{2} - \frac{ct}{Δ_{0}} \int_{R} \int_{0}^{z} \frac{u}{(x - u) x} ρ_{t} (ξ + Δ_{0} x) d u d x . \end{matrix} \end{matrix}

Note that these integrals are not singular since $ρ_{t} (ξ + Δ_{0} x)$ vanishes for $|x| \leq 1 / 2$ . We now perform the u integration to find

\begin{matrix} f (z) = \frac{z^{2}}{2} - \frac{ct}{Δ_{0}} \int_{R} [log x - log (x - z) - \frac{z}{x}] ρ_{t} (ξ + Δ_{0} x) d x . \end{matrix}

5.22

By using the precise shape (5.9) (with $s = 0$ ) of $ρ_{t}$ close to the edges $e_{0}^{\pm}$ , and recalling the gap size from (5.21a) and location of $ξ$ from (5.21b) we can then write

\begin{matrix} f (z) = (1 + O ([) 0] t^{1 / 3}) \tilde{g} (z) + O ({|z|}^{2}, t^{1 / 3}) \end{matrix}

5.23

with

\begin{matrix} \tilde{g} (z) : = \frac{z^{2}}{2} - \frac{3 \sqrt{3}}{2 π (1 \pm t^{ρ} / c t)} \int_{R} [log x - log (x - z) - \frac{z}{x}] Ψ_{edge} (|x| - 1 / 2) 1_{|x| \geq 1 / 2} d x \end{matrix}

being the leading order contribution. Here ± indicates that the formula holds for all three cases (i), (ii) and (iii) simultaneously, where $t^{ρ} = 0$ in case (i). The contribution of the error term in (5.9) to the integral in (5.22) is of order $O ([) 0] {|z|}^{2} t^{1 / 2}$ using that $log x - log (x - z) - z / x = O ([) 0] {|z / x|}^{2}$ and that $|x| \geq 1 / 2$ on the support of $ρ_{t} (ξ + Δ_{0} x)$ . By the explicit integrals

\begin{matrix} \frac{3 \sqrt{3}}{2 π} \int_{0}^{\infty} \frac{Ψ_{edge} (x)}{{(x + 1 / 2)}^{2}} d x = \frac{1}{2}, \frac{3 \sqrt{3}}{2 π} \int_{0}^{\infty} \frac{Ψ_{edge} (x)}{{(x + 1 / 2)}^{4}} d x = \frac{8}{27} \end{matrix}

and a Taylor expansion of the logarithm $log (x - z)$ we find that the quadratic term $z^{2} / 2$ almost cancels and we conclude the small-scale asymptotics

\begin{matrix} \tilde{g} (z) = (\frac{\pm t^{ρ}}{ct} \frac{z^{2}}{2} - \frac{4 z^{4}}{27}) (1 + O (t^{ρ} / t)) + O ({|z|}^{5}), |z| ≪ 1 . \end{matrix}

5.24

Contour deformations

We now argue that we can deform the contours $Υ, Γ$ and thereby via (5.18) the derived contours $\hat{Υ}, \hat{Γ}$ , in a way which bounds the sign of $R g$ away from zero along the contours. Here g(z) is the N-independent variant of $\tilde{g} (z)$ given by

5.25

The topological aspect of our argument is inspired by the approach in [42–44].

Lemma 5.5

For all sufficiently small $δ > 0$ there exists $K = K (δ)$ such that the following holds true. The contours $Υ, Γ$ then can be deformed, without touching $(supp ρ_{t} + [- 1, 1]) \ {ξ}$ or each other, in such a way that the rescaled contours $\hat{Υ}, \hat{Γ}$ defined in (5.18) satisfy $R g \geq K$ on $\hat{Υ} \cap {|z| > δ}$ and $R g \leq - K$ on $\hat{Γ} \cap {|z| > δ}$ . Furthermore, locally around 0 the contours can be chosen in such a way that

\begin{matrix} \begin{matrix} \hat{Γ} \cap {z \in C | |z| \leq δ} & = (- i δ, i δ), \\ \hat{Υ} \cap {z \in C | |z| \leq δ} & = (- δ e^{i π / 4}, δ e^{i π / 4}) \cup (- δ e^{- i π / 4}, δ e^{- i π / 4}) . \end{matrix} \end{matrix}

5.26

Proof

Just as in (5.24) we have the expansion

\begin{matrix} g (z) = - \frac{4 z^{4}}{27} + O ({|z|}^{5}), |z| ≪ 1 . \end{matrix}

5.27

It thus follows that for some small $δ > 0$ , and

we have $Ω_{\pm 1}^{<}, Ω_{\pm 3}^{<} \subset Ω_{+} : = {R g > 0}$ and $Ω_{0}^{<}, Ω_{\pm 2}^{<}, Ω_{4}^{<} \subset Ω_{-} : = {R g < 0}$ in agreement with Fig. 2c. For large z, however, it also follows from (5.20) together with (5.25) and (5.23) that for some large R, and

\begin{matrix} Ω_{k}^{>} : = {z \in C | |z| > R, \frac{(k - 1) π}{4} + δ < arg z < \frac{(k + 1) π}{4} + δ} \end{matrix}

we have $Ω_{0}^{>}, Ω_{4}^{>} \subset Ω_{+}$ and $Ω_{\pm 2}^{>} \subset Ω_{-}$ , in agreement with Fig. 2a. We denote the connected component of $Ω_{\pm}$ containing some set A by $cc (A)$ .

Claim 1— $cc (Ω_{0}^{>}), cc (Ω_{4}^{>})$ are the only two unbounded connected components of $Ω_{+}$ Suppose there was another unbounded connected component A of $Ω_{+}$ . Since $Ω_{\pm_{2}}^{>} \subset Ω_{-}$ we would be able to find some $z_{0} \in A$ with arbitrarily large $|R, z_{0}|$ . If $R z_{0} > 0$ , then we note that the map $x \mapsto R g (z_{0} + x)$ is increasing, and otherwise we note that the map $x \mapsto R g (z_{0} - x)$ is increasing. Thus it follows in both cases that the connected component A actually coincides with $cc (Ω_{0}^{>})$ or with $cc (Ω_{4}^{>})$ , respectively.
Claim 2— $cc (Ω_{\pm 2}^{>})$ are the only two unbounded connected components of $Ω_{-}$ This follows very similarly to Claim 1.
Claim 3— $cc (Ω_{\pm 1}^{<}), cc (Ω_{\pm 2}^{<}), cc (Ω_{\pm 3}^{<})$ are unbounded We note that the map $z \mapsto R g (z)$ is harmonic on $C \ ([1 / 2, \infty) \cup (- \infty, - 1 / 2])$ and subharmonic on $C$ . Therefore it follows that $cc (Ω_{\pm 1}^{<}), cc (Ω_{\pm 3}^{<}) \subset Ω_{+}$ are unbounded. Since these sets are moreover symmetric with respect to the real axis it then also follows that $cc (Ω_{\pm 2}) \cap ((- \infty, - 1 / 2] \cup [1 / 2, \infty)) = \emptyset$ . This implies that $R g (z)$ is harmonic on $cc (Ω_{\pm 2}^{<})$ and consequently also that $cc (Ω_{\pm 2}^{<})$ are unbounded.
Claim 4— $cc (Ω_{1}^{<}) = cc (Ω_{- 1}^{<}) = cc (Ω_{0}^{>})$ and $cc (Ω_{3}^{<}) = cc (Ω_{- 3}^{<}) = cc (Ω_{4}^{>})$ This follows from Claims 1–3.
Claim 5— $cc (Ω_{2}^{<}) = cc (Ω_{2}^{>})$ and $cc (Ω_{- 2}^{<}) = cc (Ω_{- 2}^{>})$ This also follows from Claims 1–3.

The claimed bounds on $R g$ now follow from Claims 4–5 and compactness. The claimed small scale shape (5.26) follows by construction of the sets $Ω_{k}^{<}$ . $□$

From Lemmas 5.5 and 2.8 it follows that $K_{N}^{t}$ and thereby also ${\tilde{K}}_{N}^{t}$ remain, with overwhelming probability, invariant under the chosen contour deformation. Indeed, $K_{N}^{t}$ only has poles where $z = w$ or $z = λ_{i}$ for some i. Due to self-adjointness and Lemma 5.5, $z = λ_{i}$ can only occur if $λ_{i} = ξ$ or $dist (λ_{i}, supp ρ_{t}) > 1$ . Both probabilities are exponentially small as a consequence of Lemma 2.8, since for the former we have $η_{f} (ξ) \sim N^{- 3 / 4 + ϵ / 6}$ according to (2.7), while $dist (ξ, supp ρ_{t}) \sim N^{- 3 / 4 + 3 ϵ / 2}$ .

For $z \in \hat{Γ} \cup \hat{Υ}$ it follows from (5.26) that we can estimate

5.28

Indeed, for (5.28) we used (5.26) to obtain $dist (R u, supp ρ_{t}) ≳ t^{3 / 2}$ , so that Inline graphic follows from the local law from (2.8b).

We now distinguish three regimes: $|z| ≲ N^{- ϵ / 2}$ , $N^{- ϵ / 2} ≲ |z| ≪ 1$ and finally $|z| ≳ 1$ which we call microscopic, mesoscopic and macroscopic. We first consider the latter two regimes as they only contribute small error terms.

Macroscopic regime.

If either $|z| \geq δ$ or $|w| \geq δ$ , it follows from Lemma 5.5 that $R g (w) \leq - K$ and/or $R g (z) \geq K$ , and therefore together with (5.23),(5.25) and (5.28) that $R \tilde{f} (w) ≲ - K$ and/or $R \tilde{f} (z) ≳ K$ with overwhelming probability. Using $Δ_{0} \sim N^{- 3 / 4 + 3 ϵ / 2}$ from (5.21a), we find that $N Δ_{0}^{2} / c t \sim N^{2 ϵ}$ and $Δ_{0} N^{1 / 4} / c t γ \sim N^{ϵ / 2}$ , so that the integrand in (5.19) in the considered regime is exponentially small.

Mesoscopic regime.

If either $δ \geq |z| ≫ N^{- ϵ / 2}$ or $δ \geq |w| ≫ N^{- ϵ / 2}$ , then $R g (w) \sim - {|w|}^{4} ≪ - N^{- 2 ϵ}$ and/or $R g (z) \sim {|z|}^{4} ≫ N^{- 2 ϵ}$ from (5.27). Thus it follows from (5.23) and (5.25) that also $R f (w) ≪ - N^{- 2 ϵ}$ and/or $R f (z) ≫ N^{- 2 ϵ}$ and by (5.28) that with overwhelming probability $R \tilde{f} (w) ≪ - N^{- 2 ϵ}$ and/or $R \tilde{f} (z) ≫ N^{- 2 ϵ}$ . Since $1 / |w - z|$ is integrable over the contours it thus follows that the contribution to ${\tilde{K}}_{N}^{t} (x, y)$ , as in (5.19), from z, w with either $|z| ≫ N^{- ϵ / 2}$ or $|w| ≫ N^{- ϵ / 2}$ is negligible.

Microscopic regime.

We can now concentrate on the important regime where $|z| + |w| ≲ N^{- ϵ / 2}$ and to do so perform another change of variables $z \mapsto c t γ z / Δ_{0} N^{1 / 4} \sim N^{- ϵ / 2} z$ , $w \mapsto c t γ w / Δ_{0} N^{1 / 4} \sim N^{- ϵ / 2} w$ which gives rise to two new contours

\begin{matrix} {\hat{Γ}}^{'} : = \frac{Δ_{0} N^{1 / 4}}{c t γ} \hat{Γ}, {\hat{Υ}}^{'} : = \frac{Δ_{0} N^{1 / 4}}{c t γ} \hat{Υ}, \end{matrix}

as depicted in Fig. 2B, and the kernel

\begin{matrix} {\tilde{K}}_{N}^{t} (x, y) = \frac{1}{{(2 π i)}^{2}} \int_{{\hat{Υ}}^{'}} d z \int_{{\hat{Γ}}^{'}} d w \frac{exp (x z - y w + \frac{N Δ_{0}^{2}}{ct} [\tilde{f} (\frac{c t γ w}{Δ_{0} N^{1 / 4}}) - \tilde{f} (\frac{c t γ z}{Δ_{0} N^{1 / 4}})])}{w - z} . \end{matrix}

5.29

We only have to consider w, z with $|w| + |z| ≲ 1$ in (5.29) since $t / Δ_{0} N^{1 / 4} \sim N^{- ϵ / 2}$ and the other regime has already been covered in the previous paragraph before the change of variables.

We now separately estimate the errors stemming from replacing $\tilde{f} (z)$ first by f(z), then by $\tilde{g} (z)$ and finally by $\pm t^{ρ} z^{2} / 2 c t - 4 z^{4} / 27$ . We recall that $Δ_{0} \sim t^{3 / 2} = N^{- 3 / 4 + 3 ϵ / 2}$ from (5.21a), $t^{ρ} ≲ N^{- 1 / 2}$ from the definition of $t^{ρ}$ in (5.21a), and that $t = N^{- 1 / 2 + ϵ}$ which will be used repeatedly in the following estimates. According to (5.28), we have

\begin{matrix} \frac{N Δ_{0}^{2}}{ct} |\tilde{f} (\frac{c t γ z}{Δ_{0} N^{1 / 4}}) - f (\frac{c t γ z}{Δ_{0} N^{1 / 4}})| ≺ \frac{N Δ_{0}^{2}}{t} \frac{t}{Δ_{0} N^{1 / 4}} N^{- 2 ϵ} |z| ≲ N^{- ϵ / 2} . \end{matrix}

5.30a

Next, from (5.23) we have

\begin{matrix} \frac{N Δ_{0}^{2}}{ct} |f (\frac{c t γ z}{Δ_{0} N^{1 / 4}}) - \tilde{g} (\frac{c t γ z}{Δ_{0} N^{1 / 4}})| ≲ t^{1 / 3} {|\frac{c t γ z}{Δ_{0} N^{1 / 4}}|}^{2} \frac{N Δ_{0}^{2}}{ct} + t^{1 / 3} \frac{N Δ_{0}^{2}}{ct} ≲ N^{- 1 / 6 + 7 ϵ / 3} . \end{matrix}

5.30b

Finally, we have to estimate the error from replacing $\tilde{g} (z)$ by its Taylor expansion with (5.24) and find

\begin{matrix} \frac{N Δ_{0}^{2}}{ct} |\tilde{g} (\frac{c t γ z}{Δ_{0} N^{1 / 4}}) - \frac{\pm t^{ρ}}{2 c t} (\frac{c t γ z}{Δ_{0} N^{1 / 4}})^{2} + \frac{4}{27} (\frac{c t γ z}{Δ_{0} N^{1 / 4}})^{4}| ≲ N^{- ϵ / 2} . \end{matrix}

5.30c

Finally, from (5.21a) and the definition of $α$ from (2.6) we obtain that

\begin{matrix} \frac{N Δ_{0}^{2}}{ct} [\frac{\pm t^{ρ}}{2 c t} {(\frac{c t γ z}{Δ_{0} N^{1 / 4}})}^{2} - \frac{4}{27} {(\frac{c t γ z}{Δ_{0} N^{1 / 4}})}^{4}] = (α \frac{z^{2}}{2} - \frac{z^{4}}{4}) [1 + O ([) 0] t^{1 / 3}] . \end{matrix}

5.30d

From (5.30) and the integrability of $1 / |z - w|$ for small z, w along the contours we can thus conclude

\begin{matrix} {\tilde{K}}_{N}^{t} (x, y) = (1 + O (N^{- c})) \frac{1}{{(2 π i)}^{2}} \int_{{\hat{Υ}}^{'}} d z \int_{{\tilde{Γ}}^{'}} d w \frac{e^{x z - y w + z^{4} / 4 - α z^{2} / 2 - w^{4} / 4 + α w^{2} / 2}}{w - z} . \end{matrix}

5.31

Furthermore, it follows from (5.26) that, as $N \to \infty$ , the contours ${\hat{Υ}}^{'}, {\hat{Γ}}^{'}$ are those depicted in Fig. 2b, i.e.

\begin{matrix} {\hat{Υ}}^{'} = (- e^{i π / 4} \infty, e^{i π / 4} \infty) \cup (- e^{- i π / 4} \infty, e^{- i π / 4} \infty), {\hat{Γ}}^{'} : = (- i \infty, i \infty) . \end{matrix}

We recognize (5.31) as the extended Pearcey kernel from (2.5).

It is easy to see that all error terms along the contour integration are uniform in x, y running over any fixed compact set. This proves that ${\tilde{K}}_{N}^{t} (x, y)$ converges to $K_{α} (x, y)$ uniformly in x, y in a compact set. This completes the proof of Proposition 5.3. $□$

Green function comparison

We will now complete the proof of Theorem 2.3 by demonstrating that the local k-point correlation function at the common physical cusp location $τ_{0}$ of the matrices ${\tilde{H}}_{t}$ does not change along the flow (5.1). Together with Proposition 5.3 this completes the proof of Theorem 2.3. A version of this continuity of the matrix Ornstein-Uhlenbeck process with respect to the local correlation functions that is valid in the bulk or at regular edges is the third step in the well known three step approach to universality [38]. We will present this argument in the more general setup of correlated random matrices, i.e. in the setting of [34]. In particular, we assume that the cumulants of the matrix elements $w_{ab}$ satisfy the decay conditions [34, Assumptions (C,D)], an assumption that is obviously fulfilled for deformed Wigner-type matrices.

We claim that the k-point correlation function $p_{k}^{(N)}$ of $H = {\tilde{H}}_{0}$ and the corresponding k-point correlation function ${\tilde{p}}_{k, t}^{(N)}$ of ${\tilde{H}}_{t}$ stay close along the OU-flow in the sense that

\begin{matrix} |\int_{R^{k}} F (x) [N^{k / 4} p_{k}^{(N)} (b + \frac{x}{γ N^{3 / 4}}) - {\tilde{p}}_{k, t}^{(N)} (b + \frac{x}{γ N^{3 / 4}})] d x_{1} \dots d x_{k}| = O (N^{- c}), \end{matrix}

5.32

for $ϵ > 0$ , $t \leq N^{- 1 / 4 - ϵ}$ , smooth functions F and some constant $c = c (k, ϵ)$ , where $b$ is the physical cusp point. The proof of (5.32) follows the standard arguments of computing t-derivatives of products of traces of resolvents ${\tilde{G}}^{(t)} = ({\tilde{H}}_{t} - z)$ at spectral parameters z just below the fluctuation scale of eigenvalues, i.e. for $I z \geq N^{- ζ} η_{f} (R z)$ . Since the procedure detailed e.g. in [38, Chapter 15] is well established and not specific to the cusp scaling, we keep our explanations brief.

The only cusp-specific part of the argument is estimating products of random variables

\begin{matrix} X_{t} = X_{t} (x) : = N^{1 / 4} 〈I, {\tilde{G}}^{(t)}, (b + γ^{- 1} N^{- 3 / 4} x + i N^{- 3 / 4 - ζ})〉 \end{matrix}

and we claim that

\begin{matrix} E [\prod_{j = 1}^{k} X_{t} (x_{j}) - \prod_{j = 1}^{k} X_{0} (x_{j})] ≲ N^{- c} \end{matrix}

5.33

as long as $t \leq N^{- 1 / 4 - ϵ}$ for some $c = c (k, ϵ, ζ)$ . For simplicity we first consider $k = 1$ and find from Itô’s Lemma that

\begin{matrix} E \frac{d X_{t}}{d t} = E [- \frac{1}{2} \sum_{α} w_{α} \partial_{α} X_{t} + \frac{1}{2} \sum_{α, β} κ (α, β) \partial_{α} \partial_{β} X_{t}], \end{matrix}

5.34

which we further compute using a standard cumulant expansion, as already done in the bulk regime in [34, Proof of Corollary 2.6] and in the edge regime in [11, Section 4.2]. We recall that $κ (α, β)$ , and more generally $κ (α, β_{1}, \dots, β_{k})$ denote the joint cumulants of the random variables $w_{α}, w_{β}$ and $w_{α}, w_{β_{1}}, \dots, w_{β_{k}}$ , respectively, which accordingly scale like $N^{- 1}$ and $N^{- (k + 1) / 2}$ . Here greek letters $α, β \in {[N]}^{2}$ are double indices. After cumulant expansion, the leading term in (5.34) cancels, and the next order contribution is

\begin{matrix} \sum_{α, β_{1}, β_{2}} κ (α, β_{1}, β_{2}) E [\partial_{α} \partial_{β_{1}} \partial_{β_{2}} X_{t}], \end{matrix}

with $N^{- 3 / 2}$ being the size of the cumulant $κ (α, β_{1}, β_{2})$ . With $α = (a, b)$ and $β_{i} = (a_{i}, b_{i})$ we then estimate

\begin{matrix} \begin{matrix} N^{- 3 / 4} \sum_{a, b, c} \sum_{a_{1}, b_{1}, a_{2}, b_{2}} |κ (a b, a_{1} b_{1}, a_{2} b_{2})| E |{\tilde{G}}_{ca}^{(t)}, {\tilde{G}}_{b a_{1}}^{(t)}, {\tilde{G}}_{b_{1} a_{2}}^{(t)}, {\tilde{G}}_{b_{2} c}^{(t)}| \\ \leq N^{- 3 / 4 - 3 / 2 + 2 + 3 / 4 + ζ} ‖ I {\tilde{G}}^{(t)} ‖_{3} {‖ {\tilde{G}}^{(t)} ‖}_{3}^{2}, \end{matrix} \end{matrix}

where we used the Ward-identity and that ${max}_{α} \sum_{β_{1}, β_{2}} κ (α, β_{1}, β_{2}) ≲ N^{- 3 / 2}$ . We now use that according to [34, Proof of Prop. 5.5], $η \mapsto η ‖ {\tilde{G}}^{(t)} ‖_{p}$ and similarly $η \mapsto η ‖ I {\tilde{G}}^{(t)} ‖_{p}$ are monotonically increasing with $η^{'} = N^{- 3 / 4 + ζ}$ to find $‖ I {\tilde{G}}^{(t)} ‖_{p} \leq_{p} N^{3 ζ - 1 / 4}$ and $‖ {\tilde{G}}^{(t)} ‖_{p} \leq_{p} N^{3 ζ}$ from the local law from Theorem 2.5 and the scaling of $ρ$ at $η^{'}$ . Since all other error terms can be handled similarly and give an even smaller contribution it follows that

\begin{matrix} |E, \frac{d X_{t}}{d t}| ≲ N^{1 / 4 + C ζ} and similarly, but more generally, |E, \frac{d}{d t}, \prod_{j = 1}^{k}, X_{t}, (x_{j})| ≲ N^{1 / 4 + C k ζ}, \end{matrix}

5.35

for some constant $C > 0$ . Now (5.33) and therefore (5.32) follow from (5.35) as in [38, Theorem 15.3] using the choice $t = N^{- 1 / 2 + ϵ} \leq N^{- 1 / 4 - ϵ}$ and choosing $ζ$ sufficiently small.

Acknowledgements

Open access funding provided by Institute of Science and Technology (IST Austria). The authors are very grateful to Johannes Alt for numerous discussions on the Dyson equation and for his invaluable help in adjusting [10] to the needs of the present work.

Appendix A. Technical lemmata

Lemma A.1

Let $C^{N \times N}$ be equipped with a norm $‖ \cdot ‖$ . Let $A : C^{N \times N} \times C^{N \times N} \to C^{N \times N}$ be a bilinear form and let $B : C^{N \times N} \to C^{N \times N}$ a linear operator with a non-degenerate isolated eigenvalue $β$ . Denote the spectral projection corresponding to $β$ by $P$ and by $Q$ the one corresponding to the spectral complement of $β$ , i.e.

\begin{matrix} P : = - lim_{ϵ ↘ 0} \frac{1}{2 π i} \oint_{\partial B_{ϵ} (β)} \frac{d ω}{B - ω} = 〈V_{l}, \cdot〉 V_{r}, Q : = 1 - P, \end{matrix}

where $V_{r}$ is the eigenmatrix corresponding to $β$ and $〈V_{l}, \cdot〉$ a linear functional. Assume that for some positive constant $λ > 1$ the bounds

\begin{matrix} ‖ A ‖ + ‖ B^{- 1} Q ‖ + ‖ 〈V_{l}, \cdot〉 ‖ + ‖ V_{r} ‖ \leq λ, \end{matrix}

A.1

are satisfied, where we denote the induced norms on linear operators, linear functionals and bilinear forms on $C^{N \times N}$ by the same symbol $‖ \cdot ‖$ . Then there exists a universal constant $c > 0$ such that for any $δ \in (0, 1)$ and any $Y, X \in C^{N \times N}$ with $‖ Y ‖ + ‖ X ‖ \leq c λ^{- 4}$ that satisfies the quadratic equation

\begin{matrix} B [Y] - A [Y, Y] + X = 0, \end{matrix}

A.2

the following holds: The scalar quantity

\begin{matrix} Θ : = 〈V_{l}, Y〉, \end{matrix}

fulfils the cubic equation

\begin{matrix} μ_{3} Θ^{3} + μ_{2} Θ^{2} + μ_{1} Θ + μ_{0} = λ^{12} O (δ {|Θ|}^{3} + {|Θ|}^{4} + δ^{- 2} {‖ X ‖}^{3}), \end{matrix}

A.3

with coefficients

\begin{matrix} \begin{matrix} μ_{3} & = 〈V_{l}, A [V_{r}, B^{- 1} Q A [V_{r}, V_{r}]] + A [B^{- 1} Q A [V_{r}, V_{r}], V_{r}]〉 \\ μ_{2} & = 〈V_{l}, A [V_{r}, V_{r}]〉 \\ μ_{1} & = - 〈V_{l}, A [B^{- 1} Q [X], V_{r}] + A [V_{r}, B^{- 1} Q [X]]〉 - β \\ μ_{0} & = 〈V_{l}, A [B^{- 1} Q [X], B^{- 1} Q [X]] - X〉 . \end{matrix} \end{matrix}

A.4

Furthermore,

\begin{matrix} Y = Θ V_{r} - B^{- 1} Q [X] + Θ^{2} B^{- 1} Q A [V_{r}, V_{r}] + λ^{7} O ({|Θ|}^{3} + |Θ| ‖ X ‖ + {‖ X ‖}^{2}) . \end{matrix}

A.5

Here, the constants implicit in the $O$ -notation depend on c only.

Proof

We decompose Y as

\begin{matrix} Y = Y_{1} + Y_{2}, Y_{1} = Θ V_{r} - B^{- 1} Q [X], Y_{2} = Q [Y] + B^{- 1} Q [X] . \end{matrix}

Then (A.2) takes the form

\begin{matrix} Θ β V_{r} + P [X] + B Q [Y_{2}] = A [Y, Y] . \end{matrix}

A.6

We project both sides with $Q$ , invert $B$ and take the norm to conclude

\begin{matrix} ‖ Y_{2} ‖ = λ^{2} O (‖ Y_{1} ‖^{2} + ‖ Y_{2} ‖^{2}), \end{matrix}

Then we use the smallness of $Y_{2}$ by properly choosing $δ$ and the definition of $Y_{1}$ to infer $Y_{2} = λ^{4} O_{2}$ , where we introduced the notation

\begin{matrix} O_{k} = O ({|Θ|}^{k} + {‖ X ‖}^{k}) . \end{matrix}

Inserting this information back into (A.6) and using $|Θ| + ‖ X ‖ = O (λ^{- 3})$ reveals

\begin{matrix} Y_{2} = B^{- 1} Q A [Y_{1}, Y_{1}] + λ^{7} O_{3} . \end{matrix}

A.7

In particular, (A.5) follows. Plugging (A.7) into (A.6) and applying the projection $P$ yields

\begin{matrix} \begin{matrix} Θ β V_{r} + P [X] & = P [A [Y_{1}, Y_{1}] + A [Y_{1}, Y_{2}] + A [Y_{2}, Y_{1}]] + λ^{11} O_{4} \\ = P [A [Y_{1}, Y_{1}] + A [Y_{1}, B^{- 1} Q A [Y_{1}, Y_{1}]] + A [B^{- 1} Q A [Y_{1}, Y_{1}], Y_{1}]] + λ^{11} O_{4} . \end{matrix} \end{matrix}

For a linear operator $K_{1}$ and a bilinear form $K_{2}$ with $‖ K_{1} ‖ + ‖ K_{2} ‖ \leq 1$ we use the general bounds

\begin{matrix} Θ K_{2} [R, R] \leq δ Θ^{3} + δ^{- 1 / 2} {‖ R ‖}^{3}, Θ^{2} K_{1} [R] \leq δ Θ^{3} + δ^{- 2} {‖ R ‖}^{3}, \end{matrix}

for any $R \in C^{N \times N}$ and $δ > 0$ to find

\begin{matrix} \begin{matrix} Θ β V_{r} + P [X] & = P [A [Θ V_{r} - B^{- 1} Q [X], Θ V_{r} - B^{- 1} Q [X]] + Θ^{3} A [V_{r}, B^{- 1} Q A [V_{r}, V_{r}]] \\ + Θ^{3} A [B^{- 1} Q A [V_{r}, V_{r}], V_{r}]] \\ + λ^{8} O (δ {|Θ|}^{3} + λ^{3} {|Θ|}^{4} + δ^{- 2} {‖ X ‖}^{3}), \end{matrix} \end{matrix}

which proves (A.3). $□$

Proof of Lemma 3.3

Due to the asymptotics $Ψ_{edge} \sim min {λ^{1 / 2}, λ^{1 / 3}}$ and $Ψ_{\min} \sim min {λ^{2}, {|λ|}^{1 / 3}}$ and the classification of singularities in (2.4), we can infer the following behaviour of the self-consistent fluctuation scale from Definition 2.4. There exists a constant $c > 0$ only depending on the model parameters such that we have the following asymptotics. First of all, in the spectral bulk we trivially have that $η_{f} (τ) \sim N^{- 1}$ as long as $τ$ is at least a distance of $c > 0$ away from local minima of $ρ$ . In the remaining cases we use the explicit shape formulae from (2.4) to compute $η_{f}$ directly from Definition 2.4.

Non-zero local minimum or cusp. Let $τ$ be the location of a non-zero local minimum $ρ (τ) = ρ_{0} > 0$ or a cusp $ρ (τ) = ρ_{0} = 0$ . Then
$\begin{matrix} η_{f} (τ + ω) \sim \{\begin{matrix} 1 / (N max {ρ_{0}, {|ω|}^{1 / 3}}), & max {ρ_{0}, {|ω|}^{1 / 3}} > N^{- 1 / 4}, \\ N^{- 3 / 4}, & max {ρ_{0}, {|ω|}^{1 / 3}} \leq N^{- 1 / 4}, \end{matrix}) \end{matrix}$ A.8a
for $ω \in (- c, c)$ .
Edge. Let $τ = e_{\pm}$ be the position of a left/right edge at a gap in $supp ρ \cap (e_{\pm} - κ, e_{\pm} + κ)$ of size $Δ \in (0, κ]$ (cf. (2.4b)). Then
$\begin{matrix} η_{f} (e_{\pm} \pm ω) \sim \{\begin{matrix} N^{- 3 / 4}, & ω \leq Δ \leq N^{- 3 / 4}, \\ Δ^{1 / 6} / ω^{1 / 2} N, & Δ^{1 / 9} / N^{2 / 3} < ω \leq Δ, \\ Δ^{1 / 9} / N^{2 / 3}, & ω \leq Δ^{1 / 9} / N^{2 / 3}, Δ > N^{- 3 / 4}, \\ N^{- 3 / 4}, & Δ < ω \leq N^{- 3 / 4}, \\ 1 / ω^{1 / 3} N, & ω \geq N^{- 3 / 4}, ω > Δ, \end{matrix}) \end{matrix}$ A.8b
for $ω \in [0, c)$ .

The claimed bounds in Lemma 3.3 now follow directly from (3.7e) and (A.8) by distinguishing the respective regimes. $□$

Proof of Lemma 4.8

We start from (4.7) and estimate all vertex weights $w^{(v)}$ , interaction matrices $R^{(e)}$ and weight matrices $K^{(e)}$ trivially by

to obtain

\begin{matrix} |Val (Γ)| \leq C^{|V| + |IE| + |WE|} N^{n (Γ) - |V|} {∥(, \prod_{v \in V}, \sum_{a_{v} \in J}, ), \prod_{e \in GE}, G_{e}∥}_{1} . \end{matrix}

We now choose the vertex ordering $V = {v_{1}, \dots, v_{m}}$ as in Lemma 4.5. In the first step we partition the set of G-edges into three parts $GE = E_{1} \cup E_{2} \cup E_{3}$ : the edges not adjacent to $v_{m}$ , $E_{1} = GE \ N (v_{m})$ , the non-Wardable edges adjacent to $v_{m}$ , $E_{2} = GE \cap N (v_{m}) \ {GE}_{W}$ and the Wardable edges adjacent to $v_{m}$ , $E_{3} = {GE}_{W} \cap N (v_{m})$ . By the choice of ordering it holds that $|E_{3}| \leq 2$ . We introduce the shorthand notation $G_{E_{i}} = \prod_{e \in E_{i}} G_{e}$ and use the general Hölder inequality for any collection of random variables ${X_{A}}$ and ${Y_{A}}$ indexed by some arbitrary index set $A$

\begin{matrix} {∥\sum_{A \in A}, |X_{A}, Y_{A}|∥}_{q} \leq {∥\sum_{A \in A}, |X_{A}|∥}_{q_{1}} {|A|}^{1 / q_{2}} max_{A \in A} {‖ Y_{A} ‖}_{q_{2}}, \frac{1}{q} = \frac{1}{q_{1}} + \frac{1}{q_{2}} \end{matrix}

to compute

\begin{matrix} \begin{matrix} {∥\sum_{a_{v_{1}}, \dots, a_{v_{m - 1}}}, |G_{E_{1}}|, \sum_{a_{v_{m}}}, |G_{E_{2}}, G_{E_{3}}|∥}_{q} \\ \leq N^{(m - 1) / q_{2}} {∥\sum_{a_{v_{1}}, \dots, a_{v_{m - 1}}}, |G_{E_{1}}|∥}_{q_{1}} max_{a_{1}, \dots, a_{v_{m - 1}}} ({∥\sum_{a_{v_{m}}}, |G_{E_{3}}|∥}_{2 q_{2}} N^{1 / 2 q_{2}} max_{a_{v_{m}}} {‖ G_{E_{2}} ‖}_{2 q_{2}}), \end{matrix} \end{matrix}

where we choose $1 / q = 1 / q_{1} + 1 / q_{2}$ in such a way that $q_{2} \geq p / c ϵ$ . Since $|E_{3}| \leq 2$ we can use (4.14a) to estimate

\begin{matrix} {∥\sum_{a_{v_{m}}}, |G_{E_{3}}|∥}_{2 q_{2}} \leq N {(ψ_{2 q_{2}}^{'})}^{|E_{3}|} \leq N {(ψ + ψ_{2 q_{2}}^{'})}^{|E_{3}|} \end{matrix}

and it thus follows from

\begin{matrix} ‖ G_{E_{2}} ‖_{2 q_{2}} \leq \prod_{e \in E_{2}} ‖ G_{e} ‖_{2 |E_{2}| q_{2}} = {‖ G - M ‖}_{2 |E_{2}| q_{2}}^{|E_{2} \cap {GE}_{g - m}|} {‖ G ‖}_{2 |E_{2}| q_{2}}^{|E_{2} \ {GE}_{g - m}|} \end{matrix}

that

\begin{matrix} {∥\sum_{a_{v_{1}}, \dots, a_{v_{m - 1}}}, |G_{E_{1}}|, \sum_{a_{v_{m}}}, |G_{E_{2}}, G_{E_{3}}|∥}_{q} \\ \leq N^{ϵ / c} {∥\sum_{a_{v_{1}}, \dots, a_{v_{m - 1}}}, |G_{E_{1}}|∥}_{q_{1}} N {(ψ + ψ_{q^{'}}^{'})}^{|E_{3}|} {(ψ + ψ_{q^{'}}^{'} + ψ_{q^{'}}^{''})}^{|E_{2} \cap {GE}_{g - m}|} {(1 + ‖ G ‖}_{q^{'}})^{|E_{2}|} \end{matrix}

A.9

for $q^{'} \geq 2 q_{2} |GE|$ . By using (A.9) inductively $m = |V| \leq c p$ times it thus follows that

\begin{matrix} {∥(, \prod_{v \in V}, \sum_{a_{v} \in J}, ), \prod_{e \in GE}, G_{e}∥}_{1} \leq N^{p ϵ} N^{|V|} {(ψ + ψ_{q^{'}}^{'})}^{|{GE}_{W}|} {(ψ + ψ_{q^{'}}^{'} + ψ_{q^{'}}^{''})}^{|{GE}_{g - m}|} (1 + {‖ G ‖}_{q^{'}})^{|GE|}, \end{matrix}

proving the lemma. $□$

Lemma A.2

For the coefficient in (4.42) we have the expansion

\begin{matrix} \frac{〈b^{(B)}, p, f, (R b^{(B^{'})})〉 〈l^{(B^{'})}, \bar{l^{(B)}}〉}{〈\bar{b^{(B)}}, \bar{l^{(B)}}〉 〈l^{(B^{'})}, b^{(B^{'})}〉} = c σ ‖ F ‖ 〈{|m|}^{- 2}, f^{2}〉 + O (ρ + η / ρ), \end{matrix}

A.10

for some $|c| \sim 1$ , provided $‖ B^{- 1} ‖_{\infty \to \infty} \geq C$ for some large enough constant $C > 0$ .

Proof

Recall from the explanation after (4.42) that $R^{'} = S, T, T^{t}$ if $R = S, T^{t}, T$ , respectively. As we saw in the proof of Lemma 4.14, in the case $R = T, T^{t}$ in the complex Hermitian symmetry class, the operator B as well as $B^{'}$ has a bounded inverse. Since we assume that $‖ B^{- 1} ‖_{\infty \to \infty}$ is large, we have $R = R^{'} = S$ , which also includes the real symmetric symmetry class. In particular, we also have $‖ {(B^{'})}^{- 1} ‖_{\infty \to \infty} \geq C$ and all subsequent statements hold simultaneously for B and $B^{'}$ . We call $f^{(S)}$ the normalised eigenvector corresponding to the eigenvalue with largest modulus of $F^{(S)} : = |M| S |M|$ , recalling $M = diag (m)$ . Since $B = |M| (1 - F^{(S)} + O (ρ)) {|M|}^{- 1}$ we can use perturbation theory of $F^{(S)}$ to analyse spectral properties of B. In particular, we find

\begin{matrix} \begin{matrix} b^{(B)} & = |M| f^{(S)} + O (ρ), l^{(B)} = {|M|}^{- 1} f^{(S)} + O (ρ), \\ B^{- 1} Q_{B} & = |M| (1 - F^{(S)})^{- 1} (1 - P_{f^{(S)}}) {|M|}^{- 1} + O (ρ), \end{matrix} \end{matrix}

A.11

where $P_{f^{(S)}}$ is the orthogonal projection onto the $f^{(S)}$ direction. The error terms are measured in ${‖ \cdot ‖}_{\infty}$ -norm. For the expansions (A.11) we used that F has a spectral gap in the sense that

\begin{matrix} Spec (F^{(S)} / ‖ F^{(S)} ‖) \subseteq [- 1 + c, 1 - c] \cup {1}, \end{matrix}

for some constant $c > 0$ , depending only on model parameters. By using (A.11) we see that the lhs. of (A.10) becomes $\pm 〈{(f^{(S)})}^{2}, p, f〉 ‖ F^{(S)} ‖ 〈{|m|}^{- 2}, {(f^{(S)})}^{2}〉 + O (ρ)$ . To complete the proof of the Lemma we note that $f^{(S)} = f / ‖ f ‖ + O (η / ρ)$ according to [10, Eq. (5.10)].

$□$

Footnotes

See Appendix B of arXiv:1809.03971v2 for details.

This equivalent property is commonly known as having a colouring number of at most $k + 1$ , see e.g. [39].

We have $c_{1} = π / ψ$ , $c_{2} = 2 σ / ψ$ with the notations $ψ, σ$ in [10], where $ψ \sim 1$ and $|σ| ≪ 1$ near the almost cusp, but we refrain from using these letters in the present context to avoid confusions.

⁴

See [10, Lemma 5.5] for the 1/3-Hölder continuity of quantities $ψ, σ$ in the definition of $c_{2}$ .

L. Erdős: Partially supported by ERC Advanced Grant No. 338804.

T. Krüger: Partially supported by the Hausdorff Center for Mathematics.

D. Schröder: Partially supported by the IST Austria Excellence Scholarship.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

László Erdős, Email: lerdos@ist.ac.at.

Torben Krüger, Email: torben-krueger@uni-bonn.de.

Dominik Schröder, Email: dschroed@ist.ac.at, Email: dschroeder@ethz.ch.

References

1.Adlam, B., Che, Z.: Spectral statistics of sparse random graphs with a general degree distribution. Preprint (2015). arXiv:1509.03368
2.Adler M, Cafasso M, van Moerbeke P. From the Pearcey to the Airy process. Electron. J. Probab. 2011;16(36):1048–1064. [Google Scholar]
3.Adler M, Ferrari PL, van Moerbeke P. Airy processes with wanderers and new universality classes. Ann. Probab. 2010;38:714–769. [Google Scholar]
4.Adler M, van Moerbeke P. PDEs for the Gaussian ensemble with external source and the Pearcey distribution. Commun. Pure Appl. Math. 2007;60:1261–1292. [Google Scholar]
5.Ajanki, O.H., Erdős, L., Krüger, T.: Quadratic vector equations on complex upperhalf-plane. Mem. Amer. Math. Soc. 261(1261), v+133 (2019)
6.Ajanki OH, Erdős L, Krüger T. Singularities of solutions to quadratic vector equations on the complex upper half-plane. Commun. Pure Appl. Math. 2017;70:1672–1705. [Google Scholar]
7.Ajanki OH, Erdős L, Krüger T. Stability of the matrix Dyson equation and random matrices with correlations. Probab. Theory Relat. Fields. 2019;173:293–373. [Google Scholar]
8.Ajanki OH, Erdős L, Krüger T. Universality for general Wigner-type matrices. Probab. Theory Relat. Fields. 2017;169:667–727. [Google Scholar]
9.Alt, J., Erdős, L., Krüger, T.: Spectral radius of random matrices with independent entries. Preprint (2019). arXiv:1907.13631
10.Alt, J., Erdős, L., Krüger, T.: The Dyson equation with linear self-energy: spectral bands, edges and cusps. Preprint (2018). arXiv:1804.07752
11.Alt, J., Erdős, L., Krüger, T., Schröder, D.: Correlated random matrices: Band rigidity and edge universality. Ann. Probab. (2018). arXiv:1804.07744 (to appear)
12.Anderson PW. Absence of diffusion in certain random lattices. Phys. Rev. 1958;109:1492–1505. [Google Scholar]
13.Baik, J., Kriecherbauer, T., McLaughlin, K.T.-R., Miller, P.D.: Discrete Orthogonal Polynomials, vol. 64. Annals of Mathematics Studies, Asymptotics and Applications, pp . viii+170. Princeton University Press, Princeton, NJ (2007)
14.Bauerschmidt R, Huang J, Knowles A, Yau H-T. Bulk eigenvalue statistics for random regular graphs. Ann. Probab. 2017;45:3626–3663. [Google Scholar]
15.Bekerman F, Figalli A, Guionnet A. Transport maps for $β$ -matrix models and universality. Commun. Math. Phys. 2015;338:589–619. [Google Scholar]
16.Borodin A, Okounkov A, Olshanski G. Asymptotics of Plancherel measures for symmetric groups. J. Am. Math. Soc. 2000;13:481–515. [Google Scholar]
17.Bourgade P, Erdős L, Yau H-T. Edge universality of beta ensembles. Commun. Math. Phys. 2014;332:261–353. [Google Scholar]
18.Bourgade P, Erdős L, Yau H-T. Universality of general $β$ -ensembles. Duke Math. J. 2014;163:1127–1190. [Google Scholar]
19.Bourgade P, Erdős L, Yau H-T, Yin J. Universality for a class of random band matrices. Adv. Theor. Math. Phys. 2017;21:739–800. [Google Scholar]
20.Bourgade, P., Yau, H.-T., Yin, J.: Random band matrices in the delocalized phase, I: quantum unique ergodicity and universality. Preprint (2018). arXiv:1807.01559
21.Brézin E, Hikami S. Level spacing of random matrices in an external source. Phys. Rev. E. 1998;3(58):7176–7185. [Google Scholar]
22.Brézin E, Hikami S. Universal singularity at the closure of a gap in a random matrix theory. Phys. Rev. E. 1998;3(57):4140–4149. [Google Scholar]
23.Capitaine M, Péché S. Fluctuations at the edges of the spectrum of the full rank deformed GUE. Probab. Theory Relat. Fields. 2016;165:117–161. [Google Scholar]
24.Cipolloni G, Erdős L, Krüger T, Schröder D. Cusp universality for random matrices II: the real symmetric case. Pure Appl. Anal. 2019;1(4):615–707. [Google Scholar]
25.Cipolloni, G., Erdős, L., Schröder, D.: Edge universality for non-Hermitian random matrices. Preprint (2019). arXiv:1908.00969 [DOI] [PMC free article] [PubMed]
26.Claeys T, Kuijlaars ABJ, Liechty K, Wang D. Propagation of singular behavior for Gaussian perturbations of random matrices. Commun. Math. Phys. 2018;362:1–54. [Google Scholar]
27.Claeys T, Neuschel T, Venker M. Boundaries of sine kernel universality for Gaussian perturbations of Hermitian matrices. Random Matrices Theory Appl. 2019;8:1950011, 50. [Google Scholar]
28.Deift P, Kriecherbauer T, McLaughlin KT-R. New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory. 1998;95:388–475. [Google Scholar]
29.Deift P, Kriecherbauer T, McLaughlin KT-R, Venakides S, Zhou X. Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Commun. Pure Appl. Math. 1999;52:1335–1425. [Google Scholar]
30.Deift P, Gioev D. Universality at the edge of the spectrum for unitary, orthogonal, and symplectic ensembles of random matrices. Commun. Pure Appl. Math. 2007;60:867–910. [Google Scholar]
31.Duse E, Johansson K, Metcalfe A. The cusp-Airy process. Electron. J. Probab. 2016;21:50. [Google Scholar]
32.Erdős L, Knowles A, Yau H-T, Yin J. Spectral statistics of Erdős–Renyi graphs II: eigenvalue spacing and the extreme eigenvalues. Commun. Math. Phys. 2012;314:587–640. [Google Scholar]
33.Erdős L, Knowles A, Yau H-T, Yin J. The local semicircle law for a general class of random matrices. Electron. J. Probab. 2013;18(59):58. [Google Scholar]
34.Erdős L, Krüger T, Schröder D. Random matrices with slow correlation decay. Forum Math. Sigma. 2019;7:e8, 89. [Google Scholar]
35.Erdős L, Péché S, Ramírez JA, Schlein B, Yau H-T. Bulk universality for Wigner matrices. Commun. Pure Appl. Math. 2010;63:895–925. [Google Scholar]
36.Erdős L, Schlein B, Yau H-T. Universality of random matrices and local relaxation flow. Invent. Math. 2011;185:75–119. [Google Scholar]
37.Erdős L, Schnelli K. Universality for random matrix flows with time-dependent density. Ann. Inst. Henri Poincaré Probab. Stat. 2017;53:1606–1656. [Google Scholar]
38.Erdős, L., Yau, H.-T.: A Dynamical Approach to Random Matrix Theory, Vol. 28, Courant Lecture Notes in Mathematics, Courant Institute of Mathematical Sciences, pp. ix+226. American Mathematical Society, Providence, RI (2017)
39.Erdős P, Hajnal A. On chromatic number of graphs and set-systems. Acta Math. Acad. Sci. Hung. 1966;17:61–99. [Google Scholar]
40.Geudens, D., Zhang, L.: Transitions between critical kernels: from the tacnode kernel and critical kernel in the two-matrix model to the Pearcey kernel. International Mathematics Research Notices IMRN 5733–5782 (2015)
41.Guionnet A, Huang J. Rigidity and edge universality of discrete $β$ -ensembles. Comm. Pure Appl. Math. 2019;72(9):1875–1982. [Google Scholar]
42.Hachem, W., Hardy, A., Najim, J.: A survey on the eigenvalues local behavior of large complex correlated Wishart matrices. In: Modelisation Aleatoire et Statistique—Journées MAS 2014, vol. 51, ESAIM Proceedings Surveys, EDP Sciences, Les Ulis, pp. 150–174 (2015)
43.Hachem W, Hardy A, Najim J. Large complex correlated Wishart matrices: fluctuations and asymptotic independence at the edges. Ann. Probab. 2016;44:2264–2348. [Google Scholar]
44.Hachem W, Hardy A, Najim J. Large complex correlated Wishart matrices: the Vearcey kernel and expansion at the hard edge. Electron. J. Probab. 2016;21:36. [Google Scholar]
45.He Y, Knowles A. Mesoscopic eigenvalue statistics of Wigner matrices. Ann. Appl. Probab. 2017;27:1510–1550. [Google Scholar]
46.Helton, J. W., Rashidi Far, R., Speicher, R.: Operator-valued semicircular elements: solving a quadratic matrix equation with positivity constraints. International Mathematics Research Notices IMRN, Art. ID rnm086, 15 (2007)
47.Huang J, Landon B, Yau H-T. Bulk universality of sparse random matrices. J. Math. Phys. 2015;56:123301, 19. [Google Scholar]
48.Johansson K. Discrete orthogonal polynomial ensembles and the Plancherel measure. Ann. Math. (2) 2001;153:259–296. [Google Scholar]
49.Johansson K. Universality of the local spacing distribution in certain ensembles of Hermitian Wigner matrices. Commun. Math. Phys. 2001;215:683–705. [Google Scholar]
50.Khorunzhy AM, Khoruzhenko BA, Pastur LA. Asymptotic properties of large random matrices with independent entries. J. Math. Phys. 1996;37:5033–5060. [Google Scholar]
51.Knowles A, Yin J. Anisotropic local laws for random matrices. Probab. Theory Relat. Fields. 2017;169:257–352. [Google Scholar]
52.Krishnapur M, Rider B, Virág B. Universality of the stochastic Airy operator. Commun. Pure Appl. Math. 2016;69:145–199. [Google Scholar]
53.Landon B, Yau H-T. Convergence of local statistics of Dyson Brownian motion. Commun. Math. Phys. 2017;355:949–1000. [Google Scholar]
54.Landon, B., Yau, H.-T.: Edge statistics of Dyson Brownian motion. Preprint (2017). arXiv:1712.03881
55.Lee JO, Schnelli K. Edge universality for deformed Wigner matrices. Rev. Math. Phys. 2015;27:1550018, 94. [Google Scholar]
56.Lee JO, Schnelli K. Local law and Tracy-Widom limit for sparse random matrices. Probab. Theory Relat. Fields. 2018;171:543–616. [Google Scholar]
57.Lee JO, Schnelli K, Stetler B, Yau H-T. Bulk universality for deformed Wigner matrices. Ann. Probab. 2016;44:2349–2425. [Google Scholar]
58.Lick DR, White AT. k-degenerate graphs. Can. J. Math. 1970;22:1082–1096. [Google Scholar]
59.Mehta ML. Random Matrices and the Statistical Theory of Energy Levels. New York: Academic Press; 1967. p. x+259. [Google Scholar]
60.Okounkov A, Reshetikhin N. Random skew plane partitions and the Vearcey process. Commun. Math. Phys. 2007;269:571–609. [Google Scholar]
61.Pastur L, Shcherbina M. Bulk universality and related properties of Hermitian matrix models. J. Stat. Phys. 2008;130:205–250. [Google Scholar]
62.Pastur L, Shcherbina M. On the edge universality of the local eigenvalue statistics of matrix models. Mat. Fiz. Anal. Geom. 2003;10:335–365. [Google Scholar]
63.Pearcey T. The structure of an electromagnetic field in the neighbourhood of a cusp of a caustic. Philos. Mag. 1946;7(37):311–317. [Google Scholar]
64.Shcherbina M. Change of variables as a method to study general $β$ -models: bulk universality. J. Math. Phys. 2014;55:043504, 23. [Google Scholar]
65.Shcherbina M. Edge universality for orthogonal ensembles of random matrices. J. Stat. Phys. 2009;136:35–50. [Google Scholar]
66.Sodin S. The spectral edge of some random band matrices. Ann. Math. (2) 2010;172:2223–2251. [Google Scholar]
67.Soshnikov A. Universality at the edge of the spectrum in Wigner random matrices. Commun. Math. Phys. 1999;207:697–733. [Google Scholar]
68.Tao T, Vu V. Random matrices: universality of local eigenvalue statistics. Acta Math. 2011;206:127–204. [Google Scholar]
69.Tao T, Vu V. Random matrices: universality of local eigenvalue statistics up to the edge. Commun. Math. Phys. 2010;298:549–572. [Google Scholar]
70.Tracy CA, Widom H. Level-spacing distributions and the Airy kernel. Commun. Math. Phys. 1994;159:151–174. [Google Scholar]
71.Tracy CA, Widom H. On orthogonal and symplectic matrix ensembles. Commun. Math. Phys. 1996;177:727–754. [Google Scholar]
72.Tracy CA, Widom H. The Pearcey process. Commun. Math. Phys. 2006;263:381–400. [Google Scholar]
73.Valkó B, Virág B. Continuum limits of random matrices and the Brownian carousel. Invent. Math. 2009;177:463–508. [Google Scholar]

[CR1] 1.Adlam, B., Che, Z.: Spectral statistics of sparse random graphs with a general degree distribution. Preprint (2015). arXiv:1509.03368

[CR2] 2.Adler M, Cafasso M, van Moerbeke P. From the Pearcey to the Airy process. Electron. J. Probab. 2011;16(36):1048–1064. [Google Scholar]

[CR3] 3.Adler M, Ferrari PL, van Moerbeke P. Airy processes with wanderers and new universality classes. Ann. Probab. 2010;38:714–769. [Google Scholar]

[CR4] 4.Adler M, van Moerbeke P. PDEs for the Gaussian ensemble with external source and the Pearcey distribution. Commun. Pure Appl. Math. 2007;60:1261–1292. [Google Scholar]

[CR5] 5.Ajanki, O.H., Erdős, L., Krüger, T.: Quadratic vector equations on complex upperhalf-plane. Mem. Amer. Math. Soc. 261(1261), v+133 (2019)

[CR6] 6.Ajanki OH, Erdős L, Krüger T. Singularities of solutions to quadratic vector equations on the complex upper half-plane. Commun. Pure Appl. Math. 2017;70:1672–1705. [Google Scholar]

[CR7] 7.Ajanki OH, Erdős L, Krüger T. Stability of the matrix Dyson equation and random matrices with correlations. Probab. Theory Relat. Fields. 2019;173:293–373. [Google Scholar]

[CR8] 8.Ajanki OH, Erdős L, Krüger T. Universality for general Wigner-type matrices. Probab. Theory Relat. Fields. 2017;169:667–727. [Google Scholar]

[CR9] 9.Alt, J., Erdős, L., Krüger, T.: Spectral radius of random matrices with independent entries. Preprint (2019). arXiv:1907.13631

[CR10] 10.Alt, J., Erdős, L., Krüger, T.: The Dyson equation with linear self-energy: spectral bands, edges and cusps. Preprint (2018). arXiv:1804.07752

[CR11] 11.Alt, J., Erdős, L., Krüger, T., Schröder, D.: Correlated random matrices: Band rigidity and edge universality. Ann. Probab. (2018). arXiv:1804.07744 (to appear)

[CR12] 12.Anderson PW. Absence of diffusion in certain random lattices. Phys. Rev. 1958;109:1492–1505. [Google Scholar]

[CR13] 13.Baik, J., Kriecherbauer, T., McLaughlin, K.T.-R., Miller, P.D.: Discrete Orthogonal Polynomials, vol. 64. Annals of Mathematics Studies, Asymptotics and Applications, pp . viii+170. Princeton University Press, Princeton, NJ (2007)

[CR14] 14.Bauerschmidt R, Huang J, Knowles A, Yau H-T. Bulk eigenvalue statistics for random regular graphs. Ann. Probab. 2017;45:3626–3663. [Google Scholar]

[CR15] 15.Bekerman F, Figalli A, Guionnet A. Transport maps for $β$ -matrix models and universality. Commun. Math. Phys. 2015;338:589–619. [Google Scholar]

[CR16] 16.Borodin A, Okounkov A, Olshanski G. Asymptotics of Plancherel measures for symmetric groups. J. Am. Math. Soc. 2000;13:481–515. [Google Scholar]

[CR17] 17.Bourgade P, Erdős L, Yau H-T. Edge universality of beta ensembles. Commun. Math. Phys. 2014;332:261–353. [Google Scholar]

[CR18] 18.Bourgade P, Erdős L, Yau H-T. Universality of general $β$ -ensembles. Duke Math. J. 2014;163:1127–1190. [Google Scholar]

[CR19] 19.Bourgade P, Erdős L, Yau H-T, Yin J. Universality for a class of random band matrices. Adv. Theor. Math. Phys. 2017;21:739–800. [Google Scholar]

[CR20] 20.Bourgade, P., Yau, H.-T., Yin, J.: Random band matrices in the delocalized phase, I: quantum unique ergodicity and universality. Preprint (2018). arXiv:1807.01559

[CR21] 21.Brézin E, Hikami S. Level spacing of random matrices in an external source. Phys. Rev. E. 1998;3(58):7176–7185. [Google Scholar]

[CR22] 22.Brézin E, Hikami S. Universal singularity at the closure of a gap in a random matrix theory. Phys. Rev. E. 1998;3(57):4140–4149. [Google Scholar]

[CR23] 23.Capitaine M, Péché S. Fluctuations at the edges of the spectrum of the full rank deformed GUE. Probab. Theory Relat. Fields. 2016;165:117–161. [Google Scholar]

[CR24] 24.Cipolloni G, Erdős L, Krüger T, Schröder D. Cusp universality for random matrices II: the real symmetric case. Pure Appl. Anal. 2019;1(4):615–707. [Google Scholar]

[CR25] 25.Cipolloni, G., Erdős, L., Schröder, D.: Edge universality for non-Hermitian random matrices. Preprint (2019). arXiv:1908.00969 [DOI] [PMC free article] [PubMed]

[CR26] 26.Claeys T, Kuijlaars ABJ, Liechty K, Wang D. Propagation of singular behavior for Gaussian perturbations of random matrices. Commun. Math. Phys. 2018;362:1–54. [Google Scholar]

[CR27] 27.Claeys T, Neuschel T, Venker M. Boundaries of sine kernel universality for Gaussian perturbations of Hermitian matrices. Random Matrices Theory Appl. 2019;8:1950011, 50. [Google Scholar]

[CR28] 28.Deift P, Kriecherbauer T, McLaughlin KT-R. New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory. 1998;95:388–475. [Google Scholar]

[CR29] 29.Deift P, Kriecherbauer T, McLaughlin KT-R, Venakides S, Zhou X. Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Commun. Pure Appl. Math. 1999;52:1335–1425. [Google Scholar]

[CR30] 30.Deift P, Gioev D. Universality at the edge of the spectrum for unitary, orthogonal, and symplectic ensembles of random matrices. Commun. Pure Appl. Math. 2007;60:867–910. [Google Scholar]

[CR31] 31.Duse E, Johansson K, Metcalfe A. The cusp-Airy process. Electron. J. Probab. 2016;21:50. [Google Scholar]

[CR32] 32.Erdős L, Knowles A, Yau H-T, Yin J. Spectral statistics of Erdős–Renyi graphs II: eigenvalue spacing and the extreme eigenvalues. Commun. Math. Phys. 2012;314:587–640. [Google Scholar]

[CR33] 33.Erdős L, Knowles A, Yau H-T, Yin J. The local semicircle law for a general class of random matrices. Electron. J. Probab. 2013;18(59):58. [Google Scholar]

[CR34] 34.Erdős L, Krüger T, Schröder D. Random matrices with slow correlation decay. Forum Math. Sigma. 2019;7:e8, 89. [Google Scholar]

[CR35] 35.Erdős L, Péché S, Ramírez JA, Schlein B, Yau H-T. Bulk universality for Wigner matrices. Commun. Pure Appl. Math. 2010;63:895–925. [Google Scholar]

[CR36] 36.Erdős L, Schlein B, Yau H-T. Universality of random matrices and local relaxation flow. Invent. Math. 2011;185:75–119. [Google Scholar]

[CR37] 37.Erdős L, Schnelli K. Universality for random matrix flows with time-dependent density. Ann. Inst. Henri Poincaré Probab. Stat. 2017;53:1606–1656. [Google Scholar]

[CR38] 38.Erdős, L., Yau, H.-T.: A Dynamical Approach to Random Matrix Theory, Vol. 28, Courant Lecture Notes in Mathematics, Courant Institute of Mathematical Sciences, pp. ix+226. American Mathematical Society, Providence, RI (2017)

[CR39] 39.Erdős P, Hajnal A. On chromatic number of graphs and set-systems. Acta Math. Acad. Sci. Hung. 1966;17:61–99. [Google Scholar]

[CR40] 40.Geudens, D., Zhang, L.: Transitions between critical kernels: from the tacnode kernel and critical kernel in the two-matrix model to the Pearcey kernel. International Mathematics Research Notices IMRN 5733–5782 (2015)

[CR41] 41.Guionnet A, Huang J. Rigidity and edge universality of discrete $β$ -ensembles. Comm. Pure Appl. Math. 2019;72(9):1875–1982. [Google Scholar]

[CR42] 42.Hachem, W., Hardy, A., Najim, J.: A survey on the eigenvalues local behavior of large complex correlated Wishart matrices. In: Modelisation Aleatoire et Statistique—Journées MAS 2014, vol. 51, ESAIM Proceedings Surveys, EDP Sciences, Les Ulis, pp. 150–174 (2015)

[CR43] 43.Hachem W, Hardy A, Najim J. Large complex correlated Wishart matrices: fluctuations and asymptotic independence at the edges. Ann. Probab. 2016;44:2264–2348. [Google Scholar]

[CR44] 44.Hachem W, Hardy A, Najim J. Large complex correlated Wishart matrices: the Vearcey kernel and expansion at the hard edge. Electron. J. Probab. 2016;21:36. [Google Scholar]

[CR45] 45.He Y, Knowles A. Mesoscopic eigenvalue statistics of Wigner matrices. Ann. Appl. Probab. 2017;27:1510–1550. [Google Scholar]

[CR46] 46.Helton, J. W., Rashidi Far, R., Speicher, R.: Operator-valued semicircular elements: solving a quadratic matrix equation with positivity constraints. International Mathematics Research Notices IMRN, Art. ID rnm086, 15 (2007)

[CR47] 47.Huang J, Landon B, Yau H-T. Bulk universality of sparse random matrices. J. Math. Phys. 2015;56:123301, 19. [Google Scholar]

[CR48] 48.Johansson K. Discrete orthogonal polynomial ensembles and the Plancherel measure. Ann. Math. (2) 2001;153:259–296. [Google Scholar]

[CR49] 49.Johansson K. Universality of the local spacing distribution in certain ensembles of Hermitian Wigner matrices. Commun. Math. Phys. 2001;215:683–705. [Google Scholar]

[CR50] 50.Khorunzhy AM, Khoruzhenko BA, Pastur LA. Asymptotic properties of large random matrices with independent entries. J. Math. Phys. 1996;37:5033–5060. [Google Scholar]

[CR51] 51.Knowles A, Yin J. Anisotropic local laws for random matrices. Probab. Theory Relat. Fields. 2017;169:257–352. [Google Scholar]

[CR52] 52.Krishnapur M, Rider B, Virág B. Universality of the stochastic Airy operator. Commun. Pure Appl. Math. 2016;69:145–199. [Google Scholar]

[CR53] 53.Landon B, Yau H-T. Convergence of local statistics of Dyson Brownian motion. Commun. Math. Phys. 2017;355:949–1000. [Google Scholar]

[CR54] 54.Landon, B., Yau, H.-T.: Edge statistics of Dyson Brownian motion. Preprint (2017). arXiv:1712.03881

[CR55] 55.Lee JO, Schnelli K. Edge universality for deformed Wigner matrices. Rev. Math. Phys. 2015;27:1550018, 94. [Google Scholar]

[CR56] 56.Lee JO, Schnelli K. Local law and Tracy-Widom limit for sparse random matrices. Probab. Theory Relat. Fields. 2018;171:543–616. [Google Scholar]

[CR57] 57.Lee JO, Schnelli K, Stetler B, Yau H-T. Bulk universality for deformed Wigner matrices. Ann. Probab. 2016;44:2349–2425. [Google Scholar]

[CR58] 58.Lick DR, White AT. k-degenerate graphs. Can. J. Math. 1970;22:1082–1096. [Google Scholar]

[CR59] 59.Mehta ML. Random Matrices and the Statistical Theory of Energy Levels. New York: Academic Press; 1967. p. x+259. [Google Scholar]

[CR60] 60.Okounkov A, Reshetikhin N. Random skew plane partitions and the Vearcey process. Commun. Math. Phys. 2007;269:571–609. [Google Scholar]

[CR61] 61.Pastur L, Shcherbina M. Bulk universality and related properties of Hermitian matrix models. J. Stat. Phys. 2008;130:205–250. [Google Scholar]

[CR62] 62.Pastur L, Shcherbina M. On the edge universality of the local eigenvalue statistics of matrix models. Mat. Fiz. Anal. Geom. 2003;10:335–365. [Google Scholar]

[CR63] 63.Pearcey T. The structure of an electromagnetic field in the neighbourhood of a cusp of a caustic. Philos. Mag. 1946;7(37):311–317. [Google Scholar]

[CR64] 64.Shcherbina M. Change of variables as a method to study general $β$ -models: bulk universality. J. Math. Phys. 2014;55:043504, 23. [Google Scholar]

[CR65] 65.Shcherbina M. Edge universality for orthogonal ensembles of random matrices. J. Stat. Phys. 2009;136:35–50. [Google Scholar]

[CR66] 66.Sodin S. The spectral edge of some random band matrices. Ann. Math. (2) 2010;172:2223–2251. [Google Scholar]

[CR67] 67.Soshnikov A. Universality at the edge of the spectrum in Wigner random matrices. Commun. Math. Phys. 1999;207:697–733. [Google Scholar]

[CR68] 68.Tao T, Vu V. Random matrices: universality of local eigenvalue statistics. Acta Math. 2011;206:127–204. [Google Scholar]

[CR69] 69.Tao T, Vu V. Random matrices: universality of local eigenvalue statistics up to the edge. Commun. Math. Phys. 2010;298:549–572. [Google Scholar]

[CR70] 70.Tracy CA, Widom H. Level-spacing distributions and the Airy kernel. Commun. Math. Phys. 1994;159:151–174. [Google Scholar]

[CR71] 71.Tracy CA, Widom H. On orthogonal and symplectic matrix ensembles. Commun. Math. Phys. 1996;177:727–754. [Google Scholar]

[CR72] 72.Tracy CA, Widom H. The Pearcey process. Commun. Math. Phys. 2006;263:381–400. [Google Scholar]

[CR73] 73.Valkó B, Virág B. Continuum limits of random matrices and the Brownian carousel. Invent. Math. 2009;177:463–508. [Google Scholar]

PERMALINK

Cusp Universality for Random Matrices I: Local Law and the Complex Hermitian Case

László Erdős

Torben Krüger

Dominik Schröder

Abstract

Introduction

Main Results

The Dyson equation

Definition 2.1

Cusp universality

Assumption (A)

Assumption (B)

Assumption (C)

Remark 2.2

Theorem 2.3

Local law

Definition 2.4

Theorem 2.5

Corollary 2.6

Corollary 2.7

Corollary 2.8

Remark 2.9

Local Law

Remark 3.1

Stability and shape analysis

Proposition 3.2

Proof

Lemma 3.3

Proposition 3.4

Proof

Probabilistic bound

Definition 3.5

Lemma 3.6

Theorem 3.7

Lemma 3.8

Proof

Proof of (3.15)

Bootstrapping

Lemma 3.9

Proof

Lemma 3.10

Proof of Theorem 2.5

Proposition 3.11

Proof

Corollary 3.12

Rigidity and absence of eigenvalues

Proof of Corollary 2.8

Proof of Corollary 2.6

Cusp Fluctuation Averaging and Proof of Theorem 3.7

Graphical representation via double index graphs

Proposition 4.1

Single index graphs

Vertices.

Vertex weights.

G-edges.

(G-)edge degree.

Interaction edges.

Generic weighted edges.

Graph value.

Single index resolution

Definition 4.2

Remark 4.3

Single index graph expansion.

Fact 1

Fact 2

Examples of graphs

Simple estimates on Val(Γ)

Improved estimates on Val(Γ): Wardable edges

Fact 3

Proof

Definition 4.4

Lemma 4.5

Proof

Definition 4.6

Lemma 4.7

Proof

Lemma 4.8

Remark 4.9

Improved estimates on Val(Γ) at the cusp: σ-cells

Simple estimates on $Val (Γ)$

Improved estimates on $Val (Γ)$ : Wardable edges

Improved estimates on $Val (Γ)$ at the cusp: $σ$ -cells