Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Feb 8.
Published in final edited form as: Phys Rev E Stat Nonlin Soft Matter Phys. 2015 Jan 26;91(1):012820. doi: 10.1103/PhysRevE.91.012820

Properties of networks with partially structured and partially random connectivity

Yashar Ahmadian 1,2,, Francesco Fumarola 1, Kenneth D Miller 1,2
PMCID: PMC4745946  NIHMSID: NIHMS753882  PMID: 25679669

Abstract

Networks studied in many disciplines, including neuroscience and mathematical biology, have connectivity that may be stochastic about some underlying mean connectivity represented by a nonnormal matrix. Furthermore the stochasticity may not be i.i.d. across elements of the connectivity matrix. More generally, the problem of understanding the behavior of stochastic matrices with nontrivial mean structure and correlations arises in many settings. We address this by characterizing large random N × N matrices of the form A = M + LJR, where M, L and R are arbitrary deterministic matrices and J is a random matrix of zero-mean independent and identically distributed elements. M can be nonnormal, and L and R allow correlations that have separable dependence on row and column indices. We first provide a general formula for the eigenvalue density of A. For A nonnormal, the eigenvalues do not suffice to specify the dynamics induced by A, so we also provide general formulae for the transient evolution of the magnitude of activity and frequency power spectrum in an N -dimensional linear dynamical system with a coupling matrix given by A. These quantities can also be thought of as characterizing the stability and the magnitude of the linear response of a nonlinear network to small perturbations about a fixed point. We derive these formulae and work them out analytically for some examples of M, L and R motivated by neurobiological models. We also argue that the persistence as N → ∞ of a finite number of randomly distributed outlying eigenvalues outside the support of the eigenvalue density of A, as previously observed, arises in regions of the complex plane Ω where there are nonzero singular values of L−1(z1M)R−1 (for z ∈ Ω) that vanish as N → ∞. When such singular values do not exist and L and R are equal to the identity, there is a correspondence in the normalized Frobenius norm (but not in the operator norm) between the support of the spectrum of A for J of norm σ and the σ-pseudospectrum of M.

I. INTRODUCTION

Knowledge of the statistics of eigenvalues and eigenvectors of random matrices has applications in the modeling of phenomena relevant to a wide range of disciplines [13]. In many applications, however, the matrices of interest are not entirely random, but feature substantial deterministic structure. Furthermore this structure, as well as the disorder on top of it, are in general described by nonnormal matrices.

In neuroscience, for example, connections between neurons typically have restricted spatial range and show specificity with respect to neuronal type, location and response properties. Experience-based synaptic plasticity, which underlies learning and memory, naturally gives rise to synaptic connectivity matrices that encode aspects of the statistical structure of the sensory environment, while containing significant randomness partly due to the inherent stochasticity of particular histories of sensory experience. Another simple example of structured neural connectivity is due to what is known as Dale’s principle [46]: neurons come in two main types, excitatory and inhibitory. This empirical principle imposes a certain structure on the synaptic connectivity matrix, forcing all elements in each column of the matrix, describing the synaptic projections of a certain neuron, to have the same sign. Particularly when the typical weight magnitude is much larger than typical differences between the magnitudes of excitatory and inhibitory weights, such a matrix can be extremely nonnormal by some measures, much more so than a fully random matrix [7]. Similarly, biological knowledge imparts a great deal of structure to models of biochemical [811] or ecological networks [1216], and matrices characterizing such interactions are typically nonnormal. Yet our knowledge of connectivity or interactions is at best probabilistic. To describe realistic biological behavior, we must generalize from the behavior of a fixed, regular connectivity to the expected behavior of a typical sample from an appropriate connectivity ensemble.

Furthermore, nonnormality can lead to important dynamical properties not seen for normal matrices [17]. In general, networks with a recurrent connectivity pattern described by a nonnormal matrix can be described as having a hidden feedforward connectivity structure between orthogonal activity patterns, each of which can also excite or inhibit itself [7, 18, 19]. In neural networks such hidden feedforward connectivity arises from the natural separation of excitatory and inhibitory neurons, yielding so-called “balanced amplification” of patterns of activity without any dynamical slowing [7]. Underlying this is the phenomenon of “transient amplification”: a small perturbation from a fixed point of a stable system with nonnormal connectivity can lead to a large transient response over finite time [17]. Transient amplification also yields unexpected results in ecological networks [2022], and has been conjectured to play a key role in many biochemical systems [23]. Networks that yield long hidden feedforward chains can also generate long time scales and provide a substrate for working memory [18, 19]. Systems with nonnormal connectivity can also exhibit pseudo-resonance frequencies in their power-spectrum at which the system responds strongly to external inputs, even though the external frequency is not close to any of the system’s natural frequencies as determined by its eigenvalues [17].

While Hermitian random matrices and fully random non-Hermitian matrices with zero-mean, independent and identically distributed (iid) elements have been widely studied, there is a shortage of results on quantities of interest for nonnormal matrices that fall in between the two extremes of fully random or fully deterministic. A natural departure from a nonnormal deterministic structure, described by a connectivity matrix M, is to additively perturb it with a fully random matrix J with zero-mean, iid elements. In many important examples, however, the strength of disorder (deviations from the mean structure) is not uniform and itself has some structure (e.g. for each connection it can depend on the types of the connected nodes or neurons). Moreover, the deviations of the strength of different connections or interactions from their average need not be independent. Hence it is important to move beyond a simple iid deviation from the mean structure. Here, we study ensembles of large N × N random matrices of the form A = M + LJR where M, L and R are arbitrary (M) or arbitrary invertible (L and R) deterministic matrices that are in general nonnormal, and J is a completely random matrix with zero-mean iid elements of variance 1/N. The matrix M is thus the average of A, and describes average connectivity. Note that when L and R are diagonal, they specify variances that depend separably on the row and column of A; while when they are not diagonal, the elements of A are not statistically independent. As we show in Sec. II C 3, this form arises naturally, for example, in linearizations of dynamical systems involving simple classes of nonlinearities. This type of ensemble is also natural from the random matrix theory viewpoint, as it describes a classical fully random ensemble – an iid random matrix J – modified by the two basic algebraic operations of matrix multiplication and addition.

We study the eigenvalue distribution of such matrices, but also directly study the dynamics of a linear system of differential equations governed by such matrices. Specifically, for matrices of the above type, using the Feynman diagram technique in the large N limit (we follow the particular version of this method developed by Refs. [24, 25]), we have derived a general formula for the density of their eigenvalues in the complex plane, which generalizes the well-known circular law for fully random matrices [2630]. It also generalizes a result [31] obtained for the case where L and R are scalar multiples of 1, the N -dimensional identity matrix (the same result was obtained in [32] using the methods and language of free probability theory; the eigenvalue density for the case LR1 and a normal M was also calculated in Ref. [24] in the limit N → ∞, and that result was extended to finite N in Ref. [33]). Apart from generalization to arbitrary invertible L and R, we also provide a correct regularizing procedure for finding the support of the eigenvalue density in the limit N → ∞, in certain highly nonnormal cases of M ; the naive interpretation of the formulae fails in these cases, which were not previously discussed. Furthermore, with the aim of studying dynamical signatures of nonnormal connectivity, we focused on the dynamics directly, deriving general formulae for the magnitude of the response of the system to a delta function pulse of input (which provides a measure of the time-course of potential transient amplification), as well as the frequency power spectrum of the system’s response to external time-dependent inputs.

These general results are presented in the next section. There, we also present the explicit results of analytical or numerical calculations based on these general formulae for some specific examples of M, L and R. Sections III and IV contain the detailed derivations of our general formulae for the eigenvalue density and the response magnitude formulae, respectively. Section V contains the detailed analytical calculations of these quantities for the specific examples presented in Sec. II, based on the general formulae. We conclude the paper in Sec. VI.

II. SUMMARY OF RESULTS

We study ensembles of large N × N random matrices of the form

A=M+LJR, (2.1)

where M, L and R are arbitrary (M) or arbitrary invertible (L and R) deterministic matrices [59], and J is a random matrix of independent and identically distributed (iid) elements with zero mean and variance 1/N. Since J and therefore LJR have zero mean, M is the ensemble average of A. The random fluctuations of A around its average are given by the matrix LJR, which for general L and/or R has dependent and non-identically distributed elements, due to the possible mixing and non-uniform scaling of the rows (columns) of the iid J by L (R).

We are firstly interested in the statistics of the eigenvalues of Eq. (2.1). While the statistics of the eigenvalues and eigenvectors of A are of interest in their own right, we also directly consider certain properties of the linear dynamical system

dx(t)dt=γx(t)+Ax(t)+I(t), (2.2)

for an N -dimensional state vector x(t), when A is a sample of the ensemble Eq. (2.1). Here, γ is a scalar and I(t) is an external, time-dependent input. In studying this system, we generally assume that Eq. (2.2) is asymptotically stable. This means that M, L, R and γ must be chosen such that for any typical realization of J, no eigenvalue of −γ1 + M + LJR has a positive real part; this can normally be achieved, for example, by choosing a large enough γ > 0.

Using the diagrammatic technique in the non-crossing approximation, which is valid for large N, we have derived general formulae for several useful properties of such matrices involving their eigenvalues and eigenvectors (see Sec. III–IV for the details of the derivations and the definition of the non-crossing approximation). We present these results in in this section. In our derivation of these results, we assume the random J belongs to the complex Ginibre ensemble [26], i.e. the distribution of the elements of J is complex Gaussian. However, we emphasize that universality theorems ensure that, for given M, L and R, the obtained result for the eigenvalue density in the limit N → ∞ will not depend on the exact choice of the distribution of the elements of J, beyond its first two moments, and extend to any iid J (including, e.g., J with real binary or log-normal elements) whose elements have the same first two moments, i.e. zero mean and variance 1/N ; the universality of the eigenvalue density for general M, L and R was established in Ref. [34], following earlier work on the universality of the circular law established and successively strengthened in Refs. [2630]. Furthermore, empirically, from limited simulations, we have thus far found (but have not proved) such universal behavior to also hold for the other quantities we compute here (however, it is possible that universality for these quantities might require the existence of some higher moments beyond the second, as has been found for universality of certain other properties of random matrices; see e.g. Ref. [35]). To demonstrate the universality of our results, we have used non-Gaussian and/or real J ’s in most of the numerical examples below.

Hereinafter, we adopt the following notations. For any matrix B, we denote its operator norm (its maximum singular value) by ∥B∥ and we define its (normalized) Frobenius norm via

BF21NijBij2=1NTr(BB) (2.3)

(equivalently, ∥BF is the root mean square of the singular values of B). For general matrices, A and B,

tr(A)1NTr(A),A(A)1,1AA1,ABAB1,

and when adding a scalar to a matrix, it is implied that the scalar is multiplied by the appropriate identity matrix. We denote the identity matrix in any dimension (deduced from the context) by 1. For a complex variable z = x + iy, the Dirac delta function is defined by δ2(z) ≡ δ(x)δ(y), and we define zz=(x+iy)2, and z∂/∂z = (∂/∂xi∂/∂y)/2. For simplicity, we use the notation f (z) (instead of f(z,z)) for general, nonholomorphic functions on the complex plane. We say a quantity is O(f (N)) (resp. Θ(f (N))) when, for large enough N, the absolute value of that quantity is bounded above (resp. above and below) by a fixed positive multiple of |f (N)|. Finally, we say a quantity is o(f (N)) when its ratio to |f (N)| vanishes as N → ∞.

The only conditions we impose on M, L and R are that ∥MF, ∥LF, ∥RF, ∥L−1MR−1F and ∥(LR)−1∥ are bounded as N → ∞. We use the bound on ∥(LR)−1∥ in Appendices A and B; the Frobenius norm conditions are assumptions in the universality theorem of Ref. [34] which we use as discussed above. Finally, we assume that for all z ∈ ℂ, the distribution of the eigenvalues of MzMz, where Mz is defined below in Eq. (2.6), tends to a limit distribution as N → ∞. This last condition simply makes precise the requirement that M, L and R are defined consistently as functions of N, such that a limit spectral density for M + LJR is meaningful; in particular, it does not impose any further limits on the growth of the eigenvalues of MzMz with N, beyond the various norm bounds imposed above.

A. Spectral density

1. Summary of results

The density of the eigenvalues of M + LJR in the complex plane for a realization of J (also known as the empirical spectral distribution) is defined by

ρJ(z)=1Nαδ2(zλα), (2.4)

where λα are the eigenvalues of M + LJR. It is known [34] that ρJ (z) is asymptotically self-averaging, in the sense that with probability one ρJ (z) − ρ(z) converges to zero (in the distributional sense) as N → ∞, where ρ(z) ≡ 〈ρJ (z)〉J is the ensemble average of ρJ (z). Thus for large enough N, any typical realization of J yields an eigenvalue density ρJ (z) that is arbitrarily close to ρ(z).

Our general result is that for large N, with certain cautions and excluding certain special cases as described below (Eqs. (2.19)(2.20) and preceding discussion), ρ(z) is nonzero in the region of the complex plane satisfying

tr[(MzMz)1]1 (2.5)

where we defined

MzL1(zM)R1. (2.6)

Using the definition Eq. (2.3), we can also express Eq. (2.5) as

R1zMLF1. (2.7)

Inside this region, ρ(z) is given by

ρ(z)=1πztr[(RL)1MzMzMz+g(z)2], (2.8)

where g(z) is a real, scalar function found by solving

tr[1MzMz+g2]=1, (2.9)

for g for each z. As a first example, for the well-known case of M = 0, L = 1, and R = σ1, we have Mz = z/σ and the circular law follows immediately from Eq. (2.5), which yields σ2/|z|2 ≥ 1 or |z| ≤ σ for the support, and from Eqs. (2.8)(2.9) which yield the uniform ρ(z) = 1/(πσ2) within that support. As we noted in the introduction, formulae (2.5)(2.9) generalize the results of Refs. [31] and [32] for the special case LR1 to arbitrary invertible L and R (the eigenvalue density for the case LR1 and a normal M was also calculated in Ref. [24]).

It is possible and illuminating to express Eqs. (2.7)(2.9) exclusively in terms of the singular values of Mz, which we denote by si(z) (we include possibly vanishing singular values among si(z), so that we always have N of them). First, noting that the squared singular values of Mz are the eigenvalues of the Hermitian MzMz, we can evaluate the trace in Eq. (2.9) in the eigen-basis of the latter matrix, and rewrite this equation as

1Ni=1N1Si(z)2+g2=1. (2.10)

Similarly, Eq. (2.5) can be equivalently rewritten as

1Ni=1NSi(z)21. (2.11)

As we prove at the end of Sec. III, Eq. (2.8) can also be written in a form that makes it explicit that the dependence of ρ(z) on M, L and R is only through the singular values of Mz and their derivatives with respect to z and z. We have

ρ(z)=1πz[1Ni=1Nz(si(z)2)si(z)2+g(z)2]. (2.12)

For the special case of M = 0 and general L and R, our formulas can be simplified considerably. The spectrum is isotropic around the origin in this case, i.e. ρ(z) depends only on r ≡ |z|, and its support is a disk centered at the origin with radius

r0=RLF=[1Ni=1Nσi2]12, (2.13)

where σi are the singular values of RL (this follows from Eq. (2.11) by noting that for M = 0, the singular values of Mz = z(RL)−1 are si(z) = |z|i). Within this support the spectral density is given by

ρ(r)=12πrr(g(r)2), (2.14)

where g(r)2 > 0 is found by solving

1=1Ni=1N1σi2r2+g(r)2. (2.15)

Integrating Eq. (2.14), we see that the proportion of eigenvalues lying a distance larger than r from the origin is, in this case, given by

n>(r)={g(r)2(r<r0)0(rr0).} (2.16)

In Sec. III we prove that the eigenvalue density, given by Eqs. (2.14)(2.15), is always a decreasing function of r = |z|, i.e. for r > 0 its derivative with respect to r is strictly negative, as long as the limit distribution of the {σi} as N → ∞ has nonzero variance (otherwise ρ(z) is given by the circular law with radius Eq. (2.13)). The values of spectral density at r = 0 and r = r0 can be calculated explicitly for general L and R:

ρ(r=0)=1π1Ni=1Nσi2 (2.17)
ρ(r=r0)=1πr02[1Ni=1Nσi4]1ρ(r=0). (2.18)

As noted above, certain cautions apply in using the above formulae for the eigenvalue density and its boundary ((2.5)(2.9), or equivalently Eqs. (2.10)(2.12), and for M = 0, Eqs. (2.14)(2.15)). We have written these formulas for finite N (assuming it is large). However, the non-crossing approximation used in deriving these formulas is only guaranteed to yield the correct result for the eigenvalue density in the limit, i.e. limN →∞ ρ(z) (see Appendix A); finite-size corrections obtained from Eqs. (2.5)(2.9) are not in general correct, and o(1) contributions to g(z)2 or ρ(z) obtained from Eqs. (2.9) and (2.8) should be discarded.

Furthermore, in general, the correct way of finding the support of limN →∞ ρ(z) using Eq. (2.5) is by setting the left side of the inequality (2.5) to lim g2 → 0+ lim N → ∞ of the left side of Eq. (2.9), as discussed in Sec. III and Appendix A. However, in writing Eq. (2.5) we have simply set g = 0 in Eq. 2.9, and thus implicitly taken the limit g2 → 0+ before the N → ∞ limit. To correctly express the support, we must first define the function

K(g,z)limNtr[1MzMz+g2]=limN1Ni=1N[1si(z)2+g2] (2.19)

for fixed, strictly positive g, which serves to regularize the denominators in Eq. (2.19) for si(z) which are zero or vanishing in the limit N → ∞. The generally correct way of expressing Eq. (2.5) or Eq. (2.11) is then

K(0+,z)limg0+K(g,z)1. (2.20)

Let us denote the the support of limN →∞ ρ(z), given by Eq. (2.20), by S0+ and the region specified by the limit N → ∞ of Eq. (2.5) or Eq. (2.11) by S0. For many examples of M, L and R, the limits N → ∞ and g → 0+ commute everywhere and hence S0+ = S0. However, if there are z’s at which some of the smallest si(z) are either zero or vanish in the limit N → ∞, the two limits may fail to commute, and the naive use of Eq. (2.5) can yield a region, S0, strictly larger than and containing S0+, the correct support of limN →∞ ρ(z). For example, at z’s for which a Θ(1) number of si(z) are zero or o(1), these singular values do not make a contribution to K(g, z) for g > 0 (their contribution to the sum in Eq. (2.19) is O(N −1)) and hence to K(0+, z), but if they vanish sufficiently fast as N → ∞ they can make a nonzero contribution to the left side of Eq. (2.11); such z may fall within S0, but not within S0+. For finite N, the si(z) can vanish exactly when z coincides with an eigenvalue of M ; thus the above situation can, e.g., arise close to eigenvalues of M that are isolated and far from the rest of M ’s spectrum, so that they fall outside the support of limN →∞ ρ(z). In such cases, the spectrum of M + LJR will nonetheless typically also contain isolated eigenvalues (which do not contribute to limN →∞ ρ(z)) with effectively deterministic location, i.e. within o(1) distance of corresponding isolated eigenvalues of M ; examples of this phenomenon, for which S0S0+ is not empty but has zero measure, have been studied in Refs. [36, 37] (for symmetric matrices, outlier eigenvalues corresponding to eigenvalues of the mean matrix were first studied in Ref. [38]). For some choices of M, L and R, however, a more interesting case can arise such that for z in a certain region of the complex plane with nonzero measure, all si(z) are nonzero at finite N (hence M has no eigenvalue there), but a few si(z) are o(1) and vanish sufficiently fast as N → ∞; in particular when LR1, this can occur for certain highly nonnormal M [60]. In such cases the non-commutation of the two limits can lead to a difference S0S0+ with nonzero measure. In cases we have examined this signifies that there exists a finite, non-vanishing region outside the support of limN →∞ ρ(z) (typically surrounding it) where, although ρ(z) is o(1), it nonetheless converges to zero sufficiently slowly that a Θ(1) number of “outlier” eigenvalues lie there (note that the vast majority of eigenvalues, i.e. (1 − o(1))N of them, lie within the support of the limit density). We will discuss examples of this phenomenon in Sec. II C below; in one of the examples (discussed in Sec. II C 2), the existence of such outlier eigenvalues was first noted in Ref. [39], and their distribution was quantitatively characterized in Ref. [36]. However, the connection between such outlier eigenvalues and nonzero but o(1) singular values of Mz, which arise, e.g., for highly nonnormal M, were not noted before to the best of our knowledge. We have observed in simulations (and also supported by [36]) that the distribution of these outliers remains random as N → ∞, is in general less universal than limN →∞ ρ(z) (e.g. it could depend on the choice of real vs. complex ensembles for J), and its average behavior may not be correctly given by the non-crossing approximation.

2. Relationship to pseudospectra

Finally, we note a remarkable connection between our general result for the support of the spectrum Eq. (2.5) and the notion of pseudospectra, in the case in which the limits g2 → 0+ and N → ∞ commute (so that Eq. (2.5) correctly describes the support). Pseudospectra are generalizations of eigenvalue spectra, which are particularly useful in the case of nonnormal matrices (see Ref. [17] for a review). The eigenvalue spectrum of matrix M can be thought of as the set of points, z, in the complex plane where (zM)−1 is singular, i.e. it has infinite norm. Given a fixed choice of matrix norm, ∥·∥, the pseudospectrum of M at level σ, or its “σ-pseudospectrum” in the given norm, is the set of points z for which ∥(zM)−1∥ ≥ σ−1 (thus as σ → 0 we recover the spectrum). For the specific choice of the operator norm (i.e. when ∥A∥ is taken to be the maximum singular value of A), the σ-pseudospectrum can equivalently be characterized as the set of points, z, for which there exists a matrix perturbation ΔM, with ∥ΔM∥ ≤ σ, such that z is in the eigenvalue spectrum of M + ΔM [17][61]. In words, in the operator norm, the σ-pseudospectrum of M is the set to which its spectrum can be perturbed by adding to it arbitrary perturbations of size σ or smaller.

In our setting we can think of LJR as a perturbation of M. Let us focus on the case where L and R are proportional to the identity, i.e., we have ΔM = σJ, with a positive scalar σ. Our result Eq. (2.7), in this case reads ∥σ(zM)−1F ≥ 1 or ∥(zM)−1Fσ−1. In other words, as N → ∞, the spectrum of M + σJ, for an iid random J with zero mean and variance 1/N, is the σ-pseudospectrum of M in the normalized Frobenius norm defined by Eq. 2.3. Interestingly, the perturbation, ΔM = σJ, has normalized Frobenius norm σ as N → ∞: this norm is σijJij2N, which, by the law of large numbers, converges to σ for large N. That is, as N → ∞, the spectrum in response to the random perturbation σJ, which has size σ (in normalized Frobenius norm), is the σ-pseudospectrum of M in the normalized Frobenius norm.

This result sounds similar to the equivalence of the two definitions of pseudospectra for the operator norm which we noted above (one based on the norm of (zM)−1, and one based on the spectra of bounded perturbations), but it differs in two key respects. First, unlike in the case of the operator norm, the general equivalence of the two notions of pseudospectra noted above does not hold for the normalized Frobenius norm. Second, for the operator norm, it is not in general the case that the σ-pseudospectrum of M is equivalent to the spectrum obtained from a single random perturbation of M of size σ, even in the limit N → ∞ (although the spectra arising from such random perturbations are sometimes used as a “poor man’s version” or approximation of the pseudospectra [17]). This can be seen as follows. The operator norm of the random iid perturbation, σJ, i.e. its maximum singular value, converges almost surely to 2σ as N → ∞ [40]. Condition 2.7 for z to be in the spectrum under this random perturbation is ∥(zM)−1Fσ−1, or rms({si(z)−1}) ≥ 1 where the si(z) are the singular values of zMσ and rms({xi}) represents the root-mean-square of the set of values {xi}. This is not equivalent to the condition that z be in the 2σ-pseudospectrum of M in the operator norm, i.e. that (zM)1(2σ)1 or smin (z)−11, where smin(z) is the minimum of the si(z); in fact, noting that smin(z)−1 ≥ rms({si(z)−1}), it is easy to see that the spectrum under random iid perturbations with operator norm ∥σJ∥ = 2σ is strictly a proper subset of the 2σ-pseudospectrum in the operator norm. For example, for M = 0, the “poor man’s 2σ-pseudospectrum” in the limit N → ∞ is a ball of radius σ about the origin (the circular law), while the true 2σ-pseudospectra of the zero matrix is the ball of radius 2σ about the origin.

In sum, in the operator norm, the σ-pseudospectrum of M for any N is equivalent to the set of points z for which some perturbation ΔM with ∥ΔM∥ ≤ σ can be found such that z is in the spectrum of MM [17]. In the normalized Frobenius norm in the limit N → ∞, however, the σ-pseudospectrum of M is equivalent to the spectrum of M + ΔM where ΔM is any random perturbation with zero-mean iid elements with ∥ΔMF = σ. This statement for the normalized Frobenius norm holds when the two limits N → ∞ and g → 0+ commute; when the two limits do not commute, the support of the spectral distribution of M + ΔM is a subset of the σ-pseudospectrum of M in the normalized Frobenius norm.

B. Average norm squared and power spectrum

As we mentioned in the introduction, an important phenomenon encountered in dynamics governed by nonnormal matrices, as described by Eq. (2.2) with I(t) = 0, is transient amplification in asymptotically stable systems. In any stable system, the size of the response to an initial perturbation eventually decays to zero, with an asymptotic rate set by the system’s eigenvalues. In stable nonnormal systems, however, after an initial perturbation, the size of the network activity, as measured, e.g., by its norm squared ∥x(t)∥ = x(t)T x(t), can nonetheless exhibit transient, yet possibly large and long-lasting growth, before it eventually decays to zero. By contrast, in stable normal systems, ∥x(t)∥2 can only decrease with time. The strength and even the time scale of transient amplification are set by properties of the matrix A beyond its eigenvalues; they depend on the degree of non-normality of the matrix, as measured, e.g., by the degree of non-orthogonality of its eigenvectors, or alternatively by its hidden feedforward structure (see Eq. (2.34) for the latter’s definition).

Nonnormal systems can also exhibit pseudo-resonances at frequencies that could be very different from their natural frequencies as determined by their eigenvalues; such pseudo-resonances will be manifested in the frequency power spectrum of the response of the system to time dependent inputs. ∥x(t)∥2 and the power spectrum of response are examples of quantities that depend not only on the eigenvalues of M + LJR but also on its eigenvectors.

Here, we present a few closely related formulas for general M, L and R. These include a formula for 〈∥x(t)∥2J, i.e. the ensemble average of the norm squared of the state vector, x(t), as it evolves under Eq. (2.2) with I(t) = 0, as well as a formula for the ensemble average of the power spectrum of the response of the network to time-varying inputs. The results of this section are valid, and in the case of the power spectrum meaningful, when the system Eq. (2.2) is asymptotically stable. As we mentioned after Eq. (2.2), this means that M, L, R and γ must be chosen such that for any typical realization of J, all eigenvalues of −γ1+M +LJR have negative real part. In particular, the entire support of the eigenvalue density of M + LJR, as determined by Eq. (2.5), must fall to the left of the vertical line of z’s with real part γ; this is a necessary condition, but may not be sufficient either at finite N or in cases where an O(1) number of eigenvalues remain outside this region of support even as N → ∞.

First, we consider the time evolution of the squared norm, ∥x(t)∥2, of the response of the system to an impulse input, I(t) = x0δ(t), at t = 0, before which we assume the system was in its stable fixed point x = 0 (for t > 0 this is equivalent to the squared norm of the activity as it evolves according to Eq. (2.2) with I(t) = 0, starting from the initial condition x(0) = x0). We provide a formula for the ensemble average of the more general quadratic function, x(t)T Bx(t), where B is any N × N symmetric matrix; the norm squared corresponds to B = 1. The result for general B, M, L and R is given as a double inverse Fourier transform

x(t)TBx(t)J=dω12πdω22πeit(ω1ω2)Tr[BCx(ω1,ω2;x0x0T)], (2.21)

in terms of the N × N Fourier-domain “covariance matrix,” Cx(ω1,ω2;x0x0T)x~(ω1)x~(ω2)J (where x~(ω) is the Fourier transform of x(t)). The expression for the latter is given by

Cx(ω1,ω2;CI)=C0x(ω1,ω2;CI)+ΔCx(ω1,ω2;CI) (2.22)

where

C0x(ω1,ω2;CI)1γ+iω1MCI1γiω2M, (2.23)

yields the result obtained by ignoring the randomness in the connectivity (i.e. by setting A = M), and

ΔCx(ω1,ω2;CI)1γ+iω1MLL1γiω2M×tr(RR1γ+iω1MCI1γiω2M)1tr(RR1γ+iω1MLL1γiω2M), (2.24)

is the contribution of the random part of connectivity LJR. For later use, we have provided these expressions for a general third argument in Cx(·, ·; ·); for use in Eq. (2.21) CI must be substituted with x0x0T. In the special case of 〈∥x(t)∥2J corresponding to B = 1, and iid disorder (L = 1, R = σ1), the contributions from Eqs. (2.23)(2.24) can be more compactly combined into

x(t)2J=dω12πdω22πeit(ω1ω2)x0T1γiω2M1γ+iω1Mx01σ2tr(1γ+iω1M1γiω2M) (2.25)

(we used Tr(1z1Mx0x0T1z2M)=x0T1z2M1z1Mx0 write the numerator in Eq. (2.25)).

Next, we look at the power spectrum of the response of the system to a noisy input, I(t), that is temporally white, with zero mean and covariance

Ii(t1)Ij(t2)¯=δ(t1t2)CijI. (2.26)

Here the bar indicates averaging over the input noise (or by ergodicity, over a long enough time). Our general result for the ensemble average of the matrix power spectrum of the response, which by definition is the Fourier transform of the steady-state response covariance,

Cijx(ω)dτeiωτxi(t+τ)xj(t)¯, (2.27)

is given by

Cx(ω)J=C0x(ω)+ΔCx(ω). (2.28)

Here we defined

C0x(ω)C0x(ω,ω;CI) (2.29)

and

ΔCx(ω)ΔCx(ω,ω;CI) (2.30)

are the power spectrum matrices obtained by ignoring the randomness in connectivity (i.e. by setting A = M), and the contribution of quenched randomness LJR, respectively.

A closely related quantity is the total power of the steady-state response of the system to a sinusoidal input I(t)=I02cosωt (the 2 serves to normalize the average power of 2 cos ωt to unity, so that the total power in the input is ∥I02). For such an input, the steady-state activity, which we denote by xω (t), is also sinusoidal (with a possible phase shift). By total power of the steady-state response we mean the time average of the squared norm of the activity, xω(t)2¯, where now the bar indicates temporal averaging (we call this total power, because the squared norm sums the power in all components of xω (t)). As in Eqs. (2.21)(2.24), we present a formula for the ensemble average of the more general quantity xωTBxω¯. We have

xωTBxω¯J=Tr(BCx(ω)J), (2.31)

where 〈Cx(ω)〉J is given by Eqs. (2.28)(2.30) with CI replaced by I0I0T. For the special case of B = 1, corresponding to the total power of the response at frequency ω, using Eq. (2.23)(2.24) with ω1 = ω2 = ω, this formula can be simplified into

Xω2¯J=1zMI02+1zMLF2R1zMI021R1zMLF2, (2.32)

where z = γ + , ∥ · ∥ denotes the vector norm, and ∥ · ∥F denotes the Frobenius norm defined in Eq. (2.3). Finally, for the case that the random part of the matrix is iid, i.e. L = σ1 and R = 1, we can further simplify Eq. (2.32) into

xω2¯J=(γ+iωM)1I021σ2(γ+iωM)1F2. (2.33)

The stability of the x = 0 fixed point guarantees the positivity of the expressions Eqs. (2.32)(2.33) for the power spectrum. This is true because, as we noted above, stability requires that the support of the eigenvalue density of A is entirely to the left of the vertical line Re(z) = γ. By our result Eq. (2.7) for that support, this can only be true if the denominators of the last terms in Eq. (2.32)(2.33) are positive, which guarantees the positivity of the full expressions.

Note that the first term in Eq. (2.32) and the numerator in Eq. (2.33) represent the power spectrum in the absence of randomness, i.e. if A, in Eq. (2.2) is replaced with M. Thus, formulae (2.32)(2.33) show that the correct average power spectrum is always strictly larger than the naive power spectrum obtained by assuming that random effects will “average out”. Furthermore, due to the denominators of the last terms in Eqs. (2.32)(2.33), the power spectrum will be larger for frequencies where the support of the eigenvalue density, Eq. (2.7), is closer to the vertical line with Re(z) = γ. Similar, but less precise statements can also be made about the strength of transient amplification using formulae (2.21)(2.25) for the squared norm of the impulse response. One measure of the strength of transient amplification up to time T is 0Tx(t)2dt. Integrating formulae Eq. (2.21) (with B = 1) or Eq. (2.25) over t, one obtains formulae for 0Tx(t)2dt that are the same as Eqs. (2.21)(2.25), except for the factor eit(ω1 ω2) in the integrands of Eqs. (2.21) and (2.25) being replaced by i[1eiT(ω1ω2+iϵ)]ω1ω2+iϵ (with ϵ → 0+). Due to the denominator in this factor (for T sufficiently large the numerator is constant), the main contribution to the integrals over ω1 and ω2 should typically arise for ω1ω2. On the other hand, note that for ω1 = ω2 the denominators in Eqs. (2.24)(2.25) reduce to the those in Eqs. (2.32)(2.33), with the connection to the support of the spectral density noted above. Thus this dominant contribution to 0Tx(t)2dt must be larger, the closer the support of the eigenvalue density, Eq. (2.7), is to the vertical line with Re(z) = γ. This also suggests that, as in the case of the power spectrum, the strength of transient amplification would typically be underestimated if randomness of connectivity is ignored and only its ensemble average M is taken into account in solving Eq. (2.2).

Numerical simulations indicate that the quantities x(t)2 and xω2¯ are self-averaging in the large N limit; that is, for large x(t)2 or xω2¯ for any typical random realization of J will be very close to their ensemble averages, given by Eq. (2.25) and Eq. (2.32) respectively, with the random deviations from these averages approaching zero as N goes to infinity (see Figs. 1, 3 and 8, below). This conclusion is also corroborated by rough estimations (not shown) based on Feynman diagrams (the diagrammatic method is introduced in Secs. III and IV) of the variance of fluctuations of these quantities for different realizations of J.

FIG. 1.

FIG. 1

(Color online) Top panel: the total power spectrum of steady state response xω2¯ as a function of input frequency ω, Eq. (2.33), for the system Eq. (2.2) with A = M + σJ, and M given by Eq. (2.35) with w = 1 and λn = ±i (with +i and −i alternating), respectively. Here, N = 700, σ = 0.5, and γ = 0.8. The input was fed into the last component of x (the beginning of the feedforward chain characterized by Eq. (2.35)), which for the matrix M has natural frequency -1. That is, the input was I02cosωt where I0 was 1 for the last component and 0 for all other components. The green (thick dashed) curve is the ensemble average of the total power spectrum, xw2¯J, calculated numerically using the general formula Eq. (2.33), which is compared with an empirical average over 100 realizations of real Gaussian J (solid red line, mostly covered by the dashed green line). The pink (light gray) area shows the standard deviation among these 100 realizations around this average. The blue (thin) line shows the result when disorder, σJ, is ignored, i.e. A is replaced by its ensemble average M. Bottom panel: the eigenvalue spectrum of M + σJ (black dots). Red big dots at ±i show the eigenvalues of M. The red curve is the outer boundary of the eigenvalue spectrum of A as computed numerically using Eq. (2.5). The real and imaginary axes of the complex plane are interchanged, so that the frequency axis in the top panel can be matched with the imaginary part of the eigenvalues, i.e. the natural frequencies of Eq. (2.2).

Finally, we note that the general formulae presented in this section are valid only for cases where the initial condition, x0, or the input structure, I0 or CI, are chosen independently of the particular realization of the random matrix J (e.g., cases where x0 is itself random but independent of J, or when x0 is chosen based on properties of M, L or R). In particular, our results do not apply to cases in which the initial condition or the input is tailored or optimized for the particular realization of the quenched randomness, J, in which case the true result could be significantly different from those given by the formulae of this section.

C. Some specific examples of M, R and L

In this section we present the results of explicit calculations of the eigenvalue density Eq. (2.8), the average squared norm of response to impulse Eqs. (2.21) and (2.25), and the total power in response to sinusoidal input Eq. (2.33), for specific examples of M, L and R (the details of the calculations for the results presented here can be found in Sec. V). For many of the examples presented here, L and R are both proportional to the identity matrix; thus in these examples the full matrix is of the form M + σJ where σ > 0 determines the strength of disorder in the matrix. In Secs. II C 2 and II C 3, we also present examples with nontrivial L and/or R.

Any matrix, M, can be turned into an upper-triangular form by a unitary transformation, i.e.

M=UTU, (2.34)

where U is unitary and T is upper-triangular (i.e. Tij = 0 if i > j) with its main diagonal consisting of the eigenvalues of M. The difference between nonnormal and normal matrices is that for the latter, T can be taken to be strictly diagonal. Equation (2.34) is referred to as a Schur decomposition of M [41], and we refer to the orthogonal modes of activity represented by the columns of U as Schur modes. The Schur decomposition provides an intuitive way of characterizing the dynamical system Eq. (2.2). Rewriting Eq. (2.2), with J and I(t) set to zero, in the Schur basis by defining y = U x (i.e. yi is the activity in the i-th Schur mode), we obtain dydt = −γy+T y. We see that activity in the j-th Schur mode provides an input to the equation for the i-th mode only when ij (as Tij = 0 for i > j). Thus the coupling between modes is feedforward, going only from higher modes to lower ones, without any feedback. We refer to Tij ’s for j > i as feedforward weights. As these vanish for normal matrices, we can say a matrix is more nonnormal the stronger its feedforward weights are.

Due to the invariance of the trace, the norm, and the adjoint operation under unitary transforms, our general formulae for the spectral density Eq. (2.8) and the average squared norm in time and frequency space, Eqs. (2.25) and (2.33), take the same form in any basis, so in particular we can work in the Schur basis of M. Hence M can be replaced by T, provided L and R are also expressed in M ’s Schur basis and x0 or I0 are replaced by U x0 or U I0, respectively [62]. Thus we use the feed-forward structure of the Schur decomposition to characterize the different examples we consider below. Our examples are chosen to demonstrate interesting features of nonnormal matrices in the simplest possible settings.

1. Single feedforward chain of length N

In the first example, each and every Schur mode is only connected to its lower adjacent mode, forming a long feedforward chain of length N. For simplicity, we take all feedforward weights in this chain to have the same value w, so that

M=T=(λ1w00λ2w) (2.35)

or more succinctly Mnm = w δn+1,m + λnδnm.

Figure 1 shows the power spectrum of response (top panel) and the eigenvalue distribution (bottom panel) of A = M + σJ for an example M of the form Eq. (2.35) with alternating imaginary eigenvalues, λn = (−1)n+1i. The black dots in the bottom panel of Fig. 1 show the eigenvalues of A for one realization of J, scattered around the highly degenerate spectrum of M at ±i (red dots). The top panel shows the ensemble average of the total power spectrum of response, xw2¯J, of the system Eq. (2.2) to sinusoidal stimuli as given by our general formula Eq. (2.33) (green curve), showing that it perfectly matches the empirical average (red curve) over a set of 100 realizations of J (the latter was obtained by generating 100 realizations of J, calculating xω2¯ for each realization, which is given by the numerator of Eq. (2.33) with M replaced by M + σJ, and then averaging the results over the 100 realizations). The pink (light gray) shading shows the standard deviation of the power spectrum over these 100 realizations. This will shrink to zero as N goes to infinity, so that for large N the power spectrum of any single realization of A = M + σJ will lie very close to the ensemble average. The system (2.2) in the zero disorder case, σ = 0, has two highly degenerate resonant frequencies (imaginary parts of the eigenvalues of ω0±=±1, leading to possible peaks in the power spectrum at these frequencies. The smaller the decay of these modes (in this case given by γ) is, i.e. the closer the eigenvalues of the combined matrix −γ + M are to the imaginary axis, the sharper and stronger are the resonances. Comparing the zero disorder power spectrum (blue curve) with that for A = M + σJ, we see that the disorder has led to strong but unequal amplification of the two resonances relative to the case without disorder. This is partly due to the disorder scattering some of the eigenvalues of −γ + A much closer to the imaginary axis, creating larger resonances.

For M of the form (2.35) with all eigenvalues zero we have analytically calculated the eigenvalue density, Eq. (2.8), the magnitude of response to impulse Eq. (2.25), and the power-spectrum Eq. (2.33). In this case, using Eq. (2.5) naively yields zw2+σ2 for the support of the eigenvalue density. However, using the correct procedure, Eqs. (2.19)(2.20), we find that this formula is only correct for σ ≥ |w|, while for σ < |w|, the true support of the eigenvalue density in the limit N → ∞ is the annulus

w2σ2zw2+σ2, (2.36)

(this result was obtained in Ref. [31]). Within this support the eigenvalue density in either case is

ρ(z)=1πσ2[1w24w2z2+σ4]. (2.37)

Figure 2 demonstrates the close agreement of Eqs. (2.36)(2.37) with the empirical spectrum of M + σJ for a single realization of J, for N = 2000 and two different values of σ. The discrepancy between the results obtained by the naive use of Eq. (2.5) and Eq. (2.36) is due to the fact that for |z| < |w|, Mz = (zM) has an exponentially small, O(ecN), singular value (see next paragraph) which makes the result of Eqs. (2.19)(2.20) dependent on the order of the two limits N → ∞ and g → 0+. As we discussed after Eq. (2.12), such a discrepancy can signify the existence of an O(1) number of outlier eigenvalues outside the support of limN →∞ ρ(z). Simulations show that this is the case for z<w2σ2 (see Fig. 2).

FIG. 2.

FIG. 2

(Color online) The eigenvalue spectra of A = M + σJ for N = 2000 and M given by Eq. (2.35) with λn = 0, w = 1 for single realizations of real Gaussian J. σ = 0.95 and 0.5 in the left and rights panels, respectively. The red circles mark the circular boundaries of the spectral support given by Eq. (2.36). The insets show a comparison of the analytic formula Eq. (2.37) for the spectral density (black smooth trace) and histograms corresponding to the particular realization shown in the main plot (red jagged trace).

The most striking aspect of these results is revealed in the limit σ → 0. For σ = 0, the spectrum is that of M, which is concentrated at the origin. Remarkably, however, as seen from Eqs. (2.36)(2.37), for very small but nonzero σ the bulk of the eigenvalues are concentrated in the narrow ring with modulus |z| ≈ |w|. Thus in the limit N → ∞ the spectrum has a discontinuous jump at σ = 0. This is a consequence of the extreme non-normality of M, which manifests itself in the extreme sensitivity of its spectrum to small perturbations, which is well-known (see Ref. [17], Ch. 7). The notion of pseudospectra quantifies this sensitivity: the (operator norm) ϵ-pseudospectrum of M is the region of complex plane to which its spectrum can be perturbed by adding to M a matrix of operator norm no larger than ϵ. As we mentioned in Sec. II A, this is precisely the set of complex values z for which ∥(zM)−1∥ > ϵ−1 [17], and therefore by the definition of the operator norm ∥ · ∥, the region in which ∥(zM)−1−1 = smin(zM) < ϵ, where smin(zM) is the least singular value of zM. As noted above, for |z| < |w|, smin(zM) is exponentially small: smin(zM)wzwN (for a proof see after Eq. (5.15) in Sec. V A). Thus the ϵ-pseudospectrum of M contains the set of points z satisfying wzwN<ϵ, i.e. the centered disk with radius w(ϵw)1N which approaches |w| as N → ∞. In other words, for large enough N, any point |z| < |w| is in the ϵ-pseudospectrum for any fixed ϵ, no matter how small. It has been stated [17] that dense random perturbations, of the form σJ considered here, tend to trace out the entire ϵ-psuedospectrum (where ϵ = σJ∥ ≈ 2σ). Our result shows that, for ϵ, σ « |w|, the spectrum of such perturbations traces out the ϵ-psuedospectrum in quite an uneven fashion; the vast majority (Θ(N)) of the perturbed eigenvalues only trace out the boundary of the pseudospectrum, |z| ≈ |w|, while only a few (O(1)) eigenvalues lie in its interior. Thus, dense random perturbations can fail as a way of visualizing (operator norm based) pseudospectra.

We now turn to the dynamics. We have explicitly calculated the average evolution of the magnitude of x(t), Eq. (2.25), and the total power spectrum of steadystate response, Eq. (2.33), for the case where the initial condition is (or the input is fed into) the last Schur mode, i.e. the beginning of the feedforward chain: x0 = (0, · · · , 0, 1)T ( or I0 ∝ (0, · · · , 0, 1)T). For the evolution of the average norm squared, with the initial condition x0 = (0, … , 0, 1)T, we obtain

x(t)2J=e2γtI0(2tw2+σ2),(t0) (2.38)

where Iν (x) is the ν-th modified Bessel function. Figure 3 plots the function Eq. (2.38) and compares it with the result obtained by ignoring the disorder (corresponding to σ = 0). The main difference between the two curves is the slower asymptotic decay of the σ ≠ 0 result (green) compared with the zero-disorder case (purple). This is the result of the disorder spreading some of the eigenvalues of −γ + A closer to the imaginary axis, creating modes with smaller decay. Importantly, in neither case do we see transient amplification. By contrast, in the σ = 0 and for small enough decay, i.e. for γ < |w|, the system Eq. (2.2) exhibits very strong transient amplification. In this case, starting from the initial condition x0 = (0, · · · , 0, 1)T, the solution for the (Nn)-th Schur component is xNn(t)=(wt)nn!eγt (for 0 ≤ nN − 1), which is maximized at t = n/γ with a value maxxNn~(wr)n for n » 1. Thus up to time t ~ N/γ the norm of the activity grows exponentially; x(t)2(wγ)2γtfortNγ For larger times the activity reaches the end of the N -long feedforward chain and starts decaying to zero; asymptotically x(t)2~e2γt for t » N/γ. However, as we have seen, the spectrum of M is extremely sensitive to perturbations; even for very small but nonzero σ, the spectrum of −γ1+A has eigenvalues with real part as large as |w|−γ. Therefore, in the limit N → ∞, the system Eq. (2.2) is unstable for |w| > γ, as soon as σ ≠ 0. Conversely, in the presence of disorder (even infinitesimally small disorder in the N → ∞ limit), as long as the system is stable (which from Eq. (5.15) requires γ>w2+σ2 it exhibits no transient amplification for the initial condition along the last Schur mode. Let us note, however, that as we mentioned after Eq. (2.33), Eq. (2.25) and hence Eq. (2.38) do not yield the correct answer when the direction of the impulse is optimized for the specific realization of the quenched disorder J ; such disorder-tuned initial conditions can yield significant transient amplification even for the stable σ ≠ 0 system.

FIG. 3.

FIG. 3

(Color online) The norm squared of the response to impulse, ∥x(t)∥2, of the system Eq. (2.2), for A = M + σJ, with binary J, and M given by Eq. (2.35) (with λn = 0) describing a N-long feedforward chain with uniform weights w. Here, w = 1, σ = 0.5, γ=1.005σ2+w21.124, and N = 700. The green (thick dashed) curve shows our result, Eq. (2.38), for the average squared impulse response, 〈∥x(t)∥2>J, which lies on top of the red (thick solid) curve showing the empirical average of ∥x(t)∥2 over 100 realizations of binary J. The five thin dashed black curves show the result for five particular realizations of J, and the pink (light gray) area shows the standard deviation among the 100 realizations. The standard deviation shrinks to zero as N → ∞, and ∥x(t)∥2 for any realization lies close to its average for large N. For comparison the purple (thin, lowest) curve shows ∥x(t)∥2 obtained by ignoring the effect of quenched disorder, i.e. by setting A = M.

Incidentally, we can also read the result for M = 0 from Eq. (2.38), by setting w = 0, obtaining 〈∥x(t)∥2J = e−2γtI0(2σt). Since in this case, all directions are equivalent, this is the answer for the (normalized) initial condition along any direction, again as long as the direction is chosen independently of the specific realization of J.

Finally, the total power of response to a sinusoidal input with amplitude I0 = (0, · · · , 0, I0)T is given by

xω2¯J=I02ω2+γ2w2σ2. (2.39)

The main effect of the disorder is to reduce the width of the resonance (the peak of xw2¯J at ω = 0) and increase its height. This is partly a consequence of the scattering of the eigenvalues of −γ +A closer to the imaginary line by the disorder, creating modes with smaller decay.

2. Examples motivated by Dale’s law: 1 or N/2 feedforward chains of length 2

In this section we consider examples motivated by Dale’s law [46] in neurobiology. Dale’s law is the observation (which holds generally but with some exceptions [42, 43]) that individual neurons release the same neurotransmitter at all of their synapses. In the context of many theoretical papers including this one, it refers more specifically to the fact that an individual neuron either makes only excitatory synapses or only inhibitory synapses; that is, each column of the synaptic connectivity matrix has a fixed sign, positive for excitatory neurons and negative for inhibitory ones. We will first consider two examples of connectivity matrices respecting Dale’s law which take the form Eq. (2.1) with L = σ−1R = 1, and a scalar σ. At the end of this subsection we consider an example with nontrivial L and R.

In the first example, we consider a matrix M, which as we will show, has a Schur form T that is composed of N/2 disjoint feedforward chains, each connecting only two modes (we assume N is even). For simplicity we will focus on the case where all eigenvalues are zero. Thus in the Schur basis we have

T=(0w1000000000w20000)=W(0100) (2.40)

where we defined W to be the N/2 × N/2 diagonal matrix of Schur weights W = diag(w1, w2, …, wN/2). T in Eq. (2.40) arises as the Schur form of a mean matrix of the form

M=12=(KKKK)=12(1111)K (2.41)

where K is a normal (but otherwise arbitrary) N/N/2 matrix (note that M is nonetheless nonnormal). The feedforward weights in Eq. (2.40) are then the eigenvalue of K. When K has only positive entries, matrices of the form Eq. (2.41) satisfy Dale’s principle, and were studied in Ref. [7], in the context of networks of excitatory and inhibitory neurons. We imagine a grid of N/2 spatial positions, with an excitatory and an inhibitory neuron at each position. ½K, a matrix with positive entries, describes the mean connectivity strength between spatial positions, which is taken to be identical regardless of whether the projecting, or receiving, neuron is excitatory or inhibitory. The sign of the weight, on the other hand, depends on the excitatory or the inhibitory nature of the projecting or presynaptic neuron; the first (last) N/2 columns of M represent the projections of the excitatory (inhibitory) neurons and are positive (negative). Since K is normal it can be diagonalized by a unitary transform: K = EW E, where W is as above, and E = (e1, e2, ) is the matrix of the orthonormal eigenvectors eb of K, b = 1, …, N/2 (with eigenvalues wb). Then transforming to the basis {(e10),(0e1),(e20),(0e2),,(eN20),(0eN2)} (where 0 represents the N/2-dimensional vector of 0’s) transforms the matrix to being block-diagonal with the 2 × 2 matrices 12(wbwbwbwb) , b = 1, …, N/2, along the diagonal. The bth block becomes (0wb00) in its Schur basis {12(ebeb),12(ebeb)} , so the full matrix takes the form Eq. (2.40). Thus, the b-th difference mode 12(ebeb) feeds forward to the b-th sum mode 12(ebeb) with weight wb. This feedforward structure leads to a specific form of nonnormal transient amplification, which the authors of Ref. [7] dubbed “balanced amplification”; small differences in the activity of excitatory and inhibitory modes feedforward to and cause possibly large transients in modes in which the excitatory and inhibitory activities are balanced.

Another interesting example of Dale’s law is that in which M simply captures the differences between the mean inhibitory and mean excitatory synaptic strengths and between the numbers of excitatory and inhibitory neurons, with no other structure assumed (uniform mean connectivity), as studied in Ref. [39]. Thus, all excitatory projections have the same mean μEN, and all inhibitory ones have the mean μIN. If we assume a fraction f of all neurons are excitatory, then we can write M as

M=uvT (2.42)

where u = N −1/2(1, …, 1)T is a unit vector, and the vector v has components vi = µE or vi = −µI for if N and i > f N, respectively (for f = 1/2 and µE = µI, Eq. (2.42) is a special case of Eq. (2.41)). The singlerank matrix M has only one non-zero eigenvalue given by vu=1Nivi, with eigenvector u. The case in which the excitatory and inhibitory weights are balanced on average, in the sense that ivi=0, is of particular interest; mathematically it is in a sense the least symmetric and most nonnormal case as v · u = 0. In this case all eigenvalues of M are equal to zero. Furthermore, since in this case u and v are orthogonal, we can readily read off the Schur decomposition of M from Eq. (2.42). The normalized Schur modes are given by u, v/v∥ and N −2 other unit vectors spanning the subspace orthogonal to both u and v. All feedforward Schur weights are zero, except for one very large weight, equal to vN, which feeds from v/v∥ to u. Thus the Schur representation of M has the form Eq. (2.40) with w1=vμN and wb≠1 = 0, where we defined

μ2tr(MM)=v2N=fμE2+(1f)μI2. (2.43)

Note that this is again a case of balanced amplification: differences between excitatory and inhibitory activity, represented by v, feed forward to balanced excitatory and inhibitory activity, represented by u, with a very large weight. In the following we present results only for this balanced case of Eq. (2.42), which as just noted is a special case of Eqs. (2.40).

We start by presenting the results for the eigenvalue density. For general diagonal W in Eq. (2.40) (or equivalently, for general normal K in Eq. (2.41)), the eigenvalue density, ρ(z), of A = M +σJ is isotropic around the origin z = 0, and depends only on r = |z|. The spectral support is a disk centered at the origin. In cases in which all the weights wb are O(1), the radius of this disk can be found directly from Eq. (2.5), which yields

r0=σ[12+14+wb2b2σ2]12. (2.44)

Here, 〈|wb|2b is the average of the squared feedforward weights over all blocks of Eq. (2.40); equivalently, (|wb|2)b = 2 tr(M M) ≡ 2µ2. As long as some wb are nonzero, r0 is larger than the radius of the circular law, σ, with the difference an increasing function of (|wb|2)b; thus the spreading of the spectrum of M (originally concentrated at the origin) after the random perturbation by σJ, is larger the more nonnormal M is. In cases in which the feedforward weights of some of the 2 × 2 blocks of Eq. (2.40) grow without bound as N → ∞, there is a corresponding singular value of MzzM for every such block which is nonzero for z ≠ 0 but vanishes in the limit, scaling like ~z2wb where wb is the unbounded weight of that block (see Eq. (5.37) and its preceding paragraph). (Note that as stated after Eq. (2.3) we assume MF=μ=wb22, so that at most o(N) number of weights can be unbounded, and each can at most scale like O(N 1/2).) In line with the general discussion after Eq. (2.12), in such cases the naive use of Eq. (2.5) may yield an area larger than the true support of limN →∞ ρ(z); the correct support must be found by using Eqs. (2.19)(2.20), which in this case can yield a support radius strictly smaller than Eq. (2.44). We have calculated the explicit results for limN →∞ ρ(z) for two specific examples of M with the Schur form Eq. (2.40). The first example belongs to the first case (bounded wb’s) where limN →∞ ρ(z) is Θ(1) within the entire disk rr0, while the second belongs to the second case (unbounded wb’s) where the limit density is only nonzero in a proper subset of that disk.

In the first example, we take all the Schur weights in Eq. (2.40) to have the same value, which we denote by w. In this case, the eigenvalue density is given by ρ(r)=12πrn<(r)r, where n<(r) is the proportion of eigenvalues within a distance r from the origin and is given by

n<(r)=r2σ2[1w2σ2+σ4+w4+4w2r2]. (2.45)

n< (r) reaches unity exactly at r = r0 given by Eq. (2.44), and ρ(r) is Θ(1) for any smaller r. Figure 4 shows the close agreement of Eq. (2.45) with empirical results based on single binary realizations of J, for N as low as 60.

FIG. 4.

FIG. 4

(Color online) The eigenvalue spectra of A = M + σJ for a binary J with σ = 0.1 and M given by Eq. (2.41) with K = 1 (corresponding to wb = 1 for all the diagonal 2 × 2 blocks in Eq. (2.40)). The main panels show the eigenvalues for single realizations of J, with N = 600 (left) and N = 60 (right). The red circles mark the boundaries of the spectral support, Eq. (2.44). Since A is real in this case, its eigenvalues are either exactly real, or come in complex conjugate pairs; the spectrum is symmetric under reflections about the real axis. However, such signatures of the reality of the matrix appear only as subleading corrections to the spectral density ρ(z); they are finite size effects which vanish as N → ∞. The insets show a comparison of the analytic formula Eq. (2.45) (black curve) and the empirical result, based on the eigenvalues of the realizations in the main panels, for the proportion, n< (r), of eigenvalues lying within a radius r of the origin (red dots). The random fluctuations and the average bias of the empirical n< (r) are both already small for N = 60, and negligible for N = 600.

The second example is that of the balanced Eq. (2.42) with u · v = 0. As we saw, all wb are zero in this case except for one very large, unbounded weight w1=μN. As discussed above, in this case MzzM has an o(1) smallest singular value, approximately given by z2μN. Using Eqs. (2.19)(2.20), we find that the support of limN →∞ ρ(z) is the disc with radius σ (within the annulus σ < |z| ≤ r0 the eigenvalue density is o(1)), and solving Eqs. (2.8)(2.9) for |z| ≤ σ, we find that the spectral density is in fact identical with the circular law (the eigenvalue density for the M = 0 case), i.e.

ρ(r)={1πσ2+o(1),(r<σ)o(1)(r>σ).} (2.46)

It was shown in Refs. [36, 44] that more generally, for any M of rank o(N) and bounded ∥MF, the eigenvalue density of A = M + σJ is given by the circular law in the limit N → ∞. For single rank M (as in the present case) and a diagonal R, it was shown in Ref. [45] that the eigenvalue density of M + J R agrees with that of J R as N → ∞. In the present example, it was observed in Ref. [39] that even though the majority of the eigenvalues are distributed according to the circular law, there also exist a number of “outlier” eigenvalues spread outside the circle |z| = σ, which unlike in the M = 0 case, may lie at a significant distance away from it (see Fig. 5). As we mentioned in Sec. II A, the non-crossing approximation cannot be trusted to correctly yield the o(1) contributions to ρ(z) by these outliers for |z| > σ. However, we found that if we ignore this warning and use Eqs. (2.8)(2.9), keeping track of finite-size, o(1) contributions, we obtain results that agree surprisingly well (though not completely) with simulations. First, for the total number of outlier eigenvalues lying outside the circle |z| = σ we obtain

FIG. 5.

FIG. 5

(Color online) The eigenvalue spectra of A = M + σJ for the M given by Eq. (2.42) in the balanced case, vT u = 0. Here, N = 800, σ = 1 and µ = 12 (see equation Eq. (2.43)). The black dots are the superimposed eigenvalues of A for 20 different realizations of complex Gaussian J. The small red circle enclosing the vast majority of the eigenvalues has radius σ = 1, corresponding to the standard circular law Eq. (2.46). A Θ(N) number of eigenvalues lie within this circle. Aϴ(N) number lie just outside of this circle in a thin boundary layer which shrinks to zero as N → ∞. Finally, a Θ(1) number of eigenvalues lie at macroscopic distances outside the unit circle. The dashed blue circle shows radius r0 given by Eq. (2.44); outliers can even lie outside this boundary.

N>(σ)Nn>(σ)=N+O(1) (2.47)

(here we defined n> (r) = 1 − n< (r) to be the proportion of eigenvalues lying outside the radius r); see Fig. 6 for a comparison of Eq. (2.47) with simulations. The vast majority of the outlier eigenvalues counted in Eq. (2.47) lie in a narrow boundary layer immediately outside the circle |z| = σ, the width of which shrinks with growing N. In addition to these, however, there are a Θ(1) number of eigenvalues lying at macroscopic, Θ(1) distances outside the circle |z| = σ. Using Eqs. (2.8)(2.9) we have calculated N> (r), the number of outlier eigenvalues lying outside radius r for r > σ. Figure 7 shows a plot of N> (r) and compares it with the results of simulations for different N. For roughly the inner half of the annulus σ < |z| <r0, N> (r) agrees well with simulations, but as r increases it deviates significantly from the empirical averages. In particular, N> (r) calculated from Eqs. (2.8)(2.9) vanishes at r0 given by Eq. (2.44), while the empirical average of the number of outliers is nonzero well beyond r0. Finally, we note that the distribution of these eigenvalues is not self-averaging, and depends on the real vs. complex nature of the random matrix J [36]. In the real case, their distribution has been recently characterized as that of the inverse roots of a certain random power series with iid standard real Gaussian coefficients [36].

FIG. 6.

FIG. 6

(Color online) The number of eigenvalues of M + σJ, for the M given by Eq. (2.42), lying outside the circle of radius σ vs. N (red line). Here, σ = 1, µ = 12 and vT u = 0. The numbers (red points connected by solid red lines) are obtained by numerically calculating the eigenvalues and counting the outliers for 200 realizations of J, and taking the average of the counts over all realizations, for N = 100, 200, 400, 800, 1600 (error bars show standard error of mean). The black dashed line plots N for comparison with our theoretical result Eq. (2.47); the (dashed) blue line which includes subleading corrections to N, is obtained by numerically solving Eq. (5.42) and substituting the result in Eq. (5.43) (these formulae are in turn obtained from Eqs. (2.8)(2.9) in Sec. V B).

FIG. 7.

FIG. 7

(Color online) The number, N> (r), of outlier eigenvalues of A = M + σJ, for the M given by Eq. (2.42), lying farther from the origin than r, as a function of r. Here, σ = 1, µ = 12 and vT u = 0. The vertical line marks |z| = r0 ≃ 3.54 where r0 is given by Eq. (2.44). The colored (shades of gray) connected points are N> (r) for realizations of A, based on 200 samples of J, each color for a different N, for N = 100, 200, 400, 800, 1600 and 3200 (error bars show standard error of sample mean). Note the lack of scaling of N> (r) with N.

As for the dynamics, we have analytically calculated the magnitude of impulse response, Eq. (2.25), as well as the power-spectrum of steady-state response Eq. (2.33), for A = M + σJ with M given by Eqs. (2.40)(2.41) with general wb, when the (impulse or sinusoidal) input feeds into the second Schur mode in one of the N/2 chains/blocks of Eq. (2.40); we denote the index for this block by a. For the average magnitude of impulse response we find

x(t)2J=[1+Ca2I0(2r0t)+1Ca2J0(2r1t)]e2γt (2.48)

where J0(x) (I0(x)) is the (modified) Bessel function, r0

is given by Eq. (2.44), we defined r12=r02σ2, and

Ca1+2wa2σ21+2wb2bσ2, (2.49)

with 〈|wb|2b = 2 tr(M M) denoting the average squared feedforward weight among all the blocks of Eq. (2.40). In Fig. 8 we plot Eq. (2.48) and compare it with the result obtained by ignoring the disorder (i.e. by setting σ = 0); in the latter case, the block a is decoupled from the rest of the network, and solving the 2 × 2 linear system governed by the matrix (γwa0γ) , we obtain x(t)2=(1+wa2t2)e2γt. From the figure, we see that the σ ≠ 0 result (green) has a slower asymptotic decay compared with the zero-disorder case (purple); this is due to the disorder having spread some eigenvalues closer to the imaginary axis, creating modes with smaller decay, along with the fact that the coupling between the 2 × 2 blocks induced by the disorder insures that these more slowly decaying modes will be activated. Indeed, for large t, ∥x(t)∥ decays like eγt when σ = 0, while in the σ > 0 case, based on Eq. (2.44) it must decay like e−(γr0)t, i.e. by a rate set by the largest real part of the spectrum shifted by −γ (this is indeed what we obtain from Eq. (2.48) using the asymptotics of Bessel functions). In addition, both curves exhibit transient amplification where the magnitude of activity initially grows to a maximum, before it decays asymptotically to zero. The σ ≠ 0 curve shows larger and longer transient amplification, which is most likely attributable both to the eigenvalues being closer to the Re(z) = γ line and to augmented nonnormal effects (e.g. larger effective feed-forward weights, or longer chains). We also mention that, as in our previous examples, if the input direction is optimized for the particular realization of J, significantly larger transient amplification may be achieved.

FIG. 8.

FIG. 8

(Color online) The squared norm of response to impulse, x(t) 2, of the system Eq. (2.2), for A = M + σJ, with log-normal J, and M given by Eq. (2.40) describing N/2 doublet feedforward chains weights wb. Here, wa = |wb|2 b = 3, σ = 0.4, γ = 1, and N = 1400. The green (thick dashed) curve shows our result, Eqs. (2.48)(2.49), for the average norm squared which, except for a small window around its peak, lies on top of the red (thick solid) curve showing the empirical average of x(t) 2 over 100 realizations of binary J. The five thin dashed black curves show the result for five particular realizations of J, and the pink (light gray) area shows the standard deviation among the 100 realizations. The standard deviation shrinks to zero as N → ∞ and x(t) 2 for any realization lies close to its average for large N. For comparison the purple (thin, lowest) curve shows x(t) 2 obtained by ignoring the effect of quenched disorder, i.e. by setting A = M.

Finally, the total power spectrum of response to a sinusoidal input, Eq. (2.32), is given by the explicit formula

xω2¯J=ω2+γ2+wa2(ω2+γ2)2σ(ω2+γ2+μ2)I02. (2.50)

where µ2 ≡ tr(M M) = (|wb|2)b/2 and, as noted above, the direction of I0 is that of the second Schur mode in block a.

The example Eq. (2.42) motivated by Dale’s law with neurons of either excitatory or inhibitory types, can be generalized to a network of neurons belonging to one of C different types (these could be subtypes of excitatory or inhibitory neurons), in which not only the mean but also the variance of connection strengths depends on the pre- and post-synaptic types. When this dependence is factorizable, in a way we will now describe, the connectivity matrix of such a network will be of the form Eq. (2.1) with non-trivial L and R. Let c(i) ∈ {1, …, C} denote the type of neuron i, and let fc denote the fraction of neurons of type c(soc=1Cfc=1); we assume C and fc are all Θ(1). Assume further that each synaptic weight is a product of a pre- and a post-synaptic factor, and that in each synapse these factors are chosen independently from the same distribution, except for a deterministic sign and overall scale that depend only on the type of the pre and post-synaptic neurons, respectively. Thus if Aij denotes the weight of the synaptic projection from neuron j to neuron i, we have

Aij=1N(lc(i)xij)(rc(j)yij) (2.51)

where xij’s and yij are positive random variables chosen iid from the distributions Px(x) and Py (y), respectively. Here, lc and rc determine the sign and the scale (apart from the overall 1N of the pre and post-synaptic factors N of the neurons in cluster c, respectively. Note that when all lc are positive, Aij satisfies Dale’s law. By absorbing appropriate constants into lc’s and rc’s we can assume that Var[xy] = 〈x2〉 〈y2〉 − 〈x2y2 = 1. Then it is easy to see that A can be cast in the form Eq. (2.1) with

Lij=lc(i)δij (2.52)
Rij=rc(i)δij (2.53)
Jij=1N(xijyijξ) (2.54)
M=sLuuTR (2.55)

where u is the unit vector 1N(1,,1)T,

sξN, (2.56)

and ξ ≡ 〈x〉〈y〉 is dimensionless and Θ(1) (note that J, given by Eq. (2.54), indeed has iid elements with zero mean and variance N −1). Being single-rank, M has N −1 zero eigenvalues; its only (potentially) non-null eigenvector is Lu, with a generically large eigenvalue

λM=suTRLu=s1Ni=1Nrc(i)lc(i)=ξNσcc (2.57)

where we defined

σclcrc, (2.58)
Xccc=1CfcXc. (2.59)

As for the example Eq. (2.42), we will focus on the balanced case in which λM ∝ 〈σcc = 0. From Eq. (2.55), M=u~v~T with u~=Lu and v~=sRu. The balanced condition is equivalent to u~v~=0 (see Eq. (2.57)). Thus, similar to Eq. (2.42), the Schur representation of M has the form (2.40) with w1=u~v~ and wb = 0 for b > 1.

In Sec. V C we prove that, as for Eq. (2.42), for the ensemble Eqs. (2.52)(2.55) the limit of the eigenvalue distribution, limN →∞ ρ(z), is also not affected by the nonzero mean matrix Eq. (2.55); hence we can obtain limN →∞ ρ(z) for that example by safely setting M to zero, and using formulae Eqs. (2.13)(2.16) with L and R given by Eqs. (2.52)(2.53). Thus limN →∞ ρ(z) is isotropic and its support is the disk with radius

r0=RLF=σc2c. (2.60)

As in the previous example, when the balance condition 〈σcc = 0 holds, use of the naive formula Eq. (2.5) with M=u~v~T would have yielded

r~0=r0[12+14+ξ2]12, (2.61)

which is larger than the correct result Eq. (2.60). As discussed above, this result is not correct, but it indicates the existence of Θ(1) number of outlier eigenvalues lying outside the boundary of limN →∞ ρ(z) given by Eq. (2.60). For r < r0, the N → ∞ limit of the proportion, n>(r), of eigenvalues lying farther than distance r of the origin is given by g2(r) which is found by solving Eq. (2.15), or equivalently

1g2+σc2r2c=1. (2.62)

The results Eqs. (2.17)(2.18) also hold, wherein the normalized sums over i can be replaced with appropriate averages 〈·〉c. In the case of two neuronal types a closed solution can be obtained for n>(r) and ρ(r). Identifying the two types with excitatory and inhibitory neurons, and assuming that lc = 1, σEσ1 > 0 and σIσ2 < 0 (we will use E and I as indices instead of c = 1 and 2, respectively) the ensemble Eqs. (2.52)(2.55) describes a synaptic connectivity matrix in which all excitatory (inhibitory) connections are iid with mean ξσEN12(ξσIN12) and variance σE2N1(σI2N1). In this case, Eq. (2.62) yields a quadratic equation. Differentiating the solution of that equation with respect to r2 we obtain the explicit result

ρ(r)=σE2+σI22π[1(σE2+σI2)r212+r022r2σE2+σ12((σE2+σI2)r21)24+r2(r02r2)(σEσI)2] (2.63)

This result was first obtained (in a less simplified form) in Ref. [39]. Figure 9 shows two examples of spectra for single realizations of matrices of the form Eq. (2.51), with three neural types (C = 3), where xij and yij, and hence Jij, have log-normal distributions. The insets compare n>(r) based on the numerically calculated eigenvalues, with those found by solving Eq. (2.62). In the right panel, the normally distributed log Jij have a higher standard deviation, and hence the distribution of Jij has a heavier tail. The right panel’s inset demonstrates that the convergence to the universal, N → ∞ limit can be considerably slow when the distribution of Jij is heavy-tailed.

FIG. 9.

FIG. 9

(Color online) The eigenvalue spectra of A = M + LJR with M, L and R given by Eqs. (2.52)(2.55) with neurons belonging to one of three different types (C = 3). The main panels show the eigenvalues for two particular realizations of J. In both panels, N = 2000, f1 = 0.6, f2 = f3 = 0.2, lc = 1, σ1 = r1 = 0.76, σ2 = r2 = −0.57, σ3 = r3 = −1.71 (so 〈σcc = 0 and r02=σc2c = 1), and Jij had real entries with log-normal distribution; in the left (right) panel, the normally distributed log10 Jij had standard deviation 0.5 (0.75). The solid red circles mark the boundaries of the spectral support as given by Eq. (2.60), and the dashed blue circles show the radii given by Eq. (2.61). The insets compare n>(r) based on the numerically calculated eigenvalues shown in the main panels (connected red dots), with that found by solving Eq. (2.62) (black curve). In the right panel’s inset we have also plotted (green connected dots lying slightly above the red connected circles) the empirically calculated n>(r) for a single realization with the same ensemble parameters, but with N = 8000; the convergence to the universal limit at N → ∞ is significantly slower in the right panel in which the distribution of Jij had a considerably heavier tail.

3. Linearizations of nonlinear neural and ecological networks

In neuroscience applications, Eq. (2.2) can arise as a linearization of nonlinear firing rate equations for a recurrent neural network of N neurons, around some stationary background. The nonlinear dynamical equations for the evolution of the network activity typically take the form [63]

Tdv(t)dt=v(t)+Wf(v(t))+Iv(t). (2.64)

Here v(t) is the vector of state variables of all neurons at time t; its i-th component, vi(t), is commonly thought of as the voltage of the i-th neuron, or the total synaptic input it receives. f (·) is the neuronal nonlinear input-output function, which is imposed element-by-element on its vector argument, with f (v)if (vi) giving the output, i.e. the firing rate, of neuron i; Iv (t) is the external input vector; T = diag(τ1, τ2, · · · , τN) is a N × N diagonal matrix whose diagonal elements are the positive time-constants of the neurons (hence T is invertible); and W is the N × N synaptic connectivity matrix.

Suppose that for a constant external input, Iv, Eq. (2.64) has a fixed point v. Then, given a small perturbation in the input, Iv(t)=Iv+σIv(t), we can write v(t) = v + x(t), and linearize the dynamics around the fixed point by expanding Eq. (2.64) to first order in x(t) and δIv (t). This yields the set of linear differential equations

Tdx(t)dt=x(t)+WΦx(t)+δIv(t), (2.65)

for the (small) deviations, where we defined the diagonal Jacobian

Φ=diag(f(v)). (2.66)

Now suppose that the original connectivity matrix can be written as W = 〈W〉 + δW, with a quenched disorder part that is an iid random matrix: δW = σJ. Then multiplying Eq. (2.65) by T −1, we can convert Eq. (2.65) into the form Eq. (2.2) with γ = 0 and A = M + LJR with

M=T1(1+WΦ) (2.67)
L=T1 (2.68)
R=σΦ (2.69)

and input

I(t)=T1δIv(t). (2.70)

This observation is not limited to neuroscience applications, and can also apply to many other frameworks, e.g. those used in mathematical biology. Generalized Lotka-Volterra (GLV) equations [47] used in modeling the dynamics of food webs provide an example. Let n(t) = (n1(t), …, nN (t))T denote the vector of population sizes of N species. The GLV equations take the form dnidt=ni(ri+jWijnj) or

dndt=diag(r+Wn)n (2.71)

where ri > 0 are the species’ intrinsic growth rates and W is the interaction matrix. Linearizing Eq. (2.71) around a fixed point, n, yields again a linear system of the form Eq. (2.2) with γ = I(t) = 0. Starting with the same simple model W = 〈W〉 + σJ, we find that A can be written in the form Eq. (2.1) with

R=σ1,L=diag(n), (2.72)
M=diag(r+Wn)+LW. (2.73)

Note that if no species is extinct in the fixed point, i.e. if all ni > 0, then M = L(W).

Assuming the linear systems thus obtained, i.e. the fixed points v* or n*, are stable, we can therefore think of our results for ∥x(t)∥2 and ∥xω2 as characterizing the temporal evolution and the spectral properties of the linear response of the nonlinear system Eq. (2.64) (Eq. (2.71)) in its fixed point v (n) to perturbations.

The necessary and sufficient condition for the stability of a fixed point (without any change in the external input) is that all eigenvalues of the corresponding A have negative real parts. Our formula for the boundary of the eigenvalue distribution, Eq. (2.5), can be applied in these cases to map out the region in parameter space (parameters here mean the time constants or intrinsic growth rates in T or r, or the connectivity parameters determining the random ensemble for W, i.e. σ and the parameters of 〈W〉) in which a particular fixed point is stable. Recently our general formula Eq. (2.5) was used in this way by colleagues [48] to determine the phase diagram of a clustered network of neurons, in which intra-cluster connectivity is large, but inter-cluster connectivity is random and weak. Because of the strong intra-cluster connectivity, each cluster behaves as a unit with a single self-coupling a. Letting the random inter-cluster couplings between N clusters have zero mean and variance g2/N, their analysis starts from the equation

dv(t)dt=v(t)+atanh(v(t))+gJtanh(v(t)) (2.74)

where J is an iid random matrix as above. Here, v is a vector whose i-th component is the mean voltage of cluster i, while the nonlinear function tanh(v(t)) (with the hyperbolic tangent acting component-wise) represents the vector of mean firing rates of the clusters. The analysis of Ref. [48] shows that there is a region of the phase plane (a, g) where the self-connectivity, a, is excitatory and sufficiently strong, in which the system eventually relaxes to non-zero random attractor fixed points v; for smaller values of a, the dynamics is chaotic (chaos in the a = 0 case was established in Ref. [49]). The form of these fixed points (the distribution of the elements of v as N → ∞ for a given (a, g)) can be obtained using mean-field theory, and the linearization about v leads to an equation in the form of Eq. (2.2), with A = M + J R, where M and R are the diagonal matrices

M=diag(1+atanh(v)) (2.75)
R=diag(gtanh(v)). (2.76)

Given this form, it can be shown that the fixed point v is stable if z = 0 is outside and to the right of the spectrum of the Jacobian matrix of the linearization, A. The mean field solution for v determines the statistics of the elements of R2M −2 for a given (a, g). From these it can be determined if z = 0 is outside the spectrum using our formula for the boundary of spectrum Eq. (2.5), which yields the requirement tr(R2M2)<1. In this way, the region of stability of the fixed points in the (a, g) plane can be mapped (see Ref. [48] for the results, and a complete discussion of the analysis outlined here). Figure 10 shows a numerical example of the eigenvalue distribution for A for a given (a, g) and the superimposed boundary calculated using Eq. (2.5).

FIG. 10.

FIG. 10

(Color online) The eigenvalues (black dots) of A = M + J R, with M and R given by Eqs. (2.75)(2.76) with g = 0.01, a = 1.02 and N = 2000. This matrix governs the dynamics of small perturbations away from a non-trivial random fixed point in a clustered network of neurons (see Eq. (2.74)), studied in Ref. [48]. The cyan dots on the real line are the eigenvalues of M, and the red curve is the boundary of support of the eigenvalue distribution, as calculated numerically from Eq. (2.5).

In closing we note a potential caveat in the applicability of our formulae to the linearization analysis of systems like Eq. (2.65) and Eq. (2.71). We have derived the general formulae of Secs. II A–II B assuming that M, L and R are independent of J. However, M and R as given by Eqs. (2.67) and (2.69) (or M and L in Eqs. (2.72)(2.73)) depend on J via their dependence on v (n). However, in our experience this dependence is often too weak and indirect to render our formulae inapplicable; an example is provided by the excellent agreement of the empirical spectrum and the red boundary given by our formula in Fig. 10, which also held for other parameter choices of the model of Ref. [48].

III. DERIVATION OF THE FORMULA FOR THE SPECTRAL DENSITY

In this section we will derive the formulae Eqs. (2.5)(2.8) for the average spectral density, ρ(z), of random matrices of the form A = M +LJR where M, L and R are deterministic matrices, and J is random with iid elements of zero mean and variance 1/N. We will use the Hermitianized diagrammatic method developed in Refs. [24, 25] (and reviewed in Ref. [50]), which we will recapitulate here for completeness. As mentioned in Sec. II, the spectral density is self-averaging for large N. Furthermore, as established in Ref. [34], it is also universal in the large N limit, in the sense that it is independent of the details of the distribution of the elements of J as long its mean and variance are as stated. The same universality theorem also ensures that the real or complex nature of J does not by itself affect ρ(z) to leading order. Therefore, for simplicity we consider the case where J is a zero-mean complex Gaussian random matrix with 〈JabJcd〉 = 0, and

JabJcd=1Nδacδbd. (3.1)

Thus Jab2=1N, and all other first and second moments of J (including Jab2) vanish. The measure on J can be written as

dμ(J)eNTr(JJ)abdImJabdReJab. (3.2)

In this form, and by the invariance of the trace, it is clear that the measure is symmetric with respect to the group U (N) ⊗ U (N), acting on J by JUJV where U and V are arbitrary N × N unitary matrices.

For a particular realization of J, we define the “Green’s function” G(z; J) by

G(z;J)1MzJ, (3.3)

where Mz = L−1(zM)R−1 (Eq. (2.6)). In the case L, R1, G(z; J) will be proportional to the resolvent of A, 1zA. More generally we have

1zA=R1G(z;J)L1. (3.4)

Following Ref. [24], we will use the identity

δ2(z)=1πzzlnz2=1πz(1z) (3.5)

where the first identity follows by noting that 4zz=2, where ∇2 is the 2-D Laplacian, and recalling from electrostatics that the solution of Poisson’s equation for a point charge at origin, i.e.2φ(z) = 4πδ2(z), in 2-D is given by the potential field φ(z) = ln |z|2; the second identity follows from zlnz2=z(lnz+lnz)=1z+0. Using Eq. (3.5) we can write the empirical spectral density, defined in Eq. (2.4), as

ρJ(z)=1πz1Nα1zλα=1πztr1zA. (3.6)

Performing the ensemble average we obtain

ρ(z)ρJ(z)J=1πztr[(RL)1G(z;J)J], (3.7)

where we used Eq. (3.4), and the linearity and cyclicity of the trace. Thus, to calculate ρ(z), our task boils down to calculating 〈G(z; J)〉J.

The diagrammatic technique provides a method for calculating averages of products of G(z; J)’s. However, this method in its standard form relies on A being a Hermitian matrix. It starts by an expansion of G(z; J) in powers of J, which is only valid when z is far enough from the spectrum of A, i.e. away from the points we are most interested in. For Hermitian matrices, this is no problem as the spectrum is confined to the real line, and therefore G(z; J) and 〈G(z; J)〉J will be analytic outside the real line. Thus one can use the expansion for z far away outside the real line, perform the averaging over J, and sum up the most dominant contributions to obtain a result analytic in z. This result can then be analytically continued to z arbitrarily close to the spectrum on the real line, yielding information about the spectrum. All this would seemingly fail in the case of a nonnormal (and in particular non-Hermitian) A, with eigenvalues that in general cover a two dimensional region in the complex plane. However, using a trick introduced by Ref. [24], we can turn this problem to an auxiliary problem of averaging the Green’s functions for a Hermitian matrix. By doubling the degrees of freedom, one defines a z-dependent, 2N × 2N Hermitian “Hamiltonian”

H(z)(0MzJMzJ0), (3.8)

and the corresponding 2N × 2N resolvent matrix or Green’s function depending on a new complex variable η:

G=(η,z;J)(ηH(z))1=(η(MzJ)(MzJ)η2MzJ(MzJ)(MzJ)η2(MzJ)(MzJ)(MzJ)η2η(MzJ)(MzJ)η2). (3.9)

For ηi0, we see that

G(0,z;J)=(0(MzJ)(MzJ)10). (3.10)

and thus from Eq. (3.3), for any realization of J

G(z;J)=limηi0G21(η,z;J) (3.11)

Here, we have used the notation

G(η,z;J)=(G11(η,z;J)G12(η,z;J)G21(η,z;J)G22(η,z;J)), (3.12)

where Gαβ (with α, β ∈ {1, 2}) are N ×N matrices, forming the four blocks of G. We have written the limit in Eq. (3.11) as ηi0 to emphasize that until the end of our calculations η is to retain a nonzero imaginary part, which serves to regularize the denominators in Eq. (3.9); c.f. the discussion after Eq. (3.35). We will be carrying out a perturbation expansion in powers of J, so we decompose the Hamiltonian according to

H(z)=H0(z)J, (3.13)
J=(0JJ0),H0(z)(0MzMz0). (3.14)

We will sometimes use a tensor product notation to denote matrices in this doubled up space, e.g. writing J = σ+J + σJ , where we defined the 2 × 2 matrices

σ+=(0100)σ=(0010). (3.15)

By a slight abuse of notation we also denote 2N × 2N matrices σ±1N×N by σ±, and we will denote the identity matrix in any space by 1. From Eqs. (3.11) we obtain and tr[(RL)-1 G(z;J)] = - tr [(σ+ ⊗(RL)-1)G(i0+, z;J)], and from Eq. (3.7)

ρ(z)=limηi01πztr((σ+(RL)1)G(η,z)), (3.16)
=limηi01πztr(((RL)1)G21(η,z)), (3.17)

where we defined

G(η,z)G(η,z;J)J. (3.18)

Having expressed ρ(z) in terms of the ensemble average of the Green’s function for a Hermitian matrix, we now develop the diagrammatic method for calculating ensemble averages of products of G(η, z; J) (including G(η, z)). Note that, being the Green’s function of a Hermitian matrix, G(η, z; J) and hence G(η, z) = 〈G(η, z; J)〉J are analytic functions of η for η outside the real line, and therefore analytic continuation can be used to take the limit ηi0 after obtaining the average over J for η sufficiently away from the real line.

We will denote the elements of a generic 2N × 2N matrix A by Aabαβ, where the Greek indices range in {1, 2} and the Latin indices range in {1, …, N }. Using this notation, the definition Eq. (3.14), and Eq. (3.1), we can write the covariance for the components of J as

JabαβJcdγδJ=1Nδadδbc(σαβ+σγδ+σαβσγδ+) (3.19)

(the terms proportional to σ+σ+ and σσ involve 〈JabJcd〉, or its complex conjugate, which vanish for the complex Gaussian ensemble). It will be more handy to rewrite the parenthesis on the right side of Eq. (3.19) as παδ1πγβ2+παδ2πγβ1, where

π1(1000)π2(0001), (3.20)

yielding

JabαβJcdγδJ=1Nr=12(παβrδad)(πγβrδcb). (3.20)

Also, since Jab have zero mean, we have 〈JJ = 0.

The starting point of the diagrammatic method is the perturbation expansion of G(η, z; J) = (ηH0(z)−J)−1 in powers of J

G(η,z;J)=G(η,z;0)[JG(η,z;0)]n (3.22)

where G(η, z; 0) is given by Eq. (3.9) with the J ’s set to zero. This equation is represented diagrammatically in the third line of Fig. 11; the thin arrows defined in the first line of the figure represents G(η, z; 0), and the dashed lines represent a power of J before ensemble averaging. To obtain the average resolvent, G(η, z), we then average Eq. (3.22), term by term, with respect to the ensemble Eq. (3.2). Since the measure is Gaussian with zero mean, according to Wick’s formula, the average of each term of Eq. (3.22) involving n factors of J is given by a sum over the contributions of all possible complete pairings of the J ’s in that term (in particular, since 〈J〉J = 0, terms in Eq. (3.22) with odd powers of J vanish after averaging). Each pairing can be represented as a Feynman diagram, as shown in Fig. 11, the first two lines of which define the diagram elements. For example, the last diagram in the fourth line of Fig. 11 shows one possible pairing of the term in Eq. (3.22) corresponding to n = 6. The contribution of each pairing diagram is given by a product of factors, one per each pair, given by Eq. (3.21) (represented by wavy lines) with the right indices for that pair, as well as the factors of G(η, z; 0) (represented by thin arrows), with all the intervening Greek and Latin matrix indices summed over their proper ranges. We show in Appendix A that for Im η ≠ 0, and so long as ∥(RL)−1∥ remains bounded as N → ∞, only non-crossing pairings need to be retained in the large N limit, as crossing pairings are suppressed by inverse powers of N and do not contribute in the limit (a pairing diagram is non-crossing if it can be drawn on a plane, with the wavy lines drawn only on the half-plane above the straight arrow line, without any wavy lines crossing). As the last two lines of Fig. 11 demonstrate, all non-crossing diagrams can be generated by iterating the equation

FIG. 11.

FIG. 11

The first two lines define different elements of Feynman diagrams: the Green’s function for J = 0 (zero disorder), Gabαβ(η,z;0), the covariance of two J elements, the ensemble averaged Green’s function, G(η, z) ≡ G(η, z; J) J, and the self-energy Σ(η, z), Eq. (3.24) (the matrix indices for G(η, z) and Σ(η, z) are arranged as for Gabαβ(η,z;0)). The third line is the diagrammatic representation of the expansion Eq. (3.22) of G(η, z; J) before averaging over J, where the J’s are represented by dashed lines. Averaging over Eq. (3.2) is performed by pairing all J ’s and connecting them with the wavy lines representing 〈J J〉. In the large N limit, the contribution of crossing pairings is suppressed by negative powers of N ; the sum of all non-crossing diagrams, shown on the fourth line, yields the leading contribution to G(η, z) for large N. The last line shows the diagrammatic representation of Eq. (3.23), which if iterated generates all the non-crossing diagrams. Alternatively, G(η, z) can be found by solving this self-consistent equation directly.

G(η,z)=G(η,z;0)+G(η,z;0)(η,z)G(η,z), (3.23)

starting from G(0)(η, z) = G(η, z; 0). This equation is represented diagrammatically in the last line of Fig. 11, with the “self-energy” matrix, Σ(η, z), defined by the diagram in the second line of that figure, i.e.

(η,z)JG(η,z)JJ. (3.24)

Using Eq. (3.21) we obtain

adαδ(η,z)=δadr=12παδr1NTr(π3rG(η,z)), (3.25)

which using Eq. (3.20) we can write as

(η,z)=(ig2(η,z)100ig1(η,z)1), (3.26)

where we defined the scalar functions

gα(η,z)itrGαα(η,z). (3.27)

Using Eq. (3.26) we can solve Eqs. (3.23)(3.26) for G(η, z) at once, in terms of gα(η, z), and then use Eq. (3.27) to obtain a self-consistency equation, which can be solved for gα(η, z). To this end, we multiply Eq. (3.23) by G−1(η, z; 0) on the left, and by G−1(η, z) on the right, to obtain

G(η,z)=[G1(η,z;0)(η,z)]1=[ηH0(z)(η,z)]1. (3.28)

Using this expression with Eqs. (3.14) and (3.26), it can be easily checked that

G(η,z)=((η+ig1)K11(zM)K21(zM)K11(η+ig2)K21), (3.29)

where K1MzMz+(g1iη)(g2iη) and K2MzMz+(g1iη)(g2iη), and we dropped the argu(η, z) for succinctness. Imposing Eq. (3.27) we obtain the self-consistency equations

g1=(g1iη)tr(K11), (3.30)
g2=(g2iη)tr(K21). (3.31)

Before solving these equations for g1 and g2, we first show that tr (K1-1) = tr(K2-1) One way to see this is to use the singular value decomposition (SVD) of Mz in the form

Mz=UzSzVz, (3.32)

where Sz is a nonnegative diagonal matrix with the singular values of M, si(z) (i = 1, · · · , N), on the diagonal, and Uz and Vz are unitary matrices (as in Sec. II A we include possibly vanishing singular values among si(z), so that Sz, Uz and Vz are always N × N matrices). Using the invariance of trace under similarity transforms, we obtain tr(K11)=tr(K21)tr(Sz2+(g1iη)(g2iη))1. Given this equality, it is not hard to see that Eqs. (3.30) cannot be simultaneously satisfied unless g1(η, z) = g2(η, z) ≡ g(η, z), with g(η, z) satisfying

g=(giη)tr[1Sz2+(giη)2], (3.33)

or as written in the original basis

g=(giη)tr[1MzMz+(giη)2], (3.34)

Noting from Eqs. (3.26), that the self-energy is thus proportional to the 2N × 2N identity matrix, from Eqs. (3.28) and (3.9) (for J = 0) we obtain

G(η,z)=G(η+ig(η,z),z;0)=(iγMzMz+γ2MzMzMz+γ2MzMzMz+γ2iγMzMz+γ2) (3.35)

where γg(η, z) − .

According to Eq. (3.11), for our case of interest we must solve Eq. (3.34) in the limit ηi0. Note, however, that as shown in Appendix A, the non-crossing approximation is in general guaranteed to work only for Im η ≠ 0; hence the limit ηi0 must be taken after the limit N → ∞ (as already pointed out in Sec. II, taking the limits in this order is important in cases where some of the singular values in Sz vanish in the limit N → ∞). For our purposes, it suffices to let η = iE for some real positive ϵ, and take the limit ϵ → 0+ at the end. In this case one must seek a positive solution for g(iE, z) in Eq. (3.34); this is because by definition, g(η, z) = itr G11(η, z) = (tr iG11(η, z; J))J and from Eq. (3.9) we obtain g(iϵ,z)=trϵ(MzJ)(MzJ)+ϵ2J which for E > 0, is the ensemble average of the trace of a positive definite matrix and hence positive. Taking the limit N → ∞ while keeping ϵ (and hence ϵ + g) positive and nonzero, we define

K(γ,z)limNtr[1MzMz+γ2]=limNtr[1Sz2+γ2] (3.36)

for γ = g + E > 0. We can then rewrite Eq. (3.34) as

γ(1K(γ,z))=ϵ, (3.37)

with γ = g + ϵ. Since ϵ and γ = g + ϵ are positive, it follows that 1−K(γ, z) must also be positive. In the limit ϵ → 0+ there are two possible situations: 1) g, γ → 0+, in which case we must have

limγ0+K(γ,z)<1, (3.38)

or limγ→0 + K(γ, z) = 1, or 2) the solution for g stays finite and positive in the limit, while K(γ, z) → 1 as γg+. Thus in the second case g(z) ≡ limϵ→0 + g(ϵ, z) must satisfy K(g(z), z) = 1, i.e.

1=limNtr[1Sz2+g(z)2]. (3.39)

Note further that since K(γ, z) is a decreasing function of γ, in the second case we have K(0+, z) ≥ K(g(z), z) = 1, i.e.

limγ0+K(γ,z)1. (3.40)

Thus the two possible solutions are realized in complimentary regions (with a shared boundary) of the complex plane for z, respectively given by Eqs. (3.38) and (3.40).

Let us substitute the g(z) = 0 solution for the case (3.38) in Eq. (3.35), and naively set η = iE (and thus γ) to zero, to obtain

G(η=i0+,z)=G(η=i0+,z;0). (3.41)

From Eqs. (3.10)(3.11), this solution yields G(z;J)J=G21(η=i0+,z)=Mz1=R(zM)1L which is analytic outside the spectrum of M. Hence from Eq. (3.7), it yields ρ(z) = 0, at least outside the spectrum of M; a more careful analysis presented in Appendix B, in which we correctly take the limit N → ∞ in Eq. (3.17) before taking ϵ → 0+, confirms that in the region Eq. (3.38), limN →∞ ρ(z) always vanishes. We conclude that the support of limN →∞ ρ(z) is where Eq. (3.40) holds (which is Eq. (2.20) of Sec. II); here g(z) is to be found by solving Eq. (3.39), or equivalently Eq. (2.9) or Eq. (2.10). In this region, we obtain ρ(z) by substituting Eq. (3.35), with the solution of Eq. (3.39), into Eqs. (3.17). This yields Eq. (2.8), which we rewrite here as

ρ(z)=1πzE(z) (3.42)
E(z)tr[(RL)1MzMzMz+g(z)2], (3.43)

with g(z) given by Eq. (3.39), or equivalently Eq. (2.9).

We will now obtain an alternative expression for ρ(z), equivalent to Eqs. (3.42)(3.43), which explicitly shows that it depends only on the singular values of Mz. Noting that, from Eq. (2.6), (MzMz)=(RL)1Mz, we can write Eq. (3.43) as

E(z)=tr[z(MzMz)MzMz+g(z)2]. (3.44)

On the other hand, we have

ztrln[MzMz+g(z)2]=tr[z(MzMz+g2(z))MzMz+g(z)2],=E(z)+z(g2(z)), (3.45)

where to write the last term we used Eq. (2.9). Thus we obtain

E(z)=zψ(z), (3.46)
ψ(z)g2(z)+tr ln[MzMz+g(z)2]. (3.47)

or using the SVD, Eq. (3.32),

ψ(z)=g2(z)+tr ln[Sz2+g(z)2], (3.48)
=g(z)2+1Ni=1Nln[si(z)2+g(z)2]. (3.49)

Finally, substituting Eq. (3.49) in Eq. (3.46), and using Eq. (2.10), we obtain Eq. (2.12).

For the special case of M = 0, we have Mz = z(RL)−1. If we let σi to be the singular values of RL, then the singular values of Mz will be given by si(z) = |z| σi1. Substituting this in Eq. (2.10) and multiplying both sides by r2 = |z|2, we obtain Eq. (2.15). We see immediately that g(z), ϕ(z) and ρ(z) depend only on the radius r = |z|. Similarly we can rewrite Eq. (3.49) as

ψ(r)=g(r)2+1Ni=1Nln[r2σi2+g(r)2]. (3.50)

To find the spectral radius (boundary of the spectrum) r0 we have to solve Eq. (2.15) for r, setting g(r) = 0. This yields r02=1Ni=1Nσi2=RLF2, yielding Eq. (2.13). Let us define the proportion of eigenvalues lying outside a radius r from the origin by n>(r). To obtain Eqs. (2.14) and (2.16), first note that

ρ(r)=1πzzψ(z)=14π2ψ(z)=14πrr(rrψ(r)), (3.51)

where we used the expression of Laplacian, 2=x2+y2, in 2-D polar coordinates in the last equality. Using this with the definition n>(r)=2πrρ(r)rdr, we obtain n>(r)=[r2rψ(r)]r. For the limit at r → ∞, note that for r > r0, g(r) = 0 and we have ψ(r)=1Ni=1Nln(r2σi2)=2lnr2Nlndet(RL), and hence r2rψ(r)1 as r → ∞. Thus we obtain

n>(r)=1r2rψ(r). (3.52)

Differentiating Eq. (3.50) and using Eq. (2.15) we obtain

rψ(r)=2r1Ni=1N1r2+σi2g(r)2, (3.53)

and

n>(r)=1r21Ni=1N1r2+σi2g(r)2, (3.54)
=g(r)21Ni=1Nσi2r2+σi2g(r)2. (3.55)

Using Eq. (2.15) once again we obtain Eq. (2.16). Finally, using the latter together with Eqs. (3.51)(3.52) yields Eq. (2.14).

We will prove further general properties for the eigenvalue density for M = 0. Let us first define

In,k(g,r)σk(g2+σ2r2)nσ (3.56)

and

f(σ)σlimN1Ni=1Nf(σi). (3.57)

(We assume σi have a limit density, ρσ (σ), such that f(σ)σ=0f(σ)ρσ(σ)dσ is well-defined for f(σ) with sufficiently fast decay at infinity. Note that since we assumed that ∥(RL)−1∥ = (mini σi)−1 = O(1), this density has no measure at σ = 0 and hence the averages in Eq. (3.56) are non-singular for n, k ≥ 0. Also (f (σ))σ is finite as long as f (σ) = O(σ2) as σ → ∞, as we are assuming that the ∥RLF = O(1) and limNRLF2=σ2σ.) First, we obtain general expressions for ρ(r = 0) and ρ(r = r0), with r0 given by Eq. (2.13). From Eq. (2.14), πρ(r)=n>(r)(r2), which using Eq. (2.15), re-expressed as I1,0(g, r) = 1, we can write as

πρ(r)=I1,0(r2)I1,0(g2)=I2,2(g,r)I2,0(g,r). (3.58)

Using the facts that at r = 0, g = 1, and at r = r0, g = 0, we obtain

ρ(r=0)=1πI2,2(1,0)I2,0(1,0)=1πσ2σ (3.59)
ρ(r=r0)=1πI2,2(0,r0)I2,0(0,r0)=1πσ2σσ4σ. (3.60)

Using the fact that σ4 and σ−2 are anti-correlated and that σ2 = σ4σ−2, we see that σ2σσ4σσ2σ or

ρ(r=r0)ρ(r=0), (3.61)

with equality if and only if ρ(σ) is deterministic, i.e., a delta-function. This can happen if all but an o(1) fraction of the σi’s have the same limit as N → ∞; in that case the eigenvalue distribution is given by the circular law. More generally, we can prove that ρ(r) is a decreasing function of r for any choice of L and R (with M = 0). Using dρ(r)dr=2rdρ(r)d(r2), and Eq. (3.58) we obtain

dρ(r)dr=2rdI2,2d(r2)I2,0I2,2dI2,0d(r2)I2,02, (3.62)

and using dd(r2)=(r2)+(g2)(r2)(g2)=(r2)ρ(r)(g2) and In,k(g2)=nIn+1,k and In,k(r2)=nIn+1,k+2 (we will drop the explicit (g, r) dependence of In,k ’s when convenient) we find

dρ(r)dr=4rI2,02I3,42I2,2I2,0I3,2+I2,22I3,0I2,03 (3.63)

Defining

f(σ)σf(σ)(g2+σ2r2)2σ1(g2+σ2r2)2σ (3.64)

(f(σ))σ is a bonafide expectation operator) we can write

dρ(r)dr=4r[σ4g2+σ2r2σ2σ2g2+σ2r2σσ2σ+1g2+σ2r2σσ2σ2] (3.65)

or

dρ(r)dr=4r[Cov[σ2g2+σ2r2,σ2]σ2σCov[1g2+σ2r2,σ2]] (3.66)

where Cov[f,g]fgσfσgσ is the covariance under σ. Now since σ2g2+σ2r2 and σ−2 are both strictly decreasing functions of σ (since g > 0 for r < r0), while 1g2+σ2r2 is a strictly increasing function of σ (for r > 0), the first covariance on the right hand side of Eq. (3.66) is positive, while the second one is negative, and therefore

dρ(r)dr0. (3.67)

This slope is zero at r = 0 and strictly negative for r > 0 as long as Var[σ] > 0 (again when Var[σ] = 0 we obtain the circular law). At r = r0 we obtain

ρ(r0)=4r0σ2σ(σ2σσ2σ1)=4r0σ2σσ4σ3(σ2σσ6σσ4σ2) (3.68)

The curvature of ρ(r) at zero can also be evaluated by taking the limit r → 0 of the bracket in Eq. (3.66), noting that g → 1 as r → 0. We obtain

ρ(r=0)=4Var[σ2]=4Var[σ2]σ4σ20. (3.69)

IV. DERIVATION OF THE FORMULA FOR THE AVERAGE NORM SQUARED

In this section, we focus on the dynamics governed by the matrix A = M + LJR, according to Eq. (2.2), and derive the general formulae presented in Sec. II B. We will first consider the system’s response to an impulse input, I(t) = x0δ(t), at t = 0, before which we assume the system was at rest in its fixed point x = 0. We assume x = 0 is a stable fixed point, i.e. all eigenvalues of −γ1+A have negative real parts, or equivalently, all eigenvalues of A have real parts less than γ (more precisely, we assume that as N → ∞, this will be the case almost surely, i.e. for any typical realization of J ; in particular, the vertical line of z’s with real part γ must be to the right of the support of ρ(z), the average eigenvalue density for A, as found by solving Eq. (2.5)). This means that x(t) decays exponentially as t → ∞, and therefore its Fourier transform, x~(ω)eiωtx(t)dt=0eiwtx(t)dt is well-defined. Fourier transformation of Eq. (2.2) with I(t) = x0δ(t) yields iωx~(ω)=(γ+A)x~(ω)+x0. Solving algebraically for x~(ω), we obtain x~(ω)=(γ+iωA)1x0, or using Eqs. (3.3)(3.4), x~(ω)=R1G(γ+iω;J)L1x0. The inverse Fourier transform, x(t)=eitωx~(ω)dω2π yields

x(t)=dω2πeitωR1G(γ+iω;J)L1x0. (4.1)

Our goal is to study the statistics of x(t) (e.g., its moments) under the distribution Eq. (3.2). Equation (4.1) allows us to reduce this task to the calculation of various moments of G(z; J) and its adjoint, and these can be found using the diagrammatic technique. Note that, σ g2 +σ2 r2 in general, these moments involve not only the statistics of the eigenvalues, but also that of the eigenvectors of A = M + LJR; this can be seen from the spectral representation R−1G(z; J)L−1 = (zA)−1 = V (z − Λ)−1V −1 where Λ is a diagonal matrix of the eigenvalues of A, and V is the matrix whose columns are the eigenvectors of A. Here we will look at the simplest interesting statistic involving the eigenvectors: the average square norm of the state vector, namely, 〈∥x(t)∥2J〉. As we discussed in Sec. II B, its study is also motivated by the fact that transient amplification due to nonnormality of A manifests itself in the transient growth of ∥x(t)∥2 = x(t)T x(t). With a slight generalization, we derive a formula for the average of a general quadratic function x(t)T Bx(t) where B is any symmetric matrix; the norm squared corresponds to B = 1. Using, x(t)T = x(t) (x(t) is real), the identity xBx = Tr (Bxx), and Eq. (4.1), we obtain

x(t)TBx(t)=dω12πdω22πeit(ω1ω2)Tr(BRG(γ+iω1;J)CLG(γ+iω2;J)), (4.2)

where we defined CLL1x0x0TL and BRR−†BR−1. Using Eq. (3.11) and G(z; J) = − limηi0+ G (η, z; J), and the 2 × 2 matrices πr defined in Eq. (3.20), we can rewrite the trace in Eq. (4.2) as Tr(π2BRG(0, z1; J) π1CLG(0, z2; J)), with zi = γ + i, where now the trace is performed over 2N × 2N matrices. Averaging over J we then obtain

x(t)TBx(t)J=dω12πdω22πeit(ω1ω2)F(γ+iω1,γ+iω2;B,x0x0T), (4.3)

where, for general matrix arguments B and C, we define

F(z1,z2;B,C)Tr(BG(0,z1;J)CG(0,z2;J))J. (4.4)

with

Bπ2BR,BRRBR1, (4.5)
Cπ1CL,CLL1CL. (4.6)

Before proceeding to the calculation of F(z1, z2; B, C) using the diagrammatic technique, we will also express the other quantities presented in Sec. II B in terms of F(γ + iω, γ + ; B, C), with appropriate B’s and C’s. First, we obtain the desired expression for the matrix power spectrum, Eq. (2.27), of the steady-state response to a temporally white noisy input, I(t), with covariance Eq. (2.26). Using the Fourier transform of Eq. (2.2), and following similar steps to those leading to Eq. (4.1), we can write the steady-state solution for x(t) as in Eq. (4.1) with x0 replaced by the Fourier transform of the input, I~(ω). Using this and exploiting xj(t2)=xj(t2) we can write (after averaging over the input noise)

xi(t1)xj(t2)¯=dω12πdω22πeit1ω1it2ω2Kij(ω1,ω2) (4.7)

where the Fourier-domain covariance matrix, K(ω1,ω2)x~(ω1)x~(ω2)¯, is given by

K(ω1,ω2)R1G(γ+iω1;J)L1CI(ω1,ω2)LG(γ+iω2;J)R. (4.8)

Here, the bars indicate averaging over the input noise distribution, and we defined CI(ω1,ω2)I~(ω1)I~(ω2)¯. On the other hand, the Fourier transform of Eq. (2.26) yields

CI(ω1,ω2)I~(ω1)I~(ω2)¯=2πδ(ω1ω2)CI, (4.9)

where we also exploited I~j(ω)=I~j(ω) for a real I(t). Substituting Eq. (4.9) into Eqs. (4.7)(4.8) we obtain

xi(t1)xj(t2)¯=dω2πeiω(t1t2)Cijx(ω), (4.10)

where

Cx(ω)=R1G(γ+iω;J)L1CILG(γ+iω;J)R. (4.11)

Noting that Eq. (4.10) expresses the covariance of the response as an inverse Fourier transform, we see that Cx(ω) is indeed the power spectrum of the response, as defined in Eq. (2.27). Finally note that the element, Cij, of any matrix can be expressed as Tr (ejeiTC), where ei are the unit basis vectors (i.e. vectors whose a-th component is δia). Using this trick with Eq. (4.11), and following the steps leading from Eq. (4.2) to Eq. (4.3), we see that after ensemble averaging, Cijx(ω)J can be written in the form

Cijx(ω)J=F(γ+iω,γ+iω;ejeiT,CI) (4.12)

where F was defined by Eqs. (4.4)(4.6).

Next, consider the system Eq. (2.2) being driven by a sinusoidal input I(t)=I02cosωt (the factor of 2 serves to normalize the time average of (2cosωt)2 to one), and consider the steady state response, which will also oscillate at frequency ω. Decomposing the input, I(t), and the steady-state response, xω (t), into their positive and negative frequency components (proportional to eiωt and eiωt, respectively), from Eq. (2.2) we obtain

xω(t)=2R1Re[eiωtG(γ+iω;J)]L1I0. (4.13)

Thus the norm squared of the steady state response, ∥x(t)∥ = x(t)x(t), will have a zero frequency component, plus components oscillating at ±2ω. Averaging over time kills the latter, leaving the zero frequency component intact, yielding

xω(t)Txω(t)¯=I0TL1G(z;J)RR1G(z;J)L1I0=Tr(RR1G(z;J)ρIG(z;J)) (4.14)

where z = γ + , the bar indicates temporal averaging, and we defined ρIL1I0I0TL. Generalizing to xω(t)TBxω(t)¯, averaging over the ensemble, and following the steps leading from Eq. (4.2) to Eq. (4.3), we obtain

xω(t)TBxw(t)¯J=F(γ+iω,γ+iω;I0I0T), (4.15)

where F is given by Eqs. (4.4)(4.6). Comparing Eq. (4.15) with Eq. (4.12), we also obtain

xω(t)TBxω(t)¯j=Tr(BCx(ω)J) (4.16)

which is Eq. (2.31) of Sec. II, it being understood that CI in Eq. (4.12) is replaced by I0I0T as in Eq. (4.15).

Now that we have expressed all our quantities of interest in terms of the kernel F as defined in Eq. (4.4), our task boils down to performing the average over J in Eq. (4.4) to obtain a closed formula for F with general arguments B and C. To this end, we now proceed to calculate the more general object

Fμ1ν2;μ2ν1(1;2)Gμ1ν1(1;J)Gμ2ν2(2;J)J, (4.17)

using the diagrammatic technique. Here, we adopted the abbreviated notation (1) ≡ (η1, z1) and (2) ≡ (η2, z2) for the function arguments, and µi = (αi, ai) (similarly for νi) for indices in the 2N dimensional space (as in Sec. III, α, β, …, and a, b, … denote indices in the 2 and N dimensional spaces, respectively). Once we have calculated Fµ1 ν2 ;µ2 ν1 (1; 2), we can obtain F (z1, z2; B, C), with the appropriate B and C, via

F(z1,z2;B,C)=Bν2μ1Fμ1ν2;μ2ν1(0,z1;0,z2)Cν1μ2, (4.18)

where all indices are summed over, and B and C were defined in Eqs. (4.5)(4.6).

As before, we start by using the expansion Eq. (3.22) for the two Green’s functions in Eq. (4.17). This is shown diagrammatically in the first line of Fig. 12, for the contribution of m-th and n-th terms in the expansion of the first and the second Green’s function, respectively. As before, for large N, averaging over J entails summing the contribution of all non-crossing pairings. This is indicated in the second line of Fig. 12. Finally, the third line of Fig. 12 shows that summing over all m’s, and n’s and all non-crossing pairings, is equivalent to replacing all solid lines with thick solid lines representing the average Green’s function in the non-crossing approximation, G(ηi, zi) (defined diagrammatically in the third line of Fig. (11), and given by Eq. (3.35) as we found in the previous section), and summing over all non-crossing pairings with every pairing connecting the thick arrow lines on top and bottom (and not each to itself). This procedure yields a sum over all ladder diagrams with different number of rungs, as shown in the third line of Fig. 12.

FIG. 12.

FIG. 12

Contribtutions to Eq. (4.17) in the non-crossing approximation. The first line shows Eq. (4.17) written using the expansion Eq. (3.22). The diagram shows the contribution of the m-th and n-th terms in the expansion for two Green’s functions, respectively. Thus the top (bottom) solid line contains m (n) factors of J, shown by dashed lines. In the large N limit, averaging each summand over J boils down to summing all non-crossing pairings (NCP) of the dashed lines. The second row shows a specific non-crossing pairing for the diagram shown in the first line. Finally, summing over all m, and n and all NCP’s, is equivalent to replacing all solid lines (representing G(ηi, zi; J = 0)) with thick solid lines representing the non-crossing average Green’s function, G(ηi, zi) (calculated according to Eqs. (3.28)(3.26)), and summing over all NCP’s with every pairing connecting the straight lines on top and bottom (and not each to itself). This procedure yields the ladder diagrams, the sum over which is shown in the third line.

As shown in the first row of Fig. 13, the sum of all ladder diagrams can be written as a sum

FIG. 13.

FIG. 13

The first row is the diagrammatic representation of Eqs. (4.19)(4.21). In the last term, ρ’s and λ’s are summed over. It shows the sum of all ladder diagram contributing to Eq. (4.17) (i.e. the last line of Fig. 12) in terms of D, which is defined in the second row. The first term on the right side of the first row equation (the ladder with zero rungs) is the disconnected average Eq. (4.20); it corresponds to taking the average of each Green’s function in Eq. (4.17) separately and then multiplying. The last row shows an iterative form of the equation in the second row, which can be solved to give the expression Eqs. (4.23) and (4.26) for D.

F=F0+FD, (4.19)

where

Fμ1ν2;μ2ν10(1;2)Gμ1ν1(1)Gμ2ν2(2), (4.20)

is the disconnected average of the two Green’s functions, and Fμ1ν2;μ2ν1D is the sum of ladder diagrams in which the two Green’s function are connected by at least one wavy line. The latter can be written in the form

Fμ1ν2;μ2ν1D(1;2)Gλ2ν2(2)Gμ1ρ1(1)Dρ1λ2;ρ2λ1(1;2)Gλ1ν1(1)Gμ2ρ2(2), (4.21)

where all repeated indices are summed over, and the “diffuson”, D, is given by the sum of all diagrams in the second row of Fig. 13.

To calculate D, it helps to first rewrite Eq. (3.21) as

JabαβJcdγδJ=1Nr,s=12(παδrδad)σrs1(πγβsδcb), (4.22)

where σ1=(0110) is the first Pauli matrix. This helps us because in the expansion of Fig. 13, the two factors in Eq. (3.21) involving πr and πs decouple and get absorbed in adjacent loops, or contribute to form factors in the left or right ends of the ladder diagrams. This is demonstrated in Fig. 14 for the second term in the series expansion of D shown in the second line of Fig. 13. Extending this similarly to all the terms in that expansion, we obtain

FIG. 14.

FIG. 14

The contribution to Dad;cbαδ;γβ (1; 2) from the second term in the series shown in the second row of Fig. 13, in more detail. The covariance of J in the form Eq. (4.22) is used to write this expression in a more manageable form. The repeated indices, r, t, u, s, are summed over 1 and 2. The matrices inside the loop multiply each other in cyclic order, giving rise to the trace Tr (G(2)πtG(1)πu). The whole diagram gives 1Nrs(πr1)adαδ[σ1ΠDσ1]rs(πs1)cbγβ where the “polarization matrix” ΠtuD was defined in Eq. (4.25).

Dμρ;λν(1;2)=Dad;cbαδ;γβ(1;2)=1Nr,s=12(παδrδad)Drs(1;2)(πγβsδcb), (4.23)

where µ = (α, a), ν = (β, b), λ = (γ, c), ρ = (δ, d) and we defined the 2 × 2 matrices

D(1;2)σ1+σ1ΠDσ1+=σ1n=0(ΠDσ1)n, (4.24)

and the “polarization matrix” for the diffuson

ΠrsD(1;2)tr(πrG(1)πsG(2))=tr(Grs(1)Gsr(2)). (4.25)

Here, as before, with the trace performed over the 2N dimensional space, and we used Eq. (3.20) to write the last form of ΠD. Summing the geometric series in Eq. (4.24) we obtain

D(1;2)=σ1(12×2ΠD(1;2)σ1)1. (4.26)

The 2 × 2 matrix inversion yields

D(1;2)=1(1Π12D)(1Π21D)Π11DΠ22D(Π22D1Π12D1Π21DΠ11D), (4.27)

where all ΠD ’s have arguments (1; 2) = (η1, z1; η2, z2) which were suppressed for clarity.

Going back to Eq. (4.18), we can also break up F (z1, z2; B, C) into a disconnected part and a connected part mirroring the decomposition Eqs. (4.19)(4.21):

F(z1,z2;B,C)=F0(z1,z2;B,C)+ΔF(z1,z2;B,C), (4.28)

where F 0(z1, z2; B, C) and ΔF (z1, z2; B, C) are defined as in Eq. (4.18), but with Fµ1 ν2 ;µ2 ν1 on the right side replaced by Fμ1ν2;μ2ν10 and Fμ1ν2;μ2ν1D, respectively. Using Eqs. (4.20)(4.21) and (4.23), we then obtain

F0(z1,z2;B,C)=Tr(BG(0,z1)CG(0,z2))=Tr(BRG21(0,z1)CLG12(0,z2)), (4.29)

and

ΔF(z1,z2;B,C)=1Nr,sTr(BRG2r(0,z1)Gr2(0,z2))×Drs(0,z1;0,z2)Tr(Gs1(0,z1)CLG1s(0,z2)), (4.30)

where r and s are summed over {1, 2}.

According to Eq. (4.3) we are interested in zi = γ + i (i = 1,2) for arbitrary real ωi. As we mentioned before Eq. (4.1), these trace a vertical line in the complex plane that is entirely to the right of the support of the average eigenvalue density, ρ(z), of A, i.e. they are in the region where the the valid solution of Eq. (3.34) is the trivial g(0, z) = 0. In this case, we have Eq. (3.41), and for ηi0+, from Eq. (3.10) (replacing A with M, corresponding to J = 0) we have

G(0,zi)=(0MziMzi10). (4.31)

Using this in Eqs. (4.29)(4.30) we obtain

F0(z1,z2;B,C)=Tr(BRMz11CLMz2), (4.32)

and

ΔF(z1,z2;B,C)=tr(BrG21(0,z1)G12(0,z2))×D12(0,z1;0,z2)Tr(G21(0,z1)CLG12(0,z2)) (4.33)

Using the definitions Eq. (2.6) and Eqs. (4.5)(4.6) we can simplify Eq. (4.32) to

F0(z1,z2;B,C)=Tr(B1z1MC1z2M). (4.34)

From Eqs. (4.25) and (4.31) we see that (for zi of interest and for ηi going to zero) Πrr = 0 and Π12 = Π21, and from Eq. (4.27) we obtain

D12(0,z1;0,z2)=11Π21D(0,z1;0,z2)=11tr(G21(0,z21)G12(0,z2)). (4.35)

Substituting this in Eq. (4.33) and using Eq. (4.31) once again, we finally obtain

ΔFz1,z2;B,C=tr(BRMz11Mz2)Tr(Mz11CLMz2)1tr(Mz11Mz2)

and after simplification using Eqs. (2.6) and (4.5)(4.6),

ΔF(z1,z2;B,C)=tr(B1z1MLL1z2M)Tr(RR1z1MC1z2M)1tr(RR1z1MLL1z2M). (4.36)

The general formulae of Sec. II B readily follow. Equations (2.21)(2.24), with CI replaced by x0xT, for the case of response to an impulse input follow from Eqs. (4.3), (4.28), (4.34) and (4.36), respectively. Equations (2.28)(2.30) (with Cx and ΔCx defined in Eqs. (2.23)(2.24)) for the power spectrum of the response to a temporally white noisy input, are similarly obtained from Eq. (4.12) by using Eqs. (4.28), (4.34) and Eq. (4.36), after setting B=ejeiT, C = CI and z1 = z2 = γ + (with the traces involving B=ejeiT turned into matrices in Eqs. (2.28)(2.30), using Tr(ejeiTX)=Xij). The result Eq. (2.31) for the steady state response to a sinusoidal input was already derived in Eq. (4.16).

We see that according to Eqs. (4.3) and (4.28)

x(t)TBX(t)J=[x(t)TBx(t)]J=0+ΔfB(t), (4.37)

where the two terms on the right hand side are obtained by replacing F (·, ·; B) in Eq. (4.3) with Eq. (4.34) and Eq. (4.36), respectively. The integrals over ω1 and ω2 decouple for the first term yielding the expected result for J = 0,

[x(t)TBx(t)]J=0=e2γtTr(BetMx0x0TetM),=x0Tet(γ+M)Bet(γ+M)x0. (4.38)

Unlike the J = 0 contribution, it is not possible to perform the double Fourier transform, Eq. (4.3), needed for obtaining ΔfB (t) for arbitrary M, L and R. In the next section, we will analytically calculate this for some special examples of M, with L and R proportional to the identity matrix (i.e. for iid quenched randomness).

V. CALCULATIONS FOR SPECIFIC EXAMPLES OF M

In this section we give the detailed calculations of the explicit expressions for the spectral density Eq. (2.8), the power spectrum Eq. (2.31), and the average squared norm Eqs. (2.21) and (2.25), for the specific examples of M, L and R presented in Sec. II C.

In the examples worked out in the subsections V A and V B, both R and L are proportional to the identity matrix; we take L = 1 and R = σ1. Furthermore, for such examples we will do the calculations by choosing the unit of time such that σ = 1 (notice that given Eq. (2.2), the elements of A and M have dimensions of frequency); then at the end of our calculations using the replacements t, zz/σ, γγ/σ, MM/σ, and ρσ2ρ (with the latter applying to both the eigenvalue density and the power spectral density), we obtain the result for general σ. The eigenvalue density and the norm squared ∥x2 are invariant with respect to unitary transforms, and, for L and R proportional to the identity, so is the distribution of the random part of A, Eq. (3.2). Thus by effecting a unitary transform MU M U, we can assume M is already in its Schur form Eq. (2.40) without loss of generality.

A. Single feedforward chain of length N : Mij = w δi+1,jγδij

We start with the example in Sec. II C 1, where M is

M=T=(0w000w) (5.1)

or Mij = w δi+1,j. First we calculate the eigenvalue density. According to Eqs. (2.8)(2.9), in order to calculate the spectral density, we need to calculate first the inverse of Mz M + g2 = (zM)(zM) + g2 (remember that we have set σ = 1, as we explained in the beginning of the section). To this end, notice that Kijr(zM)(zM)†lij = Qij − |w| δiN δjN where

Qij(z2+w2)δijwzδi+1,jwzδi,j+1. (5.2)

As the difference (zM)(zM)Q = −|w|2e T is single rank, we can use the Woodbury formula for matrix inversion to write

1K+g2=1Q+g2+1Q+g2eNeNT1Q+g2(1w2eNT(Q+g2)1eN), (5.3)

where eNT=(0,,0,1). (The only conditions for the validity of Eq. (5.3) is that the factor in parenthesis is not singular, i.e. eNT(Q+g2)1eNw2 we will consider the validity of this condition below.) Since Q is Toeplitz and Hermitian, it can be diagonalized easily. Using standard methods, we find that the eigenvalues and eigenvectors of Q, satisfying Qvn = λnvn, are given by

λn=zweiϕn2,ϕnπnN+1 (5.4)
vnj=2N+1(wzwz)j2sinϕnJ (5.5)

for n = 1, …, N. The eigenvectors are orthonormal vnvm=δnm, and we have the spectral representation

1Q+g2=n=1Nvn1λn+g2vn. (5.6)

Using Eqs. (5.4)(5.6) in Eq. (5.3) we obtain

tr1K+g2=I1(g,z)+1NI2(g,z)g2w2I2(g,z), (5.7)

where we defined

I1(g,z)tr1Q+g2=1Nn=1N1λn+g2 (5.8)
I2(g,z)eNT1Q+g2eN=1N+1n=1N2sin2ϕnλn+g2. (5.9)

In writing the numerator of the last term in Eq. (5.7), we used Eqs. (5.5)(5.6) to write Tr(1Q+g2eNeNT1Q+g2)=1Q+g2eN2=1N+1n=1N2sin2ϕn(λn+g2)2=I2(g,z)g2. In the N → ∞ limit, the sums in Eqs. (5.8)(5.9) can be approximated by the integrals

I1(g,z)=02π1zweiϕ2+g2dϕ2π, (5.10)
I2(g,z)=02π2sin2ϕzweiϕ2+g2dϕ2π. (5.11)

Some elementary contour integration then yields

I1(g,z)=[(z2+w2+g2)24w2z2]12, (5.12)
I2(g,z)=z2+w2+g2I1(g,z)12w2z2. (5.13)

In particular, we see that I2(0, z) = min(|w|−2, |z|−2), so that the condition for the validity of Eq. (5.3) would be violated for |z| < |w|, if g turns out to be zero. However, note that I2(g, z) is a decreasing function of g2, so for finite g2 > 0, the denominator in Eq. (5.7) is always positive (as is its numerator, for the same reason). Thus if we follow the correct procedure of Eq. (2.19)(2.20), taking the N → ∞ limit before sending g2 to zero, we are justified in using Eqs. (5.3) and (5.7). Furthermore, for g2 > 0 the second term in Eq. (5.7) is O(N −1), and should be neglected. Solving Eq. (2.9) (with left hand side correctly interpreted as Eq. (2.19)), which now takes the form I1(g, z) = 1, yields

g(z)2=z2w2+4w2z21. (5.14)

This is positive if and only if

w21zw2+1, (5.15)

which after the proper rescaling yields Eq. (2.36) for general σ. Note that Eq. (5.15) is precisely the region given by Eq. (2.20), which in the present case reads I1(0, z) ≥ 1. It is instructive to compare this result with what we would obtain by naively using Eq. (2.5), i.e. tr (K−1) ≥ 1, wherein g is set to zero before taking the N → ∞ limit; as we now show, that only yields the right inequality in Eq. (5.15). To see this, first note that for |w| > |z|, we can use Eq. (5.7) even for g2 = 0 (since the denominator of the last term does not vanish), which yields [(MzMz)1]=tr(K1)I1(0,z)+o(1), and by Eq. (2.5), the right inequality in Eq. (5.15). For |z| < |w|, however, we cannot set g = 0 in Eq. (5.7). In fact, when |z| < |w|, the matrix zM has an exponentially small singular value; to see this, note that the vector u with components ui=(zw)i1 satisfies (zM)u=w(zw)NeN, so that (zM)u=wzwN and since smin(zM)(zM)uu, it follows that smin(zM)wzwN, which is O(ecN) for |z| < |w|. For large enough N, this singular value alone suffices to make Eq. (2.11) (equivalent to Eq. (2.5)) hold for any |z| < |w|, as 1Nsmin(z)2 diverges despite its 1N prefactor.

Let us now calculate the eigenvalue density in the annulus Eq. (5.15). In order to use Eq. (2.8), we will first calculate

trzMK+g(z)2=ztrMK+g(z)2 (5.16)

where we used Eq. (2.9) to write the last expression. To obtain trMK+g(z)2, we will again use Eq. (5.3). In the region Eq. (5.15), the contribution of the second term in Eq. (5.3) is again suppressed by 1/N, and from Eq. (5.1) and (5.6) we have trMQ+g21=w1Nn=1N(j=1N1vnivnj+1)1λn+g2. A straightforward calculation using Eq. (5.5) (and the orthonormality of vn) yields j=1N1vnjvnj+1=(wzwz)12cosϕn. Using this and approximating the sum over n with an integral, we obtain

tr[MQ+g(z)2]wzz02πcosϕzweiϕ2+g(z)2dϕ2π,=12z[(z2+w2+g(z)2)I1g((z),z)1]. (5.17)

Using Eq. (5.17) with I1(g(z), z) = 1 (true in the region Eq. (5.15)), differentiating Eq. (5.16) with respect to z, and substituting in Eq. (2.8), we finally obtain

ρ(z)=1π[1w24w2z2+1], (5.18)

for z in the region Eq. (5.15). After the proper rescaling this yields Eq. (2.37).

We now turn to the calculation of 〈∥x(t)∥2J, using (2.25). To calculate the trace in the denominator of Eq. (2.25), first note that for Eq. (5.1) the expansion (zM)1=n=0N1Mnzn+1 terminates and is exact, yielding.

[1zM]i,j=1z(wz)ji, (5.19)

for ji, and zero otherwise. Turning the sums in the trace into a sum over the nonzero diagonals of Eq. (5.19) we obtain

tr(1z2M1z1M)=1z2z1n=0N1(1nN)qn (5.20)

where qw2(z2z1) and zi = γ + i. The condition of stability of Eq. (2.2) requires the entire spectrum of −γ1+A = −γ1+M +J to be to the left of the imaginary axis. By Eq. (5.15), this requires γ>w2+1>w. It follows that |q| < 1, and therefore the geometric series Eq. (5.20) converges as N → ∞. Summing the series and retaining terms of leading order as N → ∞, we obtain

tr(1z2M1z1M)=1z2z1w2. (5.21)

If we set the initial condition x0 in Eqs. (2.25) (or the input amplitude I0 in Eq. (2.31)) to eN = (0, · · · , 0, 1), and use Eq. (5.19), we find that the numerator in Eq. (2.25) is also given by the right hand side of Eq. (5.21). Using this and Eqs. (5.21), we obtain

F(z1,z2)=1z2z1w21, (5.22)

for the integrand of Eq. (2.25) which we denoted by F (z1, z2), with zi = γ + i, (i = 1, 2). By comparing the integrand of Eq. (2.25) with Eq. (2.33), we see that to obtain the total power spectrum for the input amplitude I0 = I0(0, · · · , 0, 1)T, we need to multiply Eq. (5.22) by I02I02, and substitute z1 = z2 = γ +. With the proper rescaling, this yields Eq. (2.39) for general σ. To obtain the formula for 〈∥x(t)∥2J, we substitute Eq. (5.22) with zi = γ + i for the integrand of Eq. (2.25). Changing the integration variables by ω1 = Ω + ω/2 and ω2 = Ω − ω/2, we obtain

x(t)2J=dω2πeitωdΩ2π1Ω2+(γ+iω2)2w21,=12dω2πeitω(γ+iω2)2w21. (5.23)

Finally consulting a table of Laplace transforms [51], we obtain

x(t)2J=e2γtI0(2tw2+1),(t0) (5.24)

where I0(x) is the 0-th modified Bessel function. Implementing the rescalings t, γγ/σ and ww/σ, we obtain Eq. (2.38).

B. N/2 feedforward chains of length 2

Here we carry out the explicit calculations for the example of Sec. II C 2 where M is given by Eq. (2.40) (without loss of generality, we assume M has its Schur form), using formulae (2.8)(2.9) for the spectral density and Eq. (2.25) for 〈∥x(t)∥2J. First we will calculate the eigenvalue density. From Eq. (2.40), KMzMz=(zM)(zM) (we are setting L = R = 1 in Eq. (2.6); see the comments at the beginning of this section) is a block-diagonal matrix with 2 × 2 diagonal blocks, with the b-th block (b = 1, …, N/2) given by

(zwb0z)(z0wbz)=(z2+wb2wbzwbzz2), (5.25)

where wb is the corresponding Schur weight in Eq. (2.40). Likewise, (K + g2)−1 whose trace appears in Eqs. (2.9) is given by a block-diagonal matrix with diagonal blocks 1(z2+g2)2+wb2g2(z2+g2wbzwbzz2+g2+wb2) . Taking the normalized trace we thus obtain

tr(K+g2)1=z2+g2+12wb2(z2+g2)2+wb2g2b, (5.26)

where (·)b means averaging over the N/2 blocks, i.e. f(wb)b1N2b=1N2f(wb).

Let us first calculate the support boundary of ρ(z). As discussed in Sec. II A, when (for |z| ≠ 0) all singular values of Mz = zM are bounded from below as N → ∞, the support is correctly given by Eq. (2.5) (we will discuss cases in which some si(z) are o(1) further below). Setting g = 0 in Eq. (5.26), and substituting in Eq. (2.5), this yields

1trK1=z2+μ2z4, (5.27)

where we defined μ2=12wb2b=tr(MM). It follows that the support is the disk |z| ≤ r0, where

r02=12+14+μ2. (5.28)

The replacements µµ/σ and r0r0 then yield Eq. (2.44).

From Eqs. (2.9) and (5.26), within the support, g2(z) is found by solving the equation

tr1K+g2=z2+g2+12wb2(z2+g2)2+wb2g2b=1, (5.29)

while for |z| > r0 we have g(z) = 0. It is clear from Eq. (5.29) that g2(z) depends on z and z only through |z| ≡ r. From Eq. (2.8), within its support the eigenvalue density is given by

πρ(z)=ztr[Mz(K+g2)1]=1ztr[M(K+g2)1], (5.30)

where we are now using the short-hand g2 = g2(|z|) (the solution of Eq. (5.29)), and in writing the second line we used Mz=zM and Eq. (5.29). From Eqs. (2.40) and (5.26) we see that M(K+g2)1 has the same block-diagonal structure as Eq. (2.40), and a short calculation shows that tr[M(K+g2)1]=z, where we defined

I3(r)12wb2(r2+g(r)2)2+wb2g(r)2b. (5.31)

I3(r) is manifestly positive (assuming some wb are nonzero), while when g2 > 0, from Eq. (5.29) we have I3(r) ≤ 1, and thus

0<I3(r)1. (5.32)

Replacing this in Eq. (5.30), and using zf(z)z=2rf(r)rr=z, we obtain

πρ(z)=12rr[r2r2I3(r)], (5.33)

for r = |z| ≤ r0, and zero otherwise; the spectral density is rotationally symmetric and depends only on r = |z|. The advantage of writing the density as a complete derivative is that it can be immediately integrated to yield n< (r), the proportion of eigenvalues with modulus smaller than some radius r. We have n<(r)=2π0rρ(r)rdr, which upon substitution of Eq. (5.33), yields

n<(r)=r2(1I3(r))(rr0). (5.34)

Likewise, we define n> (r) ≡ 1 − n< (r) to be the proportion of eigenvalues with modulus larger than r. From these definitions we have

ρ(r)=12πrn<(r)r=12πrn>(r)r, (5.35)

and from Eqs. (5.34) and n> (r) = 1 − n< (r), after some manipulation exploiting Eq. (5.29), we obtain

n>(r)=g(r)2(1+I3(r)). (5.36)

We see that beyond the radius r at which g2 vanishes (which when all wb’s are bounded is r = r0), n> (r) and ρ(r)=12πrn>r vanish identically, while for smaller r they are positive.

In cases in which some wb grow without bound as N → ∞, some singular values, si(z), of Mz = zM are o(1), and more care is needed. First, to see this, note that by definition si(z)2 are the eigenvalues of the block-diagonal K=MzMz; thus they come in pairs composed of the eigenvalues of K’s 2 × 2 blocks, given by Eq. (5.25). We denote the pair of eigenvalues corresponding to block b by sb, (z)2, with the plus and minus subscripts denoting the larger and smaller singular value, respectively. The sum sb+(z)2 + sb (z)2 and the product sb+ (z)2sb (z)2 are given by the trace and determinant of Eq. (5.25), i.e. by |wb|2 + 2|z|2 and |z|4, respectively. It follows that for blocks where the feedforward weight wb is O(1), both sb,±(z) will be Θ(1) for |z| ≠ 0, while for blocks in which as N, we have

sb+2(z)=wb2+O(1) (5.37)
sb2(z)z4wb2=o(1). (5.38)

(Note that as stated after Eq. (2.3) we assume MF2=μ2=wb2b2 is O(1), so that at most o(N) number of weights can be unbounded, and each such wb can at most be O(N).) If all the wb are O(1), and hence all singular values are Θ(1) (for |z| ≠ 0), Eq. (5.28) yields the correct support radius as noted above, and for rr0, Eq. (5.29) yields a Θ(1) solution for g(r), which leads to a Θ(1) solution for n> (r) and ρ(r) via Eqs. (5.36)(5.35). In cases in which some wb are unbounded, however, Eq. (5.28) (derived from Eq. (2.5)) may not yield the correct support boundary. Such cases are examples of the highly nonnormal cases mentioned in the general discussion after Eq. (2.12), for which the support of limN →∞ ρ(z) must be found by using Eqs. (2.19)(2.20). This is equivalent to solving Eq. (5.29) after the limit N → ∞ is taken (assuming g2 > 0), and then finding where the solution for g2(|z|) vanishes, which yields the correct support radius. From Eq. (5.36) this is indeed the radius at which limN →∞ n>(r) and hence limN →∞ ρ(r) vanish as well. This radius is in general smaller than r0 as given by Eq. (5.28).

We now calculate ρ(z) for two specific examples of M from each group. The first example is that of equal and O(1) feedforward weights in all blocks, which we denote by w (in terms of Eq. (2.41), this case corresponds to K = w1). Here we can drop the block averages in Eqs. (5.29) and (5.31), replacing wb with w. Solving Eqs. (5.29) for g2(|z| = r) we find

g2(r)=12w2r2+121+w4+4w2r2. (5.39)

Substituting this into Eq. (5.31) and Eq. (5.34) yields

n<(r)=r2w2r21+1+w4+4w2r2. (5.40)

The replacements ww/σ and rr/σ then yield Eq. (2.45) for general σ, and ρ(r) can be caclulated using Eq. (5.35).

The second case is that of Eq. (2.42). In this case only one of the blocks has a nonzero Schur weight given by |w1|2 = Tr (M M ) = N µ2 = O(N ), where µ = O(1) is given by Eq. (2.43). Equation (5.29) now yields

1=1r2+g2+μ2(r2+g2)2+Nμ2g2r2g2r2+g2, (5.41)

or

r2+g21r2g2=μ2(r2+g2)2+Nμ2g2. (5.42)

The right hand side of this last equation is I3(r), as follows from Eq. (5.31); thus using Eq. (5.42) we can rewrite Eq. (5.36) as

n>(r)=g2(r)2r21r2g2(r). (5.43)

Let us now solve Eq. (5.42) to find g(r)2. As noted above, and in accordance with the general prescription given after Eq. (2.12), for the purpose of obtaining limN →∞ ρ(z) we have to first take the N → ∞ limit in Eq. (5.41), keeping g2 > 0 fixed, and only then solve for g2. Doing so makes the last term in Eq. (5.41) vanish, and we obtain g2(r) = 1 − r2. This is positive for r ≤ 1 and vanishes at r = 1, the correct support radius of limN →∞ ρ(z), which is strictly smaller than r0 given by Eq. (5.28). From Eq. (5.43) we obtain n> (r) = g (r) = 1 − r. It then follows from Eq. (5.35) that the N → ∞ limit of the eigenvalue density is identical with the circular law (the result for the M = 0), i.e. limNρ(r)=1π for r ≤ 1 and zero otherwise. With the correct scaling, this yields Eq. (2.46).

Contrary to the general prescription given after Eq. (2.12), we will now solve equations Eqs. (5.29), (5.31) and Eq. (5.36) for r > 1, without taking the limit N → ∞ first. As we will see, the obtained solution for g(r)2, and by Eqs. (5.32) and (5.36) therefore the solutions for n>(r) and ρ(r), will be nonzero but o(1) for 1 < rr0. As discussed in Sec. II C 2, these finite-size corrections, which in general are not trustworthy, in the present case are in surprisingly good agreement with simulations for some range of r’s beyond rΘ(1), but deviate from the true n>(r) for larger r (see Fig. 7). At finite N, it can indeed be checked that Eq. (5.29) has a positive solution for g2 if and only if r < r0, with r0 given by Eq. (5.28). Simplifying Eq. (5.42) yields a cubic equation in g2. However, it turns out that ignoring the cubic term in g2 is harmless for large N ; the quadratic approximation has the positive solution

g2(r)=[1r22]2+r2(r2+μ2)r6μ2N+1r22, (5.44)

and for all r < r0, corrections to Eq. (5.44) when the cubic term is reinstated decay faster than the leading contribution from Eq. (5.44) (nevertheless we numerically solved the full cubic equation (5.29) to obtain the black curve in Fig. 7, and the blue trace in Fig. 6). First, analyzing Eq. (5.44) we see that g2(r) is indeed Θ(1) only for r < 1, where as we already found g2(r) = 1 − r2 + o(1). Furthermore, for a fixed r > 1 (such that r − 1 does not vanish as N → ∞), the solution for g(r) is O(N −1). Thus from Eq. (5.43) we thus see that N n> (r), i.e. the total number of eigenvalues with modulus larger than r, for 1 < r < r0 (and r − 1 = Θ(1)) is only O(1); the solution for N n> (r) is shown in Fig. 7. Correspondingly, from Eqs. (5.43) and (5.35) we see that ρ(r) is o(1) in this region and vanishes in the limit N → ∞, as already found.

Now let us calculate the total number of eigenvalues lying outside the circle |z| = 1. This is given by N n> (1). From Eq. (5.44) we find g2(1)=1N, and substituting in Eq. (5.43) we obtain

N>(1)Nn>(1)=N+O(1). (5.45)

With the proper rescaling this yields Eq. (2.47) for general σ. Note that, according to Eq. (5.44), g(r) (and hence n> (r)) remains Θ(N − 1/2) (as opposed to O(N −1)) in a thin boundary layer outside of width Θ(N −1/2) just outside of the circle |z| = 1.

We will now work out the formula for 〈∥x(t)∥2J, Eqs. (2.25)(2.21), when the initial condition x0 is the second Schur-vector in block b = a, which we denote by ea2; in the Schur representation, Eq. (2.40), we have ea2 = (0, 1)T (we only write the components of ea2 in block a). To calculate the numerator in Eq. (2.25) we first calculate (zTa2) where Ta=(0wa00) denotes the a-th diagonal 2 × 2 block of Eq. (2.40). Since Ta2=0, we have (zTa)1=z1+z2Ta, which yields va(z) ≡ (zTa)−1ea2 = (waz−2, z−1)T. We thus obtain

x0T1z2M1z1Mx0=va(z2)va(z1)=1z1z2+wa2z12z22. (5.46)

On the other hand, we have

tr1z2M1z1M=12Tr2×2(z2Tb)1(z1Tb)1b=1z1z2+wb2b2z12z22. (5.47)

Substituting Eqs. (5.46)(5.47) in Eq. (2.25) we obtain

F(z1,z2)=z1z2+wa2(z1z2)2(z1z2+μ2). (5.48)

where we used µ2 = (|wb|2)b//2, and we denoted the integrand of Eq. (2.25) by F (z1, z2) with zi = γ + i, (i = 1, 2). By comparing the integrand of Eq. (2.25) with Eq. (2.33), we see that substituting z1 = z2 = γ + into Eq. (5.48) yields the total power spectrum, xw2¯J. After the proper rescalings, this yields Eq. (2.50) for general σ. To obtain x(t)2J, on the other hand, we should substitute Eq. (5.48) into Eq. (2.21) with zi = γ+i. Let us use the change of variables ω1 = Ω+ω and ω2 = Ω − ω. Then we have z1z2=Ω+(γ+iω)2 and from Eq. (2.21) we obtain

x(t)2J=dω2πe2itωfa(γ+iω) (5.49)

where we defined

fa(u)2dΩ2πΩ2+u2+wa2(Ω2+u2)2(Ω2+u2+μ2). (5.50)

Let us rewrite the integrand in Eq. (5.50) as

Ω2+u2+wa2(Ω2+u2r02)(Ω2+u2+r12)=Ω2+u2+wa2r02+r12×[1Ω2+u2r021Ω2+u2+r12], (5.51)

where r02 was defined in Eq. (5.28) and

r12r0210. (5.52)

One can calculate the integral over Ω in Eq. (5.50) by contour integration, closing the contour, say, in the upper half of complex plane. The poles of the first and the second terms on the second line of Eq. (5.51) are located at Ω0,±=±iu2r02 and Ω1,±=±iu2+r12, respectively. For u = γ + (γ > 0) the roots falling in the upper half plane are Ω0,+ and Ω1,+, independently of ω. From their residues we obtain

fa(u)=1r02+r12[r02+wa2u2r02+r12wa2u2+r12]. (5.53)

The integral of Eq. (5.53) in Eq. (5.49) is essentially the inverse Laplace transform of Eq. (5.53). Consulting a table of Laplace transforms [51] yields

x(t)2J=e2γt[r02+wa2r02+r12I0(2r0t)+r12wa2r02+r12J0(2r1t)]. (5.54)

where J0(x) (I0(x)) is the 0-th Bessel function (modified Bessel function). From Eqs. (5.28) and (5.52) it follows that r02+r12=1+4μ2, and using μ2wb2b2 once again, we obtain

x(t)2J=[1+Ca2I0(2r0t)+1Ca2J0(2r1t)]e2γt (5.55)

where we defined

Ca1+2wa21+2wa2b. (5.56)

Effecting the proper rescalings we obtain the result for general σ, Eqs. (2.48)(2.49).

C. Network with different neural types and independent, factorizable weights

Here we carry out the explicit calculations for the network with C neural types presented Sec. II C 2, with M, L and R are given by Eqs. (2.52)(2.55). From Eqs. (2.6) and (2.52)(2.55) we obtain Mz = z(RL)−1suuT , and

MzMz=z2(RL)2zsvuTzsvuT+s2uuT, (5.57)

where we defined v ≡ (RL)−1u. Using the Woodbury matrix identity we can write

1g2+MzMz=QQU1D1+UQUUQ (5.58)

where we defined the N × 2 matrix U = (u , v), and

Q1g2+z2(RL)2 (5.59)
D(s2zszs0). (5.60)

We will argue that for g>0,tr(g2)+MzMz1=trQ up to o(1) corrections. From Eq. (5.58), for the remainder Δ(g,z)(g2+MzMz)1trQ, we obtain

Δ(g,z)=1NTr[UQ2UD1+UQU] (5.61)

where the trace is now over 2 × 2 matrices. We have

D1=(0(zs)1(zs)1z2) (5.62)

and for n = 1, 2 we obtain

UQnU=(uQnuuQnvuQnvvQnv)=(In,0In,1In,1In,2), (5.63)

where

In,k(g,z)1Ni=1N(lc(i)rc(i))k[g2+z2(lc(i)rc(i))2]n (5.64)
=σck(g2+σc2z2)nc (5.65)

and we are using the notation Eq. (2.59) (we will drop the explicit g and z dependence of In,k when convenient). Note that all In,k (g, z) are O(1) and for even k are positive. Inverting D−1 + U QU we obtain

Δ(g,z)=1NT(g,z)det(D1+UQU) (5.66)

where

T(g,z)Tr[(I2,0I2,1I2,1I2,2)(I1,2z2(zs)1I1,1(zs)1I1,1I1,0)]I2,2I1,0+I2,0(I1,21z2)2I2,1(I1,11sRez)

and

det(D1+UQU)=I1,0(1z2I1,2)+I1,11sz2=g2z2I1,02+I1,11sz2. (5.67)

We see that both T (g, z) and −det(D−1 + U QU ) are O(1) (to obtain their limits as N → ∞ we can set s−1 = O(N −1/2) equal to zero) and since det(D1)(D1+UQU)g2z2I1,02 and I1,02>0 we see that for g > 0, the denominator in Eq. (5.66) is bounded away from zero, and hence Δ(g, z) = O(N −1) and can be safely ignored for g > 0.

We will thus use tr(g2+MzMz)1=trQ+o(1). From Eq. (5.59) we obtain tr Q = I1,0(g, z) and hence from Eqs. (2.19),

K(g,z)=limNtrQ=1g2+σc2r2c (5.68)

where r ≡ |z|. Note that the approximation tr(g2)+MzMz1=trQ is equivalent to using MzMz=z2(RL)2 instead of the full expression Eq. (5.57) and hence to setting M = 0. Accordingly, the support of the eigenvalue distribution is given by Eq. (2.13), or equivalently by Eq. (2.60), and within this support, g2 is depends only on |z| = r and is found by solving Eq. (2.15), or equivalently Eq. (2.62). Similar considerations show that in using Eq. (2.8) to obtain limN →∞ ρ(z) we can set M = 0, yielding an isotropic eigenvalue density. From Eqs. (2.14)(2.16), the proportion, n>, of eigenvalues lying a distance larger than r is equal to g2(r), which is found by solving Eq. (2.62). The results Eqs. (2.17)(2.18) also hold, wherein the normalized sums over i can be replaced with appropriate averages (·)c.

Let us now go back to the expression for Δ(g, z), and consider the case g = 0. In this case

In,k(0,z)=z2nσc2nkc, (5.69)

and we obtain

T(g,z)=z6(σc2c22σc3cσcc)+2σc3cs1z4Rez (5.70)

and

det(D1+UQU)=z2σccs1z2. (5.71)

In the special case in which (σc)c = 0 (this corresponds to the special case of the example Eq. (2.42) with f µE − (1 − f )µIu · v = 0, which we considered above), the determinant will have a vanishing limit as N → ∞ (or s−1 → 0). This leads to a finite limit for Δ(0, z) and we obtain

Δ(0,z)=s2σc22Nz4=ξ2σc22z4,(σcc=0). (5.72)

Adding this to tr Q in the right side of Eq. (5.68), and using the naive formula Eq. (2.5) or K(0, z) = 1 for the spectral boundary, we would have obtained the equation

1=σc2cr2+ξ2σc22r4. (5.73)

This in turn yields the radius Eq. (2.61) which is larger than the true boundary of the support of limN →∞ ρ(z) given by Eq. (2.60).

VI. CONCLUSIONS

We have provided a general formula for the eigenvalue density of partly random matrices, i.e. matrices with general mean and non-trivial covariance structure. General formulae have also been derived for the magnitude of impulse response and frequency power spectrum in an N -dimensional linear dynamical system with a coupling given by such partly random matrices. Our theory makes no requirement on the normality of matrices; its applications include therefore the stability and linear response analysis of neural circuits, whose linearized dynamics is always nonnormal. We have demonstrated our theory by tackling analytically two specific neural circuits: a feedforward chain of length N, and a set of randomly coupled feedforward subchains of length 2. A connection has also been revealed between the eigenvalue spectra of dense random matrix perturbations, and the theory of pseudospectra.

The non-crossing diagrammatic method can be used to calculate other quantities of interest for matrix ensembles of the form A = M + LJR, considered here as well; possible examples are direct statistics of eigenvectors [52], or the correlations of the random fluctuations of the eigenvalue density (δρJ (z)δρJ (z + w))J for macroscopic w (i.e. for |w| = Θ(1)). On the other hand, quantities such as the microscopic structure of (δρJ (z)δρJ (z + w))J, e.g. for |w| = Θ(N −1/2) with z inside the support, which could be of interest in the study of eigenvalue repulsion are not accessible to the non-crossing approximation. This is also the case, in general, for the statistics of the “outlier” eigenvalues that we discussed after Eq. (2.12) and in the examples of Sec. II C 1 and Sec. II C 2, which may be of importance in practical applications. The calculation of such quantities is possible, for example, by using the replica technique (see e.g. Ref. [53]).

Finally, there are important forms of disorder which are not covered by the general ensemble A = M + LJR with iid, and hence dense, J. Examples of relevance to neuroscientific applications include sparse A [5456] (note that, e.g., binary matrices with probability of a nonzero weight, p, which is small but Θ(1) as N → ∞ are covered by our formulae; by “sparse” disorder we refer, e.g. to the case p = o(1)), or more general structure of correlations between the elements of A (in the ensemble considered in this article, and for real J, the covariance AijδAijJ=(LLT)ii(RTR)jj is single rank); the latter is of importance in considering networks with local topologies where, e.g., the matrix A has a banded structure. Generalization to other forms of random disorder is thus an important direction for future research.

Acknowledgments

We thank Larry Abbott and Merav Stern for helpful discussions. Y.A. was supported by the Kavli Institute for Brain Science Postdoctoral Fellowship and by the Swartz Program in Computational Neuroscience at Columbia University, supported by a gift from the Swartz Foundation. K.D.M. was supported by grant R01-EY11001 from the NIH and by the Gatsby Charitable Foundation through the Gatsby Initiative in Brain Circuitry at Columbia University.

Appendix A. Validity of the non-crossing approximation

In this appendix we will give the justification for the non-crossing approximation used in Sec. (III) and (IV). That is, we will show that the only diagrams not suppressed by inverse powers of N are the non-crossing diagrams. We will limit our discussion to the case of the eigenvalue density considered in Sec. (III), but the generalization to the quantities calculated in (IV) is straightforward. As explained after Eq. (3.22), averaging of G(η, z; J ) over the disorder J involves summing over all complete pairings of the factors of J in every term of the expansion Eq. (3.22), with each pairing of each term represented by a diagram as shown in Fig. 11. Each such diagram is composed of a solid directed line (each segment of which represents a factor of Gabαβ(η,z;0), with a number of wavy lines (each representing the expression Eq. (3.21), with different indices) connecting different points on the solid arrow line, and all the internal matrix indices summed over. For the purpose of calculating the eigenvalue density, according to Eqs. (3.16)(3.18), what we need to calculate is actually tr(σ+(RL)−1 (G(η, z; J ))J ); thus we can imagine the solid arrow making a loop by closing-in on itself sandwiching σ+ ⊗ (RL)−1 (see Fig. 15).

FIG. 15.

FIG. 15

(Color online) The orbits (shown by thin red paths) for two diagrams for the spectral density in a complex J ensemble. The non-crossing diagram on top has three orbits: orbit (1) is the external orbit connecting the two ends of the Green’s function, while orbits (2) and (3) are the internal orbits. As in Eqs. (A1) and (A2), they contribute tr(σ+G(η,z;0)πr1G(η,z;0)),Tr(π3r1G(η,z;0)πr2G(η,z;0)) and Tr(π3r2G(η,z;0)) respectively, with r1 and r2 summed over 1 and 2 (cf. Eq. (3.21)). The trace contributed by each of the three orbits is O(N ), which when combined with the three factors of 1/N accounting for the two wavy lines and the normalization of the external orbit’s trace, yield an O(1) expression for this diagram. By contrast, the crossing diagram on the right has no internal orbits. Its only external orbit contributes Tr(σ+G(η,z;0)πr2G(η,z;0)π3r1G(η,z;0)π3r2G(η,z;0)πr1G(η,z;0)) which after normalization is O(1). Accounting for two factors of 1/N coming from the wavy lines, we then see that this crossing diagram is O(N −2) and hence is suppressed as N → ∞.

Given the structure of the Kronecker deltas in Eq. (3.21), it is more convenient for our purpose here, however, to think of each diagram as a number of “orbits,” each formed by starting somewhere on the solid line and moving on it always along its arrow until the next wavy line is encountered, whereby we leave the solid line, continuing on the wavy line without crossing it (because Eq. (3.21) is composed of two Kronecker deltas, one for each side of the wavy line, enforcing index identification at the corresponding ends on each side [64]) and return somewhere else on the solid line, continuing as before until we reach the initial point (see Fig. 15). As we go around this orbit, for each solid line traversed we write down, from right to left, a G(η, z; 0) and for each wavy line a πri (see Eqs. (3.20)) where i is the index of the wavy line. Because all matrix indices are summed over, such adjacent factors multiply like matrices, and since the orbit forms a loop, in the end we obtain the trace of the matrix product thus obtained. (This recipe for assigning the contribution of each orbit accounts for the Kronecker deltas and πr ’s in Eq. (3.21), but not for the factor 1N and the sum over r’s; we will account for these, at the end, after Eq. (A2).) A generic orbit, which we refer to as internal, closes on itself after traversing, say, m wavy lines sanwiching m Green’s functions (e.g. the orbits labeled 2 and 3 in panel (a) of Fig. 15), and thus contributes a trace of the form

Im,rTr(G(η,z;0)πri1G(η,z;0)πrim), (A1)

where r is short-hand for {ri1 , …, rim }, and ik are the indices of the wavy lines traversed in the orbit. In every diagram, there is also exactly one orbit (e.g. the orbits labeled 1 in both panels of Fig. 15) which in addition includes the factor σ+(RL)−1 sandwhiched between the two external Green’s functions. This orbit, which we call the external orbit, contributes a trace of the form

Em,r~Tr(σ+(RL)1G(η,z;0)πrj1πrjnG(η,z;0)) (A2)

where n is the number wavy lines the orbit traverses and r~ is short for {rj1 , …, rjn }, and jk are the indices of the wavy lines traversed in this orbit (in writing Eq. (A2) we dropped the 1N that normalizes the trace in Eqs. (3.16), but we will account for it below). For succinctness, in Eqs. (A1)(A2) we suppressed the arguments (η, z) for Im,r and En,r~ on which they depend. The full expression for the diagram is obtained by multiplying all such trace factors contributed by every orbit in the diagram, as well as a factor of N w−1 where w is the number of wavy lines in the diagrams, to account for the N −1 in Eq. (3.21) for each wavy line, as well as the extra N −1 which normalizes the trace in the external orbit Eq. (A2) as dictated by Eq. (3.16). The obtained expression is finally summed over all the r-indices corresponding to each wavy line, as required by Eq. (3.21).

The justification for the non-crossing approximation is based on the claim that each trace contributed by a orbit (external or internal) as in Eqs. (A1)(A2) is O(N ), irrespective of η, z, m or r. We will provide justification for this claim below. However, accepting it as true, we see that any diagram’s scaling with N solely depends on the number of orbits and wavy lines it contains. A wellknown topological argument then shows that the contributions of crossing diagrams are suppressed by inverse powers of N [57, 58]; for completeness we will summarize this argument here. First note that, assuming the claim, any diagram will yield an expression that is O(N α) with

αfw1, (A3)

where f is the number of orbits in the diagram (the sum over at most 2w possible configurations of ri does not contribute to the scaling with N ). Let V denote the total number of vertices in the diagram (i.e. the number of intersections of wavy lines and the solid line, plus an extra one representing the insertion of σ+(RL)−1 in the solid line loop) and let E denote its total number of edges, i.e. Ew + s, where s is the number of solid line segments (s = 5 in both panels of Fig. 15). It is easy to see that V = s. Thus we have EV = w. Formally defining the number of “faces” in the diagram by Ff + 1, and its “Euler characteristic” by

χFE+V, (A4)

we then find that χ = F − (EV ) = f + 1 − w. From Eq. (A3) we then obtain

α=χ2. (A5)

Thus the contribution of a diagram is O(N α), with α determined solely by the diagram’s formal “Euler characteristic” via Eq. (A5). It can be shown that a diagram with F formal “faces” and a formal “Euler characteristic” χ as defined above, can be drawn on (embedded in) a two-dimensional oriented surface with Euler characteristic χ, such that no edges (solid or wavy) cross to create new vertices, and each face created on the surface by its partitioning by the drawn diagram, a) is topologically a disk, and b) has a one-to-one correspondence with and is encircled by an orbit in the diagram, where we now count among the orbits, also the loop formed by the solid arrow line. Thus the number of faces on the surface is indeed F = f + 1, and the χ, as defined above for the diagram, indeed agrees with the Euler characteristic of the surface, as conventionally defined. Topologically, such a surface is a generalized torus with g holes, satisfying χ = 2 − 2g; the surface with zero holes is the sphere, or after decompactification, the plane (e.g. the diagram in panel (b) of Fig. 15 can be drawn in this manner on a torus). We thus see that

α=2g, (A6)

and therefore the only diagrams that are not suppressed by inverse powers of N are those that can be drawn, as described above, on the plane. Since we took the area enclosed by the solid arrow line loop as a face by itself, this means that the diagram should be drawable with the wavy lines remaining outside this area (in order not to partition it into several faces) without crossing each other; this is the precise definition of the diagram being non-crossing [65].

Let us now go back to justifying the claim that the traces contributed by the orbits as in Eqs. (A1)(A2) are O(N ). For this purpose we will make use of the singular value decomposition of Mz introduced in Eq. (3.32). Defining the unitary matrix

Uz(Uz00Vz), (A7)

and using Eq. (3.32), we can write H0(z), defined in Eq. (3.14), as

H0(z)=UzH~0(z)Uz, (A8)

where

H~0(z)(0SzSz0). (A9)

Let us also define G~(η,z;0)UzG(η,z;0)Uz, such that

G(η,z;0)=UzG~(η,z;0)Uz. (A10)

Then using the definition G(η, z; 0) = (ηH0(z))−1, we see that

G~(η,z;0)=1ηH~0(z)=(ηη2Sz2Szη2Sz2Szη2Sz2ηη2Sz2), (A11)

where we used Eq. (A9) to write the last equality. Given the block-diagonal nature of Eq. (A7) and the definitons Eq. (3.20), we also have

πr=UzπrUz. (A12)

We now substitute G(η, z; 0) and πri in Eqs. (A1)(A2) with the right hand sides of Eqs. (A10) and (A12), respectively. After canceling the Uz ’s we obtain

Im,r=Tr(G~(η,z;0)πri1G~(η,z;0)πrim), (A13)
En,r~=Tr(σ+A(z)G~(η,z;0)πrj1πrjnG~(η,z;0)) (A13)

where we defined

A(z)Uz(RL)1Vz, (A15)

such that Uz[σ+(RL)1]Uz=σ+A(z)σ+A(z). For the internal orbits, we see from Eq. (A11) that each G~(η,z;0), depending on whether it is sandwiched between the same projectors πr, or between two opposite projectors, πr and π3−r, contributes a diagonal factor equal to η(η2Sz2)orSz(η2Sz2), respectively. Thus, for any configuration of ri’s, if the number of Green’s functions sandwiched the second way is k (1 ≤ km), we obtain

Im,r(η,z)=i=1Nηmksi(z)k(η2si(z)2)m(1km) (A16)

for the internal orbits (in particular, we see that the sole dependence of Im,r (η, z) on r is via the number k). We therefore have

Im,r(η,z)Nmaxiηmksi(z)k(η2si(z)2)m. (A17)

When the imaginary part of η is nonzero, the denominator in the right hand side of Eq. (A17) cannot vanish for any value of si(z) (while as ηi0, which is the limit we have to take after summing up the relevant diagrammatic series, si(z) that approach zero as N grows can make this expression unbounded as N → ∞). Assuming Im η > 0, it will be sufficient for our purposes to substitute Eq. (A17) with the weaker bound

Im,r(η,z)Nmaxsηmksk(η2s2)m,(Imη>0) (A18)

where now the maximum is taken for s ranging over the whole [0, ∞). Since Im η > 0 the expression has no singularities at finite real s, and since 2m > k, it cannot diverge as s → ∞ either; thus it has a finite maximum independent of N. More precisely, it is easy to show that maxsηmksk(η2s2)m[2Imη], irrespective of k as long as 1 ≤ km, yielding

Im,r(η,z)N[2Imη]m,(Imη>0). (A19)

Similarly, the trace for the external orbit can be written in the new basis Eq. (A11) as

En,r~(η,z)=i=1NAii(z)ηnk~si(z)k~(η2si(z)2)n,(1k~n), (A20)

where k~ is the number of Green’s functions in Eq. (A2) sandwiched between two πr ’s with different superscripts; this convention works correctly for the external orbit as well, if we account for the presence of σ+ by imagining a π2 (π1) to the left (right) of the leftmost (rightmost) Green’s function. From Eq. (A15), we can write Aii(z) = ui(z)(RL)−1vi(z), where we defined the vectors ui(z) and vi(z) to be the i-th column of Uz and Vz, respectively. By the Cauchy-Schwartz inequality we then have

Aii(z)ui(z)(RL)1vi(z)ui(z)vi(z)(RL)1, (A21)

where ∥(RL)−1∥ is the operator norm, or the maximum singular value, of (RL)−1. But since Uz and Vz are unitary matrices, ui(z) and vi(z) are unit vectors, and we obtain

Aii(z)(RL)1. (A22)

Going back to Eq. (A20), this yields the bound

En,r~(η,z)N(RL)1maxiηnk~si(z)k~(η2si(z)2)n. (A23)

The only difference with the inequality for Im,r is the factor ∥(RL)−1∥. Repeating the same argument as for the internal traces, we therefore see that

En,r~(η,z)N[2Imη]n(RL)1,(Imη>0), (A24)

and thus a sufficient condition for En,r~ to be O(N ) for Im η > 0, is that ∥(RL)−1∥ remains bounded as N → ∞, i.e.

(RL)1=O(1). (A25)

Combining Eqs. (A19) and (A24), and given the prescription after Eq. (A2), we can bound the absolute value of the contribution of a diagram with genus g (or g crossings), w wavy lines, and s solid lines, by 2w[2Imη]s(RL)1N2g (the power of s is obtained by noting that the powers of m and n in the bounds Eqs. (A19) and (A24), when summed over all orbits must equal s, since every Green’s function or solid line appears in exactly one orbit). Hence for a fixed, nonzero Im η, the contribution of crossing diagrams (i.e. those with g ≥ 1) goes to zero as N → ∞. Thus if we take the limit N → ∞ before the limit ηi0+, ignoring the crossing diagrams is safe, and the expression for ρ(z) obtained from Eq. (3.16) after analytic continuation of tr(σ+(RL)−1G(η, z)) to η = i0, with G(η, z) given by the contribtiuon of non-crossing diagrams to (G(η, z; J ))J, gives the correct result for limN →∞ ρ(z). We mention that when the smallest singular value si(z) remains bounded away from zero as N → ∞, even at η = 0 the traces Eqs. (A16) and (A20) are O(N ), as is not hard to check, justifying the non-crossing approximation at η = 0. Thus it is only when some si(z) are o(1) that it becomes important to send η to i0+ only after the limit N → ∞ has been taken. In particular, in such cases, applying the limit ηi0+ to the results obtained using the non-crossing approximation before taking the limit N → ∞, may yield finite-size contributions to limN →∞ ρ(z), which in general may yield incorrect subleading corrections.

Appendix B. ρ(z) vanishes in the region Eq. (3.38)

In this appendix we prove more rigorously that in the region Eq. (3.38), the eigenvalue density vanishes. More precisely, we prove that ρ(z) ≡ limϵ→0+ limN →∞ ρN (z, E) = 0, where

ρN(z,ϵ)1πztr[(RL)1MzMzMz+γ2], (B1)

is obtained by substituting Eq. (3.35) into Eqs. (3.17). Here, γ = g(z, E) + E, is the solution of Eq. (3.37), which as we argued in Sec. III, vanishes as ϵ → 0+ when z is in the region Eq. (3.38) (note that since Eq. (3.37) is defined in the limit N → ∞, γ has no dependence on N ). Recall that for E > 0, g(z, E) is positive and therefore γ > E > 0. Expanding the derivative in Eq. (B1) we obtain

πρN(z,ϵ)=tr[(RL)1(RL)MzMz+γ2]tr[(RL)1Mz1MzMz+γ2Mz(RL)1MzMz+γ2]tr[(RL)1MzMzMz+γ21MzMz+γ2]z(γ2)=tr[(RL)1Q(RL)MzMz+γ2]tr[(RL)1MzMzMz+γ21MzMz+γ2]z(γ2), (B2)

where we defined Q=1Mz1MzMz+γ2Mz (we suppress the explicit dependence of γ on z for simplicity). By the Woodbury matrix identity Q=1MzMz+γ2 which upon substitution in Eq. (B2) yields

πρN(z,ϵ)=tr[(RL)1MzMz+γ2(RL)MzMz+γ2]γ2tr[(RL)1MzMzMz+γ21MzMz+γ2]z(γ2). (B3)

Differentiating Eq. (3.37) with respect to z yields

z(γ2)=2γ2zK1K2γ2γ2K, (B4)

with the partial derivatives of K(γ, z) given by

γ2K=T>(γ)limNTN>(γ)zK=V(γ)limNVN(γ), (B5)

where we defined

TN>(γ)tr[1(MzMz+γ2)2] (B6)
VN(γ)tr[1MzMz+γ2Mz(RL)MzMz+γ2]. (B7)

We thus obtain

πρN(z,ϵ)=γ2TN(γ)+2γ2V(γ)VN(γ)1K(γ)+2γ2T>(γ) (B8)

where we defined

TN(γ)tr[(RL)1MzMz+γ2(RL)MzMz+γ2]. (B9)

Having eliminated derivatives of γ, we now simply need to show that limγ→0+ limN →∞ of the right side of Eq. (B8) vanishes for z is in the region Eq. (3.38) (where γ = 0+ is the solution of Eq. (3.37) as ϵ → 0+).

We will start by bounding the traces TN (γ) and VN (γ) in Eq. (B8). For VN (γ) we use the singular value decomposition Eq. (3.32):

VN(γ)=tr[(RL)1MzMzMz+γ21MzMz+γ2]=tr[Uz(RL)1VzSz(Sz2+γ2)2](RL)1tr[Sz(Sz2+γ2)2] (B10)

(where in the last line we used Eq. (A22)), i.e.

VN(γ)(RL)1VN>(γ) (B11)

where we defined

VN>(γ)1Ni=0Nsi(z)(si(z)2+γ2)2. (B12)

Taking the limit N → ∞, we obtain from Eq. (B11)

V(γ)CV>(γ) (B13)

where V>(γ)limNVN>(γ), and C is an upper bound on ∥(LR)−1∥ (which we have assumed is O(1) as N → ∞). To bound TN (γ), we use the inequality

tr(ABCD)ACtr(BB)12tr(DD)12. (B14)

This can be derived by first using the Cauchy-Schwartz inequality, |tr (AB)|2 ≤ tr (AA)tr (BB), and then using the inequality |tr (AB)| ≤ ∥B∥tr (A), valid for positive semi-definite A (which in turn follows from the definition of ∥B∥ after unitary diagonalization of A). Using (B14) we obtain

tr[(RL)1MzMz+γ2(RL)MzMz+γ2](RL)12tr[(1MzMz+γ2)2] (B15)

or

TN(γ)(RL)12TN>(γ). (B16)

Using the inequalities, (B11), (B13) and (B16) in Eq. (B8) we obtain

πρN(z,ϵ)C2[γ2TN>(γ)+2γ2V>(γ)VN>(γ)1K(γ)+2γ2T>(γ)] (B17)

Taking the N → ∞ limit (while keeping γ finite), and defining ρ(z, E) ≡ limN →∞ ρN (z, E), we obtain

πρ(z,ϵ)C2[γ2T>(γ)+2(γV>(γ))21K(γ)+2γ2T>(γ)] (B18)

where we defined

T>(γ)limN1Ni=0N1(si(z)2+γ2)2 (B19)
V>(γ)limN1Ni=0Nsi(z)(si(z)2+γ2)2. (B20)

Thus, to show that limϵ→0+ ρ(z, E) = 0, it suffices to show that γ2T>(γ) and γV>(γ) vanish as γ → 0+ (since z is in the region Eq. (3.38), 1−K(γ) and hence the denominator in the last term in Eq. (B18) remains positive as γ → 0+). Let us rewrite Eq. (B19) as

T>(γ)=0ρS(s;z)ds(s2+γ2)2 (B21)
V>(γ)=0sρS(s;z)ds(s2+γ2)2 (B22)

where we defined

ρS(s;z)=limN1Ni=0Nδ(ssi(z)) (B23)

as the limit of the density of the singular values of Mz [66]. Note that contributions to T>(γ) and V>(γ) from integration on [s0, ∞) for any fixed, nonzero s0 remain finite as γ → 0+; only singular contributions arising from the region s = O(γ) « 1 can contribute to γ2T>(γ) and γ2V>(γ) as γ → 0+. Thus we only need concern ourselves with the portion of integrals from 0 to some arbitrary small, but fixed s0, and show that γ20s0ρS(s;z)ds(s2+γ2)2 and γ0s0sρS(s;z)ds(s2+γ2)2 vanish as γ → 0. Let us first consider the situation similar to that in the two examples Eqs. (5.1) and (2.42). For those examples, there is a region of z outside Eq. (3.38), where a single (more generally O(1)) singular value si(z) vanishes as N → ∞, while all the other si(z) remain bounded from below. But an O(1) set of (vanishing) singular values does not contribute to the density Eq. (B23) and since the other si(z) are bounded from below, there is an s0 below which ρS (s; z) identically vanishes. So the claim is clearly true for such cases. More generally, we exploit the fact that z is in the region Eq. (3.38), so that

limγ0+0ρS(s;z)dss2+γ2<1. (B24)

We conclude that as s → 0+ the density, ρS (s; z), must vanish at least as fast as sα, i.e. it must be O(sα), for some α > 1; otherwise the integral in Eq. (B24) diverges in the limit. Let us therefore choose s0 to be small enough such that for ss0, ρ(s; z) < csα for some constant c and α > 1. It is then an elementary exercise to show that γ20s0sαds(s2+γ2)2 and γ20s0sα+1ds(s2+γ2)2 are O(γmin(2,α1)) and O(γmin(1,α1)), respectively, as γ → 0+, and since α > 1, they both vanish in the limit, proving the claim.

References

  • [1].Mehta ML. Random Matrices. Academic Press; 2004. [Google Scholar]
  • [2].Bai Z, Silverstein JW. Spectral Analysis of Large Dimensional Random Matrices. Science Press; 2006. [Google Scholar]
  • [3].Guhr T, Müller-Groeling A, Weidenmüller HA. Phys. Rep. 1998;299:189. [Google Scholar]
  • [4].Dale H. Proc. R. Soc. Med. 1935;28:319. doi: 10.1177/003591573502800330. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Eccles JC, Fatt P, Koketsu K. J. Physiol. 1954;126:524. doi: 10.1113/jphysiol.1954.sp005226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Strata P, Harvey R. Brain. Res. Bull. 1999;50:349. doi: 10.1016/s0361-9230(99)00100-8. [DOI] [PubMed] [Google Scholar]
  • [7].Murphy BK, Miller KD. Neuron. 2009;61:635. doi: 10.1016/j.neuron.2009.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi AL. Nature. 2000;407:651. doi: 10.1038/35036627. [DOI] [PubMed] [Google Scholar]
  • [9].Barabasi AL, Oltvai ZN. Nat. Rev. Genet. 2004;5:101. doi: 10.1038/nrg1272. [DOI] [PubMed] [Google Scholar]
  • [10].Zhu X, Gerstein M, Snyder M. Genes Dev. 2007;21:1010. doi: 10.1101/gad.1528707. [DOI] [PubMed] [Google Scholar]
  • [11].Vidal M, Cusick ME, Barabási A-L. Cell. 2011;144:986. doi: 10.1016/j.cell.2011.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].May RM. Nature. 1972;238:413. doi: 10.1038/238413a0. [DOI] [PubMed] [Google Scholar]
  • [13].Camacho J, Guimer`a R, Nunes Amaral LA. Phys. Rev. Lett. 2002;88:228102. doi: 10.1103/PhysRevLett.88.228102. [DOI] [PubMed] [Google Scholar]
  • [14].Valdovinos FS, Ramos-Jiliberto R, Garay-Narváez L, Urbani P, Dunne JA. Ecology Letters. 2010;13:1546. doi: 10.1111/j.1461-0248.2010.01535.x. [DOI] [PubMed] [Google Scholar]
  • [15].Vermaat JE, Dunne JA, Gilbert AJ. Ecology. 2009;90:278. doi: 10.1890/07-0978.1. [DOI] [PubMed] [Google Scholar]
  • [16].Guimer`a R, Stouffer DB, Sales-Pardo M, Leicht EA, Newman MEJ, Amaral LAN. Ecology. 2010;91:2941. doi: 10.1890/09-1175.1. [DOI] [PubMed] [Google Scholar]
  • [17].Trefethen LN, Embree M. Spectra and Pseudospectra. Princeton University Press; 2005. [Google Scholar]
  • [18].Ganguli S, Huh D, Sompolinsky H. Proceedings of the National Academy of Sciences. 2008;105:18970. doi: 10.1073/pnas.0804451105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Goldman MS. Neuron. 2009;61:621. doi: 10.1016/j.neuron.2008.12.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [20].Neubert MG, Caswell H. Ecology. 1997;78:653. [Google Scholar]
  • [21].Chen X, Cohen JE. Proc. Biol. Sci. 2001;268:869. doi: 10.1098/rspb.2001.1596. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [22].Tang S, Allesina S. Front. Ecol. Evol. 2014;2 [Google Scholar]
  • [23].McCoy JH. New Journal of Physics. 2013;15:113036. [Google Scholar]
  • [24].Feinberg J, Zee A. Nuclear Physics B. 1997;504:579. [Google Scholar]
  • [25].Feinberg J, Zee A. Nuclear Physics B. 1997;501:643. [Google Scholar]
  • [26].Ginibre J. J. Math. Phys. 1965;6:440. [Google Scholar]
  • [27].Girko VL. Theory Probab. Appl. 1984;29:694. [Google Scholar]
  • [28].Bai ZD. Ann. Probab. 1997;25:494. [Google Scholar]
  • [29].Tao T, Vu V. Commun. Contemp. Math. 2008;10:261. [Google Scholar]
  • [30].Götze F, Tikhomirov A. Ann. Probab. 2010;38:1444. [Google Scholar]
  • [31].Khoruzhenko B. Journal of Physics A: Mathematical and General. 1996;29:L165. [Google Scholar]
  • [32].Biane P, Lehner F. Colloq. Math. 2001;90:181. [Google Scholar]
  • [33].Hikami S, Pnini R. Journal of Physics A: Mathematical and General. 1998;31:L587. doi: 10.1088/0305-4470/31/28/002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Tao T, Vu V, Krishnapur M. Ann. Probab. 2010;38:2023. [Google Scholar]
  • [35].Tao T, Vu V. Acta Mathematica. 2011;206:127. [Google Scholar]
  • [36].Tao T. Probab. Theory Related Fields. 2013;155:231. [Google Scholar]
  • [37].O’Rourke S, Renfrew D. Electron. J. Probab. 2014;19:1. [Google Scholar]
  • [38].Edwards SF, Jones RC. Journal of Physics A: Mathematical and General. 1976;9:1595. [Google Scholar]
  • [39].Rajan K, Abbott LF. Phys. Rev. Lett. 2006;97:188104. doi: 10.1103/PhysRevLett.97.188104. [DOI] [PubMed] [Google Scholar]
  • [40].Yin Y, Bai Z, Krishnaiah P. Probab. Theory Related Fields. 1988;78:509. [Google Scholar]
  • [41].Horn RA, Johnson RG. Matrix Analysis. Cambridge University Press; 1990. [Google Scholar]
  • [42].Jonas P, Bischofberger J, Sandkuhler J. Science. 1998;281:419. doi: 10.1126/science.281.5375.419. [DOI] [PubMed] [Google Scholar]
  • [43].Root DH, Mejias-Aponte CA, Zhang S, Wang H-L, Hoffman AF, Lupica CR, Morales M. Nat Neurosci. 2014;17:1543. doi: 10.1038/nn.3823. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [44].Chafaï D. Journal of Theoretical Probability. 2010;23:945. [Google Scholar]
  • [45].Wei Y. Phys. Rev. E. 2012;85:066116. doi: 10.1103/PhysRevE.85.066116. [DOI] [PubMed] [Google Scholar]
  • [46].Miller KD, Fumarola F. Neural Comput. 2012;24:25. doi: 10.1162/NECO_a_00221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [47].Hofbauer J, Sigmund K. Evolutionary Games and Population Dynamics. Cambridge University Press; Cambridge, England: 1998. [Google Scholar]
  • [48].Stern M, Sompolinsky H, Abbott LF. 2014. arXiv:1409.2535v1. [DOI] [PMC free article] [PubMed]
  • [49].Sompolinsky H, Crisanti A, Sommers HJ. Phys. Rev. Lett. 1988;61:259. doi: 10.1103/PhysRevLett.61.259. [DOI] [PubMed] [Google Scholar]
  • [50].Feinberg J. Journal of Physics A: Mathematical and General. 2006;39:10029. [Google Scholar]
  • [51].Abramowitz M, Stegun IA. Handbook of mathematical functions with formulas, graphs, and mathematical tables. Wiley-Interscience; New York: 1970. [Google Scholar]
  • [52].Mehlig B, Chalker JT. J. Math. Phys. 2000;41:3233. [Google Scholar]
  • [53].Nishigaki SM, Kamenev A. Journal of Physics A: Mathematical and General. 2002;35:4571. [Google Scholar]
  • [54].Rogers T, Castillo IP. Phys. Rev. E. 2009;79:012101. doi: 10.1103/PhysRevE.79.012101. [DOI] [PubMed] [Google Scholar]
  • [55].Slanina F. Phys. Rev. E. 2011;83:011118. doi: 10.1103/PhysRevE.83.011118. [DOI] [PubMed] [Google Scholar]
  • [56].Neri I, Metz FL. Phys. Rev. Lett. 2012;109:030602. doi: 10.1103/PhysRevLett.109.030602. [DOI] [PubMed] [Google Scholar]
  • [57].Hooft G. t. Nuclear Physics B. 1974;72:461. [Google Scholar]
  • [58].Brézin E, Itzykson C, Parisi G, Zuber JB. Communications in Mathematical Physics. 1978;59:35. [Google Scholar]
  • [59]. Since we will present results for the limit of the eigenvalue density, etc, as N → ∞, M, L, R and J must each be more precisely understood as an infinite sequence of matrices dependent on N.
  • [60]. The designation “highly nonnormal” can be motivated, when L and R are proportional to the identity matrix, as follows. Let us denote the (operator norm based) ϵ-pseudospectrum of M, i.e. the region of z’s over which ∥(z − M)−1∥ > ϵ−1, by Σϵ(M). For fixed N, the true spectrum of M, which we denote by Σ(M), is the set of points over which the smallest singular value of (z − M) is exactly zero and hence ∥(z − M)−1∥ = ∞. For finite N, limϵ0+ϵ(M)=(M) for any M. However, for non-normal M this approach could be much slower than in the normal case (see our discussion in Sec. II A 2, and the book [17] for a complete discussion of pseudospectra and their relationship with nonnormality). Now suppose that, as in the atypical cases under discussion, in a finite region of the complex plane the smallest singular value of Mz nonzero for finite N, but vanishes in the limit N → ∞. This means that the operator norm of (z − M)−1 ∝ M−1 is finite over such a region but goes to infinity as N → ∞. Hence, if we define ϵ(M)limNϵ(M) and (M)limNϵ(M), we see that in such cases limϵ0+ϵ(M)(M) (or equivalently, limϵ0+limNϵ(M)limNlimϵ0+ϵ(M). More generally but less precisely, this indicates that at finite but large N, the ϵ-pseudospectra of such matrices can cover a significantly broader region than the spectrum even for very small ϵ, indicating extreme nonnormality.
  • [61]. This equivalence is true more generally for any matrix norm derived from a general vector norm; see Ref. [17] for a proof.
  • [62]. The unitary invariance of these formulae is in turn a consequence of the invariance of both the corresponding quantities (ρ(z) and x(t) 2), as well as the statistical ensemble for J, Eq. (3.2), and hence that of LJR when L ∝ R ∝ 1, under unitary transforms like Eq. (2.34)
  • [63]. It is also common to write the firing rate equations in the different form, Tdr(t)dt=r(t)+f(Wr(t))+Ir(t). At least in the case where all neurons have equal time constants, i.e.T ∝ 1, the two formulations are equivalent and are related by the change of variable v = W r + Ir [46]
  • [64]. This structure is a consequence of using a complex ensemble for J, for which the covariances JabJcd vanishes. For the real Gaussian ensemble, by contrast, the latter do not vanish; in this case Eq. (3.21) becomes JabαβJcdγδ=1N[σαβ+σγδ+σαβσγδ++δacδbd(σαβ+σγδ++σαβσγδ)]=1Nr=12[(παδrδad)(πβγ3rδbc)+(παγrδac)(πβδ3rδbd)]
  • [65]. Notice that this is a more restrictive property than planarity of the diagram; for example the graph in panel (b) of Fig. 15 is planar, as one of the wavy lines can be drawn inside the solid loop without crossing any other line, but it is not non-crossing as defined here.
  • [66]. More precisely, we only need to define the limit Eq. (B23) in the sense of distributions, i.e. such that for any regular test function, f(s2), bounded at infinity and regular everywhere, including at s2 → 0+, we have limN1Ni=1Nf(si(z)2)=0f(s2)ρS(s;z)ds. We do not assume any smooth form for ρS(s; z); in particular, ρS(s; z) may have delta function singularities when an O(N) singular values converge to the same value as N → ∞. Also note that this assumption does not forbid the possibility that some si(z) diverge as N → ∞; our requirement that ∥M∥F remain bounded automatically guarantees that these will not be numerous enough contribute to ρS(s; z) at infinity.

RESOURCES