Skip to main content
Springer logoLink to Springer
. 2023 Dec 27;390(1):1171–1200. doi: 10.1007/s00208-023-02777-6

Bounds for Kloosterman sums on GL(n)

Valentin Blomer 1,, Siu Hang Man 2
PMCID: PMC11424748  PMID: 39346752

Abstract

This paper establishes power-saving bounds for Kloosterman sums associated with the long Weyl element for GL(n) for arbitrary n3, as well as for another type of Weyl element of order 2. These bounds are obtained by establishing an explicit representation as exponential sums. As an application we go beyond Sarnak’s density conjecture for the principal congruence subgroup of prime level. We also obtain power-saving bounds for all Kloosterman sums on GL(4).

Mathematics Subject Classification: Primary 11L05

Introduction

Classical Kloosterman sums

Kloosterman sums belong the most universal exponential sums in number theory, algebra and automorphic forms. The classical Kloosterman sum for parameters m,mZ and a modulus cN is

S(m,m,c)=d(modc)e(md+md¯c) 1.1

where dd¯1 (mod c) and the asterisk indicates that the sum is over (d,c)=1. Kloosterman [10] introduced this type of exponential sum in his thesis as a crucial tool to study the representation of integers by positive quaternary quadratic forms. Since then they have been ubiquitous in number theory, for instance as the finite Fourier transform of exponential sums containing modular inverses, as Fourier coefficients of classical Poincaré series, in various instances of delta-symbol methods and perhaps most prominently in the relative trace formula of Petersson-Kuznetsov type (to which Poincaré series are a precursor). One of their key properties is the Weil bound [16] (complemented by Salié for powerful moduli [13])

|S(m,m,c)|τ(c)c1/2(m,m,c)1/2 1.2

where τ denotes the divisor function. This essentially exhibits square root cancellation relative to the trivial bound |S(m,m,c)|ϕ(c) where ϕ is Euler’s ϕ-function.

We proceed to describe the more general set-up of Kloosterman sums. Let G be a reductive group, T the maximal torus, and U the standard maximal unipotent subgroup. Let N=NG(T) be the normalizer of T in G, W=N/T the Weyl group and ω:NW the quotient map. For wW, we define Uw=Uw-1Uw and U¯w=Uw-1Uw. For nN(Qp) and a finite index subgroup ΓG(Zp) we define

C(n)=U(Qp)nU(Qp)Γ,X(n)=U(Zp)\C(n)/Uω(n)(Zp)

and the projection maps

u:X(n)U(Zp)\U(Qp),u:X(n)Uω(n)(Qp)/Uω(n)(Zp).

Let ψ,ψ be two characters on U(Qp) that are trivial on U(Zp). The (local) Kloosterman sum is then defined to be

Klp(ψ,ψ,n)=x=u(x)nu(x)X(n)ψ(u(x))ψ(u(x)). 1.3

In this paper we will take Γ=G(Zp), but we may think of Γ as a more general “congruence subgroup”. Of course, by the Chinese remainder theorem it suffices to study local Kloosterman sums. There is some flexibility in the definition. For instance, some authors have Uω(n) on the left and U on the right.

Example 1

For G=GL(2), n=-1/cc (so that Uω(n)=U) with c=pr, say, ψ1x1=ψ0(mx), ψ1y1=ψ0(my) for the standard additive character ψ0 on Qp/Zp and m,mZ we have

1x1n1y1=cx-1/c+cxyccyGL2(Zp)

if and only if x,yp-rZp/Zp, xy-p-2rp-rZp, and we recover (1.1).

Example 2

For G=GL(3) and n=-1/c1c1/c2c2 in the big Bruhat cell an explicit form of the corresponding Kloosterman sum becomes already much more complicated. Using Plücker coordinates, Bump, Friedberg and Goldfeld [4, Section 4] derived for two general characters ψn1,n2 and ψm1,m2 of 3-by-3 upper triangular unipotent matrices the explicit formula for the global Kloosterman sum

B1,C1(modc1)B2,C2(modc2)(Bj,Cj,cj)=1c1c2c1C2+B1B2+C1c2e(m1B1+n1(Y1c2-Z1B2)c1+m2B2+n2(Y2c1-Z2B1)c2)

where Y1,Y2,Z1,Z2 are chosen such that YjBj+ZjCj1(modcj) for j=1,2.

We see from these examples that Kloosterman sums, while defined rather naturally in terms of the Bruhat decomposition, turn out to be extremely complicated exponential sums. We are well equipped with deep technology from algebraic geometry and p-adic analysis to bound general exponential sums in a relatively sharp fashion, but it is unfortunately not clear a priori how to apply these to general Kloosterman sums given by (1.3). In fact, even determining the size of X(n) and hence the “trivial” bound for (1.3) (ignoring cancellation in the characters) is a deep result of Dąbrowski and Reeder [6], see Lemma 2 below.

One case is classical: for G=GL(n) and the Weyl element 1In-1 (“Voronoi element”) the Kloosterman sum (1.3) becomes a hyper-Kloosterman sum [8, Theorem B] in the sense of Deligne and hence its size is well-understood. In addition we have good bounds for Kloosterman sums on GL(3) [4, 5, 15], Sp(4) [11] and for the long Weyl element on GL(4) [9, Appendix]. In all other cases, the best we know is the “trivial” bound of Dąbrowski-Reeder [6]. A typical strategy, employed for instance in [3, 4, 9], is to use Plücker coordinates to understand the Bruhat decomposition of GL(n) in an explicit fashion. Unfortunately, for large n the Plücker relations become extremely complicated, and it seems that this path is not very promising in general.

In this paper, we investigate Kloosterman sums for the general linear group G=GL(n+1) for arbitrary n2. (The notation n+1 in place of n is slightly more convenient.) We parametrize the unipotent upper triangular matrices U of dimension n+1 as

u=1u11u12···u1n1u22···u2n1unn1.

We consider two characters ψ,ψ of U(Qp)/U(Zp), defined by

ψ(u)=e(j=1nψjujj),ψ(u)=e(j=1nψjujj) 1.4

for ψj,ψjZ, 1jn. The Weyl group of G consists of (n+1)! permutation matrices, but only Weyl elements of the form

graphic file with name 208_2023_2777_Equ144_HTML.gif

with identity matrices Idj of dimension dj lead to well-defined objects (see [8, p. 175]), which reduces the number of relevant Weyl elements to 2n. We consider two of them, namely the long Weyl element

graphic file with name 208_2023_2777_Equ145_HTML.gif

and with a particular application in mind (to be described in Subsection 1.4)

w=±1-In-11.

The chosen representatives of the Weyl elements wl and w here satisfy that the determinant of the (k×k)-minors formed by the bottom k rows and appropriate columns have positive determinant. The actual choice of the representatives is unimportant to the theory, but this particular choice makes the computations in later sections more convenient.

We write a typical modulus as C=(pr1,,prn) with rjN0 which we embed into the torus T as C=diag(p-r1,pr1-r2,,prn-1-rn,prn). The first main achievement of this paper in an explicit and reasonably compact expression of the considered Kloosterman sums as exponential sums which we are going to describe in the following two subsections. We try to present this in a user-friendly way, which may ease further investigations.

Kloosterman sums for G=GL(n+1) for the long Weyl element

Our stratification starts with the following decomposition. Let a modulus C be given as above with an exponent vector r=(r1,,rn). Let

Mwl(r):={m_=(mij)1ijnN0n(n+1)/2ikjmij=rk,1kn}. 1.5

For m_Mwl(r) we define

Cwl(m_):={c_=(cij)1ijncijZ/k=jnpmikZ,(cij,pmij)=1}

as well as the partial Kloosterman sum Klp(m_,ψ,ψ,wl) as

c_Cwl(m_)e(j=1ni=1jψjci(j-1)cij¯k=1i-1pmk(j-1)k=1ip-mkj+i=1nj=1iψic(n+1-i)(n+1-j)c(n+2-i)(n+1-j)¯k=1j-1pm(n+2-i)(n+1-k)×k=1jp-m(n+1-i)(n+1-k)). 1.6

Here we employ the convention that cij=1 and mij=0 for i>j. Moreover, for 1ijn we set cij¯=0 when mij=0, and of course a bar means the modular inverse. It is not obvious from the definition, but will follow from the proof, that this expression is well-defined (i.e. independent of the system of representatives chosen for the cij).

As an example, the case n=3 (i.e. G=GL(4)) is spelled out explicitly in (8.1) in the appendix, where in addition all Weyl elements for GL(4) are analyzed.

We are now ready to state our first main result.

Theorem 1

For C=(pr1,,prn) we have with the above notation

Klp(ψ,ψ,Cwl)=m_Mwl(r)Klp(m_,ψ,ψ,wl)

where Klp(m_,ψ,ψ,wl) is defined in (1.6).

Remarks: The exact formula of Theorem 1 has a number of nice features.

First of all, we see the “trivial bound” of Dąbrowski-Reeder with bare eyes by simply counting the number of terms in the summation:

#Cwl(m_)1ijnk=jnpmik=1ijnp(j-i+1)mij=1knprk 1.7

for m_Mwl(r) as defined in (1.5). We are now in a position to exploit cancellation to obtain non-trivial bounds in Corollary 1 below.

It is also structurally quite interesting because it consists of nested GL(2) Kloosterman sums. This is somewhat reminiscent of the archimedean case where the (non-degenerate) GL(n)-Whittaker function can be expressed as nested integrals of Bessel K-functions (i.e. GL(2) Whittaker functions), see [15].

The exact formula can also be used for certain exact evaluations (see below), it can be generalized to congruence subgroups, and can perhaps ultimately be a means for deeper methodological machinery in analytic number theory such as Poisson summation etc.

Corollary 1

With the notation as above, there exists a δ=δn>0 such that

Klp(ψ,ψ,Cwl)(max1jn|ψj|p-1/2)(1knprk)1-δ.

Xinchen Miao kindly informed us that he obtained independently and by a slightly different method a similar bound in [12, Theorem 5.1].

In Subsection 5.3 we will give a quick argument that shows that one can choose δ1/n2. It is an interesting question whether δ can be chosen independently of n. In this particular situation we do not know the answer, but for general Weyl elements the answer is NO, as we will see in the next subsection.

Kloosterman sums for G=GL(n+1) for w

We now establish similar results for the Weyl element w. For this Weyl element we have relatively severe restrictions on the moduli C=(pr1,,prn), see [8, Proposition 1.3]. For instance, if ψ,ψ satisfy pψjψj and n3, then the exponents rk need to form an arithmetic progression.

In the present situation we define

Mw(r):={m_=(mij)1ijni=1orj=nN02n-1ikjmij=rk,1kn}.

For m_Mw(r) we define

Cw(m_)=c_=(cij)1ijni=1orj=n|c1jZ/k=jnpm1kZ,1jn,cinZ/k=2ipmknZ,2in,(cij,pmij)=1

as well as the partial Kloosterman sum Klp(m_,ψ,ψ,w) as

c_Cw(m_)e(ψ1c11¯pm11+j=2nψj(c1(j-1)c1j¯pm1j+c(j+1)ncjn¯p-m1(j-1)+m1j+mjn)+ψ1c2npm2n+ψn(c1ncnn¯pm1n+c1(n-1)p-mnn+m1(n-1)+m1n)). 1.8

Again we employ the convention that cij=1 and mij=0 for i>j. Moreover, for 1ijn we set cij¯=0 when mij=0. For the case n=3, we refer to (8.2).

Theorem 2

For C=(pr1,,prn) we have with the above notation

Klp(ψ,ψ,Cw)=m_Mw(r)Klp(m_,ψ,ψ,w)

whenever the Kloosterman sum on the left hand side is well-defined.

Corollary 2

There exists a δ=δn>0 such that

Klp(ψ,ψ,Cw)(max1jn|ψj|p-1/2)(1knprk)1-δ.

Our proof will show that one can take δ1/n. This should be seen in light of the following exact evaluation.

Corollary 3

For C=(p,,p) and characters ψ,ψ with pψjψj for 1jn we have Klp(ψ,ψ,Cw)=pn-1+pn-2.

This shows that Kloosterman sums can be fairly large. In particular, up to the constant the saving of size δn1/n in Corollary 2 is asymptotically best possible. Corollary 3 is a variation of [2, Theorem 3], but proved by a completely different argument, since [2, Theorem 3] is special to the congruence subgroup Γ0(p). A similar, but more involved computation shows for instance

Klp(ψ,ψ,Cwl)=(-1)n(p+1)

for C=(p,,p) and characters ψ,ψ with pψjψj for 1jn (generalizing [4, Property 4.10] for n=2).

An application

Kloosterman sums come up most prominently in the Kuznetsov formula. Up until now, in higher rank one could at best employ the trivial bound. With non-trivial bounds at hand, we can now refine the main result of [1].

The Ramanujan conjecture for the group GL(n) states that cuspidal automorphic representations are tempered. This is way out of reach, but one may hope to prove that it holds in reasonable families in the following quantitative average sense: automorphic forms that are “far away” from being tempered should occur “rarely”. With this in mind, let F=FΓ(q)(M) be the finite family of cusp forms ϖ (eigenforms for the unramified Hecke algebra) for the principal congruence subgroup Γ(q)SLn(Z) with bounded spectral parameter μϖ,M (see [1] for more details). For a place v of Q and σ0 define

Nv(σ,F)=#{ϖFσϖ,vσ}

where σϖ,v=maxj|μϖ,v(j)|. The trivial bound is Nv(σ,F)Nv(0,F)=#F. On the other hand, we have σtriv,v=12(n-1). One might hope that a trace formula can interpolate linearly between these two bounds:

Nv(σ,F)v,ε,nMO(1)[SLn(Z):Γ(q)]1-2σn-1+ε

where MO(1)[SLn(Z):Γ(q)] should be thought of as a proxy for #F. This is a version of Sarnak’s density conjecture [14] and was proved in [1] for squarefree q. Of course, the trivial representation is not cuspidal, and so one may hope to do even better, but even for n=2 this is a very hard problem when the Selberg trace formula is employed. On the other hand, the Kuznetsov formula is better suited in this situation, since the residual spectrum is a priori excluded. We use this observation to go beyond Sarnak’s density conjecture in the scenario described above for prime moduli (the primality assumption is used before (7.1)).

Proposition 4

Let M>0, n5. Let q be a large prime and let F=FΓ(q)(M) be the set of cuspidal automorphic forms for Γ(q)SLn(Z) with archimedean spectral parameter μM. Fix a place v of Q. There exist constants K,δ depending only on n, such that

Nv(σ,F)v,nMK[SLn(Z):Γ(q)]1-(2+δ)σn-1

for σ0.

The stratification of Dąbrowski and Reeder

Here we convert results in [6] to conform with our convention. Although nothing in this section is new, it might be convenient for the reader to compile some results.

Let G be a simply connected Chevalley group over Qp with Lie algebra g. Let W be the Weyl group of G, and let Φ and Φ+ denote the set of roots and the set of positive roots of g respectively. We fix a set of simple roots Δ of g. For each positive root βΦ+, there is a natural homomorphism ϕβ:SL(2)G. For tQp we write

xβ(t)=ϕβ1t1,x-β(t)=ϕβ1t1.

Through the canonical bijection ββˇ between the set of roots Φ and the set of coroots Φˇ, we have

βˇ(c)=ϕβcc-1

for cQp×. For mZ, we write mβˇ for βˇ(pm). This induces a natural embedding

Xˇ:=Hom(Gm,T)T

from the set of cocharacters of T into the maximal torus T. If λ=-βΔrββˇXˇ for rβN0, we define the height of λ to be

ht(λ):=βΔrβ. 2.1

For wW, we write

R(w):=βΦ+|wβ-Φ+.

Then, if w=sβ1sβl is a reduced expression of w as a product of simple reflections, then

R(w-1)=γj:=sβ1sβj-1βj|j=1,,l. 2.2

For aQp we set

μ(a):=max0,-vp(a),a:=0ifaZp,p-2vp(a)a-1ifaZp.

In particular, if aZp, say a=cp-m for m1, εZp×, then we have a=c-1p-m. This means the map aa preserves μ. For βΔ we define

bβ(a):=xβ(a)(-μ(a)βˇ)sβ¯xβ(a), 2.3

where sβ¯ is some fixed representative of the simple reflection sβ in N=NG(T). When no confusion can arise, we simply write sβ for sβ¯. We observe that bβ(a)G(Zp) for all aQp. Indeed, we have

bβ(a)=ϕβ-11aforaZp,bβ(a)=ϕβc-1pmcfora=cp-mZp.

Let lN, let w=sβ1sβl be a reduced representation of w as a product of simple reflections, and write β_=(β1,,βl). For a_=(a1,,al)Qpl, let b(a_):=bβ_(a_) denote the image of bβ1(a1)bβl(al) in U(Zp)\G(Qp). By [6, Proposition 2.1], the map bβ_:QplU(Zp)\G(Qp) is injective, and its image is contained in U(Zp)\BwUG(Zp), where B=TU is the standard Borel subgroup of G.

We remark that the map bβ_ really depends on the choice of the reduced representation of w. Moreover, bβ_ also depends on the choice of the representative sβ¯ of sβ. If we fix a representative of w in G(Zp), then by convention we choose the representatives such that the toral part of the Bruhat decomposition of bβ_(a_) has positive entries.

For m_=(m1,,ml)N0l, we set

Yβ_(m_)=a_=(a1,,al)Qpl|μ(ai)=mi,i=1,,l.

Let βΔ, and

u=βΦ+xβ(aβ)U(Qp).

The β-coordinate function fβ is defined by fβ(u):=aβ. Note that this does not depend on the order of the product. For aQp and uU(Qp), we define

Rβa(u):=bβ(a)ubβ(a+fβ(u))-1.

Then Rβa(U(Zp))U(Zp). For a sequence β_=(β1,,βl) of simple roots, we can construct a right action =β_:Qpl×U(Zp)Qpl, a_a_u=:a_ as follows:

alal+fβl(u),aj=aj+fβjRβj+1j+1Rβj+2j+2Rβll(u),1jl-1.

Then bβ_:QplU(Zp)\G(Qp) is right U(Zp)-equivariant with respect to β_, that is, we have bβ_(a_β_u)=bβ_(a_)u for uU(Zp) and a_Qpl. It follows that bβ_ induces a map

b_β_:Qpl/Uw(Zp)U(Zp)\BwBG(Zp)/Uw(Zp),

where Qpl/Uw(Zp) denotes the set of Uw(Zp)-orbits with respect to the right action β_.

For any cocharacter λXˇ, we define

Yλ={Yβ_(m_)mN0l,λ=-j=1lmjγˇj},

where γj is defined in (2.2). Now we are able to state the main result in this section.

Lemma 1

([6, Proposition 3.3]) Let λXˇ, n=λwN(Qp), and let w=sβ1sβl be a reduced representation of w as a product of simple reflections. Write β_=(β1,,βl). Then b_β_ gives a bijection between Yλ/Uw(Zp) and the Kloosterman set X(n).

The following proposition gives the trivial bound for Kloosterman sums.

Lemma 2

([6, Proposition 3.4]) Assume the settings above. For m_N0l we have

#Yβ_(m_)/Uw(Zp)=pht(λ)1-p-1κ(m_),

where κ(m_) is the number of nonzero entries in m_ and ht(λ) was defined in (2.1).

Proof of Theorem 1

We apply the results from Sect. 2 to our case. Let G=GL(n+1). A set of simple roots of G is given by Δ=α1,,αn, where in usual notation αi:=ei-ei+1. Using this root basis, the set of positive roots of G is given by

Φ+=αij:=αi+αi+1++αj|1ijn.

Throughout this section, we use the following reduced representation of wl:

wl=(sα1sαn)(sα1sαn-1)(sα1sα2)sα1. 3.1

Recall the definition of γj in (2.2). For the reduced representation (3.1) of wl, we have

γ_=α11,α12,,α1n,α22,α2n,,αnn.

Now we give a characterisation for b(a_), for a_=(a11,,a1n,a22,,ann)Qpn(n+1)/2. Note that every aijQp can be written uniquely as aij=cijp-mij with mij0, cijZp, and (cij,pmij)=1.

Lemma 3

Let a_=(a11,,ann)Qpn(n+1)/2. Write aij=cijp-mij, with mij0, cijZp, and (cij,pmij)=1. Then b(a_) has a Bruhat decomposition b(a_)=LNR, where

graphic file with name 208_2023_2777_Equ146_HTML.gif

with

Ni(n+2-i)=(-1)(n+1-i)k=1i-1pmk(i-1)k=1npmik,(1in+1),Lij=1δ1<<δj-ij-1Lijδ1,,δj-i,(1i<jn+1),Lijδ1,,δj-i=k=1j-icδk(i-2+k)k=1j-it=δk-1+1δk-1pmt(i-2+k)k=1j-icδk(i-1+k)k=1j-ipmk(j-1),Rij=n+1-i1j-inRij1,,j-i,(1i<jn+1),Rij1,,j-i=k=1j-ic(n+2-i-k)kk=1+1npm(n+2-i)kk=1j-ic(n+3-i-k)kk=1j-it=kk+1pm(n+2-i-k)t.

To interpret the formula above, we set cij:=1 and mij:=0 if the condition 1ijn is not satisfied, and δ0:=0, j-i+1:=n. As a convention, when mij=0 for 1ijn, we define cij-1:=0 as a formal symbol.

Remarks: The hard part is to find these explicit formulae for the Bruhat decomposition. Once the formulae are given, the proof is a straightforward inductive verification by simply matching terms on both sides of the matrix equation. The indices for N look overly complicated, but are chosen in analogy with the ones in Lemma 5 below. Our application of Lemma 3 to the proof of Theorem 1 will only require the values of Lij and Rij on the first off-diagonal, i.e. for j-i=1 for which the formulae simplify substantially.

Proof

First we justify the definition cij-1:=0 as a formal symbol when mij=0. We recall the definition of bβ(a) for βΦ+, aQp. We write a as a product a=cp-m, with m0, cZp, and (c,pm)=1. When m1, we have

bβ(cp-m)=ϕβ1c-1p-m1-pmpm1cp-m1. 3.2

Meanwhile, when m=0 we have

bβ(c)=ϕβ101-111c1.

Hence (3.2) also works for m=0 if we treat c-1:=0 as a formal symbol. As bβ_(α) is a product of such matrices, our convention is justified.

Now we prove the actual formula by induction. For easier manipulation, we assume mij1 for all 1ijn; when some mij=0 we use the convention cij-1:=0 to the result. When n=1, the formula reads

bα1(a11)=1c11-1p-m111p-m11pm111c11p-m111,

which is precisely (2.3). For the general case, let

β_[n]:=α1,,αn,α1,,αn-1,,α1,α2,α1

denote the reduced representation (3.1). By induction, we have a Bruhat decomposition

bβ_[n-1](a22,,ann)=L1N1R1,

where the entries L, N, R are given by the formulae above, with indices ij replaced with (i+1)(j+1). By a slight abuse of notation, we shall denote the matrices above also by L, N, R respectively. On the other hand, it is straightforward to compute that

Υ:=b(α1,,αn)a11,,a1n=1c11pm11c11c12pm12c12c13pm1nc1n.

So it remains to show that

LNR=ΥLNR=bβ_[n](a_) 3.3

is indeed a Bruhat decomposition of bβ_[n](a_). This is a straightforward brute force computation. For convenience, we provide the details. We expand

(LNR)ij=k,LikNkRj,(ΥLNR)ij=r,k,ΥirLrkNkRj.

As each row of N has exactly one nonzero entry, we can collapse the sum and write

(LNR)ij=kLikNk(n+2-k)R(n+2-k)j=kδ,Lik(δ)R(n+2-k)j(). 3.4

By the same argument, we write

(ΥLNR)ij=c1(i-1)c1ik=1nLikNk(n+1-k)R(n+1-k)j+Li(n+1)R(n+1)j+pm1(i-1)k=1nL(i-1)kNk(n+1-k)R(n+1-k)j+L(i-1)(n+1)R(n+1)j. 3.5

Since

Li(n+1)R(n+1)j=1ifi=j=n+1,0otherwise,

it follows that for 1i,jn we have

(ΥLNR)ij=c1(i-1)c1ik=1nLikNk(n+1-k)R(n+1-k)j+pm1(n-1)k=1nL(i-1)kNk(n+1-k)R(n+1-k)j=c1(i-1)c1ik=1nδ,Lik(δ)Nk(n+1-k)R(n+1-k)j()+pm1(n-1)k=1nδ,L(i-1)k(δ)Nk(n+1-k)R(n+1-k)j(),

where

Lij(δ)=k=1j-ic(δk+1)(i-1+k)k=1j-it=δk-1+1δk-1pm(t+1)(i-1+k)k=1j-ic(δk+1)(i+k)k=1j-ipm(k+1)j,Rij()=Rij().

It is then straightforward to verify that

Lik(1,δ2,,δk-i)Nk(n+2-k)R(n+2-k)j()=c1(i-1)c1iLi(k-1)(δ2-1,,δk-i-1)N(k-1)(n+2-k)R(n+2-k)j(),

and

Lik(δ1,,δk-i)Nk(n+2-k)R(n+2-k)j()=pm1(i-1)L(i-1)(k-1)(δ1-1,,δk-i-1)N(k-1)(n+2-k)R(n+2-k)j()

if δ12. Matching the terms with (3.4) yields (LNR)ij=(ΥLNR)ij for 1i,jn.

Now consider the case where 1in, j=n+1. From (3.5), we deduce that

(ΥLNR)i(n+1)=0,1in.

It remains to show that (LNR)i(n+1)=0 for 1in. By straightforward computation, we have

Lik(δ1,,δk-i)Nk(n+2-k)R(n+2-k)j(1,,k-1)=-Lik(δ1,,δk-i+1)N(k+1)(n+1-k)R(n+1-k)j(1,,k),

where 1=k, =maxk,-1 for 2k, and δk-i+1=k+=1k-1-=2k. Putting this back into (3.4) yields (LNR)i(n+1)=0 as desired.

Now consider the case where i=n+1, 1jn. Then (3.4) and (3.5) say

(ΥLNR)(n+1)j=pm1nk=1nLnkNk(n+1-k)R(n+1-k)j=pm1(i-1)Nn1R1j.(LNR)(n+1)j=k=1n+1L(n+1)kNk(n+2-k)R(n+2-k)j=N(n+1)1R1j.

Since r1j()=r1j(), and N(n+1)1=pm1nNn1, it follows that (ΥLNR)(n+1)j=(LNR)(n+1)j.

Finally, for i=j=n+1, we have

(LNR)(n+1)(n+1)=N(n+1)1R1(n+1)=c1n=(ΥLNR)(n+1)(n+1).

So (3.3) holds, finishing the proof.

Lemma 4

Assume the settings above. For m_N0n(n+1)/2, a complete system of coset representatives for Yβ_(m_)/U(Zp) is given by

(cijp-mij)1ijn|cij(modk=jnpmik),(cij,pmij)=1.

Remark: The shape of the system is not completely obvious (to us), but once it is given, the verification is somewhat lengthly, but straightforward.

Proof

From Lemma 2 we already know the number of coset representatives needed. So it remains to show that all these coset representatives are inequivalent under right action by U(Zp). Again we argue inductively. The case n=1 is straightforward to verify. Indeed, suppose we have

1c11p-m111=1c11p-m1111u111

for some u11Zp. This actually says

c11pm11=c11pm11+u11,

which implies u11=0, and c11=c11 as desired.

Now we consider the general case. For n=r, we set

R:=bβ_(c11p-m11,,crrp-mrr),R:=bβ_(c11p-m11,,crrp-mrr),

and

u=1u11···u1r1···u2r1,

such that

R=Ru. 3.6

Removing the final column and the final row of the matrices yields the problem for n=r-1 (with a renaming of variables cijc(i-1)(j-1),cijc(i-1)(j-1)). By induction, we deduce that the first r rows and columns of R and R are identical, and uij=0 for all 1ijr-1. Using Lemma 3, we deduce that cij=cij for 2ijr.

It remains to consider the final columns of the matrices. The (1,r+1)-th entry of (3.6) reads

c1rj=1rp-mjr=c1rj=1rp-mjr+j=2r+1u(r+2-j)rcjrk=jrp-mkr, 3.7

where again we set c(r+1)r:=1. From (3.7), we deduce that c1r=c1r. It then follows that

urr=-R1r-1u(r-1)rR1(r-1)++u2rR12+u1r=-j=1r-1ujrc(r+2-j)rc2rk=2r+1-jpmkr. 3.8

Now we turn to the (2,r+1)-th entry of (3.6). It reads

R2(r+1)=R2(r+1)+j=2rujnR2j.

From Lemma 3, we see that there is exactly one term in R2(r+1) and R2(r+1) that depends on c1j for some jr-1. Since cij=cij for 2ijr, we can remove all the other terms within R2(r+1) and R2(r+1), and obtain

c1(r-1)pmrr-m1(r-1)-m1rk=2r-1pmk(r-1)=c1(r-1)pmrr-m1(r-1)-m1rk=2r-1pmk(r-1)+j=2rujrR2j.

To show that c1(r-1)=c1(r-1), it suffices to prove that

j=2rujrR2jpmrrk=2r-1p-mk(r-1)Zp.

Using (3.8), we rewrite

j=2rujrR2j=j=1r-1ujrR2j-R1jR2rR1r. 3.9

We expand

R2j=n-11j-2nR2j(),

where

R2j()=c(r-1)1c(r+2-j)j-2cr1c(r+3-j)j-21pMpmrr,1r,1,1=r.

where

M=m(r-1)1++m(r-1)2+m(r-2)2++m(r-2)3++m(r+2-j)j-2++m(r+2-j)r.

For 0lj-2, we write [j](l)=(1,,j-2), with k=r-1 if kl, and k=r otherwise. From (3.8), it is easy to check that

R1jR2r([r](k))R1rpmrrk=2r-1p-mk(r-1)Zp.

for kj-1. On the other hand, we verify that

R2j([j](k))=R1jR2r([r](k))R1r

for 0jk-2. So we conclude that

R2j-R1jR2rR1rpmrrk=2r-1p-mk(r-1)Zp

for 1jr-1. The claim then follows from (3.9).

By similar arguments, we proceed inductively and show that

j=irujrRijk=r+2-inpm(r+2-i)kk+2r+1-ip-mk(r+1-i)Zp

for 3ir, and thus c1j=c1j for 1jr-2. This finishes the proof of the statement.

Combining the previous computations, we complete the proof of Theorem 1, noting that λ=-j=1nrjαˇjXˇ, rj0 corresponds to the components of r=(r1,,rn). From Lemma 4 we obtain the summation condition in (1.6) and from Lemma 3 the shape of the exponential for two characters as in (1.4).

Proof of Theorem 2

The proof of Theorem 2 is similar. We omit the analogous straightforward verification and just write down the relevant formulae. We fix a reduced representation of w as follows

w=sα1sα2sαn-1sαnsαn-1sα2sα1.

Recall the definition of γj in (2.2). For the reduced representation of w, we have

γ_=α11,α12,,α1n,αnn,,α2n.

Now we give a characterisation for b(a_), for a_=(a11,,a1n,ann,,a2n)Qp2n-1. Again, every aijQp can be written uniquely as aij=cijp-mij, with mij0, cijZp, and (cij,pmij)=1.

Lemma 5

Let a_=(a11,,a1n,ann,,a2n)Qp2n-1. Write aij=cijp-mij, with mij0, cijZp, and (cij,pmij)=1. Then b(a_) has a Bruhat decomposition b(a_)=LNR, where

L=1L12L13···L1(n+1)1L23···L2(n+1)1Ln(n+1)1,N=N1(n+1)N22NnnN(n+1)1,R=1R12R13···R1(n+1)1R23···R2(n+1)1Rn(n+1)1,

where

N1(n+1)=(-1)nj=1np-m1j,Nii=-pm1(i-1)-min(2in),N(n+1)1=j=1npmjn,L1j=(-1)jc1(j-1)-1k=1j-1p-m1k(2jn),L1(n+1)=c11-1c2n-1k=1np-mkn,Lij=(-1)j-i+1c1(i-1)c1(j-1)k=ij-1p-m1k+c1ic(i+1)nc1(j-1)cinpm1(i-1)-mink=ij-1p-m1k(2i<jn),Li(n+1)=c1(i-1)c1ic(i+1)np-m1nk=i+1np-mkn+cin-1pm1(i-1)-m1nk=inp-min(2in),R1j=cjnk=2jp-mkn(2jn),R1(n+1)=c1nk=1np-mkn,Ri(n+1)=(-1)n-ic1ic(i+1)ncink=inp-m1k+c1(i-1)pmink=i-1np-m1k(2in).

To interpret the formula above, we set cij:=1 and mij:=0 if the condition 1ijn is not satisfied. As a convention, when mij=0 for 1ijn, we define cij-1:=0 as a formal symbol.

Proof

Similar as the proof of Lemma 3.

Lemma 6

Assume the settings above. For m_N02n-1, a complete system of coset representatives for Yβ_(m_)/U(Zp) is given by

(cijp-mij)i=1,1jn2in,j=n|c1jmodk=jnpm1k,1jn,cinmodk=2ipmkn,2in,,(cij,pmij)=1.

Proof

Similar as the proof of Lemma 4.

Combining the above results, we complete the proof of Theorem 2.

Non-trivial bounds for Kloosterman sums

General preparation

In this section we prove Corollaries 1 and 2. We first prove Corollary 1. The idea is that the partial Kloosterman sum Klp(m_,ψ,ψ,wl) defined in (1.6) with

ikjmij=rk,1kn, 5.1

is a nested sum of classical GL(2) Kloosterman sums, for which we have Weil’s bound (1.2) available. We start with a simple lemma.

Lemma 7

Let γ1,γ20, b1,b2Z with min(b1,b2)<0. Then

c1=1pγ1c2=1pγ2|c1pb1+c2pb2|p-1/2pγ1+γ2+12min(b1,b2).

Proof

The sum on the left hand side equals

δ1=0γ1δ2=0γ2c1=1(c1,p)=1pγ1-δ1c2=1(c2,p)=1pγ2-δ2|c1pb1+δ1+c2pb2+δ2|p-1/2. 5.2

Suppose for notational simplicity that b2b1 (the other case is completely analogous). Let us first assume that b1+δ1b2+δ2. Then the two inner sums are bounded by

pγ1-δ1+γ2-δ2+12min(δ1+b1,δ2+b2)pγ1+γ2+12b2-δ1-12δ2

where the second inequality can be seen by distinguishing the cases δ2+b2δ1+b1 and δ2+b2>δ1+b1.

Let us now assume b1+δ1=b2+δ2. Then the inner two sums are at most

p12(b2+δ2)δmax(γ2-δ2,γ1-δ1)pδ/2c1=1pγ1-δ1c2=1pγ2-δ2pδc1+c21p12(b2+δ2)δmax(γ2-δ2,γ1-δ1)pδ/2(pmin(γ1-δ1,γ2-δ2)+pγ1-δ1+γ2-δ2pδ)p12(b2+δ2)(p12(γ1+γ2-δ1-2δ2)+pγ1+γ2-δ1-δ2).

(For p=2 the δ-sum runs up to max(γ2-δ2,γ1-δ1)+1.) Thus in all cases we bound (5.2) by

δ1=0γ1δ2=0γ2pγ1+γ2+12min(b1,b2)-12(δ1+δ2)pγ1+γ2+12min(b1,b2),

and the lemma follows.

We return to the partial Kloosterman sum (1.6) for the long Weyl element. Let

Cij=pmij++min 5.3

be the modulus of the cij-sum, for any 1ijn. For j<i we put Cij=1.

Let us fix one variable cij. Then the cij-sum in (1.6) is given by

Σij:=1cijCij(cij,pmij)=1e(cijA+c¯ijB) 5.4

where

A=Aij=ψj+1c¯i,j+1pa1+ψn+1-ic¯i+1,jpa2,B=Bij=ψjci,j-1pb1+ψn+2-ici-1,jpb2

with

a1=a1(i,j)=m1j++mi-1,j-m1,j+1--mi,j+1,a2=a2(i,j)=mi+1,n++mi+1,j+1-mi,n--mi,j,b1=b1(i,j)=m1,j-1++mi-1,j-1-m1,j--mi,j,b2=b2(i,j)=mi,n++mi,j+1-mi-1,n--mi-1,j. 5.5

Here we apply the following conventions, in this order: if mij=0 for some 1ijn, we put cij¯=0. If j<i, we put mij=0 and cij=1 and Cij=1. If none of the above cases apply, and i<0 or j>n, we put cij=cij¯=mij=0 and Cij=1.

Let us assume

mij0.

Let vp(A)=-α, vp(B)=-β. Assume without loss of generality αβ, the other case is analogous. If α0, then trivially |Σij|Cij. If α>0, we extend the range of summation to avoid issues of well-definedness, and obtain by Weil’s bound

|Σij|=|p-α1cijCijpα(cij,p)=1e(cijACijpα+c¯ijBCijpαCijpα)|2p-α(Cijpα)1/2(Cij,Cijpα-β,Cijpα)1/2=2Cijp-α/2.

We conclude in all cases (still assuming mij0)

|Σij|2Cijmin(1,|Aij|p-1/2,|Bij|p-1/2). 5.6

Note that this uses no specific information about A and B and holds for any sum of the type (5.4).

From this and the previous lemma we see that

ci,j-1ci-1,j|Σij|(max1jn|ψj|p-1/2)Ci,j-1Ci-1,jCijp12min(0,b1(i,j))

if i1 (in which case i-1>0). Note that this continues to hold for i=j by our general conventions. If i=1 (in which case ci-1,j=0), a similar, but simpler argument confirms the bound, too. Here we dropped potential savings in the exponents a1,a2,b2.

If mij=0 we simply estimate trivially, and therefore obtain in all cases the bound

ci,j-1ci-1,j|Σij|(max1jn|ψj|p-1/2)Ci,j-1Ci-1,jCijp12b1(i,j) 5.7

where

b(i,j):=δmij0min(0,b1(i,j)).

A soft argument

With a view towards possible generalizations we first demonstrate a soft argument. We define an ordering on the set of indices (ij), 1ijn as follows

(1,n)<(1,n-1)<<(1,1)<(2,n)<<(2,2)<<(n,n). 5.8

Let

μij=max(α,β)<(i,j)mαβ.

Then (5.7) implies

cνμ(ν,μ)(i,j)|cij(...)|(max1jn|ψj|p-1/2)(1ijnCij)p-mij/2+O(μij)

(which is trivially true if mij=0) for any (ij). Choosing the index pair (ij) suitably, we conclude

Klp(m_,ψ,ψ,wl)(max1jn|ψj|p-1/2)pr1++rn-δ0maxi,jmij

for some δ0>0 (depending on n), from which we easily obtain the statement of Corollary 1, observing that the number of m_ for a given vector r is O(pε(r1++rn)) for every ε>0.

A refined argument

The previous argument uses cancellation only in one index pair (ij). It is very flexible and requires only the ordering (5.8), but no further computations. On the other hand, it gives only a small value of δ (exponentially decreasing in n). A more refined argument runs as follows. We partition the index pairs into 4 classes depending on the parity of i and j and obtain

Klp(m_,ψ,ψ,wl)(1ijnCij)(1ijnii0(mod2)jj0(mod2)p12b(i,j))

for i0,j0{0,1}. Recall that

1ijnCij=1ijnp(j-i+1)mij,

cf. (5.3) and also (1.7). Taking geometric means, we get

Klp(m_,ψ,ψ,wl)(1ijnCij)(1ijnp18b(i,j)).

We now observe that

1ijn(n+1-j)b1(i,j)=-1ijn(j-i+1)mij

and

i0=1ib(i0,j)b1(i,j).

Taken together, this implies

1ijnn2b(i,j)1ijn(n+1-j)(n+1-i)b(i,j)-1ijn(j-i+1)mij,

and so

Klp(m_,ψ,ψ,wl)(1ijnCij)1-18n2.

The element w

The proof of Corollary 2 is similar. We apply again a soft argument and use the ordering

(1n)<(1,n-1)<<(11)<(2,n)<(3,n)<<(n,n).

Analyzing (1.8), we see that Σij is of the shape (5.4) with

A1n=ψncnn¯p-m1n,B1n=ψnc1(n-1)p-m1n,A1(n-1)=ψnpmnn-m1(n-1)+m1n+ψnc1n¯p-m1n,B1(n-1)=ψn-1c1(n-2)p-m1(n-1),A1j=ψj+1c1(j+1)¯p-m1(j+1),B1j=ψjc1(j-1)p-m1j,1jn-2,A2n=ψ1p-m2n,B2n=ψ2c3np-m12+m11-m2n,Ain=ψi-1c(i-1)n¯p-m1(i-1)+m1(i-2)-m(i-1)n,Bin=ψic(i+1)np-m1i+m1(i-1)-min,3in-1,Ann=ψn-1c(n-1)n¯p-m1(n-1)+m1(n-2)-m(n-1)n,Bnn=ψnc1np-m1n+ψnpm1(n-1)-m1n-mnn

with the same conventions as explained after (5.5).

Arguing as before based on (5.6), we obtain

cνμ(ν,μ)(i,j)|cij(...)|(max1jn|ψj|p-1/2)pr1++rn-mij/2+O(μij)

and conclude the proof as in Sect. 5.2 for some δ>0.

We can make this quantitative as in Sect. 5.3. We have

c11(...)|ψ1|p-1/2C11p-m11/2,c1(j-1)|c1j(...)||ψj|p-1/2C1(j-1)C1jp-m1j/2,2jn,c(i+1)n|cin(...)||ψi|p-1/2C(i+1)nCinp-δmin0(min+m1i-m1(i-1))/2,2in-1,

and by a small variation of Lemma 5.6, we also have

c1n|cnn(...)||ψn|p-1/2C1nCnnp-δmnn0(mnn+m1n-m1(n-1))/2.

We put b(1,j)=-m1j, b(i,n)=-min-m1n+m1(n-1) for i2, and b(i,j)=δmij0min(0,b(i,j)). We put the 2n+1 nodes (ij) with i=1 or j=n into the two classes C1 with indices of the form (1,odd), (even,n) and C2 with indices of the form (1,even), (odd,n). Then

Klp(m_,ψ,ψ,w)(i,jCij)minν=1,2((i,j)Cνp12b(i,j))(i,jCij)(i,jp14b(i,j)).

We now observe that

2in(n+1-i)b(i,n)+1jnnb(1,j)-i,j(j-i+1)mij,

and so

Klp(m_,ψ,ψ,w)(i,jCij)1-14n.

An exact evaluation

Here we prove Corollary 3. For the vector r=(1,,1), the relevant m_ satisfying (5.1) are

  • m1n=1, mij=0 otherwise;

  • m1k=mk+1,n=1, mij=0 otherwise, for some 1kn-1.

In the first case we obtain

Klp(m_,ψ,ψ,w)=c11,,c1(n-2)(modp)c1(n-1)(modp)c1n(modp)e(ψnc1(n-1)c1n¯p+ψnc1(n-1)p).

The two innermost sums equal p, and so we obtain Klp(m_,ψ,ψ,w)=pn-1.

In the second case we consider first the case 2kn-1. Then Klp(m_,ψ,ψ,w) contains the sum

c1(k-1),c1k(modp)(c1k,p)=1e(ψkc1(k-1)c1k¯p)=0.

Finally, if k=1, we obtain

Klp(m_,ψ,ψ,w)=c3n,,cnn(modp)c11,c2n(modp)e(ψ1c11¯p+ψ1c2np).

The two inner sums equal 1, and so Klp(m_,ψ,ψ,w)=pn-2. This completes the proof.

Beyond Sarnak’s density conjecture

We finally prove Proposition 4. This requires some minor modifications in Sections 4 and 5 of [1] that we now describe. We use the notation from [1]. Since q is prime, the argument simplifies a bit, and we need [1, Lemma 4.2] for (α,β)=(0,0) and (1, 0). For (α,β)=(0,0) we use it as is, for (α,β)=(1,0) we make a small improvement. We recall that we need to count xij,yij satisfying the size conditions [1, (4.15)] and the congruences [1, (4.17), (4.18)], where in the case β=0 the congruence (4.18) can be written more simply as (4.19). The count for (4.15) is given in (4.16). In order to count the saving imposed by the congruences (4.17), (4.19), we proceed as described after (4.18), but obtain an extra saving for y1n from (4.19) of size pn-3. Thus we see that

C1,0p16(n3+3n2+2n-12)+n(n-1)p2(n-2)·p12(n-2)(n-3)·pn-2+12(n-1)(n-2)·pn-3=Nqq2qn+2Nqq2q74(n-1)

for n5. (It is important to have exponent strictly less than 2.) Together with the improved bound of Corollary 2, we now obtain the following variation of [1, Theorem 4.3] under the additional assumptions that n5, q is prime and c=(c1,,cn-1)=(qnγ1,,qnγn-1) satisfies γj<q2 (which implies that only the cases (α,β)=(0,0) and (1, 0) are relevant in the proof):

Sq,wv(M,N,c)qεNqqn-1c1··cn-1(γ1··γn-1)δ(γ1γn-1,q)3/4+δ 7.1

with δ as in our Corollary 2. We can and will assume without loss of generality that δ<1/10.

With this in hand, we move to the discussion after [1, Lemma 5.1]. The key point is that we can now slightly relax [1, (5.3)] to

mZK-1(1+1/r+R)-Kqn+1+δ0

for some sufficiently small δ0>0 to be chosen in a moment (cf. also [1, (1.4)]). If n4 and δ0<1/10, then by Remark 2 after [1, Lemma 4.1] we can still conclude that only the trivial Weyl element and w give a non-zero contribution. The contribution of the trivial Weyl element is given in [1, (5.4)], for the contribution of the w we invoke (7.1) getting

qεNqqn-1Z2η1γ1,,γn-1q1+δ0(γ1γn-1,q)(γ1γn-1)δNqZ2η1

if (1+δ0)(1-δ)<1. In this way we obtain an improved version of [1, Proposition 5.2] where (under the current assumptions q prime, n5) we only need the relaxed condition TM-Kqn+1+δ0. This can be directly inserted into [1, Corollary 6.11] and completes the proof of Proposition 4.

Appendix: the case G=GL(4)

It might be useful for applications to use the method of proof of Theorem 1 and Corollary 1 to obtain explicit formulae and non-trivial bounds for all Weyl elements in the case G=GL(4). See [9, Appendix] for a list of the relevant consistency relations, and [7] for a version in terms of Plücker coordinates.

There are 8 Weyl elements. We do not need to talk about the trivial Weyl element and the Voronoi element 1I3, which is covered in [8] (with non-trivial bounds following from Deligne’s estimates). The element I31 is analogous.

(a) For the long Weyl element wl=-11-11 we obtain by (1.6) that Klp(m_,ψ,ψ,wl) is given by

c_Cwl(m_)e(ψ1c11¯pm11+ψ2(c11c12¯pm12+c22¯p-m11+m12+m22)+ψ3(c12c13¯pm13+c22c23¯p-m12+m13+m23+c33¯p-m12-m22+m13+m23+m33)+ψ1c33pm33+ψ2(c23c33¯pm23+c22p-m33+m23+m22)+ψ3(c13c23¯pm13+c12c22¯p-m23+m13+m12+c11p-m23-m22+m13+m12+m11)) 8.1

where the sum runs over

c11(modpm11+m12+m13),c12(modpm12+m13),c13(modpm13),c22(modpm22+m23),c23(modpm23),c33(modpm33)

subject to (cij,pmij)=1.

(b) For the Weyl element w=-1-1-11 we obtain by (1.8) that Klp(m_,ψ,ψ,w) is given by

c_Cw(m_)e(ψ1c11¯pm11+ψ2(c11c12¯pm12+c33c23¯p-m11+m12+m22)+ψ3(c12c13¯pm13+c33¯p-m12+m13+m33)+ψ1c23pm23+ψ3(c13c33¯pm13+c12p-m33+m13+m12)) 8.2

where the sum runs over

c11(modpm11+m12+m13),c12(modpm12+m13),c13(modpm13),c23(modpm33),c33(modpm23+m33)

subject to (cij,pmij)=1. These two cases are illustrative examples for the general results presented in the body of the paper; Corollaries 1 and 2 establish non-trivial bounds.

(c) We are left with three remaining Weyl elements. We first consider w=1111. Here we choose the representative w=sα2sα1sα3sα2, so that

γ_=α22,α12,α23,α13.

We then obtain for Klp(m_,ψ,ψ,w) by similar computations the formula

c22(modpm12+m22+m23)c12(modpm12+m13)c23(modpm13+m23)c13(modpm13)e(ψ1(c22c12¯pm12+c23c13¯p-m22+m12+m13)+ψ2c22¯pm22+ψ3(c22c23¯pm23+c12c13¯p-m22+m13+m23)+ψ2c13pm13)

where indicates the usual coprimality condition (cij,pmij)=1.

Here we obtain a saving relative to the trivial bound as follows: the c13-sum and the c22-sum save O(|ψ2|p-1/2p-m13/2) and O(|ψ2|p-1/2p-m22/2) respectively. If pc22, an analogous argument works for the indices (12), (23), and if m22=0 (which is the only situation in which we can have pc22), then the c22-sum implies

ψ1c12¯pm23+ψ3c23¯pm120(modpm12+m23).

Thus in all cases we get a saving of

max(|ψ1|p-1/2,|ψ2|p-1/2,|ψ3|p-1/2,|ψ1|p-1/2)p-12max(m12,m13,m22,m23),

so that one can choose δ<1/16 in Corollary 2 below. This is just for concreteness—it is easy to improve the numerical value.

(d) Finally we treat the Weyl element w=-1111, the Weyl element 11-11 being analogous. Here we choose a representative w=sα1sα2sα3sα1sα2, so that

γ_=α11,α12,α13,α22,α23.

We then obtain for Klp(m_,ψ,ψ,w) the formula

c11(modpm11+m12+m13)c12(modpm12+m13)c13(modpm13)c22(modpm22+m23)c23(modpm23)e(ψ1c11¯pm11+ψ2(c11c12¯pm12+c22¯p-m11+m12+m22)+ψ3(c12c13¯pm13+c22c23¯p-m12+m13+m23)+ψ2c23pm23+ψ3(c11p-m22-m23+m11+m12+m13+c12c22¯p-m23+m12+m13+c13c23¯pm13)).

Arguing as in Sect. 5 with the ordering

(13)<(12)<(11)<(23)<(22)

we obtain a saving of

ψ,ψp-12max(m13,m12,m11,m23,m22-m11)p-14max(m13,m12,m11,m23,m22)

so that we can choose δ<1/36 in the following corollary. Again it is very easy to improve this numerical value. We conclude

Corollary 5

Let w be a non-trivial Weyl element for G=GL(4) and C=(pr1,pr2,pr3). Then there exists an absolute constant δ>0 such that

Klp(ψ,ψ,Cw)(max1jn(|ψj|p-1/2,|ψj|p-1/2))(pr1+r2+r3)1-δ.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Availability of data and material

Not applicable.

Declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Footnotes

V. Blomer was supported in part by the DFG-SNF Lead Agency Program grant BL 915/2-2, Germany’s Excellence Strategy grant EXC-2047/1 - 390685813 and ERC Advanced Grant 101054336. S. H. Man was supported by the European Research Council grant 101001179 and the Czech Science Foundation GAČR grant 21-00420M.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Assing, E., Blomer, V.: The density conjecture for principal congruence subgroups. Duke Math. J., to appear
  • 2.Blomer, V.: Density theorems for Inline graphic. Invent. Math. 232, 783–811 (2023) [Google Scholar]
  • 3.Broughan, K.: An algorithm for the explicit evaluation of Inline graphic Kloosterman sums. ACM Commun. Comput. Algebra 43, 1–10 (2009) [Google Scholar]
  • 4.Bump, D., Friedberg, S., Goldfeld, D.: Poincaré series and Kloosterman sums for Inline graphic. Acta Arith 50, 31–89 (1988) [Google Scholar]
  • 5.Dąbrowski, R., Fisher, B.: A stationary phase formula for exponential sums over Inline graphic and applications to Inline graphic-Kloosterman sums. Acta Arith 80, 1–48 (1997) [Google Scholar]
  • 6.Dąbrowski, R., Reeder, M.: Kloosterman sets in reductive groups. J. Number Theory 73, 228–255 (1998) [Google Scholar]
  • 7.Friedberg, S.:Explicit determination of Inline graphic Kloosterman sums, Séminaire de théorie des nombres, 1985-1986 (Talence), Exp. No. 3, 22 pp
  • 8.Friedberg, S.: Poincaré series for Inline graphic: Fourier expansion, Kloosterman sums, and algebreo-geometric estimates. Math. Z. 196, 165–188 (1987) [Google Scholar]
  • 9.Goldfeld, D., Stade, E., Woodbury, M.: An orthogonality relation for Inline graphic (with an appendix by Bingrong Huang). Forum Math. Sigma 9, Paper No. e47, 83 pp (2021)
  • 10.Kloosterman, H.D.: On the representation of numbers in the form Inline graphic. Acta Math. 49, 407–464 (1927) [Google Scholar]
  • 11.Man, S.H.: Symplectic Kloosterman sums and Poincaré series. Ramanujan J. 57, 707–753 (2022) [Google Scholar]
  • 12.Miao, X.: Bessel functions and Kloosterman integrals on GL(n). arXiv:2208.01016
  • 13.Salié, H.: Über die Kloostermanschen Summen Inline graphic. Math. Z. 34, 91–109 (1931) [Google Scholar]
  • 14.Sarnak, P.: Diophantine Problems and Linear Groups. Proceedings of the ICM Kyoto, pp. 459–471 (1990)
  • 15.Stevens, G.: Poincaré series on Inline graphic and Kloostermann sums. Math. Ann. 277, 25–51 (1987) [Google Scholar]
  • 16.Weil, A.: On some exponential sums. Proc. Natl. Acad. Sci. USA 34, 204–207 (1948) [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.


Articles from Mathematische Annalen are provided here courtesy of Springer

RESOURCES