Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2016 Jul 14;113(31):E4446–E4454. doi: 10.1073/pnas.1605366113

Unexpected biases in the distribution of consecutive primes

Robert J Lemke Oliver a,b,1, Kannan Soundararajan a,1
PMCID: PMC4978288  PMID: 27418603

Significance

Prime numbers play a central role in analytic number theory, and are well known to be very well distributed among the reduced residue classes (modq). Surprisingly, the same does not appear to be true for sequences of consecutive primes, with different patterns occurring with wildly different frequencies. We formulate a precise conjecture, based on the Hardy−Littlewood conjectures, which explains this phenomenon. In particular, we predict that all patterns do occur their fair share of the time in the limit, but that there are secondary terms only very slowly tending to zero that create the observed biases.

Keywords: prime numbers, consecutive primes, Hardy–Littlewood conjectures, singular series

Abstract

Although the sequence of primes is very well distributed in the reduced residue classes (modq), the distribution of pairs of consecutive primes among the permissible ϕ(q)2 pairs of reduced residue classes (modq) is surprisingly erratic. This paper proposes a conjectural explanation for this phenomenon, based on the Hardy−Littlewood conjectures. The conjectures are then compared with numerical data, and the observed fit is very good.

1. Introduction

The prime number theorem in arithmetic progressions shows that the sequence of primes is equidistributed among the reduced residue classes (modq). If the Generalized Riemann Hypothesis is true, then this holds in the more precise form

π(x;q,a)=li(x)ϕ(q)+O(x1/2+ϵ),whereli(x)2xdtlogt,

and π(x;q,a) denotes the number of primes up to x lying in the reduced residue class a(modq). Nevertheless, it was noticed by Chebyshev that certain residue classes seem to be slightly preferred; for example, among the first million primes, we find that

π(x0;3,1)=499,829andπ(x0;3,2)=500,170,π(x0)=106.

Chebyshev’s bias is beautifully explained by the work of Rubinstein and Sarnak (1) (see ref. 2 for a survey of related work), who showed (in a certain sense and under some natural conjectures) that π(x;3,2)>π(x;3,1) for 99.9% of all positive x.

What happens if we consider the patterns of residues (modq) among strings of consecutive primes? Let pn denote the sequence of primes in ascending order. Let r1 be an integer, and let a=(a1,a2,,ar) denote an r-tuple of reduced residue classes (modq). Define

π(x;q,a)#{pnx:pn+i1ai(modq)foreach1ir},

which counts the number of occurrences of the pattern a(modq) among r consecutive primes the least of which is below x. When r2, little is known about the distribution of such patterns among the primes. When r=2 and ϕ(q)=2 (thus q=3, 4, or 6), Knapowski and Turán (3) observed that all of the four possible patterns of length 2 appear infinitely many times. The main significant result in this direction is due to Shiu (4), who established that, for any q3, a reduced residue class a(modq), and any r2, the pattern (a,a,,a) occurs infinitely often. Recent progress in sieve theory has led to a new proof of Shiu’s result (see ref. 5), and, moreover, Maynard (6) has shown that π(x;q,(a,,a))π(x).

Despite the lack of understanding of π(x;q,a), any model based on the randomness of the primes would suggest strongly that every permissible pattern of r consecutive primes appears roughly equally often; that is, if a is an r-tuple of reduced residue classes (modq), then π(x;q,a)π(x)/ϕ(q)r. However, a look at the data might shake that belief! For example, among the first million primes (for convenience, restricting to those greater than 3), we find

π(x0;3,(1,1))=215,873,π(x0;3,(1,2))=283,957,π(x0;3,(2,1))=283,957,andπ(x0;3,(2,2))=216,213.

These numbers show substantial deviations from the expectation that all four quantities should be roughly 250,000. Further, Chebyshev’s bias (mod3) might have suggested a slight preference for the pattern (2,2) over the other possibilities, and this is clearly not the case.

The discrepancy observed above persists for larger x, and also exists for other moduli q. For example, among the first hundred million primes modulo 10, there is substantial deviation from the prediction that each of the 16 pairs (a,b) should have about 6.25 million occurrences. Specifically, with π(x0)=108, we find the following.

graphic file with name pnas.1605366113t01.jpg

Apart from the fact that the entries vary dramatically (much more than in Chebyshev’s bias), the key feature to be observed in these data is that the diagonal classes (a,a) occur significantly less often than the nondiagonal classes. Chebyshev’s bias (mod10) states that the residue classes 3 and 7(mod10) very often contain slightly more primes than the residue classes 1 and 9(mod10), but curiously in our data the patterns (3,3) and (7,7) appear less frequently than (1,1) and (9,9); this suggests again that a different phenomenon is at play here.

The purpose of this paper is to develop a heuristic, based on the Hardy−Littlewood prime k-tuples conjecture, which explains the biases seen above. We are led to conjecture that although the primes counted by π(x;q,a) do have density 1/ϕ(q)r in the limit, there are large secondary terms in the asymptotic formula which create biases toward and against certain patterns. The dominant factor in this bias is determined by the number of i for which ai+1ai(modq), but there are also lower-order terms that do not have an easy description.

Main Conjecture.

With notation as above, we have

π(x;q,a)=li(x)ϕ(q)r(1+c1(q;a)loglogxlogx+c2(q;a)1logx+O(1(logx)7/4)),

where

c1(q;a)=ϕ(q)2(r1ϕ(q)#{1i<r:aiai+1(modq)}).

When r=2, the constant c2(q;a) is given in [2.23]. If r3, it is given by

c2(q;a)=i=1r1c2(q;(ai,ai+1))+ϕ(q)2j=1r21j(r1jϕ(q)#{i:aiai+j+1(modq)}).

In general, the quantity c2(q;a) seems complicated, but there are some situations where it simplifies. For example, if a=(a,a) for a reduced residue class a(modq), then, regardless of the choice of a, we have

c2(q;(a,a))=ϕ(q)log(q/2π)+log2π2ϕ(q)2p|qlogpp1. [1.1]

We can also show that c2(q;(a,b))=c2(q;(b,a)) for any two reduced residue classes a and b(modq). Moreover, although c2(q;(a,b)) seems involved, the symmetric quantity c2(q;(a,b))+c2(q;(b,a)) simplifies nicely: For distinct reduced residue classes a, b(modq), we have

c2(q;(a,b))+c2(q;(b,a))=log(2π)ϕ(q)Λ(q/(q,ba))ϕ(q/(q,ba)), [1.2]

where Λ denotes the von Mangoldt function. In particular, this expression depends only on the difference ba.

Conjecture 1.1. If a and b are distinct reduced residue classes (modq), then π(x;q,(a,b))+π(x;q,(b,a)) equals

2li(x)ϕ(q)2(1+loglogx2logx+(log(2π)ϕ(q)Λ(q/(q,ba))ϕ(q/(q,ba)))12logx+O(1(logx)7/4)),

whereas π(x;q,(a,a)) equals

li(x)ϕ(q)2(1ϕ(q)12loglogxlogx+(ϕ(q)logq2π+log2πϕ(q)p|qlogpp1)12logx+O(1(logx)7/4)).

We give a few amusing consequences of the Main Conjecture. The famous biases π(x)<li(x), or π(x;3,1)<π(x;3,2), or π(x;4,1)<π(x;4,1) are known to be false infinitely often. However, we conjecture that the robust biases in pairs of consecutive primes (mod3) or (mod4) may hold always and from the very start!

Conjecture 1.2. Let q=3 or 4, and let a be either 1(modq) or 1(modq). Then, for all x5, we have

π(x;q,(a,a))>π(x;q,(a,a)).

Indeed, for large x, we have

π(x;q,(a,a))π(x;q,(a,a))=x4(logx)2log(2πqlogx)+O(x(logx)11/4).

Given a prime q, the product of two consecutive primes prefers to be a quadratic nonresidue rather than a quadratic residue.

Conjecture 1.3. Let q be a fixed odd prime. For large x, we have

pnx(pnq)(pn+1q)=x2(logx)2log(2πlogxq)+O(x(logx)11/4).

The constants in the Main Conjecture also simplify dramatically if one only cares about patterns exhibited by pn and pn+k for k2.

Conjecture 1.4. If k2 and a and b are distinct reduced residues (modq), then

#{pnx:pna(modq),pn+kb(modq)}=li(x)ϕ(q)2(1+12(k1)1logx+O(1(logx)7/4)),

while

#{pnx:pnpn+ka(modq)}=li(x)ϕ(q)2(1ϕ(q)12(k1)1logx+O(1(logx)7/4)).

Form a ϕ(q)×ϕ(q) transition matrix (with rows and columns indexed by reduced residue classes) and the (a,b) th entry being the probability that a prime pna(modq) is followed by pn+1b(modq). Then Conjecture 1.4 shows that the corresponding transition matrix going from pn to pn+2 is not the square of the transition matrix going from pn to pn+1. Thus, the primes (modq) are not Markovian, and this may also be seen directly from the Main Conjecture by the formula given for c2(q;a) when r3 (which is used to derive Conjecture 1.4).

The ideas that lead to the Main Conjecture imply that there will be symmetries between the number of occurrences of different patterns.

Conjecture 1.5. Given a and q as above, define aopp=(ar,ar1,,a1). For large x, we have

π(x;q,a)=π(x;q,aopp)+O(x1/2+ϵ).

Example. We find

π(1011;7,(1,6,3))=24,344,117

and

π(1011;7,(4,1,6))=24,349,025,

while the nearest number of occurrences of another pattern is

π(1011;7,(6,2,1))=24,570,765.

If the modulus is a prime power, there are additional symmetries.

Conjecture 1.6. Let q be a prime and let v2. If a=(a1,,ar) and b=(b1,,br) are such that a1b1(modq) and ai+1aibi+1bi(modqv) for each 1i<r, then

π(x;qv,a)=π(x;qv,b)+O(x1/2+ϵ).

In particular, if a is odd, then, up to an error O(x1/2+ϵ), π(x;2v,(a,b)) depends only on ba(mod2v).

Example. We find

π(1011;8,(1,3))=278,676,326,π(1011;8,(3,5))=278,696,997,π(1011;8,(5,7))=278,692,843,andπ(1011;8,(7,1))=278,681,776.

In the direction of these conjectures, the earliest work we found is the paper of Knapowski and Turán (3), who “guess” that the events pna(mod4) and pn+1b(mod4) for the four possibilities of a and b are “not equally probable.” However, Knapowski and Turán go on to suggest that π(x;4,(1,1))=o(π(x)), which is now definitively false by Maynard’s work (6). The paper (3) was published after the death of both authors, and perhaps they had something else in mind, maybe along the lines of our Conjecture 1.2 above? More recently, in Ko (7), numerical results observing the biases in the distribution of consecutive primes for small moduli are given. The paper by Ash, Beltis, Gross, and Sinnott (8) again observes these biases in pairs of consecutive primes and initiates an attempt toward understanding them based on the Hardy−Littlewood conjectures. The heuristic expression in ref. 8 is a large sum of singular series, and, as the authors note, it is unclear from that expression whether π(x;q,(a,b)) tends to π(x)/ϕ(q)2 for large x. They also note symmetries akin to Conjectures 1.5 and 1.6 for pairs of consecutive primes.

In the Main Conjecture, we expect that the remainder term O((logx)7/4) is given by a sum involving the zeros of Dirichlet L-functions (modq). The main terms given in the Main Conjecture are the same for all repeating patterns (a,a,,a); nevertheless, numerically, one observes some deviations in the counts of such patterns, and we expect the lower-order fluctuations to account for these deviations. In addition to the contributions from zeros, which we expect to be oscillating, there also appear to be nonoscillating lower-order terms of size (loglogx/logx)2, which may play a bigger role for the computable ranges of x. We hope to understand these lower-order terms in future work.

An initial guess for why there is a bias against the repeating patterns might be that, after a prime occurs that is a(modq), all other classes have a chance to represent a prime before a occurs again. However, a straightforward application of the Selberg sieve shows that the number of primes for which pn+1pn<q is O(x/log2x), which is of a smaller order of magnitude than the bias predicted by the Main Conjecture.

Although we do not pursue this here, it should be possible to prove unconditional analogs of the Main Conjecture in other settings, for example, to numbers free of small prime factors or for squarefree integers (in the latter case, the biases will be manifested already at the level of the constant in the main term). More generally, analogous biases seem to arise for many other sifted sets, for example, in the sums of two squares. We also mention two other settings in which large biases are seen: the distribution of prime geodesics for compact hyperbolic surfaces into various homology classes (see the discussion at the end of ref. 1) and the recent work of Dummit, Granville, and Kisilevsky (9) concerning the distribution of numbers that are products of two primes.

2. The Heuristic for r=2

In this section, we develop a heuristic explanation of the Main Conjecture in the case r=2. The heuristic (like several other conjectures about the primes; see, for example, refs. 1014) is based upon the Hardy−Littlewood prime k-tuples conjecture. We begin by reviewing quickly the Hardy−Littlewood conjectures and some related results, before proceeding to develop an analog suitable for understanding π(x;q,a).

2.1. The Hardy−Littlewood Conjectures.

Let be a finite subset of Z, and let 1P denote the characteristic function of the primes. In a strong form, the Hardy−Littlewood conjecture asserts that

nxh1P(n+h)=S()2xdy(logy)||+O(x1/2+ϵ),

where the singular series S() is given by

S()=p(1#(modp)p)(11p)||.

In our calculations, it will be important to understand the behavior of the singular series “on average.” Here, Gallagher (10) established that, for any k1 and as h,

[1,h]||=kS()(hk)hkk!, [2.1]

so that the singular series is 1 on average. A refined version of this asymptotic was established by Montgomery and Soundararajan (13), who introduced the modified singular series

S0()=T(1)|\T|S(T),sothatS()=TS0(T),

with S()=S0()=1. The modified singular series S0 arises naturally in the following version of the Hardy−Littlewood conjecture (thinking of the elements of as being small in comparison with x):

nxh(1P(n+h)1logn)=S0()2xdy(logy)||+O(x1/2+ϵ),

and the term 1/logn that is subtracted above arises naturally as the probability that the “random number” n+h is prime. Montgomery and Soundararajan showed that

[1,h]||=kS0()=μkk!(hlogh+Ah)k/2+Ok(hk/21/(7k)+ϵ), [2.2]

where μk is the kth moment of the standard Gaussian (in particular, μk=0 if k is odd) and A is a constant independent of k. This refines Gallagher’s asymptotic [2.1], and shows that S0() exhibits roughly square-root cancellation in each variable.

2.2. Modified Hardy−Littlewood Conjectures.

We need a slight modification of the Hardy−Littlewood conjecture, taking into account congruence conditions (modq). For any integer q1 and a finite subset of the integers, we define the singular series at the primes away from q by

Sq()pq(1#(modp)p)(11p)||.

If a(modq) is such that (h+a,q)=1 for all h, then we expect that

n<xna(modq)h1P(n+h)Sq()(qϕ(q))||1q2xdy(logy)||, [2.3]

where the factor (q/ϕ(q))|| arises because h+a is conditioned to be coprime to q for all h, and the factor 1/q arises because we are restricting n to one residue class (modq). In analogy with S0, it is also useful to define Sq,0()T(1)|\T|Sq(T), so that Sq()=TSq,0(T). Once again, the quantity Sq,0 arises naturally in the asymptotic [conditioning (h+a,q)=1 for all h]

nxna(modq)h(1P(n+h)qϕ(q)logn)Sq,0()(qϕ(q))||1q2xdy(logy)||, [2.4]

where the term q/(ϕ(q)logn) being subtracted arises naturally as the probability that n+h is prime, conditioned on the fact that n+h is coprime to q.

2.3. First Steps Toward the Conjecture.

Let a and b be two reduced residue classes (modq), and let h be a positive integer with hba(modq). We now formulate a conjecture for the number of primes nx with na(modq) and such that the next prime after n is n+h. The gaps between consecutive primes are conjectured to be distributed like a Poisson process with mean logx (and Gallagher showed that this follows from the Hardy−Littlewood conjectures), and so h should be thought of as a parameter on the scale of logx. With this in mind, we are interested in

nxna(modq)1P(n)1P(n+h)0<t<h(t+a,q)=1(11P(n+t))=nxna(modq)1P(n)1P(n+h)0<t<h(t+a,q)=1(1qϕ(q)log(n+t)1˜P(n+t)), [2.5]

where, for a variable n conditioned to be coprime to q, we set 1˜P(n)=1P(n)q/(ϕ(q)logn). Write also 1P(n)=q/(ϕ(q)logn)+1˜P(n) and similarly for 1P(n+h), and then expand out the product in [2.5]; thus we arrive at [ignoring the small differences between logn, log(n+h) or log(n+t)]

A{0,h}T[1,h1](t+a,q)=1tT(1)|T|nxna(modq)(qϕ(q)logn)2|A|t[1,h1](t+a,q)=1tT(1qϕ(q)logn)tAT1˜P(n+t). [2.6]

Given reduced residue classes a and b, and a positive hba(modq), we may write

#{0<t<h:(t+a,q)=1}=ϕ(q)qh+ϵq(a,b), [2.7]

where ϵq(a,b) is independent of h. We also write, for convenience,

α(y)=1qϕ(q)logy. [2.8]

Appealing now to the conjectured relation [2.4], we are led to hypothesize that the quantity in [2.5] (and [2.6]) is

A{0,h}T[1,h1](t+a,q)=1tT(1)|T|Sq,0(AT)(1q2x(qϕ(q)logy)2+|T|α(y)hϕ(q)/q+ϵq(a,b)|T|dy). [2.9]

Before proceeding further, a few points are in order. Note that α(x)hϕ(q)/q is about eh/logx, and this exponential decay in h is in keeping with the conjecture that gaps between consecutive primes are distributed like a Poisson process. Secondly, by replacing A and T above with hA and hT, and noting also that ϵq(a,b)=ϵq(b,a), we may see that the quantity [2.9] above does not change if we replace (a,b) by (b,a); this is an example of the symmetry between π(x;q,a) and π(x;q,aopp) noted in Conjecture 1.5. Similarly, under the hypotheses of Conjecture 1.6, the conditions satisfied by h and T are exactly the same for π(x;q,a) and π(x;q,b). Lastly, in arriving at [2.9], we have paid no attention to error terms, and, moreover, have used a uniform version of the Hardy−Littlewood conjecture both in terms of the size of the parameters in the set AT (this is relatively minor) and in terms of the size of the set AT. To mitigate the last point, we note that, in expanding out the inclusion−exclusion product in [2.5], we may obtain upper and lower bounds by stopping after an odd or an even number of steps (as in Brun’s sieve, for example); in this manner, only a mildly uniform version of the Hardy−Littlewood conjectures seems needed. For the present, we ignore these details, but it would be desirable to place the conjecture [2.9] on a firmer footing.

With conjecture [2.9] in hand, we have a conjecture for π(x;q,(a,b)): Namely, we sum the quantity in [2.9] over all positive integers hba(modq). Thus, we expect that

π(x;q,(a,b))1q2xα(y)ϵq(a,b)(qϕ(q)logy)2D(a,b;y)dy, [2.10]

say, where

D(a,b;y)=h>0hba(modq)A{0,h}T[1,h1](t+a,q)=1tT(1)|T|Sq,0(AT)(qϕ(q)α(y)logy)|T|α(y)hϕ(q)/q. [2.11]

2.4. Discarding Singular Series Involving Sets with Three or More Elements.

We now conjecture that only terms with A=T= [which gives rise to the main term of li(x)/ϕ(q)2 for π(x;q,(a,b))] and |A|+|T|=2 give significant contributions leading to the Main Conjecture, and that all other terms contribute to π(x;q,(a,b)) an amount O(x(loglogx)2/(logx)3). To argue this, we will use as a guide the work of Montgomery and Soundararajan (13), in particular [2.2] above, which shows that sums over singular series exhibit square-root cancellation in each variable.

Suppose, for example, that A= and |T|=4 in [2.11]. After summing over the variable h, these terms may be thought of as (logy)1 times an average of Sq,0(T) over element sets T whose elements are all of size about logy. The estimate [2.2] now suggests that this contribution is (loglogy)/2(logy)1/2, and, because 4, the final contribution to π(x;q,(a,b)) is O(x(loglogx)2/(logx)3). If =3, then the same argument—drawing on [2.2] with k=3 there, so that the main term there vanishes and the bound is O(h3/21/21+ϵ)—indicates that such terms contribute to π(x;q,(a,b)) an amount O(x(logx)5/21/21+ϵ) that is already smaller than the secondary main terms claimed in the Main Conjecture. We believe that, when k is odd, the work of Montgomery and Soundararajan (13) can be refined, and the actual size of the sum in [2.2] is h(k1)/2(logh)(k+1)/2. This expectation suggests that the terms with A= and |T|=3 also make a contribution of O(x(loglogx)2/(logx)3).

When A={0} or {h}, then a similar heuristic to the above shows that terms with |T|2 make a contribution to π(x;q,(a,b)) of O(x(loglogx)2/(logx)3). Finally, if A={0,h} and |T|=1, then the contribution to [2.11] may be roughly thought of as (logy) times an average of singular series Sq,0({0}T+) where T+ (standing for T{h}) runs over +1 element sets with elements of size logy. Because the singular series Sq,0 is translation-invariant, one can think of this last sum as being 1/(logy) times the average over +2 element sets with all elements of size logy. After making this observation, we can draw on [2.2] (with its proposed refinement for odd k) as earlier, and this leads to the prediction that the contribution to π(x;q,(a,b)) of terms with A={0,h} and any nonempty T is O(x(loglogx)2/(logx)3).

Thus, discarding all terms with |A|+|T|3, we now replace the density D(a,b;y) in [2.11] with

D(a,b;y)=D0(a,b;y)+D1(a,b;y)+D2(a,b;y), [2.12]

where (keeping in mind that Sq,0 is 1 for the empty set and 0 for a singleton)

D0(a,b;y)=h>0hba(modq)(1+Sq,0({0,h}))α(y)hϕ(q)/q, [2.13]
D1(a,b;y)=qϕ(q)α(y)logyh>0hba(modq)t[1,h1](t+a,q)=1(Sq,0({0,t})+Sq,0({t,h})α(y)hϕ(q)/q), [2.14]

and

D2(a,b;y)=(qϕ(q)α(y)logy)2h>0hba(modq)1t1<t2<h(t1+a,q)=(t2+a,q)=1Sq,0({t1,t2})α(y)hϕ(q)/q. [2.15]

Inserting this in [2.10], we thus conjecture that, up to O(x(loglogx)2/(logx)3), there holds

π(x;q,(a,b))=qϕ(q)22xα(y)ϵq(a,b)(logy)2(D0+D1+D2)(a,b;y)dy. [2.16]

2.5. The Main Proposition.

To evaluate the sums over two-term singular series above, we invoke the following proposition whose proof we defer to Section 3, Proof of the Proposition.

Proposition 2.1. Let q2, and let v(modq) be any residue class. For any positive real number H, define

S0(q,v;H)=h>0hv(modq)Sq,0({0,h})eh/H.

Then we may write

S0(q,0;H)=ϕ(q)2qlogH+S0c(q,0)+Zq,0(H)+O(H1+ϵ),

where

S0c(q,0)=ϕ(q)2qlogq2πϕ(q)2qp|qlogpp1+12,

and, for any v(modq), the quantity Zq,v(H) is described in [3.2] below, satisfies the bound Zq,v(H)=O(H1/2+ϵ), and which we conjecture to be O(H3/4). Further, if (v,q)=d with d<q, then

S0(q,v;H)=S0c(q,v)+Zq,v(H)+O(H1+ϵ),

where

S0c(q,v)=ϕ(q)2qΛ(q/d)ϕ(q/d)Bq(v)+1ϕ(q/d)χχ0(modq/d)χ¯(v/d)L(0,χ)L(1,χ)Aq,χ,

with Bq(v)=1/2v/q for 1vq and extended periodically for all v, and

Aq,χ=p|q(1χ(p)p)pq(1(1χ(p))2(p1)2).

2.6. Completing the Heuristic.

Returning to our heuristic calculation, we will apply Proposition 2.1 with

H=H(y)qϕ(q)1logα(y)=logyq2ϕ(q)+O(1logy). [2.17]

We begin by simplifying a bit the expressions for D0, D1, and D2, discarding terms of size O(loglogy/logy), which are negligible for the Main Conjecture. Thus, after summing the geometric series and using [2.17],

D0=S0(q,ba;H)+hba(modq)eh/H=S0(q,ba;H)+Hq+Bq(ba)+O(1H)=logyq+S0(q,ba;H)+Bq(ba)12ϕ(q)+O(1logy). [2.18]

The definition of D1 involves two singular series, Sq,0({0,t}) and Sq,0({t,h}). Consider the terms arising from the second case. Replace Sq,0({t,h}) by Sq,0({0,r}) where r=ht also lies in [1,h1] and note that the condition (t+a,q)=1 becomes (rb,q)=1. Thus, ignoring terms of size O(loglogy/logy), the second case in D1 contributes

qϕ(q)α(y)logyr>0(rb,q)=1Sq,0({0,r})h>rhba(modq)eh/H=1ϕ(q)v(modq)(vb,q)=1S0(q,v;H).

Arguing similarly with the first case, we conclude that

D1=1ϕ(q)v(modq)(v+a,q)=1S0(q,v;H)1ϕ(q)v(modq)(vb,q)=1S0(q,v;H)+O(loglogylogy). [2.19]

Finally, note that

hba(modq)eh/H1t1<t2<h(t1+a,q)=1(t2+a,q)=1Sq,0({t1,t2})=1t1<t2<h(t1+a,q)=1(t2+a,q)=1Sq,0({0,t2t1})hba(modq)h>t2eh/H=H2q2v1,v2(modq)(v1,q)=1(v2,q)=1S0(q,v2v1;H)+O(HlogH),

so that

D2=1ϕ(q)2v1,v2(modq)(v1,q)=1(v2,q)=1S0(q,v2v1;H)+O(loglogylogy). [2.20]

Using Proposition 2.1 to evaluate [2.18], [2.19], and [2.20] and then inserting that in [2.10] leads to the Main Conjecture. The term involving c1(q;(a,b)) arises from terms involving S0(q,0;H), which has a leading term of size logH whereas all other S0(q,v;H) are only of constant size. Thus, isolating the [ϕ(q)/2q]logH leading contribution to S0(q,0;H)and tracking its appearance in our expressions for D0, D1 and D2 gives

ϕ(q)2q(logH)δ(a=b)2ϕ(q)(ϕ(q)2qlogH)+1ϕ(q)(ϕ(q)2qlogH)=ϕ(q)2q(loglogy)(1ϕ(q)δ(a=b))+O(loglogylogy).

The term involving c2(q;(a,b)) is complicated, but follows straightforwardly from our work above. Having already treated the [ϕ(q)/2q]logH term arising in S0(q,0), the contributions leading to c2(q;(a,b)) come from the S0c(q,v) terms in Proposition 2.1. We thus have

c2(q;a)q=εq(a,b)ϕ(q)+S0c(q,ba)+Bq(ba)12ϕ(q)1ϕ(q)v(modq)(v+a,q)=1S0c(q,v)1ϕ(q)v(modq)(vb,q)=1S0c(q,v)+1ϕ(q)2v1,v2(modq)(v1,q)=1(v2,q)=1S0c(q,v2v1). [2.21]

With Cq,χ=L(0,χ)L(1,χ)Aq,χ (which is zero unless χ is an odd character), we may also derive the following alternative expression:

c2(q;a)q=log2π2q+S0c(q,ba)+Bq(ba)1ϕ(q)d|qd>11ϕ(d)χ(modd)χ(1)=1Cq,χ(u(modd)(uq/d+a,q)=1+u(modd)(uq/db,q)=1)χ¯(u). [2.22]

If χ is induced by the primitive character χ, then, writing χ=χ0,mχ* for some m coprime to the conductor of χ, we have

Cq,χ=Cq,χp|m(1χ(p)).

Further, it is helpful to write q=q02r with q0 odd. If now χ is a character to an odd modulus and q is even, then

Cq,χ=χ¯(2)2Cq0,χ.

Using these facts, it is possible to simplify the formula in [2.22] further, and obtain

c2(q;(a,b))=log2π2+qS0c(q,ba)+qBq(ba)q0ϕ(q0)d|q0μ(d)ϕ(d)χ(modd)Cq0,χ(χ¯(b)χ¯(a)). [2.23]

For example, if q is prime and ab, then

c2(q;(a,b))=12log2πq+qϕ(q)χχ0Cq,χ(χ¯(ba)+1ϕ(q)(χ¯(b)χ¯(a))).

This completes our discussion of the Main Conjecture in the case r=2, and the other conjectures follow as simple consequences.

3. Proof of the Proposition

The proof follows along standard lines, and the closely related case of evaluating asymptotically hHS0({0,h})(Hh) is mentioned in ref. 15 and treated in detail in ref. 16. We will therefore be brief. Let χ be a Dirichlet character modulo m|q; possibly, χ could be imprimitive, or the principal character. Define, for Re(s)>1,

Fq,χ(s)h1χ(h)hsSq({0,h})=p|q(1χ(p)ps)1pq(11(p1)2+χ(p)ps(11p)1(1χ(p)ps)1),

so that

h1χ(h)Sq({0,h})eh/H=12πi(2)Fq,χ(s)HsΓ(s)ds. [3.1]

We now note that

Fq,χ(s)=L(s,χ)pq(11(p1)2+χ(p)ps1(p1)2)=L(s,χ)L(s+1,χ)p|q(1χ(p)ps+1)pq(1(1χ(p)/ps)2(p1)2),

which furnishes a meromorphic continuation of Fq,χ(s) to Re(s)>1/2 with possible poles at s=0 or s=1 in case χ is principal. We may also express the above as

Fq,χ(s)=L(s,χ)L(s+1,χ)L(2s+2,χ2)p|q(1+χ(p)ps+1)1pq(11(p1)2+2pχ(p)(p1)2(ps+1+χ(p))),

and now the final product above is analytic in Re(s)>1, but for which the line Re(s)=1 forms a natural boundary.

If χ is nonprincipal, then, by shifting the line of integration to Re(s)=1/2+ϵ, we find that the quantity in [3.1] is L(0,χ)L(1,χ)Aq,χ+O(H12+ϵ), with the main term coming from the pole of Γ(s) at s=0. Moreover, we may even shift the line of integration to Re(s)=1+ε at the cost of picking up residues from the zeros of L(2s+2,χ2). The contribution from these zeros is

Zq,χ(H)ρ,Re(ρ)>0L(ρ,χ2)=0Ress=ρ/21(Fq,χ(s)HsΓ(s)).

If we suppose that GRH holds for L(s,χ2), that its zeros are simple, and that |L(ρ,χ2)| is not too small so that [in view of the exponential decay of Γ(s)] the sum over residues is absolutely convergent, then we would expect that Zq,χ(H) is an oscillating term of size H3/4.

If χ is principal, but m>1, then Fq,χ(s) has a pole at s=1 with residue ϕ(m)/m, but there is no pole of Fq,χ at s=0 because L(s,χ0)=sΛ(m)+O(s2) for s near 0. Therefore, in this situation, we find

h1χ0(h)eh/HSq({0,h})=ϕ(m)mHϕ(q)2qΛ(m)+Zq,χ0(H)+O(H1+ϵ).

Finally, if m=1 (and χ is naturally principal), the corresponding Fq,χ(s) has a simple pole at s=0 in addition to the pole at s=1. Thus, there is a double pole of the integrand in [3.1], and, computing residues, we obtain that

h1eh/HSq({0,h})=Hϕ(q)2q[log2πH+p|qlogpp1]+Zq,ζ(H)+O(H1+ϵ).

Because

hv(modq)eh/HSq({0,h})=S0(q,v;H)+Hq+Bq(v)+O(1H),

our proposition follows, with

Zq,v(H)=1ϕ(q/d)χ(modq/d)χ¯(v/d)Zq,χ(H/d). [3.2]

4. Modifications to the Heuristic When r3

The ideas leading to the general case of the Main Conjecture are similar to those for r=2, and so we just give a brief sketch. For r3 and a=(a1,,ar), we start by writing π(x;q,a) as

nxna1(modq)h1,,hr1>0hiai+1ai(modq)1P(n)i=1r1[1P(n+h1++hi)×0<t<hi(t+ai,q)=1(11P(n+h1++hi1+t))].

As before, we expand this out, invoke the Hardy−Littlewood conjectures, and then discard all singular series terms except for the empty set and sets with two elements. This leads to

π(x;q,a)=2xqr1ϕ(q)r(1qϕ(q)logy)εq(a)(D0+D1+D2)(a;y)dy(logy)r+O(x(loglogx)2(logx)3),

where εq(a)=εq(a1,a2)++εq(ar1,ar) and D0, D1, and D2 are certain smooth sums of singular series. For D0, we have [with H=H(y) as before]

D0=h1,,hr1>0hiai+1ai(modq)e(h1++hr1)/H(1+0i<jr1Sq,0({0,hi+1++hj})).

Notice that, if j=i+1 in the inner summation, the resulting expression is (H/q)r2 times the analogous D0 term in our calculation for π(x;q,(aj,aj+1)). If ji>1, we will need to consider sums of the form

S0k(q,v;H)hv(modq)hkeh/HSq,0({0,h}),

where k=ji1. This can be understood via contour integration as in Proposition 2.1; a key difference is that, for k1, we have S0k(q,v;H)=O(Hk1/2) unless v=0, in which case S0k(q,0;H)=[ϕ(q)/2q]Γ(k)Hk+O(Hk1/2). Using this to evaluate D0, we find that it is [up to O(Hr3)]

Hr1qr1+Hr2qr2i=1r1[S0(q,ai+1ai;H)+Bq(ai+1ai)+k=1ri1S0k(q,ai+k+1ai;H)k!Hk]Hr1qr1+Hr2qr2i=1r1[S0(q,ai+1ai;H)+Bq(ai+1ai)ϕ(q)2qk=1ri1δ(ai=ai+k+1)k],

and it is this last term that creates the additional bias [in c2(q;a)] against patterns with a nonimmediate repetition.

For D1, up to O(Hr2), we obtain a contribution of (H/q)r1(1[φ(q)/q]logy)1 times

j=1r1[((v+aj,q)=1+(vaj+1,q)=1)S0(q,v;H)+k=1j1(v,q)=1S0k(q,vajk;H)k!Hk+k=1r1j(v,q)=1S0k(q,v+aj+1+k;H)k!Hk]j=1r1((v+aj,q)=1+(vaj+1,q)=1)S0(q,v;H)ϕ(q)qk=1r2r1kk.

Finally, from D2, we obtain (H/q)r(1[ϕ(q)/q]logy)2 times

j=1r1((v1,q)=1(v2,q)=1S0(q,v2v1;H)+k=1r1j(v1,q)=1(v2,q)=1S0k(q,v1+v2;H)k!Hk)(r1)(v1,q)=1(v2,q)=1S0(q,v2v1;H)ϕ(q)22qk=1r2r1kk.

Assembling these contributions yields the Main Conjecture.

5. Comparison of the Conjecture with Numerical Data

We begin by comparing the Main Conjecture with the data for r=2 and q=3 or 4. In each of these cases, our conjecture is that

π(x;q,a)=li(x)4(1±12logxlog(2πlogxq))+O(x(logx)11/4), [5.1]

with the sign being negative if a1a2(modq) and positive if not. However, to obtain [5.1] in such a clean form, a number of asymptotic approximations were used throughout Section 2, The Heuristic for r = 2, and it is reasonable to expect that the unsimplified integral expression [2.16] for π(x;q,a) would provide a better fit to the data. Indeed, we find the following.

graphic file with name pnas.1605366113t02.jpg

Going forward, we will present only the comparison of π(x;q,a) against [2.16], so we explain briefly how we compute this approximation. In [2.18], [2.19], and [2.20], we determined D0, D1, and D2 in terms of S0(q,v;H) and, in the process, replaced geometric progressions in h with suitable approximations. Of course, the geometric progressions could just be computed exactly. We keep the exact but messy expressions so obtained and, for S0(q,v;H), use the main terms described in Proposition 2.1. This yields an expression for π(x;q,a) as an explicit integral, which we computed numerically in Sage. The actual values of π(x;q,a) were computed in C++ using the primesieve library. Code for both computations can be found on the first author’s website.

Next we consider q=8. Here too the constants simplify, with c2(8;(a,b)) depending only on the difference ba(mod8) (a fact reflected in the data, as predicted by Conjecture 1.6). Explicitly, we have c2(8;(a,a))=(5log23logπ)/2, c2(8;(a,a+2))=c2(8;(a,a+6))=(logπlog2)/2,andc2(8;(a,a+4))=(logπ3log2)/2. Thus, we should expect that, among the nondiagonal patterns, those with ba=4 should be the least frequent, and those with ba=2 and 6 should be rather close. Indeed, we find the following.

graphic file with name pnas.1605366113t03.jpg

We now turn to the patterns (mod12). Here, the quadratic character χ(mod3) plays a role for those patterns (a,b) with ab(mod3). In particular, it does not play a role in the diagonal patterns, for which c2(12;a) is given by [1.1]. For nondiagonal patterns, we have the following.

graphic file with name pnas.1605366113t04.jpg

[The other values of c2(12;a) are determined by c2(12;aopp).]

Here, A12,χ1.036, so that c2(12;(5,7)) and c2(12;(11,1)) are the largest of these. Moreover, as in the (mod8) case, there are symmetries between patterns with the same difference ba. We find the following.

graphic file with name pnas.1605366113t05.jpg

We close by considering q=5 (which amounts to considering the last decimal digit of primes). Essentially, no simplifications can be made for the constants c2(q;a). For any nondiagonal pattern (a,b), we find

c2(5;(a,b))=log(2π/5)2+52Re(L(0,χ)L(1,χ)A5,χ[χ¯(ba)+χ¯(b)χ¯(a)4]),

where χ is either of the complex characters (mod5). Apart from the understood symmetry c2(5;(a,b))=c2(5;(b,a)), the value of c2 determines the pattern. Thus, we might expect significant variation between the various patterns and, in particular, no additional symmetries like we saw (mod8) and (mod12). We find the following, presenting only the first of (a,b) and (b,a),

graphic file with name pnas.1605366113t06.jpg

An interesting feature to be observed here is that, initially, π(x;5,(1,2)) is larger than π(x;5,(1,3)), despite our conjecture predicting the opposite ordering. In fact, this is true for all x between 41,231 and 5.0761011. However, at about 5.0821011, π(x;5,(1,3)) becomes consistently larger, seemingly forever, exactly as our conjecture would predict. We take this as reasonable evidence for our speculation that there are even more lower-order terms [e.g., on the order of x(loglogx)2/(logx)3], which, in this case, apparently conspire to point in the opposite direction than the bias in the Main Conjecture.

Acknowledgments

We thank Tadashi Tokieda, whose lecture on “Rock, paper, scissors in probability” inspired the present work; James Maynard for drawing our attention to ref. 3; Paul Abbott for pointing us to ref. 7; and Alexandra Florea, Andrew Granville, and Peter Sarnak for helpful comments. The first author is partially supported by National Science Foundation (NSF) postdoctoral fellowship Division of Mathematical Sciences 1303913. The second author is partially supported by the NSF, and by a Simons Investigator Award from the Simons Foundation.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

References

  • 1.Rubinstein M, Sarnak P. Chebyshev’s bias. Exp Math. 1994;3(3):173–197. [Google Scholar]
  • 2.Granville A, Martin G. Prime number races. Am Math Mon. 2006;113(1):1–33. [Google Scholar]
  • 3.Knapowski S, Turán P. Number Theory and Algebra. Academic; New York: 1977. On prime numbers 1 resp. 3mod 4; pp. 157–165. [Google Scholar]
  • 4.Shiu DKL. Strings of congruent primes. J Lond Math Soc. 2000;61(2):359–373. [Google Scholar]
  • 5.Banks WD, Freiberg T, Turnage-Butterbaugh CL. Consecutive primes in tuples. Acta Arith. 2015;167(3):261–266. [Google Scholar]
  • 6.Maynard J. 2014. Dense clusters of primes in subsets. arXiv:14052953.
  • 7.Ko C-M. Distribution of the units digit of primes. Chaos Solitons Fractals. 2002;13(6):1295–1302. [Google Scholar]
  • 8.Ash A, Beltis L, Gross R, Sinnott W. Frequencies of successive pairs of prime residues. Exp Math. 2011;20(4):400–411. [Google Scholar]
  • 9.Dummit D, Granville A, Kisilevsky H. Big biases amongst products of two primes. Mathematika. 2016;62(2):502–507. [Google Scholar]
  • 10.Gallagher PX. On the distribution of primes in short intervals. Mathematika. 1976;23(1):4–9. [Google Scholar]
  • 11.Goldston DA, Ledoan AH. The jumping champion conjecture. Mathematika. 2015;61(3):719–740. [Google Scholar]
  • 12.Granville A, van de Lune J, te Riele HJJ. Checking the Goldbach conjecture on a vector computer. In: Mollin RA, editor. Number Theory and Applications. Kluwer; Dordrecht, The Netherlands: 1989. pp. 423–433. [Google Scholar]
  • 13.Montgomery HL, Soundararajan K. Primes in short intervals. Commun Math Phys. 2004;252(1-3):589–617. [Google Scholar]
  • 14.Odlyzko A, Rubinstein M, Wolf M. Jumping champions. Exp Math. 1999;8(2):107–118. [Google Scholar]
  • 15.Goldston DA. Linnik’s theorem on Goldbach numbers in short intervals. Glasg Math J. 1990;32(3):285–297. [Google Scholar]
  • 16.Montgomery HL, Soundararajan K. 2002. Beyond pair correlation. Paul Erdős and His Mathematics, I (Budapest, 1999), Bolyai Society Mathematical Studies (János Bolyai Math Soc, Budapest), Vol 11, pp 507–514.

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES