Exceptional characters and prime numbers in sparse sets

We develop a lower bound sieve for primes under the (unlikely) assumption of infinitely many exceptional characters. Compared with the illusory sieve due to Friedlander and Iwaniec which produces asymptotic formulas, we show that less arithmetic information is required to prove non-trivial lower bounds. As an application of our method, assuming the existence of infinitely many exceptional characters we show that there are infinitely many primes of the form $a^2+b^8$.


Introduction
Understanding the distribution of prime numbers along polynomial sequences is one of the basic questions in analytic number theory.For sparse polynomial sequences the problem is solved only in a handful of cases.The most notable are the Friedlander and Iwaniec [1998b] theorem of primes of the form a 2 + b 4 and the result of Heath-Brown [2001] of primes of the form a 3 + 2b 3 , which has been generalized to binary cubic forms by Heath-Brown and Moroz [2002] and to general incomplete norm forms by Maynard [2020].Also, the result of Friedlander and Iwaniec has been extended by Heath-Brown and Li [2017] to primes of the form a 2 + p 4 where p is a prime.Let ±D be a fundamental discriminant and let χ D (n) = (D/n) be the associated primitive real character.We say that χ D is exceptional if L(1, χ D ) is very small, say, It is conjectured that (for an exponent such as 100) there are at most finitely many exceptional characters, which is closely related to the conjecture that L-functions do not have zeros close to s = 1 (so-called Siegel zeros).However, assuming that there do exist infinitely many exceptional characters, it is possible to prove very strong results on distribution of prime numbers.For example, Heath-Brown [1983] has shown that the twin prime conjecture follows from such an assumption, and Drappeau and Maynard [2019] have bounded sums of Kloosterman sums along primes.The potential benefit of such results is that for an unconditional proof we are now allowed to assume the nonexistence of exceptional characters, which in turn implies strong regularity in the distribution of primes in arithmetic progressions.Such a bifurcation in the proof has been successfully used to solve problems, for example, in the proof of Linnik's theorem [1944] and in many results in the theory of L-functions.
The state of the art method using exceptional characters is the so-called illusory sieve developed by Friedlander and Iwaniec [2003;2004;2005], which is geared towards counting primes in sparse sets.Assuming the existence of infinitely many exceptional characters (with the exponent 100 in (1-1) replaced by 200), Friedlander and Iwaniec [2005] proved that there are infinitely many prime numbers of the form a 2 + b 6 .For their method it is required to solve the corresponding ternary divisor problem, that is, show an asymptotic formula for τ 3 (a 2 + b 6 ).This essentially comes down to showing that the sequence has an exponent of distribution 2 3 − ε.Friedlander and Iwaniec [2006] have solved this problem for a 2 + b 6 in a form that is narrowly sufficient for the illusory sieve.
Their method fails for sparser polynomial sequences such as a 2 + b 8 , which has an exponent of distribution 5 8 − ε.The purpose of this article is to develop a lower bound version of the illusory sieve.That is, instead of aiming for an asymptotic formula for primes of the form a 2 + b 8 , we just want to prove a lower bound of the correct order of magnitude for the number of primes.Morally speaking, we are able to show a nontrivial lower bound for primes in sequences with a level of distribution greater than (1 + √ e)/(1 + 2 √ e) = 0.61634 . . .(see Theorem 16), so that the sequence a 2 + b 8 qualifies.We will state the general version of our lower bound sieve at the end of this article (Theorem 16).For now we state the result for primes of the form a 2 + b 8 .For any n ≥ 0 define Theorem 1.If there are infinitely many exceptional primitive characters χ, then there are infinitely many prime numbers of the form a Remark.Note that κ 2 = π/4, so that the coefficient is in fact κ 8 /κ 2 , and 4 π κ 8 x 5/8 is the expected main term.It turns out that the upper bound result is much easier and for this having an exponent of distribution 1 2 is sufficient.
1.1.Sketch of the argument.We present here a nonrigorous sketch of the proof of the lower bound in Theorem 1.Let 1, so that our goal is to estimate n∼x a n (n).
Let χ = χ D .Similarly as in [Friedlander and Iwaniec 2005], we define the Dirichlet convolutions Note that λ(n) ≥ 0 and λ ′ (n) ≥ (n) ≥ 0 (by using λ ′ = λ * ).The basic idea in arguments using the exceptional characters is as follows.Since is large, we expect that χ ( p) = µ( p) for most primes (in a range depending on D), so that heuristically we have χ ≈ µ and λ ′ ≈ .Hence, we expect that Since the modulus of χ is small, morally λ ′ (n) is of same complexity as the divisor function τ (n), so that we have replaced the original sum by a much simpler sum.
Making the approximation (1-3) rigorous is the difficult part of the argument, especially for sparse sequences a n .Friedlander and Iwaniec succeeded in this under the assumption that the exponent of distribution is almost 2 3 , which was sufficient to handle primes in the sequence a 2 + b 6 .In our application a n has the exponent of distribution 5 8 − ε.This results in an additional error term compared to [Friedlander and Iwaniec 2005], but we are able to show that the contribution from this is smaller (but of the same order) as the main term.
Let z = x ε (in the proof we choose a slightly smaller z for technical reasons).Then Note that by removing the small prime factors we have guaranteed that m ≥ z in the second sum, so that we expect λ(m) ≈ (1 * µ)(m) = 0 for almost all m in S 2 .Thus, we expect that S 1 gives us the main term and that S 2 = o(S 1 ).

Remark.
The above decomposition has a close resemblance to the recent work of Granville [2021] using the identity For the main term S 1 we can handle the condition (n, P(z)) = 1 by the fundamental lemma of the sieve, so we ignore this detail for the moment.Thus, we have to evaluate We have m ≥ x 1/2 or n ≥ x 1/2 , so that we are able to compute S 1 provided that our sequence a n has an exponent of distribution 1 2 .This is because the modulus of χ is x o(1) , so that χ is essentially of the same complexity as the constant function 1.We find that S 1 gives the expected main term, so that we need to bound the error term S 2 .
Similarly as in the argument in [Friedlander and Iwaniec 2005], the range x 2/3 plays a special role.With this in mind, we define γ = 1 24 + ε so that 2 3 − γ = 5 8 − ε is the exponent of distribution.We split S 2 into three parts depending on the size of k By similar arguments as in [Friedlander and Iwaniec 2005], we are able use the lacunarity of λ(m) to bound the terms S 21 and S 23 suitably in terms of L(1, χ ), using the fact that the exponent of the distribution is 2 3 − γ .That is, for S 21 we write and for S 23 we drop 1 (m,P(z))=1 by positivity and write where c or d is > x 1/3+γ .In all cases we get a variable > x 1/3+γ , so that these can be evaluated as Type I sums.This gives which is sufficient by the assumption that χ is an exceptional character.
The novel part in our argument is the treatment of the middle range Note that also in [Friedlander and Iwaniec 2005] a narrow range near x 2/3 has to be discarded, but the argument there requires γ = o(1).Thanks to the restriction (m, P(z)) = 1, it turns out that we are able to handle all parts of S 22 except when m is a prime number.To see this, if m is not a prime, then m = m 1 m 2 for some m 1 , m 2 ≥ z, and we essentially get (recall that λ(m) ≥ 0) since λ is multiplicative and the part where (m 1 , m 2 ) > 1 gives a negligible contribution.For the part km 1 > x 1/2 we use λ(m 1 ) ≤ τ (m 1 ) ≪ 2 1/ε and combine variables ℓ = km 1 to get a bound which can be bounded suitably in terms of L(1, χ ) by a similar argument as with S 21 .The part km 1 ≤ x 1/2 is handled similarly, using λ(m 2 ) ≤ τ (m 2 ) ≪ 2 1/ϵ and extracting L(1, χ ) from λ(m 1 ) this time.Thus, the contribution from the composite m is negligible.Hence, it remains to bound Here we are not able to make use of the lacunarity of λ( p).However, since S 222 counts products of two primes of medium sizes, we immediately see that S 222 should be smaller than the main term by a factor of O(γ ), so that at least for small enough γ we get a nontrivial lower bound.We use the linear sieve upper bound to the variable p to make this upper bound rigorous and precise, which leads to the constant 0.189 in Theorem 1.
The paper is structured as follows.In Section 2 we carry out the sieve argument and the proof of Theorem 1 assuming a sufficient exponent of distribution for a n (Propositions 8 and 9).In Section 3 we prove Propositions 8 and 9 by generalizing the arguments in [Friedlander and Iwaniec 2006].Lastly, in Section 4 we state a general version of the sieve and explain how the method could be improved assuming further arithmetic information.
Remark.Our sieve argument is inspired by Harman's sieve method [2007], although the exact details in this setting turn out to be quite different.The moral of the story is that all sieve arguments should be continuous with respect to the quality of the arithmetic information, which in this case is measured solely by the exponent of distribution.That is, even though we fail to obtain an asymptotic formula after some point in this case 2  3 , we still expect to be able to produce lower and upper bounds of the correct order of magnitude with slightly less arithmetic information.
1.2.Notations.For functions f and g with g ≥ 0, we write f The constant may depend on some parameter, which is indicated in the subscript (e.g., ≪ ϵ ).We write f = o(g) if f /g → 0 for large values of the variable.For summation variables we write n ∼ N meaning N < n ≤ 2N .
For two functions f and g with g ≥ 0, it is convenient for us to denote For a statement E we denote by 1 E the characteristic function of that statement.For a set A we use 1 A to denote the characteristic function of A.

The sieve argument
In this section state the arithmetic information (Propositions 8 and 9) and assuming this we give the proof of Theorem 1 using a sieve argument with exceptional characters.We postpone the proof of Propositions 8 and 9 to Section 3. From here on we let q denote the modulus of the exceptional character χ = χ q , to avoid conflating it with the level of distribution which we will denote by D; this also agrees with the notations in [Friedlander and Iwaniec 2005, Section 14].Throughout this section we denote

1, and b
In b n we are counting the representations a 2 + b 2 weighted with the probability that b is a perfect fourth power so that heuristically we expect n∼x a n (n) = (1 + o(1)) n∼x b n (n).Differing from [Friedlander and Iwaniec 2005], it is convenient for us to write certain parts of the argument as a comparison between a n and b n .This is inspired by Harman's sieve method [2007], where the idea is to apply the same combinatorial decompositions to the sums over a n and b n and then compare, using positivity to drop certain terms entirely.
We let g(d) denote the multiplicative function defined by where ϱ(d) denotes the number of solutions to ν 2 + 1 ≡ 0(d).Note that for all primes p we have ϱ( p) = 1 + χ 4 ( p).We also define 2.1.Preliminaries.We have collected here some standard estimates that will be needed in the sieve argument.
Proof.The first asymptotic follows from and Mertens' theorem.To get the second part we apply prime number theorem for Gaussian primes a + ib, splitting the sum into boxes , noting that the contribution from boxes with z 1 ≤ x/ log 10 x or z 2 ≤ x/ log 10 x is trivially ≪ x 5/8 / log x (by writing (n) ≤ log x).The prime number theorem in small boxes follows splitting the boxes in to smaller polar boxes and applying [Iwaniec and Kowalski 2004, Theorem 5.36], for instance.
Here the condition (a 2 + b 2 , q) = 1 implicit in b n cancels the multiplicative factor G q , since by an expansion using the Möbius function For the last asymptotic note that by the change of variables t = u 1/4 1 4 We also require the following basic estimate; see [Friedlander and Iwaniec 1998a, Lemma 1], for instance.
Lemma 3.For every square-free integer n and every k ≥ 2 there exists some d From this we get the more general version.
Lemma 4. For every integer n and every k ≥ 2 there exists some d By Lemma 3 for all j ≤ k − 1 there is To bound the final error term we require the linear sieve upper bound for primes; apply [Friedlander and Iwaniec 2010, Theorem 11.12] with z = D and s = 1, using F(1) = 2e γ .
Lemma 5 (linear sieve upper bound for primes).Let (c n ) n≥1 be a sequence of nonnegative real numbers.For some fixed X 0 depending only on the sequence (c n ) n≥1 , define r d for all square-free d ≥ 1 by where g 0 (d) is a multiplicative function, depending only on the sequence (a n ) n≥1 , satisfying 0 ≤ g 0 ( p) < 1 for all primes p.Let D ≥ 2 (the level of distribution).Suppose that there exists a constant L > 0 that for any 2 ≤ w < D we have The following lemma gives a basic upper bound for smooth numbers; see [Tenenbaum 2015, Chapter III.5, Theorem 1], for instance.Lemma 6.For any 2 ≤ z ≤ y we have where u := log y/ log z.
We also need the following simple divisor sum bound.
Proof.For some L = L(K ) we have by a standard sieve bound by computing the sum over n j = max{n 1 , . . ., n L } first.□ 2.2.Arithmetic information.For the sieve argument we need arithmetic information given by the following two propositions, which state that a n has an exponent of distribution 5 8 − ε.We will prove these in Section 3. The first is just a standard sieve axiom on the level of distribution of the sequence a n , and the second is similar but twisted with the quadratic character χ .For the rest of this section we denote Recall that X ≍ x 5/8 by Lemma 2.
Proposition 8 (type I information).Let B > 0 be a large constant and let ∈ [log −B x, 1].Let ε > 0 be small but fixed.Let D ≤ x 5/8−ε and N be such that D N ≍ x.Let α(d) be divisor bounded coefficients and let g(d) be as in .Then for any C > 0 Furthermore, for D ≤ x 2/3+ε we have the last asymptotic Remark.In our set up the last asymptotic actually holds up to D ≤ x 1−ε , but we will not need this.
Proposition 9 (type I χ information).Let B > 0 be a large constant and let ∈ [log −B x, 1].Let exp(log 10 q) < x < exp(log 16 q).Let D ≤ x 5/8−ε and N be such that D N ≍ x.Let α(d) be divisor bounded coefficients.Then for any C > 0 Furthermore, the bounds for the sums with b dn hold up to D ≤ x 2/3+ε .We will also need the following proposition to bound certain error terms in terms of L(1, χ ).This follows from [Friedlander and Iwaniec 2005, Lemmata 3.7 and 3.9] (as mentioned in [Friedlander and Iwaniec 2005, Section 14], the g(d) defined by (2-1) is easily shown to satisfy the required assumptions).
Remark.For technical reasons we have chosen z a bit smaller than x ε (compare with Section 1.1).This has the benefit that evaluating S 1 is a lot easier.On the downside bounding S 2 is slightly more difficult and we require Lemma 4 for this.
2.4.Sum S 1 .Let D 1 := x ε for some small ε > 0. We expand the condition 1 (n,P(z))=1 by using the Möbius function and split the sum to get To handle the error term R 1 , note that if d | P(z) and d > D 1 , then d has a divisor in [D 1 , 2z D 1 ].Since z = x 1/(log log x) 2 , by Lemma 4 (with k = 2 applied to the variable n/d to get n = cdn ′ with τ (n) ≤ τ (c) O( 1) ), Proposition 8, and Lemma 6 we get 1) /(cd) and apply Lemma 4 to the variable d before using Lemma 6.
For the main term we write a mn χ (m) log n =: S 11 + S 12 .

We write (denoting
We will use Proposition 8 to evaluate this sum but first we need to remove the cross-condition d 2 n > x 1/2 and the weight log d 2 n by using a finer-than-dyadic decomposition to the sums over d 2 and n.That is, for Here we can write where the error term will contribute by Lemma 4 and Proposition 8 The cross-condition d 2 n > x 1/2 holds trivially and may be dropped except in the diagonal part where The contribution from this diagonal part is bounded by using Proposition 8 by choosing C = B. Hence, the cross-condition d 2 n > x 1/2 may be dropped and we get S 11 = S ′′ 11 + O B (x log O(1)−B x) with Applying a similar decomposition to the corresponding sum with b d 1 d 2 mn and using Proposition 8 we get Similarly, we get by Proposition 9 (denoting d 2 = (n, d)) That is, in the sums S 11 and S 12 we have managed to replace a n by b n .By reversing the steps to recombine we get By a similar argument as in (2-3) we can add the part d > D 1 back into the sum and we get by using λ ′ (n) ≥ (n).Thus, by Lemma 2 we have so that for the lower bound result it suffices to show that We now proceed to do this, and at the end of this section we will show how to get the upper bound in Theorem 1.
Remark.We have used Lemma 6 to handle the restriction (n, P(z)) = 1 instead of applying the fundamental lemma of sieve.Thanks to this we were able to use the trivial lower bound λ ′ (n) ≥ (n) to simplify the evaluation of the main term.
2.5.Sum S 2 .Recall that γ = 1 24 + ε and 2 3 − γ = 5 8 − ε.We split the sum S 2 into three ranges according to the size of k Using the assumption that L(1, χ ) is small, we will show that the contribution from S 21 and S 23 is negligible, and that S 22 ≤ (0.811 2.5.1.Sum S 21 .Here we have k > x 1/3+γ , so that by a crude estimate we get where By Proposition 8 we get and by Proposition 10 we have Hence, we have 2.5.2.Sum S 23 .Recall that here m ≫ x 2/3+2γ .By positivity we may drop the condition (m, P(z)) = 1.Writing we split the sum S 23 into two ranges, d ≤ x 1/3+γ or d > x 1/3+γ .We get S 23 ≤ S 231 + S 232 , where By Proposition 9 we get (after applying a finer-than-dyadic decomposition similarly as with S 11 to remove cross-conditions) By Propositions 8 and 10 we get (since the contribution from (k, d) > 1 is trivially negligible) Combining the bounds, we have 2.5.3.Sum S 22 .We have It turns out that we can handle all parts except when m is a prime, so we write We split this sum into two parts according to km 1 > x 1/2 or km 1 ≤ x 1/2 .In either case we get m j ≪ x 1/2 for some j ∈ {1, 2}.We combine the variables ℓ = km 2− j and use λ(m 2− j ) ≤ τ (m 2− j ) to obtain by Lemma 4 By Proposition 8 we get (once we choose K large enough so that 1 2 + 1/K < 2 3 − γ ) where since the contribution from the part the part (d, m) > 1 is negligible by a trivial bound.Thus, by Proposition 10 and Lemma 7 we have Combining the above bounds we get so all that remains is to bound the sum S 222 .The savings here will come from the fact that k is restricted to a fairly narrow range.
2.6.Bounding the error term S 222 .We have We will apply the linear sieve upper bound to the nonnegative sequence with level of distribution x 2/3−γ /k (note that by exploiting the cancellation from χ (n) we save a factor of 2 compared to using the trivial bound λ Note that the contribution from sums with (d, k) > 1 is negligible by trivial estimates.Then by Lemma 5 with D k = x 2/3−γ /k we have where and by Propositions 8 and 9. Applying Lemma 2 we get By the prime number theorem (for p ≡ 1(4)) we have (denoting We have D 1 24 < 0.811.Since ε > 0 can be taken to be arbitrarily small, this implies completing the proof of Theorem 1. □ 2.7.Proof of the upper bound result.We now explain how to get the upper bound result in Theorem 1.By Section 2.4 we have by negativity of where by reversing the initial decomposition on the b n -side (Section 2.3) which is the same as S 2 but with a n replaced by b n .Now M 2 can be bounded similarly as S 2 , except that we decompose with γ = 0 to get M 2 = M 21 + M 23 with By similar arguments as above for S 21 , S 23 we get since for b n we have an exponent of distribution > 2 3 by Propositions 8 and 9.That is, to prove the upper bound we only needed that a n has an exponent of distribution 1 2 + ε instead of 5 8 − ε.

Type I sums
In this section we will prove Propositions 8 and 9.The arguments are straightforward generalizations of the arguments in [Friedlander and Iwaniec 2006;2005, Section 14].Since it does not require much additional effort, we give the arguments in this section for the sequences a 2 + b 2k for any k ≥ 1, which yields the exponent of distribution 1 2 + 1 2k − ε, as claimed in [Friedlander and Iwaniec 2006, below Theorem 4].
For the arguments in this section it is convenient for us to define ≺≺ to mean an inequality modulo logarithmic factors, that is, for two functions f and g with g ≥ 0 we write f (N ) ≺≺ g(N ) if f (N ) ≪ g(N ) log O(1) N .For parameters such as ε we write f (N ) ≺≺ ε g(N ) to mean f (N ) ≪ ε g(N ) log O ε (1) N .
Proposition 8 is a consequence of the following proposition, which we will prove in this section.2k) .
Proof of Proposition 8 assuming Proposition 11.For the sequence b n , which counts n = a 2 + b 2 weighted with b −1+1/k /k, we will apply similar arguments as below but with k = 1, renormalizing the corresponding λ ℓ appropriately.For a n which counts n = a 2 + b 8 we write m = a and ℓ = b 4 , so that we are applying the above proposition with k = 4. Similarly as with the treatment of the sum S 11 , we use a finer-than-dyadic decomposition to remove the cross-condition m 2 + ℓ 2 ∼ x that is, writing = log −B x for some large B, we partition the sum into ≪ −2 log 2 x parts where ℓ ∈ [L 0 , L 0 (1 In fact, we need to refine this decomposition so that for m we use a C ∞ -smooth finer-than-dyadic partition of unity.Then the resulting coefficients for m are C ∞ -smooth functions of the form ψ M (m − M 0 ), where M = M 0 is the width of the window around M 0 ≪ √ x.We can now drop the condition ℓ 2 + m 2 ∼ x, with an error contribution bounded by x 5/8 log −B+O(1) x coming from the edges (where ).To see this, note that we have by Proposition 11 using M 0 , and that the number of edge cases is ≪ log B+O(1) x, so that we save a factor of log O(1)−B/k x, which is sufficient for B ≫ k.
We can now apply Proposition 11 in each of the parts separately.Note that then we have L , M ≪ x 1/2 and D ≪ x 5/8−ε , so that the error term is bounded by x 5/8−ε/4 .To remove the condition (ℓ 2 + m 2 , q) = 1 implicit in Proposition 8 we may expand using the Möbius function to get since (d, q) = 1, and apply Proposition 11 with level x 5/8−ε q ≪ x 5/8−ε/2 .Denote λ (1) ℓ = 1 ℓ=n k and λ (2) We still have to evaluate the main term in Proposition 11 to get .Recombining the finer-than-dyadic decomposition to a dyadic one for the variable ℓ, this follows we once show that for j ∈ {1, which follows easily once we show that Then, since M ≺≺ x 1/2 , the claim (3-1) follows once we show To show this, note also that Then for λ ℓ = 1 ℓ=n k (and similarly for λ by writing ℓ = (nce) k since ce is square free.□ Proposition 9 follows by a similar argument from the following (recall that a n and b n are supported on (n, q) = 1).
From h = 0 we get a total contribution by using the bound (Lemma 15) and the fact that q ℓ = q k 0 ≪ q ηk for some small η.For h ̸ = 0 we can by symmetry restrict to h < 0. We first want to remove the cross-condition χ (β 2 d 2 + q 2 ℓ ) between the variables d and ℓ.To do this we fix the value of q ℓ modulo q and split ℓ into congruence classes q ℓ ≡ γ (q).Hence, we get for some |c h,ℓ (t, q, β, γ )| ≤ 1 and |c h,ℓ (t, q)| ≤ 1 that the total contribution from h ̸ = 0 is Remark.With much more effort it is possible to get the same result as above with L(1, χ ) log x in place of L(1, χ) log 5 x, so that one only needs L(1, χ D ) = o(1/ log D).
Remark.Unfortunately the above theorem just misses out the next case a 2 + b 10 , which has an exponent of distribution 3 5 − ε.Similarly as with the linear sieve, further improvements are possible if we make use of well-factorability of the weights [Friedlander and Iwaniec 2010, Chapter 12.7].For example, the upper bound for the sum S 222 can be improved if we are able to handle certain Type I/II sums (that is, Type I sums where the modulus is kd with d well-factorable).Note also that in S 21 and S 23 the weight factorizes and furthermore there is some smoothness available in the weight.Hence, assuming suitable arithmetic information (of Type I/II or Type I 2 ) we could handle some parts near the edges of S 22 by a similar argument as for the sums S 21 or S 23 .Unfortunately we do not know how to carry this out for the sequence a 2 + b 10 , but possibly sums of Kloosterman sums methods might be able to handle these sums.It is also unclear if the handling of the sum S 222 is optimal but we have not found a way to improve this.
Remark.The ideas in this paper can be used also to the problem of primes in short intervals, to improve the result of Friedlander and Iwaniec [2004] which gives primes in intervals of length x 39/79 < x 1/2 under the assumption of exceptional characters.The sieve argument is slightly different here since for this problem we can also utilize the available Type I/II and Type I 2 information furnished by the exponential sum estimates used for the problem of largest prime factor on short intervals [Baker and Harman 2009;Fouvry and Iwaniec 1989;Liu and Wu 1999].The details will appear elsewhere.