Directional square functions

Quantitative formulations of Fefferman's counterexample for the ball multiplier are naturally linked to square function estimates for conical and directional multipliers. In this article we develop a novel framework for these square function estimates, based on a directional embedding theorem for Carleson sequences and multi-parameter time-frequency analysis techniques. As applications we prove sharp or quantified bounds for Rubio de Francia type square functions of conical multipliers and of multipliers adapted to rectangles pointing along $N$ directions. A suitable combination of these estimates yields a new and currently best-known logarithmic bound for the Fourier restriction to an $N$-gon, improving on previous results of A. Cordoba. Our directional Carleson embedding extends to the weighted setting, yielding previously unknown weighted estimates for directional maximal functions and singular integrals.


M
The celebrated theorem of Charles Fe erman from [16] shows that the ball multiplier is an unbounded operator on   (R  ) for all  ≠ 2 whenever  ≥ 2. A well-known argument originally due to Yves Meyer, [11], exhibits the intimate relationship of the ball multiplier with vector-valued estimates for directional singular integrals along all possible directions.Fe erman proves in [16] the impossibility of such estimates by testing these vector-valued inequalities on a Kakeya set.
Besicovitch or Kakeya sets are compact sets in the Euclidean space that contain a line segment of unit length in every direction.Sets of this type with zero Lebesgue measure do exist.However, in two dimensions, Kakeya sets are necessarily of full Hausdor dimension.The question of the Hausdor dimension of Kakeya sets can be then formulated as a question of quantitative boundedness of the Kakeya maximal function, which is a maximal directional average along rectangles of xed eccentricity and pointing along arbitrary directions.
The importance of the ball multiplier for the summation of higher dimensional Fourier series, as well as its intimate connection to Kakeya sets, have motivated a host of problems in harmonic analysis which have been driving relevant research since the 1970s.Finitary or smooth models of the ball multiplier such as the polygon multiplier and the Bochner-Riesz means quantify the failure of boundedness of the ball multiplier and formalize the close relation of these operators with directional maximal and singular averages.
This paper is dedicated to the study of a variety of operators in the plane that are all connected in one way or another with the ball multiplier.Our point of view is through the analysis of directional operators mapping into   (R 2 ; ℓ  )-spaces where the inner ℓ  -norm is taken with respect to the set of directions.Di erent values of  are relevant in our analysis but the cases  = 2 and  = ∞ are of particular interest.On one hand, the case  = ∞ arises when considering maximal directional averages and the corresponding di erentiation theory along directions; see [2,7,15,21] for classical and recent work on the subject.On the other hand, the case  = 2 is especially relevant for Meyer's argument that bounds the norm of a vector-valued directional Hilbert transform by the norm of the ball multiplier.It also arises when dealing with square functions associated to conical or directional Fourier multipliers of the type  ↦ → {   :  = 1, . . .,  } where each   is adapted to a di erent coordinate pair and the   have disjoint or wellseparated Fourier support.These estimates are directional analogues of the celebrated square function estimate for Fourier restriction to families of disjoint cubes, due to Rubio de Francia [31], and they appear naturally when seeking for quantitative estimates on the  -gon Fourier multiplier.
While such square function estimates have been considered previously in the literature, and usually approached directly via weighted norm inequalities, our treatment is novel and leads to improved and in certain cases sharp estimates in terms of the cardinality of the set of directions.It rests on a new directional Carleson measure condition and corresponding embedding theorem, which is subsequently applied to intrinsic directional square functions of time-frequency nature.The link between the abstract Carleson embedding theorem and the applications is provided by directional, one and two-parameter time-frequency analysis models.The latter allow us to reduce estimates for directional operators to those of the corresponding intrinsic square functions involving directional wave packet coe cients.We note that in the xed coordinate system case, related square functions have appeared in Lacey's work [25], while a single-scale directional square function similar to those of Section 4 is present in [14] by Guo, Thiele, Zorin-Kranich and the second author.
Having clari ed the context of our investigation, we turn to the detailed description of our main results and techniques.
A new approach to directional square functions.While we address several types of square functions associated to directional multipliers, our analysis of each relies on a common rst step.This is an  4 -square function inequality for abstract Carleson measures associated with one and two-parameter collections of rectangles in R 2 , pointing along a nite set of  directions; this setup is presented in Section 2 and the central result is Theorem C. Section 2 builds upon the proof technique rst introduced by Katz [21] and revisited by Bateman [2] in the study of sharp weak  2 -bounds for maximal directional operators.Our main novel contributions are the formulation of an abstract directional Carleson condition which is exible enough to be applied in the context of time-frequency square functions, and the realization that square functions in  4 can be treated in a  * -like fashion.The advancements over [2,21] also include the possibility of handling two-parameter collections of rectangles.
In Section 4, we verify that the Carleson condition, which is a necessary assumption in the directional embedding of Theorem C, is satis ed by the intrinsic directional wave packet coe cients associated with certain time-frequency tile con gurations, and Theorem C may be thus applied to obtain sharp estimates for discrete time-frequency models of directional Rubio de Francia square function (for instance).Establishing the Carleson condition requires a precise control of spatial tails of the wave packets: this control is obtained by a careful use of Journé's product theory lemma.
The estimates obtained for the time-frequency model square functions are then applied to three main families of operators described below.All of them are de ned in terms of an underlying set of  directions.As in Fe erman's counterexample for the ball multiplier the Kakeya set is the main obstruction for obtaining uniform estimates.Depending on the type of operator the usable estimates will be restricted in the range 2 <  < 4 for square function estimates or in the range 3/4 <  < 4 for the self-adjoint case of the polygon multiplier.The fact that the estimates should be logarithmic in  in the   -ranges above is directed by the Besicovitch construction of the Kakeya set.It is easy to see that for  outside this range the only available estimates are essentially trivial polynomial estimates.Further obstructions deter any estimates for Rubio de Francia type square function in the range  < 2 already in the one-directional case.
Sharp Rubio de Francia square function estimates in the directional setting.Section 5 concerns quantitative estimates of Rubio de Francia type for the square function associated with  nitely overlapping cone multipliers, of both rough and smooth type.Beginning with the seminal article of Nagel, Stein and Wainger [27], square functions of this type are crucial in the theory of maximal operators, in particular along lacunary directions, see for instance [28,32].In the case of  uniformly spaced cones, logarithmic estimates with unspeci ed dependence were proved by A. Córdoba in [10] using weighted theory.
In order to make the discussion above more precise, and to give a avor of the results of this paper, we introduce some basic notation.Let τ ⊂ (0, 2π) be an interval and consider the corresponding smooth restriction to the frequency cone subtended by τ, namely where β τ is a smooth indicator on τ, namely it is supported in τ and is identically one on the middle half of τ.
One of the main results of this paper is a quantitative estimate for a square function associated with the smooth conical multipliers of a nite collection of intervals with bounded overlap.In the statement of the theorem below ℓ 2  denotes the ℓ 2 -norm on the nite set of directions .
The dependence on #τ in the estimates above is best possible.
The sharp estimate of Theorem A above can be suitably bootstrapped in order to provide an estimate for rough conical frequency projections; the precise statement can be found in Theorem J of Section 5.The sharpness of the estimates in Theorem A above is discussed in §8. 6.
A similar square function estimate associated with disjoint rectangular directional frequency projections is presented in Section 6.This is a square function that is very close in spirit to the one originally considered by Rubio de Francia in [31], and especially to the two-parameter version of Journé from [20] and revisited by Lacey in [25].The novel element is the directional aspect which comes from the fact that the frequency rectangles are allowed to point along a set of  di erent directions.Our method of proof can deal equally well with one-parameter rectangular projections or collections of arbitrary eccentricities.As before we prove a sharp -in terms of the number of directions-estimate for the smooth square function associated with rectangular frequency projections along  directions; this is the content of Theorem K.The main term in the upper bound of Theorem K matches the logarithmic lower bound associated with the Kakeya set.
The polygon multiplier.The square function estimates discussed above may be combined with suitable vector-valued estimates in the directional setting in order to obtain a quantitative estimate for the operator norm of the  -gon multiplier, namely the Fourier restriction to a regular  -gon P  , (1.1)  P   () In Section 7 we give the details and proof of the following quantitative estimate for the polygon multiplier.
Theorem B. Let P  be a regular  -gon in R 2 and  P  be the corresponding Fourier restriction operator de ned in (1.1).We have the estimate We limit ourselves to treating the regular  -gon case; however, it will be clear from the proof that this restriction may be signi cantly weakened by requiring instead a well-distribution type assumption on the arcs de ning the polygon, similar to the one that is implicit in Theorem A.
Precise   -bounds for the  -gon multiplier as a function of  quantify Fe erman's counterexample and so the failure of boundedness of the ball multiplier when  ≠ 2. A logarithmic type estimate for  P  was rst obtained by A. Córdoba in [8].While the exact dependence in [8] is not explicitly tracked, the upper bound on the operator norm obtained in [8] must be necessarily larger than  (log  ) 5 4 for  close to the endpoints of the relevant interval: see Remark 7.12 and §8.4 for details.While the dependence obtained in Theorem B is a signi cant improvement over previous results, it does not match the currently best known lower bound, which is the same as that for the Meyer lemma constant in Lemma 7.21 and §8.1.
Remark.Let δ > 0 and   be a smooth frequency restriction to one of the  (δ −1 ) tangential δ × δ 2 boxes covering the δ 2 neighborhood of S 1 .Unlike the sharp forward square function estimate we prove in this article, the reverse square function estimate , holds with  4,δ =  (1) at the endpoint  = 4.For the proof of this  4 -decoupling estimate see [8,17].An extension to the range 2 <  < 4 is at the moment only possible via vectorvalued methods, which introduce the loss  ,δ =  (| log δ| 1/2−1/ ).In fact (1.2) with the loss  ,δ claimed above follows easily from Lemma 7.18; the details are contained in Remark 7. 22.
Reverse square function inequalities of the type (1.2) have been popularized by Wol in his proof of local smoothing estimates in the large  regime; see also the related works [18,22,23,30].We refer to Carbery's note [6] for a proof that the  = 2/( − 1) case of the S −1 reverse square function estimate implies the corresponding   (R  ) Kakeya maximal inequality, as well as the Bochner-Riesz conjecture.In [6], the author also asks whether a δ-free estimate holds in the range 2 <  < 2/( − 1).At the moment this is not known in any dimension.
On a di erent but related note, weakening (1.2) by replacing the right hand side with the larger square function of    yields a sample (weak) decoupling inequality: a full range of sharp decoupling inequalities for hypersurfaces with curvature have been established starting from the recent, seminal paper by Bourgain and Demeter [4].In the case of S 1 , the weak decoupling inequality holds in the wider range 2 ≤  ≤ 6, with  ε δ −ε type bounds outside of [2,4]: our methods do not seem to provide insights on the quantitative character of weak decoupling in this wider range.
Weighted estimates for the maximal directional function.The simplest example of application of the directional Carleson embedding theorem is the adjoint of the directional maximal function; this was already noticed by Bateman [2], re-elaborating on the approach of Katz [21].By duality, the  2 -directional Carleson embedding theorem of Section 2 yields the sharp bound for the weak (2, 2) norm of the maximal Hardy-Littlewood maximal function   along  arbitrary directions this result rst appeared in the quoted article [21] by Katz.
Theorem C may be extended to the directional weighted setting.We describe this extension in Section 3, see Theorem D, and derive several novel weighted estimates for directional maximal and singular integrals as an application.
More speci cally, our weighted Carleson embedding Theorem D yields a Fe erman-Stein type inequality for the operator   with sharp dependence on the number of directions; this result is the content of Theorem E. Specializing to  1 -weights in the directional setting yields the rst sharp weighted result for the maximal function along arbitrary directions.Furthermore, Theorem F contains an  2,∞ ()-estimate for the maximal directional singular integrals along  directions, for suitable directional weights , with a quanti ed logarithmic dependence in  .This is a weighted counterpart of the results of [12,13].
Acknowledgments.The authors are grateful to Ciprian Demeter and Jongchon Kim for fruitful discussions on reverse square function estimates, and for providing additional references on the subject.

A 𝐿 2 C
In this section we prove an abstract  2 -inequality for certain Carleson sequences adapted to sets of directions: the main result is Theorem C below.The Carleson sequences we will consider are indexed by parallelograms with long side pointing in a given set of directions in R 2 , and possessing certain natural properties.The de nitions below are motivated by the applications we have in mind, all of them lying in the realm of directional singular and averaging operators.
2.1.Parallelograms and sheared grids.Fix a coordinate system and the associated horizontal and vertical projections of  ⊂ R 2 : We will con ate the descriptions of directions in terms of slopes in  and in terms of vectors in  with no particular mention.In order to describe the setup for our general result we introduce a collection of directional dyadic grids of parallelograms.In order to de ne these grids we consider the two-parameter product dyadic grid obtained by taking cartesian product of the standard dyadic grid D (R) with itself; we note that we only consider the rectangles in D × D whose horizontal side is longer than their vertical one.De ne the sheared grids We will also use the notation Note that D 2  is a special subcollection of P 2  .In particular,  ∈ D 2  is a parallelogram oriented along  = (1, ) with vertical sides parallel to the -axis and such that π 1 () is a standard dyadic interval.Furthermore our assumptions on  and the de nition of D 2 0 imply that the parallelograms in D 2 have long side with slope | | ≤ 1 and a vertical short side.With a slight abuse of language we will continue referring to the rectangles in D 2  as dyadic.
Several results in this paper will involve collections of parallelograms R ⊂ D In general for any collection R of parallelograms we will use the notation for the shadow of the collection.Finally, for any collection of parallelograms R we de ne the corresponding maximal operator We will also use the following notations for directional maximal functions: If  ⊂ R 2 is a compact set of directions with 0 ∉  , we write In the de nitions above and throughout the paper we use the notation whenever  is a locally integrable function in R 2 and  ⊂ R 2 has nite measure.2.5.An embedding theorem for directional Carleson sequences.In this section we will be dealing with Carleson-type sequences  = {  } ∈D 2  , indexed by dyadic parallelograms.In order to de ne them precisely we need a preliminary notion.
De nition 2.6.Let L ⊂ P 2  be a collection of parallelograms and let  ∈ .We will say that L is subordinate to a collection T ⊂ P We write  () for  R () when R = D 2  .For 1 ≤  ≤ 2 we then de ne the balayage norms mass , (R)  R ()   .Note that mass ,1 (R) = ∈R   ≤ mass  .Remark 2.9 (Elementary properties of mass).Let R ⊂ D 2 τ for some xed τ ∈ .Then R is subordinate to itself and if  is an  ∞ -normalized Carleson sequence we have for some xed τ ∈ .
Also, the very de nition of mass and the log-convexity of the   -norm imply (2.10) mass , (R) ≤ mass ,1 (R) for all 1 ≤  ≤ 2, with  its dual exponent.
We are now ready to state the main result of this section.The result below should be interpreted as a reverse Hölder-type bound for the balayages of directional Carleson sequences.
The proof of Theorem C occupies the next subsection.The argument relies on several lemmata, whose proof is postponed to the nal Subsection 2.23.
Remark 2.11.There are essentially two cases in the assumption of Theorem C above.If for each  ∈  the family R  happens to be a one-parameter family, then the corresponding maximal operator M R  is of weak-type (1, 1), whence the assumption holds with γ = 0.In the generic case that R = D 2  then for each  the operator M R  = M D 2  is a skewed copy of the strong maximal function and the assumption holds with γ = 1.
2.12.Main line of proof of Theorem C. Throughout the proof, we use the following partial order between parallelograms ,  ∈ D 2  : (2.13) Notice that, since ,  ∈ D 2  , we have that π 1 (), π 1 () belong to the standard dyadic grid D on R.
It is convenient to encode the main inequality of Theorem C by means of the following dimensionless quantity associated with a collection R ⊂ D 2  and a Carleson sequence , where the supremum is taken over all nite subcollections L ⊂ R and all  ∞ -normalized Carleson sequences  = {  } ∈D 2

𝑆
. There is an easy, albeit lossy, a priori estimate for U  (R) for general R ⊂ D 2  .Lemma 2.14.Let  ⊂ [−1, 1] be a nite set of  slopes and  = {  } ∈R be a normalized Carleson sequence as above.For every R ⊂ D 2   we have the estimate Theorem C is then an easy consequence of the following bootstrap-type estimate.For an arbitrary nite collection of parallelograms R ⊂ D The remainder of the section is dedicated to the proof of (2.15).We begin by expanding the square of the  2 -norm of  R () as follows: For any L ⊂ R and  ∈ R we have implicitly de ned (2.17) Remark 2.18.Observe that for any L ⊂ R and every xed  ∈  we have (2.20) Here λ > 0 is the constant used to de ne the collections R , and in the last lines we used the de nition of a Carleson sequence and Remark 2.9.
The following lemma encodes the exponential decay relation between mass and  L  and is in fact the main step of the proof of Theorem C.  with L ⊆ R. We assume that for some  ∈ [1, 2) ) for a su ciently large numerical constant  > 1 then there exists L 1 ⊂ L such that: (2.17), we have that The nal lemma we make use of in the argument translates the exponential decay of the mass of each R , into exponential decay of the support size, which is what we need in the estimate (2.20)

.
We assume that the operators {M R  :  ∈  } map   (R 2 ) to  ,∞ (R 2 ) uniformly with constant   .For  ≥ 1 we then have the estimate with absolute implicit constant.
With these lemmata in hand we now return to the proof of (2.15).Substituting the estimate of Lemma 2.22 into (2.20)yields This was proved for an arbitrary collection R and so also for every L ⊂ R. Thus the estimate above and our assumption   ( ) γ imply ).
Now observe that we can assume that U 2 (R) 1 otherwise there is nothing to prove.In this case we can take 1  1 and leads to This is the desired estimate (2.15) and so the proof of Theorem C is complete.

Proof of the lemmata.
Proof of Lemma 2.14.We follow the proof of [25,Lemma 3.11].Take R to be some nite collection and   = 1 such that This means We have proved that for an arbitrary collection R we have Assuming this for a moment and using Remark 2.9 we can estimate This proves the proposition upon choosing  sup ∈ M R  :   →   ,∞  .
We have to prove the claim.Note that since R  is a collection in a xed direction the inequality U R  sup ∈ M R  :   →   ,∞ follows by the John-Nirenberg inequality in the product setting and Remark 2.9; see [25,Lemma 3.11].
Proof of Lemma 2.21.By the invariance under shearing of our statement, we can work in the case  = 0. Therefore, R 0 will stand for the collection of rectangles in R 0 such that  L  > λ, where λ ≥  and  > 1 will be speci ed at the end of the proof.We write  =   ×   for  ∈ R 0 .
Inside-outside splitting.For  ∈ {π 1 () :  ∈ R 0 } and any interval  we de ne where we recall that the de nition of partial order  ≤  was given in (2.13).Set also We claim that if  ⊂ R is any interval then for all α ∈  we have To see this note that in order for a -term appearing in the sum of the left hand side above to be non-zero we must have see Figure 2.26.This clearly implies that for every α ∈  we have which proves the claim.
Smallness of the local average.We now use the previously obtained (2.24) to prove (ii).Let R ★ 0 denote the family of parallelograms  =   ×   ∈ R 0 such that  out   ,  > λ.For each such  let   be the maximal interval  ∈ {  , 3  , . . ., 3    , . ..} such that  out   , > λ; the existence of the maximal interval   is guaranteed for example by the a priori estimate of Lemma 2.14 and the assumption  ∈ R ★ 0 .Obviously   ⊇   and  out   ,3  ≤ λ.We show that for  ∈ R ★ 0 we have The rst summand is estimated using the maximality of   − ∫ The second summand can be further analyzed by observing that the cubes  appearing in the sum above satisfy π 1 () ⊂  and by our assumption on λ.Combining the estimates above shows that De ning the subcollection L 1 .We set Now note that for each  ∈ R ★ 0 and  =   ∈ K π 1 () we have that

𝑅
while for  ∈ R 0 \ R ★ 0 the same estimate holds using   in place of   .It remains to show the desired estimate for mass ,1 (L 1 ) in (i) of the lemma.
Smallness of the mass mass ,1 (L 1 ).By the de nition of the collections L in , we have that sh(L 1 ) ⊂ If  =   for some  ∈ R ★ 0 we have by de nition that  out   ,  > λ.On the other hand for 3), since we have assumed  = 0. We will show that for a su ciently small constant  > 0, where M 2 is as in (2.3).To this end let us de ne which readily yields the existence of  ⊂   with .
This in turn implies that M 2 (1  ) 1 on   × 3  .Now we can conclude by the weak (1, 1) inequality of the directional Hardy-Littlewood maximal function M (1,0) .On the other hand we have for the rectangles .
Thus we get by the weak (, ) assumption for M R 0 that 1) .By the subordination property of L 1 we get ) with su ciently large  > 1.
Proof of Lemma 2.22.Fix  ∈  and choose λ in the de nition of R , to be the value given by Lemma 2.21 with Repeat the procedure inductively with  + 1 in place of .When  =  − 1 we have reached the collection L −1 with mass ,1 (L −1 ) 2 − mass ,1 (L 0 ) and  L −1

𝑅
> λ.This last condition and Remark 2.18 imply that and so, using (2.10), and the lemma follows by the de nition of λ since L 0 = R.

A C
In this section, we provide a weighted version of the directional Carleson embedding theorem.We then derive, as applications, novel weighted norm inequalities for maximal and singular directional operators.
The proof of the weighted Carleson embedding follows the strategy used for Theorem C, with suitable modi cations.In order to simplify the presentation, we restrict our scope to collections of parallelograms R = { R  :  ∈  } with the property that the maximal operator M R  associated to each collection R  satis es the appropriate weighted weak-(1, 1) inequality.This is the case, for instance, when the collections R  are of the form  which is subordinate to some collection T ⊂ P 2 τ for some xed τ ∈ , we have As before, if R ⊂ D 2 τ for some xed τ ∈  then R is subordinate to itself and for some xed τ ∈ .
Throughout this section all Carleson sequences and related quantities are taken with respect to some xed weight  which is suppressed from the notation.We can now state our weighted Carleson embedding theorem.

𝑆
we have 3.5.Proof of Theorem D. We follow the proof of Theorem C and only highlight the di erences to accommodate the weighted setting.Write σ From the de nition of σ we have that where now for any L ⊂ R we have de ned .
De ning the families R , for  ∈  and  ∈ N as in (2.19) we then have the estimate Again λ > 0 is a constant that will be determined later in the proof and in the last line we used the -Carleson assumption for the sequence  = {  } for rectangles in a xed direction.We need the weighted version of Lemma 2.21, which is given under the standing assumptions of Theorem D.  with L ⊆ R. For every λ >  [, ]  where  is a suitably chosen absolute constant, there exists L 1 ⊂ L such that: Proof.We can assume that  = 0 and let R 0 be the collection of rectangles in R 0 such that where λ is as in the statement of the lemma and  will be speci ed at the end of the proof.For  ∈ {π 1 () :  ∈ R 0 } and any interval  ⊂ R we de ne L in , and L out , as in the proof of Theorem C but now we set We de ne R 0 to be the subcollection of those  =  × ∈ R 0 such that  out , ≤ λ.By linearity we get for each where Since R 0 ⊂ R 0 we conclude as before that by the two-weight weak type Using the de nition of a Carleson sequence we have and so mass ,1 (L 1 ) [, ]  mass ,1 (L)/λ.It remains to deal with parallelograms We de ne the maximal   such that  out ,  > λ as before; the existence of this maximal interval can be guaranteed for example by assuming the collection R is nite.We have for each and by this and the -Carleson property for all s subordinate to  × 9 we get Arguing as in the unweighted case of Theorem C we can estimate where In the de nition of  above we have that M  = M (1,) = M 1 since we have reduced to the case  = (1, ) = (1, 0).Using the subordination property of L 1 and the Fe erman-Stein inequality once in the direction  2 for M 2 and once in the direction  = (1, ) = (1, 0) for M  we estimate We have thus proved the lemma upon setting L 1 L 1 ∪ L 1 and choosing λ ≥  [, ]  for a su ciently large numerical constant  > 1.
Repeating the steps in the proof of Lemma 2.22 for λ as in the statement of Lemma 3.6 we get for the sets R , de ned with respect to this λ that and this completes the proof of Theorem D.

3.7.
Applications of Theorem D. The rst corollary of Theorem D is a two-weighted estimate for the directional maximal operator M  from (2.4).
Theorem E. Let  ⊂ S 1 be a nite set of  slopes and  be a weight on R 2 .Then Remark 3.8.In the proof below, we argue for almost horizontal  , and use M (0,1) in place of max{M (1,0) , M (0,1) }.The usage of max{M (1,0) , M (0,1) } is made necessary to make the statement of the theorem invariant under rotation of  .
Proof.By standard limiting arguments, it su ces to prove that for each  ∈ Z the estimate when R is a one-parameter collection as in (3.1), holds uniformly in .
For a nonnegative function  ∈ S(R 2 ) let   be a linearization of M R  , namely By duality, (3.9) turns into We can easily calculate and it is routine to check that { ( ∩   )} ∈R is a -Carleson sequence according to De nition 3.4.The main point here is that the sets { ∩   } ∈R are by de nition pairwise disjoint and   ⊆  for each  ∈ R.
We may in turn use Theorem E to establish a weighted norm inequality for maximal directional singular integrals with controlled dependence on the cardinality # =  .Similar considerations may be used to yield weighted bounds for directional singular integrals in   (R 2 ) for  > 2; we do not pursue this issue.
Theorem F. Let  be a standard Calderón-Zygmund convolution kernel on R and  ⊂ S 1 be a nite set of  slopes.For  ∈  we de ne Let  be a weight on We sketch the proof, which is a weighted modi cation of the arguments for [13, Theorem 1].Hunt's classical exponential good-λ inequality, see [13, Proposition 2.2] for a proof, may be upgraded to (3.11)

√︁
log  when the estimate is speci ed to    1 weights as the ones we consider here.

T , ,
We de ne here some general notions of tiles and adapted families of wave-packets: de nitions in this spirit have appeared in, among others [1,13,24,25,26].These will be essential for the time-frequency analysis square functions we use in this paper in order to model the main operators of interest.After presenting these abstract de nitions we show some general orthogonality estimates for wave packet coe cients.We then detail how these notions are specialized in three particular cases of interest.
We also use the notation   (1,  ()).There are several di erent collections of tiles used in this paper, they will generically be denoted by T, T 1 , T or similar.Given any collection of tiles T we will use often use the notation R T ≔ {  :  ∈ T} to denote the collection of spatial components of the tiles in T. The exact geometry of these tiles will be clear from context, however several estimates hold for generic collections of tiles as we will see in §4. 3.
Let  =   × Ω  be a tile and  ≥ 2. We denote by A   the collection of Schwartz functions ϕ on R 2 such that: In the above display    refers to the center of   and    (•) We thus refer to A   as the collection of  2 -normalized wave packets adapted to  of order .For our purposes, it will su ce to work with moderate values of , say 2 3 ≤  ≤ 2 50 .In fact, we use  =  0 = 2 50 in the de nition of the intrinsic wavelet coe cient associated with the tile  and the Schwartz function  : This section is dedicated to square functions involving wavelet coe cients associated with particular collections of tiles which formally look like Δ T ( ) 2  ∑︁  ∈T   ( ) We begin by proving some general global and local orthogonality estimates for collections of tiles with nitely overlapping frequency components.These estimates will be essential in showing that the sequence {  ( )}  ∈T is Carleson in the sense of Section 2, when | | ≤ 1  for some measurable set  ⊂ R 2 with 0 < || < ∞.This in turn will allow us to use the directional Carleson embedding of Theorem C in order to conclude corresponding estimates for intrinsic square functions de ned on collections of tiles.
To prove (4.6), we introduce We claim that  Ω ()  2 2 for all , uniformly in Ω ∈ Ω(T).Assuming the claim for a moment and remembering the nite overlap assumption on the frequency components of the tiles we have as desired.It thus su ces to prove the claim.To this end let Then for any  with  2 = 1 we have that  Ω () =  Ω (),  ≤  Ω () 2 and it su ces to prove that  Ω () 2

2
Ω ().A direct computation reveals that where the second inequality in the last display above follows by the polynomial decay of the wave packets {ϕ  : Ω  = Ω}.This completes the proof of the lemma.
We present below a localized orthogonality statement which is needed in order to verify that the coe cients   ( ) form a Carleson sequence in the sense of §2.Verifying this Carleson condition relies on a variation of Journé's lemma that can be found in [5,Lemma 3.23]; we rephrase it here adjusted to our notation.In the statement of the lemma below we denote by M P 2  the maximal function corresponding to the collection P 2  where  ∈  is a xed slope.
Note that the proof in [5] corresponds to the case of slope  = 0 but the general case  ∈  follows easily by a change of variables.Remember here that we have  ⊂ [−1, 1].
In the statement of the lemma below two parallelograms are called incomparable if none of them is contained in the other.Lemma 4.7.Let  ∈  be a slope and T ⊂ D 2  be a collection of pairwise incomparable parallelograms.De ne sh ★ (T ) M P 2  1 sh(T ) > 2 −6 and for each  ∈ T let   be the least integer  such that 2   ⊄ sh ★ (T ).Then ∑︁ With the suitable analogue of Journé's lemma in hand we are ready to state and prove the localized orthogonality condition for the coe cients   ( ).Lemma 4.8.Let  ∈  be a slope, T ⊂ P 2  be a given collection of parallelograms and T be a collection of tiles such that R T {  :  ∈ T} is subordinate to T .Then we have ∑︁ Proof.We rst make a standard reduction that allows us to pass to a collection of dyadic rectangles.To do this we use that there exist at most 9 2 shifted dyadic grids D 2 , such that for each parallelogram  ∈ T there exists  ∈ ∪  D 2 , with  ⊂  and | | ≤ |  | | |; see for example [19].Now note that for each  ∈ T we have and so |sh( T )| |sh(T )|.Now it is clear that we can replace T with the dyadic collection T in the assumption.Furthermore there is no loss in generality with assuming that T is a pairwise incomparable collection.We do so in the rest of the proof and continue using the notation T assuming it is a dyadic collection.
Since R T is subordinate to T we have the decomposition by Lemma 4.4.We may thus assume that  is supported outside sh ★ (T ).By Lemma 4.7 it then su ces to prove that ∑︁ whenever  is the least integer such that 2   ⊄ sh ★ (T ) and  ∞ = 1.As  is supported o sh ★ (T ) we have have for this choice of  that Let   be the center of  and suppose that  =   (  ×   ) with   ×   ∈ D 2 0 ; remember that we write   (1, ).Let Observe preliminarily that     ∞ 2 −20(+) so that for any constant  > 0 we have as claimed.To pass to the second line we have used estimate (4.5) of Lemma 4.4 together with the easily veri able fact that for each  ∈ T( ) the wave-packet   −1  ϕ  is adapted to  with order  0 − 20 ≥ 2 3 provided the absolute constant  is chosen small enough.4.9.The intrinsic square function associated with rough frequency cones.Let  ∈  be our nite set of slopes.As usual we write   (1, ) for  ∈  and  {  :  ∈  } and switch between the description of directions as slopes or vectors as desired with no particular mention.Now assume we are given a nitely overlapping collection of arcs {ω  } ∈ with each ω  ⊂ S 1 centered at (  /|  |) ⊥ .We will adopt the notation ω  ((  − /|  − |) ⊥ , (  + /|  + |) ⊥ ) assuming that the positive direction on the circle is counterclockwise and  − <  <  + .
For  ∈  we de ne the conical sectors (4.10) these are an overlapping cover of the cone with  ∈ Z playing the role of the annular parameter.Each sector Ω , is strictly contained in the cone   .
For each  ∈  let ℓ  ∈ Z be chosen such that 2 −ℓ  < |ω  | ≤ 2 −ℓ  +1 .We perform a further discretization of each conical sector Ω , by considering Whitney-type decompositions with respect to the distance to the lines determined by the boundary rays   − and   + ; here   + denotes the ray emanating from the origin in the direction of  ⊥  + and similarly for   − .For each sector Ω , a central piece which we call Ω ,,0 is left uncovered by these Whitney decompositions.This is merely a technical issue and we will treat these central pieces separately in what follows.
To make this precise let ,  be xed and de ne the regions The central part that was left uncovered corresponds to  = 0 and is described as Ω ,, , >0 F 4.13.The decomposition of the sector Ω , into Whitney regions, and the spatial grid corresponding to the middle region Ω ,,0 .
We stress here that for each cone   we introduce tiles in three possible directions   −,   ,   + .This turns out to be technical nuisance more than anything else as the total number of directions is still comparable to #, and our estimates will be uniform over all  with the same cardinality.However in order to avoid confusion we set Note also that for xed , ,  the choice of scales for   yields that the tile  =   × Ω ,, obeys the uncertainty principle in both radial and tangential directions.
We then de ne the associated intrinsic square function by ( ) where the set of slopes  are kept implicit in the notation.Here we remember that the notation   ( ) was introduced in (4.2).Using the orthogonality estimates of §4.3 as input for Theorem C we readily obtain the estimates of the following theorem.
Theorem G.We have the estimates (log where the supremum in the last display is taken over all measurable sets  ⊂ R 2 of nite positive measure and all Schwartz functions  on R 2 with  ∞ ≤ 1. Proof of Theorem G. First of all, observe that the case  = 2 of (4.17) is exactly the conclusion of Lemma 4.4.By restricted weak type interpolation it thus su ces to prove (4.18) to obtain the remaining cases of (4.17): we turn to the former task.
For convenience de ne  *  ∪ { − :  ∈  } ∪ { + :  ∈  }  − ∪  ∪  + ; note that this is the actual set of slopes of tiles in T. Let R T {  :  ∈ T} ⊂ D 2  * .Observe that we can write We x  and  as in the statement and we will obtain (4.18) from an application of Theorem C to the Carleson sequence  = {  } ∈R T .First, mass  || as a consequence of Lemma 4.4 since

|𝐸|.
Further, the fact that  is (a constant multiple of) an  ∞ -normalized Carleson sequence is a consequence of the localized estimate of Lemma 4.8.To verify this we need to check the validity of De nition 2.8 for the sequence  above.To that end let L ⊂ D

4.19.
The intrinsic square function associated with smooth frequency cones.The tiles in the previous subsection were used to model rough frequency projections on a collection of essentially disjoint cones.Indeed note that all decompositions were of Whitney type with respect to all the singular sets of the corresponding rough multiplier.In the case of smooth frequency projections on cones we need a simpli ed collection of tiles that we brie y describe below.
Assuming  is a nite set of slopes and the arcs {ω  } ∈ on S 1 have nite overlap as before we now de ne for  ∈  and  ∈ Z the collections (4.20) with Ω , given by (4.10).Here we also assume that 2 . Notice that each conical sector Ω , now generates exactly one frequency component of possible tiles in contrast with the previous subsection where we need a whole Whitney collection for every  and every ; in fact the tiles T , are for all practical purposes the same as the tiles T ,,0 considered in §4.9.It is of some importance to note here that for each xed  ∈  the collection R T {  :  ∈ T} consists of parallelograms of xed eccentricity 2 ℓ  and thus the corresponding maximal operator M R T is of weak-type (1,1) uniformly in  ∈ : 1.
The intrinsic square function Δ T is formally given as in (4.16) but de ned with respect to the new collection of tiles de ned in (4.20).A repetition of the arguments that led to the proof of Theorem G yield the following.
Theorem H.For T de ned by (4.20) we have the estimates (log #) 1 4 , where the supremum in the last display is taken over all measurable sets  ⊂ R 2 of nite positive measure and all Schwartz functions  on R 2 with  ∞ ≤ 1.

4.21.
The intrinsic square function associated with rough frequency rectangles.The considerations in this subsection aim at providing the appropriate time-frequency analysis in order to deal with a Rubio de Francia type square function, given by frequency projections on disjoint rectangles in nitely many directions.The intrinsic setup is described by considering again a nite set of slopes  and corresponding directions  .Suppose that we are given a nitely overlapping collection of rectangles F = ∪ ∈ F  , consisting of rectangles which are tensor products of intervals in the coordinates ,  ⊥ ,  = (1, ), for some  ∈ .Namely a rectangle  ∈ F  is a rotation by  of an axis-parallel rectangle.We stress that the rectangles in each collection F  are generic two-parameter rectangles, namely their sides have independent lengths (there is no restriction on their eccentricity).
We also note that F  consists of rectangles rather than parallelograms and this di erence is important when one deals with rough frequency projections.Our techniques are su cient to deal with the case of parallelograms as well but we just choose to detail the setup for the rectangular case.The interested reader will have no trouble adjusting the proof for variations of our main statement below for the case of parallelograms, or for the case that the families F  are in fact one-parameter families.
Given  ∈ F  we de ne a two-parameter Whitney discretization as follows.Let  = rot  ( ×  ) +   for some   ∈ R 2 , where rot s denotes counterclockwise rotation by  about the origin and  ×  is an axis parallel rectangle centered at the origin.Note that  = (−| |/2, | |/2) and similarly for  .Then we de ne for The de nition has to be adjusted for  1 = 0 or  2 = 0.For example we de ne for  2 ≠ 0 and symmetrically for  1 ≠ 0 and  2 = 0. Finally Then for  = ( Note again that the tiles de ned above obey the uncertainty principle in both ,  ⊥ for every xed  = (1, ) with  ∈ .
The intrinsic square function associated with the collection F is denoted by Δ T F and formally has the same de nition as (4.16),where now the T are given by the collection T F of (4.22).The corresponding theorem is the intrinsic analogue of a multiparameter directional Rubio de Francia square function estimate.
Theorem I. Let F be a nitely overlapping collection of two-parameter rectangles in directions given by  ∑︁ Consider the collection of tiles T F de ned in (4.22) and Δ T F be the corresponding intrinsic square function.We have the estimates (log #) 1 4 (log log #) 1 4 , where the supremum in the last display is taken over all measurable sets  ⊂ R 2 of nite positive measure and all Schwartz functions  on R 2 with  ∞ ≤ 1.
Remark 4.23.As before, there is slight improvement in the case of one-parameter spatial components in each direction.More precisely suppose that F = ∪ ∈ F  is a given collection of disjoint rectangles in directions given by .If for each  ∈  the family R F  {  :  ∈ T F  } yields a weak-type (1, 1) maximal operator then the estimates of Theorem I hold without the log log-terms.Remark 4.24.Suppose that R = ∈ R  ⊂ P 2  is a family of parallelograms in directions given by , namely we have that if  ∈ R  then  =   ( ×  ) +   for some rectangle  ×  in R 2 with sides parallel to the coordinate axes and centered at 0, and   ∈ R 2 .Now there is an obvious way to construct a Whitney partition of each  ∈ R. Indeed we just de ne the frequency components and T are given as in (4.22).With this de nition there is a corresponding intrinsic square function Δ T R which satis es the bounds of Theorem I.The improvement of Remark 4.23 is also valid if R = ∪ ∈ R  and each R  consists of rectangles of xed eccentricity.
The proof of Theorem I relies again on the global and local orthogonality estimates of §4.3 and a subsequent application of the directional Carleson embedding theorem, Theorem C. We omit the details.

S
We begin this section by recalling the de nition for the smooth conical frequency projections given in the introduction.Let τ ⊂ (0, 2π) be an interval and consider the corresponding rough cone multiplier and its smooth analogue where β is a smooth function on R supported on [−1, 1] and equal to 1 on [− 1 2 , 1 2 ] and  τ , |τ| stand respectively for the center and length of τ.
This section is dedicated to the proofs of two related theorems concerning conical square functions.The rst is a quantitative estimate for a square function associated with the smooth conical multipliers of a nite collection of intervals with bounded overlap given in Theorem A, namely the estimates for 2 ≤  < 4, as well as the restricted type analogue valid for all measurable sets under the assumption of nite overlap The second theorem concerns an estimate for the rough conical square function for a collection of nitely overlapping cones .
Theorem J. Let  be a nite collection intervals in [0, 2) with nite overlap as in (5.2).Then the square function estimate Theorem A is sharp, in terms of log #-dependence, for all 2 ≤  < 4 and for  = 4 up to the restricted type.Theorem J improves on [10, Theorem 1], where the dependence on cardinality is unspeci ed.Examples providing a lower bound of (log #) 1 2 − 1    for the left hand side of (5.3), and showing the sharpness of Theorem A, are detailed in Section 8.
The remainder of the section is articulated as follows.In the upcoming Subsection 5.4 we show Theorem A. The subsequent subsection is dedicated to the proof of Theorem J. 5.4.Proof of Theorem A. We are given a nite collection of intervals ω ∈  having bounded overlap as in (5.2).By nite splitting we may reduce to the case of ω ∈  being pairwise disjoint; we treat this case throughout.
The rst step in the proof of Theorem A is a radial decoupling.Let ψ be a smooth radial function on R 2 with and de ne the Littlewood-Paley projection The following weighted Littlewood-Paley inequality is contained in [3, Proposition 4.1].
Proposition 5.5 (Bennett-Harrison, [3]).Let  be a non-negative locally integrable function. [3] with implicit constant independent of ,  , where we recall that M [3] denotes the three-fold iteration of the Hardy-Littlewood maximal function M with itself.
We may easily deduce the next lemma from the proposition.
Proof.The case  = 2 is trivial so we assume  > 2. Letting   2 > 1 there exists some [3]  and the lemma follows by Hölder's inequality and the boundedness of  [3] on   (R 2 ).
The second and nal step of the proof of Theorem A is the reduction of the operator appearing in the right hand side of (5.7) to the model operator of Theorem H.
In order to match the notation of §4.9 we write {  } ∈ for the collection of arcs in S 1 corresponding to the collection of intervals , namely for τ ∈  we implicitly de ne  =  τ by means of We set  { τ : τ ∈ } and de ne the corresponding arcs in S 1 as ω  τ {e  :  ∈ τ}.
Now the cone  τ is the same thing as the cone   and # = #.Similarly we write  • τ =  •  τ so the cones can now be indexed by By nite splitting and rotational invariance there is no loss in generality with assuming that  ⊂ [−1, 1].Notice that the support of the multiplier of

𝑡
for all  ∈ T  ω , and T , is de ned in (4.20).Here  0 = 2 50 is as chosen in (4.2).Fixing ,  for the moment we preliminarily observe that for each ν ≥ 1 the collection R , R T , = {  :  ∈ T , } can be partitioned into subcollections {R  ,,ν : 1 ≤  ≤ 2 8ν } with the property that We will also use below the Schwartz decay of ϕ  ∈ A  0  in the form Using Schwartz decay of ϕ  twice, in particular to bound by an absolute constant the second factor obtained by Cauchy-Schwartz after the rst step, we get Now for xed ω, , ν,  and  ∈ T , observe that there is at most one ρ = ρ  ,,ν () ∈ R  ω,,ν such that ρ ⊄ 2 ν   , ρ ⊂ 2 ν+1   .Thus the estimate above can be written in the form where  ρ = ρ × Ω , ∈ T , is the unique tile with spatial localization given by ρ: this is because 2 −4ν ϕ  ∈ A  0  ρ .We thus conclude that ( ) Comparing with the de nition of Δ T given in (4.16) we may summarize the discussion in the lemma below.
The proof of the upper bound in Theorem A is then completed by juxtaposing the estimates of Lemmata 5.6 and 5.9 with Theorem H.For the optimality of the estimate see §8.6.5.10.Proof of Theorem J.The proof of Theorem J is necessarily more involved than its smooth counterpart Theorem A. In particular we need to decompose each cone not only in the radial direction as before, but also in the directions perpendicular to the singular boundary of each cone.We describe this procedure below.
Consider a collection of intervals  = {τ} as in the statement.By the same correspondence as in the proof of Theorem A we pass to a family {ω  } ∈ consisting of nitely overlapping arcs on S 1 centered at  ⊥  /| ⊥  | and corresponding cones   .Note that the sectors {Ω , } ∈,∈Z , de ned in (4.10) form a nitely overlapping cover of ∪ ∈   .We remember here that   = (1, ) and the endpoint of the interval ω  are given by ( ⊥  −,  ⊥  + ), and that the positive direction is counterclockwise.Now, for each xed  ∈  the cover {Ω ,, } (,)∈Z 2 de ned in (4.11), (4.12), is a Whitney cover if Ω , in the product sense: for each Ω ,, the distance from the origin is comparable to 2  and the distance to the boundary is comparable to 2 −|||ω  | .
The radial decomposition in  will be taken care of by the Littlewood-Paley decomposition {  } ∈Z , de ned as in the proof of Theorem J. Now for xed ,  we consider a smooth partition of unity subordinated to the cover {Ω ,, } ∈Z .Note that one can easily achieve that by choosing {φ , } <0 to be a one-sided (contained in   ) Littlewood-Paley decomposition in the negative direction  − =   − , and constant in the direction ( − ) ⊥ when  < 0, and similarly one can de ne φ , when  > 0, with respect to the positive direction  + .The central piece Ω ,,0 corresponds to φ  0 de ned implicitly as Now the desired partition of unity is  ,, () , where ψ  ψ(2 − •) with the ψ constructed in the proof of Theorem A. Remember that    (ψ  f ) ∨ and let us de ne Φ ,  (φ , f ) ∨ .An important step in the proof is the following square function estimate in   (R 2 ), with 2 ≤  < 4, that decouples the Whitney pieces in every cone   .It comes at a loss in  which appears to be inevitable because of the directional nature of the problem.Lemma 5.11.Let {  } ∈ be a family of frequency cones, given by a family of nitely overlapping arcs  {ω  } ∈ as above.For 2 ≤  < 4 there holds Proof.Observe that the desired estimate is trivial for  = 2 so let us x some  ∈ (2, 4).There exists some  ∈   with  = (/2) = /( − 2) such that and so by Proposition 5.5 we get where we recall that M [3] denotes three iterations of the Hardy-Littlewood maximal function M. Fixing  for a moment we use Proposition 5.5 in the directions   −,   and   + to further estimate ∫ ε M [3]  where we adopted the convention  0  for brevity, and M  is given by (2.3).Remember also that Φ , for  > 0 corresponds to directions  + while Φ , corresponds to directions  − for  < 0, and to directions  0 =  for  = 0. Now for any  ∈ S 1 and  > 1 we have that M [3]    ( ) 2 [M    ] 1  ; see for example [29].Thus M [3]    ε M [3]  ( ) 2 [M  * [M [3] ]  ]  [21] that   * maps   (R 2 ) to   (R 2 ) with a bound (log # * ) 1  for  > 2. As  < 4 there exists a choice of 1 <  <  2(−2) so that   (−2) > 2 and a theorem of Katz from [21] applies.Using this fact together with Hölder's inequality proves the lemma.
The proof of Theorem J can now be completed as follows.For each (, , ) ∈  × Z × Z the operator   Φ , is a smooth frequency projection adapted to the rectangular box Ω ,, .Following the same procedure that led to (5.8) in the proof of Theorem A we can approximate each piece   Φ ,  by an operator of the form ( ) where  ε follows the sign of  and coincides with  if  = 0.The collections of tiles T  ε ,, are the ones given in (4.14).Now Lemma 5.11 and Theorem G are combined to complete the proof of Theorem J.

D R F
In his seminal paper [31], Rubio de Francia proved a one-sided Littlewood-Paley inequality for arbitrary intervals on the line.This estimate was later extended by Journé, [20], to the case of rectangles (-dimensional intervals) in R  ; a proof more akin to the arguments of the present paper appears in [25].The aim of this subsection is to present a generalization of the one-sided Littlewood-Paley inequality to the case of rectangles in R 2 with sides parallel to a given set of directions.The set of directions is to be nite, necessarily, because of Kakeya counterexamples.
As in the case of cones of §5 we will present two versions, one associated with smooth frequency projections and one with rough.To set things up let  be a nite set of slopes and  be the corresponding directions.We consider a family of rotated rectangles F as in §4.21 where F = ∪ ∈ F  .For each  ∈  a rectangle  ∈ F  is a rotation by  of an axis parallel rectangle, so that the sides of  are parallel to (,  ⊥ ) with  = (1, ).We will write  = rot  (  ×   ) +   for some   ∈ R 2 in order to identify the axes-parallel rectangle   ×   producing  by an -rotation; this writing assumes that   ×   is centered at the origin.Now for each  ∈ F we consider the rough frequency projection and its smooth analogue where γ  is a smooth function on R 2 , supported in , and identically 1 on rot s ( 1 2  × 1 2  ).We rst state the smooth square function estimate.
Theorem K. Let F be a collection of rectangles in R 2 with sides parallel to (,  ⊥ ) for some  in a nite set of directions  .Assume that F has nite overlap.Then for 2 ≤  < 4, as well as the restricted type analogue valid for all measurable sets The dependence on # in the estimates above is best possible up the doubly logarithmic term.
Remark 6.1.We record a small improvement of the estimates above in some special cases.Suppose that for xed  ∈  all the rectangles  ∈ F  have one side-length xed, or that they have xed eccentricity.In both these cases the collections of spatial components of the tiles needed to discretize these operators, R T F  {  :  ∈ T F  }, with T  as in (4.22), give rise to maximal operators that are of weak-type (1, 1).Then Remark 4.23 shows that the estimates of Theorem K hold without the doubly logarithmic terms, and as shown in §8.2 this is best possible.
The rough version of this Rubio de Francia type theorem is slightly worse in terms of the dependence on the number of directions.The reason for that is that, as in the case of conical projections, passing from rough to smooth in the directional setting incurs a loss of logarithmic terms, essentially originating in the corresponding maximal function bound.
Theorem L. Let F be a collection of rectangles in R 2 with sides parallel to (,  ⊥ ) for some  in a nite set of directions  .Assume that F has nite overlap.Then the following square function estimate holds for 2 ≤  < 4 The proofs of these theorems follow the by now familiar path of introducing local Littlewood-Paley decompositions on each multiplier, approximating with time-frequency analysis operators, establishing a directional Carleson condition on the wave-packet coe cients and nally applying Theorem C. We will very brie y comment on the proofs below.
Proof of Theorem L and Theorem K. We rst sketch the proof of Theorem L which is slightly more involved.The rst step here is a decoupling lemma which is completely analogous to Lemma 5.11 with the di erence that now we need to use two directional Littlewood-Paley decompositions while in the case of cones only one.This explains the extra logarithmic term of the statement.
Remember that F = ∪  F  with  = (1, ) for some  ∈  ; here  gives the directions (,  ⊥ ) of the rectangles in F  .Using the nitely overlapping Whitney decomposition of §4.21 we have for each  ∈ F  a collection of tiles as in (4.22).Let us for a moment x  and  ∈ F  .The frequency components of the tiles in T  ( ) form a two-parameter Whitney decomposition of  , so let { , 1 , 2 } ( 1 , 2 )∈Z 2 be a smooth partition of unity subordinated to this cover and denote by Φ , 1 , 2 the Fourier multiplier with symbol  , 1 , 2 .
The promised analogue of Lemma 5.11 is the following estimate: for 2 ≤  < 4 there holds (6.2) The proof of this estimate is a two-parameter repetition of the proof of Lemma 5.11, where one applies Proposition 5.5 once in the direction of  and once in the direction of  ⊥ .Using the familiar scheme we can approximate each Φ ,   ( ) as before.Using the orthogonality estimates of §4.3 in Theorem C yields the upper bound in Theorem K.The optimality of the estimates in the statement of Theorem K is discussed in §8.2.

T
Let P = P  be a regular  -gon and  P  be the corresponding Fourier restriction operator on P In this subsection we prove Theorem B, namely we will prove the estimate The idea is to reduce the multiplier problem for the polygon to the directional square function estimates of Theorem K and combine those with vector-valued inequalities for directional averages and directional Hilbert transforms.
We introduce some notation.The large integer  is xed throughout and left implicit in the notation.By scaling, it will be enough to consider a regular polygon P with the following geometric properties: rst, P has vertices on the unit circle S 1 with  1 =   +1 = 0 and oriented counterclockwise so that  +1 −   > 0.
The associated Fourier restriction operator is then de ned by The proof of the estimate of Theorem B for  P occupies the remainder of this section: by self-duality of the estimate it will su ce to consider the range 2 ≤  < 4.
7.1.A preliminary decomposition.Let  be a large positive integer and take κ such that 2 κ−1 <  ≤ 2 κ .For each −2κ ≤  ≤ 0 consider a smooth radial multiplier   which is supported on the annulus and is identically 1 on the smaller annulus We note that  κ is supported in the annulus . With this in mind let us consider radial functions  0 ,  P ∈ S(R 2 ) with 0 ≤  0 ,  P ≤ 1 such that (7.2)  0 +  κ +  P 1 P = 1 P , with the additional requirement that 7.5.Estimating  κ .We aim for the estimate The case  = 2 is obvious whence it su ces to prove the restricted type version at the endpoint  = 4 .
Let {ω  :  ∈  } be the collection of intervals on S 1 centered at   exp (2π / ) and of length 2 −κ .Note that these intervals have nite overlap and their centers   form a ∼ 1/net on S 1 .Now let {β  :  ∈  } be a smooth partition of unity subordinated to the nitely overlapping open cover {ω  :  ∈  } so that each β  is supported in   .We can decompose each   as For   ∈  and −2κ ≤  ≤ 0 we de ne the conical sectors  .
But now note that { , } , is a nitely overlapping family of smooth frequency projections on a family of rectangles in at most ∼  directions.Furthermore all these rectangles have one side of xed length since |ω  | = 2 −κ for all  ∈  .So Theorem K with the improvement of Remark 6.1 applies to yield 4 .The last two displays establish (7.7) and thus (7.6).
Remark 7.12.The term  κ is also present in the argument of [8].Therein, an upper estimate of order  (κ 5 4 ) for  near 4 is obtained, by using the triangle inequality and the bound sup {    4 (R 2 ) : −2κ ≤  ≤ 0} ∼ κ 1 4 for the smooth restriction to a single annulus.7.13.Estimating  P .In this subsection we will prove the estimate (7.14) .Let Φ be a smooth radial function with support in the annular region { ∈ R where  is a xed small constant, and satisfying 0 ≤ Φ ≤ 1.Let {β  :  ∈  } be a partition of unity on S 1 relative to intervals ω  as in §7.5.De ne the Fourier multiplier operators on R 2 (7.15) The operators   satisfy a square function estimate which follows in the same way as (7.11), by using Theorem K with the improvement of Remark 6.1.They also obey a vector-valued estimate (7.17) These estimates are easy to prove.Indeed note that it su ces to prove the endpoint restricted estimate at  = 4.Using the Fe erman-Stein inequality for xed  ∈  we can estimate for each function  with where M  is the Hardy-Littlewood maximal function with respect to the collection of parallelograms in D 2   ,−2κ,−κ with   de ned through (−  , 1)   .Now sup  ∈   is the maximal directional maximal function and the number of directions involved in its de nition is comparable to  ∼ 2 κ .Then the maximal theorem of Katz from [21] applies to give the estimate sup This proves the second of the estimates (7.17) and thus both of them by interpolation.
In the estimate for  P we will also need the following decoupling result.and the restricted type estimate (7.19) follows from (7.17).
We come to the main argument for  P .Let  P be as as in (7.2)-( 7.3) and   be the multiplier operators from (7.15) corresponding to the choice Φ =  P .Then obviously We may also tweak Φ and the partition of unity on S 1 to obtain further multiplier operators   as in (7.15) and such that the Fourier transform of the symbol of   equals one on the support of the symbol of   .With these de nitions in hand we estimate for 2 <  < 4 The rst inequality is an application of Lemma 7.18 for   .The last equality is obtained by observing that the polygon multiplier  P on the support of each   may be written as a (sum of  (1)) directional biparameter multipliers    +1 of iterated Hilbert transform type, where   is a Hilbert transform along the direction   , which is the unit vector perpendicular to the -th side of the polygon, and pointing inside the polygon; these are at most ∼  such directions.
In order to complete our estimate for  P we need the following Meyer-type lemma for directional Hilbert transforms of the form Lemma 7.21.Let  ⊂ S 1 be a nite set of directions and   be the Hilbert transform in the direction .Then for Proof.It su ces to prove the estimate for 2 <  < 4. The proof is by way of duality and uses the following inequality for the Hilbert transform: for  > 1 and  a non-negative locally integrable function we have with M  given by (2.3).See for example [29] and the references therein.Using this we have for a suitable  ∈  (/2) of norm one that {    } 2   (R since the number of directions is  = 2 κ .The nal estimate for the right hand side of the display above is a direct application of (7.16) which together with (7.20) yields the estimate for  P   claimed in (7.14).Now the decomposition (7.4) together with the estimate of §7.5 for  κ and the estimate (7.14) for  P complete the proof of Theorem B.
Remark 7.22.Consider a function  in R 2 such that supp( f ) ⊆  δ where  δ is an annulus of width δ 2 around S 1 .Decomposing  δ into a union of  (1/δ) nitely overlapping annular boxes of radial width δ 2 and tangential width δ we can write  =  ∈    where each   is a smooth frequency projection onto one of these annular boxes, indexed by .Then if   is a multiplier operator whose symbol is identically one on the frequency support of    and supported on a slightly larger box, we can write  =       , as in (7.20) above.Then Lemma 7.18 yields    (R 2 ) (log(1/δ)) 1 2 − 1  {   }   (R 2 ;ℓ 2  ) .This is the inverse square function estimate claimed in the remark after Theorem B in the introduction.
8. L 8.1.Sharpness of Meyer's lemma.We brie y sketch the quantitative form of Fe erman's counterexample [16] proving the sharpness of Lemma 7.21.Let  be a large dyadic integer.Using a standard Besicovitch-type construction we produce rectangles {  :  = 1, . . .,  } with sidelengths 1 × 1  , so that the long side of   is oriented along   exp(2π / ).Now we consider the set  to be the union of these rectangles and  These smooth radial multipliers were used extensively in §7.In [9] Córdoba has proved the bound  δ   (log 1/δ) Obviously one gets the same bound by duality for 4/3 <  < 2 while the  2 -bound is trivial.Now these estimates imply Córdoba's estimate for  δ in the open range (3/4, 4) by the decoupling inequality (7.10), also due to Córdoba.On the other hand Córdoba's estimate is sharp.Indeed one uses the same rescaling and modulation arguments as in the previous subsection in order to deduce a vector-valued inequality for smooth averages starting by Córdoba's estimate.Testing this vector-valued estimate against the rectangles of the Besicovitch construction proves the familiar lower bound for  δ and thus also shows the optimality of the estimates in (8.5).We omit the details.
8.6.Lower bounds for the conical square function.We conclude this section with a simple example that provides a lower bound for the operator norm of the conical square function  ω ( ) : ℓ 2  of Theorem J and the smooth conical square function  • ω : ℓ 2  of Theorem A. The considerations in this subsection also rely on the Besicovitch construction so we adopt

For each 𝑠 ∈ 𝑆 let 𝐴 𝑠 1 0 𝑠 1 be
the corresponding shearing matrix.A parallelogram along  is the image  =   ( ×  ) of the rectangular box  ×  in the xed coordinate system with | | ≥ | |.We denote the collection of parallelograms along  by P 2

1 2 − 1 𝑝 𝑓 𝑝 , 3 4 ≤ 𝑝 ≤ 4 . 1 𝑝
In fact the same bound is implicitly proved in §7 in a more re ned form, but only in the open range  ∈ (3/4, 4) with weak-type analogues at the endpoints.More precisely we have discretized  δ into a sum of pieces { δ, }  ∈ , where each  δ, is a smooth projection onto an annular box of width δ and length√δ, pointing along one of  equispaced directions   .Then it follows from the considerations in §7 that{ δ,  }   (R 2 ;ℓ 2  )   , 2 <  < 4,{ δ,  1  }  4 (R 2 ;ℓ 2 2 if for each  ∈ L there exists  ∈ T such that  ⊂  ; see Figure2.7.It is important to stress that collections L are subordinate to rectangles T ⊂ P 2  having a xed slope .The Carleson sequences  = {  } ∈R we will be considering will fall under the scope of the following de nition. which by our assumption on the weak (, ) norm of M R  implies ∈|sh(R , )|.

.
Lemma 2.22.Let  ⊂ [−1, 1] and de ne the collections R , by (2.19) with λ de ned as in Lemma 2.21 for L = R λ  max 1,   U 2 (R) Directional weights.Let  be a set of slopes and ,  ∈  1 loc (R 2 ) be nonnegative functions, which we refer to as weights from now on.Our weight classes are related to the maximal Therefore, if  ≥ 0 belongs to the unit sphere of   (R 2 ), ]  ≤ 2 M ;2 → ; here [ℓ]denotes ℓ-fold composition of an operator  with itself.We also highlight the relevance of [, ]  in Theorem D below by noticing that 1 , for a xed  ∈ Z.In other words, the parallelograms in direction  have xed vertical sidelength and arbitrary eccentricity.3.2.M ;2  ()  () .We pause to point out some relevant examples of pairs ,  with [, ]  < ∞.Recall that for  > 2, M ;2 → (log #) 1/ ; this is actually a special case of Theorem C and interpolation.3.3.Weighted Carleson sequences.We begin with the weighted analogue of De nition 2.8, which is given with respect to a xed weight .De nition 3.4.Let  = {  } ∈D 2  be a sequence of nonnegative numbers.Then  will be called an  ∞ -normalized -Carleson sequence if for every L ⊂ D 2 4.1.Tiles and wavelet coe cients.Throughout this section we x a nite set of slopes  ⊂ [−1, 1].Remember that alternatively we will refer to the set of vectors  {(1, ) :  ∈  }.A tile is a set    × Ω  ⊂ R 2 × R 2 where   ∈ D 2  and Ω  ⊂ R 2 is a measurable set, and |  ||Ω  | 1.We denote by  () ∈  the slope such that   ∈ D 2  () , and then 4.3.Orthogonality estimates for collections of tiles.We begin with an easy orthogonality estimate for wave packet coe cients.For completeness we present a sketch of proof which has a  * avor.The argument follows the lines of proof of[25,Proposition 3.3].Lemma 4.4.Let T be a set of tiles such that  ∈T 1 Ω  1, let  ≥ 2 3 and {ϕ  :  ∈ T} be such that ϕ  ∈ A   for all  ∈ T. We have the estimate (4.5) ∑︁ Proof.Fix  ≥ 2 3 .It su ces to prove that for  2 = 1 and an arbitrary adapted family of wave packets {ϕ  : ϕ  ∈ A   ,  ∈ T} there holds (4.6)  ∑︁ ∈T |  , ϕ  | 2 1, Let us rst x some Ω ∈ Ω(T){Ω  :  ∈ T} and consider the family 12) Ω ,,0  ∈ Ω , : min(dist(,   − ), dist(,   + )) ≥ Notice that the collection {Ω ,, } ∈N is a nitely overlapping cover of Ω , .Furthermore the family {Ω ,, } ,, has nite overlap as the cones {  } ∈ have nite overlap and for xed  the family {Ω ,, } , is Whitney both in  and .These geometric considerations are depicted in Figure 4.13 below.
•    is contained in the frequency sector Ω , de ned in (4.10).By standard procedures of time-frequency analysis, as for example in [13, Section 6], the operator  •    can be recovered by appropriate averages of operators C ,  ∑︁  ∈T ,  , ϕ  ϕ  where ϕ  ∈ A 8 0
[21]2|  | 2 (M  ||  )   sup  ∈ M  .Now for 2 <  < 4 there is a choice of 1 <  <  2(−2) so that   (−2) > 2.This means that the maximal theorem of Katz from[21]applies again to give(M  ||  ) and so the proof of the upper bound is complete.The optimality is discussed in §8.1.Let us now go back to the estimate for  P .The left hand side of (7.20) contains a double Hilbert transform.By an iterated application of Lemma 7.21 we thus have {   +1 (   )} 1  {  } 2   (R 2 ;ℓ 2  ) (M  ||  ) 1   (/2) (R 2 )with M [16]ting by   the 2-translate of   in the direction of   we gather that {   :  = 1, ...,  } is a pairwise disjoint collection.Furthermore if   is the Hilbert transform in direction   , there holds|  1   | ≥ 1   .Therefore for all 1 <  < ∞ Self-duality of the square function estimate then entails the optimality of the estimate of Lemma 7.21.8.2.Sharpness of the directional square function bound.In this subsection we prove that the bound of Theorem L is best possible, up to the doubly logarithmic terms.In particular we prove that the bound of Remark 6.1 is best possible.We begin by showing a lower bound for the rough square function estimate(8.3){}(R 2 ;ℓ 2 F ) ≤ {  } :   (R 2 ) →   (R 2 ; ℓ 2 F )   , 2 ≤  < 4,where the notations are as in §6.Now as in Fe erman's argument in[16]one can easily show that the estimate above implies the vector-valued inequality for directional averages, for directions corresponding to the directions of rectangles in F .For this let # =  where  is the set of directions of rectangles in F .Now consider functions {  }  ∈F with compact Fourier support; by modulating these function we can assume that supp(   ) ⊂ (  , ) for some  > 1 and {  }  ∈F a 100 -net in R 2 .Then if  is a rectangle centered at   with short side 1 parallel to a direction   ∈  and long side of length  parallel to  ⊥  , then we have that |    | = |     | where    is the averaging operator     () 2Note that this is a single-scale average with respect to rectangles of dimensions 1 × 1/ in the directions   ,  ⊥  respectively.Since the frequency supports of these functions are wellseparated we gather that for all choices of signs ε  ∈ {−1, 1} we have Thus applying (8.3) with the function  as above and averaging over random signs we get{     }   (R 2 ;ℓ 2 F ) ≤ {  } :   (R 2 ) →   (R 2 ; ℓ 2 F ) {  }   (R 2 ;ℓ 2 F ) , 2 ≤  < 4.Now we just need to note that as in §8.1 we have that   1   1  where {  }  ∈F are the rectangles used in the Besicovitch construction in §8.1.As before we get {  } :   (R 2 ) →   (R 2 ; ℓ 2  .For  < 2 the square function estimate (8.3) is known to fail even in the case of a single directions; see for example the counterexample in[31,  §1.5].One can use the same argument in order to show a lower bound for the norm of the smooth square function{ •  }   (R 2 ;ℓ 2 F ) ≤ { •  } :   (R 2 ) →   (R 2 ; ℓ 2 F )   , 2 ≤  < 4.Indeed, following the exact same steps we can deduce a vector-valued inequality for smooth averages  −  ⊥  )γ  (, ) d d,  ∈ R 2 ,where γ  is the smooth product bump function used in the de nition of  •  in §6.By a direct computation one easily shows the analogous lower bound  •   1   1   for the rectangles of the Besicovitch construction and this completes the proof of the lower bound for smooth projections as well.