On a spatially inhomogeneous nonlinear Fokker-Planck equation: Cauchy problem and diffusion asymptotics

We investigate the Cauchy problem and the diffusion asymptotics for a spatially inhomogeneous kinetic model associated to a nonlinear Fokker-Planck operator. We derive the global well-posedness result with instantaneous smoothness effect, when the initial data lies below a Maxwellian. The proof relies on the hypoelliptic analog of classical parabolic theory, as well as a positivity-spreading result based on the Harnack inequality and barrier function methods. Moreover, the scaled equation leads to the fast diffusion flow under the low field limit. The relative phi-entropy method enables us to see the connection between the overdamped dynamics of the nonlinearly coupled kinetic model and the correlated fast diffusion. The global in time quantitative diffusion asymptotics is then derived by combining entropic hypocoercivity, relative phi-entropy and barrier function methods.


Introduction
We consider the kinetic Fokker-Planck operator L FP := ∇ v • (∇ v + v) and the spatially inhomogeneous nonlinear drift-diffusion model (1-1) Given a constant ϵ ∈ (0, 1), the equation under the low field scaling t → ϵ 2 t, x → ϵx reads (x, v). (1-2) Our aim is to show the global well-posedness and the trend to equilibrium with smoothness a priori estimates for (1-1), and the quantitative asymptotic dynamics of (1-2) as ϵ tends to zero.
1A. Main results.Let us recall that a classical solution of an evolution equation is a nonnegative function satisfying the equation pointwise everywhere and matching the initial data continuously.Unless otherwise specified, any solution we consider below is intended in the classical sense.For k ∈ ‫,ގ‬ define C k ( ) to be the set of functions having all derivatives of order less than or equal to k continuous in the domain .
For α ∈ (0, 1), we note that C α ( ) is the classical Hölder space on with exponent α.In addition, we define the measure dm := dx dµ, where µ(v) := (2π ) −d/2 e −|v| 2 /2 and dµ := µ dv denote the Gaussian function and the Gaussian measure, respectively.A function that takes the form of Cµ(v) for some constant C > 0 is called a Maxwellian.
Theorem 1.1.Let the space domain x be equal to ‫ޔ‬ d or ‫ޒ‬ d and the constants 0 < λ < be given.
, then there exists a solution f to the Cauchy problem (1-1) such that 0 ≤ f ≤ µ in ‫ޒ‬ + × x × ‫ޒ‬ d .Moreover, for any ν ∈ (0, 1), k ∈ ‫,ގ‬ and any compact subset K ⊂ (0, T ] × x , there is some constant C T,ν,k,K > 0 depending only on d, β, λ, , T, ν, k, K , and the initial data such that Additionally, if f in is Hölder continuous and ρ f in ≥ λ in x , then the solution that lies below any Maxwellian is unique. (ii) For x = ‫ޔ‬ d , if the initial data satisfies λµ ≤ f in ≤ µ in ‫ޔ‬ d × ‫ޒ‬ d , then, for any k ∈ ‫,ގ‬ there exists some constant c > 0 depending only on d, β, λ, and some constant C k > 0 depending additionally on k such that, for any t ≥ 1, For x = ‫ޒ‬ d , if the initial data satisfies λµ ≤ f in ≤ µ in ‫ޒ‬ d × ‫ޒ‬ d and f in − M 1 µ ∈ L 1 ‫ޒ(‬ d × ‫ޒ‬ d ) for some constant M 1 > 0, then there is some constant C ′ > 0 depending only on d, β, λ, , M 1 such that Remark 1.2.If the general measurable initial data f in satisfies f in ≤ µ and an extra locally uniform lower bound assumption (see (4-14) below for a precise description), the existence of solutions still holds in some weak sense, as pointed out in Remark 4.9 below.
In order to describe the diffusion asymptotics of (1-2), we introduce the (Bregman) distance characterized by the relative phi-entropy functional H β .
Theorem 1.4.Let the constants α 0 ∈ (0, 1) and 0 < λ < be given, and consider a sequence of functions Let f ϵ be the solution to (1-2) associated with the initial data f ϵ,in .
1B1.Cauchy problem of the nonlinear model.The well-posedness of the nonlinear model (1-1) was first studied in [Imbert and Mouhot 2021] mixing Hölder and Sobolev spaces in the torus, and in [Liao et al. 2018] under the regime of perturbation to the global equilibrium in the whole space.We develop well-posedness with rough initial data by means of the combination of the hypoelliptic analog of the parabolic theory with a positivity-spreading result; in particular, the technique we employ allows us to drop the smallness and lower bound assumptions asserted in Theorem 1.1.In addition, the global behavior of solutions to (1-1) is derived under the assumption of upper and lower bounds on the initial data only.
When the drift-diffusion coefficient ρ β f in (1-1) is proportional to the local mass of the solution -that is when β = 1 -(1-1) and (1-2) have the same quadratic homogeneity as the Landau equation, but simpler global bounds and conservation laws.Due to the complex structure of the Landau equation, most of the existing results for its classical solutions are about the global theory under the near Maxwellian equilibrium regime [Guo 2002;Kim et al. 2020] and about the local well-posedness associated with low regularity and nonperturbative initial data [Henderson et al. 2019;2020a].By contrast, the boundedness from above and from below by Maxwellians of the initial data will be preserved along time for the solutions to (1-1) and (1-2), and the lack of conservation of momentum and energy of (1-2) reduces its hydrodynamic limit to the fast diffusion flow (1-3) rather than the Navier-Stokes dynamics of the scaling limit of the Landau equation, which makes its Cauchy problem and global behavior more tractable in a very general setting.
To address the nonlinear Cauchy problem subject to only one requirement that the initial data lies below a Maxwellian, we propose a method that involves several ingredients.First, in Section 3 we carry out a preliminary study on the linear counterpart of (1-1) -that is the Cauchy problem associated to the Kolmogorov operator where the coefficients including the entries of the positive definite d × d real symmetric matrix A and the d-dimensional vector B are Hölder continuous (B is not necessarily bounded over v ∈ ‫ޒ‬ d ).Even if the well-posedness theory for the Cauchy problem associated to the linear operator (1-4) was already well developed in some sense in the existing literature (see [Anceschi and Polidoro 2020;Manfredini 1997]), the Hölder spaces (see Definition 2.3) considered in those works are different from those studied in [Imbert and Mouhot 2021; Imbert and Silvestre 2021] (see Definition 2.1), which are the ones we study.Indeed, in contrast to [Imbert and Mouhot 2021], the (Schauder-type) a priori estimates proved in the previous literature are weaker and not appropriate for bootstrap arguments proving higher regularity for nonlinear problems (see Section 4C).
Secondly, the treatment of the existence issue for (1-1) in Hölder spaces is based on a fixed-point argument, where the compactness is provided by hypoelliptic regularization results; see Section 4B.A breakthrough on such a priori estimates for spatially inhomogeneous kinetic equations with a quasilinear diffusive structure in velocity was obtained in [Golse et al. 2019] and [Henderson and Snelson 2020;Imbert and Mouhot 2021], where the authors prove the kinetic (hypoelliptic) counterparts of the De Giorgi-Nash-Moser theory and the Schauder theory for classical elliptic equations (see for instance [Gilbarg and Trudinger 2001]), respectively.One may refer to [Mouhot 2018] for a summary.Armed with the Schauder estimate developed in [Imbert and Mouhot 2021] in kinetic Hölder spaces and the bootstrap procedure developed in [Imbert and Silvestre 2022] adapted to this case, we are then able to derive instantaneous C ∞ regularization for the solutions to (1-1) in Section 4C, provided that the solution is bounded from above and bounded away from vacuum, which guarantees the ellipticity in the velocity variable for (1-1).
Thirdly, in order to remove the lower bound assumption on the initial data, in Section 4A we establish a self-generating lower bound result showing that the positivity of solutions spreads everywhere instantaneously.Its proof is based on repeated applications of the spreading of positivity forward in time (see Lemma 4.5) and the spreading for all velocities (see Lemma 4.6), as proposed in [Henderson et al. 2020b].On the one hand, the barrier function argument will be used in the same spirit as [Henderson et al. 2020b] to show Lemma 4.5.Indeed, a lower (resp.upper) barrier for a certain equation is a subsolution (resp.supersolution) of the equation which bounds its solution from below (resp.above) on the boundary; it then follows from the maximum principle that the barrier function performs as a lower (resp.upper) bound of the solution.On the other hand, combining the local Harnack inequality obtained in [Golse et al. 2019] with the construction of a Harnack chain yields Lemma 4.6.We remark that the idea of the Harnack chain was first used in [Moser 1964], and an example of its application to Kolmogorov equations can be found in [Anceschi et al. 2019].Essentially, the spreading of positivity can be seen as a lower bound estimate of the fundamental solution, which is thus related to the result in [Henderson et al. 2019], where the authors applied a probabilistic method.
A subtle point of the lower bound result lies in the possibilities of the degeneracy of solutions as t → 0 + or t → ∞, which leads to two delicate issues.First, with the same difficulty as mentioned in [Henderson et al. 2020a], in order to prove the uniqueness of the Cauchy problem (1-1), the nondegeneracy of diffusion up to the initial time is required so that the a priori estimates can be still applicable.We remark that, generally speaking, deriving uniqueness of solutions to nonlinear equations in rough spaces is always a classical difficulty, and the presence of a vacuum sometimes gives rise to nonuniqueness phenomenon even for the limiting equation (1-3); see for instance [Daskalopoulos and Kenig 2007].Under the additional assumptions of Hölder continuity and absence of vacuum on the initial data, we achieve the uniqueness by using the scaling argument and Grönwall's lemma, since the Hölder estimate around the initial time implies that the integrand in the inequality of Grönwall's type is improved to be integrable with respect to the time variable; see the proof of Proposition 4.11 for more details.Second, we are only able to show the convergence to equilibrium if the drift-diffusion coefficient ρ β f decays slower than t −1 as t → ∞ in Proposition 5.1.Therefore, an additional lower Maxwellian bound on the initial data is imposed in Theorem 1.1(ii) and Theorem 1.4(ii) to ensure the solutions will be away from the vacuum uniformly along time.It would be expected that such additional lower bound assumption could be removed, especially when β is small.1B2.Long time behavior.The drift-diffusion operator L FP acts only on the velocity variable and ceases to be dissipative on its unique steady state µ, which also ensures that the null space of L FP is spanned by µ and the conservation law of mass is satisfied.Consequently, the convergence to equilibrium is to be expected.With the help of the global smoothness a priori estimates, we are able to pass from the exponential convergence to equilibrium in the L 2 -framework to the uniform convergence in C ∞ in Section 5A, when the spatial domain is compact -that is the periodic box ‫ޔ‬ d .Therein, the L 2convergence is obtained by the L 2 -hypocoercivity under a macro-micro (fluid-kinetic) decomposition scheme, which suggests the construction of some proper entropy (Lyapunov) functional that would provide an equivalent L 2 -norm for solutions.The key ingredient is to control the macroscopic part by means of the microscopic part in view of the decomposition.This hypocoercive theory was studied in [Esposito et al. 2013;Dolbeault et al. 2015;Hérau 2018] via different approaches, while their ideas are essentially the same.In [Esposito et al. 2013], the authors intended to develop the nonlinear energy estimate in an L 2 -to-L ∞ framework.In [Dolbeault et al. 2015] and [Hérau 2018], the authors studied the L 2 -hypocoercivity theory in an abstract setting and in the framework of pseudodifferential calculus, respectively.In addition, if the spatial domain is ‫ޒ‬ d -meaning that it is not confined to a compact region -then the convergence rate slows down to an algebraic decay, for which the hypocoercive theory was captured in [Bouin et al. 2020].We remark that the L 2 -framework allows us to avoid some difficulties from the nonlinearity of the operator ρ β f L FP f , in contrast with H 1 -entropic hypocoercivity methods (see for instance [Villani 2009]).
1B3.Diffusion asymptotics.The diffusion approximation serves as a simplification of collisional kinetic equations when the mean-free path is much smaller than the typical length of observation in a long time scale.This approximation for linear Fokker-Planck models can be traced back to [Degond and Mas-Gallic 1987], where the authors applied the Hilbert expansion method.One is also able to achieve the diffusion limit for (1-2) in some weak sense by applying a similar strategy to the one given in [El Ghani and Masmoudi 2010].However, weak convergence is sometimes not appropriate for application, as a precise description of the convergence is not given.Still, the nonlinearity of the term ρ β f ϵ L FP f ϵ in (1-2) associated with nonperturbative initial data reveals some difficulties when deriving a quantitative convergence.
In order to overcome this difficulty, in Section 5B we will rely on the phi-entropy of solutions relative to their limit to see the finite-time asymptotics on the torus.The relative entropy method, which heavily relies on the regularity of solutions to the target equation, has become an effective tool in the study of hydrodynamic limits since [Bardos et al. 1993;Yau 1991] (see also [Saint-Raymond 2009]).The method applied to the diffusion asymptotics of the kinetic Fokker-Planck equation of the type with linear diffusion can be found in [Markou 2017].The so-called phi-entropy (relative to the global equilibrium) was used to study the convergence of certain kinds of Fokker-Planck equations; see for instance [Arnold et al. 2001;Dolbeault and Li 2018].Finally, combining the barrier function method with a careful treatment of the regularity estimate of the target equation enables us to deal with the asymptotic dynamics for the cases associated with general Hölder continuous initial data.
1C. Physical motivation.The spatially inhomogeneous Fokker-Planck equation (1-1) arises from modeling the evolution of some system of a large number of interacting particles from the statistical mechanical point of view.These models appear for instance in the study of plasma physics and biological dynamics; see [Chavanis 2008;Villani 2002].Its solution can be interpreted as the probability density of the particles lying at the position x at time t with velocity v.The scaled model (1-2) for small ϵ describes the evolution of the particle density in the small mean-free path and long-time regime, where the nondimensional parameter ϵ ∈ (0, 1) designates the ratio between the mean-free path (microscopic scale) and the typical macroscopic length.The limiting equation (1-3) characterizes its macroscopic dynamics.
From the perspective of a stochastic process {(X t , V t ) : t ≥ 0} driven by a Brownian motion the dual equation describing the dynamics of {(X t , V t ) : t ≥ 0} is given by (1-1); see the review paper [Chandrasekhar 1943].Indeed, the nonlinear term ρ β f L FP f models the collisional interaction of the particles, where the mobility of these particles is hampered by their aggregation.More precisely, the nonlinear dependence on the drift-diffusion coefficient ρ β f translates the fact that the effect of friction in the interaction is positively correlated to the local mass of particles occupying the position x at time t.Moreover, the low field scaling t → ϵ 2 t, x → ϵx of (1-1) formally implies (1-2).As ϵ tends to zero, its spatial diffusion phenomena are characterized by (1-3).
Regarding its physical interpretation, we point out that the factor multiplying the time derivative in (1-2) takes into account the long time scale.The inverse of the factor multiplying ρ β f ϵ L FP f ϵ stands for the scaled average distance traveled by particles between each collision, and it is usually referred to as mean-free path.In the small mean-free path regime, it was noticed in [Chandrasekhar 1943] that the spatial variation occurs significantly only under the long time scale that is consistent with the particle motion.In such an overdamped process, also called a low field limit or diffusion limit, the statistics of the particle motion translates into the macroscopic behavior of the particle system.
Finally, we recall that the associated phi-entropy introduced in Definition 1.3 is also known as Tsallis entropy in the physics community, which generalizes the Boltzmann-Gibbs entropy (the phi-entropy with β = 1) in nonextensive statistical mechanics [Tsallis 1988].It gives some hints for the formulation of the correlated diffusion, where the index β measures the degree of nonextensivity and nonlocality of the system; see [Tsallis 2009].
1D. Organization of the paper.The article is organized as follows.In Section 2, we recall some basic notions related to kinetic Hölder spaces that are adapted to the Fokker-Planck equations.Section 3 is devoted to the study of the linear Fokker-Planck equation with Hölder continuous coefficients.The well-posedness result Theorem 1.1(i) is proved in Section 4. The asymptotic behaviors, including Theorem 1.1(ii) and Theorem 1.4, are proved in Section 5.

Preliminaries
This section is devoted to basic notation, including the invariant structure and the kinetic Hölder space for the equations we are concerned with.Instead of the usual parabolic scaling and translations, the invariant scaling and transformation associated with the Kolmogorov operator L 1 (see (1-4)) is replaced by kinetic scaling and Galilean transformations, respectively.It then turns out that the appropriate Hölder space as well as its norm should be adapted to the new scaling and transformation.
2A.The geometry associated to Kolmogorov operators.Let z := (t, x, v) ∈ ‫ޒ‬ × ‫ޒ‬ d × ‫ޒ‬ d .We define the kinetic scaling S r (t, x, v) := (r 2 t, r 3 x, r v) for r > 0 and the Galilean transformation With respect to the product • , we are able to define the inverse of z as z −1 := (−t, −x + tv, −v).In view of this structure of scaling and transformation, it is natural to define the cylinder centered at the origin of radius r > 0 as Q r := (−r 2 , 0] × B r 3 (0) × B r (0).
More generally, the cylinder centered at z 0 = (t 0 , x 0 , v 0 ) with radius r is defined by Roughly speaking, for fixed z 0 ∈ ‫ޒ‬ 1+2d , the Kolmogorov operator L 1 is invariant under the kinetic scaling and left-invariant under the Galilean transformation.It means that if f is a solution to the equation ) solves an equation of the same ellipticity class in Q 1 .
In addition, the associated quasinorm ∥ • ∥ is defined by  Any polynomial p ∈ ‫[ޒ‬t, x, v] can be uniquely written as a linear combination of monomials, and its kinetic degree deg kin ( p) is defined by the maximal kinetic degree of the monomials appearing in p.This definition is justified by the fact that p(S r (z)) = r deg kin ( p) p(z).
Definition 2.1.Let the constant α > 0 and the open subset (2-1) If this property holds for any z 0 , z on each compact subset of , then we say f ∈ C α l ( ).If the constant C in (2-1) is uniformly bounded for any z 0 , z ∈ , we define the smallest value of C as the seminorm , where we additionally define C 0 l ( ) := C 0 ( ), the space of continuous functions on , with the norm l -continuity is equivalent to the standard definition of C α -continuity with respect to the distance ∥ • ∥.The subscript "l" of C l stems from the definition of Hölder continuity above, which is given in terms of a left-invariant distance with respect to the group structure of • .
We also mention another kind of Hölder space suitable for the study of Kolmogorov operators that was first used in [Manfredini 1997].
and is equipped with the norm The consistency between these two definitions is given by [Imbert and Silvestre 2021, Lemma 2.7] (see also [Imbert and Mouhot 2021, Lemma 2.4]), a result that we state here.
Remark 2.6.A subtle difference between C 2 l and C 2 kin comes from the fact that, for f ∈ C 2 l , we have We will also employ the following notions of weighted Hölder norms in Section 3.
2C.Other notation.Throughout the article, B R denotes the Euclidean ball in ‫ޒ‬ d centered at the origin with radius R > 0. We employ the Japanese bracket defined as . By abuse of notation, ⟨ • ⟩ will also denote the velocity mean in Section 5.Moreover, we assume 0 < λ < .We denote by C a universal constant -that is to say a constant depending only on β, d, λ, , α, σ, α 0 specified in context.Finally, we write X ≲ Y to mean that X ≤ CY for some universal constant C > 0, and X ≲ q Y to mean that X ≤ C q Y for some C q > 0 depending only on universal constants and the quantity q.

Kolmogorov-Fokker-Planck equation
This section is devoted to the study of the Cauchy problem associated to the operator (1-4), where the d × d symmetric matrix A(t, x, v) and the d-dimensional vector B(t, x, v) satisfy the condition where α ∈ (0, 1) and the norm ∥ • ∥ C α l ( ) of matrix denotes the summation of the norm of each entry.The boundedness condition at infinity means that the solution shall be bounded, which is intended for the validity of maximum principle; see the proof of Lemma A.1 below.
The aim of this section is to solve the Cauchy problem (3-1) by virtue of the weighted Hölder norm (Definition 2.7) and by means of the standard continuity method combined with Schauder-type estimates.One may refer to [Gilbarg and Trudinger 2001, Subsection 6.5] for the corresponding treatment in classical elliptic theory.
Throughout this section we work with the domain := (0, T ] × ‫ޒ‬ d × ‫ޒ‬ d , with T ∈ ‫ޒ‬ + .We shed light on the fact that all of the results below can be restricted to (0, T ] × ‫ޔ‬ d × ‫ޒ‬ d whenever required. 3A. Schauder estimates.In order to apply the continuity method, first of all one needs to prove a global a priori estimate for solutions to (3-1) with respect to the weighted Hölder norm.In the kinetic setting, we have at our disposal the interior Schauder estimates proved in [Imbert and Mouhot 2021, Theorem 3.9].
Proposition 3.1 (interior Schauder estimate).Let the constant α ∈ (0, 1) be given and the cylinder Q 2r (z 0 ) be a subset of with r (3-3) In particular, the right-hand side controls r 2 First of all, we enhance this result to a global estimate for the Cauchy problem (3-1) under a vanishing condition for the initial data.
< ∞, and f be a bounded solution to the Cauchy problem (3-1) under condition (3-2) in .If the initial data f in equals 0, then we have Proof.In view of Proposition 3.1, it suffices to deal with the estimates around the initial time.Without loss of generality, we assume T ≤ 1.Let z 0 = (t 0 , x 0 , v 0 ) ∈ and 2r = t 1/2 0 .Applying the interior Schauder estimate (3-3) yields It then follows from the arbitrariness of z 0 that, for any σ < 2 such that we apply the maximum principle (Lemma A.1) to the function Combining this estimate with (3-4), we get the desired result.□ 3B.Cauchy problem for the linear equation.The goal of this subsection is to prove the well-posedness of the Cauchy problem (3-1) with Hölder continuous coefficients.Remark 3.4.In contrast with (3-2), condition (3-5) is weaker, which allows the coefficients of B to not necessarily be bounded globally.This fact will be applied to the Ornstein-Uhlenbeck operator The simplest possible setting of (3-1) under condition (3-5) is recovered by choosing A = I and B = 0, which turns out to be the classical Kolmogorov operator L 0 := ∂ t + v • ∇ x − v .This operator was first studied in [Kolmogoroff 1934], where its fundamental solution was calculated explicitly as (3-6) One can easily see that is smooth outside of its pole (the origin).In fact, in this latter case the following result holds.
is the unique bounded solution in C 2+α l ( ) to (3-1) with L 1 replaced by L 0 and f in = 0.
Remark 3.6.When the spatial domain is ‫ޔ‬ d , one can apply Green's function which is well defined due to the decay of .
We are now in a position to apply the standard continuity method to derive Proposition 3.3.
Proof of Proposition 3.3.We split the proof into three steps.In the first step, we establish the case for vanishing initial data under the stronger assumption (3-2).We point out that the assumption on the coefficient B can be weakened in the second step.Finally, we deal with general continuous initial data.
Step 1. Assume f in = 0 and condition (3-2) holds.Let the constant σ ∈ (0, 2) be fixed and consider the Banach space In particular, every function lying in Y vanishes at t = 0.For τ ∈ [0, 1], we define the operator L τ := (1 − τ )L 0 + τ L 1 , which can be written in the form where its coefficients A τ := (1 − τ )I + τ A and τ B still satisfy condition (3-2) (with λ and replaced by min{1, λ} and max{1, }, respectively).For any w ∈ Y, we have (3-9) By Lemma 3.5, we see that 0 ∈ I; in particular, I is not empty.It now suffices to show that 1 ∈ I. Pick τ 0 ∈ I. Then the global Schauder estimate provided by Proposition 3.2 implies that, for any (3-10) For any w ∈ Y, since τ 0 ∈ I and (3-8) holds, the following Cauchy problem is solvable for any Thus, we can define the mapping F : Y → Y by setting F(w) = f .Armed with (3-10) and (3-8), there exists a universal constant C > 0 such that, for any u, w ∈ Y, Hence F is a contraction mapping, provided that |τ − τ 0 | ≤ δ := (2C) −1 .Then, F gives a unique fixed point f ∈ Y, which is the unique bounded solution to the Cauchy problem (3-9) in Y.By dividing the interval [0, 1] into subintervals of length less than δ, we conclude that 1 ∈ I.
Step 2. If f in = 0 and condition (3-5) holds, we approximate the coefficient B by B n := Bϱ n , where Then, for each n ∈ ‫ގ‬ + , the result obtained in the previous step provides a bounded solution f n to (3-1) with B replaced by B n .Indeed, applying the maximum principle (Lemma A.1) to the function ± f − e t sup |s| implies sup | f n | ≤ e T sup |s|.Thanks to the interior Schauder estimate (Proposition 3.1), for any compact subset K ⊂ , we have that { f n } n≥N is precompact in C 2 kin (K ), provided that N (depending on K ) is large enough.Sending n → ∞ in the equation satisfied by f n yields that the limit function Step 3.For general , with the source term equal to s − L 1 f ε in , and associated with the vanishing initial data.The procedure presented in the previous steps ensures a unique bounded solution f ε to (3-1) for each f ε in .The uniform convergence of { f ε in } and the maximum principle (Lemma A.1) implies the uniform convergence of { f ε }.We may denote its limit by Its uniqueness is again given by the maximum principle.This concludes the proof.□

Well-posedness of the nonlinear model
This section is devoted to the proof of Theorem 1.1(i), including a self-generating lower bound given in Section 4A, the existence and uniqueness given in Section 4B, and a smoothness a priori estimate given in Section 4C.First, we recast the Cauchy problem (1-1) in terms of g(t, x, v) := µ(v) −1/2 f (t, x, v), an unknown function, with g in (x, v) := µ(v) −1/2 f in (x, v) as follows: where R[g] and U[g] on the right-hand side are defined by The main advantage of this formulation is that it allows us to get rid of the first-order term in v, and the zeroth-order term is bounded, since g is bounded from above by a Maxwellian.
For convenience, we are also concerned with the substitution h(t, x, v) (4-2) In contrast with (1-1), the zeroth-order term disappears.Let us begin by exhibiting the global bounds of solutions to (4-2) in (0, Proof.Integrating the equation This means that the upper bound is preserved along time.Similarly, the lower bound can be obtained by integrating the equation In particular, the above result preserving global bounds holds for solutions to (4-2) and (5-1) in (0, T ) × ‫ޔ‬ d × ‫ޒ‬ d .We will also apply such result to the substitution g = µ 1/2 h appearing in Section 4B.Unless otherwise specified, throughout this section we set the domain := (0, T ] × ‫ޔ‬ d × ‫ޒ‬ d with T ∈ ‫ޒ‬ + .Nevertheless, as specified in Remark 4.4, Corollary 4.10, and Proposition 4.11 below, the results of this section also hold if the spatial domain is ‫ޒ‬ d .4A. Self-generating lower bound.Throughout this subsection, we assume that the bounded solution h of (1-1) lies below the universal constant , which is guaranteed by Lemma 4.1 if the initial data lies below .The aim of this subsection is to show the following positivity-spreading result.We remark that this proposition only relies on the mixing structure of the classical parabolic-type maximum principle and the transport operator, but not on the structure of the mass conservation.
Proposition 4.2 (lower bound).Let δ > 0, T ∈ (0, T ), and h be a bounded solution to (4-2) in satisfying Then, there exist two positive continuous functions η 1 (t) and η 2 (t) on (0, T ] depending only on universal constants, T , δ, r , and v 0 such that, for any (4-4) Remark 4.3.In particular, the functions η 1 (t) and η 2 (t) are positive and bounded on any compact subset of (0, T ], but η 1 might degenerate to zero and η 2 may go to infinity as t tends to zero or infinity. Remark 4.4.If one is concerned with the problem in the whole space -that is = (0, T ] × ‫ޒ‬ d × ‫ޒ‬ dwe can proceed along the same lines as the proof in Appendix B to see that (4-3) implies the lower bound where the functions η 1 (t, x) and η 2 (t, x) on (0, T ] × ‫ޒ‬ d are positive, continuous and only depend on universal constants, T, δ, r , and v 0 .Compared with (4-4), η 1 (t, x) and η 2 (t, x) lose the uniformity in x as ‫ޒ‬ d is not compact (see Step 3 of the proof of the proposition in Appendix B).In addition, the exponential tail with respect to v cannot be improved to a Gaussian type, since there is no uniform-in-x lower bound on the local mass h dµ such that Step 4 in Appendix B fails.
We note that the proof of the proposition is composed mainly of two lemmas.On the one hand, Lemma 4.5 extends the lower bounds forward a short time from a neighborhood of any given point in ‫ޔ‬ d × ‫ޒ‬ d and at any given time.On the other hand, Lemma 4.6 is used to spread the lower bound for all velocities.The spreading of the lower bound in space is given by selecting the proper velocity to transport the positivity which is guaranteed by Lemma 4.5.By applying these lemmas repeatedly, as proposed in [Henderson et al. 2020b], we are able to control the solution from below for any finite time.We postpone the full proof of Proposition 4.2, obtained by combining these two lemmas, until Appendix B.
Lemma 4.5 (lower bound forward in time).Let δ, τ, r ∈ (0, 1] and h be a bounded solution to Then there exists some universal constant c 0 > 0 such that with the constant C 0 > 0 to be determined.The region By choosing C 0 := (1/(8c 0 ))δ⟨τr −1 ⟩ 2 ⟨v 0 ⟩ 2 for some (small) universal constant c 0 > 0, we have The spreading of lower bound to all velocities relies on the construction of a Harnack chain through iterative application of the local Harnack inequality [Golse et al. 2019, Theorem 1.6].Although some coefficients of (4-2) are unbounded globally over v ∈ ‫ޒ‬ d , we remark that their local boundedness is sufficient for us to achieve the result through a careful study on the rescaling during the construction of the Harnack chain.
Lemma 4.6 (lower bound for all velocities).Let δ > 0, T, R ∈ (0, 1], T ∈ (0, T ), and h be a bounded solution to (4-2) in such that, for any t ∈ [0, T ], for some (x 0 , v 0 ) ∈ ‫ޔ‬ d × ‫ޒ‬ d .Then there exists some (large) constant C > 0 depending only on universal constants, T , δ, R, and v 0 such that, for any t ∈ [T , T ], we have , we will construct a finite sequence of points to reach z from the region {t ≤ T, |x − x 0 − tv 0 | < R, |v − v 0 | < R}, where the solution is positive by assumption.In particular, x does not exit this region.The nonlocality of the coefficient R h , with assumption (4-7), implies the nondegeneracy of the diffusion in velocity so that the positivity of the solution h propagates over v ∈ ‫ޒ‬ d in a localized space region.
Step 1. Iterate the Harnack inequality.For i ∈ {1, 2, . . ., N + 1} with N ∈ ‫,ގ‬ we define z N +1 := z and z i := (t i , x i , v i ) by the relation , where the parameters N , r, τ 1 , τ 2 > 0 will be determined next.Consider the function for z := ( t, x, ṽ) ∈ Q 1 : We observe that, if the following is true for any z ∈ Q 1 : then, for 1 ≤ i ≤ N, the function h i+1 (z) satisfies the equation where the coefficients satisfy Applying the Harnack inequality [Golse et al. 2019, Theorem 1.6] to h i implies that there exist constants c 0 , τ 1 ∈ (0, 1) depending only on universal constants, δ, and R such that, for any τ 2 ∈ [0, 1 − τ 1 ] and 1 ≤ i ≤ N, we have Hence it remains to determine the chain {z i } 1≤i≤N +1 such that conditions (4-9) and (4-10) hold and the point z 1 stays in the region Step 2. Determine the Harnack chain (including N , r , and τ 2 ) from a proper starting time t 1 .For M > 0, we set Recalling that T, R ∈ (0, 1], by choosing we have To determine the parameter M > 0, we point out that there exists some constant C depending only on universal constants, T , δ, R, and v 0 such that M ≤ C and Thus, Nr τ 2 = |v − v 0 |.This setting then guarantees condition (4-9).It also follows from the iteration relation that v 1 = v 0 , and, for 1 ≤ i ≤ N + 1, Step 3. Determine the starting point x 1 .For any 1 ≤ i ≤ N, we estimate the departure distance from the expression (4-12) Therefore, for any x ∈ B R/2 (x 0 + tv 0 ), there exists some x 1 ∈ B 5R/8 (x 0 + t 1 v 0 ) such that x N +1 = x.In this setting, for any 1 ≤ i ≤ N, we also have M 2 < R. Thus, condition (4-10) ensures the inequality (4-11) is satisfied for 1 ≤ i ≤ N, which yields Recalling that c 0 ∈ (0, 1) appears in (4-11) and , we obtain the desired result (4-8).□ 4B.Existence and uniqueness.Let us begin by summarizing some basic a priori estimates for solutions to (4-1).
Remark 4.9.For any given nonnegative continuous function g in that is not identically zero, there is some point (x 0 , v 0 ) ∈ ‫ޔ‬ d × ‫ޒ‬ d and some constants δ, r > 0 such that We will see that the upper bound g in ≤ µ 1/2 and the lower bound (4-14) assumptions on the initial data g in (which could be discontinuous) are sufficient to ensure the existence of a solution g ∈ C 2 kin ( ) in the weak sense such that, for any As solutions become regular instantaneously, the difference between the weak solution and the classical one lies only in the continuity around the initial time.
Proof.Let us assume that g in satisfies (4-14) for some point (x 0 , v 0 ) ∈ ‫ޔ‬ d × ‫ޒ‬ d and some constants δ, r > 0. By Proposition 4.2, for any solution g to (4-1) and for any T ∈ (0, T ), there is some λ * > 0 depending only on universal constants, T , T, δ, r , and v 0 such that (4-16) Step 1.We first approximate the initial data g in by g ϵ in := g in * ϱ ε + εµ 1/2 , where ε ∈ (0, 1], Let us fix ε ∈ (0, 1].In order to establish the existence of a solution to (4-1) associated with the initial data g ε in , we find a fixed point of the mapping F : w → g defined by solving the Cauchy problem on the closed convex subset K of the Banach space C γ l ( ), where the constants γ ∈ (0, 1) and N > 0 are to be determined.We remark that (4-17) is equivalent to By Lemma 4.1 and the fact that R[w] ≥ ε, we have εµ 1/2 ≤ g ≤ (1 + )µ 1/2 in .In particular, for any w ∈ K, we have the following for the lower-order term: R[w] 1 2 d − 1 4 |v| 2 g ≲ 1.Thus, the global Hölder estimate [Zhu 2021, Corollary 4.6] implies that there exist some constants γ ∈ (0, 1) and N > 0 depending only on universal constants and ε such that ∥g∥ C 2γ l ( ) ≤ N, which also implies that the lower-order term R It then follows from Proposition 3.3 with the interior Schauder estimate (Proposition 3.1) that the mapping F : ( ) is well defined.In addition, with the help of the Arzelà-Ascoli theorem, we know that F(K) is precompact in C γ l ( ).As far as the continuity of F is concerned, we take a sequence {w n } converging to w ∞ in C γ l ( ).Since {F(w n )} is precompact in C γ l ( ), there exists a converging subsequence whose limit is g In view of the interior Schauder estimate (Proposition 3.1), {F(w n )} is precompact in C 2 kin (K ) for any compact subset K ⊂ and g ∞ ∈ C 2 kin ( ) ∩ C 0 ( ).Sending n → ∞ in (4-17) satisfied by (w, g) = (w n , F(w n )), we see that (4-17) also holds for the pair of limits (w, g) = (w ∞ , g ∞ ).Then, applying the maximum principle (Lemma A.1) to Then, for every ε ∈ (0, 1], we are allowed to apply the Schauder fixed-point theorem (see for instance [Gilbarg and Trudinger 2001, Corollary 11.2]) to get g ε ∈ C 2 kin ( ) ∩ C 0 ( ) such that F(g ε ) = g ε , which is a (classical) solution to (4-1) associated with the initial data g ε in .
Step 2. Passage to the limit.Recalling the lower bound (4-16) on the coefficient and the higher-order Hölder estimate given by Lemma 4.7(i), for any T ∈ (0, T ), we point out that for some constant α * ∈ (0, 1) with the same dependence as λ * .Hence g ε converges uniformly to g in C 2 kin ([T , T ] × ‫ޔ‬ d × ‫ޒ‬ d ), up to a subsequence.Write the equation satisfied by g ε in the weak formulation: that is, for any φ ∈ C ∞ c ( ), Combining the energy estimate derived by choosing φ = g ε above with the upper bound of g ε provided by Lemma 4.1, we have Therefore, after passing to a subsequence, R[g ε ]∇ v g ε converges weakly in L 2 ( ).On account of its local uniform convergence, we know that its weak limit is R[g]∇ v g.In addition, since µ −1/2 g ε is uniformly bounded, by their local uniform convergence, we can also derive that the sequences g ε and gives (4-15).Furthermore, if the initial data g in is continuous, then the barrier function method shows that the continuity around the initial time depends only on the upper bound of the solution and the continuity of g in ; see the derivation of the estimate (5-30) of a general type in Section 5B.Indeed, by (5-30) (with ϵ = 1, R = |v 1 |, h ϵ = µ −1/2 g, and h ϵ,in = µ −1/2 g in ), we see that, for any fixed δ ∈ (0, 1) and This implies the continuity of the solution g around t = 0 and finishes the proof.□ One may extend the above existence result to the case where the spacial domain ‫ޔ‬ d is replaced by ‫ޒ‬ d .
Corollary 4.10.For any , there exists a solution g to the Cauchy problem (4-1) satisfying 0 with periodic extension to ‫ޒ‬ d .In the light of Proposition 4.8, we take a solution g R to (4-1) associated with the initial data g R in in , where [−R, R] d is considered as a periodic box.After extracting a subsequence, we define the function g := lim R→∞ g R in (0, T ‫ޒ×]‬ d ‫ޒ×‬ d pointwise; furthermore, since 0 ‫ޒ×‬ d , we know that the limiting function satisfies 0 ≤ µ −1/2 g ≤ in (0, Since the initial data is continuous unless it is identically zero, we assume that g in ≥ δ1 {|x−x 0 |<r,|v−v 0 |<r } for some point (x 0 , v 0 ) ∈ ‫ޒ‬ d × ‫ޒ‬ d and some constants δ, r > 0. Consider R > |x 0 | + r .Applying the lower bound of the solution given by (4-5) yields that, for any compact subset ] is greater than or equal to λ * , where the constant λ * > 0 only depends on universal constants, δ, r, v 0 , and K.In view of the higher-order Hölder estimate given by Lemma 4.7(i), we know that g R uniformly converges to g in C 2 kin (K ), up to a subsequence.Additionally, due to the estimate derived in (4-19), the limiting function g is a solution to (4-1) that matches the initial data g in continuously.
As for (4-20), we notice that the function

Integrating the equation against the function
Sending R → ∞, we acquire , which implies the estimate (4-20) as asserted. □ The following proposition concerned with the uniqueness of the Cauchy problem (4-1) is derived from a Grönwall-type argument.The standard scaling technique and the Hölder estimate up to the initial time given by Lemma 4.7(ii) can improve the integrability with respect to t in the energy estimate so that Grönwall's inequality becomes admissible; see (4-25) for the precise expression.This kind of phenomena was also noticed in [Henderson et al. 2020a] (see the remarks in §1.4.2).The global energy estimate of (4-1) is not available when the spatial domain is unbounded, since there is no decay of the solution as |x| → ∞.To work it out, we take advantage of the idea originated from the uniformly local space used in [Henderson et al. 2019;Kato 1975].We note that such a technique is not necessary when working with the periodic box ‫ޔ‬ d .Proposition 4.11 (uniqueness).Let the domain x be ‫ޔ‬ d or ‫ޒ‬ d , the constant α 0 ∈ (0, 1), and the functions 0 ≤ g 1 , g 2 ≲ µ 1/2 be two solutions to (4-1) in (0, T ] × x × ‫ޒ‬ d associated with the same initial data Proof.In view of the lower bound given by Lemma 4.5 and Proposition 4.2, we know that there is some constant λ * ∈ (0, 1) depending only on universal constants, T, and the initial data such that Therefore, we may assume T = −1 with > 1.Let us set the difference We have to show that g is identically zero.
By the arbitrariness of z 0 , we know that, for any s ∈ (0, T ], Dragging this estimate into (4-24) yields that for any t ∈ (0, T ], where the constant C * > 0 depends only on universal constants and the initial data.The desired result is then given by Grönwall's inequality.□ 4C. Global regularity.The instantaneous smoothness a priori estimate in Theorem 1.1(i) is made up of the lower bound given by Proposition 4.2 and the following proposition.
Proposition 4.12.Let x = ‫ޔ‬ d or ‫ޒ‬ d , let T ∈ (0, T ), and let the function g be a solution to Then, for any ν ∈ 0, 1 2 and k ∈ ‫,ގ‬ we have for some constant C T ,ν,k > 0 depending only on universal constants, T , ν, and k.As an immediate consequence of the above proposition, for any ν ∈ 0, 1 2 , k ∈ ‫,ގ‬ and for any compact subset K ⊂ (0, T ] × x , there exists some constant C ν,k,K > 0 depending only on universal constants, ν, k, and K such that which is exactly the assertion in Theorem 1.1(i).
In order to show the higher regularity, we will apply the bootstrap procedure developed in [Imbert and Silvestre 2022] which was intended for the non-cut-off Boltzmann equation.The classical bootstrap iteration proceeds by differentiating the equation, using a priori estimates to the new equation to improve the regularity of solutions, and repeating the procedure.Nevertheless, since C 2+α l ̸ ⊂ C 1 x for any α ∈ (0, 1) by their definitions, the hypoelliptic structure of (4-1) does not gain enough regularity in the x-variable which disables the x-differentiation at each iteration.Indeed, the Schauder-type estimate provided by Lemma 4.7(i) only shows that the solution to (4-1) belongs to C (2+α)/3 with respect to the x-variable.In order to overcome it, we have to apply estimates to increments of the solution to recover a full derivative.From now on, for y ∈ ‫ޒ‬ d and w ∈ ‫ޒ‬ × ‫ޒ‬ d × ‫ޒ‬ d , we denote the spatial increment by δ y g(z) := g(w • (0, y, 0)) − g(w).
Let us proceed with the proof of the regularity estimate.
Proof of Proposition 4.12.We are going to show that, for any multi-index k : 2 , there exists some constant α k ∈ (0, 1) depending only on |k| such that, for any Q r (z 0 ) ⊂ For simplicity, we will omit the domain in estimates below, since the estimates can be always localized around the center z 0 .
Step 1.We will establish that (4-27) holds for any differential operators of the type ∂ k x x .It suffices to show that, for any n ∈ ‫,ގ‬ k x ∈ ‫ގ‬ d with |k x | = n, ν ∈ 0, 1 2 , and y ∈ B r 3 /4 , Indeed, sending y → 0 in (4-28) will complete this step.
Based on an induction on |k x | = n, we suppose that (4-28) holds for any |k x | ≤ n − 1, which implies, for any We remark that the induction here begins with (4-29) for |k x | = 0, which holds due to the previous step.
Let q := δ y ∂ k x x g with |k x | = n.Lemma C.2 and (4-29) gives Therefore, we have to enhance the exponent 2 3 on the right-hand side to 1; as a sacrifice, the Hölder exponent on the left-hand side will decrease.
Step 2. For the case k v = 0 in (4-27), we proceed with a bidimensional induction on Based on the previous step (m = 0), we have to show that (4-33) holds for k t = m ≥ 1 and |k x | = n under the induction hypothesis that (4-33) holds for any k t ≤ m − 1 and where we use the notation Di for the differential operator satisfying ∂ k t t D k x x = Di • D i .By the induction hypothesis, each term in the remainder (the summation on the right-hand side of (4-34)) with i ̸ = (0, 0, 0) can be controlled in C α m,n l . It now suffices to deal with the exceptional term so that the whole remainder can be controlled in C α m,n l ; then (4-33) follows from the interior Schauder estimate (Proposition 3.1).To this end, using Lemma 2.4 and the induction hypothesis with the pair Due to the induction hypothesis with the pair (m − 1, n + 1), for any ν ′ ∈ (0, ν), Then, (4-35) and (4-36) produce the bound on µ −ν ′ (v 0 )∥q∥ C αm,n l .
Step 3. Similarly, to show (4-27) for any differential operator ∂ k t t D k x x D k v v , we proceed with a bidimensional induction on (m, n) = (k t + |k x |, k v ) such that, for any ν ∈ 0, 1 2 , (4-37) The case n = 0 is treated in the previous step.By Lemma 2.4 and the induction hypothesis (4-37) with Computing the equation satisfied by g and proceeding as in the previous step, we conclude the proof.□

Diffusion asymptotics
This section is devoted to the study of the global-in-time quantitative diffusion asymptotics which consists of the (uniform-in-ϵ) convergence towards the equilibrium over long times and of the finite-time asymptotics, including the results of Theorem 1.1(ii) and Theorem 1.4.We first introduce the required notation.For any scalar or vector-valued function ∈ L 1 ‫ޒ(‬ d , dµ), we denote its velocity mean by For any pair of functions (scalars, vectors, or d × d matrices) 1 , 2 ∈ L 2 ‫ޔ(‬ d × ‫ޒ‬ d , dm), we denote their L 2 inner product with respect to the measure dm by where the multiplication between the pair in the integrand is replaced by the scalar contraction product if 1 and 2 are a pair of vectors or matrices.
Recalling our notation for the Ornstein-Uhlenbeck operator L OU = (∇ v − v) • ∇ v , we apply the substitutions f ϵ = µh ϵ and f ϵ,in = µh ϵ,in in (1-2) and obtain (5-1) In this setting, by applying integration by parts, for any We will use this identity repeatedly in the computation below.Then, the operator L OU is self-adjoint with respect to the inner product ( • , • ), and the bracket ⟨ • ⟩ is a projection on the null space of L OU .Moreover, as the total mass is conserved, we define (5-2) Proceeding with the macro-micro (fluid-kinetic) decomposition, we define the orthogonal complement of the projection In this framework, the local mass ⟨h ϵ ⟩ is the macroscopic (fluid) part and the complement h ⊥ ϵ is the microscopic (kinetic) part.In addition, taking the bracket ⟨ • ⟩ after multiplying the equation in (5-1) with 1 and v leads to the macroscopic equations (5-3) where ⟨vh ϵ ⟩ and ⟨v ⊗2 h ϵ ⟩ represent the local momentum and the stress tensor, respectively.
5A.Long time behavior.Our aim is to establish the (uniform-in-ϵ) exponential decay towards the equilibrium M 0 for (5-1).In particular, when ϵ = 1, it sets up the exponential convergence in each order derivative based on the smoothness a priori estimates given in Section 4C.We note that the classical coercive method is not applicable in our case to obtain the convergence to equilibrium due to the degeneracy of the ellipticity of the spatially inhomogeneous equation.Indeed, the Poincaré inequality only produces a spectral gap on the orthogonal complement of the projection ⟨ • ⟩; see (5-7).As mentioned in Section 1B, there are several ways to achieve the long-time asymptotics.We mainly follow the argument in [Esposito et al. 2013] (see also [Kim et al. 2020]) in a simpler scenario.
× ‫ޒ‬ d associated with the initial data 0 ≤ h ϵ,in ≤ and satisfying then the solution h ϵ converges to the state M 0 in L 2 (dm) as t → ∞.More precisely, there exists some universal constant c > 0 such that, for any t > 0, we have (5-6) Proof.Since the velocity mean of the microscopic part vanishes, ⟨h ⊥ ϵ ⟩ = 0, using (5-1) and the Poincaré inequality yields (5-7) Now we have to recover a new entropy that would give some bound on the projection ⟨h ϵ ⟩ − M 0 .
For every test function v • (t, x)µ, with a vector-valued function ∈ H 1 t,x ‫ޒ(‬ + × ‫ޔ‬ d , ‫ޒ‬ d ), we write the weak formulation of (5-1) as Taking the macro-micro decomposition into account, from the above expression we obtain (5-8) Let us now introduce an auxiliary function u(t, x): for any fixed t ∈ ‫ޒ‬ + , defined u(t, x) as the solution of the following elliptic equation under the compatibility condition (5-2): whose elliptic estimate states (5-10) In addition, observing that ⟨vh ϵ ⟩=⟨vh ⊥ ϵ ⟩, from (5-3) we get Combining this macroscopic relation with (5-9), we have It then follows from Hölder's inequality that (5-11) Applying (5-9)-(5-11), we have By the Cauchy-Schwarz inequality, we arrive at (5-12) Then, (5-12) combined with (5-7) implies where the constant δ ∈ 0, 1 2 will be determined and the modified entropy E ϵ is defined by We note that (5-10) also implies (5-13) It means that the modified entropy E ϵ is equivalent (independent of ϵ) to the square of the L 2 (dm)-distance between h ϵ and M 0 , when the constant δ > 0 is sufficiently small.Hence we have d dt The conclusion (5-6) then follows from Grönwall's inequality and the equivalence between E ϵ (t) and ∥h ϵ (t, We pointed out that the elliptic estimate (5-10) for the Poisson equation (5-9) used in the above proof resulting from a Poincaré-type inequality essentially relies on the compactness of the spatial domain.It was shown in [Bouin et al. 2020] that the related elliptic estimate can be recovered by applying the Nash inequality [1958] when the spatial domain is the whole space ‫ޒ‬ d , whose argument is under an abstract setting.Inspired by the proof of Proposition 5.1 above, we are also able to make the construction of [Bouin et al. 2020] precise to see that the argument still works for the nonlinear equation (5-1).We remark that the following algebraic decay rate is optimal in the sense that it is the same as in the linear case; see Appendix A of [Bouin et al. 2020].
Combining this with (5-18) and the equivalence between E ϵ and ∥h dm) , we arrive at As far as the case ϵ = 1 is concerned, we conclude the result of convergence to equilibrium.
Proof of Theorem 1.1(ii).Consider g := µ 1/2 h.In view of Proposition 4.8 and Corollary 4.10 with the assumption on initial data, we know that λ ≤ µ −1/2 g ≤ in ‫ޒ‬ + × x × ‫ޒ‬ d for x = ‫ޔ‬ d or ‫ޒ‬ d .By applying Proposition 5.1 to h = µ −1/2 g with λ t = λ and x = ‫ޔ‬ d , we have an universal constant c > 0 such that Combining this with the Sobolev embedding and the interpolation, we derive the following for any k ∈ ‫ގ‬ with k ≥ d: Since the H 4k -norm on the right-hand side is bounded due to the global regularity estimate given by Proposition 4.12, we obtain the exponential convergence to equilibrium in each order derivative.
The asserted result in the case x = ‫ޒ‬ d is a direct consequence of Proposition 5.2.As a side remark, one is also able to upgrade the long-time convergence to higher-order derivatives of solutions by means of the global regularity estimate and interpolation as above; it yet gives the algebraic decay rate that is not optimal.□ 5B.Finite-time asymptotics.The study of macroscopic dynamics for the nonlinear kinetic model (5-1) in this subsection relies on the regularity of the target equation (1-3).On account of this, let us begin with mentioning some standard results for (1-3) without proof.If the initial data satisfies λ ≤ ρ in ≤ , then such bounds are preserved along times, λ ≤ ρ ≤ , in the same spirit as Lemma 4.1.Combining the parabolic De Giorgi-Nash-Moser theory with Schauder theory, we know that the solution ρ is smooth for any positive time.We state the a priori estimate precisely as follows, where its behavior near the initial time is taken into account in view of the standard scaling technique.
Proof of Theorem 1.4.We are going to combine Propositions 5.1 and 5.5 with a delicate analysis on the relative entropy around the initial time to get Theorem 1.4.The analysis is based on the barrier function method.Let us assume the constant α ∈ (0, 1) provided by Proposition 5.5.

Appendix B: Spreading of positivity
This appendix is devoted to the proof of Proposition 4.2.The argument follows the one presented in [Henderson et al. 2020b], and it is based on the combination of Lemmas 4.5 and 4.6.
Proof of Proposition 4.2.The proof is split into four steps.
Combining this with (B-4) as well as recalling that T = 2t and the space domain ‫ޔ‬ d is compact, we know that there exists C 3 > 0 depending only on universal constants, T , δ, r , and v 0 such that, for any (t, x, v) ∈ [T , min{T 0 , T 1 }] × ‫ޔ‬ d × ‫ޒ‬ d , h(t, x, v) ≥ C −1 3 e −C 3 |v| 4 .
Since T 0 and T 1 depend only on universal constants, r , and v 0 , by applying the above arguments iteratively, we obtain the result for any finite time.
Step 4. Improving the exponential tail.We remark that this step is not necessary for the applications of the lower bound result, but it shows a more precise decay rate as |v| → ∞.
By the previous step, there is some c > 0 depending only on universal constants, T , T, δ, r , and v 0 such that h ≥ c in [T , T ] × ‫ޔ‬ d × B 1 .Consider the barrier function where the constant C 0 > 1 is to be determined.By recalling (4-2) and performing a direct computation, we have In particular, by choosing C 0 sufficiently large (with the same dependence as c), we have In addition, by its definition, h ≥ h on the boundary {t ∈ This appendix is devoted to the proof of two technical lemmas for spatial increments involved in the bootstrapping of higher regularity for solutions to (4-1) presented in Section 4C.For the convenience of the reader, we report a brief proof following the lines of [Imbert and Silvestre 2022, Lemma 8.1] with s = 1 and α 1 = β = 2.