Isotriviality, integral points, and primitive primes in orbits in characteristic p

We prove a characteristic p version of a theorem of Silverman on integral points in orbits over number fields and establish a primitive prime divisor theorem for polynomials in this setting. In characteristic p , the Thue–Siegel–Dyson–Roth theorem is false, so the proof requires new techniques from those used by Silverman. The problem is largely that isotriviality can arise in subtle ways, and we define and compare three different definitions of isotriviality for maps, sets, and curves. Using results of Favre and Rivera-Letelier on the structure of Julia sets, we prove that if ϕ is a nonisotrivial rational function and β is not exceptional for ϕ , then ϕ − n (β) is a nonisotrivial set for all sufficiently large n ; we then apply diophantine results of Voloch and Wang that apply for all nonisotrivial sets. When ϕ is a polynomial, we use the nonisotriviality of ϕ − n (β) for large n along with a partial converse to a result of Grothendieck in descent theory to deduce the nonisotriviality of the curve y ℓ = ϕ n ( x ) − β for large n and small primes ℓ ̸= p whenever β is not postcritical; this enables us to prove stronger results on Zsigmondy sets. We provide some applications of these results, including a finite index theorem for arboreal representations coming from quadratic polynomials over function fields of odd characteristic.


Introduction and Statement of Results
In [Sil93, Theorem A], Silverman proved the following theorem.
We prove that the analogous theorem holds for non-isotrivial rational functions in F p (t). Recall that a rational function in ϕ ∈ F p (t)(z) is said to be isotrivial if there is a σ ∈ F p (t)(z) of degree 1 such that σ • ϕ • σ −1 ∈ F p (z). We prove the following.
Silverman [Sil93] also proves Theorem 1.1 over number fields (see [Sil93,Theorem B]). Likewise, our most general form of Theorem 1.2 is stated in terms of S-integrality and isotriviality for rational functions defined over finite extensions of F p (t). We will define S-integrality in the next section (see Definition 2.1). We give our more general definition of isotriviality for rational functions here. Definition 1.3. Let K be a finite extension of F p (t) and let ϕ be a rational function in K(z). We say that ϕ is an isotrivial rational function if there exists σ ∈ K(z) of degree 1 such that σ • ϕ • σ −1 ∈ F p (z).
Also recall that for a rational function ϕ ∈ K(z), a point β ∈ P 1 (K) is said to be exceptional for ϕ if its total orbit (both forward and backward) is finite. However, for the maps that we consider, this amounts to ϕ −2 (β) = {β} by Riemann-Hurwitz. In particular, since totally inseparable maps are isotrivial (which may be seen by moving fixed points to 0 and ∞), we avoid the more exotic cases of exceptional points arising in positive characteristic; see, for instance, [Sil96]. With this in place, we state our general form of Theorem 1.2.
Theorem 1.4. Let K be a finite extension of F p (t), let ϕ ∈ K(z) be a non-isotrivial rational function with deg ϕ > 1, let S be a finite set of places of K, and let α, β ∈ K where β is not exceptional for ϕ. Then {ϕ n (α) | n ∈ Z + } contains only finitely many points that are S-integral relative to β.
The main tools used in the proof of [Sil93, Theorem A] are from diophantine approximation. Roughly, one takes an inverse image ϕ −i (∞) that contains at least three points and applies Siegel's theorem on integral points for the projective line with at least three points deleted to conclude that that there only finitely many n such that ϕ n are integral relative to ϕ −i (∞) and thus only finitely many n + i such that ϕ n+i (α) is an integer. Over function fields in characteristic p, the problem is more complicated since Roth's theorem is false; in fact, no improvement on Liouville's theorem is possible in general. There is, however, a weaker version of Siegel's theorem, due to Wang [Wan99, Theorem in P 1 (K), Page 337] and Voloch [Vol95], which states that, for function fields in characteristic p, there are finitely many S-integral points on the projective line with a non-isotrivial set of points deleted. (Note that this is strictly weaker than Siegel's theorem, since any set of three points is automatically isotrivial, and there are isotrivial sets of every countable cardinality.) Basic functorial results on integral points thus imply that Theorem 1.4 will hold whenever ϕ −n (β) is a non-isotrivial set. In Theorem 3.1, we show that ϕ −n (β) is a non-isotrivial set for large n whenever ϕ is a non-isotrivial rational function and β is not exceptional, using results of Favre and Rivera-Letelier [FRL10] on the structure of Julia sets at primes of genuinely bad reduction.
In the case where ϕ is a polynomial of separable degree greater than 1, we can prove a bit more than Theorem 1.4. To describe our result we need a bit of terminology. For a sequence {b n } ∞ n=1 of elements of a global field K, we say that a place p of K is a primitive divisor of b n if v p (b n ) > 0 and v p (b m ) ≤ 0 for all m < n.
For a positive integer ℓ, we say that p is a primitive ℓ-divisor of b n if p is a primitive divisor of b n and ℓ ∤ v p (b n ).
Likewise, for a positive integer ℓ and α, β, and ϕ as above, we define the ℓ-Zsigmondy set Z(ϕ, α, β, ℓ) for ϕ, α, β, and ℓ as We will also need a precise definition of critical points to state our next theorem. Let ϕ be a rational function in K(z). We let deg s ϕ denote the degree of the maximal separable extension of K(ϕ(z)) in K(z) and let deg i ϕ = (deg ϕ)/(deg s ϕ); note that deg i ϕ is also the largest power p r of p such that ϕ can be written as ϕ(z) = g(x p r ) for some rational function g ∈ K(z). For γ ∈ P 1 , there are degree one rational functions σ, θ ∈ K(z) such that θ(0) = γ and σ • ϕ • θ(0) = 0. We may then write σ • ϕ • θ(z) = z e g(z) for some rational function g such that g(z) = 0. We call e the ramification degree of ϕ at γ denote it as e ϕ (γ/ϕ(γ)). We say that γ is a critical point of ϕ if e ϕ (γ/ϕ(γ)) > deg i ϕ.
We let O + ϕ (α) denote the set {ϕ n (α) | n ∈ Z + }, called the forward orbit of α with respect to φ. Moreover, we say that a point β is post-critical if there is a critical point γ of ϕ such that β ∈ O + (γ).
With this terminology, we have the following two theorems for polynomials.
Theorem 1.5. Let K be a finite extension of F p (t), let f ∈ K(z) be a non-isotrivial polynomial with deg f > 1, and let α and β be elements of K such that α is not preperiodic, β is not post-critical, and β / ∈ O + f (α). Then for any prime ℓ = p, the Zsigmondy set Z(f, α, β, ℓ) is finite. Theorem 1.6. Let K be a finite extension of F p (t), let f ∈ K(z) be a non-isotrivial polynomial with deg f > 1, and let α and β be elements of K such that α is not preperiodic, β is not exceptional for f , and β / ∈ O + f (α). Then the Zsigmondy set Z(f, α, β) is finite. Theorem 1.4 is not true in general for isotrivial rational functions, and Theorems 1.5 and 1.6 are not true not in general for isotrivial polynomials (see [Pez94]). There are some results in the isotrivial case, however (see [HSW14]), and some of the techniques here do work for a wide class of isotrivial rational functions. We may address these questions in a future paper.
Theorem 1.4 is proved by using two different notions of isotriviality. The first is our Definition 1.3 for functions. We now define an isotrivial set. Here we use a simple, if inelegant, definition rather than a slightly more technical one that generalizes to varieties other than P 1 . Below we regard an element of K(z) as a map from K ∪ ∞ to itself.
Definition 1.7. Let K be a finite extension of F p (t) and let S be a finite subset of K ∪ ∞. We say that S is a isotrivial set if there exists σ ∈ K(z) of degree 1 such that σ(S) ⊆ F p ∪ ∞.
We note that if ϕ is a non-isotrivial rational function the set ϕ −1 (β) may still be an isotrivial set; for example any set of three or fewer elements is an isotrivial set, but there are non-isotrivial rational functions of degree 2 and 3.
Theorem 1.5 is proved using a third notion of isotriviality, this time for curves.
Definition 1.8. Let K be a finite extension of F p (t) and let C be a curve defined over K. We say that C is an isotrivial curve if there is a curve C ′ defined over a finite extension k ′ of K ∩ F p and a finite extension K ′ of K such that An outline of the paper is as follows. Throughout this paper, K is a finite extension of F p (t) as in Definitions 1.3 ,1.7, and 1.8. In Section 2, we introduce some basic facts about heights, integral points, and cross ratios that are used throughout the paper. Following that, we prove Theorem 3.1, which says that if ϕ is a non-isotrivial rational function of degree greater than 1 and β is not exceptional for ϕ, then ϕ −n (β) is a non-isotrivial set for all sufficiently large n. The proof uses work of Baker [Bak09] and Favre/Rivera-Letelier [FRL10] to produce elements in ϕ −n (β) whose v-adic cross ratio is not 1 at a place v of bad reduction. We then apply work of [Wan99] (see also [Vol95]) to give a quick proof of Theorem 1.4 in Section 4. In Section 5, we begin by proving Proposition 5.3, which states that if the roots of a polynomial F are are distinct and form a non-isotrivial set, then the curve C given by y ℓ = F (x) is a non-isotrivial curve when ℓ = p is a prime that is small relative to the degree of F . The techniques we use to do this build upon work in [HJ20]; the idea is to use the Adjunction Formula to show that the projection map onto the x-coordinate is the unique map θ : C → P 1 of degree ℓ up to change of coordinates on P 1 (see Lemma 5.1). We then use Proposition 5.3 and Theorem 3.1 to show the non-isotriviality of curves associated to ϕ −n (β), where ϕ is a non-isotrivial rational function of degree greater than 1 and β is not exceptional for ϕ, in Theorem 5.5. In Section 6, we prove Proposition 6.1, which immediately implies Theorems 1.5 and 1.6; the proof uses Theorem 3.1 along with height bounds on non-isotrivial curves in characteristic p due to Szpiro [Szp81] and Kim [Kim97] (see Theorem 6.3). Finally, in Section 7, we present some applications of our results to other dynamical questions.
We note that the proof of Theorem 3.1 works the same for function fields in characteristic 0 as for function fields in characteristic p. Theorems 1.4, 1.5, and 1.6 all hold in stronger forms for function fields in characteristic 0, as proved in [GNT13]; the main difference here is that Yamanoi [Yam04] has proved the full Vojta conjecture for algebraic points on curves over function fields of characteristic 0 (see [Voj98,Voj87]), whereas Theorem 6.3 is weaker than the full Vojta conjecture for algebraic points on curves over function fields of characteristic p. Analogs of Theorems 1.5 and 1. 6 have not yet been proved over number fields, except in some very special cases (see [Ban86,Zsi92,Sch74,PS68,Ric07]), but both theorems are implied by the abc conjecture (see [GNT13]). Carlo Pagano, Joe Silverman, Dinesh Thakur, Felipe Voloch, and Julie Wang for many helpful conversations. We give special thanks to Juan Rivera-Letelier, who provided us with the argument for Proposition 3.2 and without whose help this paper likely would not have been possible.

Preliminaries
In this section we will review some terminology and results on heights, integral points, and dynamics. For background on heights, see [HS00,Lan83,BG06]. We set some notation below.
Throughout this paper, K will denote a finite extension of F p (t) and k will denote the intersection K ∩ F p . Equivalently, K is the function field of a smooth, projective curve B defined over k.
2.1. Places, heights, and reduction. Let M K be the set of places of K, which corresponds to the set of closed points of B.
Since K is a function field, we choose a place q of K, denote and let k p be the residue field o K /p. Also, define the local degree of p to be Likewise, for each p ∈ M K we let | · | p be a normalized absolute value such that the product formula holds for all z ∈ K. Moreover, we define K p to be the completion of K with respect to | · | p and define C p to be the completion of the algebraic closure of K p . For z ∈ K, let h(z) denote the logarithmic height of K. For ϕ ∈ K(z) with deg ϕ = d ≥ 2, let h ϕ (z) denote the Call-Silverman canonical height of z relative to ϕ [CS93], defined by h ϕ (z) = lim n→∞ h(ϕ n (z)) d n .
We will often write sums indexed by primes that satisfy some condition. These are taken to be primes of o K . As an example of our indexing convention, observe that We say that a rational function ϕ ∈ K(z) has good reduction at a place p of K if the map it induces on P 1 is non-constant and well-defined modulo p. More precisely, we write ϕ(x) = f /g, where all the coefficients of f and g are in (o K ) p , and either f or g has at least one coefficient in (o K ) * p . We let f p and g p denote the reductions of f and g at p. We say that ϕ has good reduction at p if f p and g p have no common root in the algebraic closure of the residue field of p and deg(f p /g p ) = deg ϕ. We say that ϕ has bad reduction at p if it does not have good reduction at p. This notion is dependent on our choice coordinates. We say that ϕ has potentially good reduction at p if there is a finite extension K ′ of K, a prime q of K ′ lying over p, and a degree one rational function σ ∈ K ′ (z) such that σ • ϕ • σ −1 has good reduction at q. We say that ϕ has genuinely bad reduction at p if ϕ does not have potentially good reduction at p.

2.2.
Integral points. Let S be a non-empty finite subset of M K . The ring of S-integers in K is defined to be Given a place p of K and two points α = [x 1 : y 1 ] and β = [x 2 , y 2 ] in P 1 (C p ), define the p-adic chordal metric δ p by Note that we always have 0 ≤ δ p (α, β) ≤ 1, and that δ p (α, β) = 0 if and only if α = β. Then the ring o K,S is equivalent to the set which is maximally distant from ∞ outside of S, i.e. the set of We can now extend our definition of S-integrality to to any divisor D on P 1 that is defined over K.
Definition 2.1. Fix a non-empty finite set of places S ⊂ M K . Let D be an effective divisor on P 1 that is defined over K. Then α ∈ P 1 (K) is S-integral relative to D provided that for all places p / ∈ S, all τ ∈ Gal(K/K), and all β ∈ Supp D, we have For affine coordinates [α : 1] ∈ P 1 (K) and a divisor D defined over K that does not contain the point at infinity in its support, the statement that [α : 1] is S-integral relative to D is equivalent to for all p / ∈ S, all τ ∈ Gal(K/K), and all [1 : β] ∈ Supp D. Let θ be a linear fractional change of coordinate on P 1 (K). Then α is S-integral relative to β if and only if θ(α) is S-integral relative to θ(β) provided we allow an enlargement of S depending only on θ. We prove a variant of this statement for any θ ∈ K[x] later in the paper. The following is a simple and standard consequence of our definition of S-integrality (see [Soo11, Corollary 2.4], for example). Recall that for a point α ∈ P 1 (K), the divisor ϕ * (α) is defined as ϕ(β)=α e ϕ (β/α)β.
Lemma 2.2. Let ϕ ∈ K(x) and S be a set of primes containing all the primes of bad reduction for ϕ. Then, for any α, γ ∈ P 1 (K), we have that ϕ(γ) is S-integral relative to α if and only if γ is S-integral relative to ϕ * (α).
2.3. The cross ratio. Let | · | be a non-Archimedean absolute value on a field L. For any distinct x 1 , x 2 , y 1 , y 2 ∈ L we define: We may extend this to points in x 1 , x 2 , y 1 , y 2 ∈ L ∪ ∞ by eliminating the terms involving ∞; for example, Importantly, for σ ∈ PGL 2 (L), we have [z 1 , z 2 ; z 3 , z 4 ] = [σz 1 , σz 2 ; σz 3 , σz 4 ]. This is easily seen by noting that an element of PGL 2 (L) is a composition of translations, scaling maps, and the map sending every element to its multiplicative inverse, and that [z 1 , z 2 ; z 3 , z 4 ] is invariant under all these types of maps.
We will use the following two lemmas for points x 1 , x 2 , y 1 , y 2 ∈ L. The first lemma is immediate.
Lemma 2.4. Suppose that there are points a 1 , a 2 ∈ L such that |x 1 − a 1 |, |y 1 − a 1 | < |a 1 − a 2 | and |x 2 − a 2 |, |y 2 − a 2 | < |a 1 − a 2 |. Then Proof. After a translation, we may assume that a 1 = 0. Then |x 1 |, |y 1 | < |a 2 | and |x 2 |, |y 2 | = |a 2 |. Thus, we have Remark 2.5. The cross ratio of x 1 , x 2 , y 1 , y 2 is often defined without taking absolute values, i.e. as The advantage of the definition we use is that it extends to points in Berkovich space (see [FRL10]). While we do not use this extension, it can be used to give a quick proof of our Proposition 3.2. We give a slightly longer proof that we think may be more accessible for some readers.

Non-Isotriviality of inverse images
In this section, we will prove the following theorem.
Suppose that ϕ is not isotrivial and that β is not exceptional for ϕ. Then for all sufficiently large n the set ϕ −n (β) is not an isotrivial set.
We will derive Theorem 3.1 from the following proposition.
Proposition 3.2. Suppose ϕ ∈ K(z) has genuinely bad reduction at the prime p. Let | · | be an extension of | · | p to C p . Then for any non-exceptional α ∈ K, and for all sufficiently large n, there are elements z 1 , z 2 , z 3 , z 4 ∈ ϕ −n (α) such that Proof. We work over the non-Archimedean complete field C p , and consider the dynamical system induced by ϕ on the Berkovich projective line P 1,an . We will use some basic facts about the topology of the Berkovich projective line, including the classification of points as Type I, II, III, or IV; see [BR10] or [Ben19] for a detailed description of the topology of the Berkovich projective line. By [FRL10, Théorème E] (see also [Ben19,Theorem 8.15]), bad reduction implies that the equilibrium measure ρ ϕ is non-atomic. Thus, there are four or more points all of the same type (I, II, III, or IV) in the support of ρ ϕ .
Since ρ ϕ is non-atomic and the inverse images of a non-exceptional point equidistribute we have the following fact. We also have the following basic facts about the topology of P 1,an .
Fact 3.4. Let ξ(a, r), where a ∈ K and r > 0, be a point of Type II or Type III corresponding to the disc {x ∈ K | |x − a| ≤ r}. Then for any ǫ > 0, there is an open set U ⊂ P 1,an with ξ(a, r) ∈ U such that every point x of Type I in U satisfies r − ǫ < |x − a| < r + ǫ.
Fact 3.5. Let a 1 and a 2 be any two points of the same type in P 1,an , which are not concentric Type II or III points. Then there exist open sets U 1 and U 2 with a 1 ∈ U 1 and a 2 ∈ U 2 such that U 1 ∩ P 1 (C p ) and Y 2 ∩ P 1 (C p ) are disjoint open discs.
Proof. Since a 1 and a 2 are not concentric, a 1 ∧ a 2 , the unique point [ is not equal to a 1 or a 2 (see [FRL10]). Now let D i be the open disc corresponding to any Type II point in the open interval (a i , a 1 ∧ a 2 ), for i = 1, 2. Then there are open sets U i such that Suppose that ρ ϕ contains two non-concentric points z 1 , z 2 of the same type. Then, by Facts 3.3 and 3.5, for all sufficiently large n there must be open discs D(a 1 , r 1 ) and D(a 2 , r 2 ) with |a 1 − a 2 | > max{r 1 , r 2 } and points x 1 , x 2 , y 1 , y 2 ∈ ϕ −n (β) with x 1 , y 1 ∈ D(a 1 , r 1 ) and x 2 , y 2 ∈ D(a 2 , r 2 ). By Fact 2.4, we have (x 1 , x 2 ; y 1 , y 2 ) > 1, proving the proposition. Now suppose that ρ ϕ contains four concentric points of Type II or Type III, corresponding to closed discs D(a, r i ), for i = 1, 2, 3, 4, for some fixed a. We suppose that r 1 < r 2 < r 3 < r 4 , and after an affine change of coordinates, we may suppose that a = 0. By Facts 3.3 and 3.4, for any ǫ > 0, there must be an n such that ϕ −n (β) contains points z 1 , z 2 , z 3 , z 4 with |z i | within ǫ of r i for each i. Choosing ǫ appropriately, we will then have |z 1 | < |z 2 | < |z 3 | < |z 4 |. Then (z 1 , z 3 ; z 2 , z 4 ) > 1 by Lemma 2.3.
Proof of Theorem 3.1. By [Bak09, Theorem 1.9], since ϕ is non-isotrivial, it must have genuine bad reduction over some prime p. Then we may apply Proposition 3.2 to obtain four points in ϕ −n (β) with cross ratio greater than one for any sufficiently large n. Since the cross ratio of four points in F p ∪ ∞ is always 1 and the cross ratio is invariant under change of coordinate, we see then that ϕ −n (β) is a non-isotrivial set for all sufficiently large n.

Proof of Theorem 1.4
We will use the following theorem due to Wang [Wan99, Theorem in P 1 (K), Page 337] and Voloch [Vol95].
Theorem 4.1. Let D be an effective divisor on P 1 that is defined over K. If the points in Supp D form a non-isotrivial set, then the set of points in P 1 (K) that are S-integral relative to D is finite.
The corollary below follows easily.
is not an isotrivial set. Then for any α ∈ K, the forward orbit O + ϕ (α) contains only finitely many points that are S-integral relative to β.
Proof. We may extend S to contain all the primes of bad reduction for ϕ. The set of iterates ϕ n−i (α) that are S-integral relative to (ϕ i ) * (β) is finite by Theorem 4.1, so by Lemma 2.2, the set of points ϕ n (α) that are S-integral relative to β must be finite.
The proof of Theorem 1.4 is now easy.
Proof of Theorem 1.4. By Theorem 3.1, there is some i such that ϕ −i (β) is not an isotrivial set. Applying Corollary 4.2 then gives the desired conclusion.

Non-isotriviality of certain curves
Let π : C −→ P 1 be a separable nonconstant morphism defined over K. We define the ramification locus of π to be the support of π(R π ), where R π is the ramification divisor of π. If the ramification locus of π is an isotrivial set, then it follows from descent theory (see [Gro63], for example) that C must be isotrivial. On the other hand, given any finite subset U of P 1 , one can use interpolation to construct a nonconstant separable morphism f : P 1 −→ P 1 such that that the ramification locus of f contains U ; thus, there are isotrivial curves that admit nonconstant separable morphisms π : C −→ P 1 such that the ramification locus of π is a non-isotrivial set. We can show, however, that if the degree of π : C −→ P 1 is a prime ℓ = p that is small relative to the genus of C and the ramification locus of π is a non-isotrivial set, then C must indeed be a non-isotrivial curve. This enables us to prove Theorem 5.5, which gives rise to diophantine estimates used in the proofs of Theorems 1.5 and 1.6. The technique here is similar to that of [HJ20]. We begin with a lemma about uniqueness of low prime degree maps on curves of high genus.
Proof. Suppose that C is isotrivial; then there are finite extensions K ′ of K and k ′ of k such that there is a model C for C × K K ′ over the k'-curve X corresponding to the function field K ′ such that for any place t ∈ X(k ′ ), the curve C t × k(t) L is isomorphic to C × K L, where k(t) is the field of definition of t and L = K ′ · k(t). Let P be a model for P 1 over X. Then, for all but finitely many places t ∈ X(k ′ ), the morphism θ specializes to a degree ℓ morphism θ t : C t −→ P 1 k(t) defined over k(t). Let θ 2 = θ t × k(t) L. Since θ 2 : C −→ P 1 has degree ℓ, and (ℓ − 1) 2 < g, there is a λ ∈ PGL 2 (K) such that θ 2 = λ • θ, by Lemma 5.1. But λ must take the ramification locus of θ to the ramification locus of θ 2 , which is defined over k ′ . Hence, the ramification locus of θ must be isotrivial. That gives a contradiction.
Corollary 5.3. Let F be a polynomial over K without repeated roots such that the roots of F form a non-isotrivial set. Let ℓ be a prime number such that ℓ = p and ℓ − 1 < deg F/2 − 1. Then the curve C given by y ℓ = F (x) is not isotrivial.
Proof. Let θ : C −→ P 1 be the map coming from projection onto the x-coordinate. Then deg θ = ℓ. Since the genus of C is at least (ℓ − 1) deg F/2 − (ℓ − 1) by Riemann-Hurwitz and the ramification locus of θ includes the roots of F (note: it will be larger than that if θ also ramifies over the point at infinity), applying Theorem 5.2 shows that C is not isotrivial.
As mentioned above, there are obvious examples of maps π : C −→ P 1 , where C is isotrivial but the ramification locus of π is not, but we have not found examples of isotrivial curves of the specific form y m = F (x), for F a polynomial with distinct roots that form a non-isotrivial set and m is an integer greater than 1 that is not a power of p.
Question 5.4. Does there exist an isotrivial curve of the form y m = F (x), where F is a polynomial with distinct roots that form a non-isotrivial set and m is an integer greater than 1 that is not a power of p?
Corollary 5.3 and the techniques of [Hin16] can be used to show that when p is odd and m is even, the answer to Question 5.4 is "no"; we cannot however rule out examples where m is odd or p = 2.
We are now ready to prove a theorem guaranteeing the non-isotriviality of certain curves obtained by taking inverse images of points under iterates of a non-isotrivial rational function.
Theorem 5.5. Let ϕ ∈ K(x) be a non-isotrivial rational function. Let β ∈ K be non-exceptional for ϕ. Then for any ℓ = p, there is an n such that the curve given by (where the product γ∈K ϕ n (γ)=β (x − γ) is taken without multiplicities) is not an isotrivial curve.
Proof. If ∞ / ∈ ϕ −n (β) for any n, then this is immediate from Corollary 5.3 and Theorem 3.1. Otherwise, since deg s ϕ > 1 (because purely inseparable rational functions are isotrivial) and β is not exceptional for ϕ, there is some m such that ϕ −m (β) contains at least three points. Thus, there is some point β ′ ∈ ϕ −m (β) such that ∞ / ∈ ϕ −n (β) for any n. Then there is some m ′ such that ϕ −m ′ (β ′ ) is not isotrivial by Theorem 3.1, and since the set of points other than ∞ in ϕ −(m+m ′ ) (β) contains ϕ −m ′ (β ′ ), this set is non-isotrivial as well, so the curve given by is not an isotrivial curve by Corollary 5.3.
The second author conjectured [Hin16, Conjecture 3.1] that when ϕ is a non-isotrivial polynomial of degree prime to p and β is not postcritical for ϕ, then for some n and some ℓ prime to p, the curve is not isotrivial. Theorem 5.5 answers this with many of the hypotheses removed. Note that by taking the product without multiplicities, we essentially remove the issue of β being postcritical. We note that Ferraguti and Pagano have proved Theorem 5.5 in the special case where ϕ is a quadratic polynomial, ℓ = 2, and p = 2 (see [FP20, Theorem 2.4]).
6. Proof of Theorems 1.5 and 1.6 Theorems 1.5 and 1.6 will both follow from the following more general statement.
We will prove Proposition 6.1 by combining effective forms of the Mordell Conjecture over function fields (see 6.3) with Theorem 5.5 and the following lemma from [BT19, Lemma 5.2] (see also [GNT13,Proposition 5.1]). Note that while this lemma is stated in characteristic 0 in [BT19], the proof is the same word-for-word for finite extensions of F p (t).
for some 0 < m < n. Then for any ǫ > 0, we have for all n.
The next result we use follows from (any of the) effective forms of the Mordell Conjecture over functions fields [Kim97,Mor94,Szp81]. To make this precise, we need some terminology. Let C be a curve over K and let P ∈ C be a point on C defined over some finite extension K(P )/K. Then we let h K C (P ) denote the logarithmic height of P with respect to the canonical divisor K C of C and let d K (P ) = 2g(K(P )) − 2 [K(P ) : K] denote the logarithmic discriminant of P ; here g(K(P )) is the genus of K(P ). Then we have the following height bounds for rational points on non-isotrivial curves due to Szpiro [Szp81] and Kim [Kim97]. Theorem 6.3. Let C be a non-isotrivial curve of genus at least two over a finite extension K of F p (t). Then there are constants B 1 > 0 and B 2 (depending only on C) such that holds for all P ∈ C.
Remark 6.4. The first of these bounds (with explicit B 1 and B 2 in the semistable case) are due to Szpiro [Szp81,§3], and the best possible bounds (i.e., with smallest possible B 1 ) are due to Kim [Kim97]. Strictly speaking, the bound in [Szp81, §3] is stated for semistable curves. However, one may always pass to a finite extension L/K over which C is semistable [Szp81,§1] and thus obtain bounds of the form in (6.3.1). Likewise, the bound in [Kim97] is stated for curves with nonzero Kodaira-Spencer class. However, the general non-isotrivial case follows from this one as follows. Assuming that C/K is non-isotrivial and char(K) = p, there is an inseparability degree r = p e and a separable extension L/K such that C is defined over L r and that the Kodaira-Spencer class of C over L r is nonzero; see [Szp81,. Now apply Kim's theorem to C/L r . In either case, Castelnuovo's inequality [Sti09,Theorem 3.11.3] applied to the composite extensions L(P ) = LK(P ) or L r (P ) = L r K(P ) may be used to appropriately alter B 1 and B 2 to go from bounds with d L or d L r back to those with d K .
Before we apply the height bounds for points on curves from Theorem 6.3 to dynamics, we need the following elementary observation about valuations and powers.
Lemma 6.5. Let K/F p (t) be finite extension and let ℓ = p be a prime. Then there is a finite extension L of K such that if u is any element of K with the property that ℓ | v p (u) for all primes p of K, then u is an ℓ-th power in L.
Proof. Suppose that u ∈ K is such that ℓ | v p (u) for all primes p of K. Then the divisor (u) = ℓD u for some divisor D u ∈ Div 0 (K) of degree 0. Hence, the linear equivalence class of D u is an ℓ-torsion class in Cl 0 (K), the group of divisor classes of degree 0. In particular, there are only finitely many possible linear equivalence classes for D u by [Sti09, Proposition 5.1.3]. Hence there is a finite set S of u ∈ K with u = ℓD u for some D u ∈ Div 0 (K) such that for any u ′ ∈ K with u ′ = ℓD u ′ for some D u ∈ Div 0 (K), we have that D u ′ is linearly equivalent to D u for some u ∈ S. Let L ′ be the finite extension of K generated by the ℓ-th roots of the elements of S. Now if u and u ′ are two such elements of K as above such that D u and D u ′ are linearly equivalent, then D u − D u ′ = (w u,u ′ ) for some w u,u ′ ∈ K. Hence, u/u ′ = c u,u ′ w ℓ u,u ′ for some c u,u ′ in the field of constants of K. In particular, there are only finitely many possible such c u,u ′ since the field of constants of K is finite. Adjoining the ℓ-th roots of these c u,u ′ to L ′ gives a finite extension L of K.
Lemma 6.6. Let let S be a finite set of primes of K, let F ∈ o K,S [z] be a polynomial without repeated roots and let ℓ = p be a prime such that C : y ℓ = F (x) is a non-isotrivial curve of genus g(C) > 1. Then there are constants r 1 > 0 and r 2 (depending on F , ℓ, K, and S) such that (6.6.1) vp(F (a))>0 ℓ∤vp(F (a)) N p ≥ r 1 h(a) + r 2 holds for all a ∈ o K,S .
Proof. Suppose that C : y ℓ = F (x) is a non-isotrivial curve of genus g(C) > 1. Then given a ∈ o K,S , we let u a := F (a) and choose a corresponding point P a = a, ℓ √ u a on C. From here, we proceed in cases.
Suppose first that ℓ | v p (u a ) for all primes p of K. Then by Lemma 6.5 there exists a finite extension L/K (independent of a) such that u a is an ℓth power in L. In particular, since we may assume that L contains a primitive ℓth root of unity, K(P a ) ⊆ L. Therefore, (6.3.1) implies that h K C (P a ) is absolutely bounded. However, the canonical divisor class is ample in genus at least 2, so that the set of possible points P a is finite in this case. Therefore, h(a) is bounded and (6.6.1) holds trivially (take r 1 = 1 and choose r 2 to be sufficiently negative). Now suppose that there exists a prime p of K such that ℓ ∤ v p (u a ). Then we may apply the genus formula in [Sti09, Corollary 3.7.4] to deduce that since the only way that u a := F (a) can have negative valuation at p is if p ∈ S. However, this is a finite set of primes. Therefore, (6.6.2) implies that On the other hand, if π : C → P 1 is the map given by projection onto the x-coordinate, then π pulls back a degree one divisor on P 1 (yielding the Weil height on P 1 ) to a degree ℓ divisor on C.
Hence, the algebraic equivalence of divisors and [Sil94, Thm III.10.2] together imply that In particular, we may deduce that for all ǫ > 0 and all a ∈ K (not just a ∈ o K,S ). Finally, by choosing ǫ = 1 and combining (6.3.1), (6.6.3), and(6.6.4), we see that there are constants r 1 > 0 and r 2 (depending on F , ℓ, K, and S) such that vp(F (a))>0 ℓ∤vp(F (a)) holds for all a ∈ o K,S . In particular, after replacing r 1 and r 2 with the minimum of the corresponding constants from the first and second cases above, we prove Lemma 6.6.
Lemma 6.7. Let f ∈ K[z] be a non-isotrivial polynomial with deg f = d > 1 and let α, γ ∈ K where γ is not postcritical and α is not preperiodic. Then for any prime ℓ = p, there is a δ > 0 such that for all sufficiently large n, we have Proof. Let S be finite set of primes such that α, γ, and all the coefficients of f are in o K,S . Then f n (α) ∈ o K,S for all m. By Theorem 5.5, there is an m such that the curve given by is not an isotrivial curve. There is an ω ∈ K (the leading term of f m (z) − γ) and an e (coming from the degree of inseparability of f ℓ ) such that Applying Lemma 6.6 with a = f n−m (α) we see that since ℓ = p, we have constants r 1 , r 2 such that .
for all n. Choosing a δ such that 0 < δ < r 1 /d m then gives for all sufficiently large n, as desired.
We are now ready to prove Proposition 6.1.
Proof of Proposition 6.1. We first note it suffices to prove this after passing to a finite extension of K since ℓ = p. To see this, let L be a finite extension of K, let L s denote the separable closure of K in L, and let q be a prime in L lying over a prime p of K. Then unless p is in the finite set of primes of K that ramify in L s . We also note that h f (α) > 0 since α is not preperiodic and f is not isotrivial, by [Bak09, Corollary 1.8].
We change coordinates so that β = 0. Let r be the smallest positive integer such that f r (γ) = 0. After passing to a finite extension we may assume that all the roots of f r (z) are in K. Let e = e f r (γ/β) and write f r (z) = (z − γ) e g(z). Then for all but finitely many primes p of K we have for all n.
Since γ is not post-critical, by Lemma 6.7, there exists δ > 0 such that for all sufficiently large n, we have Let W be the roots of f r (z) that are not roots of f r ′ (z) for any r ′ < r. Let S 1 be the set of primes of bad reduction for f and let S 2 be the set of primes such that v p (f r ′ (w)) > 0 for some r ′ < r and some w ∈ W ∪ {α}. Now, for each n, let Y(n) be set of primes p such that v p (f n (α) − γ) > 0 and v p (f n ′ (α)) > 0 for some n ′ < n + r. If p / ∈ S 1 ∪ S 2 , then v p (f m (α)) − γ ′ ) > 0 for some γ ′ ∈ W and some m < n. Thus, since γ is not in the forward orbit of any element of W and the sets W, S 1 , and S 2 are all finite, we may apply Lemma 6.2 to each element of W. We obtain (6.7.4) for all sufficiently large n. Combining (6.7.4) with (6.7.2) and (6.7.3), we see that for all sufficiently large n, there is a prime p such that • v p (f n ′ (α)) = 0 for all 0 < n ′ < n; and • v p (f n+r (α)) = ev p (f n (α) − γ). Since e is prime to ℓ, it follows that the Zsigmondy set Z(f, α, β, ℓ) is finite.

Applications
The original Zsigmondy theorem [Ban86,Zsi92] had to do with orders of algebraic numbers modulo primes. We can treat a related dynamical problem; here we will not assume non-isotriviality. We begin with some notation and terminology. If α ∈ K is an integer at a prime p, we let α p ∈ k p be its reduction at p. If f ∈ K[x], and all of the coefficients of f are integers at p, we let f p ∈ k p be the reduction of f at p obtained by reducing each coefficient of f at p. if g : U −→ U is any map from a set to itself and u ∈ U is periodic under g, then the prime period of u for g the smallest positive integer m such that g m (u) = u. We say that a polynomial in Theorem 7.1. Let f be a polynomial of degree greater than 1 and let α ∈ K be a point that is not preperiodic for f . If f is not both isotrivial and additive, then for all but finitely many positive integers n, there is a prime p such that the prime period of α p for f p is equal to n. If f is isotrivial and additive, then there for all but finitely many positive integers n that are not a power of p, there is a p such that the prime period of α p for f p is equal to n Proof. If f is not isotrivial, this follows immediately from Theorem 1.6 by letting α = β. If f is isotrivial, then after a change of coordinates, we may assume that f ∈ k[x] and α ∈ K \ k for some finite extension k of F p . If f is not additive then for all but finitely many positive integers n, there exists β n ∈ k having prime period n for f , by [Pez94,Theorem]. For each such β n , there exists p n such that α pn = β n , so we see that for all but all but finitely many positive integers n, there exists p such that α p for f p is equal to n. If f is additive, then for all but finitely many positive integers n that are not a power of p, there exists β n ∈ k having prime period n for f , by [Pez94,Theorem]. Then, as in the non-additive case, we may choose p n such that α pn = β n . Theorem 1.4 allows one to prove characteristic p analogs of various results that rely on the results of [Sil93]. For example, the proofs of Theorems 4 and 5 of [BGH + 13] extend easily to the case of non-isotrivial rational functions over a function field in characteristic p, using Theorem 1.4. Similarly, one can use Theorem 1.4 to prove Theorem 4 of [BIJ + 17] with the additional hypothesis that at least one of the wandering critical points of ϕ has a ramification degree that is not a power of p.
We will now prove a few results that about unicritical polynomials that rely on Theorem 1.5, which is not available over number fields.
The following lemma is very similar to [BT18, Proposition 3.1]; we include the proof for a sake of completeness.
Lemma 7.2. Let f (x) = x d + c where d is an integer greater than than 1 that is not divisible by p, let β ∈ K, and let n be a positive integer. Let p be any prime of K such that (i) |c| p ≤ 1; (ii) |β| p ≤ 1; and (iii) |f m (0)| p = 1 for all 0 ≤ m ≤ n. Then p does not ramify in K(f −n (β)).
Proof. We proceed by induction. The case where n = 1 follows immediately from taking the discriminant of x d + (c − β). Now, let p be a prime satisfying (i) -(iii) for some n ≥ 2. Then it also satisfies them for n − 1, so by the inductive hypothesis, the prime p does not ramify in K(f −(n−1) (β)). Now, K(f −n (β)) is obtained from K(f −(n−1) (β)) by adjoining elements of the form d √ c − γ i for f n−1 (γ i ) = β. For any prime q in K(f −(n−1) (β)) lying over p, we see that |γ i | q ≤ 1 by (i) and (ii) and |γ i | ≥ 1 by (i), (ii), and (iii). Thus, each q in K(f −(n−1) (β)) lying over p does not ramify in any K(f −(n−1) (β))( d √ c − γ i ) = K(f −n (β)). Since each such q does not ramify over p by the inductive hypothesis, it follows that p does not ramify in K(f −n (β)), as desired.
The next lemma follows a proof that is similar to that of [BT18, Proposition 3.2] and [BIJ + 17, Theorem 5].
Lemma 7.3. Let f (x) = x d + c where c ∈ K \ k where d is an integer greater than than 1 that is not divisible by p. Let β ∈ K, let ℓ = p be a prime number, and let e be a positive integer such that ℓ e divides d. Suppose that p is a primitive ℓ-divisor of f n (0) − β such that |c| p = |β| p = 1. Then for any prime p ′ in K(f −(n−1) (β)) that lies over p, there is a prime q in K(f −n (β)) such that ℓ e divides e(q/p ′ ).
Proof. Let p ′ be a prime in K(f −(n−1) (β)) lying over p. By Lemma 7.2, the prime p does not ramify lying over p ′ , we see that ℓ e |e(q/p ′ ).
Using the Lemmas above, we can prove a result for separable non-isotrivial polynomials of the form x d + c that is a special case of a characteristic p analog of [BT18, Theorem 1.1]. Note that if f (x) = x d + c and d is not divisible by p, then f is isotrivial if and only if c ∈ F p . To see this, note that h f (0) = h(c) d > 0 when c / ∈ F p , as can be seen by simply considering the orbit of f at the places v where |c| v > 1. Therefore, if c / ∈ F p , then f has a critical point that is not preperiodic, and hence f cannot be isotrivial. We note also that a polynomial of the form x d + c is separable if and only if p ∤ d.
Theorem 7.4. Let f (x) = x d + c be a separable non-isotrivial polynomial of degree d > 1. Let β ∈ K. Then for all sufficiently large n, there is a prime p of K such that p ramifies in K(f −n (β)) but not in K(f −(n−1) (β)).
The next result is a characteristic p analog of a theorem of Pagano [Pag21, Theorem 1.3] for number fields (see also [BJS18] for a similar result); the growth condition here is stronger than what Pagano obtains over number fields.
Theorem 7.5. Let f (x) = x d + c be a separable non-isotrivial polynomial of degree d > 1. Let β ∈ K. Then there is a constant C(n, β) > 0 such that [K(f −n (β)) : K] > C(n, β)d n for all positive integers n.
We are can now prove a finite index result for iterated monodromy groups of quadratic polynomials. We need a little terminology to state our result.
Let L be a field. Let f be a quadratic polynomial and let β ∈ L. For n ∈ N, let L n (f, β) = L(f −n (β)) be the field obtained by adjoining the nth preimages of β under f to L(β). and let L ∞ (f, β) = ∞ n=1 L n (f, β). We let G ∞ (β) = Gal(L ∞ (f, β)/L). The group G ∞ (β) embeds into Aut(T 2 ∞ ), the automorphism group of an infinite 2-ary rooted tree T 2 ∞ (note that all of the definitions here generalize to rational functions of any degree -see [Odo85] or [JKMT16], for example). Boston and Jones [BJ07] asked if G ∞ (β) had finite index in Aut(T 2 ∞ ) whenever f is not post-critically finite in the case where L is a number field. It was later shown [JKL + 18] that this is true if the pair (f, β) is eventually stable (see below), assuming the abc conjecture. This was also shown to be true unconditionally for non-isotrivial quadratic polynomials over function fields of characteristic 0 in [BDG + 19].
For β ∈ L and a polynomial f ∈ L[x], the pair (f, β) is said to be eventually stable if the number of irreducible factors of f n (x) − β over L(β) is bounded independently of n as n → ∞ (stability and eventual stability can also be defined for rational functions as in [JL17]). We will prove a finite index result for non-isotrivial quadratic polynomials over function fields of odd positive characteristic under an eventual stability assumption.
The technique we use is the same as that used in [BDG + 19] (see also [JKL + 18, BT19, HJ20]). We make use of [BDG + 19, Proposition 7.7], which is stated in characteristic 0 but is true with no changes in the proof in characteristic p provided that K(f −n (β)) is separable over K for all n, which is automatic here when p > 2; the following result is a strengthening of [Hin16, Corollary 1].
Theorem 7.6. Let f be a non-isotrivial quadratic polynomial defined over a field K that is a finite extension of F p (t). Suppose that p > 2 and that β is not post-critical or periodic for f . Suppose furthermore that the pair (f, β) is eventually stable. Then G ∞ (β) has finite index in Aut(T 2 ∞ ). Proof. As in [BDG + 19], it will suffice to show that for all sufficiently large N , we have Gal(K N /K N −1 ) ∼ = C 2 N 2 , where C 2 is the cyclic group with two elements. After a change of variables, we may assume that f (x) = x 2 + c for some c ∈ K \ k.
Since (f, β) is eventually stable, there is an m such that f m (x) − β = (x − γ 1 ) · · · (x − γ 2 m ) for γ i with the property that f n (x) − γ i is irreducible over K(γ i ) for all n for i = 1, . . . , 2 m , by [BT19,Proposition]. Let L = K(γ 1 , . . . , γ 2 m ). It follows from [BDG + 19, Proposition 7.7] and Lemma 7.3 that we must have Gal(K n+m /K n+m−1 ) ∼ = [C 2 ] 2 m+n whenever there are primes p i of L, for i = 1, . . . , 2 m , such that (i) v p i (c) = v p i (γ j ) = 0 for j = 1, . . . 2 m ; (ii) 2 ∤ v p i (f n (0) − γ i ); (iii) v p i (f n ′ (0) − γ i ) = 0 for all n ′ < n; and (iv) v p i (f n ′ (0) − γ j ) = 0 for all n ′ ≤ n and j = i; Note that condition (i) holds for all but finitely many primes p i . Hence, we will be done if we can show that for all sufficiently large n, there are p i , for i = 1, . . . , 2 m , that satisfy conditions (ii), (iii), and (iv). Now, fix a γ i . By Lemma 6.7, there exists δ > 0 such that for all sufficiently large n, we have For any n, let X (n) be the set of primes p such that v p (f n (0) − γ i ) > 0 and v p (f n ′ (0) − γ i ) > 0 for some n ′ < n. Since γ i is not periodic and h f (α) > 0, we may apply Lemma 6.2. We see then that for all sufficiently large n, we have (7.6.2) For any n and i = j, we let Y j (n) be the set of primes v p (f n (0) − γ j ) > 0 and v p (f n ′ (0) − γ j ) > 0 for some n ′ ≤ n.Since f n ′ (γ j ) = γ i for all n ′ and i = j, we may apply Lemma 6.7 again. Since in addition we have v p (γ i − γ j ) = 0 for all but finitely many p when i = j, we see that for all sufficiently large n, we have (7.6.3) Since δh f (0) > 0, equations (7.6.1), (7.6.2), and (7.6.3) imply that for any sufficiently large n, there is a prime p i satisfying conditions (ii), (iii), and (iv), and our proof is complete.
Remark 7.7. We note that while conditions (i) and (ii) above are weaker as stated than Condition R from [BDG + 19, Definition 7.2], they do imply that the prime p i ramifies in K(f −n (γ i )) (by Lemma 7.3), which is what [BDG + 19, Proposition 7.7] requires.
It should also be possible to prove a finite index result along the lines of Theorem 7.6 more generally for non-isotrivial polynomials of the form x d + c, where d > 2 and p ∤ d by modifying techniques in [BDG + 19] and combining them with our argument for Theorem 7.5 above.