Skip to main content
Springer logoLink to Springer
. 2026 Mar 24;407(4):78. doi: 10.1007/s00220-026-05602-8

A Classification of Intrinsic Ergodicity for Recognisable Random Substitution Systems

P Gohlke 1, A Mitchell 2,
PMCID: PMC13013267  PMID: 41890447

Abstract

We study a class of dynamical systems generated by random substitutions, which contains both intrinsically ergodic systems and instances with several measures of maximal entropy. In this class, we show that the measures of maximal entropy are classified by invariance under an appropriate symmetry relation. All measures of maximal entropy are fully supported and they are generally not Gibbs measures. We prove that there is a unique measure of maximal entropy if and only if an associated Markov chain is ergodic in inverse time. This Markov chain has finitely many states and all transition matrices are explicitly computable. Thereby, we obtain several sufficient conditions for intrinsic ergodicity that are easy to verify. A practical way to compute the topological entropy in terms of inflation words is extended from previous work to a more general geometric setting.

Introduction

A substitution is a symbolic rule that replaces each letter from a finite alphabet with a word consisting of letters from the same alphabet. Dynamical systems arising from substitutions provide the prototypical examples of mathematical quasicrystals. Such systems are well-known to have zero topological entropy and, under a mild condition, support a unique invariant measure. For a detailed introduction to the statistical properties of substitution dynamical systems, we refer the reader to [1, 57].

Random substitutions are a generalisation of substitutions where the substituted image of a letter is determined by a Markov process. In contrast to their deterministic counterparts, their associated dynamical systems typically have positive topological entropy [48] and support uncountably many ergodic measures [29]. Random substitution systems thus provide theoretical models for quasicrystals with local defects, which simultaneously exhibit long-range order alongside local disorder. They also provide a systematic framework to study properties of statistically self-similar structures as introduced by Mandelbrot [42, 43]. In particular, Peyrière studied the convergence of pattern frequencies [56], Godrèche and Luck observed mixed spectral types in the diffraction image [24], and Dekking, Grimmett and Meester classified several phases in random Cantor sets, including the occurrence of fractal percolation [1719].

Variational principles play a pivotal role in finding physically meaningful distributions on a given system. In thermodynamic formalism these distributions are given by equilibrium measures, which often satisfy certain regularity properties [6, 58]. In the absence of an external potential, these equilibrium states coincide with the measures of maximal entropy (MMEs); compare also [30, 34] for the role of entropy maximisation. Within some classes of dynamical systems, it has been observed that both uniqueness of the MME and the presence of several MMEs are possible, depending on the parameters, thus leading to the occurrence of a phase transition [9, 31].

The question of whether a dynamical system has a unique measure of maximal entropy has been well studied over the last decades for various types of dynamical systems [8, 11, 23, 32, 44, 62]. Still, there is no complete characterisation to date, even for symbolic dynamical systems. Many important classes such as topologically transitive subshifts of finite type have been shown to be intrinsically ergodic, that is, there exists a unique measure of maximal entropy [52, 64, 65]. A common technique for proving a given subshift is intrinsically ergodic is to verify the specification property [5]. However, there exist many examples of intrinsically ergodic subshifts that do not have specification, some of which are covered by appropriate generalisations of the specification property [1315, 27]. Conversely, some weaker versions of specification have been shown to be compatible with several measures of maximal entropy [40, 54]. A classical example of a non-intrinsically ergodic subshift is the Dyck shift, studied by Krieger [38], which has two fully supported ergodic measures of maximal entropy which are Bernoulli. On the other hand, Haydn produced examples with several measures of maximal entropy that have disjoint topological support [31]. In fact, there are subshifts with uncountably many measures of maximal entropy [10]. Progress on the classification of intrinsic ergodicity has also been made recently in the context of coded systems [55], suspensions over shifts of finite type [33, 39], and bounded density shifts [22]. We also refer to [16, 39] for more on the history of this problem.

In this work, we classify intrinsic ergodicity for primitive random substitution systems under appropriate regularity assumptions. We show that for this class, the problem of intrinsic ergodicity is non-trivial. That is, there exist both intrinsically ergodic and non-intrinsically ergodic examples. All the measures of maximal entropy have full topological support, but are in general not Bernoulli. In fact, it was shown in previous work that they generally violate a (weak) Gibbs property for the zero potential, and in particular that the corresponding subshift does not satisfy specification [27]. Primitive random substitutions produce systems with complex dynamical properties, including mixed spectral types [3, 50], positive entropy [25, 27, 48], a hierarchical structure [3], rich automorphism groups [20], non-trivial dimension spectra [49], and subtle mixing properties [41, 46]. With the results presented in this work we therefore contribute to the study of intrinsic ergodicity in a regime of intricate dynamical behaviour.

A random substitution is given by a set-valued function ϑ that maps letters from an alphabet A to sets of words in this alphabet. As an example, consider A={a,b} and ϑ:a{aba},b{baa,bba}. It is extended to words by concatenating all possible realisations on the individual letters. For instance, in the given example, ϑ(ab)={ababaa,ababba}. A subshift XϑAZ is assigned in the standard way, by imposing that every pattern in xXϑ can be generated from an iteration of ϑ on some letter. The standard assumption of primitivity ensures that Xϑ is topologically transitive under the shift map.

The class of primitive random substitutions is very large and encompasses subshifts with contrasting dynamical behaviour, including all topologically transitive shifts of finite type [28] and deterministic primitive substitution subshifts, as well as the Dyck shift and similar examples of coded shifts [29]. It is therefore customary to either study isolated examples or to impose further assumptions on the class of random substitutions under consideration. We work with an assumption called geometrical compatibility that generalises two common assumptions in previous work, constant length and compatibility. This is also the minimal restriction to ensure that ϑ allows for a geometric interpretation as a random inflation rule. Such a geometric setting seems natural as it readily generalises to shapes in higher dimensions [24]. This geometric framework is adequately represented by a suspension Yϑ of the subshift Xϑ. In the special cases of compatible or constant length random substitutions, intrinsic ergodicity of Xϑ and Yϑ are equivalent.

An assumption on ϑ that puts us outside the scope of many of the classical examples for intrinsic ergodicity is recognisability. In fact it was shown in [20] that the corresponding subshifts have non-residually finite automorphism groups and therefore exclude, for instance, all mixing subshifts of finite type. Recognisability means that every xXϑ can be decomposed uniquely into inflation words in aAϑn(a) for all nN. While this property is automatic for primitive substitutions with infinite subshifts [51], it has to be imposed as an extra condition for their random analogues. Recognisability allows us to identify inflation words in xXϑ, and locally swapping words in ϑn(a) for some fixed aA and nN gives another sequence yXϑ. These symmetry transformations form the so-called shuffle group, which is responsible for the automorphism group being non-residually finite [20]. We call a measure that is invariant under the shuffle group a uniformity measure. In fact, measures of maximal entropy are known to respect any symmetry of exchangeable words, up to a factor reflecting a potential change of length [21, 45]. Our first main result is that, assuming geometric compatibility and recognisability, invariance under the shuffle group entirely characterises the measures of maximal entropy. That is, the measures of maximal entropy on Yϑ are precisely the uniformity measures. Since uniformity measures have full topological support, the same holds for the measures of maximal entropy.

Our second main result gives a characterisation of the uniqueness of uniformity measures, and hence of the intrinsic ergodicity of Yϑ. We harvest the fact that equidistributing the inflation words in ϑn(a) for all levels n and aA imposes some rigidity on the uniformity measures in the form of self-consistency relations. These are encoded in a sequence of Markov matrices Qϑ=(Qn)nN, whose entries can be written explicitly in terms of #ϑn(a) and the combinatorial data of ϑ. We prove that there is a unique uniformity measure if and only if the Markov process Qϑ is ergodic in inverse time. This can be checked via standard tools in probability theory [12], see also Sect. 2.9. We provide several sufficient conditions and give an explicit example that violates intrinsic ergodicity of both Xϑ and Yϑ. In particular, this covers and extends all the results on intrinsic ergodicity in [27].

On a technical level, we obtain that uniformity measures have an inverse limit structure under transfer operators that represent the action of ϑ, equipped with appropriate probability vectors on the inflation words. The understanding of such transfer operators is of independent interest, and we expect it to be useful for a more general study of random substitution systems. As an intermediate step to prove that uniformity measures maximise the entropy on Yϑ, we also show that this entropy can be obtained from the growth rate of #ϑn(a) for all aA, sometimes referred to as the inflation word entropy. This unifies and generalises results from [25, 48] in a geometric setting. In fact, the equality of topological entropy and inflation word entropy holds without the assumption of recognisability.

Outline

The paper is structured as follows. In Sect. 2, we introduce random substitutions, associated probability structures and their geometric interpretation, and we recall some background on suspension flows, induced systems, inverse-time Markov chains and conditional entropy. This provides us with all the necessary notation to properly formulate our main results in Sect. 3. The equality of the topological entropy of Yϑ and inflation word entropy is presented in Sect. 4, alongside some examples that illustrate the need to change from Xϑ to Yϑ for this result to hold. We start restricting our attention to recognisable random substitutions in Sect. 5, where we show a structural result for the associated subshift. Section 6 is dedicated to the introduction and study of transfer operators on Xϑ that reflect the action of ϑ. This enables us to introduce the class of inverse limit measures in Sect. 7, generalising the class of frequency measures studied in previous work, and to characterise their uniqueness. Interpreting uniformity measures as particular instances of limiting measures, we characterise intrinsic ergodicity in Sect. 8. In this section, we also work out a counterexample to intrinsic ergodicity in detail.

Preliminaries

Symbolic notation

An alphabet A is a finite collection of symbols, which we call letters. We call a finite concatenation of letters a word, and let A+ denote the set of all non-empty finite words with letters from A. We write |u| for the length of u and, for each aA, let |u|a denote the number of occurrences of a in u. The Abelianisation ϕ of a word vA+ is the vector ϕ(v)N0A with ϕ(v)a=|v|a for all aA. A subword of a word uAn is a word v such that v=u[i,j]:=uiuj for some 1ijn. We write |u|v for the number of times that v appears as a subword of u.

We let AZ denote the set of all bi-infinite sequences of elements in A and endow AZ with the discrete product topology. With this topology, the space AZ is compact and metrisable. We let S denote the usual (left-)shift map on AZ, given by S(x)n=xn+1 for all xAZ and nZ. If i,jZ with ij and x=x-1x0x1AZ, then we write x[i,j]=xixi+1xj. A subshift X is a closed and S-invariant subspace of AZ. For vAn the corresponding cylinder set is [v]={xX:x[0,n-1]=v}.

For a given set B, we write #B for the cardinality of B and let F(B) be the set of non-empty finite subsets of B. If A,BA+, we write AB={uv:uA,vB} for the set of all concatenations.

Random substitutions

There are several ways to define a random substitution. We start with a purely combinatorial definition.

Definition 2.1

A random substitution on a finite alphabet A is a set-valued function ϑ:AF(A+). It extends to words via

ϑ(v1vn)=ϑ(v1)ϑ(vn),

for vAn and nN, and to sets of words via ϑ(A)=vAϑ(v) for all AF(A+).

As we define it here, a random substitution ϑ does not, a priori, carry a probabilistic structure. This is because several properties of ϑ do not depend on the choices of the probabilities, and we wish to keep the flexibility to alternate between different probabilistic structures. In fact, there are several works on random substitutions that use this set-valued definition without ever assigning any probabilities [25, 46, 59].

Note that expressions like ϑ2=ϑϑ are well defined. For convenience, we let ϑ0 denote the identity map. We call every vϑn(a) a (level-n) inflation word of type a.

Example 2.2

The random Fibonacci substitution on A={a,b} is given by ϑ:a{ab,ba},b{a}. We can iterate this to obtain

ϑ2(a)=ϑ({ab,ba})=ϑ(ab)ϑ(ba)={aba,baa}{aab,aba}={aab,aba,baa}.

Definition 2.3

Let ϑ be a random substitution on A. The language of ϑ is given by

Lϑ={vA+:visasubwordofsomewϑn(a),forsomeaA,nN0},

and the subshift associated with ϑ is given by

Xϑ={xAZ:x[i,j]Lϑforallij}.

Note that if #ϑ(a)=1 for all aA, our notion of a random substitution coincides with the standard definition of a substitution (identifying every singleton set with its unique element). In this case we say that ϑ is deterministic. We recall a few basic notions about substitutions.

Definition 2.4

Given a substitution θ:AA+, its substitution matrix M=MθN0A×A is given by

Mab=|θ(b)|a=ϕ(θ(b))a.

We call θ primitive if M is a primitive matrix, that is, if Mp is strictly positive for some pN.

It is sometimes convenient to regard a random substitution as a local mixture of substitutions.

Definition 2.5

A marginal of a random substitution ϑ is a map θ:AA+ such that θ(a)ϑ(a) for all aA. We say that ϑ is primitive if there is some nN such that ϑn has a primitive marginal. We call ϑ geometrically compatible if there is some λ>1 and a vector L with strictly positive entries, such that L is a left eigenvector with eigenvalue λ for the substitution matrix of every marginal of ϑ.

Primitivity is a standard assumption which ensures that the corresponding subshift is non-empty and topologically transitive [60], and we will assume that ϑ is primitive throughout most of this work.

There are two special cases of geometrical compatibility that have received some attention in the past. We say that ϑ is of constant length if |v|= for all vϑ(a) and aA, and we call it compatible if each of its marginals has the same substitution matrix. The relationship between these three conditions is illustrated in Fig. 1. We highlight that all of the inclusions here are strict, and that there is no general relation between constant length and compatible random substitutions.

Fig. 1.

Fig. 1

Implication diagram for some conditions on primitive random substitutions

Remark 2.6

We highlight that every geometrically compatible random substitution ϑ has the property of having unique realisation paths: for each uAn and vϑ(u) there is a unique way to write v=v1vn with viϑ(ui); compare [27] for details.

We extend the action of ϑ to bi-infinite sequences in the obvious way. More precisely, for xAZ, let

ϑ(x)={v-2v-1.v0v1:viϑ(xi)foralliZ}.

Here, the lower dot separates the positions indexed by -1 and 0 in AZ.

Definition 2.7

A random substitution ϑ is called recognisable if for every yXϑ, there exists a unique xXϑ, a unique sequence (vi)iZ with viϑ(xi) for all i, and a unique 0k<|v0|, such that

S-ky=v-2v-1.v0v1.

We call (x,k,(vi)iZ) the recognisability data of y with respect to ϑ.

It is straightforward to verify that ϑ being recognisable implies that ϑn is recognisable for all nN. In the special case that for all aA all words in ϑ(a) have the same length, our definition of recognisability coincides with the definition used in earlier work [20, 27].

Every recognisable random substitution satisfies the disjoint set condition, meaning that for all aA and u,vϑ(a), we have ϑn(u)ϑn(v)= for all nN. The proof of this fact carries over verbatim from the slightly more restrictive definition used in [27, Lemma 4.5]. The disjoint set condition often simplifies the calculation of entropy, both in the topological and measure theoretic setting [25, 27].

Probabilistic aspects

In this section, we equip a random substitution ϑ with a probabilistic structure by choosing probability vectors on each of the sets ϑ(a) with aA. This approach goes back to Peyrière [56] and was pursued further by Denker and Koslicki [36, 37].

Definition 2.8

Let ϑ be a random substitution and I=aAϑ(a). A probability choice for ϑ is a column stochastic matrix P[0,1]I×A such that Pu,a=0 if uϑ(a). We call P non-degenerate if Pv,a>0 for all vϑ(a) and aA.

We regard Pv,a as the probability of choosing the realisation vϑ(a) when applying ϑ to a. With some abuse of notation and in line with the usual convention, we often refer to the pair ϑP=(ϑ,P) as a random substitution as well. If ϑ(a)={u1,,un}, it is customary to represent the combined data of ϑP as

ϑP:au1withprobabilityPu1,a,unwithprobabilityPun,a.

In the following we assume that ϑ has unique realisation paths: for each uAn and vϑ(u) there is a unique way to write v=v1vn with viϑ(ui). This formulation is sufficient for our purposes since every geometrically compatible random substitution has this property (recall Remark 2.6).

Reflecting the idea that neighbouring letters are mapped independently, we extend P to a countable state Markov matrix in [0,1]A+×A+ via Pv,u=0 if vϑ(u), and by setting for all u=u1unAn and v=v1vnϑ(u),

Pv,u=i=1nPvi,ui.

Recall that the splitting of v into words (vi)i=1n with viϑ(ui) is unique due to the fact that ϑ has unique realisation paths. For a more general definition that works beyond the assumption of unique realisation paths, compare [27]. In this notation, expressions like P2=P·P are well defined via standard (countable state) matrix multiplication. We emphasise that such a multiplication involves only finite sums, since every column of P has only finitely many non-zero entries. We also note that Pn is a valid probability choice for ϑn for each nN. To avoid cumbersome notation, we will write ϑnn for (ϑn)Pn=(ϑn,Pn).

Given uA+, the Markov matrix P induces a stationary Markov chain (ϑPnn(u))nN on some probability space (Ωu,Fu,Pu) via

Pu[ϑPn+1n+1(u)=wϑPnn(u)=v]=Pv[ϑP(v)=w]=Pw,v,

for all v,wA+ and nN. We often write P for Pu if the initial word is understood. In this case, we write E for the expectation with respect to P.

Equipping a random substitution ϑ with a probability choice P also allows us to define the substitution matrix M=M(ϑP)RA×A in analogy to deterministic substitutions via

Mab=E[|ϑP(b)|a]=vϑ(b)Pv,b|v|a.

If ϑ is fixed, we also write M(P) in place of M(ϑP). A routine calculation shows that

M(PP)=M(P)M(P).

If the matrix M is primitive, Perron–Frobenius (PF) theory implies that it has a simple real (PF) eigenvalue λ of maximal modulus and that the corresponding left and right (PF) eigenvectors L=(L1,,Ld) and R=(R1,,Rd)T can be chosen to have strictly positive entries. We normalise the right eigenvector according to R1=1. If the product LR is independent of P, we usually normalise L such that LR=1. Otherwise, we pick some arbitrary but fixed normalisation of L. Like for deterministic substitutions, primitivity can be characterised purely in terms of the substitution matrix [26, Lemma 3.2.18].

Lemma 2.9

A random substitution ϑ is primitive if and only if for some (equivalently all) non-degenerate P the matrix M(P) is primitive and its PF eigenvalue satisfies λ>1.

If ϑ is geometrically compatible, the corresponding data λ and L is precisely the PF data of M(P) for all P. However, the right PF eigenvector R of M(P) does depend on P in the general case.

There is a special family of probability choices that is closely related to the structure of the important family of uniformity measures which will be introduced in Definition 2.17.

Definition 2.10

For nN0, the n-productivity distribution for ϑ is the probability choice Pn,1 with

Pv,an,1=#ϑn(v)#ϑn+1(a),

for all vϑ(a).

The fact that Pn,1 weighs inflation words according to their productivity under ϑn can be seen as an attempt to prepare for a uniform distribution after n more applications of ϑ.

Measures along (random) words

Given a word wA+, let wZ be the bi-infinite periodic sequence with period length |w| and starting with the word w. The unique invariant measure on the orbit of wZ with total mass |w| is given by

μw=i=0|w|-1δSiwZ.

It follows directly that μw([a])=|w|a for all aA. Hence, writing Rμ:=(μ([a]))aA for an invariant measure μ, we obtain that

Rμw=ϕ(w).

More generally, for any word vA+, we find that 0μw([v])-|w|v|v|, where the maximal discrepancy |v| emerges from occurrences of v in wZ that overlap several copies of w.

Given a random word ω, the expression μω is a random measure, and we assign an invariant probability measure, called the periodic measure representation of ω via

μ¯ω=E[μω]E[|ω|].

Definition 2.11

Let (ωn)nN be a sequence of random words such that E[|ωn|] as n. We call every accumulation point of (μ¯ωn)nN an accumulation measure of (ωn)nN. If (μ¯ωn)nN converges to some measure μ, we call μ the limit measure of (ωn)nN.

If μ is the limit measure of (ωn)nN, its value on cylinder sets can be given explicitly by

μ([v])=limnE[|ωn|v]E[|ωn|]. 2.1

We directly obtain from (2.1) that, whenever all realisations of ωn are in Lϑ for all nN, every corresponding accumulation measure is supported on Xϑ.

The sequence (ϑnn(a))nN has a well-defined limit measure μP, which is the same for all aA. This measure is called the frequency measure of ϑP and is known to be ergodic under the shift map [29]. A systematic approach to calculating the (measure theoretic) entropy of random substitution subshifts with respect to frequency measures was developed in [27].

Geometric hull

Let ϑ be a primitive geometrically compatible random substitution. The assumption of geometric compatibility gives that the Perron–Frobenius eigenvalue λ and corresponding left eigenvector are independent of the choice of probabilities. This allows us to choose well-defined tile lengths. Let L denote the tile length vector, which is some normalisation of the left PF eigenvector. For each wLϑ, we write L(w)=aALa|w|a for the geometric length of w.

We will define the geometric hull of ϑ as an appropriate suspension flow. A roof function on a subshift X is a positive continuous function π:XR+ that is bounded away from 0. The suspension flow of X with roof function π is defined by

Sus(X,π)=(x,s):xX,0sπ(x)Xϑ×R,

where we identify points according to the relation (x,s+π(x))(S(x),s), which defines an equivalence relation on the transitive closure. For each tR, define Tt(x,s)=(x,s+t), which is well defined on Sus(X,π) via the equivalence relation above. Thus, T={Tt} is a one-parameter transformation group on Sus(X,π).

Definition 2.12

The geometric hull (Yϑ,T) of a primitive, geometrically compatible random substitution ϑ is the suspension of (Xϑ,S) with roof function π(x)=Lx0.

We recall a few facts about the invariant measures on suspension systems; see [4, 53] for details. Let m be the Lebesgue measure on the real line. Every S-invariant probability measure μ on Xϑ can be lifted to a T-invariant measure μ~ via

μ~=(μ×m)|Yϑ(μ×m)(Yϑ).

The map μμ~ defines a bijection between the invariant measures on (Xϑ,S) and (Yϑ,T). Moreover, μ~ is ergodic if and only if μ is ergodic. By Abramov’s formula, the relationship between the entropies of μ and μ~ is given by

hμ~(Yϑ)=hμ(Xϑ)μ(π),

using the notation μ(π):=πdμ. For roof functions of the form π(x)=Lx0, the normalisation factor μ(π) can be expressed as

μ(π)=aALaμ([a])=LRμ.

Definition 2.13

The geometric entropy of a shift-invariant probability measure μ on Xϑ is the quantity hμg:=hμ~=hμ/LRμ. We call μ a measure of maximal geometric entropy if hμg=htop(Yϑ).

Note that μ is a measure of maximal geometric entropy if and only if μ~ is a measure of maximal entropy on the suspension (Yϑ,T). Since the map μμ~ forms a bijection between the spaces of invariant measures, intrinsic ergodicity of (Yϑ,T) is equivalent to the existence of a unique measure of maximal geometric entropy on Xϑ. If there is a vector RRA such that the letter frequencies of all xXϑ are given by R, the normalisation factor μ(π)=LR is uniform. In this case, the measures of maximal entropy on Yϑ are precisely the lifts of the measures of maximal entropy on Xϑ. The same holds in the constant length setting, where the roof function is constant.

Another convenient interpretation of Yϑ is via Delone sets with finite local complexity, following the approach in [2]. Indeed, every yYϑ can be represented as a (coloured) Delone set D(y)={Da(y)}aA, given by

Da(y)={tR:Tt(y)[a]×{0}}.

for each aA. We define the intersection of a coloured set A={Aa}aA with a subset BR as AB={AaB}aA. Similarly, A-t={Aa-t}aA. The image D(Yϑ) is a space of coloured Delone sets of finite local complexity, which we equip with a (metrisable) topology in which two coloured sets are close if they agree on a large ball around the origin up to a small translation. It is straightforward to verify that D:YϑD(Yϑ) is a homeomorphism and that it intertwines Tt with Tt:AA-t in the sense that DTt=TtD for all tR. Hence, the systems (Yϑ,T) and (D(Yϑ),T) with T={Tt} are isomorphic; in particular, they have the same topological entropy. A patch of size [0,) is an element of

Pϑ()={D(y)[0,):yXϑ×{0}}.

The patch counting function pϑ:R+N, with pϑ()=#Pϑ(), satisfies

pϑ()=#{wLϑ:L(w[1,|w|-1])<L(w)}.

Since Xϑ (and hence D(Yϑ)) contains a point with dense orbit, we can use [2, Thm. 1] to obtain the topological entropy of Yϑ from the exponential growth rate of pϑ via

htop(Yϑ)=htop(D(Yϑ))=lim sup1log(pϑ()).

Instead of considering all patterns that are close to a given length, we can restrict our attention to those that arise directly from iterating ϑ. This gives rise to the following.

Definition 2.14

Let ϑ be a geometrically compatible random substitution. For each aA, we define the geometric inflation word entropy of type a by

haG=limm1L(ϑm(a))log(#ϑm(a)),

provided this limit exists.

Shuffle group and uniformity measures

Let us assume that ϑ is recognisable and geometrically compatible. For (y,s)Yϑ, let (x,k,(vi)iZ) be the recognisability data of y. Then, we define the recognisability data of (ys) by (x,t,(vi)iZ), where 0t<L(v0) is the unique element such that (y,s)=Tt(v-2v-1.v0v1,0).

The recognisable structure can be harvested to define a large number of symmetry relations that exchange inflation words of the same type and level. This idea was developed in [20] under the additional assumption of compatibility, but the definition extends to the geometrically compatible setting. For an element αSym(ϑ(a)) of the permutation group on the set ϑ(a), the function fα:YϑYϑ is defined by replacing each word viϑ(xi) by α(vi) whenever xi=a in the recognisability decomposition of (ys). More precisely, for (ys) with recognisability data (x,t,(vi)iZ), we set

fα((y,s))=Tt(w-2w-1.w0w1,0),

where wi=α(vi), whenever xi=a and wi=vi otherwise. Since all elements of ϑ(a) have the same geometric length, the recognisability data of fα(y,s) is given by (x,t,(wi)iZ). In particular, fα commutes with the action of T. We call fα a ϑ-shuffle, and ϑn-shuffles are defined accordingly.

Definition 2.15

[20]. For each aA and nN, let Γn,a={fα:αSym(ϑn(a))} and Γn=aAΓn,a. We call Γ=nNΓn the shuffle group of ϑ.

Remark 2.16

Note that each element in Γ inherits the property of commuting with the action of T. In [20], the shuffle group was introduced as acting on the subshift Xϑ instead of its suspension, but the naive generalisation of this definition beyond the compatible setting fails to commute with the shift action. This is why we chose to introduce the shuffle group as a symmetry of the suspension (Yϑ,T) where the commutation relation with the translation is ensured by the weaker assumption of geometric compatibility.

Continuity of fΓ is inherited from the fact that the recognisability data of yXϑ depends continuously on y, see Lemma 5.1. Hence, Γ is a subgroup of the automorphism group on (Yϑ,T). It should be noted that shuffles are nested in the sense that Γn is a subgroup of Γn+1 for all nN. A special role will be played by those measures that respect all of these symmetry relations.

Definition 2.17

A shift-invariant probability measure μ on Xϑ is called a uniformity measure if its lift μ~ is invariant under Γ, that is, if it satisfies μ~f=μ~ for all fΓ.

We will see later that uniformity measures always exist and have full topological support.

The geometric substitution matrix

A consequence of geometric compatibility is that a letter aA can be interpreted as a placeholder for an interval of length La. The random substitution can then be thought to act on intervals by inflating the tile by a factor λ and randomly dissecting into intervals corresponding to letters in A; compare Fig. 2. The overall length of intervals of type a in ϑP(b) is then given by |ϑP(b)|aLa, whereas the total geometric length of ϑP(b) is given by λLb. This motivates the following concept.

Fig. 2.

Fig. 2

ϑ:a{abb},b{a,bb}, geometrically compatible with λ=2 and L=(2,1)

Definition 2.18

The geometric substitution matrix Q=Q(P)=Q(ϑP) of a geometrically compatible random substitution ϑP is the Markov matrix given by

Qab=LaλLbMab=E[|ϑP(b)|a]LaλLb.

For some applications, the Markov property poses a technical advantage over the use of the standard substitution matrix. The geometric substitution matrix controls the expected change of geometric proportions covered by the intervals of different types. To be more precise, for a word w we consider the geometric proportion vector ϕg(w), with

ϕg(w)a=Laϕ(w)aLϕ(w),

and obtain via a straightforward calculation,

E[ϕg(ϑP(w))]=Q(P)ϕg(w).

In the same vein, it will sometimes be useful to consider a geometric analogue of the letter frequencies Rμ of an S-invariant measure μ, given by the interval proportion vector πμ, with

πaμ=LaRaμLRμ, 2.2

representing the relative geometric proportion of intervals of type a witnessed by μ.

Induced transformation

Given a compact dynamical system (XS) with invariant probability measure μ and a measurable (compact) subset AX with μ(A)>0, the return time rA:AN{} is given by

rA(x)=inf{nN:Sn(x)A}.

For our purposes it is sufficient to consider the case of bounded return times, that is, we assume that there is rmaxN such that rA(x)rmax for all xA. In this case, the induced transformation is the dynamical system (A,SA,μA), with μA(E)=μ(AE)/μ(A) and

SA(x)=SrA(x)(x),

for all xA. We recall a few well-known facts about induced transformations; see for instance [61]. For instance, the induced measure μA is SA-invariant, and it is ergodic if μ is an ergodic measure. Another useful tool is Kac’s formula, which states that

μ(f):=fdμ=Ai=0rA-1fSidμ

for all fL1(X,μ). The corresponding statement for ergodic measures can be found in [61, Thm. 1.7]. In fact, the first part of the proof in this reference shows that the statement holds for all invariant measures if rA is bounded. Applying Kac’s formula with f1, we obtain that μ(A)=μA(rA)-1. Hence,

fdμ=1μA(rA)Ai=0rA-1fSidμA.

which allows us to express μ completely in terms of μA.

Ergodicity of (inverse-time) Markov chains

We collect a few basic properties about the convergence of inhomogenenous, finite state Markov chains in inverse time. For background and details, we refer the reader to [12].

Definition 2.19

A sequence of column stochastic matrices (Pn)nN is called ergodic (in inverse time) if for each nN there exists a probability vector πn such that

limkPnPn+k=πn1T.

It will be convenient to measure the difference of probability vectors via the variation distance

dV(p,q):=12|p-q|1=12aA|pa-qa|,

for all probability vectors pq on the state space A. Dobrushin’s ergodic coefficient δ on a (column) stochastic matrix Q is given by

δ(Q)=maxi,jdV(Q·i,Q·j)=12maxi,jk|Qki-Qkj|.

This coefficient satisfies several convenient properties (see [7] for more details):

  • 0δ(Q)1, for all Markov matrices Q;

  • δ(Q)=0 if and only if all columns of Q coincide;

  • δ(Q)=1 if and only if there are two columns of Q with disjoint support;

  • δ(Q1Q2)δ(Q1)δ(Q2) for all Markov matrices Q1,Q2 with compatible dimensions.

In fact, it is possible to express ergodicity (in inverse time) entirely in terms of this coefficient.

Theorem 2.20

[12]. The sequence (Pn)nN is ergodic in inverse time if and only if

limkδ(PnPn+k)=0

for all nN. In particular, each of the following conditions is sufficient (but not necessary) for ergodicity:

  1. nkδ(Pn)=0 for all kN;

  2. limnPn=P for some primitive P.

For Markov processes in forward time, the natural analogue of our definition of ergodicity is usually called “strong ergodicity" and in fact strictly stronger than the condition that limkδ(PnPn+k)=0 (termed “weak ergodicity"). In this sense, Markov processes in inverse time are more well-behaved than their analogues in forward time.

Conditional entropy

Let U be a random variable, possibly word valued, with a countable set Im(U) of possible realisations. Assume that the probability distribution of U is fixed by some probability measure P. The entropy of U with respect to P is given by

HP(U)=-uIm(U)P[U=u]log(P[U=u]),

Often, entropy is defined for a partition, but this leads to an equivalent definition if we consider partitions that are induced by countable state random variables. We write H(U) for HP(U) if the probability distribution is understood. Given two random variables UV, we write H(U,V)=H((U,V)) for the entropy of the random variable (UV). The conditional entropy of U given V with respect to P is defined as

HP(U|V)=vIm(V)P[V=v]HP{V=v}(U).

We will freely use the following standard properties of (conditional) entropy.

  1. H(U)log(#Im(U)), equality holds if and only if U is uniformly distributed,

  2. H(U,V)=H(V)+H(U|V),

  3. H(U|V)H(U), with equality if and only if U and V are independent,

  4. H(U,V|W)=H(V|W)+H(U|V,W),

  5. H(U|V,W)H(U|W).

We refer to [35, 63] for more details and background. Let us expand a bit more on how to characterise equality in the last item. By a straightforward calculation,

HP(U|V,W)=wIm(W)P[W=w]HP{W=w}(U|V).

Using the third property, we obtain that HP(U|V,W)=HP(U|W) if and only if U and V are independent over P{W=w} for every realisation w with P[W=w]>0.

Main Results

Our first main result shows that the topological entropy of the geometric hull can be obtained by counting inflation words. We emphasise that this does not require ϑ to be recognisable. This generalises and unifies the results on topological entropy in [25, 48].

Theorem A

Let ϑ be a primitive and geometrically compatible random substitution. Then, for all aA, the geometric inflation word entropy haG exists and coincides with htop(Yϑ).

In the symbolic setting, it was shown in [27, 47] that for all primitive random substitutions that are compatible or constant length, there exists a sequence of frequency measures that converges weakly to a measure of maximal entropy. As a consequence of Theorem A, we will obtain the analogous result in the geometrically compatible setting, on the geometric hull.

Corollary B

Let ϑ be a geometrically compatible random substitution with associated geometric hull Yϑ. Then, there exists a sequence (μm)m of frequency measures (on the symbolic hull Xϑ) whose push-forwards converge weakly to a measure of maximal entropy on Yϑ.

In general however, the class of frequency measures is too small to contain the measure of maximal (geometric) entropy; an example with a unique MME that is not a frequency measure is presented in Example 8.9. A more adequate family is given by the inverse limit measures, presented in Sect. 7. In particular, this class contains all uniformity measures.

Theorem C

Let ϑ be a primitive, geometrically compatible and recognisable random substitution. Then, the measures of maximal geometric entropy on (Xϑ,S) are precisely the uniformity measures.

Since uniformity measures have full topological support, we conclude that Xϑ is (geometric) entropy–minimal, that is, all proper invariant subshifts have a smaller (geometric) entropy.

Theorem D

Let ϑ be a primitive, geometrically compatible and recognisable random substitution and let Qn=Q(Pn,1), where Pn,1 is the n-productivity distribution for ϑ for all nN0. Then, there is a unique uniformity measure if and only if the Markov chain (Qn)nN0 is ergodic in inverse time.

If ϑ is compatible or of constant length, then the measures of maximal geometric entropy are precisely the measures of maximal entropy. We note that several isolated examples were shown to be intrinsically ergodic in [27]. In all of these cases, the n-productivity distributions are uniform distributions and the Markov chain is trivially ergodic. In fact, this is true whenever ϑ is compatible.

Corollary E

If ϑ is primitive, compatible and recognisable, both (Xϑ,S) and (Yϑ,T) are intrinsically ergodic. The unique measure of maximal entropy is the frequency measure μP where P is the uniform distribution on ϑ(a) for all aA.

In general, this is not true if compatible is relaxed to geometrically compatible. Even in the constant length setting, we can find examples where (Qn)nN0 is not ergodic, and therefore obtain cases where both (Xϑ,S) and (Yϑ,T) are not intrinsically ergodic. A specific example for which this occurs is worked out at the end of Sect. 8.

Topological Entropy of the Geometric Hull

Geometric inflation word entropy

The proof of Theorem A follows a similar line of arguments to those in [48, Thm. 4.1], adapted to the geometric setting.

Proposition 4.1

Let ϑ be a geometrically compatible random substitution with associated geometric hull Yϑ. Let k,mN and set

hmaxm,k:=maxaAmaxuϑk(a)log(#ϑm(u))L(ϑm(u)).

Then, the following inequality holds:

htop(Yϑ)λmλm-1hmaxm,k.

Proof

Fix k,mN and let nN. For every legal word wLϑ, we have that L(ϑm(w))=λmL(w). For each aA and wϑk(a), we let hwm be the number such that #ϑm(w)=exp(hwmλmL(w)). By definition, we have that hmaxm,k=maxaAmaxwϑk(a)hwm. Note that if v=v1vr is the concatenation of level-k inflation words (that is, viϑk(ai) for some aiA, for all i{1,,r}), then

#ϑm(v)=i=1r#ϑm(vi)=i=1rexphvimλmL(vi)=expi=1rhvimλmL(vi)exphmaxm,kλmL(v). 4.1

By definition of Yϑ, every patch of length λmn is contained in the image of a patch with length n+Lmax, where Lmax=maxaALa. Moreover, the image of such a patch contains at most C1 patches of length λmn, where C1 is a constant dependent on m but not n. This is because patches have a control point at the origin by definition, and these control points occur with bounded distances. Hence, the number of patches of length λmn is bounded above by

#Pϑ(λmn)C1vLϑL(v[1,|v|-1])<n+LmaxL(v)#ϑm(v). 4.2

For every vLϑ, there exists a wLϑ that is the concatenation of level-k inflation words such that v is contained in w. Moreover, such a w can be chosen with length at most |v|+2|ϑk|, where |ϑk|=maxaAmaxsϑk(a)|s|. Therefore, the geometric length L(w) is at most n+(2|ϑk|+1)Lmax. Thus, it follows by (4.1) that there is a constant C2 such that

#ϑm(v)#ϑm(w)exphmaxm,kλmL(w)exphmaxm,kλm(n+C2)

Substituting this expression into (4.2) gives

#Pϑ(λmn)C1(#Pϑ(n+Lmax))exphmaxm,kλm(n+C2)

and so it follows that

lim supn1λmnlog(#Pϑ(λmn))lim supn1λmnlog(#Pϑ(n+Lmax))+lim supn1+C2nhmaxm,k.

Hence, we obtain

htop(Yϑ)1λmhtop(Yϑ)+hmaxm,k,

and rearranging then gives the desired result.

Proof of Theorem A

For n,kN and aA, let

han,k=maxuϑk(a)log(#ϑn(u))L(ϑn(u))

and let uan,kϑk(a) be a word for which this maximum is achieved. We let hmaxn,k=maxaAhan,k and hminn,k=minaAhan,k. Since we assumed ϑ to be primitive, there is a number NN such that ϑN has a marginal with strictly positive substitution matrix. Given aA, we can hence choose a word w=w1wmϑN(a) that contains every letter in A. Assuming k>N, we note that ujn,k-Nϑk-N(wj) and hence we can pick a realisation vkϑk-N(w)ϑk(a) of the form

vk=uw1n,k-Nuwmn,k-N. 4.3

Since ϑ is geometrically compatible, it has unique realisation paths (recall Remark 2.6), so we have that

log(#ϑn(vk))=j=1mlog#ϑnujn,k-N=j=1mhjn,k-NLϑnujn,k-N. 4.4

Observe that L(ϑn(ujn,k-N))=λn+k-NL(wj) and L(ϑn(vk))=j=1mL(ϑn(ujn,k-N)). Hence, for each j (and recalling that the length m=|w| depends only on the letter a), we find that

L(ϑn(ujn,k-N))L(ϑn(vk))minbALbmmaxbALb=:qa.

With q=minbAqb>0, the fact that w contains every letter implies that hmaxn,k-N appears at least with weight qL(ϑn(v)) in the last sum in (4.4). Using that vkϑk(a), we thus obtain

han,klog(#ϑn(vk))L(ϑn(vk))qhmaxn,k-N+(1-q)hminn,k-N, 4.5

for all aA. In particular, the same lower bound holds for hminn,k in place of han,k. Recall that, by Proposition 4.1, we have

hmaxn,jλn-1λnhtop(Yϑ)=:h(n)

for all jN. Together with (4.5), we therefore find

hminn,kqh(n)+(1-q)hminn,k-N.

By applying this inequality repeatedly and comparing with (4.5), we obtain that

log(#ϑn(vk))L(ϑn(vk))qj=0(k)-1(1-q)jh(n)+hminn,r(k),

where (k)=k/N and r(k)=k-N(k). Letting k in the above, and bounding the second term below by zero, we thus obtain that

lim infmlog(#ϑm(a))L(ϑm(a))=lim infklog(#ϑn+k(a))L(ϑn+k(a))lim infklog(#ϑn(vk))L(ϑn(vk))h(n)nhtop(Yϑ).

This shows the lower bound for the inflation word entropy. The upper bound

lim supmlog(#ϑm(a))L(ϑm(a))htop(Yϑ),

is immediate because all words in #ϑm(a) are legal patterns of size L(ϑm(a)) in Yϑ.    

Measure theoretic entropy of frequency measures

In this section, we prove that frequency measures can be used to approximate the topological entropy to an arbitrary level (Corollary B). This follows by combining Theorem A with the results on entropy of frequency measures proved in [27].

Definition 4.2

For a primitive random substitution ϑP on a finite alphabet A and mN, we let HPm=(HPm,a)aA denote the row vector with entries HPm,a=H(ϑmm(a)) for all aA.

Theorem 4.3

[27, Thm. 3.5]. Let ϑP be a primitive and geometrically compatible random substitution, with Perron–Frobenius eigenvalue λ and right eigenvector R. Then, for all mN,

1λmHPmRhμP(Xϑ)1λm-1HPmR.

Proof of Corollary B

Let (μm)m be the sequence of frequency measures corresponding to equi-distributed probabilities on ϑm. For each mN, let Rm denote the right Perron–Frobenius eigenvector of the substitution matrix for the choice of probabilities associated with the measure μm. Further, we let μ~m denote the push-forward of μm onto Yϑ.

Let ε>0. By Theorem A, there is an MN such that for all mM and all aA, we have log(#ϑm(a))/λm>La(htop(Yϑ)-ε), noting that L(ϑm(a))=λmLa. Thus, it follows by Theorem 4.3 and Abramov’s formula that for all mM, we have

hμ~m(Yϑ)1LRmaARm,alog(#ϑm(a))λm>1LRmaARm,aLa(htop(Yϑ)-ε)=htop(Yϑ)-ε,

Since this holds for all ε>0, we conclude that hμ~m(Yϑ)htop(Yϑ) as m. Thus, by the compactness of the space of shift-invariant measures on Xϑ, we conclude that there is a sequence of frequency measures whose push-forwards converge weakly to a measure of maximal entropy on Yϑ.

Relationship between entropy of symbolic and geometric hulls

Definition 4.4

Let X be a subshift over a finite alphabet A. We say that a probability vector η[0,1]A is a letter frequency vector for X if there exists xX such that η=η(x), where

η(x)a=limn|x[-n,n]|a2n+1,

for all aA.

Lemma 4.5

Let Y be the suspension of a subshift X, with associated length vector L:AR+ and roof function π(x)=Lx0. Further, let η- and η+ denote letter frequency vectors that minimise and maximise the quantities Lη- and Lη+, respectively. Then, the following inequalities hold:

1Lη+htop(X)htop(Y)1Lη-htop(X).

Proof

By the variational principle for suspension flows and Abramov’s formula, we have

htop(Y)=suphμ~=suphμLRμ,

where the supremum is taken over all ergodic measures. By ergodicity, each such measure satisfies that Rμ=η(x) for some xX and therefore is a letter frequency vector for X. Hence, Lη-LRμLη+, implying the desired bounds.

Corollary 4.6

If ϑ is a constant length or compatible random substitution and L is the left eigenvector, normalised such that LR=1, then htop(Xϑ)=htop(Yϑ).

In the general geometrically compatible setting, it is possible for the inequalities in Lemma 4.5 to be strict: we give an explicit example in the next section. In fact, there exist examples of geometrically compatible random substitutions (which are not compatible or constant length) for which the measure of maximal entropy is not a measure of maximal geometric entropy: see Example 4.8.

Examples

The following examples illustrate how Theorem A can be used to obtain the topological entropy for subshifts of random substitutions that are neither compatible nor constant length.

Example 4.7

Let ϑ be the primitive random substitution defined by

ϑ:a{ab,ac},b,c{a},

which is geometrically compatible but neither constant length nor compatible. We show that

htop(Xϑ)=htop(Yϑ)=1τ2log2,

where τ=(1+5)/2 denotes the golden ratio. To this end, we first observe that the vector

L=11+τ-2τ,1,1

is the left Perron–Frobenius eigenvector of the substitution matrix for an appropriate choice of normalisation. Since the frequency of a in every element of Xϑ is equal to τ-1, every letter frequency vector η satisfies Lη=1. Thus, it follows by Lemma 4.5 that htop(Xϑ)=htop(Yϑ). By Theorem A, both coincide with the quantity haG. Now, note that for all nN, we have ϑn+1(a)=ϑn(ab)ϑn(ac)=ϑn(ab), so

#ϑn+1(a)=#ϑn(ab)=#ϑn(a)#ϑn(b)=#ϑn(a)#ϑn-1(a).

Iterating this identity, and noting that #ϑ1(a)=2, we obtain that #ϑn(a)=2Fn. Hence,

haG=limn1L(ϑn(a))log(#ϑn(a))=limnFnτn+1(1+τ-2)-1log2=limnτ2+1τ35log2=1τ2log2,

where in the second last equality we have applied Binet’s formula and in the final equality we have used the identity (τ2+1)=τ5. This establises the desired result.

The following example demonstrates that, in general, the topological entropy of the symbolic hull need not coincide with the geometric inflation word entropy if the symbolic length |ϑ(a)| is not well defined for all aA.

Example 4.8

Let ϑ be the random substitution from Fig. 2, defined by

ϑ:a{abb},b{a,bb}.

The random substitution ϑ is geometrically compatible, but neither compatible nor constant length. Further, in contrast to the previous example, the symbolic inflation word lengths |ϑ(ai)| are not well defined for all letters ai. All marginals of ϑ have a substitution matrix with Perron–Frobenius eigenvalue λ=2 and left eigenvector L=(2,1). We take this vector L to define Yϑ.

By Theorem A, we can calculate htop(Yϑ) via the geometric inflation word entropy. Note that ϑ(a)ϑ(bb) and so it follows inductively that #ϑm(b)=(#ϑm-1(b))2=(#ϑ(b))m-1=2m-1 for all mN. Hence,

1L(ϑm(b))log(#ϑm(b))=12mlog(2m-1)=12log2

for all mN, so we conclude that htop(Yϑ)=log(2)/2.

We now demonstrate that the (unique) measure of maximal entropy on the symbolic hull does not correspond to a measure of maximal entropy on the associated geometric hull. Let X be the shift of finite type defined over the alphabet {a,b0,b1} by the forbidden word set F={ab1,b1b1,b0b0,b0a} and let π:{a,b0,b1}Z{a,b} be the factor map given by π(x)i=π(xi), where π(a)=a, π(b0)=π(b1)=b. It can easily be verified that π(X)=Xϑ. Further, π is one-to-one everywhere except on the sequence bZ. Thus, the subshift Xϑ is intrinsically ergodic with the unique measure of maximal entropy μ given by the pushforward by π of the Parry measure on X. Hence, by standard results on the entropy of shifts of finite type, we obtain that hμ(Xϑ)=htop(Xϑ)=logτ, where τ is the golden ratio. Moreover, the letter frequency vector Rμ associated with μ is given by

Rμ=ττ+2,2τ+2

and so we have

LRμ=2τ+2τ+2.

Thus, it follows by Abramov’s formula that the lift μ~ of μ onto Yϑ has entropy

hμ~(Yϑ)=hμ(Xϑ)LRμ=τ+22τ+2logτ.

Since

0.3325τ+22τ+2logτ<12log20.3466,

it follows that hμ~(Yϑ)<htop(Yϑ), and so μ~ is not a measure of maximal entropy for Yϑ.

Structure of Recognisable Subshifts

From now on, we will assume that ϑ is recognisable. For the special case of compatible random substitutions, it was noted in [20] that there is an equivalent local formulation of recognisability. This is in line with the following result.

Lemma 5.1

In the definition of recognisability, the data xk and (vi)iZ depend continuously on y.

Proof

Assume that ynXϑ satisfies limnyn=y. For nN, let Dn=(xn,kn,(vin)iZ) be the recognisability data of yn. Due to compactness, it suffices to show that the recognisability data of y is the only accumulation point of (Dn)nN. Up to restricting to a subsequence, we can in fact assume that Dn converges to some (x,k,(vi)iZ). In particular, (vin)nN and (xin)nN are eventually constant for all iZ, implying that viϑ(xi). For convenience, let vn=v-1n.v0n and similarly for v. Given a ball B around the origin, we can choose n large enough that y|B=yn|B=Skvn|B=Skv|B. Since B was arbitrary, we have y=Skv, and hence (x,k,(vn)nZ) is indeed the unique recognisability data of y.

Remark 5.2

As an immediate consequence, we obtain that if ϑ is recognisable and of constant length , the -adic odometer Ωn=1Z/nZ, with addition +1, is a topological factor of (Xϑ,S). Indeed, we obtain an explicit factor map x(kn)nN, where kn is the unique number in [0,n) such that xSkn(ϑn(Xϑ)). In particular, (Xϑ,S) cannot be topologically mixing. For related results on topological mixing under the assumption of compatibility, we refer the reader to [46]. In particular, the observation in this remark slightly extends [46, Cor. 14] by dropping the assumption of compatibility.

For recognisable random substitutions, it is often convenient to consider the compact subset ϑ(Xϑ) and the associated induced transformation. This gives rise to the following structure.

Lemma 5.3

Let ϑ be a recognisable random substitution. Assume that A=ϑ(Xϑ) and let (A,SA) be the induced transformation of (Xϑ,S) on A. Then (A,SA) is topologically conjugate to the shift map S on the space ZϑBZ, where B={(a,v):aA,vϑ(a)} and

Zϑ={(xi,vi)iZBZ:(xi)iZXϑ}.

Proof

We give the conjugacy map ψ:ϑ(Xϑ)Zϑ explicitly. Given yϑ(Xϑ), recognisability implies that there is a unique element xXϑ and a unique sequence (vi)iZ with viϑ(xi) such that y=(vi)iZ is a concatenation of these inflation words. Set

ψ(y)=(xi,vi)iZ.

Both injectivity and surjectivity are readily verified. The inverse is ψ-1((xi,vi)iZ)=(vi)iZ. Continuity of ψ follows directly from Lemma 5.1. In the above notation, the first return time rA of y satisfies rA(y)=|v0| and therefore, SAy=(vi+1)iZ. This implies

ψ(SAy)=(xi+1,vi+1)iZ=Sψ(y),

which is the required conjugacy relation.

Thanks to Lemma 5.3, we often identify (A,SA) with (Zϑ,S), with slight abuse of notation. Due to the established topological conjugacy, this has no effect on results concerning entropy and ergodic measures, other than a renaming of objects. Note that the same construction works for An=ϑn(Xϑ) instead of A.

The shuffle group has a convenient representation in terms of the family of subshifts (An,SAn), with nN. Recall that if fΓn, then for each aA, there is a permutation αa(ϑn(a)) that replaces each word vϑn(a) in the inflation word decomposition of xXϑ with αa(v). Thus, f acts on (xi,vi)iZ as follows:

f(y)=(xi,wi)iZ,wi=αxi(vi).

This is consistent with the earlier definition, in the sense that for such y,

f((y,t))=(f(y),t),

for all tR. It follows that μ~ is invariant under f precisely if the same holds for μAn.

Corollary 5.4

μ is a uniformity measure if and only if μAn is invariant under Γn for all nN.

Measure Transformation

The assumption of recognisability is consistent with the idea of a desubstitution. We implement this on the level of measures in the following way. Let M denote the space of S-invariant Borel probability measures on Xϑ. For νM, let νA be the corresponding induced measure on A=ϑ(Xϑ). Using the identification in Lemma 5.3, the measure νA is determined by its value on cylinder sets of the form [(a1,v1)(an,vn)]. We define a measure μ=D(νA) on Xϑ by collapsing inflation words to letters via

μ([a1an])=v1,,vnνA([(a1,v1)(an,vn)]),

corresponding to a projection to the first coordinate. By construction, this gives an S-invariant probability measure that depends continuously on ν. Hence, the map Π:MM,νD(νA) is a (weakly) continuous operator. The map Π turns out to be surjective, but is generally not injective, since the map D can fail to be injective. We will use the random substitution action to construct appropriate inverse branches.

The action of ϑP is given by a Markov kernel, replacing each letter independently. Under this transition, a given shift-invariant measure μ on Xϑ is replaced by an SA-invariant measure ν=ϑP(μ) on A=ϑ(Xϑ). Using the identification in Lemma 5.3, it can be explicitly expressed as

(ϑP(μ))([(a1,v1)(an,vn)])=μ([a1an])P[ϑP(a1an)=v1vn], 6.1

for all legal blocks a1an and viϑ(ai). Indeed, this represents an SA-invariant probability measure on A. Note that by construction, we have that D(ϑP(μ))=μ, that is, D is a left-inverse of ϑP.

Lemma 6.1

The measure ϑP(μ) is SA-ergodic if and only if μ is S-ergodic.

Proof

Recall that the ergodicity of μ is equivalent to

limn1nk=0n-1μ([x]S-k[y])=μ([x])μ([y]),

for all x,yA+. Note that

μ([x]S-k[y])=zAk-|x|μ([xzy]),

as soon as k|x|. Suppose uϑ(x) and vϑ(y). Given a word zAr and wϑ(z), let zw=(z1,w1)(zr,wr), where w=w1wr is the unique decomposition with wiϑ(xi) for all 1r. With this notation, we obtain for kr that

ϑP(μ)([xu]SA-k[yv])=zAk-r,wϑ(z)ϑP(μ)([xuzwyv])=zAk-r,wϑ(z)μ([xzy])P[ϑP(xzy)=uwv]=P[ϑP(x)=u]P[ϑP(y)=v]μ([x]S-k[y])

Hence, if μ is ergodic, we obtain for all xu and yv that

limn1nk=0n-1ϑP(μ)([xu]SA-k[yv])=P[ϑP(x)=u]P[ϑP(y)=v]μ([x])μ([y])=ϑP(μ)([xu])ϑP(μ)([yv]),

implying ergodicity of ϑP(μ). Conversely, if ϑP(μ) is ergodic, taking the sum over all uϑ(x) and vϑ(y) in the relation

limn1nk=0n-1ϑP(μ)([xu]SA-k[yv])=ϑP(μ)([xu])ϑP(μ)([yv])

yields the ergodicity of μ.

We note that the integral of rA with respect to ϑP(μ) is given by

λμ,P=ArAdϑP(μ)=(a,v)BϑP(μ)([(a,v)])|v|=aAμ([a])E[|ϑP(a)|].

We call λμ,P the normalisation constant with respect to the defining data μ,P. Using this normalisation factor, ϑP(μ) can be drawn back to an S-invariant measure TP(μ) in a canonical way. This is achieved by taking the average over all shifts prior to the first return to the set A, and normalising with the expected first return time, given by λμ,P.

Definition 6.2

Let μM and P a choice of probabilities for the random substitution ϑ. The P-transfer of μ is the measure defined by

TP(μ)(f)=λμ,P-1ϑP(μ)i=0rA-1fSi,

for all continuous functions f on Xϑ.

Indeed, by Kac’s formula, this is the only possible candidate for a shift-invariant measure with induced measure ϑP(μ) on (A,SA). That TP(μ) indeed defines a Borel measure follows easily from the Riesz–Markov–Kakutani representation theorem. Normalisation is checked by choosing f1. Finally, S-invariance of TP(μ) follows in a straightforward manner from the SA-invariance of ϑP(μ).

Lemma 6.3

The measure TP(μ) is ergodic if and only if μ is ergodic.

Proof

The ergodicity of TP(μ) is equivalent to the ergodicity of its induced measure ϑP(μ). By Lemma 6.1 this is in turn equivalent to μ being ergodic.

Lemma 6.4

For every choice of P, the operator TP:MM is continuous with respect to the topology of weak convergence.

Proof

The fact that the action of ϑP is continuous follows in a straightforward manner from (6.1). Due to recognisability the first return map rA is continuous on Xϑ. Hence, for every continuous function f, the function fA=i=0rA-1fSi is continuous on A. Therefore, ϑP(μ)(fA) depends continuously on μ and the same holds for its normalisation λP,μ, which is bounded away from 0.

Definition 6.5

For νM, let P(ν) be the set of probability choices P satisfying

νA([(a,v)])=Pv,auϑ(a)νA([(a,u)]),

for all vϑ(a). Given probability data P, we let

M[P]={νM:PP(ν)}.

Remark 6.6

Note that for all PP(ν) and aA, the vector Pa=(Pv,a)vϑ(a) is uniquely determined, as soon as Π(ν)([a])=uϑ(a)νA([(a,u)])>0, and it is completely arbitrary otherwise. In particular, P(ν) is a singleton precisely if Π(ν) is non-vanishing on cylinders of length 1.

Recall that Rμ is the letter frequency vector defined by Raμ=μ([a]), for all aA. For notational convenience, we write μν for μ,νM if Rμ=Rν. The induced measure of TP(μ) is given by ϑP(μ), which in turn is mapped back to μ under D. Hence, we observe that

ΠTP(μ)=μ,

for all invariant probability measures μ. From this, it follows that TP(μ)M[P]Π-1(μ). All measures in this set have the same letter frequencies, as we show below. Given probability data P, recall that M(P) is the substitution matrix of ϑP.

Lemma 6.7

Given μM and some probability data P, all measures νM[P]Π-1(μ) share the same letter frequency vector Rν, given by

Rν=λμ,P-1M(P)Rμ.

In particular, this applies to ν=TP(μ). Also, μμ implies that TP(μ)TP(μ) for all P.

Proof

First, note that μ=Π(ν)=D(νA) satisfies

μ([b])=vϑ(b)νA([(b,v)]),

and hence νA([(b,v)])=Pv,bμ([b]), by the defining relation for PP(ν). Hence, for A=ϑ(Xϑ),

ν(A)-1=ArAdνA=(a,v)BνA([(a,v)])|v|=aAμ([a])vϑ(a)Pv,a|v|=λμ,P.

Combining these observations with Kac’s formula yields

ν([a])=1λμ,P(b,v)BνA([b,v])ϕa(v)=1λμ,PbAμ([b])vϑ(a)Pv,bϕa(v),

for all aA. Since the entries of M(P) are given by

M(P)ab=EPϕa(ϑ(b))=vϑ(b)Pv,bϕa(v),

this shows the stated formula for Rν. For the final claim, we apply this to ν=TP(μ) and observe that the letter frequencies of this measure depend only on P and Rμ.

The updating rule for the frequency vector under TP takes an even easier form in the geometric setting. Recall that the interval proportion vector πμ is given by πaμ=LaRaμ/LRμ. We emphasise that the relation between Rμ and πμ is one-to-one, since Raμ=La-1πaμ/(bALb-1πbμ).

Corollary 6.8

The interval proportion vectors of ν=TP(μ) and μ are related by

πν=Q(P)πμ,

where Q(P) is the geometric substitution matrix of ϑP.

Proof

First note that, due to Lemma 6.7,

λμ,PLRν=LM(P)Rμ=λLRμ. 6.2

Hence,

πaν=LaRaνLRν=Laλμ,PLRνbAM(P)abRbμ=bALaM(P)abλLbLbRbμLRμ=bAQ(P)abπbμ,

as claimed.

Recall that Π is a left-inverse of TP, irrespective of P. The special role of TP as an inverse branch of Π is that it maximises the entropy of all measures with a given inflation word distribution.

Proposition 6.9

For each μM and probability data P, the measure TP(μ) is the unique measure of maximal geometric entropy in Π-1(μ)M[P], and satisfies

hTP(μ)g=1λhμg+HPRμLRμ.

Proof

For νΠ-1(μ)M[P], let νA be the induced measure on A=ϑ(Xϑ). By Abramov’s formula, we have hν=ν(A)hνA=λμ,P-1hνA and, using Lemma 6.7 as in (6.2),

LRν=λλμ,PLRμ,

implying

hνg=hνAλLRμ.

Since the normalisation is fixed, ν achieves maximal geometric entropy precisely if hνA is maximal. The measure νA naturally induces a distribution νAn on Bn by setting νAn(u)=νA([u]) for each uBn. With this notation, and the identity map Idn:uu on Bn we have

hνA=limn1nHνAn(Idn)=infnN1nHνAn(Idn).

We define several random variables on Bn, which are defined on u=(xi,vi)i=1n via Xi(u)=xi, Vi(u)=vi and X[j,k](u)=xjxk for 1jkn. If not specified otherwise, we fix the distribution ρ=νAn on Bn and compute entropies with respect to this measure. Note that u is completely determined by the values of X[1,n](u) and (Vi(u))i=1n, so we obtain

HνAn(Idn)=H(V1,,Vn,X[1,n])=H(X[1,n])+H(V1,,Vn|X[1,n]).

The distribution of X[1,n] is given by

ρ({X[1,n]=x1xn})=v1,,vnνA([(x1,v1)(xn,vn)])=μ([x1xn]), 6.3

using that νΠ-1(μ) implies D(νA)=μ in the last step. Hence, we obtain that

limn1nH(X[1,n])=infn1nH(X[1,n])=hμ,

which is uniform for all νΠ-1(μ). We thus focus on the term H(V1,,Vn|X[1,n]). By standard properties of conditional entropy, we have

H(V1,,Vn|X[1,n])i=1nH(Vi|X[1,n]) 6.4

and for each 1in,

H(Vi|X[1,n])H(Vi|Xi). 6.5

The shift-invariance of νA implies that H(Vi|Xi)=H(V1|X1) for all 1in, given by

H(V1|X1)=aAρ({X1=a})Hρ{X1=a}(V1).

Similarly to before, we have that ρ({X1=a})=μ([a]) and, provided that μ([a])>0,

ρ{X1=a}({V1=v})=νA([(a,v)])uϑ(a)νA([(a,u)])=Pv,a,

where the last equality follows from νM[P]. Hence, Hρ{X1=a}(V1)=H(ϑP(a)), for such aA. If μ([a])=0, the set {X1=a} has vanishing measure and does not contribute to H(V1|X1). Hence,

H(V1|X1)=aAμ([a])HP(ϑP(a))=HPRμ.

In summary, we obtain that H(V1,,Vn|X[1,n])nHPRμ and thereby

hνA=limn1n(H(X[1,n])+H(V1,,Vn|X[1,n]))hμ+HPRμ. 6.6

We claim that this inequality is an equality precisely if H(V1,,Vn|X[1,n])=nHPRμ for all nN. That this condition is sufficient for equality in the entropy expression is apparent. Conversely, assume that for some nN, we have H(V1,,Vn|X[1,n])<nHPRμ. Then, there is ε>0 such that H(V1,,Vn|X[1,n])=nHPRμ-ε. For each mN, we obtain that

H(V1,,Vmn|X[1,mn])i=0m-1H(Vin+1,,V(i+1)n|X[1,mn])i=0m-1H(Vin+1,,V(i+1)n|X[in+1,(i+1)n])mH(V1,,Vn|X[1,n])mnHPRμ-mε,

using the invariance of νA in the penultimate step. This implies

hνA=limm1mn(H(X[1,mn])+H(V1,,Vmn|X[1,mn]))hμ+HPRμ-εn,

and the inequality in (6.6) is indeed strict. Note that H(V1,,Vn|X[1,n])=nHPRμ if and only if we have equality in both (6.4) and (6.5). Equality in (6.4) holds if and only if for every set of the form Sn={X[1,n]=x1xn} with positive ρ measure, the random variables V1,,Vn are independent with respect to the induced measure ρSn. Phrased differently,

ρ{X[1,n]=x1xn}({V1=v1,,Vn=vn})=i=1nρ{X[1,n]=x1xn}({Vi=vi}). 6.7

On the other hand, equality in (6.5) means that, given X1, the realisation of V1 is independent of X2,,Xn. That is, for every realisation x1xn of positive measure, we have that

ρ{X[1,n]=x1xn}({V1=v1})=ρ{X1=x1}({V1=v1})=Pv1,x1. 6.8

Recalling the normalisation in (6.3), equality in (6.6) hence requires that

i=1nPvi,xi=ρ((x1,v1)(xn,vn))μ([x1xn]). 6.9

This is equivalent to

νA([(x1,v1)(xn,vn)])=μ([x1xn])P[ϑP(x1xn)=v1vn],

which remains true if μ([x1xn])=0 because μ=D(νA). We therefore find equivalence to νA=ϑP(μ), which is in turn equivalent to ν=TP(μ). Conversely, it is straightforward to verify that the distribution ρ fixed by (6.9) indeed satisfies both (6.7) and (6.8). From this, we conclude that equality in (6.6) holds if and only if ν=TP(μ).

In this case, we obtain for the geometric entropy of ν=TP(μ) the explicit expression

hνg=hνAλLRμ=1λhμg+HPRμLRμ,

which is precisely the claimed relation.

Corollary 6.10

For each μM, the unique measure of maximal geometric entropy in Π-1(μ) is given by TP(μ), where (Pv,a)vϑ(a) is the uniform distribution for each a, and satisfies

hTP(μ)g=1λhμg+1LRμaAμ([a])log#ϑ(a).

Proof

Note that H(ϑP(a))=log#ϑ(a) if and only if Pa=(Pv,a)vϑ(a) is the uniform distribution on ϑ(a). Hence,

HPRμ=aAμ([a])H(ϑP(a))=aAμ([a])log#ϑ(a),

if and only if Pa is uniform for all a with μ([a])0. We observe that for μ([a])=0 the choice of Pa is immaterial for TP(μ) and can hence be chosen uniform without altering the measure. The claim therefore follows from Lemma 6.9 by decomposing Π-1(μ)=P(Π-1(μ)M[P]).

Lemma 6.11

We have TPP=TPTP.

Proof

Given μM, we need to prove that TPP(μ)=TP(ν), where ν=TP(μ). Since both measures are S-invariant, it suffices to show that they coincide on A2=ϑ2(Xϑ)ϑ(Xϑ). Every cylinder on A2 is of the form

C2(x,w)=[(x1,w1(2))(xn,wn(2))],

where x=x1xnLn and w=w1(2)wn(2) is the unique decomposition of w such that wi(2)ϑ2(xi) for all 1in. By recognisability, there is a unique word v with vϑ(x) and wϑ(v). Assuming v=v1vn, with viϑ(xi) for all 1in, there is hence a unique preimage of C2(x,w) under ϑ in A=ϑ(Xϑ), given by

C1(x,v)=[(x1,v1)(xn,vn)].

Since C2(x,w)A2A, we obtain

TP(ν)(C2(x,w))=1λν,PϑP(ν)(C2(x,w))=1λν,Pν(C1(x,v))P[ϑP(v)=w]=1λν,Pλμ,Pμ([x])P[ϑP(x)=v]P[ϑP(v)=w]=1λν,Pλμ,Pμ([x])P[ϑPP2(x)=w]=λμ,PPλν,Pλμ,PTPP(μ)(C2(x,w)).

Hence, it remains to show that λν,Pλμ,P=λμ,PP. Indeed, recalling that ν=TP(μ) and writing 1 for the vector with constant entries 1, we obtain by Lemma 6.7 that

λν,Pλμ,P=λμ,P1TM(P)RTP(μ)=1TM(P)M(P)Rμ=1TM(PP)Rμ=λμ,PP.

This shows equality of TPP(μ) and TPTP(μ) on A2, implying equality on the whole space due to shift invariance.

Lemma 6.12

The frequency measure μP is the unique fixed point of TP. That is, μP is the unique invariant probability measure μ with TP(μ)=μ.

Proof

Let ν be an arbitrary invariant probability measure. We will show that T=TP satisfies limnTn(ν)=μP. Using the continuity of T, this implies that T(μP)=μP. Conversely, if ν=T(ν) this directly implies that ν=μP. By Lemma 6.11, we know that Tn=TPn. Set An=ϑn(Xϑ), and given uA+ let χ[u] be the indicator function of [u]. Recall that

Tn(ν)([u])=1λPn,νϑnn(ν)(fu),fu=i=0rAn-1χ[u]Si.

Since the ratio between the geometric and symbolic length of a word is bounded, we directly obtain that limnλPn,ν=. Further, note that on [(av)] with vϑn(a), we can estimate fu via |v|ufu|v|u+|u|, and therefore we obtain

ϑnn(fu)=(a,v)Bnϑnn(ν)([a,v])|v|u+O(|u|)=aAν([a])E[|ϑnn(a)|u]+O(|u|).aAν([a])μP([u])E[|ϑnn(a)|]=μP([u])λPn,ν,

yielding the desired convergence limnTn(ν)([u])=μP([u]).

Note that for the class of random substitutions considered here, the known formula for the entropy of the frequency measure is a direct consequence of Lemma 6.12 and Lemma 6.9.

Inverse Limit Measures

Construction and entropy maximisation

We start from a sequence of probability choices P=(Pn)nN for the random substitution ϑ. Our aim is to construct a measure that represents the word frequencies in ϑP1ϑPn as n, provided they are well defined. This is similar to the construction of invariant measures for S-adic systems. We say that a sequence of S-invariant probability measures (μn)nN on Xϑ is (ϑ,P)-adapted if μn=TPn(μn+1) for all nN. Likewise, we call μ a (ϑ,P) inverse limit measure if μ=μ1 for some (ϑ,P)-adapted sequence (μn)nN. In this case, we have μn+1=Πn(μ) for all nN.

Remark 7.1

If Pn=P for all nN is a constant sequence, it follows from the proof of Lemma 6.12 that μP is the unique (ϑ,P) inverse limit measure. By considering higher powers of ϑ, we also obtain a unique inverse limit measure if P is periodic and by extension if P is eventually periodic.

In general, the uniqueness of inverse limit measures is a subtle issue, but existence follows routinely via compactness in our setting.

Lemma 7.2

For each sequence of probability choices P for ϑ, there exists a (ϑ,P) inverse limit measure.

Proof

Let νM be arbitrary. For all nN and 1in, let

μi(n)=TPiTPn(ν).

By construction, the finite sequence (μi(n))i=1n satisfies

μi(n)=TPi(μi+1(n)) 7.1

for all 1in-1. By compactness of M, for each iN, the sequence (μi(n))ni has an accumulation point. Using a diagonal argument, we can choose an increasing subsequence (nj)jN of natural numbers such that for all iN, we have limjμi(nj)=μi for some μiM. The relation μi+1=TPi(μi) follows from (7.1) and the continuity of TPi. This shows that (μi)iN is (ϑ,P)-adapted and hence μ1 is a P inverse limit measure.

We show that (ϑ,P) inverse limit measures are abundant enough to produce all possible letter frequencies while maximising the corresponding entropy.

Theorem 7.3

For each νM, there exists a P and a (ϑ,P) inverse limit measure μ with μν and hμghνg. If ν is not an inverse limit measure, we can choose μ such that hμg>hνg.

Proof

We start from an arbitrary measure ν=μ11M and inductively show the existence of a family {μin:nN,1in} and P=(Pi)iN with the following properties:

  1. μi+1i+1=Π(μii) for all iN;

  2. Pi=P(μii) for all iN;

  3. μin=TPi(μi+1n) for all 1i<n;

  4. μinμim for all iN and n,mi;

  5. hμinghμin-1g, with equality if and only if μin=μin-1.

Assume that, for some NN and all 1inN, the measures μin are well defined and fulfil the properties above. For N=1 this clearly holds. We perform the inductive step by showing that the same holds up to N+1. The first three properties are simply definitions, fixing the value of PN+1, as well as μN+1N+1=Π(μNN), and μiN+1=TPi(μi+1N+1) for all 1iN. For the fourth property, it suffices to show that μiNμiN+1 for all 1iN. For i=N, this follows from the fact that μNN+1=TPN(μN+1N+1) and μNN are both in Π-1(μN+1N+1)M[PN]. Indeed, by Lemma 6.7, this implies that μNN+1μNN. By the last statement in Lemma 6.7, this also implies that

μiN=TPiTPN-1μNNTPiTPN-1μNN+1=μiN+1,

for all iN-1, which completes the proof of the fourth property. For the last property it remains to show that hμiN+1ghμiNg with equality if and only if the measures are equal. For i=N, this follows from the fact that μNN+1=TPN(μN+1N+1) is the unique measure of maximal geometric entropy in Π-1(μN+1N+1)M[PN] by Lemma 6.9. Recall that by Lemma 6.9,

hTP(μ)g=1λhμg+HPRμLRμ

is strictly increasing in the entropy of μ as long as the letter frequencies and P remain fixed. Since μNNμNN+1, this also shows that the entropy of μN-1N+1=TPN-1(μNN+1) is at least the entropy of μN-1N=TPN-1(μNN) and if the measures are not equal, then μNNμNN+1, implying that the inequality of entropies is strict. Inductively, the same holds for the entropies of μiN+1 and μiN, finishing the proof of the last property.

In summary, we obtain for each iN a sequence (μin)ni of measures with identical letter frequencies and increasing geometric entropy. There exists an appropriate (diagonal) subsequence (nj)jN with respect to which each of these measure sequences converges and we set μi=limjμij for all iN. By continuity of the transfer operators and the third property, we obtain that μi=TPi(μi+1) and in particular, μ1 is a (ϑ,P) inverse limit measure. Since equality of letter frequencies is preserved under weak convergence, we also have μ1μ11. Furthermore, by the upper semi-continuity of entropy, we also obtain that

h1gsupnNhμ1nghμ11g,

since the sequence (hμ1ng)nN is increasing. We can only have equality if the entropy sequence is constant, implying that the measure sequence is constant. In this case μ1=μ11, implying that the starting measure was already an inverse limit measure.

All of the above can be generalised to higher powers of the random substitution ϑ. In particular, we have (ϑn,P) limit measures for all sequences P of probability choices for ϑn and nN.

For each nN and νM, we regard the induced measure νAn on An=ϑn(Xϑ) as a measure on BnZ, where Bn={(a,v):aA,vϑn(a)}. In particular, we write νAnμAn if νAn([(a,v)])=μAn([(a,v)]) for all aA and vϑn(a).

Definition 7.4

For each νM and nN with An=ϑn(Xϑ), we set

M[ν,n]={μM:μAnνAn}.

In particular, M[ν,0]={μM:μν}.

Our next aim is to show that these inverse limit measures are dense in the space of all shift-invariant probability measures. As a first step, we show that every M[ν,n] contains an inverse limit measure that maximises the geometric entropy.

Lemma 7.5

For each nN and νM, every measure of maximal geometric entropy in M[ν,n] is a (ϑn,P) inverse limit measure for some sequence of probability choices P for ϑn.

Proof

Since Πn collapses level-n inflation words to letters, it is straightforward to verify that for each νM[ν,n], we have that Πn(ν)Πn(ν). That is, νΠ-n(μ) for some μΠn(ν) and we can decompose

M[ν,n]=μ:μΠn(ν)M[ν,n]Π-n(μ).

Writing Pn for (some choice of) the probability data of ν on level-n inflation words, we easily verify that M[ν,n]Π-n(μ)=M[Pn]Π-n(μ). Applying Lemma 6.9 to ϑn, we obtain that the unique measure of maximal geometric entropy in this set is given by TPn(μ). Hence, the measures of maximal geometric entropy in M[ν,n] are among the set

Tn=TPn(μ):μΠn(ν).

Due to the explicit entropy expression in Lemma 6.9, the maximal geometric entropy in Tn is obtained exactly for those measures μΠn(ν) that have maximal geometric entropy. By Theorem 7.3, every such measure μ is an inverse limit measure, and hence the same holds for TPn(μ). In summary, the measures of maximal geometric entropy in M[ν,n] are (ϑn,P) limit measures for appropriate P.

Lemma 7.6

Let (μn)nN be a sequence of measures with μnM[ν,n] for all nN. Then, limnμn=ν in the weak topology.

Proof

Recall that by Kac’s formula we have for an arbitrary S-invariant measure μ and uA+ that

μ([u])=μAn(fu)μAn(rAn),fu=i=0rAnχ[u]Si.

Assume that μM[ν,n]; that is μAnνAn. Since rAn is constant on cylinders of the form [(av)] with (a,v)Bn, this implies that μAn(rAn)=νAn(rAn). The same holds for the function gu which takes the constant value |v|u on [(av)] with (a,v)Bn. Since gufugu+|u|, we obtain that

|μ([u])-ν([u])||u|νAn(rAn). 7.2

Note that rAn gives the (symbolic) length of the level-n inflation word at the origin. Since the geometric length grows with λn and the ratio between symbolic and geometric length is bounded, the difference in (7.2) decays exponentially with n.

Proposition 7.7

Let MI be the set of all measures μ such that μ is a (ϑn,P) inverse limit measure for some P and nN. Then, for every νM, we can find a sequence of measures (μn)nN with μnMI such that limnμn=ν and

limnhng=infnNhng=hνg.

In particular, MI is dense in M.

Proof

Due to Lemma 7.5, we can choose for each nN an inverse limit measure μnM[ν,n] such that hμnhν. Since this sequence of measures mimics the inflation word frequencies of ν, it converges to ν by Lemma 7.6. The statement on convergence of the geometric entropies follows by the upper semi-continuity of entropy.

This result shows that all measures of maximal geometric entropy are limits of inverse limit measures of maximal geometric entropy. In particular, if there are only finitely many inverse limit measures of maximal geometric entropy, there can be no further measures of maximal geometric entropy. In our quest to show intrinsic ergodicity we can hence restrict our attention to inverse limit measures.

Uniqueness of inverse limit measures

A priori it is not clear whether a given sequence P of probability choices admits just one or several inverse limit measures. In this section, we characterise the uniqueness of inverse limit measures via the ergodicity of an associated (inverse-time) Markov chain.

Given P=(Pn)nN, we call Q(P)=(Q(Pn))nN the P-Markov sequence. This represents an inverse-time inhomogenous Markov process that controls the flow of interval proportions induced by P. More precisely, if (μn)nN is (ϑ,P)-adapted, then μn=TPn(μn+1) implies via Corollary 6.8 that

πμn=Q(Pn)πμn+1

for all nN. This has a unique solution precisely if the P-Markov sequence is ergodic.

Before we continue, let us expand a bit more on the role of Q(P) as updating the interval proportion vectors of periodic measure representations under the action of ϑP. This gives a natural analogue to Corollary 6.8.

Lemma 7.8

Given a random word ω and a random substitution ϑP, we have that

πμ¯ϑP(ω)=Q(P)πμ¯ω.

Proof

Let us set M=M(P) and Q=Q(P). First, we note that by the definition of μ¯, we have that

Rμ¯ω=E[ϕ(ω)]E[|ω|],

for every random word ω. Hence, for the corresponding interval proportion vector, we have

πaω=LaE[ϕ(ω)]aLE[ϕ(ω)]. 7.3

Note that E[ϕ(ϑP(ω))]=ME[ϕ(ω)]. For convenience, we use the shorthand v=E[ϕ(ω)] in the following. Applying (7.3) to the random word ϑP(ω), and recalling that LaMab=λLbQab, we obtain

πaϑP(ω)=La(Mv)aLMv=bALaMabvbλLv=bAQabLbvbLv=(Qπμ¯ω)a,

as required.

Lemma 7.9

Let P be a sequence of probability choices with P-Markov sequence Q(P)=(Qn)nN. Given aA and mN, assume that π is an accumulation point of (Q[m,m+n]ea)nN. Then, there is a (ϑ,P)-adapted sequence of measures (μn)nN such that πμm=π.

Proof

Let (nj)jN be a strictly increasing subsequence such that π=limjQ[m,nj]ea. Up to choosing a further subsequence, we can assume that the sequence of probability vectors (Q[nj+1,n2j]ea) converges to some vector π. We write kj=n2j-nj and note that Q[nj+1,n2j] is the geometric substitution matrix of the probability choice

P[j]:=Pnj+1Pn2j.

We consider the sequence of random words (ωj)jN with ωj=ϑ[j]j(a) for all jN. Again up to restricting to a subsequence, we can assume that this sequence has a limit measure μ=limjμ¯ωj. By Lemma 7.8, the interval proportion vector of μ¯ωj is given by

πμ¯ωj=Q(P[j])πμ¯(a)=Q[nj+1,n2j]eajπ,

implying that πμ=π. Since all realisations of ωj are legal words, we further know that μ is supported on Xϑ. Let μ1 be an accumulation point of the sequence (μ1(j))jN with

μ1(j)=TP1TPnj(μ).

We claim that (Πn(μ1))nN0 is a (ϑ,P)-adapted sequence. Indeed, for njk, we have that

μk(j):=Πk-1(μ1(j))=TPkTPnj(μ)

converges along the same subsequence to some μk and satisfies

μ1(j)=TP1TPk-1(μk(j)),

which persists in the limit along the corresponding subsequence. Hence (μk)kN is indeed (ϑ,P)-adapted. For the interval proportions of μm(j), we obtain

πμm(j)=Q[m,nj]πμ=Q[m,nj]π.

Since π=limjQ[nj+1,n2j]ea and the norm of Q[m,nj] is uniformly bounded we obtain that

πμm=limjQ[m,nj]π=limjQ[m,n2j]ea=π,

as claimed.

Proposition 7.10

There is a unique (ϑ,P) inverse limit measure if and only if Q(P) is ergodic.

Proof

First, assume that Q(P)=(Qn)nN is ergodic. This means that for all nN there is some πn such that Q[n,n+k-1] converges to πn1T as k. Let (μn)nN be a (ϑ,P)-adapted sequence. By Corollary 6.8, we have that

πμn=Q[n,n+k-1]πμn+kkπn.

Let P(n):=P1Pn. Then, μ=μ1 satisfies that μ=TP(n)(μn+1), and therefore the corresponding induced measure on An=ϑn(Xϑ) satisfies

μAn[(a,v)]=μn+1([a])Pv,a(n), 7.4

for all aA and vϑn(a). If (μn)nN is another (ϑ,P)-adapted sequence with μ=μ1, we have that

πμn=πn=πμn,

and therefore Rμn=Rμn for all nN. That is, μnμn for all nN, implying via (7.4) that μAnμn. Phrased differently, we have that μM[μ,n] for all nN. By Lemma 7.6, the constant sequence (μ) therefore converges to μ, meaning that μ=μ. We conclude that there can be only one (ϑ,P) inverse limit measure.

Conversely, assume that Q(P)=(Qn)nN is not ergodic. By Theorem 2.20, there exists an mN such that δ(Q[m,m+k]) does not converge to 0 as k. Since the sequence is non-increasing, there exists a c>0 such that δ(Q[m,m+k])>c for all kN. In particular, we can find a,bA such that

dV(Q[m,n]ea,Q[m,n]eb)>c>0

for all n in some strictly increasing sequence (nj)jN. Up to choosing a subsequence, we may assume that both π=limjQ[m,nj]ea and π=limjQ[m,nj]eb exist as a limit. By construction, we have that dV(π,π)c>0 and hence ππ. By Lemma 7.9, we can find corresponding (ϑ,P)-adapted sequences (μn)nN and (μn)nN such that πμm=π and πμm=π. Since π and π are different, so are μm and μm, and ultimately μ1μ1. This implies that there are several (ϑ,P) inverse limit measures.

As a direct consequence of Proposition 7.10 and Theorem 2.20, we obtain the following list of sufficient conditions for the uniqueness of inverse limit measures.

Corollary 7.11

There is a unique (ϑ,P) inverse limit measure for P=(Pn)nN if any of the following hold.

  1. ϑ is compatible.

  2. There is a primitive matrix M such that limnM(Pn)=M.

  3. There is a primitive matrix Q such that limnQ(Pn)=Q.

  4. There is some nN such that all marginals of ϑn have a strictly positive substitution matrix.

  5. ϑ is defined on a binary alphabet A.

Proof

The first condition is a special case of the second condition. The second and third condition are equivalent and it follows directly from Theorem 2.20 that they imply the ergodicity of the corresponding Markov chain. If the fourth condition holds, we obtain that δ(Q(P(n))) is bounded away from 1 for all probability choices P(n) of ϑn. Using the submultiplicativity of δ, this implies limmδ(QnQn+m)=0 for Qn=Q(Pn) and hence we obtain ergodicity. Finally, we show that the fifth condition is a special case of the fourth condition. If ϑ is defined on a binary alphabet, recognisability enforces some combinatorial structure. More precisely, if anϑ(a) or bnϑ(b), geometric compatibility (with inflation factor λ>1) enforces that n2. This would imply that aZ or bZ is a non-recognisable element of the subshift. On the other hand, if bnϑ(a) and amϑ(b), it must be n>1 or m>1 and we obtain the same contradiction to recognisability. Hence, we can pick a such that every word in ϑ(a) contains both a and b, and every word in ϑ(b) contains at least one a. It follows that all marginals of ϑ2 have a strictly positive substitution matrix.

Uniformity Measure and Intrinsic Ergodicity

We still assume that ϑ is a primitive, recognisable and geometrically compatible random substitution. Based on the previous discussion, we show that uniformity measures are the (ϑ,P) limit measures for some explicit P. We start with a slight generalisation of the n-productivity distributions in Definition 2.10.

Definition 8.1

For n,mN, let Pn,m denote the n-productivity distribution for ϑm, that is,

Pv,an,m=#ϑn(v)uϑm(a)#ϑn(u)=#ϑn(v)#ϑn+m(a),

for all vϑm(a). In particular, P0,m represents the uniform distribution on each ϑm(a).

Note that the equality uϑm(a)#ϑn(u)=#ϑn+m(a) makes use of the disjoint set condition, implied by recognisability of ϑm. For a word u=u1ur and vϑm(u), there exists, due to unique realisation paths, a unique decomposition v=v1vr with viϑ(ui) for all 1ir. We obtain that

P[ϑn,mm(u)=v]=i=1r#ϑn(vi)#ϑn+m(ui)=#ϑn(v)#ϑn+m(u),

again using unique realisation paths in the last step. Intuitively, uniformity measures are those that exhibit a uniform distribution of inflation words on each level. More formally, this can be formulated as follows.

Proposition 8.2

νM is a uniformity measure if and only if νM[P0,n] for all nN.

Proof

Recall from Corollary 5.4 that uniformity measures are precisely those such that νAn is invariant under Γn for all nN. The statement νM[P0,n] can be expressed equivalently by

νAn([a,v])=νAn([a,u]) 8.1

for all aA, nN and u,vϑn(a). Each of the automorphisms fαΓa,n leaves AnBnZ invariant by construction, and acts on it by a permutation of the letters {(a,v):vϑn(a)}Bn. Hence, if ν is a uniformity measure, we obtain

(νfα)An([a,v])=νAnfα([a,v])=νAn([a,α(v)]).

Choosing αSym(ϑn(a)) with α(v)=u reproduces (8.1) and we conclude that νM[P0,n]. Conversely assume that νM[P0,n] for all nN. Let nN and fΓn. Since the groups Γn are nested, we see that f leaves Am invariant for all mn, and it acts via a permutation αm,b on {(b,v):vϑm(b)} for every bA. Hence,

(νf)Am([b,v])=νAm([b,αm,b(v)])=νAm([b,v]),

due to (8.1). From this it follows that (νf)AmνAm for all mn. This property enforces ν=νf by Lemma 7.6, and it follows that ν is a uniformity measure.

In the following, we prove several useful consistency relations satisfied by the n-productivity distributions.

Lemma 8.3

For all n,k,mN0, we have that

Pn,k+m=Pn,kPn+k,m.

In particular, for all nN0 and kN,

Pn,k=Pn,1Pn+1,1Pn+k-1,1.

Proof

Given aA, and wϑk+m(a) let vϑm(a) be the unique word with wϑk(v). We obtain

(Pn,kPn+k,m)w,a=Pw,vn,kPv,an+k,m=#ϑn(w)#ϑn+k(v)#ϑn+k(v)#ϑn+k+m(a)=#ϑn(w)#ϑn+k+m(a)=Pw,an,k+m,

proving the first relation. Iterating this relation gives the second claim.

Definition 8.4

Given P=(Pn,1)nN0, we call every (ϑ,P)-adapted sequence of measures (μn)nN0 a uniformity sequence, and every (ϑ,P) inverse limit measure is referred to as a uniformity limit measure.

Recall that ν is a uniformity measure if νM[P0,n] for all nN, due to Proposition 8.2. We will see that this concept coincides with that of a uniformity limit measure. In fact, we can show that uniformity (limit) measures are precisely the measures of maximal entropy. This gives us a slight strengthening of Theorem C.

Theorem 8.5

Let ϑ be primitive, geometrically compatible and recognisable. An invariant probability measure on (Xϑ,S) has maximal geometric entropy if and only if it is a uniformity measure if and only if it is a uniformity limit measure.

Proof

If ν is a measure of maximal geometric entropy, it maximises in particular the geometric entropy in Π-n(Πn(ν)). Hence, by Corollary 6.10, we have that ν=TP0,n(Πn(ν))M[P0,n] for all nN. Hence, ν is a uniformity measure. If ν=ν0 is a uniformity measure, we claim that (νn)nN0 with νn=Πn(ν0) is a uniformity sequence. First, we note that, by Lemma 8.3,

TP0,1TPn-1,1(νn)=TP0,n(νn)M[P0,n]Π-n(νn)=M[ν,n]Π-n(νn)

converges to ν by Lemma 7.6. By continuity of Π, applying Πm to this relation yields that

limnTPm,1TPn-1,1(νn)=νm,

for all mN. In particular,

νm-1=limnTPm-1,1TPn-1,1(νn)=TPm-1,1(νm),

so (νn)nN0 is (ϑ,P)-adapted for P=(Pn,1)nN0. Hence, ν0 is a uniformity limit measure. Finally, let ν0 be a uniformity limit measure with uniformity sequence (νn)nN0. We obtain ν0=TP0,n(νn), and hence we can express the geometric entropy of ν0 via Corollary 6.10 as

h0g=1λn(hng+1LRνnaAνn([a])log#ϑn(a)),

for all nN. By Theorem A, we have that λ-nlog#ϑn(a) converges to Lahtop(Yϑ) as n. Hence, performing this limit in the last relation gives

h0g=htop(Yϑ)

and we conclude that ν0 has maximal geometric entropy.

Corollary 8.6

Every measure of maximal entropy on (Yϑ,T) has full topological support.

Proof

For every legal word v there exists some power nN such that v is contained in some realisation of ϑn(a) for every aA, due to primitivity. Every uniformity measure equidistributes the inflation words of level n of every given type. Hence, it assigns positive mass to the cylinder [v]. This shows that measures of maximal geometric entropy have full support on Xϑ, and hence their lifts have full support on Yϑ.

We now summarise some of our main results on intrinsic ergodicity, covering in particular Theorem D.

Theorem 8.7

Let ϑ be a primitive, geometrically compatible and recognisable random substitution. There is a unique uniformity measure μu if and only if the Markov sequence (Q(Pn,1))nN0 is ergodic. In this case, (Yϑ,T) is intrinsically ergodic, and the measure of maximal entropy is μu~, the lift of μu under the suspension.

Proof

The first statement about the uniqueness of μu is a direct consequence of Proposition 7.10. By Theorem 8.5 the uniformity measures are precisely the measures of maximal geometric entropy, and hence their lifts under the suspension are precisely the measures of maximal entropy on (Yϑ,T).

Corollary 8.8

The system (Yϑ,T) is intrinsically ergodic if any of the following hold.

  • ϑ is compatible.

  • There is a primitive matrix M such that limnM(Pn,1)=M.

  • There is a primitive matrix Q such that limnQ(Pn,1)=Q.

  • There is some nN such that all marginals of ϑn have a strictly positive substitution matrix.

  • ϑ is defined on a binary alphabet A.

Proof

This follows directly by combining Theorem 8.7 with Corollary 7.11.

Proof of Corollary E

If ϑ is primitive, compatible and recognisable, intrinsic ergodicity of (Yϑ,T) and (Xϑ,S) are equivalent and follow by Corollary 8.8. In this case, all n-productivity distributions are given by the uniform distribution P=P(0,1), and the uniformity sequence P is trivial. It hence follows from Remark 7.1 that μP is the unique uniformity (limit) measure and thus the measure of maximal entropy.

Example 8.9

Consider the random substitution ϑ on A={a,b,c}, given by

ϑ:a{abc,acc},b{bac,bcc},c{aac}.

This example is easily verified to be primitive, of constant length and recognisable. We will show that it gives rise to an intrinsically ergodic subshift, although the productivity weights are non-trivial. First, note that pn:=#ϑn(a)=#ϑn(b) follows by induction. Similarly, let qn=#ϑn(c) and rn=pn/qn for all nN0. We obtain

qn+1=#ϑn+1(c)=#ϑn(aac)=pn2qn,

and similarly pn+1=(pn+qn)pnqn. This yields the recursive relation rn+1=1+1/rn with r0=1, which is solved by rn=1+Fn/Fn+1, with Fn being the nth Fibonacci number. Hence, the limiting value τ=limnrn is the inverse of the golden ratio. From this, we obtain that P=limnPn,1 exists and is non-degenerate. In particular Q(Pn,1) converges to the primitive matrix Q(P). As a consequence of Corollary 8.8 we see that both (Xϑ,S) and (Yϑ,T) are intrinsically ergodic. Note that if we replace c{aac} by c{acc}, we obtain rn+1=rn+1 such that rn as n and hence P turns out to be degenerate for ϑ. In fact, it singles out the marginal aabc, bbac, caac. Since this is still primitive, we again deduce intrinsic ergodicity by the same criterion.

Easy sufficient conditions for the violation of intrinsic ergodicity seem to be harder to find. However, we provide an example below, showing that there are indeed primitive, geometrically and recognisable random substitutions with multiple measures of maximal (geometric) entropy.

Example 8.10

Consider the primitive random substitution ϑ of constant length 4 on the alphabet A={a0,a1,b0,b1,c} given by

ϑ:aiaiaiai+1ai+1,aiai+1cc,bibibibi+1bi+1,bibi+1cc,ca0b0cc,

where indices are to be understood modulo 2. It is straightforward to see that ϑ is primitive. To verify recognisability, note that inflation words that contain cc are easy to identify. A pattern of the form w=aiaiai+1ai+1 is either a complete inflation word or splits into two inflation words in the middle. The only case in which the next four letters do not force one of the two options is if they form exactly the same word w. Repeating the argument, we see that the only obstruction to recognisability would be the existence of words of the form wn for arbitrarily large n. However, w6 cannot be legal, as this would require a word ai6 or ai+15 in the preimage, both of which are not legal. By symmetry, the same argument applies to patterns of the form bibibi+1bi+1, and we obtain that ϑ is indeed recognisable.

The idea behind this example is the following. We can partition the alphabet into three pieces, according to A={a0,a1}{b0,b1}{c}. The letter c ensures primitivity but contributes least to entropy production. For the n-productivity weights, this causes the images of letters of type a to favour those inflation words that consist only of type a letters. The same holds for letters of type b. In the limit, this creates a non-primitive substitution matrix. We verify that the communication to letters of a different type dies out sufficiently fast so that most of the mass starting on a (or b) remains trapped. This precludes convergence to a common limit distribution. The details follow.

We can show by induction on nN0 that pn=#ϑn(ai)=#ϑn(bi) does not depend on i. We also use the notation qn:=#ϑn(c) and rn=pn/qn for all nN0. Since the disjoint set condition holds, we obtain pn+1=pn4+pn2qn2 and qn+1=pn2qn2. This yields

rn+1=rn2+1,

for all nN0, with r0=1. This is a rapidly increasing function in n. Let uia=aiaiai+1ai+1 and via=aiai+1cc, and define uib,vib analogously. The cardinalities satisfy

#ϑn(uia)#ϑn(via)=pn2qn2=rn2

The n-productivity distribution for ϑ therefore satisfies

Pvia,ain,1=Pvib,bin,1=#ϑn(via)#ϑn(via)+#ϑn(uia)=11+rn2=1rn+1.

Note that ϑ is a mixture of the marginals θ and θ, where

θ:aiuia,biuib,ca0b0cc,θ:aivia,bivib,ca0b0cc,

which have substitution matrices

M=2200122000002210022000002,M=1100111000001110011022222,

respectively. Note that in the limit n the n-productivity weights single out the marginal θ, and hence the limiting productivity matrix Q=limnQ(Pn,1) is given by the normalised substitution matrix Q=M/4. The substitution matrix for Pn,1 is given by

M(Pn,1)=rn+1-1rn+1M+1rn+1M,

and the corresponding geometric variant is Qn:=Q(Pn,1)=M(Pn,1)/4.

Our aim in the following is to rule out intrinsic ergodicity by showing explicitly that Q[1,n]=Q1Qn does not converge to a one-dimensional projection. To this end, we extract from any matrix P indexed by A the “upper left corner” via

A(P)={Pai,aj}i,j{0,1}.

Since both M and M exhibit a multiple of the idempotent matrix

N=1/21/21/21/2

at the corresponding position, we obtain that

A(Qn)=snN,sn:=rn+1-1rn+1+12rn+1=2rn+1-12rn+1.

Since we are dealing with non-negative matrices, extracting a submatrix is super- multiplicative in the sense that A(QQ)A(Q)A(Q), hence

A(Q1Qn)A(Q1)A(Qn)=i=0n-1siN. 8.2

For nN0, we compute the first few values of rn+1 as 2,5,26,677,458330,, giving rise to the values sn=3/4,9/10,51/52,. We argue that s=i=0si>1/2. For every n,mN, we can iterate the relation rn+1>rn2 to obtain rn+m>rnm>rnm+1, and using that

log(sn)2(sn-1)=-1rn+1

for sn<1 sufficiently close to 1, we obtain that

logi=nsi=i=0log(sn+i)-i=01rn+1+i>-i=01rn+1i+1=11-rn+1,

and therefore

i=nsiexp(1/(1-rn+1))=:tn.

By explicit calculation, we obtain t2>0.96 and therefore

s=s0s1i=2si>3491096100=6481000>0.5.

Combining this with (8.2), we obtain that for all nN, we have

(Q[1,n])a0,a0+(Q[1,n])a1,a0s>0.5.

By symmetry, the same relation holds with a replaced by b. But this means that the first and third column of Q[1,n] stay bounded away from each other by a positive distance. Hence, Q[1,n] does not converge to a one-dimensional projection. We conclude that (Qn)nN is not ergodic and hence that both (Xϑ,S) and (Yϑ,T) admit several measures of maximal entropy.

Acknowledgements

AM thanks Lund University for their hospitality during a research visit in April 2024, where this project began. PG acknowledges support from the German Research Foundation (DFG) through Project 50942770 and AM acknowledges support from EPSRC Grant EP/Y023358/1 and an EPSRC Doctoral Prize Fellowship.

Data Availability

This project has no associated data.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Baake, M., Grimm, U.: Aperiodic Order. Vol. 1: A Mathematical Invitation. Cambridge University Press, Cambridge (2013)
  • 2.Baake, M., Lenz, D., Richard, C.: Pure point diffraction implies zero entropy for Delone sets with uniform cluster frequencies. Lett. Math. Phys. 82(1), 61–77 (2007) [Google Scholar]
  • 3.Baake, M., Spindeler, T., Strungaru, N.: Diffraction of compatible random substitutions in one dimension. Indag. Math. (N.S.) 29(4), 1031–1071 (2018) [Google Scholar]
  • 4.Barreira, L., Radu, L., Wolf, C.: Dimension of measures for suspension flows. Dyn. Syst. 19(2), 89–107 (2004) [Google Scholar]
  • 5.Bowen, R.: Some systems with unique equilibrium states. Math. Syst. Theory 8(3), 193–202 (1974) [Google Scholar]
  • 6.Bowen, R.: Equilibrium states and the ergodic theory of anosov diffeomorphisms. Lect. Notes Math. 470, 11–25 (1975) [Google Scholar]
  • 7.Brémaud, P.: Markov Chains—Gibbs Fields, Monte Carlo Simulation and Queues, Volume 31 of Texts in Applied Mathematics, 2nd edn. Springer, Cham (2020)
  • 8.Bufetov, A.I., Gurevich, B.M.: Existence and uniqueness of a measure with maximal entropy for the Teichmüller flow on the moduli space of abelian differentials. Mat. Sb. 202(7), 3–42 (2011) [Google Scholar]
  • 9.Burton, R., Steif, J.E.: New results on measures of maximal entropy. Isr. J. Math. 89(1–3), 275–300 (1995) [Google Scholar]
  • 10.Buzzi, J.: Subshifts of quasi-finite type. Invent. Math. 159(2), 369–406 (2005) [Google Scholar]
  • 11.Buzzi, J., Fisher, T., Sambarino, M., Vásquez, C.: Maximal entropy measures for certain partially hyperbolic, derived from Anosov systems. Ergodic Theory Dyn. Syst. 32(1), 63–79 (2012) [Google Scholar]
  • 12.Chatterjee, S., Seneta, E.: Towards consensus: some convergence theorems on repeated averaging. J. Appl. Probab. 14(1), 89–97 (1977) [Google Scholar]
  • 13.Climenhaga, V., Pavlov, R.: One-sided almost specification and intrinsic ergodicity. Ergodic Theory Dyn. Syst. 39(9), 2456–2480 (2019) [Google Scholar]
  • 14.Climenhaga, V., Thompson, D.: Intrinsic ergodicity beyond specification: -shifts, -gap shifts, and their factors. Isr. J. Math. 192, 785–817 (2012) [Google Scholar]
  • 15.Climenhaga, V., Thompson, D.: Equilibrium states beyond specification and the Bowen property. J. Lond. Math. Soc. (2) 87(2), 401–427 (2013) [Google Scholar]
  • 16.Climenhaga, V., Thompson, D.J.: Beyond Bowen’s specification property. In: Thermodynamic Formalism, volume 2290 of Lecture Notes in Mathematics, pp. 3–82. Springer, Cham (2021)
  • 17.Dekking, F.M., Grimmett, G.R.: Superbranching processes and projections of random Cantor sets. Probab. Theory Related Fields 78(3), 335–355 (1988) [Google Scholar]
  • 18.Dekking, F.M., Meester, R.W.J.: On the structure of Mandelbrot’s percolation process and other random Cantor sets. J. Stat. Phys. 58(5), 1109–1126 (1990) [Google Scholar]
  • 19.Dekking, F. M., Vd Wal. P.:Fractal percolation and branching cellular automata. Probab. Theory Related Fields 120(2), 277–308 (2001)
  • 20.Fokkink, R., Rust, D., Salo, V.: Automorphism groups of random substitution subshifts. Indag. Math. (N.S.) 35(5), 931–958 (2024) [Google Scholar]
  • 21.García-Ramos, F., Pavlov, R.: Extender sets and measures of maximal entropy for subshifts. J. Lond. Math. Soc. (2) 100(3), 1013–1033 (2019) [Google Scholar]
  • 22.García-Ramos, F., Pavlov, R., Reyes, C.: Measures of maximal entropy of bounded density shifts. Ergodic Theory Dyn. Syst. 44(10), 2960–2974 (2024) [Google Scholar]
  • 23.Gelfert, K., Ruggiero, R.O.: Geodesic flows modelled by expansive flows. Proc. Edinb. Math. Soc. (2) 62(1), 61–95 (2019) [Google Scholar]
  • 24.Godrèche, C., Luck, J.: Quasiperiodicity and randomness in tilings of the plane. J. Stat. Phys. 55, 1–28 (1989) [Google Scholar]
  • 25.Gohlke, P.: Inflation word entropy for semi-compatible random substitutions. Monatsh. Math. 192, 93–110 (2020) [Google Scholar]
  • 26.Gohlke, P.: Aperiodic order and singular spectra. Ph.D. thesis, Bielefeld University (2021)
  • 27.Gohlke, P., Mitchell, A., Rust, D., Samuel, T.: Measure theoretic entropy of random substitution subshifts. Ann. Henri Poincaré 24(1), 277–323 (2023) [Google Scholar]
  • 28.Gohlke, P., Rust, D., Spindeler, T.: Shifts of finite type and random substitutions. Discrete Contin. Dyn. Syst. 39, 5085–5103 (2019) [Google Scholar]
  • 29.Gohlke, P., Spindeler, T.: Ergodic frequency measures for random substitutions. Studia Math. 255(3), 265–301 (2020) [Google Scholar]
  • 30.Guiaşu, S., Shenitzer, A.: The principle of maximum entropy. Math. Intell. 7(1), 42–48 (1985) [Google Scholar]
  • 31.Haydn, N.T.A.: Phase transitions in one-dimensional subshifts. Discrete Contin. Dyn. Syst. 33(5), 1965–1973 (2013) [Google Scholar]
  • 32.Hofbauer, F.: Examples for the nonuniqueness of the equilibrium state. Trans. Am. Math. Soc. 228, 223–241 (1977) [Google Scholar]
  • 33.Iommi, G., Velozo, A.: Measures of maximal entropy for suspension flows. Math. Z. 297(3–4), 1473–1482 (2021) [Google Scholar]
  • 34.Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 2(106), 620–630 (1957) [Google Scholar]
  • 35.Keller, G.: Equilibrium States in Ergodic Theory. London Mathematical Society Student Texts, vol. 42. Cambridge University Press, Cambridge (1998)
  • 36.Koslicki, D.: Substitution Markov chains with applications to molecular evolution. Ph.D. thesis, Pennsylvania State University (2012)
  • 37.Koslicki, D., Denker, M.: Substitution Markov chains and Martin boundaries. Rocky Mt. J. Math. 46(6), 1963–1985 (2016) [Google Scholar]
  • 38.Kriger, W.: On the uniqueness of the equilibrium state. Math. Syst. Theory 8(2), 97–104 (1974) [Google Scholar]
  • 39.Kucherenko, T., Thompson, D.J.: Measures of maximal entropy for suspension flows over the full shift. Math. Z. 294(1–2), 769–781 (2020) [Google Scholar]
  • 40.Kwietniak, D., Oprocha, P., Rams, M.: On entropy of dynamical systems with almost specification. Isr. J. Math. 213(1), 475–503 (2016) [Google Scholar]
  • 41.Mañibo, N., Miro, E.D., Rust, D., Tadeo, G.: Zeckendorf representations and mixing properties of sequences. Tsukuba J. Math. 44(2), 251–269 (2020) [Google Scholar]
  • 42.Mandelbrot, B.: Intermittent turbulence in self-similar cascades: divergence of high moments and dimension of the carrier. J. Fluid Mech. 62(2), 331–358 (1974) [Google Scholar]
  • 43.Mandelbrot, B.: The Fractal Geometry of Nature. W. H. Freeman and Co., San Francisco (1982) [Google Scholar]
  • 44.Markley, N.G., Paul, M.E.: Equilibrium states of grid functions. Trans. Am. Math. Soc. 274(1), 169–191 (1982) [Google Scholar]
  • 45.Meyerovitch, T.: Gibbs and equilibrium measures for some families of subshifts. Ergodic Theory Dyn. Syst. 33(3), 934–953 (2013) [Google Scholar]
  • 46.Miro, E.D., Rust, D., Sadun, L., Tadeo, G.: Topological mixing of random substitutions. Isr. J. Math. 255(1), 123–153 (2023) [Google Scholar]
  • 47.Mitchell, A.: Complexity of dynamical systems arising from random substitutions in one dimension. Ph.D. thesis, University of Birmingham (2023)
  • 48.Mitchell, A.: On word complexity and topological entropy of random substitution subshifts. Proc. Am. Math. Soc. 152, 4361–4377 (2024) [Google Scholar]
  • 49.Mitchell, A., Rutar, A.: Multifractal analysis of measures arising from random substitutions. Commun. Math. Phys. 405(3), 63 (2024) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Moll, M.: Diffraction of random noble means words. J. Stat. Phys. 156(6), 1221–1236 (2014) [Google Scholar]
  • 51.Mossé, B.: Puissances de mots et reconnaissabilité des points fixes d’une substitution. Theor. Comput. Sci. 99(2), 327–334 (1992) [Google Scholar]
  • 52.Parry, W.: Intrinsic Markov chains. Trans. Am. Math. Soc. 112, 55–66 (1964) [Google Scholar]
  • 53.Parry, W., Pollicott, M.: Zeta functions and the periodic orbit structure of hyperbolic dynamics. Astérisque 268 (1990)
  • 54.Pavlov, R.: On intrinsic ergodicity and weakenings of the specification property. Adv. Math. 295, 250–270 (2016) [Google Scholar]
  • 55.Pavlov, R.: On entropy and intrinsic ergodicity of coded subshifts. Proc. Am. Math. Soc. 148(11), 4717–4731 (2020) [Google Scholar]
  • 56.Peyrière, J.: Substitutions aléatoires itérées. Séminaire de théorie des nombres de Bordeaux 80–81, 1701–1709 (1981) [Google Scholar]
  • 57.Queffélec, M.: Substitution Dynamical Systems: Spectral Analysis, 2nd edn. Springer, Berlin (2010) [Google Scholar]
  • 58.Ruelle, D.: Statistical mechanics of a one-dimensional lattice gas. Commun. Math. Phys. 9, 267–278 (1968) [Google Scholar]
  • 59.Rust, D.: Periodic points in random substitution subshifts. Monatsh. Math. 193(3), 683–704 (2020) [Google Scholar]
  • 60.Rust, D., Spindeler, T.: Dynamical systems arising from random substitutions. Indag. Math. 29, 1131–1155 (2018) [Google Scholar]
  • 61.Sarig, O.: Lecture notes on ergodic theory. Lecture Notes, Penn. State University (2009)
  • 62.Ures, R.: Intrinsic ergodicity of partially hyperbolic diffeomorphisms with a hyperbolic linear part. Proc. Am. Math. Soc. 140(6), 1973–1985 (2012) [Google Scholar]
  • 63.Walters, P.: An Introduction to Ergodic Theory. Graduate Texts in Mathematics. Springer, Berlin (1982)
  • 64.Weiss, B.: Intrinsically ergodic systems. Bull. Am. Math. Soc. 76(6), 1266–1269 (1970) [Google Scholar]
  • 65.Weiss, B.: Subshifts of finite type and sofic systems. Monatsh. Math. 77, 462–474 (1973) [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

This project has no associated data.


Articles from Communications in Mathematical Physics are provided here courtesy of Springer

RESOURCES