Complexity of Automatic Sequences

Hans Zantema

doi:10.1007/978-3-030-40608-0_18

. 2020 Jan 7;12038:260–271. doi: 10.1007/978-3-030-40608-0_18

Complexity of Automatic Sequences

Hans Zantema ^12,^13,^✉

Editors: Alberto Leporati⁸, Carlos Martín-Vide⁹, Dana Shapira¹⁰, Claudio Zandron¹¹

PMCID: PMC7206635

Abstract

Automatic sequences can be defined by DFAs with output (DFAO) in two natural ways. We propose to consider the minimal size of a corresponding DFAO as the complexity measure of the automatic sequence, for both variants. This paper compares these complexity measures and investigates their properties like the relationships with kernel and morphic sequences. There exist automatic sequences for which the one complexity is exponentially greater than the other one, in both directions. For both complexity measures we investigate the effect of taking basic operations on sequences like removing or adding an element in front, and observe that these operations may increase the complexity by at most a quadratic factor.

Introduction

Automatic sequences form an important class of infinite sequences over a finite alphabet; roughly speaking it is a first regular class going beyond ultimately periodic sequences. They have been extensively studied, in particular in the book [1] that serves as the main reference for research in this area. More recent references on the topic include [5, 9].

Automatic sequences depend on a base Inline graphic , with special interest for . Two well-known 2-automatic sequences are the Thue-Morse sequence and the regular paper folding sequence, to be defined in Sect. 2. Automatic sequences admit several equivalent characterizations, many of which are closely related to the following two. In the first one the ith element Inline graphic of the sequence a is the output of a DFAO when taking as input the k-ary notation of i. The second one is similar, but then the reverse of the k-ary notation of i is taken as input. It is natural to consider the minimal size of a corresponding DFAO as the complexity measure of the automatic sequence, for both variants, and we denote them by Inline graphic and . These complexity measures are the main topic of this paper. We show how they relate to other characterizations; in particular, is closely related to the size of the kernel of a, and is closely related to the size of the smallest alphabet needed to describe a as a morphic sequence with respect to a k-uniform morphism. In doing so, we follow constructions as presented in [1] for which we investigate the precise effect on the measures Inline graphic and .

A first result states that there is an exponential gap between both measures: there exist sequences of automatic sequences a, b for which Inline graphic is exponential in , and is exponential in .

A next natural question is about the effect of taking basic operations on sequences. For instance, for any sequence a its tail Inline graphic is obtained by removing its first element. We show that and for all k-automatic sequences, and that the last inequality is sharp. Similar results hold for adding an element in front rather than removing. Also other operations are considered, like pointwise combining two sequences and taking particular subsequences. About all of these basic operations f the main observation is that their sizes do not increase more than quadratically: Inline graphic and for all a.

Another interesting question is what happens for periodic sequences. In the current paper we only derive a quadratic upper bound for Inline graphic and a linear upper bound for , so opposite to the effect of . Whether and when these upper bounds are reached is a much more involved question that is investigated in [2]. The research project on this topic is a joined project of Wieb Bosma and the current author; as this analysis for periodic sequences requires arguments of a completely different combinatorial flavor than the automata based arguments in this paper, we decided to present the current paper and [2] separately.

Throughout the paper we make several claims about the exact values of Inline graphic and for particular sequences a. To compute these values we wrote a program to search for a DFAO of minimal size n having the corresponding property for for all for N being typically around . This was done by expressing the requirements as a satisfiability problem and then call a SAT solver. The smallest n for which the formula is satisfiable then is given. As only the requirements for Inline graphic are checked, this only yields a lower bound, but for N large enough it gives the exact value. According to [6], corollary 3.1 (page 59) two states in a DFAO of n states are equivalent are equivalent if and only if for every string of length they produce the same output. This can be improved to Inline graphic . Applying this for the union of the found automaton and the real automaton with bounds derived in this paper, this shows that for the exact value is obtained.

This paper is organized as follows. In Sect. 2 we give the basic definitions and a general lemma for proving lower bounds. In Sect. 3 we investigate the exponential gap between Inline graphic and . In Sect. 4 we define the kernel of an automatic sequence and investigate its relationship with . In Sect. 5 we present how to define automatic sequences as morphic sequences with respect to uniform morphisms, and investigate the relationship with . In Sect. 6 we investigate the effect of basic operations like Inline graphic on and . In Sect. 7 we give the upper bounds of and for periodic sequences. We conclude in Sect. 8.

Basic Definitions

Let Inline graphic and .

The set of infinite sequences Inline graphic over a finite alphabet is denoted by .

A DFA M with output (DFAO) is defined to be a tuple Inline graphic , where

Q is the finite set of states,
is the finite input alphabet,
is the transition function,
is the initial state,
is the finite output alphabet,
is the output function.

DFAOs are denoted by states and arrows just as is usual for DFAs; the extra information that Inline graphic is denoted by writing q/x in the state q.

As in DFAs, Inline graphic extends to by , . A DFAO M defines a function defined by . A function is called a finite state function if a DFAO M exists such that . For every finite state function f there exists a unique (up to renaming of states) DFAO M with a minimal number of states such that .

A DFAO of which the input alphabet Inline graphic is equal to , is called a k-DFAO.

Every natural number n has a unique representation Inline graphic , where and

for Inline graphic . So and . Note that non-empty strings of which the leftmost symbol is 0 do not occur as for some number n.

Conversely, every Inline graphic represents a number :

For any Inline graphic and any string the reverse of u is defined by .

An infinite sequence Inline graphic is called k-automatic if a k-DFAO exists such that for all . According to Theorem 5.2.1 from [1] a is k-automatic if and only if a k-DFAO exists such that for all . According to Theorem 5.2.3 from [1] a is k-automatic if and only if a k-DFAO exists such that for all .

Now we are ready to define the two natural measures Inline graphic , for k-automatic sequences that we investigate in this paper.

Definition 1

For any k-automatic sequence Inline graphic its size is defined to be the size of a smallest k-DFAO such that for all .

For any k-automatic sequence Inline graphic its reversed size is defined to be the size of a smallest k-DFAO such that for all .

Conversely, every k-DFAO Inline graphic defines two infinite sequences and over :

for all Inline graphic . From the above definition it is immediate that and .

The Thue-Morse sequence Inline graphic

Inline graphic is defined by if the number of 1s in is even, and if the number of 1s in is odd, see, e.g., [1] Section 1.6, or OEIS A010060. We have , both justified by the DFAO on the right. graphic file with name 492458_1_En_18_Figa_HTML.jpg

The regular paper-folding sequence Inline graphic (or dragon curve sequence is defined by for every for the unique representation , see, e.g., [1] Example 5.16., or OEIS A014577. We have , respectively justified by the following two DFAOs. graphic file with name 492458_1_En_18_Figb_HTML.jpg

The following lemma is the basic tool for lower bounds on Inline graphic and .

Lemma 1

Let a be a k-automatic sequence, and Inline graphic such that for every there exists satisfying , then .

Proof

For the first claim let Inline graphic be a smallest k-DFAO such that for all . For define . For from the assumption we obtain , so . This shows , so .

The proof of the second claim is similar. Inline graphic

The Exponential Gap

The following theorem shows that there can be an exponential gap between Inline graphic and , in both directions. Its proof is inspired by the folklore result that the language has an NFA of size , and its reverse has a DFA of size , but its smallest DFA has size at least . We found it in [8], Sect. 3.2, page 67, exercise 3. Many similar results on state complexity are known, e.g., in [7], it is proved that all values until Inline graphic can be reached as sizes.

Theorem 1

For every Inline graphic there exist k-automatic sequences a, b such that and , and and .

Proof

Define a by Inline graphic for , and if and only if the nth digit of is j, for , . The following DFAO satisfies by construction:

in which all unlabeled arrows are assumed to be labeled by all symbols Inline graphic . Since this DFAO has states we obtain .

For proving Inline graphic we apply Lemma 1. For define , so the numbers are exactly the numbers of k-ary length n, starting in a digit . For any two distinct such numbers and there is a position p on which they differ, so by choosing , the strings and differ in their n-th position. So the condition of Lemma 1 holds and we conclude Inline graphic .

Define b by Inline graphic for , and if and only if the nth element of is j, for , . A similar argument using the same automaton proves the claim for b.

The k-kernel

For Inline graphic we define by for all . So for we have and .

For an infinite sequence Inline graphic over we define its k-kernel to be the smallest set such that

,
for every and every we have .

We recall from [4], Prop. V.3.3, or [1], Theorem 6.6.2, that a is k-automatic if and only if Inline graphic is finite.

For a k-automatic sequence Inline graphic over the alphabet its k-kernel has a natural DFAO structure: the DFAO , where

the input alphabet is ,
is the set of states,
is defined by ,
a is the initial state,
the output alphabet is ,
the output function is defined by .

Recall that for Inline graphic we have and , so in the 0-steps describe and the 1-steps describe . For the 2-kernel exactly coincides with the DFAO given in Sect. 2, in which coincides with and coincides with the sequence obtained from by swapping symbols 0 and 1. For the 2-kernel exactly coincides with the given DFAO Inline graphic , in which coincides with , with , with and with .

The following theorem is straightforwardly proved by induction on i:

Theorem 2

For every k-automatic sequence Inline graphic and every we have where refer to .

As a consequence, by only giving the DFAO Inline graphic the sequence a is fully defined.

Theorem 3

The DFAO Inline graphic is the unique DFAO of minimal size such that for every .

Proof

Let Inline graphic . Combining Theorem 2 with the fact that for all yields for every . Assume it is not of minimal size with this property. Then there are two distinct states such that for all . Since are sequences over , applying Theorem 2 to and yield for all . But then are equal as sequences, contradicting that they are distinct. Inline graphic

Recall that Inline graphic is the minimal size |Q| for which a DFAO exists such that for every . We observe that a DFAO with this property does not need to be unique. For instance, for the DFAO is a minimal DFAO with this property, having two states a and , and , , . But the DFAO with the same two states a, b and Inline graphic , , produces the same sequence .

Next we observe that Inline graphic can be strictly smaller than , the size of the state space of . Define if the number of zeros in is odd, and if this number is even. Clearly it admits the following DFAO, in which as usual is denoted by q/x in the state q: graphic file with name 492458_1_En_18_Fige_HTML.jpg

Hence Inline graphic ; we obtain since the sequence contains both 0 and 1. However, , since is the following DFAO: graphic file with name 492458_1_En_18_Figf_HTML.jpg The sequences a, b, c, d are as follows:

Observe that a and d differ only at the first position, and similarly for b and c. The next lemma states that this always occurs if Inline graphic is greater then .

Lemma 2

Let a be an infinite sequence over Inline graphic with kernel . Let such that for all . Assume that for . Then

Proof

Let Inline graphic . For any define the numbers by ; this is possible since does not start in 0 since . For any we obtain by considering . Hence

Inline graphic

Theorem 4

Let a be a k-automatic sequence over an alphabet Inline graphic . Then

Moreover, if a is periodic then Inline graphic .

Proof

The inequality Inline graphic holds since the automaton satisfies for every . For the other inequality let be a DFAO of minimal size such that for every . For every choose such that . Define on by .

According to Lemma 2 Inline graphic implies that for all , so the difference between b and c may only be caused by . Hence every equivalence class of has at most elements, while the number of equivalence classes is . This proves .

In case a is periodic then all elements of Inline graphic are periodic too, and for all implies . Hence in that case all equivalence classes consist of a single element, proving .

Morphic Sequences

Recall that Inline graphic for the smallest being the set of states of a DFAO for which for every . Again this DFAO of minimal size is not unique: for the DFAO as given above also satisfies for all , but after changing to this property still holds, since never starts by 0.

Just like Inline graphic is strongly related to the kernel of a as described in Theorem 4, is strongly related to the number of symbols needed to describe a as a morphic sequence with respect to a k-uniform morphism. A sequence a over an alphabet is called morphic with respect to a morphism and a coding Inline graphic if for some satisfying , by which is a fixed point of h. The morphism is called k-uniform if the string has length k for every . It is well-known (Cobham [3], see also [1] Theorem 6.3.2) that a is k-automatic if and only if it is morphic with respect to a k-uniform morphism. For instance, Inline graphic for , and for , , .

Theorem 5

Let a be a k-automatic sequence. Let d(a) be the minimal size of the alphabet Inline graphic such that for a k-uniform morphism and a coding . Then .

Proof

The k-DFAO Inline graphic with and , where we write , satisfies for all as is showed in the proof of Theorem 6.3.2 of [1]. As is the smallest size of a k-DFAO with this property we obtain .

Conversely, if Inline graphic is a k-DFAO of size with for all , then by choosing a fresh state and defining , for , , for , , for , we obtain the k-DFAO of size with for all . Using the fact that we obtain for h defined by as is shown in the proof of Theorem 6.3.2 of [1]. Hence .

The Effect of Basic Operations

For any sequence Inline graphic we define its tail by for all .

Theorem 6

For any k-automatic sequence a we have Inline graphic and . For every there exists a k-automatic sequence a such that and .

Proof

For the first claim take a DFAO Inline graphic of size with for all . Let be the smallest number such that exists with . Introduce fresh states and define the DFAO by

By construction we have Inline graphic for all , . So by defining for and for we obtain

and

for all Inline graphic , . Since , and , and every number in is either of the shape or , this proves that is a DFAO for . Since this yields .

For the second claim take a DFAO Inline graphic of size with for all . Define the DFAO of size by

for all Inline graphic , . For every we have either or , for some , , . In the first case we have , in the second case . The DFAO has been constructed in such a way that and . Hence for all we have , proving the second claim.

As Inline graphic , for the last claim it suffices to prove . We define a by if the number of zeros in is divisible by n, and otherwise. A DFAO consisting of a single n-cycle easily produces a, so , and since a smaller one is not possible we obtain . Let , so for all . We prove by Lemma 1. Choose to be the numbers Inline graphic for . Let and for , then .

First we consider the case where Inline graphic and are distinct modulo n, choose r such that is divisible by n and is not. Choose . Then .

In the remaining case Inline graphic and are equal modulo n, and since we obtain that p and are distinct modulo n. Choose r such that is divisible by n and is not. Choose , then .

So the conditions of Lemma 1 hold, and Inline graphic .

For our examples Inline graphic and we have , , and .

For any sequence Inline graphic over , and the sequence is defined by and for all . The next theorem states that the effect of is similar to .

Theorem 7

For any k-automatic sequence a over Inline graphic , and we have and . For every there exists a k-automatic sequence a such that and .

Proof

Similar to the proof of Theorem 6, with the roles of the symbols 0 and Inline graphic swapped, exploiting the property for any string v and any .

For our examples Inline graphic and we have , , and .

Recall that for Inline graphic the operator on sequences a is defined by for all .

Theorem 8

For any k-automatic sequence a and Inline graphic we have and .

Proof

Let Inline graphic be a DFAO of size with for all . Define by for all . Then

for all Inline graphic , so is a DFAO of size producing , so .

For the other claim let Inline graphic be a DFAO of size with for all . Define . Then

for all Inline graphic , so is a DFAO of size producing , so .

For our examples Inline graphic and we have , , and .

When applying an operator Inline graphic on two sequences , , by we mean the sequence defined by for all . For instance, applied on boolean sequences denotes the elementwise conjunction of the two boolean sequences.

Theorem 9

For any two k-automatic sequences Inline graphic , and every function we have and .

Proof

Let Inline graphic be a DFAO of size with for all . Let be a DFAO of size with for all . Then for defined by and for all , is a DFAO of size for f(a, b). The proof for the reversed version is similar.

Combining our examples Inline graphic and we have and .

Periodic Sequences

Theorem 10

Let Inline graphic be a periodic sequence with . Then and .

Proof

Writing Inline graphic we obtain for all . Define by , , , , for all . Then by induction on the length of one proves that for every . Hence for all , proving that .

For the other claim we prove that Inline graphic , then the result follows from Theorem 4. The states of are sequences b for which there are numbers q, j such that for all . We have to show that there are at most such sequences b. This follows from the fact that this only i depends on the n values for and the at most values for Inline graphic . The latter follows since if k, n are relatively prime, then the values of are among the values , and otherwise there is some dividing both n and k, and the values are among the n/p multiples of p modulo n.

A natural question is for which cases the bounds of Theorem 10 can be reached, in particular the quadratic bound for Inline graphic . This question is beyond the scope of this paper, but has been addressed in [2]. A main result of [2] is that if is prime and 2 is a primitive root modulo n (on which Artin’s conjecture states that this holds for infinitely many primes), then for .

Conclusions

We investigated two natural complexity measures for a k-automatic sequence a: Inline graphic closely related to the alphabet size required to present a as a morphic sequence with respect to a k-uniform morphism, and closely related to the size of the kernel of a. We saw how there can be an exponential gap between and , but basic operations like , adding an element in front, or applying a binary operator elementwise, never increases Inline graphic or by more than a quadratic factor. Many other operations, like changing the tenth element of a sequence, can be obtained by combining such basic operations, and hence yield a polynomial upper bound too. Probably these polynomial bounds can be improved strongly. Other open questions include a further investigation of when these upper bounds can be reached. Conversely, our SAT based tool provides values that are likely to be exact, but formally are only lower bounds. It would make sense to further investigate how to be sure to have the exact value, either depending on particular ways to define automatic sequences, or by giving general criteria for exactness depending on known upper bounds.

On periodic sequences this paper only contains some very basic observations; more involved observations are given in [2].

We want to thank Wieb Bosma for fruitful collaboration on this topic and careful proof reading. We want to thank Jeffrey Shallit for giving pointers to state complexity.

Contributor Information

Alberto Leporati, Email: alberto.leporati@unimib.it.

Carlos Martín-Vide, Email: carlos.martin@urv.cat.

Dana Shapira, Email: shapird@g.ariel.ac.il.

Claudio Zandron, Email: zandron@disco.unimib.it.

Hans Zantema, Email: h.zantema@tue.nl.

References

1.Allouche JP, Shallit J. Automatic Sequences: Theory, Applications, Generalizations. Cambridge: Cambridge University Press; 2003. [Google Scholar]
2.Bosma, W.: Complexity of periodic sequences (2019). https://www.math.ru.nl/~bosma/pubs/periodic.pdf
3.Cobham A. Uniform tag sequences. Math. Systems Theory. 1972;6:164–192. doi: 10.1007/BF01706087. [DOI] [Google Scholar]
4.Eilenberg S. Automata, Languages and Machines. New York: Academic Press; 1974. [Google Scholar]
5.Endrullis J, Grabmayer C, Hendriks D. Mix-automatic sequences. In: Dediu A-H, Martín-Vide C, Truthe B, editors. Language and Automata Theory and Applications; Heidelberg: Springer; 2013. pp. 262–274. [Google Scholar]
6.Gill A. Introduction to the Theory of Finite-State Machines. New York: McGraw-Hill; 1962. [Google Scholar]
7.Jiraskova G. The ranges of state complexities for complement, star and reversal of regular languages. Int. J. Found. Comput. Sci. 2014;25(1):101–124. doi: 10.1142/S0129054114500063. [DOI] [Google Scholar]
8.Lawson MV. Finite Automata. Boca Raton: Chapman and Hall/CRC; 2004. [Google Scholar]
9.Shallit J. Decidability and enumeration for automatic sequences: a survey. In: Bulatov AA, Shur AM, editors. Computer Science – Theory and Applications; Heidelberg: Springer; 2013. pp. 49–63. [Google Scholar]

[CR1] 1.Allouche JP, Shallit J. Automatic Sequences: Theory, Applications, Generalizations. Cambridge: Cambridge University Press; 2003. [Google Scholar]

[CR2] 2.Bosma, W.: Complexity of periodic sequences (2019). https://www.math.ru.nl/~bosma/pubs/periodic.pdf

[CR3] 3.Cobham A. Uniform tag sequences. Math. Systems Theory. 1972;6:164–192. doi: 10.1007/BF01706087. [DOI] [Google Scholar]

[CR4] 4.Eilenberg S. Automata, Languages and Machines. New York: Academic Press; 1974. [Google Scholar]

[CR5] 5.Endrullis J, Grabmayer C, Hendriks D. Mix-automatic sequences. In: Dediu A-H, Martín-Vide C, Truthe B, editors. Language and Automata Theory and Applications; Heidelberg: Springer; 2013. pp. 262–274. [Google Scholar]

[CR6] 6.Gill A. Introduction to the Theory of Finite-State Machines. New York: McGraw-Hill; 1962. [Google Scholar]

[CR7] 7.Jiraskova G. The ranges of state complexities for complement, star and reversal of regular languages. Int. J. Found. Comput. Sci. 2014;25(1):101–124. doi: 10.1142/S0129054114500063. [DOI] [Google Scholar]

[CR8] 8.Lawson MV. Finite Automata. Boca Raton: Chapman and Hall/CRC; 2004. [Google Scholar]

[CR9] 9.Shallit J. Decidability and enumeration for automatic sequences: a survey. In: Bulatov AA, Shur AM, editors. Computer Science – Theory and Applications; Heidelberg: Springer; 2013. pp. 49–63. [Google Scholar]

PERMALINK

Complexity of Automatic Sequences

Hans Zantema

Abstract

Introduction

Basic Definitions

Definition 1

Lemma 1

Proof

The Exponential Gap

Theorem 1

Proof

The k-kernel

Theorem 2

Theorem 3

Proof

Lemma 2

Proof

Theorem 4

Proof

Morphic Sequences

Theorem 5

Proof

The Effect of Basic Operations

Theorem 6

Proof

Theorem 7

Proof

Theorem 8

Proof

Theorem 9

Proof

Periodic Sequences

Theorem 10

Proof

Conclusions

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases