Abstract
Automatic sequences can be defined by DFAs with output (DFAO) in two natural ways. We propose to consider the minimal size of a corresponding DFAO as the complexity measure of the automatic sequence, for both variants. This paper compares these complexity measures and investigates their properties like the relationships with kernel and morphic sequences. There exist automatic sequences for which the one complexity is exponentially greater than the other one, in both directions. For both complexity measures we investigate the effect of taking basic operations on sequences like removing or adding an element in front, and observe that these operations may increase the complexity by at most a quadratic factor.
Introduction
Automatic sequences form an important class of infinite sequences over a finite alphabet; roughly speaking it is a first regular class going beyond ultimately periodic sequences. They have been extensively studied, in particular in the book [1] that serves as the main reference for research in this area. More recent references on the topic include [5, 9].
Automatic sequences depend on a base
, with special interest for
. Two well-known 2-automatic sequences are the Thue-Morse sequence and the regular paper folding sequence, to be defined in Sect. 2. Automatic sequences admit several equivalent characterizations, many of which are closely related to the following two. In the first one the ith element
of the sequence a is the output of a DFAO when taking as input the k-ary notation of i. The second one is similar, but then the reverse of the k-ary notation of i is taken as input. It is natural to consider the minimal size of a corresponding DFAO as the complexity measure of the automatic sequence, for both variants, and we denote them by
and
. These complexity measures are the main topic of this paper. We show how they relate to other characterizations; in particular,
is closely related to the size of the kernel of a, and
is closely related to the size of the smallest alphabet needed to describe a as a morphic sequence with respect to a k-uniform morphism. In doing so, we follow constructions as presented in [1] for which we investigate the precise effect on the measures
and
.
A first result states that there is an exponential gap between both measures: there exist sequences of automatic sequences a, b for which
is exponential in
, and
is exponential in
.
A next natural question is about the effect of taking basic operations on sequences. For instance, for any sequence a its tail
is obtained by removing its first element. We show that
and
for all k-automatic sequences, and that the last inequality is sharp. Similar results hold for adding an element in front rather than removing. Also other operations are considered, like pointwise combining two sequences and taking particular subsequences. About all of these basic operations f the main observation is that their sizes do not increase more than quadratically:
and
for all a.
Another interesting question is what happens for periodic sequences. In the current paper we only derive a quadratic upper bound for
and a linear upper bound for
, so opposite to the effect of
. Whether and when these upper bounds are reached is a much more involved question that is investigated in [2]. The research project on this topic is a joined project of Wieb Bosma and the current author; as this analysis for periodic sequences requires arguments of a completely different combinatorial flavor than the automata based arguments in this paper, we decided to present the current paper and [2] separately.
Throughout the paper we make several claims about the exact values of
and
for particular sequences a. To compute these values we wrote a program to search for a DFAO of minimal size n having the corresponding property for
for all
for N being typically around
. This was done by expressing the requirements as a satisfiability problem and then call a SAT solver. The smallest n for which the formula is satisfiable then is given. As only the requirements for
are checked, this only yields a lower bound, but for N large enough it gives the exact value. According to [6], corollary 3.1 (page 59) two states in a DFAO of n states are equivalent are equivalent if and only if for every string of length
they produce the same output. This can be improved to
. Applying this for the union of the found automaton and the real automaton with bounds derived in this paper, this shows that for
the exact value is obtained.
This paper is organized as follows. In Sect. 2 we give the basic definitions and a general lemma for proving lower bounds. In Sect. 3 we investigate the exponential gap between
and
. In Sect. 4 we define the kernel of an automatic sequence and investigate its relationship with
. In Sect. 5 we present how to define automatic sequences as morphic sequences with respect to uniform morphisms, and investigate the relationship with
. In Sect. 6 we investigate the effect of basic operations like
on
and
. In Sect. 7 we give the upper bounds of
and
for periodic sequences. We conclude in Sect. 8.
Basic Definitions
Let
and
.
The set of infinite sequences
over a finite alphabet
is denoted by
.
A DFA M with output (DFAO) is defined to be a tuple
, where
Q is the finite set of states,
is the finite input alphabet,
is the transition function,
is the initial state,
is the finite output alphabet,
is the output function.
DFAOs are denoted by states and arrows just as is usual for DFAs; the extra information that
is denoted by writing q/x in the state q.
As in DFAs,
extends to
by
,
. A DFAO M defines a function
defined by
. A function
is called a finite state function if a DFAO M exists such that
. For every finite state function f there exists a unique (up to renaming of states) DFAO M with a minimal number of states such that
.
A DFAO of which the input alphabet
is equal to
, is called a k-DFAO.
Every natural number n has a unique representation
, where
and
![]() |
for
. So
and
. Note that non-empty strings of which the leftmost symbol is 0 do not occur as
for some number n.
Conversely, every
represents a number
:
![]() |
For any
and any string
the reverse
of u is defined by
.
An infinite sequence
is called k-automatic if a k-DFAO
exists such that
for all
. According to Theorem 5.2.1 from [1] a is k-automatic if and only if a k-DFAO
exists such that
for all
. According to Theorem 5.2.3 from [1] a is k-automatic if and only if a k-DFAO
exists such that
for all
.
Now we are ready to define the two natural measures
,
for k-automatic sequences that we investigate in this paper.
Definition 1
For any k-automatic sequence
its size
is defined to be the size of a smallest k-DFAO
such that
for all
.
For any k-automatic sequence
its reversed size
is defined to be the size of a smallest k-DFAO
such that
for all
.
Conversely, every k-DFAO
defines two infinite sequences
and
over
:
![]() |
for all
. From the above definition it is immediate that
and
.
The Thue-Morse sequence

is defined by
if the number of 1s in
is even, and
if the number of 1s in
is odd, see, e.g., [1] Section 1.6, or OEIS A010060. We have
, both justified by the DFAO on the right.
The regular paper-folding sequence
(or dragon curve sequence is defined by
for every
for the unique representation
, see, e.g., [1] Example 5.16., or OEIS A014577. We have
, respectively justified by the following two DFAOs.
The following lemma is the basic tool for lower bounds on
and
.
Lemma 1
Let a be a k-automatic sequence, and
such that for every
there exists
satisfying
, then
.
Let a be a k-automatic sequence, and
such that for every
there exists
satisfying
, then
.
Proof
For the first claim let
be a smallest k-DFAO such that
for all
. For
define
. For
from the assumption we obtain
, so
. This shows
, so
.
The proof of the second claim is similar. 
The Exponential Gap
The following theorem shows that there can be an exponential gap between
and
, in both directions. Its proof is inspired by the folklore result that the language
has an NFA of size
, and its reverse has a DFA of size
, but its smallest DFA has size at least
. We found it in [8], Sect. 3.2, page 67, exercise 3. Many similar results on state complexity are known, e.g., in [7], it is proved that all values until
can be reached as sizes.
Theorem 1
For every
there exist k-automatic sequences a, b such that
and
, and
and
.
Proof
Define a by
for
, and
if and only if the nth digit of
is j, for
,
. The following DFAO satisfies
by construction:
in which all unlabeled arrows are assumed to be labeled by all symbols
. Since this DFAO has
states we obtain
.
For proving
we apply Lemma 1. For
define
, so the numbers
are exactly the numbers of k-ary length n, starting in a digit
. For any two distinct such numbers
and
there is a position p on which they differ, so by choosing
, the strings
and
differ in their n-th position. So the condition of Lemma 1 holds and we conclude
.
Define b by
for
, and
if and only if the nth element of
is j, for
,
. A similar argument using the same automaton proves the claim for b. 
The k-kernel
For
we define
by
for all
. So for
we have
and
.
For an infinite sequence
over
we define its k-kernel
to be the smallest set
such that
,for every
and every
we have
.
We recall from [4], Prop. V.3.3, or [1], Theorem 6.6.2, that a is k-automatic if and only if
is finite.
For a k-automatic sequence
over the alphabet
its k-kernel
has a natural DFAO structure: the DFAO
, where
the input alphabet is
,
is the set of states,
is defined by
,a is the initial state,
the output alphabet is
,the output function
is defined by
.
Recall that for
we have
and
, so in
the 0-steps describe
and the 1-steps describe
. For
the 2-kernel exactly coincides with the DFAO
given in Sect. 2, in which
coincides with
and
coincides with the sequence obtained from
by swapping symbols 0 and 1. For
the 2-kernel exactly coincides with the given DFAO
, in which
coincides with
,
with
,
with
and
with
.
The following theorem is straightforwardly proved by induction on i:
Theorem 2
For every k-automatic sequence
and every
we have
where
refer to
.
As a consequence, by only giving the DFAO
the sequence a is fully defined.
Theorem 3
The DFAO
is the unique DFAO of minimal size such that
for every
.
Proof
Let
. Combining Theorem 2 with the fact that
for all
yields
for every
. Assume it is not of minimal size with this property. Then there are two distinct states
such that
for all
. Since
are sequences over
, applying Theorem 2 to
and
yield
for all
. But then
are equal as sequences, contradicting that they are distinct. 
Recall that
is the minimal size |Q| for which a DFAO
exists such that
for every
. We observe that a DFAO with this property does not need to be unique. For instance, for
the DFAO
is a minimal DFAO with this property, having two states a and
, and
,
,
. But the DFAO with the same two states a, b and
,
,
produces the same sequence
.
Next we observe that
can be strictly smaller than
, the size of the state space of
. Define
if the number of zeros in
is odd, and
if this number is even. Clearly it admits the following DFAO, in which as usual
is denoted by q/x in the state q:
Hence
; we obtain
since the sequence contains both 0 and 1. However,
, since
is the following DFAO:
The sequences a, b, c, d are as follows:
![]() |
![]() |
Observe that a and d differ only at the first position, and similarly for b and c. The next lemma states that this always occurs if
is greater then
.
Lemma 2
Let a be an infinite sequence over
with kernel
. Let
such that
for all
. Assume that
for
. Then
![]() |
Proof
Let
. For any
define the numbers
by
; this is possible since
does not start in 0 since
. For any
we obtain
by considering
. Hence
![]() |
![]() |

Theorem 4
Let a be a k-automatic sequence over an alphabet
. Then
![]() |
Moreover, if a is periodic then
.
Proof
The inequality
holds since the automaton
satisfies
for every
. For the other inequality let
be a DFAO of minimal size
such that
for every
. For every
choose
such that
. Define
on
by
.
According to Lemma 2
implies that
for all
, so the difference between b and c may only be caused by
. Hence every equivalence class of
has at most
elements, while the number of equivalence classes is
. This proves
.
In case a is periodic then all elements of
are periodic too, and
for all
implies
. Hence in that case all equivalence classes consist of a single element, proving
. 
Morphic Sequences
Recall that
for the smallest
being the set of states of a DFAO
for which
for every
. Again this DFAO of minimal size is not unique: for
the DFAO
as given above also satisfies
for all
, but after changing
to
this property still holds, since
never starts by 0.
Just like
is strongly related to the kernel of a as described in Theorem 4,
is strongly related to the number of symbols needed to describe a as a morphic sequence with respect to a k-uniform morphism. A sequence a over an alphabet
is called morphic with respect to a morphism
and a coding
if
for some
satisfying
, by which
is a fixed point of h. The morphism
is called k-uniform if the string
has length k for every
. It is well-known (Cobham [3], see also [1] Theorem 6.3.2) that a is k-automatic if and only if it is morphic with respect to a k-uniform morphism. For instance,
for
, and
for
,
,
.
Theorem 5
Let a be a k-automatic sequence. Let d(a) be the minimal size of the alphabet
such that
for a k-uniform morphism
and a coding
. Then
.
Proof
The k-DFAO
with
and
, where we write
, satisfies
for all
as is showed in the proof of Theorem 6.3.2 of [1]. As
is the smallest size of a k-DFAO with this property we obtain
.
Conversely, if
is a k-DFAO of size
with
for all
, then by choosing a fresh state
and defining
,
for
,
,
for
,
,
for
, we obtain the k-DFAO
of size
with
for all
. Using the fact that
we obtain
for h defined by
as is shown in the proof of Theorem 6.3.2 of [1]. Hence
. 
The Effect of Basic Operations
For any sequence
we define its tail
by
for all
.
Theorem 6
For any k-automatic sequence a we have
and
. For every
there exists a k-automatic sequence a such that
and
.
Proof
For the first claim take a DFAO
of size
with
for all
. Let
be the smallest number
such that
exists with
. Introduce fresh states
and define the DFAO
by
![]() |
![]() |
![]() |
![]() |
By construction we have
for all
,
. So by defining
for
and
for
we obtain
![]() |
and
![]() |
for all
,
. Since
, and
, and every number in
is either of the shape
or
, this proves that
is a DFAO for
. Since
this yields
.
For the second claim take a DFAO
of size
with
for all
. Define the DFAO
of size
by
![]() |
![]() |
![]() |
for all
,
. For every
we have either
or
, for some
,
,
. In the first case we have
, in the second case
. The DFAO
has been constructed in such a way that
and
. Hence for all
we have
, proving the second claim.
As
, for the last claim it suffices to prove
. We define a by
if the number of zeros in
is divisible by n, and
otherwise. A DFAO consisting of a single n-cycle easily produces a, so
, and since a smaller one is not possible we obtain
. Let
, so
for all
. We prove
by Lemma 1. Choose
to be the numbers
for
. Let
and
for
, then
.
First we consider the case where
and
are distinct modulo n, choose r such that
is divisible by n and
is not. Choose
. Then
.
In the remaining case
and
are equal modulo n, and since
we obtain that p and
are distinct modulo n. Choose r such that
is divisible by n and
is not. Choose
, then
.
So the conditions of Lemma 1 hold, and
. 
For our examples
and
we have
,
,
and
.
For any sequence
over
, and
the sequence
is defined by
and
for all
. The next theorem states that the effect of
is similar to
.
Theorem 7
For any k-automatic sequence a over
, and
we have
and
. For every
there exists a k-automatic sequence a such that
and
.
Proof
Similar to the proof of Theorem 6, with the roles of the symbols 0 and
swapped, exploiting the property
for any string v and any
. 
For our examples
and
we have
,
,
and
.
Recall that for
the operator
on sequences a is defined by
for all
.
Theorem 8
For any k-automatic sequence a and
we have
and
.
Proof
Let
be a DFAO of size
with
for all
. Define
by
for all
. Then
![]() |
for all
, so
is a DFAO of size
producing
, so
.
For the other claim let
be a DFAO of size
with
for all
. Define
. Then
![]() |
for all
, so
is a DFAO of size
producing
, so
. 
For our examples
and
we have
,
,
and
.
When applying an operator
on two sequences
,
, by
we mean the sequence defined by
for all
. For instance,
applied on boolean sequences denotes the elementwise conjunction of the two boolean sequences.
Theorem 9
For any two k-automatic sequences
,
and every function
we have
and
.
Proof
Let
be a DFAO of size
with
for all
. Let
be a DFAO of size
with
for all
. Then
for
defined by
and
for all
, is a DFAO of size
for f(a, b). The proof for the reversed version is similar. 
Combining our examples
and
we have
and
.
Periodic Sequences
Theorem 10
Let
be a periodic sequence with
. Then
and
.
Proof
Writing
we obtain
for all
. Define
by
,
,
,
, for all
. Then by induction on the length of
one proves that
for every
. Hence
for all
, proving that
.
For the other claim we prove that
, then the result follows from Theorem 4. The states of
are sequences b for which there are numbers q, j such that
for all
. We have to show that there are at most
such sequences b. This follows from the fact that this only i depends on the n values for
and the at most
values for
. The latter follows since if k, n are relatively prime, then the values of
are among the
values
, and otherwise there is some
dividing both n and k, and the values are among the n/p multiples of p modulo n. 
A natural question is for which cases the bounds of Theorem 10 can be reached, in particular the quadratic bound for
. This question is beyond the scope of this paper, but has been addressed in [2]. A main result of [2] is that if
is prime and 2 is a primitive root modulo n (on which Artin’s conjecture states that this holds for infinitely many primes), then
for
.
Conclusions
We investigated two natural complexity measures for a k-automatic sequence a:
closely related to the alphabet size required to present a as a morphic sequence with respect to a k-uniform morphism, and
closely related to the size of the kernel of a. We saw how there can be an exponential gap between
and
, but basic operations like
, adding an element in front, or applying a binary operator elementwise, never increases
or
by more than a quadratic factor. Many other operations, like changing the tenth element of a sequence, can be obtained by combining such basic operations, and hence yield a polynomial upper bound too. Probably these polynomial bounds can be improved strongly. Other open questions include a further investigation of when these upper bounds can be reached. Conversely, our SAT based tool provides values that are likely to be exact, but formally are only lower bounds. It would make sense to further investigate how to be sure to have the exact value, either depending on particular ways to define automatic sequences, or by giving general criteria for exactness depending on known upper bounds.
On periodic sequences this paper only contains some very basic observations; more involved observations are given in [2].
We want to thank Wieb Bosma for fruitful collaboration on this topic and careful proof reading. We want to thank Jeffrey Shallit for giving pointers to state complexity.
Contributor Information
Alberto Leporati, Email: alberto.leporati@unimib.it.
Carlos Martín-Vide, Email: carlos.martin@urv.cat.
Dana Shapira, Email: shapird@g.ariel.ac.il.
Claudio Zandron, Email: zandron@disco.unimib.it.
Hans Zantema, Email: h.zantema@tue.nl.
References
- 1.Allouche JP, Shallit J. Automatic Sequences: Theory, Applications, Generalizations. Cambridge: Cambridge University Press; 2003. [Google Scholar]
- 2.Bosma, W.: Complexity of periodic sequences (2019). https://www.math.ru.nl/~bosma/pubs/periodic.pdf
- 3.Cobham A. Uniform tag sequences. Math. Systems Theory. 1972;6:164–192. doi: 10.1007/BF01706087. [DOI] [Google Scholar]
- 4.Eilenberg S. Automata, Languages and Machines. New York: Academic Press; 1974. [Google Scholar]
- 5.Endrullis J, Grabmayer C, Hendriks D. Mix-automatic sequences. In: Dediu A-H, Martín-Vide C, Truthe B, editors. Language and Automata Theory and Applications; Heidelberg: Springer; 2013. pp. 262–274. [Google Scholar]
- 6.Gill A. Introduction to the Theory of Finite-State Machines. New York: McGraw-Hill; 1962. [Google Scholar]
- 7.Jiraskova G. The ranges of state complexities for complement, star and reversal of regular languages. Int. J. Found. Comput. Sci. 2014;25(1):101–124. doi: 10.1142/S0129054114500063. [DOI] [Google Scholar]
- 8.Lawson MV. Finite Automata. Boca Raton: Chapman and Hall/CRC; 2004. [Google Scholar]
- 9.Shallit J. Decidability and enumeration for automatic sequences: a survey. In: Bulatov AA, Shur AM, editors. Computer Science – Theory and Applications; Heidelberg: Springer; 2013. pp. 49–63. [Google Scholar]




















