Deciding Classes of Regular Languages: The Covering Approach

Thomas Place

doi:10.1007/978-3-030-40608-0_6

. 2020 Jan 7;12038:89–112. doi: 10.1007/978-3-030-40608-0_6

Deciding Classes of Regular Languages: The Covering Approach

Thomas Place ^5,^✉

Editors: Alberto Leporati⁸, Carlos Martín-Vide⁹, Dana Shapira¹⁰, Claudio Zandron¹¹

PMCID: PMC7206641

Abstract

We investigate the membership problem that one may associate to every class of languages Inline graphic . The problem takes a regular language as input and asks whether it belongs to . In practice, finding an algorithm provides a deep insight on the class . While this problem has a long history, many famous open questions in automata theory are tied to membership. Recently, a breakthrough was made on several of these open questions. This was achieved by considering a more general decision problem than membership: covering. In the paper, we investigate how the new ideas and techniques brought about by the introduction of this problem can be applied to get new insight on earlier results. In particular, we use them to give new proofs for two of the most famous membership results: Schützenberger’s theorem and Simon’s theorem.

Keywords: Regular languages, Automata, Covering, Membership, Star-free languages, Piecewise testable languages

Introduction

Historical Context. A prominent question in formal languages theory is to solve the membership problem for classes of regular languages. Given a fixed class Inline graphic , one must find an algorithm which decides whether an input regular language belongs to . Such a procedure is called a -. What motivates this question is the deep insight on the class that is usually provided by a solution. Intuitively, being able to formulate an algorithm requires a solid understanding of all languages contained in the class Inline graphic . In other words, membership is used as a mathematical tool whose purpose is to analyze classes.

This research effort started with a famous theorem of Schützenberger [36] which describes the class of star-free languages ( Inline graphic ). These are the languages that can be expressed by a regular expression using union, concatenation and complement, but not Kleene star. This is a prominent class which admits natural alternate definitions. For example, the star-free languages are those which can be defined in first-order logic [15] or equivalently in linear temporal logic [11]. Schützenberger’s theorem yields an algorithm which decides whether an input regular language is star-free (i.e. an SF-membership algorithm). This provides insight on Inline graphic not because of the algorithm itself, but rather because of its proof. Indeed, it includes a generic construction which builds an expression witnessing membership in for every input language on which the algorithm answers positively. This result was highly influential and pioneered a very successful line of research. The theorem itself was often revisited [5, 7, 8, 10, 14, 16, 17, 21, 23, 41] and researchers successfully obtained similar results for other prominent classes of languages. Famous examples include the locally testable languages [4, 42] or the piecewise testable languages [38]. However, membership is a difficult question and despite years of investigation, there are still many open problems.

Among these open problems, a famous one is the dot-depth problem. Brzozowski and Cohen [2] defined a natural classification of the star-free languages: the dot-depth hierarchy. Each star-free language is assigned a “complexity level” (called dot-depth) according to the number of alternations between concatenations and complements that are required to define it with an expression. It is known that this hierarchy is strict [3]. Hence, a natural question is whether membership is decidable for each level. This has been a very active research topic since the 70s (see [20, 28, 32] for surveys). Yet, only the first two levels are known to be decidable so far. An algorithm for dot-depth one was published by Knast in 1983 [13]. Despite a lot of partial results along the way, it took thirty more years to solve the next level: the decidability of dot-depth two was shown in 2014 [26, 33]. This situation is easily explained: in practice, getting new membership results always required new conceptual ideas and techniques. In the paper, we are interested in the ideas that led to a solution for dot-depth two. The key ingredient was a new more general decision problem called covering.

Covering. The problem was first considered implicitly in [26] and properly defined later in [31]. Given a class Inline graphic , the -covering problem is as follows. The input consists in two objects: a regular language L and a finite set of regular languages . One must decide whether there exists a -cover of L (a finite set of languages in whose union includes L) such that no language in intersects all languages in Inline graphic . Naturally, this definition is more involved than the one of membership and it is more difficult to find an algorithm for -covering than for -membership. Yet, covering was recently shown to be decidable for many natural classes (see for example [6, 24, 25, 30, 34, 35]) including the star-free languages [29].

At the time of its introduction, there were two motivations for investigating this new question. First, while harder, covering is also more rewarding than membership: it yields a more robust understanding of the classes. Indeed, a Inline graphic -membership algorithm only yields benefits for the languages of : we manage to detect them and to build a description witnessing this membership. On the other hand, a -covering algorithm applies to arbitrary languages. One may view -covering as an approximation problem: on inputs L and Inline graphic , we want to over-approximate L with a -cover while specifies what an acceptable approximation is. A second key motivation was the application to the dot-depth hierarchy. It turns out that all recent membership results for this hierarchy rely heavily on covering arguments. More precisely, they are based on techniques that allow to lift covering results for a level in the hierarchy as membership results for a higher level (see [32] for a detailed explanation).

Contribution. In the paper, we are not looking to provide new covering algorithms. Instead, we look at a slightly different question. As we explained, finding an algorithm for Inline graphic -covering is even harder than for -membership. Consequently, the recent breakthroughs that were made on this question required developing new ideas, new techniques and new ways to formulate intricate proof arguments. In the paper, we look back at the original membership problem and investigate how these new developments can be applied to get new insight on earlier results. We prove that even if one is only interested in membership, reasoning in terms of “covers” is quite natural and rather intuitive when presenting proof arguments. In particular, Inline graphic -covers are a very powerful tool for presenting generic constructions which build descriptions of languages in the class . We illustrate this point by using covers to give new intuitive proofs for two of the most important membership results in the literature: Schützenberger theorem [36] for the star-free languages and Simon’s theorem [38] for the piecewise testable languages.

Organization of the Paper. We first recall standard terminology about regular languages and define membership in Sect. 2. We introduce covering in Sect. 3 and explain why reasoning in terms of covers is intuitive and relevant even if one is only interested in membership. We illustrate this point in Sect. 4 with a new proof of Schützenberger’s theorem. Finally, we present a second example in Sect. 5 with a new proof of Simon’s theorem.

Preliminaries

In this section, we briefly recall standard terminology about finite words and classes regular languages. Moreover, we introduce the membership problem.

Regular Languages. An alphabet is a finite set A. As usual, Inline graphic denotes the set of all words over A, including the empty word . For , we write for the length of w (i.e. the number of letters in w). Moreover, for , we denote by uv the word obtained by concatenating u and v.

Given an alphabet A, a language (over A) is a subset of Inline graphic . Abusing terminology, we shall often denote by u the singleton language . We lift concatenation to languages: for , we let . Finally, we use Kleene star: if , denotes the union of all languages for and . In the paper, we only consider regular languages. These are the languages that can be equivalently defined by regular expressions, monadic second-order logic, finite automata or finite monoids. We shall use the definition based on monoids which we briefly recall now (see [21] for details).

A monoid is a set M endowed with an associative multiplication Inline graphic (also denoted by st) having a neutral element . An idempotent of a monoid M is an element such that . It is folklore that for any finite monoid M, there exists a natural number (denoted by when M is understood) such that is an idempotent for every . Observe that is a monoid whose multiplication is concatenation (the neutral element is Inline graphic ). Thus, we may consider monoid morphisms where M is an arbitrary monoid. Given such a morphism and , we say that L is recognized by when there exists a set such that . A language L is regular if and only if it is recognized by a morphism into a finite monoid.

Classes. We investigate classes of languages. Mathematically speaking, a class of languages Inline graphic is a correspondence which associates a (possibly infinite) set of languages over A to every alphabet A. For the sake of avoiding clutter, we shall often abuse terminology and omit the alphabet when manipulating classes. That is, whenever A is fixed and understood, we directly write Inline graphic to indicate that some language belongs to .

While this is the mathematical definition, in practice, the term “class” is used to indicate that Inline graphic is presented in a specific way. Typically, classes are tied to a particular syntax used to describe all the languages they contain. For example, the regular languages are tied to regular expressions and monadic second-order logic. Consequently, the classes that we consider in practice are natural and have robust properties that we present now.

A lattice is a class Inline graphic which is closed under finite union and intersection: for every alphabet A, we have and for every , we have . Moreover, a Boolean algebra is a lattice which is additionally closed under complement: for every alphabet A and , we have . Finally, we say that a class is quotient-closed when for every alphabet A, every Inline graphic and every , the following two languages belong to as well:

The techniques that we discuss in the paper are meant to be applied for classes that are quotient-closed lattices and contain only regular languages. The two examples that we detail are quotient-closed Boolean algebras of regular languages.

Membership. When encountering a new class Inline graphic , a natural objective is to precisely understand the languages it contains. In other words, we want to understand what properties can be expressed with the syntax defining . Of course, this is an informal objective. In practice, we rely on a decision problem called membership which we use as a mathematical tool to approach this question.

The problem is parameterized by an arbitrary class of languages Inline graphic : we speak of -membership. It takes as input a regular language L and asks whether L belongs to . The key idea is that obtaining an algorithm for -membership is not possible without a solid understanding of . In the literature, such an algorithm is also called a decidable characterization of Inline graphic .

Remark 1

We are not only interested in Inline graphic -membership algorithms themselves but also in their correctness proofs. In practice, the deep insight that we obtain on the class comes from these proofs. Typically, the difficult part in such an argument is to prove that a membership is sound: when it answers positively, prove that the input language does belong to Inline graphic . Typically, this requires a generic construction for building a syntactic description of the language witnessing its membership in .

Finding membership algorithms has been an important quest for a long time in formal languages theory. The solutions that were obtained for important classes are milestones in the theory of regular languages [13, 22, 33, 36, 38, 40]. In the paper, we prove two of them: Schützenberger’s theorem [36] and Simon’s theorem [38]. We frame these proofs using a new formalism based on a more general problem which was recently introduced [31]: covering.

The Covering Problem

The covering problem generalizes membership. It was first considered implicitly in [26, 27] and was later formalized in [31] (along with a detailed framework designed for handling it). At the time, its introduction was motivated by two reasons. First, an algorithm for covering is usually more rewarding than an algorithm for membership as the former provides more insight on the investigated class of languages. Second, covering was introduced as a key ingredient for handling difficult membership questions. For several important classes, membership is effectively reducible to covering for another simpler class. Recently, this idea was applied to prominent hierarchies of classes called “concatenation hierarchies” (see the surveys [28, 32] for details on these results).

In the paper, we are interested in covering for a slightly different reason. In particular, we do not present any covering algorithm. Instead, we look at how the new ideas that were recently introduced with covering in mind can be applied in the simpler membership setting. It turns out that even for the early membership results, reasoning in terms of covers is quite natural and allows to present arguments in a very intuitive way. We manage to formulate new proof arguments for two famous membership algorithms.

We first define covering and explain why it generalizes membership as a decision problem. Then, we come back to membership and briefly recall the general approach that is usually followed in order to handle it. We show that this approach can actually be formulated in a convenient and natural way with covering. For the sake of avoiding clutter, we fix an arbitrary alphabet A for the presentation: all languages that we consider are over A.

Definition

Similarly to membership, covering is parameterized by an arbitrary class of languages Inline graphic : we speak of -covering. It is designed with the same objective in mind: it serves as a mathematical tool for investigating the class .

For a class Inline graphic , the -covering takes a language L and a finite set of languages as input. It asks whether there exists a -cover of L which is separating for . Let us first define these two notions.

Given a language L, a cover of L is a finite set of languages Inline graphic such that . Additionally, given some class , a -cover of L is a cover of L such that every belongs to .

Moreover, given two finite sets of languages Inline graphic and , we say that is separating for if for every , there exists which satisfies . In other words, there exists no language in which intersects all languages in . Given a class , the -covering problem is now defined as follows:

INPUT: A regular language L and a finite set of regular languages Inline graphic .

OUTPUT: Does there exist a Inline graphic -cover of L which is separating for ?

A simple observation is that covering generalizes another well-known decision problem called separation. Given a class Inline graphic and two languages and , we say that is -separable from when there exists a third language such that and . We have the following lemma (see [31] for a proof).

Lemma 2

Let Inline graphic be a lattice and two languages. Then is -separable from , if and only if there exist a -cover of which is separating for .

Lemma 2 proves that Inline graphic -covering generalizes -membership as a decision problem. Indeed, given as input a regular language L, it is immediate that L belongs to if and only if L is -separable from (which is also regular). Thus, there exists an effective reduction from -membership to -covering.

Yet, this not the only connection between membership and covering. More importantly, this is not how we use covering in the paper. While each membership algorithm existing in the literature is based on unique ideas (specific to the class under investigation), most of them are formulated and proved within a standard common framework. It turns out that this framework boils down to a particular kind of covering question: this is the property that we shall exploit in the paper.

Application to Membership

We first summarize the standard general approach that is commonly used to handle membership questions and formulate solutions. Historically, this approach was initiated by Schützenberger who applied it to obtain the first known membership algorithm [36] (for the class of star-free languages). We shall detail and prove this result in Sect. 4.

The syntactic approach. Obtaining a membership algorithm for a given class Inline graphic is intuitively hard, as it requires to decide a semantic property which may not be apparent on the piece of syntax that defines the input regular language L (be it a regular expression, an automaton or a monoid morphism). To palliate this issue, the syntactic approach relies on the existence of a canonical recognizer for any given regular language. The idea is that while belonging to Inline graphic may not be apparent on an arbitrary syntax for L, it should be apparent on a canonical representation of L. Typically, the syntactic morphism of L serves as this canonical representation. As the name suggests, this object is a canonical morphism into a finite monoid which recognizes L (and can be computed from any representation of L).

Let us first define the syntactic morphism properly. Consider a language L. One may associate a canonical equivalence relation Inline graphic over to L. Given two words , we write,

Clearly, Inline graphic is an equivalence relation and one may verify that it is a congruence for word concatenation: for every , if and , then . Consequently, the quotient set is a monoid called the syntactic monoid of L. Moreover, the map which maps each word to its -class is a monoid morphism called the syntactic morphism of L. In particular, this morphism recognizes the language L: Inline graphic where F is the set of all -classes which intersect L. It is well-known and simple to verify that L is regular if and only if its syntactic monoid is finite. Moreover, in that case, one may compute the syntactic morphism of L from any representation of L (such as an automaton or an arbitrary monoid morphism recognizing L).

We are ready to present the key result behind the syntactic approach: for every quotient-closed Boolean algebra Inline graphic , membership of an arbitrary regular language in depends only on its syntactic morphism. This claim is formalized with the following standard result.

Proposition 3

Let Inline graphic be a quotient-closed Boolean algebra, L a regular language and its syntactic morphism. Then L belongs to if and only if every language recognized by belongs to .

Proof

The right to left implication is immediate since L is recognized by its syntactic morphism. We concentrate on the converse one. Assume that Inline graphic . We show that every language recognized by belongs to as well. By definition, these languages are exactly the unions of -classes. Thus, since is closed under union, it suffices to show that every -class belongs to . Observe that the definition of can be reformulated as follows. Given Inline graphic , we have,

Let Inline graphic . Since L is recognized by , it is clear that whether some word belongs to depends only on its image . In other words, is recognized by . Moreover, since L is regular, its syntactic monoid is finite which implies that recognizes finitely many languages. Thus, while there are infinitely many words Inline graphic , there are finitely many languages .

Altogether, we obtain that every Inline graphic -class is a finite Boolean combination of languages where . Since and is quotient-closed, every such language belongs to . Hence, since is a Boolean algebra, we conclude that every -class belongs to , completing the proof.

Proposition 3 implies that membership of a regular language L in some fixed quotient-closed Boolean algebra is equivalent to some property of an algebraic abstraction of L: its syntactic morphism. In particular, this is independent from the accepting set Inline graphic . By itself, this is a simple result. Yet, it captures the gist of the syntactic approach.

Naturally, the proposition tells nothing about the actual the property on the syntactic morphism that one should look for. This question is specific to each particular class Inline graphic : one has to find the right decidable property characterizing .

Remark 4

This may seem counterintuitive. We replaced the question of deciding whether a single language belongs to the class Inline graphic by an intuitively harder one: deciding whether all languages recognized by a given monoid morphism belong to . The idea is that the set of languages recognized by a morphism has a structure which can be exploited in membership arguments.

Remark 5

Proposition 3 is restricted quotient-closed Boolean algebras. This excludes quotient-closed lattices that are not closed under complement. One may generalize the syntactic approach to such classes (as done by Pin [19]). We do not discuss this as our two examples are quotient-closed Boolean algebras. Inline graphic

Back to Covering. We proved that for every quotient-closed Boolean algebra Inline graphic , the associated membership problem boils down to deciding whether all languages recognized by an input morphism belong to . It turns out that this new question is a particular instance of -covering. In order to explain this properly, we require a last definition.

Consider a morphism Inline graphic into a finite monoid M and a finite set of languages . We say that is confined by if it is separating for the set . The following fact can be verified from the definitions and reformulates this property in a way that is easier to manipulate.

Fact 6

Let Inline graphic be a morphism into a finite monoid and a finite set of languages. Then is confined by if and only if for every , there exists such that .

Proof

By definition Inline graphic is confined by if and only if for every , there exists such that . Since , the fact follows.

We show that given a lattice Inline graphic and a morphism into a finite monoid, all languages recognized by belong to if and only if there exists a -cover of which is confined by . The latter question is a particular case of -covering. In fact, we prove a slightly more general result that we shall need later when dealing with our two examples.

Proposition 7

Let Inline graphic be a lattice, a morphism into a finite monoid and a language. The two following properties are equivalent:

For every language L recognized by , we have .
There exists a -cover of H which is confined by .

Proof

Assume first that Inline graphic for every language L recognized by . We define . Clearly, is a cover of H and it is a -cover by hypothesis. Moreover, it is clear from Fact 6 that is confined by .

For the converse direction, assume that there exists a Inline graphic -cover of H which is confined by . Let L be a language recognized by , we show that,

This implies that Inline graphic since , every language in belongs to and is a lattice. The left to right inclusion is immediate since is a cover of H. We prove the converse one. Let such that , we show that . Let . Consider (which is nonempty by definition of K). Since and is confined by , we have by Fact 6. Thus, since Inline graphic and L is recognized by , it follows that , concluding the proof: we obtain .

Let us combine Propositions 3 and 7. When put together, they imply that for every quotient-closed Boolean algebra Inline graphic , a regular language L belongs to if and only if there exists a -cover of which is confined by the syntactic morphism of L.

The key point is that this formulation is very convenient when writing proof arguments. As we explained in Remark 1, the technical core of membership proofs consists in generic constructions which build descriptions of languages in Inline graphic . It turns out that building a -cover which is confined by some input morphism is an objective that is much easier to manipulate than directly proving that all languages recognized by the morphism belong to . We illustrate this point in the next section with new proofs for two well-known membership algorithms: the star-free languages and the piecewise testable languages.

Star-Free Languages and Schützenberger’s Theorem

We now illustrate the discussion of the previous section with a first example: Schützenberger’s theorem [36]. This result is important as it started the quest for membership algorithms. It provides such an algorithm for a very famous class: the star-free languages ( Inline graphic ). Informally, these are the languages which can be defined by a regular expression in which the Kleene star is disallowed (hence the name “star-free”) but a new operator for the complement operation is allowed instead. This class is important as it admits several natural alternate definitions. For example, the star-free languages are those which can be defined in first-order logic [15] or equivalently in linear temporal logic [11].

Schützenberger’s theorem states an algebraic characterization of Inline graphic : a regular language is star-free if and only if its syntactic monoid is aperiodic. This yields an algorithm for SF-membership as aperiodicity is a decidable property of finite monoids. Historically, Schützenberger’s theorem was the first result of its kind. It motivated the systematic investigation of the membership problem for important classes of languages. It is often viewed as one of the most important results of automata theory. This claim is supported by the number of times this theorem has been revisited over the years and the wealth of existing proofs [5, 7, 8, 10, 14, 16, 17, 21, 23, 41].

In this section, we present our own proof, based on SF-covers. Let us point out that while the formulation is new, the original ideas behind the argument can be traced back to the proof of Wilke [41]. We first recall the definition of the star-free languages. Then, we state the theorem properly and present the proof.

Definition

Let us define the class of star-free languages ( Inline graphic ). For every alphabet A, is the least set containing and all singletons for , which is closed under union, complement and concatenation. That is, for every , the languages , and KL belong to as well.

Example 8

For every sub-alphabet Inline graphic , we have . Indeed, by closure under complement, . We then get by closure under concatenation. Finally, this yields,

Another standard example is Inline graphic (where a, b are two distinct letters of A). Indeed, is the complement of (provided that ) which is clearly star-free.

By definition, Inline graphic is a Boolean algebra and one may verify that it is quotient-closed (the details are left to the reader). We complete the definition with a standard property that we require to prove the “easy” direction of Schützenberger’s theorem (every star-free language has an aperiodic syntactic monoid). Another typical application of this property is to show that examples of languages are not star-free. For example, Inline graphic (words with even length) is not star-free since since it does not satisfy the following lemma.

Lemma 9

Let A be an alphabet and Inline graphic . There exists a number such that for every and , we have .

Proof

We proceed by structural induction on the definition of L as a star-free language. When Inline graphic , it is clear that the lemma holds for . When for , one may verify that the lemma holds for . We turn to the inductive cases. Assume first that where are simpler languages. Induction yields such that for , if and , we have . Hence, the lemma holds for in that case. We turn to complement: Inline graphic where is a simpler language. By induction, we get such that for every and , we have . Clearly, the lemma holds for .

We now consider concatenation: Inline graphic where are simpler languages. Induction yields such that for , if and , we have . Let m be the maximum between and . We prove that the lemma holds for . Let and , we have to show that , i.e. for every . We concentrate on the right to left implication (the converse one is symmetrical). Assume that Inline graphic . Since , we get and such that . Since , it follows that either is a prefix of or is a suffix of . By symmetry, we assume that the former property holds: we have for some . Observe that since , it follows that . Moreover, we have by definition of m. Since , we know therefore that Inline graphic by definition of . Thus, . Since , this yields , concluding the proof.

Schützenberger’s Theorem

We may now present and prove Schützenberger’s theorem. Let us first define aperiodic monoids. There are several equivalent definitions in the literature. We use an equational one based on the idempotent power Inline graphic available in finite monoids. A finite monoid M is aperiodic when it satisfies the following property:

We are ready to state Schützenberger’s theorem.

Theorem 10

(Schützenberger [36]). A regular language is star-free if and only if its syntactic monoid is aperiodic.

Theorem 10 illustrates of the syntactic approach presented in Sect. 3. It validates Proposition 3: the star-free languages are characterized by a property of their syntactic morphism. In fact, for this particular class, one does not even need the full morphism, the syntactic monoid suffices.

The main application is a membership algorithm for the class of star-free languages. Given as input a regular language L, one may compute its syntactic monoid and check whether it satisfies Eq. (1): this boils down to testing all elements in the monoid. By Theorem 10, this decides whether L is star-free. However, as we explained in Remark 1 when we first introduced membership, this theorem is also important for the arguments that are required to prove it. Indeed, providing these arguments requires a deep insight on Inline graphic . The right to left implication is of particular interest: “given a regular language whose syntactic monoid is aperiodic, prove that it is star-free”. This involves devising a generic way to construct a star-free description for every regular language recognized by a monoid satisfying a syntactic property. This is the implication that we handle with covers. On the other hand, the converse implication is simple and standard (essentially, we already proved it with Lemma 9).

Proof

We fix an alphabet A and a regular language Inline graphic for the proof. Let be the syntactic morphism of L. We prove that if and only if M is aperiodic. Let us first handle the left to right implication.

From star-free languages to aperiodicity. Assume that Inline graphic . We prove that M is aperiodic, i.e. that (1) is satisfied. Let , we have to show that .

Since Inline graphic is a syntactic morphism, it is surjective and there exists such that . Moreover, since , Lemma 9 yields such that . By definition of the syntactic morphism, this implies that . Since , this yields as desired.

From aperiodicity to star-free languages. Assume that M is aperiodic. We show that L is star-free. We rely on the notions introduced in the Sect. 3 and directly prove that every language recognized by Inline graphic is star-free.

Remark 11

Intuitively, this property is stronger than L being star-free. Yet, since Inline graphic is a quotient-closed Boolean algebra, it is equivalent by Proposition 3.

The argument is based on Proposition 7: we use induction to construct an SF-cover Inline graphic of which is confined by . By the proposition, this implies that every language recognized by belongs to . We start with a preliminary definition that we require to formulate the induction.

Let B be an arbitrary alphabet, Inline graphic a morphism and . We say that a finite set of languages (over B) is -safe if for every and every , we have .

Lemma 12

Let B be an alphabet. Consider a morphism Inline graphic , and . There exists an SF-cover of which is -safe.

We first use Lemma 12 to conclude the main argument. We apply the lemma for Inline graphic , and . This yields an SF-cover of which is -safe. By definition, it follows that for every , we have for all . By Fact 6, this implies that is confined by , completing the main argument.

It remains to prove Lemma 12. Let B be an alphabet, Inline graphic a morphism, and . We build an SF-cover of which is -safe using induction on the three following parameters listed by order of importance:

The size of .
The size of C.
The size of .

Remark 13

The aperiodic monoid M remains fixed throughout the whole proof. On the other hand, the alphabets B and C, the morphism Inline graphic and may change when applying induction.

We distinguish two cases depending on the following property of Inline graphic , C and s. We say that s is -stable when the following holds:

We first consider the case when s is Inline graphic -stable. This is the base case which we handle using the hypothesis that M is aperiodic.

Base case: s is Inline graphic -stable. In that case, we define which is clearly an SF-cover of (we have as seen in Example 8). It remains to show that is -safe. For , we have to show that . We actually prove that for every which implies the desired result. Since s is -stable, we have the following fact.

Fact 14

For every Inline graphic , there exists such that .

Proof

We use induction on the length of Inline graphic . If , the fact holds for . Assume now that . We have for and . Induction yields such that . Moreover, since s is -stable, (2) yields such that . Altogether, we obtain that which concludes the proof.

Consider the word Inline graphic (with as the idempotent power of M). We apply Fact 14 for . This yields such that . Since M is aperiodic, we have by Eq. (1). This yields , concluding the base case.

Inductive case: s is not Inline graphic -stable. By hypothesis, there exists a letter such that the following strict inclusion holds . We fix for the remainder of the argument.

Let D be the sub-alphabet Inline graphic . By definition, . Hence, induction on our second parameter in Lemma 12 (i.e., the size of C) yields an SF-cover of which is -safe. Note that it is clear that our first induction parameter (the size of ) has not increased since .

We distinguish two independent sub-cases. Clearly, we have Inline graphic . The argument differs depending on whether this inclusion is strict or not.

Sub-case 1: Inline graphic . Consider a language . Since is a cover of which is -safe by definition, there exists some element such that for every . The construction of the desired SF-cover of is based on the following fact which we prove using induction on our third parameter (the size of ).

Fact 15

For every language Inline graphic , there exists an SF-cover of which is -safe.

Proof

Since Inline graphic , it is immediate that . Hence, . Moreover, by hypothesis in Sub-case 1. Thus, . Finally, recall that the letter c satisfies by definition. Consequently, we have the strict inclusion . Hence, we may apply induction on our third parameter in Lemma 12 (i.e. the size of ) to obtain the desiredn SF-cover Inline graphic of which is -safe. Note that here, our first two parameters have not increased (they only depend on and C which remain unchanged).

We may now use Fact 15 to build the desired cover Inline graphic of . We define . Clearly, is an SF-cover of by hypothesis on and since and is closed under concatenation. We need to show that is -safe. Let and , we need to show that . By definition of , there are two cases. When , the result is immediate since is -safe by definition. Otherwise, Inline graphic for and . Thus, we get and such that and . By definition, . Moreover, since is -safe by definition in Fact 15, we have . Altogether, this yields , i.e. as desired.

Sub-case 2: Inline graphic . Let us first explain informally how the cover of is built in this case. Let . Since , w admits a unique decomposition such that and (i.e., v is the largest suffix of w in and u is the corresponding prefix). Using induction, we construct SF-covers of the possible prefixes and suffixes. Then, we combine them to construct a cover of the whole set Inline graphic . Actually, we already covered the suffixes: we have an SF-cover of which is -safe. It remains to cover the prefixes. We do so this in the following lemma which we prove using induction on our first parameter (the size of ).

Lemma 16

There exists an SF-cover Inline graphic of which is -safe.

Proof

Let Inline graphic . Using E as a new alphabet, we apply induction on the first parameter in Lemma 12 (i.e., the size of ) to build an auxiliary SF-cover of which we then use to construct .

Since Inline graphic , there exists a natural morphism defined by for every . Clearly, . Since by hypothesis of Sub-case 2, this implies and induction on the first parameter in Lemma 12 yields an SF-cover of which is -safe. We use to construct . First, we define a map .

We let Inline graphic . Otherwise, let be a nonempty word. Since , w admits a unique decomposition with . Hence, we may define with for every (recall that by definition). We are ready to define . We let,

It remains to show that Inline graphic is an SF-cover of which is -safe. It is immediate that is a cover of since was a cover of .

Let us prove that Inline graphic is -safe. Let and . We prove that . By definition, there exists such that . Thus, which implies that since is -safe by definition. One may now verify from the definitions that and . Thus, we obtain as desired.

It remains to show that every Inline graphic is star-free. By definition of , it suffices to show that for every , we have . We proceed by induction on the definition of W as a star-free language. When , it is clear that . Assume now that for some . By definition, . This may be reformulated as follows: with . Clearly, U is the intersection of Inline graphic with a language recognized by . Recall that we have an SF-cover of which is -safe (and therefore confined by ). Hence, Proposition 7 implies that . It follows that as desired. We turn to the inductive cases.

First, assume that there are simpler languages Inline graphic such that either or . By induction, for . Moreover, the definition of implies that and . Hence, we obtain . Finally, assume that for a simpler language . By induction, . Moreover, . Clearly, . Thus, we get as desired.

We are ready to construct the desired SF-cover Inline graphic of . Let be the -safe SF-cover of given by Lemma 16 and consider our -safe SF-cover of . We define . It is immediate by definition that is an SF-cover of since and is closed under concatenation. It remains to verify that is -safe (it is in fact -safe). Let and , we show that Inline graphic (which implies ). By definition, with and . Therefore, and with and . Since U and V are both -safe by definition, we have and . It follows that . This concludes the proof of Lemma 12.

Piecewise Testable Languages and Simon’s Theorem

We turn to our second example: Simon’s theorem [38]. This results states an algebraic characterization of another prominent class of regular languages: the piecewise testable languages ( Inline graphic ). It is quite important in the literature as it was among the first results of this kind after Schützenberger’s theorem (which we proved in Sect. 4). Over the years, many different proofs have been found (examples include [1, 9, 12, 18, 38, 39]). We present a new proof, based on Inline graphic -covers and entirely independent from previously known arguments. It relies on a concatenation principle for the piecewise testable languages that can only be formulated with -covers.

We first recall the definition of piecewise testable languages. Then, we state the theorem properly and present the proof.

Definition

Let us define the class of piecewise testable languages ( Inline graphic ). Given an alphabet A and , we say that u is a piece of v and write when u can be obtained from v by removing letters and gluing the remaining ones together. More precisely, when there exist and such that,

For instance, acb is a piece of Inline graphic . Note that by definition, the empty word “” is a piece of every word (this is the case ). Furthermore, it is clear that the relation is a preorder on .

For every word Inline graphic , we write for the language consisting of all words v such that u is a piece of v. If , we have by definition:

We may now define Inline graphic . A language is piecewise testable (i.e. ) when L is a (finite) Boolean combination of languages for .

Example 17

We let Inline graphic as the alphabet. Then . Indeed, . Moreover, observe that every finite language is piecewise testable. Since is closed under union, it suffices to show that every singleton is piecewise testable. Consider a word . By definition, w is the only word belonging to but not to , where denotes any sequence of Inline graphic letters. Hence, is piecewise testable.

Clearly Inline graphic is a Boolean algebra and one may verify that it is quotient-closed (the details are left to the reader). We complete the definition with two properties of . The first one is standard and we shall need it to prove that “easy” direction of Simon’s theorem (every piecewise testable language satisfies the characterization).

Lemma 18

Let A be an alphabet and Inline graphic . There exists such that for every and , we have .

Proof

Since Inline graphic , there exists such that L is a Boolean combinations of language with such that (i.e. w has length at most k). We prove that the lemma holds for this number k. Let and . We show that . By symmetry, we concentrate on : given , we show that . Since , one may verify that for every Inline graphic such that , we have . In other words, . Since L is a Boolean combination of such languages, this implies the equivalence as desired.

The second result is specific to our covering-based approach for proving Simon’s theorem. It turns out that elegant proof arguments for membership algorithms often apply to classes that are closed under concatenation (or some weak variant thereof). As seen in the previous section, the star-free languages are an example. Unfortunately, Inline graphic is not closed under concatenation. For example, consider the alphabet . We have and as seen in Example 17. Yet, one may verify with Lemma 18 that .

We solve this issue with a “weak concatenation principle” for piecewise testable languages. This result can only be formulated using Inline graphic -covers. While its proof is rather technical, an interesting observation is that it characterizes the piecewise testable languages. In the proof of Simon’s theorem, we only use this concatenation principle and the hypothesis that is a Boolean algebra (we never come back to the original definition of Inline graphic ).

Proposition 19

Let Inline graphic and . Moreover, let and be -covers of and respectively. There exists a -cover of such that for every we have and satisfying .

Proof

We start with standard definitions that we need to describe Inline graphic . For every , we associate a preorder over . For , we write to indicate that for every such that , we have . Clearly, is a preorder which is coarser than : for every such that , we have . Moreover, we write for the equivalence generated by this preorder: if and only if for every Inline graphic such that . Clearly, has finite index.

Since Inline graphic and are -covers, there exists some number every language is a finite Boolean combination of languages for such that . In other words, every such language K is a union of -classes. Moreover, we may choose k so that and . We shall define the cover as a set of -classes for an appropriate number h that we choose using the following technical lemma.

Lemma 20

Let Inline graphic , and such that . There exist such that , and .

Proof

We claim that there exist Inline graphic with length at most such that and . We first use this claim to prove the lemma. Clearly, and . Therefore, since , it follows that . This yields a decomposition such that and . Since and , this implies and as desired.

It remains to prove the claim. We only construct a piece Inline graphic such that and , as the construction of z is analogous. Let F be the set of all pieces of of size at most k, that is,

Clearly, Inline graphic . For , let be the set of words of F that are pieces of x. Let be some decomposition of . Note that . We say that the occurrence of a given by the decomposition is bad if . Let y be the word obtained from by deleting all bad letters (and keeping the other ones). By construction, Inline graphic and . The latter property implies that for every . By definition of F, this means that . Furthermore, letters of y are not bad, and one may verify that there are at most such letters. Therefore, , which concludes the proof.

We define Inline graphic . It is immediate that every -class is a language of (it is a Boolean combination of languages for such that ). Hence, the set containing all -classes which intersect is a -cover of . It remains to show that for every , there exist and such that . We fix the language for the proof. We need the following result.

Lemma 21

Let Inline graphic be a finite language. There exist and such that .

Proof

Let Inline graphic be the words in H, i.e., . Our goal is to find and such that for all . Therefore, we first have to find a suitable decomposition of each word as , and then to show that all ’s belong to some and all ’s belong to some .

By definition, K is a Inline graphic -class and it intersects . This yields a word such that . Since , there exist and such that . Let . We may write the relations as follows:

Since Inline graphic by definition, may apply Lemma 20 times to get and such that,

for every and , we have , and,
, and,
.

Since Inline graphic , the first property and the pigeonhole principle yield such that and . For every , we let and . Therefore, for all , we have .

The second and third properties now yield Inline graphic and , whence:

Recall that Inline graphic by definition of k. Since and , it follows that . Since is a cover of , this yields such that . Since is a union of -classes by choice of k and since , we deduce that . Symmetrically, we obtain such that . Finally, since for every , this yields , as desired.

We may now finish the proof. For every Inline graphic , we let be the (finite) language containing all words of length at most n in K. Clearly, and for every . Moreover, Lemma 21 implies that for every , we have and such that . Since and are finite sets, there exist and such that and for infinitely many n. Since for every , it then follows that Inline graphic for every . Finally, since , this implies which concludes the proof.

Simon’s Theorem

We may now present and prove Simon’s theorem. It characterizes the star-free languages as those whose syntactic monoid is Inline graphic -trivial. The original definition of this notion is based on the Green relation defined on every finite monoid. Here, we do not consider this relation. Instead, we use an equational definition. A finite monoid M is -trivial when it satisfies the following property:

Theorem 22

(Simon [38]). A regular language is piecewise testable if and only if its syntactic monoid is Inline graphic -trivial.

As expected, the main application of Simon’s theorem is the decidability of Inline graphic -membership. Given a regular language L as input, one may compute its syntactic monoid and check whether it satisfies Eq. (3) by testing all possible combinations. By Theorem 22, this decides whether L is piecewise testable. Yet, as for the star-free languages in Sect. 4, this theorem is also important for the arguments that are required to prove it. We present such a proof now.

Proof

We fix an alphabet A and a regular language Inline graphic for the proof. Let be the syntactic morphism of L. We prove that if and only if M is -trivial. We start with the left to right implication which is essentially immediate from Lemma 18. As expected, the difficult and most interesting part of the proof is the converse implication.

From piecewise testable languages to Inline graphic -triviality. Assume that we have . We prove that M is -trivial: (3) holds. Let , we have to show that .

Since Inline graphic is a syntactic morphism, it is surjective and there exists such that and . Moreover, since , Lemma 18 yields such that . By definition of the syntactic morphism, this implies that . Since and , this yields as desired.

From Inline graphic -triviality to piecewise testable languages. Assume that M is -trivial. We show that L is piecewise testable. We rely on the notions introduced in the Sect. 3 and directly prove that every language recognized by is piecewise testable. The argument is based on Proposition 7: we use induction to construct a Inline graphic -cover of which is confined by . By the proposition, this implies that every language recognized by belongs to . We start with a preliminary definition that we require to formulate the induction.

Given a finite set of languages Inline graphic , and , we say that is (s, t)-safe if for every and , we have . The argument is based on the following lemma.

Lemma 23

Let Inline graphic and . There exists a -cover of which is (s, t)-safe.

We first use Lemma 23 to complete the main argument. We apply the lemma for Inline graphic and . Since , this yields a -cover of which is -safe. Thus, for every and , we have . By Fact 6, this implies that is confined by , concluding the proof.

It remains to prove Lemma 23. Let Inline graphic and . We construct a -cover of which is (s, t)-safe. We write for the following set:

We proceed by induction on the two following parameters, listed by order of importance:

The size of P[s, w, t].
The length of w.

We consider two cases depending on whether w is empty or not. We first assume that this property holds.

First case: Inline graphic . We handle this case using induction on our first parameter. Let be the language of all words such that . We use induction to build a -cover of H (note that it may happen that H is empty in which case we do not need induction).

Fact 24

There exists a Inline graphic -cover of H which is (s, t)-safe.

Proof

One may verify with a pumping argument that there exists a finite set Inline graphic such that (this is also an immediate consequence of Higman’s lemma). Hence, it suffices to prove that for every , there exists a -cover of which is (s, t)-safe. Indeed, one may then choose to be the union of all covers for . We fix for the proof.

Since Inline graphic , we have . Since is surjective (it is a syntactic morphism), it follows that . Therefore, we have and . Since by definition of H, we get . Hence, induction on the first parameter in Lemma 23 (the size of P[s, w, t]) yields a -cover of which is (s, t)-safe, as desired.

We let Inline graphic be the -cover of H given by Fact 24. We define,

Finally, we let Inline graphic . It is immediate that is a -cover of since is a Boolean algebra. It remains to verify that is (s, t)-safe. Consider and let . We prove that . If , this is immediate since is (s, t)-safe by construction. Hence, it suffices to show that is (s, t)-safe. This is a direct consequence of the following fact. Note that this is the only place in the proof where we use the hypothesis that M satisfies (3).

Fact 25

For every word Inline graphic , we have .

Proof

Let Inline graphic . By definition of , for every . Since is a cover of H, it follows that . By definition of H, it follows that . By definition, this yields such that , and . The latter property yields such that , and . We prove that and , which yields as desired that . By symmetry, we only show that Inline graphic .

Since Inline graphic , we have . Moreover, since , we have and such that and . It follows from (3) that for every , we have:

This yields Inline graphic . Therefore, since we know that , we obtain . Finally, this yields,

This concludes the proof. Inline graphic

Second case: Inline graphic . In that case, we have and such that (the choice of u, v and a is arbitrary). Consider the two following subsets of M:

Moreover, we say that a cover Inline graphic of some language H is tight when for every . We use induction to prove the following fact.

Fact 26

There exist tight Inline graphic -covers and of and which satisfy the following properties:

for every , the cover of is (sr, t)-safe.
for every , the cover of is (s, rt)-safe.

Proof

We construct Inline graphic (the construction of is symmetrical). Let . For every , assume that we already have a -cover of which is -safe. We define,

Since Inline graphic is a Boolean algebra, it is immediate that is a tight -cover of which is (sr, t)-safe for every . Thus, it remains to build for every such a -cover .

We fix Inline graphic for the proof. By definition of , we have for some word . Observe that since , we have by definition: our first induction parameter (i.e., the size of P[s, w, t]) has not increased. Hence, since , it follows by induction on our second parameter in Lemma 23 (the length of w) that there exists a Inline graphic -cover of which is -safe. This concludes the proof.

We are ready to construct the desired Inline graphic -cover of . Consider the tight -covers and of and described in Fact 26. Since , Proposition 19 yields a -cover of such that for every , there exist and satisfying . It remains to prove that is (s, t)-safe. Let and . We prove that .

By definition, Inline graphic for and . Hence, there exist and such that and . Since is a tight cover of , we know that , which implies that by definition. It follows that is -safe by Fact 26. Therefore, since and , we obtain . Symmetrically, one may verify that . Altogether, it follows that , meaning that Inline graphic . This concludes the proof of Lemma 23.

Conclusion

We explained how covering provides a natural and convenient framework for handling membership questions. We illustrated this point by using covers to formulate new proofs for Schützenberger’s theorem and Simon’s theorem. We chose these two examples as they are arguably the two most famous characterization theorems of this kind. However, this approach is also relevant for other prominent characterization theorems. A first promising example is the class of unambiguous languages. It was also characterized by Schützenberger [37] and it also famous as the class of languages that can be define in two-variable first-order logic (this was shown by Thérien and Wilke [40]). Another interesting example is Knast’s theorem [13] which characterizes the languages of dot-depth one. This class is natural generalization of the piecewise testable languages.

Contributor Information

Alberto Leporati, Email: alberto.leporati@unimib.it.

Carlos Martín-Vide, Email: carlos.martin@urv.cat.

Dana Shapira, Email: shapird@g.ariel.ac.il.

Claudio Zandron, Email: zandron@disco.unimib.it.

Thomas Place, Email: tplace@labri.fr.

References

1.Almeida J. Implicit operations on finite j-trivial semigroups and a conjecture of I. Simon. J. Pure Appl. Algebra. 1990;69:205–218. doi: 10.1016/0022-4049(91)90019-X. [DOI] [Google Scholar]
2.Brzozowski JA, Cohen RS. Dot-depth of star-free events. J. Comput. Syst. Sci. 1971;5(1):1–16. doi: 10.1016/S0022-0000(71)80006-5. [DOI] [Google Scholar]
3.Brzozowski JA, Knast R. The dot-depth hierarchy of star-free languages is infinite. J. Comput. Syst. Sci. 1978;16(1):37–55. doi: 10.1016/0022-0000(78)90049-1. [DOI] [Google Scholar]
4.Brzozowski JA, Simon I. Characterizations of locally testable events. Discrete Math. 1973;4(3):243–271. doi: 10.1016/S0012-365X(73)80005-6. [DOI] [Google Scholar]
5.Colcombet T. Green’s relations and their use in automata theory. In: Dediu A-H, Inenaga S, Martín-Vide C, editors. Language and Automata Theory and Applications; Heidelberg: Springer; 2011. pp. 1–21. [Google Scholar]
6.Czerwiński W, Martens W, Masopust T. Efficient separability of regular languages by subsequences and suffixes. In: Fomin FV, Freivalds R, Kwiatkowska M, Peleg D, editors. Automata, Languages, and Programming; Heidelberg: Springer; 2013. pp. 150–161. [Google Scholar]
7.Diekert V, Gastin P. First-order definable languages. In: Flum J, Grädel E, Wilke T, editors. Logic and Automata: History and Perspectives, Texts in Logic and Games. Amsterdam: Amsterdam University Press; 2008. pp. 261–306. [Google Scholar]
8.Eilenberg S. Automata, Languages, and Machines. Orlando: Academic Press Inc.; 1976. [Google Scholar]
9.Higgins P. A proof of simon’s theorem on piecewise testable languages. Theor. Comput. Sci. 1997;178(1):257–264. doi: 10.1016/S0304-3975(96)00230-7. [DOI] [Google Scholar]
10.Higgins PM. A new proof of Schützenberger’s theorem. Int. J. Algebra Comput. 2000;10(02):217–220. doi: 10.1142/S0218196700000066. [DOI] [Google Scholar]
11.Kamp, H.W.: Tense logic and the theory of linear order. Ph.D. thesis, Computer Science Department, University of California at Los Angeles, USA (1968)
12.Klima O. Piecewise testable languages via combinatorics on words. Discrete Math. 2011;311(20):2124–2127. doi: 10.1016/j.disc.2011.06.013. [DOI] [Google Scholar]
13.Knast R. A semigroup characterization of dot-depth one languages. RAIRO - Theor. Inform. Appl. 1983;17(4):321–330. doi: 10.1051/ita/1983170403211. [DOI] [Google Scholar]
14.Lucchesi CL, Simon I, Simon I, Simon J, Kowaltowski T. Aspectos teóricos da computação. Sao Paulo: IMPA; 1979. [Google Scholar]
15.McNaughton R, Papert SA. Counter-Free Automata. Cambridge: MIT Press; 1971. [Google Scholar]
16.Meyer AR. A note on star-free events. J. ACM. 1969;16(2):220–225. doi: 10.1145/321510.321513. [DOI] [Google Scholar]
17.Perrin, D.: Finite automata. In: Formal Models and Semantics. Elsevier (1990)
18.Pin JE. Varieties of Formal Languages. New York: Plenum Publishing Co.; 1986. [Google Scholar]
19.Pin JE. A variety theorem without complementation. Russ. Math. (Izvestija vuzov.Matematika) 1995;39:80–90. [Google Scholar]
20.Pin, J.E.: The dot-depth hierarchy, 45 years later, pp. 177–202. World Scientific (2017). (chap. 8)
21.Pin, J.E.: Mathematical foundations of automata theory (2019, in preparation). https://www.irif.fr/~jep/PDF/MPRI/MPRI.pdf
22.Pin JE, Weil P. Polynomial closure and unambiguous product. Theory Comput. Syst. 1997;30(4):383–422. doi: 10.1007/BF02679467. [DOI] [Google Scholar]
23.Pippenger N. Theories of Computability. Cambridge: Cambridge University Press; 1997. [Google Scholar]
24.Place, T.: Separating regular languages with two quantifier alternations. Log. Methods Comput. Sci. 14(4) (2018)
25.Place T, van Rooijen L, Zeitoun M. Separating regular languages by piecewise testable and unambiguous languages. In: Chatterjee K, Sgall J, editors. Mathematical Foundations of Computer Science 2013; Heidelberg: Springer; 2013. pp. 729–740. [Google Scholar]
26.Place T, Zeitoun M. Going higher in the first-order quantifier alternation hierarchy on words. In: Esparza J, Fraigniaud P, Husfeldt T, Koutsoupias E, editors. Automata, Languages, and Programming; Heidelberg: Springer; 2014. pp. 342–353. [Google Scholar]
27.Place, T., Zeitoun, M.: Separating regular languages with first-order logic. In: Proceedings of the Joint Meeting of the 23rd EACSL Annual Conference on Computer Science Logic (CSL 2014) and the 29th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2014), pp. 75:1–75:10. ACM, New York (2014)
28.Place T, Zeitoun M. The tale of the quantifier alternation hierarchy of first-order logic over words. SIGLOG News. 2015;2(3):4–17. doi: 10.1145/2815493.2815495. [DOI] [Google Scholar]
29.Place, T., Zeitoun, M.: Separating regular languages with first-order logic. Log. Methods Comput. Sci. 12(1) (2016)
30.Place, T., Zeitoun, M.: Separation for dot-depth two. In: Proceedings of the 32th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2017), pp. 202–213. IEEE Computer Society (2017)
31.Place, T., Zeitoun, M.: The covering problem. Log. Methods Comput. Sci. 14(3) (2018)
32.Place T, Zeitoun M. Generic results for concatenation hierarchies. Theory Comput. Syst. (ToCS) 2019;63(4):849–901. doi: 10.1007/s00224-018-9867-0. [DOI] [Google Scholar]
33.Place T, Zeitoun M. Going higher in first-order quantifier alternation hierarchies on words. J. ACM. 2019;66(2):12:1–12:65. doi: 10.1145/3303991. [DOI] [Google Scholar]
34.Place, T., Zeitoun, M.: On all things star-free. In: Proceedings of the 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), pp. 126:1–126:14 (2019)
35.Place, T., Zeitoun, M.: Separation and covering for group based concatenation hierarchies. In: Proceedings of the 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2019), pp. 1–13 (2019)
36.Schützenberger MP. On finite monoids having only trivial subgroups. Inf. Control. 1965;8(2):190–194. doi: 10.1016/S0019-9958(65)90108-7. [DOI] [Google Scholar]
37.Schützenberger MP. Sur le produit de concaténation non ambigu. Semigroup Forum. 1976;13:47–75. doi: 10.1007/BF02194921. [DOI] [Google Scholar]
38.Simon I. Piecewise testable events. In: Brakhage H, editor. Automata Theory and Formal Languages; Heidelberg: Springer; 1975. pp. 214–222. [Google Scholar]
39.Straubing H, Thérien D. Partially ordered finite monoids and a theorem of I. Simon. J. Algebra. 1988;119(2):393–399. doi: 10.1016/0021-8693(88)90067-1. [DOI] [Google Scholar]
40.Thérien, D., Wilke, T.: Over words, two variables are as powerful as one quantifier alternation. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC 1998), pp. 234–240. ACM, New York (1998)
41.Wilke T. Classifying discrete temporal properties. In: Meinel C, Tison S, editors. STACS 99; Heidelberg: Springer; 1999. pp. 32–46. [Google Scholar]
42.Zalcstein Y. Locally testable languages. J. Comput. Syst. Sci. 1972;6(2):151–167. doi: 10.1016/S0022-0000(72)80020-5. [DOI] [Google Scholar]

[CR1] 1.Almeida J. Implicit operations on finite j-trivial semigroups and a conjecture of I. Simon. J. Pure Appl. Algebra. 1990;69:205–218. doi: 10.1016/0022-4049(91)90019-X. [DOI] [Google Scholar]

[CR2] 2.Brzozowski JA, Cohen RS. Dot-depth of star-free events. J. Comput. Syst. Sci. 1971;5(1):1–16. doi: 10.1016/S0022-0000(71)80006-5. [DOI] [Google Scholar]

[CR3] 3.Brzozowski JA, Knast R. The dot-depth hierarchy of star-free languages is infinite. J. Comput. Syst. Sci. 1978;16(1):37–55. doi: 10.1016/0022-0000(78)90049-1. [DOI] [Google Scholar]

[CR4] 4.Brzozowski JA, Simon I. Characterizations of locally testable events. Discrete Math. 1973;4(3):243–271. doi: 10.1016/S0012-365X(73)80005-6. [DOI] [Google Scholar]

[CR5] 5.Colcombet T. Green’s relations and their use in automata theory. In: Dediu A-H, Inenaga S, Martín-Vide C, editors. Language and Automata Theory and Applications; Heidelberg: Springer; 2011. pp. 1–21. [Google Scholar]

[CR6] 6.Czerwiński W, Martens W, Masopust T. Efficient separability of regular languages by subsequences and suffixes. In: Fomin FV, Freivalds R, Kwiatkowska M, Peleg D, editors. Automata, Languages, and Programming; Heidelberg: Springer; 2013. pp. 150–161. [Google Scholar]

[CR7] 7.Diekert V, Gastin P. First-order definable languages. In: Flum J, Grädel E, Wilke T, editors. Logic and Automata: History and Perspectives, Texts in Logic and Games. Amsterdam: Amsterdam University Press; 2008. pp. 261–306. [Google Scholar]

[CR8] 8.Eilenberg S. Automata, Languages, and Machines. Orlando: Academic Press Inc.; 1976. [Google Scholar]

[CR9] 9.Higgins P. A proof of simon’s theorem on piecewise testable languages. Theor. Comput. Sci. 1997;178(1):257–264. doi: 10.1016/S0304-3975(96)00230-7. [DOI] [Google Scholar]

[CR10] 10.Higgins PM. A new proof of Schützenberger’s theorem. Int. J. Algebra Comput. 2000;10(02):217–220. doi: 10.1142/S0218196700000066. [DOI] [Google Scholar]

[CR11] 11.Kamp, H.W.: Tense logic and the theory of linear order. Ph.D. thesis, Computer Science Department, University of California at Los Angeles, USA (1968)

[CR12] 12.Klima O. Piecewise testable languages via combinatorics on words. Discrete Math. 2011;311(20):2124–2127. doi: 10.1016/j.disc.2011.06.013. [DOI] [Google Scholar]

[CR13] 13.Knast R. A semigroup characterization of dot-depth one languages. RAIRO - Theor. Inform. Appl. 1983;17(4):321–330. doi: 10.1051/ita/1983170403211. [DOI] [Google Scholar]

[CR14] 14.Lucchesi CL, Simon I, Simon I, Simon J, Kowaltowski T. Aspectos teóricos da computação. Sao Paulo: IMPA; 1979. [Google Scholar]

[CR15] 15.McNaughton R, Papert SA. Counter-Free Automata. Cambridge: MIT Press; 1971. [Google Scholar]

[CR16] 16.Meyer AR. A note on star-free events. J. ACM. 1969;16(2):220–225. doi: 10.1145/321510.321513. [DOI] [Google Scholar]

[CR17] 17.Perrin, D.: Finite automata. In: Formal Models and Semantics. Elsevier (1990)

[CR18] 18.Pin JE. Varieties of Formal Languages. New York: Plenum Publishing Co.; 1986. [Google Scholar]

[CR19] 19.Pin JE. A variety theorem without complementation. Russ. Math. (Izvestija vuzov.Matematika) 1995;39:80–90. [Google Scholar]

[CR20] 20.Pin, J.E.: The dot-depth hierarchy, 45 years later, pp. 177–202. World Scientific (2017). (chap. 8)

[CR21] 21.Pin, J.E.: Mathematical foundations of automata theory (2019, in preparation). https://www.irif.fr/~jep/PDF/MPRI/MPRI.pdf

[CR22] 22.Pin JE, Weil P. Polynomial closure and unambiguous product. Theory Comput. Syst. 1997;30(4):383–422. doi: 10.1007/BF02679467. [DOI] [Google Scholar]

[CR23] 23.Pippenger N. Theories of Computability. Cambridge: Cambridge University Press; 1997. [Google Scholar]

[CR24] 24.Place, T.: Separating regular languages with two quantifier alternations. Log. Methods Comput. Sci. 14(4) (2018)

[CR25] 25.Place T, van Rooijen L, Zeitoun M. Separating regular languages by piecewise testable and unambiguous languages. In: Chatterjee K, Sgall J, editors. Mathematical Foundations of Computer Science 2013; Heidelberg: Springer; 2013. pp. 729–740. [Google Scholar]

[CR26] 26.Place T, Zeitoun M. Going higher in the first-order quantifier alternation hierarchy on words. In: Esparza J, Fraigniaud P, Husfeldt T, Koutsoupias E, editors. Automata, Languages, and Programming; Heidelberg: Springer; 2014. pp. 342–353. [Google Scholar]

[CR27] 27.Place, T., Zeitoun, M.: Separating regular languages with first-order logic. In: Proceedings of the Joint Meeting of the 23rd EACSL Annual Conference on Computer Science Logic (CSL 2014) and the 29th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2014), pp. 75:1–75:10. ACM, New York (2014)

[CR28] 28.Place T, Zeitoun M. The tale of the quantifier alternation hierarchy of first-order logic over words. SIGLOG News. 2015;2(3):4–17. doi: 10.1145/2815493.2815495. [DOI] [Google Scholar]

[CR29] 29.Place, T., Zeitoun, M.: Separating regular languages with first-order logic. Log. Methods Comput. Sci. 12(1) (2016)

[CR30] 30.Place, T., Zeitoun, M.: Separation for dot-depth two. In: Proceedings of the 32th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2017), pp. 202–213. IEEE Computer Society (2017)

[CR31] 31.Place, T., Zeitoun, M.: The covering problem. Log. Methods Comput. Sci. 14(3) (2018)

[CR32] 32.Place T, Zeitoun M. Generic results for concatenation hierarchies. Theory Comput. Syst. (ToCS) 2019;63(4):849–901. doi: 10.1007/s00224-018-9867-0. [DOI] [Google Scholar]

[CR33] 33.Place T, Zeitoun M. Going higher in first-order quantifier alternation hierarchies on words. J. ACM. 2019;66(2):12:1–12:65. doi: 10.1145/3303991. [DOI] [Google Scholar]

[CR34] 34.Place, T., Zeitoun, M.: On all things star-free. In: Proceedings of the 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), pp. 126:1–126:14 (2019)

[CR35] 35.Place, T., Zeitoun, M.: Separation and covering for group based concatenation hierarchies. In: Proceedings of the 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2019), pp. 1–13 (2019)

[CR36] 36.Schützenberger MP. On finite monoids having only trivial subgroups. Inf. Control. 1965;8(2):190–194. doi: 10.1016/S0019-9958(65)90108-7. [DOI] [Google Scholar]

[CR37] 37.Schützenberger MP. Sur le produit de concaténation non ambigu. Semigroup Forum. 1976;13:47–75. doi: 10.1007/BF02194921. [DOI] [Google Scholar]

[CR38] 38.Simon I. Piecewise testable events. In: Brakhage H, editor. Automata Theory and Formal Languages; Heidelberg: Springer; 1975. pp. 214–222. [Google Scholar]

[CR39] 39.Straubing H, Thérien D. Partially ordered finite monoids and a theorem of I. Simon. J. Algebra. 1988;119(2):393–399. doi: 10.1016/0021-8693(88)90067-1. [DOI] [Google Scholar]

[CR40] 40.Thérien, D., Wilke, T.: Over words, two variables are as powerful as one quantifier alternation. In: Proceedings of the 30th Annual ACM Symposium on Theory of Computing (STOC 1998), pp. 234–240. ACM, New York (1998)

[CR41] 41.Wilke T. Classifying discrete temporal properties. In: Meinel C, Tison S, editors. STACS 99; Heidelberg: Springer; 1999. pp. 32–46. [Google Scholar]

[CR42] 42.Zalcstein Y. Locally testable languages. J. Comput. Syst. Sci. 1972;6(2):151–167. doi: 10.1016/S0022-0000(72)80020-5. [DOI] [Google Scholar]

PERMALINK

Deciding Classes of Regular Languages: The Covering Approach

Thomas Place

Abstract

Introduction

Preliminaries

Remark 1

The Covering Problem

Definition

Lemma 2

Application to Membership

Proposition 3

Proof

Remark 4

Remark 5

Fact 6

Proof

Proposition 7

Proof

Star-Free Languages and Schützenberger’s Theorem

Definition

Example 8

Lemma 9

Proof

Schützenberger’s Theorem

Theorem 10

Proof

Remark 11

Lemma 12

Remark 13

Fact 14

Proof

Fact 15

Proof

Lemma 16

Proof

Piecewise Testable Languages and Simon’s Theorem

Definition

Example 17

Lemma 18

Proof

Proposition 19

Proof

Lemma 20

Proof

Lemma 21

Proof

Simon’s Theorem

Theorem 22

Proof

Lemma 23

Fact 24

Proof

Fact 25

Proof

Fact 26

Proof

Conclusion

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases