Abstract
A recently introduced operation of geometrical closure on formal languages is investigated. It is proved that the geometrical closure of a language from the positive variety
, the level 3/2 of the Straubing-Thérien hierarchy of star-free languages, always falls into the variety
, which is a new variety consisting of specific R-trivial languages. As a consequence, each class of regular languages lying between
and
is geometrically closed.
Keywords: Language varieties, Geometrical closure, Straubing-Thérien hierarchy, R-trivial monoid
Introduction
A geometrical closure is an operation on formal languages introduced recently by Dubernard, Guaiana, and Mignot [8]. It is defined as follows: Take any language L over some k-letter alphabet and consider the set called the figure of L in [8], which consists of all elements of
corresponding to Parikh vectors of prefixes of words from L. The geometrical closure of L is the language
of all words w such that the Parikh vectors of all the prefixes of w lie in the figure of L. This closure operator was inspired by the previous works of Blanpain, Champarnaud, and Dubernard [4] and Béal et al. [3], in which geometrical languages are studied – using the terminology from later paper [8], these can be described as languages whose prefix closure is equal to their geometrical closure. Note that this terminology was motivated by the fact that a geometrical language is completely determined by its (geometrical) figure. In the particular case of binary alphabets, these (geometrical) figures were illustrated by plane diagrams in [8].
The class of all regular languages can be easily observed not to be geometrically closed – that is, one can find a regular language such that its geometrical closure is not regular [8] (see also the end of Sect. 2). One possible research aim could be to characterise regular languages L for which
is regular, or to describe some robust classes of languages with this property. Another problem posed in [8] is to find some subclasses of regular languages that are geometrically closed. As we explain in Sect. 3, non-empty group languages have their geometrical closure equal to the universal language
. For this reason, it makes sense to look for more interesting geometrically closed subclasses among star-free languages, which are known to be “group-free”. More precisely, a language L is star-free if and only if the syntactic monoid
of L is aperiodic, that is, if
does not contain non-trivial groups as subsemigroups.
It is well known that the star-free languages are classified into the Straubing-Thérien hierarchy based on polynomial and Boolean operations. In particular, the variety
(i.e., the variety of languages of level 1) is formed by piecewise testable languages and the positive variety
is formed by polynomials built from languages of level 1. We refer to the survey paper by Pin [12] for an introduction to the Straubing-Thérien hierarchy of star-free languages and the algebraic theory of regular languages in general. This theory is based on Eilenberg correspondence between varieties of regular languages and pseudovarieties of finite monoids. Note that one well-known instance of Eilenberg correspondence, which plays an essential role in our contribution, is given by the pseudovariety of finite R-trivial monoids, for which the corresponding variety of languages is denoted by
. Nevertheless, we emphasise that our contribution is rather elementary, and it does not use sophisticated tools developed in the algebraic theory of regular languages.
It was proved by Dubernard, Guaiana, and Mignot [8] that the class of all binary languages from the positive variety
is geometrically closed. They have obtained this result by decomposing the plane diagram of the figure of a given language into specific types of basic subdiagrams, and using this decomposition to construct a regular expression for the language
.
We prove a generalisation of the above mentioned result in this contribution. Our approach is to concentrate on the form of languages that may arise as
for L taken from
. In other words, we do not construct a concrete regular expression for
, but we determine what kind of expression exists for such a language. In particular, we introduce a new variety of languages
, which is a subvariety of the variety
. Note that there is a transparent description of languages from
and also an effective characterisation via the so-called acyclic automata (both are recalled in Sect. 4). The variety of languages
is then characterised in the same manner: a precise description by specific regular expressions and also an automata-based characterisation are given. The letters LT in the notation
refer to a characteristic property of acyclic automata in which “loops are transferred” along paths.
We show that the geometrical closure of a language from the positive variety
always falls into the variety
. As a consequence, each class of regular languages lying between
and
is geometrically closed. In particular, the positive variety
is geometrically closed regardless of the alphabet, as well as is the variety
.
Preliminaries
All automata considered in this paper are understood to be deterministic and finite. An automaton is thus a five-tuple
, where Q is a finite set of states,
is a non-empty finite alphabet,
is a complete transition function,
is the unique initial state, and
is the set of final states. The minimal automaton of a given language L is denoted by
.
By a (positive) variety of languages, we always understand what is called a (positive)
-variety in [12]. We recall this notion for a reader’s convenience briefly. A class of languages
is an operator, which determines, for each finite non-empty alphabet
, a set
of languages over
. A positive variety is a class of regular languages
such that
is closed under quotients, finite unions and intersections, and the whole class is closed under preimages in homomorphisms. A positive variety
is a variety if each
is closed under complementation. Note that an alphabet could be fixed in our contribution, so homomorphisms among different alphabets play no role, and we could consider lattices of languages [9] instead of varieties of languages. However, we prefer to stay in the frame of the theory of (positive) varieties of languages as a primary aim of this paper is to describe robust classes closed under geometrical closure.
Given words u, v over an alphabet
, we write
if u is a prefix of v. We also write, for each
,
![]() |
We call these languages the prefix closure and the prefix reduction of L, respectively. Both are prefix-closed, while
and
.
Proposition 1
Each positive variety
is closed under the operator
.
Proof
It is well known that each regular language has finitely many right quotients by words. Thus, for each alphabet
and each
, the language
![]() |
is a finite union of right quotients of L, and its membership to
follows. 
Let
be a linearly ordered alphabet. The Parikh vector of a word w in
is then given by
, where
denotes the number of occurrences of the letter a in w. This notation extends naturally to languages: we write
for
. We denote by [w] the equivalence class of the kernel relation of
, i.e.
. Then we also write, for each language
,
![]() |
and we call [L] the commutative closure of L. A language L such that
is called commutative. A class of languages
is said to be closed under commutation if for each alphabet
, the language [L] belongs to
whenever
.
In the previous paragraph we consider the mapping
, where
is the set of all non-negative integers. Following the ideas of [8], we introduce some technical notations concerning
, whose elements are called vectors. We denote by
the null vector of
. Let
and
be vectors and
be an index. We write
if
and, at the same time,
for all
. Moreover,
means that
for some index s. A path in
is a finite sequence
of vectors from
such that
and
for
; more specifically, we say that
is a path leading to
. This means that a path always begins in
and each other vector of the path is obtained from the previous one by incrementing exactly one of its coordinates by one. If in addition
all belong to a set
, we say that
is a path in F and write
.
Given a word
in
, we write
for the unique path
in
such that
. Conversely, for each path
in
, there is a unique word w such that
. We denote this unique word w by
. For each
, we denote
the set
. Note that the language
is prefix-closed.
Moreover, we put
for each
. The set
is a connex figure in the sense of [8], i.e., for each
, there is a path
leading to
such that
.
Finally, the geometrical closure of L is a language
. A class of languages
is said to be geometrically closed if the language
belongs to
whenever L does, for each alphabet
.
Note that the class of all regular languages is not geometrically closed, as observed in [8]. For instance, the language
is regular, while its geometrical closure
is the prefix closure of the Dyck language.
A Characterisation of the Geometrical Closure
We now characterise the operation of geometrical closure via three simpler operations: the prefix closure, the commutative closure, and the prefix reduction. This characterisation is a key to our later considerations.
Proposition 2
If L is a language over
, then
.
Proof
By definition,
![]() |
If
, then there is a path
such that
. For an arbitrary prefix u of w, we have
for some
. It follows that
belongs to
. Hence
and w belongs to
.
On the other hand, if w belongs to
, then all prefixes u of w belong to
. Thus
is in
for each
, and
is a path in
, implying that w is in
. 
As a direct consequence of Propositions 1 and 2, we obtain the following sufficient condition, under which a positive variety of languages is geometrically closed.
Corollary 3
Each positive variety of regular languages closed under prefix reduction and commutation is geometrically closed.
Some positive varieties of languages
are geometrically closed for trivial reasons – for instance all
such that
for all non-empty
. Let us observe that this is the case for L whenever
. The proof of the following lemma is easy to see. We just note that by an absorbing state we mean a state p satisfying
for every
.
Lemma 4
Let L be a regular language over an alphabet
and
be the minimal automaton of L. Then the following conditions are equivalent:
-
(i)
; -
(ii)
for each state p in
, there exists a final state reachable from p; -
(iii)
every absorbing state p in
is final.
The conditions of Lemma 4 are satisfied in particular for all non-empty group languages. The variety
, consisting of all languages L such that the syntactic monoid
is a group, is geometrically closed as a consequence. This result can be extended to languages of the form
, where each
is a letter, and each
is a non-empty group language. Indeed, for every
, there is some
such that
, and one can find at least one
for every
. Then u is a prefix of the word
. This implies that
. We may thus conclude that the variety
, consisting of languages of level 1/2 in the group hierarchy, is geometrically closed. (The reader not familiar with the group hierarchy is referred to [12]).
In the rest of the paper, we move our attention to star-free languages.
Languages Recognised by LT-acyclic Automata
We now introduce the class of languages
, which plays a central role in our main result. For every alphabet
, the set
consists of languages which are finite unions of languages of the form
![]() |
1 |
The previous definition is similar to definitions of other classes of languages that have already been studied in literature. First of all, if we omit the condition
, we get a definition of languages from the variety
corresponding to R-trivial monoids, which we recall in more detail later. Let us conclude here just that
. Secondly, if we also require
in (1) for
, then we obtain a variety of languages considered by Pin, Straubing, and Thérien [13] and corresponding to a pseudovariety of finite monoids denoted
. Finally, if we drop in (1) the condition
and then we generate a variety, then we obtain the variety of languages corresponding to the pseudovariety
considered by Almeida [1, p. 236].
Since we want to characterise languages from
in terms of automata, we recall the characterisation of languages from
first. An automaton
is acyclic if every cycle in
is a loop. This means that if
for some
and
, then also
for every letter a occurring in w. The defining condition means that one can number the states in Q as
in such a way that the state
, with
and
, is always greater than or equal to p. For this reason, these automata are called extensive in [11, p. 93]. It is known that they recognise precisely R-trivial languages [6].
We say that an acyclic automaton
has a loop transfer property, if
implies
for every
and
. We then call
an LT-acyclic automaton for short. This means that if there is an a-labelled loop in a state p in an LT-acyclic automaton, then there is also an a-labelled loop in each state reachable from p. We may thus equivalently take
in the previous definition. The first aim of this section is to show that languages recognised by LT-acyclic automata are precisely those from
. We do so via a series of elementary lemmas.
Lemma 5
For a language L of the form (1), the automaton
is LT-acyclic.
Proof
Let L be a language
of the form (1). For every
, we denote
and we also put
. Then it is an easy exercise to show that the automaton in Fig. 1 is the minimal automaton of L and that it is an LT-acyclic automaton. 
Fig. 1.

An LT-acyclic automaton for the language of the form (1).
Lemma 6
Let L, K be languages over an alphabet
recognised by LT-acyclic automata. Then
is also recognised by an LT-acyclic automaton.
Proof
The language
can be recognised by the direct product of a pair of automata that recognise the languages L and K. It is a routine to check that a finite direct product of LT-acyclic automata is an LT-acyclic automaton. 
The previous two lemmas show that every language from
is recognised by an LT-acyclic automaton. The following lemma strengthens this observation by implying that the minimal automaton of a language from
is LT-acyclic.
Lemma 7
Let L be a language recognised by an LT-acyclic automaton. Then the minimal automaton of L is also LT-acyclic.
Proof
Let
be an LT-acyclic automaton such that
. The minimal automaton
is a homomorphic image of some subautomaton of
[14]. It is clear that a subautomaton of an LT-acyclic automaton is LT-acyclic. Thus we may assume that
has all states reachable from the initial state
.
Let
be a surjective mapping, which is a homomorphism from the automaton
onto an automaton
. We claim that
is acyclic. To prove this claim, let
and
be such that
. Then we choose some state
from
. For that
, we have
for every natural number m. Since the sequence
contains only finitely many states, there are natural numbers n and m such that
. Since
is acyclic, we have
for every letter a occurring in w. Consequently,
. We showed that
is acyclic.
Now let
and
be such that
. It follows from the previous paragraph that there is
such that
. Since
is LT-acyclic, we see that
for every
. Thus
. We showed that
is an LT-acyclic automaton. In particular, it is true for
. 
Let us also prove a converse to the statements established above.
Lemma 8
Let
be an LT-acyclic automaton over an alphabet
. Then
belongs to
.
Proof
Let
and let R be the set of all valid runs in the automaton
, which do not use loops:
![]() |
We see that the set R is finite. Moreover, for each q in Q, let
denote the alphabet
. Then
![]() |
is a language of the form (1) for each
in R and
![]() |
Hence the language
belongs to
. 
The following theorem provides a summary of the previous lemmas.
Theorem 9
For a language
, the following statements are equivalent:
-
(i)
L belongs to
. -
(ii)
L is recognised by an LT-acyclic automaton.
-
(iii)
The minimal automaton of L is LT-acyclic.
Proof
The statement (i) implies (ii) by Lemmas 5 and 6. The statement (ii) implies (iii) by Lemma 7. Finally, (iii) implies (i) by Lemma 8. 
One may prove that
is a variety of languages in several different ways. It is possible to prove directly that the class
is closed under basic language operations. It is also possible to prove that the class of LT-acyclic automata forms a variety of actions in the sense of [7]. Here we complete the previous characterisation by showing the algebraic counterpart of the class
; namely, we characterise the corresponding pseudovariety of finite monoids by pseudoidentities. We do not want to recall the notion of pseudoidentities in general. Let us only recall the implicit operation
here. If we substitute for x some element s in a finite monoid M, then the image of
is
, which is a unique idempotent in the subsemigroup of M generated by s. It could be useful to know that, for a fixed finite monoid M, there is a natural number m such that
for each
.
Theorem 10
Let
be an alphabet,
, and
the syntactic monoid of L. The following statements are equivalent:
-
(i)
L belongs to
. -
(ii)
satisfies the pseudoidentities
and
. -
(iii)
satisfies the pseudoidentity
.
Proof
Let
be the minimal automaton of the language L. Then
can be viewed as the transition monoid of
(see [12, p. 692]). Elements of
are thus transitions of
determined by words from
. More formally, for
, we denote by
the transition given by the rule
for each
. Let m be a natural number such that
for each s in
.
Let us prove that (i) implies (ii). Suppose that L belongs to
. Then
is an LT-acyclic automaton by Theorem 9. In particular, the language L is R-trivial as we already mentioned. Hence, the monoid
is R-trivial, i.e.,
satisfies the pseudoidentity
. Next, let x, y be mapped to elements in
which are given by words
. We now need to check that
. Since
is acyclic, we have
for every
and
occurring in v. Since
is an LT-acyclic automaton, the loop labelled by a in state
is transferred to every state reachable from
. In particular, for every letter a occurring in v, there is a loop labelled by a in the state
. The equality
follows.
Next, let us show that the pseudoidentity
is a consequence of pseudoidentities from item (ii). We may interpret x, y, z as arbitrary elements of any finite monoid M satisfying these pseudoidentities. Let m be such that
for each
. Then we use the second pseudoidentity from (ii) repetitively, and we get
![]() |
2 |
By the first pseudoidentity from (ii), we get
. Then we obtain
using the equality (2). Thus we get
.
Finally, in order to prove that (iii) implies (i), suppose that
satisfies the pseudoidentity
. Taking
, it follows that
satisfies the pseudoidentity
. Hence, L is R-trivial and
is acyclic. Moreover, let
and
be such that
, and take arbitrary
. Then
in
maps p to
. Similarly,
in
maps p to
. However, taking
,
, and
in
gives us
. Therefore,
. So, we see that there is a loop labelled by a in the state
. We proved that
is an LT-acyclic automaton and L belongs to
by Theorem 9. 
Corollary 11
The class
is a variety of languages corresponding to the pseudovariety of finite monoids
given by
![]() |
Let us also note that
is known to describe the pseudovariety of finite monoids
; cf. Almeida [1, p. 212], who attributes this result to Pin. Therefore,
.
The Main Result
Let us now return to the geometrical closure and prove the main result of this paper: each class of languages lying between the variety of languages
and the positive variety
is geometrically closed. This strengthens the result from [8] mentioned in the Introduction.
The route that we take to this result (Theorem 16) consists of three steps:
We recall that the class
is closed under commutation [5, 10]. Although it is not necessary to obtain our main result, we refine this observation by proving that a commutative closure of a
-language is piecewise testable.We prove that each commutative
-language belongs to
.We observe that the variety
is closed under prefix reduction.
These three observations imply that the geometrical closure of a
-language belongs to
, from which our main result follows easily.
Recall the result of Arfi [2], according to which a language belongs to
if and only if it is given by a finite union of languages
, where
are letters from
and
are subalphabets of
. It follows by a more general result of Guaiana, Restivo, and Salemi [10], or of Bouajjani, Muscholl, and Touili [5] that
is closed under commutation, and this observation is a first step to Theorem 16.
Let us show that a commutative closure of a
-language is in fact piecewise testable.
Lemma 12
A commutative closure of a
-language is piecewise testable.
Proof
Let an alphabet
be fixed. It is clear that if
are languages, then
![]() |
As a result, it is enough to prove piecewise testability of [L] for all languages
, with
and
.
Let L be of this form. Denote
, and
. We claim that
![]() |
3 |
Indeed, if w is in [L], then
for some
, while clearly
for each a in
, and
for each b in
. Conversely, let w in
be such that
for each a in
, and
for each b in
. Then
for v in
given by
, where
(
) is given as follows: if
, then
![]() |
The word v is in L by construction, hence w belongs to [L].
It remains to observe that the language [L] given by (3) is piecewise testable. However, this language is equal to
![]() |
4 |
The language on the right-hand side of (4) is piecewise testable. 
We now proceed to prove that the geometrical closure of each language from
belongs to
.
Lemma 13
Every commutative language L from
belongs to
.
Proof
If we take into account the proof of Lemma 12 and the fact that
is closed under finite unions, it is enough to prove that every language of the form (3) belongs to
. We may also use the expression (4) for that language. For each letter
and a natural number m, we may write
. This shows that the language
belongs to
. Since
is a variety, we see that also the language
belongs to
. Altogether, the language (4) belongs to the variety
. 
Finally, let us observe that the variety
is closed under prefix reduction.
Lemma 14
Let L be a language from
for some alphabet
. Then
belongs to
as well.
Proof
Let L be recognised by some LT-acyclic automaton
. If
, then L does not contain the empty word, and consequently
, which belongs to
. So we may assume that
.
Now, simply saying, we claim that the language
is recognised by the automaton
constructed from
by replacing all non-final states with a single absorbing non-final state
. More precisely, we construct an automaton
, where
is a new state, for which we define
for each
. Furthermore, for each
and
, we put
if
, and
otherwise. As
contains no cycle other than a loop, the constructed automaton
has the same property. Moreover, any state of
reachable in
from some p in
is either reachable from p in
, or equal to
. As
for each c in
, this implies that
is an LT-acyclic automaton and
belongs to
by Theorem 9. 
Theorem 15
Let
be an alphabet and
. Then
.
Proof
We have
by Proposition 2. As
is a positive variety of languages,
belongs to
whenever L belongs to this set by Proposition 1. The language
is thus a commutative
-language by [5, 10]. (Note that the language
is actually commutative piecewise testable, by Lemma 12.) It follows by Lemma 13 that
belongs to
, and by Lemma 14 that the language
belongs to
as well. 
We are now prepared to state the main result of this article merely as an alternative formulation of the theorem above.
Theorem 16
Let
be a class of languages containing
, which is contained in
. Then
is geometrically closed.
There are many important (positive) varieties studied in the literature for which the main result can be applied.
Corollary 17
The following classes are geometrically closed: the positive variety
, the variety
, the variety
, the variety of all
-recognisable languages, the variety of all
-recognisable languages.
The variety of all
-recognisable languages coincides with the intersection of
and its dual. This class has a natural interpretation in terms of logical descriptions of levels in Straubing-Thérien hierarchy (see Section 5 in [15]).
Conclusions
We have introduced a new variety of languages
and we have proved that geometrical closures of languages from
fall into
. As a consequence, we have seen that many natural classes of star-free languages are geometrically closed, namely those between the variety
and the positive variety
. On the contrary, the variety of all piecewise testable languages
is not geometrically closed. The example is not included in the paper due to space limitations.
There are some interesting questions in connection to the paper. First of all, one may ask how to effectively construct a regular expression for the geometrical closure
for a given language L from
. Note that it is effectively testable, for a given deterministic finite automaton
, whether the language
belongs to
(see [12, p. 725]). It is not clear to us whether a regular expression for
can be effectively computed from
.
Nevertheless, the main open question related to the topic is to clarify the behaviour of the geometrical closure outside the class
.
Footnotes
The first author was supported by Grant 19-12790S of the Grant Agency of the Czech Republic. The second author was supported by the grant VEGA 2/0165/16.
Contributor Information
Alberto Leporati, Email: alberto.leporati@unimib.it.
Carlos Martín-Vide, Email: carlos.martin@urv.cat.
Dana Shapira, Email: shapird@g.ariel.ac.il.
Claudio Zandron, Email: zandron@disco.unimib.it.
Ondřej Klíma, Email: klima@math.muni.cz.
Peter Kostolányi, Email: kostolanyi@fmph.uniba.sk.
References
- 1.Almeida, J.: Finite Semigroups and Universal Algebra. World Scientific, Singapore (1994)
- 2.Arfi M. Opérations polynomiales et hiérarchies de concaténation. Theor. Comput. Sci. 1991;91(1):71–84. doi: 10.1016/0304-3975(91)90268-7. [DOI] [Google Scholar]
- 3.Béal M-P, Champarnaud J-M, Dubernard J-P, Jeanne H, Lombardy S. Decidability of geometricity of regular languages. In: Yen H-C, Ibarra OH, editors. Developments in Language Theory; Heidelberg: Springer; 2012. pp. 62–72. [Google Scholar]
- 4.Blanpain, B., Champarnaud, J.M., Dubernard, J.P.: Geometrical languages. In: LATA 2007, pp. 127–138 (2007)
- 5.Bouajjani A, Muscholl A, Touili T. Permutation rewriting and algorithmic verification. Inf. Comput. 2007;205(2):199–224. doi: 10.1016/j.ic.2005.11.007. [DOI] [Google Scholar]
-
6.Brzozowski JA, Fich FE. Languages of
-trivial monoids. J. Comput. Syst. Sci. 1980;20(1):32–49. doi: 10.1016/0022-0000(80)90003-3. [DOI] [Google Scholar] - 7.Chaubard L, Pin JÉ, Straubing H. Actions, wreath products of C-varieties and concatenation product. Theor. Comput. Sci. 2006;356(1–2):73–89. doi: 10.1016/j.tcs.2006.01.039. [DOI] [Google Scholar]
-
8.Dubernard J-P, Guaiana G, Mignot L. Geometrical closure of binary
languages. In: Martín-Vide C, Okhotin A, Shapira D, editors. Language and Automata Theory and Applications; Cham: Springer; 2019. pp. 302–314. [Google Scholar] - 9.Gehrke M, Grigorieff S, Pin JÉ. Duality and equational theory of regular languages. In: Aceto L, Damgård I, Goldberg LA, Halldórsson MM, Ingólfsdóttir A, Walukiewicz I, editors. Automata, Languages and Programming; Heidelberg: Springer; 2008. pp. 246–257. [Google Scholar]
- 10.Guaiana G, Restivo A, Salemi S. On the trace product and some families of languages closed under partial commutations. J. Autom. Lang. Comb. 2004;9(1):61–79. [Google Scholar]
- 11.Pin JÉ. Varieties of Formal Languages. London: North Oxford Academic Publishers; 1986. [Google Scholar]
- 12.Pin J-E. Syntactic semigroups. In: Rozenberg G, Salomaa A, editors. Handbook of Formal Languages; Heidelberg: Springer; 1997. pp. 679–746. [Google Scholar]
- 13.Pin JÉ, Straubing H, Thérien D. Small varieties of finite semigroups and extensions. J. Aust. Math. Soc. 1984;37(2):269–281. doi: 10.1017/S1446788700022084. [DOI] [Google Scholar]
- 14.Sakarovitch J. Elements of Automata Theory. Cambridge: Cambridge University Press; 2009. [Google Scholar]
- 15.Tesson, P., Thérien, D.: Diamonds are forever: the variety DA. In: Semigroups, Algorithms, Automata and Languages, pp. 475–499. World Scientific (2002)














