Grammatical Inference by Answer Set Programming

Wojciech Wieczorek; Łukasz Strąk; Arkadiusz Nowakowski; Olgierd Unold

doi:10.1007/978-3-030-50423-6_4

. 2020 May 23;12140:45–58. doi: 10.1007/978-3-030-50423-6_4

Grammatical Inference by Answer Set Programming

Wojciech Wieczorek ^15,^✉, Łukasz Strąk ¹⁵, Arkadiusz Nowakowski ¹⁵, Olgierd Unold ¹⁶

Editors: Valeria V Krzhizhanovskaya⁸, Gábor Závodszky⁹, Michael H Lees¹⁰, Jack J Dongarra¹¹, Peter M A Sloot¹², Sérgio Brissos¹³, João Teixeira¹⁴

PMCID: PMC7303697

Abstract

In this paper, the identification of context-free grammars based on the presentation of samples is investigated. The main idea of solving this problem proposed in the literature is reformulated in two different ways: in terms of general constrains and as an answer set program. In a series of experiments, we showed that our answer set programming approach is much faster than our alternative method and the original SAT encoding method. Similarly to a pioneer work, some well-known context-free grammars have been induced correctly, and we also followed its test procedure with randomly generated grammars, making it clear that using our answer set programs increases computational efficiency. The research can be regarded as another evidence that solutions based on the stable model (answer set) semantics of logic programming may be a right choice for complex problems.

Keywords: Grammatical inference, Answer set programming, Constraint satisfaction problem

Introduction

In grammatical inference [9], a learning algorithm La takes a finite sequence (usually strings) of examples as input and outputs a language description (usually grammars). There are two main types of presentations: (i) A text for a language L is an infinite sequence of strings Inline graphic from L such that every string of L occurs at least once in the text; (ii) An informant for a language L is an infinite sequence of pairs in such that every string of occurs at least once in the sequence and . The inference algorithms that use type (ii) of information are said to learn from positive and negative examples. From the Gold’s results [7], we know that the class of context-free languages (and even regular languages) cannot be identified from presentation (i), but can be identified using presentation (ii). However, de la Higuera [8] showed that it is computationally hard.

In this work, the following informant learning environment is exploited. Suppose that the inferring process is based on the existence of an Oracle, which can be seen as a device that:

Knows the language and has to answer correctly.
Can answer equivalence queries. They are made by proposing some hypothesis to the Oracle. The hypothesis is a grammar representing the unknown language. The Oracle answers Yes in the positive case. In the negative case, the Oracle has to return the shortest string in the symmetric difference between the target language and the submitted hypothesis.

Then the following procedure can be applied. Start from a small1 sample S and Inline graphic . The parameter k denotes the number of non-terminal symbols in the target grammar. Run an answer set program (or another exact method). Every time it turns out that there is no solution that satisfies all of the constraints, increase k by 1. As long as the Oracle returns a pair (x, d) in response to an equivalent query, add (x, d) to S and run the answer set program again (or respectively another exact method). Stop after the answer is Yes. Unfortunately, there is no guarantee that the procedure will terminate in a polynomial number of steps, even when the target language is regular [1]. The equivalence checking may be done by random sampling. The positive answer could be incorrect, but this probability decreases if the sampling is repeated.

A very similar procedure for the induction of context-free grammars was proposed by Imada and Nakamura [11]. However, for the exact searching of k-variable grammar, they used Boolean formulas and applied an SAT solver. We took over their main Boolean variables, treating them as predicates, and then constructed a new encoding founded on answer set programming. In an alternative approach, we used general constraints of Gurobi Optimizer2 instead of ASP.

Related Work

The most closely related work to CFG identification is by Imada and Nakamura [11]. They proposed a way to synthesize CFGs from positive and negative samples based on solving a Boolean satisfiability problem (SAT). They translated the learning problem for a CFG into a SAT, which is then solved by a SAT solver. The result of the SAT solver satisfying the SAT contains a minimal set of rules (it can be easily changed to a minimal set of variables) that derives all positive samples and no negative samples.

They used one derivation constraint and two main types of Boolean variables:

Derivation variables. A set of derivation variables represents a relation between nonterminal symbols and substrings (in other words, derivation or parse tree) of each (positive or negative) sample w as follows: for any substring x of w and , the derivation variable represents that the nonterminal p derives the string x.
Rule variables. A set of rule variables represents a rule set as follows: for any , , a variable (or ) determines whether the production rule (or ) is a member of the set of rules or not.

The derivation constraint is a set of following Boolean expressions for any string Inline graphic () and nonterminal .

Nakamura et al. have been working on another approach for incremental learning of CFGs implemented in the Synapse system [15]. This approach is based on rule generation by analyzing the results of bottom-up parsing for positive samples and searching for rule sets. Their system can also learn similar CFGs but does it only from positive samples. Both methods synthesized similar rule sets for each language in their experiments. They reported that the computation time by the SAT-based approach is rather shorter than Synapse in most languages.

Our Contribution

The purpose of the present proposal is to investigate to what extent the power of an ASP solver makes it possible to tackle the context-free inference problem for large-size instances and to compare our approach with the original one. Because of the possibility of future comparisons with other methods, the Python implementation3 of our winning method is given via GitLab.

The main original scientific contributions are as follows:

the formulation of the induction of a k-variable context-free grammar in terms of logical rules with answer set semantics;
the formulation of the induction of a k-variable context-free grammar in terms of general constraints;
the construction of an informant learning algorithm based on ASP, CSP, and SAT solvers;
the conduct of an appropriate statistical test in order to determine the fastest CFG inference method.

This paper is organized into five sections. In Sect. 2, we present necessary definitions and facts originating from formal languages and declarative problem-solving. Section 3 describes our inference algorithms: (a) based on solving an answer set program, and (b) based on solving a constraint satisfaction program, including general constraints such as AND/OR. Section 4 shows the experimental results of our approaches in comparison with the original one. Concluding comments are made in Sect. 5.

Preliminaries

We assume the reader to be familiar with basic context-free languages theory, e.g., from [10], so that we introduce only some notations and notions used later in the paper.

Words and Languages

An alphabet is a finite, non-empty set of symbols. We use the symbol Inline graphic for the alphabet. A word is a finite sequence of symbols chosen from the alphabet. We denote the length of the word w by |w|. The empty word is the word with zero occurrences of symbols. Let x and y be words. Then xy denotes the catenation of x and y, that is, the word formed by making a copy of x and following it by a copy of y. As usual, Inline graphic denotes the set of words over . The word w is called a prefix of the word u if there is a word x such that . We call it a proper prefix if . The word w is called a suffix of the word u if there is a word x such that . It is a proper suffix if . A factor (or subword) is a prefix of a suffix. A set of words, all of which are chosen from some Inline graphic , where is a particular alphabet, is called a language.

Context-Free Grammars

A context-free grammar (CFG) is defined by a quadruple Inline graphic , where V is an alphabet of variables (or sometimes non-terminal symbols), is an alphabet of terminal symbols such that , P is a finite set of production rules in the form for and , and is a special non-terminal symbol called the start symbol. For simplicity’s sake, we write Inline graphic instead of . We call the word a sentential form. Let u, v be two words in and . Then, we write , if is a rule in P. That is, we can substitute the word x for symbol A in a sentential form if is a rule in P. We call this rewriting a derivation. For any two sentential forms x and y, we write Inline graphic , if there exists a sequence of sentential forms such that for all . The language L(G) generated by G is the set of all words over that are generated by G; that is, . A language is called a context-free language if it is generated by a context-free grammar. Assume that G is the unknown (target) CFG to be identified. An example (a positive word) of G is a word in L(G), and a counter-example (a negative word) of G is a word not in L(G).

A normal form for context-free grammars is a form, for which any grammar can be converted to the respective normal form version. Amongst all normal forms for context-free grammars, the most useful and the most well-known one is the Chomsky normal form (CNF). A grammar is said to be in Chomsky normal form if each of its rules is in one of two possible forms:

, or
.

Answer Set Programming

We will briefly introduce the idea of answer set programming (ASP). Those who are interested in a more detailed description of the topic, alternative definitions, and the formal specification of this kind of logic programming are referred to handbooks [3, 6], and [12].

A variable or constant is a term. An atom is Inline graphic , where a is a predicate of arity n and are terms. A literal is either a positive literal p or a negative literal , where p is an atom.

A rule r is a clause of the form

where Inline graphic are atoms. The atom is the head or r, while the conjunction is the body of r. By , we denote the head atom, and by the set of the body literals. (, resp.) denotes the set of atoms occurring positively (negatively, resp.) in . A program (also called ASP program) is a finite set of rules. A Inline graphic -free program is called positive. A term, atom, literal, rule, or a program is ground if no variables appear in it.

Let Inline graphic be a program. Let r be a rule in , a ground instance of r is a rule obtained from r by replacing4 every variable X in r by constants occurring in . We denote the set of all the ground instances of the rules occurring in by .

An interpretation I for Inline graphic is a set of ground atoms. A ground positive literal A is true (false, resp.) w.r.t. I if (, resp.). A ground negative literal is true (false, resp.) w.r.t. I if (, resp.).

Let r be a ground rule in Inline graphic . The head of r is true w.r.t. I if . The body of r is true w.r.t. I if all body literals of r are true w.r.t. I (i.e., ) and is false w.r.t. I otherwise. The rule r is satisfied (or true) w.r.t. I if r head is true w.r.t. I or r body is false w.r.t. I.

A model for Inline graphic is an interpretation M for such that every rule is true w.r.t. M.

Given a program Inline graphic and an interpretation I, the reduct is the set of positive rules defined as follows:

I is an answer set of Inline graphic if I is the -smallest model for .

Over the last years, answer set programming has emerged as a declarative problem-solving paradigm. It is a programming methodology rooted in research on artificial intelligence and computational logic, and researchers use it in many areas of science and technology. For experiments we took advantages of clingo—one of the most efficient and widely used answer set programming system available5 today. In addition to standard definitions, clingo allows to define constraints, i.e., rules with the empty head, for instance

By adding this constraint to a program, we eliminate its answer sets that contain Inline graphic . Adding the ‘opposite’ constraint

eliminates those answers that do not contain Inline graphic . A constraint can be translated into a normal rule. To this end, the constraint

is mapped onto the rule

where x is a new atom.

Example. Suppose we have three numbered urns and two distinguishable balls. Every ball has been put to an urn, maybe to the same. An ASP program to code this knowledge is as follows:

Please notice that as usual in logic programming, identifiers with initial uppercase letters are assigned to variables. Rules 7–11 are simple facts concerning urns and balls. Rules 12 and 13 define predicates that tell whether a ball is inside in a particular urn. Inequality Inline graphic is only used during grounding to eliminate some ground instances of rule 13. It is worth mentioning that grounding systems do not make unnecessary replacements, for example, 1 for U. Rules 14 and 15 ensure that every ball is exactly in one urn.

Suppose now that we have discovered that urn 2 is empty and we want to know possible configurations. It is enough to add two facts:

and find all answer sets. A possible answer set is: ball(q), ball(r), urn(1), urn(2), in(r), not_in(2, q), not_in(2, r), not_in(3, q), not_in(3, r), contains(1, q), in(q), urn(3), contains(1, r), which describes the placement of both balls into the first urn.

clingo also allows using choice constructions, for instance:

describes all possible ways to choose which two of the atoms p(1, q), p(2, q), p(3, q) and which two of the atoms p(1, r), p(2, r), p(3, r) are included in the resultant model. Before and after an expression in braces, we can put integers, which express bounds on the cardinality of the stable models described by the rule. The number on the left is the lower bound (0 is default), and the number on the right is the upper bound (unbounded is default).

Proposed Encodings for the Induction of CFGs

Our translation converts CFG identification into an ASP program (the main approach) and CSP model (an alternative approach, constraint satisfaction problem). Suppose we are given a sample composed of examples, Inline graphic , and counter-examples, , over an alphabet , and a positive integer k. We want to find a k-variable CFG such that and .

Using Logic Programming with Answer Set Semantics

Let F be the set of all factors (excluding the empty word) of Inline graphic . Let us now see how to describe the rules for the relationship between a grammar G and a sample in terms of ASP. There are three main predicates: y(I, J, L), which indicates the presence of in P; w(I, Q), which indicates that , where Q represents a factor; and z(I, A), which indicates the presence of Inline graphic .

We have the following domain specification, our facts.
19

20

21

22

23

24
The next rules ensure that in a grammar G a factor can or cannot be derived from a specific variable and ensure that in the grammar there is a subset of all possible productions.
25

26

27

28

29
All examples should be accepted, and no counter-example can be accepted.
30

31
For every for which and for every pair () of such factors that , f can be derived from a non-terminal I if there are two non-terminals, J and L, such that b can be derived from J, c can be derived from L, and there is a production .
32
On the other hand, if , then at least one such pair should exist, that is in P and and .
33

Using General Constraints

This time, instead of predicates, w, y, and z are binary variables. We use the following constraints

and

for each Inline graphic , where means if then and if then , and .

Experimental Results

In this section, we describe some experiments comparing the performance of our approaches implemented6 in Python, using clingo (ASP) and using Gurobi Optimizer, with our implementation of Imada et al. algorithm [11] using the PicoSAT solver (SAT), when positive and negative words are given. For these experiments, we use a set of 40 samples: partly based on randomly generated grammars (33 samples) and partly based on the set of fundamental CFGs appearing in grammatical inference research (the last 7 samples).

Benchmarks

For testing the learning power for general CFGs, we randomly generated 33 CFGs and prepared positive and negative samples with lengths no longer than 14 exhaustively enumerated for them. The grammars are in Chomsky normal form with 6 to 12 rules on the alphabet Inline graphic . In every sample, positive words constitute not less than 20% of the total.

The last seven samples are also with lengths no longer than 14 exhaustively enumerated, but they were generated based on the following descriptions:

The set of palindromes over .
The parentheses language: the set of strings consisting of equal numbers of a’s and b’s such that every prefix does not have more b’s than a’s.
The set of strings consisting of b’s twice as many as a’s.
The set of strings of a’s and b’s not of the form ww.
The complement of the language (b).
.
The set of strings consisting of equal numbers of a’s and b’s.

Performance Comparison

In all experiments, we used Intel Xeon CPU E5-2650 v2, 2.6 GHz (single-core out of eight), under Ubuntu 18.04 operating system with 60 GB available RAM. Algorithm 1 shows the process for synthesizing a grammar (the set of production rules with Inline graphic being always the start symbol) from positive and negative words. In the algorithm, and represent the set of positive and negative words as an input. The variables and hold sets of samples to be covered in the next loop iteration. The algorithm picks up a word from or that is not covered by the inferred grammar G, and add it to Inline graphic or . The function Convert translates the problem into a set of ASP rules R (or Gurobi general constraints or a Boolean expression). If the ASP solver (or Gurobi Optimizer or the SAT solver) finds a stable model M, the function Extract returns a set of production rules by analyzing the presence of particular y(i, j, l) and z(i, a) atoms. The algorithm repeats this process—increasing k to relaxe the limit on the number of non-terminals—until G covers the all given Inline graphic and .

The results are listed in Table 1. In order to determine whether the observed CPU time differences between ASP’s runs and the remaining methods’ runs did not occur by chance, we use the Wilcoxon signed-rank test [17, pp. 915–916] for ASP vs SAT and ASP vs Gurobi. The null hypothesis to be tested is that the median of the paired differences is negative (against the alternative that it is positive). As we can see from Table 2, p-value is high in both cases, so the null hypothesis cannot be rejected, and we may conclude that using our ASP encoding is likely to improve CPU time performance for most of this kind of benchmarks.

Table 1.

Execution times of exact solving CFG identification in seconds

Language	\|V\|	ASP	SAT	Gurobi
1	3	51.70	48.65	56.42
2	6	646.39	21049.22	21050
3	4	74.31	189.85	143.76
4	5	75.90	347.84	2000
5	4	27.91	64.82	18.36
6	5	75.96	335.98	10.33
7	4	68.35	61.87	2000
8	4	57.14	118.25	28.85
9	3	45.17	94.86	73.03
10	5	211.33	568.12	568.06
11	5	62.48	166.65	2000
12	3	21.50	58.12	33.28
13	6	112.69	705.80	2000
14	6	943.02	4807.32	4808
15	7	19358.09	252290.70	252291
16	4	49.01	111.22	103.05
17	7	2921.44	8035.44	8036
18	5	361.52	1369.22	2000
19	5	63.47	238.71	186.10
20	2	12.96	5.64	3.88
21	5	96.68	512.83	671.62
22	2	11.38	12.02	10.54
23	3	11.84	43.03	9.92
24	4	109.98	159.73	176.49
25	3	22.65	22.40	29.65
26	5	38.74	271.30	420.11
27	5	94.76	295.81	2000
28	5	216.61	625.07	2000
29	5	271.88	324.43	2000
30	6	228.98	412.16	2000
31	2	10.97	15.29	19.84
32	5	62.17	293.98	105.74
33	3	10.42	18.30	13.15
34	5	31.13	49.28	32.83
35	3	12.84	20.97	12.86
36	4	118.17	76.98	73.74
37	6	173.66	191.42	2000
38	4	29.33	54.63	36.71
39	4	4.12	21.00	9.02
40	3	66.71	50.65	40.40

Open in a new tab

Table 2.

Obtained p-values from the Wilcoxon signed-rank test

ASP vs SAT	ASP vs Gurobi
0.999999647	0.999987068

Open in a new tab

ASP-Based CFG Induction on Bioinformatics Datasets

Our induction method can also be applied to other data, that are not taken from context-free infinite languages. We tried its classification quality on two bioinformatics datasets: WALTZ-DB database [4], composed by 116 hexapeptides known to induce amyloidosis ( Inline graphic ) and by 161 hexapeptides that do not induce amyloidosis () and Maurer-Stroh et al. database from the same domain [14], where the ratio of is 240/836.

We chose a few standard machine learning methods for comparison: BNB (Naive Bayes classifier for multivariate Bernoulli models [13, pp. 234–265]), DTC (Decision Trees Classifier, CART method [5]), MLP (Multi-layer Perceptron [16]), and SVM (Support Vector Machine classifier with the linear kernel [18]). In all methods except ASP and BNB, an unsupervised data-driven distributed representation, called ProtVec [2], was applied in order to convert words (protein representations) to numerical vectors. For using BNB, we represented words as binary-valued feature vectors that indicated the presence or absence of every pair of protein letters. In case of ASP, the training set was partitioned randomly into n parts, and the following process was being performed m times. Choosing one part for synthesizing a CFG and use rest Inline graphic parts for validating it. The best of all m grammars—in terms of higher F-measure—was then confronted with the test set. For WALTZ-DB n and m have been set to 20, for Maurer-Stroh n has been set to 10 and m to 30. These values were selected experimentally based on the size of databases and the running time of the ASP solver.

To estimate the ASP’s and compared approaches’ ability to classify unseen hexapeptides repeated 10-fold cross-validation (cv) strategy was used. It means splitting the data randomly into 10 mutually exclusive folds, building a model on all but one fold, and evaluating the model on the skipped fold. The procedure was repeated 10 times and the overall assessment of the model was based on the mean of those 10 individual evaluations. Table 3 summarizes the performances of the compared methods on WALTZ-DB and Maurer-Stroh databases. It is noticable that the ASP approach achieved best F-score for smaller dataset (Maurer-Stroh) and an average F-score for the bigger one (WALTZ-DB), hence it can be used with a high reliability to recognize amyloid proteins. BNB is outstanding for the WALTZ-DB and almost as good as ASP for Maurer-Stroh database.

Table 3.

Performance of compared methods on WALTZ-DB and Maurer-Stroh databases in terms of Precision (P), Recall (R), and F-score (F1)

Method	WALTZ-DB			Maurer-Stroh
Method	P	R	F1	P	R	F1
ASP
BNB
DTC
MLP
SVM

Open in a new tab

Conclusion

In this paper, we proposed an approach for learning context-free grammars from positive and negative samples by using logic programming. We encode the set of samples, together with limits on the number of non-terminals to be synthesized as an answer set program. A stable model (an answer set) for the program contains a set of grammar rules that derives all positive samples and no negative samples. A feature of this approach is that we can synthesize a compact set of rules in Chomsky normal form. The other feature is that our learning method reflects future improvements on ASP solvers. We present experimental results on learning CFGs for fundamental context-free languages, including a set of strings composed of the equal numbers of a’s and b’s and the set of strings over Inline graphic not of the form ww. Another series of experiments on random languages shows that our encoding can speed up computations in comparison with SAT and CSP encodings.

Footnotes

We are aware of this imprecision. The number of words and their lengths should allow of executing a program in a reasonable amount of time. In experiments, we took two words: one example and one counter-example.

https://www.gurobi.com/.

The Python scripting language is used only for generating appropriate AnsProlog facts.

⁴

This process can be done efficiently, because many ground instances can be discarded; see Chapter 4 of [6].

⁵

https://potassco.org/.

⁶

https://gitlab.com/answer-set-programming/asp4cfg.

This research was supported by National Science Center (Poland), grant number 2016/21/B/ST6/02158.

Contributor Information

Valeria V. Krzhizhanovskaya, Email: V.Krzhizhanovskaya@uva.nl

Gábor Závodszky, Email: G.Zavodszky@uva.nl.

Michael H. Lees, Email: m.h.lees@uva.nl

Jack J. Dongarra, Email: dongarra@icl.utk.edu

Peter M. A. Sloot, Email: p.m.a.sloot@uva.nl

Sérgio Brissos, Email: sergio.brissos@intellegibilis.com.

João Teixeira, Email: joao.teixeira@intellegibilis.com.

Wojciech Wieczorek, Email: wojciech.wieczorek@us.edu.pl.

Łukasz Strąk, Email: lukasz.strak@us.edu.pl.

Arkadiusz Nowakowski, Email: arkadiusz.nowakowski@us.edu.pl.

Olgierd Unold, Email: olgierd.unold@pwr.edu.pl.

References

1.Angluin D. Negative results for equivalence queries. Mach. Learn. 1990;5(2):121–150. doi: 10.1007/BF00116034. [DOI] [Google Scholar]
2.Asgari E, Mofrad MRK. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE. 2015;10(11):1–15. doi: 10.1371/journal.pone.0141287. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Baral C. Knowledge Representation, Reasoning, and Declarative Problem Solving. New York: Cambridge University Press; 2003. [Google Scholar]
4.Beerten J, et al. WALTZ-DB: a benchmark database of amyloidogenic hexapeptides. Bioinformatics. 2015;31(10):1698–1700. doi: 10.1093/bioinformatics/btv027. [DOI] [PubMed] [Google Scholar]
5.Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Monterey: Wadsworth and Brooks; 1984. [Google Scholar]
6.Gebser M, Kaminski R, Kaufmann B, Schaub T. Answer Set Solving in Practice. San Rafael: Morgan & Claypool Publishers; 2012. [Google Scholar]
7.Gold EM. Language identification in the limit. Inf. Control. 1967;10:447–474. doi: 10.1016/S0019-9958(67)91165-5. [DOI] [Google Scholar]
8.de la Higuera C. Characteristic sets for polynomial grammatical inference. Mach. Learn. 1997;27(2):125–138. doi: 10.1023/A:1007353007695. [DOI] [Google Scholar]
9.de la Higuera C. Grammatical Inference: Learning Automata and Grammars. New York: Cambridge University Press; 2010. [Google Scholar]
10.Hopcroft JE, Motwani R, Ullman JD. Introduction to Automata Theory, Languages, and Computation. 2. Reading: Addison-Wesley; 2001. [Google Scholar]
11.Imada, K., Nakamura, K.: Learning context free grammars by using SAT solvers. In: Proceedings of the 2009 International Conference on Machine Learning and Applications, pp. 267–272. IEEE Computer Society (2009)
12.Lifschitz V. Answer Set Programming. Cham: Springer; 2019. [Google Scholar]
13.Manning CD, Raghavan P, Schütze H. Introduction to Information Retrieval. Cambridge: Cambridge University Press; 2008. [Google Scholar]
14.Maurer-Stroh S, et al. Exploring the sequence determinants of amyloid structure using position-specific scoring matrices. Nat. Methods. 2010;7:237–242. doi: 10.1038/nmeth.1432. [DOI] [PubMed] [Google Scholar]
15.Nakamura K, Matsumoto M. Incremental learning of context free grammars based on bottom-up parsing and search. Pattern Recognint. 2005;38(9):1384–1392. doi: 10.1016/j.patcog.2005.01.004. [DOI] [Google Scholar]
16.Pedregosa F, et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
17.Salkind NJ. Encyclopedia of Research Design. London: SAGE Publications Inc.; 2010. [Google Scholar]
18.Wu TF, Lin CJ, Weng RC. Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 2004;5:975–1005. [Google Scholar]

[CR1] 1.Angluin D. Negative results for equivalence queries. Mach. Learn. 1990;5(2):121–150. doi: 10.1007/BF00116034. [DOI] [Google Scholar]

[CR2] 2.Asgari E, Mofrad MRK. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE. 2015;10(11):1–15. doi: 10.1371/journal.pone.0141287. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.Baral C. Knowledge Representation, Reasoning, and Declarative Problem Solving. New York: Cambridge University Press; 2003. [Google Scholar]

[CR4] 4.Beerten J, et al. WALTZ-DB: a benchmark database of amyloidogenic hexapeptides. Bioinformatics. 2015;31(10):1698–1700. doi: 10.1093/bioinformatics/btv027. [DOI] [PubMed] [Google Scholar]

[CR5] 5.Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Monterey: Wadsworth and Brooks; 1984. [Google Scholar]

[CR6] 6.Gebser M, Kaminski R, Kaufmann B, Schaub T. Answer Set Solving in Practice. San Rafael: Morgan & Claypool Publishers; 2012. [Google Scholar]

[CR7] 7.Gold EM. Language identification in the limit. Inf. Control. 1967;10:447–474. doi: 10.1016/S0019-9958(67)91165-5. [DOI] [Google Scholar]

[CR8] 8.de la Higuera C. Characteristic sets for polynomial grammatical inference. Mach. Learn. 1997;27(2):125–138. doi: 10.1023/A:1007353007695. [DOI] [Google Scholar]

[CR9] 9.de la Higuera C. Grammatical Inference: Learning Automata and Grammars. New York: Cambridge University Press; 2010. [Google Scholar]

[CR10] 10.Hopcroft JE, Motwani R, Ullman JD. Introduction to Automata Theory, Languages, and Computation. 2. Reading: Addison-Wesley; 2001. [Google Scholar]

[CR11] 11.Imada, K., Nakamura, K.: Learning context free grammars by using SAT solvers. In: Proceedings of the 2009 International Conference on Machine Learning and Applications, pp. 267–272. IEEE Computer Society (2009)

[CR12] 12.Lifschitz V. Answer Set Programming. Cham: Springer; 2019. [Google Scholar]

[CR13] 13.Manning CD, Raghavan P, Schütze H. Introduction to Information Retrieval. Cambridge: Cambridge University Press; 2008. [Google Scholar]

[CR14] 14.Maurer-Stroh S, et al. Exploring the sequence determinants of amyloid structure using position-specific scoring matrices. Nat. Methods. 2010;7:237–242. doi: 10.1038/nmeth.1432. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Nakamura K, Matsumoto M. Incremental learning of context free grammars based on bottom-up parsing and search. Pattern Recognint. 2005;38(9):1384–1392. doi: 10.1016/j.patcog.2005.01.004. [DOI] [Google Scholar]

[CR16] 16.Pedregosa F, et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]

[CR17] 17.Salkind NJ. Encyclopedia of Research Design. London: SAGE Publications Inc.; 2010. [Google Scholar]

[CR18] 18.Wu TF, Lin CJ, Weng RC. Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 2004;5:975–1005. [Google Scholar]

PERMALINK

Grammatical Inference by Answer Set Programming

Wojciech Wieczorek

Łukasz Strąk

Arkadiusz Nowakowski

Olgierd Unold

Abstract

Introduction

Related Work

Our Contribution

Preliminaries

Words and Languages

Context-Free Grammars

Answer Set Programming

Proposed Encodings for the Induction of CFGs

Using Logic Programming with Answer Set Semantics

Using General Constraints

Experimental Results

Benchmarks

Performance Comparison

Table 1.

Table 2.

ASP-Based CFG Induction on Bioinformatics Datasets

Table 3.

Conclusion

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Grammatical Inference by Answer Set Programming

Wojciech Wieczorek

Łukasz Strąk

Arkadiusz Nowakowski

Olgierd Unold

Abstract

Introduction

Related Work

Our Contribution

Preliminaries

Words and Languages

Context-Free Grammars

Answer Set Programming

Proposed Encodings for the Induction of CFGs

Using Logic Programming with Answer Set Semantics

Using General Constraints

Experimental Results

Benchmarks

Performance Comparison

Table 1.

Table 2.

ASP-Based CFG Induction on Bioinformatics Datasets

Table 3.

Conclusion

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases