Abstract
In this paper, the identification of context-free grammars based on the presentation of samples is investigated. The main idea of solving this problem proposed in the literature is reformulated in two different ways: in terms of general constrains and as an answer set program. In a series of experiments, we showed that our answer set programming approach is much faster than our alternative method and the original SAT encoding method. Similarly to a pioneer work, some well-known context-free grammars have been induced correctly, and we also followed its test procedure with randomly generated grammars, making it clear that using our answer set programs increases computational efficiency. The research can be regarded as another evidence that solutions based on the stable model (answer set) semantics of logic programming may be a right choice for complex problems.
Keywords: Grammatical inference, Answer set programming, Constraint satisfaction problem
Introduction
In grammatical inference [9], a learning algorithm La takes a finite sequence (usually strings) of examples as input and outputs a language description (usually grammars). There are two main types of presentations: (i) A text for a language L is an infinite sequence of strings
from L such that every string of L occurs at least once in the text; (ii) An informant for a language L is an infinite sequence of pairs
in
such that every string of
occurs at least once in the sequence and
. The inference algorithms that use type (ii) of information are said to learn from positive and negative examples. From the Gold’s results [7], we know that the class of context-free languages (and even regular languages) cannot be identified from presentation (i), but can be identified using presentation (ii). However, de la Higuera [8] showed that it is computationally hard.
In this work, the following informant learning environment is exploited. Suppose that the inferring process is based on the existence of an Oracle, which can be seen as a device that:
Knows the language and has to answer correctly.
Can answer equivalence queries. They are made by proposing some hypothesis to the Oracle. The hypothesis is a grammar representing the unknown language. The Oracle answers Yes in the positive case. In the negative case, the Oracle has to return the shortest string in the symmetric difference between the target language and the submitted hypothesis.
Then the following procedure can be applied. Start from a small1 sample S and
. The parameter k denotes the number of non-terminal symbols in the target grammar. Run an answer set program (or another exact method). Every time it turns out that there is no solution that satisfies all of the constraints, increase k by 1. As long as the Oracle returns a pair (x, d) in response to an equivalent query, add (x, d) to S and run the answer set program again (or respectively another exact method). Stop after the answer is Yes. Unfortunately, there is no guarantee that the procedure will terminate in a polynomial number of steps, even when the target language is regular [1]. The equivalence checking may be done by random sampling. The positive answer could be incorrect, but this probability decreases if the sampling is repeated.
A very similar procedure for the induction of context-free grammars was proposed by Imada and Nakamura [11]. However, for the exact searching of k-variable grammar, they used Boolean formulas and applied an SAT solver. We took over their main Boolean variables, treating them as predicates, and then constructed a new encoding founded on answer set programming. In an alternative approach, we used general constraints of Gurobi Optimizer2 instead of ASP.
Related Work
The most closely related work to CFG identification is by Imada and Nakamura [11]. They proposed a way to synthesize CFGs from positive and negative samples based on solving a Boolean satisfiability problem (SAT). They translated the learning problem for a CFG into a SAT, which is then solved by a SAT solver. The result of the SAT solver satisfying the SAT contains a minimal set of rules (it can be easily changed to a minimal set of variables) that derives all positive samples and no negative samples.
They used one derivation constraint and two main types of Boolean variables:
Derivation variables. A set of derivation variables represents a relation between nonterminal symbols and substrings (in other words, derivation or parse tree) of each (positive or negative) sample w as follows: for any substring x of w and
, the derivation variable
represents that the nonterminal p derives the string x.Rule variables. A set of rule variables represents a rule set as follows: for any
,
, a variable
(or
) determines whether the production rule
(or
) is a member of the set of rules or not.
The derivation constraint is a set of following Boolean expressions for any string
(
) and nonterminal
.
![]() |
Nakamura et al. have been working on another approach for incremental learning of CFGs implemented in the Synapse system [15]. This approach is based on rule generation by analyzing the results of bottom-up parsing for positive samples and searching for rule sets. Their system can also learn similar CFGs but does it only from positive samples. Both methods synthesized similar rule sets for each language in their experiments. They reported that the computation time by the SAT-based approach is rather shorter than Synapse in most languages.
Our Contribution
The purpose of the present proposal is to investigate to what extent the power of an ASP solver makes it possible to tackle the context-free inference problem for large-size instances and to compare our approach with the original one. Because of the possibility of future comparisons with other methods, the Python implementation3 of our winning method is given via GitLab.
The main original scientific contributions are as follows:
the formulation of the induction of a k-variable context-free grammar in terms of logical rules with answer set semantics;
the formulation of the induction of a k-variable context-free grammar in terms of general constraints;
the construction of an informant learning algorithm based on ASP, CSP, and SAT solvers;
the conduct of an appropriate statistical test in order to determine the fastest CFG inference method.
This paper is organized into five sections. In Sect. 2, we present necessary definitions and facts originating from formal languages and declarative problem-solving. Section 3 describes our inference algorithms: (a) based on solving an answer set program, and (b) based on solving a constraint satisfaction program, including general constraints such as AND/OR. Section 4 shows the experimental results of our approaches in comparison with the original one. Concluding comments are made in Sect. 5.
Preliminaries
We assume the reader to be familiar with basic context-free languages theory, e.g., from [10], so that we introduce only some notations and notions used later in the paper.
Words and Languages
An alphabet is a finite, non-empty set of symbols. We use the symbol
for the alphabet. A word is a finite sequence of symbols chosen from the alphabet. We denote the length of the word w by |w|. The empty word
is the word with zero occurrences of symbols. Let x and y be words. Then xy denotes the catenation of x and y, that is, the word formed by making a copy of x and following it by a copy of y. As usual,
denotes the set of words over
. The word w is called a prefix of the word u if there is a word x such that
. We call it a proper prefix if
. The word w is called a suffix of the word u if there is a word x such that
. It is a proper suffix if
. A factor (or subword) is a prefix of a suffix. A set of words, all of which are chosen from some
, where
is a particular alphabet, is called a language.
Context-Free Grammars
A context-free grammar (CFG) is defined by a quadruple
, where V is an alphabet of variables (or sometimes non-terminal symbols),
is an alphabet of terminal symbols such that
, P is a finite set of production rules in the form
for
and
, and
is a special non-terminal symbol called the start symbol. For simplicity’s sake, we write
instead of
. We call the word
a sentential form. Let u, v be two words in
and
. Then, we write
, if
is a rule in P. That is, we can substitute the word x for symbol A in a sentential form if
is a rule in P. We call this rewriting a derivation. For any two sentential forms x and y, we write
, if there exists a sequence
of sentential forms such that
for all
. The language L(G) generated by G is the set of all words over
that are generated by G; that is,
. A language is called a context-free language if it is generated by a context-free grammar. Assume that G is the unknown (target) CFG to be identified. An example (a positive word) of G is a word in L(G), and a counter-example (a negative word) of G is a word not in L(G).
A normal form for context-free grammars is a form, for which any grammar can be converted to the respective normal form version. Amongst all normal forms for context-free grammars, the most useful and the most well-known one is the Chomsky normal form (CNF). A grammar is said to be in Chomsky normal form if each of its rules is in one of two possible forms:
, or
.
Answer Set Programming
We will briefly introduce the idea of answer set programming (ASP). Those who are interested in a more detailed description of the topic, alternative definitions, and the formal specification of this kind of logic programming are referred to handbooks [3, 6], and [12].
A variable or constant is a term. An atom is
, where a is a predicate of arity n and
are terms. A literal is either a positive literal
p or a negative literal
, where p is an atom.
A rule r is a clause of the form
![]() |
1 |
where
are atoms. The atom
is the head or r, while the conjunction
is the body of r. By
, we denote the head atom, and by
the set
of the body literals.
(
, resp.) denotes the set of atoms occurring positively (negatively, resp.) in
. A program (also called ASP program) is a finite set of rules. A
-free program is called positive. A term, atom, literal, rule, or a program is ground if no variables appear in it.
Let
be a program. Let r be a rule in
, a ground instance of r is a rule obtained from r by replacing4 every variable X in r by constants occurring in
. We denote the set of all the ground instances of the rules occurring in
by
.
An interpretation
I for
is a set of ground atoms. A ground positive literal A is true (false, resp.) w.r.t. I if
(
, resp.). A ground negative literal
is true (false, resp.) w.r.t. I if
(
, resp.).
Let r be a ground rule in
. The head of r is true w.r.t. I if
. The body of r is true w.r.t. I if all body literals of r are true w.r.t. I (i.e.,
) and is false w.r.t. I otherwise. The rule r is satisfied (or true) w.r.t. I if r head is true w.r.t. I or r body is false w.r.t. I.
A model for
is an interpretation M for
such that every rule
is true w.r.t. M.
Given a program
and an interpretation I, the reduct
is the set of positive rules defined as follows:
![]() |
2 |
I is an answer set of
if I is the
-smallest model for
.
Over the last years, answer set programming has emerged as a declarative problem-solving paradigm. It is a programming methodology rooted in research on artificial intelligence and computational logic, and researchers use it in many areas of science and technology. For experiments we took advantages of clingo—one of the most efficient and widely used answer set programming system available5 today. In addition to standard definitions, clingo allows to define constraints, i.e., rules with the empty head, for instance
![]() |
3 |
By adding this constraint to a program, we eliminate its answer sets that contain
. Adding the ‘opposite’ constraint
![]() |
4 |
eliminates those answers that do not contain
. A constraint can be translated into a normal rule. To this end, the constraint
![]() |
5 |
is mapped onto the rule
![]() |
6 |
where x is a new atom.
Example. Suppose we have three numbered urns and two distinguishable balls. Every ball has been put to an urn, maybe to the same. An ASP program to code this knowledge is as follows:
![]() |
7 |
![]() |
8 |
![]() |
9 |
![]() |
10 |
![]() |
11 |
![]() |
12 |
![]() |
13 |
![]() |
14 |
![]() |
15 |
Please notice that as usual in logic programming, identifiers with initial uppercase letters are assigned to variables. Rules 7–11 are simple facts concerning urns and balls. Rules 12 and 13 define predicates that tell whether a ball is inside in a particular urn. Inequality
is only used during grounding to eliminate some ground instances of rule 13. It is worth mentioning that grounding systems do not make unnecessary replacements, for example, 1 for U. Rules 14 and 15 ensure that every ball is exactly in one urn.
Suppose now that we have discovered that urn 2 is empty and we want to know possible configurations. It is enough to add two facts:
![]() |
16 |
![]() |
17 |
and find all answer sets. A possible answer set is: ball(q), ball(r), urn(1), urn(2), in(r), not_in(2, q), not_in(2, r), not_in(3, q), not_in(3, r), contains(1, q), in(q), urn(3), contains(1, r), which describes the placement of both balls into the first urn.
clingo also allows using choice constructions, for instance:
![]() |
18 |
describes all possible ways to choose which two of the atoms p(1, q), p(2, q), p(3, q) and which two of the atoms p(1, r), p(2, r), p(3, r) are included in the resultant model. Before and after an expression in braces, we can put integers, which express bounds on the cardinality of the stable models described by the rule. The number on the left is the lower bound (0 is default), and the number on the right is the upper bound (unbounded is default).
Proposed Encodings for the Induction of CFGs
Our translation converts CFG identification into an ASP program (the main approach) and CSP model (an alternative approach, constraint satisfaction problem). Suppose we are given a sample composed of examples,
, and counter-examples,
, over an alphabet
, and a positive integer k. We want to find a k-variable CFG
such that
and
.
Using Logic Programming with Answer Set Semantics
Let F be the set of all factors (excluding the empty word) of
. Let us now see how to describe the rules for the relationship between a grammar G and a sample
in terms of ASP. There are three main predicates: y(I, J, L), which indicates the presence of
in P; w(I, Q), which indicates that
, where Q represents a factor; and z(I, A), which indicates the presence of
.
- We have the following domain specification, our facts.

19 
20 
21 
22 
23 
24 - The next rules ensure that in a grammar G a factor can or cannot be derived from a specific variable and ensure that in the grammar there is a subset of all possible productions.

25 
26 
27 
28 
29 - All examples should be accepted, and no counter-example can be accepted.

30 
31 - For every
for which
and for every pair
(
) of such factors that
, f can be derived from a non-terminal I if there are two non-terminals, J and L, such that b can be derived from J, c can be derived from L, and there is a production
. 
32 - On the other hand, if
, then at least one such pair
should exist, that
is in P and
and
. 
33
Using General Constraints
This time, instead of predicates, w, y, and z are binary variables. We use the following constraints
![]() |
34 |
![]() |
35 |
and
![]() |
36 |
for each
, where
means if
then
and if
then
, and
.
Experimental Results
In this section, we describe some experiments comparing the performance of our approaches implemented6 in Python, using clingo (ASP) and using Gurobi Optimizer, with our implementation of Imada et al. algorithm [11] using the PicoSAT solver (SAT), when positive and negative words are given. For these experiments, we use a set of 40 samples: partly based on randomly generated grammars (33 samples) and partly based on the set of fundamental CFGs appearing in grammatical inference research (the last 7 samples).
Benchmarks
For testing the learning power for general CFGs, we randomly generated 33 CFGs and prepared positive and negative samples with lengths no longer than 14 exhaustively enumerated for them. The grammars are in Chomsky normal form with 6 to 12 rules on the alphabet
. In every sample, positive words constitute not less than 20% of the total.
The last seven samples are also with lengths no longer than 14 exhaustively enumerated, but they were generated based on the following descriptions:
The set of palindromes over
.The parentheses language: the set of strings consisting of equal numbers of a’s and b’s such that every prefix does not have more b’s than a’s.
The set of strings consisting of b’s twice as many as a’s.
The set of strings of a’s and b’s not of the form ww.
The complement of the language (b).
.The set of strings consisting of equal numbers of a’s and b’s.
Performance Comparison
In all experiments, we used Intel Xeon CPU E5-2650 v2, 2.6 GHz (single-core out of eight), under Ubuntu 18.04 operating system with 60 GB available RAM. Algorithm 1 shows the process for synthesizing a grammar (the set of production rules with
being always the start symbol) from positive and negative words.
In the algorithm,
and
represent the set of positive and negative words as an input. The variables
and
hold sets of samples to be covered in the next loop iteration. The algorithm picks up a word from
or
that is not covered by the inferred grammar G, and add it to
or
. The function Convert translates the problem into a set of ASP rules R (or Gurobi general constraints or a Boolean expression). If the ASP solver (or Gurobi Optimizer or the SAT solver) finds a stable model M, the function Extract returns a set of production rules by analyzing the presence of particular y(i, j, l) and z(i, a) atoms. The algorithm repeats this process—increasing k to relaxe the limit on the number of non-terminals—until G covers the all given
and
.
The results are listed in Table 1. In order to determine whether the observed CPU time differences between ASP’s runs and the remaining methods’ runs did not occur by chance, we use the Wilcoxon signed-rank test [17, pp. 915–916] for ASP vs SAT and ASP vs Gurobi. The null hypothesis to be tested is that the median of the paired differences is negative (against the alternative that it is positive). As we can see from Table 2, p-value is high in both cases, so the null hypothesis cannot be rejected, and we may conclude that using our ASP encoding is likely to improve CPU time performance for most of this kind of benchmarks.
Table 1.
Execution times of exact solving CFG identification in seconds
| Language | |V| | ASP | SAT | Gurobi |
|---|---|---|---|---|
| 1 | 3 | 51.70 | 48.65 | 56.42 |
| 2 | 6 | 646.39 | 21049.22 |
21050 |
| 3 | 4 | 74.31 | 189.85 | 143.76 |
| 4 | 5 | 75.90 | 347.84 |
2000 |
| 5 | 4 | 27.91 | 64.82 | 18.36 |
| 6 | 5 | 75.96 | 335.98 | 10.33 |
| 7 | 4 | 68.35 | 61.87 |
2000 |
| 8 | 4 | 57.14 | 118.25 | 28.85 |
| 9 | 3 | 45.17 | 94.86 | 73.03 |
| 10 | 5 | 211.33 | 568.12 | 568.06 |
| 11 | 5 | 62.48 | 166.65 |
2000 |
| 12 | 3 | 21.50 | 58.12 | 33.28 |
| 13 | 6 | 112.69 | 705.80 |
2000 |
| 14 | 6 | 943.02 | 4807.32 |
4808 |
| 15 | 7 | 19358.09 | 252290.70 |
252291 |
| 16 | 4 | 49.01 | 111.22 | 103.05 |
| 17 | 7 | 2921.44 | 8035.44 |
8036 |
| 18 | 5 | 361.52 | 1369.22 |
2000 |
| 19 | 5 | 63.47 | 238.71 | 186.10 |
| 20 | 2 | 12.96 | 5.64 | 3.88 |
| 21 | 5 | 96.68 | 512.83 | 671.62 |
| 22 | 2 | 11.38 | 12.02 | 10.54 |
| 23 | 3 | 11.84 | 43.03 | 9.92 |
| 24 | 4 | 109.98 | 159.73 | 176.49 |
| 25 | 3 | 22.65 | 22.40 | 29.65 |
| 26 | 5 | 38.74 | 271.30 | 420.11 |
| 27 | 5 | 94.76 | 295.81 |
2000 |
| 28 | 5 | 216.61 | 625.07 |
2000 |
| 29 | 5 | 271.88 | 324.43 |
2000 |
| 30 | 6 | 228.98 | 412.16 |
2000 |
| 31 | 2 | 10.97 | 15.29 | 19.84 |
| 32 | 5 | 62.17 | 293.98 | 105.74 |
| 33 | 3 | 10.42 | 18.30 | 13.15 |
| 34 | 5 | 31.13 | 49.28 | 32.83 |
| 35 | 3 | 12.84 | 20.97 | 12.86 |
| 36 | 4 | 118.17 | 76.98 | 73.74 |
| 37 | 6 | 173.66 | 191.42 |
2000 |
| 38 | 4 | 29.33 | 54.63 | 36.71 |
| 39 | 4 | 4.12 | 21.00 | 9.02 |
| 40 | 3 | 66.71 | 50.65 | 40.40 |
Table 2.
Obtained p-values from the Wilcoxon signed-rank test
| ASP vs SAT | ASP vs Gurobi |
|---|---|
| 0.999999647 | 0.999987068 |
ASP-Based CFG Induction on Bioinformatics Datasets
Our induction method can also be applied to other data, that are not taken from context-free infinite languages. We tried its classification quality on two bioinformatics datasets: WALTZ-DB database [4], composed by 116 hexapeptides known to induce amyloidosis (
) and by 161 hexapeptides that do not induce amyloidosis (
) and Maurer-Stroh et al. database from the same domain [14], where the ratio of
is 240/836.
We chose a few standard machine learning methods for comparison: BNB (Naive Bayes classifier for multivariate Bernoulli models [13, pp. 234–265]), DTC (Decision Trees Classifier, CART method [5]), MLP (Multi-layer Perceptron [16]), and SVM (Support Vector Machine classifier with the linear kernel [18]). In all methods except ASP and BNB, an unsupervised data-driven distributed representation, called ProtVec [2], was applied in order to convert words (protein representations) to numerical vectors. For using BNB, we represented words as binary-valued feature vectors that indicated the presence or absence of every pair of protein letters. In case of ASP, the training set was partitioned randomly into n parts, and the following process was being performed m times. Choosing one part for synthesizing a CFG and use rest
parts for validating it. The best of all m grammars—in terms of higher F-measure—was then confronted with the test set. For WALTZ-DB n and m have been set to 20, for Maurer-Stroh n has been set to 10 and m to 30. These values were selected experimentally based on the size of databases and the running time of the ASP solver.
To estimate the ASP’s and compared approaches’ ability to classify unseen hexapeptides repeated 10-fold cross-validation (cv) strategy was used. It means splitting the data randomly into 10 mutually exclusive folds, building a model on all but one fold, and evaluating the model on the skipped fold. The procedure was repeated 10 times and the overall assessment of the model was based on the mean of those 10 individual evaluations. Table 3 summarizes the performances of the compared methods on WALTZ-DB and Maurer-Stroh databases. It is noticable that the ASP approach achieved best F-score for smaller dataset (Maurer-Stroh) and an average F-score for the bigger one (WALTZ-DB), hence it can be used with a high reliability to recognize amyloid proteins. BNB is outstanding for the WALTZ-DB and almost as good as ASP for Maurer-Stroh database.
Table 3.
Performance of compared methods on WALTZ-DB and Maurer-Stroh databases in terms of Precision (P), Recall (R), and F-score (F1)
| Method | WALTZ-DB | Maurer-Stroh | ||||
|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | |
| ASP | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| BNB | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| DTC | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| MLP | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| SVM | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Conclusion
In this paper, we proposed an approach for learning context-free grammars from positive and negative samples by using logic programming. We encode the set of samples, together with limits on the number of non-terminals to be synthesized as an answer set program. A stable model (an answer set) for the program contains a set of grammar rules that derives all positive samples and no negative samples. A feature of this approach is that we can synthesize a compact set of rules in Chomsky normal form. The other feature is that our learning method reflects future improvements on ASP solvers. We present experimental results on learning CFGs for fundamental context-free languages, including a set of strings composed of the equal numbers of a’s and b’s and the set of strings over
not of the form ww. Another series of experiments on random languages shows that our encoding can speed up computations in comparison with SAT and CSP encodings.
Footnotes
We are aware of this imprecision. The number of words and their lengths should allow of executing a program in a reasonable amount of time. In experiments, we took two words: one example and one counter-example.
The Python scripting language is used only for generating appropriate AnsProlog facts.
This process can be done efficiently, because many ground instances can be discarded; see Chapter 4 of [6].
This research was supported by National Science Center (Poland), grant number 2016/21/B/ST6/02158.
Contributor Information
Valeria V. Krzhizhanovskaya, Email: V.Krzhizhanovskaya@uva.nl
Gábor Závodszky, Email: G.Zavodszky@uva.nl.
Michael H. Lees, Email: m.h.lees@uva.nl
Jack J. Dongarra, Email: dongarra@icl.utk.edu
Peter M. A. Sloot, Email: p.m.a.sloot@uva.nl
Sérgio Brissos, Email: sergio.brissos@intellegibilis.com.
João Teixeira, Email: joao.teixeira@intellegibilis.com.
Wojciech Wieczorek, Email: wojciech.wieczorek@us.edu.pl.
Łukasz Strąk, Email: lukasz.strak@us.edu.pl.
Arkadiusz Nowakowski, Email: arkadiusz.nowakowski@us.edu.pl.
Olgierd Unold, Email: olgierd.unold@pwr.edu.pl.
References
- 1.Angluin D. Negative results for equivalence queries. Mach. Learn. 1990;5(2):121–150. doi: 10.1007/BF00116034. [DOI] [Google Scholar]
- 2.Asgari E, Mofrad MRK. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE. 2015;10(11):1–15. doi: 10.1371/journal.pone.0141287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Baral C. Knowledge Representation, Reasoning, and Declarative Problem Solving. New York: Cambridge University Press; 2003. [Google Scholar]
- 4.Beerten J, et al. WALTZ-DB: a benchmark database of amyloidogenic hexapeptides. Bioinformatics. 2015;31(10):1698–1700. doi: 10.1093/bioinformatics/btv027. [DOI] [PubMed] [Google Scholar]
- 5.Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Monterey: Wadsworth and Brooks; 1984. [Google Scholar]
- 6.Gebser M, Kaminski R, Kaufmann B, Schaub T. Answer Set Solving in Practice. San Rafael: Morgan & Claypool Publishers; 2012. [Google Scholar]
- 7.Gold EM. Language identification in the limit. Inf. Control. 1967;10:447–474. doi: 10.1016/S0019-9958(67)91165-5. [DOI] [Google Scholar]
- 8.de la Higuera C. Characteristic sets for polynomial grammatical inference. Mach. Learn. 1997;27(2):125–138. doi: 10.1023/A:1007353007695. [DOI] [Google Scholar]
- 9.de la Higuera C. Grammatical Inference: Learning Automata and Grammars. New York: Cambridge University Press; 2010. [Google Scholar]
- 10.Hopcroft JE, Motwani R, Ullman JD. Introduction to Automata Theory, Languages, and Computation. 2. Reading: Addison-Wesley; 2001. [Google Scholar]
- 11.Imada, K., Nakamura, K.: Learning context free grammars by using SAT solvers. In: Proceedings of the 2009 International Conference on Machine Learning and Applications, pp. 267–272. IEEE Computer Society (2009)
- 12.Lifschitz V. Answer Set Programming. Cham: Springer; 2019. [Google Scholar]
- 13.Manning CD, Raghavan P, Schütze H. Introduction to Information Retrieval. Cambridge: Cambridge University Press; 2008. [Google Scholar]
- 14.Maurer-Stroh S, et al. Exploring the sequence determinants of amyloid structure using position-specific scoring matrices. Nat. Methods. 2010;7:237–242. doi: 10.1038/nmeth.1432. [DOI] [PubMed] [Google Scholar]
- 15.Nakamura K, Matsumoto M. Incremental learning of context free grammars based on bottom-up parsing and search. Pattern Recognint. 2005;38(9):1384–1392. doi: 10.1016/j.patcog.2005.01.004. [DOI] [Google Scholar]
- 16.Pedregosa F, et al. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
- 17.Salkind NJ. Encyclopedia of Research Design. London: SAGE Publications Inc.; 2010. [Google Scholar]
- 18.Wu TF, Lin CJ, Weng RC. Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 2004;5:975–1005. [Google Scholar]


































































