Entropy-Based Greedy Algorithm for Decision Trees Using Hypotheses

. 2021 Jun 25;23(7):808. doi: 10.3390/e23070808

Algorithm 1

E

Input: A nonempty decision table T and a number

k \in {1, \dots, 5}

.
Output: A decision tree of the type k for the table T.

Construct a tree G consisting of a single node labeled with T.
If no node of the tree G is labeled with a table, then the algorithm ends and returns the tree G.
Choose a node v in G, which is labeled with a subtable $Θ$ of the table T.
If $Θ$ is degenerate, then instead of $Θ$ , we label the node v with 0 if $Θ$ is empty and with the decision attached to each row of $Θ$ if $Θ$ is nonempty.
If $Θ$ is nondegenerate, then depending on k, we choose a query X (either attribute or hypothesis) in the following way:
- (a)
  If $k = 1$ , then we find an attribute X $\in F (T)$ with the minimum impurity $I (X, Θ)$ .
- (b)
  If $k = 2$ , then we find a hypothesis X over T with the minimum impurity $I (X, Θ)$ .
- (c)
  If $k = 3$ , then we find an attribute Y $\in F (T)$ with the minimum impurity $I (Y, Θ)$ and a hypothesis Z over T with the minimum impurity $I (Z, Θ)$ . Between Y and Z, we choose a query X with the minimum impurity $I (X, Θ)$ .
- (d)
  If $k = 4$ , then we find a proper hypothesis X over T with the minimum impurity $I (X, Θ)$ .
- (e)
  If $k = 5$ , then we find an attribute Y $\in F (T)$ with the minimum impurity $I (Y, Θ)$ and a proper hypothesis Z over T with the minimum impurity $I (Z, Θ)$ . Between Y and Z, we choose a query X with the minimum impurity $I (X, Θ)$ .
Instead of $Θ$ , we label the node v with the query X. For each answer $S \in A (X)$ , we add to the tree G a node $v (S)$ and an edge $e (S)$ connecting v and $v (S)$ . We label the node $v (S)$ with the subtable $Θ S$ and label the edge $e (S)$ with the answer S. We then proceed to step 2.