Algorithm 1 BestSplitByHistogram Algorithm |
1 |
Input: d: training data, max_depth |
2 |
Input:m: merger fingerprint dimension |
3 |
nodeSet = {0} #tree nodes in current level |
4 |
rowSet = {{0,1,2,...}} #data indices in tree nodes |
5 |
for i = 1 to max_depth |
6 |
|
for node in nodeSet do
|
7 |
|
|
usedRows = rowSet[node] |
8 |
|
|
for j = 1 to m do
|
9 |
|
|
|
H = new Histogram() |
10 |
|
|
|
#Build histogram |
11 |
|
|
|
for k in usedRows do
|
12 |
|
|
|
|
bin = d.s[j][k].bin |
13 |
|
|
|
|
H[bin].g += d.g[j] #Sum of gradients in each bin |
14 |
|
|
|
|
H[bin].n += 1 #Sum of samples in each bin |
15 |
|
|
|
Find the best split on histogram H. |
16 |
|
Update rowSet and nodeSet according to the best split points |