|
Algorithm 3
PFS2 Algorithm |
|
Input: |
|
Original database D; Threshold θ; Maximal length constraint upper bound l1; |
|
Percentage η; Privacy budgets ε1, …, ε5. |
Output: |
|
Frequent Sequences FS; |
1: |
/**** Pre-Mining Phase ****/ |
2: |
|D| ← get the noisy number of total input sequences using ε1; |
3: |
for
l2 = 1; p < η; l2 ++ do
|
4: |
αl2 = get the noisy number of input sequences with length l2 using ε2; |
5: |
; |
6: |
end for |
7: |
lmax ← min{l1, l2}; |
8: |
β ← get noisy maximal support of sequences of length from 1 to lmax using ε3; |
9: |
Lf ← estimate_max_frequent_sequence_length (θ, β); |
10: |
/**** Mining Phase ****/ |
11: |
FS ← ø; |
12: |
dbSet ← randomly_partition_database (D, Lf); |
13: |
ε′ ← ε5/Lf; |
14: |
for
k from 1 to Lf
do
|
15: |
if
k == 1 then
|
16: |
Candidate Set Ck ← all items in the alphabet; |
17: |
else
|
18: |
Candidate Set Ck ← generate_candidates (FSk—1); |
19: |
end if
|
20: |
← Sampling_based_Candidate_Pruning(Ck, dbk, ε4, θ, lmax); /**** See Algorithm 1 ****/ |
21: |
FSk ← discover_frequent_sequences (
, D, ε′, θ); |
22: |
FS += FSk; |
23: |
end for |
24: |
return FS; |
|