|
|
Algorithm 1 Clustering-Based Anonymizer (CBA) |
|
|
input: Dataset
, utility policy
,
comprised of potentially identifying sets of codes, and k
|
|
output: Anonymized dataset
|
| 1. |
←
|
| 2. |
Populate a priority queue PQ with all sets of codes in
|
| 3. |
while (PQ is not empty) |
| 4. |
Retrieve the top-most set of codes p from PQ
|
| 5. |
foreach (im ∈ p) |
| 6. |
if (im is a generalized term) |
| 7. |
im ← the set of ICD codes mapped to im
|
| 8. |
if (sup(p,
) ≥ k) |
| 9. |
remove p from PQ
|
| 10. |
else
|
| 11. |
while (sup(p,
) < k) |
| 12. |
find a pair {im, is} such that im is contained in p,
|
|
im and is are contained in the same utility |
|
constraint u ∈
and ILM( (im, is) ) is minimal |
| 13. |
ĩ ← anonymize({im, is}, p) |
| 14. |
update p by replacing {im,is} with ĩ
|
| 15. |
store the mapping of ĩ with the set of all ICD codes |
|
contained in it in {im,is} |
| 16. |
remove p from PQ
|
| 17. |
return
|
|