Skip to main content
Proceedings of the AMIA Symposium logoLink to Proceedings of the AMIA Symposium
. 2001:726–730.

Hiding information by cell suppression.

S A Vinterbo 1, L Ohno-Machado 1, S Dreiseitl 1
PMCID: PMC2243346  PMID: 11825281

Abstract

Joining relational data can jeopardize patient confidentiality if disseminated data for research can be joined with publicly available data containing, for example, explicit identifiers. Ambiguity in data hinders the construction of primary keys that are of importance when joining data tables. We define two values to be indiscernible if they are the same or at least one of them is a special value. Two rows in a data table are indiscernible if their corresponding entries are indiscernible. We further define a table to be k-ambiguous if each row is indiscernible from at least k rows in the same table. We present two simple heuristics to make a table k-ambiguous by cell suppression, and compare them on example data.

Full text

PDF
726

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. France F. H., Gaunt P. N. The need for security--a clinical view. Int J Biomed Comput. 1994 Feb;35 (Suppl):189–194. [PubMed] [Google Scholar]
  2. Gevers J. K. Issues in the accessibility and confidentiality of patient records. Soc Sci Med. 1983;17(16):1181–1190. doi: 10.1016/0277-9536(83)90010-2. [DOI] [PubMed] [Google Scholar]
  3. Gostin L. O. A proposed national policy on health care workers living with HIV/AIDS and other blood-borne pathogens. JAMA. 2000 Oct 18;284(15):1965–1970. doi: 10.1001/jama.284.15.1965. [DOI] [PubMed] [Google Scholar]
  4. Hodge J. G., Jr, Gostin L. O., Jacobson P. D. Legal issues concerning electronic health information: privacy, quality, and liability. JAMA. 1999 Oct 20;282(15):1466–1471. doi: 10.1001/jama.282.15.1466. [DOI] [PubMed] [Google Scholar]
  5. Kennedy R. L., Burton A. M., Fraser H. S., McStay L. N., Harrison R. F. Early diagnosis of acute myocardial infarction using clinical and electrocardiographic data at presentation: derivation and evaluation of logistic regression models. Eur Heart J. 1996 Aug;17(8):1181–1191. doi: 10.1093/oxfordjournals.eurheartj.a015035. [DOI] [PubMed] [Google Scholar]
  6. Ohrn A., Ohno-Machado L. Using Boolean reasoning to anonymize databases. Artif Intell Med. 1999 Mar;15(3):235–254. doi: 10.1016/s0933-3657(98)00056-6. [DOI] [PubMed] [Google Scholar]
  7. Sweeney L. Weaving technology and policy together to maintain confidentiality. J Law Med Ethics. 1997 Summer-Fall;25(2-3):98-110, 82. doi: 10.1111/j.1748-720x.1997.tb01885.x. [DOI] [PubMed] [Google Scholar]
  8. Vinterbo S., Ohno-Machado L. A genetic algorithm to select variables in logistic regression: example in the domain of myocardial infarction. Proc AMIA Symp. 1999:984–988. [PMC free article] [PubMed] [Google Scholar]

Articles from Proceedings of the AMIA Symposium are provided here courtesy of American Medical Informatics Association

RESOURCES