Biomedical data privacy: problems, perspectives, and recent advances

Bradley A Malin; Khaled El Emam; Christine M O'Keefe

doi:10.1136/amiajnl-2012-001509

editorial

. 2013 Jan-Feb;20(1):2–6. doi: 10.1136/amiajnl-2012-001509

Biomedical data privacy: problems, perspectives, and recent advances

Bradley A Malin ^1,², Khaled El Emam ^3,⁴, Christine M O'Keefe ⁵

PMCID: PMC3555341 PMID: 23221359

Introduction

The notion of privacy in the healthcare domain is at least as old as the ancient Greeks. Several decades ago, as electronic medical record (EMR) systems began to take hold, the necessity of patient privacy was recognized as a core principle, or even a right, that must be upheld.¹ ² This belief was re-enforced as computers and EMRs became more common in clinical environments.^3–5 However, the arrival of ultra-cheap data collection and processing technologies is fundamentally changing the face of healthcare. The traditional boundaries of primary and tertiary care environments are breaking down and health information is increasingly collected through mobile devices,⁶ in personal domains (eg, in one's home⁷), and from sensors attached on or in the human body (eg, body area networks^8–10). At the same time, the detail and diversity of information collected in the context of healthcare and biomedical research is increasing at an unprecedented rate, with clinical and administrative health data being complemented with a range of *omics data, where genomics¹¹ and proteomics¹² are currently leading the charge, with other types of molecular data on the horizon.¹³ Healthcare organizations (HCOs) are adopting and adapting information technologies to support an expanding array of activities designed to derive value from these growing data archives, in terms of enhanced health outcomes.¹⁴

The ready availability of such large volumes of detailed data has also been accompanied by privacy invasions. Recent breach notification laws at the US federal and state levels have brought to the public's attention the scope and frequency of these invasions. For example, there are cases of healthcare provider snooping on the medical records of famous people, family, and friends, use of personal information for identity fraud, and millions of records disclosed through lost and stolen unencrypted mobile devices.¹⁵ The danger is that such publicized incidents will erode patient trust over time, and lead to privacy protective behaviors. For example, between 15% and 17% of US adults have changed their behavior to protect the privacy of their health information, doing things such as: going to another doctor, paying out-of-pocket when insured to avoid disclosure, not seeking care to avoid disclosure to an employer, giving inaccurate or incomplete information on medical history, self-treating or self-medicating rather than seeing a provider, or asking a doctor not to write down the health problem or record a less serious or embarrassing condition.^16–18 A survey of service members who had been on active duty found that respondents were concerned that if they received treatment for their mental health problems, it would not be kept confidential and would have a negative impact on future job assignments and career advancement.¹⁹ Specific vulnerable populations have reported similar privacy protective behaviors, such as adolescents, people with HIV or at high risk for HIV, women undergoing genetic testing, mental health patients, and victims of domestic violence.^20–26 A survey of Californian residents found that discussing depression with their primary care physician was a barrier to 15% of the respondents because of privacy concerns.²⁷

On the other hand, some legal scholars are questioning the survival of conventional privacy expectations.²⁸ Privacy has conventionally been defined as an individual's ability to control the disclosure of personal facts.²⁹ ³⁰ However, privacy is also a multi-dimensional concept³¹ ³² and any shifts in privacy expectations are not homogeneous in direction and intensity across all of these dimensions. Furthermore, advances in informatics that may be eroding individuals’ control over their information are being countered by advances in privacy enhancing technologies, as well as regulatory and policy changes that give individuals back control over their information.

This special issue was established to solicit current research in privacy as it is currently understood and is being redefined for emerging biomedical systems. The selected articles consider the different dimensions of privacy, and describe some novel privacy enhancing technologies and their applications, as well as the governance, regulatory, and policy mechanisms that are being used to manage privacy risks.

Privacy is a major patient, provider, regulator, and legislator concern today. There is therefore a need to address these concerns in a practical way that can be deployed in the short term. Deployment must be preceded by a convincing evidence base demonstrating the rationale, costs, and benefits of an intervention. At the same time, new theoretical models and novel approaches that still need to be evaluated and tested in the field, are also necessary to ensure that the field keeps evolving. In putting together this special issue we attempted to balance these two perspectives, with articles presenting results of immediate relevance and applicability, and material covering theoretical work that remains to be proven in practical settings.

There were 53 papers submitted for consideration in this issue, of which 13 were accepted for publication, for an acceptance rate of 25%. All papers were subject to a rigorous review by at least two referees and oversight by one of the guest editors. The review process for papers authored by guest editors, as well as the editor-in-chief, was handled by an unaffiliated associate editor of the journal. In addition to peer-reviewed manuscripts, two invited papers were solicited for the special issue to address the topics of privacy policy and technical data protection mechanisms.

Privacy, zones, and socio-technical tracks

Privacy is an overloaded and complex term.³³ The concept of privacy often subsumes various constructs, such as anonymity (ie, the ability to hide one's identity), confidentiality (ie, the ability to share information with a second party without the information being publicly revealed), and solitude (ie, the right to be left alone). Even when the particular construct is unambiguous, it remains difficult to have discussions around the topic of privacy because it is highly contextual, such that the expectations of privacy are often specialized to the situation.³⁴ For instance, a patient's expectation of privacy changes when disclosing information to a care provider versus a random person on the street. The expectation is further modified by the perceived sensitivity of the health information in question. And, the extent to which health information (eg, a positive assertion of an HIV diagnosis) is deemed to be sensitive varies from patient to patient.

This special issue is organized to trace the lifecycle of biomedical information, which we coarsely partition into three zones that follow the general data lifecycle as follows.

Collection zone

The first zone corresponds to the point at which health information is collected from patients. The collection may occur while an individual is physically located at a healthcare provider or beyond (eg, such as through a website on the internet or an application running on a mobile device). In this zone, privacy tends to be concerned with who can collect health information, how much information should be collected, at what time, and for what purposes. The specific notions of privacy addressed in this zone tend to be associated with anonymity (eg, Is the recipient of the data permitted to know the identity of the information from which it is being collected?), limiting content (eg, What is the minimal amount of information that will satisfy the purpose?), and consent (eg, Did the patient agree to the terms of the data collection?).

Primary use zone

The second zone corresponds to the context in which the data have left the control of the patient and are housed in a system controlled, or accessed by, those who provide a primary service (eg, provision of care, study of biomedical data explicitly solicited for a specific research project). In this zone, privacy tends to be realized through confidentiality (eg, Who is permitted to access or use the data and for what purposes?) and security (eg, How can we ensure that the data are protected from misuse or abuse while at rest in a database or in transit between authorized entities?).

Secondary use zone

The third zone corresponds to the scenario in which biomedical data are utilized for purposes which are different from their primary use. The data may be used by the organization that initially collected the data (eg, repurposing of clinical data for research) or disseminated to external entities (eg, publication of public use datasets) for the performance of certain tasks (eg, evaluation of health policies). In this zone, the privacy issues that tend to arise are anonymity and consent for individuals and groups (eg, Can data collected from a particular ethnic group be reused to study a specific phenotype?).

Each of these zones can be partitioned into two interacting, although conceptually distinct, tracks. In the first track, biomedical privacy is defined and regulated via socio-legal mechanisms. This is the arena where the public, ethicists, and policy and law makers come together to define what privacy rights and responsibilities exist. In the second track, technical controls are specified and realized in working information technologies to maintain societal expectations of privacy or requirements specified in policy and law. It is critical to integrate these tracks to ensure that privacy expectations are appropriately represented in technical controls and that policies are designed to realistically account for state-of-the-art technical capabilities. It is further important that societal expectations of privacy and privacy enhancing technologies are current and cognizant of shifts in expectation of technical sophistication.

A characterization of the papers in the privacy lifecycle

Papers in the socio-legal track

The socio-legal track of this special issue commences by delving into the desires and expectations society harbors for privacy. As mentioned earlier, privacy is a societal phenomenon, such that the extent to which it is realized is dependent on how society chooses to codify the concept in policy and law. This process often begins with field studies and sessions that engage stakeholders in their preferences.³⁵ In this vein, Caine and Hanania report on a study that asks patients who should control access to health information and what granularity of control is desirable.³⁶ Often, the expectations of privacy are dependent on the domain in which information is collected. As the traditional boundaries of the healthcare domain expand, it is important to determine how individuals’ perspectives on privacy relate to new technologies. To begin to address this issue, van der Velden and El Emam focus on the use of social media by teenage patients, and how they perceive their health information privacy when interacting online.³⁷ Insights gained here should inform the more general health data context.

The next set of papers in the socio-legal track move beyond primary uses for biomedical data and into secondary settings. In this environment, it is assumed that policies and laws have been codified. However, policy and law is dependent on the locale, such that it is critical to understand how it guides data management practices. The first paper in this group, by Pencarrick Hertzman, Meagher, and McGrail, presents a case study about how the ‘Privacy by Design’ framework was applied in British Columbia (BC), Canada, to facilitate access to health information in Population Data BC.³⁸ To date, Population Data BC has facilitated over 350 research studies. This work is followed by the first invited paper by McGraw, which takes a look at the de-identification strategy of the US Health Insurance Portability and Accountability Act (HIPAA).³⁹ This strategy enables HCOs to disclose information about patients in a manner that is no longer subject to oversight by the regulatory authorities because the risk that they would be individually identifiable is deemed very small. Clarifications to what de-identification means and how it can be achieved in accordance with HIPAA were recently published by the U.S. federal government.⁴⁰ The paper by McGraw reports on a workshop on various stakeholders’ support for the current HIPAA de-identification strategy held by the Center for Democracy and Technology and discusses policy proposals to address concerns and improve trust in the process. Peterson and colleagues then recount a recent case before the Supreme Court, Sorrell vs IMS Health, and illustrate the challenges associated with selling prescription records for various purposes, such as post-market effectiveness.⁴¹ They highlight some of the concerns associated with the dissemination of identifiable prescriber and de-identified patient information. While the previous papers focus on traditional health information, the last paper in this track, by Kosseim and colleagues, addresses privacy issues associated with the management of *omics data in particular, with a specific focus on genomics.⁴² This paper describes the legal and ethical principles and practices adopted in the Canadian provinces of Newfoundland and Labrador to enable research with genomic, phenomic, and genealogical data.

Papers in the technical track

While law and policy codify the rights and requirements for managing data privacy in the biomedical domain, information technology is necessary to uphold and ensure their realization in practice. In this regard, certain aspects of privacy can be achieved through information security. The Security Rule of HIPAA specifies various administrative, physical, and technical safeguards that covered entities must have in place (ie, required controls) or document why such protections are not prudent (ie, addressable controls). For instance, it is required that all covered entities ensure that appropriate authorization is provided before employees of an HCO access a patient's EMR. By contrast, the encryption of health information at rest within the HCO is addressable, but is not required. Along these lines, the technical track of this special issue begins with a paper by Kwon and Johnson that assesses the extent to which 250 HCOs in the USA have (or have not) adopted various security practices.⁴³ Their analysis demonstrates patterns of leaders, followers, and laggers in their adoption, and provides recommendations for improving regulatory compliance. Although this work provides a high-level assessment of the adoption of security practices, it does not provide specific guidance on data management strategies. Thus, the next paper, by Fabbri and LeFevre, discusses a privacy threat encountered on a daily basis in primary care settings, specifically the insider threat.⁴⁴ This threat is particularly important to study in the healthcare domain because traditional information security controls (eg, role-based access control) are difficult to realize in care settings due to the highly dynamic nature of healthcare teams. An increasing number of publications have proposed auditing strategies for EMRs,⁴⁵ ⁴⁶ however, this line of work is unique in that it suggests EMR users can be ‘explained’ by the diagnoses that are assigned to the patient records they access. This work suggests that data-driven auditing strategies may help winnow the set of accesses to patient records to a manageable size for review by administrative officials (eg, privacy officers) of HCOs.

The next set of papers in the technical section focus on various strategies that can be invoked to protect patients’ privacy when data are shared for secondary use. However, before presenting specific protection methodologies, this section begins with an illustration of types of research studies that can be enabled through de-identified data. The paper by White and Horvitz integrates web search data from Bing and geocoded data from mobile devices to show how search for health information online correlates with an individual's physical presence at a healthcare providing facility.⁴⁷ This research is performed on data that are stripped of user identifiers and location information prior to the analysis. However, there may be times when it is beneficial to link a patient's record across multiple healthcare institutions, or within a single institution. To support such efforts without revealing a patient's identity, there has been a flurry of research in private record linkage^48–51 (or entity resolution). Such linkage is increasingly based on hashed versions of patient identifiers (eg, personal names) or quasi-identifiers (eg, demographics). The paper from Cassa, Miller, and Mandl suggests a protocol to derive a secure fingerprint from genomic data.⁵² They indicate how this approach may be applied to track a patient's record across the research enterprise in place of explicitly identifying information.

The final set of papers focus on strategies for de-identifying various types of health information. Biomedical data can take a wide array of forms, ranging from free text (eg, natural language clinical notes) to structured information (eg, such as discharge databases) to high-dimensional information (eg, genome-wide scans of single nucleotide polymorphisms). The majority of health information is in free text form and so a significant amount of research⁵³ ⁵⁴ over the past several years has investigated how to detect and redact a prespecified set of potential identifiers (such as the list of 18 features in the HIPAA Safe Harbor de-identification standard). The paper by Ferrandez and colleagues provides an example of how rule-based (eg, dictionaries, regular expressions, and rules) and machine learning-based methods (eg, conditional random fields, support vector machines, and naive Bayes classifiers) can be combined to construct a free text scrubber for over 100 different types of Veterans Health Administration clinical notes.⁵⁵ The paper by Deleger and colleagues illustrates how machine learning-based text de-identification methodology has negligible impact on clinical concept extraction, in the form of medications, from over 22 note types from the Cincinnati Children's Hospital Medical Center.⁵⁶

Although the previous papers, and others in the literature, illustrate how identifiers in clinical text can be detected, residual information in the text may still leak inferences or indicators of the corresponding patient (or their relatives). As such, informaticians have worked to developed de-identification strategies that are more formal in their guarantees. These strategies are often applied to more simple data structures, such as field-structured database tuples. The paper by Atreya and colleagues illustrates how a patient's panel of laboratory test results can be unique and potentially used as a key to track a patient back to their identity.⁵⁷ To mitigate this attack, they propose a clinically informed perturbation strategy, which adds noise to the test values. An empirical analysis 61 000 Vanderbilt patients’ records illustrates that such perturbation makes it highly unlikely that a patient's record could be matched to a group of less than 10 individuals while having minimal influence on clinical interpretation. Although offering some probabilistic protection, this style of noise addition does not guarantee defense against an adversary. In this regard, a significant number of publications have suggested aggregation (ie, generalization) strategies could be applied to ensure that every record corresponds to at least k patients (ie, the k-anonymity principle).⁵⁸ ⁵⁹

More recently, it has been suggested that de-identification strategies based on the redaction of a prespecified list of features or more formal aggregation strategies may not be an appropriate model of privacy protection because they can leak inferences about the patients from whom the data were collected.⁶⁰ While this general notion has been challenged,⁶¹ alternative models have been proposed. In the second invited paper, Dwork and Pottenger describe the notion of differential privacy from a theoretical perspective.⁶² In this model of protection, researchers are permitted to ask queries of a database, which subsequently responds with a perturbed aggregate response (eg, a count of 5 may be reported as 6). This response is perturbed such that it is guaranteed that the researcher cannot determine whether a specific individual contributed to the database within a certain probability and that the perturbed answer is within a certain bound of the non-perturbed answer. This model has a number of important strengths, but also faces a number of empirical and practical barriers to its deployment in healthcare settings.⁶³ The final paper of the special issue, by Gardner and colleagues, provides a more practical perspective on how differential privacy could be applied to databases of health information.⁶⁴ They demonstrate how this privacy protection model could be applied to a breast cancer dataset from Emory University, but note there are challenges to applying this approach to high-dimensional data.

Next steps and the future

As this special issue illustrates, the space of data privacy in the biomedical domain is broad and multi-disciplinary. It crosses ethical, legal, and technical boundaries and is specialized to the type of data and process being supported. Consequently, it is not possible to review the entire field in this issue. As we draw this editorial to a close, we stress that numerous topics (eg, access control,⁶⁵ consent management,⁶⁶ statistical disclosure control,⁶⁷ ⁶⁸ and policy specification to manage health information flow⁶⁹) were not addressed, but are no less important than those reported on in this issue. At the same time, we note that new computing infrastructures and high-throughput technologies are creating new challenges to privacy that the biomedical community will need to handle in the not too distant future. One technology that we wish to highlight is cloud computing. As cloud computing costs decline and the amount of data generated by healthcare providers grows, it is increasingly the case the health information is being stored in systems beyond the direct control and oversight of HCOs, and possibly in foreign jurisdictions with different privacy laws and regulations.⁷⁰ ⁷¹ We believe this issue demonstrates that appropriate socio-technical protections can be defined for emerging systems and that there is a quite diverse community working on developing them, and are confident that research in this area will lead to new appropriate solutions that balance privacy and data utility and system usability.

Footnotes

Competing interests: None.

Provenance and peer review: Commissioned; internally peer reviewed.

References

1.Boyer B. Computerized medical records and the right to privacy: the emerging federal response. Buffalo Law Rev 1975;25:37–118 [PubMed] [Google Scholar]
2.Vuori H. Privacy, confidentiality, and automated health information systems. J Med Ethics 1977;3:174–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Gostin L. Health information privacy. Cornell Law Rev 1995;80:451–528 [PubMed] [Google Scholar]
4.Rindfleisch T. Privacy, information technology, and health care. Communications of the ACM 1997;40:93–100 [Google Scholar]
5.Committee on Maintaining Privacy and Security in Health Care Applications of the National Information Infrastructure, Commission on Physical Sciences, Mathematics, and Applications, National Research Council For the record: protecting electronic health information. Washington, DC: National Academy Press, 1997 [Google Scholar]
6.Estrin D, Sim I. Open mHealth architecture: an engine for health care innovation. Science 2010;330:759–60 [DOI] [PubMed] [Google Scholar]
7.Chan M, Estève D, Escriba C, et al. A review of smart homes—Present state and future challenges. Comput Methods Programs Biomed 2008;9:55–81 [DOI] [PubMed] [Google Scholar]
8.Chen M, Gonzalez S, Vasilakos A, et al. Body area networks: a survey. Mobile Netw Appl 2011;16:171–93 [Google Scholar]
9.Li M, Lou W, Ren K. Data security and privacy in wireless body area networks. IEEE Wireless Commun 2010;17:51–8 [Google Scholar]
10.Powell HC, Barth AT, Ringgenberg K, et al. Body area sensor networks: challenges and opportunities. IEEE Comput 2009;42:58–65 [Google Scholar]
11.Green E, Guyer M. Charting a course for genomic medicine from base pairs to bedside. Nature 2011;470:204–13 [DOI] [PubMed] [Google Scholar]
12.Mischak H, Allmaier G, Apweiler R, et al. Recommendations for biomarker identification and qualification in clinical proteomics. Sci Transl Med 2010;2:46ps42. [DOI] [PubMed] [Google Scholar]
13.Chen R, Mias GI, Li-Pook-Than J, et al. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 2012;148:1293–307 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Stead WW, Lin H.eds. Committee on Engaging the Computer Science Research Community in Health Care Informatics, Computer Science and Telecommunications Board, Division on Engineering and Physical Sciences, National Research Council Computational technologies for effective health care: immediate steps and strategic directions. Washington, DC: National Academies Press, 2009 [PubMed] [Google Scholar]
15.El Emam K. A guide to the de-identification of health information. New York, NY: CRC Press, 2013 [Google Scholar]
16.California Health Care Foundation Medical privacy and confidentiality survey. California Health Care Foundation, 1999. http://www.chcf.org/publications/1999/01/medical-privacy-and-confidentiality-survey%5d (accessed 14 Nov 2012). [Google Scholar]
17.Harris Interactive Many US adults are satisfied with use of their personal health information. 2007. http://www.harrisinteractive.com/harris_poll/index.asp?PID=743 (accessed 14 Nov 2012).
18.Lee J, Buckley C. For privacy's sake, taking risks to end pregnancy. New York Times 2009, Jan 4
19.Tanielian T, Jaycox L. Invisible wounds of war: Psychological and cognitive injuries, their consequences, and services to assist recovery. RAND Monograph MG-720, 2008
20.Britto MT, Tivorsak TL, Slap GB. Adolescents’ needs for health care privacy. Pediatrics 2010;126:e1469–76 [DOI] [PubMed] [Google Scholar]
21.Cheng T, Savageau J, Sattler J, et al. Confidentiality in health care: A survey of knowledge, perceptions, and attitudes among high school students. JAMA 1993;269:1404–8 [DOI] [PubMed] [Google Scholar]
22.Ginsburg KR, Menapace AS, Slap GB. Factors affecting the decision to seek health care: the voice of adolescents. Pediatrics 1997;100: 922–30 [DOI] [PubMed] [Google Scholar]
23.Lothen-Kline C, Howard DE, Hamburger EK, et al. Truth and consequences: ethics, confidentiality, and disclosure in adolescent longitudinal prevention research. J Adolesc Health 2003;33:385–94 [DOI] [PubMed] [Google Scholar]
24.Reddy DM, Fleming R, Swain C. Effect of mandatory parental notification on adolescent girls’ use of sexual health care services. JAMA 2002;288:710–4 [DOI] [PubMed] [Google Scholar]
25.Sankar P, Moran S, Merz J, et al. Patient perspectives on medical confidentiality: A review of the literature. J Gen Intern Med 2003;18:659–69 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Thrall JS, McCloskey L, Ettner SL, et al. Confidentiality and adolescents’ use of providers for health information and for pelvic examinations. Arch Pediatr Adolesc Med 2000;154:885–92 [DOI] [PubMed] [Google Scholar]
27.Bell RA, Franks P, Duberstein PR, et al. Suffering in Silence: Reasons for Not Disclosing Depression in Primary Care. Ann Fam Med 2011;9:439–46 [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Tene O. Privacy: the new generations. Int Data Privacy Law 2010;1:15–27 [Google Scholar]
29.Westin A. Privacy and freedom. New York: Atheneum Press, 1967 [Google Scholar]
30.Burgoon JK. Privacy and communication. In: Burgoon M.ed. Communication yearbook 6. Beverly Hills: Sage Publications Ltd, 1982:206–49 [Google Scholar]
31.Parrott R, Burgoon JK, Burgoon M, et al. Privacy between physicians and patients: more than a matter of confidentiality. Soc Sci Med 1989; 29:1381–5 [DOI] [PubMed] [Google Scholar]
32.Ong LML, de Haes JC, Hoos AM, et al. Doctor-patient communicaton: a review of the literature. Soc Sci Med 1995;40:903–18 [DOI] [PubMed] [Google Scholar]
33.Solove D. A taxonomy of privacy. Univ Pennsylvania Law Rev 2006;3:477–560 [Google Scholar]
34.Schoeman F. Privacy: philosophical dimensions. Am Philos Q 1984;21:199–213 [Google Scholar]
35.Slobogin C, Schumacher J. Reasonable expectations of privacy and autonomy in fourth amendment cases: an empirical look at “understandings recognized and permitted by society”. Duke Law J 1993;42:727–75 [Google Scholar]
36.Caine K, Hanania R. Patients want granular privacy control over health information in electronic medical records. J Am Med Inform Assoc 2013;20:7–15. [DOI] [PMC free article] [PubMed]
37.van der Velden M, El Emam K. “Not all my friends need to know”: a qualitative study of teenage patients, privacy, and social media. J Am Med Inform Assoc 2013;20:16–24. [DOI] [PMC free article] [PubMed]
38.Pencarrick Hertzman C, Meagher N, McGrail K. Privacy by Design at Population Data BC: a case study describing the technical, administrative, and physical controls for privacy-sensitive secondary use of personal information for research in the public interest. J Am Med Inform Assoc 2013;20:25–8. [DOI] [PMC free article] [PubMed]
39.McGraw D Building public trust in uses of Health Insurance Portability and Accountability Act de-identified data. J Am Med Inform Assoc 2013;20:29–34. [DOI] [PMC free article] [PubMed]
40.Office for Civil Rights. Guidance regarding methods for de-identification of protected health information in accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule. U.S. Department of Health and Human Services, Washington, DC. September 4, 2012.
41. Petersen C, DeMuro P, Goodman K, et al. Sorrell v. IMS Health: issues and opportunities for informaticians. J Am Med Inform Assoc 2013;20:35–7. [DOI] [PMC free article] [PubMed]
42.Kosseim P, Pullman D, Perrot-Daley A, et al Privacy protection and public goods: building a genetic database for health research in Newfoundland and Labrador. J Am Med Inform Assoc 2013;20:38–43. [DOI] [PMC free article] [PubMed]
43.Kwon J, Johnson ME. Security practices and regulatory compliance in the healthcare industry. J Am Med Inform Assoc 2013;20:44–9. [DOI] [PMC free article] [PubMed]
44.Fabbri D, LeFevre K. Explaining accesses to electronic medical records using diagnosis information. J Am Med Inform Assoc 2013;20:52–60. [DOI] [PMC free article] [PubMed]
45.Boxwala AA, Kim J, Grillo JM, et al. Using statistical and machine learning to help institutions detect suspicious access to electronic health records. J Am Med Inform Assoc 2011;18:498–505 [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Chen Y, Nyemba S, Malin B. Auditing medical record accesses via healthcare interaction networks. AMIA Annu Symp 2012;93–102 [PMC free article] [PubMed] [Google Scholar]
47.White R, Horvitz E. From web search to healthcare utilization: privacy-sensitive studies from mobile data. J Am Med Inform Assoc 2013;20:61–8. [DOI] [PMC free article] [PubMed]
48.Churches T, Christen P. Some methods for blindfolded record linkage. BMC Med Inform Decis Mak 2004;4:9. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Durham E, Xue Y, Kantarcioglu M, et al. AMIA Annu Symp Proc 2010:182–6 [PMC free article] [PubMed] [Google Scholar]
50.Schnell R, Bachteler T, Reiher J. Privacy-preserving record linkage using Bloom filters. BMC Med Inform Decis Mak 2009;9:41. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Weber SC, Lowe H, Das A, et al. A simple heuristic for blindfolded record linkage. J Am Med Inform Assoc 2012;19:e157–61 [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Cassa C, Miller R, Mandl K. A novel, privacy-preserving cryptographic approach for sharing sequencing data. J Am Med Inform Assoc 2013;20:69–76. [DOI] [PMC free article] [PubMed]
53.Meystre SM, Friedlin FJ, South BR, et al. Automatic de-identification of textual documents in the electronic health record: a review of recent research. BMC Med Res Methodol 2010;10:70. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Uzuner O, Luo Y, Szolovits P. Evaluating the state-of-the-art in automatic de-identification. J Am Med Inform Assoc 2007;14:550–63 [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Ferrández O, South B, Shen S, et al BoB, a best-of-breed automated text de-identification system for VHA clinical documents. J Am Med Inform Assoc 2013;20:77–83. [DOI] [PMC free article] [PubMed]
56.Deleger L, Molnar K, Savova G, et al Large-scale evaluation of automated clinical note de-identification and its impact on information extraction. J Am Med Inform Assoc 2013;20:84–94. [DOI] [PMC free article] [PubMed]
57.Atreya R, Smith JC, McCoy A, et al Reducing patient re-identification risk for laboratory results within research datasets. J Am Med Inform Assoc 2013;20:95–101. [DOI] [PMC free article] [PubMed]
58.El Emam K, Dankar F. Protecting privacy using k-anonymity. J Am Med Inform Assoc 2008;15:627–37 [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Sweeney L. k-anonymity: a model for protecting privacy. Int J Uncertainty Fuzziness Knowledge Based Syst 2002;10:557–70 [Google Scholar]
60.Narayanan A, Shmatikov V. Myths and fallacies of “personally identifiable information”. Commun ACM 2010;53:24–6 [Google Scholar]
61.El Emam K, Jonker E, Arbuckle L, et al. A systematic review of re-identification attacks on health data. PLoS One 2011;6:e28071. [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Dwork C, Pottenger R. Toward practicing privacy. J Am Med Inform Assoc 2013;20:102–7. [DOI] [PMC free article] [PubMed]
63.Dankar F, El Emam K. The application of differential privacy to health data. Proceedings of the 5th International Workshop on Privacy and Anonymity in the Information Society 2012: 158–66 [Google Scholar]
64.Gardner J, Xiong L, Xiao Y, et al SHARE: system design and case studies for statistical health information release. J Am Med Inform Assoc 2013;20:109–16. [DOI] [PMC free article] [PubMed]
65.Blobel B. Authorisation and access control for electronic health systems. Int J Med Inform 2004;73:251–7 [DOI] [PubMed] [Google Scholar]
66.O'Keefe C, Greenfield P, Goodchild A. A decentralised approach to electronic consent and health information access control. J Res Pract Info Technol 2005;37:161–78 [Google Scholar]
67.Duncan GT, Elliot M, Salazar GJJ. Statistical confidentiality: principles and practice. New York: Springer, 2011 [Google Scholar]
68.Hundepool A, Domingo-Ferrer J, Franconi L, et al. Statistical disclosure control. West Sussex, United Kingdom: Wiley, 2012 [Google Scholar]
69.Peleg M, Beimel D, Dori D, et al. Situation-Based Access Control: privacy management via modeling of patient data access scenarios. J Biomed Inform 2008;41:1028–40 [DOI] [PubMed] [Google Scholar]
70.Kuo A. Opportunities and challenges of cloud computing to improve health care services. J Med Internet Res 2011;13:e67. [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Schweitzer EJ. Reconciliation of the cloud computing model with US federal electronic health record regulations. J Am Med Inform Assoc 2012;19:161–5 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Boyer B. Computerized medical records and the right to privacy: the emerging federal response. Buffalo Law Rev 1975;25:37–118 [PubMed] [Google Scholar]

[R2] 2.Vuori H. Privacy, confidentiality, and automated health information systems. J Med Ethics 1977;3:174–8 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Gostin L. Health information privacy. Cornell Law Rev 1995;80:451–528 [PubMed] [Google Scholar]

[R4] 4.Rindfleisch T. Privacy, information technology, and health care. Communications of the ACM 1997;40:93–100 [Google Scholar]

[R5] 5.Committee on Maintaining Privacy and Security in Health Care Applications of the National Information Infrastructure, Commission on Physical Sciences, Mathematics, and Applications, National Research Council For the record: protecting electronic health information. Washington, DC: National Academy Press, 1997 [Google Scholar]

[R6] 6.Estrin D, Sim I. Open mHealth architecture: an engine for health care innovation. Science 2010;330:759–60 [DOI] [PubMed] [Google Scholar]

[R7] 7.Chan M, Estève D, Escriba C, et al. A review of smart homes—Present state and future challenges. Comput Methods Programs Biomed 2008;9:55–81 [DOI] [PubMed] [Google Scholar]

[R8] 8.Chen M, Gonzalez S, Vasilakos A, et al. Body area networks: a survey. Mobile Netw Appl 2011;16:171–93 [Google Scholar]

[R9] 9.Li M, Lou W, Ren K. Data security and privacy in wireless body area networks. IEEE Wireless Commun 2010;17:51–8 [Google Scholar]

[R10] 10.Powell HC, Barth AT, Ringgenberg K, et al. Body area sensor networks: challenges and opportunities. IEEE Comput 2009;42:58–65 [Google Scholar]

[R11] 11.Green E, Guyer M. Charting a course for genomic medicine from base pairs to bedside. Nature 2011;470:204–13 [DOI] [PubMed] [Google Scholar]

[R12] 12.Mischak H, Allmaier G, Apweiler R, et al. Recommendations for biomarker identification and qualification in clinical proteomics. Sci Transl Med 2010;2:46ps42. [DOI] [PubMed] [Google Scholar]

[R13] 13.Chen R, Mias GI, Li-Pook-Than J, et al. Personal omics profiling reveals dynamic molecular and medical phenotypes. Cell 2012;148:1293–307 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] 14.Stead WW, Lin H.eds. Committee on Engaging the Computer Science Research Community in Health Care Informatics, Computer Science and Telecommunications Board, Division on Engineering and Physical Sciences, National Research Council Computational technologies for effective health care: immediate steps and strategic directions. Washington, DC: National Academies Press, 2009 [PubMed] [Google Scholar]

[R15] 15.El Emam K. A guide to the de-identification of health information. New York, NY: CRC Press, 2013 [Google Scholar]

[R16] 16.California Health Care Foundation Medical privacy and confidentiality survey. California Health Care Foundation, 1999. http://www.chcf.org/publications/1999/01/medical-privacy-and-confidentiality-survey%5d (accessed 14 Nov 2012). [Google Scholar]

[R17] 17.Harris Interactive Many US adults are satisfied with use of their personal health information. 2007. http://www.harrisinteractive.com/harris_poll/index.asp?PID=743 (accessed 14 Nov 2012).

[R18] 18.Lee J, Buckley C. For privacy's sake, taking risks to end pregnancy. New York Times 2009, Jan 4

[R19] 19.Tanielian T, Jaycox L. Invisible wounds of war: Psychological and cognitive injuries, their consequences, and services to assist recovery. RAND Monograph MG-720, 2008

[R20] 20.Britto MT, Tivorsak TL, Slap GB. Adolescents’ needs for health care privacy. Pediatrics 2010;126:e1469–76 [DOI] [PubMed] [Google Scholar]

[R21] 21.Cheng T, Savageau J, Sattler J, et al. Confidentiality in health care: A survey of knowledge, perceptions, and attitudes among high school students. JAMA 1993;269:1404–8 [DOI] [PubMed] [Google Scholar]

[R22] 22.Ginsburg KR, Menapace AS, Slap GB. Factors affecting the decision to seek health care: the voice of adolescents. Pediatrics 1997;100: 922–30 [DOI] [PubMed] [Google Scholar]

[R23] 23.Lothen-Kline C, Howard DE, Hamburger EK, et al. Truth and consequences: ethics, confidentiality, and disclosure in adolescent longitudinal prevention research. J Adolesc Health 2003;33:385–94 [DOI] [PubMed] [Google Scholar]

[R24] 24.Reddy DM, Fleming R, Swain C. Effect of mandatory parental notification on adolescent girls’ use of sexual health care services. JAMA 2002;288:710–4 [DOI] [PubMed] [Google Scholar]

[R25] 25.Sankar P, Moran S, Merz J, et al. Patient perspectives on medical confidentiality: A review of the literature. J Gen Intern Med 2003;18:659–69 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Thrall JS, McCloskey L, Ettner SL, et al. Confidentiality and adolescents’ use of providers for health information and for pelvic examinations. Arch Pediatr Adolesc Med 2000;154:885–92 [DOI] [PubMed] [Google Scholar]

[R27] 27.Bell RA, Franks P, Duberstein PR, et al. Suffering in Silence: Reasons for Not Disclosing Depression in Primary Care. Ann Fam Med 2011;9:439–46 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] 28.Tene O. Privacy: the new generations. Int Data Privacy Law 2010;1:15–27 [Google Scholar]

[R29] 29.Westin A. Privacy and freedom. New York: Atheneum Press, 1967 [Google Scholar]

[R30] 30.Burgoon JK. Privacy and communication. In: Burgoon M.ed. Communication yearbook 6. Beverly Hills: Sage Publications Ltd, 1982:206–49 [Google Scholar]

[R31] 31.Parrott R, Burgoon JK, Burgoon M, et al. Privacy between physicians and patients: more than a matter of confidentiality. Soc Sci Med 1989; 29:1381–5 [DOI] [PubMed] [Google Scholar]

[R32] 32.Ong LML, de Haes JC, Hoos AM, et al. Doctor-patient communicaton: a review of the literature. Soc Sci Med 1995;40:903–18 [DOI] [PubMed] [Google Scholar]

[R33] 33.Solove D. A taxonomy of privacy. Univ Pennsylvania Law Rev 2006;3:477–560 [Google Scholar]

[R34] 34.Schoeman F. Privacy: philosophical dimensions. Am Philos Q 1984;21:199–213 [Google Scholar]

[R35] 35.Slobogin C, Schumacher J. Reasonable expectations of privacy and autonomy in fourth amendment cases: an empirical look at “understandings recognized and permitted by society”. Duke Law J 1993;42:727–75 [Google Scholar]

[R36] 36.Caine K, Hanania R. Patients want granular privacy control over health information in electronic medical records. J Am Med Inform Assoc 2013;20:7–15. [DOI] [PMC free article] [PubMed]

[R37] 37.van der Velden M, El Emam K. “Not all my friends need to know”: a qualitative study of teenage patients, privacy, and social media. J Am Med Inform Assoc 2013;20:16–24. [DOI] [PMC free article] [PubMed]

[R38] 38.Pencarrick Hertzman C, Meagher N, McGrail K. Privacy by Design at Population Data BC: a case study describing the technical, administrative, and physical controls for privacy-sensitive secondary use of personal information for research in the public interest. J Am Med Inform Assoc 2013;20:25–8. [DOI] [PMC free article] [PubMed]

[R39] 39.McGraw D Building public trust in uses of Health Insurance Portability and Accountability Act de-identified data. J Am Med Inform Assoc 2013;20:29–34. [DOI] [PMC free article] [PubMed]

[R40] 40.Office for Civil Rights. Guidance regarding methods for de-identification of protected health information in accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule. U.S. Department of Health and Human Services, Washington, DC. September 4, 2012.

[R41] 41. Petersen C, DeMuro P, Goodman K, et al. Sorrell v. IMS Health: issues and opportunities for informaticians. J Am Med Inform Assoc 2013;20:35–7. [DOI] [PMC free article] [PubMed]

[R42] 42.Kosseim P, Pullman D, Perrot-Daley A, et al Privacy protection and public goods: building a genetic database for health research in Newfoundland and Labrador. J Am Med Inform Assoc 2013;20:38–43. [DOI] [PMC free article] [PubMed]

[R43] 43.Kwon J, Johnson ME. Security practices and regulatory compliance in the healthcare industry. J Am Med Inform Assoc 2013;20:44–9. [DOI] [PMC free article] [PubMed]

[R44] 44.Fabbri D, LeFevre K. Explaining accesses to electronic medical records using diagnosis information. J Am Med Inform Assoc 2013;20:52–60. [DOI] [PMC free article] [PubMed]

[R45] 45.Boxwala AA, Kim J, Grillo JM, et al. Using statistical and machine learning to help institutions detect suspicious access to electronic health records. J Am Med Inform Assoc 2011;18:498–505 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] 46.Chen Y, Nyemba S, Malin B. Auditing medical record accesses via healthcare interaction networks. AMIA Annu Symp 2012;93–102 [PMC free article] [PubMed] [Google Scholar]

[R47] 47.White R, Horvitz E. From web search to healthcare utilization: privacy-sensitive studies from mobile data. J Am Med Inform Assoc 2013;20:61–8. [DOI] [PMC free article] [PubMed]

[R48] 48.Churches T, Christen P. Some methods for blindfolded record linkage. BMC Med Inform Decis Mak 2004;4:9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] 49.Durham E, Xue Y, Kantarcioglu M, et al. AMIA Annu Symp Proc 2010:182–6 [PMC free article] [PubMed] [Google Scholar]

[R50] 50.Schnell R, Bachteler T, Reiher J. Privacy-preserving record linkage using Bloom filters. BMC Med Inform Decis Mak 2009;9:41. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R51] 51.Weber SC, Lowe H, Das A, et al. A simple heuristic for blindfolded record linkage. J Am Med Inform Assoc 2012;19:e157–61 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R52] 52.Cassa C, Miller R, Mandl K. A novel, privacy-preserving cryptographic approach for sharing sequencing data. J Am Med Inform Assoc 2013;20:69–76. [DOI] [PMC free article] [PubMed]

[R53] 53.Meystre SM, Friedlin FJ, South BR, et al. Automatic de-identification of textual documents in the electronic health record: a review of recent research. BMC Med Res Methodol 2010;10:70. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R54] 54.Uzuner O, Luo Y, Szolovits P. Evaluating the state-of-the-art in automatic de-identification. J Am Med Inform Assoc 2007;14:550–63 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R55] 55.Ferrández O, South B, Shen S, et al BoB, a best-of-breed automated text de-identification system for VHA clinical documents. J Am Med Inform Assoc 2013;20:77–83. [DOI] [PMC free article] [PubMed]

[R56] 56.Deleger L, Molnar K, Savova G, et al Large-scale evaluation of automated clinical note de-identification and its impact on information extraction. J Am Med Inform Assoc 2013;20:84–94. [DOI] [PMC free article] [PubMed]

[R57] 57.Atreya R, Smith JC, McCoy A, et al Reducing patient re-identification risk for laboratory results within research datasets. J Am Med Inform Assoc 2013;20:95–101. [DOI] [PMC free article] [PubMed]

[R58] 58.El Emam K, Dankar F. Protecting privacy using k-anonymity. J Am Med Inform Assoc 2008;15:627–37 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R59] 59.Sweeney L. k-anonymity: a model for protecting privacy. Int J Uncertainty Fuzziness Knowledge Based Syst 2002;10:557–70 [Google Scholar]

[R60] 60.Narayanan A, Shmatikov V. Myths and fallacies of “personally identifiable information”. Commun ACM 2010;53:24–6 [Google Scholar]

[R61] 61.El Emam K, Jonker E, Arbuckle L, et al. A systematic review of re-identification attacks on health data. PLoS One 2011;6:e28071. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R62] 62.Dwork C, Pottenger R. Toward practicing privacy. J Am Med Inform Assoc 2013;20:102–7. [DOI] [PMC free article] [PubMed]

[R63] 63.Dankar F, El Emam K. The application of differential privacy to health data. Proceedings of the 5th International Workshop on Privacy and Anonymity in the Information Society 2012: 158–66 [Google Scholar]

[R64] 64.Gardner J, Xiong L, Xiao Y, et al SHARE: system design and case studies for statistical health information release. J Am Med Inform Assoc 2013;20:109–16. [DOI] [PMC free article] [PubMed]

[R65] 65.Blobel B. Authorisation and access control for electronic health systems. Int J Med Inform 2004;73:251–7 [DOI] [PubMed] [Google Scholar]

[R66] 66.O'Keefe C, Greenfield P, Goodchild A. A decentralised approach to electronic consent and health information access control. J Res Pract Info Technol 2005;37:161–78 [Google Scholar]

[R67] 67.Duncan GT, Elliot M, Salazar GJJ. Statistical confidentiality: principles and practice. New York: Springer, 2011 [Google Scholar]

[R68] 68.Hundepool A, Domingo-Ferrer J, Franconi L, et al. Statistical disclosure control. West Sussex, United Kingdom: Wiley, 2012 [Google Scholar]

[R69] 69.Peleg M, Beimel D, Dori D, et al. Situation-Based Access Control: privacy management via modeling of patient data access scenarios. J Biomed Inform 2008;41:1028–40 [DOI] [PubMed] [Google Scholar]

[R70] 70.Kuo A. Opportunities and challenges of cloud computing to improve health care services. J Med Internet Res 2011;13:e67. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R71] 71.Schweitzer EJ. Reconciliation of the cloud computing model with US federal electronic health record regulations. J Am Med Inform Assoc 2012;19:161–5 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Biomedical data privacy: problems, perspectives, and recent advances

Bradley A Malin

Khaled El Emam

Christine M O'Keefe

Introduction

Privacy, zones, and socio-technical tracks

Collection zone

Primary use zone

Secondary use zone

A characterization of the papers in the privacy lifecycle

Papers in the socio-legal track

Papers in the technical track

Next steps and the future

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Biomedical data privacy: problems, perspectives, and recent advances

Bradley A Malin

Khaled El Emam

Christine M O'Keefe

Introduction

Privacy, zones, and socio-technical tracks

Collection zone

Primary use zone

Secondary use zone

A characterization of the papers in the privacy lifecycle

Papers in the socio-legal track

Papers in the technical track

Next steps and the future

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases