Abstract
Objective
Clinical corpora can be deidentified using a combination of machine-learned automated taggers and hiding in plain sight (HIPS) resynthesis. The latter replaces detected personally identifiable information (PII) with random surrogates, allowing leaked PII to blend in or “hide in plain sight.” We evaluated the extent to which a malicious attacker could expose leaked PII in such a corpus.
Materials and Methods
We modeled a scenario where an institution (the defender) externally shared an 800-note corpus of actual outpatient clinical encounter notes from a large, integrated health care delivery system in Washington State. These notes were deidentified by a machine-learned PII tagger and HIPS resynthesis. A malicious attacker obtained and performed a parrot attack intending to expose leaked PII in this corpus. Specifically, the attacker mimicked the defender’s process by manually annotating all PII-like content in half of the released corpus, training a PII tagger on these data, and using the trained model to tag the remaining encounter notes. The attacker hypothesized that untagged identifiers would be leaked PII, discoverable by manual review. We evaluated the attacker’s success using measures of leak-detection rate and accuracy.
Results
The attacker correctly hypothesized that 211 (68%) of 310 actual PII leaks in the corpus were leaks, and wrongly hypothesized that 191 resynthesized PII instances were also leaks. One-third of actual leaks remained undetected.
Discussion and Conclusion
A malicious parrot attack to reveal leaked PII in clinical text deidentified by machine-learned HIPS resynthesis can attenuate but not eliminate the protective effect of HIPS deidentification.
Keywords: deidentification; patient privacy; machine learning; natural language processing, patient data privacy
INTRODUCTION
Automated deidentification methods enable health care institutions to support secondary use of information-rich clinical text in a manner compliant with patient privacy regulations.1 The highest-performing deidentification methods incorporate machine learning technologies,2 but even the best systems overlook some identifiers, a problem known as the “residual PII problem.”3 A hiding in plain sight (HIPS) strategy4 has been proposed to address the residual PII problem by replacing the vast majority of personally identifiable information (PII) in a corpus with realistic but fictitious surrogates. This process, referred to as HIPS resynthesis, replaces actual PII in statements such as “post op 1/14/2013 Jane J Doe is an 84-year-old female,” with resynthesized surrogate PII: “post op Oct 22, 2013 Mary K Jones is an 82-year-old female.” State-of-the-art deidentification systems can locate 94%–99% of identifiers for such replacement, leaving 1%–6% of PII in its original form (referred to as “leaks”). To the reader of a HIPS deidentified corpus who accepts the HIPS premise that resynthesized PII and leaked PII are indistinguishable and who avoids critically assessing PII-like content in the corpus for the purpose of discovering actual PII leaks, HIPS deidentification can be reasonably considered to approach 100%. Even when human readers maliciously use analysis and logic to discover leaked PII in a HIPS deidentified corpus, they can do so only with limited success; published research suggests that under a hostile “human reader attack” the HIPS approach may still conceal most PII leaks.3,5
However, beyond such human reader attacks, hostile machine-assisted attacks should be recognized as a potential vulnerability before releasing HIPS deidentified corpora.6 Machine-assisted attacks employ machine-learned software algorithms to assist human readers in exposing leaked PII in a HIPS deidentified corpus. This study assesses the effectiveness of a machine-assisted parrot attack.7 Specifically, we experimentally investigate the scenario whereby an attacker uses machine learning methods to expose leaked PII in a HIPS-deidentified corpus that are similar to (or parrot) methods a defender used to conceal leaked PII. Though a successful parrot attack does not require the attacker to use methods identical to those a defender used to deidentify a corpus, an attacker’s success is likely to be highly correlated with the degree to which an attacker’s methods mimic those of the defender (as we explain further below). We hypothesized that such an attack could reverse some of the protective effect of HIPS.
Background
Clinical text deidentification is the process of removing a predefined set of identifiers from a corpus, such as the 18 enumerated types of identifiers in the Safe Harbor deidentification model of the Privacy Rule of the US Health Insurance Portability and Accountability Act of 1996 (HIPAA).1 Machine learning methods, often in combination with rule-based methods,8 underlie the highest-performing text deidentification systems.2,9–12 But numerous studies have shown that it is virtually impossible for any scalable machine learning-based method to remove more than about 94%–99% of PII in a corpus of substantial size.2,9,13–24 We believe scalable systems are unlikely to achieve 100% PII removal without severely sacrificing a corpus’ utility through over redaction.
As noted above, the HIPS approach aims to enhance deidentification by concealing both known PII (ie, PII detected by a deidentification system) and unknown PII (ie, PII the system overlooks). HIPS is designed to overcome a limitation of traditional deidentification, whereby known instances of PII are replaced with symbols denoting the type of content removed, such as “<PT_NAME>” (Figure 1, column B). Because of this, all leaked PII in a traditionally redacted corpus can be readily detected by a careful human reader and thereby exposed. In contrast, the HIPS approach replaces known PII with randomly generated fictitious but realistic surrogates,11,25 preserving as much as possible the original information content by, for example, replacing all references in a corpus to a given patient with the same surrogate patient name. This achieves 2 objectives. First, and like traditional redaction, replacement of known PII prevents exposure of the original identifiers. Because HIPS surrogates are randomly generated they cannot be leveraged to reverse engineer the original PII they replace. Second, and unlike traditional redaction, HIPS aims to conceal known PII in a manner that makes it indistinguishable from leaked PII—at least to a reader without malicious intent.26 When PII-like content in a corpus is predominantly fictitious, leaked PII blends in with, and is thereby concealed by, the fictitious PII—to a nonmalicious reader.25 This is illustrated in Figure 1. Beginning with raw original text containing PII (column A), an automated deidentification software’s PII tagger identifies most PII and replaces this “known” PII with randomly generated surrogates (column C), allowing leaked patient names and ages to hide in plain sight (column D).
Figure 1.
Illustrations of actual clinical text in various stages of deidentification: (A) Original text, (B) traditionally redacted text, (C) defender’s resynthesized text, (D) released text, (E) attacker-tagged text, and (F) attacker leak-detection performance. Performance derived by comparing hypothesized leaks in column E to actual leaks in column C. The text in this figure resembles real clinical text selected from the study corpus, but all of the original personally identifiable information (PII) and some other information has been fictionalized to preserve patient privacy. The tagged and leaked PII instances are realistic.
An earlier study reported that the HIPS approach allows leaked PII to escape detection by human readers attempting to single out actual PII leaks in a HIPS deidentified corpus.3 To our knowledge, no prior research has quantified the success rate of an attack using machine learning methods to expose leaked PII in a HIPS deidentified corpus. This study implements and evaluates such an attack.
Evaluation of an attack scenario must consider which actual PII is at risk of exposure and which actual PII is not at risk. All PII in an original raw note is at risk of exposure (Figure 1, column A). Both traditional redaction and the HIPS approach eliminate exposure risk for known PII instances (Figure 1 columns B and C) and no attack can undo these redactions. An attack on a HIPS deidentified corpus would be successful if it exposed—with certainty or high probability—any of the leaked PII that HIPS intended to conceal. An attacker may accomplish this by developing an algorithm for determining which content is surrogate PII (Figure 1, column D) and using this information to process the entire corpus (Figure 1, column E). This would allow the attacker to discover, by manual review, which PII is leaked, under the hypothesis that PII-like content not flagged by this algorithm are actual leaks (Figure 1, column F, yellow highlighting).
A parrot attack requires considerable effort and may be hampered by errors in the attacker’s algorithm. The attacker must first annotate a significant portion of the corpus to use as training data and then perform a manual review to identify untagged PII in the processed corpus to discover candidate PII leaks. Imperfections in the attacker's algorithm may lead the attacker to wrongly conclude that actual PII leaks are resynthesized PII or vice-versa (green highlighted and blue highlighted content, respectively, in Figure 1, column F).
It is important to emphasize that the only PII at risk of exposure under this attack scenario is leaked PII that remains in a HIPS deidentified corpus—the leaked PII presumed to be concealed by HIPS. With traditional redaction, all leaked PII is entirely exposed. This investigation is thus about the extent to which the added protection of HIPS deidentification may be reversible—in the worst-case rendering HIPS perhaps no better than traditional redaction. Considering worst-case scenarios is a standard component of risk assessment.
MATERIALS AND METHODS
We simulated a defender/attacker scenario in which a defender released a clinical corpus deidentified by a machine-learned PII tagger and HIPS resynthesis (the “release corpus”). After obtaining this corpus, an attacker conducted a machine-assisted parrot attack to reveal leaked PII in that release corpus. To investigate the upper limits of such an attack we investigated the scenario in which sources of minor perturbation that might reduce an attack’s success—such as minor differences in annotation schema, manual annotation errors, or differences in machine-learned algorithms used to train PII taggers—were eliminated. We did this by equipping our simulated attacker with an approach that closely or identically mimicked those used by our simulated defender. To evaluate the attack’s impact, we measured the attacker’s ability to correctly identify actual PII leaks in the released corpus.
Study corpus
The study corpus consisted of 1200 actual clinical encounter notes, randomly selected from over 870 000 notes from 16 outpatient medical specialties, for patients aged 5–89, at Kaiser Permanente Washington (KPWA, formerly Group Health Cooperative) between November 2010 and October 2011. The corpus included 70 notes from each specialty. As is standard practice at KPWA, we sought to improve the PII tagger’s performance for identifying patients’ family member names by enriching the corpus with notes from 80 patients’ first oncology encounter following a new cancer diagnosis, as these notes tend to be rich in family member names (Table 1). Electronic encounter notes were obtained from KPWA’s Epic Clarity database. The types and frequency of PII instances in this corpus is similar to that reported in Table 2.
Table 1.
Sources and counts of 1200 real clinical encounter notes in the study corpus randomly sampled from 16 medical specialty encounters during November 2010 through October 2011 with patients age 5–89, and their random allocation to the defender’s model training corpus (n = 400) and the release corpus (n = 800)
Medical specialty of clinical encounter notes | Count of available encounter notesa | Count of notes randomly sampled for study corpus | Count of notes randomly allocated to: |
|
---|---|---|---|---|
Defender model training corpus | Release corpus | |||
Cardiology | 73 308 | 70 | 20 | 50 |
Dermatology | 68 352 | 70 | 20 | 50 |
Endocrinology | 8616 | 70 | 20 | 50 |
Gastroenterology | 44 496 | 70 | 20 | 50 |
Infectious Diseases | 3588 | 70 | 20 | 50 |
Internal Medicine | 69 420 | 70 | 20 | 50 |
Nephrology | 13 524 | 70 | 20 | 50 |
Neurology | 25 104 | 70 | 20 | 50 |
Obstetrics & Gynecology | 57 360 | 70 | 20 | 50 |
Otolaryngology | 37 248 | 70 | 20 | 50 |
Oncology & Hematology | 37 620 | 70 | 20 | 50 |
Breast ca. Onc./Hem.b | 146 | 80 | 80 | 0 |
Ophthalmology | 75 744 | 70 | 20 | 50 |
Orthopedics | 151 596 | 70 | 20 | 50 |
Pediatrics | 152 508 | 70 | 20 | 50 |
Pulmonology | 24 360 | 70 | 20 | 50 |
Urology | 32 148 | 70 | 20 | 50 |
Total: | 874 992 | 1200 | 400 | 800 |
Notes: |
Count is based on actual count of notes from January 2011 multiplied by 12 to yield a 1-year estimate.
These encounter notes are from Oncology & Hematology encounters immediately following new pathologically confirmed diagnoses of primary breast cancer, which occurred 146 times during this period. We included a sample of these notes in the defender’s model training corpus (only) because they tended to have a higher frequency of patient and family member names compared to typical progress notes. This enriched tagger model training data.
Table 2.
PII instances in the attacked corpus, attacker's PII tagger results, attacker's success in exposing leaked PII, PII leak-detection rate expected by random chance, and attack evaluation metrics by PII type
PII type | PII instances in the attacked corpus |
Attacker tagger results |
Attacker’s success in exposing leaked PII |
PII leak-detection rate expected by random chance | Attack evaluation metrics |
|||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Surrogate | Leaked | Total | Tagged (therefore predicted surrogate) | Not tagged (therefore predicted leaked) | Correctly predicted leaked | Correctly predicted surrogate | Wrongly predicted leaked | Wrongly predicted surrogate | PII leak-detection rate (Eq. 1) | PII leak prediction accuracy (Eq. 2) | ||
All PII types | 3695 | 310 | 4005 | 3603 | 402 | 211 | 3504 | 191 | 99 | 8% | 68%a | 52% |
Age | 262 | 18 | 280 | 247 | 33 | 12 | 241 | 21 | 6 | 6% | 67%a | 36% |
Date | 1855 | 24 | 1879 | 1816 | 63 | 18 | 1810 | 45 | 6 | 1% | 75% a | 29% |
Doctor inits | 167 | 18 | 185 | 148 | 37 | 10 | 140 | 27 | 8 | 10% | 56%b | 27% |
Doctor name | 645 | 83 | 728 | 653 | 75 | 53 | 623 | 22 | 30 | 11% | 64%a | 71% |
Geog. loc. | 30 | 29 | 59 | 18 | 41 | 23 | 12 | 18 | 6 | 49% | 79%c | 56% |
MRN | 42 | 0 | 42 | 42 | 0 | 0 | 42 | 0 | 0 | 0% | −d | − |
Org. name | 205 | 57 | 262 | 183 | 79 | 42 | 168 | 37 | 15 | 22% | 74%a | 53% |
Other ID | 120 | 8 | 128 | 118 | 10 | 6 | 116 | 4 | 2 | 6% | 75%a | 60% |
Phone/fax | 15 | 5 | 20 | 15 | 5 | 5 | 15 | 0 | 0 | 25% | 100%d | 100% |
Patient name | 209 | 54 | 263 | 208 | 55 | 40 | 194 | 15 | 14 | 21% | 74%a | 73% |
Web address | 2 | 2 | 4 | 0 | 4 | 2 | 0 | 2 | 0 | 50% | 100%d | 50% |
Abbreviation: MRN, medical record number; PII, personally identifiable information.
P value for Chi-square test of independence < .00001.
P value for Chi-square test of independence < .0001.
P value for Chi-square test of independence = .107299.
Chi-square not calculated due to sparse data.
Reference standard PII annotations
We created reference standard annotations of PII in the study corpus and used these to evaluate the experiment. Our PII annotation schema included (1) patient names, including friends or family names; (2) patient age; (3) dates; (4) doctor initials; (5) doctor names (included because health care institutions, including ours, often require this); (6) geographical locations including addresses; (7) medical record numbers; (8) organization names including health care organizations; (9) phone numbers; (10) Web addresses including email addresses; and (11) a catch-all category for other identifiers, including social security and medical device serial numbers. To create high-quality reference standard PII annotations, we used a 4-step process with 3 layers of review by trained chart abstractors (Supplementary Appendix A).
The parrot attack scenario and assumptions
The machine-assisted parrot attack scenario is depicted in Figure 2. In this scenario a hypothetical attacker attempted to expose leaked PII in an externally shared HIPS deidentified corpus of 800 notes using only the 800 released notes; no information from any other source was used. The attack began by randomly splitting the corpus into two 400-note sets, 1 used for model training (the attacker’s training corpus) and the other for discovering leaked PII (the “attack corpus”). Next, the attacker (a) manually annotated all apparent PII in the training corpus without attempting to distinguish leaked from surrogate PII; (b) trained a machine-learned tagger model from these annotations (specifically, training a model based on textual features of the annotated content and its immediate context); and (c) used this tagger model to automatically tag PII in the 400-note attack corpus. Finally, the attacker manually reviewed the entire attack corpus to reveal content hypothesized to be leaked PII, assuming that any PII not tagged by the attacker’s tagger was leaked PII. This constituted a parrot attack because the process the attacker used to expose leaked PII parroted the defender’s process used to conceal leaked PII. For simplicity, the scenario we present attacked only half of the 800-note release corpus, but the same method may be used to attack the entire corpus.
Figure 2.
Defender process for removing known personally identifying information (PII) in clinical text using a machine-learned (ML) tagger and concealing leaked PII via hiding in plain sight (HIPS) resynthesis (left/blue), and a parrot attack process using a comparable machine learning approach to identify text hypothesized to be leaked PII in the released corpus (right/orange).
Plausible assumptions that may motivate a would-be attacker to consider the parrot attack investigated here are (1) the deidentified release corpus constitutes training data which is very similar to the original (fully identified) training data used to train the defender’s PII tagger model, and (2) using these training data to train a machine-learned PII tagger will yield a tagger that performs exactly like the defender’s tagger for a nontrivial portion of the release corpus. To the extent these assumptions hold, the attacker’s tagger may be much more likely to tag resynthesized PII than leaked PII, and, if so, these tagger results may help an attacker expose leaked PII.
We believe these assumptions are plausible for several reasons. First, as a source of training data, the release corpus is, in fact, very similar to the original corpus. Surrogate PII in the resynthesized release corpus, by design, has textual features—punctuation, capitalization, numeric characters, word frequency properties, and some actual content (such as month names)—that are similar to PII in the defender’s original corpus. For example, the surrogate patient name “Mary K Jones” has textual features similar to those of the PII it replaces, “Jane J Doe.” Second, features of non-PII text immediately preceding or following PII in the resynthesized corpus are (nearly) identical to those in the defender’s original corpus. This is because the defender’s automated resynthesis engine does not attempt to (and rarely does) alter this information, which may contain clinical information of interest. Finally, it is reasonable to assume that the defender and attacker will have access to similar machine learning algorithms, as the best algorithms are typically published and available as open source software.
What is unknown—and the question that motivated this experiment—is the extent to which differences between a defender’s training corpus, which contains real PII, and a parrot attacker’s training corpus, which contains mostly HIPS-resynthesized PII, will impact the attacker’s ability to use a parrot attack to “undo” the protection HIPS deidentification affords to leaked PII.
The parrot attack, we emphasize, cannot automatically generate a list of leaked PII. This is because any technology capable of doing so would be rapidly incorporated into a defender’s process.26 The attack scenario considered here simply attempts to reverse the protective effect of HIPS resynthesis. Even if this attack were 100% successful, its harms would extend no further than exposing the leaked PII that the HIPS resynthesis conceals.
Simulating the attacker’s manual annotation of training data
As noted above, the attacker must create training data by manually annotating all PII-like text, resynthesized and leaked, in the subset of the release corpus designated as training data. Annotation entails marking, for each instance that appears to be PII, its span (ie, beginning and ending locations) and type (eg, patient name). We assume an attacker is motivated to create high-quality annotations because annotation errors in training data negatively impact the performance of models trained on them. But creating high-quality annotation is a burdensome and expensive undertaking,27 and an attacker may therefore tolerate some level of annotation imperfection to reduce the cost of the attack. However, to investigate the upper bound of an attacker’s capability, our experiment assumed the attacker used extremely high-quality annotation. We simulated this programmatically by overlaying information from our reference standard PII annotations and the defender’s PII resynthesis engine, providing the attacker with PII annotations whose spans and types perfectly reflected all resynthesized and leaked PII in the attacker’s training corpus. The experiment is thus free from any perturbations introduced by an attacker's imperfect annotation.
Simulating the attacker’s training and use of a PII tagger model
With a well-annotated 400-note training corpus, familiarity with relevant literature, and willingness to expend moderate analytical effort, an attacker can be expected to identify an open source machine learning algorithm that yields a reasonably high-performing PII tagger model. For example, experiments using random subsets of the attacker’s training corpus may be used to evaluate and select the highest performing of several alternative tagger model training approaches. Inconsistencies between the defender’s and attacker’s tagger models can be expected to negatively impact the attacker’s success rate in exposing leaked PII, as tagger errors introduce erroneous information regarding which PII is likely to be resynthesized. However, knowledge about machine learning algorithms a defender uses for deidentification is likely to be publicly available, consistent with security best practices that avoid reliance on “security through obscurity,”28 giving an attacker access to (nearly) identical software used by the defender.
To investigate the upper bound of an attacker’s capability, and to avoid perturbations due solely to variations in the choice of software used to train a PII tagger, we gave the simulated attacker the exact same tagger model training software our defender used: MIST v.2.0.4.29 The tagger model trained on the attacker’s training corpus was then applied to the 400-note attack corpus. This produced a version of the attack corpus in which each text span the tagger predicted to be PII was clearly marked (Figure 1, column E). This machine-marked version of the corpus was manually reviewed by the attacker with intent to discover leaked (unmarked) PII, as previously described.
Reviewing the attacker-tagged corpus to expose leaked PII
The attacker’s attempt to expose leaked PII in the attack corpus entails manually reviewing the entire attacker-tagged corpus, and manually marking all untagged PII as presumptive leaks. Manual review is subject to human error, and such error may diminish an attack’s success. However, to eliminate perturbations due solely to manual review error, our simulated attack was free from such error. We did this by programmatically overlaying information from the reference standard (Figure 1, column A), the defender’s resynthesized corpus (Figure 1, column C), and the attacker-tagged corpus (Figure 1, column E). This same information enabled us to determine which of the attacker’s hypothesized leaks were actual leaks and which were resynthesized content wrongly hypothesized to be leaks (Figure 1, column F). We used this information to evaluate the parrot attack’s success.
Metrics for evaluating the parrot attack
We evaluated the parrot attacker’s ability to reveal leaked PII by calculating the following 2 metrics using the 400-note attack corpus:
Eq. 1 PII leak-detection rate
Eq. 2 PII leak prediction accuracy
The measures in Eq. 1 and Eq. 2 are conceptually identical to the concepts of recall and precision in information retrieval, respectively. However, we choose not to use the latter terms to avoid potential confusion with their more common usage in PII tagger model performance metrics. Values of Eq. 1 and Eq. 2 range from 0.0 to 1.0, with higher values indicating success in exposing leaked PII and lower values indicating failure. Taken together, Eq. 1 and Eq. 2 provide information regarding both the degree to which an attack is able to re-expose actual PII leaks and the degree to which the interpretation of purported leaks may be distorted by misleading information. Distortion occurs when the attacker believes resynthesized PII is actual PII.
To assess whether the attacker's leak-detection rate differed from that expected by random chance, we calculated Chi-square tests of independence and the associated P values for the 2x2 confusion matrix comparing the attacker’s predictions regarding leaked PII to truth (using data from columns in Table 2 labeled “Attacker’s success in exposing leaked PII”). We calculated this test statistic once for all PII types combined and, separately, for each of 11 PII types (ie, 1 test per row in Table 2).
RESULTS
The 1200 notes available to the defender contained 11 441 gold-standard annotated PII instances. The defender used a random selection of 400 of these (containing 4133 gold-standard PII instances) to train the defender’s PII tagger model. This model was then used to tag and replace with HIPS resynthesized PII all known PII in the remaining 800 notes (containing 7308 gold-standard PII instances), resulting in the 800-note release corpus. The release corpus contained 6712 instances (92%) of resynthesized PII and 596 instances (8%) of actual PII leaks—the latter potentially vulnerable to discovery by the attacker.
After receiving the release corpus, the attacker randomly split it into two 400-document sets, manually annotated all apparent PII in 1 set (the attacker’s training set; 92% of which was HIPS resynthesized content and 8% actual PII leaks) and used the resulting data to train a machine-learned tagger. The attacker then used this tagger to tag the remaining 400 notes (the attack corpus). Unknown to the attacker, the attack corpus contained 3695 (92%) instances of HIPS resynthesized PII and 310 (8%) actual PII leaks, for a total of 4005 PII-like instances (Table 2). When applied to the attack corpus, the attacker’s system tagged 3603 (90%) of these PII-like instances and left the remaining 402 (10%) untagged. Through a (simulated and error-free) manual review of the entire attack corpus, the attacker found all 402 untagged PII-like instances and hypothesized each was an actual PII leak. Of these 402 hypothesized leaks, 211 (68%) were actual leaks and 191 were HIPS surrogates. Further, the 3603 PII-like instances tagged by the attacker’s tagger included 99 actual PII leaks, which the attacker wrongly hypothesized to be HIPS surrogates, thereby allowing them to remain undetected in this attack. Content the attacker assumes to be leaked PII is that which falls outside the region labeled “Content tagged by the attacker’s tagger” in Figure 3.
Figure 3.
Rectangular Venn diagram illustrating the relationship between the attacker’s PII tagger results, surrogate PII in the release corpus, and leaked PII in the release corpus.
As shown in Table 2, for all PII types combined, the attack achieved a leak-detection rate (Eq. 1) of 68% (211/310) with leak prediction accuracy (Eq. 2) of 52% (211/402). The attacker’s leak-detection rate of 68% was statistically significantly better than the 8% detection rate expected by chance (P < .00001). Excluding rarely occurring PII types (ie, medical record numbers, phone numbers, and Web address), the parrot attack achieved high leak-detection rates and accuracy for patient names (74% detection, 73% accuracy; see Supplementary Appendix B for an analysis showing that deidentification failures involving patient name coreference had only a very limited impact on detection of leaked patient names) and doctor names (64% detection, 71% accuracy). The attack identified all 5 leaked phone/fax numbers with perfect accuracy. The attack correctly concluded all 42 medical record numbers had been resynthesized. Leak-prediction accuracy was lowest for patient ages, dates, and doctor initials.
DISCUSSION
Many users of machine-learned deidentification systems and HIPS resynthesis may have assumed, as we did, that without access to a defender’s original training corpus any malicious attempt to expose (or to substantially increase the likelihood of exposing) leaked PII in a deidentified corpus (containing mostly surrogate PII) would be infeasible. The present experiment challenges this assumption. A parrot attack, using only the released corpus as training data, can mimic the defender’s PII tagger and use it in conjunction with labor-intensive manual review to greatly increase the likelihood of exposing over two-thirds (68%) of leaked PII in the released corpus. Whether this finding is a “glass half empty” or a “glass half full” depends largely on one’s perspective. Stakeholders ethically and legally responsible for preserving patient privacy may perceive this result as a “glass half empty.” Others, including informaticists, may see a “glass half full,” cognizant of the substantial effort needed to implement the attack, and that one-third (32%) of PII remained undetected and that even the “exposed” PII remained partially shielded (52% of attacker-identified “leaks” were actual leaks—compared to an 8% concentration of leaks among all PII instances in the entire corpus). Ultimately, exposing PII leaks in a HIPS resynthesized corpus is a game of chance, but this work suggests a parrot attack can greatly increase the odds of detection for a substantial subset of leaked information. The dramatic attenuation of HIPS protection for leaked patient names, arguably the most sensitive of all HIPAA identifiers, is particularly notable. Three-fourths of actual leaked patient names were hypothesized to be leaks, yielding a set of 55 attacker-assumed leaked names of which 40 (73%) were actual leaks (Table 2). Also notable is the high accuracy in identifying leaked phone numbers, though this result may be unstable due to their low frequency in the release corpus.
There is some consolation in noting that conducting a parrot attack requires considerable effort. Annotation of a large corpus to create training data is time-consuming and tedious,27 as is manually reviewing the entire corpus once tagged by the attacker’s system, and these factors may limit the scalability of the parrot attack. On the other hand, the technical challenges are modest, given the availability of open source software for model training and text tagging tasks.
Given the success of this attack we should consider whether novel countermeasures may mitigate vulnerabilities created by parrot attacks. One potential strategy would be to randomly perturb the release corpus to “poison” the attacker’s training data, an approach that has been used in other machine learning contexts.30 This may be possible, for example, by randomly altering the capitalization of proper names, randomly degrading the formatting of dates, or randomly injecting “decoy” text containing apparent PII—all done after performing HIPS resynthesis in the standard way. Such poisoning would add noise to the attacker’s training data which, in turn, may reduce the attacker’s ability to parrot the defender’s PII tagger. Countermeasures based on altering a corpus’ content must balance advantages of fortifying deidentification against the disadvantages of degrading its information value.
Regardless, we believe that this work underscores the importance of data use agreements governing use of deidentified clinical text by external collaborators. While no agreement can eliminate the risk of attack, data use agreements can impose sanctions to deter flagrant misuse.31,32
Certain limitations of this study should be acknowledged. First, the 8% of PII leaked in the defender's release corpus is above the 1%–6% achieved by state-of-the-art automated deidentification systems. Our results may have differed in a corpus with fewer leaks, but whether this would improve or diminish the attacker’s success is difficult to estimate; attack efficiency may improve as leak rates decline. Second, by simulating error-free manual PII annotation (for attacker training data) and perfect manual review (to discover hypothesized leaks), we have likely identified the upper end of a parrot attack’s ability to attenuate the HIPS effect. This is useful from the perspective of risk assessment, but it may represent a worst-case scenario. Third, our results are based on clinical corpora from a single institution and used a single software system (MIST) to implement the attack; results in other settings and with other software systems may vary.
CONCLUSION
This research demonstrates that clinical text deidentified using a machine-learned PII tagger, along with HIPS resynthesis, is susceptible to attacks that mimic the machine learning used during the deidentification process. In particular, using over one thousand clinical notes annotated for patient identifiers, we illustrated that the attack can attenuate—but not eliminate—the protective effect of the HIPS approach. Still, the attack is plausible since it can be carried out using only the deidentified corpus provided to the recipient and readily available open source software. This finding should be considered when assessing exposure risks to deidentified clinical notes, particularly in settings where hostile attack cannot be ruled out.
FUNDING
This work was supported by the National Library of Medicine grant number R01LM011366 and National Human Genome Research Institute grant numbers RM1HG009034, U01HG008657, and U01HG008701. The work by JA and LH was supported by the MITRE Corporation.
CONTRIBUTIONS
All authors contributed to the conception and design of the project and interpretation of results. DSC oversaw selection and annotation of the study corpus and wrote the first draft of the manuscript. DJC prepared the simulated data and summarized the results of the hypothetical experiment. All other authors provided critical feedback on and approved the final version of the manuscript.
SUPPLEMENTARY MATERIAL
Supplementary material is available at Journal of the American Medical Informatics Association online.
Conflict of interest statement
Dr Li performed the work reported in this article while a student at Vanderbilt University; she is currently employed by Privacy Analytics, Inc., a company that does business in the deidentification space. Other authors have no conflicting interests to declare.
Supplementary Material
REFERENCES
- 1. US Department of Health and Human Services. Standards for Privacy of Individually Identifiable Health Information: Final Rule: Federal Register; 2002: 53181–273. [PubMed]
- 2. Meystre SM, Friedlin FJ, South BR, et al. Automatic de-identification of textual documents in the electronic health record: a review of recent research. BMC Med Res Methodol 2010; 10: 70.. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Carrell D, Malin B, Aberdeen J, et al. Hiding in plain sight: use of realistic surrogates to reduce exposure of protected health information in clinical text. J Am Med Inform Assoc 2013; 202: 342–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Hirschman L, Aberdeen J. Measuring risk and information preservation: toward new metrics for de-identification of clinical texts. In: Proceedings of the NAACL HLT 2010 Second Louhi Workshop on Text and Data Mining of Health Documents. June 5, 2010, Los Angeles, California.
- 5. El Emam K, Jonker E, Arbuckle L, et al. A systematic review of re-identification attacks on health data. PloS One 2011; 612: e28071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Xia W, Heatherly R, Ding X, et al. R-U policy frontiers for health data identification. J Am Med Inform Assoc 2015; 225: 1029–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Newton EM, Sweeney L, Malin B.. Preserving privacy by de-identifying face images. IEEE Trans Knowl Data Eng 2005; 172: 232–43. [Google Scholar]
- 8. Dehghan A, Kovacevic A, Karystianis G, et al. Combining knowledge- and data-driven methods for de-identification of clinical narratives. J Biomed Inform 2015; 58: S53–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Uzuner O, Luo Y, Szolovits P.. Evaluating the state-of-the-art in automatic identification. J Am Med Inform Assoc 2007; 145: 550–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Stubbs A, Kotfila C, Uzuner O.. Automated systems for the de-identification of longitudinal clinical narratives: overview of 2014 i2b2/UTHealth shared task Track 1. J Biomed Inform 2015; 58: S11–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Aberdeen J, Bayer S, Yeniterzi R, et al. The MITRE identification scrubber toolkit: design, training, and assessment. Int J Med Inform 2010; 7912: 849–59. [DOI] [PubMed] [Google Scholar]
- 12. Ferrandez O, South BR, Shen S, et al. BoB, a best-of-breed automated text de-identification system for VHA clinical documents. J Am Med Inform Assoc 2013; 20: 77–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Dorr DA, Phillips WF, Phansalkar S, et al. Assessing the difficulty and time cost of de-identification in clinical narratives. Methods Inf Med 2006; 453: 246–52. [PubMed] [Google Scholar]
- 14. Friedlin FJ, McDonald CJ.. A software tool for removing patient identifying information from clinical documents. J Am Med Inform Assoc 2008; 155: 601–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Meystre SM, Savova GK, Kipper-Schuler KC, et al. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med Inform 2008; 17: 128–44. [PubMed] [Google Scholar]
- 16. Morrison FP, Li L, Lai AM, et al. Repurposing the clinical record: can an existing natural language processing system de-identify clinical notes? J Am Med Inform Assoc 2009; 161: 37–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Szarvas G, Farkas R, Busa-Fekete R.. State-of-the-art anonymization of medical records using an iterative machine learning framework. J Am Med Inform Assoc 2007; 14: 574–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Wellner B, Huyck M, Mardis S, et al. Rapidly retargetable approaches to de-identification in medical records. J Am Med Inform Assoc 2007; 145: 564–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Yeniterzi R, Aberdeen J, Bayer S, et al. Effects of personal identifier resynthesis on clinical text de-identification. J Am Med Inform Assoc 2010; 172: 159–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Taira RK, Bui AA, Kangarloo H. Identification of patient name references within medical documents using semantic selectional restrictions. In: Proceedings of the AMIA Symposium; November 11, 2002; San Antonio, Texas. [PMC free article] [PubMed]
- 21. Neamatullah I, Douglass MM, Lehman LW, et al. Automated de-identification of free-text medical records. BMC Med Inform Decis Mak 2008; 8: 32.. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Mayer J, Shen S, South BR, et al. Inductive creation of an annotation schema and a reference standard for de-identification of VA electronic clinical notes. In: AMIA 2019 Annual Symposium; 2009: 416–20. [PMC free article] [PubMed]
- 23. Gardner J, Xiong L.. An integrated framework for de-identifying unstructured medical data. Data Knowl Eng 2009; 6812: 1441–51. [Google Scholar]
- 24. Dernoncourt F, Lee JY, Uzuner O, Szolovits P.. De-identification of patient notes with recurrent neural networks. J Am Med Inform Assoc 2017; 243: 596–606. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Sweeney L. Replacing personally-identifying information in medical records, the scrub system. In: Proceedings of the AMIA Annual Fall Symposium; October 1996; Washington, D.C. [PMC free article] [PubMed]
- 26. Li B, Vorobeychik Y, Li M, Malin B.. Scalable iterative classification for sanitizing large-scale datasets. IEEE Trans Knowl Data Eng 2017; 293: 698–711. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Carrell DS, Cronkite DJ, Malin BA.. Is the juice worth the squeeze? Costs and benefits of multiple human annotators for clinical text de-identification. Methods Inf Med 2016; 55: 356–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. OWASP. Security by Design Principles. https://www.owasp.org/index.php/Security_by_Design_Principles. Accessed June 25, 2019.
- 29. MITRE. MITRE Identification Scrubber Toolkit (MIST). http://mist-deid.sourceforge.net. Accessed June 25, 2019.
- 30. Mozaffari-Kermani M, Sur-Kolay S, Raghunathan A, et al. Systematic poisoning attacks on and defenses for machine learning in healthcare. IEEE J Biomed Health Inform 2015; 196: 1893–905. [DOI] [PubMed] [Google Scholar]
- 31. Paltoo DN, Rodriguez LL, Feolo M, et al. Data use under the NIH GWAS data sharing policy and future directions. Nat Genet 2014; 469: 934–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Wan Z, Vorobeychik Y, Xia W, et al. Expanding access to large-scale genomic data while promoting privacy: a game theoretic approach. Am J Hum Genet 2017; 1002: 316–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.