Skip to main content
. 2022 Nov 23;30(2):318–328. doi: 10.1093/jamia/ocac219

Table 4.

Comparison of our best model to several publicly available deidentifier models using F1 score (precision, recall), on Steinkamp Penn test set13

PHI category MIST NLM-Scrubber Emory HIDE MIT deid rule-based NeuroNER MIT deid transformer-based Our best model (radiology + i2b2 + augmentation)
All PHI 75.5 (94.7, 62.7) 74.1 (64.1, 87.5) 92.2 (96.6, 88.2) 74.0 (81.7, 67.6) 93.6 (94.5, 92.6) 78.4 (95.1, 66.7) 97.9 (98.0, 97.7)
Macro- averaged 61.5 (94.6, 53.7) 58.8 (51.8, 83.6) 72.9 (82.1, 66.1) 28.0 (35.6, 26.0) 68.6 (75.1, 65.6) 53.5 (67.9, 48.8) 89.4 (91.5, 88.0)
Dates 75.1 (97.4, 61.2) 97.9 (98.3, 97.5) 96.4 (96.8, 96.0) 89.0 (96.0, 83.0) 97.9 (98.4, 97.5) 83.4 (98.0, 72.6) 98.9 (99.1, 98.6)
Provider names 80.8 (93.0, 71.4) None 86.6 (97.5, 77.9) None 87.0 (82.0, 92.6) 54.3 (84.2, 40.1) 95.6 (92.9, 98.4)
Locations 79.6 (85.2, 74.7) None 83.0 (93.4, 74.7) 30.8 (51.1, 22.0) 86.3 (82.0, 77.9) 45.0 (60.5, 35.8) 89.4 (90.6, 88.2)
Vendors and software 88.1 (86.7, 89.7) None 76.5 (88.6, 67.2) 6.2 (28.6, 3.4) 75.9 (82.0, 70.7) None 65.0 (78.3, 55.6)
IDs 11.1 (100, 5.9) None 90.6 (98.1, 84.1) 0 (0, 0) 84.8 (81.1, 88.9) 55.3 (76.3, 43.3) 97.3 (95.9, 98.8)
Patient names 19.0 (100, 10.5) 45.4 (37.3, 57.8) 0 (0, 0) 42.1 (37.9, 47.5) 48.0 (100, 31.6) None 95.9 (100, 92.1)
Phone numbers 77.0 (100, 62.5) 33.0 (19.9, 95.6) 76.9 (100, 62.5) 0 (0, 0) 0 (0, 0) 29.5 (20.6, 52.0) 84.0 (84.0, 84.0)

Notes: Certain cases are left with “None” values, as the corresponding model is not capable of detecting the PHI category. Rule-based models could not be retrained and suffered from differences in what was considered PHI in the original study, which sometimes excluded years or name titles from being labeled as PHI. Our best model was trained on both radiology reports and i2b2 notes with our data augmentation approach. The “All PHI” category corresponds to the PHI versus non-PHI task, where labels and predictions are binarized as either PHI or non-PHI. For each PHI category, the best score is emboldened and underlined.

PHI: protected health information.