Table 2.
Remaining PHI analysis by tool, UCSF test corpus.
| PHI category | Instances of PHI remaining (PHIlter) | Instances of PHI remaining (Physionet) | Instances of PHI remaining (Scrubber) |
|---|---|---|---|
| Age ≥ 90 | 0 | 0 | 0 |
| Patient_Vehicle_or_Device_Id | 0 | 18 | 0 |
| Patient_Account_Number | 0 | 35 | 4 |
| Patient_Medical_Record_Id | 0 | 445 | 0 |
| Patient_Social_Security_Number | 0 | 0 | 6 |
| Patient_Phone_Fax | 0 | 0 | 1 |
| Patient_Initials | 2 | 120 | 132 |
| Patient_Name_or_Family_Member_Name | 6 | 211 | 93 |
| Patient_Address | 7 | 25 | 16 |
| Patient_Unique_ID | 20 | 442 | 34 |
| 0 | 1 | 1 | |
| URL_IP | 4 | 20 | 153 |
| Date | 7 | 257 | 269 |
| Provider_Certificate_or_License | 0 | 276 | 99 |
| Provider_Name | 12 | 546 | 90 |
| Provider_Initials | 12 | 236 | 217 |
| Provider_Address_or_Location | 43 | 1597 | 210 |
| Provider_Phone_Fax | 45 | 49 | 43 |
PHI counts for PHIlter, Physionet and Scrubber performance on the UCSF corpus. Instances of PHI represent single tokens within the span of multiple or single-token items of PHI.