Table 2.
PHI type | N PHI instances in corpus | N PHI instances replaced by system | System PHI recall* | N residual PHI instance in corpus | Reasonable opportunity to test HIPS?† |
A | B | C | D | E | F |
HIPAA PHI | |||||
Pat. name | 59 | 27 | 46% | 32 | Yes |
Age | 50 | 50 | 100% | 0 | No |
Phone # | 3 | 3 | 100% | 0 | No |
Address | 3 | 0 | 0% | 3 | No |
Date | 228 | 194 | 85% | 34 | Yes |
MRN | 0 | 0 | NA | 0 | No |
Acct. # | 0 | 0 | NA | 0 | No |
Other ID #s | 0 | 0 | NA | 0 | No |
ALL HIPAA | 343 | 274 | 80% | 69 | |
OTHER PHI | |||||
MD name | 53 | 4 | 8% | 49 | No |
Org. name | 63 | 2 | 3% | 61 | No |
ALL OTHER | 116 | 6 | 5% | 110 |
A suboptimal training set was used to degrade system recall, thereby increasing residual PHI for experimental purposes.
Criteria for inclusion in the detection experiment were system recall (col. D) ≥ ∼0.5 and N residual instances (col. E) ≥ ∼10.