Skip to main content
. 2012 Jul 6;20(2):342–348. doi: 10.1136/amiajnl-2012-001034

Table 3.

Results of the high-density identifier (oncology notes) human detection experiment by identifier type and reviewer

Test corpus Reviewer #1 (abstractor) Reviewer #2 (abstractor) Both reviewers combined*
Identifier type PHI instances Residual PHI Expected precision Predic-tions Correct Recall Precis. Predic-tions Correct Recall Precis. Predic-tions Correct Recall Precis.
A B C D E F G H I J K L M N O P
HIPAA
 Pat. name 35 6 0.17 0 0 0.00 12 4 0.67 0.33 12 4 0.67 0.33
 Age 86 7 0.08 5 0 0.00 0.00 12 0 0.00 0.00 17 0 0.00 0.00
 Phone # 2 2 1.00 0 0 0.00 1 1 0.50 1.00 1 1 0.50 1.00
 Address 6 2 0.33 1 0 0.00 0.00 0 0 0.00 1 0 0.00 0.00
 Date 180 17 0.09 1 0 0.00 0.00 35 1 0.06 0.03 36 1 0.06 0.03
 MRN 3 3 1.00 0 0 0.00 0 0 0.00 0 0 0.00
 Acct. # 1 1 1.00 0 0 0.00 0 0 0.00 0 0 0.00
 Other ID #s 10 9 0.90 0 0 0.00 2 0 0.00 0.00 2 0 0.00 0.00
 ALL 323 47 0.15 7 0 0.00 0.00 62 6 0.13 0.10 69 6 0.13 0.09
OTHER
 Prac name 82 9 0.11 5 4 0.44 0.80 8 4 0.44 0.50 12 7 0.78 0.58
 Org. name 27 20 0.74 8 6 0 0.75 3 1 0 0.33 10 7 0.35 0.70
 ALL 109 29 0.27 13 10 0 0.77 11 5 0.17 0.45 22 14 0.48 0.64
*

Based on unduplicated count of N predictions and N correct across the two reviewers.

Defined as the number of residual PHI instances (col. C) divided by the total number of PHI instances (col. B).