Skip to main content

View full-text article in PMC

. 2012 Jul 6;20(2):342–348. doi: 10.1136/amiajnl-2012-001034

Table 3.

Results of the high-density identifier (oncology notes) human detection experiment by identifier type and reviewer

Test corpus				Reviewer #1 (abstractor)				Reviewer #2 (abstractor)				Both reviewers combined^*
Identifier type	PHI instances	Residual PHI	Expected precision^†	Predic-tions	Correct	Recall	Precis.	Predic-tions	Correct	Recall	Precis.	Predic-tions	Correct	Recall	Precis.
A	B	C	D	E	F	G	H	I	J	K	L	M	N	O	P
HIPAA
Pat. name	35	6	0.17	0	0	0.00	–	12	4	0.67	0.33	12	4	0.67	0.33
Age	86	7	0.08	5	0	0.00	0.00	12	0	0.00	0.00	17	0	0.00	0.00
Phone #	2	2	1.00	0	0	0.00	–	1	1	0.50	1.00	1	1	0.50	1.00
Address	6	2	0.33	1	0	0.00	0.00	0	0	0.00	–	1	0	0.00	0.00
Date	180	17	0.09	1	0	0.00	0.00	35	1	0.06	0.03	36	1	0.06	0.03
MRN	3	3	1.00	0	0	0.00	–	0	0	0.00	–	0	0	0.00	–
Acct. #	1	1	1.00	0	0	0.00	–	0	0	0.00	–	0	0	0.00	–
Other ID #s	10	9	0.90	0	0	0.00	–	2	0	0.00	0.00	2	0	0.00	0.00
ALL	323	47	0.15	7	0	0.00	0.00	62	6	0.13	0.10	69	6	0.13	0.09
OTHER
Prac name	82	9	0.11	5	4	0.44	0.80	8	4	0.44	0.50	12	7	0.78	0.58
Org. name	27	20	0.74	8	6	0	0.75	3	1	0	0.33	10	7	0.35	0.70
ALL	109	29	0.27	13	10	0	0.77	11	5	0.17	0.45	22	14	0.48	0.64

^*

Based on unduplicated count of N predictions and N correct across the two reviewers.

^†

Defined as the number of residual PHI instances (col. C) divided by the total number of PHI instances (col. B).