Skip to main content
. 2012 Jul 27;12:109. doi: 10.1186/1471-2288-12-109

Table 2.

PHI category distribution and mapping for the VHA, i2b2 and Swedish Stockholm EPR corpora

VHA corpus Instances i2b2 corpus Instances Stockholm EPR De-identified Corpus Instances
Patient Name
206 (3.88%)
Patients
929 (4.76%)
Person Name
First Name
923 (20.87%)
Relative Name
30 (0.55%)
 
 
 
 
 
Other Person Name
20 (0.37%)
 
 
 
Last Name
929 (21%)
Healthcare Provider Name
492 (9.08%)
Doctors
3751 (19.24%)
 
 
 
Street City
137 (2.53%)
Locations
263 (1.35%)
Location
148 (3.35%)
State Country
161 (2.97%)
 
 
 
 
 
Zip code
4 (0.07%)
 
 
 
 
 
Deployment
43 (0.79%)
-
-
-
-
Healthcare Unit Name
1453 (26.83%)
Hospitals
2400 (12.31%)
Health_Care_Unit
1021 (23.08%)
Other Organization
86 (1.59%)
-
-
-
-
Date
2547 (47.03%)
Dates
7098 (36.40%)
Date_Part
710 (16.05%)
 
 
 
 
Full_Date
500 (11.30%)
Age > 89
4 (0.07%)
Ages
16 (0.08%)
Age
56 (1.27%)
Phone Number
90 (1.66%)
Phone Numbers
232 (1.19%)
Phone Number
136 (3.07%)
Electronic Address
4 (0.07%)
-
-
-
-
SSN
16 (0.30%)
IDs
4809 (24.66%)
-
-
Other ID Number 123 (2.27%)     - -