Box 1.
Process of Deterministic Entity Resolution
Step 1: Assign each distinct pair (valid NHS number,
valid date of birth, a distinct CUREd identifier
(CUREd ID) Step 2: Attempt to link records with valid NHS numbers but no valid date of birth to a CUREd ID based on approximate birth year (calculated from activity date and age at activity) Step 3: Attempt to link remaining records to an assigned CUREd ID by provider code, provider patient ID, and date of birth matches (provided this matches only 1 CUREd ID)1 Step 4: Attempt to link remaining records to an assigned CUREd ID by first name, last name, sex, date of birth, and postcode matches (provided this matches only 1 CUREd ID)2 Step 5: Attempt to link remaining records to an assigned CUREd ID by sex, date of birth, and postcode matches (provided this matches only 1 CUREd ID)3 Step 6: Cluster remaining records by agreement on any of the following patterns: 1. Provider code, provider patient ID, and date of birth matches1 2. First name, last name, sex, date of birth, and postcode2 3. Sex, date of birth, and postcode matches3,4 and assign each distinct cluster to a new CUREd ID Step 7: Assign each remaining record to its own CUREd ID 1. Ambulance records excluded as no provider patient ID was available 2. NHS 111 helpline records were excluded as names were not available 3. We excluded 1% of postcodes with greatest number of distinct patients registered at such postcodes. These likely represent communal establishments, such as prisons. 4. Ambulance records excluded as recorded postcodes related to incident locations rather than place of residence |
1Ambulance records excluded as no provider patient ID was available2NHS 111 helpline records were excluded as names were not available3We excluded 1% of postcodes with greatest number of distinct patients registered at such postcodes. These likely represent communal establishments, such as prisons.4Ambulance records excluded as recorded postcodes related to incident locations rather than place of residence