Skip to main content
. 2022 May 6;10(5):727. doi: 10.3390/vaccines10050727
Linkage variables used to generate tokens
  • Token 1

  • Patient last name

  • Patient first initial of first name

  • Patient gender

  • Patient DOB

  • Token 2

  • Patient last name (Soundex)

  • Patient first name (Soundex)

  • Patient gender

  • Patient DOB

Cleaning and pre-processing of linkage variables De-identification software applies a series of validators and cleaners before a token is generated
Validators
  1. First name and last name should be more than 1 character

  2. Gender: Requires that the field is MF, mf, or female or male (case insensitive)

  3. If any field fails a validation test, the token is not created

Cleaners
  1. Removes all non-alphabetic characters. Alphabetic characters include A–Z and a–z

  2. Removes all non-numeric characters. Numeric characters include 0–9

  3. Removes all characters that are not a number (0–9) or letter (A–Z + a–z)

  4. Capitalizes all alphabetic characters (a–z → A–Z)

DOB: date of birth.