Table 5. Characteristic of records with discordant classification by deterministic and probabilistic record linkage.
Record characteristics | Pair by deterministic and non-pair by probabilistic linkage | Pair by probabilistic and non-pair by deterministic linkage | ||
---|---|---|---|---|
|
|
|||
(N = 1,285) | (N = 527) | |||
Median | CI95% | |||
Score | - | - | 24.2 | 20.3–32.3 |
| ||||
n | % | n | % | |
Missing sex value | 0 | 0 | 0 | 0 |
Missing name value | 0 | 0 | 0 | 0 |
Missing mother’s name value | 68 | 5.3 | 76 | 14.4 |
Missing date of birth value | 81 | 6.3 | 58 | 11.0 |
Missing address value | 7 | 0.5 | 20 | 3.8 |
Combined: unknown mother’s name; or unknown date of birth; or unknown address | 129 | 10.0 | 141 | 26.7 |
| ||||
Link characteristics | Pair by deterministic and non-pair by probabilistic linkagea,b | Pair by probabilistic and non-pair by deterministic linkagea,b | ||
|
|
|||
(N = 733) | (N = 293) | |||
| ||||
n | % | n | % | |
Difference in sex | 115 | 15.7 | 0 | 0 |
Similarity measure for name lower than 70,0% | 160 | 25.9 | 0 | 0 |
Similarity measure for mother’s name lower than 70,0% | 147 | 26.6 | 36 | 16.6 |
Similarity measure for date of birth lower than 70,0% | 45 | 8.3 | 25 | 10.6 |
| ||||
Median | IC95% | Median | IC95% | |
| ||||
Similarity measure for name x 100 | 81.6 | 69.4–94.4 | 100 | 95.0–100 |
Similarity measure for mother’s name x 100 | 91.3 | 68.2–100 | 94.4 | 85.7–100 |
Similarity measure for date of birth | 100 | 10.0–100 | 87.5 | 75.0–100 |
a Calculating Levenshtein distance and assessing the difference in sex between records required comparing the records of the record group of the same patient. Comparing discordant records: (i) when the two discordant records were identified by only one of the techniques, the calculation was done between them; (ii) when they were records identified by both techniques and only one of them was not identified by one of the techniques, the calculation was done by comparing the unidentified record with the record of highest score in the group.
b For this calculation, the group of records that were blocked by sex or had missing information for one of the variables were excluded.