Skip to main content
. 2023 Feb 28;8(1):1771. doi: 10.23889/ijpds.v8i1.1771

Table 1: Distribution of duplicates identified between 2015 and 2020 by linkage criteria.

Mapping criteria Matching pairs
High probable N Within category (%) Overall (%)
exact RSA ID, *simscore(surname) >=0.85, simscore(date of birth) >= 0.95 10 070 86.21 3.47
exact RSA ID, simscore(firstnames) >= 0.85 748 6.40 0.26
exact RSA ID, simscore(date of birth) >= 0.95 640 5.48 0.22
exact RSA ID, switch(exact surname, exact firstnames) 206 1.76 0.07
exact RSA ID, simscore(surname) >= 0.85 17 0.15 0.01
sub totals 11 681 100.00 4.02
Probable
exact firstnames, exact surname, exact date of birth (incl RSA ID check) 80 386 77.86 27.70
switch (exact surname, exact firstnames), exact date of birth 22 861 22.14 7.88
sub totals 103 247 100.00 35.57
High possible
exact date of birth, exact firstnames, simscore(surname) >= 0.85 (incl RSA ID check) 67 102 38.27 23.12
exact surname, exact date of birth, simscore(firstnames) >= 0.85 (incl RSA ID check) 59 622 34.01 20.54
exact surname, exact firstnames, simscore(date of birth) >= 0.95 (incl RSA ID check) 42 324 24.14 14.58
exact firstnames, simscore(surname) >= 0.85, simscore(date of birth) >= 0.95 (incl RSA ID check) 6 273 3.58 2.16
sub totals 175 321 100.00 60.40
Grand Total 290 249 100.00