Table 4: Matching features added or enhanced in the new data linkage system (DLS3).
Feature | Description & purpose | Description of legacy file-based linkage software (FLS) capability | Description of DLS3 capability |
---|---|---|---|
Matching functions | Used to compare different types of fields (i.e. string, numeric, date, location, etc.), including parameters for error tolerance. | New match functions cannot be added and existing functions cannot be modified. | Match functions can be designed, added and updated as required. |
Matching conditions | Uses inexact comparisons and permissible field values to further restrict which records may be considered for matching. | No match conditions in FLS. Limited to sub-setting of entire input file (e.g. linking only to women). | Match conditions allow excluding some match pairs based on conditions other than exact matching (as implemented by blocking). |
Data preparation functions | Clean, standardise and transform data in order to improve likelihood of matching with other datasets. | None in FLS. All data preparation must be implemented using custom add-ons developed by DLB, prior to running the FLS. | Can apply additional data cleaning or transformation at linkage stage; for example address standardisation or phonetic transforms. |
Frequency calculations | Allows configuring of matching functions to give uncommon field values (e.g. surnames) more weight than common ones. | Only for 100 most frequent values. | Frequencies for all values. Can use conditional frequencies (eg. name frequencies for male vs female). Chain based rather than record based (to reduce bias in event based data). |
Cardinality restrictions | Allows linkage outcomes to be restricted – one-to-one; one-to-many; many-to-one – to meet expectations of the dataset. | 1:1, 1:N, N:1 matching restrictions are possible, but are limited to post-linkage checking (prior to loading links) on a record-to-record basis. | 1:1, 1:N. N:1 matching restrictions are possible on a chain-to-chain basis, and can be triggered as part of a linkage strategy. |
Linkage metadata | Capture current and historical information about software, systems, data and linkage keys. | No metadata recording built in. Must be done manually by users. |
Metadata recording built in. Includes:
|