. 2014 Jan 16;2014:bat086. doi: 10.1093/database/bat086

Table 2.

Categorization of article fields

Category	Field-group	Notes
Decisive	PMID	Median/low applicability
		Single selectivity
		High accuracy
	DOI	Median/low applicability
		Single selectivity
		High accuracy
Reliable partially decisive	Year	High applicability (rarely null)
		Low selectivity
		High accuracy
	ISSN and EISSN	Median applicability
		Median selectivity
		High accuracy
	Journal name	High applicability
		Median selectivity
		Median accuracy
		Abbreviations are common
	Title	High Applicability
		High Selectivity
		Median Accuracy
		Missing parts case exists
Useful but not reliable	Paging group (volume and issue and page)	Median applicability
		High selectivity
		Median accuracy
		Missing parts are common
	Author list	High applicability
		High selectivity
		Low accuracy
		Missing some author is not rare
		Name word order may vary

Record fields can be categorized according to their applicability, selectivity and accuracy. Applicability is the number of non-null values/number of total records. If a field has very few null/empty values, it has high applicability. Selectivity: the average selectivity of a field is 1 – (1/number unique field values.) If a field value of the field is shared by only very few records, the field have high selectivity. Especially, if we say a field has single selectivity or decisive, it means that any non-null value of this field is unique among records. Accuracy is the average probability of correctness of any value in a field. If a field has low accuracy, it is not a good idea to use it as a duplication indicator.