Skip to main content
Journal of Epidemiology and Community Health logoLink to Journal of Epidemiology and Community Health
. 1988 Dec;42(4):390–395. doi: 10.1136/jech.42.4.390

The classification of ethnic status using name information.

A J Coldman 1, T Braun 1, R P Gallagher 1
PMCID: PMC1052770  PMID: 3076894

Abstract

Methodology is developed to classify ethnic status by name using a simple probabilistic model. This method involves the consideration of four rules which may be used to classify individuals using three name components (first, middle and last names). In order to do this, conditional probabilities of ethnic status are estimated from a sample in which the ethnic status is known. Using a split sample technique the sensitivity and specificity of this methodology were examined in a data set of death registrations. Each of the classification rules performed well on the data from which they were constructed but were not as efficient when applied to another population. Nevertheless a model (linear), in which the sum of the conditional probabilities of each home component is used, achieved a sensitivity and specificity of 97% and 100% respectively in males and 89% and 100% in females.

Full text

PDF
390

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Nicoll A., Bassett K., Ulijaszek S. J. What's in a name? Accuracy of using surnames and forenames in ascribing Asian ethnic identity in English populations. J Epidemiol Community Health. 1986 Dec;40(4):364–368. doi: 10.1136/jech.40.4.364. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Epidemiology and Community Health are provided here courtesy of BMJ Publishing Group

RESOURCES