Abstract
Genes and proteins are often associated with multiple names, and more names are added as new functional or structural information is discovered. Because authors often alternate between these synonyms, information retrieval and extraction benefits from identifying these synonymous names. We have developed a method to extract automatically synonymous gene and protein names from MEDLINE and journal articles. We first identified patterns authors use to list synonymous gene and protein names. We developed SGPE (for synonym extraction of gene and protein names), a software program that recognizes the patterns and extracts from MEDLINE abstracts and full-text journal articles candidate synonymous terms. SGPE then applies a sequence of filters that automatically screen out those terms that are not gene and protein names. We evaluated our method to have an overall precision of 71% on both MEDLINE and journal articles, and 90% precision on the more suitable full-text articles alone
Full text
PDF




Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Fukuda K., Tamura A., Tsunoda T., Takagi T. Toward information extraction: identifying protein names from biological papers. Pac Symp Biocomput. 1998:707–718. [PubMed] [Google Scholar]
- Maltais L. J., Blake J. A., Eppig J. T., Davisson M. T. Rules and guidelines for mouse gene nomenclature: a condensed version. International Committee on Standardized Genetic Nomenclature for Mice. Genomics. 1997 Oct 15;45(2):471–476. doi: 10.1006/geno.1997.5010. [DOI] [PubMed] [Google Scholar]
- Proux D, Rechenmann F, Julliard L, Pillet V, V, Jacq B. Detecting Gene Symbols and Names in Biological Texts: A First Step toward Pertinent Information Extraction. Genome Inform Ser Workshop Genome Inform. 1998;9:72–80. [PubMed] [Google Scholar]
- Yoshida M., Fukuda K., Takagi T. PNAD-CSS: a workbench for constructing a protein name abbreviation dictionary. Bioinformatics. 2000 Feb;16(2):169–175. doi: 10.1093/bioinformatics/16.2.169. [DOI] [PubMed] [Google Scholar]
- Yu Hong, Hripcsak George, Friedman Carol. Mapping abbreviations to full forms in biomedical articles. J Am Med Inform Assoc. 2002 May-Jun;9(3):262–272. doi: 10.1197/jamia.M0913. [DOI] [PMC free article] [PubMed] [Google Scholar]
