Skip to main content
Proceedings of the AMIA Symposium logoLink to Proceedings of the AMIA Symposium
. 2001:682–686.

Comparing frequency of word occurrences in abstracts and texts using two stop word lists.

K Su 1, J E Ries 1, G M Peterson 1, M E Cullinan Sievert 1, T B Patrick 1, D E Moxley 1, L D Ries 1
PMCID: PMC2243477  PMID: 11825272

Abstract

Retrieval tests have assumed that the abstract is a true surrogate of the entire text. However, the frequency of terms in abstracts has never been compared to that of the articles they represent. Even though many sources are now available in full-text, many still rely on the abstract for retrieval. 1,138 articles with their abstracts were downloaded from Journal of the American Medical Association, New England Journal of Medicine, the British Medical Journal, and the Lancet. Based on two stop word lists, one long and one short, content bearing words were extracted from the articles and their abstracts and the frequency of each word was counted in both sources. Each article and its abstract were tested using a chi-squared test to determine if the words in the abstract occurred as frequently as would be expected. 96% to 98% of the abstracts tested were not significantly different than random samples of the articles they represented. In these four journals, the abstracts are lexical, as well as intellectual, surrogates for the articles they represent.

Full text

PDF
685

Selected References

These references are in PubMed. This may not be the complete list of references from this article.

  1. Fain J. A. Writing an abstract. Diabetes Educ. 1998 May-Jun;24(3):353–356. doi: 10.1177/014572179802400310. [DOI] [PubMed] [Google Scholar]
  2. Pitkin R. M., Branagan M. A., Burmeister L. F. Accuracy of data in abstracts of published research articles. JAMA. 1999 Mar 24;281(12):1110–1111. doi: 10.1001/jama.281.12.1110. [DOI] [PubMed] [Google Scholar]
  3. Winker M. A. The need for concrete improvement in abstract quality. JAMA. 1999 Mar 24;281(12):1129–1130. doi: 10.1001/jama.281.12.1129. [DOI] [PubMed] [Google Scholar]

Articles from Proceedings of the AMIA Symposium are provided here courtesy of American Medical Informatics Association

RESOURCES