Abstract
What is the best way to represent the content of documents in an information retrieval system? This study compares the retrieval effectiveness of five different methods for automated (machine-assigned) indexing using three test collections. The consistently best methods are those that use indexing based on the words that occur in the available text of each document. Methods used to map text into concepts from a controlled vocabulary showed no advantage over the word-based methods. This study also looked at an approach to relevance feedback which showed benefit for both word-based and concept-based methods.
Full text
PDF




Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Funk M. E., Reid C. A. Indexing consistency in MEDLINE. Bull Med Libr Assoc. 1983 Apr;71(2):176–183. [PMC free article] [PubMed] [Google Scholar]
- Hanley J. A., McNeil B. J. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983 Sep;148(3):839–843. doi: 10.1148/radiology.148.3.6878708. [DOI] [PubMed] [Google Scholar]
- Hersh W. R., Hickam D. H. A comparison of retrieval effectiveness for three methods of indexing medical literature. Am J Med Sci. 1992 May;303(5):292–300. doi: 10.1097/00000441-199205000-00004. [DOI] [PubMed] [Google Scholar]
- Hersh W., Hickam D. H., Haynes R. B., McKibbon K. A. Evaluation of SAPHIRE: an automated approach to indexing and retrieving medical literature. Proc Annu Symp Comput Appl Med Care. 1991:808–812. [PMC free article] [PubMed] [Google Scholar]
