The National Library of Medicine's (NLM's) success with Web offerings of MEDLINE has brought Medical Subject Headings (MeSH) indexing, once familiar mostly to librarians and researchers, to an entire community of students, teachers, researchers, and practitioners in the life sciences and medicine as well as to the general public. The emergence of MeSH as the accepted standard, however, creates dilemmas for indexers outside NLM. Conflicts between MeSH and subject terms suggested in text may be ignored or resolved with cross-references in indexing textbooks and monographs, but indexers dealing with periodical literature consisting of articles from many different authors depart from MeSH only advisedly, especially when indexing for a readership of frequent MEDLINE users. Professional researchers may regard cross-referencing to appropriate MeSH terms as a minimum functionality in any system for searching the medical literature, their expectations set by their experience with popular MEDLINE interfaces that automatically map keywords to MeSH. In a recent review of reference-management software in JAMA, for example, Satya-Murti, citing the search of MEDLINE as “an irreducible step” in medical research, faults the products under consideration for the lack of MeSH cross-referencing, concluding that “concepts such as continuous glucose monitoring or stool test may produce variable and irrelevant results. To aid in specificity, mapping a free-text term to MeSH vocabulary is critical” [1].
In general, in any index that cumulates work from different sources over time, indexers have little choice but to resort to the use of controlled vocabulary to overcome the variability of natural language. In a 1995 appraisal of science and technology databases, Milstead concluded simply that thesaurus-based indexing “is not an issue” [2]. Since then, however, the prevalence of Web search engines that search a mass of Web pages for keywords regardless of context lead many, not knowing any better, to accept “variable and irrelevant” search results as a fact of life. They may wonder why no one has yet devised a way to solve the problems that in fact were solved long ago—the development of a facility for such basic and recurring needs, say, as retrieving recent review articles on tuberculosis. The literature of both librarianship and medicine now regularly features cautionary articles on the perils of free-text searching through such popular, all-purpose portals as Google or AltaVista that offer neither the flexibility nor the precision required by researchers and, increasingly, by consumers. A recent evaluation of health care information on the Internet published in JAMA rated access through such search engines as “poor and inconsistent” [3].
Unfortunately for indexers working outside NLM, rigorous mapping or cross-referencing to a controlled vocabulary to meet the expectations of a professional clientele is often easier said than done. Medical literature is vast and indiscrete, incorporating many fields of knowledge from the social sciences to the cutting edge of pure science. Building on an established system and with experienced staff, NLM maintains multiple subject hierarchies within MeSH, affording their staff and users orderly analysis and access to the spectrum of the journals they index. However, MeSH is far from complete or perfect. Not even as large and experienced an institution as NLM can keep up, and only overachievers or irrepressible optimists would think of establishing a competing authority. Where there are gaps in MeSH, introducing local terms to cover rapidly proliferating terminology in such fields as biotechnology inevitably results in an uncoordinated scattering of references under provisional headings, one of the traps MeSH is specifically designed to avoid. Few indexers, though, have the luxury of putting a job aside to wait for MeSH revision, and tracking non-MeSH terms and replacing them retrospectively as MeSH is updated is a thankless chore, not supported by any off-the-shelf indexing packages.
In most instances, therefore, where MeSH is lacking, indexers must rely on the text regardless of the inconsistencies they may be introducing to cumulated indexes. Even where appropriate MeSH terms exist, non-NLM indexers may also be obliged to follow the style and editorial policies of the publishers they work for, opting for preferred local forms of entry, cross-referencing from MeSH terms rather than to them. AMA journal editors, for example, prefer the less authoritarian “patient adherence” to NLM's “patient compliance” and the more popular “detoxification” to MeSH's “metabolic detoxication, drug,” and these preferences are reflected in their journal indexes.
Most troublesome are specialty journals whose highly specific terminology cannot be mapped to MeSH at all. Where such terminology, typically names for rare diseases or syndromes and specialized surgical or therapeutic procedures, appears in article titles, subtitles, and abstracts, most indexers enter them as subject terms, avoiding the NLM system of combining more general MeSH terms with a subheading. For example, in MEDLINE “Effects of Bilateral Posteroventral Pallidotomy on Gait of Subjects with Parkinson Disease” from a February 2000 issue of the Archives of Neurology is indexed under “globus pallidus/surgery.” A non-NLM indexer, however, is free—generally obliged—to use the more specific title word, “pallidotomy,” thus establishing what is most likely to be the primary point of access.
Erdheim-Chester disease, Hirayama disease, and Kindler syndrome are a few examples of the numerous diseases and syndromes written up in the journal literature and included in standard medical dictionaries such as Stedman's or Dorland's but not in MeSH. As for surgical procedures, most indexers create index entries for the proper names, especially when they appear in article titles and subtitles, knowing not only that this is probably the first place readers look, but also where their most vocal critics—authors and editors—expect to find entries for their articles. The latter are generally not appeased by the MeSH system of using a more general term qualified by a subheading.
Varying from MeSH to please authors and editors is often politic, but indexers' overall goals remain much the same as that of their counterparts at NLM—a consistency made possible primarily through the use of a controlled vocabulary. Furthermore, with the prevalence of Boolean searching allowing the combination of assigned subject descriptors with words and phrases occurring in the abstract or text, indexers, whose print indexes will also serve as the basis for access to electronic publication, need not be overzealous in jumping into gaps in MeSH. Besides the vocabulary of medical specialties appearing in medical dictionaries but not in MeSH, medical literature is replete with obscurities—variously designated syndromes, treatments informally named after doctors who invented them, and so on—for which no authority at all may be found. Nor is it above the here-today-gone-tomorrow buzzwords that enliven other professions and the business world. Taking a pass on such ephemera is preferable to cluttering the rationalized MeSH system unnecessarily.
A more serious limitation for MeSH-based indexes compiled outside NLM lies in the loss of one of the most powerful capabilities of MEDLINE—its ability to explode headings to capture narrower terms beneath them in the MeSH tree structure. For example, one may enter the term “anti-inflammatory agents, non-steroidal” not only to search for articles that deal with these agents in general, but also to pick up every article indexed under the names of more than forty specific substances, from “aminopyrine” to “tolmetin,” beneath “anti-inflammatory agents, non-steroidal” in the MeSH tree. Such relationships between broader and narrower terms, implicit in MeSH, are usually lost outside MEDLINE. As a rule, most indexers creating back-of-the-book or journal indexes select index headings at the same level of specificity as the textual reference. For them, gathering references under general headings that may not be explicit in the text, such as references to specific drugs under such generic headings as “anti-inflammatory agents,” is more problematic. Such indexing often requires knowledge indexers do not possess, is difficult to do consistently, and may result in indexes that fail both as concise guides to the matter at hand and as ad hoc equivalents to MEDLINE. In indexing, going from specific to general is a matter of editorial judgment based on subject matter and intended audience but almost never can be done systematically as done in MEDLINE.
Unfortunately, 100% recall is impossible even with the most conscientious indexing. Medical literature is simply too vast and dynamic to lend itself to perfect control. Pointing to the inadequacies of medical indexing, a recent article in Archives of Dermatology even went as far as to advocate the hand searching of medical literature:
Hand searching is an essential task since searches of electronic databases, such as MEDLINE, have been shown to miss a large proportion (up to 50%) of trial reports. Some reports of clinical trials are missed because reference to the trial is not made in the title or abstract, and the article is not coded as a trial but as a journal article or letter. The advantage of hand searching is that the hand searcher will examine in more detail the “Materials and Methods” sections and other pages to ensure that the article is clearly a trial. These sections may have been missed by searches of electronic databases. [4]
The authors do not blame indexers. They only underscore the realities of searching a sprawling, ever-evolving domain that embodies indispensable accumulated knowledge and research investments equivalent to many national treasuries.
Contrary to the feeling that indexing has become obsolete or dysfunctional, disillusionment with the haphazard indexing of Websites and intranets has spurred a renewed interest in controlled vocabulary indexing, both in commercial and research settings. Although so far the emphasis seems to be on the development of architectures and standards associated with the creation of metadata and such applications as topic maps, this trend should also provide an opportunity for indexers and librarians to reassert themselves. In the long run, schemes such as topic mapping will face many of the same problems that have already been faced in traditional indexing, and, if in the spirit of innovation, past experience is pushed too far out of sight, a proliferation of local taxonomies hastily compiled for use as embedded metadata, for example, may only add further confusion. Access to information in health sciences, whether in the form of printed documents or Web pages, will still probably be achieved most successfully through established authorities like MeSH with the breadth and depth of its coverage; its widespread acceptance among students, researchers, and doctors; its continual revision; and the overall stability of its publisher, NLM. Darmoni et al. described an application of MeSH in conjunction with a metadata standard in a 2001 article in the Bulletin of the Medical Library Association [5]. More flexible and precise access may yet emerge to meet the demands of the information age, and, as in the past, leadership will most likely come from the benign monopoly held by NLM and its associates. In the meantime, freelancers and staff indexers under eternal deadline pressure do best, perhaps, to follow the advice traditionally doled out to high school examination takers: be specific, be general, base your answers on the assigned reading, and contribute ideas of your own.
References
- Satya-Murti S. New media reference manager. JAMA. 2000 Sep 27; 284(12):1581–2. [Google Scholar]
- Milstead J. DTIC project to improve subject indexing: phase II: subject indexing philosophy. Brookfield, CT: The Jelem Company, Feb 95:21. [Google Scholar]
- Berland GK, Elliott MN, Morales LS, Algazy JI, Kravitz RL, Broder MS, Kanouse DE, Munoz JA, Puyol JA, Lara M, Watkins KE, Yang H, and McGlynn EA. Health information on the Internet: accessibility, quality, and readability in English and Spanish. JAMA. 2001 May 23–30; 285(20):2612–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Delamere FM, Williams HC. How can hand searching the dermatological literature benefit people with skin problems? Arch Dermatol. 2001 Mar; 137(3):332–5. [PubMed] [Google Scholar]
- Darmoni SJ, Thirion B, Leroy JP, and Douyère M. The use of Dublin Core metadata in a structured health resource guide on the Internet. Bull Med Libr Assoc. 2001 Jul; 89(3):297–301. [PMC free article] [PubMed] [Google Scholar]