Skip to main content
Journal of the Medical Library Association : JMLA logoLink to Journal of the Medical Library Association : JMLA
editorial
. 2015 Jul;103(3):119–120. doi: 10.3163/1536-5050.103.3.002

Keywords, discoverability, and impact

Tanja Bekhuis 1
PMCID: PMC4511049  PMID: 26213502

Editors' Note: Keywords will improve the article impact and are now necessary for Journal of the Medical Library Association articles. Here is a brief editorial with background information.

When the editor of the Journal of the Medical Library Association (JMLA) asked me to write a piece about author keywords in MEDLINE structured abstracts [1], the first thing I did was search for “keywords” in Google and Google Scholar. This exercise was reminiscent of the lithograph Drawing Hands by M. C. Escher [2]. Think about it. I used Google, an über-web search engine, to write keywords to find keywords. Google and Google Scholar returned a deluge of information (721 million and 4.35 million hits, respectively). Not surprisingly, at the top of the first page in Google were hits for Google AdWords and their Keyword Planner Tool [3].

I then did more focused searches in the ACL Anthology, a digital archive of papers in computational linguistics, and in Scientometrics or journals with similar coverage to confirm that keyword analysis is thriving in the text-mining and bibliometrics communities; for example, see Ventura and Silva [4] or Yao et al. [5].

Despite the circularity of my initial searches and too narrow follow-up attempts, I learned that the meaning of the concept varies depending on the domain. For example, in the search engine optimization (SEO) domain, keywords are terms that improve page rank. Shrewd selection and placement of words or phrases visible to the user or buried in hypertext markup language (HTML) can move a hit toward the top of a list returned by a search engine. Regarding these terms, the world of SEO has some curious neologisms, such as spamdexing, which refers to keyword stuffing, search engine spam, or black-hat SEO [6]. In contrast, white-hat SEO is ethical; its practitioners eschew black-hat techniques. In corpus linguistics, keywords discriminate between collections of documents to identify what is unique about, say, general versus scientific prose, or British versus American English [7]. Text miners and other computational scientists extract informative keywords to classify documents or improve retrieval.

This brings us to why you, as an author, should carefully consider the list of keywords that you will assign to your JMLA article and its relationship to your title and abstract. Think of optimization principles for discoverability of your article beyond MEDLINE and potential effects on the impact of your work. Overall, enhancing discoverability of JMLA articles should improve journal visibility, subsequent citation counts, and its impact. This is a desirable outcome for you and the profession.

Discoverability could depend on how well the title, abstract, and keyword list form a miniaturized version of your paper. This is why a good structured abstract resembles a paper written in the “Introduction, Methods, Results And Discussion” (IMRAD) format (see Cooper's editorial in the April 2015 JMLA). The title includes the most important concepts in your paper and, ideally, the study design; the abstract summarizes the components of your paper; and the keyword list includes relevant concepts but with more detail than in the title. If keywords are too broad or too narrow, they are useless. All three pieces are important because web search engines and text-mining applications target these sections and sometimes overweight text, depending on location. Additionally, when presented to the reader, the title, abstract, and keyword list must be laden with relevant information to capture attention.

To write the keyword list for the JMLA, channel your inner indexer. Select the Medical Subject Headings (MeSH) that best characterize your topic to improve retrieval in MEDLINE [8]. Additionally, find words and phrases not covered by MeSH but known to practitioners and researchers in your field. The MeSH terms you proffer could improve decisions that a National Library of Medicine human indexer makes—after all, you are likely to know more about the topic of your paper than the indexer does. Adding non-MeSH terms could improve discoverability of your article by web search engines and by users who search digital repositories aside from PubMed and PubMed Central.

For example, in a recent paper we wrote for the JMLA on building gold standard datasets as a prelude to developing search filters [9], my coauthors and I reported that “oral squamous cell carcinoma” is not covered by MeSH, even though it is the most common cancer of the oral cavity. However, the term is a synonym for “mouth squamous cell carcinoma” in Emtree, the controlled vocabulary for Embase [10]. It also appears in the National Cancer Institute Thesaurus as “oral cavity squamous cell carcinoma” [11]. Any of these terms would have been good keywords for our paper.

In sum, if terms from controlled vocabularies beyond MeSH seem useful, consider adding them to your keyword list. Additionally, consider free-text terms for which users are likely to search. By carefully constructing your title, abstract, and keyword list, you will enhance discoverability of your article and its potential impact.

graphic file with name mlab-103-03-119-120-uf01.jpg

REFERENCES


Articles from Journal of the Medical Library Association : JMLA are provided here courtesy of Medical Library Association

RESOURCES