Abstract
This column explains ways to optimize the PubMed search features: Computed Author Sort, PubMed Identifier, PubMed Phrase Index, and proximity search. Two case studies show how to find every citation in PubMed, and how to retrieve comprehensive citations to systematic reviews. The article concludes with why PubMed ignores some search terms.
Introduction
PubMed is a free, searchable database of citations to the biomedical literature. PubMed is provided by the National Library of Medicine (also known as NLM). NLM is one of the U.S. National Institutes of Health (1). More than 36 million citations comprise the database. PubMed is used by over used by over 3 million unique users each day, including clinicians, researchers, students, librarians, and more (2).
PubMed has existed since 1996, and underwent substantial enhancements in 2019 - 2020. The enhancements included navigation interface improvements, automated indexing), and in 2022, the addition of proximity searching to select fields (3-5).
This column describes three built-in search features guaranteed to save you time, two case studies on retrieving comprehensive results from the PubMed, and a discussion on why search terms in PubMed are sometimes ignored.
Time-saving, built-in search features
Computed author sort aka Authorbot
Introduced in 2012, computed author sort is an algorithm that retrieves all publications by a known author (6). A 2009 study of PubMed web logs showed about 36% of the queries contain an author name, noting, Out of more than 79 million names in the PubMed collection, more than 10% will lead to retrieval of over 100 citations that are authored by multiple authors. (7) Computed Author Sort cuts through the noise to show citations most likely to be your author. Computed author sort, or Authorbot, activates when users click an author name link on the PubMed abstract display, triggering an algorithmically generated author search in PubMed. Authorbot is not only easier to say when explaining computed author sort, but also easier for users to conceptualize, as the feature acts like a bot in video games, automatically performing tasks that other users may do manually.
How to activate authorbot
Go to Abstract view of any known PubMed citation and click the author’s name.
The results display citations most likely to be written by the same author. (Figure 1)
Figure 1:
View of a PubMed Abstract for a known work by author Astley R. and the results generated by computed author sort, which available by clicking any authors name.
PMID
PMID, or the PubMed Identifier, is a unique identifier for every record in PubMed. All PubMed abstracts are assigned a PMID automatically by NLM during indexing. PMIDs appear at the end of every PubMed citation, they do not change over time or during processing, and are never reused. (8)
PMIDs are useful because that number is all you need to search PubMed and return to the citation. For example, search the number 1. The search retrieves the first citation ever assigned a PMID. The citation is about methynol poisoning from the journal Biochemical Medicine in 1975. (9) At almost 50 years since indexing, it’s a vintage PubMed citation.
Another example PMID search is for the number 12345678. PMID 12345678 is a report from the 1980s on water quality (10). This search example is useful when providing PubMed instruction, as it uses a consecutive string of numbers to demonstrate that each citation has a unique PMID. Once you know a specific PMID, the citation can be summoned with this simple search.
A final note about the usefulness of a PMID.
A user can type in a string of PMIDs separated by a space and get all the citations displayed in a list. (Figure 2)
Figure 2:
A PubMed search for the string of PMIDs separated by a space and the results displayed in a list.
How to search for a PubMed Identifier (PMID)
To search for a PubMed Identifier (PMID), enter the ID with or without the search field tag [pmid]. You can search for several PMIDs by entering each number in the search box separated by a space (e.g., 1 1234567 12345678 as shown in Figure). PubMed will OR the PMIDs together. (8)
Type a number into the search box and select ‘search’
A unique citation appears
Phrase index
Often, librarians trying to search a phrase in PubMed come across directions to use a phrase index, which is self-referentially defined as a phrase index to provide phrase searching. (11) The phrase index is actually numerous indexes of all the phrases indexed in a specific field. It is only available in PubMed Advanced Search, and it is browsable for any field listed on Advanced Search. PubMed User Guide offers detailed directions and search processing rules for browsing the index of terms using phrase index (12).
How to Find and Use Phrase Index
Go to Advanced Search
Select any field and start typing a term
Select Show Index
An alphabetical display of terms in your search field appears. The approximate number of citations for each term is indicated in parentheses (the actual citation count is returned when the search is executed).
Select Add to Search to add a term to the query box
Once you have finished adding terms to the query box, click Search (or Add to History) to run the search.
Searching for phrases in the Phrase Index can be an exercise in self-discovery to see how many misspellings and derivatives exist for a certain phrase. An example: Go to Advanced Search and select All Fields. Enter the term: cootie and select ‘Show Index”. An amusing result: there is 1 publication in PubMed about cooties (13) .Search for indexed phrases in any of the fields in the advanced search builder. Using the phrase index is helpful to identify misspelled author names in the Author Field, or to uncover derivative references to an institution the Affiliation Field.
How to search a phrase using Proximity Search
To search for a phrase not found in the PubMed phrase index, try the PubMed proximity search with a distance of zero. PubMed Proximity search was introduced in November 2022 (14). It is a complex search string that allows searches of words that are close to one another in the Title or Title/Abstract field.
Enter search terms using the following format:
"search terms"[field:~N]
Search terms = Two or more words enclosed in double quotes.
Field = The search field tag for the [Title] or [Title/Abstract] fields.
N = The maximum number of words that may appear between your search terms.
The syntax for a phrase search for the title field would look like:
"search terms"[Title]:~0]
In fact, there are 33 results that contain the phrase “search terms” in the title citation. (15)
Case Study 1. Show me everything in PubMed
I want to see everything in PubMed is a cheeky reference question, but not uncommon. Someone wants to see over 36 million citations to biomedical literature? Fine. A savvy way to summon all citations in PubMed is to use the field tag search all[sb]. (16) Searching all[sb] in PubMed is akin to a secret hack in a video game that gets you all the loot, because it will return the total number of PubMed citations currently in the database with basically one click.
Field tags search a specific field in PubMed, and can be quickly deployed from the search box when you know them. There are several caveats (16). The search field tag must be enclosed in square brackets. Field Tags turn off PubMed’s Automatic Term Mapping (ATM), limiting your search to the specified term only. To date, PubMed User Guide lists 51 Field Tags (17). We will examine the Subset field tag [sb] and what it will search.
The ultimate cheat code: all[sb]
How to find all citations in PubMed:
To search for the total number of PubMed citations, enter all[sb] in the search box & search
The total number of PubMed citations are returned in the search results.
Unpacking all[sb]
The field tag [sb] refers to the Subset field, a method of restricting retrieval by subject, citation status and journal category. (18) The subset field is related to PubMed Filters. ”You can use filters to narrow your search results by article type, text availability, publication date, species, article language, sex, age, and other.” (19) The Other category of the Subset Field includes subsets for Citation Status, which indicate the internal processing stage of an article in the PubMed database. (20) Here you will find the directions, “To search for the total number of PubMed citations, enter all[sb] in the search box.” The PubMed User Guide provides the exact search strings for the stages of citation processing, including the MEDLINE Status Subset, which refers to citations that have been indexed with MeSH terms (medline[sb]), the pubmednotmedline set (citations that will not receive MEDLINE indexing pubmednotmedline[sb]), and several more statuses. (21) Of note, the MEDLINE Status Subset appears in PubMed Search Filters Interface under Additional Filters.
Case Study 2. I want all the systematic reviews.
Have you ever had a patron ask you to find all the systematic reviews? Systematic Review is a Filter in PubMed. Using it in tandem with the all[sb] field tag search can quickly answer this maximalist question. Systematic Review was added to PubMed’s Medical Subject Headings vocabulary in 2019. (22) NLM applied this publication type retrospectively to systematic review citations in PubMed in 2018. (23) To date (October 4, 2023), there are over 283,027 citations indexed as a systematic review in PubMed. That is over one quarter of a million systematic reviews published since 1957.
How to retrieve ALL the systematic reviews
Enter all[sb] in the search box to retrieve the total number of PubMed citations
Select “Systematic Review” from the Filter - Article Type menu
Results show the total number of PubMed citations filtered to include any citation currently indexed as a systematic review
Limitations to this approach
Filtering the total number of PubMed citations by systematic review retrieves only indexed PubMed citations. Systematic Review is a Medical Subject Heading, and only citations that are indexed in Medline will have a subject heading. Since PubMed includes non-Medline citations, such as citations that still need indexing, or other material that is never indexed, this approach misses some results. For example, citations to systematic reviews that are in process or pubmednotmedline are not included.
A phrase search for ‘systematic review’ yields 317,314 results. (Terms enclosed in quotation marks search a phrase in PubMed.) A text word search for systematic review yields 320,627 results, because a basic PubMed keyword search turns on Automatic term mapping, and expands the search to include "systematic review"[Publication Type] OR "systematic reviews as topic"[MeSH Terms] OR "systematic review"[All Fields].
To retrieve the truest comprehensive results of all the systematic reviews in PubMed, combine the 3 approaches with the Boolean operator OR, as shown in Figure 3. The full search string used to summon all the systematic reviews: (("all"[Filter] AND "systematic review"[Filter]) OR ("systematic review"[Publication Type] OR "systematic reviews as topic"[MeSH Terms] OR "systematic review"[All Fields]) OR "systematic review"[All Fields].
Figure 3:
View of the PubMed Advanced Search – History and Search Details section, depicting a step-wise search for systematic reviews.
As of October 5, 2023, the combined approach yields 321,754 citations to systematic reviews. That is a very high number, surely enough to satisfy anyone seeking all the systematic reviews in PubMed. (Figure 3).
Why PubMed ignores your search terms.
PubMed users sometimes encounter the message The following term was ignored: [search terms]. There are several explanations for this message. If you use a hyphen in your search, and the phrase is not found in the phrase index, the search will not return any results for that phrase (24). Similarly, phrases may appear in a PubMed record but not be in the phrase index. Searching for a phrase within quotation marks that is not found in the phrase index will result in the message“quoted phrase not found (11). In both cases, using the PubMed proximity search with a distance of zero, as described above, can be used instead.
Conclusion
PubMed offers many ways to search and retrieve citations. This column explained 3 ways to optimize PubMed through search features. Use Computed Author Sort (authorbot) on any Abstract view of a citation to find more citations likely to be by your author. PMID, the unique identifier to every PubMed citation, can be searched as a string of numbers directly from any search box to call up specific PubMed citations. The phrase index, available only in PubMed Advanced Search, shows all the phrases indexed in a specific field and is useful in identifying mis-spellings and alternative indexing for any search field. Two case studies walked through approaches to finding every citation in PubMed, and methods for retrieving all the systematic reviews. Finally, we explored reasons why search terms might be ignored, and how the new PubMed proximity search can be an alternate search approach. These PubMed features save time in searching. By explaining how these features work, time searching the PubMed User Guide is also saved.
Funding details
Developed resources reported in this document are supported by the National Library of Medicine (NLM), National Institutes of Health (NIH) under cooperative agreement number UG4LM013732 with the University of Utah Spencer S. Eccles Health Sciences Library. The content is solely the responsibility of the authors and does not necessarily represent the official views of NIH.
References
- 1.National Library of Medicine. PubMed Overview. Bethesda (MD./United States): National Library of Medicine, [Internet]. 2023. Aug 15 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/about/ [Google Scholar]
- 2.Davidson M. (Librarian, Office of Engagement and Training, National Library of Medicine, Bethesda, MD/United States). Conversation with NNLM Training Office (University of Utah, UT./United States; ). 2021. Nov 5). [Google Scholar]
- 3.Chan J. PubMed Updates and Retirement of the Legacy Site. NLM Technical Bulletin. [Internet]. 2020. Sep-Oct; (436):e6. [cited 2023 Oct 9]. Available from: https://www.nlm.nih.gov/pubs/techbull/so20/so20_pubmed_update.html [Google Scholar]
- 4.MEDLINE 2022 Initiative: Transition to Automated Indexing. NLM Technical Bulletin. [Internet]. 2021. Nov-Dec;(443):e5. [cited 2023 Oct 9]. Available from: https://www.nlm.nih.gov/pubs/techbull/nd21/nd21_medline_2022.html [Google Scholar]
- 5.PubMed Update: Proximity Search Now Available in PubMed. NLM Technical Bulletin. [Internet]. 2022. Nov-Dec;(449):e4. [cited 2023 Oct 9]. Available from: https://www.nlm.nih.gov/pubs/techbull/nd22/nd22_pubmed_proximity_search_available.html [Google Scholar]
- 6.Canese K. PubMed and Computed Author Sorted Display NLM Technical Bulletin. [Internet] 2012. May-Jun;(386):e2. [cited 2023 Oct 9]. Available from: https://www.nlm.nih.gov/pubs/techbull/mj12/mj12_pm_author_ranking.html [Google Scholar]
- 7.Islamaj Dogan R, Murray GC, Névéol A, Lu Z. Understanding PubMed user search behavior through log analysis. Database (Oxford). 2009;2009:bap018. doi: 10.1093/database/bap018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.PMID [pmid]. PubMed User guide. Bethesda (MD./United States): National Library of Medicine; [Internet]. 2023. 19 Sep [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#pmid [Google Scholar]
- 9.Makar AB, McMartin KE, Palese M, & Tephly TR (1975). Formate assay in body fluids: application in methanol poisoning. Biochemical medicine, 13(2), 117–126. doi: 10.1016/0006-2944(75)90147-7 [DOI] [PubMed] [Google Scholar]
- 10.Ministerial Meeting on Population of the Non-Aligned Movement (1993: Bali). Denpasar Declaration on Population and Development. Integration. 1994. Jun;(40):27–9. doi: 10.1234/2013/999990. [DOI] [PubMed] [Google Scholar]
- 11.Phrase Index. PubMed User guide. Bethesda (MD./United States): National Library of Medicine; [Internet]. 2023. Sep 19 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#phrase-index [Google Scholar]
- 12.Browsing the index of terms. PubMed User guide. Bethesda (MD./United States): National Library of Medicine; [Internet]. 2023. Sep 19 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#browsing-show-index [Google Scholar]
- 13.Green JS, Teachman BA. Is "Cootie" in the Eye of the Beholder? An Experimental Attempt to Modify Implicit Associations Tied to Contamination Fears. J Exp Psychopathol. 2012;3(3):479–495. doi: 10.5127/jep.026111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.PubMed Update: Proximity Search Now Available in PubMed. NLM Technical Bulletin. [Internet]. 2022. Nov-Dec;(449):e4. . [cited 2023 Oct 9]. Available from: https://www.nlm.nih.gov/pubs/techbull/nd22/nd22_pubmed_proximity_search_available.html [Google Scholar]
- 15.PubMed Search Results for "search terms"[Title]:~0]. Bethesda (MD./United States): National Library of Medicine, [Internet]. 2023. Oct 9 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/?term=%22search+terms%22%5BTitle%3A%7E0%5D [Google Scholar]
- 16.Using search field tags. PubMed User Guide. Bethesda (MD./United States): National Library of Medicine; [Internet]. 2023. Sep 19 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#using-search-field-tags [Google Scholar]
- 17.Search field tags. PubMed User Guide. Bethesda (MD./United States): National Library of Medicine; [Internet]. 2023. Sep 19 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#search-tags [Google Scholar]
- 18.Subset [sb]. PubMed User Guide. Bethesda (MD./United States): National Library of Medicine; [Internet] 2023. Sep 19 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#sb [Google Scholar]
- 19.Filters. PubMed User Guide. Bethesda (MD./United States): National Library of Medicine; [Internet]. 2023. Sep 19 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#help-filters [Google Scholar]
- 20.Other filters & more subsets. PubMed User Guide. Bethesda (MD./United States): National Library of Medicine;[Internet]. 2023. Sep 19 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#filters-other [Google Scholar]
- 21.Status Subsets. PubMed User Guide. Bethesda (MD./United States): National Library of Medicine; [Internet]. 2023. Sep 19 [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#citation-status-subsets [Google Scholar]
- 22.Systematic Review [Publication Type]. Medical Subject Headings. Bethesda (MD./United States): National Library of Medicine; National Library of Medicine; [Internet]. 2019. [cited 2023 Oct 9]. Available from: https://www.ncbi.nlm.nih.gov/mesh/2028176 [Google Scholar]
- 23.Tybaert S. MEDLINE Data Changes—2019. NLM Technical Bulletin.[Internet]. 2018. Nov-Dec;(425):e4a. [cited 2023 Oct 9]. Available from: https://www.nlm.nih.gov/pubs/techbull/nd18/nd18_medline_data_changes_2019.html#update [Google Scholar]
- 24.Searching for a phrase. PubMed User Guide. Bethesda (MD./United States): National Library of Medicine; [Internet]. 2023. Seo 19; [cited 2023 Oct 9]. Available from: https://pubmed.ncbi.nlm.nih.gov/help/#searching-for-a-phrase [Google Scholar]



