Skip to main content
Journal of Palliative Medicine logoLink to Journal of Palliative Medicine
. 2022 Dec 30;26(1):13–16. doi: 10.1089/jpm.2022.0574

Revelations from a Machine Learning Analysis of the Most Downloaded Articles Published in Journal of Palliative Medicine 1999–2018

Suzanne Tamang 1,2, Zhijing Jin 1, Vyjeyanthi S Periyakoil 1,2,
PMCID: PMC10024060  PMID: 36607778

Abstract

The Journal of Palliative Medicine (JPM) is globally recognized as a leading interdisciplinary peer-reviewed palliative care journal providing balanced information that informs and improves the practice of palliative care. JPM shapes the values, integrity, and standards of the subspecialty of palliative medicine by what it chooses to publish. The global JPM readership chooses to download the articles that are of most relevance and utility to them. Utilizing machine learning methods, the top 100 most downloaded articles in JPM were analyzed to gain a better understanding of any latent trends and patterns in the topics between 1999 and 2018. The top five topic themes identified in the first decade were different from the ones identified in the second decade of publication. There is evidence of differentiation and maturation of the field in the context of comprehensive health care. Although noncancer serious illnesses have still not risen to the same prominence as cancer palliation, there is a directional quality to the emerging evidence as it pertains to cardiac, respiratory, neurological, renal, and other etiologies. Across both decades under study, there was persistent evidence of the importance of understanding and managing the mental health care needs of seriously ill patients and their families. A cause for concern is that the word “spirituality” was prominent in the first decade and was lacking in the second. Future palliative care clinical and research initiatives should focus on its development as an essential interprofessional and medical subspecialty germane to all types of serious illnesses and across all venues.

Keywords: artificial intelligence, latent Dirichlet allocation, machine learning methods, palliative care, topic modeling

Introduction

In 1990, the World Health Organization recognized palliative care as a distinct specialty dedicated to relieving suffering and improving quality of life for patients with serious illnesses. The Journal of Palliative Medicine (JPM) was founded to disseminate the best science to guide this important new field. The first volume was published in 1997. The JPM readership includes clinicians, teachers, and researchers across the globe. JPM publishes peer-reviewed balanced information that informs and improves their daily work. In the past 25 years, JPM established itself as a leading interdisciplinary journal.

A quarter century is an important milestone; it is a time to both look back and learn from the past as well as set the course for the future. Two approaches were engaged (Periyakoil and von Gunten, this issue). First, the emerging science of artificial intelligence (AI) was applied to the 100 most downloaded articles from 1999 to 2018 to identify any patterns and gaps in the science that might inform the future for JPM and palliative care. Second a series of seven expert roundtables1–7 on leading edge topics were conducted. The combination of AI with human intelligence might best yield a blueprint for future study to build the body of evidence to advance the field. The roundtables were published throughout volume 25 in calendar 2021. This special report describes the machine learning analysis of the most downloaded articles from 1999 to 2018.

Topic Modeling Analysis Process

Topic modeling is a machine learning technique to automatically detect hidden thematic structures in extensive collections of documents.8–11 Topic modeling uses unsupervised machine learning and can extract the main topics that occurred in a collection of articles published in JPM. The main strength of the topic modeling analysis is that it permits identification of topics without imposing a priori hypotheses of what topics should appear.

To cluster the top 100 most downloaded articles in JPM for the purpose of modeling their evolution over time, and without prior information about their theme and composition, the commonly used topic model, latent Dirichlet allocation (LDA), was utilized. LDA is a widely used generative unsupervised statistical algorithm using Bayesian statistics.8 The LDA algorithm models each textual document as a mixture of topics, and each topic is represented as a probability distribution over all the words. Specifically, LDA models a generative process of document generation. It is parameterized by two Dirichlet distributions, one modeling the mixture of topics in documents, and the other modeling the mixture of words in topics. The model uses the expectation-maximization algorithm to derive the optimized set of topics.12 The LDA topic model was implemented using the Python package “genism.” The software code for data preprocessing and topic modeling is available as open source by contacting the authors.

Topic Modeling Analysis Steps

Preprocessing the data

Per the publisher, between the years of 2004 through 2018, there were 2,252,000 million downloads of articles published in JPM. The total number of downloads are not available for the period before 2004. However, the publisher was able to identify the top 100 most downloaded articles between the period of 1999–2018 and provide portable document format (PDF) files of these 100 articles.

First, the PDF copies of the 100 full-text articles under study were parsed into computer-readable “json” formats by the “scipdf” Python package. Next the text was tokenized by the “gensim” Python package. During tokenization, the text is split into sentences and the sentences into words. The words were converted to lowercase and punctuation was removed. Words with fewer than three characters and all low information words (e.g., “sample,” “male,” and “female”) were removed. All remaining words were then lemmatized; words in third person were changed to first person and verbs in past and future tenses changed into present; finally, all words were stemmed, retaining only nouns, verbs, adjectives, and adverbs for topic modeling.

Data analyses

To analyze how topics in JPM have changed over two decades (1999–2018), the top 100 publications as determined by total downloads were identified. Next, they were separated into two groups, based on their publication year. Topic modeling analysis was performed on each group of articles, with the number of topics set to five. Our Natural Language Processing pipeline for topic modeling is shown in Figure 1.

FIG. 1.

FIG. 1.

Pipeline of natural language processing topic modeling of the JPM most downloaded articles. JPM, Journal of Palliative Medicine; LDA, latent Dirichlet allocation.

Results

Analysis of the top 100 most downloaded articles was done for the first (1999–2009) and second decade (2010–2018) to identify five topic clusters. For each of the five topic clusters generated by topic modeling analysis, Table 1 shows the highest frequency probability terms, in order of relevance to the topic cluster, and based on publication years. The topic clusters were reviewed by expert external peer reviewers who reviewed the analysis and validated the topic clusters.

Table 1.

Results of Topic Model with Five Clusters (k-5) Identified by Machine Learning Analysis of the Most Downloaded Journal of Palliative Medicine Articles, by Year of Publication for Two Time Periods

1999–2009 Topics (79 full-text articles)
 Topic 1: Initial foundation (death, pain, end, cancer, and symptom)
 Topic 2: Humanism (caregiver, family, intervention, home, and dignity)
 Topic 3: Meaning (religious, cancer, symptom, advanced, and depression)
 Topic 4: Location (hospital, spirituality, nursing, home, and family)
 Topic 5: Logistics (family, end, pain, cost, and preference)
2010–2018 Topics (23 full-text articles)
 Topic 1: Implementation (consultation, management, symptom, dialysis, and cost)
 Topic 2: Intervention expansion (intervention, parent, evidence, child, and outpatient)
 Topic 3: Disease and symptom expansion (symptom, home, child, comorbidities, and distress)
 Topic 4: Standards of care (guideline, family, caregiver, support, and quality)
 Topic 5: Broadening the foundation (hospital, cancer, depression, symptom, and anxiety)

Discussion

Machine learning analysis identified different themes from the most downloaded research articles published in JPM between two 10-year periods (Table 1). Downloading occurred across the globe from 220 distinct countries. If the analysis had focused on the most cited articles, that would have narrowed and changed the nature of the research, weighting it in favor of the opinions of the researchers in the field. The population of researchers is far smaller than the population of clinicians and researchers who are both interested in the results of research and its application to improve the quality of serious illness care. The smaller number of articles in the second period is a product of the broad observation that important research not only remains of interest for many years, the number of downloads per article tends to grow with time as the implications of the work become clearer.

During 1999–2009 period, the JPM articles reflected a focus on the issues that formed the foundation of the field of palliative care. Various aspects of end-of-life care, especially in the context of a person dying of cancer, are foremost. The associated subjects of family, caregivers, and spirituality were also prominent. In contrast, the articles from the subsequent period (2010–2018) demonstrated a difference. In addition to detail about aspects of death, dying, and cancer, there was a clear expansion to other aspects of medicine. Diseases other than cancer, the inclusion of pediatrics, the location of care in all hospital departments, including the emergency department, and in the ambulatory outpatient settings appeared. In other words, the field has expanded to all areas of the health care system. The words “consultation” “management” “evidence,” “guideline,” and “quality” appeared frequently demonstrating the formalization of the practice of palliative care.

The directionality of the latent topic themes and its change over time is clear. The field of palliative care has both deepened its attention to all phases of cancer care, including the end-of-life phase, as well as broadened beyond oncology. In other words, there is clear evidence of differentiation and maturation of the field in the context of comprehensive health care. Although noncancer illnesses have still not risen to the same prominence as cancer (as identified by this LDA analysis), there is a directional quality to the emerging evidence as it pertains to other serious illnesses of cardiac, respiratory, neurological, renal, and other etiologies.

Across decades, there is persistent evidence of the importance of understanding and managing the mental health care needs of seriously ill patients and their families. A cause for concern is that the word “spirituality” was prominent in the first decade and was lacking in the second. This could be a signal that the subspecialty of Hospice and Palliative Medicine is at risk of moving away from its bio-psycho-socio-cultural-spiritual holistic roots13 as our collective clinical practices change to better integrate into the health care medical-industrial complex.14

Finally, to explore robustness of the analysis technique used in this study, the same topic modeling analysis was conducted with a comparator journal—Journal of Women's Health15—published over the same time period by the same publisher. The publisher identified that Journal of Women's Health was the ideal comparator as it is closest in age (founded in 1992) and impact factor (3.017) to JPM. However, the analysis of the top 100 most downloaded articles published in Journal of Women's Health did not demonstrate similar evolutionary changes in the field of women's health.

This study has limitations. Although empirically valid, statistically distinct topics were identified by the topic modeling analysis, they are subject to errors that are not associated with traditional qualitative coding techniques. To reduce errors in this study, the pre- and postprocessing steps make the data more amenable to modeling higher level patterns of interest. For example, the initial analysis yielded a high proportion of low information words such as “sample,” “male,” “female,” “experiment,” “show,” “year,” and “week.” Another limitation is that we only analyzed the top 100 most downloaded articles. Since the purpose of the analysis is to guide the future of the journal, it made sense to use those articles that an unselected group of global readers would choose to download from the Internet.

This is a simple way to crowd-source and identify the articles that sparked the most interest across the broadest possible global population of interprofessional clinicians and researchers. The selection excludes those who read articles in hard copy from in-person subscriptions or in libraries. To note, newer articles are less likely to have the same number of citations as older articles. The period of analysis was stopped after 2018 as a sufficient time lag is required to identify the most downloaded articles. Nevertheless, the 2022 impact factor of 2.947 proves the journal is among the top quartile of medical research journals and on par with other scientific journals serving the field of palliative medicine. Future research could consider the use of other clustering algorithms for natural language processing applications such as “dynamic topic modeling” methods that may offer benefits for explicitly modeling the change of palliative care topics and trends over time.9,16

Conclusion

An independent AI machine learning analysis of the 100 most downloaded research articles from JPM across a 20-year time span shows evolutionary changes in the growth and development of the field as it becomes an integral medical subspecialty. It is a direct refutation to those who asked, “How could there be much to palliative care?” The analysis gives clear direction. If the current momentum is sustained, there will be published research that relates to the foundational aspects of the field (community based, family centered, and multidisciplinary roots).

There will also be research that relates to its development as an essential interprofessional and medical subspecialty germane to all serious illness and all the locations such illness is managed. In short, the evidence base for high-quality care for patients of all ages and their families with all types of serious illnesses will grow. A challenge for JPM will be to decide which articles it selects for publication among this growing and diversifying body of research to ensure the ongoing growth and advancement of the field over the decades to come.

Funding Information

VSP is funded by the following grants: P30 AG059307/AG/NIA NIH HHS/United States; R01 AG062239/AG/NIA NIH HHS/United States; and U54 MD010724/MD/NIMHD NIH HHS/United States. ST is funded by P30 AG059307/AG/NIA NIH HHS/United States.

Author Disclosure Statement

No competing financial interests exist.

References

  • 1. Periyakoil VS, von Gunten CF, Block S, et al. 25 Years: Looking back, looking forward. J Palliat Med 2022;25(12):1761–1766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Periyakoil VS, von Gunten CF, Arnold R, et al. Caught in a loop with advance care planning and advance directives: How to move forward? J Palliat Med 2022;25(3):355–360; doi: 10.1089/jpm.2022.0016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Periyakoil VS, Gunten CF von, Check D, et al. Accountable and transparent palliative quality measures will improve care. J Palliat Med 2022;25(4):542–548; doi: 10.1089/jpm.2022.0063 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Periyakoil VS, Gunten CF von, Fischer S, et al. Generalist versus specialist palliative medicine. J Palliat Med 2022;25(2):193–199; doi: 10.1089/jpm.2021.0644 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Periyakoil VS, von Gunten CF, Bowman B, et al. Incentives for palliative care. J Palliat Med 2022;25(7):1024–1030; doi: 10.1089/jpm.2022.0280 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Periyakoil VS, von Gunten CF, Bailey FA, et al. Mid-career training to advance palliative care. J Palliat Med 2022;25(5):705–711; doi: 10.1089/jpm.2022.0144 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Periyakoil VS, von Gunten CF, Bruera E, et al. Symptom control research. J Palliat Med 2022;25(10):1462–1467; doi: 10.1089/jpm.2022.0442 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Blei DM, Ng AY, Jordan MJ. Latent Dirichlet allocation. J Mach Learn Res 2003;3:993–1022. [Google Scholar]
  • 9. Blei DM, Lafferty JD. Dynamic topic models. In: Proceedings of the 23rd International Conference on Machine Learning—ICML’06 [Internet]. ACM Press: Pittsburgh, PA, USA; 2006; pp. 113–120. Available from: https://portal.acm.org/citation.cfm?doid=1143844.1143859 [Last accessed: June 25, 2020].
  • 10. Deerwester S, Dumais ST, Furnas GW, et al. Indexing by latent semantic analysis. J Am Soc Inf Sci 1990;41(6):391–407. [Google Scholar]
  • 11. Hofmann T. Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval—SIGIR’99 [Internet]. ACM Press: Berkeley, CA, USA; 1999; pp. 50–57. Available from: https://portal.acm.org/citation.cfm?doid=312624.312649 [Last accessed: November 27, 2022].
  • 12. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B Methodol 1977;39(1):1–22. [Google Scholar]
  • 13. Clark D. “Total pain,” disciplinary power and the body in the work of Cicely Saunders, 1958–1967. Soc Sci Med 1999;49(6):727–736; doi: 10.1016/s0277-9536(99)00098-2 [DOI] [PubMed] [Google Scholar]
  • 14. The new medical-industrial complex. N Engl J Med 1981;304(4):231–233; doi: 10.1056/NEJM198101223040411 [DOI] [PubMed] [Google Scholar]
  • 15. Journal of Women's Health. Available from: https://home.liebertpub.com/publications/journal-of-womens-health/42
  • 16. Wang C, Blei D, Heckerman D. Continuous time dynamic topic models. ArXiv12063298 Cs Stat 2015. Available from: https://arxiv.org/abs/1206.3298 [Last accessed: December 1, 2022].

Articles from Journal of Palliative Medicine are provided here courtesy of SAGE Publications

RESOURCES