Skip to main content
Medicine logoLink to Medicine
. 2022 Aug 26;101(34):e30217. doi: 10.1097/MD.0000000000030217

Using Sankey diagrams to explore the trend of article citations in the field of bladder cancer: Research achievements in China higher than those in the United States

Yen-Ling Lee a,b, Tsair-Wei Chien c, Jhih-Cheng Wang d,e,f,*
PMCID: PMC9410696  PMID: 36042603

Background:

Urology authors are required to evaluate research achievements (RAs) in the field of bladder cancer (BC). However, no such bibliometric indices were appropriately applied to quantify the contributions to BC in research. In this study, we examined 3 questions: whether RAs in China are higher than those in the United States, how the Sankey-based temporal bar graph (STBG) may be applied to the analysis of the trend of article citations in the BC field, and what subthemes were reflected in China’s and the United States’ proportional counts in BC articles.

Methods:

Using the PubMed search engine to download data, we conducted citation analyses of BC articles authored by urology scholars since 2012. A total of 9885 articles were collected and analyzed using the relative citations ratios (RCRs) and the STBG. The 3 research goals were verified using the RCRs, the STBG, and medical subject headings (MesH terms). The choropleth map and the forest plot were used to 1 highlight the geographical distributions of publications and RCRs for countries/regions and 2 compare the differences in themes (denoted by major MeSH terms on proportional counts using social network analysis to cluster topics) between China and the United States.

Results:

There was a significant rise over the years in RCRs within the 9885 BC articles. We found that the RCRs in China were substantially higher than those in the United States since 2017, the STBG successfully explored the RCR trend of BC articles and was easier and simpler than the traditional line charts, area plots, and TBGs, and the subtheme of genetics in China has a significantly higher proportion of articles than the United States. The most productive and influential countries/regions (denoted by RCRs) were {Japan, Germany, and Italy} and {Japan, Germany, New York}, respectively, when the US states and provinces/metropolitan cities/areas in China were separately compared to other countries/regions.

Conclusions:

With an overall increase in publications and RCRs on BC articles, research contributions assessed by the RCRs and visualized by the STBGs are suggested for use in future bibliographical studies.

Keywords: bladder cancer, choropleth map, forest plot, relative citations ratio, research achievement, Sankey diagram, temporal bar graph, urology


Key points

  • We demonstrated the use of the relative citation ratios (RCRs) and the Sankey temporal bar graphs in comparison to research achievements in bladder cancer articles between China and the United States since 2012.

  • A summary of research achievements in trends for urology authors is rarely seen in bibliometrics because of difficulties in 1 trend analyses and proportional comparison in counts simultaneously displayed using traditional line charts, area plots, and TBGs and 2 RCRs normalized by citations, years, and disciplines, superior to the citations determined by article ages and research domains.

  • Both choropleth maps and forest plots applied to compare the differences in publications and RCRs are novel and modern in bibliometrics.

1. Introduction

Bladder cancer (BC), one of the top 10 most frequently occurring neoplasms worldwide, is the most frequently diagnosed malignancy in the urinary system and has high morbidity and mortality rates.[1,2] It leads to over 150,000 deaths per annum.[1] Chemotherapy as a BC treatment is urgent to develop potent anti-BC drugs.[3,4] Histone lysine-specific demethylase 1 (LSD1) has been considered a potential cancer therapeutic target to discover novel anti-BC agents.[57] Although a variety of LSD1 inhibitors have been reported to date, many of them show insufficient selectivity toward LSD1.[8]

1.1. Racial differences in BC survival

Risk factors for BC include smoking, family history, prior radiation therapy, frequent bladder infections, and exposure to certain chemicals.[9] Diagnosis is typically by cystoscopy with tissue biopsies.[10] The staging of the cancer is determined by transurethral resection and medical imaging.[1113] Treatment depends on the stage of cancer.[13] The overall BC survival was 66% for Japanese patients, 64% for Chinese patients, 61% for Caucasians, 59% for Filipino patients, and 52% for Hawaiian patients.[14] Japanese and Chinese BC patients had higher overall survival rates than Caucasians but had substantially lower survival than the United States (77%), Canada (75%), and Europe (68%).[1517] However, the highest rate of BC occurs in Southern and Western Europe, followed by North America, with rates of 15, 13, and 12 cases per 100,000 people.[18] The highest rates of BC deaths were seen in Northern Africa and Western Asia, followed by Southern Europe.[18] We are thus motivated to examine whether the published BC-related articles in China are also higher than those in the United States.

1.2. Different article themes conducted in countries

Although a bulk of published articles have investigated BC treatments and risk factors in the literature,[19] the characteristics of BC articles remain unclear for citation trends on topical entities (e.g., authors, journals, affiliated institutes, and countries). However, it is rather difficult to give a broad outline of BC on entities in the development and evolution over the past years.

Bibliometric analysis helps further our knowledge of BC research, topics, and trends.[1] Accordingly, it is useful to identify the most influential articles (on an impactful beam plot (IBP)[20,21] rather than those articles listed in their personal biography[1,22]) that are pertinent to this field and help us better understand and manage the BC.

From the view of bibliometric analysis,[23] documents can be organized by theme (e.g., BC) on a topical entity (e.g., author or affiliated country) for a specific feature (e.g., citations and publications). The subthemes can be determined by clustering medical subject headings (MeSH terms)[24,25] using social network analysis (SNA).[26,27] The second motivation was to investigate what subthemes (denoted by MeSH terms) were different in proportional counts between China and the United States.

1.3. The temporal bar graph can be enhanced using the Sankey diagram

We frequently experience the entity arrival over time (e.g., evolution over the years on citations and publications). Strong temporal ordering of the content is necessary for making sense of it via the temporal bar graph (TBG),[28] referred to as particular trends of entities appearing, growing in intensity (namely, burst strength [BS] used in this study later), and then fading away again.[29] Nevertheless, the BS for topical entities on the TBG was not clearly explained in the traditional TBG tool,[30,31] except in these 2 studies.[32,33] We are thus motivated to illustrate the Sankey diagram[34,35] to enhance the traditional TBG (called Sankey TBG [STBG]) using the inflection point (IP) to express the hot spot (HS), the burst strength (BS) and the trend stages to highlight the development in the latest 4 time points.[36,37] The third research question was to demonstrate how the STBG can be applied to explore the trend of article citations in the BC field.

1.4. Study aims

Aims of the study are to investigate whether the research achievements (RAs) in China are higher than those in the United States, how the STBG can be used to evaluate the trend of article citations in the BC field, and which subthemes were significantly different between China and the United States.

2. Methods

2.1. Data source

Two steps were involved in arranging the data. First, the authors searched PubMed using the keywords (urology[Affiliation]) and (bladder cancer[MeSH Major Topic]) AND ((“2012”[Date - Publication]: “2021”[Date - Publication])) as of April 12, 2022, and downloaded 9885 abstracts since 2012. A total of 9885 articles were collected and analyzed in this study; see (Supplemental File 1, Supplemental Digital Content 1, http://links.lww.com/MD/H89).

Second, based on the article contents, we extracted 5 topical entities from abstracts, including (1) affiliated countries/regions, (2) journals, (3) medical subject headings (MeSH terms), (4) article identity numbers, and (5) individual authors, based on the article relative citation ratios (RCRs)[38] over the years.

The RCRs for each article were extracted from the icite analysis,[39] implying that the RCR values measure the scientific influence of each paper by field- and time-adjusting the citations it has received and benchmarking to the median for the National Institutes of Health publications.[38]

Because all data were obtained from a publicly available database, this study does not require ethical approval.

2.2. Using the Sankey to enhance the TBG

The Sankey flow diagram (Sankey),[34,35] named Captain Matthew Sankey in 1898, emphasizes the flow/movement/change from one state to another (or one time to another).[17,18] Sankey has been applied to visualize article features in bibliometrics,[40,41] but none on their trend evolutions over the years.

Line charts and area plots are frequently used to display data trends and evolution.[42,43] However, the HS and BS are not involved in them, as shown in TBGs[28,32,33] (to compare the 3 in panels 1 to 3 of Figure 1). The STBG was proposed in this study, including HS, BS, and the trend. The trend has been proposed in the study[36] (e.g., with 4 scenarios denoted by colors in the last column (see Fig. 1D).

Figure 1.

Figure 1.

Comparison of trend analyses in visualization types. BS = burst strength, RCR = relative citations ratio, TBG = temporal bar graph.

2.3. Using the Sankey to display article features

.We often observed that numerous tables and figures were provided to readers without a quick look. A Sankey-based category plot (called Alluvial diagram[44,45]) was thus proposed to display all possible entities of the top 3 elements with their hT index[46,47] in a picture.

2.4. Cluster analysis of MeSH terms using SNA

Cluster analysis of MeSH terms[24,25] was performed using SNA.[26,27] We are concerned with the subthemes with the highest RCRs in each cluster. The proportional counts in subthemes were compared between the 2 countries of China and the United States using the forest plot.[48]

2.5. Geographical distributions of publications and RCRs on choropleth maps

Geographical distributions of publications based and the RCR were displayed on choropleth maps.[49] The darker areas indicate more counts/RCRs in countries/areas.

2.6. Hundred top-cited BC articles on impactful beam plot

The 100 top-cited BC articles denoted by each dot were displayed on the IBP, from the left to the right side, by normalized citations from 0 to 100 (i.e., using the MS Excel function of PercentRank[array, x, 1]×100). The red dot indicates the article related to clinical research. The IBP dashboard was shown on Google Maps. The article is immediately linked to PubMed once the dot is clicked on the dashboard.

2.7. Statistical tools and data analysis

Visual representations of the STBG were drawn using the author-made modules in MS Excel. The significance level was set at Type I error (0.05). The study process is displayed in Figure 2. Data and the demonstration of STBG making are presented in Supplemental Files 1 and 2 (Supplemental Digital Content 1, http://links.lww.com/MD/H89 and Supplemental Digital Content 2, http://links.lww.com/MD/H90).

Figure 2.

Figure 2.

Study flowchart. MeSH = medical subject headings, RCR = relative citations ratio, TBG = temporal bar graph.

3. Results

3.1. First objective: RAs in China higher than those in the United States

RAs in China higher than those in the United States have been verified in Figure 1D. We can see that the RCRs in 2021 were 96.69 and 49.48 and total numbers of 3923.50 and 3526.80 for China and the United States, respectively. The first research question was answered.

3.2. Second objective: Sankey-based TBGs and Alluvial diagram applied to examine the RA for each entity

The most influential entities with higher hT indices were {China, the United States, Japan} of countries, {Urol Oncol, J Urol, Eur Urol} of journals, {Fudan University (Shanghai), Huazhong University of Science and Technology (Hubei), Capital Medical University (Beijing)} of institutes, {Zhang, P (Beijing), Namekawa, Takeshi (Japan), Takayama, Tatsuya (Japan)} of authors, and {Journal Article, Comparative Study, Case Reports} of document types. All of this information about the HB, BS, and the trend is simultaneously displayed on the STBGs in Figure 3. The second research question was answered.

Figure 3.

Figure 3.

Comparison of RCR for each entity using the Sankey TBG. BS = burst strength, MeSH = medical subject headings, RCR = relative citations ratio, TBG = temporal bar graph.

Six top-cited entities (i.e., year, country, institute, journal, author, and document type) with their hT indices are shown in Figure 4. Traditionally, we should provide more tables and figures to present the results.

Figure 4.

Figure 4.

hT indices for entities shown on the Alluvial diagram.

3.3. Third objective: Subtheme differences in proportional counts between China and the United States

Seven MeSH clusters were separated in Figure 5 using SNA,[26,27] as shown in Figure 5. Next in Figure 6, we observe that the subtheme of genetics in China has significantly higher proportional counts (P < .001) in articles than those in the United States. No difference was found in microbiology (P = .311). The other 5 subthemes favor the United States, including surgery, therapy, drug therapy, urinary bladder neoplasms, and economics (Fig. 6). The third research question was answered.

Figure 5.

Figure 5.

Cluster analysis of MeSH terms to be ten clusters based on RCR. MeSH = medical subject headings, RCR = relative citations ratio.

Figure 6.

Figure 6.

Proportional counts of subthemes between China and the United States.

3.4. Additional visualizations in contrast to the traditional graphs

3.4.1. The region-based geographics rather than the traditional world map.

The publication and RCR-based choropleth map[49] are shown in Figure 7. We can see that the most productive and influential countries/regions (denoted by RCRs) were {Japan, Germany, and Italy} and {Japan, Germany, New York}, respectively when the US states and provinces/metropolitan cities/areas in China were separately compared to other countries/regions.

Figure 7.

Figure 7.

Geographical distribution of publications and RCRs on BC articles since 2012. RCR = relative citations ratio.

3.4.2. IBP used to display 100 top-cited BC articles.

The 100 top-cited BC articles with dots are displayed on the IBP shown in Figure 8. The red dot indicates the clinical research. The article immediately appears on PubMed once the dot of interest is clicked. The article[50] authored by Babjuk et al from the Czech Republic was cited 621 times as of April 10, 2022. The article entitled the European Association of Urology (EAU) Guidelines on nonmuscle-invasive Urothelial Carcinoma of the Bladder and published by Eur Urol. in 2017.

Figure 8.

Figure 8.

Hundred top-cited BC articles on the impactful beam plot. BC = bladder cancer, RCR = relative citations ratio.

3.6. Online dashboards shown on Google Maps

All dashboards in Figures 5 to 8 appear once the QR code is scanned or the links are clicked. Readers are advised to examine the details of the information for each entity. Links[5154] are provided for better understanding of Figures 3 and 4/

4. Discussion

4.1. Principal findings

We observed that the RCRs in China were substantially higher than those in the United States since 2017, the STBG successfully explored the RCR trend of BC articles and was easier and simpler than the traditional ways using line charts, area plots, and TBGs, and the subtheme of genetics in China has a significantly higher proportion of articles than the United States. The most productive and influential countries/regions (denoted by RCRs) were {Japan, Germany, and Italy} and {Japan, Germany, New York}, respectively, when the US states and provinces/metropolitan cities/areas in China were separately compared to other countries/regions. The IBP used to display 100 top-cited articles is promising and novel in bibliometrics.

4.2. Trend analysis of topical entities in BC

The most-cited BC articles show that the most productive countries were China (24%), the United States (23%), and Japan (9%), different from the results of the United States (58%) and China (1%) reported from 100 top-cited BC articles since 1950.[1]

Japanese and Chinese BC patients had higher overall survival rates than Caucasians but had substantially lower survival than the United States (77%), Canada (75%), and Europe (68%).[1517] More BC article were published in those countries that have lower BC survival rates.

The greatest number of articles in the top 100 were published in the Journal of Urology (n = 15), followed by the Journal of Clinical Oncology (n = 14) and European Urology (n = 13),[1] different from our findings of {Urol Oncol, J Urol, Eur Urol}(Fig. 4) based on the hT index instead of publications alone.

The most cited article authored by Babjuk et al[50] from the Czech Republic was cited 621 times since 2017, different from the finding[1] of the article (Stein et al[55] Journal of Clinical Oncology 2001, 851 citations in PubMed)

The article[1] used 2 tables and 6 figures to display study results in contrast to the current study using 2 figures in Methods and 6 figures in results, albeit more information was provided in this study, including the trend of each entity, geographical distribution of publications (and citations) in countries/regions, and comparison of subthemes between China and the United States based on the proportional counts observed in 9885 BC-related articles since 2012 in PubMed.

Importantly, publications have been increasing, and citations are ready to decline based on the last 4-year data shown in Figure 3A. However, the RCRs in several other entities increased (Fig. 3), indicating that the RCR trend in BC is viable and feasible when compared to the traditional total citations, dependent on the article age with a tendency toward a decreasing trend always on all entities.

It is worth mentioning that the enhanced TBG leads us to focus attention on the trend in addition to the HS shown in the traditional TBG.[28,32,33] For example, the RCRs were observed not only in the trend but also in the comparison of proportions among entities over the years. In addition, the HSs are a period of years instead of the turning point only, similar to a previous study[28] displaying HSs for keywords on TBGs.

4.3. Three most-cited articles

The most cited BC articles were {32360052, 27324428, 27375033} based on the article PubMed Unique Identifier with RCRs (124.92, 77.1, 57.39, respectively) rather than citations (198, 621, 450, respectively).

The top-ranked article was titled “European Association of Urology Guidelines on Muscle-invasive and Metastatic Bladder Cancer.”[56] It was cited 198 times, and RCR = 124.92, indicating that guidelines on muscle-invasive and metastatic BC are important to BC research.

The second-ranked article was cited 621 times with RCR = 77.1 and was titled “The European Association of Urology (EAU) Guidelines on Non-Muscle-Invasive Urothelial Carcinoma of the Bladder” and published by Eur Urol. in 2017.[50] The guidelines are also important to BC research.

The third-ranked article was cited 450 times with RCT = 57.39 in PubMed and titled “Updated 2016 EAU Guidelines on Muscle-Invasive and Metastatic Bladder Cancer,”[57] indicating that the guidelines are important to BC research as well.

4.4. Strengths and implications

The strengths and implications of the current study are listed below:

First, the enhanced TBG (called STBG) was demonstrated using the Sankey. We can easily examine the 3 major features: HS, BS, and the RCR trend, which are not mentioned in the traditional TBGs, line charts, and area plots.

Second, the Sankey category diagram (called the Alluvial diagram) is focused on the article entities and their category dimensions on the x-axis. One look is worth 1000 words. Traditionally, the results require ≥5 tables to display. Wikipedia says that an Alluvial diagram is a type of Sankey diagram “that uses the same kind of representation to depict how items regroup”.[44] Although the 2 terms are thus used interchangeably in practice, we prefer to use the name of Alluvial diagram shown in Figure 4, owing to dimensions rather than paths between related events (e.g., steps) on the x-axis.

Third, the traditional TBG[28,32,33] was enhanced through additional information provided for readers to understand the network features of BC articles, including HSs, BS, and the data trend. The STBG has the identical proportional feature as the area plot illustrated in Figure 1, but more information is provided from the Sankey than from line charts, area plots, and traditional TBGs.

Fourth, the hT index was used in this study. The reason is that the hT index is highly associated with the h-index[46,47] and has a higher discrimination power because of providing decimal values instead of integral numbers as the h-index has. The computation of the hT index is shown at the link.[58] Readers are invited to use the link to compute the hT index.

Fifth, the method used to conduct this study is deposited in (Supplemental File 2, Supplemental Digital Content 2, http://links.lww.com/MD/H90). Through this, we can easily understand how to draw the STBG on their own.

Sixth, BC is one of the top 10 most frequently occurring neoplasms worldwide and is the most frequently diagnosed malignancy in the urinary system.[1,2] The kind of bibliometric analysis of BC using STBG is modern and has never been seen before in the literature. Bibliometric analysis helps further our knowledge of BC research, topics, and trends.[1] It is useful to identify the most influential articles and their impact pertinent to the BC, particularly using the IBP to display the 100 top-cited articles on a dashboard. Furthermore, the STBG and Alluvial diagram are recommended for future research in bibliometrics.

4.5. Limitations and suggestions

Nonetheless, there are still some limitations in this study. First, the database was exclusively extracted from PubMed. The results of this study might be different from those of other major citation databases, such as Scopus, Web of Science, and Embase.

Second, the authors used the article RCRs instead of citations as indicators to measure the RAs. The results would be somewhat different from those studies using the citations. Nonetheless, the RCR is recommended for future studies because they have been adjusted and normalized, allowing citations to be compared by year and discipline. Otherwise, the citations would always show a decreasing trend because citations are increased by article age.

Third, the dashboards in Figures 5 to 8 are shown on Google Maps. The use of Google Maps is not free of charge using the application programming interface (API) with a paid project key. The limitation of the dashboard is not publicly accessible if no such API was applied. The process of making dashboards is provided in Appendix 2, which helps readers apply the procedures to other topics.

Fourth, the item-response-theory model was applied to determine the IP days (or locations) and the burst spot on the TBG.[28,32,33] Future studies are required to examine other mathematical models (instead of the Newton–Raphson Iteration Method, Newton–Raphson Iteration Method[5962]) to determine the IPs on a given ogive curve for use in the Sankey.

Finally, the growth rate computed in studies[36,37] is set back to the last 4 years. Future studies can be adaptive to the practical need using appropriate conditions, such as 2, 6 years, or more, to define the growth of the TBG.

5. Conclusions

This study was the first to report the trend of the most cited entities in BC articles using Sankey-based TBGs. The results of this study provide a historical perspective on scientific evolution and disclose the research trends of topical entities in the BC field. This study identified 100 top-cited articles on the IBP and drew an Alluvial diagram to present the authors, affiliated countries, journals, and MeSH terms with the block; the higher means more hTs to evaluate the RAs contributed to the BC research. Research contributions assessed by the RCRs and visualized by the STBGs are suggested for use in future bibliographical studies.

Acknowledgments

We thank Enago (www.enago.tw) for the English language review of this article.

Authors contributions

YL and JC provided the concept and designed this study, TW interpreted the data, and YC monitored the process and the article. TW and YL drafted the article. All authors read the article and approved the final article.

Supplementary Material

medi-101-e30217-s001.xlsx (876.9KB, xlsx)
medi-101-e30217-s002.pdf (577.9KB, pdf)

Abbreviations:

BC =
bladder cancer
BS =
burst strength
HS =
hot spot
IP =
inflection point
MeSH =
medical subject headings
RA =
research achievement
RCR =
relative citations ratio
SNA =
social network analysis
STBG =
Sankey temporal bar graph.

How to cite this article: Lee Y-L, Chien T-W, Wang J-C. Using Sankey diagrams to explore the trend of article citations in the field of bladder cancer: Research Achievements in China higher than those in the United States. Medicine 2022;101:34(e30217).

The datasets generated during and/or analyzed during the current study are publicly available. All data used in this study are available in Supplemental Files.

Supplemental Digital Content is available for this article.

The authors declare that they have no competing interests.

The authors have no funding and conflicts of interest to disclose.

Contributor Information

Yen-Ling Lee, Email: yenpig8291@gmail.com.

Tsair-Wei Chien, Email: smile@mail.chimei.org.tw.

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

medi-101-e30217-s001.xlsx (876.9KB, xlsx)
medi-101-e30217-s002.pdf (577.9KB, pdf)

Articles from Medicine are provided here courtesy of Wolters Kluwer Health

RESOURCES