Abstract
Background:
There are 3 issues in bibliometrics that need to be addressed: The lack of a clear definition for author collaborations in cluster analysis that takes into account collaborations with and without self-connections; The need to develop a simple yet effective clustering algorithm for use in coword analysis, and; The inadequacy of general bibliometrics in regard to comparing research achievements and identifying articles that are worth reading and recommended for readers. The study aimed to put forth a clustering algorithm for cluster analysis (called following leader clustering [FLCA], a follower-leading clustering algorithm), examine the dissimilarities in cluster outcomes when considering collaborations with and without self-connections in cluster analysis, and demonstrate the application of the clustering algorithm in bibliometrics.
Methods:
The study involved a search for articles and review articles published in JMIR Medical Informatics between 2016 and 2022, conducted using the Web of Science core collections. To identify author collaborations (ACs) and themes over the past 7 years, the study utilized the FLCA algorithm. With the 3 objectives of; Comparing the results obtained from scenarios with and without self-connections; Applying the FLCA algorithm in ACs and themes, and; Reporting the findings using traditional bibliometric approaches based on counts and citations, and all plots were created using R.
Results:
The study found a significant difference in cluster outcomes between the 2 scenarios with and without self-connections, with a 53.8% overlap (14 out of the top 20 countries in ACs). The top clusters were led by Yonsei University in South Korea, Grang Luo from the US, and model in institutes, authors, and themes over the past 7 years. The top entities with the most publications in JMIR Medical Informatics were the United States, Yonsei University in South Korea, Medical School, and Grang Luo from the US.
Conclusion:
The FLCA algorithm proposed in this study offers researchers a comprehensive approach to exploring and comprehending the complex connections among authors or keywords. The study suggests that future research on ACs with cluster analysis should employ FLCA and R visualizations.
Keywords: author collaboration, cluster analysis, coword analysis, follower-leading clustering algorithm, JMIR Medical Informatics, R software, visualization
Key Points.
This study demonstrated the novelty of utilizing FLCA to identify author collaborations.
An R platform was applied to visualize data, which included cluster analysis and were innovative and unique in bibliometric analyses.
The ACs are focused on connections in collaborations, excluding those without coauthored articles that have different features and meaning from those with portions of self-connections focused on the total counts of publications.
1. Introduction
Social network analysis (SNA) is an interdisciplinary field that studies the relationships and connections among individuals, groups, and organizations.[1] It involves the use of mathematical and computational tools to analyze social networks, including their structure, properties, and behavior. SNA has various practical applications, such as studying communication flows, decision-making situations, organizational behavior, and construction project management.[2] SNA can also be used to inform health-related interventions, although there has been less research in this area.[3] SNA can be performed using different tools and techniques, such as Gephi,[4] Python,[5] R,[6] and Excel.[7]
There is a vast array of freely available tools for social network analysis, many of which are open source and accessible to anyone.[8] Scholars have employed various software tools to conduct coword analyses of author collaborations (AC) and keywords,[9,10] such as CiteSpace,[11] VOSviewer,[12] or Bibexcel,[7]. However, the classification methods used in these tools are often unclear and inadequately explained, resulting in inconsistent classification outcomes in unsupervised learning approaches.[13] For instance, the results of methods such as nearest distance or correlation coefficient can differ substantially depending on the co-occurrence relationships between authors,[14] posing a significant challenge to this study.
1.1. Problems in traditional cluster analysis of coword analysis
Study[3] used a spectral method derived from graph theory to uncover hidden topological structures in protein–protein interaction networks. The authors found that these hidden structures consist of biologically relevant functional groups, which suggests a new method for predicting the function of uncharacterized proteins. They applied the method to a yeast protein network, identified 48 quasicliques and 6 quasibipartites, and assigned functions to 76 uncharacterized proteins. However, similar challenges were encountered in this study,[3] which are also prevalent in other related research utilizing SNA or coword analysis. These challenges include difficulties in interpreting complex networks with a large number of connections, clusters with an excess of vertices resulting from spectral analysis methods, and unclear replication methods used for research in future studies.
1.2. Coword analysis in bibliometrics
Researchers often employ tools such as CiteSpace,[11] VOSviewer,[12] or Bibexcel[7] to perform coword analyses on AC and keywords[9,10] in bibliometrics. However, interpreting the data from coword analysis methods can be challenging due to software that does not provide details about the features of the clustering process. To address this issue, we developed the follower-leading clustering algorithm (FLCA), which can be used as a tool to clarify the clustering process in coword analysis and provide a better understanding of ACs and keywords in bibliometrics. Details about the FLCA are illustrated in the Methods section of this study.
1.3. Visualization drawn in R is necessary
Bibliometric studies have seen exponential growth in recent years,[15–18] and R language[6] is becoming increasingly common for visualizing bibliometrics, especially in cluster naming.[19–23] However, creating network diagrams or similar graphics[20,24] using R presents a challenge that needs to be addressed. Another challenge is to represent author collaboration results using a graph while explaining the clustering process in detail. To overcome these challenges, the authors propose an R platform combined with the FLCA algorithm to explore coauthorship and keyword patterns. However, no such platform exists in the literature, making it difficult for researchers to use R for visualization.
1.4. Study aims
The study research content comprises; The development of a clustering algorithm appropriate for ACs and keywords; The creation of a platform that generates R language code to offer researchers easily drawable visualizations; The assessment of the clustering results that are different when excluding or including self-connections in the network, and; The demonstration of the FLCA algorithm for reporting bibliometric results.
2. Methods
2.1. Data source
We obtained the necessary data for our study by conducting a search on the Web of Science (WoS) database to collect article metadata in the Journal of JMIR Medical Informatics. Our search yielded 993 articles, including 37, 35, 71, 111, 270,278, and 191 from 2016 to 2022. We then analyzed and visualized the results using a custom-made module on the website.[23] To report which countries are frequently coauthored with others and other relevant information of bibliometrics, we applied the FLCA algorithm and presented visualizations based on the R platform.[23]
Since all data shown in Supplemental Digital Content 1, http://links.lww.com/MD/K402.
were obtained from Web of Science, ethical approval was not required for this study.
2.2. Goal 1: FLCA
In SNA, data is often divided into 2 parts: the main data and their dyadic relationships (as shown in Table 1, represented by data A and data B). Firstly, the network’s association frequencies are arranged from highest to lowest, and then each keyword (or author entity) is assigned a temporary serial number (or label) from 1 to n, where n is the number of entities. The score of the association frequency determines the size of the bubble, and the weights of the dyadic relationships are summed up for the main data. For example, A – B 2 (indicating that A and B are coauthors of 2 papers), both A and B are assigned a weight of 2. A – A 2 (indicating that A is the sole author of 2 papers), A is assigned a weight of 2.
Table 1.
The FLCA used in this study (searching leaders).
Input k, sorting dataset A (name, connections, cluster#) | |
---|---|
及B (couple names and connections) by | |
Descending order and cluster# (1至n) assigned each | |
Output for temporary clusters | |
1 | Followers search leaders from lower to higher in connections by observing the maximum as the leader |
2 | For jk = n To 1 Step -1 for Dataset A (size = n) |
3 | Entity = dataset A. Cells (jk, 1) |
4 | For j = 1 To m dataset B (size = m) |
5 | Name 1 = dataset B. Cells (j, 4) |
6 | Name 2 = dataset B. Cells (j, 5) |
7 | Connections = dataset B. Cells (j, 6) |
8 | If Entity = Name1 or Entity = Name 2 |
And Name 1 <>Name 2 then | |
9 | If Entity = Name 1Then |
10 | Leader = Name 2 |
11 | Else |
12 | Leader = Name 1 |
13 | End If |
14 | For jk2 = 1 To jk—1 |
(Counts for leader > counts for follower) | |
15 | If dataset A. Cells (jk2, 1) = leader Then |
16 | Dataset A. Cells (jk, 3) = jk2 |
17 | (leader found) |
18 | Goto 22 end the loopjk2及j |
19 | End If |
20 | Next jk2 end dataset A |
21 | Next J end dataset B (size = m) |
22 | Next jk end dataset A (size = n) |
FLCA = following leader clustering.
Table 1 illustrates the algorithm for finding leaders. Firstly, for the main data (such as countries in data A), their leaders are traced from smallest to largest (as in steps 3 to 23). Except for the entity with the maximum count (e.g., country), all others are followers. When each follower finds a unique leader (the leader with the highest number of links, and in case of ties, the leader with the higher total number of links), the follower is assigned the cluster label of that leader (as in step 17). If a follower is not associated with any other entity, it becomes an isolated entity, forming its own cluster (as the initial label in the Input step).
Next, we proceed to the matching process in Table 2, starting from the largest entity and looking for its members. At this stage, there is a parameter k in the algorithm. When the rank of the entity is within the limit of k (<=k, as in step 3) and the entity has a sufficiently large total number of links (i.e., higher network centrality), and there are followers pointing to it, the entity can form a new cluster (similar to children growing up, finding partners, and leaving their parents to form their own families). Otherwise, the algorithm corrects the previously linked followers and assigns them the label of the new leader (i.e., changing their original cluster label due to following another leader, as in steps 11 to 17). This completes the entire cluster analysis operation.
Table 2.
The FLCA used in this study (matching up).
Input k, sorting dataset (name, connections, cluster#) | |
---|---|
Output the final clusters | |
1 | Ensure k (in k, all entities having at least one follower become a leader) |
2 | For jk = 1 To n dataset A (size = n) |
3 | If jk <=k Then |
4 | Cluster#= dataset A.cells (jk, 3) |
5 | Search for followers (with identical cluster#) |
6 | If found () Then |
7 | Dataset A cells (jk, 3) = giving the initial cluster# |
8 | End If |
9 | Else |
10 | Cluster#= dataset A.cells (jk, 3) |
11 | For j = jk + 1 To n |
12 | If dateset A.cells (j, 3) = jkThen |
13 | (=leader’s cluster#) |
14 | Dataset A cells (j, 3) = cluster # |
15 | (=set identical cluster# with the leader’s cluster#) |
16 | End If |
17 | Next j end cluster# update |
18 | End If |
19 | Next jk end dataset A (size = n) |
20 | Renew cluster# with counts from 1 to the number of clusters |
FLCA = following leader clustering.
The FLCA algorithm can be represented by mathematical expressions (1) and (2): Followers (based on the maximum association attribute, having at most 1 unique leader); Sorting the weights of the leaders (i.e., the link counts in the network) in descending order and finding their followers; Determining the number of cluster leaders based on the value of k. However, the number of clusters may vary depending on the value of k in the algorithm, typically more clusters are formed with a larger k value. To set an ideal value for k, we can use the minimum value of the absolute advantage coefficient (AAC)[24,25] for weighting.
(3) |
(4) |
(5) |
where the AAC ratio is determined by the 3 consecutive numbers of values (e.g., the member number in each cluster in descending order denoted by A1, A2, and A3 in Eqs. 2 and 3). The ACC ranged from 0 to 1.0, representing the strength of dominance for the top member when compared to the next 2 members. For simplification in computation of AAC, A2 and a3 are assigned to 1.0 if the cluster number is 1. A3 is assigned to 1.0 if the cluster number is 2.
2.3. Goal 2: Generative code website in R for visualization
When the user selects which graph to draw and inputs the data in the appropriate format, the platform generates R code.[23] Simply pasting the R code into R will create the desired graph. If adjustments are needed for font size and color, they can be easily made by modifying the parameters (see video tutorial[26]). For instance, the R code can promptly generate the network chart through the 2 links.[27,28]
2.4. Goal 3: Comparison of clustering results in 2 scenarios with and without self-connections
Two scenarios were considered, 1 with self-connections and the other without, in the dataset containing relations between vertices. The first scenario was focused on comparing the total counts of publications, while the second scenario was concerned with author collaborations.[29] To identify differences in clustering results, 3 types of visualizations were used: Network charts; A heatmap with a dendrogram, and; Venn diagrams.[30]
2.5. Goal 4: Application with the use of FLCA algorithm
In this section, we present a demonstration of the application of the FLCA algorithm using 2 types of data sources: 1 mode and 2 modes for; Institute-based author collaborations and author collaborations without self-connections and; Theme analysis of keywords on WoS. To achieve research goal 4, we applied 2 types of visualizations, network charts and circle bar charts.
Furthermore, to showcase the ability of the FLCA algorithm to reveal hidden topological structures in protein–protein interaction networks, data from the study[3] were utilized to identify the top 10 clusters. A network chart was created to provide readers with a clear and concise visualization of the results.
2.6. Traditional reports in bibliometrics
Bibliometric reports typically include various visualizations to represent and analyze data. In this study, we used a 4-quadrant radar plot[31] and a temporal heatmap[32] to showcase dominant entities and articles worth reading in the JIMR Medical Informatics journal. The 4-quadrant radar plot that displayed the dominant entities in JIMR Medical Informatics took into account the CJAL score,[31,33,34] which considers article category, journal impact factor, authorship, and citations based on the L-index.[35]
2.7. Drawing software and packages
Through the cluster analysis of author collaborations, the collaboration patterns among authors can be observed. The highest weighted centrality degree in each cluster is designated as the representative of that cluster. The top 20 authors with the most publications are selected, and a visual graph is generated using the FLCA algorithm, including a heatmap with dendrograms, Venn diagrams, and circle bar charts.
The method of conducting this study is displayed in Supplemental Digital Content 2, http://links.lww.com/MD/K403. The R platform[23] generates code in R language (version 4.2.1).[6] RStudio software (version 1.3.959) is suggested. The R platform system was developed by the authors.
3. Results
3.1. Comparison of clustering results in 2 scenarios with and without self-connections
The majority of publications in JMIR Medical Informatics are dominated by 2 countries, the United States and China. The study compared 2 scenarios with and without self-connections in the dataset and found that they resulted in different clustering outcomes when parameter k was set to 2 (Fig. 1). The 2 scenarios had distinct meanings and implications, with 1 focusing on total publication counts and the other on author collaborations in the network.
Figure 1.
Comparison of network outcomes between the two scenarios with and without self-connections cross countries on author collaborations.
The heatmap with the dendrogram in the scenario without self-connections showed 7 clusters, while the FLCA algorithm identified 2 clusters in panel B of Figure 2. The study found a significant difference in cluster outcomes between the 2 scenarios in Figure 3, with a 53.8% overlap in the top 20 countries in author collaborations. The results demonstrate the reliability of the FLCA algorithm in accurately identifying clusters in bibliometrics. Figures 1, 2, and 3 provide visualization of these findings.
Figure 2.
The difference in clustering between the heatmap and the FLCA algorithm. (e.g., the scenario of non-self-connections). FLCA = following leader clustering.
Figure 3.
Difference of countries between the two scenarios with and without self-connections on author collaborations.
3.2. Application with the FLCA algorithm
When the scenario without self-connections is taken into account, the top clusters were led by Yonsei University in South Korea, Grang Luo from the US, and model in institutes, authors, and themes over the past 7 years, as shown in Figures 4, 5, and 6.
Figure 4.
Institute-based author collaborations without self-connections in JMIR Medical Informatics.
Figure 5.
Author collaborations without self-connections in JMIR Medical Informatics.
Figure 6.
Coword analysis of keywords plus in WoS based on theme evolutions and clusters. WoS = Web of Science.
Figure 7 shows the top 10 clusters identified using the FLCA algorithm to uncover hidden topological structures in protein.
Figure 7.
Top 10 clustered are classified using the FLCA algorithm to reveal hidden topological structures in protein-protein interaction networks, data from the study.[3] FLCA = following leader clustering.
3.3. Traditional reports in bibliometrics
In traditional reports in bibliometrics based on the 4-quadrant radar plot[31] and a temporal heatmap[32] to showcase dominant entities and articles worth reading in the JIMR Medical Informatics journal, we can see that the top entities with the most publications in JMIR Medical Informatics were the United States, Yonsei University in South Korea, Medical School, and Grang Luo from the US, as shown in Figure 8. Other than replacing Yonse University (South Korea) with the University of Washington (US), the other 3 entities remained the same in terms of their publications or CJAL scores.
Figure 8.
Comparison of research achievements for entities in JMIR Medical Informatics since 2016 by country, institute, department, and author.
Figure 9 displays the top 20 articles with the highest citation counts in JMIR Medical Informatics. The first 3 articles that deserve reading, cited extensively in the literature,[36–38] will be specifically discussed in the upcoming section.
Figure 9.
Top 20 most cited articles in JMIR Medical Informatics.
4. Discussion
4.1. Principal findings
The study investigated 993 articles published between 2016 and 2022 and demonstrated the effectiveness of the FLCA algorithm in identifying clusters in bibliometrics. The results showed a significant difference in cluster outcomes between the scenarios with and without self-connections, with a 53.8% overlap in the top 20 countries in ACs. The top entities were Yonsei University in South Korea, Grang Luo from the US, and model in institutes, authors, and themes.
The study successfully achieved its 4 objectives: developing a clustering algorithm for ACs, creating a platform for easy visualization creation, evaluating the impact of self-connections in the network, and demonstrating the FLCA algorithm in bibliometric reporting.
4.2. Additional Information
In this study, we utilized SNA[39,40] and co-occurrence analysis in traditional bibliometrics[41] to explore the classification of author collaborations.[42–44] To differentiate clusters, we developed the FLCA algorithm and demonstrated its effectiveness in Figures 2–4. Furthermore, we created a novel visualization R platform using R language to generate visual graphs, as illustrated in Figures 2–8. This study’s contributions are innovative, as no previous research has sufficiently explained the classification methods or processes of various software tools such as CiteSpace,[11] VOSviewer,[12] or Bibexcel[7] that offer cluster analysis functions.
Effective classification is vital in various domains, such as co-occurrence analysis of “keywords,” member analysis of professions in contact with specific patients in hospitals, and the association between drugs and diagnoses. Visual graph presentation is particularly useful in everyday life. The R-generated platform[23] offered in this study addresses the programming challenges faced by many health care researchers. For example, with the R platform, the R code can promptly generate the network chart through the 2 links.[27,28]
Collecting a considerable amount of personal academic achievement data, especially when considering the contribution weights and allocations among coauthors, is a challenging task. More refined comparative classification analyses have not been feasible. Bibliometrics cannot be adequately presented using traditional network graphs alone, and other visual graphs, such as circle bar diagrams and heatmaps with dendrograms, are necessary for demonstration.
Despite scholars assertion that the contribution weights among coauthors should be distributed reasonably,[45,46] specific applications and implementations have not been common.[47,48] In this study, we utilized contribution weight calculation,[29] which assigns equal contribution weights to each author of a paper.
The findings of this study suggest that there have been changes in the patterns of author collaborations in JMIR Medical Informatics. Differences were observed between scenarios with and without self-connections, as shown in Figures 2–4. Only 53.8% (n = 14 out of 26) of the ACs were consistent in both scenarios. Additionally, a unique Venn diagram was presented in Figure 4 and further elaborated in Supplemental Digital Content 1, http://links.lww.com/MD/K402 using the R platform developed by the authors.[23]
Figure 2 highlights the importance of evaluating clustering outcomes when self-connections are included or excluded from the network. When self-connections are included, Iran is present in the network. However, if self-connections are removed, the clustering results differ (e.g., Iran disappears), as depicted in Figure 2.
4.3. Articles worth reading
The most cited article was authored by Desautels et al[36] from the United States in 2016 and cited 207 times in WoS. The authors addressed that sepsis is one of the leading causes of mortality in hospitalized patients, but a reliable means of predicting sepsis onset remains elusive. A machine learning classification system (called InSight) was developed to predict sepsis in intensive care unit patients aged 15 years or more. The classification performance of InSight with several other scores to determine whether or not patients will become septic at a fixed period of time before onset.
The second most cited article was authored by Kruse et al[37] from the United States in 2016 and cited 141 times in WoS. The authors conducted a systematic review on big data in health care and reported that a total of 3 searches were performed for publications between 2010 and 2016, and 9 and 14 themes were identified under the categories Challenges and Opportunities, respectively.
The third most cited article[28] was authored by Sheikhalishahi et al from Italy in 2019 and cited 127 times in WoS. The authors conducted a systematic review on new approaches to treating chronic diseases based on machine learning. Of the 2652 articles considered, 106 met the inclusion criteria. Of the 43 chronic diseases, 38 were focused on diseases of the circulatory system, while endocrine and metabolic diseases were the fewest. There has been a significant increase in the use of machine learning methods in clinical note analysis, but deep learning methods remain emergent. The authors suggested that efforts are still required to improve clinical NLP methods, such as recognition of relations among entities and temporal extraction.
4.4. Implications and possible changes
This study introduces a novel approach for generating visual graphs using R language (see video introduction[23]), providing a unique way to visualize data on author collaborations (or cowords) in bibliometrics. The study also proposes a classification and comparison of journal country-based author clusters (Fig. 2), each with its distinct characteristics and meanings, utilizing network charts and FLCA processes. In the future, these methods (i.e., FLCA and visualizations in R) could be applied for other comparisons, such as team collaboration in team management or activity contribution evaluations within hospital (or school) departments, which can be displayed using circle bar charts, Venn diagrams, or network charts, especially with the FLCA algorithm.
The Venn diagram[30] is an effective tool for displaying the proportions of each of the 7 chronic diseases and the percentage of individuals who have 2 of them simultaneously. For instance, 24% of individuals have both Alzheimer disease and hypertension, which is the most prevalent combination of diseases.[49] The circle size in the diagram varies based on the proportion. Additionally, interactive design allows us to actively explore the relationships between different chronic diseases, which is beneficial for medical researchers studying treatment plans for individuals with multiple chronic conditions.
4.5. Limitations and suggestions
The generalizability of this study’s findings is limited to the collaboration patterns among authors in JMIR Medical Informatics, and caution must be exercised when extrapolating to other journals or organizations.
To improve the interpretability of SNA visualizations, more suitable graph representations may be needed, and the R platform proposed in this study for visualization still has some limitations, such as the need for user-friendly interfaces.
While this study focuses on clustering the top 20 authors using the FLCA algorithm, attention should be given to 3 key aspects: followers, leader ranking, and the parameter k, as demonstrated in the bottom panel of Figure 1.
The optimal value for the k parameter in the FLCA algorithm has not been clearly addressed, and simulations may be required to determine the minimum AAC for appropriate clustering of ACs or keywords in bibliometrics, as shown in Eqs 1 to 3; see Supplemental Digital Content 4, http://links.lww.com/MD/K405.
While this study presents heatmaps, circle bar charts, Venn diagrams, and network graphs, it may be worthwhile to explore whether other graphical representations, such as Sankey diagrams, can provide additional insights.
R is a useful tool for creating visual diagrams, and this study offers several R plots (e.g., network chart, heatmap with dendrogram, Venn diagram, and circle bar chart) as references for interested readers.
5. Conclusion
This study serves as an illustration of how the FLCA algorithm can be utilized to examine author collaborations in academic papers and how the results can be visualized using R. This can serve as a valuable resource for researchers in the field of bibliometrics who are interested in visualizing their data. Notably, the study offers an examination of AC clusters, which can be found in the Supplemental Digital Content 3, http://links.lww.com/MD/K404 and video for further reference.
Acknowledgments
We thank Enago (www.enago.tw) for the English language review of this manuscript.
Author contributions
Conceptualization: Teng-Yun Cheng, Sam Yu-Chieh Ho.
Data curation: Julie Chi Chow.
Investigation: Willy Chou.
Methodology: Tsair-Wei Chien.
Supplementary Material
Abbreviations:
- AAC
- absolute advantage coefficient
- AC
- author collaboration
- FLCA
- following leader clustering
- SNA
- social network analysis
- WoS
- Web of Science
All data are publicly available in the WoS.
The datasets generated during and/or analyzed during the current study are publicly available.
Supplemental Digital Content is available for this article.
The authors have no funding and conflicts of interest to disclose.
How to cite this article: Cheng T-Y, Ho SY-C, Chien T-W, Chow JC, Chou W. A comprehensive approach for clustering analysis using follower-leading clustering algorithm (FLCA): Bibliometric analysis. Medicine 2023;102:42(e35156).
Contributor Information
Teng-Yun Cheng, Email: scylla1003@gmail.com.
Sam Yu-Chieh Ho, Email: t20317@hotmail.com.
Tsair-Wei Chien, Email: smile@gmail.chimei.org.tw.
Julie Chi Chow, Email: jcchow2@yahoo.com.tw.
References
- [1].Yang ACH, Chaudhury H, Ho JCF, et al. Measuring the impact of bedroom privacy on social networks in a long-term care facility for Hong Kong older adults: a spatio-social network analysis approach. Int J Environ Res Public Health. 2023;20:5494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Trach R, Khomenko O, Trach Y, et al. Application of fuzzy logic and SNA tools to assessment of communication quality between construction project participants. Sustainability. 2023;15:5653. [Google Scholar]
- [3].Bu D, Zhao Y, Cai L, et al. Topological structure analysis of the protein–protein interaction network in budding yeast. Nucleic Acids Res. 2003;31:2443–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Bastian M, Heymann S, Jacomy M. Gephi: an open source software for exploring and manipulating networks. In: Third international AAAI conference on weblogs and social media; 2009. [Google Scholar]
- [5].Python Software Foundation. Python Language Reference, version 3.10. Available at: https://docs.python.org/3/. [access date March 3, 2023].
- [6].R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.R-project.org/. [access date March 3, 2023]. [Google Scholar]
- [7].Persson O. Analyzing bibliographic data to visualize representations. Available at: https://homepage.univie.ac.at/juan.gorraiz/bibexcel/. [access date March. 3, 2023].
- [8].Aishwaryasum. Top 10 social network analysis tools to consider. Available at: https://www.geeksforgeeks.org/top-10-social-network-analysis-tools-to-consider/. [access date March 3, 2023].
- [9].Hu S, Xu S, Lu W, et al. The research on the treatment of primary immunodeficiency diseases by hematopoietic stem cell transplantation: a bibliometric analysis from 2013 to 2022. Medicine (Baltim). 2023;102:e33295. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Cheng H, Lin L, Liu T, et al. Financial toxicity of breast cancer over the last 30 years: a bibliometrics study and visualization analysis via CiteSpace. Medicine (Baltim). 2023;102:e33239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Pubmed. CiteSpace and VOSviewer used in bibliometrics. Available at: https://pubmed.ncbi.nlm.nih.gov/?term=vosviewer+and+citespace. [access date September 22, 2022].
- [12].van Eck NJ, Waltman L. “Software survey: VOSviewer, a computer program for bibliometric mapping”. Scientometrics. 2010;84:523–38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Neptune. Exploring clustering algorithms: explanation and use cases. Available at: https://neptune.ai/blog/clustering-algorithms. [access date April 22, 2023].
- [14].Leydesdorff L, Bornmann L, Wagner CS. Generating clustered journal maps: an automated system for hierarchical classification. Scientometrics. 2017;110:1601–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Block JH, Fisch C. Eight tips and questions for your bibliographic study in business and management research. Manag Rev Q. 2020;70:307–12. [Google Scholar]
- [16].Pubmed. Articles related to bibliometrics. Available at: https://pubmed.ncbi.nlm.nih.gov/?term=bibliometric%5BMeSH%20Major%20Topic%5D&sort=pubdate&timeline=expanded. [access date September 22, 2022].
- [17].Pubmed. Articles related to meta-analysis. Available at: https://pubmed.ncbi.nlm.nih.gov/?term=meta-analysis%5BMeSH%20Major%20Topic%5D&sort=pubdate&timeline=expanded. [access date September 22, 2022].
- [18].Moreno-Morente G, Hurtado-Pomares M, Terol Cantero MC. Bibliometric analysis of research on the use of the nine hole peg test. Int J Environ Res Public Health. 2022;19:10080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Zhu H, Shi L, Wang R, et al. Global research trends on infertility and psychology from the past two decades: a bibliometric and visualized study. Front Endocrinol (Lausanne). 2022;13:889845. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Yacouba A, Olowo-Okere A. Global trends and current status in colistin resistance research: a bibliometric analysis (1973-2019). F1000Res. 2020;9:856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Valera-Gran D, Prieto-Botella D, Peral-Gómez P, et al. Bibliometric analysis of research on telomere length in children: a review of scientific literature. Int J Environ Res Public Health. 2020;17:4593. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Martynov I, Klima-Frysch J, Schoenberger J. A scientometric analysis of neuroblastoma research. BMC Cancer. 2020;20:486. Published 2020 May 29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Chien TW. To generate R language for visualizations. Available at: https://www.healthup.org.tw/raschonline/cbp.asp. [access date March 3, 2023].
- [24].Yang DH, Chien TW, Yeh YT, et al. Using the absolute advantage coefficient (AAC) to measure the strength of damage hit by COVID-19 in India on a growth-share matrix. Eur J Med Res. 2021;26:61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Yang TY, Chien TW, Lai FJ. Citation analysis of the 100 top-cited articles on the topic of hidradenitis suppurativa since 2013 using Sankey diagrams: bibliometric analysis. Medicine (Baltim). 2022;101:e31144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Chien TW. How to conduct this study. Available at: https://youtu.be/J-JHSWbI-nw. [access date March 2, 2023].
- [27].Chien TW. An example of FLCA algorithm used in country-based author collaborations. Available at: https://www.healthup.org.tw/raschonline/JMIRMIwd2.htm. [access date May 4, 2023].
- [28].Chien TW. An example of FLCA algorithm used in country-based author collaborations in R. Available at: https://www.healthup.org.tw/raschonline/JMIRMIwd.htm. [access date May 4, 2023].
- [29].Wu JW, Yan YH, Chien TW, et al. Trend and prediction of citations on the topic of neuromuscular junctions in 100 top-cited articles since 2001 using a temporal bar graph: a bibliometric analysis. Medicine (Baltim). 2022;101:e30674. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Venn J. On the diagrammatic and mechanical representation of propositions and reasonings. Phil Mag. 1880;5:406–18. [Google Scholar]
- [31].Shao Y, Chien TW, Jang FL. The use of radar plots with the Yk-index to identify which authors contributed the most to the journal of Medicine in 2020 and 2021: a bibliometric analysis. Medicine (Baltim). 2022;101:e31033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [32].Ho SY, Chien TW, Tsai KT, et al. Analysis of citation trends to identify articles on delirium worth reading using DDPP model with temporal heatmaps (THM): a bibliometric analysis. Medicine (Baltim). 2023;102:e32955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Chow JC, Ho SY, Chien TW, et al. A leading author of meta-analysis does not have a dominant contribution to research based on the CJAL score: bibliometric analysis. Medicine (Baltim). 2023;102:e33519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [34].Yeh JT, Shulruf B, Lee HC, et al. Faculty appointment and promotion in Taiwan’s medical schools, a systematic analysis. BMC Med Educ. 2022;22:356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Belikov AV, Belikov VV. A citation-based, author- and age-normalized, logarithmic index for evaluation of individual researchers independently of publication counts. F1000Res. 2015;4:884. [Google Scholar]
- [36].Desautels T, Calvert J, Hoffman J, et al. Prediction of sepsis in the intensive care unit with minimal electronic health record data: a machine learning approach. JMIR Med Inform. 2016;4:e28. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Kruse CS, Goswamy R, Raval Y, et al. Challenges and opportunities of big data in health care: a systematic review. JMIR Med Inform. 2016;4:e38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Sheikhalishahi S, Miotto R, Dudley JT, et al. Natural language processing of clinical notes on chronic diseases: systematic review. JMIR Med Inform. 2019;7:e12239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Gyorki DE, Roland CL. ASO author reflections: standardization in the management of retroperitoneal sarcoma through international collaboration. Ann Surg Oncol. 2021;28:7889–90. [DOI] [PubMed] [Google Scholar]
- [40].Ho SY, Chien TW, Huang CC, et al. A comparison of 3 productive authors’ research domains based on sources from articles, cited references and citing articles using social network analysis. Medicine (Baltim). 2022;101:e31335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [41].Yu F, Patel T, Carnegie A, et al. Evaluating the impact of a CTSA program from 2008 to 2021 through bibliometrics, social network analysis, and altmetrics. J Clin Transl Sci. 2023;7:e44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [42].Chien TW, Chang Y, Wang HY. Understanding the productive author who published papers in medicine using National Health Insurance Database: a systematic review and meta-analysis. Medicine (Baltim). 2018;97:e9967. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Hsieh WT, Chien TW, Kuo SC, et al. Whether productive authors using the national health insurance database also achieve higher individual research metrics: a bibliometric study. Medicine (Baltim). 2020;99:e18631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44].Liu MY, Chou W, Chien TW, et al. Evaluating the research domain and achievement for a productive researcher who published 114 sole-author articles: a bibliometric analysis. Medicine (Baltim). 2020;99:e20334. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [45].Sahe J-A. Quality versus quantity: assessing individual research performance. Sci Transl Med. 2011;3:84cm13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [46].Petersen AM, Fortunato S, Pan RK, et al. Reputation and impact in academic careers. Proc Natl Acad Sci U S A. 2014;111:15316–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [47].Batista PD, Campiteli MG, Kinouchi O. Is it possible to compare researchers with different scientific interests? Scientometrics. 2006;68:179–89. [Google Scholar]
- [48].Hagen NT. Harmonic allocation of authorship credit: source-level correction of bibliometric bias assures accurate publication and citation analysis. PLoS One. 2008;3:e4021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].The New York Times. For the Elderly, Diseases That Overlap. URL: For the Elderly, Diseases That Overlap - Interactive Feature - NYTimes.com. [access date April 30, 2023].
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.