Appraisal of scientific literature is always cumbersome. Bibliometry, a branch of library and information science, deals with the quantitative analysis of bibliometric data. Bibliometric analysis (BA), which primarily focuses on academic productivity, uses published scientific literature (research articles, books, conference proceedings, etc.) to measure research activities in a specific area. 1 BA relies on data from journals, titles, authors, addresses, abstracts, and published literature references. Prominent databases such as Web of Science, Scopus, PubMed, and Dimensions are commonly preferred by researchers to collect and analyze data. BA is concerned with the quantitative analysis of citations, citation numbers, the significance of a research topic, and research undertaken in some geographic regions. Data at the heart of BA tend to be massive (e.g., hundreds, if not thousands of articles and their quantitative details) and objective (e.g., number of citations an article received, number of publications by an author, occurrences of keywords and topics within an article), although BA often rely on the interpretation of these quantitative details of publications.
Scientists use BA for various reasons, such as discovering emerging trends in a journal, article performance, patterns of collaboration, and research constituents (author details, coauthorship, citations obtained, etc.) and examining the intellectual structure of a particular area in the existing literature. 2 BA helps outline and map the cumulative scientific knowledge and evolutionary nuances of an established field and rigorously understand large volumes of unstructured data. Thus, a well-done bibliometric study can build a solid foundation for advancing a field in novel and meaningful ways: it enables scholars to obtain a detailed overview, identify lacunae of research in a particular area, get new ideas for investigation, and position these investigations in their anticipated contribution to the field. 3
BA is also used to identify papers cited most frequently in a researcher’s field of interest and identify author networks on a global scale. The analysis also indicates how influential some articles are in online databases.4–7 Advanced BA is a great way to get a reliable, clear, and unbiased quantitative estimate of the rise in scientific publications, especially in the medical field. In library and information science, bibliometric methods are extensively utilized. Even though these analyses provide an understanding of published literature, unlike systematic reviews and meta-analyses, BA does not perform a synthesis of concerned knowledge and attributes.
The number of publications on BA has increased over time, with a yearly average of above 1000 publications in the last decade. This can be attributed to the growth of scientific research itself because large bibliographic data sets made classical assessment methods (e.g., literature review by a qualitative approach and meta-analysis by a quantitative approach) complex. However, to our knowledge, no articles that depict detailed BA methodology have been published in any medical journals so far.
How It Contrasts with Systematic Review (SR) and Meta-Analysis (MA)
Because of a similar methodology, researchers may find it hard to differentiate between these designs. The differences are listed in Table 1.
Table 1.
Difference Between SR, MA, and BA
| Systematic Review | Meta-Analysis | Bibliometric Analysis |
| Poses a specific clinical question. Often done before a MA. Focus is narrow. | Provides a conclusive summary of evidence in a field of interest
and topic based on the research question. Usually performed to quantify the effect of an intervention/treatment. Analyses quantitative data (mean, standard deviation, odds ratio/relative risk, etc.) extracted from studies through a SR. |
Helps identify pertinent research papers in a particular area of
interest. It gives an account of quantitative summary of author details, academic productivity in a specific field, and the quality of journals. Cannot provide a conclusive summary or depict the effect of an intervention, as in MA. |
| Analyses are often subjective and with qualitative narration. | Pure statistical applications to combine homogenous studies and avail a pooled estimate. | Analysis focuses on quantitatively summarizing bibliometric characteristics of research topics and detailing research constituents, authorship features, and intellectual structure of a research area. |
| Results are illustrated in tabular form along with a flow chart representing study screening and selection. | Results depicted using forest plot, funnel chart, and heterogeneity statistics. | Mapping techniques, graphical representation, tabulated form, network diagrams, etc. are used to present results. Analysis is performed with the assistance of software. |
SR: systematic review, MA: meta-analysis, BA: bibliometric analysis.
Citation Network: The Crux of BA
Citation network is the core principle of BA. Citation networks are used to identify trends in the frequency of citations and references between papers based on parameters like publication date and subject area. They can also be used to construct measurements of scientific impact like the h-index and investigate the structure and evolution of various disciplines of academic inquiry. Two well-known methods used are performance analysis and science mapping; both rely on the same principle.
Performance Analysis
Performance analysis (PA), sometimes referred to as citation analysis, has been commonly performed in medical literature in the last decade, counting details of papers published. 8 In a research field, PA examines the contributions of various research constituents such as authors, institutes, countries, and journals as a whole. From the passel of PA measures, the most meaningful is the number of publications per year for research constituent and citations per year for research constituent. The productivity of research constituents can be analyzed by publication metrics that include total publications (TP), number of contributing authors (NCA), number of sole-authored publications (SA), number of coauthored publications (CA), and number of active years of publications (NAY). Productivity per active year of publication is calculated by dividing the total number of publications by the number of active years of publications (PAY = TP ÷ NAY). 2
Citation metrics incorporate total citations and average citations per publication, year, or period. This metric determines the significance and quality of published literature by counting the number of times other researchers have mentioned it in their work. By this method, the impact of an article can be measured. This is not an easy task because the researcher should search various databases to access data. Another part that makes this analysis difficult is the question of free access to well-established databases. 9
Two other widely used indices are the h-index and g-index. The h-index calculates an author’s number of publications and citations for those articles. If “h” of a scientist’s Np (number of publications) has at least “h” citations each, and the remaining (Np-h) papers have less than “h” citations each, then the scientist’s index is “h.” An h-index of 4 means the author has four publications with at least four citations; an example is illustrated in Table 2. The h-index thus combines the number of papers and the impact of a paper or citations to these papers.
Table 2.
Manual Calculation of h and g Indices
| Articles of Mr. X | Ranking of Articles According to Citations Obtained | No. of Citations Obtained | Sum of Citations | g2 (Square of Ranks) | h-Index | g-Index |
| A-1 | 1 | 15 | 15 | 1 | – | – |
| A-2 | 2 | 10 | 25 | 4 | – | – |
| A-3 | 3 | 7 | 32 | 9 | – | – |
| A-4 | 4 | 4 | 36 | 16 | 4* | – |
| A-5 | 5 | 3 | 39 | 25 | – | – |
| A-6 | 6 | 2 | 41 | 36 | – | 6** |
| A-7 | 7 | 1 | 42 | 49 | – | – |
Note: *h-index calculated for Mr. X is four, as the article A-4 with rank 4 has availed at least four citations. h-index cannot be five because article A-5 with rank 5 (or even A-4) does not have a minimum of five citations. **g-index is where the square of the rank (g2) position equals or stands less than the sum of citations. Here, g-index is six, as the g2 of article A-6 (36) is less than the sum of citations. In this case, g-index cannot be seven because article A-7 has a g2 value of 49, and the corresponding sum of citations is only 42.
The g-index is a modified version of the h-index; it assesses the effect of an author’s articles using a citation- ranking mechanism. The g-index is measured by referring to the cumulative distribution of citations received by the author’s articles. The g-index is the unique, most significant number such that the top-ranked (g) articles received at least g2 (g squared) citations together (Table 2). Both these indexes have their benefits.10–12 However, despite their ubiquity, neither of these indices represent how it gets influenced by differences in database citation styles. 13
Collaboration index (CI) is a coauthors per article index based solely on multiauthored articles. CI is calculated by dividing the total authors of multiauthored articles by a total number of multiple-authored articles. Correlation coefficient (a numerical measure of a statistical relationship between two variables.), numbers of cited publications (NCP), the proportion of cited publications (PCP), and citations per cited publication (CCP) are some other measures adopted in PA. Despite being descriptive, PA measures the impact of different research constituents and, therefore, represents the research field. Performance analyses in reviews do not further analyze science mapping, just like a systematic review not attempting a meta-analysis.
Science Mapping
In the context of bibliometrics, science mapping, or bibliometric mapping, is an inevitable quantitative analysis. Science mapping is a spatial depiction of how disciplines, fields, specialties, and particular texts or authors are related to one another.14,15 Science mapping basically determines the relationship between research constituents. 16 The basic processes in the overall workflow of a science mapping are data retrieval, preprocessing, network extraction, normalization, mapping, analysis, and visualization. The analyst must conclude the findings at the end of this process. 17
Various methods for extracting networks from specific units of study have been developed (co-word analysis, coauthor analysis, and bibliographic coupling). Co-word analysis examines the conceptual structure of a research field by examining the essential words or keywords in the papers. 18 Coauthor analysis investigates the social structure and collaboration networks by appraising the authors and their affiliations. 19 In bibliographic coupling, the cited references are utilized to review the study field’s intellectual foundation or to examine papers that quote the same references. 20 It differs from co-citation; while bibliographic coupling indicates a fixed and permanent relationship based on the references in coupled documents, co-citation indicates the dynamic relationship that changes over time. 21
Coauthor Analysis
Coauthor analysis explores and quantifies research and development collaboration between authors, institutions, or countries. 22 Authors in coauthorship networks are connected as nodes through coauthored scientific articles, as depicted in graphical representation in Figure 1. Each coauthored publication is represented by linkages, or edges, between the nodes. 23 In Figure 1, the edges reflect coauthorships, while nodes represent authors. The more papers an author has, the larger the node. The more publications the authors involved have coauthored, the thicker the edge. Measurements can describe the network structure and centrality (relative significance of a node in the graph compared to others). Nodes get connected when they share authorship, which shows the spreading. An example coauthor analysis pertaining to psychiatry is the study of author collaborations in schizophrenia research, conducted by Wu and Duan. 24
Figure 1. Graphical Representation of Coauthor Analysis.
Co-Word Analysis
When two or more keywords reflecting a study topic come in the same published paper and the more frequently two keywords appear together, the more closely they are related. Co-word analysis methods are applied based on the co-word matrix, consisting of factor analysis, cluster analysis, multivariate analysis, and social network analysis. These methods are helpful for researchers to determine the background of the field of interest and play a vital role in identifying the value of an academic discipline. 25
For example, Wu et al. used co-word analysis to assess the evolution of research topics in Psychiatry and found that child and adolescent psychiatry, major depression, schizophrenia, and prefrontal cortex were constant research foci between 2001 and 2015. Another co-word analysis by Khoeini et al. found that the growth rate of the total scientific outputs in bibliotherapy was only 3%.26,27
Bibliographic Coupling
Citing relevant papers is one way for the authors to express the intellectual environment in which they work. A group of scientific papers has a significant relationship to one another (are linked) if they have one or more references in common. Articles A and B are bibliographically coupled if they have cited articles C, D, and E. If two papers have similar bibliographies, there is an inferred relationship between them. As an example, Dort et al. conducted a bibliometric coupling analysis of studies on classroom management strategies for children with Attention Deficit and Hyper activity Disorder (ADHD) and found that there is relatively little communication between researchers in the fields of psychiatry/psychology and those in the field of education 6
However, it has been pointed out that bibliographic coupling is not a valid unit of relationship measurement. Because we cannot be confident that two papers referencing a third are mentioning the same unit of information, the bibliographic coupling is not a constant unit, and this is not a unit at all. It merely indicates the possibility of a relationship between two documents with an undetermined value. 25 In a more simplified manner, the fact that two papers have a reference in common does not mean they are pointing to the same shred of data. 2
There is only a little distinction between PA and science mapping. PA focuses on research constituents (e.g., information on researchers, institutions, locations, and journals as a whole), whereas science mapping focuses on links between research elements (for instance, how authors, articles, disciplines, and institutions are interlinked in literature).
Basic Steps in BA
This section presents the steps for conducting BA, after referring to available published literature.28–30 Overall design is portrayed in Figure 2.
Figure 2. Overall Design of Bibliometric Analysis.
Outline the Aims and Scope of the BA
Researchers sometimes make the aim precise with publication year, country of study, journal, type of intervention used, and so on, depending on the research situation and research problem. This step must be performed before selecting the BA technique and data collection. The scope of the study should be substantial and large enough because a BA usually handles massive data sets from databases, roughly thousands of publications, depending on the research question. The scope of the study is assessed by the number of publications as an end product. If it is less than 500, it should be considered as having a small scope and may not warrant a BA. 31 However, the availability of literature depends on the area of study planned.
Selecting Optimal Techniques for BA
The data yielded by the search will be raw; hence, the researchers need to clean it before performing the analysis. The choice of technique depends on the aim of the study. For instance, if the study intends to review different time frames related to a specific topic, the past, present, and future, a sizeable bibliometric corpus has to be availed. In this situation, appropriate techniques will be citation analysis, co-citation analysis, bibliographic coupling, and co-word analysis.
Data Collection
The choice of keywords and their combination has to be formulated for data collection. The researchers have to keep in mind the techniques planned. For instance, if the researcher planned a co-word analysis, the focus should be on the title, abstract, keywords, and available full-text articles. One impediment that can arise during the data collection is that different databases will have different bibliometric formats, and literature searches may give varying results. So, to avoid possible errors, it is recommended to decide on any of the frequently used databases. Data cleaning is required for databases such as Web of Science, Scopus, PubMed, and Dimension that are not specifically designed for this type of metric analysis. Especially, duplication of articles and author names and publications in multiple journals have to be appraised. If this type of inaccuracy is not avoided, it may result in a skewed presentation of data.
Performing the Analysis
Doing a manual search and retrieving data from the databases using keywords, as done for MA and SR, can be difficult in the case of BA, as the data handled here is enormous. Hence, dedicated software for the purpose described below is used for the search. Once keywords are finalized, the researcher can perform the search in the selected databases. All the databases quoted allow synthesizing the search results as a single file. The software selected will demand to upload the file and choose techniques the researcher intends to adopt. Findings will be provided as a citation map, which can be downloaded in various formats. Details of clusters, themes, number of articles, journal details, and region of publication can also be obtained in the output interface of the software. While interpreting the results, one should remember that the results should align with the stated scope of the study. If a researcher synthesizes search outcomes from multiple databases, a table can be prepared to synthesize the findings, which will help interpret and compare the findings.
Reporting of Findings
To report the findings, one can use different methods such as tables, citation mapping, network presentation, and graphs. While writing the report, the authors should be cautious about the wordings used because readers must be able to comprehend the bibliometric summary. The narrative of results should be in congruence with the summary presented in mapping and PA. The researcher should attempt to explain the implications and characteristics of the summary provided. Reporting of findings is most often journal-specific; some may ask for theoretical description and presentation of the themes identified with their description. On the contrary, some may demand a summary of findings straightaway. Thus, researchers should consider the target journal before writing the analysis results.
Software for BA
VOSviewer 1.6.10 (Nees Jan Eck and Ludo Waltman), CiteSpaceV (Dr Chaomei Chen), Gephi (Mathieu Bastian), Leximancer (Dr Andrew Smith), Pajak (Andrej Mrvar and Vladimir Batagelj), and UCINET (Borgatti S.P, Everett M.G and Freeman) are major software used for analysis and data presentation. Among these, VOS viewer, an open-access software, is user-friendly and can analyze data search results files from the Web of Science, Scopus, Dimension, and PubMed databases. These programs can be used to construct and display bibliometric networks. These networks can be built via citation, bibliographic coupling, co-citation, or coauthorship relationships, including journals, researchers, or individual articles. They also include text-mining capabilities, which may be used to create and display co-occurrence networks of key terms gathered from the scientific literature.
Limitations
Theoretically, a high number of citations indicate a high impact. Citing articles can also be done to quote some methods, tools, designs, definitions, and standard procedures. This may mislead researchers into believing that papers with a high number of citations may have good quality or weightage. A high number of citations does not always portray significance because citations may increase as other authors may just cite a tool used in that study, not the overall content, which is a cardinal limitation in interpreting bibliometrics results. Another drawback is considering the impact factor of a journal and giving appropriate weightage to a paper; in reality, even in journals with a high impact factor, most papers may not get cited often. 32 Even if a researcher publishes only a few pieces of scholarly research (books or journals), this small body of work can be pivotal in a subject and significantly impact on the credential of the discipline. Conventional measurements like the h-index will overlook such cases. Researchers may publish dozens or hundreds of research articles in various disciplines, each with dozens or hundreds of authors in total. Such researchers often have incredibly high impact metrics, which may or may not accurately represent their impact in the area.
None of these methods are sufficient compared to the gold standards 33 : summing impact factors, counting citations, amassing an h-index, or looking at Eigenfactor scores. The absence of standard reporting practice in the BA is also a concern in this design. Another limitation of this analysis is when researchers do not mention the time frame of a paper’s citations because citations accumulate over time. Other technical limitations related to this analysis include lack of awareness about and access to BA software, technical know-how in software-assisted analysis, and the theoretical presence of this design in medical research curricula.
Conclusion
BA is a scientific method that researchers can use to glance at prominent areas of medical research and obtain an overview of the landscape of published literature. With advanced menu-driven application tools for assessing research performance and monitoring university departments and institutes, bibliometric methods have progressed to the point that they are now considered high-quality, reliable, and informative instruments. This analysis may not provide strong evidence related to a research question as from MA, but it provides a landscape of the problem and solutions discussed worldwide in an area of interest.
Footnotes
The authors declared no potential conflicts of interest concerning this article’s research, authorship, and publication.
Funding: The authors received no financial support for the research, authorship, and publication of this article.
References
- 1.Mejia C, Wu M, Zhang Y, et al. Exploring topics in bibliometric research through citation networks and semantic analysis. Front Res Metr Anal, 2021; 6: 62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Donthu N, Kumar S, Mukherjee D, et al. How to conduct a bibliometric analysis: An overview and guidelines. J Bus Res September 1,, 2021; 133: 285–296. [Google Scholar]
- 3.Szomszor M, Adams J, Fry R, et al. Interpreting bibliometric data. Front Res Metr Anal, 2021; 5: 30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Lee K, Whelan JS, Tannery NH, et al. 50 years of publication in the field of medical education. Med Teach, 2013; 35: 591–598. [DOI] [PubMed] [Google Scholar]
- 5.Ji YA, Nam SJ, Kim HG, et al. Research topics and trends in medical education by social network analysis. BMC Med Educ, 2018; 18: 222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Sampson M, Horsley T, and Doja A.. A bibliometric analysis of evaluative medical education studies: Characteristics and indexing accuracy. Acad Med, 2013; 88: 421–427. [DOI] [PubMed] [Google Scholar]
- 7.Azer SA. The top-cited articles in medical education: A bibliometric analysis. Acad Med, 2015; 90: 1147–1161. [DOI] [PubMed] [Google Scholar]
- 8.Patel VM, Ashrafian H, Ahmed K, et al. How has healthcare research performance been assessed? A systematic review. J R Soc Med June, 2011; 104(6): 251–261. DOI: 10.1258/jrsm.2011.110005. PMID: 21659400; PMCID: PMC3110970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.DeGroote. Subject and course guides: Measuring your impact: Impact factor, citation analysis, and other metrics: Citation analysis [Internet]. Researchguides.uic.edu., https://researchguides.uic.edu/c.php?g=252299&p=1683205 (2021, accessed January11, 2022).
- 10.Hirsch JE. An index to quantify an individual’s scientific research output. Proc Natl Acad Sci USA November 15,, 2005; 102(46): 16569–16572. DOI: 10.1073/pnas.0507655102. Epub November 7, 2005. PMID: 16275915; PMCID: PMC1283832. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Chew M, Villanueva EV, and Van Der Weyden MB. Life and times of the impact factor: Retrospective analysis of trends for seven medical journals (1994-2005) and their Editors’ views. J R Soc Med March, 2007; 100(3): 142–150. DOI: 10.1177/014107680710000313. PMID: 17339310; PMCID: PMC1809163. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Garfield E. The history and meaning of the journal impact factor. JAMA. January4,, 2006; 295(1): 90–93. DOI: 10.1001/jama.295.1.90. PMID: 16391221. [DOI] [PubMed] [Google Scholar]
- 13.Radicchi F, Fortunato S, and Castellano C.. Universality of citation distributions: Toward an objective measure of scientific impact. Proc Natl Acad Sci USA November 11,, 2008; 105(45): 17268–17272. DOI: 10.1073/pnas.0806977105. Epub October 31, 2008. PMID: 18978030; PMCID: PMC2582263. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Börner K, Chen C, and Boyack KW. Visualizing knowledge domains. Annu Rev Inf Sci Technol, 2003; 37(1): 179–255. [Google Scholar]
- 15.Small H. Visualizing science by citation mapping. J Am Soc Inf Sci, 1999; 50(9): 799–813. [Google Scholar]
- 16.Kent BH, Pandey N, Kumar S, et al. A bibliometric analysis of board diversity: Current status, development, and future research directions. J Bus Res January 1,, 2020; 108: 232–246. [Google Scholar]
- 17.Cobo MJ, López AG, Herrera VE, et al. Science mapping software tools: Review, analysis, and cooperative study among tools. J Am Soc Inf Sci Technol, 2011; 62(7): 1382–1402. [Google Scholar]
- 18.Callon M, Courtial J-P, Turner WA, et al. From translations to problematic networks: An introduction to co-word analysis. Soc Sci Inf, 1983; 22(2): 191–235. [Google Scholar]
- 19.Glänzel W. National characteristics in international scientific co-author relations. Scientometrics, 2001; 51(1): 69–115. [Google Scholar]
- 20.Kessler MM. Bibliographic coupling between scientific papers. Am Doc, 1963; 14(1): 10–25. [Google Scholar]
- 21.Jarneving B. A comparison of two bibliometric methods for mapping of the research front. Scientometrics, 2005; 65(2): 245–263. [Google Scholar]
- 22.Bender ME, Edwards S, Von Philipsborn P, et al. Using co-author networks to map and analyse global neglected tropical disease research with an affiliation to Germany. PLOS Negl Trop Dis, 2015; 9(12): e0004182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Fonseca B de PF, Sampaio RB, deAraújo Fonseca MV, et al. Co-author network analysis in health research: Method and potential use. Health Res Policy Syst, 2016; 14(1): 1–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Wu Y and Duan Z.. Visualization analysis of author collaborations in schizophrenia research. BMC Psychiatry February19,, 2015; 15: 27. DOI: https://doi.org/10.1186/s12888-015-0407-z. PMID: 25884451; PMCID: PMC4340282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Yang Y, Wu M, and Cui L.. Integration of three visualization methods based on co-word analysis. Scientometrics, 2012; 90(2): 659–673. [Google Scholar]
- 26.Wu Y, Jin X, and Xue Y.. Evaluation of research topic evolution in psychiatry using co-word analysis. Medicine (Baltimore) June, 2017; 96(25): e7349. DOI: https://doi.org/10.1097/MD.0000000000007349. PMID: 28640150; PMCID: PMC5484261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Khoeini et al. Co-word analysis of scientific outputs in the field of bibliotherapy in Web of Science. Casp J Scientometrics, 2022; 9(1): pp. 13–28. [Google Scholar]
- 28.Xin YA and Qing QW. Co-word analysis of the trends in stem cells field based on subject heading weighting. Scientometrics, 2011; 88(1): 133–144. [Google Scholar]
- 29.Weinberg BH. Bibliographic coupling: A review. Inf Storage Retr, 1974; 10(5–6): 189–196. [Google Scholar]
- 30.Okubo Y. Bibliometric indicators and analysis of research systems: Methods and examples. Paris: OECD Publishing, 1997, OECD Science, Technology and Industry Working Papers, No. 1997/01. [Google Scholar]
- 31.Maggio LA, Costello JA, Norton C, et al. Knowledge syntheses in medical education: A bibliometric analysis. Perspect Med Educ March 1,, 2021; 10(2): 79–87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Szomszor M, Adams J, Fry R, et al. Interpreting bibliometric data. Front Res Metr Anal, 2021; 5: 30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Koltun V and Hafner D.. The h-index is no longer an effective correlate of scientific reputation. PLOS One June28, 2021; 16(6): e0253397. [DOI] [PMC free article] [PubMed] [Google Scholar]


