Skip to main content
Medical Journal of the Islamic Republic of Iran logoLink to Medical Journal of the Islamic Republic of Iran
. 2020 Jun 20;34:64. doi: 10.34171/mjiri.34.64

Coronavirus: Bibliometric analysis of scientific publications from 1968 to 2020

Vasna Joshua 1, Satish Sivaprakasam 1,*
PMCID: PMC7500428  PMID: 32974230

Abstract

Background: The World Health Organization declared the outbreak of COVID-19as a public health emergency of international concern on January 30, 2020. Therefore, relevant research metrics would be an added value for understanding the virus for researchers.

Methods: Research outputs related to the Coronavirus were retrieved from the Web of Science database from January 1968 to March 2020 and were analyzed using MS-office, Word Cloud generator, VOS viewer, and ArcGIS software. The analysis was based on the number of research publications per year, contributing author’s clustering pattern, most preferred journals, leading publication, document type, broad research areas, commonly used keywords, the geographical distribution of publications, commonly used languages, and productive institutes.

Results: The search retrieved 6424 Coronavirus research publications. The number of articles found in the year 1968 was 1, but it was 275 in 2019. A total of 33 clusters of authors contributed to studies on COVID-19 across the globe. The Journal of Virology had the most productivityon Coronavirus publications (n=810). An article published by Ksiazek TG et al in the New England Journal of Medicine had the maximum citation (n=2175); 90% of the research outputs were articles, broadly classified under Infectious diseases (n=5341); and the most commonly used keyword was ‘Coronavirus’. The higher number of publications was from the USA (n=2345) and the commonly used language was English (n=5948), and the most productive institute was the University of Hong Kong (n=506).

Conclusion: The results of the study showed that the growth pattern was not uniform, the United States, and the University of Hong Kong have played a major role in the contribution of Coronavirus research. Even though this depicts a higher scientific growth, it is an alarming sign to the community for preparedness. Under the prevailing situation of seeking better prevention, treatment and vaccination for COVID-19, in-depth research in the above portrayed metrics would be an added knowledge for the researchers.

Keywords: Coronavirus, Bibliometric, Public health, Novel coronavirus, Web of Science, COVID-19


↑ What is “already known” in this topic:

Coronavirus disease 2019 (COVID-19), first detected in Wuhan, China, has spread to more than 212 countries across the globe. The World Health Organization has launched a global megatrial called SOLIDARITY on March 20, 2020, an unprecedented coordinated trial designed to collect robust scientific data rapidly during this pandemic.

→ What this article adds:

Almost every journal called for papers on COVID-19 and provided them as free access to readers. The Scientific research outputs must have jumped to a greater level. Under the present prevailing situation of the need for robust data, the research metrics of a longer period in countries, journals, institutions, author’s clustering pattern, and keywords will be an added value.

Introduction

Coronaviruses are zoonotic viruses found in mammals and birds and infect the respiratory and gastrointestinal tracts in humans when transmitted. Coronavirus was first identified in the mid-1960sas human pathogens (1). Later zoonotic Coronaviruses have emerged to cause human outbreaks, such as Severe Acute Respiratory Syndrome (SARS) in 2003 and the Middle East Respiratory Syndrome (MERS) since 2012. The virus “SARS-CoV-2” caused the novel Coronavirus disease in 2019 (COVID-19) (2). The WHO declared the outbreak of COVID-19 as a public health emergency of international concern on January 30, 2020 (3).

Scientific researchers choose to publish their scientific findings in a journal of their choice to evidence their research work. The current development and new innovative technologies in any research field are authenticated bypeer-reviewed journal publications. In the digital era, the researchers highly focus on publishing their research in journals with wide publicity, high productivity, impact factor, and citations. To measure these research output metrics, tools like bibliometrics, scientometrics etc. are used. This analysis (4) is helpful for researchers to assess the metrics about the history of research, the growth of scientific inventions, and new innovative technologies applied and its pros and cons. In other words, bibliometric analyses are used to describe the study of science, including growth, structure, interrelationships, and productivity of a certain research discipline (5). It brings out the impact of scientific documents, such as research papers, academic journals, reviews, etc.

Gaining knowledge about the current literature available on Coronaviruses across the globe is of high value. There are several databases (Web of Science, Scopus, Google Scholar, etc.) that would bring out the scientific research metrics available in the literature. Our aim was to use the Web of Science database, which is a manually curated database that also tracks more citations (6).

To explore the research metrics of Coronavirus articles published during January 1968 to March 2020 using the Web of Science database.

Methods

All research output was retrieved from Science Citation Index Expanded, a database of the Thomas Reuters Web of Science from January 1968 to March 2020 by restricting the search under the topic as ‘Coronavirus’ on March 7, 2020. There were 6424 research outputs during the period. The Journal impact factor was retrieved from Thomson Reuters Journal citation reports 2017.

The data were analyzed using MS-Excel, Word Cloud generator (7), VOS viewer (8), and ArcGIS 10.1 software (9). Data were explored based on the following factors : number of publications according to the year of publication, the author’s clustering pattern in the form of a group of networks, articles that were leading publications with their citation, distribution of type of publications, classification of research articles under broad research areas, more commonly used keywords, the geographical distribution of the published research articles, commonly used languages for publication, and predominantly productive institutes for publication of Coronavirus.

The data generated for the study included 53 years of publications and were analyzed with the MS-office, online Word Cloud generator, and VOS viewer.

Results

Quantification of Coronavirus publication by year

The growth rate of the research output does not show any definite pattern from January1968 to March2020. The number of publications gradually increased from 1 in 1968 to 388 in 2005during the SARS outbreak. The publications on Coronavirus slowly declined and again increased between 2014 and 2016, marking the outbreak of the MERS. The number of publicationsin 2019 was 275. Figure 1 shows the year-wise growth pattern of the research publications.

Fig. 1.

Fig. 1

Year-wise publication of articles on Coronavirus, WOS, 1968-March 2020

The total number of citations for the publication on Coronavirus was 45514 with an average citation of 32.7 per item.

Author’s clustering pattern as a group of authors’ networks

The pattern of a group of authors’ networks is shown by using freely downloadable software, VOS viewer, in Figure 2. It shows the relationship among a unit of an object under study as a graph or network, where the units are the nodes of circles and the relations among them represent a link between 2 nodes. The author as a unit of analysis and coauthorship of articles were visualized with the software, in which the number of appearance of an author’s name and different groups is mapped with different colors. As a result, 33 different groups of authors’ networks have contributed to COVID-19-related research across the globe, which are shown indifferent colors as identified by the software.

Fig. 2.

Fig. 2

Author’s clustering pattern as a group of authors’ networks

Leading journals in Coronavirus publications and impact factor

The research outputs were published in 100 different periodicals. The Journal of Virology was the most productive journal with 810 publications (12.6%) and an impact factor of 4.368, followed by the Journal of General Virology (n=416; 6.6%) and Virology (n=323; 5.0%). The other top journals contributing to the greatest number of publications and their impact factor are shown in Table 1. Among the top journals contributing to Coronavirus publications, proceedings of the National Academy of Sciences had the highest impact factor of 9.504, followed by Emerging Infectious Diseases, with 7.422, as per Thomson Reuters Journal citation reports 2017.

Table 1. Leading journals in Coronavirus publications and impact factor WOS, 1968-2020 .

No Journals Total No. of publications Percentage Impact factor
1 Journal of Virology 810 12.60 4.37
2 Journal of General Virology 416 6.6 2.51
3 Virology 323 5.1 3.37
4 The Journal of Infectious Disease 179 2.7 5.19
5 Advances in Experimental Medicine and Biology 152 2.4 1.76
6 Virus Research 134 2.0 2.48
7 Archives of Virology 130 2.0 2.16
8 Emerging Infectious Diseases 116 1.8 7.42
9 Veterinary Microbiology 99 1.5 2.52
10 Journal of Clinical Microbiology 79 1.2 4.05
11 Journal of Virological Methods 80 1.2 1.76
12 PLOS ONE 79 1.2 2.77
13 Viruses Basel 71 1.1 -
14 Viruses 70 1.0 -
15 Proceedings of the National Academy of Sciences of the United States of America (PNAS) 69 1.0 9.50
16 Antiviral Research 68 1.0 4.31
17 American Journal of Veterinary Research 62 0.96 -

Top 10 Coronavirus publications with their citations and mean citation per year

The article “A novel Coronavirus associated with severe acute respiratory syndrome”, published in the New England Journal of Medicine by Ksiazek TG et al in 2003, was the most cited (n=2175), with a mean citation of120.83per year (Table 2). The second highly cited article (n=2012) titled, “Identification of a Novel Coronavirus in patients with severe acute respiratory syndrome”, which was also published in the New England Journal of Medicine by Drosten C et al in 2003, followed by publications from Science, and the Lancet journals.

Table 2. Top 10 leading Coronavirus publications with their citations and mean citation per year, WOS, Jan 1968-March 2020 .

No. Publications Total
Citations
Mean Citation
Per Year
P1 A novel Coronavirus associated with severe acute respiratory syndrome. N Engl J Med. 2003 May 15;348(20):1953-66. 2175 120.83
P2 Identification of a Novel Coronavirus in patients with severe acute respiratory syndrome. N Engl J Med. 2003 May 15;348 (20):1967-76. 2012 111.78
P3 Characterization of a novel Coronavirus associated with severe acute respiratory syndrome. Science. 2003 May 30;300(5624):1394-9 1747 97.06
P4 Coronavirus as a possible cause of severe acute respiratory syndrome. Lancet. 2003 Apr 19;361(9366):1319-25. 1670 92.78
P5 The Genome sequence of the SARS-associated Coronavirus. Science. 2003 May 30;300(5624):1399-404. 1536 85.33
P6 Isolation of a novel Coronavirus from a man with pneumonia in Saudi Arabia. N Engl J Med. 2012 Nov 8;367(19):1814-20. 1371 152.33
P7 Angiotensin-converting enzyme 2 is a functional receptor for the SARS Coronavirus. Nature. 2003 Nov 27;426(6965):450-4. 1101 61.17
P8 Isolation and characterization of viruses related to the SARS Coronavirus from animals in southern China. Science. 2003 Oct 10;302(5643):276-8. 993 55.17
P9 Bats are natural reservoirs of SARS-like Coronaviruses. Science. 2005 Oct 28; 310(5748):676-9. 939 58.69
P10 Clinical progression and viral load in a community outbreak of Coronavirus-associated SARS pneumonia: a prospective study. Lancet. 2003 May
24;361(9371):1767-72.
913 50.72

The results of analyzing the citations for the top 10 most-cited research papers (P1 to P10) revealed that all the top-cited articles were published after 2003 and that the papers published during 1968 and 2002 were not much cited. The citations of the articles peaked during 2004 and 2005, except the articleP6, which was published in 2012 (Fig. 3).

Fig. 3.

Fig. 3

Top 10 leading publication citations per year from 2003

Distribution of type of publications

Out of 6424 publications, almost 90%of the publications were original articles, followed by 7.4% review articles, and the rest were abstracts, letters, bibliographies, data papers, and reference materials (Table 3).

Table 3. Distribution of type of publications on Coronavirus, WOS, 1968- March 2020 .

Document Type Records Percentage of Records
Articles 5787 90.08
Review 473 7.36
Abstract 423 6.58
Meeting 409 6.36
Letter 197 3.06
Editorial 156 2.42
Case report 122 1.89
News 85 1.32
Unspecified 69 1.07
Correction 57 0.88
Book 31 0.48
Early access 22 0.34
Clinical trial 15 0.23
Biography 1 0.01
Data paper 1 0.01
Reference material 1 0.01

Coronavirus articles classified by broad research areas

Each research article has been classified under broad research areas (Table 4). Most of the articles were classified under infectious diseases (n=5341; 83.14%), followed by microbiology (5034; 78.36%), virology (4956; 77.14%), biochemistry molecular biology (4195; 65.30%), genetics heredity (3191; 49.67%) etc.

Table 4. Top 15 Research areas on Coronavirus, WOS, 1968-March 2020 .

No. Research Areas No. of papers Percentage
1 Infectious diseases 5341 83.14
2 Microbiology 5034 78.36
3 Virology 4956 77.14
4 Biochemistry Molecular Biology 4195 65.30
5 Genetics heredity 3191 49.67
6 Immunology 2849 44.34
7 Respiratory system 2235 34.79
8 Veterinary sciences 2196 34.18
9 Cell biology 2132 33.18
10 Zoology 1497 23.30
11 Pharmacology pharmacy 1126 17.52
12 Science technology other topics 1010 15.72
13 Gastroenterology Hepatology 884 13.76
14 Public environmental occupational health 875 13.62
15 Pathology 806 12.54

Keywords

The authors' keywords were visualized by the freely downloadable software Word cloud (Fig. 4). The size of the font denotes the higher magnitude of keywords used in the research output. The authors used keywords such as Coronavirus, Pneumonia, 2019-nCoV, Novel Coronavirus, COVID-19, public health, Human, Animals, Coronavirus Infection, SARS Coronavirus, SARS virus, Coronaviridae, RNA viral, Nucleotide sequence, Virus infection, Virology, Virus replication, and Middle East Respiratory Syndrome Coronavirus. The most commonly used keywords by the authors were ‘Coronavirus’ followed by ‘Virus’, ‘Sars’, and ‘Infection’.

Fig. 4.

Fig. 4

Author’s keywords in Coronavirus research publications, WOS, 1968-March 2020

Geographical distribution and production of Coronavirus publications

The map (Fig. 5) depicts the geographical distribution of the number of publications of Coronavirus articles at the global level, which was generated using ArcGIS 10.1 software. The regions without any color show regions with no publications. The United States of America (n=2345; 36.5%) had published the maximum number of articles, followed by China (n= 1067; 16.6%), Germany (n=480; 7.5%), Netherlands (n=421; 6.6%), etc.

Fig. 5.

Fig. 5

Geographical distribution of Coronavirus publications, WOS, 1968-March 2020

Publication of Coronavirus research article by language

The publications were in 14 different languages, most of which were in English (92.6%), followed by French (2.3%), German (1.5%), Spanish (1.1%), etc. (Table 5).

Table 5. Publication of Coronavirus research article by language, WOS, 1968-March 2020 .

No Language Record Count Percentage
1 English 5948 92.6
2 French 145 2.3
3 German 97 1.5
4 Spanish 73 1.1
5 Russian 42 0.7
6 Korean 28 0.4
7 Chinese 19 0.3
8 Portuguese 16 0.2
9 Hungarian 12 0.18
10 Dutch 11 0.17
11 Arabic 10 0.15
12 Japanese 10 0.15
13 Italian 8 0.12
14 Polish 5 0.08

Productive Institute in the publication of Coronavirus research articles

The top 15institutions/organizations that have contributed to Coronavirus research globally are presented in Figure 6. The University of Hong Kong was the most productive institute with 506 articles (7.8%), followed by the University of North Carolina (n=412; 6.4%), Chinese Academy of Sciences (n=371; 5.8%), and Utrecht University (n=329; 5.1%).

Fig. 6.

Fig. 6

Top 15 productive institutions in publishing Coronavirus articles, WoS, 1968-Jan 2020

Discussion

The recent outbreak of the novel Coronavirus disease, COVID-19, in December 2019 was reported in Wuhan, China, creating alarming concerns for the public health, health authorities, and the policymakers. According to the Worldometer lives update as on March 19, 2020, there were 219355 confirmed cases and 8969 deaths from the COVID-19 outbreak (10). The bibliometric analysis on the research topic of Coronavirus using Web of Science database shows an outset picture of the growth pattern of research over 53 years. Only 1 brief annotation article on Coronavirus was published in the Nature journal by Almeida et al in 1968 (11). The article describes the Coronavirus strains and their properties. Although the research publications gradually increased from then on, the citations rapidly increased only during 2003-2004 after the emergence of human transmission of zoonotic Coronaviruses. The research publications were published in nearly 100 different journals as original articles in 14 different languages. The articles were majorly classified under infectious diseases and the most preferred keyword was ’Coronavirus’. The maximum research output was from the United States of America and the most productive institute was the University of Hong Kong.

The findings of our study, which was obtained using the WoS database (1968 to 2019 and 1970 to 2019) (12) Scopus database (1970 to 2019), agreed with several bibliometric analyses(13). A 20-year-span (Jan 2000 to March 2020) of Coronavirus research outputs analyzed through bibliometric methods based on Web of Science pointed 2 sharp increases in research yield after the SARS and MERS outbreaks (14). A bibliometric study (15) evaluated the evolution of knowledge on COVID-19 for 2019 to 2020. Even for a shorter period the leading organizations affiliated with COVID-19 research was the University of Hong Kong. The preferred keyword was ‘Coronavirus’.

Another bibliometric study (16) aimed to assess the characteristics of publications involving MERS-CoV during 2012-15 and showed an increase in the growth of publications, contributing majorly by the USA. An editorial letter (17) about a bibliometric analysis from major biomedical databases from January 1951 to January 2020 for SARS-CoV, MERS-CoV, and novel CoV 2019 found that the USA and China have primary roles in CoV research, with the USA leading the scientific production with nearly one third of the articles. Even though the time frame analyzed in the above studies was different, the results were similar to our findings. Recent studies (18, 19) of shorter duration showed maximum research contributions from China, followed by the United States.

Conclusion

The growth pattern of research publications shown by our study was not uniform and intermediate peaks were observed in 2005, 2014, 2016, and 2019, which indicated the endemic occurrences, SARS (2003), MERS (2014), and COVID-19 (2019). The top-cited articles were published after 2003 and the citations peaked predominantly in 2004. The USA and the University of Hong Kong were major contributors to the Coronavirus research by region and organization, respectively. Even though this amount of research depicts higher scientific growth, it is an alarming sign to the community for preparedness. Under the prevailing situation of seeking better prevention, treatment and vaccination for COVID-19, in-depth research in the above portrayed metrics would be an added knowledge for the researchers.

Conflict of Interests

The authors declare that they have no competing interests.

Cite this article as: Joshua V, Sivaprakasam S. Coronavirus: Bibliometric analysis of scientific publications from 1968 to 2020. Med J Islam Repub Iran. 2020 (20 Jun);34:64. https://doi.org/10.34171/mjiri.34.64

References


Articles from Medical Journal of the Islamic Republic of Iran are provided here courtesy of Iran University of Medical Sciences

RESOURCES