Identification and Ranking of Biomedical Informatics Researcher Citation Statistics through a Google Scholar Scraper

Allison B McCoy; Dean F Sittig; Jimmy Lin; Adam Wright

. 2020 Mar 4;2019:655–663.

Identification and Ranking of Biomedical Informatics Researcher Citation Statistics through a Google Scholar Scraper

Allison B McCoy ¹, Dean F Sittig ², Jimmy Lin ³, Adam Wright ⁴

PMCID: PMC7153158 PMID: 32308860

Abstract

To overcome limitations of previously developed scientific productivity ranking services, we created the Biomedical Informatics Researchers ranking website (rank.informatics-review.com). The website is composed of four key components that work together to create the automatically updating ranking website: 1) list of biomedical informatics researchers, 2) Google Scholar scraper, 3) display page, and 4) updater. The interactive website has facilitated identification of leaders in each of the key citation statistics categories (i.e., number of citations, h-index, and i10-index), and it has allowed other groups, such as tenure and promotions committees, to more effectively and efficiently evaluate researchers and interpret the various citation statistics reported by candidates. Creation of the biomedical informatics researcher ranking website highlights the vast differences in scholarly productivity among members of the biomedical informatics research community. Future efforts are underway to add new functionality to the website and to expand the work to identify top papers in biomedical informatics.

Introduction

Citation statistics have been used to measure scholarly production of researchers since 1964, when Eugene Garfield created the Science Citation Index.¹ Since then, additional measures have been developed in attempt to quantify research productivity and scientific impact independent of a researcher’s field of study and years at work, and without inflation from a small number of highly cited articles. Some services have attempted to rank research productivity, including ResearchGate,² SciVal,³ and Highly Cited Researchers.⁴ However, these services have limitations, such as reliance on proprietary metrics, inclusion of only a limited number of highly ranked researchers, and requirements of organizational commitment to commercial sites.

Google created a new paradigm for citation analysis with Google Scholar,⁵ an online, freely available, automatically updating scientific information resource. By allowing researchers to create their own profile page,⁶ complete with multiple bibliometric calculations (e.g., total citations, h-index, i-10 index), Google provides a potential new method that allows comparison across researchers. Comparisons of Google Scholar, Web of Science, and Scopus have found variations in citation statistics, where Google Scholar was frequently found to be considerably higher than Web of Science and Scopus due to Google Scholar’s wider inclusion of conference papers and gray literature.^7,8 Despite these limitations, the format and availability of Google Scholar Profiles prove advantageous.

To facilitate comparison across researchers, we created a ranking website based on Google Scholar profiles for bio-medical informatics researchers.⁹ This site was based on code previously developed by one author [JL] and used to create ranking sites for information retrieval,¹⁰ human-computer interaction,¹¹ and top computer science researchers¹² using a straightforward website scraping approach. These sites computed additional metrics to normalize Google Scholar’s bibliometric measures by dividing each of them by the number of years since the researcher received his or her first citation (i.e., citations/year, h-index/year, i-10 index/year).¹³

Design Considerations

Our goal in creating the Biomedical Informatics Researchers ranking website was to produce a freely available, automatically updating information resource based on Google Scholar citation profiles for all individuals interested in the field of biomedical informatics. Creating this resource required us to:

Identify a set of biomedical informatics researchers with publicly-available Google Scholar profiles.
Develop efficient methods to scrape the Google Scholar citation profiles of this list of individuals and extract key citation metrics.
Implement a method to render the ranked list of researchers along with the means to re-order and search the list.
Produce a method to allow new biomedical informatics researchers to add their name and google scholar profile location to the ranking site.
Develop a method to automatically update the ranking site with the latest Google Scholar results on a periodic basis.

System Description

A snapshot of the ranking website as of March 11, 2019 is depicted in Figure 1. The website is composed of four key components that all work together to create the automatically updating ranking site: list of researchers, Google Scholar scraper, display page, and updater.

The first is the list of biomedical informatics researchers. The list file is in JSON (JavaScript Object Notation) format, with name / URL pairs represented as:

“Full Name [FACMI] [FAMIA] [FIAHSI] [Collen Year]”: “Google Scholar URL”

where:

“FACMI” is an optional indicator that designates that they are member of the American College of Medical Informatics (ACMI).¹⁴
“FAMIA” is an optional indicator that designates that they are a fellow of the American Medical Informatics Association (AMIA).¹⁵
“FIAHSI” is an optional indicator that designates that they are a member of the International Academy of Health Sciences Informatics (IAHSI).¹⁶
“Collen Year” is an optional indicator that designates that they are a recipient of the Morris F. Collen Award, and the year of their award.¹⁷

The FACMI, FAMIA, FIAHSI, and Collen Year components are appended to the name to be displayed on the ranked list of researchers. The list was created initially through an iterative process that began with manual searches for known biomedical informaticians using Google Scholar. After approximately 100 researchers were identified, we realized that we needed an automated method to develop a more comprehensive list. Therefore, in 2014, we used the “label:biomedical_informatics” search feature that identified all individuals on Google Scholar with “biomedical informatics” as one of their “areas of interest” and at least one publication with one or more citations. We repeated this search using “label:medical_informatics” and other common key words related to biomedical informatics, including health informatics, electronic health record, clinical decision support, and health information technology.

To facilitate new requests to be added to the list of biomedical informatics researchers, we created a Google form that allows a researcher to requested to add his or her profile to the ranking website. The Google form prompts researchers to enter his or her name and Google Scholar URL; to indicate whether they are an ACMI fellow, AMIA fellow, or member of IAHSI; and to indicate whether they have received the Morris F. Collen award and, if so, to enter the year. The input data is manually verified to be accurate by one of the co-authors to prevent errors in running the scraper, then it is added to the list of biomedical informatics researchers to be included on the site with the next update. Since then, we have periodically solicited requests for individuals to add their profiles (or create one if they had not already done so) through the ACMI listserv and other targeted e-mailings (e.g., to department listservs, through AMIA Connect). We have also manually added new profiles found through repeated Google Scholar searches of relevant labels and review of new individuals listed on ACMI, FAMIA, and IAHSI member lists.

The second component is the Google Scholar Scraper. This open-source application is written in node.js and built using commonly-available open-source libraries. It takes as input the list of researchers and then iteratively retrieves the listing of each person’s google scholar citation counts, the total number of citations, the year of first citation, the i-10 index, and the h-index. These values are extracted based on matching the relevant elements from each page’s DOM (Document Object Model) structure. This approach makes the scraper application dependent on the layout of the Google Scholar profile page, so it is not robust to changes in the layout of the profiles, and indeed, the application has broken several times since the initial development in 2014 after Google updated its site. However, no APIs (Application Programming Interfaces) that allow programmatic access to such data are available, so there are few alternatives to this screen-scraping approach.

In addition to extracting raw statistics from profile pages, the application also calculates the citations/year, i-10 in- dex/year, and h-index/year; all computed values are written into a file in JSON format, which facilitates the display as well as downstream processing by other applications. The following is a brief definition of each of the bibliometric measures included on the ranking site:

Total number of citations – the total number of citations to all of a researcher’s published articles
Year of first citation – the year in which the researcher received his or her first citation, regardless of the year of publication of their first article
i-10-index – the number of articles that a researcher has published that have received at least 10 citations
h-index – the number of articles (n) that a researcher has published that have received at least “h” citations where n=h.¹⁸ In other words, if a research has published 25 articles that have all received at least 25 citations, then his or her h-index is 25.
Citations/year – a researcher’s total number of citations divided by the number of years in which he or she has been accumulating citations (i.e., current year – year of first citation)
i10-index/year – a researcher’s i-10-index divided by the number of years in which he or she has been accumulating citations
h-index/year – a researcher’s h-index divided by the number of years in which he or she has been accumulating citations

The third component is the display page that renders the JSON data created by the scraper program above in HTML/CSS, (Hypertext Markup Language / Cascading Style Sheets) with the aid of JavaScript. The display lists the researchers in ranked order and allows a user to re-sort the entire list by any of the column headers (e.g., citations or i-10 index). The display page also incorporates a search feature that allows one to display a ranked subset of research ers, for example: 235 ACMI members, the 32 people associated with “Vanderbilt University”, or the 26 people with “David” in either their name or affiliation. This page also includes code that allows Google Analytics to track website traffic.

The final component is the updater, a script that periodically re-runs the scraper and pushes the updated data to the web site and Github. The updater is currently set to run twice per week. Several error conditions are periodically encountered and detected by the updater script, including instances where people delete their Google Scholar profile or make it private, network issues that prevent connection to Google Scholar or Github, or temporary blocks imposed by Google. Although Google permits scraping of Google Scholar profiles in their robot exclusion standard (robots.txt) file, they do periodically block the scraper if it is set to run too often.

Current Status of Biomedical Informatics Researcher Ranking Website

The Citation Statistics of Biomedical Informatics Researchers ranking website can be viewed at rank.informatics- review.com. In the nearly five years since its inception, the website has been viewed more than 18,000 times by almost 9,000 users. Of these users, 70% reside in the United States, 6% in India, 2.5% in Australia, 2.4% in Canada, 1.7% in the United Kingdom, and 1% in China. We observed apparent spikes in website traffic in several instances after listserv e-mails were sent or individual researchers mentioned the website on social media (Figure 2). For example, timepoint #1 corresponds with an e-mail sent to the ACMI listserv, and timepoint #4 corresponds with a Tweet by @allisonbmccoy.

Figure 2: — Visitors to the biomedical informatics ranking website from Google Analytics with traffic spikes corresponding to known instances of dissemination.

The list of biomedical informatics researchers contains 1,401 individuals, including 235 ACMI fellows, 62 AMIA fellows, 61 IAHSI members, and 12 Morris F. Collen award winners. Requests to be added to the site have been submitted through the Google form for 171 researchers.

Since the BMI ranking list has been available, numerous uses for the list have been identified, including:

To create a list of members from a single university and compare the scholarly productivity of those university’s biomedical informatics departments. To our knowledge, at least three universities are currently using the website in annual department reviews.¹⁹
To identify productive researchers for nomination to ACMI or IAHSI members.
To identify potential recruits for academic positions.
To help tenure and promotions committees to interpret the various citation statistics reported by candidates.
To identify speakers for conferences.
To identify subfields of biomedical informatics for which citations are highest.²⁰

Notable Statistics for Biomedical Informatics Researchers

Table 1 shows the median, min, and max values for all biomedical informatics researchers as well as for all ACMI, FAMIA, and IAHSI members and Morris F. Collen award winners, as identified through the biomedical informatics ranking website as of March 4, 2019. As expected, the median ACMI and IAHSI members (h-index=35.5 and 41, respectively) have been publishing for 8-10 years longer than the median for all researchers (h-index: 15). The median h-index for AMIA fellows (14) is similar to the median for all researchers, which is also expected given that FAMIA recognition is based on application of informatics skills and knowledge, regardless of research productivity. Table 2 shows the values for 10 randomly chosen Nobel Prize winning scientists (median h-index: 120) as an upper extreme for comparison.²²

Table 1:

Descriptive analysis of citation statistics for biomedical informatics researchers

		All Biomedical Informatics Researchers (N=1,401)	ACMI Fellows (N=235)	AMIA Fellows (N=62)	IAHSI Members (N=61)	Morris F. Collen Award Winners (N=12)
Year of 1st Citation	Median	2004	1996	2005	1994	1980
	Min	1980	1980	1983	1980	1984
	Max	2017	2009	2012	2006	1997
Total Citations	Median	1,028	5389	812.5	7145	9,964.5
	Min	2	300	72	247	4,456
	Max	166,410	166,410	24,324	108,929	108,929
Citations/ year	Median	68	237	55	274	302
	Min	0	10	5	15	117
	Max	9,958	6,400	950	4,951	4,951
h-index	Median	15	36	14	41	46
	Min	1	9	3	7	31
	Max	199	149	72	149	149
h-index/ year	Median	1	1.6	1	1.5	1.6
	Min	0.1	0.2	0.2	0.5	0.8
	Max	15.6	7.7	3.3	6.8	6.8
i10-index	Median	20	77	17.5	117	127.5
	Min	0	8	1	6	77
	Max	922	802	252	695	695
i10-index/ year	Median	1.4	3.4	1.2	4.6	4.35
	Min	0	0.2	0.1	0.4	2
	Max	51.3	33.9	9	31.6	31.6

Open in a new tab

ACMI = American College of Medical Informatics, AMIA = American Medical Informatics Association, IAHSI = International Academy of Health Sciences Informatics

Table 2:

Convenience sample of 10 Nobel Prize winners’ Google Scholar citation statistics (as of March 11, 2019)

Name	Nobel Prize (Year)	Citations	h-index	i10-index
Gerhard Ertl	Chemistry (2007)	71,475	132	573
Michael Levitt	Chemistry (2013)	38,232	91	181
Herbert A. Simon	Economics (1978)	338,316	172	554
Paul R. Krugman	Economics (2008)	217,039	159	862
Christopher A. Sims	Economics (2011)	57,963	77	157
Alvin E. Roth	Economics (2012)	47,211	95	214
Eugene F. Fama	Economics (2013)	266,441	107	192
Jean Tirole	Economics (2014)	128,261	134	306
Albert Einstein	Physics (1921)	241,716	201	800
Yoichiro Nambu	Physics (2008)	25,775	52	93
Median values		99,868	120	260

Open in a new tab

In reviewing the citation ranking statistics on the website (Figure 1) and changing the primary sort order by column, we have made a number of interesting observations about the list:

Eugene Koonin from National Center for Biotechnology Information has the most citations (166,410) and the highest h-index (199).
Twenty-nine researchers are tied for the earliest “year of first citation” (1980). In reviewing these results, we noted that this is a limitation in the Google Scholar profile page, with no citations depicted prior to this date, though prior publication dates are listed on individual researchers’ pages (e.g., Homer Warner, 1951²¹).
Alex Wang from National Institutes of Health has the highest i10-index (922).
Brian Pollack from University of Pittsburgh has the highest citations/year (9,958), h-index/year (15.6), and i10- index/year (51.3).
ACMI members make up 55 of the top 100 researchers when sorted by both h-index and citations, and IAHSI members make up 20/100.

To evaluate the included citation statistics, we calculated the correlation coefficient between the h-index and total citations (r²=0.77) (Figure 3) and i10-index (r²=0.89) (Figure 4) using Stata/IC 15.1. Overall, the statistics similarly portray researcher productivity; however, in one case a researcher has a disproportionately high total citation count compared to h-index due to a single paper with more than 100,000 citations.

Figure 3: — Graph showing the relationship between h-index and total citations (r²=0.77).

Figure 4: — Graph showing the relationship between h-index and i10-index (r²=0.89).

Lessons Learned

Creation of the biomedical informatics researcher ranking website highlights the vast differences in scholarly productivity among members of the biomedical informatics research community. Careful inspection of the citations included on many researchers’ profile pages also highlights many of the limitations of automatically curated lists, including:

For individuals with relatively common family names, the inclusion of articles that were authored by other researchers are often included erroneously, which can falsely inflate citation statistics and rankings.²³ Authors can curate their own profiles to remove erroneous citations, but few do.
Duplicate citations exist in many profiles that can also false inflate citation statistics and rankings;²³ however, Google Scholar has implemented functionality to automatically merge some articles and combine citations when authors do not manually merge citations.
Researchers with publications before the 1990’s, when use of the internet substantially increased, are not as well included in the various citation statistics. Most notable is that there are no citations included before 1980 in any of the counts, an important limitation of Google Scholar profile page and the scraper tool.
Highly cited publications by large consortia, including the “Initial Sequencing and analysis of the human ge- nome”²⁴ and “Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC”²⁵ heavily skew some authors’ citation statistics.
Not all articles included are equal, although Google Scholar often lists blog posts and slide presentations with articles from peer-reviewed scientific journals.
Likewise, not all citations are equal, although Google Scholar counts all citations equally, whether from a website, slide presentation, or top scientific journal.
Most indexed articles are in English, which negatively affects non-English speaking researchers.

Future Directions

While we believe the current biomedical informatics researcher ranking site is already very useful, we are continuing to identify new researchers, especially those who are highly cited, ACMI fellows, members of IAHSI, or Morris F. Collen award winners. In addition, we are reviewing profiles with a large number of incorrect or duplicate citations and requesting that the individuals curate their profile or be removed from the list. We have also identified numerous enhancements that we hope to make in the future, including:

Adding the total number of articles included in each person’s Google Scholar profile and the year of first publication to the biomedical informatics researcher ranking website.
Adding an indicator for other noteworthy accomplishments, including AMIA signature awards (e.g., Donald A.B. Lindberg Award for Innovation in Informatics, Virginia K. Saba Informatics Award, and AMIA New Investigator Award).
Calculating the longest consecutive string of years in which each researcher published one or more articles that received one or more citations.²⁶
Calculating h_s (universal h-index), or the h-index of an individual divided by the mean h-index of everyone in the field.²⁷
Evaluating and improving the usability and efficiency of the site.

Finally, we are exploring opportunities to use the current Google Scholar scraper to identify top papers in biomedical informatics for all time and in the past year. A preliminary version of this new tool retrieved the top 100 most cited publications with 100 or more citations from all profiles in the list of biomedical informatics researchers and found 7,429 papers that met these criteria. The top most cited publication had 69,812 citations total with 3,173 citations per year.²⁸ A preliminary version of the tool to identify top papers in the last year retrieved all publications in 2018 from all profiles in the list of biomedical researchers and found 3,751 publications. The top most cited publication had 2,177 citations total.²⁹ At present, several limitations to this new tool exist. One important limitation is the inclusion of all publication types; in 2018, the most cited publication was a textbook. Another limitation is the inclusion of papers published by biomedical informatics researchers in areas that are not directly related to biomedical informatics; for example, an American Heart Association report on which a biomedical informatics researcher played a small role related to informatics development or data analysis is the second most cited publication in 2018.³⁰

Conclusion

We have developed an easily searchable, interactive, automatically updating, open-source bibliometric ranking website using Google Scholar citation profiles that includes over 1,300 biomedical informatics researchers from around the world. While there are limitations to both using bibliometric citation analysis to measure scientific productivity and automatically generated lists of articles and citations, the biomedical informatics ranking website has already proven to be useful for a number of important tasks. Future efforts are underway to add new functionality to the website and to expand the work to identify top papers in biomedical informatics.

Figures & Table

References

1.Garfield E. “Science Citation Index”--A New Dimension in Indexing. Science. 1964 May 8;144(3619):649–54. doi: 10.1126/science.144.3619.649. [DOI] [PubMed] [Google Scholar]
2.ResearchGate. [Internet]. ResearchGate. [cited 2019 Mar 6]. Available from: https://www.researchgate.net/ [Google Scholar]
3.SciVal | Navigate the world of research with a ready-to-use solution | Elsevier Solutions. [Internet]. [cited 2019 Mar 6]. Available from: https://www.elsevier.com/solutions/scival. [Google Scholar]
4.Highly Cited Researchers - The Most Influential Scientific Minds. [Internet]. HCR. [cited 2019 Mar 6]. Available from: https://hcr.clarivate.com/ [Google Scholar]
5.Google Scholar. [Internet]. [cited 2019 Mar 6]. Available from: https://scholar.google.com/ [Google Scholar]
6.Allison B. McCoy - Google Scholar Citations. [Internet] [cited 2019 Mar 6]. Available from: https://scholar.google.com/citations?user=8fR6ShUAAAAJ&hl=en. [Google Scholar]
7.Bar-Ilan J. Which h-index? — A comparison of WoS, Scopus and Google Scholar. Scientometrics. 2008 Feb 1;74(2):257–71. [Google Scholar]
8.Teixeira da Silva JA, Dobránszki J. Multiple versions of the h-index: cautionary use for formal academic purposes. Scientometrics. 2018 May 1;115(2):1107–13. doi: 10.1007/s11192-018-2683-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.McCoy AB, Lin J. Citations Statistics of Biomedical Informatics Researchers. [Internet] 2019 [cited 2019 Mar13]. Available from: http://allisonbmccoy.github.io/scholar-scraper/index-bmi.html. [Google Scholar]
10.Lin J. Citations Statistics of Information Retrieval Researchers. [Internet] [cited 2019 Mar 6]. Available from: http://lintool.github.io/scholar-scraper/ [Google Scholar]
11.Lin J. Citations Statistics of Human-Computer Interaction Researchers. [Internet] [cited 2019 Mar 6]. Available from: http://lintool.github.io/scholar-scraper/index-hci.html. [Google Scholar]
12.Lin J. Citations Statistics of Top Computer Science Researchers. [Internet] [cited 2019 Mar 6]. Available from: http://lintool.github.io/scholar-scraper/index-stratosphere.html. [Google Scholar]
13.Lin J. Scrapes citation statistics from Google Scholar. Contribute to lintool/scholar-scraper development by creating an account on GitHub. [Internet] 2019 [cited 2019 Mar 6]. Available from: https://github.com/lin- tool/scholar-scraper. [Google Scholar]
14.ACMI Fellows | AMIA. [Internet]. [cited 2019 Mar 6]. Available from: https://www.amia.org/programs/acmi- fellowship/acmi-fellows. [Google Scholar]
15.Fellows of AMIA | AMIA. [Internet]. [cited 2019 Mar 7]. Available from: https://www.amia.org/fellows-amia. [Google Scholar]
16.IAHSI. [Internet]. IMIA. 2018 [cited 2019 Mar 7]. Available from: https://imia-medinfo.org/wp/iahsi/ [Google Scholar]
17.ACMI Fellowship | AMIA. [Internet]. [cited 2019 Mar 6]. Available from: https://www.amia.org/acmi-fellowship. [Google Scholar]
18.Hirsch JE. An index to quantify an individual’s scientific research output. Proc Natl Acad Sci U S A. 2005 Nov 15;102(46):16569–72. doi: 10.1073/pnas.0507655102. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Harle CA, Vest JR, Menachemi N. Using Bibliometric Big Data to Analyze Faculty Research Productivity in Health Policy and Management. Author. [Internet] 2016 Mar; [cited 2019 Mar 11]; Available from: https://schol- arworks.iupui.edu/handle/1805/17166. [Google Scholar]
20.Hunter LE. Discovery Informatics: Can AI Do Science? In 2014 [Google Scholar]
21.Warner HR. “Painless” myocardial infarction. Minn Med. 1951 Jan;34(1):49–50. [PubMed] [Google Scholar]
22.Harzing A-W. A preliminary test of Google Scholar as a source for citation data: a longitudinal study of Nobel prize winners. Scientometrics. 2013 Mar 1;94(3):1057–75. [Google Scholar]
23.Teixeira da Silva JA. The Google Scholar h-index: useful but burdensome metric. Scientometrics. 2018 Oct 1;117(1):631–5. [Google Scholar]
24.International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001 Feb;409(6822):860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]
25.Chatrchyan S, Khachatryan V, Sirunyan AM, Tumasyan A, Adam W, Aguilo E, et al. Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. Phys Lett B. 2012 Sep 17;716(1):30–61. [Google Scholar]
26.Ioannidis JPA, Boyack KW, Klavans R. Estimates of the continuously publishing core in the scientific workforce. PloS One. 2014;9(7):e101698. doi: 10.1371/journal.pone.0101698. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Kaur J, Radicchi F, Menczer F. Universality of scholarly impact metrics. J Informetr. 2013 Oct;7(4):924–32. [Google Scholar]
28.Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997 Sep 1;25(17):3389–402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.MD SES, PhD PGMF, MD WSR, MD RBH. Edingurg London New York: Elsevier; 2018. Evidence-Based Medicine: How to Practice and Teach EBM; p. 336 p. 5 edition. [Google Scholar]
30.Benjamin Emelia J., Virani Salim S., Callaway Clifton W., Chamberlain Alanna M., Chang Alexander R., Cheng Susan, et al. Heart Disease and Stroke Statistics—2018 Update: A Report From the American Heart Association. Circulation. 2018 Mar 20;137(12):e67–492. doi: 10.1161/CIR.0000000000000558. [DOI] [PubMed] [Google Scholar]

[r1-3202495] 1.Garfield E. “Science Citation Index”--A New Dimension in Indexing. Science. 1964 May 8;144(3619):649–54. doi: 10.1126/science.144.3619.649. [DOI] [PubMed] [Google Scholar]

[r2-3202495] 2.ResearchGate. [Internet]. ResearchGate. [cited 2019 Mar 6]. Available from: https://www.researchgate.net/ [Google Scholar]

[r3-3202495] 3.SciVal | Navigate the world of research with a ready-to-use solution | Elsevier Solutions. [Internet]. [cited 2019 Mar 6]. Available from: https://www.elsevier.com/solutions/scival. [Google Scholar]

[r4-3202495] 4.Highly Cited Researchers - The Most Influential Scientific Minds. [Internet]. HCR. [cited 2019 Mar 6]. Available from: https://hcr.clarivate.com/ [Google Scholar]

[r5-3202495] 5.Google Scholar. [Internet]. [cited 2019 Mar 6]. Available from: https://scholar.google.com/ [Google Scholar]

[r6-3202495] 6.Allison B. McCoy - Google Scholar Citations. [Internet] [cited 2019 Mar 6]. Available from: https://scholar.google.com/citations?user=8fR6ShUAAAAJ&hl=en. [Google Scholar]

[r7-3202495] 7.Bar-Ilan J. Which h-index? — A comparison of WoS, Scopus and Google Scholar. Scientometrics. 2008 Feb 1;74(2):257–71. [Google Scholar]

[r8-3202495] 8.Teixeira da Silva JA, Dobránszki J. Multiple versions of the h-index: cautionary use for formal academic purposes. Scientometrics. 2018 May 1;115(2):1107–13. doi: 10.1007/s11192-018-2683-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r9-3202495] 9.McCoy AB, Lin J. Citations Statistics of Biomedical Informatics Researchers. [Internet] 2019 [cited 2019 Mar13]. Available from: http://allisonbmccoy.github.io/scholar-scraper/index-bmi.html. [Google Scholar]

[r10-3202495] 10.Lin J. Citations Statistics of Information Retrieval Researchers. [Internet] [cited 2019 Mar 6]. Available from: http://lintool.github.io/scholar-scraper/ [Google Scholar]

[r11-3202495] 11.Lin J. Citations Statistics of Human-Computer Interaction Researchers. [Internet] [cited 2019 Mar 6]. Available from: http://lintool.github.io/scholar-scraper/index-hci.html. [Google Scholar]

[r12-3202495] 12.Lin J. Citations Statistics of Top Computer Science Researchers. [Internet] [cited 2019 Mar 6]. Available from: http://lintool.github.io/scholar-scraper/index-stratosphere.html. [Google Scholar]

[r13-3202495] 13.Lin J. Scrapes citation statistics from Google Scholar. Contribute to lintool/scholar-scraper development by creating an account on GitHub. [Internet] 2019 [cited 2019 Mar 6]. Available from: https://github.com/lin- tool/scholar-scraper. [Google Scholar]

[r14-3202495] 14.ACMI Fellows | AMIA. [Internet]. [cited 2019 Mar 6]. Available from: https://www.amia.org/programs/acmi- fellowship/acmi-fellows. [Google Scholar]

[r15-3202495] 15.Fellows of AMIA | AMIA. [Internet]. [cited 2019 Mar 7]. Available from: https://www.amia.org/fellows-amia. [Google Scholar]

[r16-3202495] 16.IAHSI. [Internet]. IMIA. 2018 [cited 2019 Mar 7]. Available from: https://imia-medinfo.org/wp/iahsi/ [Google Scholar]

[r17-3202495] 17.ACMI Fellowship | AMIA. [Internet]. [cited 2019 Mar 6]. Available from: https://www.amia.org/acmi-fellowship. [Google Scholar]

[r18-3202495] 18.Hirsch JE. An index to quantify an individual’s scientific research output. Proc Natl Acad Sci U S A. 2005 Nov 15;102(46):16569–72. doi: 10.1073/pnas.0507655102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r19-3202495] 19.Harle CA, Vest JR, Menachemi N. Using Bibliometric Big Data to Analyze Faculty Research Productivity in Health Policy and Management. Author. [Internet] 2016 Mar; [cited 2019 Mar 11]; Available from: https://schol- arworks.iupui.edu/handle/1805/17166. [Google Scholar]

[r20-3202495] 20.Hunter LE. Discovery Informatics: Can AI Do Science? In 2014 [Google Scholar]

[r21-3202495] 21.Warner HR. “Painless” myocardial infarction. Minn Med. 1951 Jan;34(1):49–50. [PubMed] [Google Scholar]

[r22-3202495] 22.Harzing A-W. A preliminary test of Google Scholar as a source for citation data: a longitudinal study of Nobel prize winners. Scientometrics. 2013 Mar 1;94(3):1057–75. [Google Scholar]

[r23-3202495] 23.Teixeira da Silva JA. The Google Scholar h-index: useful but burdensome metric. Scientometrics. 2018 Oct 1;117(1):631–5. [Google Scholar]

[r24-3202495] 24.International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001 Feb;409(6822):860–921. doi: 10.1038/35057062. [DOI] [PubMed] [Google Scholar]

[r25-3202495] 25.Chatrchyan S, Khachatryan V, Sirunyan AM, Tumasyan A, Adam W, Aguilo E, et al. Observation of a new boson at a mass of 125 GeV with the CMS experiment at the LHC. Phys Lett B. 2012 Sep 17;716(1):30–61. [Google Scholar]

[r26-3202495] 26.Ioannidis JPA, Boyack KW, Klavans R. Estimates of the continuously publishing core in the scientific workforce. PloS One. 2014;9(7):e101698. doi: 10.1371/journal.pone.0101698. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r27-3202495] 27.Kaur J, Radicchi F, Menczer F. Universality of scholarly impact metrics. J Informetr. 2013 Oct;7(4):924–32. [Google Scholar]

[r28-3202495] 28.Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997 Sep 1;25(17):3389–402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]

[r29-3202495] 29.MD SES, PhD PGMF, MD WSR, MD RBH. Edingurg London New York: Elsevier; 2018. Evidence-Based Medicine: How to Practice and Teach EBM; p. 336 p. 5 edition. [Google Scholar]

[r30-3202495] 30.Benjamin Emelia J., Virani Salim S., Callaway Clifton W., Chamberlain Alanna M., Chang Alexander R., Cheng Susan, et al. Heart Disease and Stroke Statistics—2018 Update: A Report From the American Heart Association. Circulation. 2018 Mar 20;137(12):e67–492. doi: 10.1161/CIR.0000000000000558. [DOI] [PubMed] [Google Scholar]

PERMALINK

Identification and Ranking of Biomedical Informatics Researcher Citation Statistics through a Google Scholar Scraper

Allison B McCoy, PhD

Dean F Sittig, PhD

Jimmy Lin, PhD

Adam Wright, PhD

Abstract

Introduction

Design Considerations

System Description

Figure 1:

Current Status of Biomedical Informatics Researcher Ranking Website

Figure 2:

Notable Statistics for Biomedical Informatics Researchers

Table 1:

Table 2:

Figure 3:

Figure 4:

Lessons Learned

Future Directions

Conclusion

Figures & Table

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Identification and Ranking of Biomedical Informatics Researcher Citation Statistics through a Google Scholar Scraper

Allison B McCoy, PhD

Dean F Sittig, PhD

Jimmy Lin, PhD

Adam Wright, PhD

Abstract

Introduction

Design Considerations

System Description

Figure 1:

Current Status of Biomedical Informatics Researcher Ranking Website

Figure 2:

Notable Statistics for Biomedical Informatics Researchers

Table 1:

Table 2:

Figure 3:

Figure 4:

Lessons Learned

Future Directions

Conclusion

Figures & Table

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases