Abstract
Proposed by Hirsch as a quantitative measure of the total output of a researcher, the h index does not work well in the field of life sciences, where an author’s position on a paper typically depends on the author’s contribution. We revise the h index by weighing first and last authorship papers four times heavier than middle authorship papers. The revised index (r) signifies a shift in how we evaluate the research output in biology and medicine: it places more value on conducting and directing original, independent investigations as compared with contributing to projects conducted and directed by others.
Keywords: biology, medicine, publishing, research productivity
Proposed by Hirsch as a quantitative measure of the total effective output of a researcher, the index h was defined as the number of papers with citation number ≥ h.1 In many countries, the h index is now used widely to aid in selecting applicants for positions, promotions, grants and awards of various types. The h index includes highly cited articles published by an author regardless of the position on the authors’ list: the first authorship, the last authorship and any middle authorship are all rewarded equally. This works well in scientific fields where an author’s position on a paper has little significance, and the fields of physics and mathematics may arguably be examples of those.
There is, however, a serious problem with applying the h index to evaluate output of workers in the biomedical field, where authors’ contributions are typically reflected in their positions on the authorship list. My impression, which is probably shared by the majority of biomedical scientists, is that the lion’s share of work reported in biomedical papers is typically done by two authors: the first (usually the postdoc who conducted most experiments and prepared a draft of the manuscript) and the last (usually the senior author who planned the research, obtained funding, put a team of junior associates and external collaborators together, coordinated all of the work, provided scientific guidance and had a major role in writing the manuscript, sometimes by re-writing it completely). Middle authors usually contribute to a paper in a less significant way, e.g., by running particular experiments, helping the main postdoc with various tasks involved or providing techniques or reagents. By rewarding all authors equally, regardless of their positions on a paper, the h index creates unfair advantages and disadvantages for certain groups of scientists, as illustrated by three hypothetical examples.
Example 1
A problem arises when the h index is used to compare the research output of young scientists who received training in large vs. small labs. Postdoc L. graduated from a large, highly productive laboratory. He did fine in the lab and eventually published a first-author paper, for which he made a major contribution. Also, because all projects in the lab were discussed at lab meetings, and all members were encouraged to help each other, he was included in seven other papers as a middle author, while making marginal contributions to these papers. If all eight papers receive enough citations, they will increase his h index by 8. Now L. competes for an entry-level academic position with postdoc S., who came from a small lab. S. is exceptionally talented, worked very hard and produced four first-author papers during her postdoctoral training. She would be very happy to discuss and otherwise participate in other projects in the lab, but the lab was small, and there were no other projects. Provided that all of the papers published by L. and S. are of a similar caliber, the output of S. would be viewed by many as higher than that of L. However, S. can only increase her h index by 4—exactly one half of what L. can receive for his output (Table 1). I am aware of committees in several countries that place a strong emphasis on the h index when selecting new hires. Under these circumstances, the h index puts postdocs coming from smaller labs at a severe and unfair disadvantage compared with trainees from larger labs.
Table 1. Potential “rewards” in the h and in r indices for papers published in several hypothetic scenarios (see Examples 1–3 in the text).
Example 1 |
Example 2 |
Example 3 |
||||
---|---|---|---|---|---|---|
Researcher L. | Researcher S. | Researcher G. | Researcher I. | Researcher C. | Researcher P. | |
Number of first- and last-author papers |
1 |
4 |
4 |
4 |
0 |
12 |
Number of middle-author papers |
7 |
0 |
8 |
0 |
12 |
0 |
Possible gain in h |
8 |
4 |
12 |
4 |
12 |
12 |
Possible gain in r | 4.4 | 6.4 | 9.6 | 6.4 | 4.8 | 19.2 |
Example 2
A similar problem comes into play when the h index is used to evaluate output of groups comprised of two or more principal investigators (PIs). Let’s take a look at researcher I., who published four papers from work done in his individual lab during the year. In the same department, a similarly productive researcher, let’s call her G., formed a group with two other faculty members, thus forming a three-PI group. The members of this group discussed their projects at joint lab meetings, shared some resources and included each other in all of their publications. So G., without much extra work, published 12 papers that year: four from her lab (as the senior author) and eight from the labs of other members of her group (as a middle author). If all the papers published by I. and G. during that year are cited frequently enough to contribute to their h indices, the h index of each of them will increase by 4 for papers published from the respective PI’s own lab. In addition, G. will receive two times more points (8) for her minor contributions to other projects, and, as a result, her overall productivity for this year, according to the h index, will be three times greater than that of I. (Table 1). The h index gives an unfair advantage to multi-PI groups over PIs working individually. So, if you make an agreement with n investigators to include each other in all papers, and your productivities are equal, the h index of every member of the group will grow (n + 1) times faster.
Example 3
There is another way to accelerate the growth of your h index. Researcher C.’s position in his company allows him to offer a popular compound to outside investigators. C. gave this compound to 12 productive groups and, other than that, had little to do with their projects. Justifiably or not, C. became a co-author on 12 papers while spending little time on any of them. On the other hand, researcher P. worked as a PI for six years to plan, fund, conduct and publish his research in 12 papers, thus spending on average 6 mo of his time per paper. Despite the drastic inequality of their contributions, both C. and P. can increase the h index by the same number of points: 12 (Table 1). This example shows that by giving identical gains for unequal contributions, the h index effectively rewards co-authors making marginal contributions. The optimal strategy for increasing the h index would be to make minimal contributions to the maximal number of projects. Nothing will increase your h index faster than providing minor services to highly productive groups.
These problems with the h index have been recognized, and solutions were offered. Notably, Zhang 2 proposed calculating a weighted citation number and weighted index w by giving a weight coefficient of 1 to the first author and the corresponding author, but decreasing the coefficient (linearly) for authors with increasing rank (position on the paper). Zhang’s proposal assumes that middle authors are positioned on average in the order from the greatest contribution to the least. This assumption may be adequate for single-method papers with a small number of authors from the same lab. In such papers, a postdoc (who conducted most experiments) is typically followed by a student or another postdoc (who helped substantially), then by a student or technician (who helped less) and finally by the PI, who is typically the corresponding author. For multi-methodological, multi-laboratory, collaborative studies, the order is typically different: all junior participants who made major contributions are listed at the beginning, followed by co-authors who made minor contributions and then followed the heads of participating laboratories (who made various contributions). On such papers, there is also a tendency to group authors from the same laboratory together, and because the places close to the first and the last are considered more prestigious, most contributors from the main laboratory are often listed at the beginning and at the end. Zhang’s effective assumptions that author #5 on a 10-author paper contributes more, on average, than either author #6 on a 10-author paper or author #5 on an 11-author paper, and that author #10 on an 11-author paper contributes less, on average, than author #9 on a 10-author papers, are probably unwarranted, and there is probably no simple rule that can adequately differentiate contributions of middle authors based on the rank. The w index also takes too much work to compute. It requires not only ranking one’s publications by the number of citations (as the h index does), but also collecting additional (not present in the list of authors) information about each publication (such as finding out which author is listed as the corresponding author). Furthermore, the w index requires making calculations for each individual paper (in order to determine a weight coefficient for the author of interest) and then applying the weight coefficient to the citation count. From a psychological point of view, the w index receives little enthusiasm, as it gives a lower value to research output than the h index for nearly all researchers (with the exception of those who have only first- and last-author papers among papers that form their h index).
Here, I propose a solution that eliminates or minimizes the shortcomings noted above. I present the index r—a revised h index for biomedical research. If among the h papers included in a scientist’s h index only a papers are with his or her first or last authorship, then this scientist’s r index is determined as follows:
r = 1.6 a + 0.4 (h - a) | [1] |
I propose to give two different weight coefficients to papers (not to citation numbers) that form the h index: 1.6 to the first- or last-author papers, and 0.4 to middle-author papers, thus valuing the first and last authorships four times more than any middle authorship, and valuing all middle authorships the same. These coefficients are chosen arbitrarily, but also in such a way that the r index of a “typical researcher” is close to his or her h index. It may be reasonable to estimate that among the papers that form the typical researcher’s h index, one half comes from the main postdoctoral projects (first authorship) and subsequent publications as the PI (first or last authorship), and the other half comes from contributing to other investigators’ projects. If this is the case, the r index of the typical researcher is equal to the h index.
The coefficients chosen effectively mean acceptance of the following three assumptions. First, the first author’s and last author’s contributions are equal on average; the same assumption underlies the h index. Second, all middle authors’ contributions are equal on average; again, the same assumption underlies the h index. Third, the first or last author’s contribution is, on average, four times more valuable than any middle author’s contribution. In the last assumption, the ratio 4 is the ratio of weights 1.6 and 0.4 in Equation 1; the h index assumes that this ratio is equal to 1 instead of 4.
This ratio between contributions of the principle authors and those of middle authors deserves a special consideration. Giving it a value of 4 will, in some cases, still overestimate the contributions of middle authors. There are more and more publications with the number of authors in the hundreds or even thousands. It is likely that the scientific contributions of most middle authors on such papers are several orders of magnitude smaller than the contributions of leading authors. In defense of the r index, it rewards middle authors on these papers less generously than the h index.
On the other hand, there may be a concern that the r index would underestimate the contributions of middle authors on typical single-method papers with a small number of authors from the same lab, particularly on three-author papers. In such papers, the scientific contributions of all authors are often comparable, and as a result, the ratio between the contribution of the first or last (third) author and the contribution of the middle (second) author is less than 4. Yet, the potential increase in the r index is only 0.4 for the second author, while it is 1.6 for either the first or third author. This concern, however, is lessened when a larger number of papers are considered. If the same three authors wrote six papers, and each was twice in the first position, twice in the second position and twice in the third position, then each can potentially increase his or her r index by 7.2. Remember, the typical researcher’s r index is equal to his or her h index, meaning that both indices typically reward researchers at a rate of 1.0 for every paper included in the h index (h for h papers). Yet, in the example above, the r index of the three scientists who rotate on three-author papers gives each of them a potential reward of 7.2 for six papers, or 1.2 per paper. Hence, the r index may slightly underestimate the contributions of middle authors on papers with three authors only for those authors who always stay in the middle position. If contributors to three-author papers rotate in all three positions, they will occupy the first and last positions more often than scientists who participate in papers with a larger number of authors and more often than the typical researcher. Accordingly, they will be rewarded by the r index more than the typical researcher and more than by the h index. As a group, participants in papers with a small number of authors would slightly (and perhaps justifiably) benefit from using the r index.
I started this paper by looking at the situations in which the h index gives unfair advantages and unfair disadvantages (Table 1). The r index lessens or eliminates both. Let’s revisit the young scientists L. and S., who graduated from a large lab and a small lab, respectively (Example 1). The output of S., assessed by the h index to be two times smaller than that of L., is now evaluated by the r index as higher than L.’s output. In example 2, the output of researcher I., who worked individually, is still valued by the r index as lower than the output of researcher G., a group member, who published the same amount of work individually, plus collaborated with others. However, the r index gives G. a 50% reward for her collaborative contributions—not the 200% given by the h index. In example 3, two researchers were rewarded equally by the h index: researcher C. for sharing a compound and researcher P. for leading multiple projects as the PI for several years. The r index corrects the situation and evaluates the output of C. as only one quarter of P.’s output.
What are the limits and the meaning of the r index? Equation 1 can be simplified as follows:
r = 1.2 a + 0.4 h | [2] |
When, among h highly cited papers, there are no papers with the first or last authorship (a = 0), then r = 0.4 h. When all of the highly cited papers are first- or last-author papers (a = h), then r = 1.6 h. If one half of the highly cited papers are first- or last-author papers, as in the case of a typical researcher (a = 0.5 h), then r = h. Hence, r ranges between 40% and 160% of the h index, depending on the proportion of first- or last-authorship papers among the highly cited papers. Equation 2 also gives an additional meaning to the r index as a cumulative measure of research output. It defines the r index as the sum of 40% of the highly cited papers and 120% of the first- or last-author papers among the highly cited papers.
Designed to revise the h index, the r index may eventually replace it in biomedical sciences. In certain situations, however, the h index can complement the r index, as comparing the two may be useful. For scientists working in some research positions, it may be desirable to have r lower than h. For example, the director of a core facility may or may not be asked to conduct his or her own research, but the documented ability to contribute to studies by others would be critical for such a position. For the same value of r, a researcher with a low proportion of first- or last-authorship papers among his or her highly cited papers (r close to 0.4 h) may be a preferred candidate for this position. A different relationship between r and h is sought when an institution opens a single laboratory to start a new line of research and expects the new hire to conduct investigator-originated projects. A scientist who’s r index is much higher than h (close to 1.6 h) would be a good candidate. There are also positions that require both conducting research as a PI and extensively collaborating within the institution. Having similar r and h could be an ideal combination for such positions.
In summary, I propose the r index, which is a revision of the h index aimed at evaluating the total effective scientific output of a biomedical researcher or, more generally, a researcher working in a field where contributions of the first and last authors are substantially higher, on average, than those of middle authors. The r index lessens what I view as the unfair advantages given by the h index to researchers working in large labs or representing multi-PI groups, and gives a higher credit to leading investigators as compared with collaborators. The r index is simple to understand and calculate. It should be easy to switch from the h index to the r index, because the two give a similar numeric value to the output of what may be perceived as the typical biomedical researcher. At the same time, introducing the r index signifies a shift in how we evaluate the scientific work: it places more value on conducting and directing original, independent research as compared with contributing to research projects conducted and directed by others. What is your r index?
Glossary
Abbreviations:
- PI
principal investigator
Footnotes
Previously published online: www.landesbioscience.com/journals/cc/article/22179
References
- 1.Hirsch JE. An index to quantify an individual’s scientific research output. Proc Natl Acad Sci USA. 2005;102:16569–72. doi: 10.1073/pnas.0507655102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Zhang CT. A proposal for calculating weighted citations based on author rank. EMBO Rep. 2009;10:416–7. doi: 10.1038/embor.2009.74. [DOI] [PMC free article] [PubMed] [Google Scholar]