Abstract
Bibliometric counting methods need to be validated against perceived notions of authorship credit allocation, and standardized by rejecting methods with poor fit or questionable ethical implications. Harmonic counting meets these concerns by exhibiting a robust fit to previously published empirical data from medicine, psychology and chemistry, and by complying with three basic ethical criteria for the equitable sharing of authorship credit. Harmonic counting can also incorporate additional byline information about equal contribution, or the elevated status of a corresponding last author. By contrast, several previously proposed counting schemes from the bibliometric literature including arithmetic, geometric and fractional counting, do not fit the empirical data as well and do not consistently meet the ethical criteria. In conclusion, harmonic counting would seem to provide unrivalled accuracy, fairness and flexibility to the long overdue task of standardizing bibliometric allocation of publication and citation credit.
Keywords: Bibliometry, Bibliometric counting, Validation, Counting bias
Introduction
To allocate authorship credit for multi-authored publications according to a harmonic progression was originally suggested by Hodge and Greenberg (1981) in a letter to Science. Their letter was a response to Derek De Solla Price who, although aware that coauthors did not contribute equally, had proposed equal division of publication and citation credit among coauthors as “a deterrent to the otherwise pernicious practice of coining false brownie points by awarding each author full credit for the whole thing” (Price 1981). Ironically, both Price’s proposal for equal division of authorship credit (fractional counting), and the practice he opposed (inflated counting) have persisted as routine bibliometric methods for nearly 30 years. By contrast, harmonic counting went virtually unnoticed until reproposed without acknowledgement to Hodge and Greenberg in the 17 October 2008 issue of Science (cf. Hagen 2009).
Recently, harmonic counting was shown to improve the accuracy of h index scores by removing distorting bibliometric bias from the input data (Hagen 2008). Such bias is generated by equal allocation of authorship credit, either by inflated or fractional counting, and has the potential to distort all derived bibliometric measures.
In the present study harmonic authorship credit scores are validated by comparison with previously published empirical data from medicine, psychology and chemistry. Such validation does not imply causation, and for that reason harmonic counting is also assessed ethically by contrasting its main features with previously proposed counting schemes from the bibliometric literature, including arithmetic (‘proportional’, Van Hooydonk 1997), geometric (Egghe et al. 2000) and fractional counting (Lindsey 1980; Price 1981).
Methods
Empirical validation
Harmonic authorship credit for the ith author of a publication with N coauthors was calculated according to the following formula:
![]() |
For medical research, where the corresponding author is customarily listed last to signify elevated status (Wren et al. 2007; Zuckerman 1968), the harmonic authorship credit was calculated assuming approximate equality between the contributions of the first and last authors (Hagen 2008, Fig. 5C therein).
Empirical data from the bibliometric literature were obtained as follows: data for psychology were obtained from an internet-based study on how name-ordering conventions in three different disciplines affect inferences about authorship credit (Maciejovsky et al. 2009). The data for psychology were used because this discipline has a tradition of hierarchical byline positioning, whereas the other two, marketing and economics, do not. For psychology, authorship credit per author for papers with 2, 3, or 4 coauthors was assigned by analyzing responses from 52 faculty members and advanced graduate students. The data were obtained by scanning figure A2 from Maciejovsky et al. (2009), and using the ImageJ (http://rsbweb.nih.gov/ij/download.html) image analysis program to measure the average credit scores for psychology papers with non-alphabetical name ordering.
Empirical data for medicine were obtained from a survey of perceived authorship credit allotted by 87 promotion committee members from a wide selection of American medical schools. The data consisted of mean authorship credit scores and standard deviations for papers with three or five coauthors and the last author as corresponding author (Wren et al. 2007, Table 1 therein).
Empirical data for chemistry were obtained from tabulated authorship scores based on extensive empirical and theoretical investigations (Vinkler 2000, Table 4 therein). The data consisted of authorship credit scores for papers with up to six coauthors. The data were used with one minor correction: the first author credit for a paper with six coauthors was altered from 0.33 to 0.35 in order to make the total credit sum to unity, as was Vinkler’s intention, while maintaining a consistent internal increment of 0.05.
Lack of fit
Lack of fit was calculated as a standardized departure from model predictions as follows:
![]() |
where n is the total number of empirical observations, O is the empirical observation, and E is the model prediction.
Model predictions of authorship credit for the ith author of a publication with N coauthors were calculated according to the following formulas:
![]() |
![]() |
![]() |
Results
Validation of the harmonic counting model
It is evident that the harmonic authorship credit scores are in close agreement with the empirical data from from psychology (Fig. 1a, Maciejovsky et al. 2009), medicine (Fig. 1b, Wren et al. 2007) and chemistry (Fig. 1c, Vinkler 2000). For medicine the harmonic credit scores were calculated on the assumption that the first and last (corresponding) authors were perceived as equal contributors. This assumption is supported by the close fit between the harmonic credit scores and the empirical means. The large error bars associated with first and last author credit in medicine may be an indication of diverging opinion among the 87 promotion committee members of the original survey about whether the last author position signifies approximate equality with the first author.
Fig. 1.
Harmonic authorship credit scores compared with previously published empirical data from a psychology (Maciejovsky et al. 2009), b medicine (Wren et al. 2007) and c chemistry (Vinkler 2000). n number of coauthors
The overall fit between the predicted harmonic authorship credit scores and the empirical data was close to the line of perfect fit, with no outliers (Fig. 2). The excellent fit to the harmonic authorship credit scores was quantified by a standardized score that estimated the overall departure from the model’s prediction at a mere 0.0035 (Fig. 3).
Fig. 2.
Relationship between predicted harmonic authorship credit scores and previously published empirical data from psychology (Maciejovsky et al. 2009), medicine (Wren et al. 2007) and chemistry (Vinkler 2000). The diagonal line indicates perfect fit between prediction and observation. N = 37 observations
Fig. 3.
Lack of fit between authorship credit scores predicted by harmonic, arithmetic, geometric and fractional counting models, and previously published empirical data from psychology (Maciejovsky et al. 2009), medicine (Wren et al. 2007) and chemistry (Vinkler 2000). N = 37 observations
Contrasting the bibliometric counting methods
The harmonic counting model fits the empirical data better than the arithmetic, geometric or fractional counting methods (Fig. 3). The fractional model, which allocates equal credit to all coauthors, exhibits the greatest discrepancy between model prediction and empirical data with a standardized departure score of 0.064, an 18-fold increase over harmonic counting. Arithmetic and geometric counting models have an intermediate lack of fit, with standardized departure scores for arithmetic more than double, and for geometric more than 6-fold greater than for harmonic counting. To further elucidate the differential lack of fit, a more detailed juxtaposition of how these models allocate authorship credit follows (Fig. 4; Table 1).
Fig. 4.
Comparison of bibliometric counting models. a harmonic, b arithmetic, c geometric, and d fractional counting models. Curves comparing allocated authorship credit are plotted for the first five authors for publications with N ≤ 20 coauthors
Table 1.
Authorship credit scores for papers with up to N = 6 coauthors
Counting method | Coauthors | Authorship rank | |||||
---|---|---|---|---|---|---|---|
N | 1st | 2nd | 3rd | 4th | 5th | 6th | |
Harmonic | 1 | 1.0000 | |||||
2 | 0.6667 | 0.3333 | |||||
3 | 0.5455 | 0.2727 | 0.1818 | ||||
4 | 0.4800 | 0.2400 | 0.1600 | 0.1200 | |||
5 | 0.4380 | 0.2190 | 0.1460 | 0.1095 | 0.0876 | ||
6 | 0.4082 | 0.2041 | 0.1361 | 0.1020 | 0.0816 | 0.0680 | |
Arithmetic | 1 | 1.0000 | |||||
2 | 0.6667 | 0.3333 | |||||
3 | 0.5000 | 0.3333 | 0.1667 | ||||
4 | 0.4000 | 0.3000 | 0.2000 | 0.1000 | |||
5 | 0.3333 | 0.2667 | 0.2000 | 0.1333 | 0.0667 | ||
6 | 0.2857 | 0.2381 | 0.1905 | 0.1429 | 0.0952 | 0.0476 | |
Geometric | 1 | 1.0000 | |||||
2 | 0.6667 | 0.3333 | |||||
3 | 0.5714 | 0.2857 | 0.1429 | ||||
4 | 0.5333 | 0.2667 | 0.1333 | 0.0667 | |||
5 | 0.5161 | 0.2581 | 0.1290 | 0.0645 | 0.0323 | ||
6 | 0.5079 | 0.2540 | 0.1270 | 0.0635 | 0.0317 | 0.0159 | |
Fractional | 1 | 1.0000 | |||||
2 | 0.5000 | 0.5000 | |||||
3 | 0.3333 | 0.3333 | 0.3333 | ||||
4 | 0.2500 | 0.2500 | 0.2500 | 0.2500 | |||
5 | 0.2000 | 0.2000 | 0.2000 | 0.2000 | 0.2000 | ||
6 | 0.1667 | 0.1667 | 0.1667 | 0.1667 | 0.1667 | 0.1667 |
In harmonic counting (Fig. 4a), the ratio of credit allotted to the ith and jth authors is always j:i, regardless of the total number of coauthors (N) (Hodge and Greenberg 1981), i.e. the 1st author always gets twice as much credit as the 2nd author, the 2nd author always gets 1.5 times more than the 3rd, the 3rd author always gets 1.33 times more than the 4th author, and so on.
Arithmetic counting also allots twice as much credit to the 1st author when there are only two coauthors (Fig. 4b), but has no fixed ratio of allotment when N increases. First author credit decreases rapidly and continuously, whereas last author credit initially increases and thereafter decreases slowly as N increases, e.g. the 4th author gets 0.1 credits as last author but >0.1 credits for 5 ≤ N < 15.
Geometric counting always allots twice as much credit to the ith author as to the (i + 1)th author (Fig. 1c), which implies that the allotted authorship credit rapidly approximates asymptotic values as N increases, such that the first few authors get most of the credit while negligible credit is allotted to the rest.
Fractional counting (Fig. 4d), systematically favors secondary authors by allotting equal credit to all coauthors. The amount by which secondary authors are favored is equal to the difference between fractional and harmonic authorship credit, and is referred to as equalizing bias. For primary authors the equalizing bias is negative (Hagen 2008, Fig. 3 therein).
Discussion
Harmonic counting matches established notions of the relationship between authorship credit and authorship rank in psychology, medicine and chemistry, by providing a robust fit to empirical data from three independent studies using disparate methodologies. It would appear, therefore, that harmonic counting provides a fair and accurate representation of the perceived quantitative norms of the byline hierarchy in branches of scientific publishing where unequal coauthor contribution is the norm. Furthermore, harmonic counting succeeds in capturing the essence of the unadorned byline by ensuring that three basic ethical criteria for equitable sharing of authorship credit are met (Hagen 2008):
one publication credit is shared among all coauthors,
the first author gets the most credit, and in general the ith author receives more credit than the (i + 1)th author, and
the greater the number of authors, the less credit per author.
In contrast, arithmetic counting does not consistently satisfy criterion 3 as the credit of the former last author is initially increased by adding more authors (Fig. 4b). Geometric counting does not consistently satisfy either criterion 1 or 3 because authorship credit rapidly approximates asymptotic values as N increases, so that the first few authors get most of the credit while negligible credit is allotted to the rest (Fig. 4c). And fractional counting violates criterion 2 by systematically favoring secondary authors at the expense of primary authors (Hagen 2008, Fig. 3 therein). In addition, these counting methods do not match the empirical data nearly as well as does the harmonic counting formula (Fig. 3).
Harmonic counting easily accommodates further decoding of explicit byline information about equal contribution of some coauthors (Hu 2009), or implicit information about the approximate equality of contributions by first and last authors, as in biomedical research where the corresponding author is customarily listed last (Buehring et al. 2007; Hagen 2008, Fig. 5 therein; Wren et al. 2007). However, the kind of ambiguity that may arise due to divergent opinion on the preferential status of corresponding last authors (e.g. Buehring et al. 2007; Hodge and Greenberg 1981), or as a result of unwritten conventions about coauthor equality and alphabetical name-ordering (e.g. Boas 1964; Endersby 1996; Maciejovsky et al. 2009), needs to be resolved by requesting unequivocal byline information, explicit contribution statements or editorial clarification.
In conclusion, it would seem that harmonic counting provides unrivalled accuracy, fairness and flexibility to the long overdue task of standardizing bibliometric allocation of publication and citation credit (cf. Larsen 2008).
Acknowledgments
Thanks to H.K. Marshall for improving the linguistic content and logical flow of the manuscript. Bodø University College, Norway provided time for data analysis and manuscript preparation. The institutional library provided database access, extensive full text access, and rapid hard copy information retrieval service.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References
- Boas RP., Jr Mathematical authorship. Science. 1964;145(3629):232. doi: 10.1126/science.145.3629.232. [DOI] [PubMed] [Google Scholar]
- Buehring GC, Buehring JE, Gerard PD. Lost in citation: Vanishing visibility of senior authors. Scientometrics. 2007;72(3):459–468. doi: 10.1007/s11192-007-1762-4. [DOI] [Google Scholar]
- Egghe L, Rousseau R, Van Hooydonk G. Methods for accrediting publications to authors or countries: Consequences for evaluation studies. Journal of the American Society for Information Science and Technology. 2000;51(2):145–157. doi: 10.1002/(SICI)1097-4571(2000)51:2<145::AID-ASI6>3.0.CO;2-9. [DOI] [Google Scholar]
- Endersby JW. Collaborative research in the social sciences: Multiple authorship and publication credit. Social Science Quarterly. 1996;77(2):375–392. [Google Scholar]
- Hagen NT. Harmonic allocation of authorship credit: Source-level correction of bibliometric bias assures accurate publication and citation analysis. PLoS ONE. 2008;3(12):e4021. doi: 10.1371/journal.pone.0004021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hagen NT. Credit for coauthors. Science. 2009;323(5914):583. doi: 10.1126/science.323.5914.583a. [DOI] [PubMed] [Google Scholar]
- Hodge SE, Greenberg DA. Publication credit. Science. 1981;213(4511):950. [Google Scholar]
- Hu X. Loads of special authorship functions: Linear growth in the percentage of “equal first authors” and corresponding authors. Journal of the American Society for Information Science and Technology. 2009;60(11):2378–2381. doi: 10.1002/asi.21164. [DOI] [Google Scholar]
- Larsen PO. The state of the art in publication counting. Scientometrics. 2008;77(2):235–251. doi: 10.1007/s11192-007-1991-6. [DOI] [Google Scholar]
- Lindsey D. Production and citation measures in the sociology of science: The problem of multiple authorship. Social Studies of Science. 1980;10(2):145–162. doi: 10.1177/030631278001000202. [DOI] [Google Scholar]
- Maciejovsky B, Budescu DV, Ariely D. The researcher as a consumer of scientific publications: How do name-ordering conventions affect inferences about contribution credits? Marketing Science. 2009;28(3):589–598. doi: 10.1287/mksc.1080.0406. [DOI] [Google Scholar]
- Price DDS. Multiple authorship. Science. 1981;212(4498):986. doi: 10.1126/science.212.4498.986-a. [DOI] [PubMed] [Google Scholar]
- Van Hooydonk G. Fractional counting of multiauthored publications: Consequences for the impact of authors. Journal of the American Society for Information Science and Technology. 1997;48(10):944–945. doi: 10.1002/(SICI)1097-4571(199710)48:10<944::AID-ASI8>3.0.CO;2-1. [DOI] [Google Scholar]
- Vinkler P. Evaluation of the publication activity of research teams by means of scientometric indicators. Current Science. 2000;79(5):602–612. [Google Scholar]
- Wren JD, Kozak KZ, Johnson KR, Deakyne SJ, Schilling LM, Dellavalle RP. The write position—A survey of perceived contributions to papers based on byline position and number of authors. EMBO Reports. 2007;8(11):988–991. doi: 10.1038/sj.embor.7401095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zuckerman HA. Patterns of name ordering among authors of scientific papers: A study of social symbolism and its ambiguity. American Journal of Sociology. 1968;74(3):276–291. doi: 10.1086/224641. [DOI] [Google Scholar]