Skip to main content
Entropy logoLink to Entropy
letter
. 2020 Aug 24;22(9):927. doi: 10.3390/e22090927

Inferring Authors’ Relative Contributions to Publications from the Order of Their Names When Default Order Is Alphabetical

Yigal Gerchak 1
PMCID: PMC7597182  PMID: 33286696

Abstract

In attributing individual credit for co-authored academic publications, one issue is how to apportion (unequal) credit, based on the order of authorship. Apportioning credit for completed joint undertakings has always been a challenge. Academic promotion committees are faced with such tasks regularly, when trying to infer a candidate’s contribution to an article they coauthored with others. We propose a method for achieving this goal in disciplines (such as the author’s) where the default order is alphabetical. The credits are those maximizing Shannon entropy subject to order constraints.

Keywords: OR in Scientometrics, joint authorship, apportioning credit, maximum entropy

1. Introduction

More and more published research is a collaboration of several researchers (Shapiro et al., 1994 [1]). As various promotion committees need to know and estimate the contribution and the quality of an individual researcher, that raises the bibliometric issue of apportioning individual credit by the “fractional counting“ of joint publications (e.g., references [2,3,4]). Abbas (2011) [2] proposes a set of indices to evaluate the quality of research produced by an author, while Egghe (2008) [4] focuses on a mathematical theory of the h- and g-index in the case of the fractional counting of authorship.

In this paper, we focus on disciplines where the default order of authors is alphabetical (e.g., Social Science and Mathematics; Liu and Fang 2014 [5]). Thus, for example, an order such as (B, A, C) indicates that B’s contribution was “significantly” larger than A’s, while C’s was not. We shall thus assume that each discipline has a “standard”, where if, for example, B’s contribution exceeds A’s by more than the standard, their order will be switched. Therefore, the default order (A, B, C) indicates that neither B’s nor C’s excess contribution exceeds the standard. Other than these inferences, which become constraints on the fractional contributions, we shall assume that the contributions are as uncertain as possible.

Suppose there is a disciplinary standard ε, 0<ε<1, such that the alphabetical order is not changed unless the difference in contributions, in favor of the alphabetically latter author, is deemed to be larger than ε. Thus, if, for example, the order of three authors is (B, A, C) (meaning that the author who is alphabetically second is listed before the one who is alphabetically first), it reflects the fact that B’s contribution exceeds A’s by more than ε, while C’s contribution does not exceed A’s (and thus also not B’s) by more than ε.  A large value of ε indicates strong adherence to an alphabetical order, while small values correspond to a high sensitivity to the actual relative contributions. The standard ε  may or may not be known to those wishing to evaluate the contributions.

We shall make use of the constrained maximal (Shannon) entropy approach, reflecting the most diffused contribution distribution that satisfies the implications of the limited information given by the order. Constrained maximal entropy has been used, among other applications, in physics [6] and finance [7], where the constraints were the mean and/or variance of the distribution. Our constraints are simpler, so solving the problem is often rather trivial. First, we deal with estimating the mean contribution, and then, we propose an appropriate multivariate distribution.

2. Mean Contribution

Start with two authors (A and B), with the respective unknown expected contributions p for A and 1p for B, which we wish to infer. Note that although p is a share and not a probability, we shall perform probability operations on it. The entropy function,

p log p1p log 1p, is concave in p with its maximum at p=12, regardless of the base of the logarithm (e.g., Cover and Thomas 2006 [8]). We shall assume that each author’s contribution is at least δδ<14.Now, the order (A, B) implies that p>1pε, i.e., that p>1ε2. Since 1ε2<12ε, it follows that the constrained entropy is maximized at p*=12. Thus, the authors are deemed to have contributed equally(!). The order (B, A) implies that 1p>p+ε, i.e., that

p<1ε2<12, so p*=1ε2, ε<1. If 1212ε<δ, then p=δ. Thus, if, for example, ε=14, A’s mean contribution is estimated at 38. If ε=14 A’s mean contribution is max18,δ, where the low value reflects A’s demotion despite the high threshold.

For three authors A, B and C, and respective unknown mean shares p, q, 1pq, the unconstrained entropy is jointly concave and maximized at 13, 13, 13. Now, if the order is (A, B, C), it follows that p>qε and q>1pqε. The feasible solution with the highest entropy is 13ε2, 13, 13+ε2. If the order is (B, A, C), the constraints are q>p+ε, q>1pqε and 0p+q1. Among the feasible solutions, the one that maximizes the entropy is 1322ε , 13+13ε, 13+13ε , assuming ε<12 (note that our notation lists the relative contributions in the alphabetical order of the authors).

Note that C’s contribution, 13+13ε, is deemed to be larger that A’s 1323ε, even though they appear later in the order (but not by more than 2ε).

For (B, C, A), the solution is 1323ε, 13+13ε, 1 3+13ε, ε<12. If 1323ε<δ, then we have δ,δ+ε, δ+ε. Note that if ε is large, then as A was nevertheless demoted to last, its contribution is deemed to be negligible (another school of thought is that if all the authors were essential to the research, they should receive equal credit. If one adopts the philosophy that all the authors were essential to the creation of the paper, a possible approach would be to average the above order-dependent shares with equal shares. Thus, for example, the order (B, C, A) will result in 1213ε, 13+ε2, 13+ε2+13, 13, 13=13ε2, 13+ε4, 13+ε4, ε<23).

For (A, C, B), the solutions is 1313ε, 1313ε, 13+23ε, ε<1. If 1313ε<δ, then δ, δ, 12δ. If ε<1, then the allocation is δ,δ, 12δ, so A’s and B’s contribution is deemed negligible. The intuition here is that despite the high requirement for reversing an order ε>1, C has overtaken B.

For (C, B, A), the solution is 13ε, 13,13+ε, ε+δ<1.

Consider now the case of four authors, whose mean contributions p, q, r, 1pqr we wish to find.

For (A, B, C, D), the mean contributions need to satisfy

qp<ε 
rq<ε 
rq<ε 
1pqrr<ε 
r>1pqε2

 Max entropy is attained at

1432ε, 1412ε, 14+12ε, 14+32ε,ε<16

For (D, C, B, A), the mean contributions need to satisfy

1pqr>r+ε
r<1pqε2
r>q+ε
p<qε

 Max entropy is attained at

1432ε, 1412ε, 14+12ε, 14+32ε, ε<16.

If ε>16, then the allocation is 0, 16, 13, 12 . For (A, B, D, C), we obtain 14ε, 1 4, 14, 14+ε, and so forth.

Note that in all cases, the expected contributions are either independent of ε or dependent on it linearly. Thus, if ε  is a random variable (with some subjective distribution), the only change required is the substitution of Eε for ε wherever it appears.

3. Joint Distribution of Relative Contributions

We shall now assume that the joint distribution of the relative contributions is believed to be Dirichlet (e.g., Kotz, Balakrishnan and Johnson 2000, 40.1 [9]). That is, the joint density of the relative contributions p1,,pm is

fp1,,pmp1,,pm=Γj=0mθjj=1mΓθj1j=1mpjθ01j=1mpjθj1,   pj0,  j=1,,m,   j=1mpj1.

We have θ0=1, so Epi=θi1+j=1mθj, i=1,,m. Note that the marginal density of pi is beta θi, j=0mθjθi, so Epi=θij=0mθj,

and

Var pi=θij=0mθjθij=1mθj2j=1mθj+1

and

corr pi, pj=θiθjk=0mθkθik=0mθkθj.

As we wish to allocate the whole credit to the authors, we have θ0=1,

So Epi=θi1+j=1mθj, i=1,,m.

Note that if we have already estimated (inferred) the mean relative contributions p1,,pm, then, to maintain these ratios, we need to have θi=kpi, i=1,,m, for some k>0. The choice of k will determine the parameters of the (Dirichlet) distribution.

4. Effect of Number of Authors

How does the number of authors affect the relative contribution of one of them? Consider, for concreteness, A’s contribution in orders where he/she is last:

2. BA12ε2 , ε<1.
3a. CBAδ
3b. BCA13ε, ε<13

We see that, in the case of three authors, A’s relative contribution depends on the order of B and C; in 3a, C needs to be rewarded for “overtaking “ B (as well as A), which reduces A’s contributions.

4a. DCBA1432ε, ε<16
4b. BCDA143ε, ε<112 

Now, for 3b and 4a, the relative contribution of A is:

BA=12ε2BCA=13ε=31ε213ε, BCA=13εDCBA=1432ε=413ε316ε.

The former ratio is larger than the latter for, and only for, ε<15+294360.07. Thus, no general conclusion is possible.

For 3a and 4a, BABCA is larger if ε>7+25+18δ212.

For 4b, the former ratio is larger if 1<ε<1312.

For n authors in alphabetical internal order with A at the end, A’s contribution is 1nn12ε, ε<2n1n. If ε>2n1n , A’s contribution is δ.

5. Short Discussion and Concluding Remarks

In this paper, we attempted to quantify the significance of deviations from some “natural“ or “default“ order of authors. We assumed that the unknown relative contributions maximize entropy subject to constraints reflecting default order reversal.

Future study could further demonstrate and validate the proposed method by empirical and data-driven methods. For example, a comparison could be made of academics’ rankings by the H-index (or some other measures), compared to authors’ rankings by their total fractional journal paper contributions. Another option is to apply this study to journals where the authors’ contributions are required and published, while comparing it to the order of the authors’ names. Clearly, a survey could be conducted over some well-cited papers, asking the authors to ascribe their fractional contributions, while comparing them to those determined by the proposed method.

We note that the proposed measure, if and when it will be used for personal evaluation and promotion, should be combined with qualitative assessments, as often done in T&P committees. Otherwise, “automated” evaluation metrics alone could encourage game-playing among collaborators, which would be an uninvited outcome.

Finally, let us note that several related applications could use, with required modifications, the proposed method—for example, for assessing the contribution of programmers and AI agents in tasks distributed and published over the Internet. One scenario with some similarity to the considered problem is the following.

In sports, some media “power rank” (PR) teams during the season. The PR is usually consistent with the number of wins the teams have achieved, but not always (there might be other factors such as recent injuries, recent performance, etc.).

Therefore, suppose that the number of wins of two teams A and B n,m are such that n=m+1, while team B is power ranked before A. Thus, 1p>p+εp*=1212ε.

If n=m+2, 1p>p+2εp*=12ε.

Funding

This research was partially funded by the Koret Foundation grant for “Smart Cities and Digital Living 2030”.

Conflicts of Interest

The authors declare no conflict of interest.

References

  • 1.Shapiro D.W., Wenger N.S., Shapiro M.F. The Contributions of Authors to Multi authored Biomedical Research Papers. J. Am. Med Assoc. 1994;271:438–442. doi: 10.1001/jama.1994.03510300044036. [DOI] [PubMed] [Google Scholar]
  • 2.Abbas A. Weighted Indices for Evaluating the Quality of Research with Multiple Authorship. Scientometrics. 2011;88:107–131. doi: 10.1007/s11192-011-0389-7. [DOI] [Google Scholar]
  • 3.Tol S.J. Credit where Credit’s Due: Accounting for Co-authorship in Citation Counts. Scientometrics. 2011;89:291–299. doi: 10.1007/s11192-011-0451-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Egghe L. Mathematical Theory of the h-and g-index in case of Fractional counting of Authorship. J. Am. Soc. Inf. Sci. Technol. 2008;59:1608–1616. doi: 10.1002/asi.20845. [DOI] [Google Scholar]
  • 5.Liu X., Fang H. The Impact of Publications from Mainland China on Trends in Alphabetical Authorship. Scientometrics. 2014;99:865–879. doi: 10.1007/s11192-013-1219-x. [DOI] [Google Scholar]
  • 6.Jaynes E.T. Colloquium Lectures in Pure and Applied Science. Field Research Laboratory, Socony Mobil Oil Company; Dallas, TX, USA: 1958. Probability Theory in Science and Engineering. [Google Scholar]
  • 7.Cozzolino J.M., Zahner M.J. The Maximum-Entropy Distribution of the Future Market Price of a Stock. Oper. Res. 1973;21:1200–1211. doi: 10.1287/opre.21.6.1200. [DOI] [Google Scholar]
  • 8.Cover T.M., Thomas J.A. Elements of Information Theory. 2nd ed. John Wiley & Sons; Hoboken, NJ, USA: 1999. [Google Scholar]
  • 9.Kotz S., Johnson N.L., Balakrishnan N. Continuous Multivariate Distributions: Model and Applications. 2nd ed. John Wiley & Sons; Hoboken, NJ, USA: 2000. [Google Scholar]

Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES