Skip to main content
PLOS One logoLink to PLOS One
. 2025 Aug 6;20(8):e0328531. doi: 10.1371/journal.pone.0328531

Features and signals in precocious citation impact: A meta-research study

John P A Ioannidis 1,*
Editor: Robin Haunschild2
PMCID: PMC12327621  PMID: 40768479

Abstract

The current analysis aimed to evaluate the profiles of scientists who reach top citation impact in a very short time once they start publishing. Precocious citation impact was defined as rising to become a top-cited scientist within t ≤ 8 years after the first publication year. Ultra-precocious citation impact was defined similarly for t ≤ 5 years. Top-cited authors included those in the top-2% of a previously validated composite citation indicator across 174 subfields of science or in the top-100,000 authors of that composite citation indicator across all science based on Scopus. Annual data between 2017 and 2023 show a strong increase over time, with 469 precocious and 66 ultra-precocious citation impact author profiles in 2023. In-depth assessment of validated ultra-precocious scientists in 2023, showed significantly higher frequency of less developed country affiliation; clustering in 4 high-risk subfields; high self-citations for their field; being top-cited only when self-citations were included; high citations to citing papers ratio for their field; extreme publishing behavior; extreme citation orchestration metric c/h2; and high percentage of citations given to first-authored papers compared with all top-cited authors (p < 0.005 for all signals). The 17 ultra-precocious citation impact authors in the 2017–2020 top-cited lists who had retractions showed on average 4.1 of these 8 signal indicators at the time they entered the top-cited list. In conclusion, while some authors with precocious citation impact may be stellar scientists, others probably herald massive manipulative or fraudulent behaviors infiltrating the scientific literature.

Introduction

Some authors reach extremely high citation impact to their work very rapidly after they start publishing. These authors may include some of the very best scientists whose influential work quickly attracts major attention. Alternatively, implausible rapidly rising early citation impact at the beginning of one’s career may reflect fraud or inappropriate manipulation and gaming of publications and citations.

Different types of fraud and manipulations may result in precocious starts [1]. First, paper mills produce fake papers and sell authorship slots to paying authors [24]. Artificial intelligence may have already accelerated the production of such fake papers. Second, citation cartels may be formed among scientists who promote their work by citing each other’s papers without justified referencing [5]. Third, rogue editors and coordinated fake peer review may allow some authors to publish massively bypassing any gatekeeping [6]. Fourth, plagiarism and duplication may allow publication proliferation. Fifth, scientists may engage in extreme self-citation and may even publish rubbish documents that simply boost citations [7,8]. Given that the Hirsch h-index in particular has acquired inappropriately large influence, citations may be placed strategically so as to maximize the h-index [911]. The ways that productivity and impact metrics can be gamed are almost endless in ingenuity. Occasionally several fraudulent and manipulative mechanisms may co-exist. Copious fake or meaningless publications may ensue. Most likely, only a minority of them get discovered and retracted – with substantial delays [12,13]. More systematic cleaning of the literature from fraudulent and meaningless papers is desirable.

Problematic behaviors may affect scientists at any career stage. However, when they occur in very early stages, the resulting pattern may be more readily recognizable. Inflation of metrics may pass unnoticed in late career, when one’s work has gained momentum and is already substantially cited. Detection, conversely, may be easier in beginning careers, where it is difficult to explain how beginners rapidly rise to the extreme top unless they are rare prodigies producing extremely major contributions. Citations take time to accrue. Overtaking more senior scientists in cumulative impact may take decades of strong publication presence. It is important to develop quantitative and qualitative processes that differentiate true excellence from fraud and inappropriate practices [1,14].

The current meta-research study uses data from comprehensive science-wide citation databases which get updated annually [15]. The aim is to evaluate precocious citation impact patterns, find whether they become more frequent over time, explore their distribution across countries and scientific subfields and probe whether they are associated with any other extreme metrics that may serve as signals of potential problems in these impressive CVs.

Methods

Definitions

Precocious citation impact was defined here for operational purposes as rising to become a top-cited scientist within 8 (or fewer) years after the calendar year of the first publication; and ultra-precocious citation impact was defined as rising to become a top-cited scientist within 5 (or fewer) years after the calendar year of the first publication. The use of 8- and 5-year cut-offs is arbitrary. These thresholds were pre-set in the analysis so as to capture individuals who achieve extreme acceleration of their citation impact in the very early stages of their career placing them at the far end of the distribution of all scientists in this aspect. If one assumes that extremes are populated by a mixture of individuals with truly exceptional ability and others with manipulative (or even fraudulent) behavior, it is possible that the relative share of the latter group may be enriched when an ultra-extreme threshold is used. Employing two thresholds also allowed to examine comparatively the two resulting cohorts of authors.

Top-cited was defined by membership in the list of highly-cited scientists based on career-long impact that places the author in the top-2% of a previously validated composite citation indicator [15,16] in one of the 174 subfields of science according to the Science Metrix classification [17] or in the top-100,000 authors of that composite citation indicator across all scientific subfields according to Scopus data [18]. The composite indicator is calculated from 6 metrics that consider both raw citations and h-index, but also co-authorship, and authorship placement (single, first, last). Scientists with ≥5 full papers are considered for ranking. Ranking is performed with separate calculations that include or exclude self-citations; a scientist qualifies if they pass the top-cited thresholds with either approach.

The composite indicator has been widely used in previous work [15,16] and the respective datasets have been accessed over 4 million times to date. In principle, it tries to amalgamate information on citations, co-authorship patterns, and authorship placement in positions that usually suggest greater contribution. An author is included if they manage to be in the top-2% of the composite indicator among all authors who have the same primary scientific subfield and have published at least 5 full papers. In addition, some authors are included because they are among the very top in the composite indicator across all authors in all science, even though they may not have made it explicitly to the top-2% among those who share the same primary scientific subfield. The cut-off of at least 5 full papers has been used since the original conception of the composite indicator. It aims to exclude authors with very limited presence in the literature. Articles, reviews, and conference papers are the only types of publications that count as full papers. Editorials, letters, and commentaries are not included.

Time trends

Ranking lists of top-cited scientists have been previously published based on these thresholds annually including citations to end of 2023, 2022, 2021, 2020, and 2019. Analyses including citations to end of 2018 and 2017 are also available, but only authors who are among the top-100,000 across all science for composite citation indicator are included. All annual updated databases are available at https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/7 (versions 7,6,5,3,2,1,1 for 7 annual databases, respectively). The number of top-cited authors qualifying with first publication occurring within 1, 2, 3, 4, 5, 6, 7, or 8 past calendar years was recorded for each annually released database. Total number of top-cited authors in each annually released database allowed adjusting for increasing author numbers over time.

In-depth evaluation of recent cohort of authors with ultra-precocious citation impact

For authors apparently meeting criteria for ultra-precocious citation impact in the most recently released database (including citation data until the end of 2023, https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/7), a more in-depth evaluation was performed. First, it was assessed whether they may have additional Scopus author ID records where early publications before 2018 (i.e., outside the 5-year frame) were included. If so, these authors were excluded, since ultra-precocious citation impact is then an artefact of Scopus inaccuracies [18]. Publication profiles of authors with verified ultra-precocious citation impact were also examined to find if any were journalists/columnists. It is well described that some journalists/columnists may publish very large numbers of items, especially in high-impact journals where even news stories and journalistic columns may rapidly attract many citations [19].

The eligible ultra-precocious scientists were then contrasted against other precocious authors (those starting to publish in 2015–2017) and the group of all top-cited authors (regardless of year of starting to publish) using the database that includes citation data to end of 2023. The following descriptive characteristics were obtained and summarized as counts for discrete variables and median for continuous ones: first publication year, country, number of papers, total citations, Hirsch h-index and co-authorship-adjusting Schreiber hm-index (with and without self-citations), proportion of self-citations, ratio of citations to citing papers (including self-citations), ratio of citations to the square of h-index (including self-citations), percentage of citations given to first-authored papers (including single-authored ones and including self-citations), percentage of citations given to first- (including single-) or last-authored papers (including self-citations), main scientific subfield per Science Metrix, and any retractions (excluding retractions without any author error or responsibility or retractions of preprints) based on linking of the Retraction Watch Database [20] to Scopus [21]. Recorded retractions may reflect either honest error or misconduct, but misconduct seems responsible for most of them [22,23].

Signal indicators for scientists with ultra-precocious citation impact

For each eligible scientist with ultra-precocious citation impact in the latest updated database (those starting to publish in 2018 or later), the following signal indicators were examined that may increase the possibility that the citation impact may not reflect just extreme excellence:

  1. less developed country (since it is less plausible that with fewer means a scientist may manage to achieve such extreme impact so fast); countries not classified as developed economies by the United Nations Department of Economic and Social Affairs’ World Economic Situation and Prospects 2024 report were considered “less developed” (https://www.un.org/development/desa/dpad/wp-content/uploads/sites/45/WESP_2024_Web.pdf).

  2. main Science Metrix subfield with significantly higher (p < 0.005) representation among ultra-precocious authors than among all top-cited authors (since the concentration of ultra-precocious authors in a subfield may suggest a discipline-wide problem from journals with spurious publication and citation practices).

  3. proportion of self-citations exceeding the 95th percentile among all top-cited authors in the same scientific field (since this may demonstrate over-emphasis on self-promotion); broad Science Metric field was used for percentile calibration instead of subfield, because if ultra-precocious citation impact in a subfield has a strong association with extreme self-citations, this may affect the observed 95th percentile in the subfield, but any bias would be diluted at field level.

  4. inclusion among the top-cited list of scientists only when self-citations are included in calculations (for same reason as above);

  5. ratio of citations over number of citing papers exceeding the 95th percentile among all top-cited authors in the same scientific field (since it may suggest that too many citing papers include very large numbers of citations to that author); field-level instead of subfield-level percentile calibration was used (same reason as for the self-citations).

  6. extreme publishing behavior (defined as >60 full papers [articles, reviews, and conference papers] indexed in Scopus within a single calendar year [24] which may suggest implausible peaks of productivity) – the > 60 cut-off has been used as a threshold of extreme publishing behavior in previous work [24]; and

  7. citations/h2 < 2.45 which has been previously proposed for detection of citation orchestration [25,26], i.e., authors with very low values of this metric have a disproportionately high h-index when contrasted to their total citations and this may mean that at least some of them may have self-cited or had other scientists or even fake papers cite their work in a way that it maximizes the h-index, an index that is widely used (and misused) in research assessments [911]; in previous work [25], the proposed <2.45 value corresponds to the 1st percentile among all authors with 5 or more papers.

  8. percentage of citations given to first-authored (including single-authored) papers exceeding the 95th percentile among all top-cited authors (the cut-off of >80.8% based on the most recent iteration of the top-cited authors’ database); this indicator was added after a comment by a peer-reviewer and it heralds that an author has relatively little influential work where he is not the first or single author.

Raw data for all signal indicators were obtained from publicly available Scopus-based databases (https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/7 with data freeze of August 1, 2024 for country, subfield, and all citation metrics, and https://elsevier.digitalcommonsdata.com/datasets/kmyvjk3xmd/2 for extreme publishing behavior. For each of these signal indicators, the proportion of authors with the indicator was compared between ultra-precocious citation impact authors and all top-cited scientists. Given the exploratory nature of analyses, a conservative p-value<0.005 was considered statistically significant. The threshold of <0.005 has been previously recommended as more appropriate for claiming statistical significance [27].

Then, besides the single affiliation selected by Scopus in the August 1, 2024 data freeze, for each author with ultra-precocious citation impact and a selected single affiliation from a developed country, the full publication record was assessed to identify if the author had also any affiliation(s) from less developed countries.

For each author with ultra-precocious citation impact, the total number of signal indicators was counted. Authors with >4 signals and those with <2 signals were also inspected in more depth for their work through online searches.

Retractions among scientists with ultra-precocious citation impact in early top-cited authors’ lists

Finally, for all authors recorded with ultra-precocious citation impact in any annual updated database of citation indicators over the 4 earlier available annual iterations (2017, 2018, 2019, and 2020), Retraction Watch Database was searched for any retractions (excluding retractions with clearly no error of the authors and preprint retractions), and, if so, the year of the earliest retraction, whether this was before or after their entry among the top-cited authors’ list, and whether the reasons listed for the retraction included mentions of paper mills, rogue editors, fake peer review or concerns with peer review.

Exploratory analysis statement

The analyses presented here used established, previously standardized databases, but were exploratory without pre-specified protocol. Exploration of precocious citation impact had to be iterative in probing this newly discovered, still poorly understood phenomenon.

Results

Overall trends over time

The number of authors with ultra-precocious or precocious citation impact has been increasing over time every year, except for some anomaly in the 2020 data (Table 1). Between 2019 and 2023, the number of Scopus author profiles with ultra-precocious citation impact increased in the annual databases from 28 to 66 (0.18 per thousand rising to 0.30 per thousand top-cited authors). The number of authors with precocious citation impact increased in the annual databases from 213 to 469 (1.3 per thousand rising to 2.2 per thousand top-cited authors). Data for 2018 and 2017 showed a very small number of authors with ultra-precocious or other precocious citation impact, but they are not directly comparable with the other years, because only scientists reaching the top-100,000 ranks across all science were included in these two years’ annual lists.

Table 1. Number of authors with precocious and ultra-precocious citation impact.

Number of authors when citations for ranking are counted until the end of the calendar year
Year of first publication 2023 2022 2021 2020 2019 2018* 2017*
2022 0
2021 0 0
2020 10 1 0
2019 19 9 1 1
2018 37 12 4 1 0
2017 61 28 13 10 1 0
2016 119 61 24 22 5 0 0
2015 223 122 57 33 5 0 0
2014 183 93 62 17 1 0
2013 115 90 28 4 0
2012 158 45 7 3
2011 112 13 7
2010 23 9
2009 13
Total with ultra-precocious citation impact 66 50 42 67 28 5 3
Total with precocious citation impact 469 416 307 377 213 48 32
Total top-cited authors 217097 204643 194983 186177 159683 105000 105026
*

Data for 2018 and 2017 are not directly comparable with the other years, because only scientists reaching the top-100,000 ranks across all science were included in these two years’ lists.

In-depth assessment of authors with apparent ultra-precocious citation impact

Of the 66 listed authors with ultra-precocious citation impact in the latest iteration of the top-cited authors’ database, 2 were artefacts (they had published earlier papers erroneously placed in a different Scopus author ID file). Another 5 were journalists/columnists (2 in the BMJ, 1 in Nature and 2 in both BMJ and Nature). The remaining 59 were validated to be scientists with ultra-precocious citation impact.

Contrast of ultra-precocious, other precocious, and all top-cited authors

Descriptive characteristics for the 59 eligible scientists with ultra-precocious citation impact (starting publishing in 2018 or later) appear in Table 2, also contrasted against other authors with precocious citation impact (those starting publishing in 2015–2017) and all top-cited authors. As shown, authors with ultra-precocious citation impact were heavily enriched in affiliations from less developed countries, as compared with the list of all top-cited authors where highly developed countries, led prominently by the United States, had the lion’s share.

Table 2. Characteristics of ultra-precocious, other precocious, and all top-cited authors (based on top-cited authors’ list from citation data until end of 2023).

Ultra-precocious Other precocious All top-cited
N = 59 N = 403 N = 217,097
Most frequent countries*
 China 8 (14%) 66 (16%) 10,687 (4.9%)
 Turkey 7 (12%) 16 (4.0%) 1,172 (0.5%)
 USA 5 (8.5%) 53 (13%) 84,202 (39%)
 India 4 (6.8%) 37 (9.2%) 2,939 (1.4%)
 Iraq 3 (5.1%) 3 (0.7%) 48 (0.0%)
 Iran 2 (3.4%) 17 (4.2%) 1,020 (0.5%)
 Ethiopia 2 (3.4%) 0 (0.0%) 24 (0.0%)
 Russia 2 (3.4%) 5 (1.2%) 980 (0.5%)
 Vietnam 2 (3.4%) 5 (1.2%) 74 (0.0%)
 Nigeria 2 (3.4%) 6 (1.5%) 117 (0.1%)
 Pakistan 2 (3.4%) 7 (1.7%) 223 (0.1%)
 Canada 2 (3.4%) 5 (1.2%) 9,265 (4.3%)
 Australia 2 (3.4%) 12 (3.0%) 7,448 (3.4%)
 Egypt 1 (1.7%) 10 (2.5%) 495 (0.2%)
 Malaysia 1 (1.7%) 6 (1.5%) 368 (0.2%)
 Great Britain 0 (0%) 22 (5.5%) 19,638 (9.1%)
 Saudi Arabia 0 (0%) 16 (4.0%) 675 (0.3%)
 Italy 0 (0%) 15 (3.7%) 6,271 (2.9%)
 Poland 0 (0%) 12 (3.0%) 1,244 (0.6%)
 Hong Kong 0 (0%) 8 (2.0%) 1,273 (0.6%)
 Bangladesh 0 (0%) 5 (1.2%) 68 (0.0%)
Number of papers, median 77 95 162
Total citations, median 2,591 3,671 6,533
 Excluding self-citations 1,956 2,850 5,637
h-index, median 27 31 40
 Excluding self-citations 22 26 37
hm-index, median 12.6 13.9 18.5
 Excluding self-citations 11.3 11.9 17.0
% self-citations, median 19.6 18.3 11.7
Cprat**, median 1.59 1.52 1.33
Citations/(h-index)2, median 3.23 3.33 3.91
% Citations to
first-authored papers, median 45 46 24
first/last-authored papers, median 65 65 59
Frequent main subfields ***
Environmental Sciences 17 (29%) 29 (7.2%) 2,399 (1.1%)
Energy 10 (17%) 53 (13%) 6,619 (3.1%)
Artificial Intelligence & IP 6 (10%) 66 (16%) 8,479 (3.9%)
Mechanical Eng & Transp 5 (8.5%) 29 (7.2%) 3,064 (1.4%)
Gen & Int Medicine 4 (6.8%) 24 (6.0%) 6,889 (3.2%)
Materials 1 (1.7%) 22 (5.5%) 6,307 (2.9%)
At least one retracted paper 6 (10%) 42 (10%) 7,083 (3.3%)

IP: Image Processing, Eng: Engineering. Transp: Transports

*

countries are shown if they have at least 2 authors with ultra-precocious citation impact or at least 5 other authors with precocious citation impact; country affiliation is the one selected in Scopus as of a August 1, 2024 data freeze

**

ratio of citations over the number of citing papers (self-citations are included)

***

subfields are shown if they have at least 4 authors with ultra-precocious citation impact or at least 20 other authors with precocious citation impact.

42 of the 59 authors with ultra-precocious citation impact had affiliations from less developed countries (single affiliation chosen by Scopus in the August 1, 2024 data freeze). Affiliations in China and India were heavily enriched among authors with ultra-precocious citation impact as compared with the full list of top-cited authors. Turkey affiliations were enriched 22-fold, and several countries with only 0–0.5% representation among all top-cited scientists had 2–3 authors in the ultra-precocious cohort, i.e., 3.4–5.1% (Iraq, Iran, Ethiopia, Russia, Vietnam, Nigeria, Pakistan). Among the other precocious citation impact authors (those who started publishing in 2015–2017), there was still large enrichment in affiliations from less developed countries. In some countries (China, India, Iran, Saudi Arabia, Egypt, Saudi Arabia, Italy, Poland, Hong Kong, Bangladesh) the proportion of their representation was even higher in the other precocious cohort than in the ultra-precocious cohort.

As shown also in Table 2, number of papers and overall citation metrics tended to be lower in the ultra-precocious cohort than in the other precocious cohort and these were lower than in the full list of top-cited authors. However, the differences would be nullified and even reversed if adjusted for the number of years publishing. Self-citations and citations per citing paper were much higher in the ultra-precocious cohort than in the full list of top-cited authors and other authors were in the middle, closer if anything to the ultra-precocious cohort. The ratio of citations to the square of the h-index was much lower in the ultra-precocious cohort and in the other precocious authors than in all top-cited authors. Finally, ultra-precocious authors and other precocious authors had a larger percentage of their citations received by papers where they were first (or single) authors. The difference was less prominent when both first (including single) and last authored papers were considered (Table 2).

Scientists with ultra-precocious citation impact heavily clustered in 4 scientific subfields, Environmental Sciences, Energy, Artificial Intelligence & Image Processing, and Mechanical Engineering & Transports (p < 0.005 for each of them, when their proportion in the ultra-precocious cohort was compared with the proportion among all top-cited scientists). Cumulatively, these 4 subfields accounted for 38/59 (64%) of the scientists with ultra-precocious citation impact versus only 9.5% among the full list of all top-cited scientists. These four fields were also heavily enriched among the other precocious authors (187/403, 46%). When all authors with precocious citation impact were considered, five additional fields (Building & Construction, Food Science, General & Internal Medicine, Materials, Nanomedicine, and Networking & Telecommunications) were also significantly more common (p < 0.005) than among all top-cited scientists.

10% (5/59) of ultra-precocious citation impact scientists and 10% (42/406) of other precocious authors already had at least one retracted paper, which was 3-times higher than the proportion among all top-cited authors.

Signal indicators

As shown in Fig 1, of the 59 scientists with ultra-precocious citation impact, 42 (71%) had their single Scopus-selected affiliation in a less developed country (as of August 1, 2024 data freeze), 38 (64%) had a primary subfield that was enriched in ultra-precocious scientists, 18 (31%) had self-citations exceeding the 95th percentile for their field, 12 (20%) were top-cited only when self-citations were included, 9 (15%) had a citations to citing papers ratio exceeding the 95% percentile for their field, 4 (7%) had evidence for extreme publishing behavior, and 9 (15%) had a c/h2 orchestration value <2.45, and 10 (17%) had > 80.8% of their citations received by first-authored (including single-authored) papers for a total of 142 signals across the 59 scientists (mean 2.4, median 2). All 8 signal indicators were significantly enriched (p < 0.005) in the cohort of authors with ultra-precocious citation impact versus all top-cited authors (Fig 2), with the highest relative enrichment (30-fold) occurring for the c/h2 orchestration index.

Fig 1. Signal indicators in 59 authors with ultra-precocious citation impact among top-cited scientists based on citation counts to end of 2023.

Fig 1

For definitions of the 8 indicators, see Methods. Data are from Scopus and raw data can be found in https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/7 for country, subfield, and all citation metrics, and https://elsevier.digitalcommonsdata.com/datasets/kmyvjk3xmd/2 for extreme publishing behavior. Compared with the 66 authors listed in https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/7, the figure does not include 5 journalists/columnists and 2 authors from Italy and Singapore who do not really have ultra-precocious citation impact, but represent data artefacts because Scopus split their earlier publications in a separate author file. Listing of the 59 authors here is alphabetical with one line per author. For the less developed country column, a gray color means that in the Scopus data freeze of August 1, 2024, the single author affiliation selected by Scopus for that author was from a developed country, but inspection of the full publication record of that author showed also affiliation(s) of that author with institutions in less developed countries.

Fig 2. Proportion of each of the 8 signal indicators in 59 authors with ultra-precocious citation impact versus in all top-cited authors.

Fig 2

Data are from Scopus and raw data can be found in https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/7 for country, subfield, and all citation metrics, and https://elsevier.digitalcommonsdata.com/datasets/kmyvjk3xmd/2 for extreme publishing behavior. Same selection applies as in Fig 1 for the 59 authors.

Of the 17 authors with ultra-citation precocious citation impact where the single Scopus-selected affiliation was from a developed country, perusal of the full publication record showed that 7 in fact had additional affiliation(s) from less developed countries (Iran; Turkey/Jordan; Lebanon/China; Turkey/Lebanon/Russia; India/Iran; Iran; and Hong Kong).

Four scientists had more than 4 signals, 22 had 3–4 signals, 17 had 2 signals, 14 had 1 signal, while only 2 had 0 signals. The 4 scientists with more than 4 signals had affiliations in Syria, Iran, Iraq, and Malaysia, all 4 were working in high-risk fields, 3 of them had high self-citation rates, and 3 had extreme orchestration index. Among authors with 0 signals, one peculiar situation arose with one author with 0 signals who is apparently a hospitalist. He has first-authored two narrative reviews on epidemiology of colorectal cancer and of gastric cancer in Przeglad Gastroenterologiczny that have received 1564 and 1078 citations, respectively, within 4 years, while no other paper in the long history of that journal has accumulated more than 84 citations; he has also authored another 7 highly-cited reviews on epidemiology of other specific cancer sites in World Journal of Oncology, Wspolczesna Onkologia, Medical Sciences, and Clinical and Experimental Hepatology and that are also the most-cited or second- or third most-cited papers ever in these journals.

Retractions among ultra-precocious authors in early top-cited lists

The top-cited scientists’ lists of 2017, 2018, 2019, and 2020 included a total of 103 entries of ultra-precocious authors, of which 4 were artefacts and 3 were journalists, while 10 authors appeared in two different annual lists. Of the 86 different eligible authors with ultra-precocious citation impact, 17 (20%) had at least one retraction in the Retraction Watch Database as of October 8, 2024: 3/3 (100%) in the 2017 list, 3/4 (75%) in the 2018 list, 3/27 (11%) in the 2019 list, and 10/62 (16%) in the 2020 list. For 14/17 authors with retractions, the earliest retraction had been made in a year after they had reached a top-cited list; in another case the retraction was in 2017 and the earliest available top-cited list is from 2017.

Of the 17 scientists with ultra-precocious citation impact (Fig 3), at the time of their first appearance in a top-cited authors’ list 13 had their affiliation in a less developed country, 7 had a primary subfield enriched in ultra-precocious citation impact scientists, 12 had self-citations exceeding the 95th percentile for their field, 9 were top-cited only when self-citations were included, 15 had a citations to citing papers ratio exceeding the 95% percentile for their field, 3 had evidence for extreme publishing behavior, 6 had an orchestration c/h2 value <2.45, and 4 had > 80.8% of their citations received by first-authored (including single-authored) papers, for a total of 69 signals across the 17 scientists (mean = 4.1, median = 4). Of 4 authors with single affiliation selected by Scopus being in developed countries, 3 had additional affiliations in less developed countries (Iran; China; India).

Fig 3. Signal indicators in 17 authors with ultra-precocious citation impact among top-cited scientists based on citation counts to end of 2017, 2018, 2019, or 2020 who had retractions as of October 8, 2024 in the Retraction Watch Database (see Methods for eligible retractions).

Fig 3

For definitions of the 8 indicators, see Methods. Values are as of the end of the year for which they first appeared in a top-cited list. Data are from Scopus and raw data can be found in https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw for country, subfield, and all citation metrics, https://elsevier.digitalcommonsdata.com/datasets/kmyvjk3xmd/2 for extreme publishing behavior. Listing of the 17 authors here is alphabetical with one line per author. For the less developed country column, a gray color means that in the Scopus data freeze of August 1, 2024, the single author affiliation selected by Scopus for that author was from a developed country, but inspection of the full publication record of that author showed also affiliation(s) of that author with institutions in less developed countries.

All 17 scientists had 2 or more signals, except for one who had 0 signals (or 1 when all country affiliations were retrieved). Interestingly, that scientist currently has numerous retractions and he is widely recognized to have organized a huge number of special issues across multiple journals that cross-cited massively his work and the work of other authors thus evading detection by the self-citation indices; he also used a USA (UC Davis) affiliation, but the respective lab in that institution denies he was ever a member (https://undark.org/2023/06/21/in-a-tipsters-note-a-view-of-science-publishings-achilles-heel/).

Discussion

Scientists with precocious and ultra-precocious citation impact have become a more frequent phenomenon over time. Such scientists emerge mostly from less developed countries, and heavily cluster in a few specific scientific subfields. Moreover, they often carry indicator signals that might reflect problematic behavior that converges towards a spuriously inflated citation record. Many of these authors also have retractions, but most of the retractions occur late, after they have reached top ranks of citation metrics. Perusal of the Retraction Watch database [20] shows that paper mills, rogue editors, fake or otherwise problematic peer review are frequent reasons listed in these retractions.

Typically, when these authors first reach the levels of being top-cited in their scientific subfields they have no papers retracted yet. Apparently, they reach this extraordinary level of citation achievement extremely fast, while retractions are a slow process. It takes usually long to recognize flaws and both high-impact and low-impact journals rarely act decisively fast in retracting papers [28,29]. Therefore, perhaps several other authors in these cohorts may have papers retracted in the future. Such extraordinary CVs require more routine careful scrutiny for potential problems. The 8 signal indicators described here offer some initial indicative assessment before in-depth scrutiny of papers published by these authors and their frequent co-authors. For authors where one or a few papers are retracted, inspection of the wider published corpus and its cross-linking with other citing and cited authors and journals may reveal additional problems with papers, journals and/or author clusters. This may allow large-scale detection of problems, beyond the one-paper-at-a-time post-publication peer-review [30]. In fact, while some fraudulent authors may be working alone or with a few accomplices, currently there are fraudulent or manipulative enterprises that probably generate thousands of papers and involve also hundreds and thousands of authors, real ones or occasionally even fake ones [31,32].

The current analysis used impact ranking based on a composite citation indicator that considers 6 citation metrics addressing raw citations, co-authorship, and relative author placement [16]. This was selected in purpose instead of using just raw citations. Raw citation counts would have selected as early top-ranked scientists authoring or co-authoring single most heavily-cited papers with highest immediacy. These include mostly reference papers, guidelines, clinical trials, and some extraordinary achievements in basic and translational science. Among other simple metrics, precocious h-index alone may also pick some problematic behaviors (e.g., strategic placement of citations [911,25,26]. However, it would not select preferentially the first and last authors, who may be more key players in corpora of problematic publications.

Some limitations should be acknowledged. None of the signal indicators presented here offers definitive proof of problematic behavior. However, each has substantial theoretical justification for being a risk factor and all of them are heavily enriched among the ultra-precocious cohort. Authors who reach the top within 6−8 years seem to have profiles closer to those who take 5 or fewer years than to the average top-cited scientist who takes decades. In the latest annual iteration of the top-cited scientists database, the median time elapsed since first publication is 36 years, i.e., most accomplished scientists occupy the highest citation ranks late in their careers. Therefore, one may cautiously speculate that problematic and fraudulent behavior may also exist in a substantial share of authors who show precocious but not ultra-precocious citation impact. Moreover, the definitions used here for precociousness were arbitrarily set for operational purposes to limit the authors to analyze to manageable numbers. However, problematic and fraudulent behavior may extend to other scientists regardless of whether they manage to become highly-cited and how fast they achieve this, if at all. Fraudulent and problematic behavior may far more often result in more modest inflation of CVs. This may not suffice to ever propel a CV to the top-2% of citation impact. However, the focus on precocious authors offers the advantage of studying a phenomenon at its most extreme, likely to be most heavily inclusive of problematic and fraudulent behavior at its purest forms.

It should also be stressed that many scientists with precocious citation impact may have absolutely no problematic or fraudulent behavior. They may indeed be the very exceptional early achievers who deserve only high praise for their work. Science clearly needs more early achievers and disruption unfortunately decreases over time [33,34]. The existing patterns where independence in research is reached in middle age are disappointing [35,36]. In the cohorts analyzed here, several scientists obviously fulfill a definition of meritorious excellence by diverse means of assessment besides mere citation counts, e.g., wider scientific community recognition and contributions opening new frontiers. Moreover, rapid development of highly cited emerging fields, increased prevalence of hyper-authorship practices, and large-scale collaborative projects may contribute to the emergence of ultra-precocious citation patterns. Careful scrutiny is required on a case-by-case basis, to differentiate the truly stellar scientists from those who have cut corners or even engaged in fraudulent behavior to achieve such extreme metrics so fast. In-depth assessment of single cases may also focus on assessing the citing documents, since this may reveal additional evidence for citation cartels or other questionable practices.

Each of the presented signal indicators has imperfect specificity. E.g. some scientists from less developed countries may be truly exceptional. China in particular has currently overtaken even the US in productivity and, in some fields, even in highly cited papers [37]. While some of this ascent is linked to problematic incentive structures [38, 39], elements of true excellence also exist. Moreover, resources for scientific work in less developed countries are very heterogeneous. E.g. one may compare China to Syria or Saudi Arabia to Iraq: all these countries have authors with precocious citation impact but vary vastly in resources. More importantly, non-white scientists, scientists from non-Western countries and non-native English-speaking scientists face many disadvantages [4042] and inequities need to be corrected. Future applications of the indicators of precocious impact should be accompanied by safeguards to prevent bias or discrimination based solely on an author’s geographical origin. Some truly excellent research and outstanding talent may be found in less developed countries.

Similarly, it would be unfair to dismiss all scientists working in fields that happen to have high enrichment in ultra-precocious behavior. Some of these fields are indeed cutting-edge nowadays, e.g., artificial intelligence. Furthermore, early career scientists may often legitimately have higher self-citation rates [43].

Nevertheless, the examined data suggest the presence of pervasive problems that may be currently affecting many countries with complex networks of coordinated inflation of CVs and fraud; and discipline-wide problems that may stem from journals that are less able (or less willing) to stem problematic and fraudulent papers. Indeed, retractions occur mostly in life sciences and few other scientific fields, while they are rarely used in many other disciplines [21]. Proliferation of mega-journals [44,45], predatory journals (often with difficulties in detecting or even defining them) [4648], paper mills, citation cartels, fake peer review and rogue editors (including also highjacked journals [4951]) may create a perfect storm. A new landscape of massive misconduct is emerging, different from past patterns of misconduct that typically involved single, more insular authors. Illustratively, preliminary data suggest that currently one of 7 papers submitted may be fake products [52]. Estimates of fraud have been recently revised upwards [53] versus prior estimates [54], with ongoing debate about the proposed values.

The presented analyses focus on precocious citation impact as an individual author phenomenon. However, it is conceivable that the concept may be extended to research groups that extremely swiftly reach very high levels of impact. Precocious impact authors work within research teams. However, this is more difficult to study and calibrate, since there is no similar composite indicator ranking of impact for teams, as opposed to citation metrics and rankings for individuals. Nevertheless, some relevant precocious phenomena have been described for modest size teams and for entire universities. E.g. the advent of abruptly rising, massive trial publications with suspect characteristics from some center or author team [55] and the rapid growth of some institutions in number of affiliated highly cited scientists according to the Clarivate list of highly cited scientists [56].

Acknowledging the limitations of the proposed signal indicators, the analyzed data suggest that massive assessment of the scientific literature in standardized bibliometric datasets may offer evidence for widely prevalent problems. Judicious scientometric approaches [57] coupled with qualitative assessments can offer complementary insights. Learning from extreme cases may allow understanding how current and emerging problems operate and potentially how they can be more massively and more promptly detected.

Acknowledgments

I am grateful to Kevin Boyack and Jeroen Baas for helpful discussions.

Data Availability

All the raw data are available in pre-existing publicly available databases (https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/7 and older annual versions in same site and https://elsevier.digitalcommonsdata.com/datasets/kmyvjk3xmd/2) and in the manuscript.

Funding Statement

The work of John Ioannidis is supported by an unrestricted gift from Sue and Bob O’Donnell to Stanford University. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Ioannidis JPA, Maniadis Z. Quantitative research assessment: using metrics against gamed metrics. Intern Emerg Med. 2024;19(1):39–47. doi: 10.1007/s11739-023-03447-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Else H, Van Noorden R. The fight against fake-paper factories that churn out sham science. Nature. 2021;591(7851):516–9. doi: 10.1038/d41586-021-00733-5 [DOI] [PubMed] [Google Scholar]
  • 3.Candal-Pedreira C, Ross JS, Ruano-Ravina A, Egilman DS, Fernández E, Pérez-Ríos M. Retracted papers originating from paper mills: cross sectional study. BMJ. 2022;28(379):e071517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Christopher J. The raw truth about paper mills. FEBS Lett. 2021;595(13):1751–7. doi: 10.1002/1873-3468.14143 [DOI] [PubMed] [Google Scholar]
  • 5.Fister I, Fister I, Perc M. Towards the discovery of citation cartels in citation networks. Front Phys. 2016;4:00049. [Google Scholar]
  • 6.Rivera H. Fake peer review and inappropriate authorship are real evils. J Korean Med Sci. 2018;34(2):e6. doi: 10.3346/jkms.2019.34.e6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Van Noorden R, Singh Chawla D. Hundreds of extreme self-citing scientists revealed in new database. Nature. 2019;572(7771):578–9. doi: 10.1038/d41586-019-02479-7 [DOI] [PubMed] [Google Scholar]
  • 8.Ioannidis JPA. A generalized view of self-citation: direct, co-author, collaborative, and coercive induced self-citation. J Psychosom Res. 2015;78(1):7–11. doi: 10.1016/j.jpsychores.2014.11.008 [DOI] [PubMed] [Google Scholar]
  • 9.Bartneck C, Kokkelmans S. Detecting h-index manipulation through self-citation analysis. Scientometrics. 2011;87(1):85–98. doi: 10.1007/s11192-010-0306-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Katsaros D, Akritidis L, Bozanis P. The f index: Quantifying the impact of coterminal citations on scientists’ ranking. J Am Soc Inf Sci. 2009;60(5):1051–6. doi: 10.1002/asi.21040 [DOI] [Google Scholar]
  • 11.Fire M, Guestrin C. Over-optimization of academic publishing metrics: observing Goodhart’s Law in action. Gigascience. 2019;8(6):giz053. doi: 10.1093/gigascience/giz053 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Oransky I. Retractions are increasing, but not enough. Nature. 2022;608(7921):9. doi: 10.1038/d41586-022-02071-6 [DOI] [PubMed] [Google Scholar]
  • 13.Alexander R, Peterson CJ, Yang S, Nugent K. Article retraction rates in selected MeSH term categories in PubMed published between 2010 and 2020. Account Res. 2023;:1–14. [DOI] [PubMed] [Google Scholar]
  • 14.Ioannidis JPA, Maniadis Z. In defense of quantitative metrics in researcher assessments. PLoS Biol. 2023;21(12):e3002408. doi: 10.1371/journal.pbio.3002408 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Ioannidis JPA, Baas J, Klavans R, Boyack KW. A standardized citation metrics author database annotated for scientific field. PLoS Biol. 2019;17(8):e3000384. doi: 10.1371/journal.pbio.3000384 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Ioannidis JPA, Klavans R, Boyack KW. Multiple citation indicators and their composite across scientific disciplines. PLoS Biol. 2016;14(7):e1002501. doi: 10.1371/journal.pbio.1002501 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Archambault E, Beauchesne OH, Caruso J. “Towards a multilingual, comprehensive and open scientific journal ontology” in Proceedings of the 13th International Conference of the International Society for Scientometrics and Informetrics (ISSI), Durban, South Africa. Noyons B, Ngulube P, Leta J, editors. 2011: 66–77. [Google Scholar]
  • 18.Baas J, Schotten M, Plume A, Côté G, Karimi R. Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies. Quantitative Science Studies. 2020;1(1):377–86. doi: 10.1162/qss_a_00019 [DOI] [Google Scholar]
  • 19.Ioannidis JPA. Prolific non-research authors in high impact scientific journals: meta-research study. Scientometrics. 2023;128(5):3171–84. doi: 10.1007/s11192-023-04687-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Retraction watch database. Available from: http://retractiondatabase.org/RetractionSearch.aspx?. [Google Scholar]
  • 21.Ioannidis JPA, Pezzullo AM, Cristiano A, Boccia S, Baas J. Linking citation and retraction data reveals the demographics of scientific retractions among highly cited authors. PLoS Biol. 2025;23(1):e3002999. doi: 10.1371/journal.pbio.3002999 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hwang SY, Yon DK, Lee SW, Kim MS, Kim JY, Smith L, et al. Causes for retraction in the biomedical literature: a systematic review of studies of retraction notices. J Korean Med Sci. 2023;38(41):e333. doi: 10.3346/jkms.2023.38.e333 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Fang FC, Steen RG, Casadevall A. Misconduct accounts for the majority of retracted scientific publications. Proc Natl Acad Sci U S A. 2012;109(42):17028–33. doi: 10.1073/pnas.1212247109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ioannidis JPA, Collins TA, Baas J. Evolving patterns of extreme publishing behavior across science. Scientometrics. 2024;129(9):5783–96. doi: 10.1007/s11192-024-05117-w [DOI] [Google Scholar]
  • 25.Evdaimon I, Ioannidis JPA, Nikolentzos G, Chatzianastasis M, Panagopoulos G, Vazirgiannis M. Metrics to detect small-scale and large-scale citation orchestration. arXiv 2406.19219v1, doi: 10.48550/arXiv.2406.19219 [DOI] [Google Scholar]
  • 26.Fiorillo D. Detecting the impact of academics self-citations: Fi-score. Publ Res Q. 2024;40:70–9. [Google Scholar]
  • 27.Benjamin DJ, Berger JO, Johannesson M, Nosek BA, Wagenmakers E-J, Berk R, et al. Redefine statistical significance. Nat Hum Behav. 2018;2(1):6–10. doi: 10.1038/s41562-017-0189-z [DOI] [PubMed] [Google Scholar]
  • 28.Fang FC, Casadevall A. Retracted science and the retraction index. Infect Immun. 2011;79(10):3855–9. doi: 10.1128/IAI.05661-11 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Trikalinos NA, Evangelou E, Ioannidis JP. Falsified papers in high-impact journals were slow to retract and indistinguishable from nonfraudulent papers. J Clin Epidemiol. 2008;61(5):464–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Teixeira da Silva JA, Dobránszki J. Problems with traditional science publishing and finding a wider niche for post-publication peer review. Account Res. 2015;22(1):22–40. doi: 10.1080/08989621.2014.899909 [DOI] [PubMed] [Google Scholar]
  • 31.Parr C. Who is stronzo bestiale? THE - Times Higher Education. 2024. Available from: https://www.timeshighereducation.com/comment/opinion/who-is-stronzo-bestiale/ [Google Scholar]
  • 32.Abalkina A. Publication and collaboration anomalies in academic papers originating from a paper mill: evidence from a Russia-based paper mill. arXiv preprint. 2021. doi: 10.48550/arXiv.2112.13322 [DOI] [Google Scholar]
  • 33.Park M, Leahey E, Funk RJ. Papers and patents are becoming less disruptive over time. Nature. 2023;613(7942):138–44. doi: 10.1038/s41586-022-05543-x [DOI] [PubMed] [Google Scholar]
  • 34.Kozlov M. “Disruptive” science has declined - and no one knows why. Nature. 2023;613(7943):225. doi: 10.1038/d41586-022-04577-5 [DOI] [PubMed] [Google Scholar]
  • 35.Levitt M, Levitt JM. Future of fundamental discovery in US biomedical research. Proc Natl Acad Sci U S A. 2017;114(25):6498–503. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lauer M. Age of principal investigators at the time of first R01-equivalent remains level with recent years in FY 2023. Available from: https://nexus.od.nih.gov/all/2024/05/06/age-of-principal-investigators-at-the-time-of-first-r01-remains-level-with-recent-years-in-fy-2023/. [Google Scholar]
  • 37.Wagner CS, Zhang L, Leydesdorff L. A discussion of measuring the top-1% most-highly cited publications: quality and impact of Chinese papers. Scientometrics. 2022;127(4):1825–39. doi: 10.1007/s11192-022-04291-z [DOI] [Google Scholar]
  • 38.Quan W, Chen B, Shu F. Publish or impoverish: An investigation of the monetary reward system of science in China (1999–2016). Aslib J Info Manage. 2017;69:486–502. [Google Scholar]
  • 39.Wang J, Halffman W, Zwart H. The Chinese scientific publication system: Specific features, specific challenges. Learned Publishing. 2020;34(2):105–15. doi: 10.1002/leap.1326 [DOI] [Google Scholar]
  • 40.Liu F, Rahwan T, AlShebli B. Non-White scientists appear on fewer editorial boards, spend more time under review, and receive fewer citations. Proc Natl Acad Sci U S A. 2023;120(13):e2215324120. doi: 10.1073/pnas.2215324120 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Amano T, Ramírez-Castañeda V, Berdejo-Espinola V, Borokini I, Chowdhury S, Golivets M, et al. The manifold costs of being a non-native English speaker in science. PLoS Biol. 2023;21(7):e3002184. doi: 10.1371/journal.pbio.3002184 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Chen H, Rider CI, Jurgens D, Teplitskiy M. Geographical Disparities in Navigating Rejection in Science Drive Disparities in its File Drawer. 2024. Available from: https://ssrn.com/abstract=4872023 [Google Scholar]
  • 43.Vincent-Lamarre P, Lariviere V. Are self-citations a normal feature of knowledge accumulation? arXiv. 2023. doi: 10.48550/arXiv.2303.02667 [DOI] [Google Scholar]
  • 44.Talley NJ, Barbour V, Lapeña JFF, Munk PL, Peh WCG. The rise and rise of predatory journals and the risks to clinical practice, health and careers: the APAME 2024 Sydney declaration on predatory or pseudo journals and publishers. Med J Aust. 2024;221(5):248–50. doi: 10.5694/mja2.52410 [DOI] [PubMed] [Google Scholar]
  • 45.Ioannidis JPA, Pezzullo AM, Boccia S. The rapid growth of mega-journals: threats and opportunities. JAMA. 2023;329(15):1253–4. doi: 10.1001/jama.2023.3212 [DOI] [PubMed] [Google Scholar]
  • 46.Björk BC. Have the “mega-journals” reached the limits to growth? PeerJ. 2015;3(26):e981. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Cukier S, Lalu M, Bryson GL, Cobey KD, Grudniewicz A, Moher D. Defining predatory journals and responding to the threat they pose: a modified Delphi consensus process. BMJ Open. 2020;10(2):e035561. doi: 10.1136/bmjopen-2019-035561 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Grudniewicz A, Moher D, Cobey KD, Bryson GL, Cukier S, Allen K, et al. Predatory journals: no definition, no defence. Nature. 2019;576(7786):210–2. doi: 10.1038/d41586-019-03759-y [DOI] [PubMed] [Google Scholar]
  • 49.Brainard J. Hijacked journals mar top database. Science. 2023;382(667). [DOI] [PubMed] [Google Scholar]
  • 50.Hegedűs M, Dadkhah M, Dávid LD. Masquerade of authority: hijacked journals are gaining more credibility than original ones. Diagnosis (Berl). 2024;11(3):235–9. doi: 10.1515/dx-2024-0082 [DOI] [PubMed] [Google Scholar]
  • 51.Ryan J. Hijacked journals are still a threat - here’s what publishers can do about them. Nature. 2024. doi: 10.1038/d41586-024- [DOI] [PubMed] [Google Scholar]
  • 52.Retraction W. Up to one in seven submissions to hundreds of Wiley journals flagged by new paper mill tool. Available from: https://retractionwatch.com/2024/03/14/up-to-one-in-seven-of-submissions-to-hundreds-of-wiley-journals-show-signs-of-paper-mill-activity/. [Google Scholar]
  • 53.Heathers J. How much science is fake? Available from: https://osf.io/5rf2m/. [Google Scholar]
  • 54.Fanelli D. How many scientists fabricate and falsify research? A systematic Review and meta-analysis of survey data. PLoS One. 2009;4(5):e5738. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Nielsen J, Bordewijk EM, Gurrin LC, Shivantha S, Flanagan M, Liu S, et al. Assessing the scientific integrity of the collected work of one author or author group. J Clin Epidemiol. 2025;180:111603. doi: 10.1016/j.jclinepi.2024.111603 [DOI] [PubMed] [Google Scholar]
  • 56.Bhattacharjee Y. Citation impact. Saudi universities offer cash in exchange for academic prestige. Science. 2011;334(6061):1344–5. doi: 10.1126/science.334.6061.1344 [DOI] [PubMed] [Google Scholar]
  • 57.Hicks D, Wouters P, Waltman L, de Rijcke S, Rafols I. Bibliometrics: the leiden manifesto for research metrics. Nature. 2015;520(7548):429–31. doi: 10.1038/520429a [DOI] [PubMed] [Google Scholar]
PLoS One. 2025 Aug 6;20(8):e0328531. doi: 10.1371/journal.pone.0328531.r001

Author response to Decision Letter 0


Transfer Alert

This paper was transferred from another journal. As a result, its full editorial history (including decision letters, peer reviews and author responses) may not be present.

9 Dec 2024

Decision Letter 0

Miquel Vall-llosera Camps

Dear Dr. Ioannidis,

plosone@plos.org . When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols .

We look forward to receiving your revised manuscript.

Kind regards,

Miquel Vall-llosera Camps

Senior Staff Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

3. Thank you for stating the following financial disclosure: 

“The work of John Ioannidis is supported by an unrestricted gift from Sue and Bob O’Donnell to Stanford University.”

Please state what role the funders took in the study.  If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." If this statement is not correct you must amend it as needed. 

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

4. Please note that funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript. 

5. Thank you for stating the following in the Competing Interests section: 

“METRICS has been funded by grants from the Laura and John Arnold Foundation (Arnold Ventures).”

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests).  If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. 

Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf.

6. Please note that your Data Availability Statement is currently missing the repository name and direct link to access each database. If your manuscript is accepted for publication, you will be asked to provide these details on a very short timeline. We therefore suggest that you provide this information now, though we will not hold up the peer review process if you are unable.

Additional Editor Comments:

I would like to sincerely apologise for the delay you have incurred with your submission. It has been exceptionally difficult to secure reviewers to evaluate your study. We have now received two completed reviews; the comments are available below. The reviewers have raised significant scientific concerns about the study that need to be addressed in a revision.

Please revise the manuscript to address all the reviewer's comments in a point-by-point response in order to ensure it is meeting the journal's publication criteria. Please note that the revised manuscript will need to undergo further review, we thus cannot at this point anticipate the outcome of the evaluation process.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: I Don't Know

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: No

Reviewer #2: Yes

**********

Reviewer #1: The manuscript examines precocious citation impact at the individual level, i.e., researchers who attain very high citation rates early in the career. The manuscript is overall interesting but has some issues mainly in the methods section. Many of them has to do with presentation and a lack of information and detail. Hopefully my questions and comments can help the author to improve the manuscript.

Abstract

The readability of the abstract could be increased by imposing conciseness (less detail, more substance with a focus on conclusions), better structure, and including the aim of the study.

Method

The operationalisations of “precocious citation impact” as top-cited within 8 years and “ultra-precocious citation impact” as top-cited within 5 years seems a bit arbitrary. It is not clear to me why the concept of precocious citation impact is operationalised this way? What assumptions and definitions are these operationalisations based on? My recommendation is that the author elaborate on this issue in the manuscript. (page 4) The author mentions this issue in limitations (page 15) but this should receive attention and clarification, and be discussed in the methods section.

The inclusion criteria based on “top-citedness” is not clear to me. An author is top-cited if the author is either among the top-2% of a composite citation indicator (it is unclear to me which population this percentile ranking is based on; this should be clarified in the manuscript), or among the top- 100000 authors based on the same composite citation indicator based on Scopus data. It is unclear to me how these two criteria for inclusion are related to each other. It seems as if the first is a relative cut-off threshold and the second is an absolute cut-off threshold. This makes me uncertain about to which extent these two operationalisations of “top-citedness” are comparable, and the extent to which they might refer two different definitions of top-citedness. My recommendation is that the author clarify and justify this operationalisation in the manuscript. (page 4) Overall there seem to be different operationalisations of top-citedness other than those mentioned above in the manuscript, e.g., data for 2018 and 2017 where only scientists reaching the top- 100000 ranks across all science are included (page 24). As a reader it is difficult to get a good overview of this issue, and I would recommend the author to clarify this in the manuscript. Furthermore, the use of the composite citation indicator should be justified in the manuscript. At page 14 in the discussion the indicator is justified to some degree, this should be done fully in the method section and further elaborated on.

The inclusion criteria for being considered for ranking is “Scientists with (more or equal to) 5 papers are considered for ranking”, seems arbitrary. Why is this criterion imposed and how does it impact the ranking? What does “papers” mean here? Is it only based on peer reviewed journal articles? My recommendation is that the author justify this criterion in the manuscript. (page 4)

The author write: “Total number of top-cited authors in each annually released database allowed adjusting for increasing author numbers over time” (page 5). It is not clear to me why this is done or how this is done. Can the author elaborate a bit on this?

The author write: “First, it was assessed whether they may have additional Scopus author ID records where early publications before 2018 (i.e., outside the 5-year frame) were included” (page 5). It is not clear to me how many authors where excluded due to this issue. This should be clearly stated in the manuscript. (page 5)

The author write: “The eligible ultra-precocious scientists were then contrasted against other precocious authors (those starting to publish in 2015-2017) and the group of all top-cited authors (regardless of year of starting to publish) using the database that includes citation data to end of 2023” (page 5). Who are the authors starting 2015-2017? I cannot see that these cohorts have been mentioned previously in the manuscript.

The author write: “The following descriptive characteristics were obtained and summarized as counts for discrete variables and median for continuous ones” (page 6). I cannot find this descriptitve statistics in the manuscript.

The cut-off threshold of >60 papers seems arbitrary (page 7). My recommendation is to justify this threshold with a reference of arguments.

The author write: “For each of these signal indicators, the proportion of authors with the indicator was compared between ultra-precocious citation impact authors and all top-cited scientists. Given the exploratory nature of analyses, a conservative p-value<0.005 was considered statistically significant” (page 7-8). It is not clear to me why the significance level is set at 0.005 here. My recommendation is to justify it with a reference of arguments. What is the purpose of using p-values? What is the sample (what is it’s size) and to which population do the author wish to generalize? My recommendation is to clarify these issues in the manuscript.

To summarize, overall the methods section lacks information, clarity and detail on the definitions and operationalisations of precocious citation impact and top-citedness. These issues should be presented in the manuscript and not just referenced. Further, as a reader it is difficult to follow all the references to different databases and get a clear view of what the actual dataset/datasets that is used in the study and what they consist of. My recommendation is that the author makes an effort to increase the information, clarity and detail on how the dataset/datasets that is used in this study is constructed what it consists of in the methods section in a structured and clear fashion so that the transparency on these issues increases in the manuscript.

Generel questions:

Is precocious citation impact a phenomenon that is observable first and foremost at the individual level or is it relevant also at other levels, e.g, at document level or maybe research group level, etcetera.

How does changes in the Scopus database, e.g., increased indexing over time, affect the longitudinal analyses?

This might not be possible, but, why do the author not examine the citing documents and their sources? From my point of view it would seem interesting to understand the citing side of this phenomenon.

Reviewer #2: The manuscript entitled “Features and signals in precocious citation impact: a meta-research study” addresses a highly relevant and timely topic: the emergence of patterns and signals of "precocious citation impact" and their potential associations with questionable scientific practices. The study is methodologically rigorous, employing multiple indicators to detect potential anomalies in citation trajectories. The writing is clear (although at times somewhat dense), and the overall approach is strong and systematic.

However, several aspects regarding the operationalization of key concepts, the interpretation of the findings, and the presentation of the results could be further strengthened to enhance the overall robustness, clarity, and impact of the study.

In my view, the following aspects could be further strengthened to enhance the manuscript:

- Definition of "Precocious Impact": While the selection of 8 and 5 years as thresholds for defining "precocious" and "ultra-precocious" citation impact is understandable, the manuscript would benefit from a more explicit justification of these specific cutoffs within the main text to contextualize the choice (especially, considering the diversity of fields considered in the study).

- Limitations of the signal indicators. One of the main risks associated with the use of these indicators is the potential for generalization and stigmatization. Given the notable overrepresentation of authors from less developed countries among those identified with precocious citation impact, the discussion should more explicitly address the risk of reinforcing existing structural biases or inequities in global science. Emphasizing the diversity of scientific excellence across different contexts would help provide a more balanced interpretation of the findings.

Moreover, given the sensitive nature of using country affiliation as a signal, it would be important to clearly state that any future application of such indicators in research assessments must be accompanied by strong safeguards to prevent bias or discrimination based solely on an author's geographical origin.

Now I will go briefly on more details.

Abstract

The abstract is generally clear and informative. However, I would recommend simplifying some of the longer sentences to improve readability. For example, the sentence "In-depth assessment of validated ultra-precocious scientists in 2023, showed significantly higher frequency..." is quite long and could be streamlined for clarity. Similarly, the phrase "The 17 ultra-precocious citation impact authors in the 2017-2020 top-cited lists who had retractions by October 2024..." could be simplified by removing details such as the timeline of retractions, which might be more appropriate for the main text rather than the abstract.

Finally, the transition between "top-cited authors" and "Scopus author IDs" might cause minor confusion for readers unfamiliar with the methodology; clarifying or simplifying this distinction in the abstract would enhance accessibility (population should be clear).

Methods

-Regarding the composite citation indicator, it would be helpful to clarify how co-authorship is weighted and how authorship placement (single, first, last) is specifically considered into the calculation. Additionally, when mentioning that "scientists with ≥5 full papers are considered for ranking," it would be important to specify whether all publication types (e.g., editorials, letters, commentaries) are included or if any types were excluded.

- Although the exploratory nature of the study is well justified, presenting a clearer flowchart or schematic overview of the analytical steps would greatly aid the reader in following the complex methodology.

-A brief explanation of the c/h² orchestration index within the main text would be beneficial, particularly for readers who may not be familiar with this metric.

- While it is acknowledged that none of the signal indicators provide definitive proof of misconduct, the manuscript would benefit from a more detailed discussion on the potential for false positives, particularly regarding legitimate scientists from less developed countries or rapidly emerging fields.

Results and discussion

-Figure 1 is informative, but its layout makes it somewhat difficult to interpret at a glance. Adding clearer visual separators, such as light grey horizontal lines between entries, could greatly improve readability. This would help readers more easily distinguish between different authors and the associated signal indicators.

- In the section stating "Self-citations and citations per citing paper were much higher in the ultra-precocious cohort than in the full list of top-cited authors and other authors were in the middle, closer if anything to the ultra-precocious cohort," it would be beneficial to also report authorship positions (e.g., first, last, single authorship) to better understand the roles and contributions associated with these citation patterns.

-The discussion would benefit from a more nuanced exploration of alternative explanations for ultra-precocious citation patterns. Potential factors to consider include the rapid development of highly cited emerging fields, the prevalence of international hyper-authorship practices, and the effects of large-scale collaborative projects on citation accumulation.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: No

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/ . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org . Please note that Supporting Information files do not need this step.

PLoS One. 2025 Aug 6;20(8):e0328531. doi: 10.1371/journal.pone.0328531.r003

Author response to Decision Letter 1


3 Jun 2025

May 29, 2025

Miquel Vall-llosera Camps

Senior Staff Editor

PLOS ONE

PONE-D-24-56818

Features and signals in precocious citation impact: a meta-research study

Dear Editor

I was pleased to hear that PLoS One is interested in a revised version of this manuscript. I am grateful for the insightful comments of the reviewers. I have addressed all of them in the current revision. In more detail.

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

Reply: done

2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match.

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

Reply: these have been aligned. Some funds have no grant number.

3. Thank you for stating the following financial disclosure:

“The work of John Ioannidis is supported by an unrestricted gift from Sue and Bob O’Donnell to Stanford University.”

Please state what role the funders took in the study. If the funders had no role, please state: "The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript." If this statement is not correct you must amend it as needed.

Please include this amended Role of Funder statement in your cover letter; we will change the online submission form on your behalf.

Reply: The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

4. Please note that funding information should not appear in the Acknowledgments section or other areas of your manuscript. We will only publish funding information present in the Funding Statement section of the online submission form. Please remove any funding-related text from the manuscript.

Reply: Removed.

5. Thank you for stating the following in the Competing Interests section:

“METRICS has been funded by grants from the Laura and John Arnold Foundation (Arnold Ventures).”

Please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials, by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include your updated Competing Interests statement in your cover letter; we will change the online submission form on your behalf.

Reply: This does not alter our adherence to PLOS ONE policies on sharing data and materials.

6. Please note that your Data Availability Statement is currently missing the repository name and direct link to access each database. If your manuscript is accepted for publication, you will be asked to provide these details on a very short timeline. We therefore suggest that you provide this information now, though we will not hold up the peer review process if you are unable.

Reply: As I state “All the raw data are available in pre-existing publicly available databases (see links in the text) and in the manuscript.”.

Additional Editor Comments:

I would like to sincerely apologise for the delay you have incurred with your submission. It has been exceptionally difficult to secure reviewers to evaluate your study. We have now received two completed reviews; the comments are available below. The reviewers have raised significant scientific concerns about the study that need to be addressed in a revision.

Please revise the manuscript to address all the reviewer's comments in a point-by-point response in order to ensure it is meeting the journal's publication criteria. Please note that the revised manuscript will need to undergo further review, we thus cannot at this point anticipate the outcome of the evaluation process.

Reply: Thank you for making an extra effort to secure reviewers. The comments are excellent and very helpful. Please see the point by point responses below.

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

________________________________________

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: I Don't Know

Reviewer #2: Yes

________________________________________

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

________________________________________

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: No

Reviewer #2: Yes

________________________________________

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The manuscript examines precocious citation impact at the individual level, i.e., researchers who attain very high citation rates early in the career. The manuscript is overall interesting but has some issues mainly in the methods section. Many of them has to do with presentation and a lack of information and detail. Hopefully my questions and comments can help the author to improve the manuscript.

Reply: Thank you for the very helpful comments. Please see responses and revisions delineated below.

Abstract

The readability of the abstract could be increased by imposing conciseness (less detail, more substance with a focus on conclusions), better structure, and including the aim of the study.

Reply: the abstract has been revised accordingly.

Method

The operationalisations of “precocious citation impact” as top-cited within 8 years and “ultra-precocious citation impact” as top-cited within 5 years seems a bit arbitrary. It is not clear to me why the concept of precocious citation impact is operationalised this way? What assumptions and definitions are these operationalisations based on? My recommendation is that the author elaborate on this issue in the manuscript. (page 4) The author mentions this issue in limitations (page 15) but this should receive attention and clarification and be discussed in the methods section.

Reply: I have clarified upfront that: “The use of 8- and 5-year cut-offs is arbitrary. These thresholds were pre-set in the analysis so as to capture individuals who achieve extreme acceleration of their citation impact in the very early stages of their career placing them at the far end of the distribution of all scientists in this aspect. If one assumes that extremes are populated by a mixture of individuals with truly exceptional ability and others with manipulative (or even fraudulent behavior), it is possible that the relative share of the latter group may be enriched when an ultra-extreme threshold is used. Employing two thresholds also allowed to examine comparatively the two resulting cohorts of authors.”

The inclusion criteria based on “top-citedness” is not clear to me. An author is top-cited if the author is either among the top-2% of a composite citation indicator (it is unclear to me which population this percentile ranking is based on; this should be clarified in the manuscript), or among the top- 100000 authors based on the same composite citation indicator based on Scopus data. It is unclear to me how these two criteria for inclusion are related to each other. It seems as if the first is a relative cut-off threshold and the second is an absolute cut-off threshold. This makes me uncertain about to which extent these two operationalisations of “top-citedness” are comparable, and the extent to which they might refer two different definitions of top-citedness. My recommendation is that the author clarify and justify this operationalisation in the manuscript. (page 4) Overall there seem to be different operationalisations of top-citedness other than those mentioned above in the manuscript, e.g., data for 2018 and 2017 where only scientists reaching the top- 100000 ranks across all science are included (page 24). As a reader it is difficult to get a good overview of this issue, and I would recommend the author to clarify this in the manuscript. Furthermore, the use of the composite citation indicator should be justified in the manuscript. At page 14 in the discussion the indicator is justified to some degree, this should be done fully in the method section and further elaborated on.

Reply: I have clarified this as follows: “The composite indicator has been widely used in previous work and the respective datasets have been accessed over 4 million times to-date. In principle, it tries to amalgamate information on citations, co-authorship patterns, and authorship placement in positions that usually suggest greater contribution (single, first, senior author positions). An author is included if they manage to be in the top-2% of the composite indicator among all authors who have the same primary scientific subfield and have published at least 5 full papers. In addition, some authors are included because they are among the top-100,000 in the composite indicator across all authors in all science (roughly top-1%), even though they may not have made it to the top-2% among those who share the same primary scientific subfield.”

The inclusion criteria for being considered for ranking is “Scientists with (more or equal to) 5 papers are considered for ranking”, seems arbitrary. Why is this criterion imposed and how does it impact the ranking? What does “papers” mean here? Is it only based on peer reviewed journal articles? My recommendation is that the author justify this criterion in the manuscript. (page 4)

Reply: These criteria were set long ago on creating the composite indicator and have been extensively explained in previous work. I have nevertheless added: “the cut-off of at least 5 papers has been used since the original conception of the composite indicator. It aims to exclude authors with very limited presence in the literature. Articles, reviews, and conference papers are the only types of publications that count as papers.”

The author write: “Total number of top-cited authors in each annually released database allowed adjusting for increasing author numbers over time” (page 5). It is not clear to me why this is done or how this is done. Can the author elaborate a bit on this?

Reply: this is inherent to the selection using the top-2%, so if there are more total authors, it is still only the top-2% being selected.

The author write: “First, it was assessed whether they may have additional Scopus author ID records where early publications before 2018 (i.e., outside the 5-year frame) were included” (page 5). It is not clear to me how many authors where excluded due to this issue. This should be clearly stated in the manuscript. (page 5)

Reply: It is already stated in the Results that 2 authors were artefacts because of such split publication records.

The author write: “The eligible ultra-precocious scientists were then contrasted against other precocious authors (those starting to publish in 2015-2017) and the group of all top-cited authors (regardless of year of starting to publish) using the database that includes citation data to end of 2023” (page 5). Who are the authors starting 2015-2017? I cannot see that these cohorts have been mentioned previously in the manuscript.

Reply: Other precocious authors are those who are precocious but not ultra-precocious.

The author write: “The following descriptive characteristics were obtained and summarized as counts for discrete variables and median for continuous ones” (page 6). I cannot find this descriptive statistics in the manuscript.

Reply: this is what the Results section present, they are all descriptive statistics.

The cut-off threshold of >60 papers seems arbitrary (page 7). My recommendation is to justify this threshold with a reference of arguments.

Reply: this is a previously used cut-off, so I clarify that “the >60 cut-off has been used as a threshold of extreme publishing behavior in previous work (24”.

The author write: “For each of these signal indicators, the proportion of authors with the indicator was compared between ultra-precocious citation impact authors and all top-cited scientists. Given the exploratory nature of analyses, a conservative p-value<0.005 was considered statistically significant” (page 7-8). It is not clear to me why the significance level is set at 0.005 here. My recommendation is to justify it with a reference of arguments. What is the purpose of using p-values? What is the sample (what is it’s size) and to which population do the author wish to generalize? My recommendation is to clarify these issues in the manuscript.

Reply: A reference has been added to the classic paper (ref. 27) suggesting using 0.005 instead of 0.05. The sample here is the authors with precocious behavior versus other top-cited. These are exploratory analyses (hence 0.005 further justified), no power calculation was performed and I would avoid claiming one, this would be misleading.

To summarize, overall the methods section lacks information, clarity and detail on the definitions and operationalisations of precocious citation impact and top-citedness. These issues should be presented in the manuscript and not just referenced. Further, as a reader it is difficult to follow all the references to different databases and get a clear view of what the actual dataset/datasets that is used in the study and what they consist of. My recommendation is that the author makes an effort to increase the information, clarity and detail on how the dataset/datasets that is used in this study is constructed what it consists of in the methods section in a structured and clear fashion so that the transparency on these issues increases in the manuscript.

Reply: Thank you for the very helpful suggestions. Hopefully, with the additional referencing and clarifications, the information and clarity thereof should be improved now.

General questions:

Is precocious citation impact a phenomenon that is observable first and foremost at the individual level or is it relevant also at other levels, e.g, at document level or maybe research group level, etcetera.

Reply: I am not sure what “document level

Attachment

Submitted filename: responsesplosoneprecocious.docx

pone.0328531.s002.docx (26.6KB, docx)

Decision Letter 1

Robin Haunschild

Features and signals in precocious citation impact: a meta-research study

PONE-D-24-56818R1

Dear Dr. Ioannidis,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice will be generated when your article is formally accepted. Please note, if your institution has a publishing partnership with PLOS and your article meets the relevant criteria, all or part of your publication costs will be covered. Please make sure your user information is up-to-date by logging into Editorial Manager at Editorial Manager®  and clicking the ‘Update My Information' link at the top of the page. If you have any questions relating to publication charges, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Robin Haunschild

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions??>

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously? -->?>

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available??>

The PLOS Data policy

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English??>

Reviewer #1: (No Response)

Reviewer #2: Yes

**********

Reviewer #1: (No Response)

Reviewer #2: I appreciate the substantial revisions made by the author in response to my comments. The manuscript has been significantly improved in both clarity and structure, and I commend the author for addressing the key points raised in the initial review. The methodological section has been particularly strengthened (e.g. the inclusion of a more explicit justification for the 8- and 5-year thresholds, as well as a clearer explanation of how the composite citation indicator) both of which help contextualize the rationale behind the study’s core definitions. Additionally, the discussion section now includes further clarifications that enhance the overall interpretation of the findings. In my view, the revised manuscript is now suitable for publication.

**********

what does this mean? ). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy

Reviewer #1: No

Reviewer #2: No

**********

Acceptance letter

Robin Haunschild

PONE-D-24-56818R1

PLOS ONE

Dear Dr. Ioannidis,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

You will receive further instructions from the production team, including instructions on how to review your proof when it is ready. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few days to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Robin Haunschild

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: responsesplosoneprecocious.docx

    pone.0328531.s002.docx (26.6KB, docx)

    Data Availability Statement

    All the raw data are available in pre-existing publicly available databases (https://elsevier.digitalcommonsdata.com/datasets/btchxktzyw/7 and older annual versions in same site and https://elsevier.digitalcommonsdata.com/datasets/kmyvjk3xmd/2) and in the manuscript.


    Articles from PLOS One are provided here courtesy of PLOS

    RESOURCES