Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2024 Jun 11;121(25):e2320066121. doi: 10.1073/pnas.2320066121

Promotional language and the adoption of innovative ideas in science

Hao Peng a,b,1, Huilian Sophie Qiu a,b,1, Henrik Barslund Fosse c, Brian Uzzi a,b,2
PMCID: PMC11194578  PMID: 38861605

Significance

Using three longitudinal samples of funded and unfunded grant applications from three of the world’s largest funders—the NIH, the NSF, and the Novo Nordisk Foundation—we find that the percentage of promotional language in a grant proposal is associated with the grant’s probability of being funded, its estimated innovativeness, and its predicted levels of citation impact. Further, computer experiments show that promotional words may elevate a grant’s positive impressions. While scientific ideas should be evaluated by their intrinsic merit, not by their linguistic packaging, our results empirically suggest that promotional language can harmonize the substance and presentation of scientific work to effectively communicate the merits of innovative ideas.

Keywords: innovation, funding, communication, NLP, promotional language

Abstract

How are the merits of innovative ideas communicated in science? Here, we conduct semantic analyses of grant application success with a focus on scientific promotional language, which may help to convey an innovative idea’s originality and significance. Our analysis attempts to surmount the limitations of prior grant studies by examining the full text of tens of thousands of both funded and unfunded grants from three leading public and private funding agencies: the NIH, the NSF, and the Novo Nordisk Foundation, one of the world’s largest private science funding foundations. We find a robust association between promotional language and the support and adoption of innovative ideas by funders and other scientists. First, a grant proposal’s percentage of promotional language is associated with up to a doubling of the grant’s probability of being funded. Second, a grant’s promotional language reflects its intrinsic innovativeness. Third, the percentage of promotional language is predictive of the expected citation and productivity impact of publications that are supported by funded grants. Finally, a computer-assisted experiment that manipulates the promotional language in our data demonstrates how promotional language can communicate the merit of ideas through cognitive activation. With the incidence of promotional language in science steeply rising, and the pivotal role of grants in converting promising and aspirational ideas into solutions, our analysis provides empirical evidence that promotional language is associated with effectively communicating the merits of innovative scientific ideas.


Converting scientific curiosity into facts and innovation frequently depends on a research team’s ability to effectively convey the merits of new ideas. This communication process is important in most areas of science, including publications, symposia, and the media. It is especially crucial for grant funding, which is a pivotal first step in the innovation process (14). Besides being a gateway to resources that support research, grant writing is one of a scientist’s most time-consuming efforts. Annually, scientists spend 66 of their 260 working days on funding acquisition. The coveted RO1 grant (a.k.a. the “tenure award” grant), for example, typically consumes millions of hours of researchers’ and reviewers’ time each year (5, 6).

Despite the importance of funding for science, knowledge about how scientists communicate the merits of their ideas to funders is only partly understood (710). For example, while grants are expressly designed to promote innovation, recent research indicates that reviewers commonly reject novel grants (1113), a condition that may be linked to individual career setbacks (6) and to science becoming less disruptive and replicable overall (14, 15). Gaps also exist in the data available to prior empirical studies, which have generally lacked the full text and PI data on funded and unfunded applications that can overcome selection and model specification biases needed for broadly representative empirical findings (6, 1618).

A growing area of interest in science communication is the semantic analysis of large corpuses of scientific text (19). Evidence from linguistic studies suggests that the language used in communication is associated with changes in reader’s cognition of, or connections between, concepts and objects (20, 21). For example, subtle differences in the language used in college admission essays has been linked with college admission acceptance rates (22). Words in the abstracts of funded grants correlate with the size of the award (23). Evidence also suggests that promotional language may play a meaningful role in communicating a grant proposal’s merits by helping to convey its originality, novelty, and significance to reviewers (24, 25). A recent study showed that from 1985 to 2020, the use of promotional language steeply rose in accepted NIH grants (25). In 1985, there were approximately 6,000 promotional words per million words in funded NIH proposals. By 2020, that number had more than doubled to 13,000 promotional words per million words.

While the rise and fall in the use of certain words are a natural part of the evolution of language and writing, the implications of using promotional language for communicating the quality of ideas remain unknown and debated. On the one hand, if promotional language helps to convey the merits of complex and risky ideas, it can facilitate the adoption of novel ideas in science and society (26, 27). On the other hand, if promotional language overstates findings (28), it weakens science’s reputation for presenting unvarnished and reproducible facts (9, 29, 30).

To study science communication processes from the perspective of semantic analysis, we investigated the statistical relationship between a grant’s promotional language and its i) funding success, ii) inherent innovativeness, and iii) future impact. Empirically, our unique data surmount some limitations of prior work. Going beyond studies of funded proposals, we analyze tens of thousands of funded and nonfunded grant applications from three influential funding institutions: the NIH, the NSF, and the Novo Nordisk Foundation (NNF), which is one of the world’s largest private sources of scientific grants, with nearly a billion dollars awarded annually. These grant datasets are further enriched with performance data on the prior grant application and publication histories of the principal investigators (PIs) who submitted the grants. Finally, a computer-assisted experiment that manipulates the promotional language in our grant data provides one possible explanation for how promotional language may communicate the merits of novel ideas.

Data and Design

The NNF data include 13,520 grant proposals submitted between 2015 and 2022. Each proposal contains the full text of the project, as well as the program area, funding amount applied for, funding decision, and deidentified personal information, including the applicant’s self-reported age, gender, publication and citation records, and grant-supported future publications. The average award size in NNF data is $600,000 (4.2 million Danish kroner), and the positive funding rate is 16.8%. The average NNF proposal length is 2,859 words.

The NIH and NSF datasets include 2,649 NIH and 561 NSF grant proposals submitted by PI faculty members at a leading research university. These deidentified data include the application year, project description, amount of funding applied for, acceptance decision, as well as the main applicant’s gender, and publication and citation records. In our data, the average award sizes and funding rates are higher than the national average for NIH and NSF grants, which is expected, given that our sample is from a leading research university with a medical school. The national average NIH award size is $466,000 and the funding rate is 33% across all grants. In our data, the average award size is $825,000 and the funding rate is 21.4%. For NSF, the national average award size is $361,000 and the funding rate is 25%, and in our data, the average award size is $321,000 and the funding rate is 36.0%. The average NIH and NSF proposal lengths are 9,223 and 9,030 words, respectively. This study is approved by Northwestern University’s IRB office (STU00219074 & STU00215754).

Our lexicon of promotional language uses a validated dictionary of 139 science-specific promotional words such as “unique,” “revolutionary,” or “fundamental” (See the Materials and Methods section for the full lexicon). Millar et al. (25) created this dictionary of scientific promotional language by analyzing the linguistic content of all 901,717 funded NIH grant applications from 1985 to 2020 (24, 25). To manually code and validate this lexicon, Millar et al. had two independent experts identify candidate promotional words in the NIH grants. Each candidate word was evaluated based on over 500 different instances of its use in context, and on its substitutability with a neutral synonym (Cohen’s k = 0.82). The validation process resulted in a final set of 139 scientific promotional words used in grants. Table 1 displays examples of promotional words and their contextual usage, as reported by Millar et al. (25).

Table 1.

Sample sentences from funded NIH grants showing examples of promotional words used in different contexts

Example sentences using promotional words reported by Millar et al. (25) NIH grant
“Further, a unique and key aspect of this program is the sharing of common mouse strains, reagents…” R01AG032179
“There remains an imperative need for more advanced PACT breast imaging technologies.” R35CA220436
“Addressing this severe knowledge gap in one of the most fundamental aspects of cytoskeletal biology is paramount to understanding how actin functions in cells.” R35GM137959
“The proposed methods offer a revolutionary innovation and will be a game-changer in the…” R43EB027535
“These innovative and novel studies will provide essential new information about the regulation of…” R01HL084494
“We propose to go deep in analyzing a very unique and unprecedented large scale human genomic data set for aging research.” R01AG055501

Each promotional word is italicized.

Bolstering Millar et al.’s validation, we independently conducted three additional validation checks of the lexicon. First, to externally validate the dictionary, we used the multitrait–multimethod approach (MTMM) (31). According to MTMM, if promotional words operate as expected, by helping communicate the perceived originality and significance of innovative ideas to readers (25), promotional words should correlate more strongly than their neutral synonyms with words that engender cognitive engagement. A word’s valence and arousal levels have been shown to engender cognitive processing attention (32). As predicted, we found that promotional words have statistically higher average valence and arousal scores than their neutral synonyms when we use a weighted average t test, signrank test, or mvtest of means (P < 0.002 in all tests). See details in SI Appendix, section I. Second, to test the internal consistency of the dictionary, we computed Cronbach’s alpha for all 139 promotional words based on their percentage frequencies in our data, which are at acceptable levels for each dataset (Cronbach’s alpha is 0.57 for NNF, 0.53 for NSF, 0.63 for NIH) (33). Third, we found that 88% of promotional words in the dictionary had their individual word percentage frequencies statistically correlated with the total frequency of all other promotional words (P < 0.01); to address concerns of the remaining 12% words, we ran separate analyses using only 88% of the promotional words and found that the results were not statistically different from our main finding, which uses all 139 words in the dictionary.

Our three outcome variables are the grant’s i) funding success, ii) innovativeness, and iii) future publication citation impact. Funding success was coded as a binary variable (yes/no). We measured innovativeness score using a popular and validated novelty index that, based on the statistical referencing patterns of published work, quantifies the degree to which a grant combines past knowledge in ways that have been done in previously published work or in unfamiliar ways that have rarely been seen before in previously published work (14, 3439). The computational details of this measure are provided in the Materials and Methods section. The future impact of papers based on the grant is quantified as the productivity and estimated citation impact of publications that acknowledged the funded grant. Following refs. 40 and 41, we estimate a grant’s citation impact as i) the average journal impact factor (JIF) and ii) the largest JIF of publications acknowledging the grant. Productivity is measured as the number of publications supported by the grant.

Our predictor variable is operationalized as the percentage of promotional words in a grant proposal, i.e., “a grant proposal’s total occurrences of promotional words” divided by “a grant proposal’s total number of words.” We used regression models to test for statistical relationships between promotional language and our outcome variables. Our regression models include control variables for PI characteristics (self-reported demographics, prior productivity and citation impact, and prior grant application and success), grant features (writing style, readability, length, applied funding amount), and fixed effects for year, grant type, and domain specializations (6, 17). SI Appendix, Table S1 presents variable definitions, operationalizations, and data details. SI Appendix, section II shows model specifications, BIC, VIF, cross-validation statistics, and robustness checks.

Promotional Language and Grant Funding.

In our three datasets, the median percentage of promotional words per grant is one promotion word for every 100 words, or a density of one promotion word in every four sentences (an average sentence in the data contains 26 words). The average density of promotion words is higher in the first and last 500 words of a proposal (one promotional word per three sentences), where first impression and recency biases tend to affect human recall and engagement the most (42).

Fig. 1 shows distributions of the percentage of promotional words in our datasets by funding decision. Comparing the two distributions of funded and unfunded grants indicates that funded grants contain statistically more promotional words than nonfunded grants (P < 0.002 and P < 0.01 for the t test, KS test, and Epps-Singleton nonparametric test in Fig. 1A for NNF and Fig. 1B for NIH/NSF grants, respectively).

Fig. 1.

Fig. 1.

Frequency distribution of promotional language by grant funding decision. Plots (A and B) show the frequency distribution of promotional words for NNF and NIH/NSF grants. On average, the density of promotional words per grant is one promotion word in every 100 words (1%) or one promotion word for every four sentences, with the highest density in the first and last 500 words of a proposal, where promotional words appear once in every three sentences. Statistically, the plots indicate that funded NNF and NIH/NSF grants contain more promotional words than do unfunded grants (P < 0.002 and P < 0.01 for the t test, KS-test, and Epps-Singleton nonparametric test in both plots).

Logit regression demonstrates that promotional language predicts funding success. First, we regressed whether a NNF grant is funded or not on the percentage of promotional words in the grant while controlling for 13 variables representing a grant’s semantic features, the PI’s prior productivity and citation impact, prior grant experience, self-reported gender and age, and fixed effects for application year, program area, and grant type (See SI Appendix, Table S1 for variable operationalizations and SI Appendix, Table S4 for regression coefficients).

Fig. 2A shows the margins plot from the logit regression for the NNF dataset. It indicates that increases in promotional language are significantly related to increases in the probability of a grant proposal being funded (β = 37.7, P < 0.001 in SI Appendix, Table S4). SI Appendix, Table S4 indicates that a one percentage point increase in the frequency of promotional words is associated with about a 46% increase in the odds of a NNF grant being funded or a near doubling in funding probability, from a low of 11% to a high of 21%, after controlling for a grant’s general semantic features, PI’s prior productivity and citation impact, PI’s prior grant experience, age, gender, and fixed effects for year, program area, and grant type. Fig. 2A also indicates that grants containing an average level of promotional words are funded below the 16.8% base level of success (dashed line). Specifically, the medium percentage of promotional words is 1.0% in the NNF dataset, which equates to an estimated acceptance rate of 14.8%, or two percentage points below the 16.8% base rate. Proposals that do better than the average probability of success contain 1.4 to 2.0 times the median percentage of promotional words (Fig. 2A).

Fig. 2.

Fig. 2.

Promotional language predicts grant funding decision. Margins plots with 95% CIs of the predicted probability of a grant proposal being funded as a function of the percentage of promotional words in the grant and control variables for a grant’s general semantic features, the PI’s prior productivity and citation impact, PI’s prior grant experience, age, gender, and fixed effects for year, program area, and grant type. The two dashed lines represent the average acceptance rate of 16.8% in the NNF dataset (A) and 30.6% in the NSF/NIH combined dataset (B). The rug plots on the x-axis show the data’s density distribution. In both datasets, the percentage of promotional words used in a grant significantly predicts a positive funding decision [P < 0.001 in (A) and P < 0.02 in (B)] and its increase can double the likelihood of being awarded funding. In plot (B), rerunning a regression with only the data between 0.15% and 2.1% on the x-axis, which drops 26 proposals, produces a statistically significant result (β = 28.3, P < 0.05).

Fig. 2B shows the results for NIH/NSF grants (β = 29.5, P < 0.02 in SI Appendix, Table S6), which largely replicate the NNF finding. In the NIH/NSF dataset, the medium percentage of promotional words per grant is 0.9%, for which the model estimates an acceptance rate of 20.6%, or 3.4 percentage points below the 24.0% base rate of being funded. Thus, proposals with the median percentage of promotional words have an acceptance rate that is relatively 14.2% lower than the base rate. Proposals above the base acceptance rate have up to 3.0 times the median percentage of promotional words and a funding rate of up to 30% relative to proposals with the median number of promotional words. Given that there is data sparsity for promotional words above 2% in the NIH/NSF dataset as indicated in the rug plot, we reran a regression with only data in the range from 0.15 to 2.1%, which omits 26 observations in the two tails of the distribution. The result shows a slightly weaker but still statistically significant (β = 28.3, P < 0.05) link between promotional language and a positive funding decision in the NIH/NSF samples.

Finally, due to the rareness of data on failed grant applications in prior studies (6), we highlight our original findings on the associations between PI characteristics, grant features, and funding success. The regressions demonstrate that an applicant’s prior number of citations and prior grant application success are positively correlated with being funded. By contrast, their number of prior publications and total number of prior grant applications are negatively linked to being funded. A proposal’s general semantic features, including application length, reading score, and concreteness score are weakly related or unrelated to funding decision, which further highlights the unique semantic role of promotional language in funding evaluation (SI Appendix, Tables S4 and S6). Lastly, the estimated novelty of a NNF grant proposal is not statistically associated with funding acceptance (SI Appendix, Table S5), which corroborates existing evidence that grants tend to select for conventional ideas (1113, 43).

Promotional Language and a Grant’s Inherent Innovativeness.

To examine whether the concentration of promotional language reflects a grant’s inherent level of innovativeness, we regressed a grant’s innovativeness score on its percentage of promotional words while controlling for a grant’s number of references and the 13 confounds included in the previous regression using the NNF dataset, which has the bibliographic information needed to estimate a grant’s innovativeness (35). Per our measure of innovativeness, less innovative grant applications reference prior knowledge in ways that are statistically common in the literature. By contrast, innovative proposals reference past work in ways that are statistically novel or original in the literature (14, 35, 36). The median innovativeness score in our data is 7.99, and the MAD (Median Absolute Deviation) is 6.13. In our NNF data, 1,970 of 13,520 applications lacked bibliographic data, reducing our observations in this analysis to 11,550. Statistical tests of the mean of all variables used in the regression showed that there were no statistical differences between the 11,550 samples and the full data. See Materials and Methods and SI Appendix, section IV for computational details of our innovativeness measure.

Fig. 3 shows the OLS regression’s margins plot of the predicted innovativeness score as a function of a grant’s percentage of promotional words while controlling for confounds. The plot indicates that promotional language positively and significantly correlates with a grant’s innovativeness score (β = 139.7, P < 0.001; SI Appendix, Table S7 reports regression details). A proposal with the median percentage of promotional words of 1% has an estimated innovativeness score of 11.0, which is 38% higher than the median innovativeness score in the data. Proposals with 2% of promotional words in them have an estimated innovativeness score of 12.3, which is 54% higher than the median score. Over the full range of estimated innovativeness scores, each percentage point increase in promotional language is associated with a 1.4 (CI = [0.9, 1.9], P < 0.001) point increase in the predicted innovativeness score.

Fig. 3.

Fig. 3.

The frequency of promotional language is proportional to a grant’s degree of innovativeness. The margins plot from an OLS regression predicting a grant’s innovativeness score based on its percentage of promotional language and controlling for confounds indicates that promotional language is a strong predictor of a proposal’s innovativeness score (P < 0.001). On average, each percentage point increase in promotional language corresponds to a 1.4-point (CI = [0.9, 1.9], P < 0.001) increase in innovativeness, which is equivalent to 17.5% of the median score.

Promotional Language and Impact of Funded Grants.

While a paper’s novelty and impact are positively related (34, 35), it is unknown whether a grant’s promotional language is a leading indicator of the grant’s citation and productivity impact. We used the same regression specification as in Fig. 2A, except we control for funding amount, we replace the variable “funding amount applied for” with “grant funded amount.” We use OLS and Negative Binomial regressions to estimate the relationship between promotional language and a grant’s i) average JIF, ii) largest JIF, and iii) number of papers based on grant-acknowledged publications. We used JIFs to estimate a paper’s expected citations because recent papers lack the time needed to acquire citations (41) and because JIFs are good predictors of future citations when normalized by discipline (40). The grant-attributed publications are NNF-verified. At the grant level, the median average JIF is 5.7 (MAD = 2.0), the median highest JIF is 8.7 (MAD = 4.5), and the median productivity is 4.0 publications (MAD = 3.0). See SI Appendix, Tables S8–S11 for regression details.

Fig. 4 A and B show that promotional language strongly predicts the expected average JIF (β = 168.2, P < 0.001) and the highest JIF (β = 366.6, P < 0.001) of a grant’s papers (SI Appendix, Tables S8 and S9). A proposal with the median percentage of promotional language of 1% has an estimated average JIF of 6.9, a 21% increase over the median average JIF (Fig. 4A). Proposals that contain 2% of promotional words have an estimated average JIF of as high as 8.6, which is 51% higher than the median average JIF. When we examine a grant’s max JIF, the promotion-impact association intensifies (Fig. 4B). Proposals containing 2% of promotional words, compared to proposals containing 1% of promotional words, have an estimated increase in their max JIF by 30%, from 12.0 to 16.0. These results are robust (SI Appendix, Table S10) when using JIFs normalized by 14 academic disciplines defined in the UCSD map of science (44, 45). Finally, to address the concern that PIs of recent grants have not yet fully reported their publications, we run regressions on the subset of grants awarded before 2019 and find that the predictive power of promotional words for the avg/max JIF remains statistically significant (P < 0.001).

Fig. 4.

Fig. 4.

Promotional language predicts citation impact and productivity. This figure shows margins plots from OLS (A and B) and Negative Binomial model (C) that regress the (A) average JIF, (B) largest JIF, and (C) number of publications on the percentage of promotional language in the grant and controls. The frequency of promotional words strongly predicts the expected citation impact of publications (P < 0.001) and the productivity (P < 0.02) of funded NNF grants.

Promotional language also predicts productivity, but the link is statistically and substantively weaker (β = 15.7, P < 0.02; SI Appendix, Table S11) than citation impact. Fig. 4C shows the margins plot from a negative binomial regression predicting the number of publications reported by the PI while controlling for confounds. It indicates that a 1% increase in the frequency of promotional words is associated with about one more publication, with or without controlling for the grant’s estimated novelty (SI Appendix, Table S12).

Robustness Checks on Statistical Analyses.

First, to test the sensitivity of our analysis to the number of times a promotional word appears in a document, we ran the analysis with each unique promotional word counted at most once per grant proposal. Second, to test the context sensitivity of promotional words (e.g., the word “latest” can neutrally indicate a numerical ordering or promotionally indicate an idea’s originality, depending on the context in which it is used), we ran analyses that randomly dropped up to 20% of a grant’s promotional words occurrences. Third, to test our analysis’s sensitivity to the misidentification of nonpromotional words as promotional words, we ran analyses with 5 to 20% of the promotional words in the dictionary randomly dropped. The above tests were run 100 times for the latter two conditions and indicated that the reported results are robust: While the significance level dropped as the level of sensitivity test increased, promotional words continued to predict funding at P < 0.05 (see details in SI Appendix, section I-B).

Experimental Substitutions of Promotional Words and Estimated Changes in Sentiment.

To examine how promotional language may help communicate the merits of innovative ideas, we conducted computer-assisted experiments that manipulated the promotional language in our grant samples. A starting point for our analysis is the potential link between promotional language and the perceived positive impression of a grant’s merits (46). Here we used algorithms to estimate the level of a grant’s positive sentiment before and after promotional words are replaced with neutral synonyms, while leaving all other aspects of the text intact (Table 2).

Table 2.

Example sentences containing promotional words and their substitutions with randomly chosen neutral synonyms

Example sentence with promotional words Nonpromotional replica with replacement
“Further, a unique and key aspect of this program is the sharing of common mouse strains, reagents…” “Further, a specific and central aspect of this program is the sharing of common mouse strains, reagents…”
“There remains an imperative need for more advanced PACT breast imaging technologies.” “There remains a necessary need for more modern PACT breast imaging technologies.”
“The proposed methods offer a revolutionary innovation and will be a game-changer in the…” “The proposed methods offer a different innovation and will be a game-changer in the…”

We collected each promotional word’s set of synonyms from the Oxford Dictionary and employed a graduate student with 2 y of work experience in a university grant office to review and select neutral synonyms. For example, the word “necessary” is synonymous with the promotional word “critical” but has a neutral connotation, according to human raters (25). The manual review process also ensured that each synonym is valid within a scientific context (e.g., the word “handsome” is a valid synonym for the promotional word “attractive” but is out of the grant writing context). See details and the full list of nonpromotional synonyms in SI Appendix, section V. Each of the 139 promotional words has an average of 13 neutral synonyms.

We estimated a grant’s positive sentiment using SciBert, a contextual word embedding model pretrained on scientific publications that can be used to evaluate the sentence-level sentiment of scientific text (47). The model outputs a confidence score for each of three sentiment labels (positive, neutral, negative). To fine-tune SciBert for the study of sentiment in grant applications, we followed a standard fine-tuning methodology (48) by further training SciBert on sentences from grants with labeled sentiment. The sentences came from a random sample of 1,021 sentences taken from 40 NSF and 40 NIH abstracts funded in 2024. The sentences were given to three raters who manually coded the sentences as having positive, neutral, or negative sentiment. Rater 1 was a native English speaker and a psychology graduate with 2 y of work experience reviewing grant submissions at a large university. Rater 2 was a doctoral candidate who speaks English as a second language and has 2 y of research experience. Both raters were unaware of the purpose of our substitution experiment. Rater 3 was one of the authors. The three raters worked independently. Pairwise, raters agreed on 66%, 76%, and 67.5% of all sentences, and Cohen’s kappa is 0.29, 0.53, and 0.38, which is considered acceptable agreement. A sentence was labeled as the category on which two or more of the raters agreed. In the handful of cases that lacked majority agreement, rater 1 and rater 2 resolved the sentence’s coding.

The fine-tuning procedure started with 200 sentences and incremented the sample by 50 sentences at a time up to 1,020 sentences. For each sample size, we fine-tuned SciBert five times, each time using 85% of the data as training and 15% as testing. In the testing set, SciBert reached an F1 score of 0.79 when using the entire dataset. This fine-tuning process indicates that when the analyses used more than 500 sentences, the F1 score stabilizes (SI Appendix, Fig. S3).

For each promotional-word-containing sentence in each grant, we replaced the promotional word with a randomly chosen neutral synonym. We then compared whether the replicant sentences’ average positive sentiment went up, went down, or stayed the same with respect to the original sentiment. A trial was coded as a decrease in positive sentiment if the average positive sentiment went down. This process was repeated 100 times for each proposal in order to remove random noise and outlier effects. To assess whether the replicant proposal had a statistically lower sentiment score than did the original proposal, we used a binomial test (N = 100 trials, K = the number of trials) in which the original proposal had a higher average positive sentiment than its substitution versions. This entire computer experiment procedure independently replaced 25, 50, 75, and 100 percent of a proposal’s promotional words. In total, these experiments were run on 13,520 NNF, 2,649 NIH, and 561 NSF grant proposals.

Fig. 5 shows the percentage of proposals with a statistically higher level of positive sentiment based on the binomial test than their neutral replicas at four levels of substitutions. Across all three datasets, a supermajority of original proposals had significantly more positive sentiment than did the same proposals that have had their promotional language substituted with neutral synonyms, ceteris paribus. This positive sentiment premium intensified when the level of word substitutions increased. In the NSF and NIH datasets, over 69% and 68% (respectively) of proposals showed a sentiment drop after 25% of promotional words were substituted with neutral synonyms, and over 88% and 92% of proposals decreased in sentiment when 100% of promotional words were substituted. The NNF dataset showed a similar trend. These results suggest that promotional words are linked with higher estimated positive impression of a grant.

Fig. 5.

Fig. 5.

A grant proposal’s positive sentiment drops when its promotional language is experimentally replaced with neutral synonyms. The y-axis shows the percentage of grant applications that decrease in positive sentiment after substituting their proposal’s observed promotional words with corresponding nonpromotional synonyms chosen at random. Significance tests are within-proposal and based on 100 rounds of random substitutions for 13,520 NNF, 2,649 NIH, and 561 NSF proposals. At all levels of replacement, the estimated sentiment score of a proposal drops when promotional words are replaced with synonyms that lack a promotional connotation as coded by human raters.

A robustness analysis confirmed the expectation that a proposal’s average sentiment score is positively associated with funding acceptance within our data sample. Specifically, we used our fine-tuned SciBert model to estimate the average sentence-level sentiment of each NNF grant and ran three regressions to determine the covariate relationships between promotional, average sentiment, and funding decisions. Regression one regressed funding success on a grant’s average sentiment plus basic controls for year, grant type, and program area. Consistent with expectations, it showed that average grant sentiment predicts funding (P < 0.001, BIC = 12,025). Next, we ran a second regression with the same specification as the first one, except we replaced average grant sentiment with the percentage of promotional words. Regression two indicated that the percentage of promotional words predicts funding (P < 0.001, BIC = 12,007). To evaluate the covarying relationship of average sentiment and the percentage of promotional words, regression three regressed funding on both average grant sentiment and percentage of promotional words. Regression three indicated that promotional language was statistically significant (P < 0.001), the average grant sentiment was statistically insignificant (P > 0.05, BIC = 12,013). The fact that the “promotional word only” regression (regression two) had the lowest BIC of all three regressions provides further evidence that the percentage of promotional words is the stronger predictor of funding (49). Similarly, this robustness test replicates (see details in SI Appendix, Table S13) when using the full regression model presented in SI Appendix, Table S4.

Discussion

Innovation is a distinctive strength of science. Science of science researchers have accumulated much knowledge about conditions that generate innovative ideas, such as teams, networks, institutional support, training, incentives, and diversity (4, 36, 50, 51). Comparatively, little is known about how meritorious ideas are communicated to other scientists and the entities that support science (35, 52). Indeed, a lack of effective communication and factors associated with communication skills may partly explain the observed increase in career setbacks (6, 14), inequality in science (17, 53, 54), and resistance to scientific facts (55).

Our starting point for examining the communication of scientific ideas focuses on the role of promotional language in grant applications. Adopting a semantic analysis approach, we examined the link between promotional language and funding success. The density of promotional language has skyrocketed in grants and on social media (25, 56), yet claims of its role in science remain controversial. On the one hand, promotional language is thought to increase attention to, and connections between, concepts in ways that help convey the potential of new ideas. On the other hand, it may result in misleading levels of hype that turn off reviewers or hurt science’s reputation for conservatively presenting findings (28). Our study used a promotional language lexicon identified and validated by Millar et al. (25) to examine the relationship between the density of promotional language in a grant proposal and grant’s i) probability of funding, ii) degree of innovativeness, and iii) productivity and citation impact in the published literature.

Our analysis had several features that strengthened previous research on grants. First, we analyzed tens of thousands of grant applications from three diverse funding organizations—the public NIH and NSF and the private NNF, the world’s largest philanthropic foundation by endowment. Second, to surmount sample selection biases found in prior work, our data include a proposal’s full text, all funded and unfunded proposals, and longitudinal measures of PI, grant, and agency characteristics over the last 10 y to control for confounds.

Our analyses showed that across diverse datasets, the percentage of promotional language in a grant is a key predictor of a grant’s funding decision, innovativeness, and productivity and citation impact after controlling for PI and grant characteristics. Our raw data indicate that on average a grant contains one promotional word in every 100 words or about one promotional word every fourth sentence. Adjusting for confounds, proposals with above average probability of funding success contain 1.4 to 2.0 times the average percentage of promotional words more than double a grant’s chances of funding.

To understand the mechanisms through which promotional language might shape granting agency’s impression about a grant proposal’s merits, we conducted word substitution experiments to estimate the level of sentiment produced by promotional words. To predict a grant’s level of positive sentiment, these experiments used SciBert, a domain-specific and context-sensitive foundational word embedding model in the context of scientific writing, along with fine-tuning based on over a thousand sentences from grants that were manually coded for their positive and negative sentiment. By comparing a proposal’s positive sentiment before and after the substitution of promotional words with neutral synonyms, ceteris paribus, (e.g., substitute the promotional word “unique” with a neutral synonym “specific”), we found that the replacement of promotional words with their neutral synonyms results in a significant drop in a grant’s positive sentiment. While the algorithm cannot capture all nuances in academic language and phrasing and a proxy for how humans might respond to promotional language, the experimental results agree with the theory (20) that promotional language reviewers’ subjective impressions are linked (57, 58). We hope that future research will use our initial analysis to conduct direct causal tests of this mechanism and others. Related questions for future research are why promotional language has increased and how it propagates through networks in science (51).

Our findings raise important implications for scientific practice. Granting agencies have long urged that science be communicated in plain language (23). Our findings suggest a need to understand the gap between science’s espoused norm of neutrality and the practical need for effectively communicating meritorious ideas and findings. On the one hand, the conservative approach of science, which embraces the ideal that good ideas are recognized based on their face value, has served it well. Yet, the astonishing growth of scientific knowledge and changes in search technology and the proliferation of subfield journals have led to the observation that reviewers’ attention is becoming a scarce commodity—changes that can make the recognition of innovation harder and more parochial (35, 5962). These changes additionally burden researchers with the task of conveying their ideas to multiple audiences who hold different perspectives on innovativeness (30, 34, 44). From a theory and policy perspective, promotional words—and semantic analysis more generally—can help increase scientists’ knowledge of how to connect and draw attention to boundary-spanning ideas. Mindful of these cross-cutting needs, one speculation for policy analysis is to examine how promotional language can help bridge the gap between recognizing innovative ideas and the changing reality of reviewers’ attention and specialization in science.

Another policy issue concerns the longer-term implications of whether the association between promotional language and grant success can inadvertently be misused. We believe that science is built on trust and that the pervasive trustworthiness scientists show in data integrity and honesty in reporting applies to the use of promotional language. Consistent with this belief, our results showed that the percentage of promotional words in a grant statistically correlates with the grant’s intrinsic level of innovativeness. To better understand the use of promotional language and norms in science, future research might begin to investigate how promotional language varies with changing norms like data-sharing, replicability, and social media discourse.

A scope condition of our research is that we examined grants specifically. Papers and patents are also critical to academic success and to the conversion of scientific ideas into facts and inventions. It may be that the findings for grants are different in the context of papers or patents. Grants are designed to gain funding to support aspirational research ideas, so they may invite and justify the use of more promotional language than is used in papers and patents, which are geared toward presenting proof of empirical findings and concepts. Thus, while grants are uniquely important to science, the findings here may be grant-specific. It remains an open question as to whether promotional language has positive, negative, or nonlinear associations within the context of papers and patents. Consequently, an important next step is to examine how promotional language may generalize to academic papers and patents, a question we will investigate in follow-up work.

Beyond promotional language, our study speaks to the growing interest in how semantic analysis of science can improve training, accelerate discovery, and improve equity (19). While science rightly focuses on the quantification of phenomena, most of the content of a grant application, scientific paper, or patent application is text. Analysis of scientific text has been an undeveloped research asset. Our work offers a unique demonstration of a growing research area (20) by demonstrating how data analytics, linguistic theory, NLP, and machine learning technologies can provide intriguing approaches for advancing science and its contributions to society.

Materials and Methods

Promotional Words Lexicon and Validation.

We operationalized promotional words as a list of 139 hype words identified in Millar et al. (25). Millar et al. examined 901,717 NIH-funded grants from 1985 to 2020 and created a funding-specific lexicon of promotional language in NIH grant writing (25). In Millar et al.’s (25) sophisticated manual coding process, an adjective was labeled as a promotional word based on whether the word could be removed or replaced with a more objective or neutral word without changing the information within the sentence. Each candidate word was reviewed in at least 500 different instances of its use. In their hand-curation process, independent experts achieved a strong interrater agreement (Cohen’s k = 0.82) and resolved disagreement through discussion (25).

In addition to the human validations performed by Millar et al., we conducted three tests to check the validity of the promotional word dictionary and three robustness checks to test the sensitivity of our main finding to measurement errors. Both analyses are described in the main text with further details reported in SI Appendix, section I.

The lexicon of 139 scientific promotional words includes the following: compelling, critical, crucial, essential, foundational, fundamental, imperative, important, indispensable, invaluable, key, major, paramount, pivotal, significant, strategic, timely, ultimate, urgent, vital, creative, emerging, first, groundbreaking, innovative, latest, novel, revolutionary, unique, unparalleled, and unprecedented, accurate, advanced, careful, cohesive, detailed, nuanced, powerful, quality, reproducible, rigorous, robust, scientific, sophisticated, strong, systematic, accessible, actionable, deployable, durable, easy, effective, efficacious, efficient, generalizable, ideal, impactful, intuitive, meaningful, productive, ready, relevant, rich, safer, scalable, seamless, sustainable, synergistic, tailored, tangible, transformative, user-friendly, ambitious, collegial, dedicated, exceptional, experienced, intellectual, longstanding, motivated, premier, prestigious, promising, qualified, renowned, senior, skilled, stellar, successful, talented, vibrant, ample, biggest, broad, comprehensive, considerable, deeper, diverse, enormous, expansive, extensive, fastest, greatest, huge, immediate, immense, interdisciplinary, international, interprofessional, largest, massive, multidisciplinary, myriad, overwhelming, substantial, top, transdisciplinary, tremendous, vast, attractive, confident, exciting, incredible, interesting, intriguing, notable, outstanding, remarkable, surprising, alarming, daunting, desperate, devastating, dire, dismal, elusive, stark, unanswered, and unmet. See SI Appendix, Fig. S1 for the frequency of promotional words in our datasets.

Grant Innovativeness Measure.

We estimated a grant’s innovativeness using a popular and validated measure that characterizes a grant’s novelty based on the degree to which it combines past knowledge in either familiar or novel ways (35, 36). The ideas combined in a grant are reflected in the references cocited in its bibliography. Bibliographies that contain references that have been paired together many times in prior work combine ideas in conventional and familiar ways. References that have never or rarely been paired together in prior work combine ideas in unfamiliar and innovative ways.

Journal pairs in a paper’s bibliography are used to measure the pairing of knowledge. To quantify the novelty of a given journal pair, the method computes an observed cocitation frequency based on the pairs observed in past published work and compares it with the frequency expected by chance based on a null model that shuffles citations while preserving the yearly citation statistics. The preceding procedure creates a z-score for each journal pair and a distribution of z-scores for each grant (one score for each journal pairing in its bibliography). Z-scores below zero statistically happen less than chance and vice versa for z-scores above zero. Thus, scores above zero are increasingly conventional, and scores below zero are increasingly innovative. A proposal’s innovativeness score is defined as the median of its negative scores. The innovativeness score was reverse coded so that large scores indicate more innovativeness. SI Appendix, section IV presents the procedure’s full details and an example.

Supplementary Material

Appendix 01 (PDF)

Acknowledgments

We thank the NNF, Northwestern University, Clarivate’s Web of Science and OpenAlex for providing data used in this study, especially Anders A.K. Nielsen (NNF), Northwestern’s High-Performance Computing Cluster for technical contribution to this study, Emily Rosman and Zihang Lin’s research assistance, and the Center for Science of Science and Innovation workgroup at Northwestern University for feedback on earlier drafts. We offer special thanks to Susan Fiske for her editorial leadership and reviewer one (Dr. James W. Pennebaker) and anonymous reviewer two. This work is funded by the Air Force Office of Scientific Research under Minerva award number FA9550-19-1-0354, the Northwestern University Institute on Complex Systems and Data Analytics (NICO), The Ryan Institute of Complexity, and the Kellogg School of Management at Northwestern University.

Author contributions

H.P., H.S.Q., and B.U. designed research; H.P., H.S.Q., and B.U. performed research; H.P., H.S.Q., and B.U. contributed new reagents/analytic tools; H.P., H.S.Q., H.B.F., and B.U. analyzed data; and H.P., H.S.Q., and B.U. wrote the paper.

Competing interests

The authors declare no competing interest.

Footnotes

This article is a PNAS Direct Submission.

Data, Materials, and Software Availability

Some study data available. (The deidentified NNF grant application dataset is available upon request through an NDA process with the NNF. The NSF and NIH datasets used in our analysis are subject to their University's NDA process.)

Supporting Information

References

  • 1.Packalen M., Bhattacharya J., NIH funding and the pursuit of edge science. Proc. Natl. Acad. Sci. U.S.A. 117, 12011–12016 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Braun D., The role of funding agencies in the cognitive development of science. Res. Policy 27, 807–821 (1998). [Google Scholar]
  • 3.Muller R. A., Innovation and scientific funding. Science 209, 880–883 (1980). [DOI] [PubMed] [Google Scholar]
  • 4.Fortunato S., et al. , Science of science. Science 359, eaao0185 (2018).29496846 [Google Scholar]
  • 5.Herbert D. L., Barnett A. G., Clarke P., Graves N., On the time spent preparing grant proposals: An observational study of Australian researchers. BMJ Open 3, e002800 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wang Y., Jones B. F., Wang D., Early-career setback and future career impact. Nat. Commun. 10, 4331 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Iyengar S., Massey D. S., Scientific communication in a post-truth society. Proc. Natl. Acad. Sci. U.S.A. 116, 7656–7661 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Rose K. M., Markowitz E. M., Brossard D., Scientists’ incentives and attitudes toward public communication. Proc. Natl. Acad. Sci. U.S.A. 117, 1274–1276 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Weingart P., “Is there a hype problem in science? If so, how is it addressed” in The Oxford Handbook of the Science of Science Communication, Jamieson K. H., Kahan D., Scheufele D. A., Eds. (Oxford University Press, 2017), pp. 111–118. [Google Scholar]
  • 10.Vinkers C. H., Tijdink J. K., Otte W. M., Use of positive and negative words in scientific PubMed abstracts between 1974 and 2014: Retrospective analysis. BMJ 351, h6467 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Nicholson J. M., Ioannidis J. P. A., Conform and be funded. Nature 492, 34–36 (2012). [DOI] [PubMed] [Google Scholar]
  • 12.Lane J. N., et al. , Conservatism gets funded? A field experiment on the role of negative information in novel project evaluation. Manage. Sci. 68, 4478–4495 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Ayoubi C., Pezzoni M., Visentin F., Does it pay to do novel science? The selectivity patterns in science funding. Sci. Public Policy 48, 635–648 (2021). [Google Scholar]
  • 14.Park M., Leahey E., Funk R. J., Papers and patents are becoming less disruptive over time. Nature 613, 138–144 (2023). [DOI] [PubMed] [Google Scholar]
  • 15.Youyou W., Yang Y., Uzzi B., A discipline-wide investigation of the replicability of Psychology papers over the past two decades. Proc. Natl. Acad. Sci. U.S.A. 120, e2208863120 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Ma Y., Oliveira D. F. M., Woodruff T. K., Uzzi B., Women who win prizes get less money and prestige. Nature 565, 287–288 (2019). [DOI] [PubMed] [Google Scholar]
  • 17.Oliveira D. F., Ma Y., Woodruff T. K., Uzzi B., Comparison of National Institutes of Health grant amounts to first-time male and female principal investigators. JAMA 321, 898–900 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Bol T., de Vaan M., van de Rijt A., The Matthew effect in science funding. Proc. Natl. Acad. Sci. U.S.A. 115, 4887–4890 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Liu L., Jones B. F., Uzzi B., Wang D., Data, measurement and empirical methods in the science of science. Nat. Hum. Behav. 7, 1046–1058 (2023), 10.1038/s41562-023-01562-4. [DOI] [PubMed] [Google Scholar]
  • 20.Pennebaker J. W., Mehl M. R., Niederhoffer K. G., Psychological aspects of natural language use: Our words, our selves. Annu. Rev. Psychol. 54, 547–577 (2003). [DOI] [PubMed] [Google Scholar]
  • 21.Romero D. M., Swaab R. I., Uzzi B., Galinsky A. D., Mimicry is presidential: Linguistic style matching in presidential debates and improved polling numbers. Pers. Soc. Psychol. Bull. 41, 1311–1319 (2015). [DOI] [PubMed] [Google Scholar]
  • 22.Pennebaker J. W., Chung C. K., Frazee J., Lavergne G. M., Beaver D. I., When small words foretell academic success: The case of college admissions essays. PLoS One 9, e115844 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Markowitz D. M., What words are worth: National Science Foundation grant abstracts indicate award funding. J. Lang. Soc. Psychol. 38, 264–282 (2019). [Google Scholar]
  • 24.Millar N., Salager-Meyer F., Budgell B., “It is important to reinforce the importance of…”:‘Hype’in reports of randomized controlled trials. English Specif. Purp. 54, 139–151 (2019). [Google Scholar]
  • 25.Millar N., Batalo B., Budgell B., Trends in the use of promotional language (hype) in abstracts of successful national institutes of health grant applications, 1985–2020. JAMA Network Open 5, e2228676 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Downs J. S., Prescriptive scientific narratives for communicating usable science. Proc. Natl. Acad. Sci. U.S.A. 111, 13627–13633 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Dahlstrom M. F., Using narratives and storytelling to communicate science with nonexpert audiences. Proc. Natl. Acad. Sci. U.S.A. 111, 13614–13620 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Boutron I., Ravaud P., Misrepresentation and distortion of research in biomedical literature. Proc. Natl. Acad. Sci. U.S.A. 115, 2613–2619 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Intemann K., Understanding the problem of “hype”: Exaggeration, values, and trust in science. Can. J. Philos. 52, 279–294 (2022). [Google Scholar]
  • 30.Peng H., Romero D. M., Horvát E. -Á., Dynamics of cross-platform attention to retracted papers. Proc. Natl. Acad. Sci. U.S.A. 119, e2119086119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Campbell D. T., Fiske D. W., Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol. bull. 56, 81 (1959). [PubMed] [Google Scholar]
  • 32.Citron F. M., Gray M. A., Critchley H. D., Weekes B. S., Ferstl E. C., Emotional valence and arousal affect reading in an interactive way: Neuroimaging evidence for an approach-withdrawal framework. Neuropsychologia 56, 79–89 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Nicolas G., Bai X., Fiske S. T., Automated dictionary creation for analyzing text: An illustration from stereotype content. PsyArXiv [Preprint] (2019). 10.31234/osf.io/afm8k (Accessed 3 April 2024). [DOI]
  • 34.Fontana M., Iori M., Montobbio F., Sinatra R., New and atypical combinations: An assessment of novelty and interdisciplinarity. Res. Policy 49, 104063 (2020). [Google Scholar]
  • 35.Uzzi B., Mukherjee S., Stringer M., Jones B., Atypical combinations and scientific impact. Science 342, 468–472 (2013). [DOI] [PubMed] [Google Scholar]
  • 36.Yang Y., Tian T. Y., Woodruff T. K., Jones B. F., Uzzi B., Gender-diverse teams produce more novel and higher-impact scientific ideas. Proc. Natl. Acad. Sci. U.S.A. 119, e2200841119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Wu L., Wang D., Evans J. A., Large teams develop and small teams disrupt science and technology. Nature 566, 378 (2019). [DOI] [PubMed] [Google Scholar]
  • 38.Leahey E., Beckman C. M., Stanko T. L., Prominent but less productive: The impact of interdisciplinarity on scientists’ research. Adm. Sci. Q. 62, 105–139 (2017). [Google Scholar]
  • 39.Teplitskiy M., Peng H., Blasco A., Lakhani K. R., Is novel research worth doing? Evidence from peer review at 49 journals. Proc. Natl. Acad. Sci. U.S.A. 119, e2118046119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Abramo G., D’Angelo C., Di Costa F., Citations versus journal impact factor as proxy of quality: Could the latter ever be preferable? Scientometrics 84, 821–833 (2010). [Google Scholar]
  • 41.Stringer M. J., Sales-Pardo M., Amaral L. A., Statistical validation of a global model for the distribution of the ultimate number of citations accrued by papers published in a scientific journal. J. Am. Soc. Inf. Sci. Technol. 61, 1377–1385 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Wood T. J., Exploring the role of first impressions in rater-based assessments. Adv. Health Sci. Educ. 19, 409–427 (2014). [DOI] [PubMed] [Google Scholar]
  • 43.Gross K., Bergstrom C. T., Why ex post peer review encourages high-risk research while ex ante review discourages it. Proc. Natl. Acad. Sci. U.S.A. 118, e2111615118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Peng H., Ke Q., Budak C., Romero D. M., Ahn Y.-Y., Neural embeddings of scholarly periodicals reveal complex disciplinary organizations. Sci. Adv. 7, eabb9004 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Börner K., et al. , Design and update of a classification system: The UCSD map of science. PLoS One 7, e39464 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Luo J., et al. , Analyzing sentiments in peer review reports: Evidence from two science funding agencies. Quant. Sci. Stud. 2, 1271–1295 (2021). [Google Scholar]
  • 47.Beltagy I., Lo K., Cohan A., “SciBERT: A pretrained language model for scientific text” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Inui K., Jiang J., Ng V., Wan X., Eds. (Association for Computational Linguistics, 2019), pp. 3615–3620, 10.18653/v1/D19-1371. [DOI]
  • 48.Kotelnikova A. V., Vychegzhanin S. V., Kotelnikov E. V., Cross-domain sentiment analysis based on small in-domain fine-tuning. IEEE Access 11, 41061–41074 (2023). [Google Scholar]
  • 49.Raftery A. E., Bayesian model selection in social research. Soc. Methodol. 25, 111–163 (1995). [Google Scholar]
  • 50.Jin C., Ma Y., Uzzi B., Scientific prizes and the extraordinary growth of scientific topics. Nat. Commun. 12, 5619 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Guimerà R., Uzzi B., Spiro J., Amaral L. A. N., Team assembly mechanisms determine collaboration network structure and team performance. Science 308, 697–702 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Horvát E.-Á., Uzzi B., Virtual collaboration hinders a key component of creativity. Nature 605, 38–39 (2022). [DOI] [PubMed] [Google Scholar]
  • 53.Ma Y., Oliveira D. F., Woodruff T. K., Uzzi B., Women who win prizes get less money and prestige. Nature 565, 287–288 (2019). [DOI] [PubMed] [Google Scholar]
  • 54.Peng H., Teplitskiy M., Jurgens D., Author mentions in science news reveal widespread disparities across name-inferred ethnicities. Quant. Sci. Stud. 5, 1–15 (2024). [Google Scholar]
  • 55.Lazer D. M., et al. , The science of fake news. Science 359, 1094–1096 (2018). [DOI] [PubMed] [Google Scholar]
  • 56.Peng H., Teplitskiy M., Romero D. M., Horvát E.-Á., The gender gap in scholarly self-promotion on social media. arXiv [Preprint] (2022). 10.48550/arXiv.2206.05330 (Accessed 1 September 2023). [DOI]
  • 57.Lerner J. S., Li Y., Valdesolo P., Kassam K. S., Emotion and decision making. Annu. Rev. Psychol. 66, 799–823 (2015). [DOI] [PubMed] [Google Scholar]
  • 58.Liu B., Govindan R., Uzzi B., Do emotions expressed online correlate with actual changes in decision-making?: The case of stock day traders. PLoS One 11, e0144945 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Foster J. G., Rzhetsky A., Evans J. A., Tradition and innovation in scientists’ research strategies. Am. Soc. Rev. 80, 875–908 (2015). [Google Scholar]
  • 60.Jones B. F., The burden of knowledge and the “Death of the Renaissance Man”: Is innovation getting harder? Rev. Econ. Stud. 76, 283–317 (2009). [Google Scholar]
  • 61.Mukherjee S., Romero D. M., Jones B., Uzzi B., The nearly universal link between the age of past knowledge and tomorrows breakthroughs in science and technology: The hotspot. Sci. Adv. 3, 2017 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Sasahara K., et al. , Social influence and unfollowing accelerate the emergence of echo chambers. J. Comput. Soc. Sci. 4, 381–402 (2021). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix 01 (PDF)

Data Availability Statement

Some study data available. (The deidentified NNF grant application dataset is available upon request through an NDA process with the NNF. The NSF and NIH datasets used in our analysis are subject to their University's NDA process.)


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES