Abstract
Interest in team diversity initiatives has grown significantly over the past decade. Some initiatives focus on creating “highly variable” teams where members bring a wide range of attributes. Others prioritize “highly atypical” teams, where members contribute attributes underrepresented within the broader organization or field, regardless of variety. These two approaches entail markedly different assumptions about maximizing team diversity’s benefits. Comparing short- and long-term outcomes provides important insights into cultivating and leveraging diverse teams. To do so, we examined the proposal submissions of all variable and atypical teams within a competitive seed grant program over six years. We assessed short-term performance based on funding outcomes following a three-stage review process and long-term viability based on team members’ tendency to collaborate more in the future. Our findings demonstrate that diversity operates differently when conceptualized as variability versus atypicality. Specifically, while team variability often resulted in neutral or even negative short-term performance, it had a mixed effect on long-term viability. Conversely, while team atypicality had a mixed impact on short-term performance, it consistently enhanced long-term viability. These results underscore the distinctive value of nurturing highly atypical teams to promote lasting collaboration success and highlight the importance of aligning diversity cultivation strategies with organizations’ short- and long-term goals.
Keywords: Teams, Diversity, Performance, Viability, Team effectiveness, Team composition
Subject terms: Human behaviour, Computational science
Introduction
Diversity in modern organizations has evolved from a progressive ideal into a strategic necessity1. Over the past decade, organizational efforts to build diverse teams have proliferated, driven by a growing recognition of diversity’s benefits to the workplace2–5. At the same time, there is an increasing awareness that no single approach to cultivating diversity can fully address the challenges faced by different organizations6,7.
Consider a hypothetical space agency planning a mission to the Moon and aiming to reap the benefits of diversity. The agency is deciding between two crew options. Option A is a balanced crew of two males and two females. Option B is an all-female four-person crew. Both options align with the agency’s commitment to diversity, but each follows a different approach to achieve it. Either option could serve as an exemplar diversity-promotion strategy, particularly considering that as of 2024, only men have set foot on a mission to the Moon8.
The two options conceptualize team diversity through different lenses. Option A reflects a “variability approach,” where diversity is a property of team members’ attributes relative to one another. This approach emphasizes variation within the team2, valuing different perspectives and constructive conflict that can enhance collaboration and problem-solving2,9–12. In contrast, Option B exemplifies an “atypicality approach,” where diversity is defined by a team’s deviation from the norm existing within the broader ecosystem of teams13. While an all-female crew lacks gender variability internally, its uniqueness relative to historical space missions highlights the benefits of inclusivity and collective intelligence, positioning this option as equally valuable for advancing diversity14,15.
The distinction between team diversity as variability and as atypicality goes beyond academic relevance. Organizations have demonstrated the value of each approach to leveraging diversity within work teams, sometimes without even realizing it. A good example of the variability approach can be seen in the context of vehicle safety. For decades, traditional safety rules mandated the use of crash test dummies modeled on the average male body-type, thereby ignoring the unique vulnerabilities experienced by women. As a result, up until 202216, women were 73% more likely to sustain injuries and 17% more likely to die in vehicle accidents compared to men17. This large difference highlights the importance of assembling a team of engineering and testing professionals from multiple different groups—diversity as variability—to identify and address issues that might go unnoticed by a homogeneous team.
On the other hand, the atypicality approach is perhaps best exemplified by the 2014 release of Bumble, a dating app that allows women to make the first move18. Bumble was led by an all-female team19 and challenged the norms of heterosexual dating by giving women the control to initiate matches. By breaking away from the convention that men are the typical initiators in dating situations, Bumble became the most popular dating app in the United States based on the number of monthly app downloads20. This success highlights the impact of assembling a team composed only of underrepresented group members—diversity as atypicality—to provide solutions to previously unmet societal needs.
Together, these examples underscore the unique value of both diversity approaches: variability for addressing critical gaps within a system and atypicality for challenging norms and driving change. However, research also shows that while all forms of diversity may enhance team effectiveness, they can also impede it. Hence, the question arises: Which approach to forming diverse teams in organizations is better for team effectiveness?
Answering this question requires us to first clarify the dimensions along which organizations measure team effectiveness. More often than not, teams get assigned a series of interdependent tasks21 or, in fields like consulting, a successful project can result in additional projects with either the same client or new ones. This calls for teams to balance achieving immediate goals with productive future collaboration. Further, given the significant investment of time and resources in recruiting and developing teams, managers should focus on forming teams that excel in the short run and remain viable in the long run. This dual focus on short-term performance and long-term viability has been recognized as crucial for overall team effectiveness in all major reviews of teams research22–25.
Evaluating the effects of team diversity, therefore, needs to be put into the context of how diversity affects performance and viability23. Whereas performance captures the degree to which a team achieves its near-term goals, viability captures the degree to which it can sustain itself as a social entity with the capacity to continue performing in the long term. Organizations want both. A successful initiative creates high-performing, viable teams. We examine the effects of diversity—as variability versus atypicality—on team diversity goals. Figure 1 illustrates the distinction between variability and atypicality in four-member teams with different demographic proportions.
Fig. 1.
Schematic Representation of Team Diversity as Variability versus Atypicality. For a four-person team with one minority classification (e.g., female), this illustration shows how different team compositions would be considered either high variability, high atypicality, or somewhere in between.
Team diversity as variability
The variability-based definition of diversity builds upon the notions of team heterogeneity26 or variety2, where teams are seen to comprise individuals with differing perceptions and consequently improved decision-making and problem-solving ability. This advantage is supported by information processing theory10 and resource-dependence theory11. Both theories argue that teams with heterogeneous members have a greater ability to access a wider range of resources and viewpoints, thus increasing their potential to achieve goals9.
However, variability can also introduce challenges within a team. According to social categorization theory27, individuals create subgroups based on perceived commonalities, which may spur intra-group conflict28. Similarity-attraction theory29 suggests that trust develops more slowly in diverse teams, as members may feel less comfortable collaborating with those who are different from themselves4. These dynamics can cause friction, hinder communication and coordination, and potentially weaken short-term performance even in the presence of the cognitive benefits of diversity.
In terms of long-term team viability, diversity-related complexities may persist and even amplify. Research has shown that collaborators with different backgrounds face higher communication, coordination, and integration challenges30–32. Over time, these challenges tend to intensify instead of diminish33,34. One primary reason for this effect is that members of diverse teams may harbor implicit biases or stereotypes that can influence their perceptions and behaviors toward one another35,36. While one might expect these stereotypes to attenuate over time due to repeated interactions, research shows that the opposite often holds—that is, as team members gain more information about each other, “violations of prescriptive stereotypes by means of counterstereotypical behaviors can lead to social and economic reprisals, the so-called backlash effect”34.
In other words, while variable teams offer many cognitive and resource-related advantages, they also face significant social and relational challenges that may undermine short-term performance and long-term viability. Overcoming these challenges requires deliberate efforts in the development of trust, reduction of biases, and building effective team cohesion.
Team diversity as atypicality
Defining diversity as atypicality draws attention to teams that differ from the normative composition found in a broader organizational context. Atypical teams—those showing high minority presence but possibly low internal diversity—often perform well due to members’ cohesion and shared identity. When most team members belong to the minority group, they are no longer seen as outsiders or “deviants”13,37, according to social categorization theory27. This promotes inclusiveness, trust, and communication, and therefore enhances collaboration and group intelligence14,15. Such dynamics enable better team performance, specifically in cases where cohesiveness and a common objective are important3.
Yet, atypical teams face unique difficulties. Research on intergroup bias36 indicates that members of atypical teams may face more acute competition for recognition because of their shared minority status. A team composed predominantly of minority identities can amplify concerns about collective stereotyping or being marginalized in the broader organizational environment. Therefore, individuals may feel the need to highlight their own contributions, which can lead to interpersonal conflict, lower group cohesion, and lower group performance38,39. Further, the need to shine—a central motivator for some people, called individuation—is diminished in atypical teams40. For team members who value a sense of distinctiveness, unmet individuation needs would reduce collaborative behavior39,41–43. Such individual-level challenges often surface at the team level, possibly at the expense of short-term performance.
In terms of long-term team viability, atypical teams benefit from their internal homogeneity, which may compensate for some of the coordination challenges faced by variable teams. However, the very distinct features that make these teams atypical can also become a psychological burden. Research in psychology has shown that atypical teams may feel like “underdogs”44—perceived to lack credibility due to their deviation from the norms prevalent in the organization or ecosystem. Such perception may trigger a shared desire to prove others wrong, leading to increased collaboration over time. This reasoning is in line with studies on perceived discrimination as a motivator of group identification and cohesion45.
In conclusion, atypical teams experience advantages from a common identity and cohesion that improve collaboration and group intelligence, thus positioning them for sustained success. However, these teams are required to manage pressures associated with recognition, individuation, and perceptions of credibility, which may hinder short-term performance and, yet, simultaneously motivate long-term viability.
The current study
This study explores the impact of team diversity, conceptualized as variability versus atypicality, on team effectiveness in the context of the National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) Pilot Grant Program. This program solicits proposals to help advance the effective translation of basic science research to clinical contexts. We analyze all team proposals submitted to the competition at a U.S. university over a six-year period, specifically examining how the variability and atypicality of teams are associated with their short-term performance and long-term viability. Variability and atypicality are operationalized based on the team’s gender, ethnic, and racial composition. Short-term performance is operationalized as immediate funding success following a rigorous three-stage review process. Long-term viability is operationalized as the propensity for future collaboration among team members beyond the duration of the grant.
The findings of this study reveal surprising effects of team diversity, where variability and atypicality influence short-term performance and long-term viability in different ways. This study questions the received wisdom regarding the impact of diversity on team effectiveness and thus is important to organizational theory and practice development. This research, therefore, underlines the urgent need for a subtler and more sensitive contextual understanding of team diversity46,47.
Results
We used all team proposal submissions to a Clinical and Translational Science Award (CTSA) competition hosted at a U.S. university and funded by the U.S. National Institutes of Health (NIH). 217 proposals were submitted between 2014 and 2019 across nine rounds of grant competition. Given our focus on examining the performance of diverse teams, we excluded 110 solo-authored proposals. Additionally, one proposal was excluded due to data collection issues. The final dataset comprises 271 unique investigators who collaborated on 106 grant proposals.
Of the 106 proposals submitted over the six years, 16 were funded (15%), while 90 were not funded. Additional descriptive statistics and correlations for the teams are provided in SI Table S1. Figure 2 illustrates the 106 teams as a collaboration network, with nodes representing each of the 271 investigators and edges denoting collaboration on a grant submission. Viewing the data as a network offers valuable insights into collaboration patterns, making it easier to identify recurring partnerships (depicted by thicker edge weights) and key roles within the network. For example, within the largest cluster in Fig. 2, the central node represents an investigator who acted as a broker linking many other investigators. A quick examination revealed that this investigator—a Non-Hispanic, White, male—collaborated with 12 different investigators on three different proposal submissions over three rounds of grant competition, of which only one was funded.
Fig. 2.
Proposal Collaboration Network. Proposal collaboration network showing 106 team proposals submitted over a six-year period. Nodes represent investigators, edges represent collaboration on a proposal submission, node size reflects degree, and edge weight reflects repeat collaborations. The largest cluster consists of 18 unique investigators which make up 4 different proposal teams. Of these 18 investigators, 9 were successful in receiving funding.
Short-term performance of variable versus atypical teams
A team’s short-term performance was measured by funding status—whether a proposal was Awarded or not. Each CTSA proposal underwent a three-stage review process to determine its funding potential. First, the proposal received a Reviewers score computed as the average of the scores provided by each reviewer. Following discussion among the reviewers, the proposal received an Adjusted score. Both of these scores were based on the NIH grant application scoring system, which uses a 9-point rating scale (1 = exceptional; 9 = poor) in whole numbers48. For ease of interpretation, we reversed the scale so that a proposal receiving a “high score” corresponds to a higher point rating. Finally, an advisory panel reviewed all proposals and their scores and decided whether the proposal was awarded. Note, that all three stages of the review process were included in our analysis to explore how variability and atypicality were evaluated at each stage leading up to the ultimate award decision, which served as our primary short-term team performance metric.
To measure the variability versus atypicality of the proposal teams, we examined the gender, ethnic, and racial affiliations of team members. We used Blau’s index of heterogeneity26 to calculate team-level variability measures. We used the percentage of investigators occupying each of the underrepresented gender, ethnicity, and race categories to compute team-level atypicality measures. The distribution of team atypicality measures across the 106 proposal teams is provided in SI Figure S1. More details on data sources and operationalizations are provided in the Methods section.
To assess the impact of team variability and atypicality on short-term performance, we employed a fixed effect generalized linear regression to predict the reviewers’ score and adjusted score, as well as fixed effect probit regression to predict the final award status. Further details about these methods are available in the Analytical Approach section of the Methods section.
Our analysis revealed that higher variability had either no effect or negatively affected short-term team performance, whereas higher atypicality showed mixed effects, as summarized in Fig. 3 and SI Figure S2. Specifically, higher gender variability did not significantly affect scores or funding outcomes. Higher ethnic variability negatively impacted scores but not funding success (SI Table S2: Model 2, β = −0.935, p-value < 0.05 and Model 5, β = −1.020, p-value < 0.05). Higher racial variability did not significantly affect scores, but did significantly reduce the likelihood of funding success (SI Table S2: Model 8, β = −1.640, p-value < 0.05), with marginal effects indicating a 64% lower chance of funding for teams with high racial variability compared to teams with low racial variability (SI Table S3).
In contrast, teams with high gender atypicality were more likely to receive high scores and funding (SI Table S2: Model 3, β = 0.609, p-value < 0.05; Model 6, β = 0.900, p-value < 0.10; Model 9, β = 0.969, p-value < 0.01), with marginal effects showing a 139% higher funding likelihood for teams with higher gender atypicality (SI Table S3). On the other hand, high ethnic atypicality negatively impacted scores but had no impact on funding success (SI Table S2: Model 3, β = −1.034, p-value < 0.01 and Model 6, β = −1.302, p-value < 0.05). Finally, high racial atypicality negatively impacted scores and significantly reduced the likelihood of funding success (SI Table S2: Model 6, β = −0.283, p-value < 0.1; Model 9, β = −1.456, p-value < 0.05), with marginal effects showing a 63% lower likelihood of receiving funding (SI Table S3).
To ensure the robustness of our results, we controlled for several variables potentially affecting funding success. These include team size, prior collaboration, educational level, tenure, prior citations, external funding, cognitive similarity, and social network characteristics such as local brokerage position and global closure within the larger scientific ecosystem. We provide further details in Supplementary Information.
Effects of short-term performance on long-term viability of variable versus atypical teams
While it is critical to understand how team diversity affects short-term performance, it is also important to understand how team diversity may moderate the effect of short-term performance on team long-term viability49. Team viability refers to “a group’s potential to retain its members—a condition necessary for proper group functioning over time”22.
We examined the effects of short-term performance on long-term viability of teams composed of variable or atypical combinations of members. We operationalized long-term team viability using Publication count, i.e. the number of publications co-authored by at least two investigators on the original proposal team, published within five years after the proposal submission year. Studies have shown that it takes approximately five years after receiving funding to attain maximum publication output50.
Given the non-experimental nature of our data, we leveraged a Difference-in-Difference (DiD) approach to estimate the effect of the proposal’s award status on future collaboration among proposal team members. In an ideal experimental setting, we would randomly allocate the funding to proposal teams and ensure a balance of unobserved characteristics across both awarded and unawarded teams. This would allow us to identify the causal effect of funding by comparing the future co-authored publication counts of those who received and did not receive the award. However, the CTSA grant program does not fund proposal teams randomly, implying that funding is likely to positively correlate with some unobserved characteristics that could also affect future collaboration. Therefore a simple comparison of future co-authored publication counts between awarded and unawarded teams is likely to be biased upwards51.
A DiD procedure requires an outcome observed for two groups and two time periods where one group is exposed to a treatment only in the second time period. In this study, the two groups were the awarded and unawarded proposal teams and the two time periods represented the five-year time windows before and after the proposal submission date. To accommodate the five-year post-submission observation window, we restricted our sample to only those proposals that were submitted before 2017, resulting in a new sample size of 47 proposals and 470 observations. By grouping this panel data into two time periods, we alleviate potential serial autocorrelation bias52.
The model formulation and information on whether the assumption of parallel trends is satisfied are detailed in the Analytical Approach section of the Methods section. We included the same control variables with the exception of prior collaboration to address endogeneity (see Control Variables in Supplementary Information). The corresponding results are summarized in Fig. 4 and SI Figure S3.
Fig. 3.
Effects of Team Diversity as Variability versus Atypicality on Short-term Team Performance. The three columns represent the effect of team gender, ethnic, and racial variability (upper panel) and atypicality (lower panel) on proposal outcomes across three metrics: Reviewers’ Score, Adjusted Score, and Awarded Status. Outcomes were predicted using fixed effect generalized linear regression for the Reviewers’ and Adjusted scores, and fixed effect probit regression for the final Award status, accounting for within-year differences and controlling for unobserved invariant proposal characteristics. Robust standard errors were estimated to address non-independence of observations from investigators who submitted proposals over multiple years. Bars represent the 95% confidence intervals (CIs) for the estimates, calculated using these robust standard errors. + p < 0.10, * p < 0.05, ** p < 0.01, *** p < 0.001 indicate statistical significance of the predictors.
As shown in Fig. 4, we found that strong short-term team performance does not necessarily predict long-term team viability. In fact, teams that were funded tended to exhibit decreased collaboration over time (Fig. 4, SI Table S4 and SI Table S5: Model 2, = −1.016, p-value < 0.05). When examining the moderating effects of team variability and atypicality on this relationship, we found that funding negatively predicted long-term viability when the team exhibited higher gender variability (SI Table S4: Model 6, = −4.352, p-value < 0.05). Conversely, funding positively influenced long-term viability when the team displayed higher ethnic variability (SI Table S4: Model 6, = 4.335, p-value < 0.05) or higher racial variability (SI Table S4: Model 6, = 4.271, p-value < 0.05). Note that including variability as a moderator provided better model fit than the simple model as demonstrated by the difference in Chi-squared statistics ( > 12, df = 6, p-value < 0.05)53.
In terms of atypicality, the effects were consistently positive. We found that funding positively predicted long-term viability when the proposal team was characterized by higher gender atypicality (SI Table S5: Model 6, = 7.390, p-value < 0.001) or higher racial atypicality (SI Table S5: Model 6, = 7.017, p-value < 0.001). The effect for ethnic atypicality was inconclusive. Again, the inclusion of atypicality as a moderator improved the model fit ( > 22, df = 6, p-value < 0.001)53.
Discussion
The importance of diversity in organizational teams is widely recognized, prompting an increase in interventions to enhance team diversity. However, there remains a pressing need for research that clarifies the effects of different conceptualizations of diversity. This study suggests that team diversity influences short-term team performance and long-term team viability differently when viewed through the distinct lenses of variability (diversity as a mix of characteristics) and atypicality (diversity as deviation from a norm).
Our findings reveal nuanced patterns across gender, race, and ethnicity. All-female teams outperform mixed-gender teams in both short-term performance and long-term viability. In contrast, racial diversity yields advantages under both conceptualizations—atypicality and variability—–suggesting that interventions promoting either form of racial diversity can lead to meaningful benefits. For ethnic diversity, however, the benefits emerge primarily from variability, and not atypicality. Thus, while the old adage that differences make a difference remains true, we show that the nature of those differences matter as well.
Two important conclusions emerge from our study. First, how one conceptualizes diversity needs to be explicit in designing and evaluating interventions. In other words: is the goal to create teams that exhibit different perspectives, expertise, or backgrounds (variability), or to create teams where all members share an underrepresented status (atypicality)? Second, the targeted dimension of diversity—gender, race, or ethnicity—shapes team outcomes in distinct ways. For example, all-female teams produce benefits not observed among all-Hispanic teams, while mixed-race and mixed-ethnicity teams produce advantages that do not appear in mixed-gender teams.
This study contributes to the existing literature by offering a direct comparison of short- and long-term effectiveness under conditions of high variability and high atypicality across gender, race, and ethnicity. Three particularly surprising findings illustrate the utility of this study.
The first surprising finding is that high team variability does not significantly improve short-term team performance, whereas high team atypicality has a significant effect that varies based on the demographic attribute. Specifically, while high gender variability does not have a considerable impact on short-term team performance, high gender atypicality yields significant positive results. The existing literature suggests that the impact of gender diversity on performance depends on the normative acceptance of such diversity in the institutional environment54. While all teams in this study were part of the same grant competition, there are contextual factors—particular proposal requirements or implicit biases—that may have influenced views about gender diversity and deviance, bringing about divergent outcomes. A number of theories offer an explanation for these effects. Critical mass theory proposes that a person who is underrepresented—called a “token”55—does not have enough power to bring about meaningful change. In contrast, the realization of a sizable proportion of underrepresented members is a prerequisite to wield significant influence. This phenomenon is particularly relevant to the subject of gender. For example, studies of corporate boards show that women who are board members participate more actively when women hold a sizable number of seats56–58. Similarly, research in creative fields like video game design shows that gender diversity leads to creativity only when coupled with conscious inclusion efforts. One study aptly noted that “the minimal presence of a gender minority is not effective”59. In the current study, the absence of a significant effect of high gender variability on short-term performance contrasts with the presence of a positive, significant effect of high gender atypicality on short-term performance. In other words, this research suggests that the lack of desired impacts in teams with high gender variability indicates failure to reach a critical mass level whereas those that have high gender atypicality—that is, more women than men—began to experience benefits upon crossing this threshold.
Another reason is the difficulty of handling deep-seated power imbalances stemming from gender diversity, which might impinge on team performance. Qualitative interviews with female executives who were the “first and only” women on corporate board teams reveal that they often felt pressured to represent stereotyped conceptions of their gender or minority status, they prepared more extensively for meetings due to reduced trust from peers, and they faced heightened visibility and scrutiny60. In this study, these stressors may have hindered the performance of high gender variability teams. By contrast, greater female representation in high gender atypicality teams likely alleviated these pressures, enabling better short-term performance.
Compared to gender, high levels of racial atypicality have a strongly negative impact on immediate performance. This finding illustrates the unique challenges that come with being in racially atypical teams. Unlike all-female teams, which often benefit from the shared experience of negotiating through male-dominated spaces, racially atypical teams—say, all non-white—may have members with varying subcultural backgrounds that introduce greater complexity into team dynamics. In our study, 27% of all non-white-only teams included both Black and Asian members, with even greater distinctions likely present at a more granular level. Differences in norms, values, and communication styles across these racial subgroups can lead to misunderstandings or conflict. For example, East Asians are often stereotyped as less assertive, which is an attribute less congruent with the norms of Western leadership—thus creating the “Bamboo Ceiling” in organizational advancement. On the other hand, South Asians, who have stereotypically been viewed as more assertive, experience fewer such barriers61. Subcultural differences may then surface in teams as fault lines, complicating collaboration and hindering performance.
Moreover, the phenomenon of outgroup homogeneity bias62—the tendency to perceive minority group members as highly similar—might amplify these challenges. People who are members of racially-atypical teams often resist being grouped together with others of similar backgrounds if they fear being stereotyped or reduced to a single narrative36. As one female in a corporate team noted, “I think women in my network are becoming a little bolder about supporting each other. I think there’s still a reluctance among African Americans to do that.”60 This observation suggests that solidarity, which often emerges in gender-atypical teams, may be less prevalent in racially atypical teams due to perceived similarity or competition within the group. Overall, results for high atypicality teams across gender and race suggest that team performance is contingent not only on the achievement of critical mass but also on the level of subcultural alignment within the team. Shared cohesive experiences may benefit gender-atypical teams while subcultural diversity, coupled with perceptions of homogeneity, may present added challenges for racial-atypical teams. These observations underline the need for an advanced understanding of diversity initiatives, tailored to the specific demographic dimension and its dynamics.
The second surprising finding shows a paradoxical relation between short-term performance and long-term viability: well-performing teams in the short-term do not necessarily achieve long-term viability. This finding contradicts the standard belief that a team’s early success forms a reliable indicator of the achievement of their goals later on. Previous research in similar contexts has shown a positive link between short-term funding success and the propensity for long-term collaboration among team members63,64. However, these studies differ from ours in how they define viability, in that they consider sustained collaboration as a sign of viability. Similarly, studies within the business venture domain that demonstrate a positive relationship between short-term success and long-term viability focus on the duration of the current venture rather than the number of subsequent ventures pursued collectively65,66. Our study, therefore, has a more dynamic benchmark of viability: an increase in collaboration amongst team members after funding compared to their levels before funding. The negative association observed between short-term performance and long-term viability suggests that teams that succeed in funding may, paradoxically, collaborate less over time.
One plausible explanation for this phenomenon is that successful teams are likely to focus their energies after the infusion of resources, likely prioritizing the creation of high-value outputs—such as impactful publications—rather than pursuing a large number of collaborations. While this is one avenue that clearly merits more exploration, this finding triggers some important questions concerning the basic motivations for team assembly. Specifically, it uncovers a tension between the goal of funding as a proximal goal and the cultivation of long-term collaborative relations. In organizational terms, the results suggest that immediate success factors in securing short-term project goals or resources are not necessarily congruent with the relational and adaptive capacities critical to long-term collaboration. For leaders and organizations seeking to develop long-term collaboration and adaptability, this underlines some of the risks of relying on short-term performance as a predictor of long-term viability in teams.
The third surprising finding is that team variability and team atypicality significantly, yet differently, influence the link between short-term performance and long-term viability. Specifically, the effects of high variability differ by demographic attribute: while high gender variability weakens long-term collaboration following short-term success, high ethnic or racial variability strengthens it. The negative moderating effect of high gender variability may stem from persistent power imbalances existing within mixed-gender teams. Prior research shows disparity in experiences: women often report feeling stifled in gender-mixed collaborations due to men’s dominance in communication, leading them to avoid similar collaborations in the future. In contrast, men in the same teams report the experience as productive and express a greater willingness to continue participating67. This disparity in experiences—where mixed-gender teams are perceived as stifling by women but productive by men—may explain why such teams, despite their short-term success, struggle to increase collaboration over time.
On the other hand, ethnic and racial variability appears to foster greater long-term collaboration, particularly when teams achieve short-term success. This positive effect may reflect the role of success in reducing initial tensions often associated with diverse teams,such as anxiety when interracial interactions are infrequent68. Additionally, pro-ingroup bias36—the tendency to evaluate one’s own group more favorably—may strengthen loyalty within ethnically or racially variable teams. In these cases, short-term success serves as external validation, reinforcing team cohesion and increasing members’ willingness to continue working together.
In contrast to the mixed effects of high variability, high atypicality teams that succeed in the short-term show a consistently positive relationship with long-term viability, regardless of the demographic attribute. For high gender atypicality teams (e.g., all-female teams), this may be driven by the absence of power imbalances that hinder gender-variable teams, combined with a strong sense of shared identity. Prior research shows that teams with dense internal connections—where trust and communication are strong—are generally more productive than those with sparse networks69, which may explain the increased collaboration observed in these teams. Interestingly, while high racial atypicality also fosters long-term collaboration following short-term success, the mechanisms appear distinct from those driving outcomes of high gender atypicality. As noted above, racially atypical teams face challenges in short-term performance, due to subcultural differences and outgroup homogeneity bias. However, when these teams achieve short-term success, such external validation seems to drive the collective motivation to collaborate more in the future.
Importantly, the significant moderating effects of variability and atypicality on the relationship between short-term performance and long-term viability reveal that the impact of team diversity on these two outcomes is more enduring and complex than previously thought. Prior research has indicated that the influence of diversity based on surface-level attributes (such as demographic differences) tends to wane over time, unlike diversity based on deeper level attributes (such as personality or values)70. While our findings do not speak to the deep-level characteristics of team members, the significant results associated with surface-level attributes such as gender, ethnicity, and race challenge the notion that they have a transient impact. Instead, we find that effects are long-lasting. More research is needed to understand why high variability and high atypicality in teams influence long-term viability differently.
Limitations and future research
These findings need to be considered in light of several limitations. First, the generalizability of our results is limited by the small sample of 106 teams, all drawn from within the context of a seed grant competition in a specific area of scientific research. In addition, the Request for Proposals or the eligibility criteria of the grant competition might have influenced team composition—for example, targeting preferred applicants or addressing disease prevalence—thereby further limiting broader applicability. Second, while we examined surface-level attributes of team members as measures of variability and atypicality, our analysis did not account for a more comprehensive list of minority groups, such as American Indians, Alaska Natives, Native Hawaiians, or other Pacific Islanders, which may limit the scope of diversity captured. Lastly, a more in-depth investigation into the intersectionality between the racial and ethnic identities will be necessary to fully understand the experiences involved for the underrepresented minority groups within the labor force.
Several other areas warrant further investigation. First, future research should examine the effects of deep-level attributes and network dynamics on team outcomes in greater depth. In our study, we included control variables such as educational level, prestige, cognitive similarity, and network capital as control variables, which revealed several significant relationships. For example, a team’s institutional tenure (a form of prestige), cognitive similarity, and brokerage position (a form of network capital) were significantly associated with short-term performance. Similarly, factors such as educational level, prior citations (another form of prestige), cognitive similarity, brokerage position, and network closure were linked to long-term team viability. These findings suggest that deep-level attributes and team structures interact with variability and atypicality in complex ways beyond surface-level diversity, influencing both short-term and long-term outcomes.
Second, our study was conducted in the context of translational science. Translational science aims to translate laboratory, clinic, and community observations into interventions in order to improve individual and public health. Subsequent extensions of the investigation of team diversity and its consequences for team outcomes into other settings—educational or corporate —will be valuable. Recent analyses of federal education and employment data show that demographic representation can vary significantly across different fields, suggesting that team diversity’s impacts also manifest differently. A large-scale study across many contexts would give a clearer picture of the impact of diversity on team dynamics and outcomes.
Moreover, future research should investigate the psychological mechanisms that underpin the observed effects. While our study highlighted the idiosyncratic influence of variability and atypicality, respectively, on team outcomes, further exploration into underlying psychological mechanisms is needed. The reasons why atypical teams excel in situations where variable teams fail would thus also give further insights into the exact cognitive and behavioral constituents of success. Further studies along this line of thought have helped develop strategies for building more effective, cohesive, and high-performing teams. In conclusion, while this study focused on the overall relationship between two aspects of team diversity—variability and atypicality—and their combined effect on team effectiveness, future research should examine their individual effects on specific outcomes, such as creativity and innovation. A more in-depth analysis of these more specific outcomes may provide greater insight into the types of diversity that are most beneficial for generating innovative solutions and breakthroughs within teams.
Methods
Data sources
Demographic information
We extracted demographic information about the investigators from the CTSA proposal cover sheets. This information was supplemented with data from investigators’ resumes and personal websites. Following the procedure of Lungeanu and Contractor71, we used text references (e.g., “her work”) and image searches on investigators’ personal websites and resumes to infer investigators’ gender information. To code ethnicity and race information, we cross-referenced three sources: NamSor72–74, a comprehensive first name database75, and a comprehensive surname database76. See Gender, Ethnicity, and Race Data Sources in Supplementary Information for more information about each source. We maintained a high degree of confidence in our ethnicity and race classifications by employing a combination of these three highly relied-on sources in our analysis. The race and ethnicity designations made by these three sources were in agreement for 178 (65.7%) of the investigators in this study. For the remaining 34.3% of investigators whose race and ethnicity designations conflicted across one or more of the three sources, a manual search of text references was performed by one co-author and two undergraduate research assistants. Importantly, we restricted our analyses to binary gender classifications and the Hispanic ethnic group due to insufficient information on proposal cover sheets and an inability to consistently infer more detailed information from researchers’ names across all classification databases.
Publication information: Next, we obtained each investigator’s publication records leading up to and after the proposal submission date using the Web of Science (WoS) database provided by Clarivate Analytics. WoS is an extensive database that includes information on authorship, location, institution, citations, journals, and keywords for all areas of science and engineering, social sciences, and humanities, with most types of data available from 1945 to today. We also supplemented WoS with Google Scholar and ScienceDirect to ensure that our bibliometric information was comprehensive and included all publications authored by each investigator pre- and post-proposal submission.
Measures
Independent variables and moderators
We operationalized diversity as variability and atypicality. We used Blau’s index of heterogeneity26 to calculate the team variability measures. Gender variability is based on female and male categories. Ethnic variability is based on Hispanic and non-Hispanic categories. Racial variability is based on White, Black, and Asian categories. Blau’s index quantifies the probability that two members randomly selected from a population will belong to different categories if the population size is infinite or if the sampling is carried out with replacement. Hence, if the index equals its minimum value (i.e., 0), all group members are classified in the same category, and there is no variability. In contrast, if the index equals its maximum value (e.g., 0.5 if there are only two categories), all group members are equally distributed among all categories, indicating the highest possible variability. We used the percentage of investigators in each of the underrepresented gender, ethnicity, and race categories to compute the team atypicality measures. Gender atypicality is the percentage of female investigators in the grant proposal team. Ethnic atypicality is measured as the percentage of Hispanic investigators in the team. Finally, Racial atypicalityis measured as the percentage of Non-White investigators in the team. The decision to treat Hispanic identity as an ethnicity and Non-White identity as a racial category follows established U.S. government and institutional guidelines, such as those used by the U.S. Census Bureau and NIH77–79. Atypical teams are those composed of underrepresented gender (i.e., female), ethnic (i.e., Hispanic), or racial groups (i.e., Non-White) because individuals belonging to these groups comprise such a small fraction of the overall scientific workforce80,81. Studies show that these populations are consistently underrepresented in scientific grant competitions82,83 and scientific publishing84,85. Indeed, the NIH released a statement explicitly identifying female, Hispanic, and Black scientists as three of the five underrepresented gender, ethnic, and racial populations in the U.S. biomedical, clinical, behavioral, and social sciences research enterprise86. The other two populations—American Indians or Alaska Natives and Native Hawaiians or other Pacific Islanders—are not represented in our dataset.
Analytical approach
We divided our analyses into two parts—one for each dependent variable. The first part examines the effect of variable versus atypical collaborations on funding success. The second part examines the impact of funding success on future collaboration among proposal team members and the moderating role of variable and atypical collaborations on this relationship.
Funding success
We used a fixed effect generalized linear regression to estimate the influence of variable or atypical collaboration on team proposal scores as follows:
1 |
where is our outcome variable, which captures the team proposal scores, represents the vector of proposal team characteristics, captures the proposal year fixed effects, and and are the intercept and error terms, respectively. Relatedly, we used a fixed effect probit regression to estimate the impact of variable or atypical collaboration on a team proposal’s award status, which is a binary outcome. The fixed-effects regression estimates the within-year difference controlling for unobserved invariant proposal years. Additionally, the observations are not independent because some investigators submitted proposals over multiple years. As such, we estimated robust standard errors.
Future collaboration among proposal team members
Econometrically, DiD is typically implemented as an interaction term between time and treatment group dummy variables in a regression model as follows:
2 |
where is our outcome variable which captures collaboration among proposal team members, is a dummy variable representing the time period (0 = pre-proposal submission, 1 = post-proposal submission), is a dummy variable indicating the treatment group (0 = unawarded, 1 = awarded), and and are the intercept and error terms, respectively. The coefficient of interest is , the DiD estimate, which is equal to the double difference in means between the unawarded and awarded proposal teams over time as follows:
3 |
In this way, the DiD estimator removes most of the biases associated with the permanent differences between the two groups across the two time periods. Equation (1) can be modified to incorporate the moderating effect, , of team diversity as follows:
![]() |
4 |
Here, the coefficient of interest is which is estimated as the triple difference in means. For all DiD regressions, we incorporated proposal and year fixed effects and robust standard errors for the reasons described previously.
Fig. 4.
Variability and Atypicality Moderate the Impact of Short-Term Team Performance on Long-Term Team Viability. Results are derived from a Difference-in-Difference (DiD) regression, with proposal and year fixed effects (details provided in the Analytical Approach section of Methods). The Main Effect (top panel) represents the impact of a proposal team being Awarded funding on the likelihood of at least two proposal team members collaborating more together in the five-years post-proposal submission as compared to their pre-proposal submission collaboration levels. The moderating effects of proposal team diversity conceptualized as Variability (middle panel) versus Atypicality (bottom panel) are also shown. Bars represent the 95% confidence intervals (CIs) for the regression estimates, calculated using robust standard errors. + p < 0.10, * p < 0.05, ** p < 0.01, *** p < 0.001 indicate statistical significance of the predictors.
The validity of DiD regressions relies on meeting the parallel trends assumption, which requires that the pre-proposal trends in publication counts between awarded and unawarded teams be similar. We conducted three diagnostic tests to evaluate whether this assumption could be met. First, we compared the average number of publications between awarded and unawarded teams during the pre-proposal period. The difference was not statistically significant (t-statistic = 0.55, p-value = 0.580), suggesting that there was no observable difference in the average number of publications between awarded and unawarded teams in the pre-proposal period. Second, we plotted the average number of publications for both awarded and unawarded teams over a five-year period centered around the proposal submission year (denoted as time t in Fig. 5). Figure 5a illustrates that the trends were parallel in the years preceding the proposal, with the exception of the [t–1] period, when unawarded teams began to publish more than awarded teams. However, this difference in [t–1] was also not statistically significant (t-statistic = 0.63, p-value = 0.530), reinforcing that there was no significant divergence in publication output between the groups at that time. Lastly, Fig. 5b shows that differences between awarded and unawarded teams in the average number of publications are negligible during the five years prior to the proposal time. Taken together, the results of these three tests suggest that our data meets the parallel trends assumption required for valid DID regression analysis.
Fig. 5.
Average Count of Publications Co-Authored by Proposal Team Members Prior to Proposal Submission. Panel a: Count of publications co-authored by at least two members of the grant proposal team in the five years leading up to their proposal submission year (denoted as time t), averaged across awarded and unawarded teams. Panel b: The difference in the average count of publications between awarded and unawarded grant proposal teams over the same five-year time period. The shaded region shows the 95% confidence intervals (CIs) for the difference values.
Supplementary Information
Acknowledgements
We are grateful to all of the data science interns from the Science of Networks in Communities (SONIC) Research Group at Northwestern University who helped with data cleaning and processing for this study. We are also thankful for the funding support that made this research possible, and for all reviewers who provided helpful feedback along the way.
Author contributions
A.L. designed research; N.M. and A.L. performed research and analyzed data; All authors wrote and edited the manuscript.
Funding
This work was supported by the National Institutes of Health [grant number 1R01GM137410-01] and the National Science Foundation [grant number 1856090]. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Institutes of Health or National Science Foundation.
Data availability
This paper uses restricted access data from the National Institutes of Health, protected by the Privacy Act of 1974 as amended (5 U.S.C. 552a). De-identified data necessary to reproduce all plots and statistical analyses are available upon request from the corresponding author.
Declarations
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
The online version contains supplementary material available at 10.1038/s41598-025-86483-0.
References
- 1.The Oxford Handbook of Diversity in Organizations. (Oxford University Press, Oxford, New York, 2016).
- 2.Harrison, D. A. & Klein, K. J. What’s the difference? diversity constructs as separation, variety, or disparity in organizations. Acad. Manag. Rev.32, 1199–1228 (2007). [Google Scholar]
- 3.Horwitz, S. K. & Horwitz, I. B. The effects of team diversity on team outcomes: A meta-analytic review of team demography. J. Manag.33, 987–1015 (2007). [Google Scholar]
- 4.Mannix, E. & Neale, M. A. What differences make a difference?: the promise and reality of diverse teams in organizations. Psychol. Sci. Public Int.6, 31–55 (2005). [DOI] [PubMed] [Google Scholar]
- 5.O’Reilly, C. A. III., Williams, K. Y. & Barsade, S. Group demography and innovation: Does diversity help? in Composition 183–207 (Elsevier Science/JAI Press, 1998). [Google Scholar]
- 6.Abascal, M., Xu, J. & Baldassarri, D. People use both heterogeneity and minority representation to evaluate diversity. Sci. Adv.10.1177/000312240707200603 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Bell, J. M. & Hartmann, D. Diversity in everyday discourse: the cultural ambiguities and consequences of “Happy Talk”. Am. Soc. Rev.72, 895–914 (2007). [Google Scholar]
- 8.Sanchez, R. The First Woman Flying to the Moon Is Turning Fear Into Focus. Harper’s BAZAARhttps://www.harpersbazaar.com/culture/politics/a43507774/nasa-astronaut-christina-koch-first-woman-moon-mission-interview/ (2023).
- 9.Gibson, C. & Vermeulen, F. A healthy divide: Subgroups as a stimulus for team learning behavior. Adm. Sci. Q.48, 202–239 (2003). [Google Scholar]
- 10.Hinsz, V. B., Tindale, R. S. & Vollrath, D. A. The emerging conceptualization of groups as information processors. Psychol. Bulletin121, 43–64 (1997). [DOI] [PubMed] [Google Scholar]
- 11.Pfeffer, J. & Salancik, G. R. The External Control of Organizations: A Resource Dependence Perspective (Stanford University Press, 2003). [Google Scholar]
- 12.Yang, Y., Tian, T. Y., Woodruff, T. K., Jones, B. F. & Uzzi, B. Gender-diverse teams produce more novel and higher-impact scientific ideas. Proc. Nat. Acad. Sci.119, e2200841119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Cutolo, D. & Ferriani, S. Atypicality: Toward An integrative framework in organizational and market settings. Acad. Manag. Annals.10.5465/annals.2022.0005 (2023). [Google Scholar]
- 14.Bell, S. T., Villado, A. J., Lukasik, M. A., Belau, L. & Briggs, A. L. Getting specific about demographic diversity variable and team performance relationships: A meta-analysis. J. Manag.37, 709–743 (2011). [Google Scholar]
- 15.Riedl, C., Kim, Y. J., Gupta, P., Malone, T. W. & Woolley, A. W. Quantifying collective intelligence in human groups. Proc. Nat. Acad. Sci.118, e2005737118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Epker, E. Fasten Your Seatbelts: A Female Car Crash Test Dummy Represents Average Women For The First Time In 60+ Years. Forbeshttps://www.forbes.com/sites/evaepker/2023/09/12/fasten-your-seatbelts-a-female-car-crash-test-dummy-represents-average-women-for-the-first-time-in-60-years/ (2023).
- 17.Forman, J. et al. Automobile injury trends in the contemporary fleet: Belted occupants in frontal collisions. Traffic Inj. Prev.20, 607–612 (2019). [DOI] [PubMed] [Google Scholar]
- 18.Alonso, T. Strategy Study: How Bumble Revolutionized Online Dating. https://www.cascade.app/studies/how-bumble-revolutionized-online-dating (2023).
- 19.Teng, M. Bumble CEO Whitney Wolfe Herd is proof that hiring and promoting female leaders can make you a billionaire. Business Insiderhttps://www.businessinsider.com/bumble-ceo-whitney-wolfe-female-leadership-billionaire-2021-2 (2021).
- 20.Ceci, L. Dating apps: most downloaded in the U.S. 2024. Statistahttps://www.statista.com/statistics/1238390/most-popular-dating-apps-us-by-number-of-downloads/ (2024).
- 21.Marks, M. A., Mathieu, J. E. & Zaccaro, S. J. A Temporally Based Framework and Taxonomy of Team Processes. Acad. Manag. Rev.26, 356–376 (2001). [Google Scholar]
- 22.Balkundi, P. & Harrison, D. A. Ties, leaders, and time in teams: strong inference about network structure’s effects on team viability and performance. AMJ49, 49–68 (2006). [Google Scholar]
- 23.Barrick, M. R., Stewart, G. L., Neubert, M. J. & Mount, M. K. Relating member ability and personality to work-team processes and team effectiveness. J. Appl. Psychol.83, 377–391 (1998). [Google Scholar]
- 24.Kozlowski, S. W. J. & Bell, B. S. Work groups and teams in organizations. In Handbook of psychology: Industrial and organizational psychology (ed. Weiner, I. B.) (John Wiley & Sons, 2023). [Google Scholar]
- 25.Mathieu, J., Maynard, M. T., Rapp, T. & Gilson, L. Team effectiveness 1997–2007: a review of recent advancements and a glimpse into the future. J. Manag.34, 410–476 (2008). [Google Scholar]
- 26.Blau, P. M. Inequality and Heterogeneity: A Primitive Theory of Social Structure (Free Press, 1977). [Google Scholar]
- 27.Tajfel, H. & Turner, J. C.An integrative theory of intergroup conflict. In: The social psychology of intergroup relations (eds Austin, W. G. & Worchel, S.) (Brooks/Cole, Monterey, CA, 1979). [Google Scholar]
- 28.Brewer, M. B. In-group bias in the minimal intergroup situation: A cognitive-motivational analysis. Psychol. Bulletin86, 307–324 (1979). [Google Scholar]
- 29.Byrne, D. Interpersonal attraction and attitude similarity. J. Abnorm. Soc. Psychol.62, 713–715 (1961). [DOI] [PubMed] [Google Scholar]
- 30.Byrne, D. & Griffitt, W. Interpersonal attraction. Annu. Rev. Psychol.24, 317–336 (1973). [Google Scholar]
- 31.McPherson, M., Smith-Lovin, L. & Cook, J. M. Birds of a feather: Homophily in social networks. Annu. Rev. Soc.27, 415–444 (2001). [Google Scholar]
- 32.Montoya, R. M. & Horton, R. S. A meta-analytic investigation of the processes underlying the similarity-attraction effect. J. Soc. Pers. Relationsh.30, 64–94 (2013). [Google Scholar]
- 33.Srikanth, K., Harvey, S. & Peterson, R. A dynamic perspective on diverse teams: Moving from the dual-process model to a dynamic coordination-based model of diverse team performance. Acad. Manag Ann.10, 453–493 (2016). [Google Scholar]
- 34.Van Dijk, H., Meyer, B., Van Engen, M. & Loyd, D. L. Microdynamics in diverse teams: A review and integration of the diversity and stereotyping literatures. Acad. Manag Ann.11, 517–557 (2017). [Google Scholar]
- 35.Biernat, M. & Kobrynowicz, D. Gender- and race-based standards of competence: lower minimum standards but higher ability standards for devalued groups. J. Pers. Soc. Psychol.72, 544–557 (1997). [DOI] [PubMed] [Google Scholar]
- 36.Dovidio, J. F. & Gaertner, S. L. Intergroup bias. in Handbook of social psychology, Vol. 2, 5th ed (John Wiley & Sons, Inc., Hoboken, NJ, US, 2010). 1084–1121 10.1002/9780470561119.socpsy002029.
- 37.Hutchison, P., Jetten, J. & Gutierrez, R. Deviant but desirable: Group variability and evaluation of atypical group members. J. Exp. Soc. Psychol.47, 1155–1161 (2011). [Google Scholar]
- 38.Maslach, C. Social and personal bases of individuation. J. Pers. Soc. Psychol.29, 411–425 (1974). [DOI] [PubMed] [Google Scholar]
- 39.van Knippenberg, D. & Schippers, M. C. Work Group Diversity. Annu. Rev. Psychol.58, 515–541 (2007). [DOI] [PubMed] [Google Scholar]
- 40.Kirgios, E. L., Chang, E. H. & Milkman, K. L. Going it alone: Competition increases the attractiveness of minority status. Organ. Behav. Human Decis. Proc.161, 20–33 (2020). [Google Scholar]
- 41.Chatman, J. & O’Reilly, C. Asymmetric reactions to work group sex diversity among men and women. Acad. Manag. J.47, 193–208 (2004). [Google Scholar]
- 42.Cox, T. H., Lobel, S. A. & McLeod, P. L. Effects of ethnic group cultural differences on cooperative and competitive behavior on a group task. AMJ34, 827–847 (1991). [Google Scholar]
- 43.Tsui, A. S., Egan, T. D. & O’Reilly, C. A. Being different: relational demography and organizational attachment. Adm. Sci. Q.37, 549–579 (1992). [Google Scholar]
- 44.Nurmohamed, S. The underdog effect: When low expectations increase performance. Acad. Manag. J.63, 1106–1133 (2020). [Google Scholar]
- 45.Jetten, J., Branscombe, N. R., Schmitt, M. T. & Spears, R. Rebels with a cause: Group identification as a response to perceived discrimination from the mainstream. Pers. Soc. Psychol. Bulletin27, 1204–1213 (2001). [Google Scholar]
- 46.DiTomaso, N. Rethinking, “Woke” and “Integrative” diversity strategies: diversity, equity, inclusion—and inequality. Acad. Manag. Perspect.10.5465/amp.2023.0013 (2023). [Google Scholar]
- 47.Waldman, D. A. & Sparr, J. L. Rethinking diversity strategies: An application of paradox and positive organization behavior theories. AMP37, 174–192 (2023). [Google Scholar]
- 48.Scoring Guidance. Grants & Funding | NIH Central Resource for Grants & Funding Informationhttps://grants.nih.gov/grants/policy/review/rev_prep/scoring.htm (2016).
- 49.Bell, S. T. & Marentette, B. J. Team viability for long-term and ongoing organizational teams. Organ. Psychol. Rev.1, 275–292 (2011). [Google Scholar]
- 50.Crespi, G. A. & Geuna, A. An empirical study of scientific production: A cross country analysis, 1981–2002. Res. Policy37, 565–579 (2008). [Google Scholar]
- 51.Ubfal, D. & Maffioli, A. The impact of funding on research collaboration: evidence from a developing country. Res. Policy40, 1269–1279 (2011). [Google Scholar]
- 52.Bertrand, M., Duflo, E. & Mullainathan, S. How much should we trust differences-in-differences estimates?*. Q. J. Econ.119, 249–275 (2004). [Google Scholar]
- 53.Werner, C. & Schermelleh-Engel, K. Deciding Between Competing Models: Chi-Square Difference Tests. Introduction to structural equation modeling with LISREL 1–3 (2010).
- 54.Zhang, L. An Institutional Approach to Gender Diversity and Firm Performance. Organ. Sci.31, 439–457 (2020). [Google Scholar]
- 55.Kanter, R. M. Men and Women of the Corporation (Basic Books, 1977). [Google Scholar]
- 56.Konrad, A. M., Kramer, V. & Erkut, S. Critical mass: The impact of three or more women on corporate boards. Organ. Dyn.37, 145–164 (2008). [Google Scholar]
- 57.Schwartz-Ziv, M. Gender and board activeness: the role of a critical mass. J. Financ. Quant. Anal.52, 751–780 (2017). [Google Scholar]
- 58.Torchia, M., Calabrò, A. & Huse, M. Women directors on corporate boards: from tokenism to critical mass. J. Bus Ethics102, 299–317 (2011). [Google Scholar]
- 59.Vedres, B. & Vásárhelyi, O. Inclusion unlocks the creative potential of gender diversity in teams. Sci. Rep.13, 13757 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Broome, L., Conley, J. & Krawiec, K. Does Critical Mass Matter? Views From the Board Room. Seattle Univ. Law Rev.34, 1049–1080 (2011). [Google Scholar]
- 61.Lu, J. G., Nisbett, R. E. & Morris, M. W. Why East Asians but not South Asians are underrepresented in leadership positions in the United States. Proc. Nat. Acad. Sci.117, 4590–4600 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Boldry, J. G., Gaertner, L. & Quinn, J. Measuring the measures: A meta-analytic investigation of the measures of outgroup homogeneity. Group Proc. Int. Rel.10, 157–178 (2007). [Google Scholar]
- 63.Ayoubi, C., Pezzoni, M. & Visentin, F. The important thing is not to win, it is to take part: What if scientists benefit from participating in research grant competitions?. Res. Policy48, 84–97 (2019). [Google Scholar]
- 64.Davies, B., Gush, J., Hendy, S. C. & Jaffe, A. B. Research funding and collaboration. Res. Policy51, 104421 (2022). [Google Scholar]
- 65.Gimmon, E. & Levie, J. Early Indicators of Very Long-Term Venture Performance: A 20-Year Panel Study. AMD7, 203–224 (2021). [Google Scholar]
- 66.Shane, S. & Stuart, T. Organizational endowments and the performance of university start-ups. Manag. Sci.48, 154–170 (2002). [Google Scholar]
- 67.Hardt, D., Mayer, L. & Rincke, J. Who Does the talking here? the impact of gender composition on team interactions. Manag. Sci..10.1287/mnsc.2023.03411 (2024). [Google Scholar]
- 68.National Academies of Sciences, E. et al. Diverse Work Teams: Understanding the Challenges and How STEMM Professionals Can Leverage the Strengths. in Advancing Antiracism, Diversity, Equity, and Inclusion in STEMM Organizations: Beyond Broadening Participation (National Academies Press (US), 2023). [PubMed]
- 69.Reagans, R. & Zuckerman, E. W. Networks, Diversity, and Productivity: The Social Capital of Corporate R&D Teams. Organ. Sci.12, 502–517 (2001). [Google Scholar]
- 70.Harrison, D. A., Price, K. H. & Bell, M. P. Beyond relational demography: time and the effects of surface- and deep-level diversity on work group cohesion. Acad. Manag. J.41, 96–107 (1998). [Google Scholar]
- 71.Lungeanu, A. & Contractor, N. S. The effects of diversity and network ties on innovations: The emergence of a new scientific field. Am. Behav. Sci.59, 548–564 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 72.Bursztyn, L., Chaney, T., Hassan, T. A. & Rao, A. The Immigrant Next Door: Long-Term Contact, Generosity, and Prejudice. Working Paper at 10.3386/w28448 (2021).
- 73.Lu, Y., Naik, N. Y. & Teo, M. Diverse Hedge Funds. SSRN Scholarly Paper at10.2139/ssrn.3779713 (2021). [Google Scholar]
- 74.Rieke, A., Southerland, V., Svirsky, D. & Hsu, M. Imperfect Inferences: A Practical Assessment. in Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency 767–777 (Association for Computing Machinery, New York, NY, USA, 2022). 10.1145/3531146.3533140.
- 75.Tzioumis, K. Demographic aspects of first names. Sci. Data5, 180025 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.U.S. Census Bureau. Frequently Occurring Surnames from the 2010 Census. Census.govhttps://www.census.gov/topics/population/genealogy/data/2010_surnames.html (2021).
- 77.National Institutes of Health. Race and National Origin. National Institutes of Health (NIH)https://www.nih.gov/nih-style-guide/race-national-origin (2024).
- 78.Passel, M. H. L., Jens Manuel Krogstad and Jeffrey S. Who is Hispanic? Pew Research Centerhttps://www.pewresearch.org/short-reads/2024/09/12/who-is-hispanic/ (2024).
- 79.U.S. Census Bureau. About the Topic of Race. Census.govhttps://www.census.gov/topics/population/race/about.html (2022).
- 80.Funk, R. F., Brian Kennedy and Cary. STEM Jobs See Uneven Progress in Increasing Gender, Racial and Ethnic Diversity. Pew Research Center Science & Societyhttps://www.pewresearch.org/science/2021/04/01/stem-jobs-see-uneven-progress-in-increasing-gender-racial-and-ethnic-diversity/ (2021).
- 81.Kozlowski, D., Larivière, V., Sugimoto, C. R. & Monroe-White, T. Intersectional inequalities in science. PNAS119(2), e2113067119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 82.Ginther, D. K. et al. Race, ethnicity, and NIH research awards. Science333, 1015–1019 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Ginther, D. K., Kahn, S. & Schaffer, W. T. Gender, race/ethnicity, and national institutes of health r01 research awards: is there evidence of A double bind for women of color?. Acad. Med.91, 1098 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Bertolero, M. A. et al. Racial and ethnic imbalance in neuroscience reference lists and intersections with gender. 2020.10.12.336230 Preprint at 10.1101/2020.10.12.336230 (2020).
- 85.Huang, J., Gates, A. J., Sinatra, R. & Barabási, A.-L. Historical comparison of gender inequality in scientific careers across countries and disciplines. Proc. Nat. Acad. Sci.117, 4609–4616 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.NOT-OD-20–031: Notice of NIH’s Interest in Diversity. NIH - National Institutes of Health Office of Extramural Researchhttps://grants.nih.gov/grants/guide/notice-files/NOT-OD-20-031.html (2019).
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
This paper uses restricted access data from the National Institutes of Health, protected by the Privacy Act of 1974 as amended (5 U.S.C. 552a). De-identified data necessary to reproduce all plots and statistical analyses are available upon request from the corresponding author.