Abstract
As scientists’ careers unfold, mobility can allow researchers to find environments where they are more productive and more effectively contribute to the generation of new knowledge. In this paper, we examine the determinants of mobility of elite academics within the life sciences, including individual productivity measures and for the first time, measures of the peer environment and family factors. Using a unique data set compiled from the career histories of 10,051 elite life scientists in the U.S., we paint a nuanced picture of mobility. Prolific scientists are more likely to move, but this impulse is constrained by recent NIH funding. The quality of peer environments both near and far is an additional factor that influences mobility decisions. We also identify a significant role for family structure. Scientists appear to be unwilling to move when their children are between the ages of 14-17, and this appears to be more pronounced for mothers than fathers. These results suggest that elite scientists find it costly to disrupt the social networks of their children during adolescence and take these costs into account when making career decisions.
Keywords: mobility, life sciences, economics of science, innovation, productivity
I. Introduction
A central tenant of modern theories of labor markets is that worker mobility enhances economic productivity by allowing workers to find environments where their skills are put to greatest use. In scientific fields, where team efforts are particularly important, mobility may well increase the production of scientific knowledge (e.g. Hoisl, 2007; Agrawal, McHale, & Oettl, 2014; Ejermo and Ahlin, 2015; Fernández-Zubieta, A., Geuna, A. and Lawson, C., 2016). Yet, we know surprisingly little about what drives scientists to move in the first place.
Economic theory suggests that mobility is driven by efforts to improve employer-employee match quality, but there may be constraints to realizing these matches. After all, mobility can generate significant costs, even if only temporary, as a result of professional and personal dislocation. In this paper, we examine both the professional and more personal factors that influence the mobility of elite life scientists. Since many personal factors can influence productivity and vice versa, including both in the same analysis allows us to minimize concerns about statistical confounding and thus develop the most credible measures of each influence to date.
Our analysis builds upon earlier work that has shown the important role played by own-productivity in the propensity to move (e.g. Zucker, Darby & Torero, 2002; Hoisl, 2007; Crespi, Geuna & Nesta, 2007; Lenzi, 2009) to also examine the role played by the quality of the scientific environment more broadly. Science is increasingly a collaborative “team sport” (Wuchty, Jones & Uzzi, 2007), and we exploit novel measures of the quality of peers at local and distant institutions to provide the first systematic analysis of this influence on the decision to relocate.
Our analysis also extends beyond the professional to examine the role of children in shaping mobility decisions. Demographic research has shown that the presence of children in a household can limit scientific mobility (Shauman & Xie, 1996). Moreover, the social psychology literature suggests that it may be particularly costly to move children during adolescence, when social bonds are strongest and thus the potential for social disruption is greatest (e.g. Fowler, Henry, & Marcal, 2014 and 2015), and this period roughly coincides with secondary school attendance (hereafter “high school” as it is called in the U.S.). As such, our analyses will examine how both the number and age of children influences scientist mobility.
The mobility of elite life scientists is of interest for a number of reasons. First, these scientists are largely responsible for pushing the boundaries of the knowledge frontier in their field. Work environments that enhance the returns to their human capital and potential knowledge spillovers to their colleagues can generate sizable social returns by accelerating biomedical innovation and improving human health. Second, the conduct of research in the life sciences is a team effort that often involves expensive and highly specialized equipment, some of which is financed by external sources that are tied to institutions rather than researchers. As such, mobility may be particularly constrained in this population. Finally, the notoriety of this elite group and the public nature of their careers facilitate the collection of data on family structure that is largely unobtainable in other study populations.
We use a unique data set compiled from the career histories of over 10,000 elite life scientists to understand why and when scientists make decisions to move to new locations.1 Our rich dataset includes factors that have previously been absent from studies of the determinants of mobility, and including both professional and personal factors in one framework allows us to examine the independent role of each type. We note that while the influence of some of these professional and personal measures on mobility are likely endogenous, our analysis is focused on providing new descriptive facts about the predictors of mobility rather than providing causal estimates of the impacts.
Our analysis confirms the importance of scientist productivity as a positive predictor of moves (Zucker, Darby and Torero, 2002; Coupé, Smeets & Warzynski, 2006; Lenzi, 2009; Ganguli, 2015a). It also highlights several new professional factors that influence the propensity to move. In particular, we find that recent NIH funding serves as a deterrent to moving, likely due, in part, to the significant transaction costs associated with transferring federal research between institutions (Bernstein, 2014). We also find that the peer environment exerts a significant influence on mobility. Scientists are less likely to move when the quality of the peer environment near their home institution is high and more likely to move when the quality of the peer environment at distant institutions is high. Additional analyses suggest that the nature of the move – whether it generates a substantial upward or downward change in institutional rank – has little impact on the role played by these professional factors in shaping scientific mobility.
Turning to the non-professional side, our results reveal an important influence of family structure on mobility. We find a sizable drop in non-local mobility when scientists have children of high school age. Interestingly, scientists appear to anticipate these constraints by increasing moves just before their oldest child enters high school. Mobility accelerates once again when their youngest child is beyond high school age. These results appear more pronounced for mothers than fathers.
The remainder of the paper is organized as follows. In Section 2 we provide a review of the literature and develop hypotheses about the drivers of scientists’ mobility. Section 3 describes our data and descriptive statistics. Section 4 lays out our empirical approach. Results are presented in Section 5, while Section 6 concludes.
II. Conceptual Model and Predictions
The movement of a scientist from one institution to another is an equilibrium outcome that depends on the preferences of the scientist on the supply side as well as demand from the destination institution. On the supply side, financial compensation will clearly play a role, but non-pecuniary factors are also quite important in the science community (Foster et al. 2015; Roach & Sauermann, 2010; Stern 2004). From the scientists’ perspective, one of the key benefits from moving is the change in proximity to coauthors and a new set of interlocutors working within their field (e.g. Azoulay, Graff Zivin & Sampat, 2012; Møen, 2005; Agrawal, Cockburn, and McHale, 2006). These benefits must be weighed against the potentially large social costs of moving due to uprooting one's own family or moving away from family and friends (Dahl and Sorenson, 2010). From the institutional perspective, the demand for new scientists is largely a function of how a scholar will contribute to the prestige and intellectual reputation of the institution. In this section, we develop a number of hypotheses about the professional and non-professional drivers of mobility in this setting.2
II.A. Demand-Side Professional Factors Influencing Scientist Mobility
On the demand side, universities want to hire talented individuals who will enhance the institutional reputation and the quality of scholarship produced within the university. Determining precisely which scientists will best serve this purpose is challenging since prospective employers cannot perfectly observe the talents of a particular scientist. As such, they may rely on costly signals of worker quality, such as training pedigree (Spence, 1973) and letters of recommendation (Caplow and McGee, 1958) when making hiring decisions. For seasoned scientists, the nature of this asymmetric information problem is more nuanced. Scientists leave an extensive paper trail of accomplishments that includes publications, patents, and grants (and the citations to them), which provide a relatively clear signal about scientist quality (Jaffe and Trajtenberg, 2002; Lehmann et al., 2006). Since universities are recruiting based largely on the science that they will produce under their employ, the challenge here is assessing the degree to which past is prologue. Is the scientist still in a productive phase of their career? Are previous accomplishments a reasonable proxy for future ones?
Perhaps due to the fuzziness of this quality signal, empirical evidence on the relationship between individual productivity and mobility is inconsistent (Allison and Long, 1987; Zucker et al, 2002; Hoisl, 2007; Crespi et al, 2007; Lenzi, 2009). While this relationship may indeed be mixed, it may also be the result of analyses that lean too heavily on past rather than present accomplishments as a proxy for future quality. As such, our first hypothesis concerns the relationship between the productivity of a scientist and the timing of their mobility:
H1: Scientists are more likely to move when their productivity is high
II.B. Supply-Side Professional Factors Influencing Scientist Mobility
On the supply side, the professional determinants of scientific mobility will depend on financial compensation and the degree to which the move will advance one's career. While the first component is generally unobservable by the econometrician, the latter can be at least partly inferred. In particular, it is well known that academic science is a “team sport”. This team includes formal collaborators (Bercovits and Feldman, 2011; Wuchty et al. 2007), as well as individuals with whom to exchange ideas (Azoulay, Graff Zivin & Sampat, 2012).
Despite significant advances in information technology that allow scientists to interact over great distances, a large literature documents the importance of proximity and face-to-face interactions to facilitate collaborations (Boudreau et al. 2014; Catalini 2015), knowledge sharing (Breschi and Lissoni, 2009; Azoulay, Graff Zivin & Sampat, 2012; and Ganguli 2015b), and thus productivity spillovers (Glaeser, 1999, Agrawal, McHale, & Oettl, 2014). This leads to our second hypothesis:
H2: Scientists are more likely to move when the quality of the peer environment in their field is better elsewhere
It is also important to recognize that the relevant peer group at an institution represents a small fraction of the total number of scholars under its employ. If scientists are largely moving based in pursuit of science or more general intellectual concerns (Merton, 1957; Cotgrove 1970), the broader prestige of the university and the letterhead it imparts should play a secondary role in the migration decision. This leads to our first corollary:
C1: The peer environment is more important than university prestige in determining scientific mobility
II.C. Social Factors Influencing Scientist Mobility
In addition to the professional factors discussed above, mobility decisions may also be shaped by social factors. Moving imposes social and psychic costs, including disrupting one's family and social networks (Bowles, 1970). The developmental ecology literature suggests that these costs are particularly acute for children. Moving disrupts routines and social networks that can cause children to experience social isolation when moving (Gerring 2014).
These costs are not necessarily borne evenly throughout childhood. Adolescence, when children are transitioning from childhood to adulthood, is a period when social disruption may be particularly costly. Moving during adolescence is associated with an increase in behavioral problems (Fowler et al, 2014), and a detrimental impact on long-term outcomes such as employment and earnings (Chetty, Hendren & Katz, 2016).
Empirical evidence suggests that mobility, even for highly skilled populations, is clearly impacted by social factors such as proximity to family and friends (Dahl and Sorenson, 2010 and 2012). Shauman and Xie (1996) show that scientists are less likely to move when they have children, although it is unclear whether this is due to the social factors described above or diminished productivity as a result of childcare responsibilities. They also find a differential role for fathers and mothers, with women's mobility more negatively impacted by having children, something we will explore in our empirical analysis. Thus, our third hypothesis is that, even after controlling for productivity:
H3: Scientists are less likely to make a distant move when they have adolescent children
Since local moves will disrupt some aspects of children's’ lives (e.g. household routines) while potentially leaving others (e.g. social networks) intact, the impacts of social factors on local moves may be less pronounced. As such, we posit the following corollary:
C2: The impact of adolescent children on scientific mobility will be smaller for local moves
III. Data and Descriptive Statistics
As described earlier, our analysis will focus on how both professional and personal factors impact the movement of elite life scientists across academic institutions. In this section, we begin with details on the construction of our scientist sample, including our measures of individual productivity. We then describe our measures of the productivity of a scientist's peer environment as well as the sources of data on the children of the elite scientists in our sample.
III.A. Scientist Sample
Our elite academic life scientist sample includes 12,935 individuals, which corresponds to roughly 5 percent of the entire relevant labor market in the United States. In our framework, a scientist is deemed elite if they satisfy at least one of the following criteria for cumulative scientific achievement: (1) highly funded scientists; (2) highly cited scientists; (3) top patenters; or (4) members of the National Academy of Sciences. Since these four criteria are based on extraordinary achievement over an entire scientific career, we add additional criteria to capture individuals who show great promise at the early and middle stages of their scientific careers, whether or not these episodes of productivity endure for long periods of time. These include: (5) NIH MERIT awardees; (6) Howard Hughes Medical Investigators; or (7) early career prize winners.3 Additional details on this sample construction can be found in Appendix A.
For each scientist in the sample, we reconstruct their career from the time they obtained their first position as independent investigators (typically after a postdoctoral fellowship) until 2006. We do so through a combination of curriculum vitae, NIH biosketches, “Who's Who” profiles, accolades/obituaries in medical journals, National Academy of Sciences biographical memoirs, and Google searches. The career sequences for academic life scientists in the US are no different from what could be observed elsewhere: graduate school/medical school, one or more postdoctoral fellowships, followed by the start of one's independent career, i.e., one's own laboratory, almost always in a tenure-track position. Only at the start of this last stage would a scientist be fully in control of his/her research agenda, and be eligible to raise funds (grants) to support that agenda.4
Our dataset includes employment history, degree held, date of degree, gender, and up to three departmental affiliations as well as complete list of publications, patents and NIH funding obtained in each year by each scientist. Publication counts come from the open source software PublicationHarvester. This software downloads from PubMed – an online bibliographic resource from the National Library of Medicine – the entire set of English-language articles for an elite scientist, provided they are not letters to the editor, comments, or other “atypical” articles (Azoulay, Stellman, and Graff Zivin, 2006).5 Funding data are obtained from the Consolidated Grant/Applicant File (CGAF) from the U.S. National Institutes of Health (NIH), which records information about grants awarded to extramural researchers funded by the NIH since 1938. Patent data come from the US Patent and Trademark Office (USPTO) historical patent data files.
Our mobility data is extracted precisely from biographical records, rather than inferred from affiliation information in papers or patents (e.g. Agrawal et al. 2014) or from self-reported data (Bäker 2015). As such, we observe the exact timing of professional transitions even in the cases in which a scientist has ceased to be active in research, for example because s/he has moved into an administrative position. Approximately ten percent of scientists in our sample (1,051) experience multiple moves during their career.
We exclude all scientists who transition between jobs in industry and those moving to or from foreign academic institutions.6 This exclusion yields a sample of 10,051 scientists on which our study is based. Appendix A Table A1 provides a full accounting of the mobility events in the data (US academia to US academia, US academia to industry, industry to US academia, US academia to foreign academia, foreign academia to US academia). The US academia to US academia transitions which we focus on account for 91% of the mobility events in the raw data.7
Our focus on eliteness is grounded in both substantive and pragmatic reasons.8 Substantively, our sample is selected and sui generis. We believe this idiosyncratic population is of enormous interest for the study of the scientific enterprise, and is worthy of study even if we cannot extrapolate to a population of scientists of more humble repute. In this belief we march in the footsteps of previous researchers who may not have had access to such a large sample, nor the ability to analyze these rich data using modern statistical methodologies (e.g., Zuckerman 1977).
Pragmatically, alternative sampling methodologies would make it extremely difficult to analyze the determinants of mobility between academic institutions. We could pick a random sample of academics, and then follow them up prospectively, but this would not eliminate selection bias since they would have persisted in academia at least up until the time when they are sampled. Moreover, focusing on a more representative slice of the academic population would create additional challenges. In particular, the quality of the data at our disposal (carefully disambiguated output and precise recording of the timing of each move) would suffer markedly if we collected data in this way: elite scientists leave considerably more electronic trails relevant to understanding the ways in which their careers unfolded, relative to “humdrum” scientists.
Our core analysis is focused transitions that are at least 50 miles apart (based on distance between the zip codes of the institutions) to increase the likelihood that this career change leads the scientists to change their place of residence, and thus distance them from their local professional networks and disrupt the social networks of their children.9 We also compare our main results to the effects for scientists who are local movers (moves within the 50 mile radius). The distribution of distances between institutions for those scientists that move is shown in Figure 1.
Table 1 presents summary statistics for the scientists in the sample who experience at least 1 professional transition (move) over their career and those who do not. The movers represent about 35% of the sample. Movers and stayers look similar in terms of degree type (MD or PhD). Movers are older by approximately 1 year of career age and slightly less likely to be female (13% vs. 16%). While we only obtain information on children for a subsample of our elite scientist population (as detailed below), the share of scientists with child information and the number of children they have is similar across the mover and stayer samples. Movers are also slightly more productive than their more static counterparts – 10 additional career publications, or roughly 7% more, on average. When accounting for the quality of publications using Journal Impact Factor (JIF)-weighted publications, movers have approximately 24 additional publications on average.
Table 1.
Stayers | Movers | Difference | |||
---|---|---|---|---|---|
Mean | Std. Dev. | Mean | Std. Dev. | ||
Female | 0.158 | 0.365 | 0.133 | 0.340 | 0.025*** |
MD | 0.335 | 0.472 | 0.336 | 0.472 | −0.001 |
PhD | 0.566 | 0.496 | 0.570 | 0.495 | −0.004 |
MD/PhD | 0.099 | 0.298 | 0.094 | 0.292 | 0.005 |
Career Age | 32.761 | 9.818 | 33.892 | 8.489 | −1.131*** |
With Kids info | 0.307 | 0.461 | 0.337 | 0.473 | −0.031** |
No. of Kids | 2.476 | 1.124 | 2.511 | 1.105 | −0.034 |
School NIH Funding | 139,568,954 | 115,536,596 | 127,355,091 | 107,965,909 | 12,213,863*** |
Individual Productivity | |||||
Career Publications | 136.609 | 104.319 | 147.215 | 106.747 | −10.606*** |
Career JIF-Weighted Pubs | 588.617 | 535.277 | 612.618 | 543.420 | −24.001* |
Career Patents | 1.966 | 6.670 | 1.748 | 5.014 | 0.218 |
Career NIH Amount | 18,927,373 | 25,173,591 | 20,086,774 | 20,564,469 | −1,159,400* |
Observations | 6,584 | 3,467 |
Notes: For each scientist in the sample, we reconstruct their career from the time they obtained their first position as independent investigators (typically after a postdoctoral fellowship) until 2006. The numbers presented in this table are based on the final year the scientist appears in the dataset. See section IIIA for more information about the sample construction and sources of data. We identify movers by extracting information on institutions from biographical records and then calculate geographic distance between the zip codes of the institutions. The stayers include scientists who do not move or only move locally (within 50 miles). We limit our empirical attention to transitions that are at least 50 miles apart to ensure that the career transition led the scientists to change their place of residence. Stars in the last column indicate the results of tests of proportions and t-tests for the equality of means
p < 0.10
p < 0.05
p < 0.01.
III.B. Quality of the Scientific Environment
The quality of one's peers near and far is presumably an important determinant of scientific mobility. As such, we need measures of both peers and their quality over time. For the latter, we follow conventions within the literature and use counts of publications and NIH funding (e.g. Azoulay, Graff Zivin, & Wang, 2010). Constructing a measure of the former is far more challenging because the set of scholars that influence the work of a scientist can take many forms. As such, we construct two distinct measures of peers for each year, those defined by direct collaboration as coauthors and those who work in similar fields but who have not directly collaborated.
Collaborators
To identify collaborators, we use two open-source software programs: PublicationHarvester, described earlier, and the Stars/Colleagues Generator (S/CGen).10 In the first step, the PublicationHarvester downloads from PubMeD the entire set of English-language articles for an elite scientist. From this set of publications, the S/CGen strips out the list of coauthors, eliminates duplicate names, matches each coauthor with the Faculty Roster of the Association of American Medical Colleges (AAMC),11 and stores the identifier of every coauthor for whom a match is found. The software then queries PubMed for each validated coauthor, and generates publication counts for each collaborator scientist in each year.12
Non-Collaborator Peers
Our measure of non-collaborating researchers that are intellectually close to the scientist of interest requires a method to delineate the boundaries of research fields. To construct such a measure, we employ a novel approach that groups scientific articles into subfields based on their intellectual content using very detailed keyword information as well as the relative frequencies of these keywords in the scientific corpus (Azoulay, Fons-Rosen, and Graff Zivin, 2015). Specifically, we use the PubMed Related Citations Algorithm (PMRA), which relies heavily on Medical Subject Headings (MeSH). The Medical Subject Headings (MeSH) thesaurus is a controlled vocabulary of 24,767 terms arranged in a hierarchical structure. The National Library of Medicine staff use it to tag all of the articles indexed by the MEDLINE database.13 The “Related Articles” function in PubMed is use to harvest journal articles that are intellectually proximate to the elite scientists’ own papers.14 The authors of those articles are then classified as non-collaborating peers if they have never coauthored with the elite scientist of interest. More details on this approach can be found in Appendix C.
Geography
Our measure of elite scientist location is based on the detailed biographical information we have for this sample, as detailed above. Collaborating and non-collaborating peers are mapped in physical space using affiliation data from the aforementioned Faculty Roster of the Association of American Medical Colleges (AAMC). Our measure of geography distinguishes between peers that are geographically close (less than 50 miles apart) and those that are distant (more than 50 miles apart).
In Table 2, we present descriptive statistics comparing the peer environment for scientists in the sample who experience at least one professional transition (moving more than 50 miles away) over their career to those who do not. The collaboration networks appear to be quite important. Movers have fewer and less accomplished local collaborators and greater and more accomplished distant collaborators than those that do not move. Interestingly, the quantity and quality of non-collaborating peers is lower for movers than stayers, regardless of whether they are local or distant.
Table 2.
Stayers | Movers | Difference | |||
---|---|---|---|---|---|
Mean | Std. Dev. | Mean | Std. Dev. | ||
Collaborators, Number | |||||
Nb. of Peers, Colocated | 5.459 | 6.048 | 3.054 | 4.101 | 2.405*** |
Nb. of Peers, Close | 1.164 | 2.304 | 0.736 | 1.847 | 0.428*** |
Nb. of Peers, Distant | 11.524 | 10.303 | 12.892 | 11.604 | −1.368*** |
Non-collaborating Peers, Number | |||||
Nb. of Peers, Colocated | 3.221 | 4.027 | 2.489 | 3.556 | 0.732*** |
Nb. of Peers, Close | 4.229 | 6.629 | 3.377 | 6.225 | 0.853*** |
Nb. of Peers, Distant | 118.394 | 85.812 | 104.308 | 76.467 | 14.085*** |
Collaborators’ Productivity | |||||
Pubs, Colocated | 19.988 | 28.272 | 10.481 | 18.487 | 9.507*** |
Pubs, Close | 4.057 | 10.492 | 2.337 | 6.840 | 1.721*** |
Pubs, Distant | 46.480 | 57.592 | 47.791 | 56.480 | −1.311 |
Non-collaborating Peers Prod. | |||||
Pubs, Colocated | 12.652 | 20.260 | 9.395 | 17.995 | 3.257*** |
Pubs, Close | 14.295 | 26.154 | 11.073 | 23.284 | 3.221*** |
Pubs, Distant | 398.866 | 300.081 | 332.168 | 261.033 | 66.698*** |
Observations | 6,584 | 3,467 |
Notes: The numbers presented in this table are based on the final year the scientist appears in the dataset. See section IIIB for more information about the measures of the peer environment and the sources of data. We identify movers by extracting information on institutions from biographical records and then calculate geographic distance between the zip codes of the institutions. The stayers include scientists who do not move or only move locally (within 50 miles). We limit our empirical attention to transitions that are at least 50 miles apart to ensure that the career transition led the scientists to change their place of residence. Stars in the last column indicate the results of tests of proportions and t-tests for the equality of means
* p < 0.10
** p < 0.05
p < 0.01.
III.C. Age of children
For each scientist in our sample, we hand-collected information on the number of children they had, each child's gender, and most importantly, each child's year of birth. To find this information, we obtained the names of the scientist's children from their “Who's Who” profile and in some cases from obituaries for the deceased. Using this data along with location information, we obtained the age of these children by cross-referencing information gathered from web-searches and online databases of public records (e.g. People Search Now). Out of the 10,051 scientists in the sample, 3,118 have children information, meaning a birth year for each of the children. The 6,933 scientists without children information include 6,863 scientists for whom we have no information at all, and 70 scientists for whom it is known that they do not have children.
Table 3 compares the sample of scientists for whom we have obtained information about their children to those for whom we have not.15 The former group is significantly older than the latter, an artifact of our reliance on public records to obtain age of children, much of which comes from databases of public records, such as state driver's license records, which necessarily oversamples scientists old enough to have kids that make them eligible to appear in these public records. While these older scientists have more publications and career NIH funding, normalizing by age suggests that they are statistically indistinguishable. The higher level of female scientists in the sample without children information may also be a reflection of this age difference across samples, as female entry into STEM fields has steadily climbed in recent decades (Ceci, Ginther, Kahn, and Williams, 2015). Interestingly, the composition of the sample without children information is more heavily skewed toward PhDs. While all the analyses that follow will control for these demographic characteristics, caution should, nonetheless, be exercised when generalizing our findings across different populations of scientists.
Table 3.
No Kids Info | Kids Info | Difference | |||
---|---|---|---|---|---|
Mean | Std. Dev. | Mean | Std. Dev. | ||
Female | 0.171 | 0.376 | 0.104 | 0.305 | 0.067*** |
MD | 0.298 | 0.457 | 0.416 | 0.493 | −0.118*** |
PhD | 0.605 | 0.489 | 0.485 | 0.500 | 0.121*** |
MD/PhD | 0.096 | 0.295 | 0.099 | 0.298 | −0.003 |
Career Age | 31.136 | 9.381 | 37.490 | 7.832 | −6.354*** |
Career Publications | 132.705 | 98.441 | 156.548 | 117.037 | −23.844*** |
Career JIF-Weighted Pubs | 575.952 | 514.264 | 641.981 | 583.944 | −66.029*** |
Career NIH Amount | 1,8140,879 | 23,592,466 | 21,881,374 | 23,703,597 | −3,740,495*** |
Career Pubs/Age | 4.364 | 3.022 | 4.283 | 3.195 | 0.081 |
Career NIH/Age | 579,173 | 734,993 | 584,215 | 598,109 | −5,042 |
Observations | 6,863 | 3,188 |
Notes: The numbers presented in this table are based on the final year the scientist appears in the dataset. See section IIIC for details on how the information on children was collected. Section IIIA provides more information about the construction of the demographic and productivity measures presented. Stars in the last column indicate the results of tests of proportions and t-tests for the equality of means
* p < 0.10
** p < 0.05
p < 0.01.
IV. Empirical Approach
Estimating the determinants of faculty mobility behavior requires a statistical framework that accommodates the discrete nature of the event. Since our interest lies in analyzing the dynamics associated with the timing of mobility in scientific careers, we employ discrete-time hazard rate models (Myers, Hanky and Mantel 1973, Alison 1982). The use of discrete-time models (as opposed to continuous-time models such as the Cox) is motivated by the lumpiness of the failure time information in our data: we observe mobility only at an annual frequency, rather than monthly or daily frequency, which results in multiple failure “times” in the data. For a researcher i during experience interval t, let the discrete time hazard rate of moving to a new academic position located at least 50 miles away be pit = Pr[Ti=t | Ti ≥ t, Xit], where Ti is the time at which researcher i experiences an event and Xit a vector of covariates. We use a logistic regression function to link the hazard rate with time and the explanatory covariates:
where δt is a set of experience interval dummies (in actual fact, a full suite of calendar year indicator variables). In practice, we estimate a simple logit of the decision to change employer, where the observations corresponding to years subsequent to the mobility event have been dropped from the estimation sample.16 Specifically, the vector X above is specified as:
As mentioned previously, approximately ten percent of scientists in our sample experience multiple moves during their career. For these scientists, we analyze each job spell separately, such that the move is considered an absorbing state for a given mobility spell.17 As such, our dependent variable is a binary variable equal to 1 if a scientist moves in a given year and 0 for all years prior to that move within the same spell. Because the overwhelming majority of mobility events take place in the summer, we adopt the following convention: a scientist is said to move from institution A to institution B in calendar year t whenever the actual timing of his move coincided with the summer of year t-1.18
The vector PROD includes measures of the productivity of the elite scientist, including the stock and recent flow of their publications and NIH funding levels. The vector PEER includes similar measures for both collaborating and non-collaborating peers as defined above. The term f(AGE) corresponds to a flexible function of the elite scientist's career age (years since MD or PhD degree), or dummies for each year, in order to capture life-cycle changes in the propensity to move that are not driven by the age of one's children. We also include demographic controls (Z) for gender and degree type (MD or PhD). All models include vintage fixed effects to account for differences across cohorts and a full set of year effects. Since we have 1,051 scientists in the sample with multiple moves, and each job spell is included in the sample separately, we cluster the standard errors at the scientist level.
Given our interest in the high social costs of moving children of high school age, we employ a variety of measures of AGEKID to probe this effect. In particular, we create variables that correspond to the case where one's oldest child is 12 or 13 years old and where one's youngest child is 18 or 19 years old. The former corresponds to the last window for a scientist to move before they have a child in high school, which coincides with the onset of adolescence. The latter corresponds to an ‘empty nest’ period in which all children should have graduated high school, regardless of whether they actually leave the nest. Alternative specifications include a simple measure of the number of children in high school and an indicator for having at least one child in high school.
V. Results
Our core results on the demographic and professional factors that influence the mobility of scientists are presented in Table 4. We find support for H1, that scientists who are more productive in terms of publications are more likely to move, which is consistent with related work in the literature (e.g. Zucker, Darby and Torero, 2002). Interestingly, we find that recent NIH funding serves as a deterrent to moving. This comports with popular accounts of the high transaction costs associated with moving federal funding across institutions (Bernstein, 2014). Regarding H2, we find that scientists are less likely to move when the quality of the peer environment near their home institution is high and more likely to move when the quality of the peer environment at distant institutions is high. Importantly, given the focus on family factors later in the analysis, all results are nearly identical when we restrict our attention to the sample of scientists with available age of children information.
Table 4.
Movers + Stayers | Subsample w/Kid Info | ||||
---|---|---|---|---|---|
(1) | (2) | (3) | (4) | (5) | |
Demographics | |||||
Female | −0.0013 (0.0012) | −0.0005 (0.0013) | −0.0001 (0.0013) | −0.0001 (0.0013) | −0.0047+ (0.0028) |
PhD (MD omitted) | −0.0014 (0.0009) | −0.0047** (0.0010) | −0.0066** (0.0010) | −0.0064** (0.0011) | −0.0081** (0.0017) |
MD/PhD | −0.0007 (0.0015) | −0.0053** (0.0017) | −0.0056** (0.0017) | −0.0045** (0.0017) | −0.0048+ (0.0027) |
Productivity Measures | |||||
Ln(Pubs_t-1) | −0.0008 (0.0007) | 0.0005 (0.0007) | 0.0002 (0.0007) | −0.0009 (0.0011) | |
Ln(Stock Pubs_t-2) | 0.0031** (0.0007) | 0.0045** (0.0008) | 0.0035** (0.0008) | 0.0048** (0.0014) | |
Ln(NIH Funding_t-1) | −0.0010** (0.0001) | −0.0010** (0.0001) | −0.0010** (0.0001) | −0.0009** (0.0001) | |
Ln(Stock NIH Funding_t-2) | 0.0009** (0.0001) | 0.0008** (0.0001) | 0.0006** (0.0001) | 0.0004* (0.0002) | |
Collaborators | |||||
Ln(Pubs), Colocated | −0.0043** (0.0003) | −0.0014** (0.0004) | −0.0011+ (0.0006) | ||
Ln(Pubs), Close | −0.0020** (0.0005) | −0.0022** (0.0005) | −0.0012 (0.0009) | ||
Ln(Pubs), Distant | 0.0017** (0.0004) | 0.0015** (0.0004) | 0.0016* (0.0007) | ||
Non-collaborating Peers | |||||
Ln(Pubs), Colocated | −0.0073** (0.0004) | −0.0064** (0.0006) | |||
Ln(Pubs), Close | −0.0000 (0.0003) | 0.0000 (0.0005) | |||
Ln(Pubs), Distant | 0.0046** (0.0008) | 0.0035** (0.0012) | |||
Nb. of Observations | 189,884 | 174,144 | 174,107 | 174,107 | 62,181 |
Nb. of Job Spells | 10,273 | 10,273 | 10,273 | 10,273 | 3,316 |
Nb. of Scientists | 9,378 | 9,378 | 9,378 | 9,378 | 2,977 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located at least 50 miles away. Estimation is by logit and marginal effects are reported. All specifications include full age, vintage category, and year fixed effects. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
p < 0.10
p < 0.05
p < 0.01
Table 5 repeats our analysis of the professional determinants of moving from Table 4, but for local moves (those within a 50-mile radius of the origin institution) as H1 and H2 may be relevant for local moves as well. The predictors of mobility are largely similar across samples, with two notable exceptions. First, female scientists are more likely to make local moves. Second, own-productivity plays a far more limited role in shaping local moves than it did in distant ones.
Table 5.
Local Movers + Stayers | Subsample w/Kid Info | ||||
---|---|---|---|---|---|
(1) | (2) | (3) | (4) | (5) | |
Demographics | |||||
Female | 0.0026** (0.0006) | 0.0027** (0.0006) | 0.0028** (0.0006) | 0.0025** (0.0006) | 0.0022+ (0.0012) |
PhD (MD omitted) | −0.0018** (0.0004) | −0.0024** (0.0005) | −0.0031** (0.0005) | −0.0030** (0.0006) | −0.0037** (0.0009) |
MD/PhD | −0.0016+ (0.0009) | −0.0022* (0.0010) | −0.0024* (0.0010) | −0.0024* (0.0010) | −0.0023 (0.0016) |
Productivity Measures | |||||
Ln(Pubs_t-1) | −0.0003 (0.0004) | 0.0001 (0.0004) | 0.0001 (0.0004) | 0.0008 (0.0006) | |
Ln(Stock Pubs_t-2) | 0.0002 (0.0004) | 0.0008+ (0.0004) | 0.0008+ (0.0005) | 0.0021* (0.0008) | |
Ln(NIH Funding_t-1) | −0.0001* (0.0001) | −0.0001* (0.0001) | −0.0001* (0.0001) | −0.0002* (0.0001) | |
Ln(Stock NIH Funding_t-2) | 0.0001 (0.0001) | 0.0000 (0.0001) | 0.0000 (0.0001) | 0.0000 (0.0001) | |
Collaborators | |||||
Ln(Pubs), Colocated | −0.0015** (0.0002) | −0.0007** (0.0002) | −0.0005 (0.0003) | ||
Ln(Pubs), Close | 0.0018** (0.0002) | 0.0003 (0.0002) | 0.0003 (0.0004) | ||
Ln(Pubs), Distant | −0.0003 (0.0002) | −0.0002 (0.0002) | −0.0006 (0.0004) | ||
Non-collaborating Peers | |||||
Ln(Pubs), Colocated | −0.0019** (0.0002) | −0.0020** (0.0003) | |||
Ln(Pubs), Close | 0.0019** (0.0002) | 0.0021** (0.0003) | |||
Ln(Pubs), Distant | −0.0001 (0.0004) | −0.0010+ (0.0006) | |||
Nb. of Observations | 153,813 | 143,867 | 143,833 | 143,833 | 50,445 |
Nb. of Job Spells | 6,719 | 6,719 | 6,719 | 6,719 | 2,072 |
Nb. of Scientists | 6,635 | 6,635 | 6,635 | 6,635 | 2,037 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located within 50 miles. Distant movers (scientists moving more than 50 miles away) are excluded from this analysis. Estimation is by logit and marginal effects are reported. All specifications include full age, vintage category, and year fixed effects. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
p < 0.10
p < 0.05
p < 0.01
The analyses thus far have treated all moves as equivalent, but heterogeneity in the rank of institutions may mask important insights about the determinants of mobility. The drivers of moves to higher-ranked institutions may be very different than those that contribute to lateral or lesser-ranked institutions. Next, we examine whether professional factors differentially affect mobility across different types of moves and that the peer environment is more important than university rank (C1).
We categorize moves based on the differences between the quality of their origin and destination institutions, where our measure of quality is based on an institutions rank in percentiles of total NIH funding received (per grantee) in a given year. We define scientists as moving “up” if they moved to an institution that was at least 10 percentiles higher in the ranking of NIH funding than their prior institution. A move “down” is symmetrically defined as one where the new institution was at least 10 percentiles lower in the ranking.
Table 6 presents our results for the professional determinants of moves separately for those moving up vs. staying (column 1) and those moving down vs. staying (column 2).19 Note that in these regressions, scientists who moved to an institution that was less than 10 percentiles different (i.e. lateral movers) are excluded from the regression such that the comparison is relative to stayers. We also include the percentile of the origin institution as a control since the direction of moves for those beginning at either end of the quality ladder will be partially constrained. In Column 3 we present results comparing lateral movers (moving to an institution within 10 percentiles in the ranking of NIH funding). We also present the results comparing movers up to movers down (column 4).
Table 6.
Moves Up vs. Stayers | Moves Down vs. Stayers | Lateral vs. Stayers | Moves Up vs. Down | |
---|---|---|---|---|
(1) | (2) | (3) | (4) | |
Demographics | ||||
Female | −0.0005 (0.0014) | −0.0024 (0.0019) | −0.0049* (0.0021) | 0.1410* (0.0650) |
PhD | −0.0056** (0.0010) | −0.0047** (0.0010) | −0.0057** (0.0012) | 0.0113 (0.0363) |
MD/PhD | −0.0043* (0.0018) | −0.0021 (0.0015) | −0.0029 (0.0018) | −0.0244 (0.0521) |
Productivity Measures | ||||
Ln(Pubs_t-1) | 0.0003 (0.0006) | 0.0004 (0.0008) | −0.0007 (0.0008) | 0.0028 (0.0129) |
Ln(Stk Pubs_t-2) | 0.0003 (0.0007) | 0.0024** (0.0009) | 0.0006 (0.0010) | −0.0098 (0.0243) |
Ln(NIH Funding_t-1) | −0.0001 (0.0001) | −0.0001+ (0.0001) | −0.0004** (0.0001) | −0.0017 (0.0021) |
Ln(Stk NIH Funding_t-2) | 0.0001 (0.0001) | 0.0002* (0.0001) | 0.0006** (0.0001) | 0.0019 (0.0028) |
Collaborators | ||||
Ln(Pubs), Colocated | −0.0007+ (0.0004) | −0.0009* (0.0004) | −0.0004 (0.0005) | 0.0085 (0.0133) |
Ln(Pubs), Close | −0.0013* (0.0006) | 0.0002 (0.0006) | −0.0019** (0.0007) | −0.0304+ (0.0178) |
Ln(Pubs), Distant | 0.0004 (0.0003) | 0.0002 (0.0004) | 0.0011* (0.0005) | −0.0031 (0.0136) |
Non-collaborating Peers | ||||
Ln(Pubs), Colocated | −0.0027** (0.0004) | −0.0019** (0.0004) | −0.0032** (0.0005) | −0.0271* (0.0138) |
Ln(Pubs), Close | −0.0003 (0.0003) | −0.0006+ (0.0003) | 0.0005 (0.0004) | 0.0019 (0.0111) |
Ln(Pubs), Distant | 0.0023** (0.0006) | 0.0014+ (0.0007) | 0.0034** (0.0009) | −0.0114 (0.0323) |
Nb. of Observations | 43,487 | 43,742 | 50,646 | 5,074 |
Nb. of Job Spells | 1,956 | 2,085 | 2,336 | 566 |
Nb. of Scientists | 1,954 | 2,078 | 2,277 | 496 |
Note: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located within 50 miles. Moves up or down are based on a ranking of institutions by percentiles of total NIH funding received (per grantee). The sample includes scientists who moved to an institution that was 10 percentiles or more higher (lower) in the ranking of NIH funding and scientists who did not move. This means that scientists who moved to an institution that was less than 10 percentiles different (i.e. lateral movers) and scientists who moved down (up) are excluded. Estimation is by logit and marginal effects are reported. All regressions include full age, vintage category, and year fixed effects and controls for percentile of origin institution. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
p < 0.10
p < 0.05
p < 0.01
The results show that there appears to be some difference in mobility decisions of women, with women being more likely to move up (conditional on moving) and less likely to move laterally vs. staying. Otherwise, the results across the specifications suggest, that individuals generally are less likely to move (be it up, down, or laterally) when their peer environment at their home institution is better and more likely to move when the distant peer environment is better. We view the results as supporting C1: the professional drivers of scientific mobility are insensitive to the type of move, lending support to the notion that all moves in this sample of elite scientists are ‘voluntary’ and driven by a desire to locate near higher quality peers, irrespective of broader institutional quality.20 Coupled with the result from Table 1, which shows that women move less often overall, it appears that women have a higher threshold for moving.
Our analysis of the social (family) determinants of moving begins in Figures 2 and 3, which underscore the important, and previously unmeasured, role that children play in scientist mobility. Figure 2 reveals a sizable spike in distant moves just before children in the household enter high school. Figure 3 reveals a similar spike just after all children in the household have completed high school. In both figures, the relationship between age of children and local moves is remarkably flat, suggesting that this is not simply a story about scientist age. The robustness of this relationship to more sophisticated statistical scrutiny that addresses potential confounders as well as the role of professional factors in shaping moves is examined in the next section.
Our analysis of the social (family) determinants of moving begins in Table 7. This table confirms the descriptive relationship illustrated in Figures 2 and 3. In Column 1 we see that having a child who is finishing middle school (12 or 13 years old) increases the likelihood of moving by almost 0.9 percentage points. In column 2, we see that when the youngest child has just completed high school, the propensity to move also increases, in this case also by 0.96 percentage points. Our results are very similar when we include both of these measures in the same regression. Using a simple indicator variable for having a child in school or the number of children in school illustrates the opposite side of this picture. Thus, we find evidence for H3, that having adolescent children constrains the mobility of scientists.
Table 7.
(1) | (2) | (3) | (4) | (5) | |
---|---|---|---|---|---|
Oldest kid 12 or 13 | 0.0089** (0.0022) | 0.0094** (0.0022) | |||
Youngest kid 18 or 19 | 0.0096** (0.0024) | 0.0101** (0.0024) | |||
Number of kids in high school | −0.0065** (0.0013) | ||||
At least one kid in high school | −0.0107** (0.0017) | ||||
Nb. of Observations | 62,181 | 62,181 | 62,181 | 62,181 | 62,181 |
Nb. of Job Spells | 3,316 | 3,316 | 3,316 | 3,316 | 3,316 |
Nb. of Scientists | 2,977 | 2,977 | 2,977 | 2,977 | 2,977 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located at least 50 miles away. Estimation is by logit and marginal effects are reported. All specifications include individual productivity and peer variables, as well as full age, vintage category, and year fixed effects. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
+ p < 0.10
* p < 0.05
p < 0.01.
Next we test C2, that the role of adolescent children on constraining mobility is smaller for local moves. Table 8 presents results for the same regression as those in Table 7, but for local movers only (moves within the 50 mile radius). The picture here is very different. Having children in high school has no effect on moving and thus we see no corresponding bump in moves when the youngest child in the household is beyond high school age. While we do see a small and marginally significant effect when the oldest child in the household is of middle-school age, it is one-third the size of the effect we see for distant moves and in the opposite direction. Overall, these results are consistent with the notion that a move to an institution within 50 miles can avoid major disruptions to children's social networks and school choices.
Table 8.
(1) | (2) | (3) | (4) | (5) | |
---|---|---|---|---|---|
Oldest kid 12 or 13 | −0.0003 (0.0015) | −0.0002 (0.0015) | |||
Youngest kid 18 or 19 | 0.0010 (0.0014) | 0.0010 (0.0014) | |||
Number of kids in high school | 0.0006 (0.0006) | ||||
At least one kid in high school | 0.0008 (0.0008) | ||||
Nb. of Observations | 50,445 | 50,445 | 50,445 | 50,445 | 50,445 |
Nb. of Job Spells | 2,072 | 2,072 | 2,072 | 2,072 | 2,072 |
Nb. of Scientists | 2,037 | 2,037 | 2,037 | 2,037 | 2,037 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located within 50 miles. Distant movers (scientists moving more than 50 miles away) are excluded from this analysis. Estimation is by logit and marginal effects are reported. All specifications include individual productivity and peer variables, as well as full age, vintage category, and year fixed effects. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
+ p < 0.10
* p < 0.05
** p < 0.01.
Taken as a whole, the results in Tables 7 and 8 suggest that scientists find it costly to disrupt the social networks of their children. They either postpone professional moves until after children leave high school or interestingly, anticipate these constraints by increasing moves just before their oldest child enters high school.21
In Table 9, we run the analysis separately for men and women in panels A and B. In panel C, we include interactions of the age of children variables with a female dummy. The regressions separately by gender show that the coefficients for women tend to be much larger than for men. While only the interaction terms for “Number of kids in HS” and “At least one kid in HS” are significantly different from zero, these results suggest that the age of children effects are more pronounced for women's mobility decisions, a result that is consistent with the findings of Shauman and Xie (1996) discussed previously.
Table 9.
(1) | (2) | (3) | (4) | (5) | |
---|---|---|---|---|---|
A. Women | |||||
Oldest kid 12 or 13 | 0.0177* (0.0077) | 0.0181* (0.0078) | |||
Youngest kid 18 or 19 | 0.0076 (0.0090) | 0.0089 (0.0089) | |||
Number of kids in HS | −0.0143* (0.0065) | ||||
At least one kid in HS | −0.0212** (0.0075) | ||||
Nb. of Observations | 4,743 | 4,743 | 4,743 | 4,743 | 4,743 |
Nb. of Job Spells | 316 | 316 | 316 | 316 | 316 |
Nb. of Scientists | 295 | 295 | 295 | 295 | 295 |
B. Men | |||||
Oldest kid 12 or 13 | 0.0083** (0.0023) | 0.0088** (0.0023) | |||
Youngest kid 18 or 19 | 0.0099** (0.0025) | 0.0103** (0.0026) | |||
Number of kids in HS | −0.0061** (0.0013) | ||||
At least one kid in HS | −0.0102** (0.0017) | ||||
Nb. of Observations | 56,550 | 56,550 | 56,550 | 56,550 | 56,550 |
Nb. of Job Spells | 3,000 | 3,000 | 3,000 | 3,000 | 3,000 |
Nb. of Scientists | 2,682 | 2,682 | 2,682 | 2,682 | 2,682 |
C. Female Interactions | |||||
Female × Oldest kid 12 or 13 | 0.0047 (0.0068) | 0.0046 (0.0068) | |||
Female × Youngest kid 18 or 19 | −0.0070 (0.0094) | −0.0069 (0.0094) | |||
Female × Number of kids in HS | −0.0111+ (0.0060) | ||||
Female × At least one kid in HS | −0.0145* (0.0071) | ||||
Nb. of Observations | 62,181 | 62,181 | 62,181 | 62,181 | 62,181 |
Nb. of Job Spells | 3,316 | 3,316 | 3,316 | 3,316 | 3,316 |
Nb. of Scientists | 2,977 | 2,977 | 2,977 | 2,977 | 2,977 |
One concern with our estimates of the influence of age of children is that they may suffer from selection bias, if scientists who have children are somehow different from those who do not. While we do not believe that scientists anticipate constraints on their mobility 10-12 years in advance, which would lead to biased estimates, we next assess the magnitude of potential biases based on selection on observables. We do this by performing an inverse probability of treatment weighted (IPTW) estimation of the specifications in Table 7. To do this, we model the probability of observing children information in our sample, as explained by gender, a full suite of birth year dummies, and degree dummies. Then in our regression predicting mobility, we weight individuals by the inverse probability of observing children information. Essentially, under the (untestable) assumption that selection into the subsample with children information is based on these observables, then the IPTW estimates provide estimates of the relationship of mobility with age of kids that are valid for the entire sample (for more details on this estimation technique in a setting relevant for the field of innovation, see Azoulay et al. (2016)). The results with IPTW estimation in Table 10 show that estimates are essentially unchanged, with even slightly larger effects, suggesting that selection is not driving our results.22
Table 10.
(1) | (2) | (3) | (4) | (5) | |
---|---|---|---|---|---|
Oldest kid 12 or 13 | 0.0090** (0.0023) | 0.0095** (0.0023) | |||
Youngest kid 18 or 19 | 0.0110** (0.0026) | 0.0116** (0.0026) | |||
Number of kids in high school | −0.0075** (0.0014) | ||||
At least one kid in high school | −0.0122** (0.0018) | ||||
Nb. of Observations | 62,181 | 62,181 | 62,181 | 62,181 | 62,181 |
Nb. of Job Spells | 3,316 | 3,316 | 3,316 | 3,316 | 3,316 |
Nb. of Scientists | 2,977 | 2,977 | 2,977 | 2,977 | 2,977 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located within 50 miles. Distant movers (scientists moving more than 50 miles away) are excluded from this analysis. Estimation is by logit and marginal effects are reported. We weight individuals by the inverse probability of observing children information we model the probability of observing children information in our sample, as explained by gender, a full suite of birth year dummies, and degree dummies. Then in our regression predicting mobility,. All specifications include individual productivity and peer variables, as well as full age, vintage category, and year fixed effects. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
+ p < 0.10
* p < 0.05
p < 0.01.
VI. Conclusion
In this paper, we examine the factors that shape the mobility of elite academics within the life sciences, including individual productivity measures and for the first time, measures of the peer environment and family determinants. As in previous literature, we find that prolific scientists are more likely to move, a likely reflection of the ‘demand’ for these scientists. We also provide new evidence on the ‘supply’ side, demonstrating that moves are driven in part by the scope for improvement in the quality of one's peers as a result of the move. We also highlight two significant constraints on moving. Elite scientists are less likely to move when they have recently received NIH funding, perhaps due to the high costs of transferring funds, equipment, and personnel to a new institution. Strikingly, we show that family structure is also important. Scientists appear to be unwilling to move when their children are in high school, suggesting that even elite scientists find it costly to disrupt the social networks of their children and take these costs into account when making career decisions.
Economists have long worried that labor market frictions can lead to job lock and foregone opportunities to increase social welfare. Set against that backdrop, the impacts of NIH funding on mobility are particularly interesting. If the purported benefit of federal research support is to generate new discoveries and push the frontiers of science, it is ironic that it also appears to limit scientists’ ability to locate themselves in environments where that is most likely to happen. This is a particularly germane concern in the life sciences where public funding is the lifeblood of academic life scientists. Thus, it appears that the designation of policies and guidelines that increase the portability of research funding support can help further the mission of the NIH, and perhaps other public research funding agencies as well.
Our novel findings regarding the role of peer productivity in shaping mobility are consistent with many studies documenting the importance of teams and interlocutors for scientific research. In light of this, an effective strategy for institutions for recruiting or retaining scientists may be to consider teams or group hires in the recruiting process rather than focusing only on individuals. Understanding the dynamics of team recruiting, and the degree to which it occurs in the sciences, is an important area for further research.
Our findings regarding the role of peer productivity also illustrate the challenges inherent in developing causal estimates of the impacts of mobility on knowledge production. Since mobility decisions depend upon the quality of scholars at old and new institutions, it is exceedingly difficult to infer the impacts that a scholar exerts on the productivity of his or her newly joined colleagues or those left behind. Departures may be a sign of intellectual decay even before the moving scientist left. Those s/he joined may have already been destined for greatness. In this context, our age of children results may contain a silver lining. Since scientists are reluctant to move when they have children in high school, and it is hard to imagine that they anticipate these costs more than a decade earlier when they decide to conceive, age of children offers a potential lever for studying the impacts of mobility on other outcomes of interest. This is an area ripe for exploration in future work.
The analysis in this paper has raised as many questions as it has answered. What underlies the surprising funding results? Is mobility constrained by something inherent in the contract between the NIH and the scientist's institution or might the movement of personnel and equipment be the larger concern? On family factors, what else matters for moving? Are women, too few in our elite sample to study, more impacted by those factors than men? How should all of this change the design of institutional incentives, particularly those that explicitly grant promotion and tenure based on funding histories? Together, these questions comprise a future research agenda.
Highlights.
We use a dataset of 10,051 elite life scientists to study the predictors of mobility.
Scientists with more publications and NIH funding are more likely to move.
Recent NIH funding is associated with a lower likelihood of moving.
The quality of the peer environment is an important influencer of mobility.
Scientists, especially mothers, are less likely to move when children are adolescent.
Acknowledgments
We gratefully acknowledge the financial support of the National Institutes of Health (P01-AG039347). Azoulay and Graff Zivin acknowledge the financial support of the National Science Foundation through its SciSIP Program (Award SBE-1460344). The authors also express gratitude to the Association of American Medical Colleges for providing licensed access to the AAMC Faculty Roster, and acknowledge the stewardship of Dr. Hershel Alexander (AAMC Director of Medical School and Faculty Studies). The National Institutes of Health partially supports the AAMC Faculty Roster under contract HHSN263200900009C. We thank Bruce Weinberg and participants of the NBER Innovation in an Aging Society meetings for useful discussions. All errors are our own.
Appendix A: Defining Elite Life Scientists
Highly Funded Scientists
Our first data source is the Consolidated Grant/Applicant File (CGAF) from the U.S. National Institutes of Health (NIH). This dataset records information about grants awarded to extramural researchers funded by the NIH since 1938. Using the CGAF and focusing only on direct costs associated with research grants, we compute individual cumulative totals for the decades 1977-1986, 1987-1996, and 1997-2006, deflating the earlier years by the Biomedical Research Producer Price Index. We also re-compute these totals excluding large center grants that usually fund groups of investigators (M01 and P01 grants). Scientists whose totals lie above the 95th percentile of either distribution constitute our first group of elite life scientists. In this group, the least well-funded investigator garnered $10.5 million in career NIH funding and the most well-funded received $462.6 million.23
Highly Cited Scientists
Despite the preeminent role of the NIH in the funding of public biomedical research, the above indicator of “superstardom” biases the sample towards scientists conducting relatively expensive research. We complement this first group with a second composed of highly cited scientists identified by the Institute for Scientific Information. A Highly Cited listing means that an individual was among the 250 most cited researchers for their published articles between 1981 and 1999, within a broad scientific field.24
Top Patenters
We add to these groups academic life scientists who belong in the top percentile of the patent distribution among academics – those who were granted 17 patents or more between 1976 and 2004.
Members of the National Academy of Science and of the Institute of Medicine
We add to these groups academic life scientists who were elected to the National Academy of Science or the Institute of Medicine between 1970 and 2013.
MERIT Awardees of the NIH
Initiated in the mid-1980s, the MERIT Award program extends funding for up to 5 years (but typically 3 years) to a select number of NIH-funded investigators: “Method to Extend Research in Time (MERIT) awards (R37) were created by the NIH to recognize outstanding and consistently productive investigators and to provide up to ten years of research funding” (Epstein, 2011). The specific details governing selection vary across the component institutes of the NIH, but the essential feature of the program is that only researchers holding an R01 grant in its second or later cycle are eligible. Further, the application must be scored in the top percentile in a given funding cycle.
Former and Current Howard Hughes Medical Investigators (HHMIs)
Every three years, the Howard Hughes Medical Institute selects a small cohort of mid-career biomedical scientists with the potential to revolutionize their respective subfields. Once selected, HHMIs continue to be based at their institutions, typically leading a research group of 10 to 25 students, postdoctoral associates and technicians. Their appointment is reviewed every five years, based solely on their most important contributions during the cycle.25
Early Career Prize Winners
We also included winners of the Pew, Searle, Beckman, Rita Allen, and Packard scholarships for the years 1981 through 2000. Every year, these charitable foundations provide seed funding to between 20 and 40 young academic life scientists. These scholarships are the most prestigious accolades that young researchers can receive in the first two years of their careers as independent investigators.
Table A1.
Share of Job Transitions (%) | |
---|---|
US Academia to US Academia (Distant) | 77.06 |
US Academia to US Academia (Local) | 13.88 |
US Academia to Industry (Distant) | 0.62 |
US Academia to Industry (Local) | 0.52 |
Industry to US Academia (Distant) | 1.09 |
Industry to US Academia (Local) | 0.45 |
US Academia to Foreign Academia | 1.76 |
Foreign Academia to US Academia | 4.63 |
Job Transitions (N) | 5,793 |
Appendix B: Measuring Publication Data
The source of our publication data is PubMed, a bibliographic database maintained by the U.S. National Library of Medicine that is searchable on the web at no cost.26 PubMed contains over 14 million citations from 4,800 journals published in the United States and more than 70 other countries from 1950 to the present. The subject scope of this database is biomedicine and health, broadly defined to encompass those areas of the life sciences, behavioral sciences, chemical sciences, and bioengineering that inform research in health-related fields. In order to effectively mine this publicly available data source, we designed PubHarvester, an open-source software tool that automates the process of gathering publication information for individual life scientists (see Azoulay et al. 2006 for a complete description of the software). PubHarvester is fast, simple to use, and reliable. Its output consists of a series of reports that can be easily imported by statistical software packages.
This software tool does not obviate the two challenges faced by empirical researchers when attempting to accurately link individual scientists with their published output. The first relates to what one might term “Type I Error,” whereby we mistakenly attribute to a scientist a journal article actually authored by a namesake; The second relates to “Type II Error,” whereby we conservatively exclude from a scientist's publication roster legitimate articles:
Namesakes and Popular Names
PubMed does not assign unique identifiers to the authors of the publications they index. They identify authors simply by their last name, up to two initials, and an optional suffix. This makes it difficult to unambiguously assign publication output to individual scientists, especially when their last name is relatively common.
Inconsistent Publication Names
The opposite danger, that of recording too few publications, also looms large, since scientists are often inconsistent in the choice of names they choose to publish under. By far the most common source of error is the haphazard use of a middle initial. Other errors stem from inconsistent use of suffxes (Jr., Sr., 2nd, etc.), or from multiple patronyms due to changes in spousal status.
To deal with these serious measurement problems, we opted for a labor-intensive approach: the design of individual search queries that relies on relevant scientific keywords, the names of frequent collaborators, journal names, as well as institutional affiliations. We are aided in the time-consuming process of query design by the availability of a reliable archival data source, namely, these scientists’ CVs and biosketches. PubHarvester provides the option to use such custom queries in lieu of a completely generic query (e.g, “azoulay p”[au] or “graff zivin js”[au]).
As an example, one can examine the publications of Scott A. Waldman, an eminent pharmacologist located in Philadelphia, PA at Thomas Jefferson University. Waldman is a relatively frequent name in the United States (with 208 researchers with an identical patronym in the AAMC Faculty Roster); the combination “waldman s” is common to 3 researchers in the same database. A simple search query for “waldman sa”[au] OR “waldman s”[au] returns 377 publications at the time of this writing. However, a more refined query, based on Professor Waldman's biosketch returns only 256 publications.27
The above example also makes clear how we deal with the issue of inconsistent publication names. PubHarvester gives the end-user the option to choose up to four PubMed-formatted names under which publications can be found for a given researcher. For example, Louis J. Tobian, Jr. publishes under “tobian l”, “tobian l jr”, and “tobian lj”, and all three names need to be provided as inputs to generate a complete publication listing. Furthermore, even though Tobian is a relatively rare name, the search query needs to be modified to account for these name variations, as in (“tobian l”[au] OR “tobian lj”[au]).
Appendix C: Defining Peers - PubMed Related Citations Algorithm [PMRA]
Traditionally, it has been very difficult to assign to individual scientists, or articles, a fixed address in “idea space,” but such a measure is critical in order to meaningfully assess the quality of peer environments at origin and potential destination institutions and thus the push and pull of match quality as a driver of moving.
This challenge is met here by the use of the PubMed Related Citations Algorithm [PMRA], a probabilistic, topic-based model for content similarity that underlies the \related articles” search feature in PubMed. This database feature is designed to help a typical user search through the literature by presenting a set of records topically related to any article returned by a PubMed search query.28 To assess the degree of intellectual similarity between any two PubMed records, PMRA relies crucially on MeSH keywords. MeSH is the National Library of Medicine's [NLM] controlled vocabulary thesaurus. It consists of sets of terms arranged in a hierarchical structure that permit searching at various levels of specificity. There are 27,149 descriptors in the 2013 MeSH edition. Almost every publication in PubMed is tagged with a set of MeSH terms (between 1 and 103 in the current edition of PubMed, with both the mean and median approximately equal to 11). NLM's professional indexers are trained to select indexing terms from MeSH according to a specific protocol, and consider each article in the context of the entire collection (Bachrach and Charen 1978; Neveol et al. 2010). What is key for our purposes is that the subjectivity inherent in any indexing task is confined to the MeSH term assignment process and does not involve the articles’ authors.29
Using the MeSH keywords as input, PMRA essentially defines a distance concept in idea space such that the proximity between a source article and any other PubMed-indexed publication can be assessed. The following paragraphs were extracted from a brief description of PMRA:
The neighbors of a document are those documents in the database that are the most similar to it. The similarity between documents is measured by the words they have in common, with some adjustment for document lengths. To carry out such a program, one must first define what a word is. For us, a word is basically an unbroken string of letters and numerals with at least one letter of the alphabet in it. Words end at hyphens, spaces, new lines, and punctuation. A list of 310 common, but uninformative, words (also known as stopwords) are eliminated from processing at this stage. Next, a limited amount of stemming of words is done, but no thesaurus is used in processing. Words from the abstract of a document are classified as text words. Words from titles are also classified as text words, but words from titles are added in a second time to give them a small advantage in the local weighting scheme. MeSH terms are placed in a third category, and a MeSH term with a subheading qualifier is entered twice, once without the qualifier and once with it. If a MeSH term is starred (indicating a major concept in a document), the star is ignored. These three categories of words (or phrases in the case of MeSH) comprise the representation of a document. No other fields, such as Author or Journal, enter into the calculations.
Having obtained the set of terms that represent each document, the next step is to recognize that not all words are of equal value. Each time a word is used, it is assigned a numerical weight. This numerical weight is based on information that the computer can obtain by automatic processing. Automatic processing is important because the number of different terms that have to be assigned weights is close to two million for this system. The weight or value of a term is dependent on three types of information: 1) the number of different documents in the database that contain the term; 2) the number of times the term occurs in a particular document; and 3) the number of term occurrences in the document. The first of these pieces of information is used to produce a number called the global weight of the term.
The global weight is used in weighting the term throughout the database. The second and third pieces of information pertain only to a particular document and are used to produce a number called the local weight of the term in that specific document. When a word occurs in two documents, its weight is computed as the product of the global weight times the two local weights (one pertaining to each of the documents). The global weight of a term is greater for the less frequent terms. This is reasonable because the presence of a term that occurred in most of the documents would really tell one very little about a document. On the other hand, a term that occurred in only 100 documents of one million would be very helpful in limiting the set of documents of interest. A word that occurred in only 10 documents is likely to be even more informative and will receive an even higher weight.
The local weight of a term is the measure of its importance in a particular document. Generally, the more frequent a term is within a document, the more important it is in representing the content of that document. However, this relationship is saturating, i.e., as the frequency continues to go up, the importance of the word increases less rapidly and finally comes to a finite limit. In addition, we do not want a longer document to be considered more important just because it is longer; therefore, a length correction is applied.
The similarity between two documents is computed by adding up the weights of all of the terms the two documents have in common. Once the similarity score of a document in relation to each of the other documents in the database has been computed, that document's neighbors are identified as the most similar (highest scoring) documents found. These closely related documents are pre-computed for each document in PubMed so that when one selects Related Articles, the system has only to retrieve this list. This enables a fast response time for such queries.30
The algorithm uses a cut-off rule to determine the number of related citations associated with a given source article. First, the 100 most related records by similarity score are returned. Second, a reciprocity rule is applied to this list of 100 records: if Publication A is related to Publication B, Publication B must also be related to publication A. As a result, the set of related citations for a given source article may contain many more than 100 publications.
Given our set of source articles, we delineate the scientific fields to which they belong by focusing on the set of articles returned by PMRA that satisfy three additional constraints: (i) they are original articles (as opposed to editorials, comments, reviews, etc.); (ii) they were published in or before 2006 (the end of our observation period); and (iii) they appear in journals indexed by the Web of Science (so that follow-on citation information can be collected).
To summarize, PMRA is a modern implementation of co-word analysis, a content analysis technique that uses patterns of co-occurrence of pairs of items (i.e., title words or phrases, or keywords) in a corpus of texts to identify the relationships between ideas within the subject areas presented in these texts (Callon et al. 1989; He 1999). One long-standing concern among practitioners of this technique has been the “indexer effect” (Whittaker 1989). Clustering algorithms such as PMRA assume that the scientific corpus has been correctly indexed. But what if the indexers who chose the keywords brought their own “conceptual baggage” to the indexing task, so that the pictures that emerge from this process are more akin to their conceptualization than to those of the scientists whose work it was intended to study?
Indexer effects could manifest themselves in three distinct ways. First, indexers may have available a lexicon of permitted keywords which is itself out of date. Second, there is an inevitable delay between the publication of an article and the appearance of an entry in PubMed. Third, indexers, in their efforts to be helpful to users of the database, may use combinations of keywords that reflect the conventional views of the field. The first two concerns are legitimate, but probably have only a limited impact on the accuracy of the relationships between articles that PMRA deems related. This is because the NLM continually revises and updates the MeSH vocabulary, precisely in an attempt to neutralize keyword vintage effects. Moreover, the time elapsed between an article's publication and the indexing task has shrunk dramatically, though time lag issues might have been a first-order challenge when MeSH was created, back in 1963. The last concern strikes us as being potentially more serious; a few studies have asked authors to validate ex post the quality of the keywords selected by independent indexers, with generally encouraging results (Law and Whittaker 1992). Inter-indexer reliability is also very high (Wilbur 1998).
Appendix D: OLS and Fixed Effect Results
In this Appendix, we reproduce all of our main results (Tables 4, 5, 7 and 8) using a linear probability (OLS) and fixed effects (where appropriate) framework as described in Section IV.
Table D1.
Movers + Stayers [N=9,389] | Subsample w/Kid Info [N=2,960] | ||||
---|---|---|---|---|---|
(1) | (2) | (3) | (4) | (5) | |
Demographics | |||||
Female | −0.001 (0.001) | −0.000 (0.001) | −0.000 (0.001) | −0.000 (0.001) | −0.004+ (0.002) |
PhD (MD omitted) | −0.001+ (0.001) | −0.004** (0.001) | −0.005** (0.001) | −0.005** (0.001) | −0.007** (0.002) |
MD/PhD | −0.001 (0.001) | −0.004** (0.002) | −0.004* (0.002) | −0.004* (0.002) | −0.004 (0.003) |
Productivity Measures | |||||
Ln(Pubs_t-1) | −0.000 (0.001) | 0.000 (0.001) | 0.000 (0.001) | −0.000 (0.001) | |
Ln(Stock Pubs_t-2) | 0.003** (0.001) | 0.003** (0.001) | 0.003** (0.001) | 0.004** (0.001) | |
Ln(NIH Funding_t-1) | −0.001** (0.000) | −0.001** (0.000) | −0.001** (0.000) | −0.001** (0.000) | |
Ln(Stock NIH Funding_t-2) | 0.001** (0.000) | 0.000** (0.000) | 0.000** (0.000) | 0.000 (0.000) | |
Collaborators | |||||
Ln(Pubs), Colocated | −0.001** (0.000) | −0.001** (0.000) | −0.001* (0.001) | ||
Ln(Pubs), Close | −0.002** (0.000) | −0.002** (0.000) | −0.001 (0.001) | ||
Ln(Pubs), Distant | 0.002** (0.000) | 0.002** (0.000) | 0.002** (0.001) | ||
Non-collaborating Peers | |||||
Ln(Pubs), Colocated | −0.007** (0.000) | −0.007** (0.000) | |||
Ln(Pubs), Close | −0.000 (0.000) | −0.000 (0.000) | |||
Ln(Pubs), Distant | 0.005** (0.001) | 0.005** (0.001) | |||
Nb. of Observations | 210,772 | 190,226 | 190,184 | 190,184 | 66,774 |
Nb. of Job Spells | 10,273 | 10,273 | 10,273 | 10,273 | 3,316 |
Nb. of Scientists | 9,378 | 9,378 | 9,378 | 9,378 | 2,977 |
R2 | 0.006 | 0.007 | 0.011 | 0.011 | 0.011 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located at least 50 miles away. Estimation is by Ordinary Least Squares (OLS). All specifications include full age, vintage category, and year fixed effects. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
p < 0.10
p < 0.05
p < 0.01
Table D2.
Local Movers + Stayers [N=6,637] | Subsample w/Kid Info [N=2,025] | ||||
---|---|---|---|---|---|
(1) | (2) | (3) | (4) | (5) | |
Demographics | |||||
Female | 0.003** (0.001) | 0.003** (0.001) | 0.003** (0.001) | 0.003** (0.001) | 0.002 (0.001) |
PhD (MD omitted) | −0.002** (0.000) | −0.002** (0.000) | −0.003** (0.000) | −0.003** (0.000) | −0.003** (0.001) |
MD/PhD | −0.002* (0.001) | −0.002* (0.001) | −0.002** (0.001) | −0.002** (0.001) | −0.002 (0.001) |
Productivity Measures | |||||
Ln(Pubs_t-1) | −0.000 (0.000) | 0.000 (0.000) | 0.000 (0.000) | 0.001 (0.000) | |
Ln(Stock Pubs_t-2) | 0.000 (0.000) | 0.001+ (0.000) | 0.001+ (0.000) | 0.001* (0.001) | |
Ln(NIH Funding_t-1) | −0.000* (0.000) | −0.000* (0.000) | −0.000* (0.000) | −0.000* (0.000) | |
Ln(Stock NIH Funding_t-2) | 0.000 (0.000) | 0.000 (0.000) | 0.000 (0.000) | 0.000 (0.000) | |
Collaborators | |||||
Ln(Pubs), Colocated | −0.001** (0.000) | −0.001** (0.000) | −0.001** (0.000) | ||
Ln(Pubs), Close | 0.002** (0.000) | 0.002** (0.000) | 0.002** (0.000) | ||
Ln(Pubs), Distant | −0.000 (0.000) | −0.000 (0.000) | −0.001* (0.000) | ||
Non-collaborating Peers | |||||
Ln(Pubs), Colocated | −0.002** (0.000) | −0.002** (0.000) | |||
Ln(Pubs), Close | 0.002** (0.000) | 0.002** (0.000) | |||
Ln(Pubs), Distant | −0.000 (0.000) | −0.001 (0.001) | |||
Nb. of Observations | 172,868 | 159,430 | 159,392 | 159,392 | 55,609 |
Nb. of Job Transitions | 6,719 | 6,719 | 6,719 | 6,719 | 2,072 |
Nb. of Scientists | 6,635 | 6,635 | 6,635 | 6,635 | 2,037 |
R2 | 0.002 | 0.002 | 0.003 | 0.003 | 0.004 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located within 50 miles. Distant movers (scientists moving more than 50 miles away) are excluded from this analysis. Estimation is by Ordinary Least Squares (OLS). All specifications include full age, vintage category, and year fixed effects. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
p < 0.10
p < 0.05
p < 0.01
Table D3.
(1) | (2) | (3) | (4) | (5) | |
---|---|---|---|---|---|
A. OLS | |||||
Oldest kid 12 or 13 | 0.013** (0.003) | 0.013** (0.003) | |||
Youngest kid 18 or 19 | 0.010** (0.003) | 0.011** (0.003) | |||
Number of kids in high school | −0.006** (0.001) | ||||
At least one kid in high school | −0.011** (0.002) | ||||
B. OLS with Fixed Effects | |||||
Oldest kid 12 or 13 | 0.017** (0.003) | 0.018** (0.003) | |||
Youngest kid 18 or 19 | 0.009** (0.003) | 0.010** (0.003) | |||
Number of kids in high school | −0.007** (0.001) | ||||
At least one kid in high school | −0.010** (0.002) | ||||
Nb. of Observations | 66,774 | 66,774 | 66,774 | 66,774 | 66,774 |
Nb. of Job Transitions | 3,316 | 3,316 | 3,316 | 3,316 | 3,316 |
Nb. of Scientists | 2,977 | 2,977 | 2,977 | 2,977 | 2,977 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located at least 50 miles away. Estimation is by Ordinary Least Squares (OLS). All specifications include individual productivity and peer variables, as well as full age, vintage category, and year fixed effects. Robust standard errors are in parentheses, clustered at the individual level. Panel B regressions include scientist-spell fixed effects. See Section III for a full description of how the sample and variables were constructed.
+ p < 0.10
* p < 0.05
p < 0.01
Table D4.
(1) | (2) | (3) | (4) | (5) | |
---|---|---|---|---|---|
A. OLS | |||||
Oldest kid 11, 12, or 13 | −0.000 (0.002) | −0.000 (0.002) | |||
Youngest kid 18, 19, or 20 | 0.001 (0.001) | 0.001 (0.001) | |||
Number of kids in high school | 0.001 (0.001) | ||||
At least one kid in high school | 0.001 (0.001) | ||||
B. OLS with Fixed Effects | |||||
Oldest kid 11, 12, or 13 | −0.001 (0.001) | −0.001 (0.001) | |||
Youngest kid 18, 19, or 20 | 0.001 (0.001) | 0.001 (0.001) | |||
Number of kids in high school | 0.000 (0.001) | ||||
At least one kid in high school | 0.001 (0.001) | ||||
Nb. of Observations | 55,609 | 55,609 | 55,609 | 55,609 | 55,609 |
Nb. of Job Transitions | 2,072 | 2,072 | 2,072 | 2,072 | 2,072 |
Nb. of Scientists | 2,037 | 2,037 | 2,037 | 2,037 | 2,037 |
Notes: The dependent variable is a binary variable that takes on a value one in the year we observe the elite scientist moving to a new academic position located within 50 miles. Distant movers (scientists moving more than 50 miles away) are excluded from this analysis. Estimation is by Ordinary Least Squares (OLS). All specifications include individual productivity and peer variables, as well as full age, vintage category, and year fixed effects. Panel B regressions include scientist-spell fixed effects. Robust standard errors are in parentheses, clustered at the individual level. See Section III for a full description of how the sample and variables were constructed.
+ p < 0.10
* p < 0.05
** p < 0.01
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
We are primarily focused on understanding the determinants of the timing of employer changes. Because we do not observe the set of academic institutions to which a scientist could have potentially moved, our results speak to the preferences and constraints that shape the decision to leave one's current institution. We cannot say much about the choice of a specific destination.
Our framework and empirical analysis is focused on domestic moves between US institutions. While moves to and from foreign institutions would be interesting to study, they entail being subject to a different incentive system (in particular for funding) and data that would do not have (peer and funding measures). However, we acknowledge that there is an emerging and interesting related literature that focuses on international mobility. These studies suggest that the most productive, or most motivated, are the ones leave for the US from Europe (Van Bouwel, L., Lykogianni, E. and Veugelers, R., 2011) and the former USSR (Ganguli 2015). These studies also point to drivers of international mobility such as better access to funding and resources, collaboration opportunities, and career opportunities (Enders and Mugabushaka, 2004). They also suggest that previous social ties through collaborators or other colleagues abroad can lead to emigration (Ayari-Gharbi, Besson, and Mamlouk, 2014).
We also cross-reference our list of stars with alternative measures of scientific eminence. For example, the elite subsample contains every U.S.-based Nobel Prize winner in Medicine and Physiology since 1975, and a plurality of the Nobel Prize winners in Chemistry over the same time period.
The moves that occur before the start of the independent career stage are different, as the main goal is to receive training, and the costs of mobility much lower, since they do not entail relocating equipment or other team members (but we recognize that they could entail relocating one's spouse/young children.)
More details on the assignment of publications to scientists can be found in Appendix B.
The factors influencing mobility decisions abroad are likely very different than for domestic moves. At a practical level, some our key measures of productivity (NIH funding and peer measures when the origin is a foreign institution) is not applicable for foreign movers. The scientists moving to foreign institutions are significantly different in terms of degree type and age, with foreign movers more likely to be PhDs (rather than MDs) and tending to be younger. They also are likely to have fewer JIF-weighted publications and less NIH funding, which is understandable given that NIH funding is a US-specific aspect. However, they are no more likely to be women and have similar number of publications. However, including these foreign movers does not impact our main results.
We do not focus on transitions from or to industry as they make up a very small share of transitions (see Appendix A Table A1), although clearly these transition are likely to be quite different from those between academic institutions. We note that the biopharmaceutical industry also employs researchers engaged in the same type of activities, and with similar training and credentials. Some of the notable differences between the two settings include soft money contracts, the ability to take projects, team members and even equipment, freedom in choice of collaborators, personnel policies, and importance of patenting.
Note that even within this highly selected sample, there remains substantial heterogeneity in achievement. For example, skewness is a feature of the distribution of career NIH funding and career publications even within our sample.
We have run the analysis using transitions that are 100 miles apart, and our results are unchanged.
PublicationHarvester and S/CGen are publicly available and can be found, along with user manuals, at http://www.stellman-greene.com/PublicationHarvester/ and http://www.stellman-greene.com/ScientificDistance/, respectively.
The roster is an annual census of all U.S. medical school faculty, where each faculty is linked across yearly cross-sections by a unique identifier. We have licensed access to the AAMC data for the years 1975 through 2006.
See the online appendix from Azoulay et al. (2010) for details on the matching procedure, preventing inclusion of spurious coauthors, and the approach to addressing measurement error when tallying the publication output of coauthors with common names.
The National Library of Medicine's explicit statement of purpose for these MeSH terms is to “...provide a reproducible partition of concepts relevant to biomedicine for the purpose of organizing knowledge and information.”
To facilitate the harvesting of PubMed-related records on a large scale, we have developed an open-source software tool that queries PubMed and PMRA and stores the retrieved data in a MySQL database. The software is available for download at http://www.stellman-greene.com/FindRelated/.
The sample for whom we have information includes scientists who are known to have no children. There are 61 such scientists or approximately 2% of the sample of scientists for whom we have child information. Thus, the set of individuals who definitively have zero children is too small to analyze directly. Should more data on this sample become available, this would be an interesting area for future research.
Appendix D replicates this analysis using a standard ordinary least squares (OLS) framework. One advantage of the OLS framework is that we are able to include scientist-spell fixed effects, which allows us to isolate the effects of within spell changes in covariates on the likelihood of moving, while controlling for all time invariant characteristics of scientists (see discussion in Wooldridge, 2010). It is noteworthy that all of our results are very similar under this linear specification, with or without fixed effects.
The most itinerant star in our sample has 4 unique job spells during our study period.
This convention is adopted since most academic moves occur during the summer. Moreover, this ensures that our measure of the timing of professional transitions corresponds to the timing of the school year for children in our sample.
We have also estimated these regressions for local movers, and there do not appear to be notable differences in the role of these factors for upward vs. downward mobility.
Recall that we focus on intellectual relatedness to define peers. Therefore, it is entirely possible for an institution in our dataset to be more prestigious than one's current employer, while simultaneously offering a poorer peer environment because of the number and/or quality of the peers in the focal scientist's area of research.
Appendix D repeats the analysis corresponding to Tables 4-8 using a linear probability (OLS) and fixed effects (where appropriate) framework. Reassuringly, we find that all key results are largely unchanged.
It is also noteworthy that all of our core results on the role of professional and family determinants are largely unchanged when we limit our analysis to the sample of first move of any given scientist.
We perform a similar exercise for scientists employed by the intramural campus of the NIH. These scientists are not eligible to receive extramural funds, but the NIH keeps records of the number of “internal projects” each intramural scientist leads. We include in the elite sample the top five percentiles of intramural scientists according to this metric.
The relevant scientific fields in the life sciences are microbiology, biochemistry, psychiatry/psychology, neuroscience, molecular biology & genetics, immunology, pharmacology, and clinical medicine.
See Azoulay et al. (2011) for more details and an evaluation of this program.
((((“waldman sa”[au] NOT (ether OR anesthesia)) OR (“waldman s”[au] AND (murad OR philadelphia[ad] OR west point[ad] OR wong p[au] OR lasseter kc[au] OR colorectal))) AND 1980:2013[dp])
Lin and Wilbur (2007) report that one fifth of “non-trivial” browser sessions in PubMed involve at least one invocation of PMRA.
This is a slight exaggeration: PMRA also makes use of title and abstract words to determine the proximity of any two pairs of articles in the intellectual space. These inputs are obviously selected by authors, rather than by NLM staff. However, neither the choice of MeSH keywords nor the algorithm depends on cited references contained in publications.
Available at http://ii.nlm.nih.gov/MTI/related.shtml
Contributor Information
Pierre Azoulay, MIT Sloan School of Management and NBER, 100 Main Street, E62-487, Cambridge, MA 02142, pazoulay@mit.edu.
Ina Ganguli, University of Massachusetts Amherst, 200 Hicks Way, Thompson Hall 904, Amherst, MA 01003.
Joshua Graff Zivin, University of California San Diego and NBER, 9500 Gilman Drive, MC 0519, La Jolla, CA 92093-0519, jgraffzivin@ucsd.edu.
References
- Agrawal A, Cockburn I, McHale J. Gone But Not Forgotten: Labor Flows, Knowledge Spillovers and Enduring Social Capital. Journal of Economic Geography. 2006;6(5):571–591. 2006. [Google Scholar]
- Agrawal A, McHale J, Oettl A. Why stars matter. 2014 NBER WP 20012.
- Ahlin L, Ejermo O. The patent productivity effects of mobility for a panel of Swedish inventors. DRUID; 2015. 2015. [Google Scholar]
- Allison Paul D. In: Discrete-Time Methods for the Analysis of Event Histories, in Sociological Methodology. Leinhardt S, editor. Jossey-Bass; San Francisco: 1982. pp. 61–98. [Google Scholar]
- Allison PD, Long JS. Interuniversity mobility of academic scientists. American Sociological Review. 1987:643–652. [Google Scholar]
- Ayari-Gharbi Asma, Besson Dominique, Mamlouk Zeineb Ben Ammar. Management of individual expatriation: Case of the academics expatriates in France. International Journal of Innovation and Applied Studies. 2014;9(1):53–59. November 2014. [Google Scholar]
- Azoulay P, Fons-Rosen C, Graff Zivin J. Does Science Advance One Funeral at a Time? 2015 doi: 10.1257/aer.20161574. NBER Working Paper No. 21788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Azoulay P, Graff Zivin JSG, Sampat BN. The Rate and Direction of Inventive Activity Revisited. University of Chicago Press; 2011. The Diffusion of Scientific Knowledge across Time and Space: Evidence from Professional Transitions for the Superstars of Medicine. pp. 107–155. [Google Scholar]
- Azoulay P, Liu C, Stuart T. Social Influence Given (Partially) Deliberate Matching: Career Imprints in the Creation of Academic Entrepreneurs. American Journal of Sociology. 2016 [Google Scholar]
- Azoulay P, Stellman A, Graff Zivin J. PublicationHarvester: An Open-Source Software Tool for Science Policy Research. Research Policy. 2006;35(7):970–974. [Google Scholar]
- Azoulay P, Graff Zivin J, Wang J. Superstar Extinction. The Quarterly Journal of Economics. 2010;125(2):549–589. [Google Scholar]
- Bäker A. Non-tenured post-doctoral researchers’ job mobility and research output: An analysis of the role of research discipline, department size, and coauthors. Research Policy. 2015;44(3):634–650. [Google Scholar]
- Bercovitz J, Feldman M. The mechanisms of collaboration in inventive teams: Composition, social networks, and geography. Research Policy. 2011;40(1):81–93. 2011. [Google Scholar]
- Bernstein R. Managing a Lab Move. Science Careers. 2014 Sep 23; 2014. [Google Scholar]
- Boudreau K, Brady T, Ganguli I, Gaule P, Guinan E, Hollenberg T, Lakhani KR. A field experiment on search costs and the formation of scientific collaborations. 2014 doi: 10.1162/rest_a_00676. Working Paper, SSRN 2486068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bowles S. Migration as investment: Empirical tests of the human investment approach to geographical mobility. The Review of Economics and Statistics. 1970:356–362. [Google Scholar]
- Caplow T, McGee RJ. The academic marketplace. Transaction Publishers; 1958. [Google Scholar]
- Catalini C. Microgeography and the direction of inventive activity. 2015 Rotman School of Management Working Paper 2126890 (2015).
- Ceci SJ, Ginther DK, Kahn S, Williams WM. Women in Academic Science: A Changing Landscape. Psychological Science in the Public Interest. 2014;15(3):75–141. doi: 10.1177/1529100614541236. [DOI] [PubMed] [Google Scholar]
- Chetty R, Hendren N, Katz LF. The effects of exposure to better neighborhoods on children: New evidence from the Moving to Opportunity experiment. The American Economic Review. 2016;106(4):855–902. doi: 10.1257/aer.20150572. [DOI] [PubMed] [Google Scholar]
- Cotgrove S. The sociology of science and technology. The British Journal of Sociology 21. 1970;1(1970):1–15. [PubMed] [Google Scholar]
- Coupé T, Smeets V, Warzynski F. Incentives, sorting and productivity along the career: Evidence from a sample of top economists. Journal of Law, Economics, and Organization. 2006;22(1):137–167. [Google Scholar]
- Crane D. Scientists at major and minor universities: A study of productivity and recognition. American sociological review. 1965:699–714. [PubMed] [Google Scholar]
- Crespi GA, Geuna A, Nesta L. The mobility of university inventors in Europe. The Journal of Technology Transfer. 2007;32(3):195–215. [Google Scholar]
- Dahl MS, Sorenson O. Home sweet home: Entrepreneurs' location choices and the performance of their ventures. Management science. 2012;58.6:1059–1071. 2012. [Google Scholar]
- Dahl MS, Sorenson O. The migration of technical workers. Journal of Urban Economics. 2010;67.1:33–45. 2010. [Google Scholar]
- Enders J, Mugabushaka A. Wissenshaft und Karriere: Ehrfahrungen und Werdegange ehemahleiger Stipendiaten der DFG. Forschungsgemeinshaft; Bonn: 2004. [Google Scholar]
- Epstein JA. Enhancing discovery and saving money with MERIT. The Journal of clinical investigation. 2011;121(4):1226. doi: 10.1172/JCI57708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fernández-Zubieta A, Geuna A, Lawson C. Productivity pay-offs from academic mobility: should I stay or should I go? Industrial and Corporate Change. 2016;25(1):91–114. [Google Scholar]
- Foster JG, Rzhetsky A, Evans JA. Tradition and Innovation in Scientists’ Research Strategies. American Sociological Review. 2015;80(5):875–908. [Google Scholar]
- Fowler PJ, Henry DB, Marcal KE. Family and housing instability: Longitudinal impact on adolescent emotional and behavioral well-being. Social science research. 2015;53:364–374. doi: 10.1016/j.ssresearch.2015.06.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fowler PJ, Henry DB, Schoeny M, Taylor J, Chavira D. Developmental timing of housing mobility: Longitudinal effects on externalizing behaviors among at-risk youth. Journal of the American Academy of Child & Adolescent Psychiatry. 2014;53(2):199–208. doi: 10.1016/j.jaac.2013.12.003. [DOI] [PubMed] [Google Scholar]
- Ganguli I. Who Leaves and Who Stays? Evidence on Immigrant Selection from the Collapse of Soviet Science. In: Geuna Aldo., editor. Global Mobility of Research Scientists: The Economics of Who Goes Where and Why. Elsevier; 2015a. [Google Scholar]
- Ganguli I. Immigration and Ideas: What Did Russian Scientists “Bring” to the United States? Journal of Labor Economics. 2015b;33(S1):S257–S288. [Google Scholar]
- Glaeser EL. Learning in cities. Journal of urban Economics. 1999;46(2):254–277. [Google Scholar]
- Hoisl K. Tracing mobile inventors—the causality between inventor mobility and inventor productivity. Research Policy. 2007;36(5):619–636. [Google Scholar]
- Jaffe AB, Trajtenberg M. Patents, citations, and innovations: A window on the knowledge economy. MIT press; 2002. 2002. [Google Scholar]
- Lenzi C. Patterns and determinants of skilled workers’ mobility: evidence from a survey of Italian inventors. Economics of Innovation and New Technology. 2009;18(2):161–179. [Google Scholar]
- Lehmann S, Jackson AD, Lautrup BE. Measures for measures. Nature. 2006;444.7122:1003–1004. doi: 10.1038/4441003a. 2006. [DOI] [PubMed] [Google Scholar]
- Merton RK. Priorities in scientific discovery: a chapter in the sociology of science. American sociological review 22. 1957;6(1957):635–659. [Google Scholar]
- Myers MH, Hankey BF, Mantel N. A Logistic Exponential Model for Use with Response-Time Data Involving Regressor Variables. Biometrics. 1973;29:257–69. [PubMed] [Google Scholar]
- Roach M, Sauermann H. A taste for science? PhD scientists’ academic orientation and self-selection into research careers in industry. Research Policy. 2010;39(3):422–434. [Google Scholar]
- Shauman KA, Xie Y. Geographic mobility of scientists: Sex differences and family constraints. Demography. 1996;33(4):455–468. [PubMed] [Google Scholar]
- Spence M. Job market signaling. The Quarterly Journal of Economics. 1973:355–374. 1973. [Google Scholar]
- Stern S. Do scientists pay to be scientists? Management science. 2004;50.6:835–853. 2004. [Google Scholar]
- Van Bouwel L, Lykogianni E, Veugelers R. Destination choices of mobile European researchers: Europe versus North America. 2011 Available at SSRN 2105573. [Google Scholar]
- Wooldridge JM. Econometric Analysis of Cross Section and Panel Data. MIT press; 2010. 2010. [Google Scholar]
- Wuchty S, Jones B, Uzzi B. The increasing dominance of teams in the production of knowledge. Science. 2007;316:1036. doi: 10.1126/science.1136099. [DOI] [PubMed] [Google Scholar]
- Zucker LG, Darby MR, Torero M. Labor Mobility from Academe to Commerce. Journal of Labor Economics. 2002;20(3):629–660. [Google Scholar]
- Zuckerman H. Scientific elite: Nobel laureates in the United States. Transaction Publishers; 1977. [Google Scholar]