Skip to main content
Springer Nature - PMC COVID-19 Collection logoLink to Springer Nature - PMC COVID-19 Collection
. 2023 Apr 26;42(3):40. doi: 10.1007/s11113-023-09782-2

Spatial Patterns and Determinants of Inter-county Migration in California: A Multilevel Gravity Model Approach

Anqi Xu 1,
PMCID: PMC10132804  PMID: 37128246

Abstract

Understanding migration patterns and their determinants is crucial for population estimation and resource allocation for policymakers. Utilizing residential mobility data collected by the Department of Motor Vehicles, this present study provides a spatiotemporal analysis of inter-county migration in California for the period 2014–2021. We use multilevel gravity models to address the hierarchical nature of migration data and the effects of migration flows sharing common origins, destinations, and regions, providing a substantively complete examination of push and pull forces affecting migration. Our findings show that populous counties in Southern California and the San Francisco Bay Area represent the largest origins and destinations, despite a systemic decline in intra-state migration. Migration is strongly associated with population size, geographic proximity (i.e., distance and contiguity), job availability, and industrial composition similarity between origins and destinations. Our findings also highlight the contribution of shared origins, destinations, and regions in explaining the systematic variation of migration flows. Counties vary more in the number of migrants they attract than the number they send. The purposed multilevel modeling approach is useful in identifying place-specific influences on migration and in improving estimation accuracy.

Supplementary Information

The online version contains supplementary material available at 10.1007/s11113-023-09782-2.

Keywords: Determinants of migration, Multilevel model, Gravity model, Spatial patterns

Introduction

Carefully measuring internal migration is important both for policy purposes and for an improved understanding of economic development and structural transformation. In the United States, migration is a crucial component in population estimation, and it plays a significant role in determining the appropriate distribution of state subvention funds to local governments (Brettell & Hollifield, 2014); Inaccurate information will produce an inequitable distribution of funds. Likewise, knowledge of migration contributes to population projection, which is frequently used by local, state, and federal agencies for program administration and planning.

With approximately 39 million residents in 2021, California is the most populated state in the US and makes an important case study for internal migration. The share of inter-county migrants among the total population in California far exceeds that of the national average (Molloy et al., 2011). Given the declining birth rate (Kearney et al., 2022) and immigration slowdown (Norlander & Sørensen, 2018) across North America, governments are paying increasing attention to internal migration as a source for fueling growth, fostering innovation, and bringing in new revenue (Larrison & Raadschelders, 2019). Recent years have witnessed the state’s increasing budget in census-related programs to ensure estimation accuracy regarding migration and to facilitate fair resource distribution (Murphy & Danielson, 2018); examining patterns and causes of migration informs the decision-making processes of government officials and other program planners.

One of the biggest challenges in measuring inter-county migration is data availability. Conventional measures such as Census estimates are often released with a multi-year lag and suffer from poor reliability when it comes to small and remote counties. This puts a limit on local governments when analyzing migration patterns. This study utilizes the much-underused Department of Motor Vehicles (DMV) data, which tracks individuals’ change of address during the driver’s license or ID update. DMV data have so far been utilized in transportation safety and market research (Vorona et al., 2011; Yavorsky et al., 2021), but rarely in migration studies  (Gabriel & Mattey, 1996). This paper contributes to current literature on the application of DMV data in examining migration trends and determinants.

Driving forces of migration have been investigated extensively, with the gravity model being widely used to analyze how attributes of origins and destinations affect residential mobility (Poot et al., 2016). Yet many tend to focus on one level, failing to understand migration in a realistic multilevel system. Migration flows can be connected when they share common origins, destinations, or regionsplace-specific context influences migration over and above factors such as population and distance (Matthews & Parker, 2013). For example, when a local government introduces a policy to attract migrants, the resulting migration inflows are likely to correlate with one another and share common elements that are centered around that policy. Region-specific context is also found to affect migration in addition to micro-level influences (Thomas et al., 2015; Williams et al., 2018). The multilevel dynamics of migration require researchers to explicitly consider the connectivity or the interdependency between flows to ensure outcome robustness. Examining whether there is more variation in migration flows within origins, destinations, and at a higher region level is informative in its own right. Failure to address the hierarchical nature of the migration data may otherwise lead to biased and inconsistent results.

Previous research has examined migration between rural and urban areas, between metropolitan and non-metropolitan areas, between states, and between counties (Ambinakudige & Parisi, 2017; Golding & Winkler, 2020; Hyatt et al., 2018; Rayer & Brown, 2001). Fewer explore intra-state migration flows, which comprise a considerable portion of domestic migration. In California, intra-state migration has substantially transformed the state’s demographic landscape in historic terms (Schafran & Wegmann, 2012). This study provides a comprehensive overview of spatial patterns and determinants of inter-county migration in California; the method and analytical strategy can also be broadly applied to migration studies of other places.

The aim of our analysis is twofold in this paper. First, we investigate current trends and spatial patterns of county-to-county migration in California utilizing rich Department of Motor Vehicles (DMV) residential mobility data. We then fit multilevel gravity models to examine push and pull factors affecting migration. Our multilevel gravity model approach provides a statistically robust analysis addressing the flow dependence in migration data. Importantly, it illustrates the contribution of origins, destinations, and regions in explaining the variation in migration flows. This component is not usually considered in previous migration literature, yet it is meaningful for cases where place-specific influences strongly shape migration. Our work offers scholars and policymakers a useful perspective to consider when migration can not only be explained by single-level attributes, but also by the influence of higher-level groups where observations are nested (e.g., shared origins/destinations/regions), and ultimately advances a systemic understanding of internal migration.

In this analysis, the term “internal migration” refers to the movement of people between usual residences within state boundaries. It is used interchangeably with “intra-state migration”, encompassing both inter-county and inter-region migration within California. “Migration flows” consist of migration counts, which are unique people entering or leaving a given area during the study period as recorded by DMV.

Background: The Gravity Model and Multilevel Modeling

The gravity model has been adopted to explain flows between geographically separated regions including international trade (Van Bergeijk & Brakman, 2010), traffic flows (Zhang & Zhang, 2016), and human migration (Lamonica & Zagaglia, 2013). The model views migration flows as proportional to the product of the “mass” of each region such as their population and economic power, while inversely proportional to geographic proximity (Anderson, 2011; Lewer & Van den Berg, 2008). In other words, the numbers of people moving between locations are a function of attributes of locations including population size and the physical or socioeconomic distance between places.

There are two major advantages of using the gravity model to study migration. One is that the model incorporates attributes of the origin and the destination, accounting for both push and pull forces in migratory decision-making. The model can easily be augmented to accommodate different additional controls and policy variables. Recent scholarship has suggested the inclusion of variables such as age structure (Garcia et al., 2015; Plane & Heins, 2003), income (Wong & Celbis, 2019), unemployment rate (Karemera et al., 2000), industrial composition (Santana-Gallego et al., 2016), spatial contiguity (Van Lottum & Marks, 2012), and environmental pollution (Backhaus et al., 2015). Several other neoclassical migration models also consider the variation in traits across regions: The Harris-Todaro model explains migration decisions as an outcome of expected income differentials minus migration costs (Busso et al., 2021). The intervening opportunity model argues that the volume of migration has more to do with opportunities in each location than with distance and total population (Akkoyunlu, 2012). The radiation model, similar to the intervening opportunity model, considers whether there are neighboring regions with a significant pull (Alis et al., 2021). Despite theoretical links between these models (Almeida & Gonçalves, 2001; Hong et al., 2019), the gravity model gains more prominence in recent literature due to (1) the growing availability of dyadic flow data, (2) the model’s straightforward integration within a population projection modeling framework, and (3) the ability to meet requirements of spatial econometric interaction models (Poot et al., 2016). So far, most studies show that empirical findings are highly consistent with the gravity model.

The gravity model confronts several critiques and challenges. Poot et al. (2016) discuss the model’s limited accuracy in estimating migration flows of small areas compared to microsimulation. Beyer et al. (2022) point out that the gravity model may represent statistical artifacts rather than true mechanisms of migration and thus would be inadequate in predicting migration trajectories. Importantly, the gravity model also needs to grapple with a network perspective that emphasizes the social and network costs of migration. That is, people’s socioeconomic success is often tied to places through connections with others, and may not relocate even though there seem to be ample opportunities/attractions in potential destinations. Varner and Young's (2012) and Young et al.'s work (2016) on elite migration offers an example that wealthy migrants’ socioeconomic status is mostly endogenous to their origins. As the result, migration can be insensitive to factors such as income tax differentials in the gravity model. Koser Akcapar's work (2010) adds to this viewpoint by demonstrating the role of social networks in the transit area during long-distance migration of refugees. The concept of network embeddedness, both personal and structural, challenges the conventional gravity model in addressing place-specific effects, and oversimplified assumptions on the effect of economic push and pull forces in motivating migration.

Previous research has extensively used multilevel models (MLMs) or linear mixed modeling to account for place-specific influences (Détang‐Dessendre et al., 2008; Swain & Garasky, 2007). MLMs assume that observations in the same group are likely to be more correlated than observations from different groups because they share the same environment. This approach is particularly suitable for migration datasets that are often highly structured, organized at more than one level, or contain repeated measures (Rogers et al., 2002). Zhang et al. (2020) note that the traditional gravity model of migration inevitably exhibits some degree of dependency: Each observation provides an origin, a destination, the number of people moving between them, and attributes of the origin and the destination that may explain the flow. In the regression, these attributes will be repeated, creating group dependencies and violating the assumption of observation independence in OLS. A straightforward solution is to adopt a multilevel approach that recognizes the existence of nested data structures.

Incorporating MLMs with the gravity model provides a substantively complete picture of migration because MLMs identify places with an extraordinarily high or low origin, destination, or regional effect. Specifically, MLMs partition the variance in migration into the percentage attributable to between-group effects and within-group effects. The former reveals the effects of a level-2 group variable (i.e., the effects of shared origin/destination), while the latter reveals the remaining effects at level 1 within level-2 groups (i.e., the residual effects of each origin/destination county) after controlling for between-group effects. This step helps capture the influence of place-based determinants such as local policy. Using the UK labor mobility data and cross-classified multilevel modeling, Zhan et al. (2021) demonstrate that international graduates’ county of origin and university of study have prominent roles in predicting their employment destinations, in addition to the influence of micro-level characteristics (e.g., age, gender). They show that approximately 18% of graduates’ labor mobility outcomes can be attributed to the origin county and 10% to the university of study. They also found graduates attending universities located within London are 77% more likely to stay in the UK to work. Similarly, Keogh (2013) analyzes the driving forces of asylum migration flows across the European Union and uses a dynamic coefficient random effect gravity model to capture country-specific variance. Zhang et al. (2020) utilize a multilevel gravity model to account for the interconnection between inter-provincial urban migration flows in China: between those sharing common origins and destinations, and where there is a reciprocal flow between places. They find that there is greater variation in the number of migrants received by provinces than there is in the number sent. Overall, these literatures demonstrate the effectiveness of multilevel gravity models in investigating group effects and unpacking complex migration issues.

In this study, we are interested in three types of group effects in migration flows, which rise from (1) flows sharing a common origin (i.e., the origin effect), (2) flows sharing a common destination (i.e., the destination effect), and (3) flows sharing the same interregional or intraregional linkage (i.e., the region effect). The origin and destination effects reflect variation between counties in the number of migrants they systemically export and attract; The region effect reflects variation between regional linkages in the number of migrants they systemically motivate or mitigate.

Data

The availability of data at the county level determines the period and the spatial scale of this study. While previous scholarship has used metropolitan areas to analyze internal migration because they represent the labor market better than counties (Saks & Wozniak, 2011; Zabel, 2012), we argue there are practical and political reasons to conduct county-level analysis. State governments frequently use counties to conduct population estimation and forecasting; numerous state funds are distributed to counties based on their estimated population and migration level. Overall, our panel data consist of migration flows between all 58 counties of California and their attributes between fiscal year (FY) 2014 and 2021. A summary of variables used is given in Table 1 and specific data sources are in Online Appendix Table A. Each variable except dummy variables is log-transformed in the model but is shown in its original scale in Table 1.

Table 1.

Descriptive statistics for variables in the gravity model

Variable Level Observations Mean Std. dev Min Max
Dependent variable
 Inter-county migration flow Flow 26,448 252.3 1389.7 0.0 36,364.0
Independent variables
 Total population County 464 674,903.5 1,459,233.0 1146.0 10,192,593.0
 Euclidean distance (km) Flow pair 3306 440.8 285.2 34.6 1,552.1
 Contiguity (1 contiguous, 0 otherwise) Flow pair 3306 0.1 0.3 0.0 1.0
 Median age County 464 39.9 6.5 29.4 55.8
 Median household income County 464 75,199.9 22,639.5 35,270.0 145,679.6
 Unemployment rate (%) County 464 7.5 3.6 2.1 25.3
 Industry composition similarity Flow pair 26,448 0.9 0.1 0.1 1.0
 Wildfire frequency County 464 6.9 9.2 0.0 64.0

Correlation coefficients among independent variables are below 0.52, indicating that multicollinearity should not be a concern

Dependent Variable

The migration data used in this analysis are restrictive-use change-of-address data for residents aged 16 and over FY 2014–2021 from the California Department of Motor Vehicles (DMV).1 It consists of 26,448 migration flows between 58 counties of California during the 8-year study period. The average number of individuals moving between any two counties is 252 in our dataset (Table 1). According to the California Vehicle Code, residents shall notify DMV of the address change for a driver’s license or ID update within 10 days of the move, or they are subject to a fine and/or criminal charges. The Real ID Act requires that US residents must be Real ID compliant to board domestic flights and access federal facilities such as voting centers and courthouses. Other common uses of a driver’s license or ID include banking, applying for government benefits, and employment. Having accurate information on one’s driver’s license and/or ID is important for Americans to avoid disruptions in their everyday lives.

DMV data are a reliable source for studying temporal shifts in inter-county migration because they are actual mobility data that record origin and destination locations. Other conventional data sources such as Census data products, while intending to comprise the entire population, suffer from a great degree of uncertainty in estimating migration of smaller counties. This is reflected by the frequent presence of zero-value migration flows among remote counties due to their limited sample size. According to American Community Survey (ACS) 5-year estimates, approximately 15% of inter-county migration flows in California during the study period are estimated as zero. Excessive zeros in the dependent variable can be problematic for the gravity model because the model uses the log-linear specification and the logarithm of zero is undefined. As the result, potentially useful information will be discarded from the regression, introducing a high risk of biased results (Burger et al., 2009; Westerlund & Wilhelmsson, 2011). Although solutions have been proposed to handle zero values (e.g., using a Poisson estimator), scholarship warns of their limitations (Ramos, 2013). The low reliability of ACS migration data is also indicated by high margins of errors, with over 40% of inter-county migration flows in California having a margin error larger than the estimate itself. DMV data provide comparatively  accurate information on migration among remote and small counties and have few occurrences of zero-value migration flows.

Another strength of DMV data is that it is released on monthly and annual bases, much sooner than conventional Census-based measures that often come out with at least a one-year lag. This allows researchers and policymakers to quickly gather information and examine the demographic implications of rapid and slow-on shocks at different time intervals. Areas that would particularly benefit from this feature include migratory responses to pandemics and natural disasters.

DMV data also improve on the known limitations of other migration data sources such as moving van data (Conway & Rork, 2022) and the Consumer Credit Panel data (DeWaard et al., 2019). Unlike the moving van data which proves to be highly unreliable in forecasting, DMV data-based population projections are reasonably accurate and comparable to Census results (California State Auditor, 2022). Consumer Credit Panel data, despite its large sample size and timely updates, have accessibility issues given its proprietary nature, whereas the DMV as a government agency has made the data readily available for policymakers.

DMV datasets should be interpreted with caution due to several weaknesses. The DMV data collection excludes populations under age 16, and people who are less likely to possess a driver's license or ID (e.g., low-income population, unsheltered population, immigrants), making the data inadequate in tracking the mobility of some subgroups. There can also be considerable geographic heterogeneity in driver’s license holding—people living in urbanized and transit-rich counties are associated with a lower likelihood of driver’s license ownership (Lavieri et al., 2017). Using Census statistics as the benchmark, the authors’ calculation suggests a 16–26 percent undercount of total migration in the DMV data depending on the area. This undercount may inflate the regression coefficients in this paper. An additional disadvantage is inherent to the self-report nature of DMV data. Individuals may postpone the address update for various reasons despite the state’s 10-day rule. This affects the estimation accuracy of monthly- or quarterly-based migration analysis but should have less of an impact on the outcome of this paper, which uses yearly observations.

Independent Variables

As suggested by the gravity theory of migration, the primary independent variables include total population and distance between origins and destinations. Under the California Department of Finance, the Demographic Research Unit (DRU) is designated as the single official source of demographic data for state planning and is our data source for the total population. The distance between the origin and the destination is defined as the Euclidean distance between their spatial centers, which is measured in the ArcGIS software.

We assume mobility decisions are also influenced by the contiguity of counties, age structure, income level, job availability, industrial composition, climate, and time-specific events such as the COVID-19 pandemic and therefore incorporate these controls in our estimations. Median age is indicative of age structure, which is obtained from the DRU estimates. Median household income is obtained from ACS 5-year estimates, with the inflation adjusted using the Consumer Price Index. On average, the median household income in our sample is $75,199. The unemployment rate reflects the availability of jobs and is obtained from the US Bureau of Labor Statistics. To control for the effects of industrial homophily, we construct the cosine similarity index between the industry composition of the origin and the destination, based on employment sizes of industries defined at the level of 2-Digits North American Industry Classification code (e.g., agriculture, manufacture, professional service, etc.) from the County Business Patterns datasets. The rationale behind this is to capture the phenomenon wherein workers with a certain skill set may be drawn to areas with larger and more diverse employment sectors (Liu et al., 2019).

The wildfire problem along the US West Coast has been increasingly cited as a motivation for migration, despite mixed results from empirical analysis (Sharygin, 2021; Winkler & Rouleau, 2021). We include wildfire frequency using the geocoded historical wildfire event data from the California Department of Forestry and Fire Protection. A wildfire burning across county boundaries is counted in each affected county.

Additionally, the COVID-19 pandemic increases the possibility of working remotely and may affect residential mobility. We include the time-fixed effect to control for the pandemic effect and other unobserved variables that evolve over time but are constant across counties.

Regional Division of California

The examination of the regional effect on migration flows requires a suitable regional division of California. We adopt the California Economic Regions, which have been utilized by governmental and research institutions in analyses of workforce participation, infrastructure development, school enrollment, and interstate migration (California Economic Strategy Panel, 2006; Holmes & White, 2021). The division considers factors such as population centers, commute patterns, land ownership, and labor force composition and categorizes counties into nine groups. They are the Bay Area region (Alameda, Contra Costa, Marin, Napa, San Benito, San Francisco, San Mateo, Santa Clara, Santa Cruz, Solano, and Sonoma), the Central Coast region (Monterey, San Luis Obispo, and Santa Barbara), the Central Sierra region (Alpine, Amador, Calaveras, Inyo, Mariposa, Mono, and Tuolumne), the Greater Sacramento region (El Dorado, Placer, Sacramento, Sutter, Yolo, and Yuba), the Northern California region (Del Norte, Humboldt, Lake, Lassen, Mendocino, Modoc, Nevada, Plumas, Sierra, Siskiyou, and Trinity), the Northern Sacramento Valley region (Butte, Colusa, Glenn, Shasta, and Tehama), the San Joaquin Valley region (Fresno, Kern, Kings, Madera, Merced, San Joaquin, Stanislaus, and Tulare), the Southern Border region (Imperial, and San Diego), and the Southern California region (Los Angeles, Orange, Riverside, San Bernardino, and Ventura). Appendix Figure presents the map of California Economic Regions.

Methodology and Plan for Analysis

Spatial Pattern Analysis

The spatial distribution of in-migration and out-migration counts is examined using the natural break classification method, which is the most used and robust classification method provided by ArcGIS. Class breaks are created in a way that minimizes within-group differences and maximizes between-group differences (De Smith et al., 2007). The features are divided into classes whose boundaries are set where there are relatively big differences in the data values.

The cluster and outlier analyses are utilized to illustrate the spatial disparity in migration patterns as they detect groupings and areas with anomalies (Fotheringham & Rogerson, 2013). The analysis identifies counties that have either a high or low level of migration counts in concordance with their surroundings (i.e., clusters). It also identifies anomalous counties with migration counts that are very different from their neighbors, whether much higher or lower (i.e., outliers). There are also cases in which no associations can be made.

The relative measure of demographic efficiency summarizes the degree to which inflow and outflows are non-canceling, which scales net migration by the sum of in-migration and out-migration as follows:

DemographicEfficiency=NetMigration/Inmigration+Outmigration×100 1

The absolute value of the measure ranges from 0%, when the inflow exactly matches the outflow, to 100% when all movement is unidirectional. The positive sign indicates the inflow is larger than the outflow, and the negative sign suggests the opposite.

Multilevel Gravity Model

The basic gravity model is specified as follows, in a linearized form:

lnMigrationijt=β0+β1lnPopit+β2lnPopjt+β3lnDistanceij+βX+eijt, 2

Migrationijt is comprised of the observed migration flows from county i to county j during time period t. With 58 counties in California, there are 3306 (i.e., 58 × 57) observed values for Migrationij each year, and 26,448 in total during our 8-year study period (i.e., 3306 × 8). Popit(Popjt) represents the total population of county i(j) measured at time i. Distanceij measures the Euclidean distance between spatial centers of county i and j. Other demographic and socioeconomic variables are represented by matrix X and its vector of coefficients β, which are parameters to be estimated. β0 is the constant term and eijt is a randomly distributed error term.

The multilevel gravity model is an extended gravity model that allows migration flows to be nested in common origins, destinations, or regional linkages. It includes cross-classified origin, destination, or region random effect to account for systematic residual variation in migration across counties or regions. It can be written as:

lnMigrationijt=β0+β1lnPopit+β2lnPopjt+β3lnDistanceij+βX+γt+eijt, 3

where γt denotes the origin, destination, or region random effect and eijt the revised residual. The random effects and residuals are typically stated to be normally distributed with zero means and constant variance. Importantly, the multilevel models generate variance partition coefficients (VPCs) which can be used to quantify the relative importance of place-specific context in explaining residual migration. The value of VPC is the proportion of unexplained differences in migration flows that are attributed to origins, destinations, or regions.

Analytic Strategy

The analysis proceeds in two main stages. First, we analyze recent trends and spatial patterns of inter-county migration in California during the period FY 2014–2021. We examine in-migration, out-migration, net migration, and demographic efficiency, and identify any spatial disparity in the ArcGIS software. Second, we utilize multilevel gravity models and compare them with the traditional OLS gravity model to explore the determinants of migration. The inclusion of multiple levels enables us to capture (1) variation in flows of common origins, (2) variation in flows of common destinations, and (3) variation in flows of common intraregional or interregional linkages. The regression is conducted in the Stata software, where the maximum-likelihood estimator is employed to fit gravity models.

The battery of robustness checks will include the use of a fixed-effect model as an alternative approach and the incorporation of a COVID-19 pandemic dummy variable to explicitly control for the pandemic effect.

Internal Migration in California and Its Spatial Patterns

A total number of 6,673,766 inter-county residential address changes were reported at the DMV between FY 2014 and 2021. Total movements increased from 815,546 in 2014 to 874,295 in 2016, followed by a drop to 811,320 in 2018. Despite another increase in 2019, there is a sharp drop to 769,776 moves in 2020, likely due to state-wide COVID-related impacts. In 2021, total inter-county movements bounced back to 850,586 cases. Inter-county mobility in California overall declined compared with historical trends between the 1980s and 1990s (Flood et al., 2021), which echoes the long-run decline in interstate migration (Foster, 2017; Molloy et al., 2011).

Figure 1 shows the spatial distribution of annual average in-migration and out-migration counts. Southern California counties such as Los Angeles and San Bernardino, joined by several northern counties such as Alameda and Sacramento, exhibit a high level of in-migration reaching over 40,000 cases per year (Fig. 1a). Out-migration features a spatial pattern similar to in-migration, with six counties in the top two classes (i.e., Los Angeles, Orange, Alameda, Santa Clara, Riverside, and San Bernardino) representing nearly 50% of all out-migration cases (Fig. 1b).

Fig. 1.

Fig. 1

Fig. 1

Annual average in-migration (a) and out-migration (b) at the county level in California, FY2014–2021 (Unit: persons). (Color figure online)

Table 2 follows up on Fig. 1 and identifies the top ten counties associated with high levels of annual average migration during the study period. Los Angeles tops the ranking with 103,241 in-migrants and 147,077 out-migrants annually and contributes to at least 10% of all inter-county migration in California. On average, 33,730 people move from Los Angeles to Orange each year, making it the largest county-to-county migration flow.

Table 2.

Top ten counties with the highest values for annual average in-migration, out-migration, and inter-county migration, FY2014–2021

Rank County In-migrants County Out-migrants Origin Destination Migrants
1 Los Angeles 103,241 Los Angeles 147,077 Los Angeles Orange 33,730
2 Riverside 72,347 Orange 64,571 Los Angeles San Bernardino 32,647
3 Orange 64,963 Alameda 53,671 Orange Los Angeles 26,099
4 San Bernardino 61,665 Santa Clara 52,593 Los Angeles Riverside 20,323
5 Alameda 50,565 Riverside 51,765 San Bernardino Riverside 18,571
6 San Diego 45,819 San Bernardino 51,668 San Bernardino Los Angeles 18,079
7 Sacramento 41,032 San Diego 43,909 Riverside San Bernardino 15,028
8 Santa Clara 37,306 San Francisco 36,642 Orange Riverside 14,013
9 Contra Costa 34,810 Sacramento 34,555 Alameda Contra Costa 12,847
10 San Francisco 29,176 Contra Costa 31,621 Riverside Los Angeles 11,074

Figure 2a presents results from the cluster and outlier analysis of in-migration. The High-High (HH) cluster—spatial clustering of counties with an outstandingly higher number of in-migrants—can be found among southern California counties including Los Angeles, San Bernadino, Ventura, Orange, and San Diego. The Low–Low (LL) clusters of in-migration are mostly in the northern inland. Imperial is identified as a Low–High (LH) outlier, meaning the number of the entry in Imperial is low while being surrounded by counties with relatively higher in-migration. Figure 2b shows clusters and outliers of out-migration. Los Angeles, Ventura, and Orange form the HH cluster, while the LL clusters are identified among northern counties such as Lassen and Mono. These counties have a significantly low number of out-movers and so do their neighboring counties. Santa Cruz and Imperial have a LH relationship with their neighbors, indicating that they significantly send out fewer people compared to their neighboring counties.

Fig. 2.

Fig. 2

Fig. 2

Results of the cluster and outlier analysis for in-migration (a) and out-migration (b). (Color figure online)

Net migration patterns provide a picture of the relative attractiveness of counties. Figure 3 illustrates that a majority of inland counties receive positive net migration, particularly in Riverside, San Bernardino, Sacramento, and Placer. In contrast, Los Angeles, Santa Clara, and San Francisco experience annual net losses of 43,836, 15,286, and 7,466 residents, respectively. This finding confirms a coast-to-inland migration trend in California (Frey, 2010).

Fig. 3.

Fig. 3

Annual average net migration at the county level in California, FY2014–2021 (Unit: persons). (Color figure online)

In terms of demographic efficiency, Fig. 4 shows that Los Angeles, Santa Clara, and San Francisco have a demographic efficiency rating of − 10% or worse, pointing to an imbalanced exchange leading to population loss. Conversely, counties including Riverside, Amador, and San Benito exhibit a positive demographic efficiency of over 15%, indicating the predominance of inflow in these areas. A handful of other counties illustrate a relatively balanced exchange between inflow and outflow, such as Santa Cruz, Napa, and Orange, with a demographic efficiency of nearly 0%.

Fig. 4.

Fig. 4

Demographic efficiency at the county level in California, FY2014–2021. (Color figure online)

Figure 5 provides a closer look at people leaving the state’s two largest economic powerhouses—Los Angeles (Fig. 5a) and San Francisco County (Fig. 5b). While many out-migrants indeed move toward the inland, a majority do not move further than adjacent counties. This finding adds nuances to the coast-to-inland migration trend in California.

Fig. 5.

Fig. 5

Fig. 5

Annual average migrants from Los Angeles (a) and San Francisco (b) received by other counties, FY2014–2021 (Unit: persons). (Color figure online)

Determinants for Internal Migration in California

Regression results from gravity models are presented in Table 3. Model 1 lists the OLS estimates of the gravity model. Model 2, 3, and 4 are multilevel gravity models that recognize the clustering of migration flows based on origins, destinations, and regions. Although coefficients across four models are similar in magnitude, the likelihood ratio test suggests that Model 2–4 are preferred over Model 1. Comparisons of Akaike’s Information Criteria (AIC) and Bayesian Information Criteria (BIC) also indicate the improved performance of the multilevel gravity models.

Table 3.

OLS and multilevel gravity model estimates predicting inter-county migration flows in California

Variables Model 1: conventional model Model 2: origin model Model 3: destination model Model 4: region model
Coef. (SE) Coef. (SE) Coef. (SE) Coef. (SE)
Origin population 0.84 (0.01)*** 0.75 (0.03)*** 0.85 (0.00)*** 0.94 (0.01)***
Destination population 0.80 (0.01)*** 0.81 (0.00)*** 0.49 (0.04)*** 0.90 (0.01)***
Distance − 1.24(0.01)*** − 1.32 (0.01)*** − 1.32 (0.01)*** − 1.16 (0.02)***
Contiguity 1.31 (0.02)*** 1.25 (0.02)*** 1.25 (0.02)*** 1.21 (0.02)***
Origin median age 0.29 (0.02)*** 0.65 (0.13)*** 0.30 (0.02)*** 0.01 (0.02)
Destination median age 0.32 (0.02)*** 0.32 (0.02)*** 0.44 (0.16)*** 0.04 (002)
Origin median household income − 0.26 (0.02)*** 0.05 (0.06) − 0.29(0.01)*** 0.08 (0.02)***
Destination median household income − 0.48 (0.02)*** − 0.49 (0.02)*** − 0.02 (0.07) − 0.14 (0.02)***
Origin unemployment rate − 0.36 (0.02)*** 0.13 (0.05)*** − 0.36 (0.02)*** − 0.10 (0.02)***
Destination unemployment rate − 0.52 (0.02)*** − 0.52 (0.02)*** − 0.11 (0.05)** − 0.25 (0.02)***
Industry composition similarity − 0.23 (0.01)*** − 0.24 (0.01)*** − 0.20 (0.01)*** − 0.18 (0.01)***
Origin wildfire frequency 0.03 (0.01)*** 0.01 (0.01) 0.03 (0.00)*** 0.04 (0.01)***
Destination wildfire frequency 0.05 (0.01)*** 0.06 (0.01)*** 0.00 (0.01) 0.07 (0.01)***
Constant 6.79 (0.38)*** 3.60 (0.83)*** 5.77 (1.01)*** − 3.35 (0.52)***
Year 2015 − 0.16(0.02)*** − 0.07 (0.02)*** − 0.08 (0.02)*** − 0.03 (0.02)
Year 2016 − 0.30 (0.02)*** − 0.12 (0.03)*** − 0.13 (0.03)*** − 0.07 (0.02)***
Year 2017 − 0.52 (0.02)*** − 0.30 (0.03)*** − 0.30 (0.03)*** − 0.23 (0.02)***
Year 2018 − 0.69 (0.03)*** − 0.38 (0.04)*** − 0.38 (0.04)*** − 0.31 (0.03)***
Year 2019 − 0.82 (0.03)*** − 0.43 (0.04)*** − 0.43 (0.04)*** − 0.32 (0.03)***
Year 2020 − 0.99 (0.03)*** − 0.69 (0.05)*** − 0.60 (0.05)*** − 0.45 (0.03)***
Year 2021 − 0.44 (0.02)*** − 0.36 (0.03)*** − 0.29 (0.03)*** − 0.24 (0.02)***
Individual flow variance 0.67 (0.38)*** 0.61 (0.01)*** 0.56 (0.00)*** 0.56 (0.00)***
Origin county variance 0.12 (0.04)***
Destination county variance 0.39 (0.15)***
Within/Across region variance 0.31 (0.07)***
Origin VPC 16.1%
Destination VPC 40.8%
Region VPC 35.8%
R2 0.86 0.85, 0.94 0.80, 0.78 0.85, 0.90
AIC 64,489.0 62,343.1 60,169.8 59,855.7
BIC 64,660.8 62,531.2 60,358.0 60,043.9
LR test 2149.9*** 4323.2*** 4637.3***

**ρ < 0.05, ***ρ  < 0.01. Dependent variable is the log migration flow. Coef., Coefficient. SE, standard errors

The Snijders/Bosker R2 measure is calculated for Model 2–4, which is a commonly used R2 measure for multilevel models and offers level-specific variance explained (Snijders & Bosker, 1994). The value before the comma is the level-1 R2 and the value after the comma is the level-2 R2

Demographic and Socioeconomic Characteristics

Our primary independent variables have a substantial effect on migration flows, with consistent outcomes across four models. Specifically, the larger the population size of the origin and the destination, the greater the migration between them. A 10% increase in the origin population is associated with an approximately 8–9% increase in out-migration, all else being equal, while a 10% increase in the destination population is associated with an approximately 5–9% increase in in-migration. Geographic proximity matters—as expected, the shorter the distance between the origin and the destination, and when two counties are contiguous, there will be a larger migration flow.

Other independent variables including job availability and industrial composition substantially contribute to migration flows. Counties with lower unemployment rates receive more migrants, which is consistent with empirical evidence that available employment opportunities exert a positive pull for migrants (Greenwood, 2021; Marré & Rupasingha, 2020). Industrial composition homophily has a repelling effect on migration—as predicted, the greater the difference in industrial structure between counties, the larger the migration flow. Additionally, migration flows systematically decreased over time, reflected by negative time-fixed effects. The largest negative time-fixed effect can be found in FY 2020, pointing to the beginning of the pandemic as an impeding factor against migration.

Moving from Model 1 to Model 2, where migration flows are nested in common origins, the effect of median household income at the origin becomes insignificant, suggesting that the variation in out-migration initially explained by income in Model 1 could instead be ascribed to the origin effect. The unemployment rate at the origin becomes positively associated with migration, indicating that migration can be pushed by an unfavorable job market. The VPC in Model 2 indicates that the origin context accounts for approximately 16% of the total residual variance.

Comparing Model 3 with Model 1, income at the destination becomes statistically indistinguishable from zero, suggesting a limited role of income in explaining in-migration after accounting for the destination effect. The destination-specific influence explains nearly 41% of the unexplained differences in migration flows between counties. The impact of sharing common destinations is an important feature shaping migration, but this would have gone unnoticed in Model 1.

In Model 4, where flows are nested in regional relationships, results mostly replicate those in Model 1, except that the effect of median age at the origin and destination becomes statistically indistinguishable from zero. The region VPC indicates that 36% of the unexplained differences in migration flows can be ascribed to the regional context.

Contrary to popular belief, there is a limited association between wildfire and inter-county migration, as coefficients of wildfire frequency shrink toward zero in Model 1–4; This supports empirical analysis showing that the damage of wildfires mostly leads to short-distance relocation (Nawrotzki et al., 2014; Sharygin, 2021).

Origin, Destination, and Region Effects

Figure 6 maps the spatial distribution of residual differences between origins and destinations, which are predicted random effects of origins from Model 2 (Fig. 6a) and of destination from Model 3 (Fig. 6b). Counties with a positive origin or destination effect that is statistically significant depart from the theoretical gravity model by systematically sending out or receiving more migrants than predicted by population, distance, and other attributes. Counties with a negative origin or destination effect that is statistically significant depart from the gravity model in the opposite direction—they export or import fewer migrants than predicted. Some counties do not appear to have any strong origin or destination effect (i.e., statistically non-significant). Zero represents the theoretical mean of the normally distributed residuals. As shown in Fig. 6a, origins with above-average exporting capabilities cluster in southern, central coastal, and far northern parts of California. On the other hand, northern inland counties such as Sierra, and central valley counties such as Tuolumne, systematically send out fewer migrants. Thirteen counties including San Francisco do not exhibit a strong origin effect.

Fig. 6.

Fig. 6

Fig. 6

Spatial pattern of the predicted origin (a) and destination (b) effects. (Color figure online)

The destination effect across counties illustrates a similar spatial pattern (Fig. 6b), except counties including Los Angeles, Sacramento, and Fresno, exhibit a noticeably stronger attraction compared to their exporting ability. Destination effects vary more than origin effects, reflected by a wider value range (i.e., − 1.74–1.45) and a higher standard deviation (i.e., 0.63 for destination effects vs. 0.34 for origin effects), suggesting that counties vary more in the number of migrates they attract than in the number of migrants they send.

Figure 7 follows Fig. 6 and provides a detailed comparison of the origin and destination effects of all counties. Most counties exhibit matching origin and destination effects—those systematically sending out more migrants tend to also receive more migrants. San Diego shows the highest origin and destination effect, followed by far northern counties including Butte and Humboldt. In contrast, remote and rural counties such as Sierra and Alpine are at the bottom. A few other counties lean toward either a strong pull or push. For instance, Placer exhibits an above-average destination effect (i.e., 0.35) but minimal origin effect (i.e., − 0.06), suggesting comparatively stronger attraction of the county.

Fig. 7.

Fig. 7

Comparison of predicted origin and destination effects of counties

The predicted region effects from Model 4 reveal impacts of the macrogeographic context (Table 4). Among regional flows with below-average residuals, migration flows between the San Joaquin Valley region and the Southern California region show the lowest region effects—flows between these two regions are significantly smaller than predicted by the gravity model. On the other end are flows with relatively high region effects, which are those within the Northern Sacramento Valley region and the Northern California region. Five pairs of interregional flows do not appear to have a strong region effect that deviates from the overall average, such as flows between the Central Coast and the Southern Border region.

Table 4.

Predicted regional effects in rank order

Regional migration flows Region effect
Below average
 San Joaquin Valley-Southern California − 0.96
 Bay Area-San Joaquin Valley − 0.83
 Within Southern California − 0.83
 Southern Border-Southern California − 0.82
 Central Coast-Southern California − 0.59
 Greater Sacramento-Southern California − 0.57
 Bay Area-Southern California − 0.56
 Central Coast-San Joaquin Valley − 0.48
 Greater Sacramento-San Joaquin Valley − 0.46
 Central Sierra-San Joaquin Valley − 0.39
 Within Southern Border − 0.37
 Northern Sacramento Valley-Southern California − 0.34
 Within Central Coast − 0.29
 Within Bay Area − 0.28
 Bay Area-Northern Sacramento Valley − 0.25
 Central Sierra-Southern California − 0.25
 Bay Area-Greater Sacramento − 0.21
 Within San Joaquin Valley − 0.20
 Bay Area-Central Sierra − 0.17
 Northern Sacramento Valley-San Joaquin Valley − 0.16
 Bay Area-Southern Border − 0.15
 Greater Sacramento-Southern Border − 0.14
 Northern California-Southern California − 0.12
 Bay Area-Central Coast − 0.08
 Northern California-San Joaquin Valley − 0.05
Average
 Central Coast-Greater Sacramento − 0.01
 San Joaquin Valley-Southern Border 0.02
 Northern Sacramento Valley-Southern Border 0.07
 Central Coast-Northern Sacramento Valley 0.08
 Central Coast-Southern Border 0.15
Above average
 Bay Area-Northern California 0.15
 Central Sierra-Greater Sacramento 0.16
 Central Sierra-Southern Border 0.18
 Central Coast-Central Sierra 0.19
 Northern California-Southern Border 0.25
 Central Coast-Northern California 0.36
 Central Sierra-Northern Sacramento Valley 0.41
 Within Greater Sacramento 0.50
 Greater Sacramento-Northern Sacramento Valley 0.61
 Greater Sacramento-Northern California 0.65
 Within Central Sierra 0.86
 Northern California-Northern Sacramento Valley 0.99
 Central Sierra-Northern California 1.03
 Within Northern California 1.29
 Within Northern Sacramento Valley 1.60

Robustness Check

The section collects results from diagnostic checks and alternative models examined to assess the stability of our results.

Fixed-Effect Modeling

Fixed-effect modeling offers an alternative approach that specified a separate intercept for each group. A fixed-effect resembling the fixed-effect portion of our gravity model can be estimated by regression in which all variables are re-expressed as deviations from within-group means. Results of the fixed-effect gravity model show that coefficient signs and size mostly remain similar, except the unemployment rate at the origin becomes non-significant (Appendix Table B). It is also worth noting that the results of the multilevel gravity model and its fixed-effect version are not strictly comparable, because the former includes place-specific random effects, which significantly improve the fit, as indicated by much higher R2 measures.

The COVID-19 Pandemic

Although the time-fixed effect is used to control for the impact of time-relevant events including the COVID-19 pandemic, additional analysis was conducted by replacing the time-fixed-effect variables with a pandemic dummy variable (i.e., value 0 for the pre-pandemic years before 2020, value 1 for the year 2020 and after). The results largely replicated those documented in our original models, although the coefficient for the origin population in Model 2 and the coefficient for the destination population in Model 3 shrink slightly (Appendix Table C). The impact of the pandemic on our data sample can be mostly controlled by incorporating time-fixed effects.

Discussion and Conclusion

This paper examines spatial patterns of inter-county migration in California between the fiscal year 2014 and 2021 and utilizes multilevel gravity models to analyze the driving forces of migration. It utilizes the valuable DMV change-of-address data, which have remained underused in migration studies. Furthermore, while prior studies have greatly contributed to the understanding of internal migration, many treat each migration flow as an independent observation, failing to address both the hierarchical nature of migration data and the impact of origin-, destination-, and region-specific contexts. This leads to inaccurate and inconsistent results. In addition, little scholarly attention has been given to intra-state migration. Our endeavor has overcome these research gaps, which are successfully considered in this paper.

Our descriptive analysis shows that the Los Angeles-centered southern California and the San Francisco Bay Area represent the largest origins and destinations of inter-county migration in California (Fig. 1 and Table 2). The examination of net migration supports a coast-to-inland migration trend within California (Frey, 2010), yet many migrants move no further than adjacent counties.

The application of multilevel gravity models identifies push and pull factors and verifies the importance of group effect in explaining the systematic variation of flows. Consistent with the gravity theory (Anderson, 2011), large population sizes and geographic proximity (i.e., shorter distances, being contiguous) contribute to the volume of migration. This is because populous places provide and attract diverse goods and services, as well as facilitate job creation and information exchange more than less populous areas; the geographic closeness between places means lower moving costs and less disruption to one’s local ties than long-distance moves. They help explain the exceptionally large number of people migrating between Los Angeles and Orange (Table 2), the state’s 1st and 3rd populous counties. Apart from these core factors, our findings also show that migration is economically driven by employment and industrial structure similarity between places, which is consistent with empirical studies (Marré & Rupasingha, 2020; Wang et al., 2022).

Contrary to popular belief, our results illustrate a limited association between migration and income, particularly after controlling origin and destination random effects. This perhaps points to the embeddedness of income in localized economies: When people’s income-earning capacity substantially derives from place-based social capital (e.g., business connections with colleagues, co-founders, clients, etc.), moving away can be extremely costly (Young et al., 2016). In other words, people who expect income from place-specific networks tend not to move, despite attractions from possible destinations. This is especially true for upper-class earners in highly networked industries (e.g., technology) (Powell et al., 2002). It’s important to understand that people’s embeddedness in the network can profoundly complicate the relationship between economic incentives and migration decisions.

Another important feature of inter-county migration in California is that it is systematically declining and echoes a long-run downward trend in interstate migration across the US (Cooke, 2011; Hyatt et al., 2018; Molloy et al., 2017). Our regression results hint that declining intra-state migration also contributes to immobility among Americans during recent decades. Some scholarship suggests that Americans are increasingly stuck—unable to move when they expect to—rather than being rooted in places (Foster, 2018; Johnson et al., 2017). Furthermore, the COVID-19 pandemic and relevant policy responses (e.g., remote working options) may introduce new behavior changes in internal mobility, although researchers have not reached a consensus on the long-term outcome (Anderson et al., 2021; Cohen, 2020). We encourage future studies to grapple with the changing patterns in domestic migration at various spatial scales during the post-pandemic era.

Our multilevel model results demonstrate that origin, destination, and region effects significantly contribute to explaining migration flow residuals, which would otherwise be overlooked by traditional models. Accounting for the flow dependency due to sharing origins, destinations, and regions greatly improves the fit of the model. Among counties that exhibit higher than average importing and exporting capability, some correspond to their migration level (e.g., San Diego, Los Angeles), and others do not (e.g., Butte, Humboldt). Localized economies and policies are the keys to explaining unexpectedly high emissivity and attraction of places. For instance, Humboldt County, known as the state’s “cannabis capital” due to its unique growing history and environment, has experienced rapid exurban growth and increasing property conflicts as newcomers participate in the legalized cannabis industry (Bodwitch et al., 2019; Kavousi et al., 2022). In Butte County, an increasing number of residents have been displaced during the post-disaster reconstruction/redevelopment projects; wealthy newcomers have taken the opportunity and purchased lots of wildfire-stricken areas from locals (Marandi & Main, 2021; Spearing & Faust, 2020), causing regional housing prices to soar and become unaffordable. In contrast, northern inland counties exhibit below-average exporting and importing ability. The fact that these communities are geographically remote, lacking infrastructures of transportation and energy perhaps explains why migrants respond passively to these labor markets.

The macrogeographic context also contributes to the irregular and unpredictable part of internal migration, as indicated by our region-level model outcomes. The macrogeographic influence can be related to various community ties, and/or work-based networks (Williams et al., 2018). For instance, moving within an established network is associated with more assistance with the settlement process and less perceived alienation (Nowotny & Pennerstorfer, 2019), hence motivating residential mobility. In other cases, strong community attachment may discourage the intention to relocate (Cairns & Smyth, 2011).

This study has some limitations. First, the underrepresentation of young, low-income, inner-city-living, and marginalized populations in the DMV data likely leads to an underestimation of internal migration. One possible solution is a composite method that utilizes multiple data sources for age-based subgroup estimation. For instance, school enrollment and Medicare enrollment datasets are valuable sources to estimate the mobility of children and the elderly and overcome the non-representativeness in the DMV data among these age groups. Future work should thoroughly assess the validity of DMV data and solutions to address its non-represented populations. Second, our gravity model is unable to address the feedback effect between migration and income. Migration brings changes in the labor pool, causing changes in wage rates needed to re-equilibrate the labor market (Fan et al., 2018; Treyz et al., 1993); The same mechanism also applies to the unemployment-migration relationship (Villarreal, 2014). These feedback effects need to be seriously considered to ensure estimation accuracy. Third, the analysis of migration can be sensitive to the issue of the modifiable areal unit problem (MAUP)—interpretation of migration patterns can be affected by the scale and boundary of delineation to which data are aggregated (Saks & Wozniak, 2011; Zabel, 2012). Complete knowledge of spatial–temporal patterns of migration must be attentive to the arbitrary nature of spatial data aggregation.

Despite these limitations, our findings warrant consideration from policymakers. Policies that focus on creating quality job opportunities and diversifying the industrial structure can help attract the labor force and improve regional competitiveness. Another implication is that migration estimation and forecasting should make timely adjustments to account for irregularities induced by the place-specific influence. This is particularly important when the county-level socioeconomic condition does not completely explain the heterogeneity in migration outcomes, which may indicate the existence of a contextual effect. An understanding of how much place-specific policies contribute to the importing and exporting power of places deserves great attention from both scholars and government officials.

Supplementary Information

Below is the link to the electronic supplementary material.

Acknowledgements

The author thanks three anonymous reviewers for their thoughtful comments on the manuscript and the Demographic Research Unit staff at California Department of Finance for their data support and feedbacks. The content is solely the responsibility of the author and do not necessarily reflect those views of the California Department of Finance.

Data availability statement

The raw DMV data are restricted-use data and only available for authorized users due to data privacy laws. Other data used in this analysis are publicly available. See Appendix Table A for links and details.

Declarations

Conflict of interest

The author declares that she has no conflict of interest.

Footnotes

1

DMV uses the fiscal year in data reporting, which run from July 1st of a calendar year to June 30th of the following calendar year. Fiscal year 2014–2021 is the period between July 2013 and June 2021.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. Akkoyunlu S. Intervening opportunities and competing migrants in Turkish migration to Germany, 1969–2008. Migration Letters. 2012;9(2):155. doi: 10.33182/ml.v9i2.104. [DOI] [Google Scholar]
  2. Alis C, Legara EF, Monterola C. Generalized radiation model for human migration. Scientific Reports. 2021;11(1):1–10. doi: 10.1038/s41598-021-02109-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Almeida LM, Gonçalves MB. A methodology to incorporate behavioral aspects in trip-distribution models with an application to estimate student flow. Environment and Planning A. 2001;33(6):1125–1138. doi: 10.1068/a33122. [DOI] [Google Scholar]
  4. Ambinakudige S, Parisi D. A spatiotemporal analysis of inter-county migration patterns in the United States. Applied Spatial Analysis and Policy. 2017;10(1):121–137. doi: 10.1007/s12061-015-9171-1. [DOI] [Google Scholar]
  5. Anderson B, Poeschel F, Ruhs M. Rethinking labour migration: Covid-19, essential work, and systemic resilience. Comparative Migration Studies. 2021;9(1):45. doi: 10.1186/s40878-021-00252-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Anderson JE. The gravity model. Annual Review of Economics. 2011;3(1):133–160. doi: 10.1146/annurev-economics-111809-125114. [DOI] [Google Scholar]
  7. Backhaus A, Martinez-Zarzoso I, Muris C. Do climate variations explain bilateral migration? A gravity model analysis. IZA Journal of Migration. 2015;4(1):1–15. doi: 10.1186/s40176-014-0026-3. [DOI] [Google Scholar]
  8. Beyer RM, Schewe J, Lotze-Campen H. Gravity models do not explain, and cannot predict, international migration dynamics. Humanities and Social Sciences Communications. 2022;9(1):56. doi: 10.1057/s41599-022-01067-x. [DOI] [Google Scholar]
  9. Bodwitch H, Carah J, Daane K, Getz C, Grantham T, Hickey G, Wilson S. Growers say cannabis legalization excludes small growers, supports illicit markets, undermines local economies. California Agriculture. 2019;73(3):177–184. doi: 10.3733/ca.2019a0018. [DOI] [Google Scholar]
  10. Brettell CB, Hollifield JF. Migration theory: Talking across disciplines. Routledge; 2014. [Google Scholar]
  11. Burger M, Van Oort F, Linders G-J. On the specification of the gravity model of trade: Zeros, excess zeros and zero-inflated estimation. Spatial Economic Analysis. 2009;4(2):167–190. doi: 10.1080/17421770902834327. [DOI] [Google Scholar]
  12. Busso M, Chauvin JP, Herrera LN. Rural-urban migration at high urbanization levels. Regional Science and Urban Economics. 2021;91:103658. doi: 10.1016/j.regsciurbeco.2021.103658. [DOI] [Google Scholar]
  13. Cairns D, Smyth J. I wouldn’t mind moving actually: Exploring student mobility in Northern Ireland. International Migration. 2011;49(2):135–161. doi: 10.1111/j.1468-2435.2009.00533.x. [DOI] [Google Scholar]
  14. California Economic Strategy Panel. (2006). California economic strategy panel regions. Retrieved September 7, 2022, from https://web.archive.org/web/20111001045752/http://www.labor.ca.gov/panel/pdf/CESP_Regions_100606.pdf
  15. California State Auditor. (2022). Finance provides reasonable population projections, but it has not provided sufficient support for its household formation projections. Retrieved September 7, 2022, from https://www.auditor.ca.gov/reports/2021-125/index.html#chapter2
  16. Cohen JH. Modeling migration, insecurity and COVID-19. Migration Letters. 2020;17(3):405–409. doi: 10.33182/ml.v17i3.986. [DOI] [Google Scholar]
  17. Conway KS, Rork JC. On measuring U.S interstate migration with moving van data. Population Research and Policy Review. 2022;41:1431–1449. doi: 10.1007/s11113-022-09713-7. [DOI] [Google Scholar]
  18. Cooke TJ. It is not just the economy: Declining migration and the rise of secular rootedness. Population, Space and Place. 2011;17(3):193–203. doi: 10.1002/psp.670. [DOI] [Google Scholar]
  19. De Smith MJ, Goodchild MF, Longley P. Geospatial analysis: A comprehensive guide to principles, techniques and software tools. Troubador Publishing Ltd.; 2007. [Google Scholar]
  20. Détang-Dessendre C, Goffette-Nagot F, Piguet V. Life cycle and migration to urban and rural areas: Estimation of a mixed logit model on French data. Journal of Regional Science. 2008;48(4):789–824. doi: 10.1111/j.1467-9787.2008.00571.x. [DOI] [Google Scholar]
  21. DeWaard J, Johnson J, Whitaker S. Internal migration in the United States: A comprehensive comparative assessment of the Consumer Credit Panel. Demographic Research. 2019;41:953–1006. doi: 10.4054/DemRes.2019.41.33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Fan Q, Fisher-Vanden K, Klaiber HA. Climate change, migration, and regional economic impacts in the United States. Journal of the Association of Environmental and Resource Economists. 2018;5(3):643–671. doi: 10.1086/697168. [DOI] [Google Scholar]
  23. Flood S, King M, Rodgers R, Ruggles S, Warren JR, Westberry M. Integrated public use microdata series, current population survey: Version 9.0. IPUMS; 2021. [Google Scholar]
  24. Foster T. Decomposing American immobility: Compositional and rate components of interstate, intrastate, and intracounty migration and mobility decline. Demographic Research. 2017;37:1515–1548. doi: 10.4054/DemRes.2017.37.47. [DOI] [Google Scholar]
  25. Foster TB. The persistent black-white gap in and weakening link between expecting to move and actually moving. Sociology of Race and Ethnicity. 2018;4(3):353–370. doi: 10.1177/2332649217728374. [DOI] [Google Scholar]
  26. Fotheringham S, Rogerson P. Spatial analysis and GIS. CRC Press; 2013. [Google Scholar]
  27. Frey, W. H. (2010). State of metropolitan America: On the front lines of demographic transformation. Retrieved April 15, 2022, from https://www.efaidnbmnnnibpcajpcglclefindmkaj, https://www.brookings.edu/wp-content/uploads/2016/07/metro_america_report1.pdf
  28. Gabriel, S. A., & Mattey, J. P. (1996). Leaving Los Angeles: Migration, economic opportunity, and the quality-of-life. Working Papers in Applied Economic Theory, 96–10. Federal Reserve Bank of San Francisco.
  29. Garcia AJ, Pindolia DK, Lopiano KK, Tatem AJ. Modeling internal migration flows in sub-Saharan Africa using census microdata. Migration Studies. 2015;3(1):89–110. doi: 10.1093/migration/mnu036. [DOI] [Google Scholar]
  30. Golding SA, Winkler RL. Tracking urbanization and exurbs: Migration across the rural–urban continuum, 1990–2016. Population Research and Policy Review. 2020;39(5):835–859. doi: 10.1007/s11113-020-09611-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Greenwood MJ. Migration and labor market opportunities. In: Manfred F, Nijkamp P, editors. Handbook of regional science. Springer; 2021. pp. 467–480. [Google Scholar]
  32. Holmes, N., & White, E. (2021). Pandemic patterns: California is seeing fewer entrances and more exits. Retrieved December 2, 2021, from https://www.capolicylab.org/wp-content/uploads/2021/12/Pandemic-Patterns.-California-is-Seeing-Fewer-Entrances-and-More-Exits.pdf
  33. Hong I, Jung W-S, Jo H-H. Gravity model explained by the radiation model on a population landscape. PLoS ONE. 2019;14(6):e0218028. doi: 10.1371/journal.pone.0218028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Hyatt HR, McEntarfer E, Ueda K, Zhang A. Interstate migration and employer-to-employer transitions in the U.S.: New evidence from administrative records data. Demography. 2018;55:2161–2180. doi: 10.1007/s13524-018-0720-5. [DOI] [PubMed] [Google Scholar]
  35. Johnson KM, Curtis KJ, Egan-Robertson D. Frozen in place: Net migration in sub-national areas of the United States in the era of the Great Recession. Population and Development Review. 2017;43(4):599–623. doi: 10.1111/padr.12095. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Karemera D, Oguledo VI, Davis B. A gravity model analysis of international migration to North America. Applied Economics. 2000;32(13):1745–1755. doi: 10.1080/000368400421093. [DOI] [Google Scholar]
  37. Kavousi P, Giamo T, Arnold G, Alliende M, Huynh E, Lea J, Lucine R, Tillett Miller A, Webre A, Yee A, Champagne-Zamora A, Taylor K. What do we know about opportunities and challenges for localities from Cannabis legalization? Review of Policy Research. 2022;39(2):143–169. doi: 10.1111/ropr.12460. [DOI] [Google Scholar]
  38. Kearney MS, Levine PB, Pardue L. The puzzle of falling US birth rates since the Great Recession. Journal of Economic Perspectives. 2022;36(1):151–176. doi: 10.1257/jep.36.1.151. [DOI] [Google Scholar]
  39. Keogh G. Modelling asylum migration pull-force factors in the EU-15. The Economic and Social Review. 2013;44(3):371–399. [Google Scholar]
  40. KoserAkcapar S. Re-thinking migrants’ networks and social capital: A case study of Iranians in Turkey. International Migration. 2010;48(2):161–196. doi: 10.1111/j.1468-2435.2009.00557.x. [DOI] [Google Scholar]
  41. Lamonica GR, Zagaglia B. The determinants of internal mobility in Italy, 1995–2006: A comparison of Italians and resident foreigners. Demographic Research. 2013;29:407–440. doi: 10.4054/DemRes.2013.29.16. [DOI] [Google Scholar]
  42. Larrison J, Raadschelders JC. Understanding migration: The case for public administration. International Journal of Public Administration. 2019;43(1):37–48. doi: 10.1080/01900692.2019.1620772. [DOI] [Google Scholar]
  43. Lavieri PS, Garikapati VM, Bhat CR, Pendyala RM. Investigation of heterogeneity in vehicle ownership and usage for the millennial generation. Transportation Research Record. 2017;2664(1):91–99. doi: 10.3141/2664-10. [DOI] [Google Scholar]
  44. Lewer JJ, Van den Berg H. A gravity model of immigration. Economics Letters. 2008;99(1):164–167. doi: 10.1016/j.econlet.2007.06.019. [DOI] [Google Scholar]
  45. Liu X, Andris C, Desmarais BA. Migration and political polarization in the US: An analysis of the county-level migration network. PLoS ONE. 2019;14(11):e0225405. doi: 10.1371/journal.pone.0225405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Marandi A, Main KL. Vulnerable City, recipient city, or climate destination? Towards a typology of domestic climate migration impacts in US cities. Journal of Environmental Studies and Sciences. 2021;11(3):465–480. doi: 10.1007/s13412-021-00712-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Marré AW, Rupasingha A. School quality and rural in-migration: Can better rural schools attract new residents? Journal of Regional Science. 2020;60(1):156–173. doi: 10.1111/jors.12437. [DOI] [Google Scholar]
  48. Matthews SA, Parker DM. Progress in spatial demography. Demographic Research. 2013;28:271–312. doi: 10.4054/DemRes.2013.28.10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Molloy R, Smith CL, Wozniak A. Internal migration in the United States. Journal of Economic Perspectives. 2011;25(3):173–196. doi: 10.1257/jep.25.3.173. [DOI] [Google Scholar]
  50. Molloy R, Smith CL, Wozniak A. Job changing and the decline in long-distance migration in the United States. Demography. 2017;54(2):631–653. doi: 10.1007/s13524-017-0551-9. [DOI] [PubMed] [Google Scholar]
  51. Murphy, P., & Danielson, C. (2018). Census-related funding in California. Retrieved September 30, 2022, from https://www.ppic.org/publication/census-related-funding-in-california/#:~:text=The%20state's%202017%E2%80%9318%20budget,to%20participate%20in%20the%20census.
  52. Nawrotzki RJ, Brenkert-Smith H, Hunter LM, Champ PA. Wildfire-migration dynamics: Lessons from Colorado’s Fourmile Canyon Fire. Society & Natural Resources. 2014;27(2):215–225. doi: 10.1080/08941920.2013.842275. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Norlander P, Sørensen TA. 21st century slowdown: The historic nature of recent declines in the growth of the immigrant population in the United States. Migration Letters. 2018;15(3):409–422. doi: 10.33182/ml.v15i3.362. [DOI] [Google Scholar]
  54. Nowotny K, Pennerstorfer D. Network migration: Do neighbouring regions matter? Regional Studies. 2019;53(1):107–117. doi: 10.1080/00343404.2017.1380305. [DOI] [Google Scholar]
  55. Plane DA, Heins F. Age articulation of U.S. inter-metropolitan migration flows. The Annals of Regional Science. 2003;37(1):107–130. doi: 10.1007/s001680200114. [DOI] [Google Scholar]
  56. Poot J, Alimi O, Cameron MP, Maré DC. The gravity model of migration: The successful comeback of an ageing superstar in regional science. Journal of Regional Research. 2016;36:63–86. doi: 10.2139/ssrn.2864830. [DOI] [Google Scholar]
  57. Powell WW, Koput KW, Bowie JI, Smith-Doerr L. The spatial clustering of science and capital: Accounting for biotech firm-venture capital relationships. Regional Studies. 2002;36(3):291–305. doi: 10.1080/00343400220122089. [DOI] [Google Scholar]
  58. Ramos, R. (2013). Gravity models: A tool for migration analysis. IZA Discussion Paper No. 7700. 10.15185/izawol.239
  59. Rayer S, Brown DL. Geographic diversity of inter-county migration in the United States, 1980–1995. Population Research and Policy Review. 2001;20(3):229–252. doi: 10.1023/A:1010654618859. [DOI] [Google Scholar]
  60. Rogers A, Willekens F, Little J, Raymer J. Describing migration spatial structure. Papers in Regional Science. 2002;81(1):29–48. doi: 10.1111/j.1435-5597.2002.tb01220.x. [DOI] [Google Scholar]
  61. Saks RE, Wozniak A. Labor reallocation over the business cycle: New evidence from internal migration. Journal of Labor Economics. 2011;29(4):697–739. doi: 10.1086/660772. [DOI] [Google Scholar]
  62. Santana-Gallego M, Ledesma-Rodríguez FJ, Pérez-Rodríguez JV. International trade and tourism flows: An extension of the gravity model. Economic Modelling. 2016;52:1026–1033. doi: 10.1016/j.econmod.2015.10.043. [DOI] [Google Scholar]
  63. Schafran A, Wegmann J. Restructuring, race, and real estate: Changing home values and the new California metropolis, 1989–2010. Urban Geography. 2012;33(5):630–654. doi: 10.2747/0272-3638.33.5.630. [DOI] [Google Scholar]
  64. Sharygin E. Estimating migration impacts of wildfire: California’s 2017 North Bay Fires. In: Karácsonyi D, Taylor A, Bird D, editors. The Demography of Disasters. Springer; 2021. pp. 49–70. [Google Scholar]
  65. Spearing LA, Faust KM. Cascading system impacts of the 2018 Camp Fire in California: The interdependent provision of infrastructure services to displaced populations. International Journal of Disaster Risk Reduction. 2020;50:101822. doi: 10.1016/j.ijdrr.2020.101822. [DOI] [Google Scholar]
  66. Swain LL, Garasky S. Migration decisions of dual-earner families: An application of multilevel modeling. Journal of Family and Economic Issues. 2007;28(1):151–170. doi: 10.1007/s10834-006-9046-3. [DOI] [Google Scholar]
  67. Thomas M, Stillwell J, Gould M. Modelling multilevel variations in distance moved between origins and destinations in England and Wales. Environment and Planning a: Economy and Space. 2015;47(4):996–1014. doi: 10.1068/a130327p. [DOI] [Google Scholar]
  68. Treyz GI, Rickman DS, Hunt GL, Greenwood MJ. The dynamics of U.S. internal migration. The Review of Economics and Statistics. 1993;75(2):209. doi: 10.2307/2109425. [DOI] [Google Scholar]
  69. Van Bergeijk PA, Brakman S. The gravity model in international trade: Advances and applications. Cambridge University Press; 2010. [Google Scholar]
  70. Van Lottum J, Marks D. The determinants of internal migration in a developing country: Quantitative evidence for Indonesia, 1930–2000. Applied Economics. 2012;44(34):4485–4494. doi: 10.1080/00036846.2011.591735. [DOI] [Google Scholar]
  71. Varner C, Young C. Millionaire migration in California: The impact of top tax rates. National Tax Journal. 2012;64(2):255–284. [Google Scholar]
  72. Villarreal A. Explaining the decline in Mexico-U.S. migration: The effect of the Great Recession. Demography. 2014;51(6):2203–2228. doi: 10.1007/s13524-014-0351-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Vorona RD, Szklo-Coxe M, Wu A, Dubik M, Zhao Y, Ware JC. Dissimilar teen crash rates in two neighboring southeastern Virginia cities with different high school start times. Journal of Clinical Sleep Medicine. 2011;7(2):145–151. doi: 10.5664/jcsm.28101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Wang Y, Li X, Yao X, Li S, Liu Y. Intercity population migration conditioned by city industry structures. Annals of the American Association of Geographers. 2022;112(5):1441–1460. doi: 10.1080/24694452.2021.1977110. [DOI] [Google Scholar]
  75. Westerlund J, Wilhelmsson F. Estimating the gravity model without gravity using panel data. Applied Economics. 2011;43(6):641–649. doi: 10.1080/00036840802599784. [DOI] [Google Scholar]
  76. Williams AM, Jephcote C, Janta H, Li G. The migration intentions of young adults in Europe: A comparative, multilevel analysis. Population, Space and Place. 2018;24(1):e2123. doi: 10.1002/psp.2123. [DOI] [Google Scholar]
  77. Winkler RL, Rouleau MD. Amenities or disamenities? Estimating the impacts of extreme heat and wildfire on domestic US migration. Population and Environment. 2021;42(4):622–648. doi: 10.1007/s11111-020-00364-4. [DOI] [Google Scholar]
  78. Wong P-H, Celbis MG. Human rights, income and international migration. International Migration. 2019;57(3):98–114. doi: 10.1111/imig.12558. [DOI] [Google Scholar]
  79. Yavorsky D, Honka E, Chen K. Consumer search in the US auto industry: The role of dealership visits. Quantitative Marketing and Economics. 2021;19(1):1–52. doi: 10.1007/s11129-020-09229-4. [DOI] [Google Scholar]
  80. Young C, Varner C, Lurie IZ, Prisinzano R. Millionaire migration and taxation of the elite: Evidence from administrative data. American Sociological Review. 2016;81(3):421–446. doi: 10.1177/0003122416639625. [DOI] [Google Scholar]
  81. Zabel JE. Migration, housing market, and labor market responses to employment shocks. Journal of Urban Economics. 2012;72(2–3):267–284. doi: 10.1016/j.jue.2012.05.006. [DOI] [Google Scholar]
  82. Zhan M, Downey C, Dyke M. International postgraduate students’ labour mobility in the United Kingdom: A cross-classified multilevel analysis. Population, Space and Place. 2021;27(1):e2381. doi: 10.1002/psp.2381. [DOI] [Google Scholar]
  83. Zhang Y, Zhang A. Determinants of air passenger flows in China and gravity model: Deregulation, LCCs, and high-speed rail. Journal of Transport Economics and Policy. 2016;50(3):287–303. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The raw DMV data are restricted-use data and only available for authorized users due to data privacy laws. Other data used in this analysis are publicly available. See Appendix Table A for links and details.


Articles from Population Research and Policy Review are provided here courtesy of Nature Publishing Group

RESOURCES