Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Sep 6.
Published in final edited form as: Econometrica. 2020 Mar;88(2):727–797. doi: 10.3982/ECTA13734

Diversity and Conflict*

Cemal Eren Arbatli , Quamrul H Ashraf , Oded Galor §, Marc Klemp
PMCID: PMC9447842  NIHMSID: NIHMS1776416  PMID: 36071951

Abstract

This research advances the hypothesis and establishes empirically that interpersonal population diversity, rather than fractionalization or polarization across ethnic groups, has been pivotal to the emergence, prevalence, recurrence, and severity of intrasocietal conflicts. Exploiting an exogenous source of variations in population diversity across nations and ethnic groups, as determined predominantly during the exodus of humans from Africa tens of thousands of years ago, the study demonstrates that population diversity, and its impact on the degree of diversity within ethnic groups, has contributed significantly to the risk and intensity of historical and contemporary civil conflicts. The findings arguably reflect the contribution of population diversity to the non-cohesivnesss of society, as reflected partly in the prevalence of mistrust, the divergence in preferences for public goods and redistributive policies, and the degree of fractionalization and polarization across ethnic, linguistic, and religious groups.

Keywords: Social conflict, population diversity, ethnic fractionalization, ethnic polarization, interpersonal trust, political preferences

JEL codes: D74, N30, N40, O11, O43, Z13

1. Introduction

Over the course of the 20th century, in the period following World War II, civil conflicts have been responsible for more than 16 million casualties worldwide, well surpassing the cumulative loss of human life associated with international conflicts. Nations plagued by civil conflict have experienced significant fatalities from violence, substantial loss of productive resources, and considerable declines in their standards of living. While the number of countries experiencing conflict has declined from its peak in the early 1990s, as many as 35 nations have been afflicted by the prevalence of civil conflict since 2010, and more than a quarter of all nations encountered the incidence of civil conflict for at least a decade during the 1960–2017 time period.

This research explores the origins of the prevailing variation in the emergence, prevalence, recurrence, and severity of intrasocietal conflicts across countries, regions, and ethnic groups. It highlights one of their deepest roots, molded during the dawn of the dispersion of anatomically modern humans across the globe and its differential impact on the level of population diversity across regions. The study advances the hypothesis and establishes empirically that interpersonal diversity with each ethnic group, rather than fractionalization or polarization across ethnic groups, is pivotal for the understanding of civil conflicts. Exploiting an exogenous source of variations in population diversity across nations and ethnic groups, as determined predominantly during the exodus of Homo sapiens from Africa tens of thousands of years ago, the study establishes that interpersonal population diversity, and its impact on the degree of diversity within ethnic groups, has contributed significantly to conflicts in the course of human history. The study further suggests that the contribution of interpersonal population diversity to the non-cohesiveness of society, as reflected partly by the prevalence of mistrust, the divergence in preferences for public goods and redistributive policies, and the degree of fractionalization and polarization across ethnic, linguistic, and religious groups, has fostered social, political, and economic instability and magnified the vulnerability of society to internal conflicts.

Population diversity at the national or subnational level may contribute to intergroup as well as intra-group conflicts through several mechanisms. First, population diversity may have an adverse effect on the prevalence of mutual trust, and excessive diversity could therefore depress the level of social capital below a threshold that could have averted the emergence of social, political, and economic grievances and, thus, prevented violent hostilities. Second, to the extent that population diversity captures interpersonal divergence in preferences for public goods and redistributive policies, highly diverse societies may find it difficult to reconcile such differences through collective action, thereby intensifying their susceptibility to conflict. Third, insofar as population diversity reflects interpersonal heterogeneity in traits that are differentially rewarded, it can potentially cultivate resentments that are rooted in inequality, thereby magnifying the vulnerability to internal belligerence.

Moreover, the prehistorical variation in the level population diversity across regions and its potential role in facilitating the formation of ethnic groups may have contributed to the emergence of social conflicts. In particular, following the “out of Africa” migration of humans, the initial endowment of population diversity in each region may have influenced the process of group formation, reflecting the trade-off associated with the scale of the population. While a larger group may benefit from economies of scale, its productivity tends to be affected adversely by its incohesiveness. Thus, in light of the adverse impact of diversity on social cohesiveness, a larger initial endowment of population diversity have plausibly led to the emergence of a larger number of groups, and due to the forces of “cultural drift” and “biased transmission” of cultural markers (e.g., traditions, norms, and dialects), to the formation of distinct ethnic identities. The emergent fragmentation could have fueled excessive inter-group competition and dissension, and could have created fertile grounds for the use of a divide-and-rule strategy by political elites, contributing to the emergence of conflict.

The exploration of the contribution of interpersonal population diversity to conflict within nations and ethnic groups relies on a novel measure that encompasses various dimensions of population diversity – proportional representation of ethnic groups, interpersonal diversity between groups, and interpersonal diversity within groups. While some aspects of population diversity at the national level can be captured by indexes of ethnolinguistic fractionalization and polarization, these measures predominantly reflect the proportional representation of ethnic groups in the population, disregarding the importance of the degree of interpersonal diversity within each ethnic group for the overall level of diversity at the national level. These deficient measures of population diversity may thus obfuscate the true impact of population diversity on civil conflicts within nations, and they do not permit the exploration of the role of diversity within an ethnic group on either intra-group or inter-group conflicts.

Exploiting variations across countries and ethnic homelands, the analysis demonstrates that interpersonal population diversity within and between ethnic groups has contributed fundamentally – as illustrated in Figure 1 – to the emergence, prevalence, recurrence, and severity of historical and contemporary intrasocietal conflicts across countries, regions, and ethnic groups. Furthermore, the country-level analysis documents that the contribution of population diversity to intrastate conflicts has plausibly operated partly via the number of ethnic groups in the population, the prevalence of mistrust, and the degree of dispersion in political preferences.

Figure 1: The Evolution of Population Diversity in a Location and Its Impact on Conflict.

Figure 1:

Notes: Solid arrows represent hypothesized links that are confirmed by the empirical analysis, whereas dashed arrows represent hypothesized links that do not gain consistent support. In particular, interpersonal diversity within as well as between groups affect both inter-group and intra-group conflict, partly via their adverse effect on social cohesion within and across ethnic groups.

The dual analysis at the national and at the ethnic-homeland levels has several virtues. First, it permits the exploration of the impact of population diversity on the emergence of conflicts in societies of different scales, suggesting that population diversity reduces social cohesion and increases the likelihood of social conflicts within national as well as subnational populations. Second, since the boundaries of ethnic homelands largely predate the formation of modern nation states, the ethnic-homeland level analysis mitigates potential concerns regarding the impact of population diversity and internal conflicts on contemporary national borders (Alesina and Spolaore, 2003). Third, the focus on ethnic groups as well as on national populations permits the analysis to disentangle the impact of population diversity within an ethnic group, from the impact of ethnic diversity across groups, in the emergence of inter-group as well as intra-group conflicts. Fourth, because populations within ethnic homelands have been largely native to their locations, the analysis at the ethnicity level diminishes potential concerns about the effect of conflicts on migrations across countries and on the global distribution of national population diversity.

The research employs several empirical strategies to mitigate concerns about the potential role of reverse causality, omitted cultural, geographical, and human characteristics, as well as sorting, in the observed association between population diversity and intrasocietal conflicts. In the course of human history, conflicts have plausibly altered the observed levels of diversity within ethnic groups, and the association between observed population diversity within an ethnic group and intra-group conflict may partly reflect reverse causality from conflict to diversity. Furthermore, the association between population diversity and internal conflicts at the ethnicity level may be governed by omitted cultural, geographical, and human characteristics. In order to mitigate these concerns, the empirical analysis exploits the negative association between the observed population diversity of an indigenous contemporary ethnic group and its migratory distance from East Africa, due to the serial founder effect (e.g., Harpending and Rogers, 2000; Ramachandran et al., 2005; Ashraf and Galor, 2013a), to predict population diversity for a globally representative sample of more than 900 ethnic groups.1

Nevertheless, several scenarios could a priori weaken the credibility of this methodology. First, selective migration out of Africa, or natural selection along the migratory paths, could have affected human traits and, therefore, conflict independently of the impact of migratory distance from Africa on the degree of diversity in human traits. However, while migratory distance from Africa has a significant negative association with the degree of diversity in human traits, it appears to be uncorrelated with the mean level of traits in a population, such as height, weight, and skin reflectance, conditional on distance from the equator (Ashraf and Galor, 2013a). Second, migratory distance from Africa could be correlated with distances from focal historical locations (e.g., technological frontiers) and could, therefore, capture the effect of these other distances on the process of development and the emergence of conflicts, rather than the effect of these migratory distances via population diversity. Nevertheless, conditional on migratory distance from East Africa, distances from historical technological frontiers in the years 1, 1000, and 1500 do not qualitatively alter the impact of predicted diversity on internal conflicts, further justifying the reliance on the “out of Africa” hypothesis and the serial founder effect for identifying the influence of population diversity on intrasocietal conflicts.

Moreover, a threat to identification would emerge if the actual migratory paths from Africa would have been correlated with geographical characteristics that are directly conducive to conflict (e.g., soil quality, ruggedness, climatic conditions, and propensity to trade). This would have involved, however, that the conduciveness of these geographical characteristics to conflicts would be aligned along the main root of the migratory path out of Africa as well as along each of the main forks that emerge from this primary path. In particular, in several important forks of this migration process (e.g., the Fertile Crescent and the associated eastward migration into Asia and westward migration into Europe), geographical characteristics that are conducive to conflicts would have to diminish symmetrically along these divergent secondary migratory paths. Nevertheless, the analysis establishes that the results are qualitatively unaffected when it accounts for a wide range of potentially confounding geographical characteristics of ethnic homelands, spatial dependence, as well as time-invariant unobserved heterogeneity in each region, identifying the association between interpersonal population diversity and internal conflicts across societies in the same region.

The observed association between population diversity and internal conflict at the ethnic-homeland level may further reflect the sorting of less diverse populations into geographical niches that are less conducive to conflicts. While sorting would not affect the existence of a positive association between population diversity and conflicts, it would weaken the proposed interpretation of this association. However, such sorting would require that the spatial distribution of ex-ante conflict risk would have to be negatively correlated with migratory distance from Africa and the conduciveness of geographical characteristics to conflicts would have to be negatively aligned with the primary migratory path out of Africa as well as with each of the main subsequent forks and their associated secondary migratory paths. These concerns are further mitigated by accounting for heterogeneity in a wide range of geographical characteristics across ethnic homelands, spatial autocorrelation, and regional fixed effects.

Further, to the extent that interregional migration flows in the post-1500 era, and thus the proportional representation of ethnic groups within each national population, may have been affected by historically persistent spatial patterns of conflict risk, contemporary national population diversity may be endogenous to intrastate conflicts. Thus, to mitigate these concerns two alternative empirical strategies are developed, yielding remarkably similar results. The first strategy confines the analysis to variations in a sample of countries that only belong to the Old World (i.e., Africa, Europe, and Asia), where diversity of contemporary national populations predominantly reflects the diversity of indigenous populations that became native to their current locations well before the colonial era. This strategy rests on the observation that post-1500 population movements within the Old World did not result in the significant admixture of populations that were very distant from one another. The second strategy exploits variations in a globally representative sample of countries using an estimator, in which the migratory distance of a country’s prehistorically native population from East Africa is employed as an instrumental variable for the diversity of its contemporary national population. It rests on the identifying assumption that the migratory distance of a country’s prehistorically native population from East Africa is exogenous to the risk of intrastate conflict faced by the country’s overall population in the last half-century.

The empirical analysis at the country level establishes that, accounting for the potentially confounding effects of geographical and institutional characteristics, ethnolinguistic fragmentation, outcomes of economic development, and continent fixed effects, an increase in national population diversity that corresponds to the movement from the 10th to the 90th percentile of its global cross-country distribution (i.e., a movement from the diversity level of the Republic of Korea to that of the Democratic Republic of Congo) is associated with 2.3 new civil conflict outbreaks during the 1960–2017 time horizon (relative to a sample mean of 1.2 and a standard deviation of 1.7 new civil conflict outbreaks). In addition, this increase in diversity is also associated with (i) an increase in the likelihood of observing the incidence of civil conflict in any given 5-year interval during the 1960–2017 period from 18 percent to 34 percent; (ii) an increase in the likelihood of observing the onset of a new civil conflict in any given year during the 1960–2017 time horizon from 1 percent to 4 percent; (iii) an increase in the likelihood of observing the incidence of one or more intra-group factional conflict events in any given year during the 1985–2006 time horizon from 6 percent to 60 percent; and (iv) an increase in the intensity of social unrest by either 26 percent or 38 percent of a standard deviation of the observed distribution of intrastate conflict severity across countries in the post-1960 time period (depending on the employed measure of intrastate conflict severity).

Similarly, the analysis at the ethnic-homeland level establishes that, accounting for the potentially confounding influence of a wide range of geographical and historical factors, outcomes of economic development, and regional fixed effects, an increase in observed population diversity of an ethnic group from the 10th percentile (e.g., the Mamusi people of Oceania) to the 90th percentile (e.g., the Pare people of Eastern Africa) of its global distribution is associated with an increase in the prevalence of conflicts within the territory of a homeland over the years 1989–2008 by 0.43 (relative to a sample mean of 0.14 and a standard deviation of 0.27). Further, this change in ethnic population diversity is also associated with an increase of about 57 conflict events, 9,731 conflict-related deaths, and 924 deaths per conflict during the same time period.

2. Related Literature

This study is related to several well-established lines of inquiry. Specifically, the paper contributes to the vast literature on the determinants of civil conflict. The determinants of civil conflict have been the focus of intensive research over the past two decades, highlighting the role of social, political, and economic grievances, along with the capability of the state to subdue armed opposition groups, the conduciveness of geographical characteristics towards rebel insurgencies, and the opportunity cost of engaging in rebellions, among other contributing factors (Sambanis, 2002; Fearon and Laitin, 2003; Collier and Hoeffler, 2007; Blattman and Miguel, 2010). The present study advances the understanding of the nature of grievance-related mechanisms in civil conflict, emphasizing the role of interpersonal population diversity and its deep determinants on the emergence of intra-group as well as inter-group social divisions.

The role of fractionalization was initially at the forefront of empirical analyses of the underlying determinants of civil conflict, in light of the conventional wisdom that inter-group competition over ownership of productive resources and political power, along with conflicting preferences for public goods and redistributive policies, are more difficult to reconcile in societies that are fragmented ethnolinguistically. Nevertheless, early evidence regarding the influence of ethnic, linguistic, and religious fractionalization on the risk of civil conflict in society had been largely inconclusive (Fearon and Laitin, 2003; Collier and Hoeffler, 2007), arguably due in part to conceptual limitations associated with fractionalization indices. The introduction of polarization indices to the analyses of civil conflict has led to more affirmative findings demonstrating that inter-group grievances are indeed contributors to the risk of civil conflict in society (Montalvo and Reynal-Querol, 2005; Esteban et al., 2012).2

Nevertheless, while measures of ethnolinguistic fragmentation are unable to account for the potentially critical role of intra-group heterogeneity in augmenting the risk of conflict in society at large, a central virtue of the proposed measure of population diversity is that it captures the impact of diversity across individuals within ethnic groups. Furthermore, even as a proxy for interethnic divisions, the proposed measure generates substantial insights relative to existing proxies that are based on fractionalization and polarization indices. Specifically, the commonly used measures of ethnolinguistic fragmentation typically do not exploit information beyond the proportional representations of ethnolinguistically differentiated groups in the national population – namely, they implicitly assume that these ethnic groups are internally homogenous and culturally “equidistant” from one another.3 In contrast, the proposed measure of national population diversity incorporates information on pairwise inter-group genetic distances, as well as the genetic diversity within each ethnic group, as determined predominantly over the course of the “out of Africa” demic diffusion of humans to the rest of the globe tens of thousands of years ago.4

Moreover, the use of conventional measures of ethnolinguistic fragmentation in the exploration of the impact of fragmentation on conflict is unsatisfactory due to plausible concerns about reverse causality and measurement error. Due to the association of conflict with atrocities as well as voluntary and forced migrations, the degree of ethnolinguistic fractionalization is likely to be affected by past and potential conflicts. Although the proposed measure of population diversity exploits information on the population shares of subnational groups possessing ethnically differentiated ancestries, because the endowment of population diversity in a given location was overwhelmingly determined during the prehistoric “out of Africa” expansion of humans, the analysis is able to exploit a plausibly exogenous source of the contemporary cross-country variation in this measure, thereby mitigating the biases associated with measurement and endogeneity issues that plague the widely used proxies of ethnolinguistic fragmentation. Furthermore, in contrast to the plausibly exogenous component of population diversity, the degree of ethnolinguistic fragmentation may be systematically mismeasured in more conflict-prone societies, due to (i) the political economy of national census categorizations of subnational groups, and (ii) the endogenous constructivism of individual self-identification with an ethnic group (Eifert et al., 2010; Caselli and Coleman, 2013; Besley and Reynal-Querol, 2014).

The present study also contributes to a vast literature that explores the impact of ethnolinguistic fragmentation and interethnic economic inequality on other societal outcomes, including the rate of economic growth, the quality of national institutions, the extent of financial development, the efficiency in the provision of public goods, and the level of social capital (Easterly and Levine, 1997; Alesina and La Ferrara, 2005; Alesina et al., 2016). In particular, since population diversity encompasses the degree of heterogeneity within each ethnic group as well as the pairwise distances amongst them, the current analysis is uniquely positioned to capture the contribution of these additional dimensions of diversity to social dissonance and aggregate inefficiency.

Furthermore, in light of the view that the contemporary variation in population diversity across the globe predominantly reflects the human expansion out of Africa tens of thousands of years ago, the paper contributes to the exploration of the role of deeply rooted human characteristics in comparative economic development. In particular, the study contributes to the understanding of the importance of inter-personal population diversity for social outcomes in the course of human history (e.g., population density, urbanization, and income) as explored by Ashraf and Galor (2013a).5

Finally, the study is consistent with the primordialist theories of conflict, maintaining that ethnic conflict springs from differences in ethnic identity, as well as with the instrumentalist theories, suggesting that ethnic conflict may emerge for pragmatic reasons (e.g., inequality, security, and competition).6 In particular, since the initial endowment of interpersonal population diversity at a given location may have facilitated the endogenous formation of groups, whose collective identities diverged over time under the forces of cultural drift, a reduced-form link between the prehistorically determined diversity and the contemporary risk of interethnic conflict may well be apparent in the data, regardless of whether these groups are mobilized into conflict by primordial or instrumentalist reasons.

3. Population Diversity and Conflict at the Country Level

3.1. Empirical Framework and Strategy

This section describes the various layers of the country-level analyses of the influence of population diversity on intrastate conflicts, the key variables employed, and the strategies implemented to identify the impact of population diversity on conflict.

The analysis initially focuses on contemporary conflicts, exploiting variations in either cross-country or repeated cross-country data. It explores the explanatory power of interpersonal population diversity for (i) the average frequency of new conflict outbreaks, (ii) the persistence of conflicts, as captured by the likelihood of conflict prevalence, and (iii) the likelihood of conflict outbreak. It then analyzes the impact of interpersonal diversity on intra-group factional conflicts within a national population. Finally, it explores the influence of interpersonal diversity on conflicts in the distant past.

Following the convention in the civil conflict literature, the contemporary analysis is confined to the post-1960 time period, when most of the European colonies in Sub-Saharan Africa, the Middle East, and South and Southeast Asia had already gained independence. This time horizon thus permits an assessment of the correlates of civil conflict at the national level, independently of their interactions with the contemporaneous influence of the colonial powers. The baseline sample for the contemporary analysis contains information on 150 countries for the 1960–2017 time period, of which 123 are in the Old World.

3.1.1. Main Outcome Variables: Frequency, Incidence, and Onset of Civil Conflict

The main outcome variable in the cross-country regressions is the average number of new civil conflicts per annum during the 1960–2017 time period. It is based on conflict events listed in version 18.1 of the UCDP/PRIO Armed Conflict Dataset (Gleditsch et al., 2002; Pettersson and Eck, 2018). In this data set, a civil conflict is defined as an armed conflict between the government of a state and internal opposition groups over a given incompatibility. Recurrent episodes of the same conflict between state actors and armed opposition groups are not treated as new conflicts. The study employs the most comprehensive armed conflict coding (PRIO25), encompassing all civil conflict events that resulted in at least 25 battle-related deaths in a given year.

The country-level analysis additionally exploits the temporal dimension of armed conflict events, examining the incidence of PRIO25 civil conflicts in a repeated cross-section of countries. In this analysis, the outcome variable is an indicator, coded 1 for each country-period (a period being a 5-year time interval) in which at least one active PRIO25 civil conflict is observed, and 0 otherwise. The study also examines the predictive power of population diversity for the onset of new PRIO25 civil conflicts in annually repeated cross-country data. This variable is coded 1 for each year in which at least one new PRIO25 civil conflict had erupted, and 0 otherwise. Moreover, outbreaks of subsequent episodes of the same conflict are not considered new conflict onsets.

3.1.2. Population Diversity: Measurement and Identification Strategy

The interpersonal population diversity of each country is captured by the measure of predicted genetic diversity developed by Ashraf and Galor (2013a). It is based on (i) the proportional representation of each of the ancestral populations of a contemporary nation, (ii) the genetic diversity of each of these ancestral populations, as predicted by its migratory distance from Africa, and (iii) the pairwise genetic distances between each pair of these ancestral populations, as predicted by their migratory distances from one another.

Observed genetic diversity at the ethnic group level is measured by an index referred to by population geneticists as expected heterozygosity. This index reflects the probability that two individuals, selected at random from the relevant population, are different from one another with respect to a given spectrum of genetic traits. The index is constructed by population geneticists using data on allelic frequencies (i.e., the frequency with which a gene variant or allele occurs in a given population).7 Expected heterozygosity, Hexp, takes the form:

Hexp=11ml=1mi=1klpi2,

where m is the number of genes or DNA loci in the sample, kl is observed variants or alleles of gene l, and pi denotes the frequency of occurrence of the ith allele.

Population geneticists have computed this index of expected heterozygosity, along with pairwise genetic distances, for a sample of 53 globally representative ethnic groups from the Human Genome Diversity Cell Line Panel.8 These ethnic groups have been not only prehistorically native to their current geographical locations but also largely isolated from genetic flows from other ethnic groups. The index is constructed using data on allelic frequencies for a particular class of DNA loci called microsattelites, residing in non-protein-coding or “neutral” regions of the human genome – i.e., regions that do not directly result in phenotypic expression. Thus, this measure of observed genetic diversity has the advantage of not being tainted by the differential forces of natural selection that may have operated on these populations since their prehistoric exodus from Africa.

Nevertheless, like measures of ethnolinguistic fragmentation based on fractionalization or polarization indices, observed genetic diversity might be endogenous to civil conflict, since it could be tainted by genetic admixtures resulting from the movement of populations across space, triggered by cross-regional differences in patterns of historical conflict potential, the nature of political institutions, and levels of economic prosperity. To circumvent this concern, the analysis is based on the measure of predicted genetic diversity introduced by Ashraf and Galor (2013a). Exploiting the explanatory power of a serial founder effect associated with the “out of Africa” migration process, the diversity of a country’s prehistorically indigenous population is predicted by the coefficients obtained from an ethnic-group-level regression of expected heterozygosity on migratory distance from Addis Ababa in the aforementioned sample comprising 53 globally representative ethnic groups from the Human Genome Diversity Cell Line Panel. This measure captures the component of observed interpersonal diversity within a country’s indigenous ethnic groups that is predicted by migratory distance from Addis Ababa to the country’s modern-day capital city, along prehistoric land-connected human migration routes.9

In the absence of systematic and large-scale population movements across geographically (and, thus, genetically) distant regions, as had been largely true during the precolonial era, the interpersonal diversity of the prehistorically native population in a given location serves as a good proxy for the contemporary population diversity of that location. While this continues to remain true to a large extent for nations in the Old World (i.e., Africa, Europe, and Asia), post-1500 population flows from the Old World to the New World have had a considerable impact on the ethnic composition and, thus, the contemporary interpersonal diversity of national populations in the Americas and Oceania. Thus, instead of employing the interpersonal diversity of prehistorically native populations (i.e., precolonial diversity) at the expense of limiting the analysis to the Old World, the measure of ancestry-adjusted genetic diversity from Ashraf and Galor (2013a) is employed as the main proxy for contemporary population diversity. Using the shares of different groups in a country’s modern-day population, this measure accounts for (i) the diversity within the ethnic groups that can trace own ancestry around year 1500 to their current homelands, (ii) the diversity of those descended from immigrant settlers over the past half-millennium, and (iii) the additional component of population diversity at the national level that arises from the pairwise genetic distances amongst these different subnational groups.10

However, ancestry-adjusted population diversity may still be afflicted by endogeneity bias because it accounts for the impact of cross-country migrations in the post-1500 era on the diversity of contemporary national populations. In particular, these migrations may have been spurred by historically persistent spatial patterns of conflict. Two alternative strategies are implemented to address this issue. The first strategy is to exploit variations across countries that only belong to the Old World, where as discussed previously, the interpersonal diversity of contemporary national populations overwhelmingly reflects the diversity within populations that have been native to their current locations since well before the colonial era. This strategy is based on the view that the great human migrations of the post-1500 era had systematically differential impacts on the genetic composition of national populations in the Old World versus the New World. Specifically, although post-1500 population flows had a dramatic effect on the interpersonal diversity of national populations in the Americas and Oceania, the diversity of populations in Africa, Europe, and Asia remained largely unaltered, primarily because native populations in the Old World were not subjected to substantial inflows of migrant that were descended from genetically distant ancestral populations. By confining the analysis to the Old World, this strategy effectively exploits the spatial variation in contemporary population diversity that largely coincides with the variation in diversity of prehistorically indigenous populations, as determined overwhelmingly by an ancient serial founder effect associated with the “out of Africa” migration process.

The second strategy employs the migratory distance of the prehistorically native populations in each country from East Africa as an instrument for the country’s contemporary population diversity. This strategy utilizes the observation that the mark of ancient population bottlenecks that occurred during the prehistoric “out of Africa” demic diffusion of humans across the globe continues to be seen in the worldwide pattern of genetic diversity across contemporary national populations, as reflected by the sizable correlation of 0.75 between the proxies for precolonial and contemporary population diversity in a global sample of countries. This strategy rests on the identifying assumption that the migratory distance of a country’s prehistorically indigenous population from East Africa has no direct effect on the potential for civil conflict faced by its modern national population, conditional on a large set of controls for the geographical and institutional determinants of conflict as well as the correlates of economic development.

3.1.3. Confounding Characteristics

The vast empirical literature on civil conflict has considered a large number of contributing factors. Drawing on this literature, a wide range of control variables are included in the baseline specifications. The discussion below describes these potential confounders. Additional control variables used in robustness checks are discussed in corresponding Appendices.11

Geographical Characteristics

The study accounts for a wide range of geographical attributes that may be correlated with prehistoric migratory distance from East Africa and can influence conflict risk through channels unrelated to population diversity. Absolute latitude and distance to the nearest waterway, for instance, can exert an influence on economic development and, thus, on conflict potential through climatological, institutional, and trade-related mechanisms.

Rugged terrains can provide safe havens for rebels and enable them to sustain continued resistance by protecting them from superior government forces (Fearon and Laitin, 2003). Moreover, in regions with rough terrains, subgroups of a regional population may be geographically more isolated. Such isolation may strengthen the forces of “cultural drift” and ethnic differentiation among these groups (Michalopoulos, 2012), thus increasing the potential for inter-group conflict. Further, in light of evidence that conditional on their respective country-level means, greater intracountry dispersion in agricultural land suitability and elevation can contribute to ethnolinguistic diversity (Michalopoulos, 2012), these natural attributes could also generate an indirect influence on conflict propensity through the ethnolinguistic fragmentation of the population.12 To account for these factors, the baseline analysis controls for terrain ruggedness, as well as the mean and range of both agricultural land suitability and elevation.

The baseline specifications also include a dummy for island nations. Due to their greater isolation in space, islands nations possibly followed different historical trajectories than nations that are connected by land to one another. For example, the settlement process that took place in island nations and their relative immunity from cross-border spillovers may influence both population diversity and conflict potential. Finally, the baseline specifications additionally account for a complete set of continent fixed effects to ensure that the estimated reduced-form impact of population diversity on conflict potential is not simply reflecting the latent influence of unobserved time-invariant cultural, institutional, and geographical factors at the continent level.

Institutional Factors

Colonial legacies may have significantly shaped the political economy of interethnic cleavages in newly independent states (Posner, 2003). More generally, the heritage of colonial rule and the identity of the former colonizers may have important ramifications for the nature and stability of contemporary political institutions at the national level, thereby influencing the potential for conflict in society. Two different sets of covariates are included in the baseline specifications to account for the impact of colonial legacies. Depending on the unit of analysis, the first set comprises either binary indicators for the historical prevalence of colonial rule (as is the case in the cross-country regressions) or time-varying measures of the lagged prevalence of colonial rule (as is the case in the regressions using repeated cross-country data). In either case, a distinction is made between colonial rule by the U.K., France, and any other major colonizing power. The second set of covariates comprises time-invariant binary indicators for British and French legal origins, included to account for any latent influence of legal codes and institutions that may not necessarily be captured by colonial experience.

The baseline specifications additionally include three control variables, all based on yearly data at the country level from the Polity IV Project, in order to account for the direct influence of contemporary political institutions on the risk of civil conflict. The first variable is based on an ordinal index that reflects the degree of executive constraints in any given year, whereas the other two variables are based on binary indicators for the type of political regime, reflecting the prevalence of either democracy (when the polity score is above 5) or autocracy (when the polity score is below −5) in a given year.13

Ethnolinguistic Fragmentation

Previous empirical findings regarding the role of ethnic fragmentation in civil conflicts have been somewhat mixed, exhibiting substantial sensitivity to model specifications and conflict codings (Fearon and Laitin, 2003). Moreover, theoretical work on the link between the ethnic composition of a society and the risk of civil conflict suggests that ethnic fractionalization by itself may be insufficient to fully capture the conflict potential that can be attributed to broader ethnolinguistic configurations of the population (Esteban and Ray, 2011a). In light of their well-grounded structural foundations, indices of polarization have gained popularity as a substitute for – or in addition to – the fractionalization measures commonly considered by empirical analyses of civil conflict. Indeed, many empirical studies find that ethnic polarization is a stronger predictor of the likelihood of civil conflict (e.g., Montalvo and Reynal-Querol, 2005; Esteban et al., 2012).

Two time-invariant controls are thus included in the baseline specifications to capture the influence of the ethnolinguistic composition of national populations on the potential for civil conflict. The first proxy is the well-known ethnic fractionalization index of Alesina et al. (2003), reflecting the probability that two individuals, randomly selected from a country’s population, will belong to different ethnic groups. The second proxy for this channel is an index of ethnolinguistic polarization, obtained from the data set of Desmet et al. (2012). The authors provide measures of several such polarization indices, constructed at different levels of aggregation of linguistic groups in a country’s population (based on hierarchical linguistic trees). The specific polarization measure employed here corresponds to the most disaggregated level of the linguistic tree and reflects the extent of polarization across subnational groups classified according to modern-day languages.14

Natural Resources and Development Outcomes

Natural resources can foster the risk of civil conflict by weakening political institutions and facilitating state capture, easing the financial constraints on rebel organizations (e.g., Fearon and Laitin, 2003; Dube and Vargas, 2013; Collier and Hoeffler, 2007), increasing the vulnerability of political elites to terms-of-trade shocks (e.g., Humphreys, 2005), and raising the return to regional secession (e.g., Ross, 2006). The baseline specifications thus include an indicator for the presence of oil or gas reserves.

Average living standards can influence civil conflict potential in a country through several channels. One argument, due to Grossman (1991) and Hirshleifer (1995), is that higher per-capita incomes raise the opportunity cost for potential rebels to engage in insurrections, thus predicting an inverse relationship between the level or growth rate of income, on the one hand, and the risk of civil conflict, on the other (Miguel et al., 2004; Collier and Hoeffler, 2007). Another argument, due to Hirshleifer (1991) and Grossman (1999), is that by raising the return to predation, higher per-capita incomes can contribute to the risk of rapacious activities over society’s resources, consistently with empirical findings from some of the aforementioned studies on the link between income from natural resources and conflict potential. Furthermore, to the extent that income per capita serves as a proxy for state capabilities (Fearon and Laitin, 2003), a higher level of per-capita income can reflect the notion of a state that is better able to prevent or defend itself against rebel insurgencies; an idea that has also found some recent empirical support (e.g., Bazzi and Blattman, 2014). Therefore, the baseline specifications control for GDP per capita, as reported by the World Bank’s World Development Indicators (WDI). Importantly, because population diversity, as proxied by genetic diversity, has been shown to confer a hump-shaped influence on productivity at the country level (Ashraf and Galor, 2013a), the inclusion of GDP per capita accounts for the indirect effect of population diversity on conflict potential via the income channel.

Like income per capita, population size is also a standard covariate in empirical models of conflict. One reason is that operational definitions of civil conflict typically impose a death threshold, and violence-related casualties may be mechanically related to the size of population. In addition, a larger population may imply a greater recruitment pool for rebels (Fearon and Laitin, 2003). Further, to the extent that more populous countries exhibit greater intrapopulation heterogeneity, they could also harbor stronger motives for secessionist conflicts (Alesina and Spolaore, 2003; Desmet et al., 2011). The baseline specifications thus include controls for population size.

It should be noted that many of the aforementioned controls for institutional quality, ethnolinguistic fragmentation, and the correlates of economic development are endogenous in an empirical model of civil conflict, and as such, their estimated coefficients in the regressions do not permit a causal interpretation. Nonetheless, controlling for these factors is essential to minimize specification errors and assess the extent to which the reduced-form influence of population diversity on conflict potential can be attributed to more conventional explanations in the literature.

Appendix A.4 presents the summary statistics of all the main variables exploited by the baseline cross-country analysis of civil conflict frequency.

3.2. Empirical Results

This section presents the main findings from several country-level analyses, establishing a highly significant and robust reduced-form causal influence of population diversity on various intrastate conflict outcomes over the past half-century. The exposition commences with the results of the baseline cross-country regressions that explain the annual frequency of civil conflict outbreaks in the post-1960 time period. It then discusses the results from conflict incidence and onset regressions that exploit variations in repeated cross-country data, before presenting evidence that population diversity has also been a significant predictor of contemporary intra-group conflict outcomes. The section concludes with an analysis of conflicts during the 1400–1799 period, showing that population diversity has had a deep influence on the conflict potential of societies over many centuries. The analysis of each conflict outcome includes several robustness checks. Some of these are collected and discussed in Appendix A.2 while others are relegated to Sections A.1–A.2 of the Supplemental Material.

3.2.1. Analysis of Civil Conflict Frequency in Cross-Country Data

The cross-country regressions attempt to explain the variation across countries in the annual frequency of new civil conflict onsets – i.e., the average number of new PRIO25 civil conflict eruptions per year – during the 1960–2017 time horizon. Specifically, the baseline empirical model for the cross-country analysis is as follows.

CFi=β0+β1DIVi^+β2GEOi+β3ETHi+β4INSi+β5DEVi+εi, (1)

where CFi is the (log-transformed) average number of new PRIO25 civil conflict outbreaks per year in country i; DIVi^ is the ancestry-adjusted population diversity of the national population; GEOi, ETHi, INSi, and DEVi are the respective vectors of control variables for geographical characteristics (including continent dummies), ethnolinguistic fragmentation, institutional factors, and the correlates of economic development, as described in Section 3.1; and finally, εi is a country-specific disturbance term. All time-varying controls for institutional factors and development outcomes enter the model as their respective temporal means over the 1960–2017 time horizon.

Table I presents the results from the baseline cross-country analysis. The analysis begins with a bivariate regression in Column 1, showing that population diversity is indeed a positive and highly significant correlate of the annual frequency of new civil conflict eruptions. Specifically, the estimated coefficient suggests that a move from the 10th to the 90th percentile of the cross-country distribution of population diversity is associated with an increase in conflict frequency by 0.014 new civil conflict outbreaks per year, a relationship that is statistically significant at the 1 percent level. Bearing in mind that the sample mean of the dependent variable is 0.022 outbreaks per year, this association is also of sizable economic significance, reflecting 44 percent of a standard deviation across countries in the temporal frequency of new civil conflict onsets. Next, beginning with Column 2, the analysis progressively includes an expanding set of covariates to the specification. It first incorporates exogenous geographical characteristics and then additionally accounts for measures of ethnolinguistic fragmentation, before controlling for semi-endogenous institutional factors and more endogenous outcomes of economic development in the full empirical model in Column 8.

Table I:

Population Diversity and the Frequency of Civil Conflict Onset across Countries – The Baseline Analysis

Cross-country sample: Global
Old World
Global
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
OLS OLS OLS OLS OLS OLS OLS OLS OLS OLS 2SLS 2SLS
Log number of new PRIO25 civil conflict onsets per year, 1960–2017
Population diversity (ancestry adjusted) 0.209*** (0.066) 0.439*** (0.104) 0.306*** (0.115) 0.290** (0.113) 0.326*** (0.118) 0.318*** (0.119) 0.309** (0.130) 0.548*** (0.191) 0.597*** (0.209) 0.537*** (0.176) 0.602*** (0.185)
Within-group population diversity 0.364*** (0.140)
Between-group population diversity 0.284* (0.166)
Ethnic fractionalization 0.011 (0.012) 0.004 (0.013) 0.004 (0.013) 0.001 (0.010) 0.002 (0.012) −0.005 (0.010)
Ethnolinguistic polarization 0.016 (0.011) 0.014 (0.011) 0.014 (0.012) 0.012 (0.012) 0.016 (0.014) 0.020* (0.012)
Absolute latitude −0.307** (0.124) −0.396* (0.204) −0.294 (0.249) −0.435** (0.199) −0.392 (0.244) −0.391 (0.245) 0.166 (0.242) −0.319 (0.255) 0.289 (0.305) −0.477** (0.201) −0.046 (0.243)
Ruggedness 0.015 (0.030) −0.005 (0.035) −0.001 (0.036) 0.000 (0.035) 0.001 (0.036) 0.003 (0.036) 0.031 (0.036) 0.002 (0.040) 0.048 (0.041) −0.001 (0.034) 0.028 (0.033)
Mean elevation −0.019** (0.009) −0.018* (0.009) −0.018* (0.010) −0.019* (0.010) −0.019* (0.010) −0.020* (0.010) −0.020** (0.009) −0.023** (0.012) −0.023** (0.011) −0.019** (0.009) −0.021** (0.009)
Range of elevation 0.011*** (0.004) 0.012*** (0.004) 0.011*** (0.004) 0.011*** (0.004) 0.011*** (0.004) 0.011*** (0.004) 0.004 (0.003) 0.014*** (0.005) 0.004 (0.004) 0.012*** (0.004) 0.005* (0.003)
Mean land suitability 0.014 (0.012) 0.020 (0.013) 0.023 (0.014) 0.024* (0.014) 0.024* (0.015) 0.025 (0.015) 0.001 (0.014) 0.018 (0.016) 0.000 (0.017) 0.021* (0.012) −0.000 (0.013)
Range of land suitability 0.014* (0.008) 0.014 (0.010) 0.013 (0.010) 0.017* (0.010) 0.017 (0.011) 0.017 (0.011) 0.008 (0.012) 0.017 (0.012) 0.007 (0.015) 0.017* (0.010) 0.011 (0.012)
Distance to nearest waterway 0.007 (0.010) 0.006 (0.011) 0.005 (0.011) 0.006 (0.012) 0.006 (0.012) 0.006 (0.012) 0.003 (0.012) 0.005 (0.012) 0.005 (0.012) 0.005 (0.011) 0.002 (0.011)
Island nation dummy −0.012 (0.007) −0.015** (0.007) −0.015** (0.007) −0.015** (0.007) −0.015** (0.007) −0.015** (0.007) −0.021** (0.008) −0.008 (0.010) −0.021* (0.011) −0.015** (0.007) −0.022*** (0.008)
Executive constraints, 1960–2017 average −0.002 (0.004) −0.003 (0.005) −0.000 (0.004)
Fraction of years under democracy, 1960–2017 0.017 (0.018) 0.023 (0.019) 0.013 (0.017)
Fraction of years under autocracy, 1960–2017 −0.009 (0.015) −0.010 (0.016) −0.010 (0.014)
Oil or gas reserve discovery 0.008* (0.005) 0.007 (0.005) 0.007 (0.005)
Log population, 1960–2017 average 0.005** (0.003) 0.007** (0.003) 0.005** (0.002)
Log GDP per capita, 1960–2017 average −0.010*** (0.002) −0.009*** (0.003) −0.010*** (0.002)
Continent dummies × × × × × × × × × × ×
Legal origin dummies × × ×
Colonial history dummies × × ×

Observations Partial R2 of population diversity 150 150 0.128 150 0.044 150 0.040 150 0.050 150 0.046 150 147 0.051 123 0.068 121 0.088 150 147
Partial R2 of within-group 0.042
Partial R2 of between-group 0.015
Adjusted R2 0.029 0.189 0.213 0.212 0.220 0.215 0.212 0.358 0.225 0.392
Effect of 10th–90th %ile move in diversity 0.014*** (0.004) 0.029*** (0.007) 0.020*** (0.008) 0.019** (0.008) 0.022*** (0.008) 0.021*** (0.008) 0.021** (0.009) 0.026*** (0.009) 0.026*** (0.009) 0.036*** (0.012) 0.041*** (0.013)
Effect of 10th–90th %ile move in within-group 0.037*** (0.014)
Effect of 10th–90th %ile move in between-group 0.023* (0.013)

FIRST STAGE Population diversity (ancestry adjusted)
Migratory distance from East Africa (in 10,000 km) −0.068*** (0.005) −0.065*** (0.007)
First-stage F statistic 153.543 92.693

Notes: This table exploits cross-country variations to establish a significant positive reduced-form impact of contemporary population diversity on the annual frequency of new PRIO25 civil conflict onsets during the 1960–2017 time period, conditional on ethnic diversity measures as well as the proximate geographical, institutional, and development-related correlates of conflict. For regressions based on the global sample, the set of continent dummies includes five indicators for Africa, Asia, North America, South America, and Oceania, whereas for regressions based on the Old-World sample, the set includes two indicators for Africa and Asia. The set of legal origin dummies includes two indicators for British and French legal origins, and the set of colonial history dummies includes three indicators for experience as a colony of the U.K., France, and any other major colonizing power. The 2SLS regressions exploit prehistoric migratory distance from East Africa to the indigenous (precolonial) population of a country as an excluded instrument for the country’s contemporary population diversity. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its cross-country distribution is expressed in terms of the number of new conflict onsets per year. Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Upon accounting for the potentially confounding influence of geographical characteristics in Column 2, population diversity continues to remain statistically significant at the 1 percent level, but now, its coefficient is more than twice as large as the unconditioned estimate from Column 1. This increase appears to be largely driven by the inclusion of absolute latitude and the range of elevation and of land suitability as covariates to the model, as all three variables enter the regression significantly and with expected signs.15 Based on the specification in Column 2, the scatter plots in Figure 2 depict the positive and statistically significant cross-country relationship between population diversity and the annual frequency of new civil conflict onsets, both in the full sample of countries and in a sample that omits apparently influential outliers.

Figure 2: Population Diversity and the Frequency of Civil Conflict Onset across Countries.

Figure 2:

Notes: This figure depicts the global cross-country relationship between contemporary population diversity and the annual frequency of new PRIO25 civil conflict onsets during the 1960–2017 time period, conditional on the baseline geographical correlates of conflict, as considered by the specification in Column 2 of Table I. The relationship is depicted for either an unrestricted sample of countries (Panel (a)) or a sample that omits apparently influential outliers (Panel (b)). Each of the two panels presents an added-variable plot with a partial regression line. Given that the unrestricted sample employed by the left panel is not constrained by the availability of data on other covariates considered by the analysis in Table I, the regression coefficients reported in this panel are marginally different from those presented in Column 2 of Table I. The set of influential outliers omitted from the sample in Panel (b) includes Bosnia and Herzegovina (BIH), Ethiopia (ETH), Georgia (GEO), India (IND), and Ukraine (UKR).

As revealed by the regression in Column 3, the point estimate of the impact of population diversity on conflict becomes somewhat diminished once the specification is conditioned to only exploit intra-continental cross-country variations. However, even after including a complete set of continent dummies, the coefficient of interest remains statistically significant at the 1 percent level and larger than the unconditioned estimate from Column 1. It suggests that a move from the 10th to the 90th percentile of the cross-country distribution of population diversity is associated with an increase in conflict frequency by 0.020 civil conflict outbreaks per year, corresponding to 65 percent of a standard deviation of the cross-country conflict frequency distribution.

The regressions in Columns 4–6 indicate that when additionally subjected to controls for ethnic fractionalization and ethnolinguistic polarization, either individually or jointly, the point estimate of the coefficient on population diversity continues to remain largely stable in both magnitude and statistical precision.16 In contrast, neither ethnic fractionalization nor ethnolinguistic polarization appears to possess any significant explanatory power for the cross-country variation in the temporal frequency of civil conflict outbreaks, conditional on population diversity and the baseline set of geographical covariates.17

The analysis in Column 7 replicates the specification from Column 6 except that it decomposes the measure of overall interpersonal diversity of the national population into its two components and jointly examines their conditional associations with conflict. The two components of overall diversity capture the average interpersonal diversity within versus between groups in the contemporary national population, where the subnational groups are categorized by their ancestral origins prior to the great intercontinental migrations of the post-1500 era.18 The results indicate that the within-group component of population diversity is economically and statistically more important for explaining civil conflict. Specifically, a move from the 10th to the 90th percentile of the cross-country distribution of within-group diversity is associated with an increase in conflict frequency by 0.037 civil conflict outbreaks per year, a relationship that is statistically significant at the 1 percent level. On the other hand, a similar move along the cross-country distribution of between-group diversity is associated with a less pronounced increase of 0.023 new civil conflict onsets per year. The estimated response in the latter case is also statistically less precise, reflecting statistical significance only at the 10 percent level. The greater importance of the within-group component of population diversity is additionally reflected by a corresponding partial R2 statistic that is nearly 3 times as large as that associated with the between-group component.

The full specification in Column 8 augments the intermediate specification from Column 6 with controls for colonial legacy and contemporary institutional factors, as well as controls for the natural resource curse, population size, and GDP per capita. Reassuringly, regardless of the potential endogeneity of these additional covariates, the point estimate of the coefficient on population diversity remains remarkably stable in both magnitude and statistical significance in comparison to the estimates from previous columns. In particular, the coefficient of interest from this regression suggests that conditional on the complete set of controls for geographical characteristics, ethnolinguistic fragmentation, institutional factors, and outcomes of economic development, a move from the 10th to the 90th percentile of the cross-country distribution of population diversity is associated with an increase in conflict frequency by 0.021 new PRIO25 civil conflict outbreaks per year, or 68 percent of a standard deviation of the cross-country conflict frequency distribution. Moreover, the adjusted R2 statistic of the regression suggests that the full empirical model explains about 36 percent of the cross-country variation in conflict frequency, whereas the partial R2 statistic associated with population diversity indicates that 5 percent of the residual cross-country variation in conflict frequency can be explained by the residual cross-country variation in population diversity.

Addressing Endogeneity

The results thus far demonstrate a highly significant and robust cross-country association between population diversity and the temporal frequency of civil conflict onsets over the last half-century, even after conditioning the analysis on a sizable set of controls for geographical characteristics, ethnolinguistic fragmentation, institutional factors, and development outcomes. Nevertheless, this association could be marred by endogeneity bias, in light of the possibility that the large-scale human migrations of the post-1500 era – as captured by the ancestry-adjusted measure of interpersonal diversity for contemporary national populations – and the spatial pattern of conflicts in the modern era could be codetermined by common unobserved forces (e.g., the spatial pattern of historical conflicts) that may not be fully accounted for by covariates. Although the stability of the coefficient of interest across specifications suggests that selection on unobservables needs to be unreasonably strong to fully explain away the main finding, one cannot rely entirely on OLS point estimates to assess causality.19 Thus, as discussed previously in Section 3.1, the analysis exploits two alternative identification strategies to address this issue. The specifications in Columns 9–10 implement the first approach to causal identification by simply restricting the OLS estimator to exploit variations in a subsample of countries that only belong to the Old World. Then, in Columns 11–12, the analysis conducts 2SLS regressions that employ the migratory distance of the prehistorically native population in each country from East Africa as an instrument for the country’s contemporary population diversity. The identifying assumption is that migratory distance from East Africa is plausibly exogenous to the risk of civil conflict in the post-1960 time period, conditional on the sizable vector of control variables.

As is evident from the regressions in Columns 9–12, the two alternative identification strategies yield remarkably similar results, with the point estimate of the coefficient on population diversity being noticeably larger in magnitude, relative to its less well-identified counterpart in the global-sample OLS regressions (from either Column 3 or Column 8). In particular, the coefficient is highly statistically significant across the four better-identified specifications, and as estimated by the 2SLS regression in Column 12, it suggests that a move from the 10th to the 90th percentile of the global cross-country distribution of population diversity is associated with an increase in conflict frequency by 0.041 new PRIO25 civil conflict outbreaks per year, corresponding to 133 percent of a standard deviation of the global cross-country conflict frequency distribution.

There are plausibly three distinct rationales – perhaps operating in tandem – for why the better-identified point estimates of the coefficient on population diversity are larger than their less well-identified counterparts. First, the spatial pattern of social conflict may exhibit long-term persistence for reasons other than population diversity. If persistent conflict spurred emigrations and atrocities that gradually led to systematically more homogeneous populations (Fletcher and Iyigun, 2010) in conflict-prone areas, there should be a downward bias in the estimated coefficient on population diversity in an OLS regression that explains the global variation in civil conflict potential in the modern era.

A second plausible explanation is that the pattern of conflict risk in the modern era, especially across populations in the New World that experienced a substantial increase in diversity from migrations in the post-1500 era, has been influenced not so much by the higher population diversity of the immigrants but more so by the unobserved (or observed but noisily measured) human capital that European settlers brought with them, the colonization strategies that they pursued, and the socio-political institutions that they established. To the extent that these unobserved factors associated with European settlers in the New World served, in one way or another, to reduce the risk of social conflict in the modern national populations of the Americas and Oceania, they could also introduce a negative bias in the OLS estimates of the relationship between population diversity and conflict risk in a global sample of countries.

A third possible rationale is that in the end, population diversity explains the conflict propensity of a population mostly through its prehistorically determined component. This component may have contributed to the formation and ethnic differentiation of native groups in a given location and, thus, to more deeply rooted inter-ethnic divisions amongst these groups. As such, conditional on continent fixed effects that absorb any systematic differences in the pattern of post-1500 population flows into locations in the Old World versus the New World, the ancestry-adjusted measure of interpersonal diversity – which incorporates the diversity of both native and non-native groups in a contemporary national population – might be a noisy proxy for the “true” measure of prehistorically determined population diversity. Therefore, as a result of this “measurement error,” the influence of the ancestry-adjusted measure of population diversity might be attenuated in an OLS regression that exploits worldwide variations.

Given that both of the identification strategies ultimately exploit the variation in population diversity across populations that have been prehistorically indigenous to their current locations, either by omitting the modern national populations of the New World from the estimation sample or by instrumenting contemporary population diversity in a globally representative sample of countries with the prehistoric migratory distance of a country’s geographical location from East Africa, the better-identified estimates mitigate all the aforementioned sources of negative bias.

Robustness Checks

The analysis in Appendix A.2 shows that population diversity possesses significant power for explaining the cross-country variation in the total count of new conflict onsets during the 1960–2017 time period (Table A.II). It also establishes the robustness of the baseline cross-country findings to accounting for: spatial dependence across observations by estimating spatial regressions (Table A.III); and the property of population diversity as a generated regressor by bootstrapping the standard errors (Table A.IV).

Further, Section A.1 of the Supplemental Material presents several robustness checks for the cross-country analysis of the influence of population diversity on the temporal frequency of civil conflict outbreaks in the post-1960 time horizon. It demonstrates that the main findings are qualitatively robust to (1) accounting for various ecological and climatic covariates, including the temporal means and volatilities of annual temperature and precipitation over the relevant sample period as well as time-invariant measures of ecological fractionalization and polarization (Table SA.I); (2) accounting for the timing of the Neolithic Revolution, state antiquity, the duration of human settlement, and distance from the regional technological frontier in 1500 (Table SA.II); (3) accounting for inequality across ethnic homelands as well as overall spatial inequality in nighttime luminosity within a country (Table SA.III); (4) accounting for linguistic rather than ethnic fractionalization as a baseline covariate (Table SA.IV); (5) accounting for alternative measures of ethnolinguistic fractionalization and polarization, based on the spatial distribution of language homelands and on gridded population data (Table SA.V); (6) accounting for the initial-year values of time-varying baseline covariates rather than their temporal means over the sample period (Table SA.VI); (7) accounting for spatial autocorrelation in unobserved heterogeneity (Table SA.VII); and (8) the elimination of world regions from the estimation sample that could have been statistically influential for generating the key empirical pattern (Table SA.VIII).

3.2.2. Analysis of Civil Conflict Incidence in Repeated Cross-Country Data

The analysis now proceeds to examine the temporal prevalence of civil conflict. Specifically, exploiting the time structure of quinquennially repeated cross-country data, it investigates the predictive power of population diversity for the likelihood of observing the incidence of one or more active conflict episodes in a given 5-year interval during the 1960–2017 time horizon. The following probit model is therefore estimated using maximum-likelihood estimation.

CPi,t=γ0+γ1DIVi^+γ2GEOi+γ3INSi,t1+γ4ETHi+γ5DEVi,t1+γ6Ci,t1    +γ7δt+ηi,tγZi,t+ηi,t; (2)
Ci,t=1CPi,tD; (3)
PrCi,t=1|Zi,t=PrCPi,tD|Zi,t=ΦγZi,tD, (4)

where CPi,t is a latent variable measuring the potential for an active conflict episode in country i during any given 5-year interval, t, and it is modeled as a linear function of explanatory variables. In particular, the time-invariant explanatory variables DIVi^, GEOi, and ETHi are all as previously defined, but now, the time-varying covariates included in INSi,t−1 and DEVi,t−1 enter as their respective temporal means over the previous 5-year interval. Further, δt is a vector of time-interval (5-year period) dummies, and ηi,t is a country-period-specific disturbance term.20 By specifying each of the time-varying controls to enter the model with a one-period lag, the analysis aims to mitigate the concern that the use of contemporaneous measures of these covariates may exacerbate reverse-causality bias in their estimated coefficients.21 Finally, the model assumes that contemporary conflict potential additionally depends on the lagged incidence of civil conflict, Ci,t−1, which accounts for the possibility that countries with a conflict experience in the immediate past may exhibit a higher conflict potential in the current period, mainly because of the intertemporal spillovers that are common to most conflict processes – e.g., the self-reinforcing nature of past casualties on either side of a conflict.22 Because the continuous variable reflecting conflict potential, CPi,t, is unobserved, its level can only be inferred from the binary incidence variable, Ci,t, indicating whether the latent conflict potential was sufficiently intense for the annual battle-related death threshold of a civil conflict episode to have been surpassed during a given 5-year interval. As is evident from equations (3)(4), D* is the corresponding threshold for unobserved conflict potential, and it appears as an intercept in Φ (.), the cumulative distribution function for the disturbance term, ηi,t.

The main results for the temporal prevalence (or incidence) of PRIO25 civil conflict episodes are presented in Columns 1–4 of Table II. In the interest of brevity, the analysis exclusively reports the better-identified point estimates – namely, from probit regressions in a sample of countries belonging only to the Old World, and from IV probit regressions that exploit migratory distance from East Africa as an instrument for contemporary population diversity in a global sample of countries. For each of these two identification strategies, two distinct specifications are estimated; one that partials out the influence of only exogenous geographical covariates (including continent fixed effects), and another that conditions the analysis on the full set of control variables from the empirical model of conflict incidence.

Table II:

Population Diversity and the Incidence or Onset of Civil Conflict in Repeated Cross-Country Data

Cross-country sample: Old World
Global
Old World
Global
(1) (2) (3) (4) (5) (6) (7) (8)
Probit Probit IV Probit IV Probit Probit Probit IV Probit IV Probit
Quinquennial PRIO25 civil conflict incidence, 1960–2017
Annual PRIO25 civil conflict onset, 1960–2017
Population diversity (ancestry adjusted) 13.366*** (3.700) 12.203*** (3.787) 14.304*** (3.652) 13.578*** (4.210) 6.172** (2.576) 6.356** (2.645) 7.066*** (2.594) 8.804*** (3.170)
Ethnic fractionalization −0.399 (0.353) −0.519 (0.332) −0.084 (0.252) −0.322 (0.280)
Ethnolinguistic polarization 0.049 (0.344) 0.322 (0.340) 0.172 (0.248) 0.334 (0.254)
Continent dummies × × × × × × × ×
Time dummies × × × × × × × ×
Controls for temporal spillovers × × × × × × × ×
Controls for geography × × × × × × × ×
Controls for institutions × × × ×
Controls for oil, population, and income × × × ×

Observations 1,270 1,045 1,583 1,311 5,452 4,377 6,996 5,757
Countries 123 121 150 147 123 121 150 147
Pseudo R2 0.416 0.440 0.131 0.161
Marginal effect of diversity 2.553*** (0.683) 2.261*** (0.709) 2.817*** (0.741) 2.595*** (0.850) 0.324** (0.139) 0.332** (0.140) 0.336** (0.133) 0.421** (0.170)

FIRST STAGE Population diversity (ancestry adjusted)
Population diversity (ancestry adjusted)
Migratory distance from East Africa (in 10,000 km) −0.068*** (0.006) −0.066*** (0.006) −0.068*** (0.006) −0.066*** (0.006)
First-stage F statistic 145.394 99.876 151.502 102.614

Notes: This table exploits variations in repeated cross-country data to establish a significant positive reduced-form impact of contemporary population diversity on the likelihood of observing (i) the incidence of a PRIO25 civil conflict in any given 5-year interval during the 1960–2017 time period (Columns 1–4); and (ii) the onset of a new PRIO25 civil conflict in any given year during the 1960–2017 time period (Columns 5–8), conditional on ethnic diversity measures as well as the proximate geographical, institutional, and development-related correlates of conflict. The controls for geography include absolute latitude, ruggedness, distance to the nearest waterway, the mean and range of agricultural suitability, the mean and range of elevation, and an indicator for small island nations. The controls for institutions include a set of legal origin dummies, comprising two indicators for British and French legal origins, as well as six time-dependent covariates, comprising the degree of executive constraints, two indicators for the type of political regime (democracy and autocracy), and three indicators for experience as a colony of the U.K., France, and any other major colonizing power. The control for oil presence is a time-invariant indicator for the discovery of a petroleum (oil or gas) reserve by the year 2003. The controls for population and income are the time-dependent log-transformed values of total population and GDP per capita. In Columns 1–4, all time-dependent covariates assume their average annual values over the previous 5-year interval, whereas in Columns 5–8, they assume their annual values from the previous year. To account for duration dependence and temporal spillovers in conflict outcomes, all regressions control for the lagged incidence of conflict, and the regressions in Columns 5–8 additionally control for a set of cubic splines of the number of peace years. For regressions based on the global sample, the set of continent dummies includes five indicators for Africa, Asia, North America, South America, and Oceania, whereas for regressions based on the Old-World sample, the set includes two indicators for Africa and Asia. The IV probit regressions exploit prehistoric migratory distance from East Africa to the indigenous (precolonial) population of a country as an excluded instrument for the country’s contemporary population diversity. The estimated marginal effect of a 1 percentage point increase in population diversity is the average marginal effect across the entire cross-section of observed diversity values, and it reflects the increase in either the quinquennial likelihood of a conflict incidence (Columns 1–4) or the annual likelihood of a conflict onset (Columns 5–8), both expressed in percentage points. Heteroskedasticity-robust standard errors, clustered at the country level, are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

As is evident from the results, interpersonal population diversity enters all four specifications with a positive and highly significant coefficient. To interpret the coefficient of interest, the IV probit regression presented in Column 4 suggests that conditional on the complete set of control variables, a 1 percentage point increase in population diversity leads to an increase in the quinquennial likelihood of a PRIO25 civil conflict incidence by 2.6 percentage points. Indeed, this sample-wide average marginal effect of population diversity is statistically significant at the 1 percent level. In addition, the economic significance of population diversity for conflict incidence is evident in the plots presented in Figure SA.1 in Section A.3 of the Supplemental Material. Based on the regressions in Columns 2 and 4, these plots illustrate how the predicted quinquennial likelihood of a civil conflict incidence varies as one moves along the cross-country distribution of population diversity in the relevant estimation sample. Specifically, a move from the 10th to the 90th percentile of the cross-country distribution of population diversity leads to an increase in the predicted quinquennial likelihood of civil conflict incidence from about 23 to 33 percent amongst countries in the Old World, and from about 18 to 34 percent in the global sample of countries.

Robustness Checks

The analysis in Appendix A.2 shows that the influence of population diversity on the incidence or prevalence of conflict is robust to: considering alternative definitions and types of intrastate conflict as the outcome variable, such as the prevalence of large-scale civil conflicts (i.e., “civil wars”) and of intrastate conflicts involving only non-state actors (Table A.V); exploiting variations in annually rather than quinquennially repeated cross-country data (Table A.VI, Columns 1–4); and considering an alternative coding of conflict prevalence that captures the share of years with an active civil conflict in a given 5-year interval (Table A.VI, Columns 5–8).

Further, Section A.2 of the Supplemental Material demonstrates that the main findings for the impact of population diversity on civil conflict incidence are qualitatively insensitive to (1) accounting for time-invariant fractionalization and polarization indices of ecological diversity as well as time-varying climatic covariates, including the inter-annual means and volatilities of temperature and precipitation over the previous 5-year interval (Table SA.X, Columns 1–4); (2) accounting for various deep-rooted correlates of long-run economic development, such as the depth of state antiquity, the time elapsed since the Neolithic Revolution, the duration of human settlement, and distance to the year-1500 technological frontier (Table SA.XI, Columns 1–4); (3) accounting for inequality in nighttime luminosity across gridded space and across ethnic homelands within a country (Table SA.XII, Columns 1–4); (4) accounting for alternative distributional indices of intergroup diversity (Alesina et al., 2003; Fearon, 2003; Esteban et al., 2012) and for additional time-invariant geographical and historical correlates of conflict incidence potential, including the percentage of mountainous terrain, the presence of noncontiguous subnational territories, and the intensity of the disease environment (Table SA.XIII); (5) empirically modeling conflict prevalence using either classical logit or “rare events” logit (King and Zeng, 2001) estimators, in lieu of the standard probit estimator (Table SA.XIV, Columns 1–4); and (6) allowing for spatiotemporal dependence across country-time observations by exploiting two-dimensional clustering of standard errors (Table SA.XV, Columns 1–4).

Finally, akin to the current analysis of conflict prevalence, the analysis in Appendix A.1 exploits variations in quinquennially repeated cross-country data to establish interpersonal population diversity as a significant predictor of the intensity of social conflicts. In particular, it examines both ordinal and continuous measures that capture the “severity” of intrastate conflicts and of events related to general social unrest, including but not limited to armed conflict.

3.2.3. Analysis of Civil Conflict Onset in Repeated Cross-Country Data

This section examines the onset of civil conflict. Unlike the model of conflict incidence, the onset model focuses solely on explaining the outbreak of conflict events, classifying the subsequent years into which a given conflict persists as nonevent years (akin to civil peace), unless they coincide with the eruption of a “new” conflict.23 Conceptually, this analysis assesses the extent to which population diversity at the national level influences socio-political instability by triggering conflicts, rather than contributing to their perpetuation over time. The probit model for the analysis of conflict onset is similar to the one for conflict incidence, as described by equations (2)(4), but with two notable exceptions. Specifically, following the convention in the literature, the model (i) exploits variations in annually repeated cross-country data, with the binary outcome variable assuming a value of 1 if a country-year observation coincides with the first year of a new civil conflict, and 0 otherwise; and (ii) controls for a set of cubic splines in the number of preceding years of uninterrupted peace, along with year dummies, in order to account for temporal or duration dependence (Beck et al., 1998). To mitigate issues of causal identification of the influence of population diversity on conflict onset, the analysis implements the same two strategies followed by the preceding analyses of conflict frequency and conflict incidence.

The main results for the onset of new PRIO25 civil conflicts are presented in Columns 5–8 of Table II. Irrespective of the identification strategy employed, or the set of covariates considered by the specification, population diversity appears to confer a statistically significant and robust positive influence on the annual likelihood of new civil conflict outbreaks. To elucidate the economic significance of this impact in the global sample of countries, the sample-wide average marginal effect estimated by the specification in Column 8 suggests that conditional on the complete set of control variables, a 1 percentage point increase in population diversity leads to an increase in the annual likelihood of a new PRIO25 civil conflict outbreak by 0.4 percentage points. Further, based on the regressions in Columns 6 and 8, the plots presented in Figure SA.2 in Section A.3 of the Supplemental Material depict how the predicted annual likelihood of a new conflict onset responds as one moves along the cross-country distribution of population diversity in the relevant estimation sample. For instance, in response to a move from the 10th to the 90th percentile of the cross-country distribution of population diversity, the predicted annual likelihood of a new conflict onset rises from about 1.9 to 3.4 percent in the Old-World sample of countries, and from about 1.1 to 3.6 percent amongst countries worldwide.

Robustness Checks

Section A.2 of the Supplemental Material demonstrates that the main findings regarding the impact of population diversity on civil conflict onset remain qualitatively unaffected after (1) accounting for time-invariant fractionalization and polarization indices of ecological diversity as well as time-varying climatic covariates, including the lagged annual values of temperature and precipitation and their inter-annual volatilities over the previous 5 years (Table SA.X, Columns 5–8); (2) accounting for various deep-rooted correlates of long-run economic development, such as the depth of state antiquity, the time elapsed since the Neolithic Revolution, the duration of human settlement, and distance to the year-1500 technological frontier (Table SA.XI, Columns 5–8); (3) accounting for inequality in nighttime luminosity across gridded space and across ethnic homelands within a country (Table SA.XII, Columns 5–8); (4) empirically modeling conflict onset using either classical logit or “rare events” logit (King and Zeng, 2001) estimators, in lieu of the standard probit estimator (Table SA.XIV, Columns 5–8); (5) allowing for spatiotemporal dependence across country-year observations by exploiting two-dimensional clustering of standard errors (Table SA.XV, Columns 5–8); (6) accounting for the influence of additional correlates of conflict onset potential, including the time-invariant “ethnic dominance” indicator of Collier and Hoeffler (2004) and the time-varying “political instability” and “new state” indicators of Fearon and Laitin (2003) (Table SA.XVI); and (7) accounting for the contemporaneous and lagged values of annual price shocks to various export commodities, as studied by Bazzi and Blattman (2014) (Table SA.XVII).

3.2.4. Analyses of Intra-group Conflict Incidence in Cross-Country and Repeated Cross-Country Data

One crucial dimension in which the advanced measure of population diversity adds value beyond standard indices of ethnolinguistic fragmentation is that it incorporates information on interpersonal heterogeneity not only across group boundaries but within such boundaries as well. Thus, in contrast to standard measures of ethnolinguistic fragmentation at the national level, to the extent that interpersonal diversity can be expected to give rise to social, political, and economic grievances that culminate to violent contentions even within ethnically or linguistically homogeneous groups, the measure is naturally better-suited to empirically link population diversity with intra-group conflicts in society. The analysis in this section provides evidence that supports this important aspect of the advanced measure, exploiting cross-country variations to establish a positive link between population diversity and the incidence of intra-group conflict events during the 1985–2006 time period.

The primary source of the exploited data on the incidence of intra-group conflict events across the globe is the All Minorities at Risk (AMAR) Phase 1 Sample Data (Birnir et al., 2018). The AMAR Sample Data is a single integrated data set, combining information on 291 subnational groups originally included in the Minorities at Risk (MAR) project with information on 74 new groups randomly selected from the AMAR Sample Frame of “socially relevant” subnational groups, in order to correct for potential selection issues in the original MAR data (Phases I–V). A “socially relevant” subnational group is defined as an ethnic group (majority or minority) that satisfies five criteria outlined and discussed in (Birnir et al., 2015).24 For each subnational group in the AMAR Sample Data, the data set provides information on whether the group experienced one or more intra-group conflicts in each year during the 1985–2006 time horizon.

Table III presents two distinct analyses of intra-group conflict incidence. The outcome variable for the cross-country analysis (Panel A) is the share of group-years in a given country with at least one active intra-group conflict over the sample period. For the analysis based on annually repeated cross-country data (Panel B), the outcome is a binary variable that reflects whether any of the AMAR groups within a given country experienced one or more intra-group conflicts in a given year. Depending on the identification strategy from earlier sections (i.e., restricting the estimation sample to countries in the Old World versus exploiting migratory distance from East Africa as an excluded instrument for population diversity in a global sample of countries), the analysis employs either OLS or 2SLS estimators in Panel A, and either probit or IV probit estimators in Panel B. For each outcome variable, and for each of the two identification strategies, three alternative specifications are estimated. The first two of these follow from the methodology in previous sections, in that one conditions the analysis on only exogenous geographical covariates (including continent fixed effects), whereas the other partials out the influence of the full set of controls for geographical characteristics, ethnolinguistic fragmentation, institutional factors, and development outcomes. However, to account for the possibility that the AMAR groups in a given country may not be fully representative of all its subnational groups, the third specification augments the full model with additional controls for the total number and the total share of all AMAR groups in the national population. Finally, in line with the methodology from earlier sections, all time-varying controls for institutional factors and development outcomes enter the specifications in Panel A as their respective temporal means over the 1985–2006 time period, whereas in Panel B, these covariates assume their respective lagged annual values.

Table III:

Population Diversity and Intra-group Conflict

Cross-country sample: Old World
Global
(1) (2) (3) (4) (5) (6)
PANEL A OLS OLS OLS 2SLS 2SLS 2SLS
Share of AMAR group-years with intra-group conflict, 1985–2006
Population diversity (ancestry adjusted) 4.456** (1.692) 4.267** (1.711) 3.580** (1.694) 5.728*** (1.761) 5.606*** (1.879) 5.124*** (1.894)
Continent dummies × × × × × ×
Controls for geography × × × × × ×
Controls for ethnic diversity × × × ×
Controls for institutions × × × ×
Controls for oil, population, and income × × × ×
Controls for number/share of AMAR groups × ×

Observations 91 91 91 115 115 115
Partial R2 of population diversity 0.079 0.068 0.051
Adjusted R2 0.092 0.187 0.231
Effect of 10th–90th %ile move in diversity 0.218*** (0.083) 0.209** (0.084) 0.175** (0.083) 0.392*** (0.121) 0.384*** (0.129) 0.351*** (0.130)

FIRST STAGE Population diversity (ancestry adjusted)
Migratory distance from East Africa (in 10,000 km) −0.061*** (0.007) −0.057*** (0.008) −0.057*** (0.008)
First-stage F statistic 83.366 47.887 47.107

PANEL B Probit Probit Probit IV Probit IV Probit IV Probit

Annual AMAR intra-group conflict incidence, 1985–2006
Population diversity (ancestry adjusted) 25.350*** (9.336) 37.535*** (9.792) 31.687*** (10.547) 31.929*** (7.335) 40.579*** (7.261) 38.375*** (7.973)
Controls as in same column of Panel A × × × × × ×
Time dummies × × × × × ×

Observations 1,658 1,658 1,658 2,179 2,179 2,179
Countries 90 90 90 114 114 114
Pseudo R2 0.207 0.338 0.390
Marginal effect of diversity 7.378*** (2.528) 9.107*** (2.301) 7.067*** (2.428) 8.717*** (1.992) 10.318*** (2.008) 9.402*** (2.212)

FIRST STAGE Population diversity (ancestry adjusted)
Migratory distance from East Africa (in 10,000 km) −0.061*** (0.007) −0.060*** (0.007) −0.060*** (0.007)
First-stage F statistic 74.527 66.911 66.939

Notes: This table exploits variations across countries and years to establish a significant positive reduced-form impact of contemporary population diversity on (i) the share of group-years of a country during the 1985–2006 time period in which an “all minorities at risk” (AMAR) ethnic group of the country experienced an intra-group conflict (Panel A); and (ii) the likelihood of observing the incidence of an intra-group conflict across a country’s AMAR ethnic groups in any given year during the 1985–2006 time period (Panel B), conditional on ethnic diversity measures, the proximate geographical, institutional, and development-related correlates of conflict, and measures capturing the number and total share of AMAR groups in the national population. The controls for geography include absolute latitude, ruggedness, distance to the nearest waterway, the mean and range of agricultural suitability, the mean and range of elevation, and an indicator for small island nations. The controls for ethnic diversity include ethnic fractionalization and polarization. The controls for institutions include a set of legal origin dummies, comprising two indicators for British and French legal origins, as well as six time-dependent covariates, comprising the degree of executive constraints, two indicators for the type of political regime (democracy and autocracy), and three indicators for experience as a colony of the U.K., France, and any other major colonizing power. The control for oil presence is a time-invariant indicator for the discovery of a petroleum (oil or gas) reserve by the year 2003. The controls for population and income are the time-dependent log-transformed values of total population and GDP per capita. In Panel A, all time-dependent covariates assume their average annual values over the entire 1985–2006 time period, whereas in Panel B, they assume their annual values from the previous year. For regressions based on the global sample, the set of continent dummies includes five indicators for Africa, Asia, North America, South America, and Oceania, whereas for regressions based on the Old-World sample, the set includes two indicators for Africa and Asia. The 2SLS and IV probit regressions exploit prehistoric migratory distance from East Africa to the indigenous (precolonial) population of a country as an excluded instrument for the country’s contemporary population diversity. In Panel A, the estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its cross-country distribution is expressed in terms of the share of group-years of a country in which an intra-group conflict was experienced by an AMAR ethnic group. In Panel B, the estimated marginal effect of a 1 percentage point increase in population diversity is the average marginal effect across the entire cross-section of observed diversity values, and it reflects the percentage-point increase in the annual likelihood of an intra-group conflict incidence. Heteroskedasticity-robust standard errors (clustered at the country level in Panel B) are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

The results in Table III indicate that regardless of the outcome variable examined, the set of covariates considered, or the identification strategy employed, population diversity has contributed substantially to the risk of intra-group conflicts during the 1985–2006 time period. This impact is not only highly statistically significant but considerable in terms of economic significance as well. For instance, the regression in Column 5 of Panel A suggests that conditional on the full set of baseline controls, a move from the 10th to the 90th percentile of the global cross-country distribution of population diversity is associated with an increase of 38 percentage points in the likelihood that an AMAR group of a country will have experienced an intra-group conflict at some point during the 1985–2006 time horizon. Moreover, as estimated by the regression in Column 5 of Panel B, a 1 percentage point increase in population diversity leads to an increase in the likelihood of an intra-group conflict incidence in any given country-year during this time horizon by 10 percentage points. Based on the regressions in Columns 2 and 5 of Panel B, the plots presented in Figure SA.3 in Section A.3 of the Supplemental Material illustrate the predicted annual likelihood of an intra-group conflict incidence as a function of the percentile of the cross-country distribution of population diversity in the relevant estimation sample. Specifically, a move from the 10th to the 90th percentile of this distribution is predicted to raise the annual likelihood of an intra-group conflict incidence from about 13 to 55 percent amongst countries in the Old World, and from about 9 to 62 percent in the global sample of countries.

3.2.5. Analysis of Historical Conflict Outcomes in Cross-Country Data

The analysis has thus far been confined to examining intrastate conflict events in the last half-century. This restriction permitted it to focus on the post-independence time period of the former European colonies, exploit better quality data and codings for intrastate conflict events, and employ time-varying controls for institutional and development outcomes, as is standard in civil conflict regressions. However, there is no a priori reason why the conflict-promoting role of population diversity should not extend to the distant past.

This section investigates whether population diversity predicts historical conflict events in a cross-section of countries. Specifically, the analysis exploits information on the locations of violent conflicts during the 1400–1799 time period, as compiled by Brecke (1999) and geocoded by Dincecco et al. (2015), employing the geocoding of conflict locations to map these historical conflicts to territories defined by contemporary national borders. The examined time period excludes the colonial wars of the 19th and early 20th centuries, many of which were associated with the Scramble for Africa. In particular, because these wars occurred as a consequence of local resistance to the European colonizers or were triggered by the conflicting interests of the different colonial powers, they are not expected to be related to local population diversity in a meaningful way.

The definition of a violent conflict in Brecke’s data set is based on Cioffi-Revilla (1996):“An occurrence of purposive and lethal violence among 2+ social groups pursuing conflicting political goals that results in fatalities, with at least one belligerent group organized under the command of authoritative leadership. The state does not have to be an actor. Data can include massacres of unarmed civilians or territorial conflicts between warlords.” The list is comprised of conflicts that resulted in at least 32 fatalities.25 Further, although the data set does not systematically distinguish between intrastate and interstate conflicts, the latter appear to form the basis of the recorded conflicts. Finally, while the recorded conflicts do not necessarily represent the universe of conflict events during the sample period, the list contains almost all major conflicts that have been documented by historians.

In contrast to the analysis of modern conflicts, the explanatory variable of interest in the current analysis is the precolonial population diversity (predicted by migratory distance from East Africa) of a territory bounded by its contemporary national borders. By construction, this measure does not account for the impact of post-1500 migrations on population diversity. In addition, it is not clear at the outset if one should expect any systematic relationship between the native population diversity of a given territory and the outbreak of interstate – as opposed to internal – conflicts in that territory. However, given that the measure of precolonial population diversity is collinear in migratory distance from East Africa, if a conflict’s location is relatively close to the native territories of the warring parties in the conflict, the measure should possess some explanatory power for the onset of such conflicts. Because the conflicts examined occurred during a time period when long-distance campaigns were uncommon, due to the constraints imposed by historical transportation and warfare technologies, precolonial population diversity could in principle explain a considerable part of the variation in interstate conflicts across the globe, especially in earlier periods of the 1400–1799 time horizon.

Table IV presents the analysis of historical conflicts. For the specifications in Columns 1– 5, the outcome variable captures the (log-transformed) total number of distinct conflict events in different time intervals during the 1400–1799 time horizon.26 The specification in Column 1 examines conflicts during the entire time horizon of four centuries, whereas those in Columns 2–5 focus on the conflicts recorded for each individual century of the time horizon. Indeed, the data on conflicts that occurred prior to the discovery of the New World is expected to be less contaminated by information on interstate conflicts between warring parties whose combined population diversity is not representative of the population diversity of the locations in which these conflict occurred. The specifications in Columns 6–10 replicate the analysis from Columns 1–5, except that the outcome variable is an indicator for conflict onset that captures whether there was any recorded conflict event during the specified time interval. All specifications include the geographical controls from the earlier analysis of modern conflicts. In addition, regional dummies are included in all regressions to mitigate the concern that Brecke’s conflict data could suffer from a regional bias in coverage, due to differences across world regions in the quality of primary sources and in the nature and scale of historical conflict events.27

Table IV:

Precolonial Population Diversity and the Occurrence of Historical Conflicts across Countries

Historical period: 1400–1799
1400–1499
1500–1599
1600–1699
1700–1799
1400–1799
1400–1499
1500–1599
1600–1699
1700–1799
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
OLS OLS OLS OLS OLS Probit Probit Probit Probit Probit
Number of conflict onsets in historical period
Onset of any conflict in historical period
Population diversity (precolonial) 16.336*** (4.264) 13.561*** (3.425) 10.919*** (3.603) 9.878*** (3.127) 6.456** (2.801) 18.211*** (5.799) 35.761*** (6.754) 17.266*** (6.241) 17.622*** (5.745) 12.508** (5.297)
Region dummies × × × × × × × × × ×
Controls for geography × × × × × × × × × ×

Observations 155 155 155 155 155 155 155 155 155 155
Partial R2 of population diversity 0.104 0.136 0.087 0.064 0.039
Adjusted R2 0.354 0.367 0.356 0.251 0.231
Pseudo R2 0.248 0.374 0.285 0.224 0.213
Effect of 10th–90th %ile move in diversity 31.725*** (8.281) 8.352*** (2.109) 7.603*** (2.508) 5.911*** (1.871) 2.826** (1.226) 0.541*** (0.098) 0.631*** (0.045) 0.515*** (0.097) 0.560*** (0.085) 0.430*** (0.118)

Notes: This table exploits cross-country variations to establish a significant positive reduced-form impact of indigenous (precolonial) population diversity on (i) the number of conflict onsets (Columns 1–5); and (ii) the likelihood of observing one or more conflict onsets (Columns 6–10), either during the entire 1400–1799 time period (Columns 1 and 6) or in each century therein (Columns 2–5 and 7–10), conditional on the baseline geographical correlates of conflict. The controls for geography include absolute latitude, ruggedness, distance to the nearest waterway, the mean and range of agricultural suitability, the mean and range of elevation, and an indicator for small island nations. The set of region dummies includes four indicators for Sub-Saharan Africa, Middle East and North Africa, Europe and Central Asia, and South Asia. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its cross-country distribution is expressed in terms of either the number of conflict onsets (Columns 1–5) or the percentage-point increase in the likelihood of a conflict onset (Columns 6–10) during the time period examined by the regression. Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

The results indicate that pre-colonial population diversity had a statistically significant positive influence on both the number and the incidence of historical conflicts. This is true for conflicts that occurred both in the century prior to the discovery of the New World and in the centuries that followed. However, in line with the prior that the impact of native population diversity on conflicts ought to dissipate in time periods marred by mostly international or interregional conflicts (particularly, those involving ancestrally very distant warring populations like the European colonial powers versus the natives), the association between population diversity and conflicts is noticeably weaker in the centuries following the advent of the colonial era.

The OLS estimate in Column 2 implies that a move from the 10th to the 90th percentile in the cross-country distribution of population diversity is associated with 8.4 more conflicts during the 15th century. This impact is somewhat larger than those implied by the comparable specifications for modern civil conflicts.28 This could potentially reflect a waning, albeit significant, influence of population diversity in more contemporary time periods. However, it could also be a mechanical consequence of measurement issues associated with the fact that in contrast to the earlier analyses of modern civil conflicts, the current analysis of historical conflicts does not distinguish between purely intrastate conflicts and interstate conflicts involving ancestrally proximate warring populations. As for the economic significance of population diversity for historical conflict incidence, the probit regression in Column 7 implies that a move from the 10th to the 90th percentile in the cross-country distribution of population diversity is associated with a 63 percent increase in the likelihood of observing a conflict during the 15th century.

In sum, beyond providing temporal external validity to the main findings from the earlier analyses of civil conflict in the contemporary era, the findings in this section attest to a deep-rooted and persistent influence of population diversity on the risk of conflict in society – an impact that is indeed apparent across many centuries.

4. Population Diversity and Conflict at the Ethnicity Level

This section explores the contribution of interpersonal population diversity to the existing variation in the prevalence and severity of conflicts within ethnic homelands. The focus on ethnic homelands permits the analysis to disentangle the impact of population diversity within an ethnic group, rather than across groups, on conflict. The ethnic-level analysis mitigates potential concerns about the confounding effects of population diversity as well as conflict on national borders.

4.1. Data

The ethnic-level analysis is conducted using two novel geo-referenced datasets of ethnic homelands across the globe. The first dataset consists of homelands of indigenous ethnic groups (largely isolated and shielded from genetic admixture) whose levels of diversity is provided by the most comprehensive source on observed genetic diversity (Pemberton et al., 2013).29 The geo-referenced dataset maps the genetic diversity of individuals within each ethnic homeland, as reported in Pemberton et al. (2013), to the geographical characteristics of this homeland.30 The data consists of 207 ethnic homelands for which genetic diversity is observed.31 The distribution of these ethnic groups across the globe is depicted in Figure 3. The level of observed genetic diversity ranges from 0.55 among ethnic groups in South America to 0.77 among those in Africa. Appendix A.6 presents the summary statistics of the sample.

Figure 3: The Spatial Distribution of Ethnic Homelands.

Figure 3:

Notes: This map depicts the global spatial distribution of ethnic homelands in the sample. Each point represents the centroid of the historical homeland of an ethnic group. Red points depict homelands for which population diversity is observed, whereas blue points depict homelands for which population diversity is predicted.

The second geo-referenced dataset consists of all homelands of ethnic groups, whose geographical territories are delineated by the GREG (“Geo-Referencing of Ethnic Groups”), as drawn from the classical Soviet Atlas Narodov Mira (Weidmann et al., 2010). Population diversity within these ethnic homelands is predicted based on prehistoric migratory distance from Addis Ababa, using the unconditional relationship between observed genetic diversity and prehistoric migratory distance from Addis Ababa derived from the 207-ethnic homelands sample.

While the historical homeland of each ethnic group captures the area of the globe in which the group is predominantly residing, the vast majority of ethnic homelands tend to be fractionalized, as indicated by the fact that they are inhabited by multiple linguistic groups. Hence, the analysis of the impact of interpersonal population diversity on conflict accounts for the potentially confounding effects of the degree of linguistic fractionalization and polarization within ethnic homelands on conflict.32

The main measure of conflict that is used at the ethnic-level analysis is derived from the UCDP/PRIO Armed Conflict Dataset (Gleditsch et al., 2002). In particular, the analysis focuses on the average yearly share of the area of each ethnic homeland, over the period 1989–2008, that fell within the boundaries of internal armed conflict events (between the government of a state and internal opposition groups).33 Furthermore, the analysis utilizes a second measure that accounts for the number of conflict events, the number of deaths, and the number of deaths per event, as recorded within each ethnic homeland in the UCDP Georeferenced Event Dataset (Sundberg et al., 2012; Croicu and Sundberg, 2015).

4.2. Empirical Strategy

The analysis implements several empirical strategies to mitigate concerns about the potential role of reverse causality, omitted cultural, geographical and human characteristics, as well as sorting in the observed association between population diversity and civil conflicts within ethnic homelands. In particular, the positive associations between the extent of the observed population diversity within an ethnic homeland and civil conflict may reflect reverse causality from conflict to population diversity. It is not inconceivable that in the course of human history, conflicts within ethnic groups have operated towards homogenization of the population, reducing its observed levels of diversity. Hence, in order to mitigate concerns about reverse causality, as well as concerns about sample limitations, the ethnic-level analysis further exploits predicted population diversity, rather than observed diversity, to explore the effect of diversity on civil conflict. In particular, as caused by the serial founder effect (e.g., Harpending and Rogers, 2000; Ramachandran et al., 2005; Ashraf and Galor, 2013a) and depicted in Figure 4, observed population diversity within geographically indigenous contemporary ethnic groups decreases with distance along ancient migratory paths from East Africa. Hence, migratory distance from Africa is exploited to predict population diversity for all ethnic groups in the GREG.

Figure 4: Migratory Distance from East Africa and Observed Genetic Diversity across Ethnic Homelands.

Figure 4:

Notes: This figure depicts the relationship between prehistoric migratory distance from East Africa and observed population diversity in a sample of 207 ethnic homelands. The negative relationship reflects the serial founder effect associated with expansion of humans from East Africa to the rest of the world.

Furthermore, the associations between ethnic-level population diversity and civil conflicts may be governed by omitted cultural, geographical and human characteristics. Thus, in order to mitigate these concerns, the empirical analysis exploits two related strategies. In light of the serial founder effect, the analysis exploits the migratory distance from Africa to each ethnic group as an instrumental variable for the observed level of population diversity, and as a predictor for its level of diversity. Nevertheless, there are several plausible scenarios that would weaken this identification strategy. First, selective migration out of Africa, or natural selection operating in different ways along the migratory paths, could have affected human traits and therefore conflict independently of the effect of migratory distance from Africa on the degree of diversity in human traits. Second, migratory distance from Africa could be correlated with distances from focal historical locations (e.g., technological frontiers) and could therefore capture the effect of these distances on the process of development and the emergence of conflicts, rather than the effect of these migratory distances via population diversity.

These potential concerns are mitigated, however, by the following observations. First, while migratory distance from Africa has a significant negative association with the degree of genetic diversity, it has no apparent association with the mean level of human traits (Ashraf and Galor, 2013a), conditional on the distance from the equator. Second, conditional on migratory distance from East Africa, migratory distances from historical technological frontiers in the years 1, 1000, and 1500 do not affect the impact of population diversity on conflict, reinforcing the justification for the reliance on the “out of Africa” hypothesis and the serial founder effect.

Moreover, an unlikely threat to the identification strategy would emerge if the actual migration path out of Africa would have been correlated with geographical characteristics that are directly conducive to conflicts (e.g., soil quality, ruggedness, climatic conditions, and propensity to trade). This, however, would have implausibly involved that the conduciveness of these geographical characteristics to conflict would be aligned along the main root of the migratory path out of Africa, as well as along each of the main forks that emerge from this primary path. In particular, in several important forks in the course of this migratory path (e.g., the Fertile Crescent and the associated eastward migration towards East Asia and western migration towards Europe), the geographical characteristics that are conducive to conflicts would have to diminish symmetrically along these diverging migratory routes. Nevertheless, in order to further mitigate this unlikely concern, the analysis establishes that the results are unaffected qualitatively if it accounts for the potentially confounding effects of a wide range of geographical factors in the homeland of each ethnic group. In addition, in order to further mitigate concerns regarding the role of omitted variables, the analysis accounts for spatial auto-correlation as well as regional fixed effects, capturing time-invariant unobserved heterogeneity in each region and hence identifying the association between interpersonal diversity and conflict, within, rather than across, regions. Furthermore, it establishes that selection on unobservables is not a concern.

The observed associations between population diversity and the extent of conflicts may further reflect the sorting of less diverse populations into geographical niches characterized by lower conflict. While this sorting would not affect the existence of a positive association between population diversity and the extent of conflict, it could weaken the proposed mechanism. However, in view of the serial founder effect and the tight negative association between migratory distance from Africa and population diversity, sorting would necessitate that the ex-ante spatial distribution of conflict would have to be negatively correlated with migratory distance from Africa. As argued above, this would have implausibly involved that the conduciveness of geographical characteristics to conflict would be negatively aligned with the primary migratory path out of Africa, as well as with each of its diverging forks, diminishing symmetrically along these diverging migratory routes. Nevertheless, to further mitigate this unlikely scenario, the empirical analysis accounts for the potentially confounding effects of a wide range of geographical characteristics, as well as regional fixed effects.

4.3. Empirical Results

This subsection establishes a highly significant and robust reduced-form impact of observed and predicted diversity within an ethnic homeland on intra-societal conflicts within this homeland. The analyzes explores the effect of population diversity within ethnic groups on the prevalence of conflict, as well as on its intensity at the ethnic level. The empirical specifications in the ethnic-level analysis follows rather closely the specifications in the country-level analysis, assuring the comparability of the findings.

Tables V and VI present the results of the baseline analysis of the influence of interpersonal population diversity within an ethnic homeland on log conflict prevalence over the period 1989–2008. Table V conducts the analysis for the observed-diversity sample. In particular, Column 1 establishes a highly significant association between observed diversity across the 207 ethnic homelands and conflict prevalence, conditional on world-region fixed effects.34 Column 2 demonstrates that — as depicted in Figure 5 — the association remains highly significant and even increases slightly in magnitude if one accounts for the potentially confounding effects of some exogenous geographical factors (i.e., absolute latitude, ruggedness, mean and range of elevation, mean and range of land suitability, distance from waterway, and an island dummy). Column 3 establishes that accounting for additional exogenous climatic factors which have been shown to be relevant for conflict (i.e., temperature and precipitation), the association between observed diversity and conflict remains highly significant. The coefficient estimate suggests that an increase in population diversity from the 10th percentile (e.g., the Mamusi people of Oceania) to the 90th percentile (e.g., the Pare people of Eastern Africa) corresponds to an average increase of 0.43 in the prevalence of conflicts within the territory of a homeland over the years 1989–2008 (compared to a sample mean of 0.14 and a standard deviation of 0.27).35 Columns 4 and 5 establishe that the qualitative results are unaffected by accounting for the potentially confounding effects of linguistic fractionalization and polarization. Finally, Columns 6 and 7 demonstrate that the estimates remain highly significant and stable if one accounts for a set of potentially endogenous confounders (i.e., log luminosity, malaria endemicity, and time since settlement).

Table V:

Observed Population Diversity and Conflict across Ethnic Homelands

Log conflict prevalence
(1) (2) (3) (4) (5) (6) (7)
OLS OLS OLS OLS OLS OLS OLS
Observed population diversity 28.740*** (9.638) 33.896*** (10.161) 27.559*** (9.634) 27.998*** (9.455) 27.619*** (9.511) 29.020*** (10.662) 28.550*** (10.727)
Ethnolinguistic fractionalization 1.291** (0.626) 1.088* (0.642)
Ethnolinguistic polarization 0.811 (0.523) 0.733 (0.527)
Regional dummies Yes Yes Yes Yes Yes Yes Yes
Geographical controls No Yes Yes Yes Yes Yes Yes
Climatic controls No No Yes Yes Yes Yes Yes
Development outcomes No No No No No Yes Yes
Disease environment controls No No No No No Yes Yes

Sample Observed Observed Observed Observed Observed Observed Observed
Observations 207 207 207 207 207 207 207
Effect of 10th-90th %ile move in diversity 0.449*** (0.151) 0.530*** (0.159) 0.431*** (0.151) 0.438*** (0.148) 0.432*** (0.149) 0.454*** (0.167) 0.446*** (0.168)
Adjusted R2 0.107 0.168 0.298 0.310 0.303 0.316 0.312
β * 37.750 26.984 27.645 27.080 29.149 28.462

Notes: This table exploits variations across ethnic homelands to establish a significant positive impact of observed population diversity on the log prevalence of conflict during the 1989–2008 period, conditional on the potentially confounding effects of geographic, climatic, and development-related characteristics, as well as the disease environment. World-region fixed effects include Europe, Asia, North America, South America, Oceania, North Africa, and Sub-Saharan Africa. Geographical controls are absolute latitude, ruggedness, mean and range of elevation, and mean and range of land suitability, distance from waterway, and an island dummy. Climatic controls are the mean levels of temperature and precipitation. Development outcomes are time since settlement, presence of oil and gas, and log luminosity. The disease environment control is malaria endemicity. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution is expressed in terms of the change in the prevalence of conflicts within the territory of a homeland over the years 1989–2008. The β* statistic is the estimated effect of population diversity, if selection on observables and unobservables are of equal proportions, and the maximal R2 is equal to 1.3 times the observed R2 (Oster, 2019). Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Table VI:

Predicted Population Diversity and Conflict across Ethnic Homelands

Log conflict prevalence
(1) (2) (3) (4) (5) (6) (7)
OLS OLS OLS OLS OLS OLS 2SLS
Predicted population diversity 77.710*** (6.279) 77.031*** (7.282) 74.010*** (7.396) 73.581*** (7.418) 81.354*** (9.623) 80.889*** (9.735)
Observed population diversity 129.610*** (32.407)
Ethnolinguistic fractionalization 0.347 (0.299) 0.200 (0.356)
Ethnolinguistic polarization 0.457* (0.263) 0.629** (0.311)
Regional dummies Yes Yes Yes Yes Yes Yes Yes
Geographical controls No Yes Yes Yes Yes Yes Yes
Climatic controls No Yes Yes Yes Yes Yes Yes
Development outcomes No No Yes Yes Yes Yes No
Disease environment controls No No Yes Yes Yes Yes No

Sample Predicted Predicted Predicted Predicted Old World Old World Observed
Observations 901 901 901 901 697 697 207
Effect of 10th-90th %ile move in diversity 1.725*** (0.139) 1.710*** (0.162) 1.643*** (0.164) 1.633*** (0.165) 1.019*** (0.120) 1.013*** (0.122) 2.027*** (0.507)
Adjusted R2 0.211 0.362 0.378 0.379 0.401 0.404
β * 76.546 71.535 70.829 73.903 73.187

Migratory distance from East Africa (in 10,000 km) −0.044*** (0.009)
First-stage F -statistic 26.185

Notes: This table exploits variations across ethnic homelands to establish a significant positive impact of predicted population diversity, based on prehistoric migratory distance from East Africa, on the log prevalence of conflict during the 1989–2008 period, conditional on the potentially confounding effects of geographic, climatic, and development-related characteristics, as well as the disease environment. World-region fixed effects include Europe, Asia, North America, South America, Oceania, North Africa, and Sub-Saharan Africa. Geographical controls are absolute latitude, ruggedness, mean and range of elevation, and mean and range of land suitability, distance from waterway, and an island dummy. Climatic controls are the mean levels of temperature and precipitation. Development outcomes are time since settlement, presence of oil and gas, and log luminosity. The disease environment control is malaria endemicity. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution is expressed in terms of the change in the prevalence of conflicts within the territory of a homeland over the years 1989–2008. The 2SLS regression exploits prehistoric migratory distance from East Africa to each ethnic homeland as an excluded instrument for the observed population diversity of the ethnic group. The β* statistic is the estimated effect of population diversity, if selection on observables and unobservables are of equal proportions, and the maximal R2 is equal to 1.3 times the observed R2 (Oster, 2019). Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Figure 5: Observed Population Diversity and Conflict across Ethnic Homelands.

Figure 5:

Notes: This figure depicts the relationship between observed population diversity and conflict prevalence during the period 1989– 2008 across 207 ethnic homelands, conditional on world-region fixed effects, and potential geographic and climatic confounders, as reported in Column 3 of Table V.

In light of the potential endogeneity of observed population diversity, Table VI presents the effect of predicted population diversity, based on prehistoric migratory distance from East Africa, on the prevalence of conflict in a sample of 901 ethnic homelands.36 In particular, Column 1 establishes a highly significant effect of predicted diversity on log conflict prevalence, conditional on world-region fixed effects. Column 2 demonstrates that, as depicted in Figure 6, the effect remains highly significant and stable if one accounts for the potentially confounding effects of some exogenous geographical factors (i.e., absolute latitude, ruggedness, mean and range of elevation, mean and range of land suitability, distance from waterway, and an island dummy) as well as additional exogenous climatic factors which have been shown to be relevant for conflict (i.e., temperature and precipitation). In particular, the coefficient estimate suggests that an increase in predicted diversity from the 10th percentile (e.g., the Boruca people of Central America) to the 90th percentile (e.g., the Wafipa people of East Africa) corresponds to an average increase of 1.71 in the prevalence of conflicts within the territory of a homeland over the years 1989–2008 (compared to a sample mean of 0.19 and a standard deviation of 0.32).

Figure 6: Predicted Population Diversity and Conflict across Ethnic Homelands.

Figure 6:

Notes: This figure depicts the relationship between predicted population diversity and conflict prevalence during the period 1989–2008 across 901 ethnic homelands, conditional on world-region fixed effects, and potential geographic and climatic confounders, as reported Column 2 of Table VI.

Columns 3 and 4 of Table 6 establish that the qualitative results are unaffected by the potentially confounding effects of linguistic fractionalization and polarization, accounting for a set of potentially endogenous confounders (i.e., log luminosity, malaria endemicity, and time since settlement). Importantly, restricting the analysis to a sample of 697 ethnic homelands in the Old World, that are arguably less sensitive to the mass-migration in the post-1500 period, Columns 5 and 6 suggest that the effect of predicted diversity on conflict remains highly significant and larger, plausibly due to smaller measurement errors.

Finally, using prehistoric migratory distance from Africa as an instrumental variable for observed population diversity, the 2SLS regression analysis reported in Column 7, suggests that there exists a highly significant reduced-form impact of population diversity on conflict, accounting for world-region fixed effects, geographical, and climatic characteristics.37 Furthermore, the results remain highly significant if one accounts for the potentially confounding effects of linguistic fractionalization and polarization in the ethnic homelands as well as development outcomes and the disease environment (results available upon request). In line with the results based on predicted diversity, once the potential change in diversity of ethnic groups due to conflict is accounted for, the estimated coefficient of interest in Column 7 suggests that an increase in population diversity from the 10th percentile (e.g., the Mamusi people of Oceania) to the 90th percentile (e.g., the Pare people of Eastern Africa) corresponds to an average increase of 2.03 in the prevalence of conflicts within the territory of a homeland over the years 1989–2008 (compared to a sample mean of 0.14 and a standard deviation of 0.27).

Table A.VIII in Appendix A.5 establishes that population diversity is a significant contributor to the total number of conflict events within an ethnic homeland during the 1989–2008 time period. Table A.IX establishes the significant impact of both observed population diversity and predicted population diversity on the number of conflicts, the number of deaths, and the number of deaths per conflict, accounting for world-region fixed effects, geographic and climatic characteristics, as well as linguistic fractionalization and polarization. Further, the baseline results with respect to the prevalence of conflicts across ethnic homelands are shown to be robust to accounting for: (i) spatial dependence across observations (Tables A.X and A.XI), and (ii) the use of predicted population diversity as a generated regressor (Table A.XII).

Finally, as established in Section B.3 of the Supplemental Material, the baseline results are qualitatively insensitive to accounting for: (i) migratory distances from historical technological frontiers (Table SB.I), and (ii) ecological diversity and ecological polarization (Tables SB.II and SB.III).

5. Potential Mediating Channels

What are the proximate factors that could explain the adverse reduced-form influence of interpersonal population diversity on different forms and dimensions of social conflict? This section explores some potential mediating channels at the national and subnational levels.

5.1. Ethnic Diversity, Interpersonal Trust, and Dispersion in Political Preferences at the Country Level

This subsection examines some hypothesized proximate mechanisms that can potentially mediate the positive reduced-form cross-country relationship between population diversity and the risk of intrastate conflict, as reflected by the annual frequency of new PRIO25 civil conflict outbreaks during the 1960–2017 time period. Specifically, it provides evidence that the main cross-country empirical finding may partly be a ramification of (i) the contribution of interpersonal population diversity to the degree of ethnolinguistic fragmentation at the country level, measured by the total number of ethnic groups in a national population Fearon (2003);38 (ii) the adverse influence of population diversity on social capital, based on data from the World Values Survey (2006, 2009) (henceforth, WVS) on the prevalence of generalized interpersonal trust in a country’s population;39 and (iii) the association between population diversity and heterogeneity in preferences for public goods and redistributive policies at the national level, as captured by the intra-country dispersion in self-reported individual political positions on a politically “left”–“right” categorical scale, based on data from the WVS.40

Table VII reports the findings from an empirical examination of the aforementioned three potential mechanisms through which population diversity could partly contribute to the risk of intrastate conflict in society. For each posited channel, the analysis presents the results from estimating three different OLS regressions, exploiting worldwide variations in a common sample of countries, conditioned primarily by the availability of data on the mediating variable in question. In addition, all examined specifications partial out the influence of only the baseline set of geographical covariates (including continent or regional fixed effects). Specifically, the analysis does not include potentially endogenous control variables, many of which (like GDP per capita) may well be afflicted by reverse causality from the temporal frequency of civil conflict onsets and may also be partly determined by both population diversity and the mediating variable.

Table VII:

Population Diversity and the Frequency of Civil Conflict Onset across Countries – Mediating Channels

Mediating channel: Cultural fragmentation
Interpersonal trust
Preference heterogeneity
(1) (2) (3) (4) (5) (6) (7) (8) (9)
OLS OLS OLS OLS OLS OLS OLS OLS OLS
Log number of ethnic groups
Annual frequency of new civil conflict onsets, 1960–2017
Prevalence of interpersonal trust
Annual frequency of new civil conflict onsets, 1960–2017
Variation in political attitudes
Annual frequency of new civil conflict onsets, 1960–2017
Population diversity (ancestry adjusted) 5.187*** (1.887) 0.316*** (0.114) 0.294*** (0.109) −1.817** (0.848) 0.488** (0.221) 0.447* (0.236) 14.344** (6.675) 0.451** (0.219) 0.375 (0.254)
Log number of ethnic groups 0.004 (0.005)
Prevalence of interpersonal trust −0.023 (0.026)
Variation in political attitudes 0.005 (0.006)
Continent/region dummies × × × × × × × × ×
Controls for geography × × × × × × × × ×

Observations 147 147 147 84 84 84 81 81 81
Partial R2 of population diversity 0.049 0.047 0.039 0.075 0.062 0.049 0.082 0.050 0.033
Adjusted R2 0.342 0.203 0.201 0.441 0.232 0.226 0.397 0.247 0.249
Effect of 10th–90th %ile move in diversity 2.136*** (0.777) 0.021*** (0.008) 0.020*** (0.007) −0.104** (0.049) 0.029** (0.013) 0.026* (0.014) 0.824** (0.383) 0.027** (0.013) 0.022 (0.015)

Notes: This table exploits cross-country variations to demonstrate that the significant positive reduced-form influence of contemporary population diversity on the annual frequency of new PRIO25 civil conflict onsets during the 1960–2017 time period, conditional on the baseline geographical correlates of conflict, is at least partly mediated by each of three potentially conflict-augmenting proximate channels that capture the contribution of population diversity to (i) the degree of cultural fragmentation, as reflected by the number of ethnic groups in the national population (Columns 1–3); (ii) the diminished prevalence of generalized interpersonal trust at the country level (Columns 4–6); and (iii) the extent of heterogeneity in preferences for redistribution and public-goods provision, as reflected by the intra-country dispersion in individual political attitudes on a politically “left”–“right” categorical scale (Columns 7–9). For each of the three mediating channels examined, the first regression documents the impact of population diversity on the proximate variable in the channel, the second presents the reduced-form influence of population diversity on conflict, and the third runs a “horse race” between population diversity and the proximate variable to establish reductions in the magnitude and explanatory power of the reduced-form influence of population diversity on conflict. All three regressions for each channel are conducted using a common cross-country sample, conditioned by the availability of data on the relevant variables employed by the analysis of the channel in question. The controls for geography include absolute latitude, ruggedness, distance to the nearest waterway, the mean and range of agricultural suitability, the mean and range of elevation, and an indicator for small island nations. The regressions for the “cultural fragmentation” channel control for the full set of continent dummies (i.e., five indicators for Africa, Asia, North America, South America, and Oceania), whereas for the “trust” and “preference heterogeneity” channels, given the smaller degrees of freedom afforded by the more limited sample of countries, the regressions control for a more modest set of region dummies, including an indicator for Sub-Saharan Africa and another for Latin America and the Caribbean. Given that the unit of measurement for the variable reflecting the degree of intra-country dispersion in political attitudes has no natural interpretation, its cross-country distribution is standardized prior to conducting the relevant regressions. The estimated effect associated with increasing diversity from the tenth to the ninetieth percentile of its cross-country distribution is expressed in terms of (i) the actual number of ethnic groups in the national population in Column 1; (ii) the fraction of individuals in a country who “think that most people can be trusted” in Column 4; (iii) the number of standard deviations of the cross-country distribution of the national-level dispersion in political attitudes in Column 7; and (iv) the number of new conflict onsets per year in all the remaining columns. Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

The analysis of each mechanism proceeds by first regressing the mediating variable on population diversity. These regressions are presented in Columns 1, 4, and 7. All coefficients on the mediating variables are statistically significant at the 5 percent level or below. They suggest that conditional on exogenous geographical factors, a move from the 10th to the 90th percentile of the cross-country diversity distribution in the relevant sample is associated with (i) an increase in the total number of ethnic groups in a national population by 2.1 groups; (ii) a decrease in the prevalence of generalized interpersonal trust at the country level by 10.4 percent; and (iii) an increase in the intra-country dispersion in individual political attitudes by 82.4 percent of a standard deviation from the cross-country distribution of this particular measure.41

The latter two regressions in the analysis of each hypothesized channel establish that the quantitative importance of population diversity as a predictor of the risk of civil conflict becomes diminished in both magnitude and explanatory power once the reduced-form influence of population diversity on the temporal frequency of civil conflict outbreaks is conditioned on the mediating variable of interest. Specifically, a comparison of the regressions in Columns 2 versus 3 indicates that, when conditioned on the total number of ethnic groups in the national population, the influence of population diversity on conflict frequency, in terms of the response associated with a move from the 10th to the 90th percentile of the cross-country diversity distribution, is reduced in magnitude by about 5 percent (from 0.021 to 0.020 new PRIO25 civil conflict onsets per year). Moreover, the explanatory power of population diversity for conflict frequency, as reflected by the partial R2 statistic, diminishes by 17 percent. The corresponding results obtained for each of the other two posited mechanisms are qualitatively similar, and if anything, even more pronounced. In particular, when conditioned on either the prevalence of generalized interpersonal trust in the national population or the intra-country dispersion in political attitudes, the magnitude of the response in conflict frequency that is associated with a move from the 10th to the 90th percentile of the cross-country diversity distribution decreases by either 10.3 percent (Columns 5 versus 6) or 18.5 percent (Columns 8 versus 9), with the explanatory power of population diversity for conflict frequency declining by either 21 percent or 34 percent. Further, as shown in Column 9, the reduced-form influence of population diversity on the frequency of conflict outbreaks becomes statistically indistinguishable from zero when conditioned on the intra-country dispersion in political attitudes.

One important caveat regarding the interpretation of the findings in Table VII is that the mediating variables considered here may themselves be endogenous in a model of conflict risk (Rohner et al., 2013a). Indeed, as corroborated by empirical evidence from recent studies (e.g., Fletcher and Iyigun, 2010; Rohner et al., 2013b; Cassar et al., 2013; Besley and Reynal-Querol, 2014), the unobserved historical cross-regional pattern of conflict risk may have partly contributed to the contemporary variations observed across countries in the degree of ethnolinguistic fragmentation, the prevalence of interpersonal trust, and the intra-country dispersion in revealed political preferences. In particular, past conflicts plausibly triggered movements of ethnic groups across space and reinforced extant inter-ethnic cleavages along with the social, political, and economic grievances associated with such divisions. Thus, one ought to be cautious when interpreting the findings from the current analysis as conclusive evidence of the role of these factors as mediators. In order to assess these hypothesized mechanisms more conclusively, one would need to exploit an independent exogenous source of variation for each of these proximate factors, a task that remains open for future exploration.

5.2. Interpersonal Trust at the Individual Level

The proposed hypothesis suggests that interpersonal population diversity is conducive to conflict partly due to its adverse effect on trust and social cohesiveness. This section sheds light on this suggested mechanism, exploring the relationship between interpersonal population diversity and interpersonal trust, using individual data.42 The analysis establishes that a higher degree of population diversity is indeed associated with a lower level of interpersonal trust, suggesting that the impact of diversity on the prevalence of conflict could plausibly operate through the adverse effect of diversity on trust.

5.2.1. Population Diversity and Trust: Individuals in Africa

The analysis establishes a negative association between observed population diversity in ethnic homelands in Africa and the level of trust of individuals (surveyed by the Afrobarometer) who are originated from these homelands and are either residing in their ethnic homelands or in other regions of Africa. This negative association is robust if one accounts for (i) host-country fixed effects, individual-level characteristics (i.e., age, gender, education, occupation, living condition, and religion), (iii) exposure to slave exports, (iv) indicators of host district characteristics (i.e., presence of school, electricity, piped water, sewage, health clinic, and urban status), and (v) ancestral country fixed effects.43 Moreover, the analysis accounts for the degree of fragmentation in the ethnic homeland as well as in the host district. Fragmentation in ethnic homelands is captured by linguistic fractionalization and polarization in these ethnic homelands, whereas fragmentation in the host district is captured by ethnic fractionalization in the district as well as the proportion of the respondent’s group in the district population.

Table VIII presents the regression analysis of trust towards other individuals within the ethnic group on the level of interpersonal population diversity in the group.44 The coefficient suggests that an increase in observed population diversity within an ethnic group from the 10th percentile of the distribution (e.g., individuals belonging to the Ashanti people) to the 90th percentile (e.g., individuals belonging to the Sukuma people) corresponds to a 0.29–0.59 point decrease in intragroup trust (compared to a sample mean of 1.52 and a standard deviation of 1.00). The analysis further suggests that ethnolinguistic fractionalization and polarization in the ethnic homeland has an adverse effect on intra-group trust.

Table VIII:

Ethnic-Homeland Population Diversity and Individual-Level Trust in Africa

Intra-group trust
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
Observed population diversity (ancestral) −39.496** (17.304) −38.335** (15.859) −45.303*** (12.489) −34.849** (17.698) −37.840** (17.240) −38.467** (15.572) −45.567*** (10.702) −35.190** (15.368) −64.122*** (16.540) −70.334*** (20.333)
Ethnolinguistic fractionalization (ancestral) −0.443 (0.314) −0.447 (0.306) −0.934*** (0.228)
Ethnolinguistic polarization (ancestral) −0.973*** (0.160) −0.959*** (0.211) −1.264*** (0.415)
District-level ethnic fractionalization −0.057 (0.052) 0.006 (0.185) 0.019 (0.201) 0.030 (0.213) 0.027 (0.225)
Proportion of ethnic group in district 0.076 (0.108) 0.087 (0.264) 0.071 (0.258) 0.037 (0.213) 0.029 (0.210)
Host country dummies × × × × × × × × × ×
Baseline individual controls × × × × × × × × ×
Education dummies × ×
Occupation dummies × ×
Living conditions dummies × ×
Religion dummies × ×
Slave export control × ×
Host district characteristics dummies × ×
Ancestral country dummies × ×
Urban dummy × ×

Number of Observations 3212 3212 3212 3212 3212 3212 3212 3212 2916 2916
Adjusted R2 0.218 0.225 0.230 0.234 0.225 0.226 0.230 0.234 0.289 0.287
Effect of 10th-90th %ile move in diversity −0.331** (0.145) −0.321** (0.133) −0.379*** (0.105) −0.292** (0.148) −0.317** (0.144) −0.322** (0.130) −0.382*** (0.090) −0.295** (0.129) −0.537*** (0.138) −0.589*** (0.170)

Notes: This table presents the results of an individual-level OLS regression analysis of interpersonal trust towards individuals of the same ethnicity (as reported in Nunn and Wantchekon (2011)) on observed population diversity in the ancestral ethnicity of these individuals, controlling for a range of individual characteristics (i.e., age, gender, living conditions, education, religion), the presence of a school, electricity, piped water, sewage, a health clinic, in the local area, whether the local area is urban, and the intensity of Atlantic and Indian slave exports. In addition, the analysis accounts for host country fixed effects as well as fixed effects associated with the ancestral country. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution is expressed in terms of the change in the trust variable. Heteroskedasticity-robust standard errors, clustered multi-dimensionally at both the ancestral ethnic group and the host country, are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

5.2.2. Population Diversity and Trust: Second-Generation Migrants (U.S.)

This subsection explores the effect of population diversity in the ancestral country of second-generation migrants in the United States on their level of trust (as reported in the General Social Survey, GSS). The focus on a single country permits the analysis to account for time-invariant unobserved heterogeneity in the host country (e.g., geographical, cultural, and institutional characteristics).45 Moreover, the analysis accounts for a range of individual controls, as well as geographical characteristics, regional fixed effects, and the degree of ethnolinguistic fractionalization and polarization, all in the ancestral country of origin.46

Table IX explores the association between the trust of second-generation migrants and the degree of population diversity in their parental country of origin. Column 1 establishes a negative and highly significant association between population diversity in the parental country of origin and trust of second-generation migrants, accounting for regional fixed effects associated with the parental country of origin.47 This highly significant negative association remains largely stable if one accounts for interview-year fixed effects (Column 2), and the fixed effects associated with the respondent’s age, sex, income, education, religion, and region within the United States (i.e., where the interview was conducted) in addition to the ethnic fractionalization or polarization of the homeland (Columns 3 and 4). Moreover, the results are robust to controlling for geographical characteristics of the parental country of origin (Columns 5 and 6).48 The coefficient of interest in Column 4 suggests that an increase in population diversity in the parental country of origin from the 10th percentile of the predicted contemporary level of diversity (e.g., individuals of Mexican descent) to the 90th percentile (e.g., individuals of Austrian descent) corresponds to a decrease in trust by 0.69 points (compared to a sample mean of 1.88 and standard deviation of 0.97). The analysis further suggests that ethnolinguistic fractionalization and polarization in the parental country of origin have no significant association with trust.

Table IX:

Country-of-Origin Population Diversity and Individual-Level Trust among Second-Generation U.S. Immigrants

Trust
(1) (2) (3) (4) (5) (6)
Population diversity (ancestral) −14.670*** (4.234) −15.036*** (3.736) −10.175** (4.483) −9.820** (4.546) −12.343*** (2.368) −12.358*** (1.714)
Ethnic fractionalization (ancestral) 0.014 (0.182) 0.004 (0.202)
Ethnolinguistic polarization (ancestral) −0.028 (0.094) −0.012 (0.122)
Regional dummies (ancestral) × × × × × ×
GSS year × × × × ×
Baseline individual controls × × × ×
Income dummies × × × ×
Education dummies × × × ×
Religion dummies × × × ×
Region of interview dummies × × × ×
Geographical controls (ancestral) × × ×

Number of Observations 2294 2294 1785 1785 1785 1785
Adjusted R2 0.029 0.036 0.096 0.096 0.096 0.096
Effect of 10th-90th %ile move in diversity −1.032*** (0.298) −1.058*** (0.263) −0.716** (0.315) −0.691** (0.320) −0.868*** (0.167) −0.869*** (0.121)

Notes: This table presents the results of an individual-level OLS regression analysis of interpersonal trust among second-generation migrants in the US on population diversity in their parental country of origin (as captured by ancestry-adjusted predicted diversity; Ashraf and Galor (2013a)), accounting for a range of individual-level socioeconomic characteristics (i.e., age, gender, income, religion, education), as well as time period fixed effects, parental region fixed effects, and the US host region fixed-effect. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution is expressed in terms of the change in the trust variable. Heteroskedasticity-robust standard errors, clustered multi-dimensionally at both the ancestral country and the US region of interview, are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

6. Concluding Remarks

This research explores one of the deepest roots of the prevailing variations in the emergence, prevalence, recurrence, and severity of intrasocietal conflicts, molded during the dawn of the dispersion of anatomically modern humans across the globe. It advances the hypothesis and establishes empirically that interpersonal population diversity, as determined predominantly during the exodus of humans from Africa tens of thousands of years ago, has been pivotal to historical and contemporary civil conflicts. The findings arguably reflect the contribution of population diversity to the non-cohesiveness of society, as reflected partly in the prevalence of mistrust, the divergence in preferences for public goods and redistributive policies, and the degree of fractionalization and polarization across ethnic, linguistic, and religious groups. Future research ought to focus on a deeper exploration of these and other possible mechanisms in order to better inform policies geared towards the implementation of optimal educational and sociopolitical institutions that could address the contribution of diversity to the non-cohesiveness of society.

Supplementary Material

1

Appendix

A.1. Analysis of Intrastate Conflict Severity in Repeated Cross-Country Data

The findings in Section 3.2 indicate that population diversity is a robust and significant reduced-form contributor to the contemporary risk of conflict in society, as manifested by the frequency, prevalence, and emergence of civil conflict events in the post-1960 time period. However, the outcome variables employed by those regressions are based on binary measures that are subject to a predefined threshold of annual battle-related casualties, which needs to be surpassed for a civil conflict event to be identified as such. Therefore, broadly speaking, the earlier findings reflect the influence of interpersonal population diversity on the extensive margin of conflict. This appendix section explores the influence of population diversity on the intensive margin of conflict. In particular, it employs both ordinal and continuous measures that capture the “severity” of intrastate conflicts and of events related to general social unrest, including but not limited to armed conflict.

The first measure of conflict intensity exploits information on the apparent “magnitude scores” associated with “major episodes” of intrastate armed conflict, as reported by the Major Episodes of Political Violence (MEPV) data set (Marshall, 2017).49 According to this data set, a “major episode” of armed conflict involves both (i) a minimum of 500 directly related fatalities in total; and (ii) systematic violence at a sustained rate of at least 100 directly related casualties per year. Importantly, for each such episode of conflict, the MEPV data set provides a “magnitude score” —namely, an ordinal measure on a scale of 1 to 10 of the episode’s destructive impact on the directly affected society, incorporating information on multiple dimensions of conflict severity, including the capabilities of the state, the interactive intensity (means and goals) of the oppositional actors, the area and scope of death and destruction, the extent of population displacement, and the duration of the episode. The specific outcome variable from the MEPV data set employed by the current analysis reflects the aggregated magnitude score across all conflict episodes that are classified as one of four types of intrastate conflict —namely, civil war, civil violence, ethnic war, and ethnic violence.50 In particular, this variable is reported by the MEPV data set at the country-year level, with nonevent years for a country being coded as 0.

The second measure of conflict intensity is based on annual time-series data on a continuous index of social conflict at the country level, as reported by the Cross-National Time-Series (CNTS) Data Archive (Banks and Wilson, 2018). Rather than adopting an ad hoc fatality-related threshold for the identification of conflict events, this index provides an aggregate summary of the general level of social dissonance in any given country-year, by way of measuring a weighted average across all observed occurrences of eight different types of sociopolitical unrest, including assassinations, general strikes, guerrilla warfare, major government crises, political purges, riots, revolutions, and anti-government demonstrations.51

Table A.I:

Population Diversity and the Severity of Civil Conflict in Repeated Cross-Country Data

Cross-country sample: Old World
Global
Old World
Global
(1) (2) (3) (4) (5) (6) (7) (8)
OLS OLS 2SLS 2SLS OLS OLS 2SLS 2SLS
Quinquennial MEPV civil conflict 1960–2017
Quinquennial CNTS social conflict severity, index, 1960–2014
Population diversity (ancestry adjusted) 4.241*** (1.452) 4.089** (1.803) 4.159*** (1.531) 3.981** (1.987) 5.306** (2.350) 5.619*** (1.982) 5.679** (2.599) 6.106*** (2.289)
Continent dummies × × × × × × × ×
Time dummies × × × × × × × ×
Controls for temporal spillovers × × × × × × × ×
Controls for geography × × × × × × × ×
Controls for ethnic diversity × × × ×
Controls for institutions × × × ×
Controls for oil, population, and income × × × ×

Observations 1,270 1,045 1,576 1,311 1,144 924 1,430 1,165
Countries 123 121 149 147 123 120 150 146
Partial R2 of population diversity 0.009 0.006 0.006 0.005
Adjusted R2 0.630 0.614 0.082 0.104
Effect of 10th–90th %ile move in diversity 0.199*** (0.068) 0.183** (0.081) 0.276*** (0.102) 0.264** (0.132) 0.249** (0.110) 0.264*** (0.093) 0.370** (0.169) 0.405*** (0.152)
First-stage F statistic 150.323 101.923 147.137 93.983

Notes: This table exploits variations in repeated cross-country data to establish a significant positive reduced-form impact of contemporary population diversity on the severity of conflict, as reflected by (i) the maximum value of an annual ordinal index of conflict intensity (from the MEPV data set) across all years in any given 5-year interval during the 1960–2017 time period; and (ii) the maximum value of an annual continuous index of the degree of social unrest (from the CNTS data set) across all years in any given 5-year interval during the 1960–2014 time period, conditional on other well-known diversity measures as well as the proximate geographical, institutional, and development-related correlates of conflict. Given that both measures of conflict severity are expressed in units that have no natural interpretation, their intertemporal cross-country distributions are standardized prior to conducting the regression analysis. The controls for geography include absolute latitude, ruggedness, distance to the nearest waterway, the mean and range of agricultural suitability, the mean and range of elevation, and an indicator for small island nations. The controls for ethnic diversity include ethnic fractionalization and polarization. The controls for institutions include a set of legal origin dummies, comprising two indicators for British and French legal origins, as well as six time-dependent covariates that capture the average annual values over the previous 5-year interval of the degree of executive constraints, two indicators for the type of political regime (democracy and autocracy), and three indicators for experience as a colony of the U.K., France, and any other major colonizing power. The control for oil presence is a time-invariant indicator for the discovery of a petroleum (oil or gas) reserve by the year 2003. The controls for population and income are the time-dependent log-transformed average annual values over the previous 5-year interval of total population and GDP per capita. To account for temporal dependence in conflict outcomes, all regressions control for the value of the outcome variable from the previous 5-year interval. For regressions based on the global sample, the set of continent dummies includes five indicators for Africa, Asia, North America, South America, and Oceania, whereas for regressions based on the Old-World sample, the set includes two indicators for Africa and Asia. The 2SLS regressions exploit prehistoric migratory distance from East Africa to the indigenous (precolonial) population of a country as an excluded instrument for the country’s contemporary population diversity. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its cross-country distribution is expressed in terms of the number of standard deviations of the intertemporal cross-country distribution of the outcome variable. Heteroskedasticity-robust standard errors, clustered at the country level, are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Given that the current analysis of conflict severity follows Esteban et al. (2012), in terms of exploiting variations in quinquennially repeated cross-country data, for each country, the annual data on either measure of conflict intensity is collapsed to a quinquennial time series, by assigning to any given 5-year interval in the post-1960 sample period, the maximum level of conflict intensity reflected by that measure across all years in the 5-year interval. As in earlier analyses of civil conflict incidence and onset, the examination focuses on better-identified specifications that either (i) exploit variations in a sample of countries belonging only to the Old World, or (ii) exploit migratory distance from East Africa as an instrument for contemporary population diversity in a global sample of countries. All regressions account for temporal dependence in conflict severity by allowing both the lagged observation of the outcome variable and a full set of time-interval (5-year period) dummies to enter the specification. Further, whenever time-varying covariates are allowed to enter the specification, they do so with a one-period lag. Finally, because the units in which the proxies of conflict intensity are measured in the data have no natural interpretation, the outcome variables are standardized prior to running the regressions.

Table A.I presents the results from the analysis of the influence of interpersonal diversity on intrastate conflict severity, as reflected by either the MEPV aggregate magnitude score of conflict intensity (Columns 1–4) or the CNTS index of social conflict (Columns 5–8).52 Regardless of the measure for conflict intensity examined, the identification strategy exploited, or the set of covariates considered by the specification, the results from the analysis of conflict severity in Table A.I establish population diversity as a qualitatively robust and significant reduced-form contributor to the intensive margin of intrastate conflict. Specifically, a move from the 10th to the 90th percentile of the cross-country distribution of population diversity in the relevant sample is associated with an increase in conflict severity by 18 to 28 percent of a standard deviation from the observed distribution of the MEPV magnitude score of conflict intensity, and with an an increase in general social unrest by 25 to 41 percent of a standard deviation from the observed distribution of the CNTS index of social conflict.

A.2. Robustness Checks for the Country-Level Analyses

Selection on Observables and Unobservables

Following the methodology of Altonji et al. (2005), the current analysis exploits the idea that the amount of selection bias due to the unobserved variables in a model can be inferred from the reduction in selection bias from the inclusion of additional observed variables, thus permitting an assessment of how much larger the bias from unobserved heterogeneity needs to be, relative to the bias from observables, in order to fully explain away the coefficient on the explanatory variable of interest.53 Specifically, the analysis compares the estimated coefficient, β^1R, on population diversity from a restricted model (conditioned on a subset of controls) with its estimated coefficient, β^1F, from an augmented model (conditioned on the full set of controls), examining the Altonji et al. (2005) ratio,AET=β^1F/β^1Rβ^1F. Intuitively, a higher absolute value for AET suggests that the additional control variables included in the augmented model, relative to the restricted one, are not sufficient to explain away the estimated coefficient on population diversity in the full specification, and as such, this coefficient cannot be completely attributed to omitted-variable bias unless the amount of selection on unobservables is much larger than that on observables.

The analysis additionally considers the δ and β* statistics suggested by Oster (2019). The δ statistic reflects how strongly correlated the unobservables need to be with population diversity, relative to observables, in order to account for the full size of the coefficient on population diversity. It differs from AET by accounting for the empirical relevance of the observables in explaining the variation in the outcome variable, based on the idea that including observables that do not move the R2 statistic of the regression very much leaves more room for unobservables that are correlated with the variable of interest. The β* statistic reflects the estimated value of the coefficient on population diversity if unobservables were as correlated with population diversity as the observables. Oster (2019) shows that if zero does not belong to the interval between the estimated coefficient on population diversity and β*, then one can reject the null hypothesis that the coefficient of interest is exclusively driven by unobservables.

Table A.II:

Population Diversity and the Count of Civil Conflict Onsets across Countries

Cross-country sample: Global
Old World
Global
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Negative Negative Negative Negative Negative Negative Negative
Binomial Binomial Binomial Binomial Binomial Binomial Binomial Poisson Poisson
Total count of new PRIO25 civil conflict onsets, 1960–2017
Population diversity (ancestry adjusted) 10.032*** (3.878) 19.339*** (3.559) 13.092** (5.238) 14.180*** (5.232) 12.884*** (4.674) 17.968*** (6.045) 18.025*** (5.358) 13.592** (5.512) 12.884*** (4.674)
Continent dummies × × × × × × ×
Controls for geography × × × × × × × ×
Controls for ethnic diversity × × × ×
Controls for institutions × × ×
Controls for oil, population, and income × × ×

Observations 150 150 150 150 147 123 121 150 147
Pseudo R2 0.013 0.128 0.153 0.158 0.257 0.149 0.276 0.219 0.317
Marginal effect of diversity 0.114** (0.046) 0.220*** (0.051) 0.149** (0.064) 0.162** (0.065) 0.147** (0.058) 0.231*** (0.086) 0.231*** (0.075) 0.155** (0.068) 0.147** (0.058)

Notes: This table conducts a robustness check on the results from the baseline cross-country analysis of the reduced-form impact of contemporary population diversity on civil conflict onsets, as shown in Table. Specifically, it establishes robustness to considering the total count rather than the annual frequency of civil conflict onsets over the post-1960 time period as the outcome variable. In line with the standard for analyzing over-dispersed count data, the regressions are estimated using the negative-binomial as opposed to a least-squares estimator. Given the absence of a negative-binomial estimator that permits instrumentation, however, the current analysis is unable to implement the strategy of exploiting prehistoric migratory distance from East Africa to the indigenous (precolonial) population of a country as an excluded instrument for the country’s contemporary population diversity. Thus, in lieu of implementing the instrument-based identification strategy in the global sample of countries, Columns 8–9 examine robustness to employing the Poisson rather than the negative-binomial estimator for estimating the specifications from Columns 6–7, respectively. The specifications examined in this table are otherwise identical to corresponding OLS specifications reported in Table I. The reader is therefore referred to Table I and the corresponding table notes for additional details on the baseline set of covariates considered by the current analysis. The estimated marginal effect of a 1 percentage point increase in population diversity is the average marginal effect across the entire cross-section of observed diversity values, and it reflects the increase in the total number of new conflict onsets over the post-1960 time period. Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

The analysis treats the specification from Column 3 of Table I as the restricted model. This specification includes, besides population diversity, the baseline geographical controls and continent fixed effects. Coefficient stability is assessed relative to the augmented specification presented in Column 8 that includes the full set of control variables. The resulting AET ratio is −10.3, and it suggests that selection on unobservables would have to be at least ten times larger than the selection on observables to account for the full size of the estimated coefficient for population diversity.54 On the other hand, Oster’s δ statistic is 1.93, indicating that the correlation of unobservables with population diversity needs to be almost twice as large as the correlation of population diversity with observables in order to drive the estimate down to zero. Assuming that the unobservables are equally correlated with population diversity as are the observables, and that these correlations have the same sign, the estimated coefficient for diversity, if one were able to control for all unobservables, would be β* = 1.15. Thus, the interval between the actual coefficient estimate from the full specification (0.309) and β* excludes zero.55 It is therefore rather unlikely that the main results could be explained away by omitted variables.

Table A.III:

Population Diversity and the Frequency of Civil Conflict Onset across Countries – Robustness to Accounting for Spatial Dependence

Cross-country sample: Global
Old World
Global
(1) (2) (3) (4) (5) (6) (7) (8) (9)
SARAR SARAR SARAR SARAR SARAR SARAR SARAR SARAR SARAR
OLS OLS OLS OLS OLS OLS OLS 2SLS 2SLS
Log number of new PRIO25 civil conflict onsets per year, 1960–2017
Population diversity (ancestry adjusted) 0.253** (0.099) 0.447*** (0.109) 0.320*** (0.120) 0.329*** (0.121) 0.288** (0.130) 0.717*** (0.251) 0.643*** (0.223) 0.602*** (0.219) 0.457*** (0.175)
Spatial lag AR(1) of conflict (λ) −0.633 (1.078) −0.164 (0.750) −0.226 (0.750) −0.214 (0.729) 0.362 (0.761) −1.123 (0.833) −0.199 (0.772) −0.851 (0.849) 0.317 (0.748)
Spatial lag AR(1) of error (ρ) 0.177 (0.814) 0.579 (0.846) 0.629 (0.840) 0.328 (0.842) 0.470 (0.798) 1.103 (0.817) 0.963 (0.669) 1.115 (0.821) 0.346 (0.744)
Continent dummies × × × × × × ×
Controls for geography × × × × × × × ×
Controls for ethnic diversity × × × ×
Controls for institutions × × ×
Controls for oil, population, and income × ×

Observations 150 150 150 150 147 123 121 150 147
Effect of 10th–90th %ile move in diversity 0.017** (0.007) 0.030*** (0.007) 0.021*** (0.008) 0.022*** (0.008) 0.020** (0.009) 0.035*** (0.012) 0.028*** (0.010) 0.040*** (0.015) 0.031*** (0.012)

Notes: This table conducts a robustness check on the results from the baseline cross-country analysis of the reduced-form impact of contemporary population diversity on the annual frequency of civil conflict onsets, as shown in Table. Specifically, it establishes robustness to accounting for spatial dependence across observations by estimating spatial-autoregressive models with spatial-autoregressive disturbances (SARAR(1,1)) using a generalized spatial two-stage least-squares (GS2SLS) estimator (e.g., Drukker et al., 2013). To perform this robustness check, which involves the estimation of the AR(1) coefficients, λ and ρ, respectively associated with the spatial lags in the outcome variable and the error term, the estimator exploits an inverse-distance spatial weighting matrix for the regression sample, based on the great-circle distances between the geodesic centroids of country pairs. The specifications examined in this table are otherwise identical to corresponding ones reported in Table I. The reader is therefore referred to Table I and the corresponding table notes for additional details on the baseline set of covariates considered by the current analysis as well as the identification strategy employed by the 2SLS regressions in the last two columns. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its cross-country distribution is expressed in terms of the number of new conflict onsets per year. Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Robustness to Examining the Count of Civil Conflict Onset across Countries

Given that the baseline cross-country regressions employ least-squares estimation, a log transformation is applied to the outcome variable in order to partly address the issue that its cross-country distribution is positively skewed with excess zeros, arising from the fact that new civil conflict onsets are generally rare events in cross-sectional data. An alternative approach to this issue, however, is to employ an estimation method that is tailored to the analysis of over-dispersed count data. The analysis in Table A.II considers the total count rather than the annual frequency of civil conflict onsets over the 1960–2017 time period as the outcome variable. The regressions in Columns 1–7 are estimated using the negative-binomial (as opposed to a least-squares) estimator to account for over-dispersion. Given the absence of a negative-binomial estimator that permits instrumentation, in lieu of implementing the instrument-based identification strategy in the global sample of countries, Columns 8–9 examine robustness to employing the Poisson rather than the negative binomial-estimator in the global sample of countries. To interpret the influence of population diversity, the estimate in Column 7 suggests that conditional on the full set of control variables, a 5 percentage point increase in population diversity translates roughly into an additional civil conflict amongst countries in the Old World during the 1960–2017 time period.

Table A.IV:

Population Diversity and the Frequency of Civil Conflict Onset across Countries – Robustness to Accounting for Population Diversity as a Generated Regressor

Cross-country sample: Global
Old World
Global
(1) (2) (3) (4) (5) (6) (7) (8) (9)
OLS OLS OLS OLS OLS OLS OLS 2SLS 2SLS
Log number of new PRIO25 civil conflict onsets per year, 1960–2017
Population diversity (ancestry adjusted) 0.209*** (0.066) 0.439*** (0.103) 0.306*** (0.118) 0.318*** (0.123) 0.309** (0.138) 0.548*** (0.189) 0.597*** (0.227) 0.537*** (0.184) 0.602*** (0.223)
Continent dummies × × × × × × ×
Controls for geography × × × × × × × ×
Controls for ethnic diversity × × × ×
Controls for institutions × × ×
Controls for oil, population, and income × × ×

Observations 150 150 150 150 147 123 121 150 147
Adjusted R2 0.029 0.189 0.213 0.215 0.358 0.225 0.392
Effect of 10th–90th %ile move in diversity 0.014*** (0.005) 0.029*** (0.007) 0.020** (0.008) 0.021** (0.009) 0.021** (0.010) 0.026** (0.011) 0.026** (0.012) 0.036*** (0.013) 0.041** (0.016)

Notes: This table conducts a robustness check on the results from the baseline cross-country analysis of the reduced-form impact of contemporary population diversity on the annual frequency of civil conflict onsets, as shown in Table. Specifically, it establishes robustness of the standard-error estimates to accounting for the fact that the country-level measure of contemporary population diversity is a generated regressor in the empirical specifications, because it is projected from implicit zeroth-stage relationships (a) between prehistoric migratory distance from East Africa and expected heterozygosity in the HGDP-CEPH sample of 53 ethnic groups, and (b) between pairwise migratory distance and pairwise FST genetic distance across all pairs of ethnic groups in this sample. To perform this robustness check, the current analysis adopts the two-step bootstrapping technique implemented by Ashraf and Galor (2013a) for computing the standard-error estimates, so the reader is referred to that work for additional details on the technique. The specifications examined in this table are otherwise identical to corresponding ones reported in Table I. The reader is therefore referred to Table I and the corresponding table notes for additional details on the baseline set of covariates considered by the current analysis as well as the identification strategy employed by the 2SLS regressions in the last two columns. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its cross-country distribution is expressed in terms of the number of new conflict onsets per year. Bootstrapped standard errors, accounting for the use of a generated regressor, are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Robustness to Accounting for Spatial Dependence

To account for spatial dependence across country observations, the analysis in Table A.III replicates the key specifications from Table I using spatial-autoregressive models with spatial-autoregressive disturbances (SARAR(1,1)), estimated by a generalized spatial two-stage least-squares (GS2SLS) estimator (e.g., Drukker et al., 2013). These spatial regressions involve the estimation of AR(1) coefficients, λ and ρ, that are respectively associated with the spatial lags in the outcome variable and the error term. To perform this robustness check, the estimator exploits an inverse-distance spatial weighting matrix for the regression sample, based on the great-circle distances between the geodesic centroids of country pairs. Reassuringly, all of the main cross-country findings remain qualitatively intact, indicating that spatial dependence across country observations is not a confounding issue.

Robustness to Accounting for Population Diversity as a Generated Regressor

The measure of contemporary population diversity is a generated regressor in the main specifications, because it is projected from implicit zeroth-stage relationships (i) between prehistoric migratory distance from East Africa and expected heterozygosity in the HGDP-CEPH sample of 53 ethnic groups, and (ii) between pairwise migratory distance and pairwise FST genetic distance across all pairs of ethnic groups in this sample. Table A.IV therefore checks the robustness of the standard-error estimates to accounting for potential bias due to the use of a generated regressor. To perform this robustness check, the analysis replicates the key specifications from Table I, adopting the two-step bootstrapping technique implemented by Ashraf and Galor (2013a) for estimating the standard errors. The reader is referred to that work for additional details on the technique. As expected, the bootstrapped standard errors are indeed somewhat larger than their robust counterparts from Table I, but reassuringly, the statistical significance of the coefficients on population diversity remain unaffected.

Table A.V:

Population Diversity and the Incidence of Civil Conflict in Repeated Cross-Country Data – Robustness to Examining Alternative Measures of Conflict Incidence

Cross-country sample: Old World
Global
Old World
Global
(1) (2) (3) (4) (5) (6) (7) (8)
Probit Probit IV Probit IV Probit Probit Probit IV Probit IV Probit
Quinquennial PRIO1000 civil war incidence, 1960–2017
Quinquennial UCDP nonstate conflict incidence, 1989–2017
Population diversity (ancestry adjusted) 16.221*** (4.285) 11.251** (5.482) 17.090*** (4.256) 16.327*** (5.808) 24.499*** (5.399) 25.186*** (6.408) 22.511*** (4.992) 24.662*** (5.563)
Continent dummies × × × × × × × ×
Time dummies × × × × × × × ×
Controls for temporal spillovers × × × × × × × ×
Controls for geography × × × × × × × ×
Controls for ethnic diversity × × × ×
Controls for institutions × × × ×
Controls for oil, population, and income × × × ×

Observations 1,270 1,026 1,551 1,262 717 670 879 824
Countries 123 121 147 144 123 121 150 147
Pseudo R2 0.392 0.390 0.436 0.459
Marginal effect of diversity 1.850*** (0.540) 1.212** (0.617) 2.005*** (0.631) 1.786** (0.777) 3.835*** (0.837) 3.568*** (0.927) 3.790*** (0.911) 3.839*** (1.013)
First-stage F statistic 168.723 113.194 148.632 120.800

Notes: This table conducts a robustness check on the results from the baseline analysis of the reduced-form impact of contemporary population diversity on the quinquennial incidence of intrastate conflict in repeated cross-country data, as shown in Columns 1–4 of Table. Specifically, it establishes robustness to considering the temporal incidence of alternative forms of intrastate conflict as the outcome variable, including the incidence of (i) a high-intensity PRIO1000 civil war in any given 5-year interval during the 1960–2017 time period (Columns 1–4); and (ii) a low-intensity conflict involving nonstate actors in any given 5-year interval during the 1989–2017 time period (Columns 5–8). The specifications examined in this table are otherwise identical to corresponding ones reported in Columns 1–4 of Table II. The reader is therefore referred to Table II and the corresponding table notes for additional details on the baseline set of covariates considered by the current analysis, the identification strategy employed by the IV probit regressions, and the estimation and interpretation of the marginal effect of population diversity on the incidence of conflict. Heteroskedasticity-robust standard errors, clustered at the country level, are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Robustness to Examining Alternative Measures of Conflict Incidence

As shown in Columns 1–4 of Table II, population diversity is positively and significantly associated with the quinquennial incidence of a PRIO25 civil conflict (with at least 25 battle-related deaths in a year) in the post-1960 time period. The analysis in Table A.V examines whether the same result holds when considering the temporal incidence of alternative forms of intrastate conflict as the outcome variable, including the incidence in any given 5-year interval of (i) a high-intensity PRIO1000 civil war (with at least 1000 battle-related deaths in a year) during the 1960–2017 time period (Columns 1–4); and a low-intensity conflict (with at least 25 battle-related deaths in a year) involving only nonstate actors during the 1989–2017 time period (Columns 5–8). The findings indicate that regardless of the covariates included in the specification or the identification strategy exploited, population diversity exerts a positive and significant influence on the quinquennial incidence of either of the aforementioned types of intrastate conflict. To interpret the coefficient of interest, the IV probit regressions presented in Columns 4 and 8 suggest that conditional on the full set of control variables, a 1 percentage point increase in population diversity increases the quinquennial likelihoods of conflict incidence by 1.8 percentage points for PRIO1000 civil wars and by 3.8 percentage points for internal conflicts involving nonstate actors.

Table A.VI:

Population Diversity and the Incidence of Civil Conflict in Repeated Cross-Country Data – Robustness to Examining the Annual Incidence or Quinquennial Prevalence of Conflict

Cross-country sample: Old World
Global
Old World
Global
(1) (2) (3) (4) (5) (6) (7) (8)
Probit Probit IV Probit IV Probit OLS OLS 2SLS 2SLS
Annual PRIO25 civil conflict incidence, 1960–2017
Quinquennial PRIO25 civil conflict prevalence, 1960–2017
Population diversity (ancestry adjusted) 9.301*** (3.015) 9.763*** (3.203) 10.762*** (3.121) 12.848*** (3.914) 1.710*** (0.558) 1.737*** (0.637) 1.773*** (0.565) 1.988*** (0.716)
Continent dummies × × × × × × × ×
Time dummies × × × × × × × ×
Controls for temporal spillovers × × × × × × × ×
Controls for geography × × × × × × × ×
Controls for ethnic diversity × × × ×
Controls for institutions × × × ×
Controls for oil, population, and income × × × ×

Observations 6,280 5,221 7,801 6,569 1,270 1,045 1,583 1,311
Countries 123 121 150 147 123 121 150 147
Pseudo R2 0.597 0.602
Adjusted R2 0.621 0.598
Marginal effect of diversity 0.976*** (0.329) 0.973*** (0.339) 1.125*** (0.367) 1.297*** (0.463)
Effect of 10th–90th %ile move in diversity 0.080*** (0.026) 0.078*** (0.028) 0.115*** (0.037) 0.132*** (0.047)
First-stage F statistic 155.509 103.745 151.471 104.807

Notes: This table conducts a robustness check on the results from the baseline analysis of the reduced-form impact of contemporary population diversity on the temporal incidence or prevalence of civil conflict in repeated cross-country data, as shown in Columns 1–4 of Table. Specifically, it establishes robustness to considering (i) the annual incidence of conflict, by examining annual rather than quinquennial repetitions of the cross-section (Columns 1–4); and (ii) the quinquennial prevalence of conflict, by examining the share of years with an active civil conflict in any given 5-year interval (Columns 5–8). The specifications examined in this table are essentially identical to corresponding ones reported in Columns 1–4 of Table II, with the exception that in Columns 1–4 of the current analysis, the time-dependent baseline controls for institutions (i.e., executive constraints, indicators for the type of political regime, and indicators for colonial experience by identity of the colonizing power), total population, GDP per capita, and temporal spillovers are all appropriately adjusted to assume their respective lagged annual values, rather than their values corresponding to the previous 5-year interval. The reader is therefore referred to Table II and the corresponding table notes for additional details on the baseline set of covariates considered by the current analysis as well as the identification strategy employed by the IV probit or 2SLS regressions. In Columns 1–4, the estimated marginal effect of a 1 percentage point increase in population diversity is the average marginal effect across the entire cross-section of observed diversity values, and it reflects the increase in the annual likelihood of a conflict incidence, expressed in percentage points. In Columns 5–8, the estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its cross-country distribution is expressed in terms of the share of years with an active conflict in any given 5-year interval. Heteroskedasticity-robust standard errors, clustered at the country level, are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Robustness to Examining the Annual Incidence or Quinquennial Prevalence of Civil Conflict

The analysis in Table A.VI checks the robustness of the baseline results for the incidence of civil conflict, as shown in Columns 1–4 of Table II, to considering alternative outcomes of conflict incidence or prevalence, including (i) the annual incidence of conflict, by examining annual rather than quinquennial repetitions of the cross-section (Columns 1–4); and (ii) the quinquennial prevalence of conflict, by examining the share of years with an active civil conflict in any given 5-year interval (Columns 5–8). The specifications examined in this table are essentially identical to corresponding ones reported in Columns 1–4 of Table II, with the exception that in Columns 1–4 of the current analysis, the time-dependent baseline controls for institutions (i.e., executive constraints, indicators for the type of political regime, and indicators for colonial experience by identity of the colonizing power), total population, GDP per capita, and temporal spillovers are all appropriately adjusted to assume their respective lagged annual values, rather than their values corresponding to the previous 5-year interval. As is evident from the results in Table A.VI, regardless of the identification strategy exploited or the covariates included in the specification, population diversity contributes positively and significantly to both the annual incidence and the quinquennial prevalence of civil conflict during the 1960–2017 time period. Specifically, the global average marginal effect estimated by the specification in Column 4 suggests that conditional on the full set of control variables, a 1 percentage point increase in population diversity increases the annual likelihood of a conflict incidence by 1.3 percentage points. Further, the specification in Column 8 suggests that conditional on all covariates, a move from the 10th to the 90th percentile of the global cross-country distribution of population diversity is associated with an increase of 13 percentage points in the fraction of years with an active conflict in any given 5-year interval.

A.3. Appendix Figures for the Country-Level Analyses

Figure A.1: Population Diversity and Proximate Determinants of the Frequency of Civil Conflict Onset across Countries.

Figure A.1:

Notes: This figure depicts the global cross-country relationship between contemporary population diversity and each of three potentially conflict-augmenting proximate channels, including (i) the degree of cultural fragmentation, as reflected by the number of ethnic groups in the national population (Panel (a)); (ii) the prevalence of generalized interpersonal trust at the country level (Panel (b)); and (iii) the extent of heterogeneity in preferences for redistribution and public-goods provision, as reflected by the intra-country dispersion in individual political attitudes on a politically “left”–“right” categorical scale (Panel (c)), conditional on the baseline geographical correlates of conflict, as considered by the analysis in Table VII. Each of Panels (a), (b), and (c) presents an added-variable plot with a partial regression line, corresponding to the estimated coefficient associated with population diversity in Columns 1, 4, and 7, respectively, of Table VII.

A.4. Descriptive Statistics at the Country Level

Table A.VII:

Summary Statistics of Variables from the Baseline Cross-Country Analysis

Percentile
Mean SD 10th 90th
PANEL A Old World sample (N = 121)
New civil conflict onsets per year, 1960–2017 0.025 0.033 0.000 0.069
Population diversity (ancestry adjusted) 0.735 0.018 0.712 0.754
Migratory distance from East Africa (in 10,000 km) 0.515 0.244 0.262 0.831
Absolute latitude 0.029 0.017 0.007 0.052
Ruggedness 0.124 0.134 0.016 0.286
Mean elevation 0.610 0.584 0.106 1.265
Range of elevation 1.550 1.322 0.281 3.043
Mean land suitability 0.359 0.234 0.035 0.669
Range of land suitability 0.701 0.259 0.345 0.974
Distance to nearest waterway 0.383 0.483 0.039 1.036
Island nation dummy 0.033 0.180 0.000 0.000
Ethnic fractionalization 0.476 0.264 0.115 0.812
Ethnolinguistic polarization 0.491 0.220 0.181 0.747
Ever a U.K. colony dummy 0.264 0.443 0.000 1.000
Ever a French colony dummy 0.207 0.407 0.000 1.000
Ever a non-U.K./non-French colony dummy 0.198 0.400 0.000 1.000
British legal origin dummy 0.256 0.438 0.000 1.000
French legal origin dummy 0.405 0.493 0.000 1.000
Executive constraints, 1960–2017 average 3.983 1.875 1.684 7.000
Fraction of years under democracy, 1960–2017 0.367 0.381 0.000 1.000
Fraction of years under autocracy, 1960–2017 0.390 0.327 0.000 0.900
Oil or gas reserve discovery 0.669 0.472 0.000 1.000
Log population, 1960–2017 average 16.072 1.459 14.385 17.873
Log GDP per capita, 1960–2017 average 7.638 1.567 5.649 9.940
PANEL B Global sample (N = 147)
New civil conflict onsets per year, 1960–2017 0.022 0.031 0.000 0.064
Population diversity (ancestry adjusted) 0.728 0.027 0.685 0.752
Migratory distance from East Africa (in 10,000 km) 0.806 0.679 0.295 2.088
Absolute latitude 0.027 0.017 0.006 0.051
Ruggedness 0.125 0.126 0.018 0.278
Mean elevation 0.594 0.552 0.104 1.250
Range of elevation 1.701 1.389 0.283 3.752
Mean land suitability 0.386 0.246 0.046 0.718
Range of land suitability 0.715 0.264 0.317 0.994
Distance to nearest waterway 0.353 0.458 0.036 1.010
Island nation dummy 0.048 0.214 0.000 0.000
Ethnic fractionalization 0.467 0.254 0.115 0.792
Ethnolinguistic polarization 0.452 0.241 0.097 0.747
Ever a U.K. colony dummy 0.259 0.439 0.000 1.000
Ever a French colony dummy 0.190 0.394 0.000 1.000
Ever a non-U.K./non-French colony dummy 0.320 0.468 0.000 1.000
British legal origin dummy 0.252 0.435 0.000 1.000
French legal origin dummy 0.463 0.500 0.000 1.000
Executive constraints, 1960–2017 average 4.145 1.827 1.839 7.000
Fraction of years under democracy, 1960–2017 0.408 0.377 0.000 1.000
Fraction of years under autocracy, 1960–2017 0.352 0.323 0.000 0.879
Oil or gas reserve discovery 0.673 0.471 0.000 1.000
Log population, 1960–2017 average 16.087 1.431 14.423 17.877
Log GDP per capita, 1960–2017 average 7.703 1.489 5.705 9.937

A.5. Robustness Checks for the Ethnicity-Level Analyses

Table A.VIII:

Population Diversity and the Number of Conflicts across Ethnic Homelands

Number of conflict events
(1) (2) (3) (4) (5) (6) (7) (8)
Poisson Poisson Poisson Poisson Poisson Poisson Poisson Poisson
Observed population diversity 58.949*** (18.286) 54.959*** (18.020) 63.413*** (16.750) 61.608*** (16.685)
Predicted population diversity 53.230*** (9.061) 43.748*** (6.670) 49.242*** (7.403) 51.593*** (7.799)
Ethnolinguistic fractionalization 0.236 (0.498) −0.784** (0.356)
Ethnolinguistic polarization −0.190 (0.430) −0.895** (0.441)
Regional dummies Yes Yes Yes Yes Yes Yes Yes Yes
Geographical controls No Yes Yes Yes No Yes Yes Yes
Climatic controls No No Yes Yes No No Yes Yes
Development outcomes No No Yes Yes No No Yes Yes
Disease environment controls No No Yes Yes No No Yes Yes

Sample Observed Observed Observed Observed Predicted Predicted Predicted Predicted
Observations 207 207 207 207 901 901 901 901
PseudoR2 0.250 0.327 0.463 0.463 0.215 0.451 0.519 0.520
Effect of 10th-90th %ile move in diversity 61.225*** (22.925) 57.081*** (21.677) 65.859*** (21.558) 63.985*** (21.168) 60.379*** (15.693) 49.623*** (10.515) 55.855*** (11.546) 58.521*** (12.213)

Notes: This table exploits variations across ethnic homelands to establish a significant positive reduced-form impact of contemporary population diversity on the number of conflict events during the 1989–2008 period, conditional on the baseline control variables (i.e., proximate geographical and development-related correlates of conflict). The set of continent and regional dummies includes indicators for Europe, Asia, North America, South America, Oceania, North Africa, and Sub-Saharan Africa. Additional climatic covariates refer to the average diurnal temperature range, average cloud cover, and average temperature range in the homeland. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution is expressed in terms of the change in the number of conflict events. Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Table A.IX:

Population Diversity and Alternative Conflict Outcomes across Ethnic Homelands

Log number of conflicts
Log number of deaths
Log number of deaths per conflict
(1) (2) (3) (4) (5) (6)
OLS OLS OLS OLS OLS OLS
Observed population diversity 6.037*** (2.284) 26.119*** (9.789) 20.082** (7.792)
Predicted population diversity 9.173*** (1.918) 40.406*** (8.581) 31.233*** (6.932)
Ethnolinguistic fractionalization 0.552* (0.317) 0.094 (0.113) 3.152** (1.421) 0.576 (0.492) 2.600** (1.162) 0.482 (0.398)
Ethnolinguistic polarization −0.439* (0.255) −0.171* (0.092) −2.489** (1.221) −0.758* (0.397) −2.050** (1.006) −0.587* (0.318)
Regional dummies Yes Yes Yes Yes Yes Yes
Geographical controls Yes Yes Yes Yes Yes Yes
Climatic controls Yes Yes Yes Yes Yes Yes

Sample Observed Predicted Observed Predicted Observed Predicted
Observations 207 901 207 901 207 901
Effect of 10th-90th %ile move in diversity 1.319*** (0.499) 2.232*** (0.467) 9969.713*** (3736.546) 10211.867*** (2168.784) 948.445*** (368.027) 1008.051*** (223.744)
Adjusted R2 0.201 0.300 0.241 0.275 0.241 0.253

Notes: This table exploits variations across ethnic homelands to establish a significant positive impact of contemporary population diversity, predicted by prehistoric migratory distance from East Africa on the log number of UCDP/GED conflicts, the log number of UCDP/GED deaths, and the log number of UCDP/GED deaths per conflict, during the 1989–2008 period, accounting for geographical and development-related correlates of conflict. The set of continent and regional dummies includes indicators for Europe, Asia, North America, South America, Oceania, North Africa, and Sub-Saharan Africa. Additional climatic covariates refer to the average diurnal temperature range, average cloud cover, and average temperature range in the homeland. The estimated effects associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution are expressed in terms of the non-logged levels of the respective outcome variables. Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Table A.X:

Observed Population Diversity and Conflict across Ethnic Homelands – Robustness to Accounting for Spatial Dependence

Log conflict prevalence
(1) (2) (3) (4) (5) (6) (7)
OLS OLS OLS OLS OLS OLS OLS
Observed population diversity 31.788*** (8.819) 41.070*** (8.392) 37.111*** (8.261) 37.333*** (8.203) 37.148*** (8.222) 41.745*** (8.428) 41.403*** (8.439)
Ethnolinguistic fractionalization 0.881* (0.504) 0.804 (0.497)
Ethnolinguistic polarization 0.593 (0.426) 0.562 (0.417)
Regional dummies Yes Yes Yes Yes Yes Yes Yes
Geographical controls No Yes Yes Yes Yes Yes Yes
Climatic controls No No Yes Yes Yes Yes Yes
Development outcomes No No No No No Yes Yes
Disease environment controls No No No No No Yes Yes

Sample Observed Observed Observed Observed Observed Observed Observed
Direct impact of genetic diversity 32.803*** (9.165) 43.792*** (9.362) 38.509*** (8.756) 38.722*** (8.691) 38.550*** (8.717) 43.734*** (9.165) 43.391*** (9.180)
Direct effect of 10th-90th %ile move in diversity 0.513*** (0.143) 0.685*** (0.146) 0.602*** (0.137) 0.605*** (0.136) 0.603*** (0.136) 0.684*** (0.143) 0.678*** (0.144)
Observations 207 207 207 207 207 207 207

Notes: This table exploits variations across ethnic homelands to establish a significant positive reduced-form impact of contemporary population diversity on the log conflict prevalence during the 1989–2008 period, conditional on the baseline control variables (i.e., proximate geographical and development-related correlates of conflict) and accounting for spatial dependence using a spatial autoregressive (SARAR(1,1)) model, with a spectral-normalized inverse-distance weighting matrix, estimated with maximum-likelihood estimation, with a spatial lag of the dependent variable and a spatially lagged error. The model treat errors as heteroskedastic. Variables relating to observations associated with the same homeland polygon are averaged and a single observation is kept for each polygon. The set of continent and regional dummies includes indicators for Europe, Asia, North America, South America, Oceania, North Africa, and Sub-Saharan Africa. Additional climatic covariates refer to the average diurnal temperature range, average cloud cover, and average temperature range in the homeland. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution is expressed in terms of the change in the prevalence of conflicts within the territory of a homeland over the years 1989–2008. Standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Table A.XI:

Predicted Population Diversity and Conflict across Ethnic Homelands – Robustness to Accounting for Spatial Dependence

Log conflict prevalence
(1) (2) (3) (4) (5) (6) (7)
OLS OLS OLS OLS OLS OLS OLS
Predicted population diversity 57.609*** (6.447) 87.327*** (7.269) 88.759*** (7.230) 88.623*** (7.387) 86.281*** (7.305) 83.671*** (7.255) 84.245*** (7.288)
Ethnolinguistic fractionalization 0.517** (0.216) 0.332 (0.215)
Ethnolinguistic polarization 0.007 (0.189) −0.004 (0.188)
Regional dummies Yes Yes Yes Yes Yes Yes Yes
Geographical controls No Yes Yes Yes Yes Yes Yes
Climatic controls No No Yes Yes Yes Yes Yes
Development outcomes No No No No No Yes Yes
Disease environment controls No No No No No Yes Yes

Sample Predicted Predicted Predicted Predicted Predicted Predicted Predicted
Direct Impact of Genetic Diversity 60.406*** (6.930) 87.543*** (7.509) 87.488*** (7.510) 79.433*** (8.977) 84.390*** (7.690) 81.962*** (7.597) 82.463*** (7.643)
Effect of 10th-90th %ile move in diversity 1.341*** (0.154) 1.943*** (0.167) 1.942*** (0.167) 1.763*** (0.199) 1.873*** (0.171) 1.819*** (0.169) 1.830*** (0.170)
Observations 901 901 901 901 901 901 901

Notes: This table exploits variations across ethnic homelands to establish a significant positive reduced-form impact of predicted population diversity on the log conflict prevalence during the 1989–2008 period, conditional on the baseline control variables (i.e., proximate geographical and development-related correlates of conflict) and accounting for spatial dependence using a spatial autoregressive (SARAR(1,1)) model, with a spectral-normalized inverse-distance weighting matrix, estimated with maximum-likelihood estimation, with a spatial lag of the dependent variable and a spatially lagged error. The model treats errors as heteroskedastic. Variables relating to observations associated with the same homeland polygon are averaged and a single observation is kept for each polygon. The set of continent and regional dummies includes indicators for Europe, Asia, North America, South America, Oceania, North Africa, and Sub-Saharan Africa. Additional climatic covariates refer to the average diurnal temperature range, average cloud cover, and average temperature range in the homeland. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution is expressed in terms of the change in the prevalence of conflicts within the territory of a homeland over the years 1989–2008. Standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

Table A.XII:

Predicted Population Diversity and Conflict across Ethnic Homelands – Robustness to Accounting for Predicted Diversity as a Generated Regressor

Log conflict prevalence
(1) (2) (3) (4)
OLS OLS OLS OLS
Predicted population diversity 77.710*** (6.279) 77.031*** (7.282) 74.010*** (7.396) 73.581*** (7.418)
Ethnolinguistic fractionalization 0.347 (0.299)
Ethnolinguistic polarization 0.457* (0.263)
Regional dummies Yes Yes Yes Yes
Geographical controls No Yes Yes Yes
Climatic controls No Yes Yes Yes
Development outcomes No No Yes Yes
Disease environment controls No No Yes Yes

Sample Predicted Predicted Predicted Predicted
Observations 901 901 901 901
Effect of 10th-90th %ile move in diversity 1.725*** (0.139) 1.710*** (0.162) 1.643*** (0.164) 1.633*** (0.165)
Adjusted R2 Bootstrapped standard error 0.211 (7.128)*** 0.362 (8.196)*** 0.378 (8.244)*** 0.379 (8.266)***
VI

Notes: This table exploits variations across ethnic homelands to establish a significant positive impact of predicted population diversity on the log conflict prevalence during the 1989–2008 period, conditional on ecological diversity and ecological polarization as well as the baseline control variables. The set of continent and regional dummies includes indicators for Europe, Asia, North America, South America, Oceania, North Africa, and Sub-Saharan Africa. Additional climatic covariates refer to the average diurnal temperature range, average cloud cover, and average temperature range in the homeland. To perform this robustness check, the current analysis adopts the two-step bootstrapping technique implemented by Ashraf and Galor (2013a) for computing the standard-error estimates, so the reader is referred to that work for additional details on the technique. The specifications examined in this table are otherwise identical to corresponding ones reported in Table. The reader is therefore referred to Table VI and the corresponding table notes for additional details on the baseline set of covariates considered by the current analysis as well as the identification strategy employed by the 2SLS regressions in the last two columns. The estimated effect associated with increasing population diversity from the tenth to the ninetieth percentile of its distribution is expressed in terms of the change in the prevalence of conflicts within the territory of a homeland over the years 1989–2008. Heteroskedasticity-robust standard errors are reported in parentheses.

***

denotes statistical significance at the 1 percent level,

**

at the 5 percent level, and

*

at the 10 percent level.

A.6. Descriptive Statistics at the Ethnicity Level

Table A.XIII:

Summary Statistics

Percentile
Mean SD 10th 90th
PANEL A Observed population diversity sample (N = 207)
Population diversity (observed) 0.72 0.05 0.65 0.76
Population diversity (predicted) 0.72 0.04 0.65 0.76
Conflict prevalence 0.14 0.27 0.00 0.63
Number of conflicts 1.04 2.78 0.00 3.00
Number of deaths (in thousands) 3.56 39.32 0.00 1.49
Ethnolinguistic fractionalization 0.26 0.30 0.00 0.74
Ethnolinguistic polarization 0.33 0.36 0.00 0.85
Absolute latitude 15.15 15.09 1.85 38.02
Ruggedness 133.37 144.14 14.69 299.79
Elevation 0.75 0.75 0.07 1.67
Range of elevation 1.60 1.25 0.31 3.36
Mean land suitability 8.50 3.50 3.69 12.38
Range of land suitability 5.09 4.42 0.36 11.76
Small island dummy 0.01 0.10 0.00 0.00
Distance to nearest waterway 56.45 60.96 0.00 140.47
Temperature 21.08 7.79 8.94 27.20
Precipitation 123.06 100.31 31.34 285.66
Years since settlement (centuries from present) 104.94 31.86 40.19 120.19
Malaria 0.16 0.19 0.00 0.49
Oil or gas discovery 0.27 0.45 0.00 1.00
Luminosity 1.20 2.95 0.00 3.70
PANEL B Predicted population diversity sample (N = 901)
Population diversity (predicted) 0.71 0.04 0.64 0.75
Conflict prevalence 0.19 0.32 0.00 0.76
Number of conflicts 1.13 4.30 0.00 3.00
Number of deaths (in thousands) 2.22 20.67 0.00 1.62
Ethnolinguistic fractionalization 0.49 0.28 0.02 0.83
Ethnolinguistic polarization 0.55 0.28 0.04 0.87
Absolute latitude 21.69 17.08 2.92 48.23
Ruggedness 172.23 176.69 16.32 403.90
Elevation 0.73 0.86 0.07 1.75
Range of elevation 1.84 1.37 0.34 3.69
Mean land suitability 8.24 3.61 2.09 12.21
Range of land suitability 5.56 4.64 0.55 13.25
Small island dummy 0.03 0.16 0.00 0.00
Distance to nearest waterway 43.72 56.33 0.00 94.88
Temperature 18.82 9.36 3.83 26.67
Precipitation 118.85 75.53 32.58 225.72
Years since settlement (centuries from present) 112.52 23.61 90.19 120.19
Malaria 0.10 0.15 0.00 0.37
Oil or gas discovery 0.35 0.48 0.00 1.00
Luminosity 1.47 3.69 0.00 3.76

Footnotes

*

We thank the editor, four anonymous referees, Ran Abramitzky, Alberto Alesina, Yann Algan, Sascha Becker, Moshe Buchinsky, Matteo Cervellati, Carl-Johan Dalgaard, David de la Croix, Emilio Depetris-Chauvin, Paul Dower, Joan Esteban, James Fenske, Raquel Fernádez, Boris Gershman, Avner Greif, Pauline Grosjean, Elhanan Helpman, Murat Iyigun, Noel Johnson, Garett Jones, Mark Koyama, Stelios Michalopoulos, Steven Nafziger, Nathan Nunn, John Nye, Ömer Özak, Elias Papaioannou, Sergey Popov, Stephen Smith, Enrico Spolaore, Uwe Sunde, Mathias Thoenig, Nico Voigtländer, Joachim Voth, Romain Wacziarg, Fabian Waldinger, David Weil, Ludger Wößmann, Noam Yuchtman, Alexei Zakharov, and seminar participants at George Mason University, George Washington University, HSE/NES Moscow, the AEA Annual Meeting, the conference on “Deep Determinants of International Comparative Development” at Brown University, the workshop on “Income Distribution and Macroeconomics” at the NBER Summer Institute, the conference on “Culture, Diversity, and Development” at HSE/NES Moscow, the conference on “The Long Shadow of History: Mechanisms of Persistence in Economics and the Social Sciences” at LMU Munich, the fall meeting of the NBER Political Economy Program, the session on “Economic Growth” at the AEA Continuing Education Program, the workshop on “Biology and Behavior in Political Economy” at HSE Moscow, and the Economic Workshop at IDC Herzliya for valuable comments. Steven Brownstone, Gregory Casey, Ashwin Narayan, Daniel Prinz, and Jeffrey Wang provided excellent research assistance. Arbatlı acknowledges research support from the Center for Advanced Studies (CAS) at the National Research University Higher School of Economics (HSE). Ashraf acknowledges research support from the NSF (SES-1338738), the Hellman Fellows Program, and the Oakley Center for Humanities and Social Sciences at Williams College. Galor acknowledges research support from the NSF (SES-1338426) and the Population Studies and Training Center (PSTC) at Brown University. The PSTC receives core support from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (5R24HD041020). Klemp acknowledges research support from the Carlsberg Foundation, the Danish Research Council (grant numbers 1329–00093 and 1327–00245), and the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Global Fellowship grant agreement number 753615).

1

The contemporary worldwide distribution of observed population diversity across indigenous ethnic groups overwhelmingly reflects a serial founder effect – i.e., a chain of ancient population bottlenecks – originating in East Africa. In particular, because the spatial diffusion of humans to the rest of the world occurred in a stepwise migration process beginning around 90,000–60,000 BP, where in each step, a subgroup of individuals left their parental colony to establish a new settlement farther away, carrying with them only a subset of the diversity of their parental colony, the population diversity of a prehistorically indigenous ethnic group as observed today decreases with the distance along ancient human migratory paths from East Africa.

2

However, in network-based models of conflict involving multiple groups (e.g., König et al., 2017), greater inter-group divergence could mitigate conflict propensity by reducing the strength of inter-group network alliances within one side or another of such conflicts.

3

More sophisticated measures of ethnolinguistic fragmentation – such as (i) the Greenberg index of “cultural diversity,” as measured by Fearon (2003) and Desmet et al. (2009), or (ii) the ethnolinguistic polarization index, as measured by Desmet et al. (2009) and by Esteban et al. (2012) – incorporate information on pairwise linguistic distances, wherein pairwise linguistic proximity monotonically increases in the number of shared branches between any two languages in a hierarchical linguistic tree. This information, however, is constrained by the nature of a hierarchical linguistic tree, where languages residing at the same level of branching of the tree are necessarily equidistant from one another.

4

The genetic distance between any two ethnic groups in a contemporary national population predominantly reflects the prehistoric migratory distance between their respective ancestral populations (from the precolonial era), and as follows from the continuity of geographical distances, the proposed population diversity measure captures continuous inter-group distances. Spolaore and Wacziarg (2016) documents a negative relationship between genetic distance and interstate warfare. They argue that if genetic relatedness proxies for unobserved similarity in preferences over rival and excludable goods, then conflict over the control of such resources would be more likely to arise between nations that are genetically closer to one another.

5

The importance of prehistorically determined human characteristics is further explored by Spolaore and Wacziarg (2013) and Ashraf and Galor (2013b, 2018).

6

In addition, the modernist viewpoint (Bates, 1983; Gellner, 1983; Wimmer, 2002) stresses that interethnic conflict arises from increased competition over scarce resources, especially when previously marginalized groups that were excluded from the nation-building process experience socioeconomic modernization and, thus, begin to challenge the status quo.

8

The Human Genome Diversity Cell Line is compiled by the Human Genome Diversity Project (HGDP) in collaboration with the Centre d’Etudes du Polymorphisme Humain (CEPH).

9

Consistent with the serial founder effects associated with the prehistoric “out of Africa” migration process, expected heterozygosity in microsattelites declines with migratory distance from East Africa across ethnic groups. Mounting evidence from the fields of physical and cognitive anthropology, surveyed in Ashraf and Galor (2018), additionally reflect the influence of serial founder effects on various forms of intra-group phenotypic and cognitive diversity, including phonemic diversity and interpersonal diversity in skeletal features pertaining to cranial characteristics, dental attributes, and pelvic traits. Thus, the association of heterozygosity in neutral genetic markers with socioeconomic outcomes may plausibly reflect the influence of diversity in various observed and unobserved phenotypic characteristics.

10

The data on the population shares of these different subnational groups at the country level are obtained from the World Migration Matrix, 1500–2000 of Putterman and Weil (2010), who compile for each country in their data set, the share of the country’s population in 2000 that is descended from the population of every other country in 1500. For an in-depth discussion of the methodology underlying the construction of the ancestry-adjusted measure of genetic diversity, the reader is referred to the data appendix of Ashraf and Galor (2013a).

11

The definitions and data sources of all variables employed by the analysis at the country level are listed in Section A.4 of the Supplemental Material.

12

Although these measures of ethnolinguistic fragmentation are directly accounted for, their exogenous geographical determinants may still explain some unobserved component of intrapopulation heterogeneity in ethnic and cultural traits, thereby exerting some influence on the potential for conflict in society.

13

The prevalence of anocracy, occurring when the polity score is between −5 and 5, therefore serves as the omitted political regime category.

14

The choice of Desmet et al. (2012) as the data source for ethnolinguistic polarization is primarily due to the more comprehensive geographical coverage of their data set, relative to other potential data sources such as Montalvo and Reynal-Querol (2005) or Esteban et al. (2012).

15

Specifically, countries located farther from the equator have seen fewer conflict outbreaks on average, while those with greater dispersion in their respective land endowments have experienced such outbreaks more frequently, a result that plausibly reflects the conflict-promoting role of ethnolinguistic fragmentation, following the rationale provided by the findings of Michalopoulos (2012).

16

By restricting both fractionalization and polarization measures to enter the regressions linearly, the current approach follows Esteban et al. (2012). Nevertheless, a robustness check of the main finding to employing alternative specifications that allow for both a linear and a quadratic term in ethnic fractionalization yielded qualitatively similar results (not reported).

17

The analysis in Table SA.IX in Section A.1 of the Supplemental Material shows that although the two measures of ethnolinguistic fragmentation do independently possess some explanatory power for the temporal frequency of conflict onsets after accounting for geographical confounders, these conditional relationships are not statistically robust to the inclusion of continent dummies to the specifications.

18

Thus, for a given contemporary national population, the within-group component of overall diversity reflects the weighted average group-level interpersonal diversity, using the population shares of these subnational groups as weights, whereas the between-group component reflects the residual fraction of overall diversity that is unexplained by the within-group component. The latter component therefore corresponds to an aggregate measure of intergroup distances amongst all subnational groups in the national population.

19

For a more formal analysis of selection on observables and unobservables, see Appendix A.2.

20

The robustness of the current analysis of conflict incidence to exploiting variations in annually (rather than quinquennially) repeated cross-country data is confirmed in Appendix A.2. Naturally, in those regressions, the time-dependent covariates enter as their lagged annual values (instead of their lagged 5-year temporal means) and time fixed effects are captured by a set of year dummies.

21

An alternative method to address the reverse-causality problem, in the context of quinquennially repeated cross-country data, would have been to control for time-dependent covariates as measured in the initial year of each 5-year interval. Although this method would retain the first-period observation for each country, which is dropped under the current specification, it leaves open the possibility that the presence or absence of an active conflict in the first year of each period may still exert a direct influence on the time-varying controls.

22

In adopting this strategy, the current analysis of conflict incidence follows Esteban et al. (2012). It may also be noted here that because the measure of population diversity is time-invariant (as is the case with all known measures of ethnolinguistic fragmentation, based on fractionalization or polarization indices), the analysis is unable to either account for country fixed effects or exploit dynamic panel estimation methods, despite the time dimension of the repeated cross-country data. In all regressions exploiting such data, however, the robust standard errors of the estimated coefficients are always clustered at the country level.

23

A “new” civil conflict in a country is defined as one involving a previously unobserved set of actors and/or a previously unobserved set of contentious issues.

24

These criteria are as follows: (1) Membership in the group is determined primarily by descent by both members and non-members; (2) Membership in the group is recognized and viewed as important by members and/or non-members, where importance may be psychological, normative, and/or strategic; (3) Members share some distinguishing cultural features, such as common language, religion, occupational niche, and customs with respect to other groups in the country; (4) One or more of these cultural features are either practiced by a majority of the group or preserved and studied by a set of members who are broadly respected by the wider membership for so doing; and (5) The group has at least 100,000 members or constitutes one percent of the national population.

25

This fatality level corresponds to a magnitude of 1.5 or higher on Richardsons (1960) base-10 log conflict scale.

26

The log transformation is applied to one plus the total number of conflicts in order to retain observations without any recorded conflict.

27

For example, primary sources on historical warfare in Sub-Saharan Africa are relatively scarce (Reid, 2014), and unlike the large-scale campaigns common in European warfare, historical conflicts in Africa more often took the form raiding wars.

28

For instance, in Column 3 of Table I, the estimated impact of the same move in the cross-country distribution of population diversity is 0.02 additional civil conflict outbreaks per year – i.e., 2 additional conflicts per century.

29

Pemberton et al. (2013) combines eight human genetic diversity datasets based on the 645 loci that they share, including the HGDP-CEPH Human Genome Diversity Cell Line Panel used by Ashraf and Galor (2013a). Since ethnic groups have been largely native to their ethnic homelands, at least since the pre-colonial era, the measure of population diversity within the ethnic groups properly captures the degree of population diversity within the ethnic homelands.

30

Further details on the construction of the dataset are presented in Section B.1 of the Supplemental Material.

31

The analysis includes all ethnic groups in Pemberton et al. (2013) that can be mapped into an ethnic homeland, excluding the Surui of South America. Population geneticists view the Surui as an extreme outlier in terms of genetic diversity. In particular, Ramachandran et al. (2005) omit the Surui, as “an extreme outlier in a variety of previous analyses”. Including this observation, nevertheless, does not affect the qualitative results.

32

As elaborated in Section B.2 of the Supplemental Material, the measures of the degree of ethnolinguistic fractionalization and polarization in ethnic homelands is based on the proportional representation of each linguistic group within the ethnic homeland.

33

This measure is calculated using the gridded PRIO data (PRIO-GRID version 1.01) as reported by Tollefsen et al. (2012) based on the UCDP/PRIO Armed Conflict Dataset.

34

The observed sample of 207 ethnic homelands disproportionately represents sub-Saharan Africa. Moreover, while the prevalence of conflict in ethnic homelands in Africa is significantly above the worldwide average, in the observed sample the prevalence of conflict is below the world average, introducing undesirable biases in the estimation and necessitating the use of regional-fixed effects, and in particular a Sub-Saharan dummy variable, to account for these regional anomalies. In contrast, in the representative predicted sample, considered in Table VI, the positive association between population diversity and conflict, within as well as between continents, can be identified.

35

See Appendix A.6.

36

The larger coefficient estimates for the impact of diversity on conflict in the predicted sample (relative to the observed sample) plausibly reflects the more representative spatial coverage of conflicts across the globe. Further, these larger estimates for predicted diversity are in line with the fact that the 2SLS estimates of instrumented observed diversity are also larger than their OLS counterparts.

37

The first-stage F-statistic indicates that prehistoric migratory distance is a strong instrument for the level of observed population diversity. The large 2SLS coefficient on observed diversity, relative to its OLS counterpart from Column 3 of Table V, may be explained by the following two reasons. First, the OLS estimates may be afflicted by attenuation bias due to the possibility that observed diversity in neutral genetic markers is merely a noisy proxy of interpersonal diversity in unobserved traits that are relevant for socioeconomic interactions. Second, in line with the interpretation of a local average treatment effect (LATE), there could be certain ethnic groups in the observed sample that are not perfect compliers of the ”migratory distance” treatment, in the sense that their population diversities improperly reflect the legacy of the serial founder effect (due to some degree of admixture from non-native populations in the era of European colonization).

38

Unlike measures of ethnolinguistic fragmentation that are based on fractionalization or polarization indices, the number of ethnic groups in the national population is potentially less endogenous in an empirical model of the risk of civil conflict, in light of the fact that this measure is not additionally tainted by the incorporation of information on the endogenous shares of the different subnational groups.

39

In particular, this well-known measure of social capital reflects the proportion in a given country of all respondents (from across five different waves of the WVS, conducted over the 1981–2009 time horizon) that opted for the answer “Most people can be trusted” (as opposed to “Can’t be too careful”) when responding to the survey question “Generally speaking, would you say that most people can be trusted or that you need to be very careful in dealing with people?”

40

Specifically, this country-level measure of heterogeneity in political attitudes reflects the intra-country standard deviation across all respondents (sampled over five different waves of the WVS during the 1981–2009 time horizon) of their self-reported positions on a categorical scale from 1 (politically “left”) to 10 (politically “right”) when answering the survey question “In political matters, people talk of ‘the left’ and ‘the right.’ How would you place your views on this scale, generally speaking?” Given that this variable’s unit of measurement does not possess a natural interpretation, the cross-country distribution of this variable is standardized prior to conducting the regressions.

41

The three scatter plots presented in Figure A.1 of Appendix A.3 depict these statistically significant cross-country relationships, conditional on the baseline set of geographical covariates (including continent or region fixed effects). Specifically, they show the relationship between population diversity and (i) the total number of ethnic groups in a national population (Panel (a)); (ii) the prevalence of generalized interpersonal trust at the country level (Panel (b)); and (iii) the intra-country dispersion in political attitudes (Panel (c)).

42

Summary statistics for the trust analysis samples can be found in Section B.4 of the Supplemental Material.

43

Since a third of the observations in the sample are individuals who are currently residing in Nigeria, and since Nigeria has the lowest level of trust among the 9 countries in the sample, possibly due to omitted variables (e.g., corruption), and since the level of genetic diversity in Nigeria is not among the highest in the sample, the actual relationship between diversity and trust may be masked in the absence of country dummies. Thus, all columns of the table account for host country fixed effects.

44

The classification of individuals and their association with various ethnic homelands is based on Nunn and Wantchekon (2011).

45

In addition, the focus on second-generation rather than first-generation migrants allows the analysis to exploit the individual-level variation in trust that plausibly mostly reflects the trust attitudes transmitted intergenerationally from parents rather than from society at large.

46

Since the sample of second-generation migrants consists of 76% immigrants from Europe, 3% immigrants from Asia and 21% immigrants from the Americas, and since individuals from Europe has the highest level of trust among these three groups, possibly due to omitted variables (e.g., income), and since the level of genetic diversity in Europe is highest among the three groups, an artificially positive relationship between trust and population diversity in the sample as a whole may appear in the absence of ancestral regional dummies. Thus, all columns of the table account for ancestral regional fixed effects. Since migrants from the Americas in the sample are originated from either Canada or Mexico, where Canada is significantly more diverse, due to a larger European population and significantly more trustful, possibly due to higher income, the use of a North America dummy only will affect the significance of the results. Hence, all columns of the table account for Latin American regional fixed effects.

47

Since the sample is composed of individuals from European countries, Asian countries, and three countries in the Americas: Canada, Mexico, and Puerto Rico, the regional dummies distinguish between European, Asian, and Latin American countries.

48

The inclusion of geographical characteristics of the ancestral homeland reduces the sample due to the absence of some of the relevant data for Puerto Rico.

49

The version of the MEPV data set employed provides annual information for a total of 179 countries over the 1946–2017 time period. See http://www.systemicpeace.org/inscr/MEPVcodebook2016.pdf for further details on the measure of conflict intensity from the MEPV data set.

50

Specifically, all episodes of intrastate conflict in the MEPV data set are categorized along two dimensions. With respect to the first dimension, an episode may be considered either (i) one of “civil” conflict, involving rival political groups; or (ii) one of “ethnic” conflict, involving the state agent and a distinct ethnic group. In terms of the second dimension, however, an episode may be either (i) one of “violence,” involving the use of instrumental force, without necessarily possessing any exclusive goals; or (ii) one of “war,” involving violent activities between distinct groups, with the intent to impose a unilateral result to the contention.

51

The specific weights (reported in parentheses) assigned to the different types of sociopolitical unrest considered by the index are as follows: assassinations (25), general strikes (20), guerrilla warfare (100), major government crises (20), political purges (20), riots (25), revolutions (150), and anti-government demonstrations (10). This weighting methodology is based on Rummel (1963). For further details, the reader is referred to the codebook of the CNTS data archive, available at http://www.cntsdata.com/.

52

Despite the fact that the measure of conflict intensity from the MEPV data set is ordinal rather than continuous in nature, the analysis pursues least-squares (as opposed to maximum-likelihood) estimation methods when examining this particular outcome variable, primarily because this permits the implementation of both of the key identification strategies. Specifically, although the main findings from Columns 1–2 can be qualitatively replicated using ordered probit rather than OLS regressions (results not shown), the absence of a readily available IV counterpart of the ordered probit regression model precludes conducting a similar robustness check on the main findings from Columns 3–4.

53

Altonji et al. (2005) develop this method for the case where the explanatory variable of interest is binary in nature, while Bellows and Miguel (2009) consider the case of a continuous explanatory variable. Roughly speaking, the assumption in assessments of this type is that the covariation of the outcome variable with observables, on the one hand, and its covariation with unobservables, on the other, are identically related to the explanatory variable of interest. Altonji et al. (2005) provide some sufficient conditions for such an assumption to hold.

54

The negative sign indicates that selection on unobservables needs to move the coefficient estimate in the opposite direction, compared to selection on observables.

55

The reported Oster statistics are computed under the most conservative assumption that Rmax = 1; i.e., that the entire cross-country variation in conflict frequency would be explained by the estimated model if one could include all unobservables correlated with population diversity to the model.

References

  1. Alesina A, Devleeschauwer A, Easterly W, Kurlat S, and Wacziarg R (2003): “Fractionalization,” Journal of Economic Growth, 8, 155–194. [Google Scholar]
  2. Alesina A and La Ferrara E (2005): “Ethnic Diversity and Economic Performance,” Journal of Economic Literature, 43, 762–800. [Google Scholar]
  3. Alesina A, Michalopoulos S, and Papaioannou E (2016): “Ethnic Inequality,” Journal of Political Economy, 124, 428–488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Alesina A and Spolaore E (2003): The Size of Nations, Cambridge, MA: MIT Press. [Google Scholar]
  5. Altonji JG, Elder TE, and Taber CR (2005): “Selection on Observed and Unobserved Variables Assessing the Effectiveness of Catholic Schools,” Journal of Political Economy, 113, 151–184. [Google Scholar]
  6. Ashraf Q and Galor O (2013a): “The “Out of Africa” Hypothesis, Human Genetic Diversity, and Comparative Economic Development,” American Economic Review, 103, 1–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Ashraf Q and Galor O (2013b): “Genetic Diversity and the Origins of Cultural Fragmentation,” American Economic Review: Papers & Proceedings, 103, 528–533. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Ashraf QH and Galor O (2018): “The Macrogenoeconomics of Comparative Development,” Journal of Economic Literature, 56, 1119–1155. [Google Scholar]
  9. Banks AS and Wilson KA (2018): “Cross-National Time-Series Data Archive [Data file],” Databanks International, Jerusalem, Israel https://www.cntsdata.com/.
  10. Bates RH (1983): “Modernization, Ethnic Competition, and the Rationality of Politics in Contemporary Africa,” in State versus Ethnic Claims: African Policy Dilemmas, ed. by Rothchild D and Olorunsola VA, Boulder, CO: Westview Press, 152–171. [Google Scholar]
  11. Bazzi S and Blattman C (2014): “Economic Shocks and Conflict: Evidence from Commodity Prices,” American Economic Journal: Macroeconomics, 6, 1–38. [Google Scholar]
  12. Beck N, Katz JN, and Tucker R (1998): “Taking Time Seriously: Time-Series–Cross-Section Analysis with a Binary Dependent Variable,” American Journal of Political Science, 42, 1260–1288. [Google Scholar]
  13. Bellows J and Miguel E (2009): “War and Local Collective Action in Sierra Leone,” Journal of Public Economics, 93, 1144–1157. [Google Scholar]
  14. Besley T and Reynal-Querol M (2014): “The Legacy of Historical Conflict: Evidence from Africa,” American Political Science Review, 108, 319–336. [Google Scholar]
  15. Birnir JK, Laitin DD, Wilkenfeld J, Waguespack DM, Hultquist AS, and Gurr TR (2018): “Introducing the AMAR (All Minorities at Risk) Data,” Journal of Conflict Resolution, 62, 203–226. [Google Scholar]
  16. Birnir JK, Wilkenfeld J, Fearon JD, Laitin DD, Gurr TR, Brancati D, Saideman SM, Pate A, and Hultquist AS (2015): “Socially Relevant Ethnic Groups, Ethnic Structure, and AMAR,” Journal of Peace Research, 52, 110–115. [Google Scholar]
  17. Blattman C and Miguel E (2010): “Civil War,” Journal of Economic Literature, 48, 3–57. [Google Scholar]
  18. Brecke P (1999): “Violent Conflicts 1400 A.D. to the Present in Different Regions of the World,” Paper presented at the 1999 Annual Meeting of the Peace Science Society, October 8–10.
  19. Caselli F and Coleman WJ II (2013): “On the Theory of Ethnic Conflict,” Journal of the European Economic Association, 11, 161–192. [Google Scholar]
  20. Cassar A, Grosjean P, and Whitt S (2013): “Legacies of Violence: Trust and Market Development,” Journal of Economic Growth, 18, 285–318. [Google Scholar]
  21. Cioffi-Revilla C (1996): “Origins and Evolution of War and Politics,” International Studies Quarterly, 40, 1–22. [Google Scholar]
  22. Collier P and Hoeffler A (2004): “Greed and Grievance in Civil War,” Oxford Economic Papers, 56, 563–595. [Google Scholar]
  23. Collier P and Hoeffler A (2007): “Civil War,” in Handbook of Defense Economics, Vol. 2: Defense in a Globalized World, ed. by Sandler T and Hartley K, Amsterdam, The Netherlands: Elsevier, North-Holland, 711–740. [Google Scholar]
  24. Croicu M and Sundberg R (2015): “UCDP Georeferenced Event Dataset Codebook version 4.0,” Department of Peace and Conflict Research, Uppsala University http://ucdp.uu.se/downloads/ged/ucdp-ged-40-codebook.pdf.
  25. Desmet K, Breton ML, OrtuÑo-ORTÍN I, and Weber S (2011): “The Stability and Breakup of Nations: A Quantitative Analysis,” Journal of Economic Growth, 16, 183–213. [Google Scholar]
  26. Desmet K, OrtuÑo-OrtÍn I, and Wacziarg R (2012): “The Political Economy of Linguistic Cleavages,” Journal of Development Economics, 97, 322–338. [Google Scholar]
  27. Desmet K, OrtuÑo-OrtÍn I, and Weber S (2009): “Linguistic Diversity and Redistribution,” Journal of the European Economic Association, 7, 1291–1318. [Google Scholar]
  28. Dincecco M, Fenske J, and Onorato MG (2015): “Is Africa Different? Historical Conflict and State Development,” IMT Lucca EIC Working Paper No. 08/2015, IMT Institute for Advance Studies Lucca. [Google Scholar]
  29. Drukker DM, Prucha IR, and Raciborski R (2013): “Maximum Likelihood and Generalized Spatial Two-Stage Least-Squares Estimators for a Spatial-Autoregressive Model with Spatial-Autoregressive Disturbances,” Stata Journal, 13, 221–241. [Google Scholar]
  30. Dube O and Vargas JF (2013): “Commodity Price Shocks and Civil Conflict: Evidence from Colombia,” Review of Economic Studies, 80, 1384–1421. [Google Scholar]
  31. Easterly W and Levine R (1997): “Africa’s Growth Tragedy: Policies and Ethnic Divisions,” Quarterly Journal of Economics, 112, 1203–1250. [Google Scholar]
  32. Eifert B, Miguel E, and Posner DN (2010): “Political Competition and Ethnic Identification in Africa,” American Journal of Political Science, 54, 494–510. [Google Scholar]
  33. Esteban J, Mayoral L, and Ray D (2012): “Ethnicity and Conflict: An Empirical Study,” American Economic Review, 102, 1310–1342. [Google Scholar]
  34. Esteban J and Ray D (2011a): “A Model of Ethnic Conflict,” Journal of the European Economic Association, 9, 496–521. [Google Scholar]
  35. Fearon JD (2003): “Ethnic and Cultural Diversity by Country,” Journal of Economic Growth, 8, 195–222. [Google Scholar]
  36. Fearon JD and Laitin DD (2003): “Ethnicity, Insurgency, and Civil War,” American Political Science Review, 97, 75–90. [Google Scholar]
  37. Fletcher E and Iyigun M (2010): “The Clash of Civilizations: A Cliometric Investigation,” http://www.colorado.edu/economics/courses/iyigun/fractionalization013109.pdf.
  38. Gellner E (1983): Nations and Nationalism, Ithaca, NY: Cornell University Press. [Google Scholar]
  39. Gleditsch NP, Wallensteen P, Eriksson M, Sollenberg M, and Strand H (2002): “Armed Conflict 1946–2001: A New Dataset,” Journal of Peace Research, 39, 615–637. [Google Scholar]
  40. Grossman HI (1991): “A General Equilibrium Model of Insurrections,” American Economic Review, 81, 912–921. [Google Scholar]
  41. Grossman HI (1999): “Kleptocracy and Revolutions,” Oxford Economic Papers, 51, 267–283. [Google Scholar]
  42. Harpending H and Rogers A (2000): “Genetic Perspectives on Human Origins and Differentiation,” Annual Review of Genomics and Human Genetics, 1, 361–385. [DOI] [PubMed] [Google Scholar]
  43. Hirshleifer J (1991): “The Technology of Conflict as an Economic Activity,” American Economic Review: Papers & Proceedings, 81, 130–134. [Google Scholar]
  44. Hirshleifer J (1995): “Anarchy and its Breakdown,” Journal of Political Economy, 103, 26–52. [Google Scholar]
  45. Humphreys M (2005): “Natural Resources, Conflict, and Conflict Resolution: Uncovering the Mechanisms,” Journal of Conflict Resolution, 49, 508–537. [Google Scholar]
  46. King G and Zeng L (2001): “Logistic Regression in Rare Events Data,” Political Analysis, 9, 137–163. [Google Scholar]
  47. König MD, Rohner D, Thoenig M, and Zilibotti F (2017): “Networks in Conflict: Theory and Evidence From the Great War of Africa,” Econometrica, 85, 1093–1132. [Google Scholar]
  48. Marshall MG (2017): “Major Episodes of Political Violence (MEPV) and Conflict Regions, 1946–2017,” Center for Systemic Peace, Vienna, VA. Data retrieved at http://www.systemicpeace.org/inscrdata.html. [Google Scholar]
  49. Michalopoulos S (2012): “The Origins of Ethnolinguistic Diversity,” American Economic Review, 102, 1508–1539. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Miguel E, Satyanath S, and Sergenti E (2004): “Economic Shocks and Civil Conflict: An Instrumental Variables Approach,” Journal of Political Economy, 112, 725–753. [Google Scholar]
  51. Montalvo JG and Reynal-Querol M (2005): “Ethnic Polarization, Potential Conflict, and Civil Wars,” American Economic Review, 95, 796–816. [Google Scholar]
  52. Nunn N and Wantchekon L (2011): “The Slave Trade and the Origins of Mistrust in Africa,” American Economic Review, 101, 3221–3252. [Google Scholar]
  53. Oster E (2019): “Unobservable Selection and Coefficient Stability: Theory and Evidence,” Journal of Business & Economic Statistics, 37, 187–204. [Google Scholar]
  54. Pemberton TJ, DeGiorgio M, and Rosenberg NA (2013): “Population Structure in a Comprehensive Genomic Data Set on Human Microsatellite Variation,” G3: Genes, Genomes, and Genetics, 3, 891–907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Pettersson T and Eck K (2018): “Organized Violence, 1989–2017,” Journal of Peace Research, 55, 535–547. [Google Scholar]
  56. Posner DN (2003): “The Colonial Origins of Ethnic Cleavages: The Case of Linguistic Divisions in Zambia,” Comparative Politics, 35, 127–146. [Google Scholar]
  57. Putterman L and Weil DN (2010): “Post-1500 Population Flows and The Long-Run Determinants of Economic Growth and Inequality,” Quarterly Journal of Economics, 125, 1627–1682. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Ramachandran S, Deshpande O, Roseman CC, Rosenberg NA, Feldman MW, and Cavalli-Sforza LL (2005): “Support from the Relationship of Genetic and Geographic Distance in Human Populations for a Serial Founder Effect Originating in Africa,” Proceedings of the National Academy of Sciences, 102, 15942–15947. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Reid R (2014): “The Fragile Revolution: Rethinking War and Development in Africa’s Violent Nineteenth Century,” in Africa’s Development in Historical Perspective, ed. by Akyeampong E, Bates RH, Nunn N, and Robinson JA, Cambridge, UK: Cambridge University Press, 393–423. [Google Scholar]
  60. Rohner D, Thoenig M, and Zilibotti F (2013a): “War Signals: A Theory of Trade, Trust, and Conflict,” Review of Economic Studies, 80, 1114–1147. [Google Scholar]
  61. Rohner D, Thoenig M, and Zilibotti F (2013b): “Seeds of Distrust: Conflict in Uganda,” Journal of Economic Growth, 18, 217–252. [Google Scholar]
  62. Ross M (2006): “A Closer Look at Oil, Diamonds, and Civil War,” Annual Review of Political Science, 9, 265–300. [Google Scholar]
  63. Rummel RJ (1963): “Dimensions of Conflict Behavior Within and Between Nations,” General Systems Yearbook, 8, 1–50. [Google Scholar]
  64. Sambanis N (2002): “A Review of Recent Advances and Future Directions in the Quantitative Literature on Civil War,” Defence and Peace Economics, 13, 215–243. [Google Scholar]
  65. Spolaore E and Wacziarg R (2013): “How Deep Are the Roots of Economic Development?” Journal of Economic Literature, 51, 325–369. [Google Scholar]
  66. Spolaore E and Wacziarg R (2016): “War and Relatedness,” Review of Economics and Statistics, 98, 925–939. [Google Scholar]
  67. Sundberg R, Eck K, and Kreutz J (2012): “Introducing the UCDP Non-State Conflict Dataset,” Journal of Peace Research, 49, 351–362. [Google Scholar]
  68. Tollefsen AF, Strand H, and Buhaug H (2012): “PRIO-GRID: A Unified Spatial Data Structure,” Journal of Peace Research, 49, 363–374. [Google Scholar]
  69. Weidmann NB, Rød JK, and Cederman L-E (2010): “Representing Ethnic Groups in Space: A New Dataset,” Journal of Peace Research, 47, 491–499. [Google Scholar]
  70. Wimmer A (2002): Nationalist Exclusion and Ethnic Conflict: Shadows of Modernity, Cambridge, UK: Cambridge University Press. [Google Scholar]
  71. World Values Survey (2006): “European and World Values Surveys, Four-Wave Integrated Data File, 1981–2004, version 20060423,” The World Values Survey Association, Stockholm, Sweden Data retrieved at http://www.worldvaluessurvey.org.
  72. World Values Survey (2009): “World Values Survey, 1981–2008 Official Aggregate, version 20090914,” The World Values Survey Association, Stockholm, Sweden Data retrieved at http://www.worldvaluessurvey.org.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

RESOURCES