Abstract
This paper introduces a model-based approach for measuring heterogeneity in sex preferences using birth history records. The approach identifies the combinations of preferences over the sex and number of children that best explain observed childbearing. Empirical estimates indicate that a majority of parents in Africa, Asia, and the Americas consider the sex of children when making childbearing decisions. Many parents prefer sons and many prefer daughters. Comparisons with reported preferences suggest that survey respondents tend to underreport the degree to which they prefer sons or daughters. Estimates indicate that, although sex preferences are widespread, they have little effect on aggregate fertility levels.
JEL codes: J13, J16
Keywords: Sex Preferences, Fertility, Gender, Partial Identification
1. Introduction
When parents particularly want sons or daughters, the decision to have additional children depends on the sex of previous children. For example, in many countries in Asia, parents with only daughters are more likely to have another child than are parents with only sons, suggesting that parents tend to prefer sons (Arnold et al. 1998, Haughton and Haughton 1998). Unfortunately, these comparisons cannot measure heterogeneity (Haughton and Haughton 1998). In much of Africa and the Americas, parents are similarly likely to have another child after having sons as after having daughters, and there is little evidence that parents overwhelmingly prefer sons or daughters (Arnold 1997, Bongaarts 2013). Parents in these regions may not have sex preferences, or some parents may prefer sons while others prefer daughters.
In this paper, I introduce a revealed preference approach that, for the first time, measures heterogeneity in sex preferences within a population using observed sequences of sons and daughters in completed families. I begin with a model of childbearing in which parents have preferences over the share of children that are sons and the total number of children. These preferences govern the decision to have another child after every son or daughter is born. Using a large collection of birth history surveys, I identify the combinations of preferences that best explain an observed distribution of children across families.
Three features of this approach permit measurement of heterogeneity. First, I develop a simple model of childbearing that allows for many possible preferences over the sex and number of children, and structures how these preferences govern parents’ childbearing decisions. There is no single sex preference; rather, there are many ways that parents may value sons and daughters. By first establishing a set of possible preferences, is it possible to then isolate the preferences that best explain observed childbearing.
Second, I measure the importance of the sex of children relative to the number of children. This relative importance determines how parents weigh potentially competing objectives over the sex and number of their children. For example, a couple may want one son, but whether the couple has a second child after a first-born daughter depends on the relative importance of having a small family versus having a son. If the couple would always stop after one child regardless of the child’s sex, then the couple’s desire for a son has no bearing on its fertility decisions. Sex preferences matter only if the sex of previous children influences the decision to have another child.
Third, and perhaps most crucially, the approach allows for partial identification. Because family composition is discrete and most parents have only a few children, parents with different preferences can make the same childbearing decisions, and many different combinations of preferences may best explain observed childbearing. I calculate bounds on preferences across all of these combinations. As noted by Ben-Porath and Welch (1976, page 292), “If populations are heterogeneous … observed patterns will be blurred.” Although blurry, observed patterns can still reveal information about the distribution of preferences in a population.
Standard statistics can only answer the question, what is the prevailing sex preference, if any? The new approach answers additional questions, such as, what share of parents prefer sons, and what share prefer daughters? For example, I estimate that between 57 percent and 93 percent of parents in Africa have and act upon sex preferences. Between 24 percent and 74 percent prefer sons and between 18 percent and 70 percent prefer daughters. Although these intervals are wide, at just the lower bound they suggest that sex preferences are more widespread and heterogeneous than previously established. For more than half of parents in Africa, Asia, and the Americas, the sex of previous children influences the decision to have additional children. In each region, at least 24 percent of parents prefer sons and at least 18 percent prefer daughters.
Many surveys directly ask parents to report their preferred numbers of sons and daughters. A comparison with preferences estimated using observed childbearing suggest that parents tend to underreport the degree to which they prefer sons or daughters. For example, 40 percent of women in Africa report that they prefer sons or prefer daughters, but at least 61 percent of parents are estimated to actually prefer sons or prefer daughters.
Sex preferences can matter for public policy if they affect the level of aggregate fertility (Mutharayappa et al. 1997). Some parents may have many children in order to have a son or daughter, but others may stop childbearing early upon reaching a particularly desirable combination of sons and daughters. These two effects roughly offset in aggregate, and I estimate that eliminating all sex preferences would change aggregate fertility levels around the world by less than 0.2 children per woman. Although sex preferences are widespread and govern many parents’ childbearing decisions, these estimates suggest that they do not drive aggregate fertility levels.
2. Sex Preferences and the Distribution of Children across Families
I follow standard convention and refer to desire for sons or daughters as sex preferences. These preferences are a reduced-form representation of the various tastes, incentives, and constraints that determine whether parents want sons or daughters, such as payments at marriage and expectations about old age support. I do not disentangle the components of sex preferences in this paper. Additionally, I use data from birth history surveys that are administered to women and do not record the identity of each child’s father. Although I show that husbands and wives often report different preferences, I do not address inter-partner bargaining over childbearing decisions, and for convenience I refer to preferences as belonging to couples.
In all populations, the natural likelihood that a conceived child is a boy is about 0.51. Although this likelihood can vary slightly by ancestry and environmental conditions, there remains no widely agreed-upon and adopted method by which parents can predetermine the sex of a fetus (Novitski and Sandler 1956, James 1971, Pickles et al. 1982, James 1990, Bongaarts 2013). However, ultrasound and amniocentesis technologies allow parents to identify the sex of a fetus and make sex-selective abortion possible. The share of births that are boys remained at or below 0.519 in all countries before 1980, but has since risen above 0.519 in Armenia, Azerbaijan, China, Georgia, India, Pakistan, South Korea, and Vietnam, suggesting substantial selective abortion of girls (World Bank DataBank 2015). Girls in many of these countries are also more likely than boys to die in childhood (Arnold 1997, Bongaarts and Guilmoto 2015). Because no country exhibits corresponding selection against boys, the study of sex preferences during childbearing is overwhelmingly the study of son preference.
Outside of Asia, the sex ratio at birth remains around the natural level. Particularly in Sub-Saharan Africa, abortion is heavily stigmatized and infanticide is historically less common than in Asia (Maharaj and Cleland 2006, Kumar et al. 2009). However, even where sex-selective abortion is rare, sex preferences can influence whether parents continue having children after each son or daughter is born. In such cases, the presence of additional children depends on the sex of previous children. Sex preferences therefore influence the distribution of children across families. In this paper, I measure sex preferences using observed sequences of sons and daughters in completed families, collected in settings in which sex-selective abortion is unavailable or rarely used.
When the sex of each birth is stochastic, two couples with the same preferences can end up with different sequences of sons and daughters, and two couples with different preferences can end up with the same sequence (Haughton and Haughton 1998). For example, a couple with a single son may have wanted a single son, or may simply have wanted one child and that child happened to be a son. It is therefore not possible to determine a couple’s preferences from its sequence of sons and daughters. Preferences can only be inferred at the group level, if at all.
A parity progression ratio measures the share of couples with a particular number of sons and daughters that have another child. For example, if all couples keep having children until they have a son or have two children, zero percent of couples with one son will have a second child, while 100 percent of couples with one daughter will have a second child. These unequal parity progression ratios mean that the distribution of children across families is not random: all one-child families have only a single son, while all two-child families have either a daughter and then a son or two daughters. Using simulated populations, appendix A demonstrates how parity progression ratios can signal son or daughter preference in homogeneous populations, but can struggle when preferences are heterogeneous.
Table 1 presents parity progression ratios calculated from a large sample of birth history surveys described in section 4. In Asia, women with only daughters are consistently more likely to have another child than are women with only sons, suggesting son preference. For example, 95.2 percent of women with one daughter have a second child, and 95.1 percent of women with one son have a second child.1 In Africa and the Americas, parity progression ratios are more mixed: at some parities, couples with only daughters are most likely to have another child; at other parities, couples with only sons are most likely to have another child. Even in Asia, having additional daughters is not always associated with a greater likelihood of having another child: 86.9 percent of women with two sons have a third child, while only 85.4 percent of women with one sons and one daughter have a third child. These comparisons suggest that, especially outside of Asia but perhaps even within Asia, some couples may not have sex preferences while others prefer sons, daughters, or a balance of sons and daughters. Some couples may want many children while others want just one or two.
Table 1:
Parity Progression Ratios
| Africa | Asia | North America | South America | |
|---|---|---|---|---|
| 0.9518 | 0.9507 | 0.9243 | 0.9223 | |
| One daughter | 0.9509 | 0.9521 | 0.9210 | 0.9176 |
| Two sons | 0.9159 | 0.8687 | 0.8419 | 0.7961 |
| One son and one daughter | 0.9070 | 0.8543 | 0.8127 | 0.7674 |
| Two daughters | 0.9181 | 0.8913 | 0.8265 | 0.7976 |
| Three sons | 0.8647 | 0.8016 | 0.7389 | 0.6988 |
| Two sons and one daughter | 0.8505 | 0.7749 | 0.7162 | 0.6863 |
| One son and two daughters | 0.8612 | 0.8057 | 0.7220 | 0.6859 |
| Three daughters | 0.8760 | 0.8521 | 0.7314 | 0.7190 |
3. Measuring Heterogeneity in Sex Preferences
Haughton and Haughton (1995) present a model in which a couple receives utility separately from the difference between their numbers of sons and daughters and from their total number of children. In this section, I develop a similar model in which a couple values the share of children that are sons and the total number of children. This simple model structures how preferences guide childbearing decisions. From a large set of possible combinations of preferences, I calculate bounds on the combinations that best explain observed childbearing. Sex preferences are often characterized as yielding stopping rules, in which a couple stops childbearing upon reaching a target number of sons or daughters (Keyfitz and Caswell 2005). In appendix B, I demonstrate that the model in this section rationalizes common stopping rules.
3.1. A Model of Childbearing in which Parents Value the Sex and Number of Children
A couple has T childbearing periods. In each period, the couple tries or does not try to have a child, and then has a son, a daughter, or no child (there are no twin births). Effort at conception, realization of the child’s sex, and childbirth all occur in the same period. The couple conceives with likelihood p when trying to get pregnant, and with likelihood q when not trying to get pregnant. Each conceived child is a boy with likelihood l and is carried to full term. Therefore, each period, a couple that tries to get pregnant has a son with likelihood pl, a daughter with likelihood p(1–l), and no child with likelihood 1–p. A couple that does not try to get pregnant has a son with likelihood ql, a daughter with likelihood q(1–l), and no child with likelihood 1–q.
A couple’s childbearing decisions are governed by the following Bellman equation:
| (1) |
where
In every childbearing period t = 1, 2, …, T, the couple tries or does not try to have a child in order to maximize expected utility at the end of its childbearing career (period T+1). If trying and not trying to have a child yield the same expected utility, the couple tries to have a child. The decision in each period depends on the number of sons, s, and daughters, d, to which the couple has already given birth.
The couple receives bliss point utility in period T+1 equal to the sum of two terms: the squared difference between the couple’s actual share of children that are sons, s/(s+d), and the couple’s ideal share of children that are sons, r*; and the squared difference between the couple’s actual number of children, s+d, and the couple’s ideal number of children, c*.2 So that both terms have a maximum value of one, each is divided by a scaling factor equal to its largest possible value: r* or 1–r* for the first term, and c* or T–c* for the second term. If the couple remains childless, s/(s+d) is undefined, and the first term is set equal to one. The importance of the sex of children relative to the number of children, α, weights the two terms.
I refer to the couple’s ideal share of children that are sons, r*, as its sex preference; the ideal number of children, c*, as family size preference, and the relative importance of the sex of children, α, as the strength of sex preference. When deciding whether to have additional children, a couple with α=1 values only the share of children that are sons, a couple with α=0 values only the number of children, and a couple with 0<α<1 values both the sex and number of children. Sequential childbearing decisions are fundamentally economic, balancing potentially competing desires (the sex and number of children) under constraints (a finite number of childbearing periods and the randomness of the sex of each child).
Equation 1 is solved using backward induction. In period T, a couple tries to have a child if and only if the expected utility from trying to having a child weakly exceeds the expected utility from not trying to have a child. When facing the same choices in period T–1, the couple knows, for each possible outcome, what its decision will be in period T. Similar calculations govern decisions in all earlier periods.3
Imperfect control over conception is the only reason to defer childbearing in the model. For example, consider a couple that wants only one child, can always get pregnant while trying, but may accidentally become pregnant when not trying. The couple will reduce the risk of having too many children by waiting until later periods to start trying to have a child. Parents may in fact defer childbearing for education, employment, marriage, and other reasons, but the model interprets any deferral as resulting from concern over the likelihood of conception. To avoid such misinterpretation, I collapse the expected distribution over the full sequence of periods in which a son, daughter, or no child is born into just the sequence of sons and daughters.4 In appendix C, I provide an example of this collapse.
I make several assumptions so that the model is tractable. Only the three preferences can vary across couples. All other components of equation 1, such as number of childbearing periods, are assumed to be the same across all couples. In reality, women differ in the number of years they are fecund, in how quickly they can become pregnant again after giving birth, and therefore in their number of childbearing periods. Couples may exert a range of effort at trying to conceive, or trying to avoid conception. Access to contraception and other factors that influence control over conception can vary across populations, across couples within a population, and over time for individual couples (Henry 1961, Di Renzo et al. 2007, Clifton 2010). Couples may vary in their natural likelihoods of having a boy, although this variation is small (James 2009). Finally, not all conceived children are carried to full term; some are miscarried or aborted.
The model also assumes that a couple’s utility depends only on the numbers of sons and daughters. Couples may in fact also care about the order in which their sons and daughters are born. For example, China’s one-child policy permitted couples in rural areas to have a second child if their first child was a daughter. These couples therefore had reason to prefer a first-born daughter if they wanted two total children (Goodkind 2017). Additionally, the model assumes that a couple receives utility once, after the last childbearing period. There is likely variation in the timing of when couples enjoy and bear the costs of having children. Some couples may have reason to delay childbearing in order to attend school or begin a career. Some may continue to particularly prefer sons or daughters long after they finish having children, such as when payments of dowry or brideprice are required at the child’s marriage. Again, so that the model can be estimated, I assume that a couple receive utility once at the end of the childbearing career, rather than over time during and long after its childbearing career.
3.2. Calculate Bounds on Preferences that Best Explain Observed Childbearing
With T childbearing periods, there are M=2T+1–1 possible sequences of sons and daughters. I consider a discrete number of possible values for the preferences: sex preference r*∈{0, 0.1, 0.2, …, 1}, family size preference c*∈{0, 1, 2, …, T}, and strength of sex preference α∈{0, 0.1, 0.2, …, 1}. This discretization allows for N=121(T+1) possible combinations of preferences. A is an M×N matrix in which each element amn is the likelihood that a couple with combination of preferences n has sequence of children m, as calculated in section 3.1. D is an M×1 matrix in which each element dm is the share of couples with sequence of children m in an observed population. The candidate share, S, is an N×1 matrix in which each element sn is the estimated share of couples with combination of preferences n. The empirical challenge is to identify the elements of S such that the predicted distribution of sequences of children, AS, most closely matches the observed distribution, D.
Multiple combinations of preferences can lead couples to make the same childbearing decisions. As I discuss in appendix D, this collinearity can prevent identification of a unique candidate share that alone best matches the observed population. There may be an infinite number of such optimal candidate shares. In the remainder of this section, I develop a two-stage procedure for calculating bounds on preferences across all optimal candidate shares without requiring identification of each individual optimal candidate share.
The first stage identifies the smallest possible difference between observed and predicted populations:
| (2) |
Minimum value function 2 (which I refer to as equation 2) chooses a candidate share, S, that minimizes the sum of absolute deviations between the observed population, D, and the predicted population, AS, subject to the constraints that the shares of couples with each possible combination of preferences are non-negative and sum to one. Equation 2 identifies the smallest possible sum of absolute deviations between the observed and predicted populations, emin=1|D−AS|. I use the sum of absolute deviations, rather than the sum of squared deviations or another non-linear norm, so that this minimized sum can enter as a linear constraint in the second stage.
The second stage calculates bounds on a summary measure of preferences across all candidate shares that best match the observed population:
| (3) |
F is a N×1 matrix in which each element fn is a summary measure of combination of preferences n. For example, if fn is the sex preference in combination of preferences n, then equation 3 chooses a candidate share to minimize the average sex preference, FTS. In addition to the same proportionality constraints as in equation 2, equation 3 requires that the sum of absolute deviations using the chosen candidate share equals the minimized sum of absolute deviations from the first stage. Although there may be many such candidate shares, it does not matter which one is selected, because they all yield the same estimated minimum average sex preference, FTŜ. Multiplying F by −1 and rerunning equation 3 yields the maximum average sex preference, -FTŜ, that best explains the observed population. I use the linear programming simplex algorithm to solve equations 2 and 3.
4. Birth History Surveys
Birth history surveys record the order and sex of all of a woman’s live births. I compile birth histories collected by the following eight surveys: China Two-per-thousand Fertility Survey, Demographic and Health Surveys, Family Life Surveys, India Rural Economic and Demographic Survey, Japanese General Social Survey, Multiple Indicator Cluster Surveys, United States Integrated Fertility Survey Series, and World Fertility Surveys. The compiled dataset draws from 330 individual surveys conducted between 1970 and 2015 in 92 countries in Africa, Asia, and the Americas. Appendix E lists the citation for each survey and the individual surveys by country.
The model-based approach for measuring sex preferences in section 3 requires that observed sequences of children represent completed families, that each child is born in a separate period, and that the sex of each child is stochastic. I therefore limit the sample to women who are aged 40 and older, who provide complete birth histories, and who do not have any twins or other multiple births. For countries in which the share of births that are boys has ever exceeded 0.519, indicating extensive sex-selective abortion, I exclude women who reached age 40 after the final year in which the share of births was less than or equal to 0.519.5 Few surveys record birth histories from women who have never been married, and typically only if they have given birth. These women are therefore not representative of all women who have never been married, and I restrict the sample to only women who have ever been married. Finally, all estimates are calculated allowing T=8 childbearing periods to facilitate computation of equations 1 through 3, so I restrict the sample to women who have given birth eight or fewer times.
Of 895,683 women aged 40 and older in the full sample, these restrictions leave a final sample of 621,435 women. Panel (a) of Table 2 compares the sizes of the full and final samples on each continent. The final sample is most substantially reduced in Asia because it excludes women who were of childbearing age when sex selective abortion was widespread. Panel (b) compares women included in and excluded from the final sample across basic demographic characteristics. Women in the final sample are slightly younger on average, have more years of schooling and fewer children on average, and are more likely to live in an urban area. Appendix E repeats this comparison for each individual sample restriction.
Table 2:
Sample Characteristics
| (a) Sample | ||||||
| Women age 40+ | ||||||
| Countries | Surveys | Full sample | Final sample | |||
| Africa | 42 | 171 | 282,935 | 197,656 | ||
| Asia | 31 | 89 | 445,237 | 287,442 | ||
| North America | 11 | 37 | 61,671 | 49,628 | ||
| South America | 8 | 33 | 105,840 | 86,709 | ||
| Total | 92 | 330 | 895,683 | 621,435 | ||
| (b) Characteristics of final sample | ||||||
| In final sample | Not in final sample | Difference | ||||
| Women age 40+ | 621,435 | 274,248 | ||||
| Avg. age | 44.06 | 44.10 | –0.04 (0.01) | |||
| Avg. years of schooling | 5.23 | 3.40 | 1.83 (0.02) | |||
| Share live in urban area | 0.47 | 0.34 | 0.14 (0.002) | |||
| Avg. no. of children | 4.25 | 6.85 | –2.60 (0.01) | |||
Notes: This table presents characteristics of birth history survey respondents. Panel (a) counts the number of surveys, countries in which the surveys were conducted, and number of women aged 40 and older on each continent. The final sample is reached by excluding women who provide incomplete birth histories, who have ever had twins or another multiple birth, who reached age 40 after the last recorded year in which the sex ratio at birth was at the natural level in their country, who have never been married, and who have more than eight children. For women included in and excluded from the final sample, panel (b) compares the average age, share ever married, average years of completed schooling, and average number of children. The final column of panel (b) presents a t-test of equality of the two means, with standard errors in parentheses. All p-values are less than 0.001. Appendix Table E.2 repeats panel (b) for each individual restriction. See section 4. Data source: Birth history surveys, described in section 4.
Birth history surveys are retrospective and ask women to report births that may have occurred several decades ago. Failure to report births is a particular concern if boys or girls are especially likely to go unreported. Although it is not possible to directly measure omitted births, it is possible to measure the accuracy of the reported timing of births by sex. When misreported, ages tend to be clustered at multiples of five (Newell 1988). For example, across the entire sample, 21.1 percent of children are reported to have an age ending in zero or five. This value is above the expected 20 percent, suggesting age heaping. However, age heaping is nearly identical for boys and girls (21.2 percent and 21.0 percent). There is similarly little difference in the incidence of age heaping between boys and girls who are still alive (21.4 percent versus 21.1 percent), and between boys and girls who have died (20.4 percent versus 20.5 percent). These comparisons suggest that misreporting is not associated with the sex of children.
5. Main Results
Larsen (2005) reports that approximately 90% of women in northern Tanzania are able to get pregnant within two years when trying, and I assume that every couple conceives when trying to get pregnant with likelihood p=0.9. To allow for a small chance of accidentally becoming pregnant, I assume that every couple conceives when not trying to get pregnant with likelihood q=0.1. Because the natural likelihood that each birth is a boy is about 0.51, I assume that every child is a boy with likelihood of l=0.51. I calculate the distribution of sequences of children using sampling weights, with each survey’s weight rescaled to have a mean value of one. I follow McKean and Sievers (1987) to calculate the coefficient of determination for the minimized sum of absolute deviations in equation 2.
5.1. Estimated Bounds on Average Values of Preferences
Figure 1 presents estimated average values of preferences, by continent. The dark rectangles provide estimated bounds calculated using equations 1 through 3. The light rectangles provide 95-percent confidence intervals around the unknown true average value, calculated using subsampling following Romano and Shaikh (2008) with 1,000 subsamples. The confidence intervals in panel (a) indicate that the average ideal share of children that are sons, r*, is between 0.28 and 0.78 in Africa, between 0.48 and 0.79 in Asia, between 0.32 and 0.70 in North America, and between 0.34 and 0.77 in South America. Values below 0.5 indicate that couples on average prefer more daughters than sons, while values above 0.5 indicate son preference. In Asia, nearly the entire interval lies above 0.5 and suggests average son preference. Elsewhere, the interval substantially spans 0.5, indicating that average daughter preference or average son preference can explain observed childbearing equally well. Additionally, the four confidence intervals overlap one another, preventing identification of any statistically significant differences between continents in the unknown true average sex preference.6
Figure 1: Estimated Bounds on Average Values of Preferences, by Continent.
Notes: The dark rectangles provide bounds on average values of preferences, calculated according to section 3 with parameter values given in section 5. The light rectangles provide 95-percent confidence intervals around the unknown true value. The coefficient of determination is 0.94 in Africa, 0.95 in Asia, 0.94 in North America, and 0.96 in South America. See section 5.1. Data source: Birth history surveys, described in section 4.
The 95-percent confidence intervals in panel (b) are much narrower and indicate that the average ideal number of children, c*, is approximately 5.0 in Africa, 4.2 in Asia, 3.7 in North America, and 3.3 in South America.7 The estimates in panel (c) indicate that, across all four continents, the average strength of sex preference, α, is at least 0.16 and at most 0.35. The lower bound on this confidence interval is 0.26 in Asia, and the upper bound is 0.25 in South America, indicating that the difference in average strength of sex preference on the two continents is statistically distinguishable at the five-percent level. Again, if a couple’s decision to have an additional child never depends on the sex of previous children, then that couple is indistinguishable from a couple that places no weight on the sex of children. Only when the strength of sex preference is greater than zero, α>0, does a couple’s ideal share of children that are sons actually influence the couple’s childbearing decisions. That finding that average strength of sex preference is greater than zero on all continents therefore suggests that the sex of children matters for childbearing decisions around the world.
In appendix G, I demonstrate the sensitivity of the estimates in Asia to model and sample specification. Although the estimated bounds vary, they generally yield the same conclusion: average ideal share of children that are sons is substantially greater than zero and less than one, meaning that couples on average prefer a mix of sons and daughters; and average strength of sex preference is greater than zero, meaning that the sex of children matters for childbearing decisions.
5.2. Estimated Bounds on Additional Summary Measures of Sex Preferences
I estimate bounds on additional summary measures of preferences using alternative specifications of F in equation 3. For example, to estimate the share of couples for whom the sex of children matters (α>0), each element of F equals one if the corresponding combination of preferences places any weight on the sex of children and equals zero otherwise. The estimates in panel (a) of Figure 2 indicate that between 57 percent and 93 percent of couples in Africa have strength of sex preference greater than zero. Confidence intervals run from 76 percent to 95 percent in Asia, from 70 percent to 93 percent in North America, and from 64 percent to 90 percent in South America. At just the lower bound, these estimates suggest that more than half of couples on each continent have and act upon sex preferences.
Figure 2: Additional Summary Measures of Sex Preferences.
Notes: The dark rectangles provide estimated bounds on summary measures of preferences. The light rectangles provide 95-percent confidence intervals around the unknown true value. The coefficients of determination are given in the notes to Figure 1. See section 5.2. Data source: Birth history surveys, described in section 4.
As given in panel (b), at most 52 percent of couples in Africa prefer a balance of sons and daughters. Elsewhere, this upper bound is substantially below 50 percent, suggesting that most couples prefer sons or prefer daughters. The intensity with which couples prefer sons or daughters, measured as the magnitude of the difference between each couple’s ideal share of children that are sons and one-half, is correspondingly large, with a minimum value in panel (c) of at least 0.17 on all continents. On average, couples prefer at most 33 percent or at least 67 percent of their children to be sons – a ratio of at least two daughters per son or vice versa.
Panels (d) and (e) of Figure 2 present bounds on the shares of couples that prefers sons and prefer daughters. Although these intervals are wide, they indicate that at least 24 percent of couples everywhere prefer sons, and at least 18 percent prefer daughters. The difference between these two shares, given in panel (f), substantially spans zero in Africa and the Americas. In Asia, nearly the entire confidence interval for this difference suggests that more couples prefer sons than prefer daughters.
The final three panels of Figure 2 present estimates on average ideal number of sons, number of daughters, and the difference between the two. On all continents, couples want an average of at least 1.05 sons and at least 0.62 daughters. Again, in Africa and the Americas, the difference between these two values could be negative or positive. In Asia, the identified set for this difference is between 0.25 and 2.68, and the 95-percent confidence interval is between −0.03 and 3.01, suggesting son preference. Panel (b) of Figure 1 and panel (i) of Figure 2 suggest that, of the 4.17 or more total children that the average couple in Asia wants, at most 3.01 are desired to be sons. Among all children desired in Asia, at most 72 percent (3.01/4.17) are desired to be sons. Yet, as given in panel (a) of Figure 1, the average couple in Asia prefers at most 79 percent of its children to be sons. This comparison suggests that son preference in Asia may be concentrated among couples that want small families.
Although the bounds are wide, the estimates in Figure 2 indicate the achievement of a model-based approach to measuring sex preferences. Parity progression ratios and other standard techniques can identify populations in which the sex of previous children matters for at least some couples’ future childbearing decisions. However, only a model-based approach permits calculation of bounds on the share of couples for whom the choice to have additional children depends on the sex of previous children, and bounds on the shares of couples who prefer sons, daughters, or a balance of the two. Even at the lower bound, the estimates in this section indicate that more than half of couples in Africa, Asia, and the Americas have and act upon sex preferences. At least 24 percent prefer sons, at least 18 percent prefer daughters, and at least four percent prefer a balance of sons and daughters. Observed childbearing is best explained by sex preferences that are widespread and heterogeneous.
5.3. Comparison between Estimated and Reported Preferences
In this section, I compare preferences estimated from birth history surveys using the model-based approach to reported preferences collected by the same surveys. Many Demographic and Health Surveys ask women to report the number of sons, daughters, and total children they would choose if they could return to the start of their childbearing careers (Arnold 1997). From these reports, I calculate each woman’s ideal share of children that are sons and ideal family size. So that the estimated and reported preferences are drawn from the same people, I restrict the sample introduced in section 4 to the 273,518 women who report their desired numbers of sons, daughters, and total children. This sample yields different estimated bounds than does the larger sample in sections 5.1 and 5.2.
As given in panel (a) of Figure 3, average reported ideal family size on all four continents is within 0.4 children of the estimated bounds on average ideal family size, suggesting that women accurately report ideal family size. As given in panels (b) and (c), in Africa the estimated bounds span the reported ideal numbers of sons and daughters. In Asia and the Americas, reported average ideal number of sons is between 0.2 and 0.4 sons below the estimated lower bound, and reported average ideal number of daughters is near the estimated upper bound. As a result, panel (d) suggests that, in Asia and the Americas, women on average underreport their ideal share of children that are sons. In all cases, though, reported values are close to the estimated bounds.
Figure 3: Comparison between Estimated and Reported Preferences.
Notes: The hollow rectangles provide estimated bounds on summary measures of preferences. The solid circles provide average reported values of these summary measures. The coefficient of determination is 0.93 in Africa, 0.94 in Asia, 0.91 in North America, and 0.95 in South America. See section 5.3. Data source: Birth history surveys, described in section 4 and section 5.3.
The remaining panels of Figure 3 reveal more substantial differences between reported and estimated preferences. As given in panel (e), less than 25 percent of women on each continent report that they prefer sons, substantially below the estimated lower bound of at least 32 percent in Africa, 44 percent in North America, 50 percent in South America, and 53 percent in Asia. Panel (f) similarly suggests that, in Africa, Asia, and North America, too few women report preference for daughters. This underreporting of son and daughter preference means that, as given in panel (g), women overreport desire for equals numbers of sons and daughters. For example, 62 percent of women in Asia report wanting a balance of sons and daughters, yet estimates indicate that at most 20 percent of couples in fact have balance preference. Using a more expansive definition of balance preference that includes any ideal share of children that are sons between 0.2 and 0.8, panel (h) similarly finds that women overreport balance preference. More than 83 percent of women report that they want less than four times as many sons as daughters and vice versa, yet estimates indicate that at most 53 percent of couples do so. Similarly, panel (i) indicates that women underreport the intensity with which they prefer sons or daughters, measured as the magnitude of the difference between ideal sex preference and one-half.
When the sex of children is stochastic, it is not possible to determine a couple’s sex preference from its sequence of sons and daughters; only the distribution of preferences in a population can be inferred from a collection of birth histories. Surveys that elicit reported preferences have the advantage of isolating a single parent’s preferences. However, the comparisons in this section suggest that women tend to underreport the degree to which they prefer sons or daughters. There are several potential explanations for this difference between reported and estimated preferences. First, the reported preferences used in this section are collected from women alone, but actual childbearing reflects a combination of men’s and women’s preferences. In appendix H, I compare preferences reported by husbands and wives using a smaller sample of surveys that also record men’s reported preferences. Particularly in Africa, husbands report more son preference than do their wives. However, both husbands’ and wives’ reported preferences fall well short of the lower bound on estimated intensity. While husbands and wives may differ on whether they prefer sons or daughters, this finding suggests that they both tend to underreport the degree to which they prefer sons or prefer daughters.
Second, when asked about sex preferences after having children, a respondent may feel pressure to report that she wants the family composition that she actually had (Lightbourne 1985, Bongaarts 2013). In appendix H, I compare preferences reported by women near the start and near the end of their childbearing careers in Indonesia. Women aged 29 or younger are nearly twice as likely to report son preference or daughter preference than are women aged 40 and older. Reported intensity of preferences is also approximately twice as high at the start as at the end of childbearing careers. Although estimated using different samples, these comparisons suggest that reported preferences weaken as women age. Birth history surveys record behavior from across the childbearing career, and preferences estimated from birth histories may more accurately represent the preferences that drove a couple’s childbearing decisions.
Finally, and perhaps most importantly, reported preferences do not address the sequential nature of childbearing. A couple’s decision to have another child depends not only on its ideal composition of sons and daughters but also on the relative undesirability of other combinations of sons and daughters. By jointly measuring sex preference, family size preference, and strength of sex preference, a model-based approach more fully captures the tradeoffs that parents face when making childbearing decisions.
5.4. Estimated Bounds on Preferences by Country
Preferences estimated by continent reveal heterogeneity in sex preferences around the world, but wide bounds generally prevent identification of differences between continents. Country-level estimates can better identify pockets of sex preferences.8 Panel (a) of Figure 4 presents estimated bounds on average ideal share of children that are sons, sorted by the share of women with two daughters that have a third child minus the share of women with two sons that have a third child. This difference ranges from nearly −0.06 in Congo to nearly 0.12 in Nepal, suggesting that couples in Congo are satisfied once they have daughters, while couples in Nepal prefer sons. The estimated bounds on average sex preference vary substantially, but generally suggest greater son preference as the parity progression ratio difference rises. In Namibia, the average ideal share of children that are sons is at most 0.47, while in Nepal the average ideal share is at least 0.66. China has the greatest possible son preference, with an upper bound of 0.87. Four countries – Bolivia, Ecuador, Namibia, and Swaziland – have an upper bound below 0.5, suggesting average daughter preference, and none of these countries are in Asia. Of the 17 countries with a lower bound above 0.5, suggesting average son preference, 12 are in Asia.
Figure 4: Estimated Bounds on Preferences by Country.
Notes: Each panel presents estimated bounds on summary measures of preferences, measured along the x-axis, by alternative ways of signaling sex preferences, measured along the y-axis. The y-axis in panels (a) and (b) records the difference between parity progression ratios among couples with two daughters and couples with two sons. The y-axis in panel (c) records the difference between parity progression ratios among couples with two children of the same sex (two sons or two daughters) and couples with one son and one daughter. The y-axis in panel (d) records the average magnitude of the difference between ideal share of sons and one-half, as reported by women who also provide birth history surveys. See section 5.4. Data source: Birth history surveys, described in section 4 and section 5.4.
Panel (b) presents estimated bounds on the share of couples that value the sex of children, sorted by the same parity progression ratio difference as in panel (a). Greater variation in parity progression ratios, yielding values on the y-axis further from zero, suggests a greater role for the sex of children in childbearing decisions. The estimated lower bounds on the prevalence of sex preferences correspondingly rise as the parity progression ratio difference gets further away from zero. In Mali, there is little difference in parity progression ratios, and estimates indicate that at most 82 percent of couples value the sex of children. In Namibia, China, and Nepal, parity progression ratio differences are greater, and at least 87 percent of couples value the sex of children. Together, panels (a) and (b) demonstrate that the model based approach identifies greater son preference and more widespread sex preferences in many of the same countries signaled by parity progression ratios.
Panel (c) demonstrates the limitation of using parity progression ratios to infer sex preferences. This panel presents estimated bounds on the share of couples that prefer less than four times as many sons as daughters and vice versa (a generous definition of balance preference), sorted by a different parity progression measure: the share of women with two children of the same sex who have a third child, minus the share of women with one son and one daughter who have a third child. Positive values for this difference between parity progression ratios are commonly interpreted as a signal of balance preference (Ben-Porath and Welch 1976, Angrist and Evans 1998). For example, 63 percent of women in the United States with two sons or two daughters have a third child, while only 59 percent of women with one of each have a third child. However, the model-based approach estimates that at most 42 percent of couples in the United States prefer a balance of sons and daughters. In 44 other countries, parity progression ratios similarly suggest balance preference, yet less than one-half of couples are estimated to have balance preference. A plurality of couples in these countries may prefer balance, but these findings suggest that a majority do not. Even though the estimated bounds using a model-based approach are wide, this comparison demonstrates that they can provide a more precise description of underlying preferences than do parity progression ratios.
Finally, panel (d) of Figure 4 repeats the comparison in panel (i) of Figure 3 between estimated and reported intensity of sex preferences, calculated as the magnitude of the difference between ideal share of children that are sons and one-half, |r*–½|. Average reported intensity varies from 0.045 in Tajikistan to 0.16 in Brazil, but the lower bound on the estimated average intensity is greater than 0.17 in every country. This comparison suggests that women in every country underreport the intensity of sex preferences on average.
Figure 5 demonstrates that estimated bounds on preferences generally widen for more diverse populations. Panel (a) compares the width of estimated bounds on average ideal share of children to population size in each country (World Bank 2018). The scatterplot suggests little relationship, and the slope of the regression line is not statistically different from zero. However, panel (b) indicates a positive and statistically significant relationship between the width of bounds and ethnic diversity, measured using Alesina et al.’s (2003) index of ethnic fractionalization, which records the likelihood that two randomly-selected people belong to different ethnic groups. For example, South Korea and Uganda both have large populations (47 million people and 24 million people in 2000). South Korea is the least ethnically diverse country in the dataset, Uganda is the most diverse, and estimated bounds on average sex preference have a width of 0.06 in South Korea and 0.52 in Uganda. The width of the estimated bounds is greater than 0.4 in 10 countries. Nine of these countries, all located in Africa, have an index of ethnic fractionalization greater than 0.6; the tenth, Yemen, does not have an index of fractionalization available, and is excluded from panel (b). As presented in Figures 1 through 3, estimated bounds on preferences are generally widest in Africa. The relationship in Figure 5 suggests that population diversity, rather than size, contributes to the width of these bounds.
Figure 5: Width of Estimated Bounds on Preferences by Country.
Notes: This figure presents country-level scatterplots of the widths of estimated bounds on the average ideal share of children that are sons, by population in panel (a), and by ethnic diversity in panel (b). The straight line in each panel is a regression line. See section 5.4. Data sources: Birth history surveys, described in section 4 and section 5.4; Population from World Bank (2018); Ethnic fractionalization from Alesina et al. (2003).
5.5. Sex Preferences and Aggregate Fertility Levels
Sex preferences can inflate fertility. For example, a couple whose ideal family composition is two sons may continue having children if the first two children are daughters. Mutharayappa et al. (1997) and Bhat and Zavier (2003) propose that weakening sex preferences can therefore decrease aggregate fertility by giving parents less reason to have many children. On the other hand, as noted by Freedman and Coombs (1974), the couple could also stop early upon reaching a particularly desirable combination of sons and daughters. If the couple has a first-born son, it may stop short of its ideal family size of two children, unwilling to risk having a daughter. In this case, weakening sex preferences would lead the couple to have a second child, raising fertility. Whether changes in sex preferences cause a decrease, increase, or no change in aggregate fertility is therefore an empirical question.
I estimate the expected change in aggregate fertility under counterfactual scenarios in which preferences change: all couples prefer a balance of sons and daughters, all couples care only about their number of children, or all couples want one less child. In each case, the couples’ other preferences remain unchanged. I estimate these bounds using alternative specifications of F in equation 3. For example, to calculate the expected change in children per couple if all couples prefer a balance of sons and daughters, I set each element of F equal to the expected change in fertility if a couple with the corresponding combination of preferences retains the same family size preference and strength of sex preference but prefers half of children to be sons.
In the first two counterfactual scenarios in panels (a) and (b) of Figure 6, the average number of children per couple would change by magnitudes of at most 0.25 and 0.17. These estimates suggest that, although encouraging couples to have balanced or weakened sex preferences could substantially raise or lower fertility for many individual couples, neither change would dramatically affect aggregate fertility. Panel (c) of Figure 6 indicates that, on all continents, fertility would fall by between 0.53 and 0.70 children per couple if all couples want one less child than before. The magnitude of the decline is less than one because some couples already want zero children. Imperfect control over conception also tempers the decline: couples who want fewer children spend more childbearing periods trying not to get pregnant, but during these periods they still run a risk of conception. Together, these estimates suggest that sex preferences alone do not shape aggregate fertility levels. Factors that influence the number of children that couples want to have, such child mortality rates and economic opportunities for women, may offer more effective policy levers for reducing fertility.
Figure 6: Estimated Change in Children per Couple as Preferences Change.
Notes: The dark rectangles provide bounds on the estimated change in overall fertility levels as preferences change. The light rectangles provide 95-percent confidence intervals around the unknown true value. Coefficients of determination given in the notes to Figure 1. See section 5.5. Data source: Birth history surveys, described in section 4.
6. Conclusion
This paper answers the following question: what information about sex preferences can be inferred from the distribution of sequences of sons and daughters in completed families? Previous research has found evidence that parents tend to want sons in several countries, particularly in Asia. However, little is known about heterogeneity in the sex preferences that may motivate childbearing decisions around the world. In this paper, I introduce a model-based approach for inferring sex preferences from widely-available birth history records.
Starting with a model of childbearing allows for nuance in measured sex preferences. There is no single “son preference,” nor is there only a binary of son preference versus daughter preference. A couple may prefer one of many possible combinations of sons and daughters, just as a couple may prefer one of many possible numbers of total children. Couples can also differ on how strongly the sex of previous children influences the decision to have additional children. I model childbearing decisions using a simple utility specification with quadratic loss functions over the share of children that are sons and the total number of children.
Collinearity of preferences presents an empirical challenge: because different combinations of preferences can lead a couple to make the same childbearing decisions, it is generally not possible to identify a single set of preferences that best explains observed childbearing. I use linear programming to efficiently estimate bounds on preferences. By estimating bounds rather than requiring point estimates, this paper ties the study of sex preferences into a growing literature on partial identification in economics, reviewed by Tamer (2010).
Empirical estimates indicate that sex preferences influence the decision to have additional children for more than half of couples in Africa, Asia, and the Americas. At least 24 percent of couples prefer sons, and at least 18 percent prefer daughters. Although the exact lower bounds vary across alternative model and sample specifications, they are consistently greater than zero. Sex preferences are more widespread and heterogeneous than previously established. Estimates also suggest that women tend to overreport wanting a balance of sons and daughters. In Africa, Asia, and the Americas, the share of women that report wanting a balance of sons and daughters is greater than the upper bound estimated using observed childbearing. Finally, although preferences are widespread, they do not substantially drive aggregate fertility levels. Estimated aggregate fertility would change by less than 0.2 children per couple if all couples no longer valued the sex of their children.
These findings suggest three areas for further research. First, all estimates in this paper are calculated starting from the same initial, broad set of possible preferences that have substantial collinearity. Ruling out certain preferences ahead of time could limit collinearity among the remaining preferences and narrow the estimated bounds. However, which preferences, if any, can reasonably be omitted varies by context. For example, in the 2001 Demographic and Health Survey in Nepal, less than 0.2 percent of women report that they want only daughters. In the 2000 Demographic and Health Survey in Colombia, more than 12 percent of women report that they want only daughters. Ruling out complete daughter preference, r*=0, may be appropriate in Nepal but not in the Dominican Republic.
Second, computing capacity shapes the specification of equations 1 through 3. Solving equation 1 using backward induction is computationally intensive. Because equations 2 and 3 involve large matrices and must be solved several thousand times to determine confidence intervals, they are structured to be solved efficiently using linear programming. Additional computing resources would allow for more flexibility in the empirical approach described in section 3. A more nuanced model could permit preferences to change over time, allow couples to receive utility over time rather than all at once in period T+1, or permit variation across couples in the steepness of the utility penalty that comes from not reaching ideal family composition. Larger matrices would accommodate more childbearing periods or consideration of spacing between births. Non-linear optimization of equations 2 and 3 would allow greater exploration of the correlation between preferences.
Third, heterogeneity in sex preferences within a population implies heterogeneity in the underlying tastes, incentives, and constraints that shape sex preferences. Differential treatment of boys and girls is widespread, as are disparities in opportunities available to boys and girls (Arnold 1997). Parents jointly determine the quantity and quality of their children (Becker 1960), and differences in the returns to investment in quality of boys versus girls may in turn influence the number of sons and daughters that parents want. There remain important avenues of research into the relationship between preferences for sons and daughters and sex differences in factors, such as returns to schooling, that may lead parents to treat boys and girls differently.
Acknowledgements
Thank you to Barbara Anderson, Martha Bailey, Jacob Bastian, Hoyt Bleakley, Charlie Brown, Eric Chyn, Aaron Flaaen, Jeremy Fox, Andrew Goodman-Bacon, Morgan Henderson, Evan Herrnstadt, Joshua Hyman, Max Kapustin, Justin Ladner, David Lam, Jeff Smith, Bryan Stuart, Caroline Theoharides, James Wang, various seminar participants, and two anonymous referees.
Funding sources
This work was supported by the NICHD (T32 HD007339) as part of the University of Michigan Population Studies Center training program. I gratefully acknowledge the use of the services and facilities of the Population Studies Center (funded by NICHD Center Grant R24 HD041028). This work was also supported by the University of Michigan Institute for Teaching and Research in Economics. The opinions and conclusions expressed herein are solely mine and do not represent the opinions or policy of these funders or any agency of the federal government.
Appendix A: Parity Progression in Simulated Populations
Standard techniques reliably signal sex preferences in populations in which all parents have the same preference. For example, consider a simulated population with son preference in which all couples stop after their first son and have up to two children. This population is represented in column 1 of Table A.1. As given in panel (a), every couple in this population with one son stops childbearing, while every couple with one daughter has a second child. No couple has a third child. Because the likelihood of having a second child depends on the sex of the first child, the distribution of sequences of children in completed families is not even. Panel (b) presents this distribution assuming that each birth is a boy with likelihood 0.5. All one-child families have only sons, and all two-child families have either a daughter and a son or two daughters.
Heterogeneity can mask the presence of sex preferences. For example, consider a mixed population in which half of couples have up to two children in order to have a son, and the other half have up to two children in order to have a daughter. As given in column 3, parity progression ratios in this population are even across couples with the same number of children: couples are equally likely to have a second child after having a son as after having a daughter, and couples are equally likely to have a third child no matter the sex composition of the first two children. The distribution of children across completed families is also even: half of one-child families have sons and half have daughters, and couples are equally likely to have each of the four possible sequences of two children. This population is identical to a population in which half of couples have one child, half have two children, and the sex of children does not matter.
Although heterogeneity can mask the presence of sex preferences, this masking is not necessarily complete. For example, consider an alternative mixed population in which 60 percent of couples have up to two children in order to have a son, and the remaining 40 percent have up to three children in order to have a daughter. Couples in this mixed population vary not just in their preference for sons or daughters but also in their willingness to have a third child. As given in column 5, parity progression ratios provide mixed signals: couples are more likely to have a second child after having a daughter than after having a son, suggesting son preference, but only couples with two sons have a third child, suggesting daughter preference. The distribution of children across completed families is correspondingly uneven. As presented in Table 1, actual populations can have similarly mixed parity progression ratios, suggesting variety in preferences over the sex and number of children.
Table A.1:
Parity Progression in Simulated Populations
| Population 1 | Population 2 | Population 3 | Population 4 | Population 5 | |||||
|---|---|---|---|---|---|---|---|---|---|
| All couples stop after 1st son and have up to 2 children | All couples stop after 1st daughter and have up to 2 children | 50% of couples from population 1, 50% from population 2 | All couples stop after 1st daughter and have up to 3 children | 60% of couples from population 1, 40% from population 4 | |||||
| (a) Parity progression ratios | |||||||||
| 1st child | 2nd child | ||||||||
| Boy | – | 0 | 1 | 0.5 | 1 | 0.4 | |||
| Girl | – | 1 | 0 | 0.5 | 0 | 0.6 | |||
| Boy | Boy | – | 0 | 0 | 1 | 1 | |||
| Boy | Girl | – | 0 | 0 | 0 | 0 | |||
| Girl | Boy | 0 | – | 0 | – | 0 | |||
| Girl | Girl | 0 | – | 0 | – | 0 | |||
| (b) Share of couples with each sequence of children | |||||||||
| 1st child | 2nd child | 3rd child | |||||||
| Boy | – | – | 0.5 | – | 0.25 | – | 0.3 | ||
| Girl | – | – | – | 0.5 | 0.25 | 0.5 | 0.2 | ||
| Boy | Boy | – | – | 0.25 | 0.125 | – | – | ||
| Boy | Girl | – | – | 0.25 | 0.125 | 0.25 | 0.1 | ||
| Girl | Boy | – | 0.25 | – | 0.125 | – | 0.15 | ||
| Girl | Girl | – | 0.25 | – | 0.125 | – | 0.15 | ||
| Boy | Boy | Boy | – | – | – | 0.125 | 0.05 | ||
| Boy | Boy | Girl | – | – | – | 0.125 | 0.05 | ||
| Boy | Girl | Boy | – | – | – | – | – | ||
| Boy | Girl | Girl | – | – | – | – | – | ||
| Girl | Boy | Boy | – | – | – | – | – | ||
| Girl | Boy | Girl | – | – | – | – | – | ||
| Girl | Girl | Boy | – | – | – | – | – | ||
| Girl | Girl | Girl | – | – | – | – | – | ||
Notes: Among parents that start with each given sequence of sons and daughters, parity progression ratios record the share that have at least one more child. The distribution of sequences of children in completed families is calculated assuming a likelihood of one-half that each birth is a boy. See appendix A.
Appendix B: Correspondence between Bliss Point Utility Model and Stopping Rules
Table B.1 demonstrates that the bliss point utility model introduced in section 3.1 rationalizes several standard stopping rules. Each column represents a combination of preferences and stopping rule, and each row reports the likelihood that a couple with those preferences or following that rule has the indicated sequence of children. These distributions are calculated assuming that each child is a son with likelihood 0.5, couples have perfect control over conception, couples in column 3 have three childbearing periods, and all other couples have two childbearing periods. For example, column 1 represents a couple that faces two childbearing periods, prefers all children to be sons, prefers to have two total children, and only values the sex of children. This couple has a son with likelihood 0.5, a daughter and then a son with likelihood 0.25, and two daughters with likelihood 0.25. A couple that has one child, and then a second child only if the first is a daughter, has the same expected distribution of sequences of children.
Table B.1:
Correspondence between Bliss Point Utility Model and Stopping Rules
| Son preference | Daughter preference | Balance preference | No preference | |||||
|---|---|---|---|---|---|---|---|---|
| Bliss point preferences | ||||||||
| Childbearing periods, T | 2 | 2 | 3 | 2 | ||||
| Ideal share of children that are sons, r* | 1 | 0 | 0.5 | 0 | ||||
| Ideal number of children, c* | 2 | 2 | 2 | 2 | ||||
| Relative importance of the sex of children, α | 1 | 1 | 1 | 0 | ||||
| Stopping rule | ||||||||
| Minimum number of children | 1 | 1 | 1 | 2 | ||||
| Maximum number of children | 2 | 2 | 3 | 2 | ||||
| Stop after | 1st son | 1st daughter | 1 son and 1 daughter | |||||
| Share of couples with each sequence of children | ||||||||
| 1st child | 2nd child | 3rd child | ||||||
| Boy | – | – | 0.5 | – | – | – | ||
| Girl | – | – | – | 0.5 | – | – | ||
| Boy | Boy | – | – | 0.25 | – | 0.25 | ||
| Boy | Girl | – | – | 0.25 | 0.25 | 0.25 | ||
| Girl | Boy | – | 0.25 | – | 0.25 | 0.25 | ||
| Girl | Girl | – | 0.25 | – | – | 0.25 | ||
| Boy | Boy | Boy | – | – | 0.125 | – | ||
| Boy | Boy | Girl | – | – | 0.125 | – | ||
| Boy | Girl | Boy | – | – | – | – | ||
| Boy | Girl | Girl | – | – | – | – | ||
| Girl | Boy | Boy | – | – | – | – | ||
| Girl | Boy | Girl | – | – | – | – | ||
| Girl | Girl | Boy | – | – | 0.125 | – | ||
| Girl | Girl | Girl | – | – | 0.125 | – | ||
Notes: This table demonstrates that the bliss point model in section 3.1 rationalizes several common stopping rules. The predicted distributions of sequences of children for couples following several specifications of the bliss point model match the predicted distributions for several common stopping rules. See section B.
Appendix C: Collapse of Childbearing Periods into Sequence of Sons and Daughters
Table C.1 demonstrates how, in section 3.2, childbearing periods are collapsed into a sequence of sons and daughters, omitting periods without a birth. Panel (a) presents the likelihood that a couple has each possible sequence of childbearing periods with no child, a son, or a daughter, given that the couple wants one child regardless of its sex (α=0, c*=1, r* does not matter), faces two childbearing periods (T=2), has perfect control over conception when trying to get pregnant (p=1), and can accidentally conceive when not trying to get pregnant (0<q<1). Each child is a boy with likelihood l. The couple’s childbearing decisions are governed by equation 1. The couple maximizes its utility by not trying to get pregnant in the first period, and then by trying to get pregnant in the second period only if the couple did not accidentally have a child in the first period. Panel (b) collapses these likelihoods by each unique sequence of sons and daughters.
Table C.1:
Collapse of Childbearing Periods into Sequence of Sons and Daughters
| (a) Original | (b) Collapsed by sequence of sons and daughters | |||||
|---|---|---|---|---|---|---|
| 1st period | 2nd period | Likelihood | 1st child | 2nd child | Likelihood | |
| No child | No child | 0 | → | – | – | 0 |
| No child | Boy | (1–q)l | → | Boy | – | (1–q)l + ql(1–q) |
| Boy | No child | ql(1–q) | ||||
| No child | Girl | (1–q)(1–l) | → | Girl | – | (1–q)(1–l) + q(1–l)(1–q) |
| Girl | No child | q(1–l)(1–q) | ||||
| Boy | Boy | qlql | → | Boy | Boy | qlql |
| Boy | Girl | qlq(1–l) | → | Boy | Girl | qlq(1–l) |
| Girl | Boy | q(1–l)ql | → | Girl | Boy | q(1–l)ql |
| Girl | Girl | q(1–l)q(1–l) | → | Girl | Girl | q(1–l)q(1–l) |
Notes: The model introduced in Section 3.1 yields a predicted distribution of sequences of periods in which a son, daughter, or no child is born. This table provides an example of how each predicted distribution is collapsed to just the sequence of sons and daughters, assuming two childbearing periods and perfect control over conception when trying to get pregnant. See section C.
Appendix D: Collinearity of Preferences
This section demonstrates that, because of collinearity, there may be many combinations of preferences that each explains observed childbearing equally well. For example, let there be two childbearing periods (T=2), perfect control over conception (p=1, q=0), and an even likelihood that each child is a son (l=0.5). Table D.1 presents A, in which each element amn is the likelihood that a couple with combination of preferences n has sequence of children m. Consider a large population in which half of couples have a single son, one-quarter have a daughter and then a son, and one-quarter have two daughters. Table D.1 also presents D, this observed distribution of sequences of children.
If all couples have the final combination of preferences listed in A (r*=1, c*=2, α=1), then half of couples will have one son, one-quarter will have a daughter and then a son, and one-quarter will have two daughters. This distribution exactly matches the actual observed distribution of sequences of children, D. However, if all couples have the next-to-last combination of preferences listed in A (r*=1, c*=2, α=0.9), the resulting population will also exactly match the observed population. Figure D.1 describes all combinations of preferences that each exactly match the observed population. The dots in panels (b) through (f) each represent a population in which all couples have the same combination of preferences. In each of these populations, the expected distribution of sequences of children exactly matches the observed population. For example, the combination of preferences listed in the final column of A in Table D.1 (r*=1, c*=2, α=1) exactly matches the observed population D and is given by the dot in the upper-right corner of panel (f).
Each vertex of the polyhedron in panel (g) corresponds to a dot in panels (b) through (f) and is a unique combination of preferences that, in expectation, exactly matches the observed population. Any population that consists of mixtures of couples with any of these unique combinations of preferences also exactly matches the observed population, so the polyhedron in panel (g) is a convex set of all possible combinations of average values of sex preference, family size preference, and strength of sex preference that exactly match the observed population. Each of the polygons in panels (c) through (f) provide slices of the polyhedron in panel (g), holding average sex preference constant.
This example is a specific case when at least one combination of preferences has a predicted distribution of sequences of children that exactly matches the observed distribution. In the more general case when no combination of preferences exactly predicts the observed population, it is not possible to create a polyhedron formed by vertices that represent single combinations of preferences. The edges of the convex hull may be determined by an infinite number of mixtures of combinations of preferences. Section 3.2 instead develops a two-stage procedure for calculating bounds on summary measures of preferences without constructing the entire convex hull.
Table D.1:
Collinearity of Preferences
|
Notes: Each column of A represents a strategy defined by a unique combination of preferences r*, c*, and α. Each row of A provides the likelihood that a couple following that strategy has the corresponding sequence of children. Because of collinearity, several of these strategies (such as those in the final two columns of A) exactly match the observed distribution, D. See section D.
Figure D.1: Combinations of Strategies that Best Explain an Observed Population.

Notes: This figure depicts all combinations of strategies from A in Table D.1 that exactly predict the observed population D in Table D.1. The two-dimensional figures in panels (a) through (f) depict combinations of family size preference and strength of sex preference, holding sex preference constant. The three-dimensional figure in panel (g) depicts all combinations of average values of the three preferences. See appendix D.
Appendix E: Birth History Surveys and Restrictions for Final Sample
Table E.1 lists birth history surveys by country. Sources for the surveys are as follows: China Two-per-thousand Fertility Survey (China Fertility Survey 1988), Demographic and Health Survey (ICF International 1985–2017), RAND Family Life Surveys (Frankenberg and Karoly 1995, Rahman et al. 1999, Frankenberg and Thomas 2000, Strauss et al. 2004, Strauss et al. 2009, Strauss et al. 2014), India Rural Economic and Demographic Survey (National Council of Applied Economic Research 1982), Japanese General Social Survey (Institute of Regional Studies at Osaka University of Commerce 2000–2010), United States Integrated Fertility Survey Series (Smock et al. 1955–2002), Multiple Indicator Cluster Survey (UNICEF 2006–2014), and World Fertility Survey (International Statistics Institute 1974–1981).
The China Fertility Survey is a 10 percent sample of the 1988 “Two-per-thousand” survey and is missing records from Heibi Province, Shanxi Province, and Inner Mongolia Autonomous Region. Of the Family Life Surveys compiled by the RAND Corporation, I use a cross-sectional survey conducted in 1996 in the Matlab region of Bangladesh in 1996, and a panel survey conducted between 1993 and 2014 in Indonesia. All other surveys from all sources are cross-sectional. The Japanese General Social Surveys are designed and carried out at the Institute of Regional Studies at Osaka University of Commerce in collaboration with the Institute of Social Science at the University of Tokyo under the direction of Ichiro Tanioka, Michio Nitta, Hiroki Sato and Noriko Iwai with Project Manager, Minae Osawa. The project is financially assisted by Gakujutsu Frontier Grant from the Japanese Ministry of Education, Culture, Sports, Science and Technology for 1999–2003 academic years, and the datasets are compiled with cooperation from the SSJ Data Archive, Information Center for Social Science Research on Japan, Institute of Social Science, and the University of Tokyo.
Table E.2 replicates panel (b) of Table 2 for each restriction that yields the final sample of birth history surveys: women aged 40 and older who provide a complete birth history, who do not have twins or another multiple birth, who did not reach age 40 after the final year in which the sex ratio at birth was at the natural level in their country, who have been married, and who have eight or fewer children. There is no consistent pattern across women included in and excluded from the final sample. Some restrictions exclude women who are older, have more years of schooling, are more likely to live in an urban area, or who have more children. Other restrictions do the reverse.
Table E.1:
Birth History Surveys
| Country | Surveys |
|---|---|
| Africa | |
| Angola | DHS(2011) |
| Benin | DHS(1996,2001,2006,2011) WFS(1981) |
| Burkina Faso | DHS(1993,1998,2003,2010) |
| Burundi | DHS(1987,2010) |
| Cameroon | DHS(1991,1998,2004,2011) WFS(1978) |
| Central African Rep. | DHS(1994) |
| Chad | DHS(1996,2004,2014) |
| Comoros | DHS(1996,2012) |
| Congo | DHS(2005,2011) |
| Congo, Dem. Rep. | DHS(2007,2013) |
| Cote d’Ivoire | DHS(1994,1998,2011) WFS(1980) |
| Egypt | DHS(1988,1992,1995,2000,2003,2005,2008,2014) WFS(1980) |
| Ethiopia | DHS(2000,2005,2011) |
| Gabon | DHS(2000,2012) |
| Ghana | DHS(1988,1993,1998,2003,2008,2014) MICS(2011) WFS(1979) |
| Guinea | DHS(1999,2005,2012) |
| Kenya | DHS(1989,1993,1998,2003,2008,2014) MICS(2009,2011) WFS(1977) |
| Lesotho | DHS(2004,2009,2014) WFS(1977) |
| Liberia | DHS(1986,2007,2009,2013) |
| Madagascar | DHS(1992,1997,2003,2008) MICS(2012) |
| Malawi | DHS(1992,2000,2004,2010,2015) MICS(2006,2013) |
| Mali | DHS(1987,1995,2001,2006,2012) |
| Mauritania | MICS(2011) WFS(1981) |
| Morocco | DHS(1987,1992,2003) WFS(1980) |
| Mozambique | DHS(1997,2003,2011) |
| Namibia | DHS(1992,2000,2006,2013) |
| Niger | DHS(1992,1998,2006,2012) |
| Nigeria | DHS(1990,2003,2008,2010,2013) |
| Rwanda | DHS(1992,2000,2005,2007,2010,2014) |
| Sao Tome & Principe | DHS(2008) |
| Senegal | DHS(1986,1992,1997,2005,2008,2010,2012,2015) WFS(1978) |
| Sierra Leone | DHS(2008,2013) |
| Somalia | MICS(2006,2011) |
| South Africa | DHS(1998) |
| Sudan | DHS(1990) MICS(2010) WFS(1978) |
| Swaziland | DHS(2006) MICS(2010) |
| Tanzania | DHS(1991,1996,1999,2004,2010,2015) |
| Togo | DHS(1988,1998,2013) |
| Tunisia | DHS(1988) MICS(2011) WFS(1978) |
| Uganda | DHS(1988,1995,2000,2006,2009,2011) |
| Zambia | DHS(1992,1996,2001,2007,2013) |
| Zimbabwe | DHS(1988,1994,1999,2005,2010,2015) MICS(2009,2014) |
| Asia | |
| Afghanistan | DHS(2015) |
| Armenia | DHS(2000,2005,2010) |
| Azerbaijan | DHS(2006) |
| Bangladesh | DHS(1993,1996,1999,2004,2007,2011,2014) FLS(1996) WFS(1975) |
| Cambodia | DHS(2000,2005,2010,2014) |
| China | CFS(1988) |
| India | DHS(1992,1998,2005) REDS(1982) |
| Indonesia | DHS(1987,1991,1994,1997,2002,2007,2012) FLS(1993) WFS(1976) |
| Iraq | MICS(2006,2011) |
| Japan | JGSS(2001,2002,2005,2006,2008,2010,2012) |
| Jordan | DHS(1990,1997,2002,2007,2009,2012) |
| Kazakhstan | DHS(1995,1999) |
| Korea | WFS(1974) |
| Kyrgyzstan | DHS(1997,2012) |
| Laos | MICS(2012) |
| Lebanon | MICS(2011) |
| Malaysia | WFS(1974) |
| Maldives | DHS(2009) |
| Nepal | DHS(1996,2001,2006,2011) MICS(2014) WFS(1976) |
| Pakistan | DHS(1990,2006,2012) WFS(1975) |
| Palestine | MICS(2010) |
| Philippines | DHS(1993,1998,2003,2008,2013) WFS(1978) |
| Sri Lanka | DHS(1987) WFS(1975) |
| Syria | WFS(1978) |
| Tajikistan | DHS(2012) |
| Thailand | DHS(1987) |
| Timor-Leste | DHS(2009) |
| Turkey | DHS(1993,1998,2003) WFS(1978) |
| Uzbekistan | DHS(1996) |
| Vietnam | DHS(1997,2002) |
| Yemen | DHS(1991) MICS(2006) WFS(1979) |
| North America | |
| Costa Rica | WFS(1976) |
| Dominican Rep. | DHS(1986,1991,1996,1999,2002,2007,2013) WFS(1975) |
| Guatemala | DHS(1987,1995,1998,2014) |
| Haiti | DHS(1994,2000,2005,2012) WFS(1977) |
| Honduras | DHS(2005,2011) |
| Jamaica | WFS(1975) |
| Mexico | DHS(1987) WFS(1976) |
| Nicaragua | DHS(1998,2001) |
| Panama | WFS(1975) |
| Trinidad & Tobago | DHS(1987) WFS(1977) |
| United States | IFSS(1970,1973,1976,1982,1988,1995,2002) |
| South America | |
| Bolivia | DHS(1989,1994,1998,2003,2008) |
| Brazil | DHS(1986,1991,1996) |
| Colombia | DHS(1986,1990,1995,2000,2005,2010,2015) WFS(1976) |
| Ecuador | DHS(1987) WFS(1979) |
| Guyana | DHS(2009) WFS(1975) |
| Paraguay | DHS(1990) WFS(1979) |
| Peru | DHS(1986,1991,1996,2000,2004,2009,2010,2011,2012) WFS(1977) |
| Venezuela | WFS(1977) |
Notes: See section E. Sources: China Fertility Survey (CFS), Demographic and Health Survey (DHS), India Rural Economic and Demographic Survey (REDS), Family Life Surveys (FLS), Japanese General Social Survey (JGSS), United States Integrated Fertility Survey Series (IFSS), Multiple Indicator Cluster Survey (MICS), and World Fertility Survey (WFS).
Table E.2:
Restrictions for Final Sample
| Complete birth history | Birth history error | Difference | |
| Women age 40+ | 894,366 | 1,317 | |
| Avg. age | 44.07 | 42.85 | 1.22 (0.26) |
| Avg. years of schooling | 4.72 | 9.78 | –5.06 (0.56) |
| Share live in urban area | 0.44 | 0.69 | –0.25 (0.07) |
| Avg. no. of children | 4.98 | 6.94 | –1.96 (0.69) |
| No multiple births | Multiple birth | Difference | |
| Women age 40+ | 848,384 | 47,299 | |
| Avg. age | 44.06 | 44.15 | –0.09 (0.02) |
| Avg. years of schooling | 4.80 | 3.41 | 1.39 (0.03) |
| Share live in urban area | 0.44 | 0.35 | 0.09 (0.003) |
| Avg. no. of children | 4.81 | 7.78 | –2.97 (0.02) |
| Normal sex ratio | Distorted sex ratio | Difference | |
| Women age 40+ | 782,051 | 113,632 | |
| Avg. age | 44.13 | 43.44 | 0.69 (0.02) |
| Avg. years of schooling | 4.78 | 4.07 | 0.72 (0.03) |
| Share live in urban area | 0.44 | 0.36 | 0.08 (0.003) |
| Avg. no. of children | 5.06 | 4.17 | 0.89 (0.01) |
| Ever married | Never married | Difference | |
| Women age 40+ | 870,007 | 25,676 | |
| Avg. age | 44.08 | 43.86 | 0.22 (0.02) |
| Avg. years of schooling | 4.64 | 6.76 | –2.12 (0.05) |
| Share live in urban area | 0.43 | 0.59 | –0.16 (0.004) |
| Avg. no. of children | 5.08 | 2.50 | 2.58 (0.03) |
| ≤8 children | ≥9 children | Difference | |
| Women age 40+ | 780,597 | 115,086 | |
| Avg. age | 43.98 | 44.62 | –0.64 (0.01) |
| Avg. years of schooling | 5.19 | 1.67 | 3.52 (0.01) |
| Share live in urban area | 0.47 | 0.22 | 0.24 (0.002) |
| Avg. no. of children | 4.19 | 10.15 | –5.96 (0.01) |
Notes: This table repeats panel (b) of Table 1 for each individual restriction that yields the final sample of birth history surveys. All p-values are less than 0.001. See appendix E and notes to Table 2. Data source: Birth history surveys, described in section 4.
Appendix F: Estimates in Africa by Women’s Birth Cohort
The estimated bounds on average values of preferences, presented in section 4.1, are wide, particularly for average sex preference. These wide bounds can obscure any changes in preferences over time. Figure F.1 presents estimated average values of preferences by women’s birth cohort in Africa. Among women born in the 1940s, average ideal share of children that are sons is between 0.19 and 0.81. The estimated bounds vary for later cohorts, but all intervals overlap, as do estimated bounds on strength of sex preference in panel (c). These overlapping bounds prevent identification of changes over time; the true unknown average values of these preferences may have increased, stayed the same, or decreased. Only when estimated bounds on preferences are narrow is it possible to identify changes over time. As given in panel (b), estimated ideal family size fell from nearly 5.4 children on average for the 1940s cohort to approximately 4.8 for the 1970s cohort.
Figure F.1: Estimates in Africa by Women’s Birth Cohort.
Notes: This figure presents estimated bounds on preferences in Africa by women’s birth cohort. Coefficients of determination (CD) are given in parentheses. See appendix F. Data source: Birth history surveys, described in section 4.
Appendix G: Sensitivity of Estimates in Africa to Model and Sample Specification
Figure G.1 demonstrates the sensitivity of estimated bounds in Asia to the specification of equation 1. The solid rectangles provide the identified set from Figure 1 calculated using the main specification, and the hollow rectangles provide the identified set under alternative specifications. Equation 1 assumes symmetric, quadratic loss around both bliss points. The first comparison in Figure G.1 demonstrates that the bounds vary substantially with other specifications of the loss function, narrowing with an absolute value loss and widening with higher-order power polynomial loss. The second comparison demonstrates that the bounds also vary substantially if loss comes from only undershooting or only overshooting each bliss point.
So that each term of equation 1 has a maximum value of one, each is divided by the maximum possible loss. The third comparison in Figure G.1 demonstrates that estimated bounds on average strength of sex preference increase and widen substantially if these scaling factors are removed. Equation 1 sets the sex preference term to equal one when a couple has no children and the share of children that are sons is undefined. Estimated bounds vary little if this term is instead set to zero. Equation 1 also assumes that, when indifferent between having and not having a child, a couple tries to conceive. Estimated bounds vary only slightly if the couple instead is assumed to not have a child.
The main estimates assume eight childbearing periods, a likelihood of 0.51 that each birth is a boy, a likelihood of 0.9 that a couple conceives when trying, and a likelihood of 0.1 that a couple conceives when not trying. As given in the sixth through ninth comparisons in Figure G.1, estimated bounds widen considerably if couples are assumed to have less chance of conceiving when trying, or if couples are assumed to have perfect ability to prevent conception. These comparisons suggest that couples face a small likelihood of infecundity and a small likelihood of accidentally becoming pregnant. Imperfect control over conception introduces some randomness into actual sequences of sons and daughters. A model that assumes perfect control over conception cannot account for this randomness and struggles to explain observed childbearing.
The final comparison in Figure G.1 presents bounds estimated using different discretizations of possible sex preference and the strength of sex preference. Smaller increments between possible preferences permit identification of finer differences in preferences across couples, but at a cost of increased scale and slower computation of equations 2 and 3. Estimated bounds on average sex preference narrow substantially moving from an increment of 0.50 to an increment of 0.25, but then remain similar with an increment of 0.10 or 0.05. Estimated bounds on average strength of sex preference rise slightly with smaller increments.
Figure G.2 performs similar comparisons across various sample specifications. The main estimates are reached using a sample of women aged 40 and older who have ever been married. The first two comparisons in Figure G.2 demonstrate that the estimates vary little if the minimum age at observation is increased, but the estimated bounds on average sex preference and average family size preference are lower for previously married women than for currently married women. The main sample includes all live births, and the third comparison demonstrates that the estimated bounds on average sex preference narrow slightly if deceased children are omitted. Sampling weights are scaled to have a mean of one by survey. The fourth comparison demonstrates that the estimated bounds on average family size preference drop substantially if the original survey weights are used. This drop is due to large scaling of sampling weights provided in the surveys from Japan, where fertility is very low. The main estimates are reached from a sample of all women in Asia grouped together, and the fifth comparison demonstrates that the estimated bounds vary slightly if estimates are instead performed at the country level and then averaged.
The final comparison in Figure G.2 demonstrates the sensitivity of the estimates in Asia to the restriction imposed on women from countries with substantial evidence of sex-selective abortion. Ultrasound technology began to spread widely in the 1980s. The natural share of births that are boys is about 0.51 around the world, but has risen above 0.519 in eight countries, all in Asia. The main estimates are reached excluding women who reached age 40 after the final year in which the share of births that are boys in their country was 0.519 or below. The estimated bounds vary slightly if all women who reached age 40 after 1980 in these eight countries are excluded, or if all women who reached age 40 after 1980 are excluded, regardless of country.
Figure G.1: Robustness of Estimates in Asia to Model Specification.
Notes: This figure compares the estimated bounds on average preferences in Asia (solid rectangles) with estimated bounds under alternative assumptions about the specification of equation 3.1 (hollow rectangles). Calculations are performed according to section 3, with the main specification given in section 5. Coefficients of determination (CD) are given in parentheses. See appendix G. Data source: Birth history surveys, described in section 4.
Figure G.2: Robustness of Estimates in Asia to Sample Specification.
Notes: This figure compares the estimated bounds on average preferences in Asia (solid rectangles) with estimated bounds under alternative sample specifications (hollow rectangles). Calculations are performed according to section 3, with the main specification given in section 5. Coefficients of determination (CD) are given in parentheses. See appendix G. Data source: Birth history surveys, described in section 4.
Appendix H: Reported Preferences in Alternative Samples
Many Demographic and Health surveys ask women to report the number of sons and daughters they would want if they could return to the start of their childbearing careers. Several of these surveys also ask men to report preferences. Figure H.1 compares estimated preferences, preferences reported by women, and preferences reported by men, using data from couples in which the husband and wife each report preferences and each has only one spouse. This sample is 10 percent the size of the full sample of women that report preferences, and the bounds on estimated preferences are different than in Figure 3.
Reported average ideal number of children in panel (a) of Figure H.1 is nearly equal among men and women in Asia and the Americas, but is 0.8 children higher for men than for women in Africa. This difference in Africa is due to desired sons: as given in panel (b), men in Africa report wanting 3.4 sons on average, while women want only 2.7 on average. There is little difference in number of desired daughters in panel (c). As given in panels (d) through (f), men and women on average both want half of their children to be sons, but men are more likely to report that they prefer sons, while women are more likely to report that they prefer daughters. As given in panel (g), women are more likely than men to report balance preference everywhere except in South America. Finally, panels (h) and (i) demonstrate that men and women are equally likely to indicate nearly-balance preference, and average intensity of sex preferences reported by men and women are well below the estimated lower bound.
Because Demographic and Health Surveys are cross-sectional, women observed after age 40 can only report the number of sons and daughters they would want if they could return to the start of their childbearing careers. Figure H.2 compares estimated and reported preferences calculated using Demographic and Health Surveys in Indonesia to preferences calculated from the Indonesian Family Life Survey, a panel study that asked women about preferences in 1993 and then recorded births through 2014. From this study, I calculate average preferences reported in 1993 by women aged 29 and younger, and bounds on estimated preferences using birth history records from the same women observed in 2014 aged 40 and older. Compared to Demographic and Health Surveys administered in Indonesia, the Indonesian Family Life Survey sample is considerably smaller (998 compared to 24,993 women), and the estimated bounds on preferences are wider. Average reported ideal number of children, number of sons, number of daughters, and share of children that are sons are nearly equal at the start and end of women’s childbearing careers. However, greater shares of women report preferring sons and preferring daughters at the start of their childbearing careers than at the end, and average reported intensity of preferences is greater among younger women. Reported and estimated preference are closest for the sample of women whose preferences are reported before age 30.
Figure H.1: Estimated Preferences, and Preferences Reported by Women and Men.
Notes: This figure presents estimated bounds on estimated, and preferences reported by women and men. The coefficient of determination is 0.81 in Africa, 0.84 in Asia, 0.64 in North America, and 0.80 in South America. See appendix H. Data source: Birth history surveys, described in section 4 and appendix H.
Figure H.2: Preferences Reported Early and Later in Life in Indonesia.
Notes: This figure compares bounds on estimated preferences to preferences reported by women. Reported preferences before age 30 are recorded in the first wave of the Indonesia Family Life Survey. Reported preferences after age 40 are recorded in Demographic and Health Surveys. Estimated preferences are measured using birth histories collected after age 40 from the sample of women who report preferences in each case. The coefficient of determination is 0.71 for women whose preferences are reported before age 30, and 0.91 for women whose preferences are reported after age 39. See appendix H. Data source: Birth history surveys, described in section 4 and appendix H.
Footnotes
A binomial test can determine the likelihood that sampling error explains any unevenness in parity progression ratios. For example, of 146,042 women in the sample in Asia with a first-born son, 138,848 have a second child. Of 131,619 women in Asia with a first-born daughter, 125,315 have a second child. Overall, of 277,661 women in Asia with one child, 264,163 have a second child. Assuming that the true likelihood of having a second child is 264,163/277,661 no matter the sex of the first child, the likelihood of observing that 138,848 or fewer women with one son have a second child is 0.1267, and the likelihood of observing that 125,315 or more women with one daughter have a second child is 0.1142. Both of these likelihoods are large enough to indicate that sampling error alone could plausibly explain observed variation in parity progression ratios among women with one child in Asia. Similarly, in Africa and the Americas, sampling error could plausibly explain variation in parity progression ratios among women with one child. However, on each continent, p-values at higher parities are less than 0.01, indicating that sampling error alone cannot explain variation in parity progression ratios. At family size two and larger, the sex of previous children influences whether women have additional children.
Although I have no theoretical justification for the quadratic form of these utility penalties, bliss point utility functions are often formulated assuming a quadratic loss (Alesina 1988, Levine et al. 2008). In appendix H, I compare alternative specifications of the utility penalty. The estimated bounds on preferences widen considerably using quartic or higher-order loss functions, and also vary substantially using asymmetric loss functions.
For example, consider a couple that has two childbearing periods (T=2), has an even likelihood that each child is a boy (l=0.5), has perfect control over conception (p=1, q=0), and cares only that all children are sons (α=1, r*=1, c* does not matter and can take any value). In the first period, the couple has a child. If the first-born child is a son, another child cannot raise and might reduce the share of children that are sons, so the couple does not have a second child. If the first-born child is a daughter, another child cannot reduce and might raise the share of children that are sons, so the couple has a second child. Therefore, with likelihood l the couple has a son and then stops childbearing, with likelihood (1–l)l the couple has a daughter and then a son, and with likelihood (1–l)(1–l) the couple has two daughters. The couple has no chance of having any other sequence of children.
Shorter spacing after the birth of daughters in some countries signals that parents are eager to have sons (Jayachandran and Kuziemko 2011). Collapsing to the sequence of sons and daughters avoids misinterpreting the timing of childbearing, but at a cost of ignoring information about spacing between births. Milazzo (2014) notes that shorter spacing between births can increase maternal mortality. If women with sex preferences have shorter birth spacing, then they are more likely to die in childbirth and fail to be observed birth history surveys. Because the sample in this paper is drawn from birth history surveys, the findings may therefore underestimate the prevalence of sex preferences.
This exclusion drops from the sample women born since 1942 and living in China or South Korea, women born since 1950 and living in India, women born since 1952 and living in Armenia or Azerbaijan, women born since 1957 and living in Pakistan, and women born since 1962 and living in Vietnam. This exclusion therefore omits nearly all women who made childbearing decisions under China’s one-child policy, which began in 1979.
In appendix F, I present estimated bounds on average preferences in Africa by women’s decade of birth. The wide bounds on average sex preference similarly prevent identification of changes over time in the unknown true average value.
Again, birth histories collected from younger women who live in high-sex ratio countries are excluded. Because these women are concentrated in Asia and generally have few children, the actual average ideal number of children across all couples in Asia is likely less than 4.2.
These estimates are calculated for the 76 countries in which at least 1,000 birth history surveys are collected. The sample is restricted to surveys that have national or nearly-national coverage.
Works Cited
- Alesina Alberto, Devleeschauwer Arnaud, Easterly William, Kurlat Sergio, and Wacziarg Romain. 2003. “Fractionalization.” Journal of Economic Growth, 8(2): 155–194. [Google Scholar]
- Alesina Alberto. 1988. “Credibility and Policy Convergence in a Two-Party System with Rational Voters.” American Economic Review, 78(4): 796–805. [Google Scholar]
- Angrist Joshua, and Evans William. 1998. “Children and Their Parents’ Labor Supply: Evidence from Exogenous Variation in Family Size.” American Economic Review, 88(3): 450–477. [Google Scholar]
- Arnold Fred, Choe Minja Kim, and Roy TK. 1998. “Son Preference, the Family-Building Process and Child Mortality in India.” Population Studies, 52(3): 301–315. [Google Scholar]
- Arnold Fred. 1997. Gender Preferences for Children DHS Comparative Studies No. 23. Calverton, Maryland: Macro International Inc. [Google Scholar]
- Becker Gary. 1960. “An Economic Analysis of Fertility” In Demographic and Economic Change in Developed Countries: A Conference of the Universities-National Bureau Committee for Economic Research. Princeton: Princeton University Press. [Google Scholar]
- Ben-Porath Yoram, and Welch Finis. 1976. “Do Sex Preferences Really Matter?” Quarterly Journal of Economics, 90(2): 285–307. [Google Scholar]
- Bhat Mari, and Zavier Francis. 2003. “Fertility Decline and Gender Bias in Northern India.” Demography, 40(4): 637–657. [PubMed] [Google Scholar]
- Bongaarts John, and Guilmoto Christophe Z.. 2015. “How Many More Missing Women? Excess Female Mortality and Prenatal Sex Selection, 1970–2050.” Population and Development Review, 41(2): 241–269. [Google Scholar]
- Bongaarts John. 2013. “The Implementation of Preferences for Male Offspring.” Population and Development Review, 39(2): 185–208. [Google Scholar]
- China Fertility Survey. 1988. Ten percent sample of the Two-per-thousand survey.
- Clifton VL 2010. “Review: Sex and the Human Placenta: Mediating Differential Strategies of Fetal Growth and Survival.” Placenta, 31(3): S33–S39. [DOI] [PubMed] [Google Scholar]
- Renzo Di, Carlo Gian, Rosati Alessia, Roberta Donati Sarti, Laura Cruciani, and Antonio Massimo Cutuli. 2007. “Does Fetal Sex Affect Pregnancy Outcome?” Gender Medicine, 4(1): 19–30. [DOI] [PubMed] [Google Scholar]
- Frankenberg Elizabeth, and Thomas Duncan. 2000. “The Indonesia Family Life Survey (IFLS): Study Design and Results from Waves 1 and 2.” RAND Corporation, DRU-2238/1-NIA/NICHD. [Google Scholar]
- Frankenberg Elizabeth, and Karoly Lynn A.. 1995. “The 1993 Indonesian Family Life Survey: Overview and Field Report.” RAND Corporation, DRU-1195/1-NICHD/AID. [Google Scholar]
- Freedman Ronald, and Coombs Lolagene C.. 1974. Cross-Cultural Comparisons: Data on Two Factors in Fertility Behavior. New York: Population Council. [Google Scholar]
- Goodkind Daniel. 2017. “The Big Short: Four Reasons to Bet on a Collapse in China’s Excess of Boys.” Working paper, presented at the 2017 Annual Meeting of the Population Association of America. [Google Scholar]
- Haughton Jonathan, and Haughton Dominique. 1995. “Son Preference in Vietnam.” Studies in Family Planning, 26(6): 325–337. [PubMed] [Google Scholar]
- Haughton Jonathan, and Haughton Dominique. 1998. “Are Simple Tests of Son Preference Useful? An Evaluation Using Data from Vietnam.” Journal of Population Economics, 11(4): 495–516. [Google Scholar]
- Henry Louis. 1961. “Some Data on Natural Fertility.” Biodemography and Social Biology, 8(2): 81–91. [Google Scholar]
- ICF International. 1985–2017. Demographic and Health Surveys (various) [Datasets]. Calverton, Maryland: ICF International [Distributor] Available at <http://dhsprogram.com>, [Google Scholar]
- Institute of Regional Studies at Osaka University of Commerce. 2000–2010. Japanese General Social Surveys [Datasets]. Ann Arbor, Michigan: Inter-university Consortium for Political and Social Research [Distributor] Available at <https://opr.princeton.edu/archive/wfs>, [Google Scholar]
- International Statistics Institute. 1974–1981. “World Fertility Surveys.” Available at <https://opr.princeton.edu/archive/wfs>,
- James William H. 1971. “Cycle Day of Insemination, Coital Rate, and Sex Ratio.” Lancet, 297(7690): 112–114. [DOI] [PubMed] [Google Scholar]
- James William H. 1990. “On the Magnitude of Variation in the Human Sex Ratio at Birth.” Current Anthropology, 31(4): 419–420. [Google Scholar]
- James William H. 2009. “Variation in the Probability of a Male Birth Within and Between Sibships.” Human Biology, 81(1): 13–22. [DOI] [PubMed] [Google Scholar]
- Jayachandran Seema, and Kuziemko Ilyana. 2011. “Why Do Mothers Breastfeed Girls Less than Boys? Evidence and Implications for Child Health in India.” Quarterly Journal of Economics, 126(3): 1–54. [DOI] [PubMed] [Google Scholar]
- Keyfitz Nathan, and Caswell Hal. 2005. Applied Mathematical Demography. Third edition New York: Springer. [Google Scholar]
- Kumar Anuradha, Hessini Leila, and Mitchell Ellen M.H.. 2009. “Conceptualising Abortion Stigma.” Culture, Health, & Sexuality, 11(6): 625–639. [DOI] [PubMed] [Google Scholar]
- Larsen Ulla. 2005. “Research on Infertility: Which Definition Should We Use?” Fertility and Sterility, 83(4): 846–852. [DOI] [PubMed] [Google Scholar]
- Levine Paul, Pearlman Joseph, and Pierse Richard. 2008. “Linear-quadratic Approximation, External Habit, and Targeting Rules.” Journal of Economic Dynamics and Control, 32(10): 3315–3349. [Google Scholar]
- Lightbourne Robert E. 1985. “Individual Preferences and Fertility Behaviour” In Reproductive Change in Developing Countries: Insights from the World Fertility Survey, edited by Cleland John and Hobcraft John, 165–198. Oxford: Oxford University Press. [Google Scholar]
- Maharaj Pranitha, and Cleland John. 2006. “Condoms Become the Norm in the Sexual Culture of College Students in Durban, South Africa.” Reproductive Health Matters, 14(28): 104–112. [DOI] [PubMed] [Google Scholar]
- McKean Joseph W., and Sievers Gerald L.. 1987. “Coefficients of Determination for Least Absolute Deviation Analysis.” Statistics & Probability Letters, 5(1): 49–54. [Google Scholar]
- Milazzo Annamaria. 2014. “Son Preference, Fertility and Family Structure: Evidence from Reproductive Behavior among Nigerian Women.” World Bank Policy Research Working Paper 6869. [Google Scholar]
- Mutharayappa Rangamuthia, Minja Kim Choe Fred Arnold, and Roy TK. 1997. “Son Preference and Its Effect on Fertility in India.” National Family Health Survey Subject Reports, Number 3. [PubMed] [Google Scholar]
- National Council of Applied Economic Research. 1982. Rural Economic and Demographic Survey. Available at <http://adfdell.pstc.brown.edu/arisreds_data>,
- Newell Colin. 1988. Methods and Models in Demography. New York: Guilford Press. [Google Scholar]
- Novitski E, and Sandler L. 1956. “The Relationship between Parental Age, Birth Order, and the Secondary Sex Ratio in Humans.” Annals of Human Genetics, 21(2): 123–131. [DOI] [PubMed] [Google Scholar]
- Pickles AR, Crouchley R, and Davies RB. 1982. “New Methods for the Analysis of Sex Ratio Data Indepdendent of the Effects of Family Limitation.” Annals of Human Genetics, 46(1): 75–81. [DOI] [PubMed] [Google Scholar]
- Rahman Omar, Menken Jane, Foster Andrew, Peterson Christine E., Mohammad Nizam Khan Randall Kuhn, and Gertler Paul. 1999. “The Matlab Health and Socioeconomic Survey: Overview and User’s Guide.” RAND Corporation, DRU-2018/1. [Google Scholar]
- Romano Joseph, and Shaikh Azeem. 2008. “Inference for Identifiable Parameters in Partially Identified Econometric Models.” Journal of Statistical Planning and Inference, 138(9): 2786–2807. [Google Scholar]
- Smock Pamela, Granda Peter, and Hoelter Lynette. 1955–2002. Integrated Fertility Survey Series, Release 7 [Datasets]. Ann Arbor, Michigan: Inter-university Consortium for Political and Social Research [Distributor] Available at <http://www.icpsr.umich.edu>, [Google Scholar]
- Strauss John, Witoelar Firman, and Sikoki Bondan. 2016. “The Fifth Wave of the Indonesia Family Life Survey (IFLS5): Overview and Field Report.” RAND Corporation, WR-1143/1-NIA/NICHD. [Google Scholar]
- Strauss John, Witoelar Firman, Sikoki Bondan, and Wattie Anna Marie. 2009. “The Fourth Wave of the Indonesian Family Life Survey (IFLS4): Overview and Field Report.” RAND Corporation, WR-675/1-NIA/NICHD. [Google Scholar]
- Strauss John, Beegle Kathleen, Sikoki Bondan, Dwiyanto Agus, Herawati Yulia, and Witoelar Firman. 2004. “The Third Wave of the Indonesia Family Life Survey (IFLS): Overview and Field Report.” RAND Corporation, WR-144/1-NIA/NICHD. [Google Scholar]
- Tamer Elie. 2010. “Partial Identification in Econometrics.” Annual Review of Economics, 2(1): 167–195. [Google Scholar]
- UNICEF. 2006–2014. Multiple Indicator Cluster Surveys. Available at <http://mics.unicef.org>, accessed August 23, 2015.
- World Bank DataBank. 2015. “Sex Ratio at Birth.” Available at <https://data.worldbank.org/>, downloaded April 3, 2015.
- Bank World. 2018. “Population, Total.” Available at <https://data.worldbank.org/>, downloaded June 6, 2018.











