Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Nov 5.
Published in final edited form as: Sociol Methods Res. 2014 Dec 1;44(4):555–584. doi: 10.1177/0049124114554460

PROSPECTIVE VERSUS RETROSPECTIVE APPROACHES TO THE STUDY OF INTERGENERATIONAL SOCIAL MOBILITY

XI SONG 1, ROBERT D MARE 2
PMCID: PMC6830566  NIHMSID: NIHMS1053755  PMID: 31693004

Abstract

Most intergenerational social mobility studies are based upon retrospective data, in which samples of individuals report socioeconomic information about their parents, an approach that provides representative data for offspring but not the parental generation. When available, prospective data on intergenerational mobility, which are based on a sample of respondents who report on their progeny, have conceptual and practical advantages. Prospective data are especially useful for studying social mobility across more than two generations and for developing joint models of social mobility and demographic processes. Because prospective data remain relatively scarce, we propose a method that corrects retrospective mobility data for the unrepresentativeness of the parental generation, and thus permits them to be used for models of social mobility and demographic processes. We illustrate this method using both simulated data and data from the Panel Study of Income Dynamics. In our examples, this method removes more than 95% of the bias in the retrospective data.

Keywords: social mobility, prospective data, retrospective data, multigenerational study, demographic processes

Introduction

This paper illustrates the advantages of using a prospective approach in two-generation and three-generation social mobility studies, and provides an adjustment method for conducting prospective analyses when only retrospective data are available. Traditional social mobility studies typically examine the parent-offspring association in socioeconomic status by asking a group of individuals about the characteristics of their parents retrospectively. Such an approach has been the basis of a large and successful literature on intergenerational social mobility (e.g., Beller 2009; Blau and Duncan 1967; Breen 2004; Erikson and Goldthorpe 1992; Featherman and Hauser 1978; Hauser et al. 1975; Hout 1983, 1988). Modern longitudinal studies, however, which follow a sample of adults from the birth and growth to adulthood of their offspring and descendants, afford the possibility of studying social mobility prospectively, including, in some cases, mobility across multiple generations (Mare 2011).

In Duncan’s (1966) classic article on methodology of social mobility studies, he concluded: “Although data in the typical mobility study are collected retrospectively (by questioning the respondent about the past), this is only a convenience in data collection. While it introduces problems of data reliability and validity, it does not commit the analyst to a backward-looking conceptual framework.” Whereas this conclusion is, strictly speaking, correct, it leaves unsaid much about the relationship between these two approaches. Important statistical and conceptual issues arise when one considers their respective possibilities and limitations.

From a statistical point of view, mobility estimates, such as those based on odds ratios, are not necessarily the same in the two approaches. The discrepancy between statistical estimates based on retrospective and prospective surveys, which we term “retrospective sampling bias,” arises because individuals can be recalled only if they have surviving offspring who can be sampled in retrospective surveys and the parents who are recalled have higher fertility than a random selection of parents from their generation (Glass 1954; Duncan 1966; Allan and Bytheway 1973). Thus, the respondents are representative of the offspring generation, but their parents are not representative of the parents’ generation. In contrast, for prospective data, in which the survey asks individuals to provide information about themselves and their offspring, the respondents are representative of people in the parents’ generation, regardless of whether they have children. However, their offspring may not be representative of their own generation, either because the respondents are asked about only a selected child rather than all children, or because the respondents have not finished having children. Prospective data provide representative samples for both the parent and the offspring generation only when parents provide information about all their children. Retrospective and prospective estimates of social mobility agree only when family size is unrelated to the intergenerational transmission of socioeconomic characteristics. One of the contributions of this paper is to provide a method of reconciling mobility estimates from the two approaches.

Although prospective mobility data, like retrospective data, are useful for the analysis of associations between the socioeconomic statuses of parents and offspring, prospective data also afford a wider range of analytic possibilities. For example, the reproduction of inequality from parents to offspring is not only about intergenerational transmission of status, but also depends on differential fertility among socioeconomic groups (Mare 1997; Mare and Maralani 2006). The intergenerational persistence of inequality involves not only inequality among those who have offspring, the typical focus of a traditional mobility table approach, but also inequality among those who have different numbers of offspring, including those who have no offspring at all. Thus, the prospective approach provides a more complete understanding of intergenerational transmission of inequality because it takes into account the interdependence of mobility and demographic processes, such as differentials in timing and levels of marriage, fertility, and mortality. Some prior studies have relied on a prospective logic to investigate the joint impact of social mobility and demographic processes on population dynamics (Preston 1974; Preston and Campbell 1993), intergenerational reproduction of education (Mare and Maralani 2006) and changes in occupational structures (Matras 1961, 1967).

Moreover, prospective data that follow families over three or more generations permit researchers to go beyond the traditional two-generation paradigm in social mobility and demographic studies (Mare 2011). Examples of multigenerational, longitudinal data include the Panel Study of Income Dynamics (PSID) and the Wisconsin Longitudinal Study (WLS). But multigenerational data are still rare in social sciences because the collection of these data is not only costly but also time-consuming—it requires surveys to follow families for more than 50 years. Despite the scarcity of prospective data spanning three or more generations, many cross-sectional and longitudinal data have elicited retrospective questions regarding family histories in social statuses (Featherman 1979). Some of these retrospective data provide potentially useful data sources for studying multigenerational processes (e.g., Szelenyi and Treiman 1994; Treiman, Moeno and Schlemmer 1996; Treiman and Walder 1998).

In this paper, we illustrate the conceptual and practical advantages of using a prospective approach in social mobility analyses that incorporate both mobility and demographic processes. We propose an adjustment method that permits researchers to adopt the prospective approach using only retrospective data. The adjustment method corrects biases in retrospective data caused by the unrepresentativeness of individuals in the parent generation who have no or few offspring. We illustrate our methods using both simulated data and empirical data from the Panel Study of Income Dynamics. Our results suggest that the adjustment method removes most of the bias in the retrospective joint demographic and mobility effect. The adjustment methods proposed in this paper are well-suited for not only two-generation social mobility studies but multigenerational mobility analyses as well.

We divide the remainder of the paper into five sections. The first section discusses the statistical relationship between mobility tables constructed from retrospective and prospective data. In the second section, we present a joint model of social mobility and demographic reproduction. We then describe our adjustment method, explaining how to conduct the prospective analyses when only retrospective data are available. In the fourth section, we illustrate the method using simulated two-generation and multigenerational data. We then use the Panel Study of Income Dynamics to illustrate the effectiveness of the adjustment method for real data. The conclusion section reviews the capabilities and limitations of our method.

Retrospective and Prospective Mobility Table

Most mobility studies use tables based on retrospective surveys that ask respondents about their own and their parents’ socioeconomic characteristics. The mobility table shows the parent-offspring association based on a tabulation of a socioeconomic characteristic of an individual and his or her parents (e.g., Hout 1983). Some widely used cross-sectional, retrospective mobility data include the Occupational Changes in a Generation surveys (OCG) (Blau and Duncan 1967; Featherman and Hauser 1978), the General Social Survey (GSS) (Hout 1988; Beller 2009) and the Comparative Analysis of Social Mobility in Industrial Nations Project (CASMIN) (e.g., Erikson, Goldthorpe and Portocarero 1979; Erikson and Goldthorpe 1992). Based on the retrospective data, the mobility table is constructed from the perspective of adult individuals, who are a representative sample of their own generation. These individuals report on their parents, but the sample of parents is not representative of any cohort or population in “some definite prior moment in time,” as parents vary in their timing and level of childbearing (Duncan 1966). In addition, because this design overrepresents parents who have more offspring and fails to include any members of the parental generation who are childless, the resulting data do not provide a representative sample of the parental generation.

As an alternative, it is possible to construct the mobility table based on prospective data. There are two kinds of prospective mobility data. The first type of data asks respondents about their own socioeconomic status and that of a selected child. For example, the Wisconsin Longitudinal Study (WLS) adopts this design for collecting intergenerational occupational information. The WLS respondents are representative of their generation regardless of their number of sons, but sons of high fertility respondents are underrepresented in their generation. The second type of data asks respondents about all of their offspring, either on a one time basis or longitudinally. Examples of this prospective design include the collection of educational information for the WLS, and intergenerational socioeconomic information in the PSID and newer panel studies in other countries such as the German Socioeconomic Panel (started in 1984), the British Household Panel Survey (1991), the Canadian Survey of Labor and Income Dynamics (1993), the Korean Labor and Income Panel Study (1998), the Swiss Household Panel (1999), the Australian Household, Income and Labor Dynamics (2001), and the Chinese Family Panel Studies (2010). These prospective data include a parent sample and an offspring sample, both of which are representative of their own generations if we ignore the effects of immigration and emigration.

The algebraic relationship between odds ratios in mobility tables estimated from the first type of prospective data and those from traditional retrospective data is as follows: Given a retrospective mobility table in which the rows represent fathers’ occupations and the columns represent sons’ occupations, with I and J categories respectively (typically, I = J), the odds ratios (ORr) are defined as nijnijnijnij, where nij denotes the number of men in the ith occupation category whose fathers are in the jth occupation category. This odds ratio is the relative odds of being in occupation j rather than j’ given that one’s father is in occupation i, to the odds of being in occupation j rather than j, given that one’s father is in occupation i'. We can obtain the same odds ratio based on the first type of prospective data by adjusting for the completed fertility of fathers. If the kth father who is in occupation i and has rij(k)sons in occupation j and r¯ij denotes the average number of sons in occupation j for fathers in occupation i, that is, r¯ij=1nijk=1nijrij(k), the relationship between the retrospective and prospective odds ratios is:

ORr=k=1nijrij(k)k=1nijrij(k)k=1nijrij(k)k=1nijrij(k)=(nij·r¯ij)(nij·r¯ij)(nij·r¯ij)(nij·r¯ij)=ORp·r¯ij·r¯ijr¯ij·r¯ij. (1)

= 1. ratios are equal, it indicates that the reweighting ratio r-son occupation groupent Effectseen an also be applied to conver Equation (1) shows that to construct retrospective odds ratio based on prospective data, we can weight the prospective data so that fathers with more sons are over-represented in exact proportion to their fertility. More specifically, we weight each father by his fertility, or equivalently, weight each son by one plus his number of siblings. The analyses are thus based on the weighted frequencies rather than the original unweighted frequencies1.

When the weighting ratio r¯ij·r¯ijr¯ij·r¯ij = 1, then the retrospective and prospective odds ratios yield the same conclusions about mobility. This means that if father’s fertility is only associated with only the marginal distributions of father’s occupation and/or son’s occupation, then no weighting is needed. Only when there is three-way interaction among father’s occupation, son’s occupation and father’s fertility (or son’s number of siblings), namely, r¯ij·r¯ijr¯ij·r¯ij1, do the retrospective and prospective odds ratios differ (Clogg and Eliason 1987; Fienberg 1980). For example, when the fertility of the fathers of immobile sons is greater than the fertility of the fathers of mobile sons (i.e., r¯ii·r¯iir¯ii·r¯ii > 1), the estimate of ORp is smaller than ORr. Although there is abundant evidence about the two-way interactions between father’s socioeconomic status and fertility, as well as between father’s fertility and son’s status (e.g., Blake 1981; Blau and Duncan 1967), relatively few studies have examined the three-way interaction among father’s occupation, son’s occupation and father’s fertility.

For the second type of prospective data, we can construct mobility tables based on the offspring sample, and link offspring with their parents in the parent sample. If we use data from all sons, the resulting mobility table has the same structure as a traditional mobility table constructed from retrospective data. In this case, the prospective and retrospective odds ratios are equal and no weighting adjustment is required.

The Demography of Social Mobility

Beyond the Mobility Table: the Joint Effects of Fertility and Mobility

As we show above, prospective data can be used to construct traditional mobility tables after a proper weighting of the data. But this is not the only reason for using prospective mobility data. An important advantage of prospective data is that they also permit analyses to go beyond the intergenerational correlation of socioeconomic status and incorporate demographic mechanisms into our models of the intergenerational reproduction of inequality. Most mobility research focuses on the associations between parents’ and their offspring’s characteristics conditional on the existence of the offspring. However, a more complete understanding of intergenerational influence considers the process of how offspring come into existence as part of parents’ effect on their children (Mare and Maralani 2006). The mobility table itself is inadequate for showing how a socioeconomic distribution persists or changes because the mobility process is interdependent with the differentials in timing and levels of fertility, mortality, and migration (Duncan 1966). In the discussion below, we focus on the role of fertility in a one-sex model for men. More comprehensive versions of these models take account of other demographic processes, including marriage, divorce, remarriage, parental and child survival, adoption, migration, and timing of these events, both for women and for two-sex populations (For related discussions, see Bartholomew 1982; Lam 1986; Mare 1997; Mare and Schwartz 2006; Maralani and Mare 2005; Matras 1961, 1967; Musick and Mare 2004; Preston 1974; Preston and Campbell 1993).

A Joint Demographic and Mobility Model

Our discussion below builds upon the one-sex joint demographic and mobility model in Mare and Maralani (2006).2 The model specifies the effect of a man’s socioeconomic position in one generation (compared to other positions of men in that generation) on the expected number of sons in a given socioeconomic position in the next generation. This model shows how social mobility and fertility contribute to transformations of the socioeconomic distribution of a population. The model is written as

SY2|Y1=FY1·fY1·pY2|Y1 (2)

where SY2|Y1 denotes the number of men in the offspring generation who are in position Y2 and have fathers in position Y1; FY1 denotes the number of men in the paternal generation who are in position Y1; fY1 denotes the expected number of sons born to a man in position Y1 and who survive to adulthood; pY2|Y1 denotes the probability that a son born to a man in position Y1 will enter position Y2.

We define the joint demographic and mobility effect as

E(fY1pY2|Y1)E(fY1'pY2|Y1') (3)

This effect suggests the expected relative advantages of a man in position Y1 over a man in position Y1' in reproducing sons in position Y2.

The model can also incorporate potential influences from grandparents, great grandparents, and earlier generations of ancestors in both the fertility and the mobility components. The multigenerational form of the model specifies that:

SYt|Y¯t1=FY¯t1·fY¯t1·pYt|Y¯t1 (4)

where Y¯t1=(Y1,Y2,,Yt1) is a vector that denotes the family history of positions and t denotes the generation sequence; SYt|Y¯t1 denotes the number of men in generation t who are in position Yt and have fathers in position Yt1, grandfathers in position Yt2 and so forth; FY¯t1 denotes the number of men in generation t-1 and those in prior generations in the positional history Y¯t1; fY¯t1 denotes the expected number of sons of men in generation t-1, given that positions of early generations are Y¯t1; and pYt|Y¯t1 denotes the probability that a son born to a family in positional history Y¯t1 achieves position Yt. Accordingly, the joint demographic and mobility effect in the multigenerational form is:

E(fY¯t1·pYt|Y¯t1)E(fY¯t1'·pYt|Y¯t1') (5)

which is the difference of the expected number of individuals in position Yt in a population from families with positional history Y¯t1 compared to Y¯t1'.Standard errors of the joint effect, for either the two-generational or multigenerational model, can be estimated by the delta or bootstrap method.

Estimation of the Model Parameters

To estimate the fertility and mobility parameters of the model, we rely on regression-based methods. For the fertility component, we assume that the number of offspring, conditional upon a set of covariates, follows a Poisson or negative binomial distribution (e.g., Long 1997). The Poisson or the negative binomial model assumes that the proportion of observed counts at each level of fertility, including zero children, in the data matches the proportions predicted by the respective distribution. This assumption may be problematic for the distinction between childless individuals and those with children because different mechanisms may account for the influence of an individual’s characteristics on the probability of having no offspring, and conditional on having at least one offspring, the probabilities of having different numbers of offspring. For example, in most developed societies individuals with high socioeconomic status tend to have fewer children than those low in status, whereas socioeconomic status may have a positive or negative association with childlessness (e.g., Heaton, Jacobson and Holland 1999; Abma and Martinez 2006).

To allow parents’ and grandparents’ characteristics to have separate effects on the probabilities of being childless, and conditional on having at least one offspring, on the total number of offspring, we use a mixture Poisson or mixture negative binomial distribution that allows the two parts of the fertility distribution to follow distinct processes, yet model these processes jointly (Johnson, Kemp and Kotz 2005). Suppose that π is the probability of avoiding childlessness. The model specifies that for the kth individual,

P[fk=0|Zk]=πP[fk=n|Xk,fk>0]=(1π)pn1p0 (6)

where P[fk=n] is the probability that the number of offspring for the kth individual is n; Z is the set of covariates that predict having no offspring and X is the set of covariates that predict the positive number of offspring; pn (and p0 ) is the probability of having a given number of offspring in the Poisson or negative binomial distribution.

We use a generalized linear model with logit link to predict P[fk=0] and assume that nonzero fertility P[fk=n|fk>0] follows a truncated negative binomial distribution. Thus, we can model them jointly by the mixture logit and negative binomial model:

P[fk=0|Zk]=11+exp(Zk'γ)

and

P[fk=n|Xk,fk>0]=1P[fk=0|Zk]1(1+μkθk)θk·Γ(fk+θk)Γ(θk)·Γ(fk+1)·μkfk·θkθk(μk+θk)fk+θk (7)

where Γ(·) is the gamma function. The probability of having zero offspring is (1+μkθk)θk. The mean of the negative binomial distribution is μk, and the variance is μk + μk2θk, where θk is the dispersion parameter.

This model reduces to a regular negative binomial model if all zero and nonzero observations in the fertility distribution are generated by the same negative binomial process. The mixture logit and negative binomial model allows us to examine whether the mechanisms that determine individuals’ decisions about whether to have offspring are the same ones that determine how many offspring they have. In addition, the separation of the zero fertility from the positive fertility can improve the accuracy the adjustment method we propose for retrospective data. We illustrate the latter point in the next section.

We estimate the mobility probabilities from a multinomial logit model or an ordered logit model, depending on whether the socioeconomic outcome is purely categorical or ordered. Taken together, the joint demographic and mobility model specifies the number of men in a given position in the fathers’ generation, the expected number of sons born to each man in that position, and the probability that a son with a father in that position will attain a specific position. Thus, the mobility probabilities in the model are estimated by giving an equal weight to each man in the sons’ generation. This implies that retrospective data would yield the same results as prospective data if the data are the second type of prospective data we described earlier, which include fathers and all their sons. By contrast, when the prospective data include only a single randomly selected son for each father (the first type of prospective data), we need to adjust the mobility estimates by weighting for differential fertility.

An Adjustment Method for Retrospective Fertility Data

When we use retrospective data to estimate fertility parameters in the joint demographic and mobility model in equation (2), the estimates suffer from two problems. First, the information from men in the fathers’ generation who have multiple offspring is overrepresented in retrospective samples. Second, the information from men in the fathers’ generation who have no offspring is omitted from retrospective samples. We call the estimate bias caused by the two problems as “retrospective sampling bias” in this study. We propose a two-step method to adjust for the bias.

Step 1: Correction for the Overrepresentativeness in the Parent Generation.

We rewrite the expected prospective fertility in the parent generation as the combination of two parts,

E(fk|Xk)=P(fk=0|Xk).0+(1P(fk=0|Xk))·E(fk|Xk,fk>0) (8)

The first source of the bias comes from the biased estimate of E(fk|Xk,fk>0). We can correct the bias by the inverse probability weighting method (Horvitz and Thompson 1952), namely, weighting each respondent by the inverse of the number of siblings of respondents plus themselves, that is, for the kth individual, wk = 1/(sibsk+1). The relationship between the expected value of a variable measured in the weighted retrospective sample, X, and the same variable measured in the original retrospective sample, X, is as follows:

E(Xk')=k=1m1m·Xk'=k=1nwk·Xkk=1nwk=k=1n1(sibsk+1)·Xkk=1n1(sibsk+1) (9)

where n refers to the original sample size of the retrospective sample, and m is the weighted sample size (m=k=1nwk).

After weighting the data based on equation (9), we can estimate E(fk|Xk,fk>0) from a truncated negative binomial model, that is,

(fk|Xk,fk>0)=μkP(fk>0|Xk)=μk1(1+μkθk)θk (10)

where μk=exp(Xk'β) In principle, this adjusted retrospective estimate of fk (> 0) should be the same as the prospective estimate from the truncated part in the mixture logit and negative binomial model.

Step 2 Correction for the Childlessness in the Parent Generation.

The second source of bias in the retrospective data comes from the missing information of P(fk=0|Xk) for the parent generation in retrospective data. Two feasible alternatives to approximate the childlessness probability for the parents include (1) relying on external data on fertility of individuals by social groups in the parent cohorts based on cross-sectional data such as the U.S. census and (2) relying on the proportion of childless adults in the offspring generation to approximate that for the parent generation. These two solutions both have drawbacks. The weakness of the first approximation is that because parents vary in their levels and timing of childbearing, individuals in the parent generation are not representative of any population living in a definite time frame in the past (Duncan 1966). Therefore, it may be difficult to recover fertility of the parent generation based on cohort fertility estimates. The weaknesses of the second method are that fertility levels and differentials may have changed across generations and that some individuals in the offspring generation may have not completed their childbearing by the time of the interview. Thus, our estimates of the childlessness probability in the offspring generation need to take account of individuals’ length of time exposure to reproductive ages.

In our empirical example, we rely on the second method to approximate the childlessness probability in the parent generation and show that, even though the childlessness probability has changed over time, the approximation still effectively reduces the retrospective sampling bias. We rely on a logit model to model the zero fertility in the parents’ generation P(fk=0|Xk) by the zero fertility in the offspring generation P(fk'=0|Zk). When this approximation is accurate, the retrospective estimate of P(fk'=0|Zk) should be close to the prospective estimate from the logit part of the mixture negative binomial model in equation (7).

Overall, the adjusted retrospective estimate of fertility in the fathers’ generation in equation (8) can be expressed as

E(fk|Xk)=(1P(fk=0|Xk))·μk1(1+μkθk)θk=(111+exp(Zk'γ))·exp(Xk'β)1(1+exp(Xk'β)θk)θk (11)

To evaluate the performance of the adjustment method, we define the bias in the retrospective estimates of the joint demographic and mobility effect as

B=(E(fY¯t1·pYt|Y¯t1)E(fY¯t1'·pYt|Y¯t1'))p(E(fY¯t1·pYt|Y¯t1)E(fY¯t1'·pYt|Y¯t1'))r (12)

Then adjusted bias and the percent reduction in bias (Δ) are defined accordingly

Badj=(E(fY¯t1·pYt|Y¯t1)E(fY¯t1'·pYt|Y¯t1'))p(E(fY¯t1·pYt|Y¯t1)E(fY¯t1'·pYt|Y¯t1'))adj (13)
Δ=100(1|γ|)%whereBadj=γB (14)

Simulation Example

In this section, we simulate several prospective data sets under different fertility and mobility assumptions using Monte Carlo methods. We generate samples of data from a joint fertility-mobility model with known parameters. Given these simulated data we first obtain the prospective estimates of the joint demography and mobility model. Then we treat the data sets retrospectively, assuming that information from the childless group is missing, and that individuals with different numbers of offspring are disproportionately represented in the sample. This procedure shows the extent of the retrospective sampling bias in the joint demographic and mobility effect estimated from retrospective data. Our illustrations include a two-generation model, which only focuses on father-son associations, and a three-generation model, which takes both fathers and grandfathers into account. Models with four or more generations can be simulated in a similar fashion.

Data Generating Process

We generate three variables, F (number of sons), Y (socioeconomic position), and U (a random variable that summarizes personal attributes, such as ability, genetic endowment, and experiences) for each of 10,000 subjects in the initial generation. We first generate the personal attributes variable U for the initial generation, which is drawn from a standard normal distribution. We assume the socioeconomic position is a dichotomous variable with two categories {1=low, 2=high}, and for each subject, we draw the variable from a Bernoulli distribution with the mean conditional on the exogenous variable (U). Then we draw fertility (F) from a Poisson distribution with the mean parameter to be determined by the socioeconomic position variable (Y) of the concurrent generation. We then generate a dichotomous variable D, indicating whether the fertility is zero (D = 0) or positive (D = 1). Since we do not count daughters, we assume that the number of men without any offspring is number of men without sons. Once the variables for the initial generation are generated, all the subsequent generations can be generated by fertility and mobility rules specified in the equations below.

We first simulate a two-generation data set, in which we assume a man’s fertility at the tth generation (Ft) depends on his socioeconomic position (Yt,), as shown in equation (15) below, and his socioeconomic position (Yt) depends only on his father’s position (Yt-1), given that his father has at least one son (Dt-1 = 1) (equation (16)). We then simulate a three-generation data set, for which we assume that a man’s fertility (Ft) depends on the socioeconomic positions and fertility of all prior generations (Yt-1Y1 and Ft-1F1) as well as his own socioeconomic position (Yt) (equation ()). Also, a man’s socioeconomic position (Yt) depends on the socioeconomic positions of all prior generations (Yt-1Y1) (equation ()) and his father’s fertility (Ft-1), given that his father has at least one son (Dt-1 = 1). Appendix A gives a more detailed description of the simulation procedures. The equations for the models are as follows:

Two-generation model:
E(F2|Y2,D1=1)=exp(β0+β1(Y2Y2¯))=exp(log(1.1)+0.6.(Y2Y2¯))logit(P[Y2=2|U2,Y1,D1=1])=δ0+δ1·U2+δ2·Y1 (15)
=log(0.20.8)+log(2)·U2+log(2.5)·Y1 (16)
Three-generation model:
E(F3|Y3,Y2,Y1,F2,F1,D1=1)=exp(ζ0+ζ1·(Y3Y3¯)+ζ2·(Y2Y2¯)+ζ3·(Y1Y1¯)+ζ4·(F2F¯2)+ζ5·(F1F¯1))exp(log(1.1)+0.36.(Y3Y3¯)+0.20.(Y2Y2¯)+0.10.(Y1Y1¯)+0.10.(F2F¯2)+0.03.(F1F¯1)) (17)
logit(P[Y3=2|U3,Y2,U2,F2,Y1,U1,D2=1])=λ0+λ1·U3+λ2·Y2+λ3·U2+λ4·F2+λ5·Y1+λ6·U1=log(0.150.85)+log(1.8)·U3+log(2.0)·Y2+log(1.3)·U2+log(1.1)·F2+log(1.5)·Y1+log(1.1)·U1 (18)

In the two-generation prospective sample all the variables, F1, F2, D1, D2, Y1, Y2, are observed, whereas in the retrospective sample we only observe F1 > 0 (i.e., given D1 = 1), F2, D2, Y1 (given D1 = 1), and Y2. We use the proportion of childless adults in the sons’ generation (D2 = 0) to approximate that of the fathers’ generation (D1 = 0) in the adjusted retrospective method. Likewise, in the three-generation prospective sample all the variables F1, F2, F3, D1, D2, D3, Y1, Y2, and Y3 are observed, whereas in the retrospective sample we only observe F1 > 0 (given D1 = 1 and D2 = 1), F2 > 0 (given D2 = 1), F3, D3, Y1 (given D1 = 1 and D2 = 1), Y2 (given D2= 1), and Y3. We need to use the proportion of childless adults in the sons’ generation (D3 = 0) to approximate that of the fathers’ generation (D2 = 0) in the adjusted retrospective method.

We randomly generate 1,000 data sets and obtain the unadjusted prospective and the retrospective Monte Carlo estimates for the joint demographic and mobility effect. We then show the adjusted results for the retrospective results by weighting the overrepresented fathers and approximating the number of childless adults in the fathers’ generation.

Simulation Results

We simulate a two-generation prospective data set for men only. The first column in Table 1 presents the prospective fertility results for men in the first generation and the predicted mobility probabilities for their sons based on the correct model that was used to generate the data. On average, a man in position 2 produces 0.54 more sons who are in position 2 than a man in position 1 does. Next we treat the data retrospectively by analyzing men in the second generation and their reported fathers’ positions and fertility. The results presented in model 1.2 show that the retrospective method overestimates the fertility of the fathers’ generation, for men in both high and low socioeconomic positions. As a result, the joint effect increases from 0.54 to 0.77: the retrospective model implies that not only do high status fathers produce more high status sons than low status fathers do, but also that the degree of their advantage is greater than in the correct model. In model 1.3, we adjust for the overrepresentation of sons in the retrospective models; that is, we weight each son by the inverse of his father’s fertility. In model 1.4, we approximate the proportion of childless men in the fathers’ generation by the proportion of childless men in the sons’ generation. The mobility estimates are the same across the four models, because, as discussed above, mobility probabilities estimated from retrospective and prospective data are the same when prospective data include all the sons of each father. After weighting the fertility of fathers who have multiple sons, the retrospective fertility estimates decrease from 2.66 to 2.05, and the joint effect decreases from 0.77 to 0.59, a reduction of approximately 80% of the bias in the retrospective estimate of the joint effect. After we also adjust for the proportion of men without any sons in the fathers’ generation, the estimates from model 1.4 become virtually equal to the prospective estimates.

Table 1.

Two-Generation Prospective Models and Unadjusted and Adjusted Retrospective Models based on Monte Carlo Simulation

1.1
Prospective
Method
1.2
Retrospective
Method
1.3 Adjusted
Retrospective
Method
(weighting only)
1.4 Adjusted
Retrospective
Method (weighting
+ zero fertility)

Fertility fG1
f1 0.910 (0.012) 1.909 (0.021) 1.523 (0.012) 0.911 (0.011)
f2 1.656 (0.024) 2.655 (0.034) 2.047 (0.023) 1.658 (0.023)
Mobility pG2|G1 *
p2|1 0.207 (0.005) 0.207 (0.005) 0.207 (0.005) 0.207 (0.005)
p2|2 0.440 (0.007) 0.440 (0.007) 0.440 (0.007) 0.440 (0.007)
The joint demographic and mobility effect f2p2|2f1p2|1 0.541 (0.016) 0.774 (0.026) 0.586 (0.019) 0.541 (0.016)
Bias (ref. model 1.1) -- 0.233 0.045 0.000

Note: Figures in the parentheses are standard errors.

*

The mobility estimates are the same across all the models, because prospective and the retrospective data yield the same results.

Next, we simulate a three-generation data set with lagged effects of grandfathers on father’s fertility and son’s mobility. The results in Table 2 suggest similar patterns to those shown in Table 1, except that the joint demographic and mobility effects are greater than in the two-generation model. The weighting step reduces the bias in the retrospective estimates of the joint effect by 72.1%. The adjustment for the childless population accounts for the remaining 27.9% of the bias. On average, a father in position 2 has 0.75 more sons who achieve position 2 than a father in position 1. The retrospective estimates are again very close to the prospective estimates after we adjust for the overrepresentation of high fertility fathers and the omission of men without sons.

Table 2.

Three-Generation Prospective Models and Unadjusted and Adjusted Retrospective Models based on Monte Carlo Simulation

2.1
Prospective
Method
2.2
Retrospective
Method
2.3 Adjusted
Retrospective
Method
(weighting only)
2.4 Adjusted
Retrospective
Method (weighting
+ zero fertility)

Fertility f{G2,G1}
f{1,1} 0.844 (0.014) 1.851 (0.025) 1.484 (0.015) 0.845 (0.014)
f{2,2} 1.680 (0.026) 2.711 (0.040) 2.077 (0.025) 1.677 (0.026)
Mobility pG3|{G2,G1} *
p2|{1,1} 0.162 (0.006) 0.162 (0.006) 0.162 (0.006) 0.162 (0.006)
p2|{2,2} 0.530 (0.008) 0.530 (0.008) 0.530 (0.008) 0.530 (0.008)
The joint demographic and mobility effect f{2,2}p2|{2,2}f{1,1}p2|{1,1} 0.754 (0.021) 1.138 (0.034) 0.861 (0.024) 0.753 (0.021)
Bias (ref. model 2.1) -- 0.384 0.107 −0.001

Note: Figures in the parentheses are standard errors.

*

The mobility estimates are the same across all the models, because the prospective and the retrospective data yield the same results.

Empirical Example

Data

The simulated example shows the theoretical performance of the adjustment method when the underlying probabilities of positive fertility and childlessness are fixed over generations. To illustrate the adjustment method for real data when the true parameters are unknown, we apply it to a retrospective sample constructed from prospective data —the Panel Study of Income Dynamics (PSID 1968–2009). The PSID began in 1968 with a household sample of more than 18,000 Americans from roughly 5,000 families. Original panel members have been followed prospectively each year through 1997 and then biannually. The study follows targeted respondents according to a genealogical design. All household members recruited into the PSID in 1968 carry the PSID “gene” and are targeted for collection of detailed socioeconomic information. Members of new households created by the offspring of original targeted household heads retain the PSID “gene” themselves and become permanent PSID respondents. Original panel members are asked questions about the social and economic circumstances of their families of origin. As those original panel members’ children grow older, the PSID also includes information about the social and economic circumstances of multiple generations within families. The data have been widely used in intergenerational mobility studies (e.g., Corcoran et al. 1992; Smeeding, Jèantii and Erikson 2011; Solon 1992; Torche 2011). The design of the PSID is similar to the second type of the prospective data that we discussed above, which include information on respondents and all their offspring.

We construct our multigenerational sample through the PSID Family Identification Mapping System (FIMS). The FIMS sample links the PSID respondents with their parents and grandparents who are also PSID sample members. We then merge the person ID in the FIMS sample with the yearly individual files, and keep only the latest available fertility and educational information for all the individuals. We restrict our sample to men in the fathers’ generation who were born between 1930 and 1950, so that we can get reasonable retrospective estimates for the sons, since they have reached adulthood by the last wave of the survey in 2009.

We estimate the joint demographic and mobility model with respect to educational attainment, which is transformed from the variable “years of education” into an ordinal variable with four levels: 1 (0–11 years of schooling), 2 (12 years of schooling), 3 (13–15 years of schooling), 4 (16+ years of schooling). We rely on the question about an individual’s number of live births to estimate his fertility. Because the question does not identify the sex of births, we estimate the proportion of men without any offspring, rather than men without sons. We obtain the predicted number of sons in the joint demographic and mobility model by dividing the predicted number of offspring by 2, assuming that the sex ratio is 1. In the three-generation model, we assume a man’s educational attainment depends on both his father’s and grandfather’s attainments, while his fertility depends on his own and his father’s educational attainment.

We first estimate the prospective joint demographic and mobility effect from the PSID sample. Then we treat the data retrospectively, which provides information about the sons’ education, his number of siblings and offspring, and his father’s and grandfather’s education. As we discussed above, because the retrospective sample omits childless men in the fathers’ generation, we rely on either the proportion of childless adults in the sons’ generation to approximate that for the fathers’ generation, or childlessness information from external sources.

If we rely on childlessness probability from the sons’ generation, the PSID sample of sons suffers from a right-censoring problem because many adults in the sons’ generation may have not finished childbearing. This problem also exists in most retrospective data, as cross-sectional data normally sample adult men from age 18 to 69, proportionate to the age distribution in the adult population. To estimate the childlessness probability of these adults at the end of their reproductive span, we need to adjust for individuals’ censored exposure time during reproductive ages, such as by controlling for the respondent’s age (e.g., Kalbfleisch and Prentice 2002: 334). The sons’ sample also suffers from a truncation problem because the PSID has not been running long enough to include sons who are aged 60 or above (the age we assume that men finish their reproduction). This truncation problem may affect the precision of our adjustment for the right-censoring problem, but it is typical to the PSID sample, not to other retrospective samples. In the example below, we use the childlessness estimates from the entire PSID data set, not only from the sample of the sons’ generation, to address the truncation problem.

Empirical Results

Table 3 reports results from the prospective and retrospective fertility and mobility models. In model 3.1, we estimate a negative binomial model for the fertility of men in the fathers’ generation. A test that we do not report here suggests that the negative binomial model fits the data better than the Poisson model, because of over-dispersion of the fertility distribution. The education coefficients of the sons and fathers show a clear negative educational gradient in fertility; that is, highly educated sons, especially those whose fathers are also highly educated, tend to have fewer offspring. In model 3.2, we differentiate between childless men and those with at least one child and estimate the level of fertility with a mixture logit and negative binomial regression model. Positive coefficients from the logit regression imply high odds of having at least one child vs. being childless. Coefficients from the truncated negative binomial regression represent effects on the total number of offspring, conditional on having at least one child. A man’s own education has no impact on whether he has offspring or not, but it strongly reduces his total number of offspring given that he has at least one child. By contrast, his father’s education affects both his chance of having any offspring and his total number of offspring.

Table 3.

Three-Generation Fertility and Mobility Models for Men Born Between 1930 and 1950 and Their Offspring, PSID

Prospective Fertility Retrospective Fertility Mobility

Dependent
variable
3.1 Number
of children
(NegBin)
3.2 Mixture NegBin model 3.3 Offspring’s number of
siblings+1
3.4 Zero fertility 3.5 Grandfather-father-son

Logit Regression
(fF0 vs. 0)
Truncated NB
(fF>0)
Truncated NB
(fF>0)
Truncated NB
after weighting
Logit regression
(fS0) = 1
Ordered Logit
regression

Individuals’ schooling
0–8/9–11 - - - - - - -
12 - - - - - −0.183** (0.064) -
13–15 - - - - - −0.429*** (0.071) -
16+ - - - - - −0.461*** (0.078) -
Schooling of men born 1930–1950
0–8/9–11 - - - - - -
12 −0.169*** (0.034) 0.140 (0.188) −0.213*** (0.035) −0.201*** (0.019) −0.223*** (0.025) - 0.865*** (0.107)
13–15 −0.200*** (0.040) 0.010 (0.209) −0.238*** (0.043) −0.294*** (0.024) −0.303*** (0.030) - 1.596*** (0.135)
16+ −0.335*** (0.042) −0.170 (0.203) −0.388*** (0.046) −0.486*** (0.026) −0.409*** (0.029) - 2.603*** (0.137)
Schooling of the fathers
0–8/9–11 - - - - - - -
12 −0.188*** (0.037) −0.660*** (0.160) −0.140*** (0.040) −0.114*** (0.023) −0.144*** (0.029) 0.028 (0.054) 0.147 (0.113)
13–15 −0.275*** (0.067) −0.502 (0.267) −0.300*** (0.077) −0.283*** (0.045) −0.229*** (0.044) −0.034 (0.073) 0.132 (0.198)
16+ −0.178*** (0.060) −0.638** (0.237) −0.135* (0.067) −0.062 (0.037) −0.104* (0.411) −0.172* (0.072) 0.148 (0.178)
Age group
20–29 - - - - - - -
30–39 - - - - - 1.045*** (0.066) 0.035 (0.155)
40–49 - - - - - 1.639*** (0.066) −0.038 (0.155)
50–59 - - - - - 2.273*** (0.073) 0.223 (0.185)
60+ - - - - - 2.732*** (0.080) -
Intercept 1.218*** (0.024) 2.384*** (0.138) 1.271*** (0.246) 1.554*** (0.012) 1.270*** (0.018) −0.412*** (0.076) -
Cut Point 1 - - - - - - −1.044 (0.168)
Cut Point 2 - - - - - - 1.390 (0.169)
Cut Point 3 - - - - - - 2.881 (0.177)
N 2,526 2,526 5,201 5,201 13,092 2,141
Log-likelihood −4874.82 −4858.85 −10225.22 −3053.73 −6791.63 −2514.11
log(theta) 2.610 3.244 3.978 4.239

Data source: The Panel Study of Income Dynamics 1968–2009.

Note:

***

p< 0.001;

**

p< 0.01;

*

p< 0.05. The parameter theta refers to the dispersion parameter in the negative binomial distribution. Only three cases in the mobility sample are aged above 60, so we exclude them from the analysis.

In the retrospective sample, we approximate the fertility of men in the fathers’ generation by the number of siblings of men in the sons’ generation. In model 3.3 we show the results both with and without weighting men in the sons’ generation by the inverse of one plus their number of siblings. Comparing the coefficients with those in the prospective, truncated negative binomial results in model 3.2, we see that education coefficients in neither the weighted nor the unweighted models are very similar to those in the prospective model, but the intercept estimate from the weighted method becomes very close to that in the prospective model. Because we need to use the intercept to estimate the fertility of fathers, a good approximation of the intercept in the truncated model plays a large part in the effectiveness of our adjustment method.

In model 3.4, we approximate the probability of childlessness in mixture model 3.2 with a logit regression that makes use of fertility information from all PSID respondents. We control individuals’ age groups in the model to adjust for their censored exposure times. With this adjustment, we can estimate the childlessness probability at the end of each individual’s reproductive span. A comparison between model 3.4 and 3.2 shows that the educational coefficients in the two models differ from each other, meaning that using the childlessness probability from the whole PSID sample does not provide a good approximation for the childlessness probability in the fathers’ generation. However, in Table 4 below we show that the discrepancy between the two childless probabilities does not have a big impact on the effectiveness of our adjustment method.

Table 4.

Three-Generation Prospective Models and Unadjusted and Adjusted Retrospective Models, PSID

(1) Prospective Approach (2) Retrospective Approach

Fertility f{G2,G1} Negative Binomial Mixture Negative
Binomial
Truncated NB Weighted
Truncated NB
Weighted Truncated
NB + Zero Fertility
Adjusted

f{1,1} 1.691 (1.012) 1.692 (0.036) 2.366 (0.006) 1.781 (0.009) 1.657 (0.043)
f{1,4} 1.415 (0.032) 1.400 (0.080) 2.224 (0.019) 1.605 (0.022) 1.482 (0.098)
f{4,1} 1.209 (0.018) 1.211 (0.037) 1.455 (0.012) 1.183 (0.012) 1.107 (0.039)
f{4,4} 1.012 (0.028) 1.008 (0.043) 1.368 (0.019) 1.067 (0.019) 0.990 (0.052)

Mobility pG3|{G2,G1} Ordered Logit*

p4|{1,1} 0.054 (0.005) 0.054 (0.005) 0.054 (0.005) 0.054 (0.005) 0.054 (0.005)
p4|{1,4} 0.063 (0.012) 0.063 (0.012) 0.063 (0.012) 0.063 (0.012) 0.063 (0.012)
p4|{4,1} 0.437 (0.026) 0.437 (0.026) 0.437 (0.026) 0.437 (0.026) 0.437 (0.026)
p4|{4,4} 0.473 (0.043) 0.473 (0.043) 0.473 (0.043) 0.473 (0.043) 0.473 (0.043)

Probability of being childless π{G2,G1} Binary Logit from Mixture NB Binary Logit

π{1,1} 0.084 (0.138) 0.090 (0.069)
π{1,4} 0.149 (0.266) 0.105 (0.101)
π{4,1} 0.098 (0.161) 0.135 (0.077)
π{4,4} 0.171 (0.206) 0.156 (0.083)
The joint demographic and mobility effect f{4,4}p4|{4,4}f{1,1}p4|{1,1} 0.387 (0.073) 0.385 (0.611) 0.519 (0.068) 0.409 (0.091) 0.379 (0.070)
Bias (ref. model 4.2) 0.002 -- 0.134 0.024 −0.006

Note: The bias equals the difference of the joint effect between the mixture prospective estimate in model 4.2 and the estimate from an alternative model. The predicted probability of mobility is estimated for the mean age group. The predicted childlessness probability is estimated for the age group 60+. Standard errors of the predicted fertility, the mobility probabilities, and the joint demographic and mobility effect are estimated by the delta method using the mean estimates of f and p, and the variance-covariance matrix of these estimates.

*

The mobility estimates from the ordered logit regression are the same across all the models. The prospective and the retrospective data yield the same results.

In model 3.5, we estimate the educational mobility for men in the sons’ generation by their fathers’ and grandfathers’ education, and their age group using an ordered logit model. The results show that, for our sample of PSID families, a son’s level of education does not depend on his grandfather’s education when his father’s education is taken into account. Again, the mobility model is the same for the prospective and retrospective methods, because the PSID follows all sons of a father and we estimate the mobility probabilities by giving an equal weight to each man in the sons’ generation.

Based on the model estimates in Table 3, we calculate the level of total fertility for men (from which we estimate the number of sons), their probability of having any offspring, and the mobility probability for their sons. The calculations are reported in Table 4. The first panel shows the average number of sons by education of men and their fathers across different models. The second panel shows the mobility probability, which is the same across all the models. The third panel presents the probability of childlessness based on the mixture model for the prospective data and the binary logit model for the retrospective data. The fertility estimates from the negative binomial model are very close to those from the mixture model. For example, in both models, the average number of sons is roughly 1.7 for men in the first educational group whose fathers are also in this group. The childlessness proportion for these men is roughly 8%. The retrospective estimates are shown in the last three columns. Model estimates from the truncated negative binomial model in the third column present naïve estimates without any adjustment. The last column presents the results from our final adjustment method. The unadjusted estimate suggests that the average number of sons is roughly 2.4 for men who along with their fathers are both in the first educational group. This estimate declines to 1.8 after we adjust for the overrepresentation of fathers with more offspring, and further declines to 1.7 after we adjust for the omission of the childless group.

We report the joint demographic and mobility effect in the fourth panel, and compare our preferred model in column 2 with alternative models in columns 1, 3, 4, and 5. Here we present the joint effect based on one pair of father’s and grandfather’s characteristics—namely, both in the top educational group or both in the bottom group—out of many possibilities. The results for the preferred model show that families with both the father and grandfather in the top educational group produce 0.39 more offspring in the top educational group than families in which both the father and grandfather are in the bottom educational group, despite the lower fertility of more educated fathers and grandfathers. The effect is 0.52 for the simple retrospective model, 0.41 for the weighted retrospective model without adjusting for the childlessness probability, and 0.38 for our final adjustment model. Compared to the bias in the unadjusted retrospective model in column 3, our weighting adjustment model in column 4 eliminates 82.1% of the bias, and the final model in column 5 eliminates 95.5% of the bias based on equation ().

Conclusion

In general, prospective, longitudinal data are superior to retrospective, cross-sectional data, because they include complete fertility information for the early generations of interviewed families, which allows researchers to examine the joint effects of net fertility and intergenerational transmission in the reproduction of inequality. Although some recent studies provide prospective data for more than two generations and allow analyses of multigenerational social mobility, such data remain scarce. Thus it remains desirable to exploit traditional retrospective survey data for estimating joint models of social mobility and demographic processes. However, analyses based on retrospective data may suffer from retrospective sampling bias because of the omission of childless population in early generations, as well as the overrepresentation of persons in earlier generations who have more offspring and descendants. We have shown that the differences between these two approaches with respect to odds ratios in intergenerational mobility tables depend on how prospective data are collected and the interaction between fertility and social mobility. When prospective data are obtained for all offspring for each individual, these data yield the same estimates as those based on retrospective data. However, if the prospective data only contain information on a single member of the offspring generation, the odds ratios estimated from retrospective and prospective data may differ. The inconsistency between the two approaches depends on the three-way association among parents’ fertility, parents’ status, and offspring’s status. In the presence of the three-way interaction, we propose a weighting method shown in equation (1) to convert the prospective odds ratio to the traditional, retrospective odds ratio.

Retrospective sampling bias becomes a more salient problem when researchers are interested in not only the mobility table, but also the persistence or changes in socioeconomic distributions across generations. Demographic pathways, such as marriage, assortative mating, geographic mobility, and fertility, modify the extent to which inequality in one generation is reproduced in the next. Traditional mobility tables focus on the inequality between parents and offspring restricted to those who have offspring, but the process of producing offspring itself also involves social inequality associated with parents’ socioeconomic status. Based on the joint demographic and mobility model proposed by Mare and Maralani (2006), we provide estimates of the joint demographic and mobility effect in two-generation and mutigenerational mobility examples and propose a feasible and effective adjustment method for obtaining the prospective estimates using retrospective data.

A Monte Carlo study comparing the prospective approach with the adjusted and unadjusted retrospective approach shows that the adjustment method removes almost all the difference between the prospective and the biased retrospective estimates. Specifically, the weighting method removes more than 70 percent of the bias, while the remaining bias is eliminated by accounting for childlessness.

Our illustrative analyses of the PSID show how to adjust retrospective mobility data using prospective data and to estimate the joint demographic and mobility model using a mixture logit and negative binomial model. The results suggest that overall the adjustment method removes more than 95% of the bias in the retrospective estimates. The methods proposed in this paper are potentially applicable to a wide range of models that include a broader variety of demographic processes and socioeconomic outcomes than those presented here. Compared to the retrospective approach in traditional social mobility studies, the prospective approach provides a broader view of intergenerational inequality. We show in this paper what kind of new knowledge a forward-looking conceptual framework can offer to social mobility studies and how such a prospective approach can be achieved with limited information from retrospective data.

Acknowledgments

We are grateful to Sung Park, William Rosales, Judith Seltzer, Ying Nian Wu, Yu Xie, Hua Ye, and three anonymous reviewers, for their valuable suggestions; to Dwight Davis and Benjamin Jarvis, for insightful comments and editorial assistance on early drafts. An earlier version of this paper was presented at the meeting of ISA Research Committee on Social Stratification (RC28), Hong Kong, May 10–13, 2012. This research was supported by the National Science Foundation (SES-1260456). The authors benefited from facilities and resources provided by the California Center for Population Research at UCLA (CCPR), which receives core support (R24-HD041022) from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD).

Appendix A: Simulation Details

This appendix provides the details for the simulation examples. For the two-generation model, we assume that a man’s fertility depends on his own socioeconomic position, and his socioeconomic position depends on only his father’s position. There is no lagged effect from the grandfather in both the fertility and mobility equations. The data are generated in the following order according to the specified probability models:

  • 1.1

    The exogenous variable U1 for the fathers’ generation is drawn from a standard normal distribution U1 ~ N (0, 1).

  • 1.2.

    We then generate the father’s position Y1 (1 vs. 2) for each of the 10,000 subjects:

logit(P[Y1=2|U1])=α0+α1·U1=log(0.30.7)+log(2)·U1 (A-1)
  • 1.3.

    The conditional distribution of a father’s fertility given Y1 follows a Poisson distribution with the mean of the fertility satisfies the equation below.

E(F1|Y1)=exp(β0+β1·(Y1Y1¯))=exp(log(1.1)+0.6.(Y1Y1¯)) (A-2)

We then generate a dichotomous variable D1 based on F1, where D1 = 1 if F1 >0 and D1 = 0 if F1 =0.

  • 1.4.

    The conditional distribution of a son’s variable U2 given U1 and D1 is drawn from a normal distribution, where the standard deviation is fixed at 1 and the mean satisfies the equation below.

E(U2|U1,D1=1)=γ1·(U1U1¯)=0.8.(U1U1¯) (A-3)
  • 1.5.

    The conditional distribution of a son’s socioeconomic position Y2 given U2, D1 and Y1 follows a Bernoulli distribution.

logit(P[Y2=2|U2,Y1,D1=1])=δ0+δ1·U2+δ2·Y1=log(0.20.8)+log(2)·U2+log(2.5)·Y1 (A-4)
  • 1.6.

    The conditional distribution of a son’s fertility F2 given Y2, and D1 follows a Poisson distribution with the mean of the fertility satisfies the equation below.

E(F2|Y2,D1=1)=exp(β0+β1·(Y2Y2¯))=exp(log(1.1)+0.6.(Y2Y2¯)) (A-5)

We then generate a dichotomous variable D2 based on F2, where D2 = 1 if F2 >0 and D2 = 0 if F2 =0.

In the prospective sample, all the variables F1, F2, D1, D2, Y1, Y2 are observed, while in the retrospective sample, we only know F1 > 0 (i.e., D1 = 1), F2, D2, Y1, Y2. We need to use the proportion of childless adults in the sons’ generation (D2 = 1) to approximate that of the fathers’ generation (D1 = 1) in the adjusted retrospective method.

For the three-generation model, we assume that a man’s fertility depends on the socioeconomic positions and fertility of all prior generations, as well as his own socioeconomic position. In addition, we assume a man’s socioeconomic position depends on the socioeconomic positions of all prior generations. We generate the data by the following steps.

  • 2.1.

    The exogenous variable U1 for the grandfathers’ generation follows a standard normal distribution U1 ~ N (0, 1).

  • 2.2.

    The conditional distribution of a grandfather’s socioeconomic position Y1 (1 vs. 2) given U1, follows a Bernoulli distribution.

logit(P[Y1=2|U1])=α0+α1·U1=log(0.30.7)+log(2)·U1 (A-6)
  • 2.3.

    The conditional distribution of a grandfather’s fertility F1 given Y1 follows a Poisson distribution with the mean of the fertility satisfies the equation below.

E(F1|Y1)=exp(β0+β1·(Y1Y1¯))=exp(log(1.1)+0.6.(Y1Y1¯)) (A-7)

We generate a dichotomous variable D1 based on F1, where D1 = 1 if F1 >0 and D1 = 0 if F1 =0.

  • 2.4.

    The conditional distribution of a father’s variable U2 given U1 and D1 follows a normal distribution, where the standard deviation is fixed at 1 and the mean satisfies the equation below.

E(U2|U1,D1=1)=γ1·(U1U1¯)=0.8.(U1U1¯) (A-8)
  • 2.5.

    The conditional distribution of a father’s position Y2 given U2, F1, D1 and Y1 follows a Bernoulli distribution.

logit(P[Y2=2|U2,Y1,D1=1])=δ0+δ1·U2+δ2·Y1+δ3·F1=log(0.20.8)+log(2)·U2+log(2.5)·Y1+log(1.1)·F1 (A-9)
  • 2.6.

    The conditional distribution of a father’s fertility F2 given Y2, Y1, F1 and D1 follows a Poisson distribution with the mean of the fertility satisfies the equation below.

(F2|Y2,Y1,F1,D1=1)=exp(θ0+θ1·(Y2Y2¯)+θ2·(Y1Y1¯)+θ3·(F1F1¯))=exp(log(1.1)+0.4.(Y2Y2¯)+0.2.(Y1Y1¯)+0.1.(F1F1¯)) (A-10)

We generate a dichotomous variable D2 based on F2, where D2 = 1 if F2 > 0 and D2 = 0 if F2 =0.

  • 2.7.

    The conditional distribution of a son’s variable U3 given U2, U1 and D2 follows a normal distribution, where the standard deviation is fixed at 1 and the mean satisfies the equation below.

E(U3|U2,U1,D2=1)=π1·(U2U2¯)+π2·(U1U1¯)=0.6.(U2U2¯)+0.2.(U1U1¯) (A-11)

Note that when D2 =1, we must have D1 =1.

  • 2.8.

    The conditional distribution of a son’s position Y3 given U3, Y2, U2, F2, Y1, U1, and D2 follows a Bernoulli distribution.

logit(P[Y3=2|U3,Y2,U2,F2,Y1,U1,D2=1])=λ0+λ1·U3+λ2·Y2+λ3·U2+λ4·F2+λ5·Y1+λ6·U1=log(0.150.85)+log(1.8)·U3+log(2.0)·Y2+log(1.3)·U2+log(1.1)·F2+log(1.5)·Y1+log(1.1)·U1 (A-12)
  • 2.9.

    The conditional distribution of a son’s fertility F3 given Y3, Y2, Y1, F2, F1 and D2 follows a Poisson distribution with the mean of the fertility satisfies the equation below.

E(F3|Y3,Y2,Y1,F2,F1,D1=1)=exp(ζ0+ζ1·(Y3Y3¯)+ζ2·(Y2Y2¯)+ζ3·(Y1Y1¯)+ζ4·(F2F2¯)+ζ5·(F1F1¯))=exp(log(1.1)+0.36.(Y3Y3¯)+0.20.(Y2Y2¯)+0.10.(Y1Y1¯)+0.10.(F2F2¯)+0.03.(F1F1¯)) (A-13)

Footnotes

1

Conversely, if we want to construct a prospective odds ratio based on retrospective data in which fathers with more sons contribute multiple observations to the sample, we need to reduce the weights of these observations so that all fathers contribute equally to the sample. This weighting procedure resembles the Horvitz and Thompson’s (1952) inverse probability weighting strategy. In practice, this method requires that all fathers have completed their fertility or that data on incomplete fertility have been appropriately adjusted.

2

Whereas Maralani and Mare (2006) present a model for women, we discuss the model for a male population. Although the details of these models may differ in their substantive applications, whether the model is formulated for men or women does not affect the methods proposed in this paper.

Contributor Information

XI SONG, University of California, Los Angeles.

ROBERT D. MARE, University of California, Los Angeles

References

  1. Abma Joyce C. and Martinez Gladys M.. 2006. “Childlessness among Older Women in the United States: Trends and Profiles.” Journal of Marriage and Family 68(4): 1045–56. [Google Scholar]
  2. Allan Boris, and Bytheway Bill. 1973. “The Effects of Differential Fertility on Sampling in Studies of Intergenerational Social Mobility.” Sociology 7(2): 273–76. [Google Scholar]
  3. Bartholomew David J. 1982. Stochastic Models for Social Processes (3rd edition). New York: Wiley. [Google Scholar]
  4. Beller Emily. 2009. “Bringing Intergenerational Social Mobility Research into the Twenty-First Century: Why Mothers Matter.” American Sociological Review 74(4): 507–28. [Google Scholar]
  5. Blake Judith. 1981. “Family Size and the Quality of Children.” Demography 18(4): 421–42. [PubMed] [Google Scholar]
  6. Blau Peter M., and Otis Dudley Duncan 1967. The American Occupational Structure. New York: Willey. [Google Scholar]
  7. Breen Richard (ed.). 2004. Social Mobility in Europe. Oxford: Oxford University Press. [Google Scholar]
  8. British Household Panel Survey (BHPS) User Manual Codebook. 1997. Institute for Social & Economic Research. [Google Scholar]
  9. Chinese Family Panel Studies (CFPS). 2010. Chinese Family Panel Studies: A PSC Research Project (http://www.psc.isr.umich.edu/research/project-detail/34795).
  10. Clogg Clifford C., and Eliason Scott R.. 1987. “Some Common Problems in Log-Linear Analysis.” Sociological Methods and Research 16(1): 8–44. [Google Scholar]
  11. Corcoran Mary, Gordon Roger, Laren Deborah, and Solon Gary. 1992. “The Association between Men’s Economic Status and Their Family and Community Origins.” The Journal of Human Resources 27(4): 575–601. [Google Scholar]
  12. Duncan Otis Dudley. 1966. “Methodological Issues in the Analysis of Social Mobility” Pp. 51–97 in Smelser NJ and Lipset SM (eds.), Social Structure and Mobility in Economic Development. Chicago: Aldine. [Google Scholar]
  13. Erikson Robert, and Goldthorpe John H.. 1992. The Constant Flux: A Study of Class Mobility in Industrial Societies. Oxford University Press, USA. [Google Scholar]
  14. Erikson Robert, Goldthorpe John H. and Portocarero Lucienne. 1979. “Intergenerational Class Mobility in Three Western European Societies: England, France and Sweden.” British Journal of Sociology 30(4): 415–41. [DOI] [PubMed] [Google Scholar]
  15. Featherman David L. 1979. “Retrospective Longitudinal Research: Methodological Considerations.” University of Wisconsin-Madison CDE Working Paper; 79–19. [Google Scholar]
  16. Featherman David L. and Hauser Robert M.. 1978. Opportunity and Change. New York: Academic Press. [Google Scholar]
  17. Fienberg Stephen E. 2007. The Analysis of Cross-Classified Categorical Data (2nd ed). Springer. [Google Scholar]
  18. Glass David V. (ed.). 1954. Social Mobility in Britain. London: Routledge and Kegan Paul. [Google Scholar]
  19. Hauser Robert M., Koffel John N, Travis Harry P., and Dickinson Peter J.. 1975. “Temporal Change in Occupational Mobility: Evidence for Men in the United States.” American Sociological Review 40 (3): 279–97. [Google Scholar]
  20. Heaton Tim B., Jacobson Cardell K., and Holland Kimberlee. 1999. “Persistence and Change in Decisions to Remain Childless.” Journal of Marriage and the Family 61:531–39. [Google Scholar]
  21. Horvitz Daniel G., and Thompson Donovan J.. 1952. “A Generalization of Sampling without Replacement From A Finite Universe.” Journal of the American Statistical Association 47 (260): 663–85. [Google Scholar]
  22. Household, Income and Labour Dynamics in Australia (HILDA). 2002. “The Household, Income and Labour Dynamics in Australia Survey: Wave 1.” American Economic Review 35 (3): 339–348 [Google Scholar]
  23. Hout Michael. 1983. Mobility Tables. Beverly Hills: Sage. [Google Scholar]
  24. Hout Michael. 1988. “More Universalism, Less Structural Mobility: The American Occupational Structure in the 1980s.” American Journal of sociology 93(6): 1358–1400. [Google Scholar]
  25. Johnson Norman L., Kemp Adrienne W., and Kotz Samuel. 2005. Univariate Discrete Distributions (3rd edition). New Jersey: John Wiley & Sons, Inc.. [Google Scholar]
  26. Kalbfleisch John D., and Prentice Ross L.. 2002. The Statistical Analysis of Failure Time Data. New York: John Wiley & Sons, Inc., [Google Scholar]
  27. Korean Labor & Income Panel Study (KLIPS). 2012. An Overview of the KLIPS. See (http://www.kli.re.kr/klips/en/about/introduce.jsp).
  28. Lam David. 1986. “The Dynamics of Population Growth, Differential Fertility, and Inequality.” American Economic Review 76: 1103–16. [Google Scholar]
  29. Long Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Sage Publications [Google Scholar]
  30. Maralani Vida. 2013. “The Demography of Social Mobility: Black-White Differences in the Process of Educational Reproduction.” American Journal of Sociology 118(6): 1509–58. [Google Scholar]
  31. Maralani Vida, and Mare Robert D.. 2005. “Demographic Pathways of Intergenerational Effects: Fertility, Mortality, Marriage and Women’s Schooling In Indonesia.” CCPR Working Paper [Google Scholar]
  32. Mare Robert D. 1997. “Differential Fertility, Intergenerational Educational Mobility, and Racial Inequality.” Social Science Research 26 (3): 263–91. [Google Scholar]
  33. Mare Robert D. 2011. “A Multigenerational View of Inequality.” Demography 48 (1): 1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Mare Robert D., and Schwartz Christine R.. 2006. “Educational Assortative Mating and the Family Background of the Next Generation: A Formal Analysis.” Riron to Hoho (Sociological Theory and Methods) 21: 253–77. [Google Scholar]
  35. Mare Robert D. and Maralani Vida. 2006. “The Intergenerational Effects Of Changes In Women’s Educational Attainments.” American Sociological Review 71 (4): 542–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Matras Judah. 1961. “Differential Fertility, Intergenerational Occupational Mobility, and Change in Occupational Distribution: Some Elementary Interrelationships.” Population Studies 15: 187–97. [Google Scholar]
  37. Matras Judah. 1967. “Social Mobility and Social Structure: Some Insights from the Linear Model.” American Sociological Review 32: 608–14. [PubMed] [Google Scholar]
  38. Musick Kelly, and Mare Robert D.. 2004. “Family Structure, Intergenerational Mobility, and the Reproduction of Poverty: Evidence for Increasing Polarization?” Demography 41(4): 629–48. [DOI] [PubMed] [Google Scholar]
  39. Preston Samuel H. 1974. “Differential Fertility, Unwanted Fertility, and Racial Trends in Occupational Achievement.” American Sociological Review 39: 492–506. [PubMed] [Google Scholar]
  40. Preston Samuel H. and Campbell Cameron. 1993. “Differential Fertility and the Distribution of Traits: The Case of IQ.” American Journal of Sociology, 98: 997–1019. [Google Scholar]
  41. PSID Main Interview User Manual: Release 2012.1 . Institute for Social Research, University of Michigan, January 23, 2012. [Google Scholar]
  42. Smeeding Timothy M., Markus Jèantii, and Robert Erikson, eds. 2011. Persistence, Privilege, and Parenting: The Comparative Study of Intergenerational Mobility. Russell Sage Foundation. [Google Scholar]
  43. Socio-Economic Panel (SOEP) Group. 2000. “The German Socio Economic Panel After More than 15-Years Overview.” Vierteljahrshefte zur Wirtschaftsforschung 70: 7–14. [Google Scholar]
  44. Solon Gary. 1992. “Intergenerational Income Mobility in the United States.” American Economic Review 82 (3): 393–408. [Google Scholar]
  45. Survey of Labour and Income Dynamics (SLID, Canada). 2012. A Survey Overview (http://www5.statcan.gc.ca/bsolc/olc-cel/olc-cel?lang=eng&catno=75F0011X).
  46. Swiss Household Panel (SHP). 2012. Swiss Household Panel Documentations (http://www.swisspanel.ch/index.php?lang=en).
  47. Szelenyi Ivan and Treiman Donald J.. 1994. Social Stratification in Eastern Europe after 1989: General Population Survey—Provisional codebook Unpublished manuscript. Department of Sociology, University of California, Los Angeles. [Google Scholar]
  48. Torche Florencia. 2011. “Is a College Degree Still the Great Equalizer? Intergenerational Mobility across Levels of Schooling in the United States.” American Journal of Sociology 117(3): 763–807. [Google Scholar]
  49. Treiman Donald J. , and Walder Andrew. G. 1996. Life Histories and Social Change in Contemporary China. Distributed by the UCLA Social Science Data Archive; [http://www.sscnet.ucla.edu/issr/da/]. [Google Scholar]
  50. Treiman Donald J., Sylvia Moeno, Lawrence Schlemmer.1996. Survey of Socioeconomic Opportunity and Achievement in South Africa—Codebook Unpublished manuscript. Department of Sociology, University of California–Los Angeles. [Google Scholar]
  51. Wisconsin Longitudinal Study (WLS) Handbook. 2006. Wisconsin Longitudinal Study: Tracking the Life Course (http://www.ssc.wisc.edu/wlsresearch/documentation/handbook/WLS_Handbook.pdf).

RESOURCES