Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Oct 1.
Published in final edited form as: J Labor Econ. 2016 Feb 11;34(Suppl 2):s31–s65. doi: 10.1086/684121

Decomposing Trends in Inequality in Earnings into Forecastable and Uncertain Components

Flavio Cunha 1, James Heckman 2
PMCID: PMC4827721  NIHMSID: NIHMS747455  PMID: 27087741

Abstract

A substantial empirical literature documents the rise in wage inequality in the American economy. It is silent on whether the increase in inequality is due to components of earnings that are predictable by agents or whether it is due to greater uncertainty facing them. These two sources of variability have different consequences for both aggregate and individual welfare. Using data on two cohorts of American males we find that a large component of the rise in inequality for less skilled workers is due to uncertainty. For skilled workers, the rise is less pronounced.

1 Introduction

A large literature documents an increase in wage inequality in the American economy over the past 40 years. This increase in wage inequality occurred both within and across education-experience groups. (See, e.g., the surveys in Katz and Autor, 1999, and Acemoglu and Autor, 2011).

Variability in wages across people and over time for the same people is not necessarily the same as uncertainty in wages. Some of the variability may be due to predictable components observed by agents early in their adult lives but not observed by the analyst. Cunha, Heckman, and Navarro (2005), henceforth CHN, estimate that roughly half of all variability in lifetime earnings across people is due to uncertainty as perceived at the time they make college-going decisions. They estimate uncertainty for one cohort of workers.1 In this paper, we apply their methodology to estimate how much of the recent increase in wage inequality over the later 20th century is due to an increase in components predictable by the agents at the age they make their college attendance decisions and how much is due to components that are unpredictable at that age.

A large literature in empirical labor economics starting with the pioneering work of Friedman and Kuznets (1945) uses panel data to decompose earnings into permanent and transitory components. This literature has developed rich descriptions of earnings dynamics.2 Using such statistical decompositions, Gottschalk and Moffitt (1994, 2009) document an increase in measured earnings instability in recent decades. The variance of transitory components greatly increases from the period 1970–1978 to the period 1979–1987. However, purely statistical decompositions cannot distinguish uncertainty from other sources of variability. Transitory components as measured by a statistical decomposition may be perfectly predictable by agents, partially predictable, or totally unpredictable.

This paper uses data on schooling choices and realized future earnings for two birth cohorts of white males spanning the mid-1960s to the mid-2000s to estimate the evolution of uncertainty in the labor market. We show that unforecastable components in labor income have increased in recent years, especially for less skilled workers. Our findings support the analysis of Ljungqvist and Sargent (1998, 2008) that turbulence has increased in unskilled labor markets. This increase is not revealed in traditional measures of earnings inequality which do not distinguish between predictable and unpredictable components.

Our approach is based on the following simple idea. A decision variable C1, say consumption of an agent in the first period of life, may depend on incomes Y1, …, YT over horizon T that are realized after the consumption choice is taken. Abstracting from measurement errors, under the permanent income hypothesis the correlation between C1 and future Yt is a measure of how much of future Yt is known and acted on when agents make their consumption decisions. (See, e.g., Flavin, 1981.)

Agents only imperfectly predict their future earnings using information set I1. Suppose that C1 depends on future Yt only through expected present value, E(PV1I1), where “E” denotes expectation, PV1=t=1TYt(1+ρ)t1, and ρ is the discount rate. This framework assumes that there is an asset market in which agents can lend or borrow against verifiable future income. If, after the choice of C1 is made, we actually observe Y1, …, YT, we can construct PV ex-post. If the information set is properly specified, the residual corresponding to the component of PV that is not forecastable in the first period, V1=PV1E(PVI1), should not predict C1. E(PV1I1) is predictable. V1 arises from uncertainty. The variance in PV1 that is unpredictable using I1 is a measure of uncertainty as of the first period3.

This paper uses college attendance choices as its decision variable to estimate uncertainty. Accordingly, we measure uncertainty at only one stage of the life cycle. In principle, we could use decisions at later stages to chart the evolution of information over the life cycle but we do not do so in this paper.4 Following Becker (1964), college choices depend on comparisons of earnings in the schooling level chosen and in alternative states.

We modify the simple procedure just described to account for measurement error and the economists' inability to measure expected earnings in schooling states not selected by agents. We account for the resulting selection bias in measuring earnings in any state that arises when we only observe earnings streams for a given educational level only for people who select into that level (see, e.g., Heckman, 1976, 1979; Willis and Rosen, 1979).

Using college choice data combined with earnings data and data on test scores, we find that both predictable and unpredictable components of earnings variance have increased in recent years. The increase in uncertainty is largely microeconomic in nature, and is much greater for unskilled workers. Macroeconomic uncertainty decreased over the period studied (which predates the 2008 downturn), especially for less skilled workers. For them, roughly 60% of the increase in wage variability within schooling groups is due to micro uncertainty associated with turnover and job loss. For more skilled workers, only 8% of the increase in inequality is due to uncertainty. Roughly 26% of the increase in the variance of returns to schooling is due to increased uncertainty.

The rest of this paper is in three parts. Part 2 summarizes the strategy used to obtain our estimates. It is based on the analysis of CHN and Cunha and Heckman (2008).5 Part 3 presents and interprets our empirical analysis. Part 4 concludes.

2 The Model

To identify the forecastable components of future earnings and how they have changed over time, we draw on the analysis of CHN and Cunha and Heckman (2008), which we briefly summarize.

2.1 Earnings Equations

Using the Roy model (1951) and its generalizations (see Heckman and Vytlacil, 2007a,b), agents possess two lifetime potential earnings streams, (Y0,t, Y1,t), t = 1, …, T, for schooling levels “0” and “1” respectively. Earnings are assumed to have finite means. For conditioning variables X, we write:

Y0,t=Xβ0,t+U0,t (1)
Y1,t=Xβ1,t+U1,t,t=1,,T, (2)

where the error terms Us,t are defined to satisfy E (Us,t | X) = 0, s = 0, 1, t = 1, …, T. Allowing for age-specific returns incorporates post-school investment as a determinant of earnings. For any individual, we only observe one of the two possible earnings streams. This is the standard switching regression model (Quandt, 1958, 1972).

2.2 Choice Equations

The human capital model of Becker (1964) is based on present value income maximization. We extend that model by assuming that agents are risk neutral and make schooling choices based on maximizing the expected value of the return to schooling given information set I1. Write the index I of the difference in present values as

I=E[t=1T(11+ρ)t1(Y1,tY0,t)CI1], (3)

where C is the cost of attending college. Costs include both pecuniary and psychic costs, which may or may not be fully known at the time schooling decisions are made. Psychic costs play an important role in explaining college enrollment decisions (see, e.g., Carneiro, Hansen, and Heckman, 2003, Abbott, Gallipoli, Meghir, and Violante, 2013, and Eisenhauer, Heckman, and Mosso, 2015). Let Z and UC denote, respectively, the directly measured and unmeasured (by the analyst) determinants of costs respectively. We assume that costs can be written as

C=Zγ+UC. (4)

Defining

μI(X,Z)=t=1T(11+ρ)t1X(β1,tβ0,t)Zγ

and

UI=t=1T(11+ρ)t1(U1,tU0,t)UC,

and substituting in (1), (2), and (4) into decision rule (3) we obtain

I=E[μI(X,Z)+UII1]. (5)

E(U1I1) is the error term in the choice equation and it may or may not include U1,t, U0,t, or UC, depending on what is in the agent's information set. Similarly, μI(X, Z) may only be based on expectations of future X and Z at the time schooling decisions are made. People go to college if the expected present value of earnings is positive:

S=1[I0]. (6)

2.3 Cognitive Ability

In estimating our model and decomposing realized earnings into forecastable and unforecastable components, we control for cognitive ability. Cognitive ability is known to affect both earnings and college choices. (See, e.g., Chamberlain and Griliches, 1975; Taubman, 1977). We have access to data on scores on tests of cognitive ability.6 Let Mk denote an agent's score on the kth test. Assume that the Mk have finite means and can be expressed in terms of conditioning variables XM. Write

Mk=XMβkM+UkMandE(UkMXM)=0,k=1,2,,K. (7)

Test scores facilitate but are not essential to our identification strategy. They enable us to proxy unobserved components of ability that affect earnings and schooling choices.

2.4 Heterogeneity and Uncertainty

The earnings of agents of schooling level s at age t can be decomposed into predictable and unpredictable components as of period 1:

Ys,t=E(Ys,tI1)+Vs,t,s=0,1,t=1,,T.

E(Ys,tI1) is available to the agent to help predict schooling choices. It is a component of realized earnings. The component Vs,t does not enter the schooling choice equation because it is unknown at the time schooling decisions are made. However, it determines realized earnings.

To determine which components are in the information set of the agent, we need to determine which specification of the information set I1 best characterizes the dependence between schooling choices and future earnings. CHN and Cunha and Heckman (2008) use factor structure approximations to the error terms to decompose earnings residuals into predictable and unpredictable components. Other approximations such as ARMA models might be used (see, e.g., MaCurdy, 1982, 2007). However, factor structures are computationally and conceptually convenient and can approximate general error processes (see Heckman, 1981). There is an extensive literature on their identification and estimation (see, e.g., Abbring and Heckman, 2007; Chamberlain and Griliches, 1975). One advantage of factor models is that they enable analysts to partition realized earnings into orthogonal components. Some of these components may be known by the agent when schooling choices are made and some components may not be known. By factor analyzing earnings and choice equations we can determine which components (factors) of realized earnings appear in the choice equations. To show this, following CHN, we introduce an explicit factor structure for the disturbance terms.

2.5 Factor Models

We now present our factor model, starting with the earnings and choice equations. We decompose the error terms in the earnings equations into factors and idiosyncratic error terms. Let factors and factor loadings be θ = (θ1, …, θK) and αs,t = (α1,s,t, …, αK,s,t), respectively. The idiosyncratic error terms, εs,t, s ∈ {0, 1}, t ∈ {1, ⋯, T}, affect only the period-t, schooling-s earnings equation. The εs,t are mutually independent and independent of θ, X and Z. The factors, in turn, are assumed to be independent of X, Z.7 We assume that U0,t and U1,t can be represented in factor-structure form:

Us,t=θαs,t+εs,ts=0,1,t=1,,T. (8)

We assume that factors are mutually independent and independent of X and εs,t for all s, t. The εℓ,t, ℓ = 0, 1 and t = 1, …, T, are mutually independent.

The equation for psychic and pecuniary cost is decomposed in a fashion similar to the earnings equations, so that (4) can be written as

C=Zγ+θαC+εC, (9)

where εC is independent of θ, X, Z, εs,t, s = 0, 1, t = 1, …, T. Given the factor representation (8) and (9), we can represent the choice index I for schooling as

I=E[t=1T(11+ρ)t1X(β1,tβ0,t)Zγ+θαI+t=1T(11+ρ)t1(ε1,tε0,t)εCI1] (10)

where we define

αI=t=1T1(1+ρ)t1(α1,tα0,t)αC.

2.5.1 Test Score Equations

Following a long tradition in the literature (see, e.g., the papers in Taubman, 1977), we include measures of ability in the earnings and choice equations. Let the first component of θ, θ1, correspond to cognitive ability. It is extracted from data on test scores. There are additional errors unique to test score equation k, εkM. In this notation, we can write equation (7) as

Mk=XMβkM+θ1αkM+εkM,k=1,,K (11)

where the αkM are “factor loadings”, i.e., coefficients that map θ1 into Mk, and the εkM are mutually independent “uniquenesses” independent of all other right hand side variables. Modeling test scores in this fashion recognizes that they are noisy measures of cognitive ability.8 While we do not require test scores to identify the model (see, e.g., Abbring and Heckman, 2007) they facilitate identification, allow us to give a specific interpretation to one component of θ, and link our analysis to a large literature in labor economics and the economics of education.

2.6 The Estimation of Predictable Components of Future Earnings

We now illustrate how to apply the method of CHN to determine which components of realized earnings are known to the agent when schooling choices are made. For full details on the econometrics used to extract the estimates reported in this paper see CHN, Abbring and Heckman (2007), and Cunha and Heckman (2008). For expositional simplicity, in this section alone we assume that X, Z, βs,t (s = 0, 1, t = 1, …, T) and εC are in the information set I1.9 To fix ideas, suppose that there are two factors, θ1 (ability) and θ2. In the empirical work reported below we use more factors and find that 3 are required to fit the data.

Suppose that it is claimed that both θ1 and θ2 are known by the agent when schooling choices are made but the εs,t are not, i.e. {θ1,θ2}I1, but εs,tI1 for all s and t. If this is true, the index function governing schooling choices is

I=μI(X,Z)+α1,Iθ1+α2,Iθ2+εC. (12)

Using standard results in the theory of discrete choice (see Matzkin, 1992, or Heckman and Vytlacil, 2007a, for precise conditions), we can proceed as if we observe I in equations (6) and (12) up to an unknown positive scale. Thus from the discrete choices on schooling we observe the index generating the choices up to scale. From the correlation between S and realized incomes, we can form (up to scale) the covariance between I and Ys,t, t = 1, …, T for s = 0 or 1. Conditional on X, Z this covariance is

Cov(I,Ys,tX,Z)=α1,Iα1,s,tσθ12+α2,Iα2,s,tσθ22,s=0,1,t=1,,T. (13)

Suppose next that θ2 is not known, or is known and not acted on by the agent when schooling choices are made. In this case, α2,I = 0. If neither θ2 nor θ1 is known, or acted on by the agent, α1,I = α2,I = 0. For panels of earnings histories of length 3 or more (T ≥ 3) and with three or more measures of cognition (K ≥ 3), we can use the system of covariances in (13) joined with the information from the covariances between Mk and I and Mk and Ys,t to identify the model and infer the number of factors.

CHN, Heckman, Lochner, and Todd (2006), Abbring and Heckman (2007), and Cunha and Heckman (2008) present the details on how to use the covariances among schooling, test scores, and earnings to identify the factor loadings and the distribution of the factors in test score and earnings equations (11), (8), and (9) using self-selected samples.10 Self selection arises because analysts only observe the earnings stream associated with s for persons who choose s. The cited papers establish conditions for identifying σθ12, σθ22, α1,s,t and α2,s,t, s = 0, 1, t = 1, …, T. We review their conditions in the Web Appendix.11

Putting these ingredients together, we can determine which components (factors) that determine realized earnings and the test scores are correlated with I. If component (factor) θ1 appears in the period t earnings equation (α1,s,t ≠ = 0) is correlated with I and is acted on by the agent in making schooling choices (so α1,I ≠ = 0), then θ1 is predictable (in I1) as of the time schooling decisions are being made. If earnings component θ2 is uncorrelated with I, then α2,I = 0 and θ2 is not acted on by the agent in making schooling choices and we say that it is unpredictable at the time schooling choices are made.12

3 Empirical Results

In order to study the evolution of uncertainty and inequality in labor earnings in the U.S. economy, we analyze and compare two demographically comparable, temporally separated samples. We study white males born between 1957 and 1964, sampled in the National Longitudinal Survey of Youth (NLSY/1979).13 We also study an earlier sample of white males born between 1941 and 1952, surveyed in the National Longitudinal Survey (NLS/1966).14 In what follows, we refer to the samples as NLSY/1979 and NLS/1966, respectively. These data are described in detail in the Web Appendix.15 Because we only analyze white males, we do not present a comprehensive investigation of the increase in inequality in the U.S. arising from all within-group and between-group comparisons. However, in focusing on white males, we can abstract from influences that operate differentially on various demographic groups. We focus on the rise of inequality that is due to forecastable versus unforecastable components for one important demographic group.16

We analyze two schooling choices: high school and college graduation. Use s = 0 to denote those who stop at high school and s = 1 to denote those who graduate college. We present descriptive statistics on the NLSY/1979 and NLS/1966 samples, in the Web Appendix Tables 1.1 and 1.2 respectively. In both samples, college graduates have higher test scores, fewer siblings and parents with higher levels of education than those who stop at high school. In the NLSY/1979, college graduates are more likely to live in locations where the tuition for four-year college is lower. This is not true for the college graduates in NLS/1966.17

We analyze the evolution of labor income from ages 22 to 36. Reliable data are not available after that age for the NLS/1966 sample. Thus we study earnings over the years 1963–1988 for the NLS/1966 sample and the years 1979–2005 for the NSLY sample. Web appendix (Figures 1.1 and 1.2) display, respectively, the mean earnings by age of high school and college graduates for NLSY/1979 and NLS/1966.18 In both data sets and for both cohorts, college graduates start off with lower mean labor income than high school graduates but overtake them. This is consistent with the analysis of Mincer (1974). The appendix also plots the standard deviation of earnings by age for high school graduates and college graduates for both cohorts.19 The standard deviation of earnings increases with age for high school and college graduates in both data sets. The standard deviation of earnings by age is uniformly greater in the later cohort, for both high school and college graduates. Thus our data are consistent with a vast literature documenting the increase in inequality of earnings.

Both data sets have measures of cognitive test scores that can be used to proxy ability.20 For the NLSY/1979, we use five components of the ASVAB test battery: arithmetic reasoning, word knowledge, paragraph comprehension, math knowledge and coding speed. We dedicate the first element of θ1) to this test score system, and exclude other factors from it, so θ1 is a measure of cognitive ability.

In the NLS/1966 there are many different achievement tests, but in our empirical work we use the two most commonly reported ones: the OTIS/BETA/GAMMA and the California Test of Mental Maturity (CTMM). One problem with the NLS/1966 sample is that there are no respondents for whom we observe scores from two or more achievement tests. That is, for each respondent we observe at most one test score. We supplement the information from these test scores by using additional proxies for cognitive achievement.21

We model the test score j,Mj, by equation (11). The covariates XM include family background variables, year of birth dummies, and characteristics of the individuals at the time of the test.22 To set the scale of θ1, we normalize α1M=1. Using factor models, instead of working directly with test scores, recognizes that test scores may be noisy measures of cognitive skills.

Salient features of our data are presented in Table 1. Fewer males graduate college in the later cohort. This is consistent with a large body of evidence that shows enhanced college participation in earlier cohorts to avoid the Vietnam War draft.23 For a variety of specifications, Mincer returns increase for the later cohorts. This is consistent with a large body of evidence on the returns to schooling (Acemoglu and Autor, 2011; Katz and Autor, 1999).

Table 1.

Schooling Choice and Rates of Return per Year of College: Comparison Across Cohorts

NLS/66 NLSY/79
High School Graduates 58.17% 64.19%
College Graduates 41.83% 35.81%
Mincer Returns to College1 9.01% 11.96%
Mincer Returns to College2 10.17% 12.41%
Mincer Returns to College3 8.17% 11.00%
1

Pooled OLS Regression, controlling only for Mincer Experience and Mincer Experience Squared

2

Pooled OLS Regression, controlling for Mincer Experience, Mincer Experience Squared, and Year Dummies

3

Pooled OLS Regression, controlling for Mincer Experience, Mincer Experience Squared, Cognitive Skills, Urban and South Residence at Age 14, and Year Dummies (Dependent Variable: Log Earnings).

Qualitatively similar models characterize both samples. For both cohorts, a three factor model is sufficient to fit the data on ex-post earnings, test scores and schooling choice.24 The identification of the model requires the normalization of some factor loadings because the scales of the components of θ are otherwise indeterminate. Web Appendix Table 2.2 shows the factor loading normalizations imposed in both data sets. In both samples, the covariates X are urban residence at age 14, year effects, and an intercept.

The covariates Z in the cost function are urban residence at age 14, dummies for year of birth, and variables that affect the costs of going to college but do not affect outcomes Ys,t after controlling for ability. Examples of such exclusions are mother's education, father's education, number of siblings, and local tuition.25 Because in both samples we only have earnings data into the middle 30s, the truncated discounted earnings after the periods of observation (denoted t = 1, …, T*) are absorbed into the definition of expected C in equation (3). Thus C estimated from the choice equation is not a pure measure of costs. We discuss this issue further in Section 3.3.

Each factor θk is assumed to be generated by a mixture of Jk normal distributions,

θkj=1Jkpk,jϕ(θkμk,j,λk,j),

where ϕ (η | μj, λj) is a normal density for η with mean μj and variance λj and j=1Jkpk,j=1, and pk,j > 0.26 The εs,t are also assumed to be generated by mixtures of normals. We estimate the model using Markov Chain Monte Carlo methods as described in Carneiro, Hansen, and Heckman (2003). For all factors, a four-component model (Jk = 4, k = 1, …, 3) is adequate. For all εs,t we use a three-component model.27

The dependent variable in our analysis is earnings and not log earnings. Under risk neutrality, agents make college choices based on expected earnings. The traditional argument for fitting log earnings is based on goodness of fit considerations.28 Using a nonparametric estimation method for determining the error distribution, our model fits the earnings data.

3.1 Model Fit

The Web Appendix reports model fit overall and in subsamples disaggregated by education and age.29 When we perform formal tests of equality of predicted versus actual densities, we pass these tests for both schooling groups for most ages.30 The model fits the NLS/1966 data marginally better than it fits the NLSY/1979 data. The estimated factor distributions are non-normal.31

Our analysis reveals that agents know θ1 and θ2 but not θ3 at the time that they make their schooling decisions. Thus the third factor is revealed after schooling choices are made. In addition, they do not know the εs,t, s = 0, 1, t = 1, …, T*, or the year dummies in the earnings equations corresponding to future macro shocks. Otherwise agents know the variables in X and Z described in the previous subsection.

3.2 The Evolution of Joint Distributions of Earnings and the Returns to College

The conventional approach to estimating the distribution of earnings in counterfactual schooling states (e.g., the distributions of college earnings for people who choose to be high school graduates under a particular policy regime) assumes that college and high school distributions are the same except for an additive constant — the coefficient of a schooling dummy in an earnings regression conditioned on covariates. Using the methods developed in CHN and reviewed in Part III of the Web Appendix, we can identify both ex-ante and ex-post joint distributions without making this strong assumption or the other strong assumptions conventionally used to identify joint distributions of counterfactuals.32 We present and discuss our estimates of ex-ante and ex-post joint distributions in Web Appendix 3.5.

Knowledge of the joint distributions allows analysts to compare factual with counterfactual distributions. In the Web Appendix, we compare the density of the present value of realized ex-post earnings in the high school sector for high school graduates with the density of the present value of earnings they would obtain in the college sector. We also compare the density of realized present value earnings of college graduates with the density of their counterfactual present value of earnings in the high school sector.33 For both data sets, the high school attenders would have higher earnings if they had chosen to be college graduates. For college graduates, the densities of high school present value of earnings are to the left of the college densities. These distributions are consistent with economic rationality because estimated psychic costs are estimated to be substantially negative for college attendees and large and positive for those who stop at high school. See the evidence reported in CHN.34

From our model, we can generate the distributions of the ex-post gross rate of return R to college (excluding costs) defined as

R=Y1Y0Y0

where

Ys=t=1TYs,t(1+ρ)t,s{0,1}

where t ∈ {1, …, 15} corresponding to discounting earnings to age 22 over the period from age 22 to age 36, (T* = 15) and ρ = .03. The mean high school student would have had annual gross returns per year of schooling of around 6% for a college education in the earlier cohort and around 9.5% for the later cohort. (See Table 2.) For the mean college graduate, the annual return per year of schooling is around 8.7% for the earlier cohort and 13.5% for the later cohort. For individuals at the margin of attending college, these figures are 7.5% and 11.8% respectively. The returns to college for high school and college graduates for both cohorts are plotted in Figure 1.

Table 2.

Mean Rates of Return per Year of College by Schooling Group

NLS/66 NLSY/79
Schooling Group Mean Returns Standard Error Mean Returns Standard Error
High School Graduates 0.0592 0.0046 0.0955 0.0063
College Graduates 0.0877 0.0070 0.1355 0.0080
Individuals at the Margin 0.0750 0.0178 0.1184 0.0216

Figure 1.

Figure 1

Densities of Returns to College

3.3 The Evolution of Uncertainty and Heterogeneity

Under risk neutrality, the valuation or net utility function for schooling is

I=E(t=1TY1,tY0,t(1+ρ)t1I1)E(CTI1),

where

CT=t=T+1T1(1+ρ)t1(Y1,tY0,t)+C.

Because of the age truncation of lifetime earnings in our data, the estimated cost includes a component due to the expected return realized after period T*. Individuals go to college if I > 0. As previously explained, the correlation between schooling choices and realized future income allows the analyst to disentangle predictable components from uncertainty. For both cohorts, we test, and do not reject, the hypothesis that at the time they make college going decisions individuals know their Z and the factors θ1 and θ2. They do not know the time dummies (year effects) in X, the factor θ3 or εs,t, s = 0, 1, t = 1, …, T*, at the time they make their educational choices. We now explore the implications of our estimates for the growth of uncertainty in the American economy prior to the 2008 recession.

3.3.1 Total Residual Variance and Variance of Unforecastable Components

The unforecastable component of the residual is the sum of the components that are not in the information set of the agent at the time schooling choices are made. For both data sets, the unforecastable component of the present value of earnings estimates up to age T* is

Ps=t=1Tθ3α3,s,t+Ttϕ+εs,t(1+ρ)t1, (14)

where the Tt are the year dummies in the future earnings equations that we estimate to be unknown to agents at the time they make their schooling choices. The variance of the unforecastable component in the present value of earnings up to age T* for schooling level s is Var (Ps).

Table 3 displays the total variance and the variance of the unforecastable components for each schooling level for both NLS/1966 and NLSY/1979. Total variance of the present value of college earnings up to age T* increases from 195.9 (NLS/1966) to 292.4 (NLSY/1979). This implies an increase of almost 50% in the total variance. The increase is smaller for the variance of the present value of high school earnings up to age 36: it goes from 137 in NLS/1966 to 165 for NSLY/79, an increase of almost 21%.

Table 3.

Evolution of Uncertainty

NLS/1966
College High School Returns
Total Variance 195.882 136.965 611.245
Variance of Unforecastable Components 76.332 31.615 167.187
Variance of Forecastable Components 119.550 105.350 444.058
NLS/1979
College High School Returns
Total Variance 292.368 165.350 823.200
Variance of Unforecastable Components 84.464 48.137 221.976
Variance of Forecastable Components 207.904 117.214 601.223
Evolution
Percentage Increase in Total Variance 49.26% 20.72% 34.68%
Percentage Increase in Variance of Unforecastable Components 10.65% 52.26% 32.77%
Percentage Increase in Variance of Forecastable Components 73.90% 11.26% 35.39%
Percentage Increase in Total Variance by Source
College High School Returns
Percentage Increase in Total Variance due to Unforecastable Components 8.43% 58.20% 25.85%
Percentage Increase in Total Variance due to Forecastable Components 91.57% 41.80% 74.15%

The variance of the unforecastable components up to age 36 has also increased. For college earnings, it is 76.3 in the early cohort and becomes 84.4 in the more recent cohort. For high school earnings, it is 31.6 in the NLS/1966 and becomes 48.1 in the NLSY/1979. In percentage terms, this implies that the variance of the unforecastable component increased 10.6% for college and 52% for high school. Table 3 shows that total variance in the present value of gross returns to college up to age 36 increased from 611 in NLS/1966 to 823 in NLSY/1979, an increase of about 35%. The variance of the unforecastable components increased from 167 to 222, or roughly 33%.

The increase in the variance of the unforecastable components of earnings is a key element in explaining the increase in the total variance in earnings for high school graduates. It is much less of a driving force in explaining the increase in the variance of college earnings.

Figures 2A and 2B plot the densities of realized and unforecastable present values of high school earnings for the 1979 and 1966 samples, respectively. Figures 3A and 3B make the analogous comparison for present values of college earnings for the 1979 and 1966 samples, respectively. Finally, Figures 4A and 4B show the corresponding figures for returns. Unforecastable components are a major component of total earnings variance.

Figure 2.

Figure 2

The Densities of Total Residual vs. Unforecastable Components in Present Value of High School Earnings

Figure 3.

Figure 3

The Densities of Total Residual vs. Unforecastable Components in Present Value of College Earnings

Figure 4.

Figure 4

The Densities of Total Residual vs. Forecastable Components Returns College vs. High School

Table 3 also presents the total variance and the variance of forecastable components for each schooling level for both NLS/1966 and NLSY/1979. In the recent cohort, individuals who attend college have become more diverse in predictable ways possibly associated with greater possibilities for specialization in the modern economy. There is only a small change in the predictability of high school earnings. For college earnings, the variance of forecastable components is 119.5 for the NLS/1966 and 207.9 for the NLSY/1979 corresponding to a 74% increase. For high school earnings, it is 105 for the NLS/1966 and 117.2 for the NLSY/1979, which implies an increase of only 11%. There is a substantial increase in the variance of predictable returns to college for the more recent cohort.

In summary, our analysis shows that about 8% of the increase in the variability in college earnings, 60% of the increase in the variability in high school earnings, and about 26% of the increase in the variability of gross returns to college is due to an increase in uncertainty in the American labor market. We next turn to an analysis of how the increase in variance is apportioned by age.

3.3.2 The Variance of the Unforecastable and Forecastable Components by Age

The increase in uncertainty is not uniform across age groups. Figure 5A plots the variances of unforecastable components by age in high school earnings in NLS/1966, and NLSY/1979. They are flat until age 27/28. A similar pattern characterizes college earnings (Figure 5B). After age 27/28, college and high school variances in both cohorts increase with age. Until age 36, the NLSY/1979 cohort experiences a much more rapid increase in variances with age than does the NLS/1966 cohort. The college sample shows a similar flat pattern until age 27. Again, components due to uncertainty increase with age but the only divergence between the younger cohort and the older cohort is in the age range 28–31.

Figure 5.

Figure 5

Profile of Variance of Uncertainty

The age profile of the variance of forecastable components is different. (See Figures 6A and 6B.) For both college and high school graduates it rises up to age 27 and then declines somewhat. For high school graduates, the increase is greater for the more recent cohort up to age 27 but then the two curves coincide. For college graduates, the predictable components of variance are uniformly higher at each age for the more recent cohort.

Figure 6.

Figure 6

Profile of Variance of Heterogeneity

3.3.3 Accounting for Macro Uncertainty

The literature in macroeconomics documents that aggregate instability steadily decreased in the post-World War II period prior to the 2008 meltdown (see Gordon, 2005). To capture the reduction in macro uncertainty, we introduce time dummies into the earnings equation.35 Our tests indicate that the time dummies in the ex-post earnings equations do not enter the schooling choice equation. Thus, we estimate that macro uncertainty is not forecastable by agents at the time schooling choices are made. Macro uncertainty decreased by 90% for later cohorts of high school educated workers (see Table 4). Macro shocks have decreased slightly if at all for college educated workers. These estimates are consistent with the evidence that US business cycle volatility decreased in the years prior to 2008. At the same time, macro uncertainty is a tiny fraction of total uncertainty for both cohorts (6.8% for 1966, 3.3% for 1979).

Table 4.

Share of Variance of Business Cycle in Total Variance of Unforecastable Components

NLS/1966 NLSY/1979
Point Estimate Standard Error Point Estimate Standard Error
High School 0.1111 0.0147 0.0156 0.0020
College 0.0452 0.0077 0.0392 0.0052
Overall 0.0679 0.0107 0.0328 0.0042

3.3.4 Risk Aversion and More General Market Structures

In deriving the estimates presented in this paper, we have assumed risk neutrality and access to credit markets. It would be informative to estimate a more general model with risk–averse agents trading in incomplete markets. Introducing risk aversion and different credit market structures into our analysis raises a general set of questions about the identification of the model of CHN.

A basic question, first posed by CHN (2005), is What can be identified in more general environments? In the absence of perfect certainty or perfect risk sharing, preferences and credit market environments also determine schooling choices. The separation theorem used in this paper that allows consumption and schooling decisions to be analyzed in isolation of each other breaks down.

If we a priori postulate information arrival processes, and assume that preferences are known up to some unknown parameters as in Flavin (1981), Blundell and Preston (1998), and Blundell, Pistaferri, and Preston (2008), we can identify departures from specified market structures. Flavin (1981), Blundell and Preston (1998), and Blundell, Pistaferri, and Preston (2008) specify explicit time series processes for the unobservables (e.g., ARMA or fixed effect/AR-1 models) with unknown coefficients but prespecified serial correlation structures and assume that the innovations in these processes are the uncertainty components while the predictable components are known to agents.36

One can add consumption data to the schooling choice and earnings data to secure identification of risk preference parameters (within a parametric family) and information sets, and to test among alternative models of market environments. Navarro (2011) analyzes consumption and earnings data using a CRRA utility function (assumed to be the same for all persons) and an Aiyagari (1994) borrowing constraint. Doing so has substantial effects on the educational choices and estimates of the contribution of uncertainty to earnings variability. Adding these features substantially reduces the estimated level of uncertainty for both college and high school states but especially so for the college state. He estimates that fully 81% of the variance in observed college earnings is predicted as opposed to 44% of the variance in high school earnings.37

Alternative assumptions about what analysts know produce different interpretations of the same evidence. An open question, not yet fully resolved in the literature, is how far one can go in nonparametrically jointly identifying preferences, market structures and agent information sets. The lack of full insurance interpretation given in the empirical analyses of Flavin (1981) and Blundell, Pistaferri, and Preston (2008), may instead be a consequence of their misspecification of the generating processes of agent information sets.

3.3.5 Accounting for Inequality

Instead of estimating a model with risk aversion, in this paper we draw on a large literature on inequality measurement that evaluates alternative distributions of earnings using a variety of indices and social welfare functions.38 These criteria embody social preferences toward inequality aversion. We contribute to this literature by distinguishing the contributions to inequality arising from uncertainty and the contributions arising from predictable components. These are measured with respect to information sets at the college going age.

We simulate the distribution of the observed present value of age-truncated earnings and compute the Gini coefficient, the Theil Entropy Index, and the Atkinson Index under different scenarios. For each cohort k, we write earnings of individual i at the time t, schooling level s as Yk,i,t. Let Sk,i = 1 if person i graduates college and Sk,i = 0 if person i graduates high school. We may write

Yk,i,t=Sk,iYk,1,i,t+(1Sk,i)Yk,0,i,t

and

Yk,i=t=1TYk,i,t(1+ρ)t1.

We show that the distribution of Yk,i for each cohort, displayed in the first row of Table 5A (for the Gini index), Table 5B (for the Theil index) and Table 5C (for the Atkinson index), the NLSY/1979 cohort is more unequal than the NLS/1966 cohort for any inequality measurement we use. The Gini coefficient (Table 5A) grows by 16% from the earlier cohort to the later cohort.39 Table 5B shows that the Theil Entropy Index T grew by 38% from the NLS/1966 to the NLSY/1979. One of the advantages of the Theil Index is that it can be used to decompose overall inequality within and between schooling groups. Within group inequality grew by 28% and between group inequality grew by 450%.

Table 5.

Predictable Heterogeneity

A. Gini Decomposition

NLS/66 NLSY/79 % Growth
Factual Economy: Predictable Heterogeneity and Uncertainty1 0.1803 0.2088 15.85%
Counterfactual: Predictable Fixing Schooling Choices as in Factual Economy Predictable Heterogeneity Only2 0.1591 0.1825 14.73%

B. The Theil Entropy Index T (Overall)

NLS/66 NLSY/79 % Growth
Factual Economy: Predictable Heterogeneity and Uncertainty1 0.0502 0.0693 37.98%
Counterfactual: Fixing Schooling Choices as in Factual Economy Predictable Heterogeneity Only2 0.0390 0.0522 33.76%

Within Schooling Groups

NLS/66 NLSY/79 % Change
Factual Economy: Predictable Heterogeneity and Uncertainty1 0.0491 0.0631 28.53%
Counterfactual: Fixing Schooling Choices as in Factual Economy Predictable Heterogeneity Only2 0.0378 0.0465 22.85%

Between Schooling Groups

NLS/66 NLSY/79 % Change
Factual Economy: Predictable Heterogeneity and Uncertainty1 0.0011 0.0062 447.37%
Counterfactual: Fixing Schooling Choices as in Factual Economy Predictable Heterogeneity Only2 0.0011 0.0057 394.22%
1
Let Yk,s,t,i denote the earnings of an agent i, i = 1, …, nk, at age t, t = 22, …, 36, in schooling level s, s = high school, college, and cohort k,k = NLS/1966, NLSY/1979. We model earnings Yk,s,t,i as:
Yk,s,t,i=μs,k(Xk)+θ1,k,iα1,k,s,t,i+θ2,k,iα2,k,s,t,i+θ3,k,iα3,k,s,t,i+εk,s,t,i. (i)
The present value of earnings at schooling level s, Yk,s,i, is Yk,s,i=t=1TYk,s,t,i(1+ρ)t1. The observed present value of earnings satisfies Yk,i = Sk,iYk,1,i + (1 − Sk,i) Yk,0,i where Sk,i = 1 if agent i in cohort k graduates college, and Sk,i = 0 if the person graduates high school. Let Ck,i denote the direct costs for individual i in cohort k. The schooling choice is:
Sk,i=1E(Yk,1,iYk,0,iCk,iIk)0. (ii)
This is the factual economy. In this row, we show the inequality measure in the subtitle.
2
We simulate the economy by replacing (i) with:
Yk,s,t,ih=μs,k(Xk)+θ1,k,iα1,k,s,t,i+θ2,k,iα2,k,s,t,i,
where Yk,s,t,ih are the individual earnings when idiosyncratic uncertainty is completely shut down. The present value of earnings when only heterogeneity is accounted for is constructed in a similar manner: Yk,s,ih=t=1TYk,s,t,ih(1+ρ)t1. The schooling choices are as determined in (ii). In this row, we show the inequality measure for the concept given in the subtitle for the observed truncated present value of earnings Yk,s,ih when we constrain schooling choices to be the same as in the economy that generates the first row.
Table 5C.

Atkinson Index

ε = 0.5 ε = 1.0
NLS/66 NLSY/79 % Change NLS/66 NLSY/79 %Change
Factual Economy: Predictable Heterogeneity and Uncertainty1 0.0276 0.0389 0.4111 0.0586 0.0847 0.4446
Counterfactual: Fixing Schooling Choices as in Factual Economy Predictable Heterogeneity Only2 0.0213 0.0286 0.3437 0.0447 0.0604 0.3503

ε = 1.5 ε = 2.0
NLS/66 NLSY/79 %Change NLS/66 NLSY/79 %Change
Factual Economy: Predictable Heterogeneity and Uncertainty1 0.0968 0.1467 0.5147 0.1627 0.2627 0.6149
Counterfactual: Fixing Schooling Choices as in Factual Economy Predictable Heterogeneity Only2 0.0716 0.0980 0.3687 0.1060 0.1506 0.4205
1
Let Yk,s,t,i denote the earnings of an agent i, i = 1, …, nk, at age t, t = 1, …, T, in schooling level s, s = high school, college, and cohort k, k = NLS/1966, NLSY/1979. We model earnings Yk,s,t,i as:
Yk,s,t,i=μs,k(Xk)+θ1,k,iα1,k,s,t,i+θ2,k,iα2,k,s,t,i+θ3,k,iα3,k,s,t,i+εk,s,t,i. (i)
The present value of earnings in schooling level s, Yk,s,i, is Yk,s,i=t=1TYk,s,t,i(1+ρ)t1. The observed truncated present value of earnings is Yk,i = Sk,iYk,1,i + (1 − Sk,i) Yk,0,i. Let Ck,i denote the direct costs for individual i in cohort k. The schooling choice is:
Sk,i=1E(Yk,1,iYk,0,iCk,iIk)0. (ii)
This is the factual economy. We then compute the average present value of earnings across all individuals in cohort k, μk=1ni=1nkYk,i. For a given inequality aversion parameter ϵ, we compute the level of permanent income Yk(ϵ) that generates the same welfare as the social welfare of the actual distribution in cohort k:
[Yk(ϵ)]1ϵ11ϵ=1nki=1nk(Yk,i)1ϵ11ϵ
For each value of ϵ, the Atkinson Index is A(ϵ)=1Yk(ϵ)μk. In this row, we show the Atkinson Index for the observed present value of earnings Yk,i for different values of ϵ.
2
We simulate the economy by replacing (i) with:
Yk,s,t,ih=μs,k(Xk)+θ1,k,iα1,k,s,t,i+θ2,k,iα2,k,s,t,i,
where Yk,s,t,ih are the individual earnings when idiosyncratic uncertainty is completely shut down. The present value of earnings when only predictable heterogeneity is accounted for is constructed in a similar manner: Yk,s,ih=t=1TYk,s,t,ih(1+ρ)t1. The schooling choices are as determined in (ii). In this row, we show the Atkinson Index for the observed present value of earnings Yk,ih for different values of ϵ when we constrain schooling choices, Sk,i, to be observed in the factual economy.

An explicit social welfare approach to measuring earnings inequality proceeds by constructing indexes based on social welfare functions defined over earnings distributions (see Cowell, 2000; Foster and Sen, 1997).40 For each cohort k, let μk denote the average income level computed over incomes of agents i in all schooling groups,

μk=1nki=1nkYk,i,

where nk is the number of persons in our samples of cohort k. Given a social welfare function U (Yk,i), the Atkinson index (1970) is defined as the per-capita level of present value of income Yk such that, if equally distributed, would generate the same level of social welfare as the distribution of earnings in cohort k. That is, for the social welfare function advocated by Atkinson (1970), Yk satisfies:

(Yk)1ϵ11ϵ=1nki=1nkYk,i1ϵ11ϵ.

The parameter ϵ is a measure of inequality aversion (ϵ = 0 corresponds to no inequality aversion; ϵ → − ∞ corresponds to Rawlsian inequality aversion). The Atkinson index A is defined as:

A=1(Ykμk).

Table 5C computes the Atkinson Index for each cohort and its growth, for different values of inequality aversion parameter ϵ. Regardless of the value of ϵ, inequality has increased by between 40% to 60% according to the Atkinson Index.

Our previous analysis established that some portion of the inequality in observed present value of earnings is predictable at the age college decisions are made using the information in I1. We can compare the inequality that is produced by predictable factors (heterogeneity) versus overall earnings inequality. This allows us to determine the contribution of uncertainty to overall inequality using a variety of measures. We simulate counterfactual economies in which uncertainty is eliminated. Eliminating uncertainty can be accomplished by simulating an economy in which the unforecastable components are set at their means. We could keep schooling choices fixed at their values in the factual economy or allow agents to re-optimize and see how that affects these measures of inequality measurement. We do both, but differences arising from re-optimized schooling choice are of second order. See the tables in Appendix 3.6–3.8. In the text, we report results holding schooling fixed at their value in the factual economy.

The second row of Table 5A presents the Gini coefficient for the economy without uncertainty in future earnings fixing schooling choices as in the factual economy. In this case, the Gini coefficient for the NLS/1966 would be 0.16 and for the NLSY/1979 would be 0.18, which represents a growth of less than 15% in inequality as measured by the Gini coefficient. The analogous calculation for the Theil index reported in Table 5B shows that the Overall Theil Index would have grown by 34% if uncertainty were eliminated, while the Within and Between Theil Indexes would have grown by 22% and 394%, respectively. The analogous exercise for the Atkinson index predicts an increase between 35% and 42% (see Table 5C).

These calculations show that rising inequality in the aggregate as measured by conventional inequality indices is largely driven by rising heterogeneity. However, as documented in Table 3, there are sharp differences in the contribution of rising uncertainty to inequality for different schooling groups. The rise in high school graduate earnings variability is due to a substantial rise in inequality due to uncertainty. Uncertainty in college graduate earnings has not increased substantially, although predictable components have become more variable.

4 Summary and Conclusion

This paper investigates the sources of rising wage inequality in the US labor market for white males in a period ranging over the mid-1960s to 2005, prior to the 2008 meltdown. We find that increasing inequality arises both from increasing micro uncertainty and increasing predictable components of variation. The latter could arise from increased specialization in labor markets, but we present no direct evidence on this question. Both predictable and unpredictable components of earnings have increased since the mid-1960s. The fraction of the variability due to micro uncertainty has increased especially for less skilled workers. Aggregate uncertainty decreased prior to the 2008 meltdown, especially for unskilled workers. Micro uncertainty dwarfs macro uncertainty. Our evidence of substantially increased uncertainty at the micro level for recent cohorts of unskilled labor supports the increased turbulence hypothesis of Ljungqvist and Sargent (1998, 2008). Conventional measures of aggregate inequality do not reveal the substantial contribution of the rise in the uncertainty of the earnings of less skilled workers to their observed rise in the inequality of their earnings.

Supplementary Material

Appendix

Acknowledgments

This research was supported in part by: the American Bar Foundation; the Pritzker Children's Initiative; the Buffett Early Childhood Fund; NIH grants NICHD R37HD065072, NICHD R01HD54702, and NIA R24AG048081; an anonymous funder; Successful Pathways from School to Work, an initiative of the University of Chicago's Committee on Education funded by the Hymen Milgrom Supporting Organization; and the Human Capital and Economic Opportunity Global Working Group, an initiative of the Center for the Economics of Human Development, affiliated with the Becker Friedman Institute for Research in Economics, and funded by the Institute for New Economic Thinking. The views expressed in this paper are solely those of the authors and do not necessarily represent those of the funders or the official views of the National Institutes of Health. Cunha is grateful to the Claudio Haddad Dissertation Fund at the University of Chicago and Rob Dugger for research support. This article builds on research reported in Cunha, Heckman, and Navarro (2005) and NIH Grant R01HD073221. We thank the editor, Paul Oyer, for comments on this draft. We also thank Ray Fair, Lars Hansen, Pat Kehoe, Robert Lucas, Salvador Navarro, Tom Sargent, Robert Shimer, Robert Townsend and Kenneth Wolpin for comments on various drafts of this paper. We have benefited from comments received at the Money and Banking Workshop, University of Chicago, from comments received by participants at the Ely Lectures at Johns Hopkins University, the 9th Econometric Society World Congress at University College London, the Economic Dynamics Working Group at the University of Chicago, the Empirical Dynamic General Equilibrium Conference at the Centre for Applied Microeconometrics, the Macroeconomics of Imperfect Risk Sharing Conference at the University of California at Santa Barbara, the Society for Economic Dynamics, the Koopmans Memorial Lectures at Yale, the Federal Reserve Bank of Minneapolis Applied Micro Workshop, Tom Sargent's Macro Reading Group at New York University, a UCL Micro Workshop 2009, Student Working Group 2012, and numerous cohorts of Econ 350 students at the University of Chicago. The website for the supplementary material to this paper is http://jenni.uchicago.edu/evo-earn/.

Footnotes

1

Keane and Wolpin (1997) estimate that 90% of lifetime variability is predictable by young adults. Johnson (2013) reports estimates consistent with those reported in CHN.

3

The Sims (1972) test for noncausality is based on a related idea in a linear prediction framework. Whereas Sims tests whether future Yt predict current C1, we measure what fraction of future Yt predict current C1 and use a more general prediction process.

4

In other work (Cunha and Heckman, 2011), we use annual labor supply to estimate information sets at multiple stages of the life cycle.

5

A Web Appendix presents semiparametric proofs of identification based on their work.

6

See Almlund, Duckworth, Heckman, and Kautz (2011) for a discussion of cognitive tests.

7

Alternatively, we can interpret the factors as residualized versions of θ controlling for X and Z.

8

Applying the analyses of Schennach (2004) and Cunha, Heckman, and Schennach (2010), identification of the model can be secured under much weaker conditions.

9

In our empirical analysis, we test for the presence or absence of components in X, Z, and the εs,t that are in ex-ante information sets.

10

In our Web Appendix, we restate their formal proofs of identification. They identify the distributions of factors nonparametrically. Test score data are not strictly required to secure identification. See, e.g., Abbring and Heckman (2007).

11

See Part 3 of the Web Appendix.

12

CHN interpret the factor loadings in the earnings equations as prices of unobserved skills that they interpret as factors. In this paper we do not adopt that interpretation. We allow agents to be uncertain about their future skills, future prices, or both. We interpret the factor loadings as convenient statistical devices for representing the components of realized earnings no matter what their source. Thus we do not maintain the perfect foresight assumption about future skill prices used by CHN.

13

See Miller (2004) for a description of the NLSY data.

14

See documentation at http://www.bls.gov/nls/handbook/2005/nlshc6.pdf for a description of the NLS data.

15

http://jenni.uchicago.edu/evo-earn/. The Web Appendix has five parts: Web Appendix 1 contains a description of the samples; Web Appendix 2 presents a description of the estimated model, including the goodness of fit tests; Web Appendix 3 provides a review on the identification of the model; Web Appendix 3.5 discusses the estimates of the joint distribution of outcomes; Web Appendix 3.6 presents the results of the schooling choice on our measures of aggregate inequality.

16

In this paper, we do not take a position on the sources of predictable variability or uncertainty. The former might come from cost of living differentials (e.g., Black, Kolesnikova, and Taylor, 2009; Moretti, 2013) or from variance arising from life cycle investment (see Mincer, 1974 or Lemieux, 2006). Both components could have changed as the labor market became more demographically diverse and the white males we study faced increasing competition. Katz and Autor (1999) and Acemoglu and Autor (2011) discuss other factors contributing to the observed rise in wage inequality.

17

See Cameron and Heckman (2001) for details on the construction of our tuition variables.

18

Earnings figures are adjusted for inflation using the CPI and we take the year 2000 as the base year.

19

See Web Appendix Figures 1.3 and 1.4 respectively.

20

M in the notation of section 2.

21

We use information from three different tests from the “Knowledge of the World of Work” survey. The first is a question regarding occupation: the respondent is asked about the duties of a given profession, say draftsman. For this specific example, there are three possible answers: (a) makes scale drawings of products or equipment for engineering or manufacturing purposes, (b) mixes and serves drinks in a bar or tavern, (c) pushes or pulls a cart in a factory or warehouse. The second test is a test that asks for each occupation in the first test, the level of education associated with that occupation. The third test is an earnings comparison test. Specifically, it asks the respondent who he/she believes makes more in a year, comparing two different occupations. In Web Appendix Table 2.1 we show that even after controlling for parental education, number of siblings, urban residence at age 14, and dummies for year of birth, the “Knowledge of the World of Work” test scores are correlated with the cognitive test scores. The correlation with OTIS/BETA/GAMMA and CTMM is stronger for the occupation and education tests than for the earnings-comparison test.

22

In our analyses of both the NLSY/1979 and NLS/1966 data we include mother's education, father's education, number of siblings, urban residence at age 14, dummies for year effects and an intercept. In the NLSY/1979 sample we also control for whether the test taker is enrolled in school and the highest grade completed at the time of the test. In the NLS/1966 all of the respondents were enrolled in school at the time of the test (in fact, the test score is obtained in a survey from schools). We do not know the highest grade completed at the time of the test for the NLS/1966 sample.

24

In the next subsection and at our website, we discuss the goodness-of-fit measures used to select the appropriate model for each sample.

25

Because we control for ability and other unobservables captured by the factors, our parsimonious specification of the earnings equations is less controversial.

26

Ferguson (1983) shows that mixtures of normals with a large number of components approximate any distribution of θk arbitrarily well in the ℓ1 norm.

27

Additional components do not improve the goodness of fit of the model to the data.

29

The Web Appendix shows fits for all ages. See Web Appendix Figures 2.1 through 2.90 for the overall, high school, and college earnings, for both the NLSY/1979 and NLS/1966.

31

Figures 2.97–2.102 plot the estimated densities of the factors for the NLS 1966 and 1979 NLSY samples by attained schooling level.

32

Abbring and Heckman (2007) discuss a variety of alternative assumptions used to identify joint counterfactual distributions.

33

See Figures 2.91–2.92 for high school and college earnings, respectively for the NLSY/1979 cohort and Figures 2.94–2.95 for the corresponding figures for the NLS/1966 cohort.

34

This is a recurrent finding in the literature. See Abbott, Gallipoli, Meghir, and Violante (2013) and Eisenhauer, Heckman, and Mosso (2015).

35

We face the standard problem of the lack of simultaneous identification of age, period and cohort effects so we cannot identify cohort effects in the presence of age and time effects. Thus our estimates of uncertainty of time effects can also be interpreted as estimates of uncertainty of cohort effects. See Heckman and Robb (1985) for a discussion of this problem and a demonstration of the interactions that can be identified.

36

Hansen (1987) shows a fundamental nonidentification result for the Flavin model estimated on aggregate data. Our use of micro panel data circumvents the problem he raises.

37

Navarro's sample corresponds most closely to our NLSY/1979 sample. He estimates the model for a single cohort and so he does not address the issue of the evolution of uncertainty discussed in this paper. He also does not report separate estimates of the effects of allowing for risk aversion and adding credit constraints to CHN.

39

The low level of the Gini coefficient arises from the averaging of incomes that arises in constructing present values, because we study of white males only, and from the truncation of the present value term due to data limitations.

40

Anand (1983) presents a useful summary of the indices used in this literature.

References

  1. Abbott Brant, Gallipoli Giovanni, Meghir Costas, Violante Giovanni L. Working paper 18782. NBER; 2013. Education policy and intergenerational transfers in equilibrium. [Google Scholar]
  2. Abbring Jaap H., Heckman James J. Econometric evaluation of social programs, part III: Distributional treatment effects, dynamic treatment effects, dynamic discrete choice, and general equilibrium policy evaluation. In: Heckman J, Leamer E, editors. Handbook of Econometrics. 6B. Elsevier; Amsterdam: 2007. pp. 5145–5303. [Google Scholar]
  3. Acemoglu Daron, Autor David. Skills, tasks and technologies: Implications for employment and earnings. In: Ashenfelter Orley, Card David., editors. Handbook of Labor Economics, Handbooks in Economics. Vol. 4. Elsevier; Amsterdam: 2011. pp. 1043–1171. Part B. chap. 12. [Google Scholar]
  4. Aiyagari S. Rao. Uninsured idiosyncratic risk and aggregate saving. Quarterly Journal of Economics. 1994;109(3):659–684. [Google Scholar]
  5. Almlund Mathilde, Duckworth Angela, Heckman James J., Kautz Tim. Personality psychology and economics. IZA Discussion Paper, no. No. 5500. 2011 http://ftp.iza.org/dp5500.pdf.
  6. Anand Sudhir. Inequality and Poverty in Malaysia: Measurement and Decomposition. Published for the World Bank by Oxford University Press; New York: 1983. [Google Scholar]
  7. Atkinson Anthony B. On the measurement of inequality. Journal of Economic Theory. 1970;2(3):244–266. [Google Scholar]
  8. Atkinson Anthony B., Bourguignon François. Introduction: Income distribution and economics. In: Atkinson Anthony B., Bourguignon François., editors. Handbook of Income Distribution. Vol. 1. North-Holland; Amsterdam: 2000. pp. 1–58. [Google Scholar]
  9. Becker Gary Stanley. Human Capital: A Theoretical and Empirical Analysis, with Special Reference to Education. National Bureau of Economic Research; 1964. [Google Scholar]
  10. Black Dan, Kolesnikova Natalia, Taylor Lowell. Earnings functions when wages and prices vary by location. Journal of Labor Economics. 2009;27(1):21–47. [Google Scholar]
  11. Blundell Richard, Pistaferri Luigi, Preston Ian. Consumption inequality and partial insurance. American Economic Review. 2008;98(5):1887–1921. [Google Scholar]
  12. Blundell Richard, Preston Ian. Consumption inequality and income uncertainty. Quarterly Journal of Economics. 1998;113(2):603–640. [Google Scholar]
  13. Cameron Stephen V., Heckman James J. The dynamics of educational attainment for black, hispanic, and white males. Journal of Political Economy. 2001;109(3):455–499. [Google Scholar]
  14. Carneiro P, Hansen KT, Heckman JJ. 2001 Lawrence R. Klein Lecture Estimating Distributions of Treatment Effects with an Application to the Returns to Schooling and Measurement of the Effects of Uncertainty on College Choice. International Economic Review. 2003;44(2):361–422. [Google Scholar]
  15. Chamberlain Gary, Griliches Zvi. Unobservables with a variance-components structure: Ability, schooling, and the economic success of brothers. International Economic Review. 1975;16(2):422–449. [Google Scholar]
  16. Cowell Frank A. Measurement of inequality. In: Atkinson Anthony B., Bourguignon François., editors. Handbook of Income Distribution. Vol. 1. North-Holland; Amsterdam: 2000. pp. 87–166. [Google Scholar]
  17. Cunha Flavio, Heckman James J. A new framework for the analysis of inequality. Macroeconomic Dynamics. 2008;12(Supplement 2):315–354. [Google Scholar]
  18. Cunha Flavio, Heckman James J. The evolution of inequality, heterogeneity and uncertainty in labor earnings in the U.S. economy. University of Pennsylvania; 2011. Unpublished manuscript. [Google Scholar]
  19. Cunha Flavio, Heckman James J., Navarro Salvador. Separating uncertainty from heterogeneity in life cycle earnings, The 2004 Hicks Lecture. Oxford Economic Papers. 2005;57(2):191–261. [Google Scholar]
  20. Cunha Flavio, Heckman James J., Schennach Susanne M. Estimating the technology of cognitive and noncognitive skill formation. Econometrica. 2010;78(3):883–931. doi: 10.3982/ECTA6551. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Eisenhauer Philipp, Heckman James, Mosso Stefano. Estimation of dynamic discrete choice models by maximum likelihood and the simulated method of moments. Forthcoming, International Economic Review. 2015 doi: 10.1111/iere.12107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ferguson Thomas S. Bayesian density estimation by mixtures of normal distributions. In: Chernoff H, Rizvi MH, Rustagi J, Siegmund D, editors. Recent Advances in Statistics: Papers in Honor of Herman Chernoff on his Sixtieth Birthday. Academic Press; New York: 1983. pp. 287–302. [Google Scholar]
  23. Flavin Marjorie A. The adjustment of consumption to changing expectations about future income. Journal of Political Economy. 1981;89(5):974–1009. [Google Scholar]
  24. Foster James E., Sen Amartya K. On Economic Inequality. Oxford University Press; New York: 1997. [Google Scholar]
  25. Friedman Milton, Kuznets Simon Smith. Income from Independent Professional Practice. National Bureau of Economic Research; New York: 1945. [Google Scholar]
  26. Gordon Robert J. What caused the decline in U. S. business cycle volatility? In: Kent Christopher, Norman David., editors. The Changing Nature of the Business Cycle. Economics Group, Reserve Bank of Australia; Sydney, Australia: 2005. pp. 61–104. Proceedings of a conference held at the H.C. Coombs Centre for Financial Studies, Kirribilli, Australia on 11–12 July 2005. [Google Scholar]
  27. Gottschalk Peter, Moffitt Robert. The growth of earnings instability in the U.S. labor market. Brookings Papers on Economic Activity. 1994;2:217–254. [Google Scholar]
  28. Gottschalk Peter, Moffitt Robert. The rising instability of u.s. earnings. Journal of Economic Perspectives. 2009;23(4):3–24. [Google Scholar]
  29. Haider Steven J. Earnings instability and earnings inequality of males in the United States: 1967–1991. Journal of Labor Economics. 2001;19(4):799–836. [Google Scholar]
  30. Hansen Lars Peter. Calculating asset prices in three example economies. In: Bewley Truman F., editor. Advances in Econometrics: Fifth World Congress. Vol. 1. Cambridge University Press; New York: 1987. pp. 207–243. [Google Scholar]
  31. Heckman James J. A life-cycle model of earnings, learning, and consumption. Journal of Political Economy. 1976;84(4):S11–S44. Part 2. Journal Special Issue: Essays in Labor Economics in Honor of H. Gregg Lewis. [Google Scholar]
  32. Heckman James J. Sample selection bias as a specification error. Econometrica. 1979;47(1):153–162. [Google Scholar]
  33. Heckman James J. Statistical models for discrete panel data. In: Manski C, McFadden D, editors. Structural Analysis of Discrete Data with Econometric Applications. MIT Press; Cambridge, MA: 1981. pp. 114–178. [Google Scholar]
  34. Heckman James J., LaFontaine Paul A. The American high school graduation rate: Trends and levels. Review of Economics and Statistics. 2010;92(2):244–262. doi: 10.1162/rest.2010.12366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Heckman James J., Lochner Lance J., Todd Petra E. Earnings functions, rates of return and treatment effects: The Mincer equation and beyond. In: Hanushek Eric A., Welch Frank., editors. Handbook of the Economics of Education. Vol. 1. Elsevier; Amsterdam: 2006. pp. 307–458. chap. 7. [Google Scholar]
  36. Heckman James J., Polachek Solomon. Empirical evidence on the functional form of the earnings-schooling relationship. Journal of the American Statistical Association. 1974;69(346):350–354. [Google Scholar]
  37. Heckman James J., Robb Richard. Using longitudinal data to estimate age, period and cohort effects in earnings equations. In: Mason William M., Fienberg Stephen E., editors. Cohort Analysis in Social Research: Beyond the Identification Problem. Springer-Verlag; New York: 1985. [Google Scholar]
  38. Heckman James J., Vytlacil Edward J. Econometric evaluation of social programs, part I: Causal models, structural models and econometric policy evaluation. In: Heckman J, Leamer E, editors. Handbook of Econometrics. 6B. Elsevier; Amsterdam: 2007a. pp. 4779–4874. [Google Scholar]
  39. Heckman James J., Vytlacil Edward J. Econometric evaluation of social programs, part II: Using the marginal treatment effect to organize alternative economic estimators to evaluate social programs and to forecast their effects in new environments. In: Heckman J, Leamer E, editors. Handbook of Econometrics. 6B. Elsevier; Amsterdam: 2007b. pp. 4875–5143. chap. 71. [Google Scholar]
  40. Jensen Shane T., Shore Stephen H. Semiparametric bayesian modeling of income volatility heterogeneity. Journal of the American Statistical Association. 2011;106(496):1280–1290. [Google Scholar]
  41. Johnson Matthew T. Borrowing constraints, college enrollment, and delayed entry. Journal of Labor Economics. 2013;31(4):669–725. [Google Scholar]
  42. Katz Lawrence F., Autor David H. Changes in the wage structure and earnings inequality. In: Ashenfelter O, Card D, editors. Handbook of Labor Economics. Vol. 3. North-Holland; New York: 1999. pp. 1463–1555. chap. 26. [Google Scholar]
  43. Keane Michael P., Wolpin Kenneth I. The career decisions of young men. Journal of Political Economy. 1997;105(3):473–522. [Google Scholar]
  44. Lemieux Thomas. Increasing residual wage inequality: Composition effects, noisy data, or rising demand for skill? American Economic Review. 2006;96(3):461–498. [Google Scholar]
  45. Ljungqvist Lars, Sargent Thomas J. The european unemployment dilemma. Journal of Political Economy. 1998;106(3):514–550. [Google Scholar]
  46. Ljungqvist Lars, Sargent Thomas J. Two questions about european unemployment. Econometrica. 2008;76(1):1–29. [Google Scholar]
  47. MaCurdy Thomas E. The use of time series processes to model the error structure of earnings in a longitudinal data analysis. Journal of Econometrics. 1982;18(1):83–114. [Google Scholar]
  48. MaCurdy Thomas E. A practitioner's approach to estimating intertemporal relationships using longitudinal data: Lessons from applications in wage dynamics. In: Heckman James J., Leamer Edward., editors. Handbook of Econometrics, Handbooks in Economics. Vol. 6. Elsevier; Amsterdam: 2007. pp. 4057–4167. chap. 62. [Google Scholar]
  49. Matzkin Rosa L. Nonparametric and distribution-free estimation of the binary threshold crossing and the binary choice models. Econometrica. 1992;60(2):239–270. [Google Scholar]
  50. Meghir Costas, Pistaferri Luigi. Income variance dynamics and heterogeneity. Econometrica. 2004;72(1):1–32. [Google Scholar]
  51. Meghir Costas, Pistaferri Luigi. Earnings, consumption and life cycle choices. In: Ashenfelter Orley, Card David., editors. Handbook of Labor Economics. Vol. 4 2011. [Google Scholar]
  52. Miller Shaum. The National Longitudinal Surveys NLSY79 User's Guide 1979–2002. Bureau of Labor Statistics, U.S. Department of Labor; Washington, DC: 2004. [Google Scholar]
  53. Mincer Jacob. Schooling, Experience and Earnings. Columbia University Press for National Bureau of Economic Research; New York: 1974. [Google Scholar]
  54. Moretti Enrico. Real wage inequality. American Economic Journal: Applied Economics. 2013;5(1):65–103. [Google Scholar]
  55. Navarro Salvador. Working Paper 20118. University of Western Ontario, CIBC Centre for Human Capital and Productivity; 2011. Using observed choices to infer agent's information: Reconsidering the importance of borrowing constraints, uncertainty and preferences in college attendance. URL http://ideas.repec.org/p/uwo/hcuwoc/20118.html. [Google Scholar]
  56. Quandt Richard E. The estimation of the parameters of a linear regression system obeying two separate regimes. Journal of the American Statistical Association. 1958;53(284):873–880. [Google Scholar]
  57. Quandt Richard E. A new approach to estimating switching regressions. Journal of the American Statistical Association. 1972;67(338):306–310. [Google Scholar]
  58. Roy AD. Some thoughts on the distribution of earnings. Oxford Economic Papers. 1951;3(2):135–146. [Google Scholar]
  59. Schennach Susanne M. Estimation of nonlinear models with measurement error. Econometrica. 2004;72(1):33–75. [Google Scholar]
  60. Sen Amartya K. Social justice and the distribution of income. In: Atkinson Anthony B., Bourguignon François., editors. Handbook of Income Dynamics. Vol. 1. North-Holland; Amsterdam: 2000. pp. 59–85. [Google Scholar]
  61. Sims Christopher A. Money, income, and causality. American Economic Review. 1972;62(4):540–552. [Google Scholar]
  62. Taubman Paul. Kinometrics: Determinants of Socioeconomic Success Within and Between Families. North-Holland Publishing Company; New York: 1977. [Google Scholar]
  63. Willis Robert J., Rosen Sherwin. Education and self-selection. Journal of Political Economy. 1979;87(5):S7–S36. Part 2. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix

RESOURCES