Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Sep 27.
Published in final edited form as: Hum Hered. 2013 Sep 27;75(0):204–212. doi: 10.1159/000352007

Propagation of Obesity Across Generations: The Roles of Differential Realized Fertility and Assortative Mating by Body Mass Index

John A Dawson 1, Emily J Dhurandhar 2, Ana I Vazquez 3, Bo Peng 4, David B Allison 5
PMCID: PMC4010105  NIHMSID: NIHMS534652  PMID: 24081235

Abstract

Background/Aims

To quantify the extent to which the increase in obesity observed across recent generations of the American population is associated with the individual or combined effects of assortative mating for body mass index (BMI; kg/m2) and differential realized fertility by BMI.

Methods

A Monte Carlo framework is formed and informed using data collected from the National Longitudinal Survey of Youth (NLSY). The model has two portions, one that generates childbirth events on an annual basis and another that produces a BMI for each child. Once the model is informed using the data, a reference distribution of offspring BMIs is simulated. We quantify the effects of our factors of interest by removing them from the model and comparing the resulting offspring BMI distributions with that of the baseline scenario.

Results

An association between maternal BMI and number of offspring is evidenced in the NLSY data, as well as the presence of assortative mating. These two factors combined are associated with increased mean BMI (+0.067, C.I. [0.056, 0.078]), increased BMI variance (+0.578, C.I. [0.418, 0.736]) and increased prevalence of obesity (RR 1.032, 95% C.I. [1.023, 1.041]) and BMIs over 40 (RR 1.083, 95% C.I. [1.053, 1.118]) among offspring.

Conclusion

Our investigation suggests that both differential realized fertility and assortative mating by BMI appear to play a role in the increasing prevalence of obesity in America.

Keywords: Obesity, Body Mass Index, Assortative Mating, Realized Fertility, Monte Carlo Simulation

Introduction

The prevalence of obesity has increased substantially since 1980, with more than 60% of the United States population now overweight or obese (1, 2). The contribution of both environmental and genetic factors to the population increase in obesity is clear (3). Still, it has been argued that several decades is an insufficient amount of time for the gene pool to change significantly to influence obesity (4). It is plausible, however, that genetic factors have influenced the prevalence of obesity in several ways. In a closed population and in absence of disturbing forces, if a population is in Hardy-Weinberg Equilibrium, then allele frequencies will remain constant across generations. The disturbing forces that may break the equilibrium include mutation, selection, non-random mating, genetic drift, and migration. Moreover, non-random mating can change genotype frequencies even without changing allele frequencies.

Assortative Mating for BMI

First, fairly consistent evidence suggests that human mate choice is phenotypically assortative for body mass index (BMI; kg/m2) (519). Intermate correlation for BMI ranges from about 0.05 to 0.25 across many studies, and averages around a correlation of 0.15 in industrialized societies. Intermate correlation of fat mass is higher, at 0.41 (18). The intermate correlation in adiposity does not increase during cohabitation, and appears to be due to mate choice rather than factors such as selective divorce (5, 13, 15, 19). Assortative mating increases genetic variance in offspring (20), and can impact the distribution of phenotypes in subsequent generations. For example, Redden et al. determined that existing intermate BMI correlations alone could increase the prevalence of obesity by 2% over 11 generations if acting in isolation (21).

Differential Realized Fertility by BMI

Realized fertility is another factor that may define the extent to which genes contribute to obesity prevalence (22). Fecundity is the potential to reproduce, or an individual or population’s capacity to produce offspring. However, we are interested not in capacity per se, but rather the realization of that capacity. Since the terminology used in the literature is inconsistent, we shall use realized fertility to describe this factor of interest: the number of children born to a given woman over her lifetime.

Realized fertility is higher in individuals with higher levels of adiposity (2325). Epidemiological evidence consistently shows women and couples with higher than average BMI produce more offspring. Because BMI is heritable, increased realized fertility among those with higher BMI may contribute to increased prevalence of obesity by increasing the frequency of BMI-increasing alleles in subsequent generations.

Increased realized fertility in overweight and obese individuals as well as assortative mating may have acted independently or in concert to contribute to the increasing prevalence of obesity over time. We aim to examine the extent to which these factors are associated with increased mean BMI and prevalence of obesity through an empirical Monte Carlo framework for childbirth events and offspring body composition.

Methods

Data Collection

Longitudinal data were collected from the National Longitudinal Survey of Youth (NLSY), 1979 and 1997 cohorts, conducted by the United States Bureau of Labor Statistics (26). The 1979 cohort follows individuals who were aged 14 – 22 in 1979 from that time forward to the present (we consider data collected up to 2010); the 1997 cohort similarly follows people who were aged 12 – 17 in 1997. Some information was exceedingly limited in the 1979 survey cohort; in particular, neither height, weight, nor BMI were recorded for partners of the survey respondents. However, note that the mid-1980s was both the time frame in which the 1997 cohort youths were born as well as the prime child-bearing years for the individuals in the 1979 cohort. For this reason, we assumed that the matings in both cohorts were contemporaneous, and therefore may be considered generally comparable for our purposes. Thus this analysis will primarily make use of the 1979 cohort but will utilize data from in the 1997 cohort to inform relationships among mothers, fathers and offspring.

Variables collected from each of the 1979 cohort surveys include (parental) age, height, weight, gender, race/ethnicity, total number of children born to date, total years of education to date and total net family income. BMI values were calculated using self reported height and weight. Variables collected from the 1997 cohort surveys include (offspring) age, height, weight, race/ethnicity and gender, as well as maternal height, maternal weight, paternal height, paternal weight and maternal gravid age. Data on age, height and weight were used to calculate maternal and paternal BMIs, as well as BMI at age 18 for the offspring. Details related to data cleaning and missing data handling may be found in the Appendix.

Approach

Our overall goal was to simulate the effects of assortative mating (AM) and differential realized fertility (DRF) on the distribution of offspring BMIs. In order to accomplish this, we built a simulation model that reflects AM and DRF and generates offspring BMIs; we will refer to this as the ‘baseline model’. This framework must then be modified to produce offspring BMIs under the removal of AM (random mating; RM), the removal of DRF (no DRF), or both. We will then simulate from the framework under each of four scenarios (RM and no DRF; RM and DRF; AM and no DRF; AM and DRF) and use the results to assess our factors of interest.

In order to make it as realistic as possible, we used data from the NLSY cohorts to inform all aspects of the baseline model. The process of generating a collection of offspring BMIs under the baseline model (an instantiation of this model) is as follows (see Figure 1 for an overview):

Figure 1.

Figure 1

A flow diagram for the operation of the Monte Carlo simulation framework. For a given woman, we generate some number N of offspring and, if any, a paternal BMI under assortative mating, for each year under consideration. We then move forward in time by one year and repeat until we have no more data for the woman under consideration. Once all birth events have been generated, offspring gender(s) and offspring BMI(s) are then simulated twice, once under these pairings (assortative mating present), and once where the paternal BMIs have been shuffled across birth events to enforce random mating.

  1. A group of women will be followed across their lifetimes from adolescence, and their children (if any) will be simulated. Rather than simulating the complex and collinear covariates of these women in an imperfect manner, we use a representative, cross-sectional population of women from the 1979 NLSY cohort as the basis for our baseline model, since we have empirical covariate information across the lifetimes of these women.

  2. In a given year and for a particular woman, the number of children born in our simulation is based on a mapping m1 of time-dependent maternal covariates to an expected number of children; we will refer to this quantity as λt. The number of children born is drawn from a Poisson with expectation λt. If any children are born in a year, this is called a birth event. The mapping m1 is informed using the 1979 cohort.

  3. For each birth event, a paternal BMI is drawn under assortative mating; recall that we do not have corresponding paternal BMIs for the women in the 1979 cohort. This link between maternal covariates and a paternal BMI is encoded in the mapping m2 and is informed using both the 1979 and the 1997 cohorts.

  4. Given all birth events for all women, offspring genders are chosen via biased coin flips, where the probability of a male child is 0.5122 (27).

  5. Lastly, offspring BMIs at age 18 are sampled through the mapping m3, which maps parental covariates and offspring gender to an offspring BMI; this mapping is informed by the 1997 cohort.

Once the baseline model has been established, our factors of interest can be removed from it. Assortative mating may be removed by shuffling paternal BMIs within race/ethnicity after step 3 has been completed for all women, producing random mating with respect to BMI. Differential realized fertility may be removed from the model by replacing m1 with a new mapping m1*, which does not use maternal BMI as an input.

Informing the Model Mappings

The details of the three mappings are as follows: m1 models the number of children born in a given year (λt) as a function of maternal age, BMI, race/ethnicity, income and education as well as the number of children born to date and born in the previous year. This number can be (and usually is) zero and some women will never have any children over their simulated lifetimes. Since the model will draw numbers of children born from a Poisson, we use Poisson regression to map a set of maternal covariates to a value of λt. We consider quadratic, cubic and quartic effects in addition to linear effects when modeling age, maternal BMI, children born to date and number of children born in the previous year, as linear effects alone are insufficient to capture the data. Years of education are reduced to whether or not more than a high school education has been achieved, as the more expansive measure of education was not a better predictor in the regression. Similarly total net family income is trichotomized into less than $25,000, $25,000 – 75,000 and more than $75,000 in a given year. Covariate BMIs were centered at 25 kg/m2, the traditional boundary between normal and overweight (28), for ease of interpretation.

The mapping m2 needs to map a mother’s BMI and race/ethnicity, as recorded in the 1979 cohort, to a comparable paternal BMI under assortative mating. Recall that mate BMI information is not available in the 1979 cohort, but BMIs for the parents of the 1997 cohort are available. While we have a suitable pool of male BMIs to draw upon in the 1979 cohort (from the men of the representative, cross-sectional sample), m2 must match each mother with a mate in a manner that maintains the degree of assortative mating observed in the 1997 cohort. As there are differences in BMI distributions across race/ethnicity, within this mapping we restrict all pairings to individuals of the same race/ethnicity.

The mapping m2 is as follows; we provide Figure 2 as a visual aid (and will reference its labels herein). Given a maternal BMI, we calculate the percentile that it corresponds to among all BMIs recorded in the 1979 cohort of women; call it QM (A). We then gather all mothers in the 1997 cohort that have a recorded BMI within a one-percentile window of QM (B) and choose one at random (C). For example, if an African-American mother’s BMI corresponds to the 80th percentile in the 1979 cohort, we gather all African-American mothers from the 1997 cohort whose BMIs fall between the 79th and 81st percentile and choose one at random. The BMI of the selected woman’s mate is then converted into a percentile QP (D), as calculated from the 1997 distribution of fathers. We then conclude the mapping by gathering all men in the 1979 cohort with BMIs within a one-percentile window of QP (E) and choose one at random (F), to be the mate for our original 1979 cohort mother. While laborious, this approach allows us to generate pairings within the 1979 cohort under assortative mating.

Figure 2.

Figure 2

An overview of the m2 mapping. Given a maternal BMI, we calculate the percentile that it corresponds to among all BMIs recorded in the 1979 cohort of women; call it QM (A). We then gather all mothers in the 1997 cohort that have a recorded BMI within a one-percentile window of QM (B) and choose one at random (C). The BMI of the selected woman’s mate is then converted into a percentile QP (D), as calculated from the 1997 distribution of fathers. We then conclude the mapping by gathering all men in the 1979 cohort with BMIs within a one-percentile window of QP (E) and choose one at random (F), to be the mate for our original 1979 cohort mother.

For m3, drawing an offspring BMI given parental BMIs, gravid age and offspring gender/race/ethnicity was performed by gathering all 1997 parent-offspring triads that match the given set of covariates and randomly selecting one of them. We allowed the parental BMIs and gravid ages in our selection pool to lie within a window around the actual covariate values. These windows were set to be small (plus or minus 0.5, kg/m2 or years as appropriate) but were allowed to grow if an insufficient number (less than 50) of matching 1997 records was available.

For both m2 and m3, we determined the covariates to be used in the mappings by variable selection via linear regression applied to the 1997 cohort data; recall m1 was informed through Poisson regression. In all regressions, a full model containing all pairwise interactions was initially considered and then pared down using backwards selection until only terms that had p-values less than 0.1 remained; terms required by the hierarchy principle (e.g., the constituent main effects of an interaction) were also retained. As aforementioned, in addition to m1, a mapping m1* was derived for scenarios without DRF, by removing the main effect for maternal BMI as well as all of its interactions from the backwards selection; see Table 1. All regressions, model selection procedures and random draws using the empirical mappings were performed in R 2.14.2 (29).

Table 1.

Model fit for log number of offspring (Mappings m1 and m1*)

Model Term Coefficient (SE) for m1 Coefficient (SE) for m1* (with Maternal BMI removed)
Intercept −3.451 (0.369) −3.710 (0.340)
Age 0.562 (0.222) 0.611 (0.213)
Age2 −0.074 (0.041) −0.079 (0.040)
Age3 0.004 (0.003) 0.004 (0.003)
Age4 −0.00008 (0.00007) −0.00009 (0.00007)
Body Mass Index (BMI) 0.066 (0.030) 0*
BMI2 −0.003 (0.0004) 0*
BMI3 0.00006 (0.00001) 0*
Children Born to Date (BTD) 1.501 (0.071) 1.523 (0.071)
BTD2 −1.183 (0.067) −1.194 (0.067)
BTD3 0.292 (0.020) 0.295 (0.020)
BTD4 −0.022 (0.002) −0.022 (0.002)
Children Born in the Previous Year (PRV) −0.732 (0.156) −0.731 (0.155)
PRV2 0.338 (0.140) 0.339 (0.139)
Income 25–75K −1.228 (0.717) −1.216 (0.740)
Income 75K+ 0.625 (1.882) 0.727 (1.727)
Race/Ethnicity is African American (R/Eth (AA)) 0.191 (0.041) 0.233 (0.040)
Race/Ethnicity is Hispanic (R/Eth (Hisp)) 0.150 (0.046) 0.177 (0.046)
Education −0.659 (0.043) −0.702 (0.042)
Age x BMI −0.019 (0.012) 0*
Age2 x BMI 0.002 (0.001) 0*
Age3 x BMI −0.0001 (0.00007) 0*
Age4 x BMI 0.000002 (0.000001) 0*
Age x Income 25–75K −0.255 (0.372) −0.273 (0.388)
Age x Income 75K+ −1.062 (0.519) −1.122 (0.479)
Age2 x Income 25–75K 0.088 (0.063) 0.092 (0.066)
Age2 x Income 75K+ 0.181 (0.059) 0.189 (0.056)
Age3 x Income 25–75K −0.006 (0.004) −0.006 (0.004)
Age3 x Income 75K+ −0.010 (0.003) −0.010 (0.003)
Age4 x Income 25–75K 0.0001 (0.00009) 0.0001 (0.0001)
Age4 x Income 75K+ 0.0002 (0.00007) 0.0002 (0.00007)
BMI x Education 0.026 (0.005) 0*
BTD x Income 25–75K −0.201 (0.031) −0.213 (0.031)
BTD x Income 75K+ −0.303 (0.081) −0.317 (0.081)
BTD x R/Eth (AA) −0.048 (0.026) −0.060 (0.026)
BTD x R/Eth (Hisp) 0.027 (0.028) 0.024 (0.028)
BTD x Education 0.226 (0.027) 0.246 (0.027)
R/Eth (AA) x Income 25–75K −0.474 (0.067) −0.464 (0.066)
R/Eth (AA) x Income 75K+ −0.259 (0.192) −0.226 (0.190)
R/Eth (Hisp) x Income 25–75K −0.209 (0.068) −0.206 (0.068)
R/Eth (Hisp) x Income 75K+ −0.229 (0.182) −0.229 (0.182)
Educ. x Income 25–75K 0.502 (0.056) 0.500 (0.056)
Educ. x Income 75K+ 0.611 (0.136) 0.588 (0.136)
*

This term was explicitly set to zero

Comparing Scenarios through Simulation

We wish to compare four different scenarios through simulation to quantify generational effects of assortative mating (AM) and differential realized fertility (DRF). These scenarios are: AM on a background of no DRF, random mating (RM) on a background of no DRF, AM on a background of DRF, and RM on a background of DRF. DRF may be included or excluded from the simulation through the use of m1 or m1*, respectively, and paternal BMIs may be shuffled to induce random mating. We utilized observed covariate information from a representative sample of women aged 14 – 22 in 1979 in order to make our simulations realistic; this ensured that covariates remained collinear and changed over time, without having to build these complex dynamics into our simulation. However, since we are simulating birth events, all empirical data detailing those events were stripped out, so that no women enter into our simulation with prior birth events.

Because the simulation framework is not a deterministic process under all four scenarios, the distribution of offspring BMIs at age 18 varies across realizations of the simulation process. Therefore, the Monte Carlo simulation model was instantiated three hundred times per scenario, in order to quantify and minimize this uncertainty. The resulting distributions of observed BMIs are compared based on differences in mean BMI; differences in the spread about that mean (BMI variance); and differences in the prevalence of obesity and morbid obesity. Significance and 95% confidence intervals for these comparisons across scenarios (for example, between the baseline scenario and one without assortative mating by BMI) were obtained via Welch’s t-tests unless otherwise noted; note that in a common DRF / no DRF background, instantiations of RM and AM outcomes are paired and hence paired t-tests are used in these cases. As these tests are assessing stochastic variation arising from the non-deterministic nature of our model (see Discussion) and significant results can hence be associated with smaller and smaller p-values by increasing the number of simulation instantiations, we do not report p-values; these are all less than 0.05 whenever significance is indicated and usually on the order of 10−5 to 10−29.

For prevalence of obesity, we present the relative risk (RR) of obesity, where the risk is relative to the scenario without one or both factors of interest. In all cases inference is based on the log RR and then back-transformed. In comparisons where instantiations are paired, we have a RR estimate for each instantiation and make statistical comparisons via t-tests on the log RR. In cases where instantiations are not paired, we pool subjects within scenarios, across instantiations, and treat these as two big populations. A point estimate of the relative risk and the asymptotic standard error of the log RR are then based on the four counts of obese and not-obese over the two scenarios under consideration, using a Z-test.

Results

The regression results that inform the m1 and m1* mappings may be found in Table 1. Figure 3 illustrates some results under m1, specifically the probabilities of having at least one child as a function of maternal age for Caucasian women with BMI equal to 25 and education at the high school level or less under different conditions. The left panel shows curves for poorer households (total net family income less than $25K) and the right panel is for wealthier households (total net family income greater than $75K). In each panel, plotted curves are given for childless women, women who had their first child last year, women with one child not born last year, women with two children and one of them was born last year and women with two children, neither of them born last year.

Figure 3.

Figure 3

Selected curves denoting probability of having any children for Caucasian women with BMIs of 25 and a high school education or less, as a function of maternal age and previous history of childbirth events. Poorer households (total net family income less than $25K) are shown in the left panel (A), wealthier households (total net family income greater than $75K) are shown in the right panel (B).

We note some characteristics of our baseline model, which reflect the NLSY data sets as closely as possible. First, existing offspring are more associated with new births in poor families than in wealthy households. Thus, a mother with two children is more likely to have another than a childless woman if she is poor, but the reverse holds if she is wealthy. Second, higher BMIs are associated with higher overall realized fertility, but in a nonlinear fashion: The highest realized fertilities are among overweight women (BMI roughly between 25 and 30), second highest among ‘normal’ and obese women (BMIs between 20–25 and 30–35), third highest for the morbidly obese (BMI greater than 35) and least among the low end of ‘normal’ and the underweight (BMI < 20). Lastly, a correlation of 0.2 is observed between maternal and paternal BMIs in the 1997 cohort data. Thus our model, as informed by the NLSY data sets, reflects aforementioned relationships that agree with the literature (12, 23).

Comparisons of Scenarios

Monte Carlo computation quantifies the model effects of these relationships on the next generation. Recall that the resulting distributions of observed offspring BMIs will be compared based on differences in mean BMI; differences in the spread about that mean (BMI variance); and differences in the prevalence of obesity and morbid obesity, which we will now present. Visual summaries of the point estimates and their corresponding 95% confidence intervals are presented in Figure 4.

Figure 4.

Figure 4

Visual summary of change in mean BMI, change in BMI variance and relative risk (RR) for BMIs greater than 30 and 40, in each of five scenarios. These are: assortative mating (AM) and differential realized fertility (DRF). These scenarios are: assortative mating (AM) on a background of no differential realized fertility (DRF), random mating (RM) on a background of no DRF, AM on a background of DRF, RM on a background of DRF, and a comparison of RM combined with no DRF against AM with DRF. Units are kg/m2 or unitless (for RRs).

First, we compare the presence and absence of DRF on distributions of offspring BMI. Under RM, DRF is associated with increased mean offspring BMI (+0.046 kg/m2, 95% C.I. [0.035, 0.057]) and an increased prevalence of obesity (RR 1.020, 95% C.I. [1.011, 1.028]) compared to no DRF. The variances of the two distributions are not significantly different (+0.116, 95% C.I. [−.038, 0.270]). Under AM, these associations are strengthened: DRF is associated with increased mean offspring BMI (+0.056 kg/m2, 95% C.I. [0.045, 0.067]) and an increased prevalence of obesity (RR 1.019, 95% C.I. [1.011, 1.028]) compared to no DRF; the difference in variances remains non-significant (+0.102, 95% C.I. [−0.068, 0.271]).

Next, we compare random and assortative mating; recall that these comparisons are paired because they arise from the same set of birth events in every instantiation. When the birth events were generated with DRF excluded from the model (under m1*), the distributions of offspring BMI under RM versus AM slightly differ in mean BMI (+0.011, 95% C.I. [0.0003, 0.0217]). The spread about that mean, however, is significantly higher under AM (+0.475 kg/m2, 95% C.I. [0.322, 0.629]), due to higher prevalence of very low and very high BMIs, relative to the mean. This is therefore associated with an increased prevalence of BMIs greater than 30 (RR 1.012, 95% C.I. [1.004, 1.021]) and greater than 40 kg/m2 (RR 1.074, 95% C.I. [1.039, 1.109]). When DRF influences birth events, AM is associated with an increase in mean BMI (+0.021, 95% C.I. [0.010, 0.032]) as well as spread about that mean (+0.461, C.I. [0.300, 0.622]), and the prevalence of BMIs greater than 30 (RR 1.012, 95% C.I. [1.004, 1.021]) and 40 kg/m2 increases by an even greater degree (RR 1.101, 95% C.I. [1.067, 1.136]).

Lastly, we compare the scenarios of no DRF under RM versus DRF under AM, in order to examine them in concert. We find that both factors together are associated with increased mean BMI (+0.067, C.I. [0.056, 0.078]), increased BMI variance (+0.578, C.I. [0.418, 0.736]) and increased prevalence of BMIs greater than 30 (RR 1.032, 95% C.I. [1.023, 1.041]) and BMIs over 40 (RR 1.085, 95% C.I. [1.053, 1.118]).

Discussion

We found significant associations between realized fertility by BMI and assortative mating by BMI on the distribution of BMIs in the next generation. Both the mean BMI and the prevalence of obesity were evidenced as being shifted to higher levels by these factors.

Our approach has several strengths. Our framework is empirical and informed by large, nationally representative samples of longitudinal data collected over several decades and our framework closely emulates the empirical data so that, for instance, realized fertility declines to zero with advancing maternal age. In addition, we are not restricted to married couples or families where there are children: we follow teenaged women forward in time, and they may get married or not and they may have children or not. We empirically modeled offspring BMI rather than making parametric model assumptions that might have invalidated our inference. Lastly, simulation of this nature is one of the only ethical ways to examine factors that may influence obesity over generations, as randomized controlled trials that examine these factors are not feasible in humans.

On the other hand, our approach has some limitations that should be noted. As aforementioned, we used the 1997 NLSY cohort to inform some portions of our model since mate BMI information was not available in the 1979 cohort, and we further assumed that the two NLSY data sets were comparable for these purposes. Parental BMIs were only recorded once in the 1997 cohort and so we assumed that their BMIs at a given youth’s birth were comparable to that pair of measurements; our use of percentiles when moving across cohorts aimed to address this concern.

The model reflects the data well, with a few caveats. We had to put in an explicit stopping rule for the rare times when λt was calculated to be greater than 3 (occurred in less than 1 out of every 13 million calculations of λt) to prevent unduly large (e.g., triple-digit) multiple birth events in a single year. Our modeling of offspring BMIs in the third mapping was limited by the covariate information that was available in both cohorts. We took into account differential birth rates for the two genders (27) but did not take into account higher mortality before age 18 in male youths (30).

Other factors inherent to our model should be taken into consideration when interpreting our results. First, BMI is an imperfect indicator of adiposity. Extension of these results outside of the populations and generations sampled by NLSY must be done with caution, as we only had one generation of matings to consider. Lastly, we cannot make causal claims, as the longitudinal surveys are observational in nature.

There are two sources of variation in our model for the point estimates of mean BMI shifts and increases in the prevalence of obesity. One of these is the stochastic nature of the simulation model, and our p-values reflect uncertainty due to this source; because of our large number of simulations per scenario, this variation is fairly minimal and well accounted for. In addition, however, there is uncertainty associated with the fact that our inference is based on a single representative sample, and the simulation framework described previously does not take that into account. Properly accounting for this source of variation, for example via bootstrapping, is a topic for future research.

Conclusion

In summary, a small portion of the prevalence of obesity from one generation to the next is associated with the presence differential realized fertility and assortative mating by BMI. Further research using multigenerational data sets is required to fully ascertain the degree of association of these factors on population-level obesity.

Acknowledgments

This work was supported by Grant Number T32HL072757 from the National Heart, Lung, and Blood Institute and the Obesity Training Program (T32 DK062710) at University of Alabama at Birmingham.

Appendix. Data Cleaning and Missing Data

Like all longitudinal surveys, the NLSY data sets contain errors, omissions and values that are implausible or logically inconsistent or both. These issues needed to be addressed and we briefly outline our methods for doing so below. When we say that a value was ‘rejected’ we mean that it was replaced by a missing value indicator, to be addressed through multiple imputation.

  • Implausible values: Heights shorter than 2′6″ and taller than 8′ were rejected. Any weight recorded as 996 was rejected; while not explicitly decoded in the NLSY codebooks, a value of 996 appears to be an ‘out-of-range’ indicator.

  • Logically inconsistent values: Some of the measurements under consideration, such as years of education or total number of offspring born to date, are naturally monotone: they should not decrease as one moves forward in time. This property was used to fill in values that were nominally missing but could be deductively inferred. For example, if a mother has had two children total in 1981 and also in 1983 but a missing value is recorded for 1982, we replace that omission with the value ‘2’. Similarly, there are data entries that are not missing but exhibit non-monotone behaviors, such as offspring vanishing from the longitudinal record. Roughly half of these violations reflect discrepancies in the 1979 – 1981 offspring totals compared to those taken after 1981; in these cases we used the 1982 values carried backward. In the remaining cases where the offenses were not as systematic, invalid values were rejected.

  • Non-survey years: Our framework models birth events on an annual basis but the NLSY surveys were not undertaken in every year, and furthermore, some variables were not collected in more than a handful of years (e.g., in the 1979 cohort height was only recorded in six calendar years). To ameliorate these issues, height was carried forward and all other values were interpolated when possible and considered to be missing values when not.

  • 1997 parental BMIs: The measurements used to calculate parental BMIs in the 1997 cohort were not recorded as ‘maternal height’ or the like. Rather, information on sex, height and weight were recorded for the respondent adult and possibly for non-respondent biological parents (NRBPs) #1 and #2 (sometimes the adult queried at the household was not a biological parent of the youth in question). In cases where only NRBP #1 was indicated but that parent’s sex matched that of the primary respondent, only the information for NRBP #1 was used to ascribe either maternal or paternal BMI information (that is to say, we did not consider the respondent to be a biological parent). When there were two NRBPs and both were listed as the same sex, both parental BMIs were treated as missing. Fortunately, the gravid age of the mother was recorded separately and did not need to be inferred in this manner.

  • Youth BMI at age 18: BMI was missing at age 18 for some youths. If BMI could be calculated for either age 17, 19 or both, an average of the available BMIs was used. If not, an average of his or her BMIs available for ages 15 – 21 was used, if possible. If those were all missing, a missing value was recorded and was later addressed during the multiple imputation.

  • Missing data: In order to address the presence of missing data, we performed multiple imputation in SAS v. 9.3 (31) on the 1979 and 1997 data sets separately after cleaning. Specifically, the MCMC (Markov Chain Monte Carlo) incarnation of PROC MI was used, with the number of imputations chosen in order to obtain 95% efficiency, based on the highest fraction of missing information by EM (32). This resulted in four and thirteen imputations for the 1979 and 1997 cohort data sets, respectively. Additionally, predictive mean matching was employed so that the imputed values would be plausible.

Contributor Information

John A. Dawson, Office of Energetics, School of Public Health, University of Alabama at Birmingham

Emily J. Dhurandhar, Office of Energetics, School of Public Health, University of Alabama at Birmingham

Ana I. Vazquez, Department of Biostatistics, School of Public Health, University of Alabama at Birmingham

Bo Peng, Department of Genetics, The University of Texas MD Anderson Cancer Center, Houston, TX.

David B. Allison, Office of Energetics, School of Public Health, Nutrition Obesity Research Center, University of Alabama at Birmingham

References

  • 1.Flegal KM, Carroll MD, Kit BK, Ogden CL. Prevalence of obesity and trends in the distribution of body mass index among US adults, 1999–2010. JAMA: the journal of the American Medical Association. 2012;307(5):491–7. doi: 10.1001/jama.2012.39. Epub 2012/01/19. [DOI] [PubMed] [Google Scholar]
  • 2.Kuczmarski RJ, Flegal KM, Campbell SM, Johnson CL. Increasing prevalence of overweight among US adults. The National Health and Nutrition Examination Surveys, 1960 to 1991. JAMA: the journal of the American Medical Association. 1994;272(3):205–11. doi: 10.1001/jama.272.3.205. Epub 1994/07/20. [DOI] [PubMed] [Google Scholar]
  • 3.Bouchard C. Gene-environment interactions in the etiology of obesity: defining the fundamentals. Obesity (Silver Spring) 2008;16 (Suppl 3):S5–S10. doi: 10.1038/oby.2008.528. Epub 2008/12/17. [DOI] [PubMed] [Google Scholar]
  • 4.Poston WS, 2nd, Foreyt JP. Obesity is an environmental issue. Atherosclerosis. 1999;146(2):201–9. doi: 10.1016/s0021-9150(99)00258-0. Epub 1999/10/26. [DOI] [PubMed] [Google Scholar]
  • 5.Ajslev TA, Angquist L, Silventoinen K, Gamborg M, Allison DB, Baker JL, et al. Assortative marriages by body mass index have increased simultaneously with the obesity epidemic. Frontiers in genetics. 2012;3:125. doi: 10.3389/fgene.2012.00125. Epub 2012/10/12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Davey G, Ramachandran A, Snehalatha C, Hitman GA, McKeigue PM. Familial aggregation of central obesity in Southern Indians. International journal of obesity and related metabolic disorders: journal of the International Association for the Study of Obesity. 2000;24(11):1523–7. doi: 10.1038/sj.ijo.0801408. Epub 2000/01/11. [DOI] [PubMed] [Google Scholar]
  • 7.Di Castelnuovo A, Quacquaruccio G, Donati MB, de Gaetano G, Iacoviello L. Spousal concordance for major coronary risk factors: a systematic review and meta-analysis. American journal of epidemiology. 2009;169(1):1–8. doi: 10.1093/aje/kwn234. Epub 2008/10/11. [DOI] [PubMed] [Google Scholar]
  • 8.Garn SM, Sullivan TV, Hawthorne VM. Educational level, fatness, and fatness differences between husbands and wives. The American journal of clinical nutrition. 1989;50(4):740–5. doi: 10.1093/ajcn/50.4.740. Epub 1989/10/01. [DOI] [PubMed] [Google Scholar]
  • 9.Ginsburg E, Livshits G, Yakovenko K, Kobyliansky E. Major gene control of human body height, weight and BMI in five ethnically different populations. Annals of human genetics. 1998;62(Pt 4):307–22. doi: 10.1046/j.1469-1809.1998.6240307.x. Epub 1999/01/30. [DOI] [PubMed] [Google Scholar]
  • 10.Hur YM. Assortive mating for personaltiy traits, educational level, religious affiliation, height, weight, adn body mass index in parents of Korean twin sample. Twin research: the official journal of the International Society for Twin Studies. 2003;6(6):467–70. doi: 10.1375/136905203322686446. Epub 2004/02/20. [DOI] [PubMed] [Google Scholar]
  • 11.Jacobson P, Torgerson JS, Sjostrom L, Bouchard C. Spouse resemblance in body mass index: effects on adult obesity prevalence in the offspring generation. American journal of epidemiology. 2007;165(1):101–8. doi: 10.1093/aje/kwj342. Epub 2006/10/17. [DOI] [PubMed] [Google Scholar]
  • 12.Katzmarzyk PT, Hebebrand J, Bouchard C. Spousal resemblance in the Canadian population: implications for the obesity epidemic. International journal of obesity and related metabolic disorders: journal of the International Association for the Study of Obesity. 2002;26(2):241–6. doi: 10.1038/sj.ijo.0801870. Epub 2002/02/19. [DOI] [PubMed] [Google Scholar]
  • 13.Knuiman MW, Divitini ML, Bartholomew HC, Welborn TA. Spouse correlations in cardiovascular risk factors and the effect of marriage duration. American journal of epidemiology. 1996;143(1):48–53. doi: 10.1093/oxfordjournals.aje.a008656. Epub 1996/01/01. [DOI] [PubMed] [Google Scholar]
  • 14.Knuiman MW, Divitini ML, Welborn TA, Bartholomew HC. Familial correlations, cohabitation effects, and heritability for cardiovascular risk factors. Annals of epidemiology. 1996;6(3):188–94. doi: 10.1016/1047-2797(96)00004-x. Epub 1996/05/01. [DOI] [PubMed] [Google Scholar]
  • 15.Konnov MV, Dobordzhginidze LM, Deev AD, Gratsianskii NA. Spousal concordance for factors related to metabolic syndrome in families of patients with premature coronary heart disease. Kardiologiia. 2010;50(2):4–8. Epub 2010/02/12. [PubMed] [Google Scholar]
  • 16.Maes HH, Neale MC, Eaves LJ. Genetic and environmental factors in relative body weight and human adiposity. Behavior genetics. 1997;27(4):325–51. doi: 10.1023/a:1025635913927. Epub 1997/07/01. [DOI] [PubMed] [Google Scholar]
  • 17.Silventoinen K, Kaprio J, Lahelma E, Viken RJ, Rose RJ. Assortative mating by body height and BMI: Finnish twins and their spouses. American journal of human biology: the official journal of the Human Biology Council. 2003;15(5):620–7. doi: 10.1002/ajhb.10183. Epub 2003/09/04. [DOI] [PubMed] [Google Scholar]
  • 18.Speakman JR, Djafarian K, Stewart J, Jackson DM. Assortative mating for obesity. The American journal of clinical nutrition. 2007;86(2):316–23. doi: 10.1093/ajcn/86.2.316. Epub 2007/08/09. [DOI] [PubMed] [Google Scholar]
  • 19.Zietsch BP, Verweij KJ, Heath AC, Martin NG. Variation in human mate choice: simultaneously investigating heritability, parental influence, sexual imprinting, and assortative mating. The American naturalist. 2011;177(5):605–16. doi: 10.1086/659629. Epub 2011/04/22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.ELF. The correlation between relatives on the supposition of Mendelian inheritance. Trans Roy Soc Edinb. 1918;52:399–433. [Google Scholar]
  • 21.Redden DT, Allison DB. The effect of assortative mating upon genetic association studies: spurious associations and population substructure in the absence of admixture. Behavior genetics. 2006;36(5):678–86. doi: 10.1007/s10519-006-9060-0. Epub 2006/03/04. [DOI] [PubMed] [Google Scholar]
  • 22.McAllister EJ, Dhurandhar NV, Keith SW, Aronne LJ, Barger J, Baskin M, et al. Ten putative contributors to the obesity epidemic. Critical reviews in food science and nutrition. 2009;49(10):868–913. doi: 10.1080/10408390903372599. Epub 2009/12/05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Weng HH, Bastian LA, Taylor DH, Jr, Moser BK, Ostbye T. Number of children associated with obesity in middle-aged women and men: results from the health and retirement study. J Womens Health (Larchmt) 2004;13(1):85–91. doi: 10.1089/154099904322836492. Epub 2004/03/10. [DOI] [PubMed] [Google Scholar]
  • 24.Bastian LA, West NA, Corcoran C, Munger RG. Number of children and the risk of obesity in older women. Preventive medicine. 2005;40(1):99–104. doi: 10.1016/j.ypmed.2004.05.007. Epub 2004/11/09. [DOI] [PubMed] [Google Scholar]
  • 25.Rosenberg L, Palmer JR, Wise LA, Horton NJ, Kumanyika SK, Adams-Campbell LL. A prospective study of the effect of childbearing on weight gain in African-American women. Obesity research. 2003;11(12):1526–35. doi: 10.1038/oby.2003.204. Epub 2003/12/25. [DOI] [PubMed] [Google Scholar]
  • 26.Statistics BoL. National Longitudinal Surveys. United States Department of Labor; [cited 2012]; Available from: http://www.bls.gov/nls/ [Google Scholar]
  • 27.CIA. The World Factbook. Available from: http://www.cia.gov/library/publications/the-world-factbook/fields/2018.html.
  • 28.WHO. Report of a WHO Expert Committee. Geneva: World Health Organization; 1995. Physical Status: the use and interpretation of anthropometry. [PubMed] [Google Scholar]
  • 29.Team RC. R: A language and environment for statistical computing. R Foundation for Statistical Computing; Vienna, Austria: 2012. [Google Scholar]
  • 30.Moller-Leimkuhler AM. The gender gap in suicide and premature death or: why are men so vulnerable? European archives of psychiatry and clinical neuroscience. 2003;253(1):1–8. doi: 10.1007/s00406-003-0397-6. Epub 2003/03/29. [DOI] [PubMed] [Google Scholar]
  • 31.Inc. SI. SAS 9.3 Product Documentation. 2013 [cited 2013]; Available from: http://support.sas.com/documentation/93/index.html.
  • 32.JLS . Analysis of Incomplete Multivariate Data. London: Chapman & Hall; 1997. [Google Scholar]

RESOURCES