Abstract
Cross-sectional studies of total gestational weight gain (GWG) and perinatal outcomes have used different approaches to operationalize GWG and adjust for duration of gestation. Using birth records from California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017), we compared 3 commonly used approaches to estimate associations between GWG and cesarean delivery, small-for-gestational-age birth, and low birth weight (LBW): 1) the Institute of Medicine–recommended GWG ranges at a given gestational week, 2) total weight gain categories directly adjusting for gestational age as a covariate, and 3) weight-gain-for-gestational-age z scores derived from an external longitudinal reference population. Among 5,461,130 births, the 3 methods yielded similar conclusions for cesarean delivery and small-for-gestational-age birth. However, for LBW, some associations based on z scores were in the opposite direction of methods 1 and 2, paradoxically suggesting that higher GWG increases risk of LBW. This was due to a greater proportion of preterm births among those with high z scores, and controlling for gestational age in the z score model brought the results in line with the other methods. We conclude that the use of externally derived GWG z scores based on ongoing pregnancies can yield associations confounded by duration of pregnancy when the outcome is strongly associated with gestational age at delivery.
Keywords: cesarean delivery, gestational age, gestational weight gain, low birth weight, small-for-gestational-age birth, z scores
Abbreviations
- BMI
body mass index
- CI
confidence interval
- GAC
gestational-age–controlled
- GWG
gestational weight gain
- IOM
Institute of Medicine
- LBW
low birth weight
- RR
risk ratio
- SGA
small for gestational age
Gestational weight gain (GWG) has been identified as an important predictor of adverse maternal and infant outcomes (1). However, there is a correlation between total GWG and duration of gestation; women who deliver at later gestational ages have more time to accumulate weight. Despite this known association, few investigators take gestational age into account when examining GWG as a cause of adverse perinatal outcomes, and without accounting for this, estimates can be biased (2). In a recent publication discussing the best practices for studying weight gain in pregnancy, Hutcheon and Bodnar (3) recommended that investigators adjust for gestational age, either directly or indirectly, in order to estimate the causal effects of total GWG. However, outstanding questions remain about which adjustment methods minimize potential bias in a given context and under what scenarios these methods might produce different results.
The most commonly used operationalization of GWG is based on the 2009 Institute of Medicine’s (IOM) classification of GWG (4, 5). These guidelines provide separate total weight gain recommendations for each prepregnancy body mass index (BMI; weight (kg)/height (m)2) category (underweight, normal-weight, overweight, and obese) and provide weekly rates of weight gain in the second and third trimesters to indirectly adjust for gestational age (assuming linear weight gain throughout second and third trimesters) and classify women as below, within, or exceeding the recommended weight gain (6). These predetermined cutoffs, based on expert opinion, were designed to minimize adverse maternal and neonatal outcomes and were not explicitly designed to force independence between gestational age and GWG (6).
A second approach is to directly adjust for gestational age at delivery by including it as a covariate when modeling associations with total GWG (7–10). This approach is the most straightforward and most directly accounts for the length of pregnancy, but it cannot be applied validly for some pregnancy outcomes, such as preterm birth (defined by gestational age), and outcomes diagnosed before delivery (which can affect ultimate gestational duration and total GWG) (3). Additionally, conditioning on gestational age at delivery is often debated, because it could introduce collider bias (2, 7, 11, 12). Collider bias can occur if there is unmeasured or residual confounding between gestational age and the outcome, which would open a backdoor path when gestational age is controlled (12, 13). For example, in Figure 1A, a simplified causal graph adapted from Hinkle et al. (7), if there is an unmeasured confounder (C) between the outcome and gestational age, such as an underlying pregnancy complication, conditioning on gestational age would open the backdoor path between GWG and the outcome (14).
Figure 1.

Directed acyclic graphs (DAGs) showing associations between gestational weight gain (GWG) and 3 pregnancy outcomes: cesarean delivery, small-for-gestational-age (<10th percentile) birth, and low birth weight (<2,500 g). U represents the longitudinal patterns of GWG throughout pregnancy; C represents potential confounders of the gestational age (GA)–outcome relationship. The dashed line represents the hypothesized relationship between GA and each outcome. There is a direct arrow from GA to the outcomes cesarean delivery and low birth weight, but there would not be an arrow for the outcome of small-for-gestational-age birth. A) DAG for using absolute weight gain and adjusting for GA in the model (adapted from Hinkle et al. (7)); B) DAG for using standardized z scores for GA, with deterministic arrows between GA and z score and between GWG and z score (adapted from Hinkle et al. to include an arrow from GA to the outcome).
A newer method with which to indirectly adjust for gestational age is to use a standardized weight-gain-for-gestational-age z score based on longitudinal measurements across pregnancy. Z score charts provide a modeled mean value and standard deviation for cumulative weight gain among ongoing pregnancies at each gestational week, rather than births at a given gestational week (15, 16). Using z scores to quantify GWG is appealing because z scores should provide a gestational-age–independent measure of GWG and may avoid collider bias (3). However, Hinkle et al. show that when the z score model is misspecified, residual confounding by duration of gestation can occur (7). As is shown in Figure 1B, residual confounding can occur when backdoor paths between z score and the outcome are not fully blocked. While some have noted that confounding tends to be stronger in magnitude than collider bias (17, 18), it is not clear in this study setting which is a greater threat to validity. This method has been used as an alternative when direct adjustment is thought to be problematic (19–22) but is also being used for outcomes where direct adjustment is justified (10, 23–25).
Because of data availability issues, most studies to date have characterized GWG with a single measurement reflecting cumulative weight gain from conception to delivery (total GWG). The optimal approach to adjusting for gestational age when modeling total GWG in such cross-sectional epidemiologic studies is unclear because, under various scenarios, proposed approaches could potentially remove, induce, or amplify bias. It is therefore useful to assess whether these approaches yield comparable results when applied to the same data. Here we build upon the previous simulation study by Hinkle et al. (7) by comparing results using 3 methods of adjusting for gestational age in a study of total GWG and observed pregnancy outcomes at the time of delivery, using multistate birth records.
METHODS
Data for this study were cross-sectional at the time of birth and were obtained from birth records in California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017) based on when each state implemented the 2003 revision of the US Standard Certificate of Live Birth. Birth records were included in the data set if the estimated gestational age was 24–44 weeks and there were complete data on prepregnancy height and weight, GWG, cesarean delivery, low birth weight (LBW), small-for-gestational-age (SGA) birth, and covariates (n = 5,473,509).
Gestational weight gain
Total GWG was calculated by subtracting prepregnancy weight from delivery weight. Women with implausible weight gain (<−40 pounds (<−18.2 kg) or >100 pounds (>45.4 kg)) were excluded (n = 12,379 (0.2%)). IOM categories and z scores were calculated within prepregnancy BMI groups. BMI was calculated based on prepregnancy height and weight, as reported on the birth record, and divided into 6 BMI categories: underweight (<18.5), normal-weight (18.5–24.9), overweight (25.0–29.9), class I obese (30.0–34.9), class II obese (35.0–39.9), and class III obese (≥40.0).
A summary measure of GWG was calculated in 3 different ways. First, GWG was operationalized as below, within, or exceeding the IOM-recommended ranges, based on the BMI-specific weight gain for a given gestational age. The IOM-recommended range of GWG was calculated based on published expected first-trimester weight gain and mean weekly weight gain in the second and third trimesters (6). The range of recommended GWG was calculated as [1.1 + (gestational week − 13) × mean GWGL] to [4.4 + (gestational week − 13) × mean GWGU], where mean GWGL is the lower bound of the mean weekly weight gain based on BMI (1.0 pound (0.45 kg), 0.8 pound (0.36 kg), 0.6 pound (0.27 kg), and 0.5 pound (0.23 kg) for underweight, normal-weight, overweight, and obese, respectively) and mean GWGU is the upper bound (1.3 pounds (0.59 kg), 1.0 pound (0.45 kg), 0.7 pound (0.32 kg), and 0.6 pound (0.27 kg), respectively) (6). In a second approach, GWG was categorized using absolute total weight gain (<10 pounds (<4.5 kg), 10–19 pounds (4.5–9.0 kg), 20–29 pounds (9.1–13.5 kg), 30–39 pounds (13.6–18.1 kg), or ≥40 pounds (≥18.2 kg)) and duration of gestation was included as an indicator variable to allow for flexible control of gestational age at birth in subsequent models. Total GWG categories were chosen to facilitate comparison across methods, but a continuous measure was also examined.
Third, GWG was operationalized using weight-gain-for-gestational-age z scores. Z scores were calculated using the formula
(15). Mean values, standard deviations (SDs), and the constant (c) were taken from published z score charts derived using serial weight measurements from a Pittsburgh, Pennsylvania, population of uncomplicated term pregnancies that are BMI-specific (15, 16). For these charts, z scores are only available up to 40 weeks’ gestation for underweight and normal BMIs and up to 41 weeks for overweight and obese classes I–III BMIs. If the estimated gestational age was greater than these ages, the last available week was used (e.g., the z score for a normal-BMI woman who delivered at 43 weeks is calculated using the 40-week z score estimates) (19). Z scores were grouped into 5 categories (<−1.50, −1.50
to −0.51, −0.50 to 0.50, 0.51 to 1.50, or >1.50). If a woman lost more weight than the constant c, she was excluded from z score analyses (n = 1,094 (0.02%)). To examine the sensitivity of the z score approach to the selected external population, we also applied z score charts derived from a Swedish population to our data using the same methodology as the Pittsburgh z score calculations (26). Internal z scores were also calculated based on births in our population at each gestational week.
Outcomes
Three pregnancy outcomes—cesarean delivery, SGA birth, and LBW—were selected to reflect varying degrees of dependence on gestational age; gestational age is strongly associated with LBW (mean gestational age at birth (LBW vs. not LBW) = 34.7 weeks vs. 39.0 weeks), moderately associated with cesarean delivery (mean (cesarean delivery vs. vaginal birth) = 38.5 weeks vs. 38.9 weeks), and not associated with SGA birth (by definition, mean (SGA vs. not SGA) = 38.8 weeks vs. 38.8 weeks). We chose outcomes that were not defined by gestational duration (e.g., preterm birth) or outcomes that could affect both total weight gain and gestational duration (e.g., gestational diabetes, preeclampsia), as these are outcomes that cannot be directly controlled for gestational age at delivery (3). Cesarean delivery was categorized based on the reported method of delivery. The distribution of birth weights at each gestational week was calculated from our data set, and SGA birth was internally defined as having a birth weight below the 10th percentile for a given gestational age at birth. LBW was defined as a birth weight less than 2,500 g.
Covariates
Potential confounders collected from the birth record included maternal age (10–19, 20–24, 25–29, 30–34, 35–39, or ≥40 years), maternal race/ethnicity (non-Hispanic White, Hispanic, or other), maternal education (less than high school diploma, high school diploma, some college, or bachelor’s degree or more), source of payment (Medicaid, private insurance, or other), infant sex (male or female), birth order (firstborn or secondborn/higher), and birth weight (<2,500 g, 2,500–4,000 g, or >4,000 g). Prepregnancy BMI was treated as an effect modifier in all models because IOM recommendations and z score charts are BMI-specific, and it has previously been established that the risks of adverse birth outcomes associated with GWG differ according to prepregnancy BMI (3, 6).
Statistical analysis
Risk ratios (RRs) and 95% confidence intervals (CIs) were estimated using modified Poisson regression with robust variance estimation (Poisson regression applied to binomial outcome data using sandwich estimation) (27). Analyses of cesarean delivery included all of the covariates listed above; SGA birth and LBW analyses included all covariates except birth weight. Three methods were used to adjust for gestational age: 1) indirectly, using IOM guidelines for a given gestational age to categorize GWG (under, within, or exceeding the recommendations); 2) directly, by including gestational age as a covariate (gestational week indicator variables), with absolute total weight gain categories; and 3) indirectly, using a weight-gain-for-gestational-age z score. Throughout this paper, these methods will be referred to as the IOM-cat model, the gestational-age–controlled (GAC) model, and the z score model, respectively. Because the interpretations of the RRs derived from each method are not directly comparable, we looked at the direction and magnitude of the RRs in similar GWG categories; reference categories selected for all methods had similar mean GWGs. All models were stratified by prepregnancy BMI (3).
We conducted sensitivity analyses for the z score approach to examine how results differed if births beyond the range of gestational weeks provided in the external z score standard were excluded (past week 40 for underweight and normal BMIs and past week 41 for overweight and obese classes I–III BMIs) (15, 16) and to investigate how results changed based on the reference population chosen to derive the z score. We used the Swedish z score charts (also based on serial measurements of ongoing pregnancies) to explore the sensitivity of results to the external population used, and we assessed the sensitivity of including gestational age as a covariate in both the Pittsburgh and Swedish z score models. Lastly, we assigned internally derived z scores (based on gestational age at delivery) which standardized GWG among infants born in a given gestational week as opposed to among ongoing pregnancies.
RESULTS
There were 5,461,130 births (88.2% from California, 4.4% from Nevada, and 7.4% from Oregon) in the study in total. Across the 3 states, 31% of births occurred via cesarean delivery, 5% of newborns were LBW, and 10% were SGA. Distributions of covariates are shown in Table 1. Overall prepregnancy BMI distributions were as follows: 4.0% of women were underweight (n = 217,545), 48.4% had a normal weight (n = 2,645,102), 25.9% were overweight (n = 1,414,005), 12.9% had class I obesity (n = 706,812), 5.5% had class II obesity (n = 303,319), and 3.3% had class III obesity (n = 180,568). Figure 2 represents the distribution of GWG categories across BMI categories according to the 3 methods. Using the IOM classification, the majority of women exceeded the recommendation for their BMI. Using absolute total GWG categories, women in higher BMI categories gained the least amount of weight, but the proportion of women gaining 20–29 pounds (9.1–13.5 kg) was similar across BMI categories. The distribution of externally derived z scores did not differ greatly across BMI categories. However, the correlation between z scores and gestational age at delivery did differ across BMI categories (ρ = 0.07, ρ = 0.05, ρ = 0.01, ρ = −0.01, ρ = −0.04, and ρ = −0.05, respectively), with some categories showing nonlinear patterns not captured by a simple linear correlation (Figure 3). Additionally, the proportion of women who delivered preterm varied across externally derived z score categories (within BMI category) but, as expected, did not differ across internally derived z score categories (Figure 3). Mean GWG in each category by BMI class is shown in Table 2.
Table 1.
Maternal and Infant Characteristics (%) of Births Taking Place in California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017) in the Mid-2000s
| State | ||||
|---|---|---|---|---|
| Characteristic |
California
(n = 4,818,508) |
Nevada
(n = 238,442) |
Oregon
(n = 404,180) |
Total
(n = 5,461,130) |
| Maternal Characteristics | ||||
| Age group, years | ||||
| 10–19 | 7.1 | 7.1 | 6.4 | 7.1 |
| 20–24 | 20.1 | 23.3 | 21.3 | 20.4 |
| 25–29 | 26.8 | 29.2 | 29.3 | 27.1 |
| 30–34 | 27.0 | 24.9 | 26.9 | 26.9 |
| 35–39 | 15.1 | 12.2 | 13.2 | 14.8 |
| ≥40 | 3.8 | 3.2 | 2.9 | 3.7 |
| Race/ethnicity | ||||
| Non-Hispanic White | 28.0 | 41.5 | 68.8 | 31.6 |
| Hispanic | 50.4 | 37.3 | 19.2 | 47.5 |
| Other | 21.6 | 21.2 | 12.0 | 20.9 |
| Education | ||||
| Less than high school | 20.4 | 21.1 | 16.3 | 20.2 |
| High school | 26.0 | 30.1 | 23.2 | 26.0 |
| Some college | 25.7 | 29.4 | 31.8 | 26.3 |
| Bachelor’s degree or more | 27.9 | 19.3 | 28.7 | 27.6 |
| Health insurance | ||||
| Medicaid | 45.8 | 36.4 | 44.6 | 45.3 |
| Private | 47.1 | 38.5 | 51.5 | 47.1 |
| Other | 7.1 | 25.1 | 3.9 | 7.6 |
| Prepregnancy body mass indexa | ||||
| Underweight (<18.5) | 4.0 | 4.5 | 3.3 | 4.0 |
| Normal-weight (18.5–24.9) | 48.5 | 47.6 | 48.2 | 48.4 |
| Overweight (25.0–29.9) | 26.0 | 25.3 | 24.7 | 25.9 |
| Obese class I (30.0–34.9) | 12.9 | 13.1 | 12.9 | 12.9 |
| Obese class II (35.0–39.9) | 5.4 | 5.9 | 6.4 | 5.5 |
| Obese class III (≥40.0) | 3.2 | 3.6 | 4.5 | 3.3 |
| Infant Characteristics | ||||
| Male sex | 51.2 | 51.3 | 51.1 | 51.2 |
| Birth order | ||||
| Firstborn | 39.5 | 37.5 | 39.7 | 39.4 |
| Secondborn or higher | 60.5 | 62.5 | 60.3 | 60.6 |
| Gestational age, weeks | ||||
| <37 (preterm) | 6.7 | 8.3 | 5.9 | 6.8 |
| ≥37 (full-term) | 93.3 | 91.7 | 94.1 | 93.3 |
| Cesarean section delivery | 31.0 | 32.9 | 27.8 | 30.9 |
| Low birth weight (<2,500 g) | 5.0 | 6.4 | 4.6 | 5.0 |
| SGA birth (<10th percentile) | 9.9 | 11.3 | 8.8 | 9.9 |
Abbreviation: SGA, small for gestational age.
a Weight (kg)/height (m)2.
Figure 2.
Distribution of gestational weight gain (GWG) categories for 3 methods of accounting for gestational age according to prepregnancy body mass index (weight (kg)/height (m)2) in California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017). The Institute of Medicine (IOM) categories model used IOM guidelines for a given gestational age to categorize GWG as falling under, within, or exceeding the guidelines. The gestational-age–controlled (GAC) model categorized GWG using absolute total weight gain during pregnancy (<10 pounds (<4.5 kg), 10–19 pounds (4.5–9.0 kg), 20–29 pounds (9.1–13.5 kg), 30–39 pounds (13.6–18.1 kg), or ≥40 pounds (≥18.2 kg)). The z score model calculated results using z score for gestational age. BMI categories: underweight, <18.5; normal-weight, 18.5–24.9; overweight, 25.0–29.9; class I obese, 30.0–34.9; class II obese, 35.0–39.9; class III obese, ≥40.0.
Figure 3.

Proportions of preterm births in externally derived (black) and internally derived (gray) z score categories within prepregnancy body mass index (weight (kg)/height (m)2) categories (underweight (A), normal-weight (B), overweight (C), obese class I (D), obese class II (E), or obese class III (F)) in California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017). Dashed lines represent the prevalence of preterm birth within each body mass index category.
Table 2.
Mean (Standard Deviation) Gestational Weight Gain (in Poundsa) for 3 Different Methods of Accounting for Gestational Age, by Prepregnancy Body Mass Index, California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017)
| Prepregnancy Body Mass Index b | ||||||
|---|---|---|---|---|---|---|
| Method and GWG Category |
Underweight
(<18.5) (n = 217,545) |
Normal-Weight
(18.5–24.9) (n = 2,645,102) |
Overweight
(25.0–29.9) (n = 1,414,005) |
Obese Class I
(30.0–34.9) (n = 706,812) |
Obese Class II
(35.0–39.9) (n = 301,319) |
Obese Class III
(≥40.0) (n = 180,568) |
| IOM guidelines | ||||||
| Under | 21.2 (5.2) | 15.9 (5.6) | 7.2 (7.0) | 3.5 (7.9) | 2.1 (8.7) | −0.1 (10.1) |
| Within | 32.2 (3.5) | 26.4 (2.9) | 18.7 (2.6) | 14.4 (1.8) | 14.3 (1.8) | 14.2 (1.8) |
| Exceed | 47.0 (9.6) | 41.0 (9.6) | 36.6 (11.4) | 32.3 (12.3) | 31.6 (12.4) | 31.3 (12.6) |
| Total GWGc | ||||||
| <10 pounds (<4.5 kg) | 5.1 (4.1) | 4.0 (5.8) | 2.6 (7.3) | 1.6 (7.9) | 0.4 (8.6) | −1.8 (9.9) |
| 10–19 pounds (4.5–9.0 kg) | 15.8 (2.7) | 15.4 (2.8) | 15.0 (2.8) | 14.7 (2.9) | 14.5 (2.9) | 14.3 (2.9) |
| 20–29 pounds (9.1–13.5 kg) | 24.8 (2.8) | 24.6 (2.9) | 24.2 (2.9) | 24.0 (2.9) | 23.8 (2.9) | 23.7 (2.9) |
| 30–39 pounds (13.6–18.1 kg) | 33.9 (2.9) | 33.8 (2.9) | 33.8 (2.9) | 33.6 (2.9) | 33.5 (2.9) | 33.5 (2.9) |
| ≥40 pounds (≥18.2 kg) | 48.6 (9.3) | 48.4 (8.9) | 49.5 (9.8) | 50.1 (10.5) | 50.4 (10.8) | 50.8 (11.1) |
| z score for GA | ||||||
| <−1.50 | 13.9 (4.3) | 13.3 (5.1) | 6.0 (6.7) | −2.1 (8.0) | −12.3 (8.5) | −21.5 (7.2) |
| −1.50 to −0.51 | 23.7 (2.8) | 24.6 (3.2) | 19.9 (3.8) | 13.2 (3.9) | 5.5 (4.1) | −2.7 (4.4) |
| −0.50 to 0.50 | 33.1 (3.6) | 34.7 (3.8) | 33.1 (4.9) | 26.8 (5.2) | 21.0 (5.8) | 15.4 (6.6) |
| 0.51 to 1.50 | 45.0 (4.7) | 47.2 (4.7) | 49.4 (6.1) | 44.3 (6.4) | 40.3 (7.5) | 37.5 (8.9) |
| >1.50 | 63.9 (10.0) | 65.5 (8.8) | 71.6 (9.2) | 68.6 (9.9) | 68.0 (10.7) | 72.2 (10.5) |
Abbreviations: GA, gestational age; GWG, gestational weight gain; IOM, Institute of Medicine.
a 1 pound = 0.45 kg.
b Weight (kg)/height (m)2.
c Used for GA-controlled models.
Main results
Primary results for associations between the 3 approaches and the outcomes are shown graphically in Figure 4, and numerical results are shown in Web Tables 1–3 (available at https://doi.org/10.1093/aje/kwac120). Sensitivity analyses depicting results for different z score models are partially shown in Figure 5, and numerical results are shown in Web Tables 4–6.
Figure 4.
Risk ratios and 95% confidence intervals (bars) for the associations between gestational weight gain (GWG) and 3 pregnancy outcomes (cesarean delivery (A), small-for-gestational-age (<10th percentile) birth (B), and low birth weight (<2,500 g) (C)) according to body mass index (BMI; weight (kg)/height (m)2) category, derived using 3 different methods (the Institute of Medicine (IOM), categories (IOM-cat) model (rows 1 and 2); the gestational-age–controlled (GAC) model (rows 3–6); and the z score for gestational age (GA) model (rows 7–10)), California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017). The IOM-cat model used IOM guidelines for a given gestational age to categorize GWG. The GAC model categorized GWG using absolute total weight gain (<10 pounds (<4.5 kg), 10–19 pounds (4.5–9.0 kg), 20–29 pounds (9.1–13.5 kg), 30–39 pounds (13.6–18.1 kg), or ≥40 pounds (≥18.2 kg)). The z score model calculated results using z score for gestational age. The reference group for the IOM-cat model was “within guidelines”; the reference group for the GAC model was 20–29 pounds (9.1–13.5 kg); and the reference group for the z score model was −0.50 to 0.50. BMI categories: underweight, <18.5; normal-weight, 18.5–24.9; overweight, 25.0–29.9; class I obese, 30.0–34.9; class II obese, 35.0–39.9; class III obese, ≥40.0. Narrow 95% confidence intervals are hidden by point estimate markers; numerical results are shown in Web Tables 1–3.
Figure 5.
Risk ratios and 95% confidence intervals (bars) for the associations between gestational weight gain (GWG) and 3 pregnancy outcomes (cesarean delivery (A), small-for-gestational-age (<10th percentile) birth (B), and low birth weight (<2,500 g) (C)) according to body mass index (BMI; weight (kg)/height (m)2) category, derived using the z score model (circles) and z score adjusted for gestational age (GA) (triangles), California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017). The z score model included the z score and covariates; the z score + GA model included the z score, GA, and covariates. The reference group was z score −0.50 to 0.50. BMI categories: underweight, <18.5; normal-weight, 18.5–24.9; overweight, 25.0–29.9; class I obese, 30.0–34.9; class II obese, 35.0–39.9; class III obese, ≥40.0. Narrow 95% confidence intervals are hidden by point estimate markers; numerical results are shown in Web Tables 4–6.
Cesarean delivery.
Across all approaches, there was an increased risk of cesarean delivery among women who were in higher categories of GWG and a decreased risk among women in lower categories of GWG (Figure 4, Web Table 1). For the highest categories of GWG (IOM model, exceeded recommendation; GAC model, ≥40 pounds (≥18.2 kg); z score model, >1.50) versus the referent group, there was a consistent pattern of higher RRs among lower BMI groups. Inclusion of gestational age as a covariate in the z score model did not meaningfully change RRs (Figure 5, Web Table 4).
SGA birth.
Similar to cesarean delivery, SGA results were comparable across methods (Figure 4, Web Table 2). Women who gained less weight were more likely to have an infant who was born SGA. In general, RRs differed across BMI categories; women who had an underweight BMI had RRs further from the null, whereas women with class III obesity had RRs closer to the null. RRs were not meaningfully different when gestational age was included in the z score model (Figure 5, Web Table 5).
Low birth weight.
Results for LBW analyses were not consistent across methods (Figure 4, Web Table 3). The IOM-cat model and the GAC model produced similar results; women who gained more weight were less likely to have an infant who was LBW. However, results from the primary z score model indicated that women in the higher BMI classes who had higher z scores were more likely to have an LBW infant. Women with z scores of 0.50 to 1.50, compared with those with z scores of −0.50 to 0.50, had the following RRs for each BMI category, from lowest to highest BMI, respectively: RR = 0.79 (95% CI: 0.75, 0.83); RR = 0.87 (95% CI: 0.86, 0.89); RR = 1.06 (95% CI: 1.03, 1.09); RR = 1.05 (95% CI: 1.02, 1.08); RR = 1.16 (95% CI: 1.11, 1.20); and RR = 1.12 (95% CI: 1.07, 1.17). This pattern was more pronounced among women who had a z score greater than 1.50. RRs changed drastically, often changing direction to be more similar to the IOM-cat and GAC results, when gestational week was included in the z score model (Figure 5, Web Table 6). For example, among women with class III obesity, the RR for those with a z score of 0.51 to 1.50 (referent: −0.50 to 0.50) was 0.89 with adjustment, as compared with 1.12 without adjustment.
Sensitivity analysis
Results did not differ when analyses excluded women who delivered at gestational ages that exceeded the z score charts (e.g., excluding women with a normal BMI who delivered at 41 weeks). Using the z score charts derived from the Swedish population (correlation with gestational age: ρ = 0.003; P < 0.0001) did not meaningfully change results or interpretations from the primary external standard, and the inclusion of gestational age as a covariate similarly corrected the counterintuitive findings for LBW (Web Tables 4–6). Use of internally derived z scores, based on distributions of GWG among births at each gestational age, were consistent with results from the IOM-cat and GAC models showing a protective association of gaining more weight (higher z score) with LBW (Table 3, Web Tables 4–6). Results from the GAC model using a continuous measure of GWG showed similar patterns (Web Table 7).
Table 3.
Risk Ratios for the Association Between Weight-Gain-for-Gestational-Age z Score and Low Birth Weight Among Obese Class IIIa (Body Mass Indexb ≥40.0) Births, Derived Using Different z Score Standards, California (2007–2017), Nevada (2010–2017), and Oregon (2008–2017)
| Z Score for Gestational Age c | ||||||||
|---|---|---|---|---|---|---|---|---|
| Model | <1.50 | −1.50 to −0.51 | 0.51 to 1.50 | >1.50 | ||||
| RR | 95% CI | RR | 95% CI | RR | 95% CI | RR | 95% CI | |
| External z scored,e | 1.40 | 1.28, 1.54 | 1.00 | 0.93, 1.07 | 1.12 | 1.07, 1.17 | 1.85 | 1.63, 2.09 |
| External z score + GAf,g | 1.20 | 1.10, 1.32 | 1.09 | 1.03, 1.16 | 0.89 | 0.85, 0.93 | 0.81 | 0.71, 0.91 |
| External z score (Sweden)d,e | 1.45 | 1.35, 1.56 | 1.05 | 0.99, 1.12 | 1.37 | 1.30, 1.44 | 2.52 | 2.35, 2.70 |
| External z score + GA (Sweden)f,g | 1.14 | 1.06, 1.23 | 1.08 | 1.01, 1.15 | 0.92 | 0.88, 0.97 | 0.85 | 0.80, 0.92 |
| Internal z scored,h | 1.15 | 1.05, 1.24 | 1.06 | 1.00, 1.11 | 0.83 | 0.79, 0.88 | 0.75 | 0.69, 0.82 |
Abbreviations: CI, confidence interval; GA, gestational age; RR, risk ratio.
a Full results for other prepregnancy body mass index categories are presented in Web Table 6.
b Weight (kg)/height (m)2.
c The reference group for the z score model was −0.50 to 0.50.
d Models adjusted for maternal age, race, education, type of health insurance, child’s sex, and birth order.
f Models additionally adjusted for GA at delivery.
g External Swedish z scores were taken from Johansson et al. (26).
h Internal z scores were calculated based on GA at delivery.
DISCUSSION
Motivated by the current debate in the literature surrounding preferred methods for studying total GWG, we applied 3 different approaches to observed cross-sectional data addressing the relationship between GWG and 3 pregnancy outcomes, within categories of prepregnancy BMI. In our study, we found that for cesarean delivery and SGA, outcomes that are modestly associated with or independent of gestational duration, the choice of method did not affect the overall interpretation of the results. For LBW, an outcome that is strongly associated with gestational duration, the IOM-cat model and the GAC model yielded similar patterns of RRs; however, the z score models standardized to external reference populations of ongoing pregnancies produced results that were meaningfully different from the other models and previous literature (28), because of residual correlation between gestational duration and z score. Across all methods, we found that higher GWG was associated with an increased risk of cesarean delivery and a decreased risk of SGA birth. For all outcomes, RRs varied across BMI categories, including obesity subclasses, supporting the recommendation to investigate effect modification by prepregnancy BMI in studies of GWG, and highlighting the need for future studies to stratify by obesity subclass when possible.
We observed meaningful differences in results and conclusions between the externally derived z score approach and the other approaches for the outcome of LBW. Our study utilized real-world exposure and outcome data, so without simulation we cannot directly identify which method is least biased when results differ. Nonetheless, several factors suggest that the use of externally derived z scores in this context is confounded by gestational duration. Our work built on that of Hinkle et al. (7), who demonstrated a potential for bias when the z score model used to define GWG is misspecified using simulated pregnancy outcome data. In our study, when a commonly used external standard was used to calculate z scores, the estimated associations between z score and LBW were counterintuitive, regarding biological plausibility, and different from the other approaches: Women who had the highest z scores were estimated to be more likely to have an LBW infant. This was due to a higher proportion of preterm births among those with the highest z scores; thus, adding control for gestational duration in the z score model yielded results that were consistent with the other 2 approaches. For example, among obese class III individuals, women with a z score greater than 1.50 had 1.85 times the risk of delivering a LBW baby as women with a z score of −0.50 to 0.50, and preterm birth constituted 19.7% of the former group, as compared with 8% among the latter. A higher-than-expected prevalence of preterm birth within specific z score categories also explains the strong positive associations seen in other BMI categories, such as among underweight women with a z score less than −1.50. This result was also strongly attenuated when gestational age was controlled.
The use of external z scores is becoming more common in epidemiologic research to remove the correlation between gestational duration and GWG (19–22, 24, 25, 29). However, within our data the externally derived z scores were associated with gestational age at delivery, with the specific pattern of association varying by prepregnancy BMI. Our BMI-specific internally derived z scores, calculated to have the same distribution of z scores at each gestational week with a median z score of 0, yielded results more consistent with the other approaches. Although we are not advocating the use of internal z scores in future studies (which are conceptually similar to direct adjustment for gestational age at birth), the comparison highlighted how the choice of risk set used to calculate the z score has implications for the presence of a residual association between z score and gestational age at delivery. Gestational-week–specific externally derived z scores calculated longitudinally based on risk sets of ongoing pregnancies are appealing for outcomes for which the risk set is considered to be ongoing pregnancies rather than births at a given gestational age. Nonetheless, whenever the distribution of GWG differs between births and ongoing pregnancies at a given gestational age, z scores based on ongoing pregnancies will not be independent of gestational age at delivery. Thus, if gestational age at birth is also a cause of the outcome, confounding should be expected with the use of external z scores. In addition, the external reference populations were restricted to uncomplicated, full-term pregnancies, thus excluding pregnancies ultimately resulting in preterm birth from the risk set; this also may have contributed to a residual association between z score and gestational duration (7). This could be an important consideration in studies utilizing the z score that focus on comparisons between women, as the residual association between z score and gestational age at delivery could vary across groups—for example, across BMI groups in our study.
It is also possible that the unexpected associations were due to the oversimplification of GWG as a cross-sectional measurement at the time of birth. There are several pathways that could potentially be biasing the relationship between z score and LBW (Figure 1B), and the interrelated causal effects of remaining in utero and GWG are complicated. Even though GWG is a trajectory over time, one simplification, conceived by Hinkle et al. (7), replaces the complex causal effects over time with an unmeasured variable U that captures the resulting correlations (Figure 1). Under this simplification, use of the z score without adjustment for gestational age at delivery requires the strong assumption that the total effects of gestational duration and GWG are captured in the z score (Figure 1B, if dotted line were removed). With the adjustment for gestational age at delivery, the assumption that the z score fully captures the effect of gestational age can be dropped, bringing the assumptions, and limitations, more in line with those needed for use of the GAC approach. It is possible that after adjustment for gestational duration, the z score estimates were more similar to the GAC estimates because both models had collider bias due to conditioning on gestational age at delivery (Figure 1). However, collectively our analyses suggest that residual confounding by gestational age explains differences in results across methods for the association between z score and LBW.
We included the IOM guidelines as a method in our analyses because they are the most commonly used classification of GWG in the literature, they are frequently used in clinical practice, and they indirectly adjust for gestational age by assuming linear weight gain in the second and third trimesters. Although conceptually similar to z scores in that they adjust indirectly for gestational length using an external reference, the IOM guidelines were not designed to be completely independent of gestational age and are only able to crudely categorize women as below, within, or exceeding the current guidelines. Additionally, unlike the IOM thresholds, which were designed to reduce adverse perinatal outcomes, it is unclear what range of z scores would reduce these outcomes. Unlike the external z scores, we found that the proportion of preterm births did not vary across IOM categories (Web Figure 1), indicating that in our study the IOM-cat models may have been less affected by residual confounding by gestational duration.
Comparing RRs across methods in our study was challenging because of inherent differences in how each method quantified GWG. To aid in comparison, we reported RRs that were based on comparing different groups of women instead of using continuous measures of GWG. Contrasts were made between categories that had similar mean GWGs; mean GWG was similar across reference categories for all methods. Although there was certainly some misclassification due to self-reporting of prepregnancy weight, the primary purpose of the study was to compare results across different methods commonly implemented in the literature, and all 3 methods would be similarly affected by this misclassification.
In this study population of more than 5 million births in 3 US states, we showed that 3 commonly implemented methods of accounting for duration of gestation in studies of GWG yielded similar overall conclusions, except when the outcome was strongly related to gestational age (LBW). Under such conditions, the use of an externally derived z score can lead to meaningfully different conclusions than adjusting for gestational age as a covariate or using the IOM-recommended ranges. In light of previous work (7), these discrepancies likely reflect residual confounding by gestational duration in the z score approach. Our study was limited to the study of total weight gain in relation to outcomes defined at the time of birth that, under certain assumptions, could be validly estimated by all 3 methods. The specific research question and study context has a large impact on preferred approach, and perinatal outcomes like preterm birth (gestational duration itself) or gestational diabetes, where the z score is theoretically preferred, were beyond the scope of this analysis. Nonetheless, our comparative analysis can help inform the design and interpretation of the many studies of total GWG in relation to these outcomes. In conclusion, when only measures of total GWG are available, future investigators should consider the potential biases introduced in each method and consider the correlation of the outcome with gestational age at delivery, to best estimate the risks of adverse perinatal outcomes associated with GWG.
Supplementary Material
ACKNOWLEDGMENTS
Author affiliations: School of Public Health, University of Nevada, Reno, Nevada, United States (Megan Richards, Matthew J. Strickland, Lyndsey A. Darrow); Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia, United States (W. Dana Flanders); and Department of Environmental and Occupational Health, Rollins School of Public Health, Emory University, Atlanta, Georgia, United States (Mitchel Klein).
This work was supported by the National Institute of Environmental Health Sciences (grant R01ES028346).
The data for this study are available upon request from the individual states’ health departments.
This work was presented at the 53rd Annual Meeting of the Society for Epidemiologic Research (virtual), December 15–18, 2020.
Conflict of interest: none declared.
REFERENCES
- 1. LifeCycle Project-Maternal Obesity and Childhood Outcomes Study Group, Voerman E, Santos S, et al. Association of gestational weight gain with adverse maternal and infant outcomes. JAMA. 2019;321(17):1702–1715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Hutcheon JA, Bodnar LM, Abrams B. Untangling gestational weight gain from gestational age in infant mortality studies. Am J Public Health. 2014;104(9):e1–e2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Hutcheon JA, Bodnar LM. Good practices for observational studies of maternal weight and weight gain in pregnancy. Paediatr Perinat Epidemiol. 2018;32(2):152–160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Goldstein RF, Abell SK, Ranasinha S, et al. Association of gestational weight gain with maternal and infant outcomes: a systematic review and meta-analysis. JAMA. 2017;317(21):2207–2225. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Kominiarek MA, Saade G, Mele L, et al. Association between gestational weight gain and perinatal outcomes. Obstet Gynecol. 2018;132(4):875–881. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Institute of Medicine (US) and National Research Council (US) Committee to Reexamine IOM Pregnancy Weight Guidelines , Rasmussen KM, Yaktine AL, eds. Weight Gain During Pregnancy: Reexamining the Guidelines. Washington, DC: National Academies Press; 2010. [PubMed] [Google Scholar]
- 7. Hinkle SN, Mitchell EM, Grantz KL, et al. Maternal weight gain during pregnancy: comparing methods to address bias due to length of gestation in epidemiological studies. Paediatr Perinat Epidemiol. 2016;30(3):294–304. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Harpsøe MC, Basit S, Bager P, et al. Maternal obesity, gestational weight gain, and risk of asthma and atopic disease in offspring: a study within the Danish National Birth Cohort. J Allergy Clin Immunol. 2013;131(4):1033–1040. [DOI] [PubMed] [Google Scholar]
- 9. Leermakers ETM, Sonnenschein-van der Voort AMM, Gaillard R, et al. Maternal weight, gestational weight gain and preschool wheezing: the Generation R Study. Eur Respir J. 2013;42(5):1234–1243. [DOI] [PubMed] [Google Scholar]
- 10. Polinski KJ, Liu J, Boghossian NS, et al. Maternal obesity, gestational weight gain, and asthma in offspring. Prev Chronic Dis. 2017;14:E109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Davis RR, Hofferth SL, Shenassa ED. Gestational weight gain and risk of infant death in the United States. Am J Public Health. 2014;104(suppl 1):S90–S95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Wilcox AJ, Weinberg CR, Basso O. On the pitfalls of adjusting for gestational age at birth. Am J Epidemiol. 2011;174(9):1062–1068. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Ananth CV, VanderWeele TJ. Placental abruption and perinatal mortality with preterm delivery as a mediator: disentangling direct and indirect effects. Am J Epidemiol. 2011;174(1):99–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. VanderWeele TJ, Robins JM. Directed acyclic graphs, sufficient causes, and the properties of conditioning on a common effect. Am J Epidemiol. 2007;166(9):1096–1104. [DOI] [PubMed] [Google Scholar]
- 15. Hutcheon JA, Platt RW, Abrams B, et al. A weight-gain-for-gestational-age z score chart for the assessment of maternal weight gain in pregnancy. Am J Clin Nutr. 2013;97(5):1062–1067. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Hutcheon JA, Platt RW, Abrams B, et al. Pregnancy weight gain charts for obese and overweight women. Obesity (Silver Spring). 2015;23(3):532–535. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Greenland S. Quantifying biases in causal models: classical confounding vs collider-stratification bias. Epidemiology. 2003;14(3):300–306. [PubMed] [Google Scholar]
- 18. Whitcomb BW, Schisterman EF, Perkins NJ, et al. Quantification of collider-stratification bias and the birthweight paradox. Paediatr Perinat Epidemiol. 2009;23(5):394–402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Pickens CM, Hogue CJ, Howards PP, et al. The association between gestational weight gain z score and stillbirth: a case-control study. BMC Pregnancy Childbirth. 2019;19(1):451. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Bodnar LM, Himes KP, Abrams B, et al. Early-pregnancy weight gain and the risk of preeclampsia: a case-cohort study. Pregnancy Hypertens. 2018;14:205–212. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Freese KE, Himes KP, Hutcheon JA, et al. Excessive gestational weight gain is associated with severe maternal morbidity. Ann Epidemiol. 2020;50:52–56.e1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Leonard SA, Abrams B, Main EK, et al. Weight gain during pregnancy and the risk of severe maternal morbidity by prepregnancy BMI. Am J Clin Nutr. 2020;111(4):845–853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Widen EM, Nichols AR, Kahn LG, et al. Prepregnancy obesity is associated with cognitive outcomes in boys in a low-income, multiethnic birth cohort. BMC Pediatr. 2019;19(1):507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Matias SL, Pearl M, Lyall K, et al. Maternal prepregnancy weight and gestational weight gain in association with autism and developmental disorders in offspring. Obesity (Silver Spring). 2021;29(9):1554–1564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Pugh SJ, Hutcheon JA, Richardson GA, et al. Gestational weight gain, prepregnancy body mass index and offspring attention-deficit hyperactivity disorder symptoms and behaviour at age 10. BJOG. 2016;123(13):2094–2103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Johansson K, Hutcheon JA, Stephansson O, et al. Pregnancy weight gain by gestational age and BMI in Sweden: a population-based cohort study. Am J Clin Nutr. 2016;103(5):1278–1284. [DOI] [PubMed] [Google Scholar]
- 27. Zou G. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7):702–706. [DOI] [PubMed] [Google Scholar]
- 28. McDonald SD, Han Z, Mulla S, et al. High gestational weight gain and the risk of preterm birth and low birth weight: a systematic review and meta-analysis. J Obstet Gynaecol Can. 2011;33(12):1223–1233. [DOI] [PubMed] [Google Scholar]
- 29. Badon SE, Dublin S, Nance N, et al. Gestational weight gain and adverse pregnancy outcomes by pre-pregnancy BMI category in women with chronic hypertension: a cohort study. Pregnancy Hypertens. 2021;23:27–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



