Skip to main content
American Journal of Epidemiology logoLink to American Journal of Epidemiology
. 2019 Oct 2;189(2):133–145. doi: 10.1093/aje/kwz213

Exercise During the First Trimester and Infant Size at Birth: Targeted Maximum Likelihood Estimation of the Causal Risk Difference

Samantha F Ehrlich 1,2,, Romain S Neugebauer 1, Juanran Feng 1, Monique M Hedderson 1, Assiamira Ferrara 1
PMCID: PMC7156138  PMID: 31577030

Abstract

This cohort study sought to estimate the differences in risk of delivering infants who were small or large for gestational age (SGA or LGA, respectively) according to exercise during the first trimester of pregnancy (vs. no exercise) among 2,286 women receiving care at Kaiser Permanente Northern California in 2013–2017. Exercise was assessed by questionnaire. SGA and LGA were determined by the sex- and gestational-age-specific birthweight distributions of the 2017 US Natality file. Risk differences were estimated by targeted maximum likelihood estimation, with and without data-adaptive prediction (machine learning). Analyses were also stratified by prepregnancy weight status. Overall, exercise at the cohort-specific 75th percentile was associated with an increased risk of SGA of 4.5 (95% CI: 2.1, 6.8) per 100 births, and decreased risk of LGA of 2.8 (95% CI: 0.5, 5.1) per 100 births; similar findings were observed among the underweight and normal-weight women, but no associations were found among those with overweight or obesity. Meeting Physical Activity Guidelines was associated with increased risk of SGA and decreased risk of LGA but only among underweight and normal-weight women. Any vigorous exercise reduced the risk of LGA in underweight and normal-weight women only and was not associated with SGA risk.

Keywords: exercise, infant size, pregnancy

Abbreviations

IPTW

inverse probability of treatment weighting

LGA

large for gestational age

MET

metabolic equivalent

PA

physical activity

PETALS

Pregnancy Environment and Lifestyle Study

PPAQ

Pregnancy Physical Activity Questionnaire

SGA

small for gestational age

TMLE

targeted maximum likelihood estimation

The American College of Obstetricians and Gynecologists (1) and the Physical Activity Guidelines for Americans (2) recommend that pregnant women participate in 30 minutes of moderate intensity activity most days of the week. Meta-analyses (3, 4) of exercise intervention trials initiated, primarily, late in the first trimester or early in the second trimester, reported no association between exercise and delivering an infant that is small for gestational age (SGA) and either a lower risk of delivering an infant that is large for gestational age (LGA) (3) or no association with LGA (4); however, the evidence presented in the latter meta-analysis was rated very low quality (4). In terms of public health messaging, self-report of exercise behavior in population-based cohorts is particularly informative compared with structured exercise interventions in motivated trial participants. A meta-analysis of cohort studies found no association between leisure-time physical activity (PA) and SGA and reported there were too few studies to examine LGA (3). Only 2 studies (5, 6) included in this meta-analysis (3) considered leisure-time PA performed during the first trimester, and they reported no association with SGA (5, 6).

There is a paucity of data on exercise early in gestation. These data would be useful to clinicians advising women who are planning a pregnancy. Therefore, in a prospective cohort study, we sought to examine whether exercise during the first trimester was associated with delivering an infant who was SGA or LGA. We also set out, a priori, to examine potential differences in the relationship according to prepregnancy weight status. We use causal inference methods to address this question, specifically targeted maximum likelihood estimation (TMLE) (7–9) of causal risk differences, including TMLE with data-adaptive estimation (machine learning). These methods rely on specific assumptions (described below) under which observational data can emulate inference from a perfect randomized trial (i.e., no noncompliance or loss to follow-up); when these assumptions hold, the resulting measures (i.e., population causal effect estimates) can be interpreted causally (10). Because almost half of US pregnancies are unintended (11), it would be inappropriate to address this research question with a randomized trial; thus, these methods represent the best available alternative for obtaining an estimate of the causal risk difference.

METHODS

The setting for this study is Kaiser Permanente Northern California, a health-care delivery system serving 3.6 million members. The Kaiser Permanente Northern California patient population is racially/ethnically and socioeconomically diverse, and it is representative of the underlying population served (12).

Data come from the Pregnancy Environment and Lifestyle Study (PETALS), a longitudinal birth cohort study (13). Beginning October 2013, women initiating prenatal care at Kaiser Permanente Northern California at <11 gestational weeks were invited to attend a study visit at 10–13 weeks; the participation rate was 75% (13).

Data collection

The visit included a Pregnancy Physical Activity Questionnaire (PPAQ) (14–17), on which women reported time spent in 36 population-appropriate activities during the previous 2 months (including questions on yoga/Pilates, use of cardiovascular exercise machines, aerobic exercise classes, weight lifting/resistance exercises, and team sports (15, 16)). Participants selected one of 6 responses regarding the amount of time spent in each activity (e.g., none, <1/2 hour per day, 1/2 to almost 1 hour per day, 1 to almost 2 hours per day, 2 to almost 3 hours per day, or ≥3 hours per day). The minimum of the time category selected (duration and frequency) multiplied by the intensity, measured in metabolic equivalents (METs) allotted to that activity, gave an estimate of the volume of PA (MET-hours per day). MET values for walking and household tasks came from field-based measurements of pregnant women (18); otherwise Compendium-based MET values were used (19).

The study questionnaire provided information on age, race/ethnicity, marital status, parity, and education. It included a Block food frequency questionnaire (20, 21), assessing diet over the past 2 months, and a validated questionnaire assessing nausea and vomiting since the beginning of pregnancy (22). Body mass index was calculated as weight (kilograms) divided by height (meters) squared and classified according to standard thresholds (23). Infant sex, weight, and gestational age at birth were obtained from the electronic health record.

Exposure

The present study focused on sports and exercise, or activity that is intentional for health, wellness, or to increase fitness, and results in energy expenditure beyond the demands of everyday living. This domain encompasses 10 PPAQ activities of moderate intensity (range of 3.2–6.0 METs, for walking, swimming, etc.) and 2 of vigorous intensity (6.5 and 7.0 METs for walking quickly up hills and jogging, respectively). The volume of all activities summed provided an overall estimate of moderate to vigorous intensity sports and exercise activity (hereafter, “exercise”), in MET-hours per week.

The exposure was examined in 3 ways. First, because PA questionnaire data are most appropriate for ranking individuals with respect to the volume of activity performed, a “high” level of exercise was defined as meeting or exceeding the cohort-specific 75th percentile (i.e., ≥13.3 MET-hours per week) (5). Second, to inform public health messaging regarding exercise during pregnancy, exercise was defined as meeting or exceeding the lower bound of the Physical Activity Guidelines (2) recommendation of 150–300 minutes per week of moderate exercise, 75–150 minutes per week of vigorous exercise, or an equivalent combination of moderate/vigorous exercise (i.e., ≥7.5 MET-hours per week) (2, 5, 24). Finally, to inform recommendations pertaining to higher intensity exercise during pregnancy, performing any amount of vigorous exercise during the first trimester, regardless of volume, was examined.

Outcomes

SGA (≤10th percentile) and LGA (≥90th percentile) were based on the sex- and gestational-age-specific birthweight distributions of the 2017 US Natality files, which provide cutpoints for births occurring at 22–42 weeks’ gestation (25).

Participants

PETALS participants with PPAQ data who delivered before October 31, 2017, were eligible for the present study (n = 2,501). Women missing data on infant size at birth (n = 137) were excluded (e.g., pregnancy losses, stillbirth, those with unknown pregnancy outcomes). Those with a contraindication to PA during pregnancy (1) that was diagnosed before the study visit were excluded (n = 6), as were 72 women with implausible volumes of PA (15, 16). The final analytical cohort consisted of 2,286 mother-infant pairs.

Statistical analyses

Targeted maximum likelihood estimation (7–9) (TMLE) was used to estimate the average difference in risk of SGA and LGA for the self-reported levels of exercise. Inferences from the more commonly used inverse probability of treatment weighted (IPTW) (10, 26) estimators are also provided for comparison. All estimates are interpreted as the difference in risk had all women exercised at a specific level minus the risk had none exercised at that level. Missing covariate data were addressed using the missingness indicator method (27, 28) (i.e., an indicator of missing values was included in the adjustment set, and missing values were replaced with the median of observed values, which amounts to creating a separate missingness level for categorical variables).

For the IPTW analyses, the propensity for exercise was estimated by logistic regression, with main terms only, for the baseline covariates: age (continuous), prepregnancy body mass index (continuous), daily caloric intake (kcal; continuous), vomiting (vomited ≥5 times on an average day vs. not (reference)), marital status (married (reference), not married, and missing (n = 4)), race/ethnicity (Hispanic (largest group, reference), white, Asian/Pacific Islander, black, and other), parity (0 (reference), 1, ≥2, and missing (n = 2)), and education (high school or less, some college (reference), college graduate, postgraduate, and missing (n = 2)). Women missing caloric intake (n = 54) or with implausible estimates of daily caloric intake (i.e., <400 calories or >6,000 calories; n = 20) were assigned the cohort median value of 1,438 kcal.

Unadjusted (i.e., unweighted) and IPTW (with stabilized, untruncated weights) estimates were obtained using Proc Genmod with an independence structure for the variance-covariance matrix (SAS, version 9.3; SAS Institute, Inc., Cary, North Carolina) to derive robust (sandwich) estimates of variance and provide conservative inference. Stabilized weights were not truncated because all were within the range of 0.24–8.52 (Web Table 1, available at https://academic.oup.com/aje) (29).

To potentially improve estimation efficiency over IPTW (30), R, version 3.3.3 (R Foundation for Statistical Computing, Vienna, Austria), was used to implement a doubly robust, locally efficient estimator, TMLE (7–9). Valid inferences from TMLE depend upon correctly estimating the model for the propensity score or the outcome regression, and TMLE’s efficiency can be improved over that of IPTW. To avoid incorrect TMLE inference from misspecified parametric models (i.e., logistic) for propensity score and outcome regression, data-adaptive methods, such as SuperLearner (31), have been proposed to estimate nuisance parameters (9, 32–35). Three approaches were used to estimate the propensity score and the outcome regression portions of the likelihood: 1) same logistic modeling approach described above, 2) data-adaptively with SuperLearner (31) using the default “learners” (7, 8) (i.e., prediction algorithms) only (i.e., Wrapper for Glm, Choose a Model by AIC in a Stepwise Algorithm, and Wrapper Function for SuperLearner Prediction Algorithm), and 3) data-adaptively with SuperLearner, using the defaults and the following extra learners: SL Wrapper For Biglasso, Discrete Bayesian Additive Regression Tree Sampler, ExtraTrees SuperLearner Wrapper, Elastic Net Regression Including Lasso and Ridge, Wrapper for Kernlab’s SVM Algorithm, Wrapper for Lm, SL Wrapper for Ranger, and Wrapper for Speedglm.

Analyses stratified by prepregnancy weight status (underweight/normal weight vs. overweight/obese) estimated subgroup-specific associations. Post hoc analyses estimated the association of exercise with the risk of SGA plus respiratory distress syndrome, hypoglycemia, or hyperbilirubinemia, because decreased substrate availability and oxygenation during pregnancy, potential consequences of exercise, could hypothetically increase neonatal hypoglycemia and hyperbilirubinemia, and SGA is a risk factor for respiratory distress syndrome (see Web Table 2).

The study was approved by the institutional review boards of Kaiser Permanente Northern California and the University of Tennessee Knoxville.

RESULTS

Cohort characteristics (n = 2,286) are presented in Table 1. The PPAQ was completed at a median of 13.0 weeks’ gestation (interquartile range, 3.0). Prepregnancy weight (for body mass index) was either a clinic-measured weight recorded in the electronic health record from up to 1 year prior to the last menstrual period through that date (78.5%, n = 1,794), a clinic-measured pregnancy weight ascertained before 10 weeks’ gestation (18.9%, n = 433), or self-reported at the study visit if both of the electronic health record weights were missing (2.6%, n = 59). Those who met or exceeded the cohort-specific 75th percentile for exercise were less likely to report vomiting ≥5 times on an average day, attained higher levels of education, and had a smaller proportion of nonwhite participants. Overall, the prevalence of SGA was 9.6% (n = 219) and LGA 10.6% (n = 243; Table 1).

Table 1.

Characteristics of the Analytical Cohort, Pregnancy Environment and Lifestyle Study (n = 2,286), Kaiser Permanente Northern California, 2013–2017

Characteristic Total No. (n = 2,286) Overall Met the Cohort-Specific 75th Percentile for Moderate to Vigorous Intensity Exercise
Yes a
(n = 571)
No b
(n = 1,715)
No. % No. % No. %
Age, years 2,286
 ≤25.0 years 468 20.5 106 18.6 362 21.1
 25.1–34.9 1,329 58.1 332 58.1 997 58.1
 ≥35.0 489 21.4 133 23.3 356 20.8
Prepregnancy BMIc 2,286
 Underweight 62 2.7 16 2.8 46 2.7
 Normal weight 920 40.2 224 39.2 696 40.6
 Overweight 657 28.7 167 29.3 490 28.6
 Obese 647 28.3 164 28.7 483 28.2
Diet, kcal 2,212
 <1,000 409 18.5 96 17.2 313 18.9
 1,000–2,000 1,342 60.7 328 58.7 1,014 61.3
 >2,000 461 20.8 135 24.2 326 19.7
Vomited ≥5 times on an average dayd 2,283 69 3.0 10 1.8 59 3.4
Married 2,282 1,552 68.0 395 69.4 1,157 67.5
Race/ethnicitye 2,286
 Hispanic 953 41.7 229 40.1 724 42.2
 White 496 21.7 175 30.7 321 18.7
 Asian American/Pacific Islander 538 23.5 96 16.8 442 25.8
 Black 220 9.6 48 8.4 172 10.0
 Other 79 3.5 23 4.0 56 3.3
Parity 2,284
 0 998 43.7 257 45.1 741 43.2
 1 833 36.5 203 35.6 630 36.8
 ≥2 453 19.8 110 19.3 343 20.0
Educatione 2,283
 High school or less 319 14.0 59 10.4 260 15.2
 Some college 881 38.6 195 34.2 686 40.1
 College graduate 645 28.3 189 33.2 456 26.6
 Postgraduate 438 19.2 127 22.3 311 18.2
SGA 2,286 219 9.6 67 11.7 152 8.9
LGA 2,286 243 10.6 52 9.1 191 11.1

Abbreviations: BMI, body mass index; LGA, large for gestational age; MET, metabolic equivalent; SGA, small for gestational age.

a

a Met: ≥13.3 MET-hours per week.

b

b Not met: <13.3 MET-hours per week.

c

c Weight (kg)/height (m)2. Categories: underweight, <18.5; normal weight, 18.5–24.9; overweight, 25.0–29.9; obese, ≥30.0.

d

d χ 2 P < 0.05.

e

e χ 2 P < 0.0001.

In the full cohort, 40.8% (n = 932) met the Physical Activity Guidelines and 36.9% (n = 843) reported participating in some amount of vigorous intensity exercise. Similar overall volumes of exercise were performed by those who met the Physical Activity Guidelines and those participating in vigorous intensity exercise (median 15.6 (interquartile range, 13.0) vs. 13.1 (interquartile range, 16.1) MET-hours per week, respectively). There were 240 women (10.5%) who participated in vigorous intensity exercise but did not meet the Physical Activity Guidelines; they reported a median 0.8 (interquartile range, 0.1) MET-hours per week of vigorous intensity exercise (comparable to running for approximately 13 minutes per week).

Estimates of the differences in SGA and LGA risk for meeting the cohort-specific 75th percentile for exercise, versus not, are presented in Table 2. In the full cohort, in the crude and all adjusted analyses, statistically significant increases in SGA risk were observed for exercise at or above the cohort-specific 75th percentile for exercise; the adjusted estimate obtained from the data-adaptive TMLE with extra learners was 4.5 (95% CI: 2.1, 6.8) additional cases of SGA per 100 births. For LGA, only the estimate obtained from the data-adaptive TMLE with extra learners attained statistical significance, indicating 2.8 (95% CI: 0.5, 5.1) fewer cases of LGA per 100 births if all had met the 75th percentile for exercise versus not (Table 2).

Table 2.

Estimates of the Causal Risk Differences for Delivering Small- and Large-for-Gestational-Age Neonates If All Women Had Exercised at or Above the Cohort-Specific 75th Percentilea Versus Not, in the Full Cohort and Stratified by Prepregnancy Weight Status, Pregnancy Environment and Lifestyle Study, Kaiser Permanente Northern California, 2013–2017

Estimation Method SGA LGA
RD 95% CI P Value RD 95% CI P Value
Full cohort
   Crude 0.0287 0.000835, 0.0566 0.04 −0.0203 −0.0495, 0.00890 0.17
   Adjusted
      Stabilized IPTW with logistic modelsb 0.0427 0.00910, 0.0763 0.01 −0.0246 −0.0530, 0.00390 0.09
   TMLE
      User-specified logistic modelsc 0.0429 0.0101, 0.0756 0.01 −0.0246 −0.0524, 0.00319 0.08
      Data-adaptive with default learnersd 0.0441 0.0111, 0.0771 0.009 −0.0239 −0.0519, 0.00417 0.10
      Data-adaptive with default and extra learnerse 0.0447 0.0212, 0.0681 0.0002 −0.0279 −0.0505, −0.00540 0.02
 Stratified by prepregnancy weight status
 Underweight and normal weight
  Crude 0.0462 −0.000233, 0.0927 0.05 −0.0224 −0.0618, 0.0170 0.27
  Adjusted
   Stabilized IPTW with logistic modelsb 0.0869) 0.0203, 0.154 0.01 −0.0368 −0.0713, −0.00230 0.04
  TMLE
   User-specified logistic modelsc 0.0879 0.0273, 0.148 0.004 −0.0345 −0.0692, 0.000160 0.05
   Data-adaptive with default learnersd 0.0850 0.0260, 0.144 0.005 −0.0331 −0.0683, 0.00216 0.07
   Data-adaptive with default and extra learnerse 0.0850 0.0370, 0.133 0.0005 −0.0349 −0.0648, −0.00500 0.02
 Overweight and obese
  Crude 0.0166 −0.0176, 0.0507 0.34 −0.0198 −0.0613, 0.0218 0.35
  Adjusted
   Stabilized IPTW with logistic modelsb 0.0165 −0.0206, 0.0535 0.38 −0.0108 −0.0541, 0.0325 0.63
  TMLE
   User-specified logistic modelsc 0.0169 −0.0197, 0.0536 0.37 −0.0101 −0.0523, 0.0320 0.64
   Data-adaptive with default learnersd 0.0172 −0.0197, 0.0541 0.36 −0.00971 −0.0510, 0.0316 0.65
   Data-adaptive with default and extra learnerse 0.0163 −0.00968, 0.0422 0.22 −0.0126 −0.0439, 0.0188 0.43

Abbreviations: CI, confidence interval; IPTW, inverse probability of treatment weighting; LGA, large for gestational age; MET, metabolic equivalent; RD, risk difference; SGA, small for gestational age; TMLE, targeted maximum likelihood estimation.

a

a ≥13.3 MET-hours per week.

b

b Logistic regression for the propensity score model included the baseline covariates maternal age, prepregnancy body mass index, marital status, race/ethnicity, educational attainment, parity, daily caloric intake, and vomiting; P values and 95% CIs from Proc Genmod with independence as the identity matrix for robust/sandwich estimator of the variance to provide conservative inference.

c

c Logistic regression with the same baseline covariates to specify the propensity score and outcome models.

d

d SuperLearner (31) with default learners only (i.e., Wrapper for Glm, Choose a Model by AIC in a Stepwise Algorithm, and Wrapper Function for SuperLearner Prediction Algorithm) and the same baseline covariates to specify the propensity score and outcome models.

e

e SuperLearner (31) with defaults plus additional learners (i.e., SL Wrapper for Biglasso, Discrete Bayesian Additive Regression Tree Sampler, ExtraTrees SuperLearner Wrapper, Elastic Net Regression Including Lasso and Ridge, Wrapper for Kernlab’s SVM Algorithm, Wrapper for Lm, SL Wrapper for Ranger, and Wrapper for Speedglm) and the same baseline covariates to specify the propensity score and outcome models.

In underweight and normal-weight women (n = 982), statistically significant increases in SGA risk were observed for meeting the cohort-specific 75th percentile, versus not, in all analyses (Table 2). The estimate obtained from the model with data-adaptive TMLE with extra learners was 8.5 (95% CI: 3.7, 13.3) additional cases of SGA if all underweight and normal-weight women had exercised at this level versus not. In underweight and normal-weight women, meeting the cohort-specific 75th percentile for exercise statistically significantly decreased LGA risk, but only in the models with adjusted IPTW and the data-adaptive TMLE with extra learners. The latter revealed 3.5 (95% CI: 0.5, 6.5) fewer cases of LGA if all underweight and normal-weight women had exercised at this level versus not. In women with overweight or obesity (n = 1,304), meeting the cohort-specific 75th percentile for exercise was not statistically significantly associated with changes in SGA risk or changes in LGA risk (Table 2).

Risk difference estimates for meeting the Physical Activity Guidelines (i.e., ≥7.5 MET-hours per week), versus not, are presented in Table 3. No risk difference estimates for exercise at this level attained statistical significance in the full cohort, nor did they among women with overweight or obesity. In underweight and normal-weight women, statistically significant increases in SGA risk and decreases in LGA risk were observed for meeting the Physical Activity Guidelines in the adjusted IPTW and all TMLE analyses. The model with data-adaptive TMLE with extra learners revealed 5.1 (95% CI: 1.6, 8.7) additional cases of SGA and 3.5 (95% CI: 6.3, 7.2) fewer cases of LGA had all underweight and normal-weight women met the Physical Activity Guidelines versus not (Table 3).

Table 3.

Estimates of the Causal Risk Differences for Delivering Small- and Large-for-Gestational-Age Neonates If All Women Had Exercised at or Above the Physical Activity Guidelinesa Versus Not, in the Full Cohort and Stratified by Prepregnancy Weight Status, Pregnancy Environment and Lifestyle Study, Kaiser Permanente Northern California, 2013–2017

SGA LGA
Estimation Method RD 95% CI P Value RD 95% CI P Value
Full cohort
 Crude 0.0122 −0.0124, 0.0367 0.33 −0.0128 −0.0385, 0.0129 0.33
 Adjusted
  Stabilized IPTW with logistic modelsb 0.0213 −0.00510, 0.0478 0.11 −0.0166 −0.0422, 0.00910 0.21
 TMLE
  User-specified logistic modelsc 0.0219 −0.00439, 0.0481 0.10 −0.0179 −0.0432, 0.00727 0.16
  Data-adaptive with default learnersd 0.0219 −0.00439, 0.0481 0.10 −0.0178 −0.0431, 0.00749 0.17
  Data-adaptive with default and extra learnerse 0.0214 −0.00403, 0.0469 0.10 −0.0168 −0.0407, 0.00701 0.17
Stratified by prepregnancy weight status
 Underweight and normal weight
  Crude 0.0205 −0.0202, 0.0612 0.32 −0.0205 −0.0549, 0.0140 0.24
  Adjusted
   Stabilized IPTW with logistic modelsb 0.0557 0.00630, 0.105 0.03 −0.0370 −0.0697, −0.00420 0.03
  TMLE
   User-specified logistic modelsc 0.0551 0.00788, 0.102 0.02 −0.0353 −0.0677, −0.00291 0.03
   Data-adaptive with default learnersd 0.0550 0.00735, 0.103 0.02 −0.0340 −0.0667, −0.00141 0.04
   Data-adaptive with default and extra learnerse 0.0514 0.0160, 0.0867 0.004 −0.0349 −0.0626, −0.00718 0.01
 Overweight and obese
  Crude 0.00583 −0.0244, 0.0361 0.71 −0.00696 −0.0437, 0.0298 0.71
  Adjusted
   Stabilized IPTW with logistic modelsa 0.00520 −0.0261, 0.0365 0.74 −0.00400 −0.0413, 0.0334 0.84
  TMLE
   User-specified logistic modelsb 0.00605 −0.0249, 0.0370 0.70 −0.00288 −0.0394, 0.0336 0.88
   Data-adaptive with default learnersc 0.00563 −0.0252, 0.0365 0.72 −0.00331 −0.0399, 0.0333 0.86
   Data-adaptive with default and extra learnersd 0.00600 −0.0226, 0.0346 0.68 −0.00345 −0.0393, 0.0324 0.85

Abbreviations: CI, confidence interval; IPTW, inverse probability of treatment weighting; LGA, large for gestational age; MET, metabolic equivalent; RD, risk difference; SGA, small for gestational age; TMLE, targeted maximum likelihood estimation.

a

a ≥7.5 MET-hours per week.

b

b Logistic regression for the propensity score model included the baseline covariates maternal age, prepregnancy body mass index, marital status, race/ethnicity, educational attainment, parity, daily caloric intake, and vomiting; P values and 95% CIs from Proc Genmod with independence as the identity matrix for robust/sandwich estimator of the variance to provide conservative inference.

c

c Logistic regression with the same baseline covariates to specify the propensity score and outcome models.

d

d SuperLearner (31) with default learners only (i.e., Wrapper for Glm, Choose a Model by AIC in a Stepwise Algorithm, and Wrapper Function for SuperLearner Prediction Algorithm) and the same baseline covariates to specify the propensity score and outcome models.

e

e SuperLearner (31) with defaults plus additional learners (i.e., SL Wrapper for Biglasso, Discrete Bayesian Additive Regression Tree Sampler, ExtraTrees SuperLearner Wrapper, Elastic Net Regression Including Lasso and Ridge, Wrapper for Kernlab’s SVM Algorithm, Wrapper for Lm, SL Wrapper for Ranger, and Wrapper for Speedglm) and the same baseline covariates to specify the propensity score and outcome models.

Table 4 presents risk difference estimates for performing any amount of vigorous intensity exercise versus none. None of the risk difference estimates for SGA attained statistical significance with exercise at this intensity. In underweight and normal-weight women, any vigorous-intensity exercise resulted in statistically significantly decreases in LGA risk in all analyses. According to the data-adaptive TMLE with extra learners, there would be 5.9 (95% CI: 2.8, 9.0) fewer cases of LGA if all of the underweight and normal-weight women had performed some vigorous intensity exercise versus none (Table 4).

Table 4.

Estimates of the Causal Risk Differences for Delivering Small- and Large-for-Gestational-Age Neonates If All Women Had Performed Any Vigorous Intensity Exercise, Versus None, in the Full Cohort and Stratified by Prepregnancy Weight Status, Pregnancy Environment and Lifestyle Study, Kaiser Permanente Northern California, 2013–2017

SGA LGA
Estimation Method RD 95% CI P Value RD 95% CI P Value
Full cohort
 Crude 0.0155 −0.00954, 0.0405 0.23 −0.00866 −0.0349, 0.0176 0.52
 Adjusted
  Stabilized IPTW with logistic modelsa 0.0232 −0.00380, 0.0501 0.09 −0.00950 −0.0361, 0.0172 0.49
 TMLE
  User-specified logistic modelsb 0.0237 −0.00303, 0.0504 0.08 −0.00990 −0.0360, 0.0162 0.46
  Data-adaptive with default learnersc 0.0237 −0.00303, 0.0504 0.08 −0.0100 −0.0362, 0.0162 0.45
  Data-adaptive with default and extra learnersd 0.0227 −0.00223, 0.0476 0.07 −0.00896 −0.0326, 0.0146 0.46
Stratified by prepregnancy weight status
 Underweight and normal weight
  Crude 0.00897 −0.0326, 0.0506 0.67 −0.0453 −0.0804, −0.0101 0.01
  Adjusted
   Stabilized IPTW with logistic modelsa 0.0418 −0.0113, 0.0949 0.12 −0.0600 −0.0924, −0.0275 0.0003
  TMLE
   User-specified logistic modelsb 0.0380 −0.0112, 0.0872 0.13 −0.0609 −0.0928, −0.0289 0.0002
   Data-adaptive with default learnersc 0.0366 −0.0108, 0.0840 0.13 −0.0598 −0.0921, −0.0275 0.0003
   Data-adaptive with default and extra learnersd 0.0294 −0.0107, 0.0695 0.15 −0.0591 −0.0902, −0.0280 0.0002
 Overweight and obese
  Crude 0.0210 −0.00968, 0.0517 0.18 0.0176 −0.0197, 0.0550 0.35
  Adjusted
   Stabilized IPTW with logistic modelsa 0.0228 −0.00990, 0.0554 0.17 0.0240 −0.0151, 0.0630 0.23
  TMLE
   User-specified logistic modelsb 0.0228 −0.00938, 0.0550 0.16 0.0241 −0.0141, 0.0622 0.22
   Data-adaptive with default learnersc 0.0213 −0.0104, 0.0530 0.19 0.0237 −0.0140, 0.0614 0.22
   Data-adaptive with default and extra learnersd 0.0224 −0.00399, 0.0488 0.10 0.0265 −0.00520, 0.0583 0.10

Abbreviations: CI, confidence interval; IPTW, inverse probability of treatment weighting; LGA, large for gestational age; RD, risk difference; SGA, small for gestational age; TMLE, targeted maximum likelihood estimation.

a

a Logistic regression for the propensity score model included the baseline covariates maternal age, prepregnancy body mass index, marital status, race/ethnicity, educational attainment, parity, daily caloric intake, and vomiting; P values and 95% CIs from Proc Genmod with independence as the identity matrix for robust/sandwich estimator of the variance to provide conservative inference.

b

b Logistic regression with the same baseline covariates to specify the propensity score and outcome models.

c

c SuperLearner (31) with default learners only (i.e., Wrapper for Glm, Choose a Model by AIC in a Stepwise Algorithm, and Wrapper Function for SuperLearner Prediction Algorithm) and the same baseline covariates to specify the propensity score and outcome models.

d

d SuperLearner (31) with defaults plus additional learners (i.e., SL Wrapper for Biglasso, Discrete Bayesian Additive Regression Tree Sampler, ExtraTrees SuperLearner Wrapper, Elastic Net Regression Including Lasso and Ridge, Wrapper for Kernlab’s SVM Algorithm, Wrapper for Lm, SL Wrapper for Ranger, and Wrapper for Speedglm) and the same baseline covariates to specify the propensity score and outcome models.

There were 2 infants identified with SGA plus respiratory distress syndrome, 6 with SGA plus neonatal hypoglycemia, and 89 with SGA plus hyperbilirubinemia, for a total 97 infants with SGA plus an adverse infant outcome. No estimate for the risk of SGA plus an adverse infant outcome attained statistical significance in the post-hoc analyses (Web Table 2).

DISCUSSION

Our results suggest that performing high levels of moderate to vigorous intensity exercise during the first trimester of pregnancy is associated with an increase in the risk of delivering an SGA infant and a decrease in the risk of delivering an LGA infant, essentially shifting the distribution of birthweight for gestational age to the left, particularly among underweight and normal-weight women. In underweight and normal-weight women only, meeting the lower exercise threshold recommended by the Physical Activity Guidelines for Americans also appears to increase the risk of SGA and decrease the risk of LGA. Performing any amount of vigorous intensity exercise during the first trimester appears to be associated with a decrease in the risk of LGA in underweight and normal-weight women, although without a concurrent increase in SGA risk. There was no evidence for an association between exercise and the risks of SGA or LGA among women with overweight or obesity. It is possible that the association of in utero exposure with maternal exercise may be more difficult to detect in the infants of women with overweight or obesity, because these infants are likely exposed to excessive fuel substrates due to the maternal overweight or obesity (36). Importantly, in the overall cohort, exercise performed during the first trimester did not appear to increase the risk of SGA plus an adverse neonatal outcome.

Previous prospective cohort studies investigating the relationship between exercise activity during the first trimester and infant size at birth have estimated odds ratios, complicating comparisons with the findings of the present study. Nonetheless, a study of 1,040 predominately Puerto Rican women in the United States that also used the PPAQ reported no association between moderate to vigorous intensity exercise in early pregnancy and SGA (6). There was also no evidence of effect modification (on the multiplicative scale) by prepregnancy weight status in that study (6). A prospectively followed, population-based study (5) of 826 mother-infant pairs in Colorado used the PPAQ and found that exercise meeting the Physical Activity Guidelines for Americans in early pregnancy was not associated with SGA. However, meeting the Physical Activity Guidelines for Americans was not evaluated with respect to LGA (5). A meta-analysis of randomized trials that included 166,094 births reported no association of exercise interventions on SGA or LGA, both overall and among women with overweight or obesity (4).

An observational study of 76 prospectively followed, lean endurance exercisers found that continuing high levels of exercise into late pregnancy increased the risk of SGA (37). A randomized trial of 75 women who regularly exercised prior to pregnancy similarly found that infants of women assigned to high volumes of exercise in the second half of pregnancy were statistically significantly lighter at birth and had lower fat mass than those assigned to low exercise volumes (38). In the Colorado cohort (5), those in the highest quartile of total PA during late pregnancy delivered infants with statistically significant less fat mass at birth compared with those in the lowest quartile. To our knowledge, the present study is the first to report an association between exercise performed during the first trimester and infant size at birth, and to observe that meeting the Physical Activity Guidelines for Americans during early pregnancy increases the risk of SGA and decreases the risk of LGA in underweight and normal-weight women. However, a limitation of our study is that we lacked late-pregnancy exercise data, so our inability to assess decreases in exercise over the course of pregnancy might have masked a true relationship with LGA.

Hypothesized mechanisms for exercise’s association with infant size at birth include increased maternal insulin sensitivity and, particularly in late pregnancy, nutrient partitioning to the fetus. Mechanisms for exercise performed early in pregnancy are more likely to involve early gestational processes. It has been suggested that exercise during early pregnancy could induce epigenetic changes that lead to smaller size at birth, but this hypothesis has yet to be investigated in humans (39, 40).

The results of the present study must be evaluated in terms of the assumptions for valid causal inference with TMLE or IPTW estimation. The first assumption is referred to as the positivity or experimental treatment assignment assumption, which posits that all manifestations of the exposure must be possible, conditional on the baseline covariates. As such, women with contraindications to exercise during early pregnancy were excluded from the present study, and for women without contraindications, it is not unreasonable to assume that any level of exercise in early pregnancy would be possible, as is supported by the distribution of the inverse probability weights (Web Table 1). A second, untestable assumption is that of no unmeasured confounding. The present study had high-quality data on several key confounding factors, including maternal prepregnancy weight, diet and vomiting in early pregnancy, and a proxy for socioeconomic status (e.g., education), but as in any observational study, the possibility for residual confounding remains. In addition, valid IPTW inference relies on consistent estimation of the propensity score (i.e., the logistic model utilized in this study being correctly specified). Valid TMLE inference relies on consistent estimation of either the propensity score or the outcome regression. The present study used machine learning (i.e., SuperLearner (31)) in the TMLE analyses to mitigate concerns over violation of this last assumption. The machine learning additionally served to improve statistical efficiency over the IPTW analyses and the TMLE analyses with user-specified parametric models. Additional strengths of the present study include the prospective design, the size and racial/ethnic diversity of the cohort, and the availability of measured prepregnancy weight for most participants.

In conclusion, a high volume of moderate to vigorous intensity exercise performed during the first trimester of pregnancy was associated with increased risk of delivering an infant who was SGA and decreased the risk of delivering an infant who was LGA, particularly among underweight and normal-weight women. Increases in SGA risk and decreases in LGA risk were also observed for exercise meeting the Physical Activity Guidelines for Americans, but only among underweight and normal-weight women, suggesting that additional research on exercise during the first trimester is warranted. Performing any vigorous intensity exercise during the first trimester was associated with a reduced risk of LGA in underweight and normal-weight women but was not associated with SGA risk. In women with overweight or obesity, there was no evidence for an association of exercise during the first trimester with the risks of SGA and LGA. Fortunately, the increases in SGA risk observed in the cohort overall did not translate into increases in the risk of SGA plus adverse neonatal outcomes associated with disrupted maternal metabolism and small infant size at birth.

Supplementary Material

AJE-00243-2019_Ehrlich_Web_Material_kwz213

ACKNOWLEDGMENTS

Author affiliations: Division of Research, Kaiser Permanente Northern California, Oakland, California (Samantha F. Ehrlich, Romain S. Neugebauer, Juanran Feng, Monique M. Hedderson, Assiamira Ferrara); and Department of Public Health, the University of Tennessee Knoxville, Knoxville, Tennessee (Samantha F. Ehrlich).

This work was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (grants K01 DK105106 to S.F.E. and P30 DK092924 to A.F.) and the National Institute of Environmental Health Sciences (grant R01ES019196 to A.F.).

A poster describing this work was presented at the 2018 Annual Meeting of the Society for Epidemiological Research, June 20–22, 2018, Baltimore, Maryland.

Conflict of interest: none declared.

REFERENCES

  • 1. American College of Obstetricians and Gynecologists ACOG Committee opinion no. 650: physical activity and exercise during pregnancy and the postpartum period. Obstet Gynecol. 2015;126(6):e135–e142. [DOI] [PubMed] [Google Scholar]
  • 2. Piercy K, Troiano R, Ballard R, et al. The physical activity guidelines for Americans. JAMA. 2018;320(19):2020–2028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. da Silva SG, Ricardo LI, Evenson KR, et al. Leisure-time physical activity in pregnancy and maternal-child health: a systematic review and meta-analysis of randomized controlled trials and cohort studies. Sports Med. 2017;47(2):295–317. [DOI] [PubMed] [Google Scholar]
  • 4. Davenport MH, Meah VL, Ruchat SM, et al. Impact of prenatal exercise on neonatal and childhood outcomes: a systematic review and meta-analysis. Br J Sports Med. 2018;52(21):1386–1396. [DOI] [PubMed] [Google Scholar]
  • 5. Harrod CS, Chasan-Taber L, Reynolds RM, et al. Physical activity in pregnancy and neonatal body composition: the Healthy Start study. Obstet Gynecol. 2014;124(2 Pt 1):257–264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Gollenberg AL, Pekow P, Bertone-Johnson ER, et al. Physical activity and risk of small-for-gestational-age birth among predominantly Puerto Rican women. Matern Child Health J. 2011;15(1):49–59. [DOI] [PubMed] [Google Scholar]
  • 7. Gruber S, Van der Laan M. TMLE: targeted maximum likelihood estimation. https://CRAN.R-project.org/package=tmle. Accessed October 15, 2018.
  • 8. Gruber S, Van der Laan M. TMLE: an R package for targeted maximum likelihood estimation. J Stat Softw. 2012;51(13):1–35.23504300 [Google Scholar]
  • 9. Laan M, Rose  . Targeted Learning: Causal Inference for Observational and Experimental Data. New York, NY: Springer Science+Business Media; 2011. [Google Scholar]
  • 10. Hernán MA, Robins JM. Estimating causal effects from epidemiological data. J Epidemiol Community Health. 2006;60(7):578–586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Finer LB, Zolna MR. Declines in unintended pregnancy in the United States, 2008–2011. N Engl J Med. 2016;374(9):843–852. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Gordon N, Lin T. The Kaiser Permanente Northern California adult member health survey. Perm J. 2016;20(4):34–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Zhu Y, Hedderson MM, Feng J, et al. The Pregnancy Environment and Lifestyle Study (PETALS): a population-based longitudinal multi-racial birth cohort. BMC Pregnancy Childbirth. 2017;17(1):122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Chasan-Taber L, Schmidt MD, Roberts DE, et al. Development and validation of a pregnancy physical activity questionnaire. Med Sci Sports Exerc. 2004;36(10):1750–1760. [DOI] [PubMed] [Google Scholar]
  • 15. Ehrlich SF, Sternfeld B, Krefman AE, et al. Moderate and vigorous intensity exercise during pregnancy and gestational weight gain in women with gestational diabetes. Matern Child Health J. 2016;20(6):1247–1257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Ehrlich SF, Hedderson MM, Brown SD, et al. Moderate intensity sports and exercise is associated with glycaemic control in women with gestational diabetes. Diabetes Metab. 2017;43(5):416–423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Evenson KR, Chasan-Taber L, Symons Downs D, et al. Review of self-reported physical activity assessments for pregnancy: summary of the evidence for validity and reliability. Paediatr Perinat Epidemiol. 2012;26(5):479–494. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Chasan-Taber L, Freedson PS, Roberts DE, et al. Energy expenditure of selected household activities during pregnancy. Res Q Exerc Sport. 2007;78(1):133–137. [DOI] [PubMed] [Google Scholar]
  • 19. Ainsworth BE, Haskell WL, Whitt MC, et al. Compendium of physical activities: an update of activity codes and MET intensities. Med Sci Sports Exerc. 2000;32(9 Suppl):S498–S504. [DOI] [PubMed] [Google Scholar]
  • 20. Block G, Hartman AM, Dresser CM, et al. A data-based approach to diet questionnaire design and testing. Am J Epidemiol. 1986;124(3):453–469. [DOI] [PubMed] [Google Scholar]
  • 21. Harley K, Eskenazi B, Block G. The association of time in the US and diet during pregnancy in low-income women of Mexican descent. Paediatr Perinat Epidemiol. 2005;19(2):125–134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Lacasse A, Rey E, Ferreira E, et al. Validity of a modified Pregnancy-Unique Quantification of Emesis and Nausea (PUQE) scoring index to assess severity of nausea and vomiting of pregnancy. Am J Obstet Gynecol. 2008;198(1):71.e1–71.e7. [DOI] [PubMed] [Google Scholar]
  • 23. World Health Organization Obesity: preventing and managing the global epidemic. Report of a WHO Consultation. WHO Technical Report Series Geneva, Switzerland: World Health Organization; 2000. [PubMed] [Google Scholar]
  • 24. Shiroma EJ, Sesso HD, Moorthy MV, et al. Do moderate-intensity and vigorous-intensity physical activities reduce mortality rates to the same extent? J Am Heart Assoc. 2014;3(5):e000802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Aris IM, Kleinman KP, Belfort MB, et al. A 2017 United States Reference for Singleton Birth Weight Percentiles Using Obstetric Estimates of Gestataional Age. 2019. https://izzuddin-aris.shinyapps.io/BW-for-GA_z-score_webapp/. [DOI] [PMC free article] [PubMed]
  • 26. Mortimer KM, Neugebauer R, Laan M, et al. An application of model-fitting procedures for marginal structural models. Am J Epidemiol. 2005;162(4):382–388. [DOI] [PubMed] [Google Scholar]
  • 27. Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Science. 2010;25(1):1–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Blake HA, Leyrat C, Mansfield KE, et al. Propensity scores using missingness pattern information: a practical guide. arXiv. 2019. doi: 1901.03981v1 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Cole SR, Hernán MA. Constructing inverse probability weights for marginal structural models. Am J Epidemiol. 2008;168(6):656–664. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Neugebauer R, Schmittdiel JA, Laan MJ. Targeted learning in real-world comparative effectiveness research with time-varying interventions. Stat Med. 2014;33(14):2480–2520. [DOI] [PubMed] [Google Scholar]
  • 31. Laan MJ, Polley EC, Hubbard AE. Super learner. Stat Appl Genet Mol Biol. 2007;6: article 25. [DOI] [PubMed] [Google Scholar]
  • 32. Neugebauer R, Schmittdiel JA, Laan MJA. Case study of the impact of data-adaptive versus model-based estimation of the propensity scores on causal inferences from three inverse probability weighting estimators. Int J Biostat. 2016;12(1):131–155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Lee BK, Lessler J, Stuart EA. Improving propensity score weighting using machine learning. Stat Med. 2010;29(3):337–346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Neugebauer R, Fireman B, Roy JA, et al. Super learning to hedge against incorrect inference from arbitrary parametric assumptions in marginal structural modeling. J Clin Epidemiol. 2013;66(8 Suppl):S99–S109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Laan MJ. Targeted estimation of nuisance parameters to obtain valid statistical inference. Int J Biostat. 2014;10(1):29–57. [DOI] [PubMed] [Google Scholar]
  • 36. King JC. Maternal obesity, metabolism, and pregnancy outcomes. Annu Rev Nutr. 2006;26:271–291. [DOI] [PubMed] [Google Scholar]
  • 37. Clapp JF 3rd, Dickstein S. Endurance exercise and pregnancy outcome. Med Sci Sports Exerc. 1984;16(6):556–562. [PubMed] [Google Scholar]
  • 38. Clapp JF 3rd, Kim H, Burciu B, et al. Continuing regular exercise during pregnancy: effect of exercise volume on fetoplacental growth. Am J Obstet Gynecol. 2002;186(1):142–147. [DOI] [PubMed] [Google Scholar]
  • 39. Donovan EL, Miller BF. Exercise during pregnancy: developmental origins of disease prevention? Exerc Sport Sci Rev. 2011;39(3):111. [DOI] [PubMed] [Google Scholar]
  • 40. Chalk TE, Brown WM. Exercise epigenetics and the fetal origins of disease. Epigenomics. 2014;6(5):469–472. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

AJE-00243-2019_Ehrlich_Web_Material_kwz213

Articles from American Journal of Epidemiology are provided here courtesy of Oxford University Press

RESOURCES