Risk of selection bias due to non‐participation in a cohort study on pubertal timing

Nis Brix; Andreas Ernst; Lea Lykke Braskhøj Lauridsen; Erik Thorlund Parner; Onyebuchi A Arah; Jørn Olsen; Tine Brink Henriksen; Cecilia Høst Ramlau‐Hansen

doi:10.1111/ppe.12679

. 2020 Apr 21;34(6):668–677. doi: 10.1111/ppe.12679

Risk of selection bias due to non‐participation in a cohort study on pubertal timing

Nis Brix ^1,^2,^✉, Andreas Ernst ^1,², Lea Lykke Braskhøj Lauridsen ¹, Erik Thorlund Parner ³, Onyebuchi A Arah ^2,⁴, Jørn Olsen ^2,⁵, Tine Brink Henriksen ⁶, Cecilia Høst Ramlau‐Hansen ¹

PMCID: PMC7754153 PMID: 32319135

Abstract

Background

Non‐participation in aetiologic studies of pubertal timing is frequent. However, little effort has been given to explore the risk and potential impact of selection bias in studies of pubertal timing.

Objective

We aimed to explore the risk of selection bias due to non‐participation in a newly established puberty cohort.

Methods

We evaluated whether three maternal exposures chosen a priori (pre‐pregnancy obesity, smoking, and alcohol drinking during pregnancy) were associated with participation, whether pubertal timing was associated with participation, and whether selection bias influenced the associations between these exposures and pubertal timing. In total, 22 439 children from the Danish National Birth Cohort born 2000–2003 were invited to the Puberty Cohort and 15 819 (70%) participated. Exposures were self‐reported during pregnancy. Pubertal timing was measured using a previously validated marker, “the height difference in standard deviations” (HD:SDS), which is the difference between pubertal height and adult height, both in standard deviations. For this study, pubertal height at around 13 years in sons and around 11 years in daughters was obtained from an external database, and adult height was predicted based on parental height reported by mothers.

Results

Participation was associated with most exposures but not with pubertal timing, measured by HD:SDS. The associations between exposures and HD:SDS were comparable for participants only and all invited for participation.

Conclusion

In conclusion, the risk of selection bias in aetiologic studies on pubertal timing in the Puberty Cohort appears minimal.

Keywords: cohort studies, lost to follow‐up, menarche, puberty, selection bias, sexual maturation

Synopsis

Study question

Does non‐participation in aetiologic cohort studies on pubertal timing cause selection bias?

What's already known

Studies on pubertal timing are prone to non‐participation because sensitive information needs to be collected. However, little effort has been given to explore the risk and potential impact of selection bias in studies of pubertal timing.

What this study adds

The findings from this study suggest that the risk of selection bias in aetiologic cohort studies on self‐reported pubertal timing is most likely minimal.

1. BACKGROUND

In aetiologic studies on pubertal timing, we have to collect sensitive data on puberty. However, not all are willing to share this information, which may result in non‐participation. Early puberty has been linked to increased risk of internalising and externalising symptoms in both boys and girls as well as increased risk of depressive disorders and eating disorders in girls, whereas late puberty has been linked to subclinical internalising and externalising symptoms although less consistent. ¹ Therefore, both earlier and later timing of puberty might be related to lower participation rates. If the exposures studied are also associated with participation, this may result in selection bias. ² Despite this apparent threat, little effort has been given to explore the risk and potential impact of selection bias due to non‐participation in aetiologic studies of pubertal timing.

In aetiologic cohort studies of pubertal timing, it may be possible to estimate the association between exposures collected at baseline and participation in data collection on pubertal development. As information on pubertal timing is only available for participants, it is impossible to directly estimate the association between pubertal timing and participation, unless another marker of pubertal timing is obtained from other external resources for the entire cohort including both participants and non‐participants. The “height difference in standard deviations” (HD:SDS) may be such a marker of pubertal timing, ³ because it can readily be obtained for the entire cohort if external registry information on height is available. HD:SDS is the difference between pubertal height, measured around the mean age at peak height velocity, in standard deviation scores (SDS) and adult height in SDS. ³ A child with earlier puberty will, on average, have an earlier age at peak height velocity, leading to a higher pubertal height, and consequently a higher HD:SDS than a child with later puberty. HD:SDS correlates well with the pubertal markers, age at peak height velocity and onset of the growth spurt. ³ Using HD:SDS as a measure of pubertal timing on both participants and non‐participants, it is possible to assess how participation is associated with the exposure under study as well as with pubertal timing. ² This may also provide bias parameters for bias analysis to adjust for selection bias. ⁴ , ⁵

We aimed to explore the potential impact of selection bias due to non‐participation in a newly established puberty cohort. First, we assessed whether maternal lifestyle factors during pregnancy and other baseline characteristics were associated with participation. Second, we assessed whether pubertal timing, measured by HD:SDS, was associated with participation. Third, we investigated the impact of potential selection bias on the associations between three maternal lifestyle exposures (pre‐pregnancy body mass index [BMI], smoking, and alcohol drinking during pregnancy) chosen a priori and pubertal timing, measured by HD:SDS. This was done by comparing the associations estimated in the entire Puberty Cohort (participants and non‐participants) with the association estimated among participants only. The directed acyclic graph in Figure S1 shows the potential mechanisms for selection bias in the present study.

2. METHODS

2.1. Cohort selection

This validation study is based on the Puberty Cohort, nested within the Danish National Birth Cohort (DNBC), and the Children’s Database. ⁶ The DNBC has been described in detail elsewhere. ⁷ Briefly, about 50% of the general practitioners in Denmark agreed to recruit pregnant women in early gestation from 1996 through 2002. In total, 101 042 pregnancies were included, corresponding to around 60% of invited pregnant women or around 30% of all pregnancies in Denmark during the study period. Women provided information on their pregnancy and their children twice during pregnancy and 6 months, 18 months, 7 years, and 11 years post‐partum.

Eligible children for the Puberty Cohort were liveborn singletons born from 2000 to 2003, whose mothers participated in the first interview during pregnancy and had not withdrawn from the DNBC by May 2012 (n = 56 641). To obtain large exposure contrasts, the Puberty Cohort was sampled independently according to 12 different prenatal exposures hypothesised to be prenatal causes of pubertal timing such as pre‐pregnancy obesity, maternal smoking during pregnancy, and alcohol drinking during pregnancy. ⁸ If a child was sampled more than once (ie from different exposure categories), the child was only included in the analyses once. Based on rules from probability theory and the sampling fractions, we were able to calculate exact sampling probabilities. The inverses of these sampling probabilities were used as sampling weights to reweight the data. The mathematical derivation of the sampling weights is described in detail previously. ⁸ In total, 22 439 of 56 641 children were sampled to constitute the Puberty Cohort. These children were invited to provide information on their current pubertal development as part of the 11‐year follow‐up in the DNBC and as part of half‐yearly questionnaires on puberty from 11.5 until full maturity (defined as Tanner stage 5 for genital and pubic hair development for sons and Tanner stage 5 for breast and pubic hair development for daughters) or 18 years of age. The puberty information was provided through web‐based questionnaires on pubertal markers such as Tanner stages (pubic hair and genital or breast development), ⁹ , ¹⁰ voice break, first ejaculation, and menarche. In total, 15 819 children (7696 sons and 8123 daughters) participated in the Puberty Cohort by returning at least one questionnaire on puberty (participation rate 70%). We will refer to the participating children as participants, and we will refer to the remaining 6620 children of the Puberty Cohort as non‐participants (Figure 1).

Flow of participants and non‐participants, the Puberty Cohort, Denmark, 2012‐2018. Abbreviations: HD:SDS, height difference in standard deviations; PHV, peak height velocity; SDS, standard deviation score

The Children’s Database systematically collects health‐related data on children in Denmark from school nurses and general practitioners. It was initiated in April 2009, and reporting has been mandatory since December 2011. ⁶ Data were extracted from this database for all children in the Puberty Cohort in July 2017. In total, 20 408 (91%) of 22 439 children from the Puberty Cohort had at least one height measurement in the Children’s Database (Figure 1).

2.2. Exposures and other baseline characteristics

Exposures of interest in this study were maternal pre‐pregnancy BMI, smoking during pregnancy, and alcohol drinking during pregnancy. These lifestyle exposures reflect social and behavioural patterns that might affect the children’s participation in the Puberty Cohort. Information on these exposures was collected during the first interview in the DNBC around gestational week 17. Other baseline characteristics included maternal age at menarche also obtained from the first interview in the DNBC; included maternal age at delivery and parity, both obtained from the Danish Medical Birth Registry; and included highest social class of parents, based on the International Standard Class of Occupation and Education codes (ISCO‐88 and ISCED), obtained from Statistics Denmark. All baseline characteristics including the exposures were categorised as shown in Tables 1 and 2.

Table 1.

Participation rate according to baseline characteristics in 11 445 sons, the Puberty Cohort, Denmark, 2012‐2018

Baseline characteristics	Non‐participants (n = 3749)	Participants (n = 7696)	Unadjusted participation rate ratio	Adjusted participation rate ratio ^a , ^a
Baseline characteristics	N (%)	N (%)	uPRR (95% CI)	aPRR (95% CI)
Pre‐pregnancy BMI (kg/m²)
<18.5	255 (33.2)	513 (66.8)	0.95 (0.90, 1.00)	1.00 (0.94, 1.05)
18.5‐24.9	2076 (30.4)	4750 (69.6)	1.00 (Reference)	1.00 (Reference)
25‐29.9	830 (34.5)	1574 (65.5)	0.94 (0.91, 0.97)	0.97 (0.93, 1.00)
30+	519 (40.6)	758 (59.4)	0.84 (0.80, 0.88)	0.88 (0.83, 0.92)
Smoking in pregnancy
Non‐smoker	2306 (29.3)	5556 (70.7)	1.00 (Reference)	1.00 (Reference)
Stopped smoking	359 (32.9)	731 (67.1)	0.94 (0.90, 0.99)	0.96 (0.91, 1.01)
1‐9 daily cigarettes	495 (40.1)	738 (59.9)	0.83 (0.79, 0.88)	0.87 (0.82, 0.91)
10‐14 daily cigarettes	305 (44.5)	380 (55.5)	0.78 (0.72, 0.84)	0.83 (0.77, 0.89)
15+ daily cigarettes	278 (49.2)	287 (50.8)	0.71 (0.65, 0.77)	0.79 (0.73, 0.86)
Alcohol in pregnancy
Abstainers	2164 (35.4)	3947 (64.6)	1.00 (Reference)	1.00 (Reference)
<1 unit weekly	1022 (29.8)	2402 (70.2)	1.06 (1.03, 1.10)	1.03 (1.00, 1.06)
1‐3 units weekly	381 (29.1)	927 (70.9)	1.06 (1.02, 1.11)	1.04 (1.00, 1.09)
>3 units weekly	180 (30.2)	416 (69.8)	1.03 (0.97, 1.10)	1.03 (0.97, 1.10)
Highest social class of parents
High‐grade professional	649 (26.1)	1834 (73.9)	1.00 (Reference)	1.00 (Reference)
Low‐grade professional	950 (27.5)	2505 (72.5)	0.97 (0.94, 1.00)	0.98 (0.95, 1.02)
Skilled worker	1215 (36.5)	2117 (63.5)	0.86 (0.83, 0.90)	0.90 (0.87, 0.94)
Unskilled worker	768 (42.7)	1032 (57.3)	0.79 (0.75, 0.83)	0.86 (0.81, 0.91)
Student	95 (39.3)	147 (60.7)	0.82 (0.72, 0.92)	0.84 (0.74, 0.96)
Economically inactive	57 (58.8)	40 (41.2)	0.50 (0.37, 0.68)	0.56 (0.41, 0.77)
Maternal age at delivery (y)
<20	51 (58.0)	37 (42.0)	0.67 (0.51, 0.88)	0.75 (0.57, 1.00)
20‐29.9	1911 (34.9)	3564 (65.1)	1.00 (Reference)	1.00 (Reference)
30‐39.9	1711 (30.2)	3947 (69.8)	1.06 (1.03, 1.09)	1.03 (1.00, 1.07)
40+	70 (32.7)	144 (67.3)	0.99 (0.88, 1.11)	0.99 (0.88, 1.12)
Parity
First child	1729 (30.6)	3918 (69.4)	1.00 (Reference)	1.00 (Reference)
Second child or more	2020 (34.8)	3778 (65.2)	0.95 (0.92, 0.97)	0.95 (0.92, 0.98)
Maternal age at menarche
Earlier than peers	1016 (34.5)	1927 (65.5)	0.99 (0.96, 1.02)	1.00 (0.97, 1.04)
Same time as peers	2141 (32.8)	4390 (67.2)	1.00 (Reference)	1.00 (Reference)
Later than peers	565 (30.0)	1316 (70.0)	1.03 (0.99, 1.07)	1.01 (0.98, 1.05)

Open in a new tab

Abbreviations: aPRR, adjusted participation rate ratio; BMI, body mass index; CI, confidence interval; uPRR, unadjusted participation rate ratio.

^{^a}

Adjusted for all other variables in this table.

Table 2.

Participation rate according to baseline characteristics in 10 991 daughters, the Puberty Cohort, Denmark, 2012‐2018

Baseline characteristics	Non‐participants (n = 2868)	Participants (n = 8123)	Unadjusted participation rate ratio	Adjusted participation rate ratio ^a , ^a
Baseline characteristics	N (%)	N (%)	uPRR (95% CI)	aPRR (95% CI)
Pre‐pregnancy BMI (kg/m²)
<18.5	220 (28.8)	543 (71.2)	0.93 (0.88, 0.97)	0.97 (0.92, 1.01)
18.5‐24.9	1564 (24.2)	4906 (75.8)	1.00 (Reference)	1.00 (Reference)
25‐29.9	620 (26.4)	1731 (73.6)	0.96 (0.93, 0.99)	0.97 (0.94, 1.00)
30+	415 (33.4)	827 (66.6)	0.86 (0.82, 0.90)	0.88 (0.84, 0.92)
Smoking in pregnancy
Non‐smoker	1738 (23.2)	5766 (76.8)	1.00 (Reference)	1.00 (Reference)
Stopped smoking	255 (24.5)	785 (75.5)	0.98 (0.93, 1.02)	0.99 (0.94, 1.03)
1‐9 daily cigarettes	363 (30.8)	816 (69.2)	0.92 (0.88, 0.96)	0.93 (0.89, 0.97)
10‐14 daily cigarettes	253 (37.3)	426 (62.7)	0.80 (0.75, 0.86)	0.83 (0.77, 0.89)
15+ daily cigarettes	255 (44.0)	324 (56.0)	0.72 (0.66, 0.77)	0.75 (0.69, 0.81)
Alcohol in pregnancy
Abstainers	1663 (28.0)	4273 (72.0)	1.00 (Reference)	1.00 (Reference)
<1 units weekly	795 (24.2)	2494 (75.8)	1.04 (1.01, 1.07)	1.00 (0.98, 1.03)
1‐3 units weekly	283 (22.6)	969 (77.4)	1.09 (1.05, 1.12)	1.04 (1.01, 1.08)
>3 units weekly	125 (24.5)	386 (75.5)	1.01 (0.95, 1.07)	1.00 (0.94, 1.07)
Highest social class of parents
High‐grade professional	480 (20.6)	1854 (79.4)	1.00 (Reference)	1.00 (Reference)
Low‐grade professional	732 (21.4)	2690 (78.6)	1.00 (0.97, 1.03)	1.01 (0.98, 1.05)
Skilled worker	932 (29.4)	2236 (70.6)	0.91 (0.87, 0.94)	0.94 (0.91, 0.98)
Unskilled worker	611 (35.4)	1117 (64.6)	0.83 (0.79, 0.87)	0.90 (0.86, 0.94)
Student	64 (27.9)	165 (72.1)	0.96 (0.89, 1.04)	1.02 (0.94, 1.10)
Economically inactive	40 (44.0)	51 (56.0)	0.80 (0.66, 0.96)	0.87 (0.72, 1.05)
Maternal age at delivery (y)
<20	49 (53.8)	42 (46.2)	0.62 (0.47, 0.82)	0.68 (0.52, 0.90)
20‐29.9	1451 (27.8)	3774 (72.2)	1.00 (Reference)	1.00 (Reference)
30‐39.9	1320 (24.1)	4150 (75.9)	1.05 (1.03, 1.08)	1.04 (1.01, 1.07)
40+	41 (20.9)	155 (79.1)	1.13 (1.05, 1.21)	1.12 (1.04, 1.20)
Parity
First child	1335 (24.8)	4047 (75.2)	1.00 (Reference)	1.00 (Reference)
Second child or more	1533 (27.3)	4076 (72.7)	0.98 (0.96, 1.01)	0.98 (0.95, 1.01)
Maternal age at menarche
Earlier than peers	789 (27.5)	2084 (72.5)	0.99 (0.96, 1.02)	1.01 (0.98, 1.04)
Same time as peers	1583 (25.6)	4598 (74.4)	1.00 (Reference)	1.00 (Reference)
Later than peers	470 (25.4)	1381 (74.6)	1.00 (0.97, 1.04)	0.98 (0.95, 1.02)

Open in a new tab

Abbreviations: aPRR, adjusted participation rate ratio; BMI, body mass index; CI, confidence interval; uPRR, unadjusted participation rate ratio.

^{^a}

Adjusted for all other variables in this table.

2.3. Outcome: HD:SDS

Height in SDS describes how far away, in SDs scores, a person’s height is from that person’s expected height based on age and sex. HD:SDS is calculated as pubertal height in SDS minus adult height in SDS. ³ The rationale is that the pubertal height reflects both pubertal timing and the genetic growth potential, whereas adult height reflects only the genetic growth potential. Adult height SDS is, therefore, subtracted from pubertal height SDS in an attempt to exclude the genetic contribution to pubertal height, and the resulting measure, HD:SDS, reflects pubertal timing. ³ A higher HD:SDS is indicative of earlier pubertal timing and vice versa. Pubertal height should be measured around the mean age at peak height velocity at the population level. ³ Therefore, we chose the height measure from the Children’s Database closest to 13 years in sons and 11 years in daughters (+/−2 years) (n = 17 276) based on recent Danish normal material. ¹¹ These height measures were converted to pubertal height SDS. ¹¹ We excluded pubertal height SDS >4 or <−4 (n = 12), resulting in 17 264 children with information on pubertal height SDS.

Adult height was not available because the children were not fully grown at the time of data collection. Thus, we predicted the children’s adult height from their parents’ height using a prediction model derived from Swedish data ¹² and then converted adult height to SDS. ¹¹ We excluded children with no measure of maternal or paternal height (n = 498). We were able to create HD:SDS for 16 766 children (75%) (Figure 1). This modified HD:SDS relied on predicted adult height SDS and has previously been validated against the age at attaining the pubertal milestones collected in the Puberty Cohort. ¹³

2.4. Statistical methods

To assess whether baseline characteristics were associated with participation in the Puberty Cohort, we estimated unadjusted and baseline characteristics‐adjusted participation rate ratios with 95% confidence intervals (CIs) using log‐binomial regression.

To assess the association between pubertal timing and participation in the Puberty Cohort (yes/no), we estimated baseline characteristics‐adjusted participation rate as a function of HD:SDS using a restricted cubic spline with 7 knots at HD:SDS of −3, −2, −1, 0, 1, 2, and 3 using logistic regression. Further, we estimated the mean adjusted differences in HD:SDS between the participants and non‐participants using linear regression.

Then, we assessed the impact of potential selection bias on associations between the three exposures of interest and pubertal timing by comparing the estimates for the entire Puberty Cohort with the estimates for the participants only. Linear regression was used to estimate difference in HD:SDS as a function of the exposures. The exposures were first included as indicator variables and then as linear terms (pre‐pregnancy BMI in kg/m², maternal smoking in first trimester as a grouped ordered variable (non‐smoker, stopped smoking, 1–9 daily cigarettes, 10–14 daily cigarettes, 15+ daily cigarettes), and alcohol drinking in first trimester in weekly units). All analyses were adjusted for maternal age at menarche, highest social status of parents, parity, maternal age at delivery, and the other exposures. These potential confounding factors were chosen based on the directed acyclic graph shown in Figure S2. Then, we obtained the difference in the estimates for the participants and for the Puberty Cohort as a way to quantify the potential selection bias. The 95% CI were computed using a stratified non‐parametric bootstrap approach: (a) we drew a bootstrap sample from participants (n = 15 819) and another bootstrap sample from non‐participants (n = 6620). (b) We then ran the aetiologic analyses for the entire Puberty Cohort using both bootstrap samples and then for participants only using the bootstrap sample for participants only using fixed sampling weights derived previously. ⁸ (c) We then saved the difference between the estimates for the two populations. Then, we repeated step 1–3 in 10 000 replications to obtain the 95% CI.

Sampling weights reflect the inverse probability of being sampled and were included in all analysis to account for the sampling strategy in the Puberty Cohort. ⁸ Robust SEs were applied to account for clustering of siblings (166 male‐male and 136 female‐female siblings) and the use of sampling weights. All analyses were performed in stata 15.1 MP software (StataCorp).

2.5. Missing data

A maximum of 1.5% of the data were missing for exposures and other baseline characteristics. Therefore, complete case analyses were employed.

2.6. Sensitivity analysis

In a sensitivity analysis, we estimated unadjusted and baseline characteristics‐adjusted prevalence ratios of having information on HD:SDS (yes/no) according to the baseline characteristics using log‐binomial regression to assess whether having missing information on HD:SDS was independent of other variables.

2.7. Ethics approval

The Committee for Biomedical Research Ethics in Denmark approved the collection of data in the DNBC ((KF)01‐471/94). A written informed consent was obtained from the mother upon recruitment including permission to follow‐up until the child turned 18 years of age. The present study was approved by The Danish Data Protection Agency (2012‐41‐0379 and 2015‐57‐0002) and the Steering Committee of the DNBC (2012‐04 and 2015‐47).

3. RESULTS

3.1. Associations between exposures and participation

Mothers of sons that agreed to participate in the Puberty Cohort were more often normal weight, non‐smokers, of higher social status, and older at delivery than mothers of non‐participating sons (Table 1). Similar patterns were observed for mothers of the daughters (Table 2).

3.2. Association between pubertal timing and participation

The adjusted participation rates as spline functions of HD:SDS were relatively flat between −2 and 2, which includes approximately 95% of the observations, and they were not statistically significantly different from the null (Figure 2). Similarly, the adjusted difference in HD:SDS between participants and non‐participants was comparable in both sons (0.02 [95% CI: −0.04, 0.07]) and daughters (0.02 [95% CI: −0.04, 0.07]).

Adjusted participation rates as spline functions of HD:SDS in 8969 sons (A) and 7797 daughters (B), the Puberty Cohort, Denmark, 2012‐2018. Solid lines represent estimates, and dashed lines represent 95% confidence intervals. Adjusted participation rate for a reference person that was the first‐born child whose mother was a non‐smoker, alcohol abstainer, normal weight and high‐grade professional during pregnancy, gave birth at 25 years and had maternal age at menarche at the same time as her peers

3.3. Risk of selection bias for associations between exposures and pubertal timing

Figures 3 and 4 and Tables S1 and S2 show adjusted measures of associations between exposures and HD:SDS among the entire Puberty Cohort (including both participants and non‐participants) and participants only. In both sons and daughters, most associations in the entire Puberty Cohort were comparable to those obtained among participants only. Statistically significant differences in estimates were only observed for maternal light smoking (1‐9 daily cigarettes) in sons, but no consistent patterns were seen across exposure categories. In daughters, we observed a tendency towards different estimates for maternal heavy smoking (15+ daily cigarettes), but when smoking was modelled as a linear term, no difference between the entire Puberty Cohort and the participants were observed.

Measures of associations between three maternal exposures and pubertal timing, measured by HD:SDS, in sons in the Puberty Cohort and the participants only, Denmark, 2012‐2018. Adjusted for maternal age at menarche, highest social class of parents, parity, maternal age at delivery and the two other exposures

Measures of associations between three maternal exposures and pubertal timing, measured by HD:SDS, in daughters in the Puberty Cohort and the participants only, Denmark, 2012‐2018. Adjusted for maternal age at menarche, highest social class of parents, parity, maternal age at delivery and the two other exposures

3.4. Associations between exposures and having information on HD:SDS

In a sensitivity analysis, we assessed the prevalence ratios of having information on HD:SDS according to baseline characteristics (Tables S3 and S4). Most baseline characteristics were not or only weakly associated with having information on HD:SDS; the exceptions were heavy smoking in pregnancy, highest social class of parents, and maternal age at delivery.

3.5. Comment

3.5.1. Principal findings

In this validation study on the risk of selection bias due to non‐participation in a puberty cohort, we found that participation was associated with maternal exposures and other baseline characteristics but not with pubertal timing, measured by HD:SDS. Hence, non‐participation appears to result in only minimal risk of selection bias in aetiologic studies of pubertal timing conducted in this puberty cohort. However, selection bias may still be present, if pubertal timing is associated with participation within strata of the exposure. ² Therefore, we also investigated the impact of potential selection bias on three a priori defined associations. Overall, the results were compatible with no or only weak selection bias. For sons, non‐participation appeared to bias the estimate for maternal smoking of 1–9 daily cigarettes towards earlier puberty, measured by higher HD:SDS. However, no consistent pattern was seen across maternal smoking categories, and no association was observed when modelling maternal smoking as a linear term. These results indicate that the finding for maternal smoking of 1–9 daily cigarettes might represent a chance finding.

3.5.2. Strengths of the study

The main strength of the present study is the availability of height measures in the Children’s Database, making it possible to compare associations obtained for participants only to those obtained for the entire Puberty Cohort (non‐participants and participants). This supported a direct assessment of the risk of selection bias for specific exposure‐outcome associations. Furthermore, we had a large sample with almost complete information on baseline characteristics.

3.5.3. Limitations of the data

Missing information on HD:SDS (25%) could have biased the associations if this missingness was related to both baseline characteristics and pubertal timing. However, the missingness was not or only weakly associated with baseline characteristics, most likely resulting in minimal bias.

The validity of HD:SDS as a marker of pubertal timing has been supported by high correlations between HD:SDS and age at peak height velocity of 0.84 for boys and 0.78 for girls, and correlations between HD:SDS and age at onset of the growth spurt of 0.75 for boys and 0.71 for girls. ³ Hence, HD:SDS is a measure of the accelerated linear growth that occurs during puberty, and it appears to be a surrogate measure for age at peak height velocity. Our measure of HD:SDS was further limited by lack of data on adult height as the children were not fully grown yet, and we had to rely on predictions based on parental height, which may introduce additional measurement error. ¹² Reassuringly, we have previously estimated correlations between HD:SDS and age at attaining the pubertal milestones in the expected range between −0.20 and −0.53. These correlations are within the expected range because (a) they are comparable to correlations between age at peak height velocity and other pubertal markers (correlation coefficients with age at menarche is 0.48 and with age at onset of breast development is 0.27), ¹³ and (b) they are also comparable to the correlations between the other pubertal milestones in the Puberty Cohort (the correlation between age at first ejaculation and age at attaining the other pubertal milestones in boys ranged from 0.28 to 0.41, and the correlations between age at menarche and age at attaining the other pubertal milestones in girls ranged from 0.37 to 0.71). ¹³ Lastly, another study has also found intercorrelations between HD:SDS, genital development, pubic hair, voice breaking, and axillary hair in 14‐year‐old boys to be similar. ¹⁴ In conclusion, this suggests that HD:SDS constitute a reasonable measure of pubertal timing, although we cannot rule out that measurement error on HD:SDS might have affected the results.

The lack of data on the children’s adult height might potentially give rise to the following potential bias. If the studied exposure affects postnatal growth, the prediction algorithm may overestimate or underestimate final adult height among the exposed only. This may lead to measurement error of HD:SDS being differential with respect to the maternal exposure. However, the potential measurement error may be similar for participants and the entire Puberty Cohort, and therefore, the bias might cancel out when studying the risk of selection bias.

3.5.4. Interpretation

The finding of minimal risk of selection bias is supported by two other validation studies within the DNBC. ¹⁵ , ¹⁶ The first study assessed selection bias due to non‐participation at the 7‐year follow‐up and suggested negligible selection bias in five a priori defined exposure‐outcome associations, except for attention deficit hyperactivity disorder after prenatal exposure to maternal smoking. ¹⁵ The second study assessed bias due to non‐participation in a follow‐up of mothers from the DNBC 14 years after the index pregnancy and found that non‐participation had no or relatively small impact on their a priori associations, including an association between maternal smoking and ischaemic heart disease. ¹⁶ Similar findings of minimal selection bias due to loss to follow‐up on a priori defined associations have been reported in other cohorts. ¹⁷ , ¹⁸ , ¹⁹

In this study, we have addressed potential selection bias due to non‐participation of sons and daughters in a follow‐up cohort on pubertal timing, but we have not addressed potential bias due to selection of the pregnant women into the DNBC. Around 60% of the pregnant women invited to the DNBC participated. ⁷ However, whether the mother participated at baseline was probably unrelated to her unborn child’s future pubertal timing and need not introduce selection bias. This is supported by findings from Nohr et al. ²⁰ Using register‐based information, they found that non‐participation at baseline in the DNBC did not bias three associations between prenatal exposures and adverse pregnancy outcomes defined a priori. ²⁰ Similarly, no or only minor indication of selection bias was found for eight associations in the Norwegian Mother and Child Cohort Study. ²¹

In the Puberty Cohort, the children were invited to participate through an e‐mail to the mothers with a hyperlink to the web‐based puberty questionnaire, and the children took part in the final decision to participate. As the children were largely unaware of the maternal factors, these factors may have had a smaller impact on the children’s decision to participate than if the decision to participate were made solely by the mothers. This might lead to reduced risk of selection bias in measures of association for maternal lifestyle and pregnancy‐related factors.

Participation appeared to be stronger associated with parental social class for sons than daughters. This suggests that sons’ selection patterns may be more affected by social class than daughters’ selection patterns. Another explanation may be chance as the “student” and “economically inactive” categories were small and the only categories where a sex difference was observed. We are not aware of any other reports on sex differences for the association between social class and participation.

Our results might apply to similar cohorts with longitudinal self‐reported information on pubertal development by the children, although investigators should be aware of cultural and behavioural differences that may affect selection patterns. In contrast, our results might not apply to studies using clinical pubertal staging as such studies generally have much lower participation rates and might have different factors responsible for participation.

4. CONCLUSIONS

In conclusion, most maternal exposures and baseline characteristics were associated with participation in the Puberty Cohort as expected, but pubertal timing was not. This suggests that the non‐participation likely causes minimal selection bias in aetiologic studies on pubertal timing conducted in the Puberty Cohort. This was corroborated by similar associations between three a priori defined maternal exposures and pubertal timing in the entire Puberty Cohort (participants and non‐participants) and the participants, indicating no or negligible selection bias for the associations studied, although we cannot rule out that measurement error on HD:SDS may have affected the results. Our results are reassuring for future studies on prenatal risk factors for pubertal timing relying on cohorts with self‐assessed pubertal information.

Supporting information

Fig S1

Click here for additional data file.^{(391.5KB, pdf)}

Fig S2

Click here for additional data file.^{(353.9KB, pdf)}

Table S1

Click here for additional data file.^{(55.6KB, pdf)}

Table S2

Click here for additional data file.^{(53.6KB, pdf)}

Table S3

Click here for additional data file.^{(53.7KB, pdf)}

Table S4

Click here for additional data file.^{(53.5KB, pdf)}

ACKNOWLEDGEMENTS

The Danish National Birth Cohort was established with a significant grant from the Danish National Research Foundation. Additional support was obtained from the Danish Regional Committees, the Pharmacy Foundation, the Egmont Foundation, the March of Dimes Birth Defects Foundation, the Health Foundation and other minor grants. The DNBC Biobank has been supported by the Novo Nordisk Foundation and the Lundbeck Foundation. Follow‐up of mothers and children have been supported by the Danish Medical Research Council (SSVF 0646, 271‐08‐0839/06‐066023, O602‐01042B, 0602‐02738B), the Lundbeck Foundation (195/04, R100‐A9193), The Innovation Fund Denmark 0603‐00294B (09‐067124), the Nordea Foundation (02‐2013‐2014), Aarhus Ideas (AU R9‐A959‐13‐S804), University of Copenhagen Strategic Grant (IFSV 2012), and the Danish Council for Independent Research (DFF ‐ 4183‐00594 and DFF ‐ 4183‐00152).

Brix N, Ernst A, Lauridsen LLB, et al. Risk of selection bias due to non‐participation in a cohort study on pubertal timing. Paediatr Perinat Epidemiol. 2020;34:668–677. 10.1111/ppe.12679

Funding Information

This work was supported by the Danish Council for Independent Research (DFF 4183‐00152 to C.H.R.H), the Faculty of Health at Aarhus University (N.B.), Jorck’s Foundation (16‐SU‐0637 to N.B.), the Denmark‐America Foundation (N.B.) and the Oticon Foundation (17‐3050 to N.B.)

REFERENCES

1. Graber JA. Pubertal timing and the development of psychopathology in adolescence and beyond. Horm Behav. 2013;64:262–269. [DOI] [PubMed] [Google Scholar]
2. Greenland S. Response and follow‐up bias in cohort studies. Am J Epidemiol. 1977;106:184–187. [DOI] [PubMed] [Google Scholar]
3. Wehkalampi K, Silventoinen K, Kaprio J, et al. Genetic and environmental influences on pubertal timing assessed by height growth. Am J Hum Biol. 2008;20:417–423. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Thompson CA, Arah OA. Selection bias modeling using observed data augmented with imputed record‐level probabilities. Ann Epidemiol. 2014;24:747–753. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Lash TL, Fox MP, Flink AK. Applying Quantitative Bias Analysis to Epidemiologic Data. New York, NY: Springer Science+Buisiness Media; 2009. [Google Scholar]
6. Høghsbro C. http://www.esundhed.dk/dokumentation/Registre/Sider/Register.aspx?rp:A_Register=20&rp:Visning=0&. Accessed 25 Oct 2018
7. Olsen J, Melbye M, Olsen SF, et al. The Danish National birth cohort – its background, structure and aim. Scand J Public Health. 2001;29:300–307. [DOI] [PubMed] [Google Scholar]
8. Brix N, Ernst A, Lauridsen LLB, et al. Maternal smoking during pregnancy and timing of puberty in sons and daughters: a population‐based cohort study. Am J Epidemiol. 2019;188:47–56. [DOI] [PMC free article] [PubMed] [Google Scholar]
9. Marshall WA, Tanner JM. Variations in pattern of pubertal changes in girls. Arch Dis Child. 1969;44:291–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Marshall WA, Tanner JM. Variations in the pattern of pubertal changes in boys. Arch Dis Child. 1970;45:13–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Tinggaard J, Aksglaede L, Sorensen K, et al. The 2014 Danish references from birth to 20 years for height, weight and body mass index. Acta Paediatr. 2014;103:214–224. [DOI] [PubMed] [Google Scholar]
12. Luo ZC, Albertsson‐Wikland K, Karlberg J. Target height as predicted by parental heights in a population‐based study. Pediatr Res. 1998;44:563–571. [DOI] [PubMed] [Google Scholar]
13. Brix N, Ernst A, Lauridsen LLB, et al. Maternal pre‐pregnancy body mass index, smoking in pregnancy, and alcohol intake in pregnancy in relation to pubertal timing in the children. BMC Pediatr. 2019;19:338. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Ong KK, Bann D, Wills AK, et al. Timing of voice breaking in males associated with growth and weight gain across the life course. J Clin Endocrinol Metab. 2012;97:2844–2852. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Greene N, Greenland S, Olsen J, Nohr EA. Estimating bias from loss to follow‐up in the Danish National Birth Cohort. Epidemiology. 2011;22:815–822. [DOI] [PubMed] [Google Scholar]
16. Bliddal M, Liew Z, Pottegard A, Kirkegaard H, Olsen J, Nohr EA. Examining non‐participation to the maternal follow‐up within the Danish National birth cohort. Am J Epidemiol. 2018;187:1511–1519. [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Winding TN, Andersen JH, Labriola M, Nohr EA. Initial non‐participation and loss to follow‐up in a Danish youth cohort: implications for relative risk estimates. J Epidemiol Community Health. 2014;68:137–144. [DOI] [PubMed] [Google Scholar]
18. Powers J, Loxton D. The impact of attrition in an 11‐year prospective longitudinal study of younger women. Ann Epidemiol. 2010;20:318–321. [DOI] [PubMed] [Google Scholar]
19. Carter KN, Imlach‐Gunasekara F, McKenzie SK, Blakely T. Differential loss of participants does not necessarily cause selection bias. Aust N Z J Public Health. 2012;36:218–222. [DOI] [PubMed] [Google Scholar]
20. Nohr EA, Frydenberg M, Henriksen TB, Olsen J. Does low participation in cohort studies induce bias? Epidemiology. 2006;17:413–418. [DOI] [PubMed] [Google Scholar]
21. Nilsen RM, Vollset SE, Gjessing HK, et al. Self‐selection and bias in a large prospective pregnancy cohort in Norway. Paediatr Perinat Epidemiol. 2009;23:597–608. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Fig S1

Click here for additional data file.^{(391.5KB, pdf)}

Fig S2

Click here for additional data file.^{(353.9KB, pdf)}

Table S1

Click here for additional data file.^{(55.6KB, pdf)}

Table S2

Click here for additional data file.^{(53.6KB, pdf)}

Table S3

Click here for additional data file.^{(53.7KB, pdf)}

Table S4

Click here for additional data file.^{(53.5KB, pdf)}

[ppe12679-bib-0001] 1. Graber JA. Pubertal timing and the development of psychopathology in adolescence and beyond. Horm Behav. 2013;64:262–269. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0002] 2. Greenland S. Response and follow‐up bias in cohort studies. Am J Epidemiol. 1977;106:184–187. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0003] 3. Wehkalampi K, Silventoinen K, Kaprio J, et al. Genetic and environmental influences on pubertal timing assessed by height growth. Am J Hum Biol. 2008;20:417–423. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ppe12679-bib-0004] 4. Thompson CA, Arah OA. Selection bias modeling using observed data augmented with imputed record‐level probabilities. Ann Epidemiol. 2014;24:747–753. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ppe12679-bib-0005] 5. Lash TL, Fox MP, Flink AK. Applying Quantitative Bias Analysis to Epidemiologic Data. New York, NY: Springer Science+Buisiness Media; 2009. [Google Scholar]

[ppe12679-bib-0006] 6. Høghsbro C. http://www.esundhed.dk/dokumentation/Registre/Sider/Register.aspx?rp:A_Register=20&rp:Visning=0&. Accessed 25 Oct 2018

[ppe12679-bib-0007] 7. Olsen J, Melbye M, Olsen SF, et al. The Danish National birth cohort – its background, structure and aim. Scand J Public Health. 2001;29:300–307. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0008] 8. Brix N, Ernst A, Lauridsen LLB, et al. Maternal smoking during pregnancy and timing of puberty in sons and daughters: a population‐based cohort study. Am J Epidemiol. 2019;188:47–56. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ppe12679-bib-0009] 9. Marshall WA, Tanner JM. Variations in pattern of pubertal changes in girls. Arch Dis Child. 1969;44:291–303. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ppe12679-bib-0010] 10. Marshall WA, Tanner JM. Variations in the pattern of pubertal changes in boys. Arch Dis Child. 1970;45:13–23. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ppe12679-bib-0011] 11. Tinggaard J, Aksglaede L, Sorensen K, et al. The 2014 Danish references from birth to 20 years for height, weight and body mass index. Acta Paediatr. 2014;103:214–224. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0012] 12. Luo ZC, Albertsson‐Wikland K, Karlberg J. Target height as predicted by parental heights in a population‐based study. Pediatr Res. 1998;44:563–571. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0013] 13. Brix N, Ernst A, Lauridsen LLB, et al. Maternal pre‐pregnancy body mass index, smoking in pregnancy, and alcohol intake in pregnancy in relation to pubertal timing in the children. BMC Pediatr. 2019;19:338. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ppe12679-bib-0014] 14. Ong KK, Bann D, Wills AK, et al. Timing of voice breaking in males associated with growth and weight gain across the life course. J Clin Endocrinol Metab. 2012;97:2844–2852. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ppe12679-bib-0015] 15. Greene N, Greenland S, Olsen J, Nohr EA. Estimating bias from loss to follow‐up in the Danish National Birth Cohort. Epidemiology. 2011;22:815–822. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0016] 16. Bliddal M, Liew Z, Pottegard A, Kirkegaard H, Olsen J, Nohr EA. Examining non‐participation to the maternal follow‐up within the Danish National birth cohort. Am J Epidemiol. 2018;187:1511–1519. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ppe12679-bib-0017] 17. Winding TN, Andersen JH, Labriola M, Nohr EA. Initial non‐participation and loss to follow‐up in a Danish youth cohort: implications for relative risk estimates. J Epidemiol Community Health. 2014;68:137–144. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0018] 18. Powers J, Loxton D. The impact of attrition in an 11‐year prospective longitudinal study of younger women. Ann Epidemiol. 2010;20:318–321. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0019] 19. Carter KN, Imlach‐Gunasekara F, McKenzie SK, Blakely T. Differential loss of participants does not necessarily cause selection bias. Aust N Z J Public Health. 2012;36:218–222. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0020] 20. Nohr EA, Frydenberg M, Henriksen TB, Olsen J. Does low participation in cohort studies induce bias? Epidemiology. 2006;17:413–418. [DOI] [PubMed] [Google Scholar]

[ppe12679-bib-0021] 21. Nilsen RM, Vollset SE, Gjessing HK, et al. Self‐selection and bias in a large prospective pregnancy cohort in Norway. Paediatr Perinat Epidemiol. 2009;23:597–608. [DOI] [PubMed] [Google Scholar]

PERMALINK

Risk of selection bias due to non‐participation in a cohort study on pubertal timing

Nis Brix

Andreas Ernst

Lea Lykke Braskhøj Lauridsen

Erik Thorlund Parner

Onyebuchi A Arah

Jørn Olsen

Tine Brink Henriksen

Cecilia Høst Ramlau‐Hansen

Abstract

Background

Objective

Methods

Results

Conclusion

Synopsis

Study question

What's already known

What this study adds

1. BACKGROUND

2. METHODS

2.1. Cohort selection

Figure 1.

2.2. Exposures and other baseline characteristics

Table 1.

Table 2.

2.3. Outcome: HD:SDS

2.4. Statistical methods

2.5. Missing data

2.6. Sensitivity analysis

2.7. Ethics approval

3. RESULTS

3.1. Associations between exposures and participation

3.2. Association between pubertal timing and participation

Figure 2.

3.3. Risk of selection bias for associations between exposures and pubertal timing

Figure 3.

Figure 4.

3.4. Associations between exposures and having information on HD:SDS

3.5. Comment

3.5.1. Principal findings

3.5.2. Strengths of the study

3.5.3. Limitations of the data

3.5.4. Interpretation

4. CONCLUSIONS

Supporting information

ACKNOWLEDGEMENTS

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases