Abstract
The ABCD study is a new and ongoing project of very substantial size and scale involving 21 data acquisition sites. It aims to recruit 11,500 children and follow them for ten years with extensive assessments at multiple timepoints. To deliver on its potential to adequately describe adolescent development, it is essential that it adopt recruitment procedures that are efficient and effective and will yield a sample that reflects the nation’s diversity in an epidemiologically informed manner. Here, we describe the sampling plans and recruitment procedures of this study. Participants are largely recruited through the school systems with school selection informed by gender, race and ethnicity, socioeconomic status, and urbanicity. Procedures for school selection designed to mitigate selection biases, dynamic monitoring of the accumulating sample to correct deviations from recruitment targets, and a description of the recruitment procedures designed to foster a collaborative attitude between the researchers, the schools and the local communities, are provided.
Keywords: Adolescent Brain Cognitive Development, Recruitment, Adolescence, Study design
1. Overall design
The Adolescent Brain Cognitive Development study (ABCD) aims to characterize psychological and neurobiological development from pre-adolescence to young adulthood. A baseline cohort of 11,500 nine and ten year old children (and their parents/guardians) are being recruited and will be followed for ten years with annual lab-based assessments including biennial Magnetic Resonance Imaging (MRI). The final cohort will include 9780 single births and 1720 twins: This paper focuses on recruitment of the single birth participants as twin recruitment is described elsewhere in this issue (Iacono et al., 2017). The sample size ensures sufficient power, allowing for an anticipated 10% attrition, to detect medium to small effects over the study’s duration. An extensive test battery assesses numerous factors that impact development and aims to provide a comprehensive understanding of change across these critical periods of human development. While a longitudinal study of a single cohort entails a substantial time and financial commitment, the ABCD study’s design provides an unprecedentedly large sample with baseline assessments that will precede many of the clinically relevant behaviors and key outcomes of interest that emerge during adolescence. While following a single sample of nine and ten year olds may be subject to specific cohort effects (i.e., developmental effects specific to this age range at this moment in history) and will be slower to yield developmental trajectories than might be found with an accelerated longitudinal design (i.e., a longitudinal study of multiple cohorts that vary across a wider range of ages; Thompson et al., 2011) the ABCD Study has the advantage of providing baseline (pre-adolescent) assessments on all its participants. Consequently, statistical power is maximized to identify the precursors of the range of developmental outcomes that emerge over the adolescent years. Indeed, it is the very large variation in developmental trajectories that warrants the large sample size. While a sample this size is more than adequate to describe “typical” (i.e., average) neurodevelopment, it is our interest in individual differences (e.g., stratification by gender, genotype, exposure to enrichment vs. adversity, health, education, and parenting practices and interactions among variables of this kind) that make so large a sample invaluable.
2. Population neuroscience
With the relatively recent creation of large-sample, neuroscientific studies such as IMAGEN (Schumann et al., 2010), PING (Jernigan et al., 2016), NCANDA (Brown et al., 2015), HCP (Glasser et al., 2016), UK Biobank (Sudlow et al., 2015), Generation R (Kooijman et al., 2016), GUSTO (Soh et al., 2014), and the Philadelphia Neurodevelopment Cohort (Satterthwaite et al., 2016) or data pooling efforts such as ENIGMA (Thompson et al., 2014) comes the possibility to apply epidemiological principles to neuroscientific investigations. Population neuroscience refers to the application of epidemiological practices such as purposeful sampling to neuroscience research which, heretofore, has typically studied relatively small and homogenous convenience samples (Falk et al., 2013; Paus, 2013). In contrast, a population neuroscience approach embraces the variation that exists across members of the population and strives to identify the broad range of biological, social and environmental influences on neurobiological function and development. Such an approach may ultimately reveal a much more complex tapestry of etiological mechanisms than are typically derived from averaging over a relatively homogenous sample. For example, the neurobiological factors that may underlie individual differences in psychological functioning, including risk for mental illness, may differ across demographic groups or environmental exposures. As the ABCD study aims to provide a comprehensive characterization of adolescent development in a sample that reflects the demographic variation of the US population, it is inevitable that it should adopt a population neuroscientific perspective that de-emphasizes sample averages in favor of understanding the very many mediating and moderating factors that may influence the broad range of adolescent development. In so doing, it aims to disentangle confounding factors from those genuinely involved in neurodevelopment. Our hope is that the size and the demographic diversity of our sample will contain sufficient inter-individual variability to allow us to statistically isolate what might be the critical sources of variance amidst correlated variables.
3. Adolescent diversity and national representativeness
A very important motivation for the ABCD study is that its sample should reflect, as best as possible, the sociodemographic variation of the US population. Studies such as the National Health and Nutrition Examination Survey (NHANES) that are designed to provide unbiased representation of the U.S. population and its major subpopulations utilize multi-stage probability sampling (Heeringa et al., 2017) which, in theory, provide all eligible participants a known, non-zero probability of inclusion in the cohort sample. The probability sampling design of such studies enables researchers to use “design-based” statistical methods to make unbiased or nearly unbiased inferences to the population from which the sample was selected. With one important departure, the ABCD cohort recruitment emulates a multi-stage probability sample of eligible children: A nationally distributed set of 21 primary stage study sites, a probability sampling of schools within the defined catchment areas for each site, and recruitment of eligible children in each sample school. The major departure from traditional probability sampling of U.S. children originates in how participating neuroimaging sites were chosen for the study. Although the 21 ABCD study sites are well-distributed nationally (see Table 1) the selection of collaborating sites is not a true probability sample of primary sampling units (PSUs) but was constrained by the grant review selection process and the requirement that selected locations have both the research expertise and the neuroimaging equipment needed for the study protocol. As a consequence, neuroimaging research centers are more likely to be located in urban areas resulting in a potential under-representation of rural youth.
Table 1.
Census Region of ABCD Study Site | Total Study Sites | Single Birth Baseline Cohortc | Race/Ethnicity of Childa |
Rural School Students | ||||
---|---|---|---|---|---|---|---|---|
White | African-American | Hispanic | Asian | All Otherb | ||||
Northeast | 4 | 1900 | 1252 | 315 | 207 | 68 | 58 | 358 |
Midwest | 4 | 1385 | 795 | 321 | 124 | 94 | 49 | 210 |
South | 6 | 2710 | 1114 | 694 | 596 | 85 | 221 | 310 |
West | 7 | 3520 | 1546 | 251 | 1284 | 257 | 195 | 320 |
Total ABCD Baseline | 21 | 9515 | 4707 | 1581 | 2211 | 504 | 512 | 1198 |
% of ABCD Single Birth Total in Target Sample | NA | 100.0% | 49.5% | 16.6% | 23.2% | 5.3% | 5.4% | 12.3% |
%Total ABCD 21 Site Age 9–10 Children in Public and Private Schools | NA | 100.0% | 51.3% | 14.9% | 23.4% | 5.6% | 4.9% | 12.3% |
%Total U.S. Population Age 9–10 Children in Public and Private Schools | NA | 100.0% | 50.7% | 14.5% | 25.1% | 5.0% | 4.7% | 17.5% |
Assumes a 50:50 female to male allocation for each race/ethnicity category.
Includes children of Native Hawaiian, Pacific Islander, Alaskan Native, American Indian and multiple races.
An additional 265 subjects, which will bring the single birth total to 9780, are not yet allocated. Their allocation will be made to ensure that the final sample matches the target demographics.
That said, fortuitously, the recruitment catchment areas of the 21 participating sites encompass over 20% of the entire US population of nine and ten year olds. Moreover, a carefully designed sampling and recruitment process within sites, described below, aims to ensure both local randomization and representativeness while also yielding a final combined ABCD sample that we hope will provide a close approximation to national sociodemographics. The sociodemographic factors on which the sample is recruited include age, gender, race and ethnicity, socio-economic status, and urbanicity. The sociodemographic sample size targets for the ABCD baseline cohort come from a combination of two sources: 1) the American Community Survey (ACS), a large scale survey of approximately 3.5 million households conducted annually by the U.S. Census Bureau; and 2) annual 3rd and 4th grade school enrollment data maintained by the National Center for Education Statistics. The ACS is one of the primary sources of demographic data for the nation as a whole and for smaller areas as well. The NCES data sources provide aggregate counts of students for simple demographic classifications of children at the school district and individual school level.
As illustrated in the final two rows of Table 1, the marginal race and ethnic composition for children living within the site-defined recruitment catchment areas of these 21 sites (described below) closely matches the distribution of the U.S. as a whole. Consequently, across the individual sites there is little need for significant disproportionate sampling of single birth children in order to achieve an ABCD race/ethnic distribution that closely approximates that for the nation at large. Within specific race/ethnicity categories, a natural 50:50 distribution to age level (910) and sex (male, female) is expected.
Table 1 illustrates the national and regional ABCD target sample sizes for children who were single births in each of five major race/ethnicity classifications. Relative to national proportions, the ABCD sample targets incorporate a slight oversampling of African-American (16.6% vs. actual population percentage of 14.5%) and Other Race (smaller racial groups and multiple-races; 5.4% vs. 4.7%) children and a corresponding slight undersampling of White (49.5% vs. 50.7%) and Hispanic (23.6% vs. 25.1%) children. These small deviations from a strict percentage match are related to special study objectives (e.g., recruitment of Native American children in several study sites) as well as the analytic aim of achieving a more optimal balance in the numbers of African American children across study sites.
Recruiting a sample that reflects national proportions on these sociodemographic factors is of utmost importance for the ABCD study. There is a growing appreciation that, rather than being restricted to convenience samples that often exclude harder-to-recruit socio-economic groups, research studies should strive to increase the diversity of their samples (Keiding and Louis, 2016). Consistent with this aim of achieving greater “external validity” for both experimental and observational study findings, there is a growing body of statistical literature and methods that aim to maximize the population representativeness even when study conditions do not permit a full application of traditional probability sampling methods (see Dugoff et al. 2014; O’Muircheartaigh and Hodges, 2014; Stuart et al., 2015; Elliott and Valliant, 2017). While there now exists institutional commitment to ensuring sex differences are incorporated into study designs (https://grants.nih.gov/grants/guide/notice-files/NOT-OD-15-102.html) the same is not yet true for other factors such as race and ethnicity, SES and urbanicity. Studies that have addressed the role of sex in, for example, the structural brain associations with substance use, give strong evidence for substantial differences between males and females (Lind et al., 2017). Similarly, there is evidence that SES can have substantial effects on neurodevelopment (Noble et al., 2015).
Nevertheless, the full extent to which the determinants and the trajectories of neurodevelopment are influenced by these factors remains largely unknown. However, there is a firm commitment that these questions should be addressed by a major national study with the scope of ABCD. Not only is it important to identify these potentially moderating variables but a study of the magnitude of ABCD has the scope to elucidate the mechanisms that may underlie sociodemographic differences: Are sex differences driven by genetics, hormones, timing of puberty or socialization factors? Are SES differences driven by stress, nutrition, access to educational resources and so forth?
Designing the ABCD sample demographics to match those of the national target population does not guarantee that the sample will be representative across all of the many dimensions (demographics, family and individual factors, community and environment, behaviors, exposures) that may influence a child’s development. However by exerting control over this smaller but still important set of socio-demographic attributes in the baseline recruitment, ABCD hopes to minimize a later need to statistically adjust (e.g., through weighting, propensity score methods, statistical modeling) for selection bias due to those factors that cannot be easily assessed prior to baseline recruitment or can only be observed as the ABCD cohort ages and is repeatedly observed.
3.1. Twins recruitment
Table 2 summarizes the race/ethnicity targets for the n = 1720 twins that will be recruited in the four study sites that are participating in the ABCD twin study component. Based on estimates derived from state vital statistics (birth registries) or twin registries maintained by the sites, the race/ethnic composition for the twin births in the four ABCD study sites is disproportionately more white (69.0%) than the national distributions of 2006–2008 twin births (60.2% white) or the current percentage of eligible school children who are white (50.4%). In setting the race/ethnic targets for the four twin sites, the sample was disproportionately allocated to increase the percentage of Hispanic and Asian, Native Hawaiian, Pacific Islander, Alaskan Native, American Indian and Other Race children to more closely approximate the national percentages of twin and total births in the 2006–2008 birth cohorts. The final allocation illustrated in Table 2 was constrained by the numbers of eligible twins in each study population as well as the analytic aim of balancing—as possible—the numbers of children in specific race/ethnic groups that would be recruited from the individual sites.
Table 2.
ABCD Population/ Sample Summary | Total | Race/Ethnicity Categoryb |
||||
---|---|---|---|---|---|---|
White | African-American | Hispanic | Asian,NHP, AIAN | Mixed Race, Not Reported | ||
Twin Births in ABCD Twin Study Populations | 100.0% | 66.0% | 15.7% | 8.1% | 3.8% | 6.4% |
Target ABCD Twin Allocation (n) | 1720 | 1002 | 296 | 198 | 106 | 118 |
Target ABCD Twin Allocation (percent) | 100.0% | 58.3% | 17.2% | 11.5% | 6.1% | 6.9% |
US Vital Statistics Births 2006–2008c | ||||||
% 2006–2008 US Twin Births | 100.0% | 60.2% | 16.6% | 16.7% | 5.6% | 0.9% |
% 2006–2008 Total Births | 100.0% | 53.7% | 14.6% | 24.5% | 6.5% | 0.7% |
Numbers represent individual twins.
Assume approximately 50:50 sex ratio for enrolled twins.
U.S. Vital Statistics Race/Ethnicity as recorded for the mother. Not all states report father’s race/ethnicity.
4. Sampling and recruitment procedures
The primary recruitment approach of the ABCD study is through elementary schools, both public (including charter schools) and private. Almost the entire population of nine and ten year olds and their families (youth must be aged 9 or 10 at the time of their baseline assessments, all of which occur between September 1st 2016 and August 31st 2018) can be reached through their involvement in the school system. The demographic composition of individual schools is available and a school sampling procedure, using this information, can be employed in a standardized manner across all 21 participating sites. Reaching families through schools also has the advantage that it can enable face-to-face interactions between the researchers and the children and their families (e.g., through classroom presentations, PTA meetings, parent nights) and can elicit enthusiastic involvement from teachers, all of which, in turn, can aid long-term retention.
4.1. Sampling strategy
The ABCD consortium considered a full-range of options for sampling and recruiting the baseline cohort of eligible 9 and 10 year-olds. Clinical samples, convenience sample methods and other non-probability sampling strategies were ruled out due to concerns that they were vulnerable to unknown selection biases (Baker et al., 2013). Various forms of probability sample screening were also ruled out due to concerns over low response rates (direct mail solicitation), lack of specificity in targeting the eligible age group and/or high cost (telephone screening of RDD samples, in-person screening of housing unit address samples).
After reviewing the full range of options, ABCD decided to employ probability sampling of U.S. schools within the 21 catchment areas as the primary method for contacting and recruiting eligible children and their parents. Sampling of schools and consenting students/parents has also been used to recruit cohorts for a number of major national studies. Since 1975, Monitoring the Future (MTF, Bachman et al., 2011) has used school-based recruitment to study patterns of drug use among 12th grade students and to establish a panel of teens who are followed into young adulthood. The Add Health Study (Chantala and Tabor, 2010), the National Comorbidity Replication-Adolescent Supplement (NCSR-A, Conway et al., 2016) and the National Education Longitudinal Studies (NELS, Ingels et al., 1990) are additional examples of national programs of research that have used school-based sampling and recruitment to launch a longitudinal study designed to represent the experiences of U.S. children and teens as they transition through the childhood years to adulthood.
Given the natural constraint to work within the catchment area populations of the 21 study sites, properties of the ABCD school-based sampling strategy that contribute to the study’s representation of U.S. population of children in the eligible birth cohorts include:
-
•
The catchment areas represented by the 21 study sites are geographically distributed to the nation’s four major regions (see Table 1) and are demographically and socio-economically diverse. In terms of population coverage of the U.S. population, over 1 in 5 of all eligible U.S. 9 and 10 year olds live within the geographic catchment areas of the 21 study sites.
-
•
Annual data bases compiled and maintained by the National Center for Education Statistics (NCES) provide a comprehensive sampling frame of public and private schools for the nation and specifically for the 21 study sites. The frame contains data on the socio-demographic characteristics of the students attending each school. This enables a careful stratification of sample schools and a sample allocation plan that permits ABCD to recruit a cohort that achieves the specified demographic targets.
-
•
ABCD has used the SAS V9.4. software system and the SAS Proc SurveySelect program to select a stratified, probability sample of schools from the sampling frame for each of the 21 sites, ensuring that systematic sampling biases in recruitment at the school level can be minimized. Using data available on the NCES data bases and ancillary data (e.g. Census information) for the geographic area in which schools are located, characteristics of sample schools that do and do not consent to participate can be compared.
-
•
Nationally representative data from sources such as the U.S. Census American Community Survey (ACS) Public Use Micro-Sample (PUMS) files allow ABCD statisticians to compare the ABCD cohort’s demographic, socio economic and other household characterisics those of the population to which the study aims to generalize. This comparison further enables estimation of nonsampling biases (due to selective nonparticipation) and post hoc analytic corrections to be made. In the analysis of ABCD data, propensity weights (DuGoff et al., 2014) will be developed for each participating child and used to compensate for differential rates of participation by schools and parents in the baseline recruitment process.
While the above properties of the ABCD sampling plan aim to minimize systematic biases in the sampling, it is well understood that self-selection by families into the study will likely be a major and unavoidable source of sampling bias. Consequently, any sampling strategy benefits from additional efforts to ensure outreach to and engagement with historically under-represented segments of the population. Knowledge, during recruitment, of how the accumulating sample might be deviating from target demographics (gender, race and ethnicity, SES, and urbanicity) helps identify when and where these outreach efforts need to be enhanced. Thus, active recruitment and monitoring of the accumulating sample go hand in hand over the two years of ABCD recruitment (September 2016–August 2018). The recruitment and monitoring procedures of the ABCD study are guided by the experience of the University of Michigan Institute for Social Research (ISR). ISR has a 70 year history of conducting population based research including major longitudinal programs of research such as Monitoring The Future (MTF), the Panel Study of Income Dynamics (PSID), and the Health and Retirement Study (HRS).
4.2. Sampling procedures
Recruitment procedures commenced with defining the catchment area where each research site wished to recruit. Typically, these were geographical areas centered on the research facility (e.g., all schools within 50 miles of the research institution). There are 21 recruitment sites in the ABCD study spread throughout the US (https://abcdstudy.org/contact.html). An analysis of the demographics of the nine and ten year olds within those catchment areas was performed as was an analysis of the demographics of the students at each elementary school. Each school was coded according to its geographical location, its racial, ethnic and gender composition, and, as a proxy for SES, its percentage of students receiving free or subsidized lunches.
Sampling statisticians from ISR used geographic information system (GIS) software and the Common Core of Data (CCD) and Private School Survey (PSS) national data bases from the National Center for Education Statistics (NCES) to convert the catchment area boundaries for each site into school districts that serve the public school student population in the original catchment area. To be included as part of the revised catchment area, districts that straddled the original boundary were required to have a minimum of 50% of schools physically located inside the boundary. Three data bases were prepared for each of the 21 ABCD study sites: 1) a data base of public school districts corresponding to the catchment area for the site; 2) a data base of all public schools in those school districts that served 3rd and 4th grade students (the modal grade levels of nine and ten year olds); and 3) a data base of all private schools with an address ZIP code that was within a site’s catchment area.
Across the 21 ABCD geographic study sites, it is important that the collective recruitment effort produces a baseline cohort that maximally reflects the distributions of demographic and socio-economic characteristics in the U.S. population of 9–10 year-old children. This study-wide objective to maximize the cohort’s representation of demographic and socio-economic groups has implications for the sample designs in the 21 sites. For example, to ensure that the ABCD baseline cohort achieves proportionate representation of African American adolescents, schools with a student body that was greater than 10% African American were oversampled by approximately 50% relative to similar size schools that fall below the 10% threshold. In ABCD site catchment areas that include rural/non-urban school districts, schools in these districts were also oversampled by a factor of approximately 50% to partially address a major under-representation of rural residents in the 21 ABCD study site catchment areas.
4.3. Construction of school lists
Over the two-year course of school-based recruitment the selection criteria for school inclusion enables the sites to: 1) control the total number of eligible students that are contacted, consented and enrolled; and 2) control departures due to sampling variability, non-consent and so on from the desired demographic composition for the site. To manage the overall numerical uncertainty that can arise from variability in the local rates of school participation and parental consent, the sample of schools selected for each site is assigned to four random subsamples or “replicates”. Each of the four replicates of the full sample for the study site is itself a proper one quarter of the larger sample. The first year of baseline recruitment for each study site began with random replicates 1 and 2. The lists of schools in replicates 1 and 2 were released to sites and then the local researchers made contact with the school superintendents and principals to elicit their cooperation with recruitment. Sites are instructed to contact all schools in replicates 1 and 2 (e.g., they should not introduce a selection bias by ignoring schools farthest from the research center) before moving on to replicates 3 and 4. For reasons explained next, schools in sample replicates 3 and 4 were not contacted in the initial phase (approximately Year 1) of recruitment.
4.4. Monitoring and dynamic adjustment of the accumulating sample
As noted above, factors outside of the researcher’s control such as the noncooperation of state or city departments, schools or parents could have large effects on the characteristics of the eventual sample. However, by starting with probability samples of elementary school students as the basis for the recruitment process, ABCD minimizes the risks of unmeasured selection bias inherent in a nonrandom pool of potential recruits and maximizes the opportunity to evaluate and potentially adjust for such bias.
As the ABCD sample accumulates, its demographics are carefully monitored. Emerging deviations from targets are quickly identified and strategies successful at one site are shared among the consortium (a 10% departure is the typical threshold for intervention). Should a significant departure from demographic targets be observed at a site then the composition of schools in replicates 3 and 4 for that site can be adjusted so as to increase the representation of that segment of the population that has been under-recruited in replicates 1 and 2. For example, should a site find that it is recruiting fewer Hispanic participants than anticipated, the list of schools in replicates 3 and 4 can be adjusted so as to include more schools with higher numbers of Hispanic students. A centralized data pooling allows for these adjustments to occur both within and between sites, thus increasing the likelihood that the consortium’s final sample will have a demographic profile that matches national proportions.
4.5. Approaching schools
Once schools are identified, sites contact school district superintendents and school principals requesting their cooperation in distributing materials (hard and electronic copies) to the nine and ten year olds in their school. In some sites approval from the state’s department of education was required before such contact can be established. Forging a strong relationship with principals, teachers and parents is central to maximizing enrollment and subsequent retention (see more details in this issue, Feldstein Ewing et al., 2017). Consequently, researcher-led presentations on adolescent development to teachers, students and parents are typically offered, and additional efforts to disseminate the study goals (e.g., distributing study brochures, manning desks at school entrances) are employed. Centrally prepared recruitment materials by the ABCD Outreach and Dissemination working group provide high-quality materials that are utilized at all sites. This approach also ensures a standardized messaging across all sites. All recruitment materials are given to all the children in the targeted age range (typically, 3rd, 4th, and 5th graders) to take home to their families. If schools agree, electronic copies are also sent through classroom mailing lists and information is also given on the school newsletter. The information directs families to the research site (school-specific webpage; phone and e-mail contact information) thereby reducing any burden on school staff. Subsequently, interested families complete a brief telephone screening and, if eligible, are enrolled and scheduled for the baseline assessment. The assessments occur at the research centers at times that are convenient for the families including after school, on weekends, and during school breaks. Guardians and children are reimbursed for their participation. Reimbursement rates vary across sites given variation in costs of living. The typical compensation is $200 to the parent/guardian who accompanies the child and $100 worth of gifts and gift cards to the child. In addition, travel costs and childcare for other siblings is offered to families to increase study accessibility.
5. Additional recruitment procedures
A number of additional recruitment procedures have been employed as supplements to the standard school-based approach. Combined, these methods will be the source of less than 10% of the final sample; this cap is to minimize the influence of any systematic sampling biases associated with these supplementary procedures. Demographics of the participants recruited using these methods are monitored to assess any departure from the demographics of youth recruited through schools. Some of these alternatives enable us to recruit children who might otherwise be excluded such as home-schooled children who would not be reached through the standard school-based recruitment method. In addition to these active recruitment methods, all sites engage in considerable effort to publicize the study through local media and outreach activities. These efforts are especially important to disseminate information about the study to segments of the population that are typically under-represented in research or that may be more distrustful of research goals. Interaction with, for example, church leaders and congregations or respected members of minority groups by researchers are very valuable in ensuring that the ABCD sample achieves its goal to recruit a sample as diverse as the national population.
5.1. Mailing lists
As the school-based recruitment required a ramp-up period (obtaining permission from state departments, school superintendents, establishing links with principals, distribution of materials to families) some sites initially utilized commercial mailing lists to distribute their materials to households within their catchment areas.
5.2. Affiliates
The sampling scheme is designed to enable control over who is approached and invited to participate in the study. However, self-selection into the study is allowed for individuals who are affiliated with an enrolled participant (e.g., a neighbor, friend or relation). The rationale is that the joint enrollment of affiliates will likely facilitate retention as sites can schedule annual assessments to accommodate the pair.
5.3. Referrals
A snowballing referral mechanism is employed whereby enrolled families are compensated for referring another family to join the study. This approach maximizes word-of-mouth enrollment wherein those in the study can serve as ambassadors for the project and provide a personal endorsement which may prove useful in segments of the population that are harder-to-reach or perhaps distrustful of research studies.
5.4. Summer recruitment
In order to avoid a potential lag in recruitment during the summer months when schools are out, recruitment continues with outreach to summer activities such as Boys and Girls clubs, YMCA’s, and summer meals programs. Researchers at all sites are encouraged to use their local knowledge to reach out to segments of their population who may be under-recruited or who historically prove harder to recruit (such as lower-income or minority families). In addition, summer recruitment activities that are based within schools in the defined samples are prioritized; for example summer reading programs at elementary schools.
5.5. Twin registry
Given their lower prevalence, an alternative procedure is employed for twin recruitment. As described above and in more detail elsewhere in this issue, twins (n = 1720 comprising 860 twin pairs) are recruited through direct contact using parental name and contact information from birth registries and rigorous tracking of parents and twins to their current residential address location (Iacono et al., 2017).
6. Predicting mental health outcomes
One very important motivation for the ABCD study is to identify the precursors and trajectories of adolescent mental health. Many mental health problems are considered neurodevelopmental in nature with their emergence during adolescence, a period of substantial neurobiological, physiological, psychological and social change. To ensure that the study has the inter-individual variation at baseline to provide sufficient statistical power to characterize these different developmental trajectories, we aim to recruit a sample with a large number of children who show early signs of externalizing and internalizing symptoms. The full details of these criteria, their predictive value, and the screening procedures are described in more detail elsewhere in this special issue.
There is, of course, a tension between enriching the study by over-recruiting those children with signs of externalizing/internalizing symptoms and maintaining the goal of recruiting a sample that reflects the sociodemographic diversity of the US population of 9–10 year olds. The initial intention was for 50% of the youth enrolled in the single birth sample to exhibit elevated externalizing/internalizing symptoms at baseline. So far, in the absence of selective recruitment of individuals showing elevated symptoms, over 40% of the emerging sample satisfies these criteria. We are therefore optimistic that a sample that has the desirable inter-individual variation will be enrolled without deviating from our current sampling strategies and without using symptom profiles for subject selection and enrollment.
7. Statistical adjustments
Despite the efforts to control the ABCD sample, factors that are largely uncontrollable such as the noncooperation of specific schools or individual families may result in an eventual sample that departs from national proportions. In this case, knowing the demographics of the target population and of those who were invited to enroll in comparison to those who comprise the final cohort enable us to estimate systematic biases in recruitment. For descriptive analyses and modeling of the ABCD data, propensity-based methods (Rosenbaum and Rubin, 1983; Heeringa et al. 2017) will be employed to develop case-specific analysis weights that are designed to compensate for any selectivity (due to non-observation) that remains in the ABCD sample. Propensity analyses will consist of individual subject inverse-probability-of-sampling scores computed based on known population vs. study proportions of 9–10 year olds in the various demographic categories described above. These scores will be included as weights in statistical analyses, thereby ensuring more unbiased population-valid estimates of regression coefficients and other analytical outputs. Using standard statistical packages such as SAS®, Stata® or the R system, these analysis weights are readily incorporated in estimation and model fitting required for the numerous analyses afforded by the study’s broad range of assessments.
8. Conclusions
The ABCD study is a landmark project in its scale and scope and in its potential to advance knowledge. It will generate a data base, open to the global scientific community, containing extensive multi-modal data on an especially large sample and, with a ten year developmental focus, will characterize the changes from childhood to young adulthood in depth. Considerable attention has been paid to ensuring that its sample reflects the diversity of the national population thereby greatly increasing the generalizability of its findings. An important set of analyses will focus on assessing the influence of demographic variables on adolescent development. While the number of variables that might influence development (including higher-order interactions among variables) is extremely large, the open science model of ABCD, wherein all data are made available to the scientific community, should facilitate a full exploration of the influence of sociodemographics on the outcome measures. The degree to which these demographic variables influence developmental trajectories remains to be clarified. Of note, recruiting a sample that is broad in both its sociodemographic diversity and inter-individual variation will allow us to disentangle factors that are all too often confounded such as race, urbanicity and SES.
Disclaimer
The views and opinions expressed in this manuscript are those of the authors only and do not necessarily represent the views, official policy or position of the U.S. Department of Health and Human Services or any of its affiliated institutions or agencies.
Conflict of Interest
None.
References
- Bachman J.G., Johnston L.D., O’Malley P.M., Schulenberg J.E. MTF Occasional Paper Series, Paper 76. Institute for Social Research, University of Michigan; Ann Arbor: 2011. The monitoring the future project after thirty-seven years: design and procedures. [Google Scholar]
- Baker R., Brick J.M., Bates N.A., Battaglia M., Couper M.P., Dever J.A., Gile K.J., Tourangeau R. 2013. Report of the AAPOR Task Force on Non-Probability Sampling.https://www.aapor.org/AAPOR_Main/media/MainSiteFiles/NPS_TF_Report_Final_7_revised_FNL_6_22_13.pdf [Google Scholar]
- Brown S.A., Brumback T., Tomlinson K., Cummins K., Thompson W.K., Nagel B.J., De Bellis M.D., Hooper S.R., Clark D.B., Chung T., Hasler B.P., Colrain I.M., Baker F.B., Prouty D., Pfefferbaum A., Sullivan E.V., Pohl K.M., Rohlfing T., Nichols B.N., Chu W., Tapert S.F. The National Consortium on Alcohol and NeuroDevelopment in Adolescence (NCANDA): a multi-site study of adolescent development and substance use. J. Stud. Alcohol. 2015;76:895–908. doi: 10.15288/jsad.2015.76.895. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chantala K., Tabor J. Carolina Population Center, University of North Carolina; Chapel Hill: 2010. National Longitudinal Study of Adolescent Health. Technical Report. [Google Scholar]
- Conway K.P., Swendsen J., Husky M.M., He J.-P., Merikangas K.R. Association of lifetime mental disorders and subsequent alcohol and illicit drug use: results from the National Comorbidity survey-adolescent supplement. J. Am. Acad. Child Adolesc. Psychiatry. 2016;55(4):280–288. doi: 10.1016/j.jaac.2016.01.006. [DOI] [PubMed] [Google Scholar]
- DuGoff E.H., Schuler M., Stuart E. Generalizing observational study results: applying propensity score methods to complex surveys. Health Serv. Res. 2014;49(1):284–303. doi: 10.1111/1475-6773.12090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Elliott M.R., Valliant R. Inference for nonprobability samples. Stat. Sci. 2017;32(2):249–264. [Google Scholar]
- Falk E.B., Hyde L.W., Mitchell C., Faul J., Gonzalez R., Heitzeg M.M., Keating D.P., Langa K.M., Martz M.E., Maslowsky J., Morrison F.J., Noll D.C., Patrick M.E., Pfeffer F.T., Reuter-Lorenz P.A., Thomason M.E., Davis-Kean P., Monk C.S., Schulenberg J. What is a representative brain? Neuroscience meets population science. Proc. Natl. Acad. Sci. U. S. A. 2013;110(44):17615–17622. doi: 10.1073/pnas.1310134110. 29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feldstein Ewing S.W., Chang L., Cottler L.B., Tapert S.F., Dowling G.J., Brown S.A. Approaching retention within the ABCD study. Dev. Cogn. Neurosci. 2017;(November) doi: 10.1016/j.dcn.2017.11.004. pii: S1878-9293(17)30097-X [Epub ahead of print] Review. PMID: 29150307. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Glasser M.F., Coalson T.S., Robinson E.C., Hacker C.D., Harwell J., Yacoub E., Ugurbil K., Andersson J., Beckmann C.F., Jenkinson M., Smith S.M., Van Essen D.C. A multi-modal parcellation of human cerebral cortex. Nature. 2016;536(7615):171–178. doi: 10.1038/nature18933. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heeringa S.G., West B.T., Berglund P.A. second edition. Chapman and Hall; 2017. Applied Survey Data Analysis. [Google Scholar]
- Iacono W.G., Heath A.C., Hewitt J.K., Neale M.C., Banich M.T., Luciana M.M., Madden P.A., Barch D.M., Bjork J.M. The utility of twins in developmental cognitive neuroscience research: how twins strengthen the ABCD research design. Dev. Cogn. Neurosci. 2017;(September) doi: 10.1016/j.dcn.2017.09.001. pii: S1878-9293(17)30113-5 [Epub ahead of print] Review. PMID: 29107609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ingels S.J., Abraham S.Y., Karr R., Spenser B.D., Frankel M.R. Technical Report. National Opinion Research Center, University of Chicago; 1990. National education longitudinal survey of 1988. [Google Scholar]
- Jernigan T.L., Brown T.T., Hagler D.J., Jr., Akshoomoff N., Bartsch H., Newman E., Thompson W.K., Bloss C.S., Murray S.S., Schork N., Kennedy D.N., Kuperman J.M., McCabe C., Chung Y., Libiger O., Maddox M., Casey B.J., Chang L., Ernst T.M., Frazier J.A., Gruen J.R., Sowell E.R., Kenet T., Kaufmann W.E., Mostofsky S., Amaral D.G., Dale A.M. pediatric imaging, neurocognition and genetics study. The pediatric imaging, neurocognition, and genetics (PING) data repository. Neuroimage. 2016;124(Pt B):1149–1154. doi: 10.1016/j.neuroimage.2015.04.057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keiding N., Louis T.A. Perils and potentials of self-selected entry to epidemiological studies and surveys. J. R. Stat. Soc. A. 2016;179(2):319–376. [Google Scholar]
- Kooijman M.N., Kruithof C.J., van Duijn C.M., Duijts L., Franco O.H., van IJzendoorn M.H., de Jongste J.C., Klaver C.C., van der Lugt A., Mackenbach J.P., Moll H.A., Peeters R.P., Raat H., Rings E.H., Rivadeneira F., van der Schroeff M.P., Steegers E.A., Tiemeier H., Uitterlinden A.G., Verhulst F.C., Wolvius E., Felix J.F., Jaddoe V.W. The generation R study: design and cohort update 2017. Eur J. Epidemiol. 2016;31:1243–1264. doi: 10.1007/s10654-016-0224-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lind K.E., Gutierrez E.J., Yamamoto D.J., Regner M.F., McKee S.A., Tanabe J. Sex disparities in substance abuse research: evaluating 23 years of structural neuroimaging studies. Drug Alcohol. Depend. 2017;173:92–98. doi: 10.1016/j.drugalcdep.2016.12.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- O’Muircheartaigh C., Hodges L.V. Generalizing from unrepresentative experiments: a stratified propensity score approach. Appl. Stat. 2014;63(2):195–210. [Google Scholar]
- Paus T. Springer-Verlag; Berlin Heidelberg: 2013. Population Neuroscience. [Google Scholar]
- Noble K.G., Houston S.M., Brito N.H., Bartsch H., Kan E., Kuperman J.M., Akshoomoff N., Amaral D.G., Bloss C.S., Libiger O., Schork N.J., Murray S.S., Casey B.J., Chang L., Ernst T.M., Frazier J.A., Gruen J.R., Kennedy D.N., Van Zijl P., Mostofsky S., Kaufmann W.E., Kenet T., Dale A.M., Jernigan T.L., Sowell E.R. Family income, parental education and brain structure in children and adolescents. Nat. Neurosci. 2015;18(5):773–778. doi: 10.1038/nn.3983. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosenbaum P.R., Rubin D.B. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55. [Google Scholar]
- Satterthwaite T.D., Connolly J.J., Ruparel K., Calkins M.E., Jackson C., Elliott M.A., Roalf D.R., Ryan Hopsona K.P., Behr M., Qiu H., Mentch F.D., Chiavacci R., Sleiman P.M., Gur R.C., Hakonarson H., Gur R.E. The philadelphia neurodevelopmental cohort: a publicly available resource for the study of normal and abnormal brain development in youth. Neuroimage. 2016;124(Pt B):1115–1119. doi: 10.1016/j.neuroimage.2015.03.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schumann G., Loth E., Banaschewski T., Barbot A., Barker G., Büchel C., Conrod P.J., Dalley J.W., Flor H., Gallinat J., Garavan H., Heinz A., Itterman B., Lathrop M., Mallik C., Mann K., Martinot J.L., Paus T., Poline J.B., Robbins T.W., Rietschel M., Reed L., Smolka M., Spanagel R., Speiser C., Stephens D.N., Ströhle A., Struve M., IMAGEN consortium The IMAGEN study: reinforcement-related behaviour in normal brain function and psychopathology. Mol. Psychiatry. 2010;15(12):1128–1139. doi: 10.1038/mp.2010.4. [DOI] [PubMed] [Google Scholar]
- Soh S.E., Chong Y.S., Kwek K., Saw S.M., Meaney M.J., Gluckman P.D., Holbrook J.D., Godfrey K.M., GUSTO Study Group Insights from the growing up in Singapore towards healthy outcomes (GUSTO) cohort study. Ann. Nutr. Metab. 2014;64(3–4):218–225. doi: 10.1159/000365023. [DOI] [PubMed] [Google Scholar]
- Stuart E.A., Bradshaw C.P., Leaf P.J. Assessing the generalizability of randomized trial results to target populations. Prev. Sci. 2015;16(3):475–485. doi: 10.1007/s11121-014-0513-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sudlow C., Gallacher J., Allen N., Beral V., Burton P., Danesh J., Downey P., Elliott P., Green J., Landray M., Liu B., Matthews P., Ong G., Pell J., Silman A., Young A., Sprosen T., Peakman T., Collins R. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779. doi: 10.1371/journal.pmed.1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson P.M., Stein J.L., Medland S.E., Hibar D.P., Vasquez A.A., Renteria M.E. The ENIGMA consortium: large-scale collaborative analyses of neuroimaging and genetic data. Brain Imaging Behav. 2014;8(2):153–182. doi: 10.1007/s11682-013-9269-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson Wesley K., Hallmayer Joachim, O’Hara Ruth. Design considerations for characterizing psychiatric trajectories across the lifespan: application to effects of APOE-ε4 on cerebral cortical thickness in Alzheimer’s disease. Am. J. Psychiatry. 2011;168(9):894–903. doi: 10.1176/appi.ajp.2011.10111690. [DOI] [PMC free article] [PubMed] [Google Scholar]