Abstract
Cognitive abilities are closely tied to mental health from early childhood. This study explores how neurobiological units of analysis of cognitive abilities—multimodal neuroimaging and polygenic scores (PGS)—represent this connection. Using data from over 11,000 children (ages 9–10) in the Adolescent Brain Cognitive Development (ABCD) Study, we applied multivariate models to predict cognitive abilities from mental health, neuroimaging, PGS, and environmental factors. Neuroimaging included 45 MRI-derived features (e.g. task/resting-state fMRI, structural MRI, diffusion imaging). Environmental factors encompassed socio-demographics (e.g. parental income/education), lifestyle (e.g. sleep, extracurricular activities), and developmental adverse events (e.g. parental use of alcohol/tobacco, pregnancy complications). Cognitive abilities were predicted by mental health (r = 0.36), neuroimaging (r = 0.54), PGS (r = 0.25), and environmental factors (r = 0.49). Commonality analyses showed that neuroimaging (66%) and PGS (21%) explained most of the cognitive–mental health link. Environmental factors accounted for 63% of the cognitive–mental health link, with neuroimaging and PGS explaining 58% and 21% of this environmental contribution, respectively. These patterns remained consistent over two years. Findings highlight the importance of neurobiological units of analysis for cognitive abilities in understanding the cognitive–mental health connection and its overlap with environmental factors.
Research organism: Human
Introduction
Cognitive abilities across various domains, such as attention, working memory, declarative memory, verbal fluency, and cognitive control, are often altered in several psychiatric disorders (Millan et al., 2012). This is evident in recent meta-analyses of case-control studies involving patients with mood and anxiety disorders, obsessive-compulsive disorder, posttraumatic stress disorder, and attention-deficit/hyperactivity disorder (ADHD), among others (Abramovitch et al., 2021; East-Richard et al., 2020). Beyond typical case-control studies, the association between cognitive abilities and mental health is also observed when mental health varies from normal to abnormal in normative samples (Morris et al., 2022). For instance, our study Pat et al., 2022a found an association between cognitive abilities and mental health in a relatively large, non-referred sample of 9–10 year-old children from the ABCD study (Casey et al., 2018). In this study, we measured cognitive abilities using behavioural performance across cognitive tasks (Luciana et al., 2018) while measuring mental health using a broad range of emotional and behavioural problems (Achenbach et al., 2017). Thus, cognitive abilities are frequently considered crucial for understanding mental health issues throughout life, beginning in childhood (Abramovitch et al., 2021; Hankin et al., 2016; Morris and Cuthbert, 2012).
According to the National Institute of Mental Health’s Research Domain Criteria (RDoC) framework (Insel et al., 2010), cognitive abilities should be investigated not only behaviourally but also neurobiologically, from the brain to genes. It remains unclear to what extent the relationship between cognitive abilities and mental health is represented in part by different neurobiological units of analysis -- such as neural and genetic levels measured by multimodal neuroimaging and polygenic scores (PGS). To fully comprehend the role of neurobiology in the relationship between cognitive abilities and mental health, we must also consider how these neurobiological units capture variations due to environmental factors, such as socio-demographics, lifestyles, and childhood developmental adverse events (Morris et al., 2022). Our study investigated the extent to which (a) environmental factors explain the relationship between cognitive abilities and mental health, and (b) cognitive abilities at the neural and genetic levels capture these associations due to environmental factors. Specifically, we conducted these investigations in a large normative group of children from the ABCD study (Casey et al., 2018). We chose to examine children because, while their emotional and behavioural problems might not meet full diagnostic criteria (Kessler et al., 2007), issues at a young age often forecast adult psychopathology (Reef et al., 2010; Roza et al., 2003). Moreover, the associations among different emotional and behavioural problems in children reflect transdiagnostic dimensions of psychopathology (Michelini et al., 2019; Pat et al., 2022a), making children an appropriate population to study the transdiagnostic aetiology of mental health, especially within a framework that emphasises normative variation from normal to abnormal, such as the RDoC (Morris et al., 2022).
Recently, several neuroscientists have developed predictive models using neuroimaging data from brain magnetic resonance imaging (MRI) of various modalities in the so-called Brain-Wide Association Studies (BWAS) (Marek et al., 2022; Sui et al., 2020). BWAS aims to create models from MRI data that can accurately predict behavioural phenotypes in participants not included in the model-building process (Dadi et al., 2021). In one of the most extensive BWAS benchmarks to date, Marek et al., 2022 concluded, ‘More robust BWAS effects were detected for functional MRI (versus structural), cognitive tests (versus mental health questionnaires), and multivariate methods (versus univariate).’ This benchmark has significant implications for using neuroimaging as a neural unit of analysis for cognitive abilities. First, while current BWAS may not be robust enough to predict mental health directly, it is more suitable for predicting cognitive abilities (see Zhi et al., 2024 for a similar conclusion). This aligns with the Research Domain Criteria (RDoC) framework, which emphasises neurobiological units of analysis for functional domains, such as cognitive abilities, rather than mental health itself (Cuthbert and Insel, 2013). RDoC’s functional domains capture basic human functioning and include cognitive abilities along with negative/positive valence, arousal, and regulation, and social and sensory processes (Morris et al., 2022). Accordingly, the current study conducted BWAS to capture cognitive abilities rather than mental health.
The second implication of Marek et al., 2022 benchmark is the support it provides for using multivariate algorithms, which draw MRI information simultaneously across regions/voxels, over massively univariate algorithms that draw data from one area/voxel at a time. Similar to Marek et al., 2022 study, which focused on resting-state functional MRI (rs-fMRI), our recent study on task-fMRI also found that multivariate algorithms performed superiorly, up to several folds, in predicting cognitive abilities compared to massively univariate algorithms (Pat et al., 2023). The third implication is that the performance of neuroimaging in predicting cognitive abilities depends on MRI modalities. Previous research has used brain MRI data of different modalities to predict cognitive abilities (Vieira et al., 2022). For instance, many studies have used rs-fMRI, which reflects functional connectivity between regions during rest (Dubois et al., 2018; Keller et al., 2023; Rasero et al., 2021; Sripada et al., 2020; Sripada et al., 2021). Others have utilised structural MRI (sMRI), which reflects anatomical morphology based on thickness, area, and volume in cortical/subcortical areas, and diffusion tensor imaging (DTI), which reflects diffusion distribution within white matter tracts (Mihalik et al., 2019; Rasero et al., 2021). While less common, task-fMRI, which reflects blood-oxygen-level-dependent (BOLD) activity relevant to each task condition, shows relatively good predictive performance, especially from specific contrasts, such as the 2-Back vs 0-Back from the N-Back working-memory task (Barch et al., 2013) nor (Makowski et al., 2024; Pat et al., 2023; Pat et al., 2022b; Sripada et al., 2020; Tetereva et al., 2022; Zhao et al., 2023). A recent meta-analysis estimated the performance of multivariate methods in predicting cognitive abilities from MRI of different modalities at around an out-of-sample r of 0.42 (Vieira et al., 2022). However, we and others found that this predictive performance could be further boosted by drawing information across different MRI modalities, rather than relying on only one modality (Pat et al., 2022b; Rasero et al., 2021; Tetereva et al., 2022; Tetereva and Pat, 2024). Therefore, the current study used opportunistic stacking (Engemann et al., 2020; Pat et al., 2022b). This multivariate modelling technique allowed us to combine information across MRI modalities with the added benefit of handling missing values. With opportunistic stacking, we created a ‘proxy’ measure of cognitive abilities (i.e. predicted value from the model) at the neural unit of analysis using multimodal neuroimaging.
Geneticists, like neuroscientists, have conducted Genome-Wide Association Studies (GWAS) to explore the links between single-nucleotide polymorphisms (SNPs) and various behavioural phenotypes (Bogdan et al., 2018). Similar to BWAS, GWAS can develop predictive models from genetic profiles, resulting in polygenic scores (PGS) that predict behavioural phenotypes in participants not included in the model-building process (Choi et al., 2020). Several large-scale GWAS on cognitive abilities have been conducted, with some studies involving over 250,000 participants (Davies et al., 2018; Lee et al., 2018; Savage et al., 2018). Recently, researchers have used these large-scale GWAS to compute PGS for cognitive abilities and applied these scores to predict cognitive abilities in children (Allegrini et al., 2019; Pat et al., 2022b). For example, Allegrini et al., 2019 found that PGS based on Savage et al.’s (2018) GWAS accounted for approximately 5.3% of the variance in cognitive abilities among 12-year-old children. The current study adopted this approach with children of a similar age in the ABCD study, creating a proxy measure of cognitive abilities at the genetic unit of analysis using PGS.
Environmental factors, broadly defined, significantly influence cognitive abilities (Duyme et al., 1999; Pietschnig and Voracek, 2015). A classic example is the Flynn Effect (Flynn, 1984; Flynn, 2009; Rundquist, 1936; Williams, 2013), which describes the observed rise in cognitive abilities, as measured by various cognitive tasks, across generations in the general population over time, particularly in high-income countries during the 20th century (Pietschnig and Voracek, 2015; Trahan et al., 2014; Wongupparaj et al., 2017). Experts attribute the Flynn Effect to environmental factors such as improved living standards and better education (Baker et al., 2015; Rindermann et al., 2017). Recently, researchers have used multivariate algorithms to create proxy measures of cognitive abilities in children based on environmental factors, similar to approaches used in neuroimaging and polygenic scores (PGS) (Kirlic et al., 2021; Pat et al., 2022b). These environmental factors often include socio-demographic variables (e.g., parental income/education, area deprivation index, parental marital status), lifestyle factors (e.g. screen/video game use, extracurricular activities), and developmental adverse events (e.g. parental use of alcohol/tobacco before and after pregnancy, birth complications). Studies, including ours, Kirlic et al., 2021; Pat et al., 2022b have applied multivariate algorithms to predict cognitive abilities from various environmental factors in the ABCD study (Casey et al., 2018). In these predictive models, parental income/education, area deprivation index, and extracurricular activities are particularly important predictors of cognitive abilities (Kirlic et al., 2021; Pat et al., 2022b). Following this approach, the current study created another proxy measure of cognitive abilities based on socio-demographics, lifestyles, and developmental adverse events.
In this study, inspired by RDoC (Insel et al., 2010), we (a) focused on cognitive abilities as a functional domain, (b) created predictive models to capture the continuous individual variation (as opposed to distinct categories) in cognitive abilities, (c) computed two neurobiological units of analysis of cognitive abilities: multimodal neuroimaging and PGS, and (d) investigated the potential contributions of environmental factors. To operationalise cognitive abilities, we estimated a latent variable representing behavioural performance across various cognitive tasks, commonly referred to as general cognitive ability or the g-factor (Deary, 2012). The g-factor was computed from various cognitive tasks pertinent to RDoC constructs, including attention, working memory, declarative memory, language, and cognitive control. However, using the g-factor to operationalise cognitive abilities caused this study to diverge from the original conceptualisation of RDoC, which emphasises studying separate constructs within cognitive abilities (Morris et al., 2022; Morris and Cuthbert, 2012). Recent studies suggest that including a general factor, such as the g-factor, in the model, rather than treating each construct separately, improved model fit (Beam et al., 2021; Quah et al., 2025). The g-factor in children is also longitudinally stable and can forecast future health outcomes (Calvin et al., 2017; Deary et al., 2013). Notably, our previous research found that neuroimaging predicts the g-factor more accurately than predicting performance from separate individual cognitive tasks (Pat et al., 2023). Accordingly, we decided to conduct predictive models on the g-factor while keeping the RDoC’s holistic, neurobiological, and basic-functioning characteristics.
Using the ABCD study (Casey et al., 2018), we first developed predictive models to estimate the cognitive abilities of unseen children based on their mental health. These models enabled us to quantify the relationship between cognitive abilities and mental health, thereby creating a proxy measure of cognitive abilities derived from mental health data. The mental health variables included children’s emotional and behavioural problems (Achenbach et al., 2017) and temperaments, such as behavioural inhibition/activation (Carver and White, 1994) and impulsivity (Zapolski et al., 2010). These temperaments are linked to externalising and internalising aspects of mental health and are associated with disorders like depression, anxiety, and substance use (Carver and Johnson, 2018; Johnson et al., 2003). Next, we built predictive models of cognitive abilities using neuroimaging, polygenic scores (PGS), and socio-demographic, lifestyle, and developmental adverse event data, resulting in various proxy measures of cognitive abilities. For neuroimaging, we included 45 types of brain MRI data from task-fMRI, rs-fMRI, sMRI, and DTI. For PGS, we used three definitions of cognitive abilities based on previous large-scale GWAS (Davies et al., 2018; Lee et al., 2018; Savage et al., 2018). For socio-demographic, lifestyle, and developmental adverse events, we included 44 features, covering variables such as parental income/education, screen use, and birth/pregnancy complications. Finally, we conducted a series of commonality analyses (Nimon et al., 2008) using these proxy measures of cognitive abilities to address three specific questions. First, we examined the extent to which the relationship between cognitive abilities and mental health was represented in part by cognitive abilities at the neural and genetic levels, as measured by multimodal neuroimaging and PGS, respectively. Second, we assessed the extent to which this relationship was partly explained by environmental factors, as measured by socio-demographic, lifestyle, and developmental adverse events. Third, we tested whether the two neurobiological units of analysis for cognitive abilities, measured by multimodal neuroimaging and PGS, could account for the variance due to environmental factors. To ensure the stability of our results, we repeated the analyses at two time points (ages 9–10 and 11–12).
Results
Predictive modelling
Predicting cognitive abilities from mental health
Figure 1a and Table 1 illustrate the predictive performance of the Partial Least Square (PLS) models in predicting cognitive abilities from mental health features. These features included: (1) emotional and behavioral problems assessed by the Child Behaviour Checklist (CBCL) (Achenbach et al., 2017), and (2) children’s temperaments assessed by the Behavioural Inhibition System/Behavioural Activation System (BIS/BAS) (Carver and White, 1994) and the Urgency, Premeditation, Perseverance, Sensation seeking, and Positive urgency (UPPS-P) impulsive behaviour scale (Zapolski et al., 2010). Using these two sets of mental health features separately resulted in moderate predictive performance, with correlation coefficients ranging from r=0.24 to r=0.31. Combining them into a single set of features, termed ‘mental health,’ improved the performance to approximately r=0.36, consistent across the two time points.
Figure 1. Predictive models, predicting cognitive abilities from mental-health features via Partial Least Square (PLS).
(a) Predictive performance of the models, indicated by scatter plots between observed vs predicted cognitive abilities based on mental health. Cognitive abilities are based on the second-order latent variable, the g-factor, based on a confirmatory factor analysis of six cognitive tasks. All data points are from test sets. r is the average Pearson’s r across 21 test sites. The parentheses following the r indicate bootstrapped 95% CIs, calculated based on observed vs predicted cognitive abilities from all test sites combined. UPPS-P Impulsive and Behaviour Scale and the Behavioural Inhibition System/Behavioural Activation System (BIS/BAS) were used for child temperaments, conceptualised as risk factors for mental issues. Mental health includes features from CBCL and child temperaments. (b) Feature importance of mental health, predicting cognitive abilities via PLS. The features were ordered based on the loading of the first PLS component. Univariate correlations were Pearson’s r between each mental-health feature and cognitive abilities. Error bars reflect 95% CIs of the correlations. CBCL = Child Behavioural Checklist (in green), reflecting children’s emotional and behavioural problems; UPPS-P = Urgency, Premeditation, Perseverance, Sensation seeking, and Positive urgency Impulsive Behaviour Scale; BAS = Behavioural Activation System (in orange).
Table 1. Performance metrics for predictive models, predicting cognitive abilities from mental health, neuroimaging, polygenic scores, and socio-demographics, lifestyles, and developments.
The metrics were averaged across test sites with standard deviations in parentheses.
| Features | Correlation | R2 | MAE | RMSE |
|---|---|---|---|---|
| Baseline | ||||
| Mental Health | 0.353 (0.051) | 0.124 (0.038) | 0.736 (0.019) | 0.934 (0.02) |
| CBCL | 0.272 (0.048) | 0.074 (0.028) | 0.758 (0.014) | 0.961 (0.015) |
| Child personality | 0.268 (0.058) | 0.071 (0.034) | 0.759 (0.019) | 0.962 (0.017) |
| Neuroimaging | 0.539 (0.073) | 0.291 (0.082) | 0.658 (0.039) | 0.839 (0.05) |
| Polygenic scores | 0.252 (0.056) | 0.02 (0.075) | 0.696 (0.055) | 0.884 (0.066) |
| Socio-demo Life Dev Adv | 0.486 (0.081) | 0.239 (0.084) | 0.686 (0.041) | 0.87 (0.049) |
| Follow-up | ||||
| Mental Health | 0.36 (0.07) | 0.116 (0.061) | 0.715 (0.043) | 0.903 (0.051) |
| CBCL | 0.24 (0.056) | 0.043 (0.034) | 0.746 (0.045) | 0.94 (0.053) |
| Child personality | 0.311 (0.076) | 0.084 (0.059) | 0.728 (0.046) | 0.919 (0.051) |
| Neuroimaging | 0.524 (0.097) | 0.266 (0.112) | 0.645 (0.038) | 0.818 (0.053) |
| Polygenic scores | 0.25 (0.075) | 0.031 (0.068) | 0.672 (0.053) | 0.854 (0.068) |
| Socio-demo Life Dev Adv | 0.488 (0.093) | 0.226 (0.096) | 0.664 (0.044) | 0.843 (0.05) |
R2=coefficient of determination; MAE = mean-absolute error; RMSE = root mean square error.
Figure 1b illustrates the loadings and the proportion of variance in cognitive abilities explained by each PLS components. The first PLS component accounted for the highest proportion of variance, ranging from 22.3 to 25.7%. This component was primarily influenced by factors such as attention and social problems, rule-breaking and aggressive behaviours and behavioural activation system drive. A similar pattern was observed across both time points.
Predicting cognitive abilities from neuroimaging
Figure 2, Figure 2—figure supplements 1 and 2, and Tables 1–3 illustrate the predictive performance of the opportunistic stacking models in predicting cognitive abilities from 45 sets of neuroimaging features. The predictive performance of each set of neuroimaging features varied significantly, with correlation coefficients ranging from approximately 0 (ENBack: Negative vs. Neutral Face) to around 0.4 (ENBack: 2-Back vs. 0-Back). Combining information from all 45 sets of neuroimaging features into a stacked model improved the performance to approximately r=0.54, consistent across both time points. The stacked model (R² ≈0.29) explained almost twice as much variance in cognitive abilities as the model based on the best single set of neuroimaging features (ENBack: 2-Back vs. 0-Back, R² ≈0.15). Figures 2 and 3, Figure 3—figure supplements 1–11 highlight the feature importance of the opportunistic stacking models. Across both time points, the top contributing neuroimaging features, as indicated by SHAP values, were ENBack task-fMRI contrasts, rs-fMRI, and cortical thickness.
Figure 2. Predictive models predicting cognitive abilities from neuroimaging via opportunistic stacking and polygenic scores via Elastic Net.
(a) Scatter plots between observed vs predicted cognitive abilities based on neuroimaging and polygenic scores. Cognitive abilities are based on the second-order latent variable, the g-factor, based on a confirmatory factor analysis of six cognitive tasks. The parentheses following the r indicate the bootstrapped 95% CIs, calculated based on observed vs predicted cognitive abilities from all test sites combined. All data points are from test sets. r is the average Pearson’s r across 21 test sites. The parentheses following the r indicate bootstrapped 95% CIs, calculated based on observed vs predicted cognitive abilities from all test sites combined. (b) Feature importance of the stacking layer of neuroimaging, predicting cognitive abilities via Random Forest. For the stacking layer of neuroimaging, the feature importance was based on the absolute value of SHapley Additive exPlanations (SHAP), averaged across test sites. A higher absolute value of SHAP indicates a higher contribution to the prediction. Error bars reflect standard deviations across sites. Different sets of neuroimaging features were filled with different colours: pink for dMRI, orange for fMRI, purple for resting-state functional MRI (rsMRI), and green for structural MRI (sMRI). (c) Feature importance of polygenic scores, predicting cognitive abilities via Elastic Net. For polygenic scores, the feature importance was based on the Elastic Net coefficients, averaged across test sites. We also plotted Pearson’s correlations between each polygenic score and cognitive abilities computed from the full data. Error bars reflect 95% CIs of these correlations.
Figure 2—figure supplement 1. Scatter plots between observed vs predicted cognitive abilities based on each set of 45 neuroimaging features in the baseline data.
Figure 2—figure supplement 2. Scatter plots between observed vs predicted cognitive abilities based on each set of 45 neuroimaging features in the follow-up data.
Table 2. Performance metrics for predictive models, predicting cognitive abilities from the 45 sets of neuroimaging features in the baseline data.
The metrics were averaged across test sites with standard deviations in parentheses.
| Features | Correlation | MAE | RMSE | |
|---|---|---|---|---|
| Neuroimaging | 0.539 (0.073) | 0.291 (0.082) | 0.658 (0.039) | 0.839 (0.05) |
| ENback 2back vs 0back | 0.393 (0.048) | 0.147 (0.042) | 0.661 (0.038) | 0.841 (0.045) |
| ENback 2back | 0.367 (0.06) | 0.128 (0.048) | 0.667 (0.036) | 0.848 (0.043) |
| rsfMRI temporal variance | 0.3 (0.094) | 0.09 (0.054) | 0.728 (0.04) | 0.921 (0.045) |
| rsfMRI cortical FC | 0.299 (0.055) | 0.088 (0.034) | 0.734 (0.027) | 0.929 (0.032) |
| ENback emotion | 0.277 (0.06) | 0.07 (0.041) | 0.689 (0.031) | 0.876 (0.035) |
| Cortical thickness | 0.265 (0.1) | 0.072 (0.055) | 0.756 (0.026) | 0.96 (0.03) |
| T2 gray matter avg intensity | 0.264 (0.106) | 0.069 (0.064) | 0.752 (0.032) | 0.953 (0.035) |
| T1 gray matter avg intensity | 0.263 (0.103) | 0.063 (0.071) | 0.761 (0.033) | 0.965 (0.039) |
| ENback 0back | 0.261 (0.058) | 0.061 (0.038) | 0.688 (0.031) | 0.878 (0.035) |
| T1 white matter avg intensity | 0.26 (0.103) | 0.067 (0.063) | 0.76 (0.029) | 0.963 (0.035) |
| rsfMRI subcortical-network FC | 0.258 (0.083) | 0.066 (0.043) | 0.743 (0.033) | 0.94 (0.035) |
| ENback place | 0.239 (0.065) | 0.049 (0.041) | 0.695 (0.032) | 0.886 (0.038) |
| T2 white matter avg intensity | 0.238 (0.103) | 0.056 (0.056) | 0.756 (0.03) | 0.96 (0.031) |
| T2 normalised intensity | 0.236 (0.082) | 0.057 (0.041) | 0.755 (0.021) | 0.96 (0.024) |
| DTI | 0.23 (0.074) | 0.042 (0.048) | 0.762 (0.027) | 0.967 (0.029) |
| Cortical volume | 0.228 (0.095) | 0.053 (0.044) | 0.767 (0.02) | 0.971 (0.024) |
| MID Small Rew vs Neu anticipation | 0.223 (0.049) | 0.048 (0.022) | 0.743 (0.017) | 0.938 (0.02) |
| Cortical area | 0.218 (0.101) | 0.049 (0.046) | 0.768 (0.021) | 0.973 (0.025) |
| T1 normalised intensity | 0.215 (0.109) | 0.047 (0.049) | 0.769 (0.022) | 0.974 (0.028) |
| MID Reward vs Neutral anticipation | 0.214 (0.062) | 0.043 (0.028) | 0.745 (0.022) | 0.944 (0.024) |
| MID Loss vs Neutral anticipation | 0.214 (0.075) | 0.043 (0.034) | 0.745 (0.025) | 0.944 (0.028) |
| MID Small Loss vs Neu anticipation | 0.203 (0.073) | 0.038 (0.03) | 0.747 (0.026) | 0.945 (0.026) |
| MID Pos vs Neg Punishment Feedback | 0.202 (0.066) | 0.037 (0.027) | 0.745 (0.021) | 0.945 (0.026) |
| T1 subcortical avg intensity | 0.2 (0.087) | 0.037 (0.043) | 0.773 (0.023) | 0.979 (0.026) |
| MID Large Rew vs Neu anticipation | 0.2 (0.072) | 0.037 (0.03) | 0.747 (0.021) | 0.946 (0.024) |
| MID Pos vs Neg Reward Feedback | 0.198 (0.05) | 0.036 (0.02) | 0.748 (0.022) | 0.945 (0.028) |
| T1 summations | 0.196 (0.08) | 0.009 (0.059) | 0.784 (0.029) | 0.992 (0.033) |
| Sulcal depth | 0.18 (0.095) | 0.032 (0.039) | 0.777 (0.02) | 0.984 (0.026) |
| MID Large Loss vs Neu anticipation | 0.173 (0.066) | 0.026 (0.026) | 0.749 (0.022) | 0.95 (0.025) |
| subcortical volume | 0.17 (0.078) | 0.028 (0.029) | 0.775 (0.018) | 0.982 (0.021) |
| SST Any Stop vs Correct Go | 0.164 (0.065) | 0.022 (0.025) | 0.736 (0.038) | 0.935 (0.043) |
| T2 subcortical avg intensity | 0.158 (0.057) | 0.023 (0.023) | 0.77 (0.018) | 0.977 (0.02) |
| ENback Face vs Place | 0.148 (0.076) | 0.014 (0.028) | 0.712 (0.027) | 0.904 (0.034) |
| SST Incorrect Stop vs Correct Go | 0.147 (0.059) | 0.017 (0.02) | 0.738 (0.035) | 0.937 (0.04) |
| SST Correct Stop vs Correct Go | 0.145 (0.056) | 0.017 (0.018) | 0.739 (0.033) | 0.936 (0.038) |
| SST Correct Go vs Fixation | 0.145 (0.053) | 0.017 (0.017) | 0.74 (0.033) | 0.938 (0.036) |
| MID Large Rew vs Small anticipation | 0.133 (0.05) | 0.015 (0.014) | 0.757 (0.022) | 0.956 (0.025) |
| T2 summations | 0.114 (0.053) | 0.008 (0.022) | 0.777 (0.018) | 0.984 (0.016) |
| SST Incorrect Go vs Correct Go | 0.11 (0.061) | 0.008 (0.015) | 0.744 (0.034) | 0.94 (0.038) |
| SST Correct Stop vs Incorrect Stop | 0.096 (0.068) | 0.005 (0.018) | 0.744 (0.033) | 0.943 (0.036) |
| MID Large vs Small Loss anticipation | 0.093 (0.063) | 0.006 (0.014) | 0.756 (0.024) | 0.96 (0.026) |
| SST Incorrect Go vs Incorrect Stop | 0.061 (0.039) | 0 (0.008) | 0.744 (0.032) | 0.943 (0.036) |
| ENback Positive vs Neutral Face | 0.024 (0.06) | –0.007 (0.012) | 0.716 (0.027) | 0.908 (0.034) |
| ENback Emotion vs Neutral Face | 0.019 (0.058) | –0.007 (0.01) | 0.716 (0.026) | 0.908 (0.033) |
| ENback Negative vs Neutral Face | 0.002 (0.058) | –0.007 (0.009) | 0.718 (0.024) | 0.911 (0.03) |
R2=coefficient of determination; MAE = mean-absolute error; RMSE = root mean square error.
Table 3. Performance metrics for predictive models, predicting cognitive abilities from the 45 sets of neuroimaging features in the follow-up data.
| Features | Correlation | MAE | RMSE | |
|---|---|---|---|---|
| Neuroimaging | 0.524 (0.097) | 0.266 (0.112) | 0.645 (0.038) | 0.818 (0.053) |
| ENback 2back vs 0back | 0.402 (0.092) | 0.15 (0.075) | 0.671 (0.032) | 0.844 (0.041) |
| ENback 2back | 0.39 (0.083) | 0.14 (0.071) | 0.676 (0.036) | 0.848 (0.045) |
| ENback place | 0.32 (0.073) | 0.089 (0.049) | 0.695 (0.038) | 0.874 (0.047) |
| ENback emotion | 0.319 (0.076) | 0.089 (0.05) | 0.696 (0.04) | 0.876 (0.047) |
| rsfMRI cortical FC | 0.309 (0.093) | 0.081 (0.071) | 0.718 (0.037) | 0.908 (0.046) |
| ENback 0back | 0.299 (0.078) | 0.077 (0.057) | 0.7 (0.045) | 0.881 (0.052) |
| rsfMRI temporal variance | 0.297 (0.111) | 0.077 (0.071) | 0.718 (0.045) | 0.903 (0.052) |
| rsfMRI subcortical-network FC | 0.265 (0.092) | 0.056 (0.059) | 0.732 (0.039) | 0.92 (0.048) |
| Cortical thickness | 0.259 (0.106) | 0.055 (0.062) | 0.738 (0.034) | 0.932 (0.041) |
| Cortical volume | 0.243 (0.091) | 0.046 (0.049) | 0.744 (0.034) | 0.936 (0.039) |
| T1 white matter avg intensity | 0.243 (0.09) | 0.044 (0.057) | 0.742 (0.035) | 0.937 (0.042) |
| T1 gray matter avg intensity | 0.241 (0.105) | 0.04 (0.069) | 0.742 (0.039) | 0.939 (0.047) |
| Cortical area | 0.233 (0.092) | 0.041 (0.05) | 0.746 (0.032) | 0.939 (0.04) |
| T2 gray matter avg intensity | 0.226 (0.112) | 0.04 (0.064) | 0.743 (0.037) | 0.939 (0.049) |
| DTI | 0.218 (0.065) | 0.022 (0.052) | 0.747 (0.034) | 0.944 (0.041) |
| T2 white matter avg intensity | 0.213 (0.099) | 0.033 (0.057) | 0.747 (0.036) | 0.942 (0.045) |
| T1 summations | 0.213 (0.062) | 0.011 (0.046) | 0.756 (0.039) | 0.954 (0.044) |
| MID Pos vs Neg Punish Feedback | 0.208 (0.058) | 0.025 (0.033) | 0.743 (0.044) | 0.933 (0.049) |
| MID Pos vs Neg Reward Feedback | 0.196 (0.071) | 0.021 (0.042) | 0.742 (0.038) | 0.933 (0.042) |
| T2 normalised intensity | 0.195 (0.077) | 0.025 (0.035) | 0.749 (0.039) | 0.946 (0.045) |
| T1 subcortical avg intensity | 0.191 (0.094) | 0.002 (0.083) | 0.759 (0.039) | 0.957 (0.046) |
| sulcal depth | 0.185 (0.087) | 0.018 (0.048) | 0.756 (0.034) | 0.95 (0.043) |
| MID Reward vs Neutral anticipation | 0.185 (0.078) | 0.016 (0.039) | 0.746 (0.037) | 0.937 (0.04) |
| SST Any Stop vs Correct Go | 0.184 (0.079) | 0.018 (0.034) | 0.745 (0.047) | 0.934 (0.054) |
| T1 normalised intensity | 0.181 (0.077) | 0.018 (0.036) | 0.752 (0.038) | 0.95 (0.045) |
| ENback Face vs Place | 0.179 (0.075) | 0.019 (0.03) | 0.721 (0.039) | 0.907 (0.044) |
| subcortical volume | 0.178 (0.062) | 0.016 (0.032) | 0.752 (0.036) | 0.949 (0.041) |
| SST Correct Stop vs Correct Go | 0.175 (0.062) | 0.015 (0.026) | 0.746 (0.048) | 0.936 (0.053) |
| MID Large Rew vs Neu anticipation | 0.172 (0.055) | 0.012 (0.028) | 0.747 (0.04) | 0.939 (0.044) |
| SST Incorrect Stop vs Correct Go | 0.17 (0.085) | 0.015 (0.032) | 0.746 (0.051) | 0.936 (0.059) |
| T2 subcortical avg intensity | 0.157 (0.085) | 0.011 (0.033) | 0.755 (0.039) | 0.952 (0.043) |
| MID Small Rew vs Neu anticipation | 0.154 (0.086) | 0.007 (0.04) | 0.75 (0.04) | 0.941 (0.044) |
| MID Loss vs Neutral anticipation | 0.147 (0.07) | 0.004 (0.024) | 0.75 (0.04) | 0.942 (0.043) |
| SST Correct Go vs Fixation | 0.138 (0.065) | 0.005 (0.026) | 0.749 (0.046) | 0.938 (0.054) |
| SST Incorrect Go vs Correct Go | 0.122 (0.072) | 0.001 (0.03) | 0.752 (0.053) | 0.944 (0.059) |
| MID Large Loss vs Neu anticipation | 0.121 (0.074) | –0.004 (0.03) | 0.752 (0.04) | 0.942 (0.044) |
| T2 summations | 0.116 (0.07) | –0.003 (0.029) | 0.763 (0.041) | 0.96 (0.048) |
| MID Small Loss vs Neu Anticipation | 0.106 (0.071) | –0.005 (0.021) | 0.755 (0.041) | 0.948 (0.044) |
| SST Correct Stop vs Incorrect Stop | 0.09 (0.086) | –0.006 (0.023) | 0.754 (0.049) | 0.947 (0.057) |
| MID Large vs Small Loss Anticipation | 0.064 (0.07) | –0.012 (0.025) | 0.756 (0.043) | 0.948 (0.048) |
| MID Large vs Small Rew anticipation | 0.063 (0.059) | –0.012 (0.018) | 0.759 (0.042) | 0.952 (0.046) |
| SST Incorrect Go vs Incorrect Stop | 0.038 (0.067) | –0.014 (0.019) | 0.756 (0.052) | 0.95 (0.059) |
| ENback Positive vs Neutral Face | 0.006 (0.069) | –0.013 (0.018) | 0.732 (0.037) | 0.919 (0.044) |
| ENback Negative vs Neutral Face | –0.012 (0.031) | –0.012 (0.015) | 0.735 (0.039) | 0.923 (0.043) |
| ENback Emotion vs Neutral Face | –0.027 (0.067) | –0.014 (0.016) | 0.733 (0.038) | 0.921 (0.045) |
The metrics were averaged across test sites with standard deviations in parentheses. R2=coefficient of determination; MAE = mean-absolute error; RMSE = root mean square error.
Figure 3. Feature importance of each set of neuroimaging features, predicting cognitive abilities in the baseline data.
The feature importance was based on the Elastic Net coefficients, averaged across test sites. We did not order these sets of neuroimaging features according to their contribution to the stacking layer (see Figure 2). Larger versions of the feature importance for each set of neuroimaging features can be found in Figure 3—figure supplements 1–11. MID = Monetary Incentive Delay task; SST = Stop Signal Task; DTI = Diffusion Tensor Imaging; FC = functional connectivity.
Figure 3—figure supplement 1. Feature importance of each set of neuroimaging features, predicting cognitive abilities in the follow-up data via Elastic Net.
Figure 3—figure supplement 2. Feature importance of Nback task-fMRI features, predicting cognitive abilities in the baseline data via Elastic Net.
Figure 3—figure supplement 3. Feature importance of MID task-fMRI features, predicting cognitive abilities in the baseline data via Elastic Net.
Figure 3—figure supplement 4. Feature importance of SST task-fMRI features, predicting cognitive abilities in the baseline data via Elastic Net.
Figure 3—figure supplement 5. Feature importance of resting-state functional MRI (rs-fMRI) features, predicting cognitive abilities in the baseline data via Elastic Net.
Figure 3—figure supplement 6. Feature importance of structural MRI (sMRI) and dMRI features, predicting cognitive abilities in the baseline data via Elastic Net.
Figure 3—figure supplement 7. Feature importance of Nback task-fMRI features, predicting cognitive abilities in the follow-up data via Elastic Net.
Figure 3—figure supplement 8. Feature importance of monetary incentive delay (MID) task-fMRI features, predicting cognitive abilities in the follow-up data via Elastic Net.
Figure 3—figure supplement 9. Feature importance of SST task-fMRI features, predicting cognitive abilities in the follow-up data via Elastic Net.
Figure 3—figure supplement 10. Feature importance of resting-state functional MRI (rs-fMRI) features, predicting cognitive abilities in the follow-up data via Elastic Net.
Figure 3—figure supplement 11. Feature importance of structural MRI (sMRI) and dMRI features, predicting cognitive abilities in the follow-up data via Elastic Net.
Predicting cognitive abilities from polygenic scores
Figure 2a and Table 1 illustrate the predictive performance of the Elastic Net models in predicting cognitive abilities using three polygenic scores (PGSs). The predictive accuracy of these PGSs was r=0.25 at baseline and r=0.25 at follow-up. (Figure 2c) highlights the feature importance within these models, indicating a stronger contribution from the PGS based on Savage et al., 2018 GWAS.
Predicting cognitive abilities from socio-demographics, lifestyles, and developmental adverse events
Figure 4a and Table 1 illustrate the predictive performance of the PLS models in predicting cognitive abilities from socio-demographics, lifestyles, and developmental adverse events. Using 44 features covering these areas, the predictive performance was around r=0.49, consistent across the two time points. (Figure 4b) shows the loadings and the proportion of variance explained by these PLS models. The first PLS component accounted for the highest proportion of variance (around 10%).
Figure 4. Predictive models, predicting cognitive abilities from socio-demographics, lifestyles, and developmental adverse events via Partial Least Square (PLS).
(a) Scatter plots between observed vs predicted cognitive abilities based on socio-demographics, lifestyles, and developmental adverse events. Cognitive abilities are based on the second-order latent variable, the g-factor, based on a confirmatory factor analysis of six cognitive tasks. All data points are from test sets. r is the average Pearson’s r across 21 test sites. The parentheses following the r indicate bootstrapped 95% CIs, calculated based on observed vs predicted cognitive abilities from all test sites combined. (b) Feature importance of socio-demographics, lifestyles, and developmental adverse events, predicting cognitive abilities via Partial Least Square. The features were ordered based on the loading of the first component. Univariate correlations were Pearson’s correlation between each feature and cognitive abilities. Error bars reflect 95% CIs of the correlations. Different types of environmental factors were filled with different colours: orange for socio-demographics, purple for developmental adverse events and green for lifestyle. A dashed horizontal line in the follow-up feature importance figure distinguishes whether the variables were collected at baseline or follow-up.
Based on its loadings, this first component was: (a) Positively influenced by features such as parental income and education, neighbourhood safety, and extracurricular activities, (b) Negatively influenced by features such as area deprivation, having a single parent, screen use, economic insecurities, lack of sleep, playing mature video games, watching mature movies, and lead exposure.
Commonality analyses
We separately conducted the four sets of commonality analyses.
Commonality analyses for proxy measures of cognitive abilities based on mental health and neuroimaging
At baseline, having both proxy measures based on mental health and neuroimaging in a linear mixed model explained 27% of the variance in cognitive abilities. Specifically, 9.8% of the variance in cognitive abilities was explained by mental health, which included the common effect between the two proxy measures (6.48%) and the unique effect of mental health (3.32%) (see Tables 4–5 and Figure 5). This indicates that 66% of the relationship between cognitive abilities and mental health, i.e., (6.48 ÷ 9.8)×100, was shared with neuroimaging. The common effects varied considerably across different sets of neuroimaging features, ranging from approximately 0.08 to 2.78%, with the highest being the ENBack task fMRI: 2-Back vs. 0-Back (see Figure 5—figure supplement 1). The pattern of results was consistent across both time points.
Table 4. Results of linear-mixed models using proxy measures of cognitive abilities based on mental health and/or neuroimaging as regressors to explain cognitive abilities across test sites in the baseline.
| Response | Cognitive abilities | Cognitive abilities | Cognitive abilities | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Regressors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 0.02 | –0.00–0.03 | 0.058 | 0.02 | –0.00–0.04 | 0.057 | 0.02 | –0.00–0.03 | 0.067 |
| mental savg | 0.00 | –0.02–0.02 | 0.895 | 0.00 | –0.02–0.02 | 0.985 | |||
| mental cws | 0.19 | 0.17–0.20 | <0.001 | 0.31 | 0.29–0.33 | <0.001 | |||
| neuroimaging savg | –0.01 | –0.02–0.01 | 0.507 | –0.01 | –0.02–0.01 | 0.523 | |||
| neuroimaging cws | 0.43 | 0.41–0.44 | <0.001 | 0.48 | 0.47–0.50 | <0.001 | |||
| Random Effects | |||||||||
| σ2 | 0.55 | 0.54 | 0.57 | ||||||
| τ00 | 0.17 SITE_ID_L:REL_FAMILY_ID | 0.35 SITE_ID_L:REL_FAMILY_ID | 0.18 SITE_ID_L:REL_FAMILY_ID | ||||||
| ICC | 0.24 | 0.39 | 0.24 | ||||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | 21 SITE_ID_L | ||||||
| 9001 REL_FAMILY_ID | 9001 REL_FAMILY_ID | 9001 REL_FAMILY_ID | |||||||
| Observations | 10728 | 10728 | 10728 | ||||||
| Marginal R2 | 0.272 | 0.098 | 0.238 | ||||||
| Conditional R2 | 0.444 | 0.452 | 0.423 | ||||||
cws = values centred within each site; savg = values averaged within each site.
Table 5. Results of linear-mixed models using proxy measures of cognitive abilities based on mental health and/or neuroimaging as regressors to explain cognitive abilities across test sites in the follow-up.
| Response | Cognitive abilities | Cognitive abilities | Cognitive abilities | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Regressors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 0.82 | 0.80–0.84 | <0.001 | 0.82 | 0.80–0.85 | <0.001 | 0.82 | 0.80–0.84 | <0.001 |
| mental savg | 0.02 | 0.00–0.04 | 0.047 | 0.02 | 0.00–0.05 | 0.037 | |||
| mental cws | 0.19 | 0.17–0.21 | <0.001 | 0.31 | 0.29–0.33 | <0.001 | |||
| neuroimaging savg | 0.02 | 0.00–0.05 | 0.021 | 0.03 | 0.01–0.05 | 0.012 | |||
| neuroimaging cws | 0.42 | 0.40–0.44 | <0.001 | 0.47 | 0.45–0.49 | <0.001 | |||
| Random Effects | |||||||||
| σ2 | 0.41 | 0.45 | 0.42 | ||||||
| τ00 | 0.24 SITE_ID_L:REL_FAMILY_ID | 0.37 SITE_ID_L:REL_FAMILY_ID | 0.27 SITE_ID_L:REL_FAMILY_ID | ||||||
| ICC | 0.37 | 0.46 | 0.40 | ||||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | 21 SITE_ID_L | ||||||
| 5434 REL_FAMILY_ID | 5434 REL_FAMILY_ID | 5434 REL_FAMILY_ID | |||||||
| Observations | 6315 | 6315 | 6315 | ||||||
| Marginal R2 | 0.286 | 0.104 | 0.245 | ||||||
| Conditional R2 | 0.552 | 0.513 | 0.545 | ||||||
cws = values centred within each site; savg = values averaged within each site.
Figure 5. Venn diagrams showing common and unique effects of proxy measures of cognitive abilities based on mental health, neuroimaging, polygenic scores, and/or socio-demographics, lifestyles and developmental adverse events in explaining cognitive abilities across test sites.
We computed the common and unique effects in % based on the marginal of four sets of linear-mixed models.
Figure 5—figure supplement 1. Stacked bar plots showing common and unique effects of proxy measures of cognitive abilities based on each set of neuroimaging features in explaining cognitive abilities across test sites.
Commonality analyses for proxy measures of cognitive abilities based on mental health and PGSs
At baseline, having both proxy measures based on mental health and PGSs in a linear mixed model explained 11.8% of the variance in cognitive abilities. Specifically, 9.21% of the variance in cognitive abilities was explained by mental health, which included the common effect between the two proxy measures (1.93%) and the unique effect of mental health (7.28%) (see Tables 6–7 and Figure 5). This indicates that 21% of the relationship between cognitive abilities and mental health, i.e., (1.93 ÷ 9.21) × 100, was shared with PGSs. The pattern of results was consistent across both time points.
Table 6. Results of linear-mixed models using proxy measures of cognitive abilities based on mental health and/or polygenic scores as regressors to explain cognitive abilities across test sites in the baseline.
| Response | Cognitive abilities | Cognitive abilities | Cognitive abilities | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Regressors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 0.23 | 0.21–0.26 | <0.001 | 0.23 | 0.21–0.25 | <0.001 | 0.23 | 0.21–0.26 | <0.001 |
| mental savg | 0.06 | 0.02–0.09 | 0.004 | 0.13 | 0.10–0.15 | <0.001 | |||
| mental cws | 0.25 | 0.23–0.27 | <0.001 | 0.25 | 0.23–0.27 | <0.001 | |||
| PGS savg favg | –0.08 | –0.12 to –0.05 | <0.001 | –0.13 | –0.15 to –0.10 | <0.001 | |||
| PGS cws cwf | 0.05 | 0.03–0.07 | <0.001 | 0.06 | 0.04–0.08 | <0.001 | |||
| Random Effects | |||||||||
| σ2 | 0.51 | 0.52 | 0.53 | ||||||
| τ00 | 0.27 SITE_ID_L:REL_FAMILY_ID | 0.26 SITE_ID_L:REL_FAMILY_ID | 0.32 SITE_ID_L:REL_FAMILY_ID | ||||||
| ICC | 0.34 | 0.33 | 0.38 | ||||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | 21 SITE_ID_L | ||||||
| 4734 REL_FAMILY_ID | 4734 REL_FAMILY_ID | 4734 REL_FAMILY_ID | |||||||
| Observations | 5766 | 5766 | 5766 | ||||||
| Marginal R2 | 0.098 | 0.092 | 0.026 | ||||||
| Conditional R2 | 0.408 | 0.394 | 0.394 | ||||||
cws = values centred within each site; savg = values averaged within each site; cws,cwf = values centred within each family first and then within each site; savg,favg = values averaged within each family first and then within each site. PGS = polygenic scores.
Table 7. Results of linear-mixed models using proxy measures of cognitive abilities based on mental health and/or polygenic scores as regressors to explain cognitive abilities across test sites in the follow-up.
| Response | Cognitive abilities | Cognitive abilities | Cognitive abilities | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 1.06 | 1.03–1.09 | <0.001 | 1.06 | 1.03–1.09 | <0.001 | 1.06 | 1.03–1.09 | <0.001 |
| mental savg | 0.03 | –0.00–0.07 | 0.063 | 0.07 | 0.05–0.10 | <0.001 | |||
| mental cws | 0.22 | 0.19–0.25 | <0.001 | 0.22 | 0.20–0.25 | <0.001 | |||
| PGS savg favg | –0.07 | –0.10 to –0.04 | <0.001 | –0.09 | –0.12 to –0.06 | <0.001 | |||
| PGS cws cwf | 0.04 | 0.02–0.06 | <0.001 | 0.05 | 0.03–0.07 | <0.001 | |||
| Random Effects | |||||||||
| σ2 | 0.42 | 0.43 | 0.43 | ||||||
| τ00 | 0.32 SITE_ID_L:REL_FAMILY_ID | 0.31 SITE_ID_L:REL_FAMILY_ID | 0.37 SITE_ID_L:REL_FAMILY_ID | ||||||
| ICC | 0.43 | 0.42 | 0.46 | ||||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | 21 SITE_ID_L | ||||||
| 3370 REL_FAMILY_ID | 3370 REL_FAMILY_ID | 3370 REL_FAMILY_ID | |||||||
| Observations | 4036 | 4036 | 4036 | ||||||
| Marginal R2 | 0.075 | 0.068 | 0.013 | ||||||
| Conditional R2 | 0.470 | 0.460 | 0.469 | ||||||
cws = values centred within each site; savg = values averaged within each site; cws,cwf = values centred within each family first and then within each site; savg,favg = values averaged within each family first and then within each site. PGS = polygenic scores.
Commonality analyses for proxy measures of cognitive abilities based on mental health and socio-demographics, lifestyles, and developmental adverse events
At baseline, having both proxy measures based on mental health and socio-demographics, lifestyles, and developmental adverse events in a linear mixed model explained 24.9% of the variance in cognitive abilities. Specifically, 9.75% of the variance in cognitive abilities was explained by mental health, which included the common effect between the two proxy measures (6.12%) and the unique effect of mental health (3.63%) (see Tables 8–9 and Figure 5). This indicates that over 63% of the relationship between cognitive abilities and mental health, i.e., (6.12 ÷ 9.75) × 100, was shared with socio-demographics, lifestyles, and developmental adverse events. The pattern of results was consistent across both time points.
Table 8. Results of linear-mixed models using proxy measures of cognitive abilities based on mental health and/or socio-demographics, lifestyles, and developmental adverse events as regressors to explain cognitive abilities across test sites in the baseline.
| Response | Cognitive abilities | Cognitive abilities | Cognitive abilities | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Regressors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 0.01 | –0.01–0.02 | 0.525 | 0.01 | –0.01–0.03 | 0.385 | 0.01 | –0.01–0.02 | 0.558 |
| mental savg | –0.00 | –0.02–0.02 | 0.917 | –0.00 | –0.02–0.02 | 0.930 | |||
| mental cws | 0.20 | 0.18–0.22 | <0.001 | 0.31 | 0.29–0.33 | <0.001 | |||
| sdl savg | 0.00 | –0.02–0.02 | 0.819 | 0.00 | –0.01–0.02 | 0.792 | |||
| sdl cws | 0.40 | 0.38–0.41 | <0.001 | 0.46 | 0.44–0.48 | <0.001 | |||
| Random Effects | |||||||||
| σ2 | 0.52 | 0.53 | 0.54 | ||||||
| τ00 | 0.22 SITE_ID_L:REL_FAMILY_ID | 0.35 SITE_ID_L:REL_FAMILY_ID | 0.24 SITE_ID_L:REL_FAMILY_ID | ||||||
| ICC | 0.30 | 0.40 | 0.31 | ||||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | 21 SITE_ID_L | ||||||
| 9390 REL_FAMILY_ID | 9390 REL_FAMILY_ID | 9390 REL_FAMILY_ID | |||||||
| Observations | 11294 | 11294 | 11294 | ||||||
| Marginal R2 | 0.249 | 0.098 | 0.213 | ||||||
| Conditional R2 | 0.474 | 0.458 | 0.456 | ||||||
cws = values centred within each site; savg = values averaged within each site; sdl = socio-demographics, lifestyles and developmental adverse events.
Table 9. Results of linear-mixed models using proxy measures of cognitive abilities based on mental health and/or socio-demographics, lifestyles and developmental adverse events as regressors to explain cognitive abilities across test sites in the follow-up.
| Response | Cognitive abilities | Cognitive abilities | Cognitive abilities | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Regressors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 0.83 | 0.81–0.85 | <0.001 | 0.83 | 0.81–0.86 | <0.001 | 0.83 | 0.81–0.85 | <0.001 |
| mental savg | 0.01 | –0.01–0.03 | 0.185 | 0.01 | –0.01–0.04 | 0.198 | |||
| mental cws | 0.20 | 0.18–0.22 | <0.001 | 0.30 | 0.28–0.32 | <0.001 | |||
| sdl savg | 0.00 | –0.02–0.02 | 0.957 | 0.00 | –0.02–0.02 | 0.757 | |||
| sdl cws | 0.39 | 0.37–0.41 | <0.001 | 0.44 | 0.42–0.47 | <0.001 | |||
| Random Effects | |||||||||
| σ2 | 0.42 | 0.45 | 0.43 | ||||||
| τ00 | 0.27 SITE_ID_L:REL_FAMILY_ID | 0.37 SITE_ID_L:REL_FAMILY_ID | 0.30 SITE_ID_L:REL_FAMILY_ID | ||||||
| ICC | 0.39 | 0.45 | 0.41 | ||||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | 21 SITE_ID_L | ||||||
| 6217 REL_FAMILY_ID | 6217 REL_FAMILY_ID | 6217 REL_FAMILY_ID | |||||||
| Observations | 7382 | 7382 | 7382 | ||||||
| Marginal R2 | 0.256 | 0.099 | 0.213 | ||||||
| Conditional R2 | 0.543 | 0.508 | 0.535 | ||||||
cws = values centred within each site; savg = values averaged within each site; sdl = socio-demographics, lifestyles and developmental adverse events.
Commonality analyses for proxy measures of cognitive abilities based on mental health, neuroimaging, PGSs and socio-demographics, lifestyles, and developmental adverse events
At baseline, having all four proxy measures based on mental health, neuroimaging, PGSs, and socio-demographics, lifestyles, and developmental adverse events in a linear mixed model explained 24.2% of the variance in cognitive abilities. Of the 8.97% of the variance in cognitive abilities explained by mental health, 7.05% represented common effects with the other proxy measures. This indicates that 79%, i.e., (7.05 ÷ 8.97) × 100, of the relationship between cognitive abilities and mental health was shared with the three other proxy measures (see Tables 10–11 and Figure 5). Additionally, among the variance that socio-demographics, lifestyles, and developmental adverse events accounted for in the relationship between cognitive abilities and mental health, neuroimaging could capture 58%, while PGSs could capture 21%. The pattern of results was consistent across both time points.
Table 10. Results of linear-mixed models using proxy measures of cognitive abilities based on mental health, neuroimaging, polygenic scores and/or socio-demographics, lifestyles and developmental adverse events as regressors to explain cognitive abilities across test sites in the baseline.
| Response | Cognitive abilities | Cognitive abilities | ||||
|---|---|---|---|---|---|---|
| Regressors | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 0.24 | 0.21–0.26 | <0.001 | 0.24 | 0.21–0.26 | <0.001 |
| mental savg | 0.00 | –0.05–0.05 | 0.975 | 0.09 | 0.05–0.12 | <0.001 |
| mental cws | 0.14 | 0.11–0.16 | <0.001 | 0.18 | 0.15–0.20 | <0.001 |
| neuroimaging savg | 0.01 | –0.03–0.05 | 0.533 | 0.05 | 0.01–0.09 | 0.006 |
| neuroimaging cws | 0.26 | 0.24–0.29 | <0.001 | 0.31 | 0.28–0.33 | <0.001 |
| PGS savg favg | –0.04 | –0.08–0.00 | 0.070 | |||
| PGS cws cwf | 0.05 | 0.03–0.07 | <0.001 | |||
| sdl savg | 0.09 | 0.03–0.16 | 0.006 | |||
| sdl cws | 0.18 | 0.16–0.21 | <0.001 | |||
| σ2 | 0.50 | 0.52 | ||||
| τ00 | 0.15 SITE_ID_L:REL_FAMILY_ID | 0.17 SITE_ID_L:REL_FAMILY_ID | ||||
| ICC | 0.23 | 0.25 | ||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | ||||
| Observations | 5520 | 5520 | ||||
| Marginal R2 | 0.241 | 0.197 | ||||
| Conditional R2 | 0.416 | 0.395 | ||||
| Regressors | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 0.24 | 0.21–0.26 | <0.001 | 0.24 | 0.21–0.26 | <0.001 |
| mental savg | 0.06 | 0.03–0.10 | 0.001 | 0.00 | –0.04–0.05 | 0.890 |
| mental cws | 0.24 | 0.22–0.27 | <0.001 | 0.19 | 0.16–0.21 | <0.001 |
| neuroimaging savg | ||||||
| neuroimaging cws | ||||||
| PGS savg favg | –0.08 | –0.12 to –0.05 | <0.001 | |||
| PGS cws cwf | 0.06 | 0.04–0.08 | <0.001 | |||
| sdl savg | 0.14 | 0.09–0.19 | <0.001 | |||
| sdl cws | 0.25 | 0.22–0.27 | <0.001 | |||
| σ2 | 0.51 | 0.52 | ||||
| τ00 | 0.27 SITE_ID_L:REL_FAMILY_ID | 0.20 SITE_ID_L:REL_FAMILY_ID | ||||
| ICC | 0.34 | 0.28 | ||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | ||||
| 4571 REL_FAMILY_ID | 4571 REL_FAMILY_ID | |||||
| Observations | 5520 | 5520 | ||||
| Marginal R2 | 0.097 | 0.163 | ||||
| Conditional R2 | 0.408 | 0.395 | ||||
cws = values centred within each site; savg = values averaged within each site; cws,cwf = values centred within each family first and then within each site; savg,favg = values averaged within each family first and then within each site; PGS = polygenic scores; sdl = socio-demographics, lifestyles and developmental adverse events.
Table 11. Results of linear-mixed models using proxy measures of cognitive abilities based on mental health, neuroimaging, polygenic scores and/or socio-demographics, lifestyles, and developmental adverse events as regressors to explain cognitive abilities across test sites in the follow-up.
| Response | Cognitive abilities | Cognitive abilities | ||||
|---|---|---|---|---|---|---|
| Regressors | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 1.05 | 1.02–1.08 | <0.001 | 1.05 | 1.02–1.08 | <0.001 |
| mental savg | 0.05 | –0.01–0.10 | 0.100 | 0.06 | 0.03–0.10 | <0.001 |
| mental cws | 0.13 | 0.11–0.16 | <0.001 | 0.17 | 0.14–0.20 | <0.001 |
| neuroimaging savg | 0.00 | –0.06–0.06 | 0.935 | 0.03 | –0.01–0.06 | 0.146 |
| neuroimaging cws | 0.27 | 0.24–0.30 | <0.001 | 0.31 | 0.28–0.33 | <0.001 |
| PGS savg favg | 0.00 | –0.03–0.04 | 0.833 | |||
| PGS cws cwf | 0.04 | 0.02–0.06 | <0.001 | |||
| sdl savg | 0.04 | –0.04–0.12 | 0.349 | |||
| sdl cws | 0.20 | 0.17–0.23 | <0.001 | |||
| σ2 | 0.38 | 0.40 | ||||
| τ00 | 0.23 SITE_ID_L:REL_FAMILY_ID | 0.25 SITE_ID_L:REL_FAMILY_ID | ||||
| ICC | 0.38 | 0.39 | ||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | ||||
| 2930 REL_FAMILY_ID | 2930 REL_FAMILY_ID | |||||
| Observations | 3423 | 3423 | ||||
| Marginal R2 | 0.242 | 0.190 | ||||
| Conditional R2 | 0.527 | 0.506 | ||||
| Regressors | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 1.05 | 1.02–1.08 | <0.001 | 1.05 | 1.02–1.08 | <0.001 |
| mental savg | 0.08 | 0.04–0.11 | <0.001 | 0.05 | –0.00–0.10 | 0.074 |
| mental cws | 0.23 | 0.20–0.26 | <0.001 | 0.18 | 0.15–0.21 | <0.001 |
| neuroimaging savg | ||||||
| neuroimaging cws | ||||||
| PGS savg favg | 0.00 | - 0.03–0.04 | 0.844 | |||
| PGS cws cwf | 0.05 | 0.03–0.07 | <0.001 | |||
| PGS savg favg | 0.00 | - 0.03–0.04 | 0.844 | |||
| PGS cws cwf | 0.05 | 0.03–0.07 | <0.001 | |||
| sdl savg | 0.04 | –0.01–0.09 | 0.092 | |||
| sdl cws | 0.25 | 0.22–0.28 | <0.001 | |||
| σ2 | 0.41 | 0.42 | ||||
| τ00 | 0.33 SITE_ID_L:REL_FAMILY_ID | 0.27 SITE_ID_L:REL_FAMILY_ID | ||||
| ICC | 0.45 | 0.39 | ||||
| N | 21 SITE_ID_L | 21 SITE_ID_L | ||||
| 2930 REL_FAMILY_ID | 2930 REL_FAMILY_ID | |||||
| Observations | 3423 | 3423 | ||||
| Marginal R2 | 0.076 | 0.153 | ||||
| Conditional R2 | 0.491 | 0.486 | ||||
cws = values centred within each site; savg = values averaged within each site; cws,cwf = values centred within each family first and then within each site; savg,favg = values averaged within each family first and then within each site; PGS = polygenic scores; sdl = socio-demographics, lifestyles and developmental adverse events.
Discussion
We aim to understand the extent to which the relationship between cognitive abilities and mental health is represented in part by cognitive abilities at the neural and genetic levels of analysis. We began by quantifying the relationship between cognitive abilities and mental health, finding a medium-sized out-of-sample correlation of approximately r=0.36. This relationship was shared with neuroimaging (66% at baseline) and PGS (21% at baseline), based on two separate sets of commonality analyses. This suggests the significant roles of these two neurobiological units of analysis in shaping the relationship between cognitive abilities and mental health (Morris and Cuthbert, 2012). We also found that the relationship between cognitive abilities and mental health was partly shared with environmental factors, as measured by socio-demographics, lifestyles, and developmental adverse events (63% at baseline). In another set of commonality analysis, this variance due to socio-demographics, lifestyles, and developmental adverse events was explained by neuroimaging and PGS at 58% and 21%, respectively, at baseline. Accordingly, the neurobiological units of analysis for cognitive abilities captured the environmental factors, consistent with RDoC’s viewpoint (Morris et al., 2022). Notably, this pattern of results remained stable over two years in early adolescence.
Our predictive modelling revealed a medium-sized predictive relationship between cognitive abilities and mental health. This finding aligns with recent meta-analyses of case-control studies that link cognitive abilities and mental disorders across various psychiatric conditions (Abramovitch et al., 2021; East-Richard et al., 2020). Unlike previous studies, we estimated the predictive, out-of-sample relationship between cognitive abilities and mental disorders in a large normative sample of children. Although our predictive models, like other cross-sectional models, cannot determine the directionality of the effects, the strength of the relationship between cognitive abilities and mental health estimated here should be more robust than when calculated using the same sample as the model itself, known as in-sample prediction/association (Marek et al., 2022; Yarkoni and Westfall, 2017). Examining the PLS loadings of our predictive models revealed that the relationship was driven by various aspects of mental health, including thought and externalising symptoms, as well as motivation. This suggests that there are multiple pathways—encompassing a broad range of emotional and behavioural problems and temperaments—through which cognitive abilities and mental health are linked.
Our predictive modelling created proxy measures of cognitive abilities based on two neurobiological units of analysis: neuroimaging and PGS (Morris and Cuthbert, 2012). For neuroimaging, inspired by recent BWAS benchmarks (Engemann et al., 2020; Marek et al., 2022), we used a multivariate modelling technique called opportunistic stacking, which integrates information across various MRI features and modalities. Combining 45 sets of neuroimaging features resulted in relatively high predictive performance (out-of-sample r=0.54 at baseline), compared to using any single set. This finding aligns with previous research that pooled multiple neuroimaging modalities (Engemann et al., 2020; Rasero et al., 2021; Tetereva et al., 2022). This level of predictive performance is numerically higher than that found in a recent meta-analysis, which mainly included studies using only one set of neuroimaging features, with an r of 0.42 (Vieira et al., 2022). Moreover, this performance level in predicting cognitive abilities is nearly the same as our previous attempt using a similar stacking technique to integrate MRI modalities in young adult samples from the Human Connectome Project (HCP) (Van Essen et al., 2013), which achieved an out-of-sample r=0.57 (Tetereva et al., 2022). Similarly, in the current study, the top contributing set of neuroimaging features, the 2-Back vs. 0-Back task fMRI, was consistent with previous studies using the HCP (Sripada et al., 2020; Tetereva et al., 2022). Altogether, this demonstrates the robustness of our proxy measure of cognitive abilities based on multimodal neuroimaging. In addition to predictive performance, opportunistic stacking offers the added benefit of handling missing values (Engemann et al., 2020; Pat et al., 2022b), allowing us to retain data from 10,754 participants who completed the cognitive tasks at baseline and has at least one set of neuroimaging features. Consequently, with opportunistic stacking, we were more likely to retain MRI data from participants with higher fMRI noise, such as those with socioeconomic disadvantages (Cosgrove et al., 2022). More importantly, we demonstrated that the proxy measure based on multimodal neuroimaging explained the majority of the variance in the relationship between cognitive abilities and mental health, underscoring its significant role as a neurobiological unit of analysis for cognitive abilities (Morris and Cuthbert, 2012).
For PGS, we created a proxy measure based on three large-scale GWAS on cognitive abilities (Davies et al., 2018; Lee et al., 2018; Savage et al., 2018). Using PGS resulted in a numerically weaker predictive performance (out-of-sample r=0.25 at baseline) compared to multimodal neuroimaging. However, this predictive strength is still comparable to previous research. For instance, Allegrini et al., 2019 used a different cohort of children and found R²=0.053 when applying PGS based on Savage et al., 2018 to predict the cognitive abilities of 12-year-old children. Given that PGS based on Savage et al., 2018 also drove the prediction in the current study, as seen in its feature importance, this similar level of predictive performance between Allegrini et al., 2019 and our study suggests consistency in the predictive performance of PGS. Despite this level of performance, PGS was able to explain some variance (21% at baseline) in the relationship between cognitive abilities and mental health, indicating some capacity of PGS as a neurobiological unit of analysis for cognitive abilities.
There are multiple potential reasons why PGS performed much poorer than multimodal neuroimaging. First, unlike genes, the brain changes throughout development and lifespan (Bethlehem et al., 2022), and so do cognitive abilities (Hartshorne and Germine, 2015). This dynamic nature might make multimodal neuroimaging a better tool for tracing cognitive abilities. Second, there might be a mismatch in the age of participants between the original GWAS (Davies et al., 2018; Lee et al., 2018; Savage et al., 2018) and the current study. While the original GWAS conducted meta-analyses pooling data from participants aged 5–102, these studies might draw more heavily from older cohorts with large participant numbers, such as the UK Biobank (Sudlow et al., 2015). Allegrini et al., 2019 also demonstrated that PGS performs better in predicting cognitive abilities in older children (aged 16) compared to younger ones (aged 12). Therefore, a more child-specific PGS might be needed to explain more variance in children. Thirdly, the PGS used here included only common SNPs and not rare variants. Recent studies using whole-genome sequence data have found that rare variants contribute to the heritability of complex traits, such as height and body mass index (Wainschtein et al., 2022). Given that cognitive abilities are also complex traits, future studies might need to examine if including rare variants can improve the predictive performance of PGS.
Similarly, our predictive modelling created proxy measures of cognitive abilities for environmental factors based on socio-demographics, lifestyles, and developmental adverse events. In line with previous work (Kirlic et al., 2021; Pat et al., 2022b), we could predict unseen children’s cognitive abilities based on their socio-demographics, lifestyles, and developmental adverse events with a medium-to-high out-of-sample r=0.49 (at baseline). This prediction was driven more strongly by socio-demographics (e.g. parent’s income and education, neighbourhood safety, area deprivation, single parenting), somewhat weaker by lifestyles (e.g. extracurricular activities, sleep, screen time, video gaming, mature movie watching, and parental monitoring), and much weaker by developmental adverse events (e.g. pregnancy complications). Importantly, proxy measures based on socio-demographics, lifestyles, and developmental adverse events captured a large proportion of the relationship between cognitive abilities and mental health. Furthermore, this variance captured by socio-demographics, lifestyles, and developmental adverse events overlapped mainly with the neurobiological proxy measures. This reiterates RDoC’s central tenet that understanding the neurobiology of a functional domain, such as cognitive abilities, could help us understand the extent to which environments influence mental health (Cuthbert and Insel, 2013; Insel et al., 2010). More importantly, all the results regarding neuroimaging, PGS, and socio-demographics, lifestyles, and developmental adverse events were reliable across two years during a sensitive period for adolescents.
This study has several limitations that might affect its generalisability. Firstly, the range of mental health variables was not exhaustive. While we covered various emotional and behavioural problems (Achenbach et al., 2017) and temperaments, including behavioural inhibition/activation (Carver and White, 1994) and impulsivity (Zapolski et al., 2010), we may still miss other critical mental health variables, such as psychotic-like experiences, eating disorder symptoms, and mania. Similarly, our ABCD samples were young and community-based, likely limiting the severity of their psychopathological issues (Kessler et al., 2007). Future work needs to test if the results found here are generalisable to adults and participants with stronger severity. Next, for cognitive abilities, while the six cognitive tasks (Luciana et al., 2018; Thompson et al., 2019) covered most of the RDoC cognitive abilities/systems constructs, we still missed variability in some domains, such as perception (Morris and Cuthbert, 2012). Additionally, several children (3274) did not complete all six cognitive tasks at follow-up, which might create a discrepancy between baseline and follow-up samples. However, the differences in social demographics, lifestyles, and developmental adverse events between participants who provided cognitive scores in the follow-up were minimal (Cohen’s d ranging from 0.007 to 0.092, see Table 12). Moreover, given that we found a similar pattern of predictive performance across the two time points, we believe excluding the children who did not complete the cognitive tasks at follow-up should not alter our conclusions.
Table 12. The differences in social demographics, lifestyles, and developmental adverse events between participants who provided cognitive scores in the follow-up.
We used social demographics, lifestyles, and developmental adverse events collected at baseline.
| Variable names | Having cognitive scores in the follow-up. | Not having cognitive scores in the follow-up. | Test statistics |
|---|---|---|---|
| Age in months | Mean (sd): 119.3 (7.5) | Mean (sd): 118.3 (7.6) | Yuen’s t(3783)=6.05, p < 0.001, Cohen’s d = 0.092 |
| Sex | Male = 3918 (52.4%) Female = 3564 (47.6%) Intersex-Male=1 (0.0%) Intersex-female=0 (0.0%) Do not know = 0 (0.0%) | Male = 1776 (53.2%) Female = 1563 (46.8%) Intersex-Male=2 (0.1%) Intersex-female=0(0.0%) Do not know = 0 (0.0%) | (X2 = 4, N = 10824)=6, p=0.199 |
| Body Mass Index | Mean (sd): 18.7 (4.1) | Mean (sd): 18.9 (4.4) | Yuen’s t (3658)=1.605, p=0.109, Cohen’s d=0.023 |
| Race | White = 4190 (56.0%) Black = 918 (12.3%) Hispanic = 1441 (19.3%) Asian = 157(2.1%) Other = 777 (10.4%) | White = 1611 (48.2%) Black = 612 (18.3%) Hispanic = 689 (20.6%) Asian = 68(2.0%) Other = 360 (10.8%) | X2(16, N=10823)=20, p = 0.22 |
| Bilingual Use | Mean (sd): 1 (1.7) | Mean (sd): 1 (1.7) | Yuen’s t(3776)=0.696, p=0.486, Cohen’s d=0.011 |
| Parent Marital Status | Married = 5239 (70.5%) Widowed = 59(0.8%) Divorced = 684 (9.2%) Separated = 264 (3.6%) NeverMarried = 806(10.8%) LivingWithPartner = 381 (5.1%) | Married = 2194 (66.0%) Widowed = 29(0.9%) Divorced = 290 (8.7%) Separated = 135 (4.1%) NeverMarried = 460(13.8%) LivingWithPartner = 214 (6.4%) | X2(25, N=10755)=30, p=0.224 |
| Parents' Education | Mean (sd): 16.6 (2.6) | Mean (sd): 16.3 (2.8) | Yuen’s t(3262)=4.175, p<0.001, Cohen’s d=0.068 |
| Parents' Income | Mean (sd): 7.4 (2.3) | Mean (sd): 7.2 (2.5) | Yuen’s t(2854)=2.243, p=0.025, Cohen’s d=0.034 |
| Household Size | Mean (sd): 4.7 (1.5) | Mean (sd): 4.7 (1.6) | Yuen’s t(3718)=0.39, p=0.697, Cohen’s d=0.007 |
| Economics Insecurities | Mean (sd): 0.4 (1.1) | Mean (sd): 0.5 (1.1) | Yuen’s t(1982)=2.65, p=0.008, Cohen’s d=0.033 |
| Area Deprivation Index | Mean (sd): 94.6 (20.7) | Mean (sd): 94.9 (21.2) | Yuen’s t(3297)=1.686, p=0.092, Cohen’s d=0.029 |
| Lead Risk | Mean (sd): 5 (3.1) | Mean (sd): 5.1 (3.1) | Yuen’s t(3374)=1.797, p=0.072, Cohen’s d=0.027 |
| Uniform Crime Reports | Mean (sd): 12.1 (5.5) | Mean (sd): 12 (6.1) | Yuen’s t(3370)=0.873, p=0.383, Cohen’s d=0.014 |
| Parent reported Neighbourhood Safety | Mean (sd): 11.8 (2.9) | Mean (sd): 11.6 (3) | Yuen’s t(3382)=1.799, p=0.072, Cohen’s d=0.025 |
| Child reported Neighbourhood Safety | Mean (sd): 4.1 (1.1) | Mean (sd): 4 (1.1) | Yuen’s t(3786)=2.258, p=0.024, Cohen’s d=0.036 |
| School Environment | Mean (sd): 20 (2.8) | Mean (sd): 19.8 (2.9) | Yuen’s t(3787)=1.763, p=0.078, Cohen’s d=0.029 |
| School Involvement | Mean (sd): 13.1 (2.3) | Mean (sd): 12.9 (2.4) | Yuen’s t(3790)=3.203, p=0.001, Cohen’s d=0.05 |
| School Disengagement | Mean (sd): 3.7 (1.4) | Mean (sd): 3.8 (1.5) | Yuen’s t(3800)=2.171, p=0.03, Cohen’s d=0.035 |
| Lack of Sleep | Mean (sd): 1.7 (0.8) | Mean (sd): 1.7 (0.8) | Yuen’s t(3860)=3.084, p=0.002, Cohen’s d=0.05 |
| Sleep Disturbance | Mean (sd): 1.9 (Abramovitch et al., 2021) | Mean (sd): 1.9 (Abramovitch et al., 2021) | Yuen’s t(3877)=1.567, p=0.117, Cohen’s d=0.025 |
| Sleep Initiating Maintaining | Mean (sd): 11.7 (3.7) | Mean (sd): 11.9 (3.8) | Yuen’s t(3862)=2.481, p=0.013, Cohen’s d=0.038 |
| Sleep Breathing Disorders | Mean (sd): 3.7 (1.2) | Mean (sd): 3.8 (1.3) | Yuen’s t(3834)=1.43, p=0.153, Cohen’s d=0.022 |
| Sleep Arousal Disorders | Mean (sd): 3.4 (0.9) | Mean (sd): 3.4 (Abramovitch et al., 2021) | Yuen’s t(3885)=0.966, p=0.334, Cohen’s d=0.013 |
| Sleep Wake Transition Disorders | Mean (sd): 8.2 (2.6) | Mean (sd): 8.1 (2.6) | Yuen’s t(3828)=1.198, p=0.231, Cohen’s d=0.022 |
| Sleep Excessive Somnolence | Mean (sd): 6.9 (2.4) | Mean (sd): 7 (2.5) | Yuen’s t(3836)=0.131, p=0.896, Cohen’s d=0.007 |
| Sleep Hyperhidrosis | Mean (sd): 2.4 (1.2) | Mean (sd): 2.5 (1.2) | Yuen’s t(4375)=1.755, p=0.079, Cohen’s d=0.029 |
| Individual Physical Extracurricular Activities | Mean (sd): 5 (5.7) | Mean (sd): 4.7 (5.4) | Yuen’s t(4173)=2.933, p=0.003, Cohen’s d=0.044 |
| Team Physical Extracurricular Activities | Mean (sd): 8.4 (7.7) | Mean (sd): 7.8 (7.4) | Yuen’s t(4007)=3.604, p<0.001, Cohen’s d=0.055 |
| Non Physical Extracurricular Activities | Mean (sd): 5.1 (6.3) | Mean (sd): 4.8 (6.1) | Yuen’s t(4075)=2.961, p=0.003, Cohen’s d=0.047 |
| Physically Active | Mean (sd): 3.5 (2.3) | Mean (sd): 3.4 (2.3) | Yuen’s t(3838)=2.094, p=0.036, Cohen’s d=0.033 |
| Mature Video Games Play | Mean (sd): 0.5 (0.8) | Mean (sd): 0.6 (0.9) | Yuen’s t(3816)=1.396, p=0.163, Cohen’s d=0.022 |
| Mature Movies Watch | Mean (sd): 0.4 (0.6) | Mean (sd): 0.4 (0.7) | Yuen’s t(3728)=4.038, p<0.001, Cohen’s d=0.065 |
| Weekday Screen Use | Mean (sd): 3.3 (3) | Mean (sd): 3.6 (3.3) | Yuen’s t(3220)=4.161,p<0.001, Cohen’s d=0.069 |
| Weekend Screen Use | Mean (sd): 4.5 (3.5) | Mean (sd): 4.8 (3.7) | Yuen’s t(3521)=3.218, p=0.001, Cohen’s d=0.053 |
| Tobacco Before Pregnant | No = 6328 (86.7%) Yes = 974 (13.3%) | No = 2838 (86.7%) Yes = 436 (13.3%) | X2(1,=10576)=0, p=1 |
| Tobacco After Pregnant | No = 6968 (95.2%) Yes = 351 (4.8%) | No = 3081 (94.2%) Yes = 190 (5.8%) | X2(1,=10590)=0, p=1 |
| Alcohol Before Pregnant | No = 5174 (73.4%) Yes = 1871 (26.6%) | No = 2380 (75.4%) Yes = 775 (24.6%) | X2(1,=10200)=0, p=1 |
| Alcohol After Pregnant | No = 7096 (97.1%) Yes = 210 (2.9%) | No = 3175 (97.4%) Yes = 85 (2.6%) | X2(1,=10566)=0, p=1 |
| Marijuana Before Pregnant | No = 6874 (94.5%) Yes = 399 (5.5%) | No = 3044 (93.9%) Yes = 199 (6.1%) | X2(1,=10516)=0, p=1 |
| Marijuana After Pregnant | No = 7182 (98.2%) Yes = 130 (1.8%) | No = 3191 (97.7%) Yes = 74 (2.3%) | X2(1,=10577)=0, p=1 |
| Developmental Prematurity | No = 5945 (80.3%) Yes = 1458 (19.7%) | No = 2735 (83.0%) Yes = 561 (17.0%) | X2(1, N=10699)=0, p=1 |
| Birth Complications | Mean (sd): 0.4 (0.8) | Mean (sd): 0.4 (0.7) | Yuen’s t(3591)=0.121, p=0.904, Cohen’s d=0.007 |
| Pregnancy Complications | Mean (sd): 0.6 (Abramovitch et al., 2021) | Mean (sd): 0.6 (Abramovitch et al., 2021) | Yuen’s t(3543)=1.19, p=0.234, Cohen’s d=0.018 |
| Parental Monitoring | Mean (sd): 4.4 (0.5) | Mean (sd): 4.4 (0.5) | Yuen’s t(3810)=0.451, p=0.652, Cohen’s d=0.009 |
| Parent-reported Family Conflict | Mean (sd): 2.5 (1.9) | Mean (sd): 2.6 (2) | Yuen’s t(3805)=1.404, p=0.16, Cohen’s d=0.023 |
| Child report Family Conflict | Mean (sd): 2 (1.9) | Mean (sd): 2.1 (2) | Yuen’s t(3809)=1.751, p=0.08, Cohen’s d=0.026 |
| Parent reported Prosocial | Mean (sd): 1.8 (0.4) | Mean (sd): 1.8 (0.4) | Yuen’s t(3817)=0.288, p=0.774, Cohen’s d=0.007 |
| Child reported Prosocial | Mean (sd): 1.7 (0.4) | Mean (sd): 1.7 (0.4) | Yuen’s t(3849)=2.529, p=0.011, Cohen’s d=0.041 |
Furthermore, while we used comprehensive multimodal MRI from 45 sets of features for neuroimaging, three fMRI tasks were not chosen based on their relevance to cognitive abilities (Casey et al., 2018). It is possible to obtain higher predictive performance based on other fMRI tasks. For all analyses involving PGS, we limited our participants to children of European ancestry due to the lack of summary statistics from well-powered GWAS for cognitive abilities in non-European participants. This prevented us from fully leveraging the diverse samples in the ABCD study (Garavan et al., 2018). Future GWAS work with more diverse samples is needed to ensure equity and fairness in developing neurobiological units of analysis for cognitive abilities. Lastly, we relied on 44 variables of socio-demographics, lifestyles, and developmental adverse events included in the study, which might have missed some variables relevant to cognitive abilities (e.g. nutrition). The ABCD study (Casey et al., 2018) is ongoing, and future data might address some of these limitations.
Overall, aligning with the RDoC perspective (Morris and Cuthbert, 2012), our findings support the use of neurobiological units of analysis for cognitive abilities, as assessed through multimodal neuroimaging and Polygenic Scores (PGS). These measures explain (a) the relationship between cognitive abilities and mental health and (b) the variance in this cognitive-ability-and-mental-health relationship attributable to environmental factors. Our results emphasise the importance of considering both neurobiology and environmental factors, such as socio-demographics, lifestyles, and adverse childhood events, to gain a comprehensive understanding of the aetiology of mental health (Insel et al., 2010; Morris et al., 2022).
Materials and methods
The ABCD study
We used data from the AABCD Study Curated Annual Release 5.1 (DOI:10.15154/z563-zd24) from two time points. The baseline included data from 11,868 children (5677 females and 3 others, aged 9–10 years), while the two-year follow-up included data from the same children two years later (10,908 children, 5181 females and 3 others). Although the ABCD collected data from 22 sites across the United States, we excluded data from Site 22 since this site only provided data from 35 children at baseline and none at follow-up (Garavan et al., 2018). We also excluded 69 children based on the Snellen Vision Screener (Luciana et al., 2018; Snellen, 1862). These children either could not read any line on the chart, could only read the largest line, or could read up to the fourth line clearly but had difficulty reading stimuli on an iPad used for administering cognitive tasks (explained below). We listed the number of participants following each inclusion and exclusion criteria for each variable in Figure 6, Figure 6—figure supplement 1 and Tables 13–14. Institutional Review Boards at each site approved the study protocols. Please see Clark et al., 2018 for ethical details, such as informed consent and confidentiality.
Figure 6. Flow diagram of participants’ inclusion and exclusion criteria.
Here, we show the criteria for cognitive abilities and mental health across the two time points.
Figure 6—figure supplement 1. Flow diagram of participants’ inclusion and exclusion criteria.
Table 13. Exclusion criteria for neuroimaging features in the baseline.
| Neuroimaging features | Data provided | Did not pass quality control | Had vision problems | From site 22 | Had any missing feature | Flagged as outliers | Observations kept |
|---|---|---|---|---|---|---|---|
| ENback 0back | 11771 | 3996 | 38 | 21 | 8 | 292 | 7416 |
| ENback 2back | 11771 | 3996 | 38 | 21 | 12 | 281 | 7423 |
| ENback 2back vs 0back | 11771 | 3996 | 38 | 21 | 10 | 397 | 7309 |
| ENback emotion | 11771 | 3996 | 38 | 21 | 10 | 303 | 7403 |
| ENback Emotion vs Neutral Face | 11771 | 3996 | 38 | 21 | 11 | 480 | 7225 |
| ENback Face vs Place | 11771 | 3996 | 38 | 21 | 10 | 391 | 7315 |
| ENback Negative vs Neutral Face | 11771 | 3996 | 38 | 21 | 11 | 454 | 7251 |
| ENback Positive vs Neutral Face | 11771 | 3996 | 38 | 21 | 10 | 500 | 7206 |
| ENback place | 11771 | 3996 | 38 | 21 | 11 | 331 | 7374 |
| MID Reward vs Neutral anticipation | 11771 | 2596 | 51 | 22 | 11 | 250 | 8841 |
| MID Loss vs Neutral anticipation | 11771 | 2596 | 51 | 22 | 11 | 245 | 8846 |
| MID Positive vs Negative Reward Feedback | 11771 | 2596 | 51 | 22 | 12 | 338 | 8752 |
| MID Positive vs Negative Punishment Feedback | 11771 | 2596 | 51 | 22 | 10 | 334 | 8758 |
| MID Large Reward vs Neutral anticipation | 11771 | 2596 | 51 | 22 | 12 | 241 | 8849 |
| MID Small Reward vs Neutral anticipation | 11771 | 2596 | 51 | 22 | 10 | 270 | 8822 |
| MID Large Reward vs Small Reward anticipation | 11771 | 2596 | 51 | 22 | 13 | 266 | 8823 |
| MID Large Loss vs Neutral anticipation | 11771 | 2596 | 51 | 22 | 11 | 250 | 8841 |
| MID Small Loss vs Neutral anticipation | 11771 | 2596 | 51 | 22 | 11 | 282 | 8809 |
| MID Large Loss vs Small Loss anticipation | 11771 | 2596 | 51 | 22 | 12 | 307 | 8783 |
| SST Any Stop vs Correct Go | 11771 | 3672 | 45 | 20 | 14 | 227 | 7793 |
| SST Correct Go vs Fixation | 11771 | 3672 | 45 | 20 | 13 | 262 | 7759 |
| SST Correct Stop vs Correct Go | 11771 | 3672 | 45 | 20 | 13 | 236 | 7785 |
| SST Correct Stop vs Incorrect Stop | 11771 | 3672 | 45 | 20 | 14 | 292 | 7728 |
| SST Incorrect Go vs Correct Go | 11771 | 3672 | 45 | 20 | 15 | 481 | 7538 |
| SST Incorrect Go vs Incorrect Stop | 11771 | 3672 | 45 | 20 | 14 | 366 | 7654 |
| SST Incorrect Stop vs Correct Go | 11771 | 3672 | 45 | 20 | 13 | 246 | 7775 |
| rsfMRI temporal variance | 11771 | 2397 | 62 | 25 | 14 | 682 | 8591 |
| rsfMRI subcortical-network FC | 11771 | 2397 | 62 | 25 | 14 | 1 | 9272 |
| rsfMRI cortical FC | 11771 | 2397 | 62 | 25 | 14 | 3 | 9270 |
| T1 subcortical avg intensity | 11771 | 501 | 66 | 27 | 0 | 60 | 11117 |
| T1 white matter avg intensity | 11771 | 501 | 66 | 27 | 12 | 13 | 11152 |
| T1 gray matter avg intensity | 11771 | 501 | 66 | 27 | 12 | 11 | 11154 |
| T1 normalised intensity | 11771 | 501 | 66 | 27 | 12 | 2 | 11163 |
| T1 summations | 11771 | 501 | 66 | 27 | 12 | 34 | 11131 |
| cortical thickness | 11771 | 501 | 66 | 27 | 12 | 2 | 11163 |
| cortical area | 11771 | 501 | 66 | 27 | 12 | 1 | 11164 |
| cortical volume | 11771 | 501 | 66 | 27 | 12 | 0 | 11165 |
| subcortical volume | 11771 | 501 | 66 | 27 | 0 | 215 | 10962 |
| sulcal depth | 11771 | 501 | 66 | 27 | 12 | 1106 | 10059 |
| T2 subcortical avg intensity | 11771 | 1217 | 58 | 25 | 0 | 67 | 10404 |
| T2 white matter avg intensity | 11771 | 1217 | 58 | 25 | 10 | 56 | 10405 |
| T2 gray matter avg intensity | 11771 | 1217 | 58 | 25 | 10 | 55 | 10406 |
| T2 normalised intensity | 11771 | 1217 | 58 | 25 | 10 | 12 | 10449 |
| T2 summations | 11771 | 1217 | 58 | 25 | 10 | 14 | 10447 |
| DTI | 11771 | 1577 | 57 | 13 | 0 | 24 | 10100 |
Table 14. Exclusion criteria for neuroimaging features in the follow-up.
| Neuroimaging features | Data provided | Did not pass quality control | Had vision problems | Had any missing feature | Flagged as outliers | Observations kept |
|---|---|---|---|---|---|---|
| ENback 0back | 8123 | 1804 | 35 | 11 | 216 | 6057 |
| ENback 2back | 8123 | 1804 | 35 | 13 | 186 | 6085 |
| ENback 2back vs 0back | 8123 | 1804 | 35 | 14 | 294 | 5976 |
| ENback emotion | 8123 | 1804 | 35 | 13 | 202 | 6069 |
| ENback Emotion vs Neutral Face | 8123 | 1804 | 35 | 13 | 347 | 5924 |
| ENback Face vs Place | 8123 | 1804 | 35 | 13 | 295 | 5976 |
| ENback Negative vs Neutral Face | 8123 | 1804 | 35 | 11 | 355 | 5918 |
| ENback Positive vs Neutral Face | 8123 | 1804 | 35 | 13 | 342 | 5929 |
| ENback place | 8123 | 1804 | 35 | 12 | 234 | 6038 |
| MID Reward vs Neutral anticipation | 8123 | 1379 | 40 | 8 | 153 | 6543 |
| MID Loss vs Neutral anticipation | 8123 | 1379 | 40 | 8 | 154 | 6542 |
| MID Positive vs Negative Reward Feedback | 8123 | 1379 | 40 | 9 | 192 | 6503 |
| MID Positive vs Negative Punishment Feedback | 8123 | 1379 | 40 | 9 | 197 | 6498 |
| MID Large Reward vs Neutral anticipation | 8123 | 1379 | 40 | 8 | 142 | 6554 |
| MID Small Reward vs Neutral anticipation | 8123 | 1379 | 40 | 8 | 163 | 6533 |
| MID Large Reward vs Small Reward anticipation | 8123 | 1379 | 40 | 8 | 155 | 6541 |
| MID Large Loss vs Neutral anticipation | 8123 | 1379 | 40 | 8 | 150 | 6546 |
| MID Smal Loss vs Neutral anticipation | 8123 | 1379 | 40 | 9 | 179 | 6516 |
| MID Large Loss vs Small Loss anticipation | 8123 | 1379 | 40 | 9 | 173 | 6522 |
| SST Any Stop vs Correct Go | 8123 | 2036 | 33 | 7 | 123 | 5924 |
| SST Correct Go vs Fixation | 8123 | 2036 | 33 | 7 | 173 | 5874 |
| SST Correct Stop vs Correct Go | 8123 | 2036 | 33 | 7 | 163 | 5884 |
| SST Correct Stop vs Incorrect Stop | 8123 | 2036 | 33 | 7 | 187 | 5860 |
| SST Incorrect Go vs Correct Go | 8123 | 2036 | 33 | 7 | 345 | 5702 |
| SST Incorrect Go vs Incorrect Stop | 8123 | 2036 | 33 | 7 | 267 | 5780 |
| SST Incorrect Stop vs Correct Go | 8123 | 2036 | 33 | 7 | 131 | 5916 |
| rsfMRI temporal variance | 8123 | 1152 | 49 | 14 | 512 | 6396 |
| rsfMRI subcortical-network FC | 8123 | 1152 | 49 | 14 | 3 | 6905 |
| rsfMRI cortical FC | 8123 | 1152 | 49 | 14 | 3 | 6905 |
| T1 subcortical avg intensity | 8123 | 227 | 51 | 0 | 32 | 7813 |
| T1 white matter avg intensity | 8123 | 227 | 51 | 10 | 8 | 7827 |
| T1 gray matter avg intensity | 8123 | 227 | 51 | 10 | 9 | 7826 |
| T1 normalised intensity | 8123 | 227 | 51 | 10 | 0 | 7835 |
| T1 summations | 8123 | 227 | 51 | 10 | 19 | 7816 |
| cortical thickness | 8123 | 227 | 51 | 10 | 2 | 7833 |
| cortical area | 8123 | 227 | 51 | 10 | 0 | 7835 |
| cortical volume | 8123 | 227 | 51 | 10 | 0 | 7835 |
| subcortical volume | 8123 | 227 | 51 | 0 | 112 | 7733 |
| sulcal depth | 8123 | 227 | 51 | 10 | 890 | 6945 |
| T2 subcortical avg intensity | 8123 | 600 | 50 | 0 | 39 | 7434 |
| T2 white matter avg intensity | 8123 | 600 | 50 | 10 | 47 | 7416 |
| T2 gray matter avg intensity | 8123 | 600 | 50 | 10 | 49 | 7414 |
| T2 normalised intensity | 8123 | 600 | 50 | 10 | 5 | 7458 |
| T2 summations | 8123 | 600 | 50 | 10 | 14 | 7449 |
| DTI | 8123 | 638 | 47 | 0 | 15 | 7423 |
Measures: cognitive abilities
Cognitive abilities were assessed using six cognitive tasks collected with an iPad during a 70 min session outside of MRI at baseline and two-year follow-up (Luciana et al., 2018; Thompson et al., 2019). The first task was Picture Vocabulary, which measured language comprehension (Gershon et al., 2014). The second task was Oral Reading Recognition, which measured language decoding (Bleck et al., 2013). The third task was Flanker, which measured conflict monitoring and inhibitory control (Eriksen and Eriksen, 1974). The fourth task was Pattern Comparison Processing, which measured the speed of processing patterns (Carlozzi et al., 2013). The fifth task was Picture Sequence Memory, which measured episodic memory (Bauer et al., 2013). The sixth task was Rey-Auditory Verbal Learning, which measured memory recall after distraction and a short delay (Daniel and Wahlstrom, 2014). Rey-Auditory Verbal Learning was sourced from Pearson Assessment, while the other five cognitive tasks were from the NIH Toolbox (Bleck et al., 2013; Luciana et al., 2018). The ABCD study administered the Dimensional Change Card Sort and List Sorting Working Memory tasks from the NIH Toolbox (Bleck et al., 2013) only at baseline, not at the two-year follow-up (see DOI: 10.15154/z563-zd24). Consequently, these two tasks were not analysed in the current study. Additionally, 3,274 children at follow-up did not complete some of these tasks and were therefore excluded from the follow-up data analysis.
We operationalised individual differences in cognitive abilities across the six cognitive tasks as a factor score of a latent variable, the ‘g-factor.’ To estimate this factor score, we fit the standardised performance of the six cognitive tasks to a second-order confirmatory factor analysis (CFA) of a ‘g-factor’ model, similar to previous work (Ang et al., 2020; Pat et al., 2022a; Pat et al., 2022b; Thompson et al., 2019). In this CFA, we treated the g-factor as the second-order latent variable that underpinned three first-order latent variables, each with two manifest variables: (1) ‘language,’ underlying Picture Vocabulary and Oral Reading Recognition, (2) ‘mental flexibility,’ underlying Flanker and Pattern Comparison Processing, and (3) ‘memory recall,’ underlying Picture Sequence Memory and Rey-Auditory Verbal Learning.
We fixed the variance of the latent factors to one and applied the Maximum Likelihood with Robust standard errors (MLR) approach with Huber-White standard errors and scaled test statistics. To provide information about the internal consistency of the g-factor, we calculated OmegaL2 (Jorgensen et al., 2022). We used the lavaan (Rosseel, 2012) (version 0.6–15), semTools (Jorgensen et al., 2022), and semPlots (Epskamp, 2015) packages for this CFA of cognitive abilities.
We found the second-order ‘g-factor’ model to fit cognitive abilities well across the six cognitive tasks. This is evidenced by several indices if we apply the model to the whole baseline data: scaled and robust CFI (0.994), TLI (0.986), RMSEA (0.031, 90% CI [0.024-0.037]), robust SRMR (0.013), and OmegaL2 (0.78). See Figure 7 for the standardised weights of this CFA model. This enabled us to use the factor score of the latent variable ‘g-factor’ as the target for our predictive models.
Figure 7. Standardised weights of the second-order ‘g-factor’ model.
These weights were derived from confirmatory factor analysis, fitted on cognitive abilities across six cognitive tasks from the entire baseline dataset. The actual weights used for predictive modelling were slightly different, as the predictive modelling was based on leave-one-site-out cross-validation, which trained on data from all but one site.
Measures: mental health
Mental health was assessed using two sets of features. The first set involved parental reports of children’s emotional and behavioural problems, as measured by the Child Behaviour Checklist (CBCL) (Achenbach et al., 2017). We used eight summary scores: anxious/depressed, withdrawn, somatic complaints, social problems, thought problems, attention problems, rule-breaking behaviours, and aggressive behaviours. For CBCL, caretakers rated each item as 0=not true (as far as you know), 1=somewhat or sometimes true, and 2=very true or often true. The third set assessed children’s temperaments, conceptualised as risk factors for mental issues (Johnson et al., 2003; Whiteside and Lynam, 2003), using the Urgency, Premeditation, Perseverance, Sensation Seeking, and Positive Urgency (UPPS-P) Impulsive Behaviour Scale (Zapolski et al., 2010) and the Behavioural Inhibition System/Behavioural Activation System (BIS/BAS) (Carver and White, 1994). We used nine summary scores: negative urgency, lack of planning, sensation seeking, positive urgency, lack of perseverance, BIS, BAS reward responsiveness, BAS drive, and BAS fun. Supplementary file 1 and Supplementary file 2 provide summary statistics, histograms, and missing values for measures of mental health. They also include the actual variable names listed in the data dictionary and their calculations.
Measures: neuroimaging
Neuroimaging data were based on the tabulated brain-MRI data pre-processed by the ABCD. We organized the brain-MRI data into 45 sets of neuroimaging features, covering task-fMRI (including ENBack, stop signal (SST), and monetary incentive delay (MID) tasks), resting-state fMRI, structural MRI, and diffusion tensor imaging (DTI). The ABCD provided details on MRI acquisition and image processing elsewhere (Hagler et al., 2019; Yang and Jernigan, 2023).
The ABCD study provided recommended exclusion criteria for neuroimaging data based on automated and manual quality control (Yang and Jernigan, 2023). Specifically, the study created an exclusion flag for each neuroimaging feature (with the prefix ‘imgincl’ in the ‘abcd_imgincl01’ table) based on criteria involving image quality, MR neurological screening, behavioural performance, and the number of repetition times (TRs), among others. We removed the entire set of neuroimaging features from each participant if any of its features were flagged or missing. We also detected outliers with over three interquartile ranges from the nearest quartile for each neuroimaging feature. We excluded a particular set of neuroimaging features from each participant when this set had outliers over 5% of the total number of its neuroimaging features. For instance, for the 2-Back vs 0-Back contrast from the ENBack task-fMRI, we had 167 features (i.e. brain regions) based on the brain parcellation atlas used by the ABCD. If (a) one of the 167 features had an exclusion flag, (b) a participant had a vision problem, (c) any of the 167 features was missing, (d) at least nine features (i.e. over 5%) were outliers, then we would remove this 2-Back vs 0-Back contrast from a particular participant but still keep other sets of neuroimaging features that did not meet these criteria (see –13 for the number of participants after each exclusion criterion for each set of neuroimaging features).
We standardised each neuroimaging feature across participants and harmonised variation across MRI scanners using ComBat (Fortin et al., 2017; Johnson et al., 2007; Nielson et al., 2018). Note that under predictive modelling, we discuss strategies we implemented to avoid data leakage and to model the data with missing values using the opportunistic stacking technique (Engemann et al., 2020; Pat et al., 2022b).
Sets of neuroimaging features 1-26: task-fMRI
We used unthresholded generalised-linear model (GLM) contrasts, averaged across two runs (Bolt et al., 2017; Pat et al., 2023; Pat et al., 2022b) for task-fMRI sets of features. These contrasts were embedded in the brain parcels based on the FreeSurfer’s atlases (Dale et al., 1999): 148 cortical-surface Destrieux parcels (Destrieux et al., 2010) and subcortical-volumetric 19 ASEG parcels (Fischl et al., 2002), resulting in 167 features in each task-fMRI set.
Sets of neuroimaging features 1-9: ENBack task-fMRI
The ‘ENBack’ or emotional n-back task was designed to elicit fMRI activity related to working memory to neutral and emotional stimuli (Barch et al., 2013). Depending on the block, the children were asked whether an image matched the image shown two trials earlier (2-Back) or at the beginning (0-Back). In this task version, the images shown included emotional faces and places. Thus, in addition to working memory, the task also allowed us to extract fMRI activity related to emotion processing and facial processing. We used the following contrasts as nine separate sets of neuroimaging features for ENBack task-fMRI: 2-Back vs 0-Back, Face vs Place, Emotion vs Neutral Face, Positive vs Neutral Face, Negative vs Neutral Face, 2-Back, 0-Back, Emotion, and Place.
Sets of neuroimaging features 10-19: MID task-fMRI
The MID task was designed to elicit fMRI activity related to reward processing (Knutson et al., 2000). In this task, children responded to a stimulus shown on a screen. If they responded before the stimulus disappeared, they could either win $5 (Large Reward), win $0.2 (Small Reward), lose $5 (Large Loss), lose $0.2 (Small Loss), or not win or lose any money (Neutral), depending on the conditions. At the end of each trial, they were shown feedback on whether they won money (Positive Reward Feedback), did not win money (Negative Reward Feedback), avoided losing money (Positive Punishment Feedback), or lost money (Negative Punishment Feedback). We used the following contrasts as ten separate sets of neuroimaging features for MID task-fMRI: Large Reward vs Small Reward anticipation, Small Reward vs Neutral anticipation, Large Reward vs Neutral anticipation, Large Loss vs Small Loss anticipation, Small Loss vs Neutral anticipation, Large Loss vs Neutral anticipation, Loss vs Neutral anticipation, Reward vs Neutral anticipation, Positive vs Negative Reward Feedback, and Positive vs Negative Punishment Feedback.
Sets of Neuroimaging Features 20-26: Stop-Signal Task (SST) task-fMRI
The SST was designed to elicit fMRI activity related to inhibitory control (Whelan et al., 2012). Children were asked to withhold or interrupt their motor response to a ‘Go’ stimulus whenever they saw a ‘Stop’ signal. We used two additional quality-control exclusion criteria for the SST task: tfmri_sst_beh_glitchflag and tfmri_sst_beh_violatorflag, which notified glitches as recommended (Bissett et al., 2021; Garavan et al., 2018). We used the following contrasts as seven separate sets of neuroimaging features for SST task-fMRI: Incorrect Go vs Incorrect Stop, Incorrect Go vs Correct Go, Correct Stop vs Incorrect Stop, Any Stop vs Correct Go, Incorrect Stop vs Correct Go, Correct Stop vs Correct Go, and Correct Go vs Fixation.
Sets of neuroimaging features 27-29: rs-fMRI
The ABCD study collected rs-fMRI data for 20 min while children viewed a crosshair. The study described the pre-processing procedure elsewhere (Hagler et al., 2019). The investigators parcellated the cortical surface into 333 regions and the subcortical volume into 19 regions using Gordon’s (Gordon et al., 2016) and ASEG (Fischl et al., 2002) atlases, respectively. They grouped the cortical-surface regions into 13 predefined large-scale cortical networks (Gordon et al., 2016). These large-scale cortical networks included auditory, cingulo-opercular, cingulo-parietal, default-mode, dorsal-attention, frontoparietal, none, retrosplenial-temporal, salience, sensorimotor-hand, sensorimotor-mouth, ventral-attention, and visual networks. Note that the term ‘None’ refers to regions that did not belong to any network. They then correlated time series from these regions and applied Fisher’s z-transformation to the correlations. We included three sets of neuroimaging features for rs-fMRI. The first set was cortical functional connectivity (FC) with 91 features, including the mean values of the correlations between pairs of regions within the same large-scale cortical network and between large-scale cortical networks. The second set was subcortical-network FC with 247 features, including the mean values of the correlations between each of the 19 subcortical regions and the 13 large-scale cortical networks. The third set was temporal variance with 352 features (i.e. 333 cortical and 19 subcortical regions), representing the variance across time calculated for each parcellated region. Temporal variance reflects the magnitude of low-frequency oscillations (Yang and Jernigan, 2023).
Sets of neuroimaging features 30-44: sMRI
The ABCD study collected T1-weighted and T2-weighted 3D sMRI images and quantified them into various measures, mainly through FreeSurfer v7.1.1 (Yang and Jernigan, 2023). Similar to task-fMRI, we used 148 cortical-surface Destrieux (Destrieux et al., 2010) and subcortical-volumetric 19 ASEG (Fischl et al., 2002) atlases, resulting in 167 features. We included 15 sets of neuroimaging features for sMRI: cortical thickness, cortical area, cortical volume, sulcal depth, T1 white-matter averaged intensity, T1 grey-matter averaged intensity, T1 normalised intensity, T2 white-matter averaged intensity, T2 grey-matter averaged intensity, T2 normalised intensity, T1 summations, T2 summations, T1 subcortical averaged intensity, T2 subcortical averaged intensity and subcortical volume. Note: see Figure 3 for the neuroimaging features included in T1 and T2 summations, and those figures are enlarged in Figure 3—figure supplements 6 and 11 for baseline and follow-up respectively.
Sets of neuroimaging features 45: DTI
We included fractional anisotropy (FA) derived from DTI as another set of neuroimaging features. FA characterizes the directionality of diffusion within white matter tracts, which is thought to indicate the density of fiber packing (Alexander et al., 2007). The ABCD study used AtlasTrack (Hagler et al., 2009; Hagler et al., 2019) to segment major white matter tracts. These included the corpus callosum, forceps major, forceps minor, cingulate and parahippocampal portions of the cingulum, fornix, inferior fronto-occipital fasciculus, inferior longitudinal fasciculus, pyramidal/corticospinal tract, superior longitudinal fasciculus, temporal lobe portion of the superior longitudinal fasciculus, anterior thalamic radiations, and uncinate. Given that ten tracts were separately labelled for each hemisphere, there were 23 neuroimaging features in this set.
Measures: polygenic scores
Genetic profiles were constructed based on PGS of cognitive abilities. The ABCD study provides detailed notes on genotyping in another source (Uban et al., 2018). Briefly, the study genotyped saliva and whole blood samples using Smokescreen Array. The investigators then quality-controlled the data using calling signals and variant call rates, applied the Ricopili pipeline and imputed the data with TOPMED (see https://topmedimpute.readthedocs.io/). The study also identified problematic plates and data points with a subject-matching issue. Additional quality control was applied to these data, specifically excluding SNPs with a minor allele frequency of < 5%, removing SNPs with excessive deviation in Hardy-Weinberg Equilibrium of p < 10-10, and finally remove individuals with excessive homozygosity / heterozygosity. Defined as heterozygosity observed at greater than 4 standard deviations above or below the sample mean.
We calculated PGS using three definitions from three large-scale genome-wide association studies (GWAS) on cognitive abilities: n=300,486 participants aged 16–102 (Davies et al., 2018), n=257,84 participants aged 8–96 (Lee et al., 2018) and n=269,867 participants aged 5–98 (Savage et al., 2018). These GWAS synthesised findings from different cohorts that collected cognitive tasks. Due to the diversity in cognitive tasks used across cohorts, they defined cognitive abilities in unique ways. For instance, Lee et al., 2018 utilised principal component analysis to consolidate various cognitive task scores into a single measure within each cohort from the Cognitive Genomics Consortium (COGENT) consortia (Lencz et al., 2014), but only focused on the verbal-numerical reasoning (VNR) test within the UK Biobank cohort (Sudlow et al., 2015). In a similar approach, Davies et al., 2018 employed principal component analysis to capture cognitive abilities from different cohorts within both CHARGE consortium data sets (Psaty et al., 2009) and COGENT (Lencz et al., 2014). They also focused on VNR testing within UK Biobank (Sudlow et al., 2015). Similarly, Savage et al., 2018 calculated a singular score for cognitive abilities using ‘a single sum score, mean score, or factor score’ collated from various tasks across thirteen cohort studies alongside logistic regression in one case-control study.
Participants in these GWAS were of European ancestry. Because PGS has a lower predictive ability when target samples (i.e. in our case, ABCD children) do not have the same ancestry as those of the discovery GWAS sample (Duncan et al., 2019), we restricted all analyses involving PGS to 5776 children of European ancestry. These children were within four standard deviations from the mean of the top four principal components (PCs) of the super-population individuals in the 1000 Genomes Project Consortium Phase 3 reference (Auton et al., 2015).
We employed the P-threshold approach (Choi et al., 2020). In this approach, we defined ‘risk’ alleles as those associated with cognitive abilities in the three discovery GWASs (Davies et al., 2018; Lee et al., 2018; Savage et al., 2018) at ten different PGS thresholds: 0.5, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001, 0.000001, 0.0000001, 0.00000001. We then computed PGS as the Z-scored, weighted mean number of linkage-independent risk alleles in approximate linkage equilibrium derived from imputed autosomal SNPs. We selected the best PGS threshold for each of the three definitions by choosing the PGS threshold that demonstrated the strongest correlation between its PGS and cognitive abilities in the ABCD (i.e. the g-factor factor score). Refer to the section on predictive modelling below for strategies we implemented to avoid data leakage due to this selection of the PGS threshold and the family structure in the ABCD.
Measures: sociodemographics, lifestyles, and developmental adverse events
Environmental factors were based on 44 features, covering socio-demographics, lifestyles, and developmental adverse events. This included (a) 14 features for child social-demographics (Zucker et al., 2018), including bilingual use (Dick et al., 2019), parental marital status, parental education, parental income, household size, economic insecurities, area deprivation index (Kind et al., 2014), lead risk (Frostenson, 2016), crime report (Federal Bureau Of Investigation, 2012), neighbourhood safety (Echeverria et al., 2004), school environment, involvement and disengagement (Stover et al., 2010), (b) five features for child social interactions from Parent Monitoring scale (Chilcoat and Anthony, 1996), Child Report of Behaviour Inventory (Schaefer, 1965), Strength and Difficulties Questionnaire (Goodman et al., 2003) and Moos Family Environment Scale (Moos et al., 1974), (c) eight features from child’s sleep problems based on the Sleep Disturbance scale (Bruni et al., 1996), (d) four features for child’s physical activities from Youth Risk Behaviour Survey (Dolsen et al., 2019; Hunsberger et al., 2015), (e) four features for child screen use (Bagot et al., 2018), (f) six features for parental use of alcohol, tobacco and marijuana before and after pregnancy from the Developmental History Questionnaire (Kessler et al., 2009; Merikangas et al., 2009), and (g) three features for developmental adverse events from the Developmental History Questionnaire, including prematurity and birth and pregnancy complications (Kessler et al., 2009; Merikangas et al., 2009). Note that we treated developmental adverse events from the Developmental History Questionnaire as environmental factors, as these events are either parental behaviours (e.g. parental use of alcohol, tobacco and marijuana) or parental medical conditions (e.g. pregnancy complications) that affect children. Supplementary file 3 and Supplementary file 4 provide summary statistics, histograms, and missing values for measures of socio-demographics, lifestyles and developmental adverse events. They also include the actual variable names listed in the data dictionary and their calculations.
Predictive modelling
For building predictive multivariate models, we implemented a nested leave-one-site-out cross-validation. Specifically, we treated one out of 21 sites as a test set and the rest as a training set for training predictive models. We then repeated the model-building process until every site was a test set once and reported overall predictive performance across all test sites. Within each training set, we applied 10-fold cross-validation to tune the hyperparameters of the predictive models. The nested leave-one-site-out cross-validation allowed us to ensure the generalisability of our predictive models to unseen sites. This is important because different sites involved different MRI machines, experimenters, and participants of other demographics (Garavan et al., 2018). Next, data from children from the same family were collected from the same site. Accordingly, using leave-one-site-out also prevented data leakage due to family structure, which might inflate the predictive performance of the models, particularly those involving polygenic scores. Still, given the different number of participants in each site, one drawback for the nested leave-one-site-out cross-validation is that we ended up with some test sets with fewer participants than others. Accordingly, we provided a supplemental analysis using the classical nested cross-validation, which included ten non-overlapping outer folds, randomly chosen without considering the site information, as test sets and ten inner folds for hyperparameter tuning (see Figure 8). Briefly, the results of the leave-one-site-out cross-validation and classical nested cross-validation were close to each other, albeit classical nested cross-validation having slightly higher performance.
Figure 8. Predictive performance of leave one site out cross-validation vs 10-fold cross validation.
To demonstrate the stability of the results across two years, we built the predictive models (including hyperparameter tuning) separately for baseline and follow-up data. We separately applied standardisation to the baseline training and test sets for both the target and features to prevent data leakage between training and test sets. To ensure similarity in the data scale across two time points, we used the mean and standard deviation of the baseline training and test sets to standardise the follow-up training and test sets, respectively. For cognitive abilities, which were used as the target for all predictive models, we applied this standardisation strategy both before CFA (i.e. to the behavioural performance of the six cognitive tasks) and after CFA (i.e. to the g-factor factor scores). Moreover, we only estimated the CFA of cognitive abilities using the baseline training set to ensure that the predictive models of the two time points had the same target. We then applied this estimated CFA model to the baseline test set and follow-up training and test sets. We examined the predictive performance of the models via the relationship between predicted and observed cognitive abilities, using Pearson’s correlation (r), coefficient of determination (R2, calculated using the sum of square definition), mean-absolute error (MAE), and root mean square error (RMSE).
Predicting cognitive abilities from mental health
We developed predictive models to predict cognitive abilities from three sets of mental health features: CBCL and temperaments. We separately modelled each of these two sets and also simultaneously modelled the two sets by concatenating them into one set of features called ‘mental health.’ We implemented PLS (Wold et al., 2001) as a multivariate algorithm for these predictive models. Note that while PLS is sometimes used for reducing the dimensionality of features within a dataset, here we utilised PLS in a predictive framework: we tuned and estimated PLS loadings in each training set and applied the final model to the corresponding test set. PLS decomposes features into components that capture not only the features' variance but also the target’s variance (Wold et al., 2001). PLS has an advantage in dealing with collinear features (Dormann et al., 2013), typical for mental health issues (Caspi and Moffitt, 2018).
PLS has one hyperparameter, the number of components. In our grid search, we tested the number of components, ranging from one to the total number of features. We selected the number of components based on the drop in root mean square error (RMSE). We kept increasing the number of components until the component did not reduce 0.1% of the total RMSE. We fit PLS using the mixOmics package (Rohart et al., 2017) with the tidymodels package as a wrapper (Kuhn and Wickham, 2025).
To understand how PLS made predictions, we examined loadings and the proportion of variance explained. Loadings for each PLS component show how much each feature contributes to each PLS component. The proportion of variance explained shows how much variance each PLS component captures compared to the total variance. We then compared loadings and the proportion of variance explained with the univariate Pearson’s correlation between each feature and the target. Note that because we could not guarantee that each training set would result in the same PLS components, we calculated loadings and the proportion of variance explained on the full data without splitting them into training and test sets. It is important to note that the loadings and the proportion of variance explained are for understanding the models, but for assessing the predictive performance and computing a proxy measure of cognitive abilities (i.e., the predicted values), we still relied on the nested leave-one-site-out cross-validation.
Predicting cognitive abilities from neuroimaging
We developed predictive models to predict cognitive abilities from 45 sets of neuroimaging features. To avoid data leakage, we detected the outliers separately in the baseline training, baseline test, follow-up training, and follow-up test sets. Similarly, to harmonise neuroimaging features across different sites while avoiding data leakage, we applied ComBat (Fortin et al., 2017; Johnson et al., 2007; Nielson et al., 2018) to the training set. We then applied ComBat to the test set, using the ComBatted training set as a reference batch.
Unlike PLS used above for predictive models from mental health, we chose to apply opportunistic stacking (Engemann et al., 2020; Pat et al., 2022b) when building predictive models from neuroimaging. As we showed previously (Pat et al., 2022b), opportunistic stacking allowed us to handle missingness in the neuroimaging data without sacrificing predictive performance. Missingness in children’s MRI data is expected, given high levels of noise (e.g. movement artifact) (Fassbender et al., 2017). For the ABCD, if we applied listwise exclusion using the study’s exclusion criteria and outlier detection, we would have to exclude around 68% and 74%, at baseline and follow-up, respectively of the children with MRI data from any set of neuroimaging features flagged (Pat et al., 2022b) (see Figure 9). With opportunistic stacking, we only required each participant to have at least one out of 45 sets of neuroimaging features available. Therefore, we needed to exclude just around 9% and 41%, at baseline and follow-up respectively, of the children (see Figure 9). Our opportunistic stacking method kept 10,754 and 6412 participants at baseline and follow-up, respectively, while listwise deletion only kept 3784 and 2788 participants, respectively. We previously showed that the predictive performance of the models with opportunistic stacking is similar to that with listwise exclusion (Pat et al., 2022b).
Figure 9. Illustration of data missingness (black) versus presence (grey) across different sets of neuroimaging features.
This figure compares the number of observations in the analysis. Opportunist stacking (referred to as stacking here) requires only at least one neuroimaging feature to be present, thus allowing the inclusion of more neuroimaging features compared to listwise deletion.
Opportunistic stacking (Engemann et al., 2020; Pat et al., 2022b) involves two layers of modelling: set-specific and stacking layers. In the set-specific layer, we predicted cognitive abilities separately from each set of neuroimaging features using Elastic Net (Zou and Hastie, 2005). While being a linear and non-interactive algorithm, Elastic Net performs relatively well in predicting behaviours from neuroimaging MRI, often on par with, if not better than, other non-linear and interactive algorithms, such as support vector machine with non-linear kernel, XGBoost and Random Forest (Pat et al., 2023; Tetereva et al., 2022; Vieira et al., 2022). Moreover, Elastic Net coefficients are readily explainable, enabling us to explain how the models drew information from each neuroimaging feature when making a prediction (Molnar, 2019; Pat et al., 2023).
Elastic Net simultaneously minimises the weighted sum of the features’ coefficients. Its loss function can be written as:
| (1) |
where is a row vector of all the features in observation i, and β is a column vector of features’ coefficient. There are two hyperparameters: (1) the penalty constraining the magnitude of the coefficients and (2) the mixture (α) deciding whether the model is more of a sum of squared coefficients (known as Ridge) or a sum of absolute values of the coefficients (known as Least Absolute Shrinkage and Selection Operator, LASSO). Using grid search, we chose the pair of penalty and mixture based on the lowest root mean square error (RMSE). The penalty was selected from 20 numbers, ranging from to 10, equally spaced with the scale, and the penalty was selected from 11 numbers, ranging from 0 to 1 on a linear scale.
Training the set-specific layer resulted in the predicted values of cognitive abilities, one from each set of neuroimaging features. The stacking layer, then, took these predicted values across 45 sets of neuroimaging features and treated them as features to predict cognitive abilities, thereby drawing information across (as opposed to within) sets of neuroimaging features. Importantly, we used the same training set across both layers, ensuring no data leakage between training and test sets. Opportunistic stacking dealt with missing values from each set of neuroimaging features by, first, duplicating each feature (i.e. each of 45 predicted values from the set-specific layer) into two features, resulting in 90 features. We then replaced the missing values in the duplicated features with either unrealistically large (1000) or small (–1000) values. Accordingly, we could keep the data as long as at least one set of neuroimaging features had no missing value. Using these duplicated and imputed features, we predicted cognitive abilities from different sets of neuroimaging features using Random Forest (Breiman, 2001). Ultimately, the stacking layer resulted in a predicted value of cognitive abilities based on 45 sets of neuroimaging features.
Random Forest generates several regression trees by bootstrapping observations and including a random subset of features at each split (Breiman, 2001). To make a prediction, Random Forest aggregates predicted values across bootstrapped trees, known as bagging. We used 500 trees and turned two hyperparameters. First, ‘mtry’ was the number of features selected at each branch. Second, ‘min_n’ was the minimum number of observations in a node needed for the node to be split further. Using a Latin hypercube grid search of 3000 numbers (Dupuy et al., 2015; Sacks et al., 1989; Santner et al., 2003), we chose the pair of mtry, ranging from 1 to 90, and min_n, ranging from 2 to 2000, based on the lowest root mean square error (RMSE).
To understand how opportunistic stacking made predictions, we plotted Elastic Net coefficients for the set-specific layer and SHapley Additive exPlanations (SHAP) (Lundberg and Lee, 2017) for the stacking layer, averaged across 21 test sites. For the set-specific layer, Elastic Net made a prediction based on the linear summation of its regularised, estimated coefficients, and thus, plotting the coefficient of each neuroimaging feature allowed us to understand the contribution of such feature. For the stacking layer, it is difficult to trace the contribution from each feature from Random Forest directly, given the use of bagging. To overcome this, we computed Shapley values instead (Roth, 1988). Shapley values indicate the weighted differences in a model output when each feature is included versus not included in all possible subsets of features. SHAP (Lundberg and Lee, 2017) is a method to estimate Shapley values efficiently. Thus, SHAP allowed us to visualise the contribution of each set of neuroimaging features to the prediction in the stacking layer. Given that we duplicated the predicted values from each set of neuroimaging features in the stacking layer, we combined the magnitude of SHAP across the duplicates.
We fit Elastic Net and Random Forest using the glmnet (Friedman et al., 2010) and ranger (Wright and Ziegler, 2017) packages, respectively, with the tidymodels (Kuhn and Wickham, 2025) package as a wrapper. We approximated the Shapley values (Lundberg and Lee, 2017) using the fastshap package (Greenwell, 2023). The brain plots were created via the ggseg, ggsegDesterieux, ggsegJHU, and ggsegGordon packages (Mowinckel and Vidal-Piñeiro, 2020).
Predicting cognitive abilities from polygenic scores
We developed predictive models to predict cognitive abilities from polygenic scores, as reflected by PGS of cognitive abilities from three definitions (Davies et al., 2018; Lee et al., 2018; Savage et al., 2018). We first selected the PGS threshold for each of the three definitions that demonstrated the strongest correlation with cognitive abilities within the training set. This left three PGSs as features for our predictive models, one for each definition. To control for population stratification in genetics, we regressed each PGS on four genetic principal components separately for the training and test sets. Later, we treated the residuals of this regression for each PGS as each feature in our predictive models. Similar to the predictive models for the set-specific layer of the neuroimaging features, we used Elastic Net here as an algorithm. Given that the genetic data do not change over time, we used the same genetic features for baseline and follow-up predictive models. We selected participants based on ancestry for predictive models involving polygenic scores, leaving us with a much smaller number of children (n=5776 vs. n=11,868 in the baseline).
Predicting cognitive abilities from socio-demographics, lifestyles, and developmental adverse events
We developed predictive models to predict cognitive abilities from socio-demographics, lifestyles and developmental adverse events, reflected in the 44 features. We implemented PLS (Wold et al., 2001) as an algorithm similar to the mental health features. To deal with missing values, we applied the following steps separately for baseline training, baseline test, follow-up training, and follow-up test sets. We first imputed categorical features using mode and converted them into dummy variables. We then standardised all features and imputed them using K-nearest neighbours with five neighbours. Note that in a particular site, the value in a specific feature was at 0 for all of the observations (e.g., site 3 having a crime report at 0 for all children), making it impossible for us to standardise this feature when using this site as a test set. In this case, we kept the value of this feature at 0 and did not standardise it.
Note that the ABCD study only provided some features in the baseline, but not the follow-up. Accordingly, we treated these baseline features as features in our follow-up predictive models and combined them with the other collected in the follow-up. Supplementary file 4 listed all of the variables and their calculation.
Commonality analyses
Following the predictive modelling procedure above, we extracted predicted values from different sets of features at each test site and treated them as proxy measures of cognitive abilities (Dadi et al., 2021). The out-of-sample relationship between observed and proxy measures of cognitive abilities based on specific features reflects variation in cognitive abilities explained by those features. For instance, the relationship between observed and proxy measures of cognitive abilities based on mental health indicates the variation in cognitive abilities that could be explained by mental health. Capitalising on this variation, we then used commonality analyses (Nimon et al., 2008) to demonstrate the extent to which other proxy measures captured similar variance of cognitive abilities as mental health.
First, to control for the influences of biological sex, age at interview and medication information, we residualised those variables from observed cognitive abilities and each proxy measure of cognitive abilities. We defined medication using the su_y_plus table and generated dummy variables based on the medication’s functionality, as categorized by the Anatomical Therapeutic Chemical (ATC) Classification System (refer to Table 15). We then applied random-intercept, linear-mixed models (Raudenbush and Bryk, 2002) to the data from all test sites, using the lme4 package (Bates et al., 2015). In these models, we considered families to be nested within each site, which allow different families from each site can have an unique intercept. We treated different proxy measures of cognitive abilities as fixed-effect regressors to explain cognitive abilities. We, then, estimated marginal from the linear-mixed models, which describes the variance explained by all fixed effects included in the models (Nakagawa and Schielzeth, 2013; Vonesh et al., 1996) and multiplied the marginal by 100 to obtain a percentage. By including and excluding each proxy measure in the models, we were able to decompose marginal into unique (i.e. attributed to the variance, uniquely explained by a particular proxy measure) and common (i.e. attributed to the variance, jointly explained by a group of proxy measures) effects (Nimon et al., 2008). We focused on the common effects between a proxy measure based on mental health and other proxy measures in four sets of commonality analyses. Note that each of the four sets of commonality analyses used different numbers of participants, depending on the data availability.
Table 15. Medication reports in the baseline and follow-up.
This report is derived from the su_y_plus table and utilises the Anatomical Therapeutic Chemical (ATC) Classification System to group medications according to their functionality.
| Functionality | Baseline | Follow-up |
|---|---|---|
| Alimentary tract and metabolism | 144 | 145 |
| Blood and blood forming organs | 12 | 22 |
| Cardiovascular system | 124 | 142 |
| Dermatologicals | 108 | 64 |
| Genitourinary system and sex hormones | 72 | 76 |
| Systemic hormonal preparations, excl. sex hormones and insulins | 26 | 24 |
| Anti-infectives for systemic use | 56 | 35 |
| Antineoplastic and immunomodulating agents | 5 | 5 |
| Musculo-skeletal system | 145 | 183 |
| Nervous system | 710 | 729 |
| Antiparasitic products, insecticides and repellents | 5 | 4 |
| Respiratory system | 721 | 538 |
| Sensory organs | 42 | 42 |
| Various | 1 | 3 |
Commonality analyses for proxy measures of cognitive abilities based on mental health and neuroimaging
Here, we included proxy measures of cognitive abilities based on mental health and/or neuroimaging. Specifically, for each proxy measure, we added two regressors in the models: the values centred within each site (denoted ) and the site average (denoted ). For instance, we applied the following lme4 syntax for the models with both proxy measures:
| (2) |
We computed unique and common effects (Nimon et al., 2008) as follows:
| (3) |
where the subscript of indicates which proxy measures were included in the model.
In addition to using the proxy measures based on neuroimaging from the stacking layer, we also conducted commonality analyses on proxy measures based on neuroimaging from each set of neuroimaging features. This allows us to demonstrate which sets of neuroimaging features showed higher common effects with the proxy measures based on mental health. Note that to include as many participants in the models as possible, we dropped missing values based on the availability of data in each set of neuroimaging features included in the models (i.e., not applying listwise deletion across sets of neuroimaging features).
Commonality analyses for proxy measures of cognitive abilities based on mental health and polygenic scores
Here, we included proxy measures of cognitive abilities based on mental health and/or polygenic scores. Since family members had more similar genetics than non-members, we changed our centring strategy to polygenic scores. With the proxy measure based on polygenic scores, we applied (1) centring on two levels: centring its values within each family first and then within each site (denoted ) (2) averaging on two levels: averaging of its values within each family first and then within each site (denoted ). Accordingly, we used the following lme4 syntax for the models with both proxy measures:
| (4) |
We computed unique and common effects as follows:
| (5) |
Commonality analyses for proxy measures of cognitive abilities based on mental health and socio-demographics, lifestyles, and developmental adverse events
Here, we included proxy measures of cognitive abilities based on mental health and/or socio-demographics, lifestyles, and developmental adverse events. We applied the following lme4 syntax for the models with both proxy measures:
| (6) |
Where soc lif dev shorts for socio-demographics, lifestyles, and developmental adverse events. We computed unique and common effects (Nimon et al., 2008) as follows:
| (7) |
Commonality analyses for proxy measures of cognitive abilities based on mental health, neuroimaging, polygenic scores and socio-demographics, lifestyles, and developmental adverse events
Here, we included proxy measures of cognitive abilities based on mental health, neuroimaging, polygenic scores and/or socio-demographics, lifestyles, and developmental adverse events. We applied the following lme4 syntax for the model with all proxy measures included:
| (8) |
We computed unique and common effects (Nimon et al., 2008; Nimon et al., 2017) as follows:
| (9) |
where mh, b, g, and s denote mental health, brain (i.e. neuroimaging), genetic profile (i.e. polygenic scores) and/or socio-demographics, lifestyles, and developmental adverse events, respectively.
Acknowledgements
Data used in the preparation of this article were obtained from the Adolescent Brain Cognitive Development (ABCD) Study (https://abcdstudy.org), held in the NIMH Data Archive (NDA). This is a multisite, longitudinal study designed to recruit more than 10,000 children aged 9–10 and follow them over 10 years into early adulthood. The ABCD Study is supported by the National Institutes of Health and additional federal partners under award numbers U01DA041048, U01DA050989, U01DA051016, U01DA041022, U01DA051018, U01DA051037, U01DA050987, U01DA041174, U01DA041106, U01DA041117, U01DA041028, U01DA041134, U01DA050988, U01DA051039, U01DA041156, U01DA041025, U01DA041120, U01DA051038, U01DA041148, U01DA041093, U01DA041089, U24DA041123, U24DA041147. A full list of supporters is available at https://abcdstudy.org/federal-partners.html. A listing of participating sites and a complete listing of the study investigators can be found at https://abcdstudy.org/consortium_members/. ABCD consortium investigators designed and implemented the study and/or provided data but did not necessarily participate in the analysis or writing of this report. This manuscript reflects the views of the authors and may not reflect the opinions or views of the NIH or ABCD consortium investigators. The ABCD data repository grows and changes over time. The ABCD data used in this report came from DOI:10.15154/z563-zd24. The authors wish to acknowledge the use of New Zealand eScience Infrastructure (NeSI) high-performance computing facilities, consulting support, and/or training services as part of this research. New Zealand’s national facilities are provided by NeSI and funded jointly by NeSI’s collaborator institutions and through the Ministry of Business, Innovation & Employment’s Research Infrastructure programme. URL https://www.nesi.org.nz. Yue Wang and Narun Pat were supported by Health Research Council Funding (21/618 and 24/838), the Neurological Foundation of New Zealand (grant number 2350 PRG) and the University of Otago.
Funding Statement
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Contributor Information
Narun Pat, Email: narun.pat@otago.ac.nz.
Jason P Lerch, University of Oxford, Oxford, United Kingdom.
Jonathan Roiser, University College London, London, United Kingdom.
Funding Information
This paper was supported by the following grants:
Health Research Council of New Zealand 21/618 to Narun Pat.
Health Research Council of New Zealand 24/838 to Narun Pat.
Neurological Foundation of New Zealand 2350 PRG to Narun Pat.
University of Otago to Narun Pat.
Additional information
Competing interests
No competing interests declared.
Author contributions
Conceptualization, Data curation, Formal analysis, Visualization, Methodology, Writing – original draft.
Software, Validation.
Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Methodology, Writing – original draft, Project administration, Writing – review and editing.
Ethics
Human subjects: We used data from the Adolescent Brain Cognitive Development (ABCD) Study Curated Annual Release 5.1 (DOI:10.15154/z563-zd24) from two time points. Institutional Review Boards at each site approved the study protocols. Please see Clark et al., 2018 for ethical details, such as informed consent and confidentiality.
Additional files
Med = Median IQR = interquartile range; CV = Coefficient of variation; CBCL = Child Behavioural Checklist, reflecting children’s emotional and behavioural problems; UPPS-P = Urgency, Premeditation, Perseverance, Sensation seeking, and Positive urgency Impulsive Behaviour Scale; BAS = Behavioural Activation System. Under the variable names, there are information about the method to compute these variables and the original variables names in ABCD data dictionary.
Med = Median IQR = interquartile range; CV = Coefficient of variation; CBCL = Child Behavioural Checklist, reflecting children’s emotional and behavioural problems; UPPS-P = Urgency, Premeditation, Perseverance, Sensation seeking, and Positive urgency Impulsive Behaviour Scale; BAS = Behavioural Activation System. Under the variable names, there are information about the method to compute these variables and the original variables names in ABCD data dictionary.
Med = Median IQR = interquartile range; CV = Coefficient of variation. Under the variable names, there are information about the method to compute these variables and the original variables names in ABCD data dictionary.
We only provided variables that were repeatedly correction in the follow-up here. Med = Median IQR = interquartile range; CV = Coefficient of variation. Under the variable names, there are information about the method to compute these variables and the original variables names in ABCD data dictionary.
Data availability
We used publicly available ABCD 5.1 data (DOI: 10.15154/z563-zd24) provided by the ABCD study, held in the NIMH Data Archive. We uploaded the R analysis script and detailed outputs here (https://github.com/HAM-lab-Otago-University/Commonality-Analysis-ABCD5.1, copy archived at HAM-lab-Otago-University, 2025).
References
- Abramovitch A, Short T, Schweiger A. The C Factor: Cognitive dysfunction as a transdiagnostic dimension in psychopathology. Clinical Psychology Review. 2021;86:102007. doi: 10.1016/j.cpr.2021.102007. [DOI] [PubMed] [Google Scholar]
- Achenbach TM, Ivanova MY, Rescorla LA. Empirically based assessment and taxonomy of psychopathology for ages 1½-90+ years: Developmental, multi-informant, and multicultural findings. Comprehensive Psychiatry. 2017;79:4–18. doi: 10.1016/j.comppsych.2017.03.006. [DOI] [PubMed] [Google Scholar]
- Alexander AL, Lee JE, Lazar M, Field AS. Diffusion tensor imaging of the brain. Neurotherapeutics. 2007;4:316–329. doi: 10.1016/j.nurt.2007.05.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Allegrini AG, Selzam S, Rimfeld K, von Stumm S, Pingault JB, Plomin R. Genomic prediction of cognitive traits in childhood and adolescence. Molecular Psychiatry. 2019;24:819–827. doi: 10.1038/s41380-019-0394-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ang YS, Frontero N, Belleau E, Pizzagalli DA. Disentangling vulnerability, state and trait features of neurocognitive impairments in depression. Brain. 2020;143:3865–3877. doi: 10.1093/brain/awaa314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR, 1000 Genomes Project Consortium A global reference for human genetic variation. Nature. 2015;526:68–74. doi: 10.1038/nature15393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bagot KS, Matthews SA, Mason M, Squeglia LM, Fowler J, Gray K, Herting M, May A, Colrain I, Godino J, Tapert S, Brown S, Patrick K. Current, future and potential use of mobile and wearable technologies and social media data in the ABCD study to increase understanding of contributors to child health. Developmental Cognitive Neuroscience. 2018;32:121–129. doi: 10.1016/j.dcn.2018.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Baker DP, Eslinger PJ, Benavides M, Peters E, Dieckmann NF, Leon J. The cognitive impact of the education revolution: A possible cause of the Flynn Effect on population IQ. Intelligence. 2015;49:144–158. doi: 10.1016/j.intell.2015.01.003. [DOI] [Google Scholar]
- Barch DM, Burgess GC, Harms MP, Petersen SE, Schlaggar BL, Corbetta M, Glasser MF, Curtiss S, Dixit S, Feldt C, Nolan D, Bryant E, Hartley T, Footer O, Bjork JM, Poldrack R, Smith S, Johansen-Berg H, Snyder AZ, Van Essen DC, WU-Minn HCP Consortium Function in the human connectome: task-fMRI and individual differences in behavior. NeuroImage. 2013;80:169–189. doi: 10.1016/j.neuroimage.2013.05.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. Journal of Statistical Software. 2015;67:01. doi: 10.18637/jss.v067.i01. [DOI] [Google Scholar]
- Bauer PJ, Dikmen SS, Heaton RK, Mungas D, Slotkin J, Beaumont JL. III. NIH Toolbox Cognition Battery (CB): measuring episodic memory. Monographs of the Society for Research in Child Development. 2013;78:34–48. doi: 10.1111/mono.12033. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beam E, Potts C, Poldrack RA, Etkin A. A data-driven framework for mapping domains of human neurobiology. Nature Neuroscience. 2021;24:1733–1744. doi: 10.1038/s41593-021-00948-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bethlehem RAI, Seidlitz J, White SR, Vogel JW, Anderson KM, Adamson C, Adler S, Alexopoulos GS, Anagnostou E, Areces-Gonzalez A, Astle DE, Auyeung B, Ayub M, Bae J, Ball G, Baron-Cohen S, Beare R, Bedford SA, Benegal V, Beyer F, Blangero J, Blesa Cábez M, Boardman JP, Borzage M, Bosch-Bayard JF, Bourke N, Calhoun VD, Chakravarty MM, Chen C, Chertavian C, Chetelat G, Chong YS, Cole JH, Corvin A, Costantino M, Courchesne E, Crivello F, Cropley VL, Crosbie J, Crossley N, Delarue M, Delorme R, Desrivieres S, Devenyi GA, Di Biase MA, Dolan R, Donald KA, Donohoe G, Dunlop K, Edwards AD, Elison JT, Ellis CT, Elman JA, Eyler L, Fair DA, Feczko E, Fletcher PC, Fonagy P, Franz CE, Galan-Garcia L, Gholipour A, Giedd J, Gilmore JH, Glahn DC, Goodyer IM, Grant PE, Groenewold NA, Gunning FM, Gur RE, Gur RC, Hammill CF, Hansson O, Hedden T, Heinz A, Henson RN, Heuer K, Hoare J, Holla B, Holmes AJ, Holt R, Huang H, Im K, Ipser J, Jack CR, Jr, Jackowski AP, Jia T, Johnson KA, Jones PB, Jones DT, Kahn RS, Karlsson H, Karlsson L, Kawashima R, Kelley EA, Kern S, Kim KW, Kitzbichler MG, Kremen WS, Lalonde F, Landeau B, Lee S, Lerch J, Lewis JD, Li J, Liao W, Liston C, Lombardo MV, Lv J, Lynch C, Mallard TT, Marcelis M, Markello RD, Mathias SR, Mazoyer B, McGuire P, Meaney MJ, Mechelli A, Medic N, Misic B, Morgan SE, Mothersill D, Nigg J, Ong MQW, Ortinau C, Ossenkoppele R, Ouyang M, Palaniyappan L, Paly L, Pan PM, Pantelis C, Park MM, Paus T, Pausova Z, Paz-Linares D, Pichet Binette A, Pierce K, Qian X, Qiu J, Qiu A, Raznahan A, Rittman T, Rodrigue A, Rollins CK, Romero-Garcia R, Ronan L, Rosenberg MD, Rowitch DH, Salum GA, Satterthwaite TD, Schaare HL, Schachar RJ, Schultz AP, Schumann G, Schöll M, Sharp D, Shinohara RT, Skoog I, Smyser CD, Sperling RA, Stein DJ, Stolicyn A, Suckling J, Sullivan G, Taki Y, Thyreau B, Toro R, Traut N, Tsvetanov KA, Turk-Browne NB, Tuulari JJ, Tzourio C, Vachon-Presseau É, Valdes-Sosa MJ, Valdes-Sosa PA, Valk SL, van Amelsvoort T, Vandekar SN, Vasung L, Victoria LW, Villeneuve S, Villringer A, Vértes PE, Wagstyl K, Wang YS, Warfield SK, Warrier V, Westman E, Westwater ML, Whalley HC, Witte AV, Yang N, Yeo B, Yun H, Zalesky A, Zar HJ, Zettergren A, Zhou JH, Ziauddeen H, Zugman A, Zuo XN, 3R-BRAIN. AIBL. Alzheimer’s Disease Neuroimaging Initiative. Alzheimer’s Disease Repository Without Borders Investigators. CALM Team. Cam-CAN. CCNP. COBRE. cVEDA. ENIGMA Developmental Brain Age Working Group. Developing Human Connectome Project. FinnBrain. Harvard Aging Brain Study. IMAGEN. KNE96. Mayo Clinic Study of Aging. NSPN. POND. PREVENT-AD Research Group. VETSA. Bullmore ET, Alexander-Bloch AF. Brain charts for the human lifespan. Nature. 2022;604:525–533. doi: 10.1038/s41586-022-04554-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bissett PG, Hagen MP, Jones HM, Poldrack RA. Design issues and solutions for stop-signal data from the Adolescent Brain Cognitive Development (ABCD) study. eLife. 2021;10:e60185. doi: 10.7554/eLife.60185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bleck TP, Nowinski CJ, Gershon R, Koroshetz WJ. What is the NIH toolbox, and what will it mean to neurology? Neurology. 2013;80:874–875. doi: 10.1212/WNL.0b013e3182872ea0. [DOI] [PubMed] [Google Scholar]
- Bogdan R, Baranger DAA, Agrawal A. Polygenic risk scores in clinical psychology: bridging genomic risk to individual differences. Annual Review of Clinical Psychology. 2018;14:119–157. doi: 10.1146/annurev-clinpsy-050817-084847. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolt T, Nomi JS, Yeo BTT, Uddin LQ. Data-driven extraction of a nested model of human brain function. The Journal of Neuroscience. 2017;37:7263–7277. doi: 10.1523/JNEUROSCI.0323-17.2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Breiman L. Random Forests. Machine Learning. 2001;45:5–32. doi: 10.1023/A:1010933404324. [DOI] [Google Scholar]
- Bruni O, Ottaviano S, Guidetti V, Romoli M, Innocenzi M, Cortesi F, Giannotti F. The Sleep Disturbance Scale for Children (SDSC). Construction and validation of an instrument to evaluate sleep disturbances in childhood and adolescence. Journal of Sleep Research. 1996;5:251–261. doi: 10.1111/j.1365-2869.1996.00251.x. [DOI] [PubMed] [Google Scholar]
- Calvin CM, Batty GD, Der G, Brett CE, Taylor A, Pattie A, Čukić I, Deary IJ. Childhood intelligence in relation to major causes of death in 68 year follow-up: prospective population study. BMJ. 2017;357:j2708. doi: 10.1136/bmj.j2708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carlozzi NE, Tulsky DS, Kail RV, Beaumont JL. NIH toolbox cognition battery (CB): measuring processing speed. Monographs of the Society for Research in Child Development. 2013;78:88–102. doi: 10.1111/mono.12036. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carver CS, White TL. Behavioral inhibition, behavioral activation, and affective responses to impending reward and punishment: The BIS/BAS Scales. Journal of Personality and Social Psychology. 1994;67:319–333. doi: 10.1037/0022-3514.67.2.319. [DOI] [Google Scholar]
- Carver CS, Johnson SL. Impulsive reactivity to emotion and vulnerability to psychopathology. The American Psychologist. 2018;73:1067–1078. doi: 10.1037/amp0000387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casey BJ, Cannonier T, Conley MI, Cohen AO, Barch DM, Heitzeg MM, Soules ME, Teslovich T, Dellarco DV, Garavan H, Orr CA, Wager TD, Banich MT, Speer NK, Sutherland MT, Riedel MC, Dick AS, Bjork JM, Thomas KM, Chaarani B, Mejia MH, Hagler DJ, Daniela Cornejo M, Sicat CS, Harms MP, Dosenbach NUF, Rosenberg M, Earl E, Bartsch H, Watts R, Polimeni JR, Kuperman JM, Fair DA, Dale AM, ABCD Imaging Acquisition Workgroup The Adolescent Brain Cognitive Development (ABCD) study: imaging acquisition across 21 sites. Developmental Cognitive Neuroscience. 2018;32:43–54. doi: 10.1016/j.dcn.2018.03.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Caspi A, Moffitt TE. All for one and one for all: mental disorders in one dimension. The American Journal of Psychiatry. 2018;175:831–844. doi: 10.1176/appi.ajp.2018.17121383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chilcoat HD, Anthony JC. Impact of parent monitoring on initiation of drug use through late childhood. Journal of the American Academy of Child & Adolescent Psychiatry. 1996;35:91–100. doi: 10.1097/00004583-199601000-00017. [DOI] [PubMed] [Google Scholar]
- Choi SW, Mak TS-H, O’Reilly PF. Tutorial: a guide to performing polygenic risk score analyses. Nature Protocols. 2020;15:2759–2772. doi: 10.1038/s41596-020-0353-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clark DB, Fisher CB, Bookheimer S, Brown SA, Evans JH, Hopfer C, Hudziak J, Montoya I, Murray M, Pfefferbaum A, Yurgelun-Todd D. Biomedical ethics and clinical oversight in multisite observational neuroimaging studies with children and adolescents: The ABCD experience. Developmental Cognitive Neuroscience. 2018;32:143–154. doi: 10.1016/j.dcn.2017.06.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cosgrove KT, McDermott TJ, White EJ, Mosconi MW, Thompson WK, Paulus MP, Cardenas-Iniguez C, Aupperle RL. Limits to the generalizability of resting-state functional magnetic resonance imaging studies of youth: An examination of ABCD Study baseline data. Brain Imaging and Behavior. 2022;16:1919–1925. doi: 10.1007/s11682-022-00665-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cuthbert BN, Insel TR. Toward the future of psychiatric diagnosis: the seven pillars of RDoC. BMC Medicine. 2013;11:126. doi: 10.1186/1741-7015-11-126. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dadi K, Varoquaux G, Houenou J, Bzdok D, Thirion B, Engemann D. Population modeling with machine learning can enhance measures of mental health. GigaScience. 2021;10:giab071. doi: 10.1093/gigascience/giab071. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dale AM, Fischl B, Sereno MI. Cortical Surface-Based Analysis. NeuroImage. 1999;9:179–194. doi: 10.1006/nimg.1998.0395. [DOI] [PubMed] [Google Scholar]
- Daniel MH, Wahlstrom D. Equivalence of Q-interactive and Paper Administrations of Cognitive. Tasks: WISC–V; 2014. [Google Scholar]
- Davies G, Lam M, Harris SE, Trampush JW, Luciano M, Hill WD, Hagenaars SP, Ritchie SJ, Marioni RE, Fawns-Ritchie C, Liewald DCM, Okely JA, Ahola-Olli AV, Barnes CLK, Bertram L, Bis JC, Burdick KE, Christoforou A, DeRosse P, Djurovic S, Espeseth T, Giakoumaki S, Giddaluru S, Gustavson DE, Hayward C, Hofer E, Ikram MA, Karlsson R, Knowles E, Lahti J, Leber M, Li S, Mather KA, Melle I, Morris D, Oldmeadow C, Palviainen T, Payton A, Pazoki R, Petrovic K, Reynolds CA, Sargurupremraj M, Scholz M, Smith JA, Smith AV, Terzikhan N, Thalamuthu A, Trompet S, van der Lee SJ, Ware EB, Windham BG, Wright MJ, Yang J, Yu J, Ames D, Amin N, Amouyel P, Andreassen OA, Armstrong NJ, Assareh AA, Attia JR, Attix D, Avramopoulos D, Bennett DA, Böhmer AC, Boyle PA, Brodaty H, Campbell H, Cannon TD, Cirulli ET, Congdon E, Conley ED, Corley J, Cox SR, Dale AM, Dehghan A, Dick D, Dickinson D, Eriksson JG, Evangelou E, Faul JD, Ford I, Freimer NA, Gao H, Giegling I, Gillespie NA, Gordon SD, Gottesman RF, Griswold ME, Gudnason V, Harris TB, Hartmann AM, Hatzimanolis A, Heiss G, Holliday EG, Joshi PK, Kähönen M, Kardia SLR, Karlsson I, Kleineidam L, Knopman DS, Kochan NA, Konte B, Kwok JB, Le Hellard S, Lee T, Lehtimäki T, Li SC, Lill CM, Liu T, Koini M, London E, Longstreth WT, Jr, Lopez OL, Loukola A, Luck T, Lundervold AJ, Lundquist A, Lyytikäinen LP, Martin NG, Montgomery GW, Murray AD, Need AC, Noordam R, Nyberg L, Ollier W, Papenberg G, Pattie A, Polasek O, Poldrack RA, Psaty BM, Reppermund S, Riedel-Heller SG, Rose RJ, Rotter JI, Roussos P, Rovio SP, Saba Y, Sabb FW, Sachdev PS, Satizabal CL, Schmid M, Scott RJ, Scult MA, Simino J, Slagboom PE, Smyrnis N, Soumaré A, Stefanis NC, Stott DJ, Straub RE, Sundet K, Taylor AM, Taylor KD, Tzoulaki I, Tzourio C, Uitterlinden A, Vitart V, Voineskos AN, Kaprio J, Wagner M, Wagner H, Weinhold L, Wen KH, Widen E, Yang Q, Zhao W, Adams HHH, Arking DE, Bilder RM, Bitsios P, Boerwinkle E, Chiba-Falek O, Corvin A, De Jager PL, Debette S, Donohoe G, Elliott P, Fitzpatrick AL, Gill M, Glahn DC, Hägg S, Hansell NK, Hariri AR, Ikram MK, Jukema JW, Vuoksimaa E, Keller MC, Kremen WS, Launer L, Lindenberger U, Palotie A, Pedersen NL, Pendleton N, Porteous DJ, Räikkönen K, Raitakari OT, Ramirez A, Reinvang I, Rudan I, Schmidt R, Schmidt H, Schofield PW, Schofield PR, Starr JM, Steen VM, Trollor JN, Turner ST, Van Duijn CM, Villringer A, Weinberger DR, Weir DR, Wilson JF, Malhotra A, McIntosh AM, Gale CR, Seshadri S, Mosley TH, Jr, Bressler J, Lencz T, Deary IJ, Dan R. Study of 300,486 individuals identifies 148 independent genetic loci influencing general cognitive function. Nature Communications. 2018;9:2098. doi: 10.1038/s41467-018-04362-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Deary IJ. Intelligence. Annual Review of Psychology. 2012;63:453–482. doi: 10.1146/annurev-psych-120710-100353. [DOI] [PubMed] [Google Scholar]
- Deary IJ, Pattie A, Starr JM. The stability of intelligence from age 11 to age 90 years: the Lothian birth cohort of 1921. Psychological Science. 2013;24:2361–2368. doi: 10.1177/0956797613486487. [DOI] [PubMed] [Google Scholar]
- Destrieux C, Fischl B, Dale A, Halgren E. Automatic parcellation of human cortical gyri and sulci using standard anatomical nomenclature. NeuroImage. 2010;53:1–15. doi: 10.1016/j.neuroimage.2010.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dick AS, Garcia NL, Pruden SM, Thompson WK, Hawes SW, Sutherland MT, Riedel MC, Laird AR, Gonzalez R. No evidence for a bilingual executive function advantage in the ABCD study. Nature Human Behaviour. 2019;3:692–701. doi: 10.1038/s41562-019-0609-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dolsen EA, Deardorff J, Harvey AG. Salivary pubertal hormones, sleep disturbance, and an evening circadian preference in adolescents: risk across health domains. The Journal of Adolescent Health. 2019;64:523–529. doi: 10.1016/j.jadohealth.2018.10.003. [DOI] [PubMed] [Google Scholar]
- Dormann CF, Elith J, Bacher S, Buchmann C, Carl G, Carré G, Marquéz JRG, Gruber B, Lafourcade B, Leitão PJ, Münkemüller T, McClean C, Osborne PE, Reineking B, Schröder B, Skidmore AK, Zurell D, Lautenbach S. Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography. 2013;36:27–46. doi: 10.1111/j.1600-0587.2012.07348.x. [DOI] [Google Scholar]
- Dubois J, Galdi P, Paul LK, Adolphs R. A distributed brain network predicts general intelligence from resting-state human neuroimaging data. Philosophical Transactions of the Royal Society B. 2018;373:20170284. doi: 10.1098/rstb.2017.0284. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Duncan L, Shen H, Gelaye B, Meijsen J, Ressler K, Feldman M, Peterson R, Domingue B. Analysis of polygenic risk score usage and performance in diverse human populations. Nature Communications. 2019;10:3328. doi: 10.1038/s41467-019-11112-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dupuy D, Helbert C, Franco J. DiceDesign and DiceEval: Two R Packages for design and analysis of computer experiments. Journal of Statistical Software. 2015;65:1–38. doi: 10.18637/jss.v065.i11. [DOI] [Google Scholar]
- Duyme M, Dumaret AC, Tomkiewicz S. How can we boost IQs of “dull children”?: A late adoption study. PNAS. 1999;96:8790–8794. doi: 10.1073/pnas.96.15.8790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- East-Richard C, Mercier RA, Nadeau D, Cellard C. Transdiagnostic neurocognitive deficits in psychiatry: A review of meta-analyses. Canadian Psychology / Psychologie Canadienne. 2020;61:190–214. doi: 10.1037/cap0000196. [DOI] [Google Scholar]
- Echeverria SE, Diez-Roux AV, Link BG. Reliability of self-reported neighborhood characteristics. Journal of Urban Health. 2004;81:682–701. doi: 10.1093/jurban/jth151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Engemann DA, Kozynets O, Sabbagh D, Lemaître G, Varoquaux G, Liem F, Gramfort A. Combining magnetoencephalography with magnetic resonance imaging enhances learning of surrogate-biomarkers. eLife. 2020;9:e54055. doi: 10.7554/eLife.54055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Epskamp S. semPlot: unified visualizations of structural equation models. Structural Equation Modeling. 2015;22:474–483. doi: 10.1080/10705511.2014.937847. [DOI] [Google Scholar]
- Eriksen BA, Eriksen CW. Effects of noise letters upon the identification of a target letter in a nonsearch task. Perception & Psychophysics. 1974;16:143–149. doi: 10.3758/BF03203267. [DOI] [Google Scholar]
- Fassbender C, Mukherjee P, Schweitzer JB. Minimizing noise in pediatric task-based functional MRI; Adolescents with developmental disabilities and typical development. NeuroImage. 2017;149:338–347. doi: 10.1016/j.neuroimage.2017.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Federal Bureau Of Investigation Uniform crime reporting program data: county-level detailed arrest and offense data, 2010: Version 2. ICPSR - Interuniversity Consortium for Political and Social Research. 2012;10:33523. doi: 10.3886/ICPSR33523.V2. [DOI] [Google Scholar]
- Fischl B, Salat DH, Busa E, Albert M, Dieterich M, Haselgrove C, van der Kouwe A, Killiany R, Kennedy D, Klaveness S, Montillo A, Makris N, Rosen B, Dale AM. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron. 2002;33:341–355. doi: 10.1016/s0896-6273(02)00569-x. [DOI] [PubMed] [Google Scholar]
- Flynn JR. The mean IQ of Americans: Massive gains 1932 to 1978. Psychological Bulletin. 1984;95:29–51. doi: 10.1037/0033-2909.95.1.29. [DOI] [Google Scholar]
- Flynn JR. What Is Intelligence? Beyond the Flynn Effect. Cambridge University Press; 2009. [DOI] [Google Scholar]
- Fortin JP, Parker D, Tunç B, Watanabe T, Elliott MA, Ruparel K, Roalf DR, Satterthwaite TD, Gur RC, Gur RE, Schultz RT, Verma R, Shinohara RT. Harmonization of multi-site diffusion tensor imaging data. NeuroImage. 2017;161:149–170. doi: 10.1016/j.neuroimage.2017.08.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software. 2010;33:1–22. [PMC free article] [PubMed] [Google Scholar]
- Frostenson S. Where is the lead exposure risk in your community? 2016. [June 27, 2023]. http://www.vox.com/a/lead-exposure-risk-map
- Garavan H, Bartsch H, Conway K, Decastro A, Goldstein RZ, Heeringa S, Jernigan T, Potter A, Thompson W, Zahs D. Recruiting the ABCD sample: Design considerations and procedures. Developmental Cognitive Neuroscience. 2018;32:16–22. doi: 10.1016/j.dcn.2018.04.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gershon RC, Cook KF, Mungas D, Manly JJ, Slotkin J, Beaumont JL, Weintraub S. Language measures of the NIH toolbox cognition battery. Journal of the International Neuropsychological Society. 2014;20:642–651. doi: 10.1017/S1355617714000411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goodman R, Meltzer H, Bailey V. The Strengths and Difficulties Questionnaire: a pilot study on the validity of the self-report version. International Review of Psychiatry. 2003;15:173–177. doi: 10.1080/0954026021000046137. [DOI] [PubMed] [Google Scholar]
- Gordon EM, Laumann TO, Adeyemo B, Huckins JF, Kelley WM, Petersen SE. Generation and evaluation of a cortical area parcellation from resting-state correlations. Cerebral Cortex. 2016;26:288–303. doi: 10.1093/cercor/bhu239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greenwell B. Fastshap: fast approximate shapley values. 0.1.0CRAN. 2023 https://cran.r-project.org/package=fastshap
- Hagler DJ, Ahmadi ME, Kuperman J, Holland D, McDonald CR, Halgren E, Dale AM. Automated white-matter tractography using a probabilistic diffusion tensor atlas: Application to temporal lobe epilepsy. Human Brain Mapping. 2009;30:1535–1547. doi: 10.1002/hbm.20619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hagler DJ, Jr, Hatton S, Cornejo MD, Makowski C, Fair DA, Dick AS, Sutherland MT, Casey BJ, Barch DM, Harms MP, Watts R, Bjork JM, Garavan HP, Hilmer L, Pung CJ, Sicat CS, Kuperman J, Bartsch H, Xue F, Heitzeg MM, Laird AR, Trinh TT, Gonzalez R, Tapert SF, Riedel MC, Squeglia LM, Hyde LW, Rosenberg MD, Earl EA, Howlett KD, Baker FC, Soules M, Diaz J, de Leon OR, Thompson WK, Neale MC, Herting M, Sowell ER, Alvarez RP, Hawes SW, Sanchez M, Bodurka J, Breslin FJ, Morris AS, Paulus MP, Simmons WK, Polimeni JR, van der Kouwe A, Nencka AS, Gray KM, Pierpaoli C, Matochik JA, Noronha A, Aklin WM, Conway K, Glantz M, Hoffman E, Little R, Lopez M, Pariyadath V, Weiss SR, Wolff-Hughes DL, DelCarmen-Wiggins R, Feldstein Ewing SW, Miranda-Dominguez O, Nagel BJ, Perrone AJ, Sturgeon DT, Goldstone A, Pfefferbaum A, Pohl KM, Prouty D, Uban K, Bookheimer SY, Dapretto M, Galvan A, Bagot K, Giedd J, Infante MA, Jacobus J, Patrick K, Shilling PD, Desikan R, Li Y, Sugrue L, Banich MT, Friedman N, Hewitt JK, Hopfer C, Sakai J, Tanabe J, Cottler LB, Nixon SJ, Chang L, Cloak C, Ernst T, Reeves G, Kennedy DN, Heeringa S, Peltier S, Schulenberg J, Sripada C, Zucker RA, Iacono WG, Luciana M, Calabro FJ, Clark DB, Lewis DA, Luna B, Schirda C, Brima T, Foxe JJ, Freedman EG, Mruzek DW, Mason MJ, Huber R, McGlade E, Prescot A, Renshaw PF, Yurgelun-Todd DA, Allgaier NA, Dumas JA, Ivanova M, Potter A, Florsheim P, Larson C, Lisdahl K, Charness ME, Fuemmeler B, Hettema JM, Maes HH, Steinberg J, Anokhin AP, Glaser P, Heath AC, Madden PA, Baskin-Sommers A, Constable RT, Grant SJ, Dowling GJ, Brown SA, Jernigan TL, Dale AM. Image processing and analysis methods for the Adolescent Brain Cognitive Development Study. NeuroImage. 2019;202:116091. doi: 10.1016/j.neuroimage.2019.116091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- HAM-lab-Otago-University Commonality-analysis-ABCD5.1. swh:1:rev:14a26e3301a1700265ee605ace77316e424683d4Software Heritage. 2025 https://archive.softwareheritage.org/swh:1:dir:3d00e696493420407582d67394d1fa445a061469;origin=https://github.com/HAM-lab-Otago-University/Commonality-Analysis-ABCD5.1;visit=swh:1:snp:e5ea4a90a3d0d5f43533926bd0170d27445d1e05;anchor=swh:1:rev:14a26e3301a1700265ee605ace77316e424683d4
- Hankin BL, Snyder HR, Gulley LD. In: Developmental Psychopathology. Cicchetti D, editor. Elsevier; 2016. Cognitive risks in developmental psychopathology; pp. 1–74. [DOI] [Google Scholar]
- Hartshorne JK, Germine LT. When does cognitive functioning peak? The asynchronous rise and fall of different cognitive abilities across the life span. Psychological Science. 2015;26:433–443. doi: 10.1177/0956797614567339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hunsberger M, O’Malley J, Block T, Norris JC. Relative validation of B lock K ids F ood S creener for dietary assessment in children and adolescents. Maternal & Child Nutrition. 2015;11:260–270. doi: 10.1111/j.1740-8709.2012.00446.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, Sanislow C, Wang P. Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. The American Journal of Psychiatry. 2010;167:748–751. doi: 10.1176/appi.ajp.2010.09091379. [DOI] [PubMed] [Google Scholar]
- Johnson SL, Turner RJ, Iwata N. BIS/BAS levels and psychiatric disorder: an epidemiological study. Journal of Psychopathology and Behavioral Assessment. 2003;25:25–36. doi: 10.1023/A:1022247919288. [DOI] [Google Scholar]
- Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8:118–127. doi: 10.1093/biostatistics/kxj037. [DOI] [PubMed] [Google Scholar]
- Jorgensen TD, Pornprasertmanit S, Schoemann AM, Rosseel Y. SemTools: useful tools for structural equation modeling. 0.5-6CRAN. 2022 https://cran.r-project.org/package=semTools
- Keller AS, Pines AR, Shanmugan S, Sydnor VJ, Cui Z, Bertolero MA, Barzilay R, Alexander-Bloch AF, Byington N, Chen A, Conan GM, Davatzikos C, Feczko E, Hendrickson TJ, Houghton A, Larsen B, Li H, Miranda-Dominguez O, Roalf DR, Perrone A, Shetty A, Shinohara RT, Fan Y, Fair DA, Satterthwaite TD. Personalized functional brain network topography is associated with individual differences in youth cognition. Nature Communications. 2023;14:8411. doi: 10.1038/s41467-023-44087-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kessler RC, Amminger GP, Aguilar-Gaxiola S, Alonso J, Lee S, Ustün TB. Age of onset of mental disorders: a review of recent literature. Current Opinion in Psychiatry. 2007;20:359–364. doi: 10.1097/YCO.0b013e32816ebc8c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kessler RC, Avenevoli S, Costello EJ, Green JG, Gruber MJ, Heeringa S, Merikangas KR, Pennell BE, Sampson NA, Zaslavsky AM. Design and field procedures in the US National Comorbidity Survey Replication Adolescent Supplement (NCS-A) International Journal of Methods in Psychiatric Research. 2009;18:69–83. doi: 10.1002/mpr.279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kind AJH, Jencks S, Brock J, Yu M, Bartels C, Ehlenbach W, Greenberg C, Smith M. Neighborhood socioeconomic disadvantage and 30-day rehospitalization: a retrospective cohort study. Annals of Internal Medicine. 2014;161:765–774. doi: 10.7326/M13-2946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirlic N, Colaizzi JM, Cosgrove KT, Cohen ZP, Yeh HW, Breslin F, Morris AS, Aupperle RL, Singh MK, Paulus MP. Extracurricular activities, screen media activity, and sleep may be modifiable factors related to children’s cognitive functioning: evidence from the ABCD study. Child Development. 2021;92:2035–2052. doi: 10.1111/cdev.13578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knutson B, Westdorp A, Kaiser E, Hommer D. FMRI visualization of brain activity during a monetary incentive delay task. NeuroImage. 2000;12:20–27. doi: 10.1006/nimg.2000.0593. [DOI] [PubMed] [Google Scholar]
- Kuhn M, Wickham H. Tidymodels: easily install and load the “tidymodels” packages. 1.4.1CRAN. 2025 https://cran.r-project.org/package=tidymodels
- Lee JJ, Wedow R, Okbay A, Kong E, Maghzian O, Zacher M, Nguyen-Viet TA, Bowers P, Sidorenko J, Karlsson Linnér R, Fontana MA, Kundu T, Lee C, Li H, Li R, Royer R, Timshel PN, Walters RK, Willoughby EA, Yengo L, 23andMe Research Team. COGENT (Cognitive Genomics Consortium) Social Science Genetic Association Consortium. Alver M, Bao Y, Clark DW, Day FR, Furlotte NA, Joshi PK, Kemper KE, Kleinman A, Langenberg C, Mägi R, Trampush JW, Verma SS, Wu Y, Lam M, Zhao JH, Zheng Z, Boardman JD, Campbell H, Freese J, Harris KM, Hayward C, Herd P, Kumari M, Lencz T, Luan J, Malhotra AK, Metspalu A, Milani L, Ong KK, Perry JRB, Porteous DJ, Ritchie MD, Smart MC, Smith BH, Tung JY, Wareham NJ, Wilson JF, Beauchamp JP, Conley DC, Esko T, Lehrer SF, Magnusson PKE, Oskarsson S, Pers TH, Robinson MR, Thom K, Watson C, Chabris CF, Meyer MN, Laibson DI, Yang J, Johannesson M, Koellinger PD, Turley P, Visscher PM, Benjamin DJ, Cesarini D. Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals. Nature Genetics. 2018;50:1112–1121. doi: 10.1038/s41588-018-0147-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lencz T, Knowles E, Davies G, Guha S, Liewald DC, Starr JM, Djurovic S, Melle I, Sundet K, Christoforou A, Reinvang I, Mukherjee S, DeRosse P, Lundervold A, Steen VM, John M, Espeseth T, Räikkönen K, Widen E, Palotie A, Eriksson JG, Giegling I, Konte B, Ikeda M, Roussos P, Giakoumaki S, Burdick KE, Payton A, Ollier W, Horan M, Donohoe G, Morris D, Corvin A, Gill M, Pendleton N, Iwata N, Darvasi A, Bitsios P, Rujescu D, Lahti J, Hellard SL, Keller MC, Andreassen OA, Deary IJ, Glahn DC, Malhotra AK. Molecular genetic evidence for overlap between general cognitive ability and risk for schizophrenia: a report from the Cognitive Genomics consorTium (COGENT) Molecular Psychiatry. 2014;19:168–174. doi: 10.1038/mp.2013.166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luciana M, Bjork JM, Nagel BJ, Barch DM, Gonzalez R, Nixon SJ, Banich MT. Adolescent neurocognitive development and impacts of substance use: Overview of the adolescent brain cognitive development (ABCD) baseline neurocognition battery. Developmental Cognitive Neuroscience. 2018;32:67–79. doi: 10.1016/j.dcn.2018.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems 30 (NIPS 2017).2017. [Google Scholar]
- Makowski C, Brown TT, Zhao W, Hagler DJ, Parekh P, Garavan H, Nichols TE, Jernigan TL, Dale AM. Leveraging the adolescent brain cognitive development study to improve behavioral prediction from neuroimaging in smaller replication samples. Cerebral Cortex. 2024;34:bhae223. doi: 10.1093/cercor/bhae223. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marek S, Tervo-Clemmens B, Calabro FJ, Montez DF, Kay BP, Hatoum AS, Donohue MR, Foran W, Miller RL, Hendrickson TJ, Malone SM, Kandala S, Feczko E, Miranda-Dominguez O, Graham AM, Earl EA, Perrone AJ, Cordova M, Doyle O, Moore LA, Conan GM, Uriarte J, Snider K, Lynch BJ, Wilgenbusch JC, Pengo T, Tam A, Chen J, Newbold DJ, Zheng A, Seider NA, Van AN, Metoki A, Chauvin RJ, Laumann TO, Greene DJ, Petersen SE, Garavan H, Thompson WK, Nichols TE, Yeo BTT, Barch DM, Luna B, Fair DA, Dosenbach NUF. Reproducible brain-wide association studies require thousands of individuals. Nature. 2022;603:654–660. doi: 10.1038/s41586-022-04492-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merikangas KR, Avenevoli S, Costello EJ, Koretz D, Kessler RC. National comorbidity survey replication adolescent supplement (NCS-A): I. Background and measures. Journal of the American Academy of Child and Adolescent Psychiatry. 2009;48:367–379. doi: 10.1097/CHI.0b013e31819996f1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Michelini G, Barch DM, Tian Y, Watson D, Klein DN, Kotov R. Delineating and validating higher-order dimensions of psychopathology in the Adolescent Brain Cognitive Development (ABCD) study. Translational Psychiatry. 2019;9:261. doi: 10.1038/s41398-019-0593-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mihalik A, Brudfors M, Robu M, Ferreira FS, Lin H, Rau A, Wu T, Blumberg SB, Kanber B, Tariq M, Garcia ME, Zor C, Nikitichev DI, Mourão-Miranda J, Oxtoby NP. ABCD neurocognitive prediction challenge 2019: predicting individual fluid intelligence scores from structural MRI using probabilistic segmentation and kernel ridge regression. Adolescent Brain Cognitive Development Neurocognitive Prediction. ABCD-NP 2019; 2019. pp. 133–142. [DOI] [Google Scholar]
- Millan MJ, Agid Y, Brüne M, Bullmore ET, Carter CS, Clayton NS, Connor R, Davis S, Deakin B, DeRubeis RJ, Dubois B, Geyer MA, Goodwin GM, Gorwood P, Jay TM, Joëls M, Mansuy IM, Meyer-Lindenberg A, Murphy D, Rolls E, Saletu B, Spedding M, Sweeney J, Whittington M, Young LJ. Cognitive dysfunction in psychiatric disorders: characteristics, causes and the quest for improved therapy. Nature Reviews. Drug Discovery. 2012;11:141–168. doi: 10.1038/nrd3628. [DOI] [PubMed] [Google Scholar]
- Molnar C. Interpretable Machine Learning. a Guide for Making Black Box Models Explainable. Github; 2019. [Google Scholar]
- Moos RH, Insel PM, Humphrey B. Preliminary Manual for Family Environment Scale, Work Environment Scale, Group Environment Scale. Consulting Psychologists Press; 1974. [Google Scholar]
- Morris SE, Cuthbert BN. Research Domain Criteria: cognitive systems, neural circuits, and dimensions of behavior. Dialogues in Clinical Neuroscience. 2012;14:29–37. doi: 10.31887/DCNS.2012.14.1/smorris. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morris SE, Sanislow CA, Pacheco J, Vaidyanathan U, Gordon JA, Cuthbert BN. Revisiting the seven pillars of RDoC. BMC Medicine. 2022;20:024140. doi: 10.1186/s12916-022-02414-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mowinckel AM, Vidal-Piñeiro D. Visualization of brain statistics with R Packages ggseg and ggseg3d. Advances in Methods and Practices in Psychological Science. 2020;3:466–483. doi: 10.1177/2515245920928009. [DOI] [Google Scholar]
- Nakagawa S, Schielzeth H. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution. 2013;4:133–142. doi: 10.1111/j.2041-210x.2012.00261.x. [DOI] [Google Scholar]
- Nielson DM, Pereira F, Zheng CY, Migineishvili N, Lee JA, Thomas AG, Bandettini PA. Detecting and harmonizing scanner differences in the ABCD study - annual release 1.0. bioRxiv. 2018 doi: 10.1101/309260. [DOI]
- Nimon K, Lewis M, Kane R, Haynes RM. An R package to compute commonality coefficients in the multiple regression case: an introduction to the package and a practical example. Behavior Research Methods. 2008;40:457–466. doi: 10.3758/brm.40.2.457. [DOI] [PubMed] [Google Scholar]
- Nimon K, Lewis M, Kane R, Haynes RM. Erratum to: An R package to compute commonality coefficients in the multiple regression case: An introduction to the package and a practical example. Behavior Research Methods. 2017;49:08532. doi: 10.3758/s13428-017-0853-2. [DOI] [PubMed] [Google Scholar]
- Pat N, Riglin L, Anney R, Wang Y, Barch DM, Thapar A, Stringaris A. Motivation and cognitive abilities as mediators between polygenic scores and psychopathology in children. Journal of the American Academy of Child and Adolescent Psychiatry. 2022a;61:782–795. doi: 10.1016/j.jaac.2021.08.019. [DOI] [PubMed] [Google Scholar]
- Pat N, Wang Y, Anney R, Riglin L, Thapar A, Stringaris A. Longitudinally stable, brain-based predictive models mediate the relationships between childhood cognition and socio-demographic, psychological and genetic factors. Human Brain Mapping. 2022b;43:5520–5542. doi: 10.1002/hbm.26027. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pat N, Wang Y, Bartonicek A, Candia J, Stringaris A. Explainable machine learning approach to predict and explain the relationship between task-based fMRI and individual differences in cognition. Cerebral Cortex. 2023;33:2682–2703. doi: 10.1093/cercor/bhac235. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pietschnig J, Voracek M. One century of global IQ gains: a formal meta-analysis of the flynn effect (1909-2013) Perspectives on Psychological Science. 2015;10:282–306. doi: 10.1177/1745691615577701. [DOI] [PubMed] [Google Scholar]
- Psaty BM, O’Donnell CJ, Gudnason V, Lunetta KL, Folsom AR, Rotter JI, Uitterlinden AG, Harris TB, Witteman JCM, Boerwinkle E, CHARGE Consortium Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium: Design of prospective meta-analyses of genome-wide association studies from 5 cohorts. Circulation. Cardiovascular Genetics. 2009;2:73–80. doi: 10.1161/CIRCGENETICS.108.829747. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Quah SKL, Jo B, Geniesse C, Uddin LQ, Mumford JA, Barch DM, Fair DA, Gotlib IH, Poldrack RA, Saggar M. A data-driven latent variable approach to validating the research domain criteria framework. Nature Communications. 2025;16:55831-z. doi: 10.1038/s41467-025-55831-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rasero J, Sentis AI, Yeh FC, Verstynen T. Integrating across neuroimaging modalities boosts prediction accuracy of cognitive ability. PLOS Computational Biology. 2021;17:e1008347. doi: 10.1371/journal.pcbi.1008347. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raudenbush SW, Bryk AS. Hierarchical Linear Models: Applications and Data Analysis Methods. 2nd ed. Sage Publications; 2002. [Google Scholar]
- Reef J, Diamantopoulou S, van Meurs I, Verhulst F, van der Ende J. Predicting adult emotional and behavioral problems from externalizing problem trajectories in a 24-year longitudinal study. European Child & Adolescent Psychiatry. 2010;19:577–585. doi: 10.1007/s00787-010-0088-6. [DOI] [PubMed] [Google Scholar]
- Rindermann H, Becker D, Coyle TR. Survey of expert opinion on intelligence: The FLynn effect and the future of intelligence. Personality and Individual Differences. 2017;106:242–247. doi: 10.1016/j.paid.2016.10.061. [DOI] [Google Scholar]
- Rohart F, Gautier B, Singh A, Lê Cao K-A. mixOmics: An R package for ’omics feature selection and multiple data integration. PLOS Computational Biology. 2017;13:e1005752. doi: 10.1371/journal.pcbi.1005752. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rosseel Y. lavaan: An R package for structural equation modeling. Journal of Statistical Software. 2012;48:1–36. doi: 10.18637/jss.v048.i02. [DOI] [Google Scholar]
- Roth AE. The Shapley Value: Essays in Honor of Lloyd S. Shapley. Cambridge University Press; 1988. [DOI] [Google Scholar]
- Roza SJ, Hofstra MB, van der Ende J, Verhulst FC. Stable prediction of mood and anxiety disorders based on behavioral and emotional problems in childhood: a 14-year follow-up during childhood, adolescence, and young adulthood. The American Journal of Psychiatry. 2003;160:2116–2121. doi: 10.1176/appi.ajp.160.12.2116. [DOI] [PubMed] [Google Scholar]
- Rundquist EA. Intelligence test scores and school marks of high school seniors in 1929 and 1933. School & Society. 1936;43:301–304. [Google Scholar]
- Sacks J, Welch WJ, Mitchell TJ, Wynn HP. Design and analysis of computer experiments. Statistical Science. 1989;4:409–423. doi: 10.1214/ss/1177012413. [DOI] [Google Scholar]
- Santner TJ, Williams BJ, Notz WI. The Design and Analysis of Computer Experiments. Springer; 2003. [DOI] [Google Scholar]
- Savage JE, Jansen PR, Stringer S, Watanabe K, Bryois J, de Leeuw CA, Nagel M, Awasthi S, Barr PB, Coleman JRI, Grasby KL, Hammerschlag AR, Kaminski JA, Karlsson R, Krapohl E, Lam M, Nygaard M, Reynolds CA, Trampush JW, Young H, Zabaneh D, Hägg S, Hansell NK, Karlsson IK, Linnarsson S, Montgomery GW, Muñoz-Manchado AB, Quinlan EB, Schumann G, Skene NG, Webb BT, White T, Arking DE, Avramopoulos D, Bilder RM, Bitsios P, Burdick KE, Cannon TD, Chiba-Falek O, Christoforou A, Cirulli ET, Congdon E, Corvin A, Davies G, Deary IJ, DeRosse P, Dickinson D, Djurovic S, Donohoe G, Conley ED, Eriksson JG, Espeseth T, Freimer NA, Giakoumaki S, Giegling I, Gill M, Glahn DC, Hariri AR, Hatzimanolis A, Keller MC, Knowles E, Koltai D, Konte B, Lahti J, Le Hellard S, Lencz T, Liewald DC, London E, Lundervold AJ, Malhotra AK, Melle I, Morris D, Need AC, Ollier W, Palotie A, Payton A, Pendleton N, Poldrack RA, Räikkönen K, Reinvang I, Roussos P, Rujescu D, Sabb FW, Scult MA, Smeland OB, Smyrnis N, Starr JM, Steen VM, Stefanis NC, Straub RE, Sundet K, Tiemeier H, Voineskos AN, Weinberger DR, Widen E, Yu J, Abecasis G, Andreassen OA, Breen G, Christiansen L, Debrabant B, Dick DM, Heinz A, Hjerling-Leffler J, Ikram MA, Kendler KS, Martin NG, Medland SE, Pedersen NL, Plomin R, Polderman TJC, Ripke S, van der Sluis S, Sullivan PF, Vrieze SI, Wright MJ, Posthuma D. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nature Genetics. 2018;50:912–919. doi: 10.1038/s41588-018-0152-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schaefer ES. A configurational analysis of children’s reports of parent behavior. Journal of Consulting Psychology. 1965;29:552–557. doi: 10.1037/h0022702. [DOI] [PubMed] [Google Scholar]
- Snellen H. Letterproeven, Tot Bepaling Der Gezigtsscherpte. J. Greven; 1862. [Google Scholar]
- Sripada C, Angstadt M, Rutherford S, Taxali A, Shedden K. Toward a “treadmill test” for cognition: Improved prediction of general cognitive ability from the task activated brain. Human Brain Mapping. 2020;41:3186–3197. doi: 10.1002/hbm.25007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sripada C, Angstadt M, Taxali A, Clark DA, Greathouse T, Rutherford S, Dickens JR, Shedden K, Gard AM, Hyde LW, Weigard A, Heitzeg M. Brain-wide functional connectivity patterns support general cognitive ability and mediate effects of socioeconomic status in youth. Translational Psychiatry. 2021;11:01704-0. doi: 10.1038/s41398-021-01704-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stover PJ, Harlan WR, Hammond JA, Hendershot T, Hamilton CM. PhenX: a toolkit for interdisciplinary genetics research. Current Opinion in Lipidology. 2010;21:136–140. doi: 10.1097/MOL.0b013e3283377395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, Downey P, Elliott P, Green J, Landray M, Liu B, Matthews P, Ong G, Pell J, Silman A, Young A, Sprosen T, Peakman T, Collins R. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLOS Medicine. 2015;12:e1001779. doi: 10.1371/journal.pmed.1001779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sui J, Jiang R, Bustillo J, Calhoun V. Neuroimaging-based individualized prediction of cognition and behavior for mental disorders and health: methods and promises. Biological Psychiatry. 2020;88:818–828. doi: 10.1016/j.biopsych.2020.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tetereva A, Li J, Deng JD, Stringaris A, Pat N. Capturing brain-cognition relationship: Integrating task-based fMRI across tasks markedly boosts prediction and test-retest reliability. NeuroImage. 2022;263:119588. doi: 10.1016/j.neuroimage.2022.119588. [DOI] [PubMed] [Google Scholar]
- Tetereva A, Pat N. Brain age has limited utility as a biomarker for capturing fluid cognition in older individuals. eLife. 2024;12:RP87297. doi: 10.7554/eLife.87297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson WK, Barch DM, Bjork JM, Gonzalez R, Nagel BJ, Nixon SJ, Luciana M. The structure of cognition in 9 and 10 year-old children and associations with problem behaviors: Findings from the ABCD study’s baseline neurocognitive battery. Developmental Cognitive Neuroscience. 2019;36:100606. doi: 10.1016/j.dcn.2018.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trahan LH, Stuebing KK, Fletcher JM, Hiscock M. The Flynn effect: a meta-analysis. Psychological Bulletin. 2014;140:1332–1360. doi: 10.1037/a0037173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Uban KA, Horton MK, Jacobus J, Heyser C, Thompson WK, Tapert SF, Madden PAF, Sowell ER, Adolescent Brain Cognitive Development Study Biospecimens and the ABCD study: Rationale, methods of collection, measurement and early data. Developmental Cognitive Neuroscience. 2018;32:97–106. doi: 10.1016/j.dcn.2018.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Essen DC, Smith SM, Barch DM, Behrens TEJ, Yacoub E, Ugurbil K, WU-Minn HCP Consortium The WU-Minn human connectome project: an overview. NeuroImage. 2013;80:62–79. doi: 10.1016/j.neuroimage.2013.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vieira BH, Pamplona GSP, Fachinello K, Silva AK, Foss MP, Salmon CEG. On the prediction of human intelligence from neuroimaging: A systematic review of methods and reporting. Intelligence. 2022;93:101654. doi: 10.1016/j.intell.2022.101654. [DOI] [Google Scholar]
- Vonesh EF, Chinchilli VM, Pu K. Goodness-of-fit in generalized nonlinear mixed-effects models. Biometrics. 1996;52:572–587. [PubMed] [Google Scholar]
- Wainschtein P, Jain D, Zheng Z, Cupples LA, Shadyab AH, McKnight B, Shoemaker BM, Mitchell BD, Psaty BM, Kooperberg C, Liu C-T, Albert CM, Roden D, Chasman DI, Darbar D, Lloyd-Jones DM, Arnett DK, Regan EA, Boerwinkle E, Rotter JI, O’Connell JR, Yanek LR, de Andrade M, Allison MA, McDonald M-LN, Chung MK, Fornage M, Chami N, Smith NL, Ellinor PT, Vasan RS, Mathias RA, Loos RJF, Rich SS, Lubitz SA, Heckbert SR, Redline S, Guo X, Chen Y-DI, Laurie CA, Hernandez RD, McGarvey ST, Goddard ME, Laurie CC, North KE, Lange LA, Weir BS, Yengo L, Yang J, Visscher PM, TOPMed Anthropometry Working Group. NHLBI Trans-Omics for Precision Medicine Consortium Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nature Genetics. 2022;54:263–273. doi: 10.1038/s41588-021-00997-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whelan R, Conrod PJ, Poline J-B, Lourdusamy A, Banaschewski T, Barker GJ, Bellgrove MA, Büchel C, Byrne M, Cummins TDR, Fauth-Bühler M, Flor H, Gallinat J, Heinz A, Ittermann B, Mann K, Martinot J-L, Lalor EC, Lathrop M, Loth E, Nees F, Paus T, Rietschel M, Smolka MN, Spanagel R, Stephens DN, Struve M, Thyreau B, Vollstaedt-Klein S, Robbins TW, Schumann G, Garavan H, IMAGEN Consortium Adolescent impulsivity phenotypes characterized by distinct brain networks. Nature Neuroscience. 2012;15:920–925. doi: 10.1038/nn.3092. [DOI] [PubMed] [Google Scholar]
- Whiteside SP, Lynam DR. Understanding the role of impulsivity and externalizing psychopathology in alcohol abuse: application of the UPPS impulsive behavior scale. Experimental and Clinical Psychopharmacology. 2003;11:210–217. doi: 10.1037/1064-1297.11.3.210. [DOI] [PubMed] [Google Scholar]
- Williams RL. Overview of the Flynn effect. Intelligence. 2013;41:753–764. doi: 10.1016/j.intell.2013.04.010. [DOI] [Google Scholar]
- Wold S, Sjöström M, Eriksson L. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems. 2001;58:109–130. doi: 10.1016/S0169-7439(01)00155-1. [DOI] [Google Scholar]
- Wongupparaj P, Wongupparaj R, Kumari V, Morris RG. The Flynn effect for verbal and visuospatial short-term and working memory: A cross-temporal meta-analysis. Intelligence. 2017;64:71–80. doi: 10.1016/j.intell.2017.07.006. [DOI] [Google Scholar]
- Wright MN, Ziegler A. ranger: a fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software. 2017;77:1–17. doi: 10.18637/jss.v077.i01. [DOI] [Google Scholar]
- Yang R, Jernigan T. 2023. Adolescent Brain Cognitive Development Study (ABCD)—Annual Release 5.1. NIMH Data Archive. [DOI]
- Yarkoni T, Westfall J. Choosing prediction over explanation in psychology: lessons from machine learning. Perspectives on Psychological Science. 2017;12:1100–1122. doi: 10.1177/1745691617693393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zapolski TCB, Stairs AM, Settles RF, Combs JL, Smith GT. The measurement of dispositions to rash action in children. Assessment. 2010;17:116–125. doi: 10.1177/1073191109351372. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhao W, Makowski C, Hagler DJ, Garavan HP, Thompson WK, Greene DJ, Jernigan TL, Dale AM. Task fMRI paradigms may capture more behaviorally relevant information than resting-state functional connectivity. NeuroImage. 2023;270:119946. doi: 10.1016/j.neuroimage.2023.119946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhi D, Jiang R, Pearlson G, Fu Z, Qi S, Yan W, Feng A, Xu M, Calhoun V, Sui J. Triple interactions between the environment, brain, and behavior in children: an ABCD study. Biological Psychiatry. 2024;95:828–838. doi: 10.1016/j.biopsych.2023.12.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B. 2005;67:301–320. doi: 10.1111/j.1467-9868.2005.00503.x. [DOI] [Google Scholar]
- Zucker RA, Gonzalez R, Feldstein Ewing SW, Paulus MP, Arroyo J, Fuligni A, Morris AS, Sanchez M, Wills T. Assessment of culture and environment in the Adolescent Brain and Cognitive Development Study: Rationale, description of measures, and early data. Developmental Cognitive Neuroscience. 2018;32:107–120. doi: 10.1016/j.dcn.2018.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
























