Abstract
Predictive models in neuroimaging are increasingly designed with the intent to improve risk stratification and support interventional efforts in psychiatry. Many of these models are developed in samples of school-aged children or older. Nevertheless, despite growing evidence that altered brain maturation during the fetal, infant, and toddler (FIT) period modulates the risk for poor mental health outcomes in childhood, these models are rarely implemented in FIT samples. Applications of predictive modeling in these ages provide an opportunity to develop powerful tools for improved characterization of the neural mechanisms underlying development. To facilitate the broader use of predictive models in FIT neuroimaging, we present a brief systematic review and primer on the methods used in current predictive modeling FIT studies. Reflecting on current practices in more than 100 studies conducted over the past decade, we provide an overview of topics, modalities, and methods commonly used in the field and under-researched areas. We then outline recommendations and ethical considerations for neuroimaging researchers interested in predicting health outcomes in early life, which may be relatively new to either advanced machine learning methods or using FIT data. Altogether, the last decade of FIT research in machine learning provides a foundation for accelerating the prediction of early life trajectories across the full spectrum of illness and health.
1. Introduction
Fetal, infant, and toddler (FIT) neuroimaging is a critical tool for understanding (a)typical brain development in vivo and the developmental origins of psychopathology. Thus far, FIT studies have primarily examined the relationship between phenotypic measures and brain structure or function (i.e., brain-behavior associations and group contrasts) using classical statistical inference (1). Such methods can be powerful but do not explicitly quantify the generalizability of findings to novel samples (2,3). In contrast, machine learning—or predictive—models are defined and validated with independent data, thus promising more generalizable brain-behavior associations (4,5) and even individual-level prediction. Additionally, machine learning procedures often capture more complex associations than classical statistical inference, a widely recognized necessity for biomedical research (6). Accordingly, these methods are becoming mainstays in neuroimaging studies of older individuals but remain scarce in FIT studies.
To facilitate the broader use of predictive models in FIT neuroimaging, we present a brief primer on current predictive modeling methods in FIT studies and a systematic review of the existing literature. We also discuss how ethical considerations may differ when using FIT samples compared to older individuals, current challenges of predictive modeling with FIT datasets, and future directions for this line of research. Throughout the manuscript, we refer to ‘prediction’ as including analyses that test models in a novel sample (e.g., cross-validation or external validation) instead of associations with a future phenotype. This review encapsulates multiple neuroimaging modalities, including magnetic resonance imaging (MRI), functional MRI (fMRI), electroencephalogy (EEG), and magnetoencephalography (MEG). Finally, we define a ‘FIT sample’ as those younger than three years of age.
2. Primer on predictive modeling for FIT neuroimaging)
First, we provide a primer to introduce FIT neuroimaging researchers to the fundamental concepts of machine learning, highlighting special considerations for working with FIT datasets in general machine learning workflows. While many different approaches exist, most follow the same analytic template overviewed here (Figure 1). However, we refer interested readers to existing work that provides in-depth explanations of the fundamental mathematics underlying machine learning (4,7) and best practices for using these approaches with neuroimaging data (8–11).
Figure 1. A typical flowchart of neuroimaging-based predictive modeling using machine learning.

Although the specific implementations may vary across studies, a typical predictive modeling workflow generally includes acquisition and preprocessing of neuroimaging data, model building, model evaluation, interpretation/generalization, and model sharing.
2.1. Predictive modeling workflow
Typically, predictive modeling aims to create a model that estimates an individual’s phenotypic characteristics (e.g., executive function, diagnostic category) from their neuroimaging data. Various machine learning algorithms are leveraged to combine neuroimaging features to estimate continuous (known as regression) or categorical (known as classification) outcome measures. Though various phenotypes can be used, neuroimaging measures often include task-evoked activation patterns (fMRI, EEG, MEG, fNIRS), functional connectivity (fMRI, EEG, MEG, fNIRS), brain morphometry (sMRI), and structural connectivity (DTI). The validation strategy and algorithms are selected based on the overall goals and available data. Then, the model is trained, tested, and evaluated.
Ensuring that data used to build a model (e.g., training data) and data used to evaluate a model (e.g., testing data) remain independent is a fundamental step for prediction (12). Typically, cross- and/or external validation are used. Most cross-validation strategies employ some form of k-fold cross-validation, where the dataset is split into k equally-sized, non-overlapping subsets. A machine learning algorithm is then used to “learn” a model in k-1 folds. Individuals in these k-1 folds comprise the training data. The “learned” model is tested in the left-out fold, labeled as the testing data. This process is repeated for each set of k-1 folds for training and the corresponding left-out fold for testing. Commonly recommended approaches include 10-fold (i.e., a 90/10 split for training and testing), 5-fold (i.e., an 80/20 split), leave-one-out (k=sample size), and split-half (k=2) cross-validation, where the latter are two special cases of the general k-fold approach. Leave-one-out cross-validation is exhaustive—or every possible way of splitting the data into k folds is used. However, other choices of k are not. Repeating the random splits (i.e., 100–1,000 times) is generally recommended to obtain more stable estimates than a single split. External validation—or showing that a “learned” model can generalize to externally collected datasets—provides a more robust validation than cross-validation, but it requires a second dataset, which may not always be available.
During model training, a machine learning algorithm learns a mathematical function that maps neuroimaging data to estimate the phenotype of interest. Standard algorithms include SVM/R, partial least squares regression (PLSR), penalized regression (LASSO, elastic net, and ridge regression), tree-based approaches, and deep learning. There is no “one size fits all” approach as each method has specific strengths and limitations (Table 1). The selection mainly depends on the scientific question of interest. For example, SVM is designed for classification, and LASSO will produce sparse models where only a few features contribute to prediction. Neuroimaging data typically has more features than samples, which can hinder algorithm performance. While many algorithms can perform well in these situations (e.g., PLSR reduces the original features to a lower-dimensional space), feature selection can reduce the dimensionality by selecting only the most informative subset of features. To maintain the separation of training and testing data, tuning of algorithm-specific parameters (or hyperparameters) needs to be determined with additional cross-validation (5).
Table 1.
OverView of common machine learning approaches in FIT neuroimaging.
| Method | Model Type | Description | Advantages | Disadvantages | Methodology reference | Example FIT paper | # FIT papers |
|---|---|---|---|---|---|---|---|
| Linear regression | Regr. | Estimates a linear relationship between the dependent and independent variables | -Simplicity -Easy to implement and interpret - Penalized version exists for more features than observations |
-Assumes linear relationship between variables -Sensitive to outliers |
Linear Regression Analysis: Part 14 of a Series on Evaluation of Scientific Publications (85) | Neonatal multi-modal cortical profiles predict 18-month developmental outcomes (64) | 24 |
| Logistic regression | Class. | Estimates a linear relationship between a continuous independent variable and a binary dependent variable ({0,1}) using a sigmoid transformation | - Useful for predicting binary outcomes - Can be extended to classification of multiple groups - Penalized version exists for more features than observations - No assumption regarding distribution, increasing the robustness to outliers compared to LDA |
-Assumes linear relationship between variables -Parameter estimates are unstable with large separation between classes -Other methods (e.g., discriminant analysis) may be more accurate with small, normally distributed samples |
An Introduction to Statistical Learning with Applications in R (86) | General factors of white matter microstructure from DTI and NODDI in the developing brain (87) | 15 |
| Support vector machine / regression | Both | For classification: finds the hyperplane (i.e., plane in high dimensional space, line in 2D space) that separates classes by maximizing the distance between observations closest to the separating hyperplane For regression: finds the hyperplane that allows predictions to be within a specified error band, where observations just outside the error band define the hyperplane |
-Works well in low or high dimensional data -Requires much fewer observations than neural networks -Can be extended to model non-linear relationships (e.g., kernel SVM) |
-SVM only works for binary classification problems in its base form -Kernel methods less interpretable than simpler methods (e.g., linear regression) |
SVM: Machine Learning Methods and Applications to Brain Disorders (Chapter 6 - Support vector machine) (88) SVR: Machine Learning Methods and Applications to Brain Disorders (Chapter 7 - Support vector regression) (89) |
EEG-based neonatal seizure detection with Support Vector Machines (90) | 47 |
| Discriminant analysis (linear or quadratic) | Class. | Fits densities for each class and finds a linear or quadratic decision boundary that approximates the Bayes decision boundary | -Simple and computationally inexpensive (closed form solution) -Inherently multiclass -Dimension reduction allows for visualization and interpretation |
-Predictors must be normally distributed | Scikit-learn: Machine learning in Python (91) An Introduction to Statistical Learning with Applications in R (86) |
Electrophysiological auditory responses and language development in infants with periventricular leukomalacia (92) | 8 |
| Tree-based methods (random forest, decision trees) | Both | Partitions the feature space into a set of branching choices, and then fits a simple model (e.g., a constant) in each one. More complex variants (e.g. random forest) include many trees. | - Conceptually simple - Highly interpretable - No need for data normalization - Can aggregate information over decision trees to achieve high performance (e.g., random forest, bagging). |
- Models involving single trees may be unstable or sensitive to outliers - Building trees usually takes long time and is relatively expensive |
The Elements of Statistical Learning: Data Mining, Inference, and Prediction (93) | Brain MRI radiomics analysis may predict poor psychomotor outcome in preterm neonates (94) | 22 |
| Bayesian methods | Both | A function estimation method to compute posterior distribution. Can be used in conjunction with other methods to provide principled estimation and inference of model parameters |
-Flexible and can be used with other machine learning approaches -Unlike frequentist approaches where everything comes from data, here you can define your model a priori |
- High computational cost - Difficulty to specify model a priori |
An Introduction to Statistical Learning with Applications in R (86) Bayesian Methods for Machine Learning (95) |
Subject-specific prediction using nonlinear population modeling: application to early brain maturation from DTI (96) | 8 |
| Deep learning | Both | Learns multiple layers of complex, non-linear representations of data | -Learns very complex relationships -Works with many features -Highly customizable (loss function, architecture, etc.) |
-Requires large sample sizes for training -Less interpretable than simpler methods (e.g., linear regression) -Harder to implement -Long training time |
Deep learning (97) | Predicting motor outcome in preterm infants from very early brain diffusion MRI using a deep learning convolutional neural network (CNN) model (98) | 39 |
Notes: LDA = linear discriminant analysis, SVM = support vector machine, Regr = regression, Class = classification.
Once a model is built, it is applied to unseen data, and the resulting predicted values are compared to the observed ones. Notably, any model parameters should not be modified during this step. How well the model predicts a phenotype can be measured in several ways, and multiple measures are generally reported. For regression, Pearson or Spearman correlations between predicted and observed values are common statistics for model evaluation. Note that correlation can overestimate prediction performance when using cross-validation, and adjusted measures exist (e.g., cross-validation R^2 or q^2; (5)). To measure unstandardized error, estimates of mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) quantify how much the predicted values deviate from the actual values in the units of the phenotypic data (13). For classification, popular summary metrics include accuracy (proportion of the sample correctly classified), specificity, sensitivity, or area under the receiver operating characteristic curve (ROC AUC). In either case, any evaluation statistic should be compared to a null to assess statistical significance. Due to the dependency created by cross-validation, null datasets are often generated by permutation testing, where labels between the neuroimaging and phenotypic data are shuffled to estimate chance level prediction nonparametrically. For classification, McNemar’s test can be used. Finally, groups are often unbalanced, which must be considered when measuring performance (5).
Confounds in neuroimaging are idiosyncratic to each modality. Any confounds considered in a standard explanatory analysis should also be considered for a predictive analysis since it is needed to know whether signal or artifact is driving prediction (5). Common potential confounds in FIT samples include age, sex, head motion, and variability in head coils for multi-site studies. Some groups use dedicated infant head coils (14) to increase the signal-to-noise ratio, while others use adult-sized head coils (15). Although bespoke infant head coils have advantages, these must be balanced against their high cost and low availability. Variability in head coils and scanners may confound prediction estimates (16). Confounds can be addressed by harmonization data (17), attenuating their effect at the preprocessing stage (e.g., via artifact regression), adjusting input data accordingly (e.g., balancing the confound across groups), including the confound in the model (e.g., estimating added predictive power when including a phenotype; e.g.(18)), or other approaches (19).
Should a model significantly predict a phenotype, interrogating the contribution of individual brain features is the following but often overlooked stepp (11,20). Model interpretation usually starts with visualizing the model weights for each brain feature. Typically, features with larger weights are considered more important for prediction. However, as with standard linear regression, the size of the weight depends on the specific feature(s) under examination, and larger does not always mean more significant (21). Normalizing the model weights, or standardizing the brain features (i.e., z-scoring), can increase the interpretability of model weights. Another approach to understanding the importance of a feature would be iterating through all features, removing a single feature at each iteration, and quantifying the change in prediction performance associated with each feature. Features that degrade performance the most upon removal would have greater importance in the model. As neuroimaging analyses often have many features, reducing the number of features through feature selection or algorithms that enforce sparsity (e.g., LASSO, which will make most model weights zero) can simplify model interpretation. However, enforcing sparsity on highly correlated features (which neuroimaging features typically are) can lead to model instability (22).
Finally, the resulting outcomes, data, and models should be shared when all analyses are complete to facilitate external validation. Importantly, models that do not generalize to external data are not invalid. Demographic factors (e.g., development, sex, culture factors) may differ between datasets, affecting brain-phenotypic associations and resulting models (23).
2.2. Considerations for adapting a machine learning workflow for FIT data
While the fundamental concepts remain consistent, adapting standard machine learning workflows for FIT data requires special considerations. We highlight five key considerations here, though others may exist.
First, expected features for brain-based predictive models might not be available within a particular age group. For example, task-evoked brain activation patterns are standard features for predictive models, but task-based paradigms are challenging to implement in fetuses and younger infants. Similarly, due to the change in myelination across the first year of life, the gray and white matter boundary has poor contrast around 6-months, which makes features like cortical thickness challenging to estimate. Thus, it is not easy to create models pooling several neuroimaging modalities—typically, the best performing models (24).
Second, despite its limited duration, the FIT period defines the most rapid and dynamic epoch of brain and behavior maturation within the lifespan. Developmental changes in the hemodynamic response function shape (25) or respiratory artifacts change rapidly across the FIT period (26). Therefore, it may not be possible to combine brain features from individuals who are only a few weeks or months apart in age. In addition, many behaviors may not be measurable during the early FIT period—particularly in fetal and neonatal subjects—and longitudinal follow-up is often needed to establish brain-behavior models. Thus, FIT studies generally use a scan collected at one time point to predict a future phenotype. Predicting a future phenotype is more challenging than predicting a concurrent phenotype (as is more typical for non-FIT studies) since subjects can change dramatically between time points. Finally, many milestones, such as walking, ordinarily have broad temporal distributions across individuals, so the prediction of inter-individual variability at this stage may or may not have clinical utility for long-term outcomes.
Third, specialized analytic methods and associated software for FIT neuroimaging analysis are still being built and optimized (27). Preprocessing pipelines developed using adult data typically perform poorly on FIT data (27,28). Moreover, FIT studies lack a widely used common space (such as MNI space) for analyses (29). Encouragingly, this area continues to progress rapidly, with recent advances in fetal sMRI and fMRI imaging sequences (30,31), in-scanner monitoring of infant motion (32), and FIT-specific software (33,34). Nevertheless, the weaker analytic approaches of FIT make various aspects of predictive modeling workflows more difficult..
Fourth, small samples can limit the choice of algorithms (e.g., many deep learning methods require larger training samples than more traditional methods like SVM/R). Because the adequate sample size will depend on various factors, including modality, age (e.g., scanning toddlers with MRI is more challenging than neonates (35), research design, algorithm, or target accuracy, we lack universal guidelines for sample sizes in FIT imaging. In older cohorts, reproducible findings have been reported with 25 participants (36), yet hundreds of participants (or more) have been recommended (5,37,38). Open-source FIT datasets (see below) might reduce this limitation.
Finally, the interpretation of predictive models in FIT samples is more complex than in older samples. Our understanding of neuroimaging features is primarily based on findings from older individuals, yet these findings do not necessarily generalize to FIT samples. For example, many canonical large-scale brain networks are still forming, and the neurotransmitter GABA is excitatory, not inhibitory, in young infants (39,40). The neurovascular coupling also differs between adults and neonates, where infants show delayed and either less pronounced or negative hemodynamic responses to stimuli (41). Given the rapid neurodevelopment in FIT samples, even slight differences in age across a sample could drive prediction performance. For instance, individual differences in intracranial volume or myelination could influence functional measures such that these structural differences drive prediction performance rather than functional measures. Indeed, it is imperative to mitigate confounding age effects so observed differences between groups do not merely reflect developmental processes. Additionally, the individual’s state should also be considered for functional data, as most infant fMRI data are collected during natural sleep, affecting functional connectivity patterns and may confound comparisons to other individuals (42).
3. Systematic review of the current FIT studies using predictive models
We systematically reviewed published articles to summarize the state of the science using predictive models in FIT neuroimaging datasets. Eligible articles included empirical papers published between 2010 and 2022, written in English, and collected neuroimaging data from FIT participants. We excluded articles with the following characteristics: 1) published before January 1, 2010 (due to the limited number of predictive modeling FIT neuroimaging studies before this date); 2) animal model studies, review articles, meta-analyses, case reports, imaging of non-brain organs, and methodological articles (e.g., novel brain segmentation tools); 3) studies where all participants were older than 3 years in chronological age; 4) studies not using predictive modeling approaches with external or cross-validation.
We collated FIT predictive modeling articles that met the eligibility mentioned above criteria using a PubMed search on May 9, 2022, which included the search string: (“fetal” OR “preterm” OR “premature” OR “newborn” OR “neonate*” OR “neonatal” OR “perinatal” OR “infant*” OR “toddler*”) AND (“MRI” OR “magnetic resonance imaging” OR “fMRI” OR “rsfmri” OR “resting state” OR “resting-state” OR “DWI” OR “diffusion weighted imaging” OR “DTI” OR “diffusion tensor imaging” OR “MRS” or “magnetic resonance spectroscopy” OR “fNIRS” OR “infrared spectroscopy” OR “EEG” OR “electroencephalography” OR “PET” OR “positron emission tomography” OR “neuroimaging” OR “connectome*” OR “functional connectivity”) AND (“prediction” OR “cross validation” OR “machine learning” OR “external validation”). This initial search yielded 1055 articles that the authors screened for eligibility using the online platform Rayaan (43). Following the initial screening, the abstract and full text of each article was screened by two raters to assess eligibility for inclusion. Each rater made a binary decision of either “include” or “exclude.” A subset of 111 articles was deemed eligible, and another 134 articles received conflicting ratings based on a binary decision rule (include vs. exclude), adjudicated by a third blind rater. Ultimately, 134 articles met our eligibility criteria (Figure 2). We estimated inter-rater reliability for included articles using the R package “irr” (44). Our analysis of inter-rater reliability identified ‘good’ agreement between the raters (Cohen’s k = 0.602, p < .001).
Figure 2. PRISMA diagram for the systematic review of predictive models in fetal, infant, toddler (FIT) neuroimaging datasets (Modified from (99)).

3.1. Prediction target category
Since 2010, the number of FIT predictive modeling papers has steadily increased (Figure 3a). To quantify the phenotypic measures used for FIT prediction studies, we classified articles into four categories based upon the prediction target: 1) neurological phenotypes (e.g., seizures); 2) neurodiverse characteristics (e.g., autism, language-learning disorders); 3) physical characteristics (e.g., age, brain maturation); and 4) cognitive and developmental outcomes (e.g., Bayley scores, sleep, motor skills). Most studies have predicted cognitive outcomes (31.3%), physical characteristics (29.9%), and neurological phenotypes (26.9%), whereas only a small portion of the literature has yet focused on the prediction of neurodiverse characteristics (11.9%; Figure 3b). Several key themes emerged when examining the literature within each of the four categories, as detailed below.
Figure 3. Trends in FIT predictive modeling research since 2010.

A) The number of peer-reviewed FIT predictive modeling articles has increased over the past decade, with more rapid growth since 2019. B) Distribution of sample size across all studies. The majority of studies had an overall sample size (i.e., combined training and testing data) of less than N = 150 participants as denoted by the red dashed line. C) Predictive modeling methods have been applied to four broad categories of phenotypes: neurological disorders and their cardinal symptoms, neurodiversity, physical characteristics (e.g., brain morphometry, age), and cognitive outcomes. While a wide array of phenotypes have been successfully predicted, most related to normative developmental milestones rather than clinical phenotypes. D) A range of neuroimaging modalities have been used for FIT prediction, yet most studies have implemented EEG, sMRI or multimodal approaches, whereas fMRI, MEG, ASL and ultrasound remain underused in this area. Abbreviations: ASL = arterial spin labeling, DTI = diffusion tensor imaging, EEG = electroencephalogy, MEG = magnetoencephalography, sMRI = structural magnetic resonance imaging, fMRI = functional magnetic resonance imaging.
Neurological.
Research on neurological phenotypes has overwhelmingly used predictive modeling to classify fetuses, infants, or toddlers with various conditions, including neonatal encephalopathy (45,46), epilepsy (47), and West syndrome (48). Several studies of cardinal neurological symptoms have implemented machine learning techniques to automate seizure detection (49,50) from EEG recordings.
Neurodiversity.
Of the few articles predicting neurodiversity, most have predicted phenotypes associated with autism (e.g., predicting future autism diagnoses (51), classifying infants who are either at elevated- or normal-likelihood for autism (52,53)). Only one study has predicted cross-cutting psychopathology measures, like Child Behavioral Checklist scores (54), from FIT neuroimaging data.
Physical.
Prediction of physical characteristics in FIT samples has primarily focused on gestational or postmenstrual age and brain morphometric phenotypes (e.g., volume (55), sulcal depth (56)). While much of this work aimed to generate normative models of pre- and postnatal brain development, others have focused on classifying preterm (57) and term-born infants.
Cognitive.
Studies of cognitive outcomes have largely predicted language, cognitive, or motor outcomes in infants from typically developing to neurodiverse populations (e.g., prematurity, neonatal encephalopathy, congenital hearing loss). Although most articles predicted phenotypic data collected at the same time as the neuroimaging data (58), several leveraged neuroimaging data collected in infancy to predict cognitive domains in early or mid-childhood, i.e. (59–61). These phenotypes were primarily indexed by the Bayley Scales of Infant and Toddler Development, Third Edition (62), yet other measures, such as a child’s sleep states (63), social-emotional development (64), and development quotient (65), have also been predicted.
2.2. Neuroimaging modality
Most FIT predictive modeling papers have used either EEG (38.8%) or structural MRI (27.6%), yet multimodal imaging (11.9%), diffusion-weighted imaging (11.9%), and fMRI (7.5%) techniques have become increasingly popular in recent years (Figure 3c). Few studies used arterial-spin labeling, ultrasound, or MEG modalities for prediction in FIT samples (all <1%); however, several multimodal studies incorporated ultrasound and structural MRI for fetal neuroimaging. Since EEG and structural MRI are routinely collected in the clinical setting, it is perhaps unsurprising that this work has heavily featured those modalities. The growing use of multimodal imaging may be necessary for predicting cognitive and psychiatric phenotypes, particularly since multimodal approaches improve model performance when predicting these phenotypes in older cohorts (12,66).
3.3. Machine learning methods
Many machine learning methods have been applied to FIT neuroimaging data, with many predictive modeling papers using multiple complementary approaches to address the research question of interest. Support vector machines/regression (SVM/R; 35.1%), deep learning (29.1%), penalized linear regression (17.9%), and tree-based (16.4%) approaches were among the most common. Other methods included logistic regression, discriminant analysis, Bayesian approaches, and more bespoke algorithms or non-machine learning methods with external validation. Most studies reported prediction accuracy above chance level, with some approaching 100% accuracy (45–100%; see Supplementary Table 1). In addition to the study aims, methodological choices likely reflected the diverse sample sizes across these articles, which ranged from N=10–1851 participants (median=86). Of the nine papers with sample sizes greater than 500, five used EEG data in neonates, three used fetal sMRI, and one used infant fMRI.
4. Ethics of prediction
In addition to the technical concepts discussed above, several ethical concerns exist when using prediction on FIT neuroimaging data (67). It has been established as best practice to develop trustworthy and fair models. Trust refers to a model’s response to adversarial data manipulations, where one wants to fool, evade, and mislead it in ways that “game the system.” In the worst-case scenario, data manipulations could be exploited to “prove” a negative stereotype of a particular population or support a fraudulent model developed with fabricated data. Fair models are absent of “any prejudice or biases toward an individual or group based on their inherent and acquired characteristics” (68). In other words, an unfair model will perform worse in certain groups, often underrepresented groups (69). General strategies to combat unfair models include creating and testing diverse, pooled datasets representing the wider population. However, as bias can stem from nearly limitless sources (70), it may not be possible for a model to be completely bias-free (71).
The development of fair and trustworthy models holds promise for improved risk stratification, early detection, and intervention for various conditions, such as autism, where differential diagnosis from other forms of neurodiversity may facilitate more appropriate support or accommodations (72). Similarly, early detection in FIT samples may improve the quality of life for those struggling with reading impairment (73), language delay (74), or psychopathology (75).
That said, even if fair and trustworthy models can be developed, there are additional ethical considerations when making predictions in FIT samples (76–79). Historically, the prediction of infant outcomes was used to promote harmful ideologies, including eugenics (80). Making predictions of poorer developmental outcomes or future clinical diagnosis under any circumstance can influence many aspects of the child’s life, including their family care, social integration, and potential clinical interventions. Additionally, these predictions may have lasting effects that encroach on adulthood. Given these risks, FIT researchers should be particularly cautious and transparent about quantifying the error of their predictions, and in some cases, they may preferentially need to focus on reducing false positives over false negatives or vice versa. A concern is that phenotypes of interest may not be stable across the lifespan due to the high degree of developmental variability in early life (81). Individuals may fluctuate between clinically significant and remitted states, or experience diagnostic crossover, throughout childhood, leading to questions about not only what is the best phenotype for prediction but also when to measure it. While predictive models using FIT data are generally far from clinical use, questions regarding their effect on clinical outcomes should be considered. For example, some current interventions are expensive, inaccessible, or have limited efficacy for some individuals. Therefore, even if a specific infant outcome could be predicted with perfect accuracy, it may be ethically dubious to predict a poor developmental outcome, or future diagnosis, without the assurance of an effective intervention. A more tractable, immediate goal would be to improve predictive models of cognitive and behavioral phenotypes, such as those identified by the Research Domain Criteria, that can form the foundation of future models of clinical disorders.
5. Future considerations
As the use of predictive modeling with FIT neuroimaging data grows, some future considerations might challenge the field. Recent efforts have created large, open-source datasets, such as the Developing Human Connectome Project, Baby Connectome Project, and the forthcoming HEALthy Brain and Child Development Study, that promise to lead FIT neuroimaging into the realm of “Big Data”. However, an overreliance on such datasets contributes to data decay or the notion that having multiple investigators analyze the same dataset inadvertently increases the number of false positives. This problem increases as the number of researchers analyzing the data increases. Data decay limits the generalizability of one’s findings and their subsequent interpretation (82). External validation of models from these datasets helps to ensure that the model has not been fit to dataset-specific features, and it represents a critical step in determining the reliability of predictive models. Thus, the need for continued data collection in FIT samples remains beyond these large-scale efforts.
Machine learning and open science practices often go hand in hand. As such, implementing best practices in open science, including sharing and reporting: raw data, the preprocessing pipeline used (i.e., parcellations, atlases, code, and processed data), the predictive modeling pipeline (including input features, like connectivity and phenotypic data, predictive modeling methods, tested hyperparameter(s), and validation methods), and software (including version number) will improve the overall quality of FIT predictive modeling findings—and their interpretation—by facilitating external validation of a model in an independent, external dataset. Moreover, data sharing will increase the diversity in datasets, which is needed to capture biases in models. Nevertheless, open science initiatives in FIT neuroimaging are less mature than in other neuroimaging fields.
Finally, given the difficulties and the cost of FIT neuroimaging data acquisition (27), researchers may be hesitant to embrace data sharing fully (83). However, as funding bodies move toward mandating open science practices, taking advantage of these opportunities will be essential for generating larger sample sizes by pooling data (84). Larger sample sizes will, in turn, improve the quality of FIT research by allowing researchers to account for potential biases (see Section 4), properly estimate effect sizes, and use methods, like deep neural nets, that require extensive amounts of data.
6. Conclusion
In summary, we examined the current state of the literature on predictive modeling in early life, and we outlined best practices and ethical considerations when applying machine learning methods to FIT neuroimaging data. Studies have shown that various cutting-edge machine learning approaches can be successfully implemented across neuroimaging modalities to predict developmental milestones in typically developing children and those with health conditions. The emergence of large-scale, open-source FIT neuroimaging datasets will provide exciting new opportunities to extend this work to additional phenotypes, particularly those related to cognition, social development, and psychiatric symptoms, which remain understudied. However, all researchers still need to complement these efforts with increased data sharing to improve the accuracy, generalizability, and trustworthiness of model predictions. Embracing collaboration will also be central to overcoming existing data collection challenges requiring interdisciplinary expertise from neuroscientists, physicists, engineers, and psychologists to optimize FIT samples. Together, decades of advances in FIT neuroimaging and machine learning can now be leveraged to move predictive modeling toward advancing knowledge of early life development in illness and in health.
Supplementary Material
IN THIS ISSUE Statement.
By quantifying brain structure and function throughout the most rapid period of brain development, fetal, infant, and toddler (FIT) neuroimaging promises insight into the earliest antecedents of psychiatry risk. Nevertheless, machine learning—a powerful class of data-driven methods—is rarely implemented in FIT neuroimaging. We present a primer on these methods and a review of the current literature to stimulate their border use in FIT neuroimaging.
Acknowledgements.
This work supported by CTSA Grant Number TL1 TR001864 from the National Center for Advancing Translational Science (NCATS), a component of the National Institutes of Health (NIH), T32 DA022975, K99 MH130894, R01 MH126133, T32 GM100884, T32 NS041228
Footnotes
Financial disclosures. All authors report no biomedical financial interests or potential conflicts of interest.
References
- 1.Pollatou A, Filippi CA, Aydin E, Vaughn K, Thompson D, Korom M, et al. (2022): An ode to fetal, infant, and toddler neuroimaging: Chronicling early clinical to research applications with MRI, and an introduction to an academic society connecting the field. Developmental Cognitive Neuroscience, vol. 54. p 101083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Yarkoni T, Westfall J (2017): Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning. Perspect Psychol Sci 12: 1100–1122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Whelan R, Garavan H (2014): When optimism hurts: inflated predictions in psychiatric neuroimaging. Biol Psychiatry 75: 746–748. [DOI] [PubMed] [Google Scholar]
- 4.Jordan MI, Mitchell TM (2015): Machine learning: Trends, perspectives, and prospects. Science 349: 255–260. [DOI] [PubMed] [Google Scholar]
- 5.Scheinost D, Noble S, Horien C, Greene AS, Lake EM, Salehi M, et al. (2019): Ten simple rules for predictive modeling of individual differences in neuroimaging. Neuroimage 193: 35–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Goecks J, Jalili V, Heiser LM, Gray JW (2020): How Machine Learning Will Transform Biomedicine. Cell 181: 92–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Biermann AW (1986): Fundamental mechanisms in machine learning and inductive inference. In: Bibel W, Jorrand P, editors. Fundamentals of Artificial Intelligence: An Advanced Course. Berlin, Heidelberg: Springer Berlin Heidelberg, pp 133–169. [Google Scholar]
- 8.Davatzikos C (2019): Machine learning in neuroimaging: Progress and challenges. Neuroimage 197: 652–656. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Janssen RJ, Mourão-Miranda J, Schnack HG (2018): Making Individual Prognoses in Psychiatry Using Neuroimaging and Machine Learning. Biol Psychiatry Cogn Neurosci Neuroimaging 3: 798–808. [DOI] [PubMed] [Google Scholar]
- 10.Nielsen AN, Barch DM, Petersen SE, Schlaggar BL, Greene DJ (2020): Machine Learning With Neuroimaging: Evaluating Its Applications in Psychiatry. Biol Psychiatry Cogn Neurosci Neuroimaging 5: 791–798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kohoutová L, Heo J, Cha S, Lee S, Moon T, Wager TD, Woo C-W (2020): Toward a unified framework for interpreting machine-learning models in neuroimaging. Nat Protoc 15: 1399–1435. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Genon S, Eickhoff SB, Kharabian S (2022): Linking interindividual variability in brain structure to behaviour. Nature Reviews Neuroscience, vol. 23. pp 307–318. [DOI] [PubMed] [Google Scholar]
- 13.Poldrack RA, Huckins G, Varoquaux G (2020): Establishment of Best Practices for Evidence for Prediction: A Review. JAMA Psychiatry 77: 534–540. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Hughes EJ, Winchman T, Padormo F, Teixeira R, Wurie J, Sharma M, et al. (2017): A dedicated neonatal brain imaging system. Magn Reson Med 78: 794–804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Dubois J, Alison M, Counsell SJ, Hertz- Pannier L, Hüppi PS, Benders MJN (2021): MRI of the Neonatal Brain: A Review of Methodological Challenges and Neuroscientific Advances. Journal of Magnetic Resonance Imaging, vol. 53. pp 1318–1343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Panman JL, To YY, van der Ende EL, Poos JM, Jiskoot LC, Meeter LHH, et al. (2019): Bias Introduced by Multiple Head Coils in MRI Research: An 8 Channel and 32 Channel Coil Comparison. Front Neurosci 13: 729. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Yamashita A, Yahata N, Itahashi T, Lisi G, Yamada T, Ichikawa N, et al. (2019): Harmonization of resting-state functional MRI data across multiple imaging sites via the separation of site differences into sampling bias and measurement bias. PLoS Biol 17: e3000042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Spisak T (2021): Statistical quantification of confounding bias in predictive modelling. arXiv. Retrieved from https://arxiv.org/abs/2111.00814 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.More S, Eickhoff SB, Caspers J, Patil KR (2021): Confound Removal and Normalization in Practice: A Neuroimaging Based Sex Prediction Case Study. Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track 3–18. [Google Scholar]
- 20.Jiang R, Woo C-W, Qi S, Wu J, Sui J (2022): Interpreting Brain Biomarkers: Challenges and solutions in interpreting machine learning-based predictive neuroimaging. IEEE Signal Processing Magazine, vol. 39. pp 107–118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Haufe S, Meinecke F, Görgen K, Dähne S, Haynes J-D, Blankertz B, Bießmann F (2014): On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage 87: 96–110. [DOI] [PubMed] [Google Scholar]
- 22.Kamkar I, Gupta SK, Phung D, Venkatesh S (2015): Exploiting feature relationships towards stable feature selection. 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA). 10.1109/dsaa.2015.7344859 [DOI] [Google Scholar]
- 23.Greene AS, Shen X, Noble S, Horien C, Alice Hahn C, Arora J, et al. (2022): Brain–phenotype models fail for individuals who defy sample stereotypes. Nature, vol. 609. pp 109–118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Gao S, Greene AS, Todd Constable R, Scheinost D (2019): Combining multiple connectomes improves predictive modeling of phenotypic measures. NeuroImage, vol. 201. p 116038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Arichi T, Fagiolo G, Varela M, Melendez-Calderon A, Allievi A, Merchant N, et al. (2012): Development of BOLD signal hemodynamic responses in the human brain. Neuroimage 63: 663–673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Kaplan S, Meyer D, Miranda-Dominguez O, Perrone A, Earl E, Alexopoulos D, et al. (2022): Filtering respiratory motion artifact from resting state fMRI data in infant and toddler populations. Neuroimage 247: 118838. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Korom M, Camacho MC, Filippi CA, Licandro R, Moore LA, Dufford A, et al. (2022): Dear reviewers: Responses to common reviewer critiques about infant neuroimaging studies. Dev Cogn Neurosci 53: 101055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Fitzgibbon SP, Harrison SJ, Jenkinson M, Baxter L, Robinson EC, Bastiani M, et al. (2020): The developing Human Connectome Project (dHCP) automated resting-state functional processing framework for newborn infants. Neuroimage 223: 117303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Dufford AJ, Hahn CA, Peterson H, Gini S, Mehta S, Alfano A, Scheinost D (2022): (Un)common space in infant neuroimaging studies: A systematic review of infant templates. Hum Brain Mapp 43: 3007–3016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Seshamani S, Cheng X, Fogtmann M, Thomason ME, Studholme C (2014): A method for handling intensity inhomogenieties in fMRI sequences of moving anatomy of the early developing brain. Med Image Anal 18: 285–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Liao Y, Li X, Jia F, Ye Z, Ning G, Liu S, et al. (2022): Optimization of the image contrast for the developing fetal brain using 3D radial VIBE sequence in 3 T magnetic resonance imaging. BMC Med Imaging 22: 11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.D’Andrea CB, Kenley JK, Montez DF, Mirro AE, Miller RL, Earl EA, et al. (2022): Real-time motion monitoring improves functional MRI data quality in infants. 10.1101/2021.11.10.468084 [DOI] [PMC free article] [PubMed]
- 33.Rutherford S, Sturmfels P, Angstadt M, Hect J, Wiens J, van den Heuvel MI, et al. (2021): Automated Brain Masking of Fetal Functional MRI with Open Data. Neuroinformatics. 10.1007/s12021-021-09528-5 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Zöllei L, Iglesias JE, Ou Y, Ellen Grant P, Fischl B (2020): Infant FreeSurfer: An automated segmentation and surface extraction pipeline for T1-weighted neuroimaging data of infants 0–2 years. NeuroImage, vol. 218. p 116946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Hendrix CL, Thomason ME (2022): A survey of protocols from 54 infant and toddler neuroimaging research labs. Dev Cogn Neurosci 54: 101060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Rosenberg MD, Finn ES, Scheinost D, Papademetris X, Shen X, Constable RT, Chun MM (2016): A neuromarker of sustained attention from whole-brain functional connectivity. Nat Neurosci 19: 165–171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Schnack HG, Kahn RS (2016): Detecting Neuroimaging Biomarkers for Psychiatric Disorders: Sample Size Matters. Front Psychiatry 7: 50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Varoquaux G (2018): Cross-validation failure: Small sample sizes lead to large error bars. Neuroimage 180: 68–77. [DOI] [PubMed] [Google Scholar]
- 39.Ben-Ari Y (2002): Excitatory actions of gaba during development: the nature of the nurture. Nat Rev Neurosci 3: 728–739. [DOI] [PubMed] [Google Scholar]
- 40.Ben-Ari Y, Khalilov I, Represa A, Gozlan H (2004): Interneurons set the tune of developing networks. Trends Neurosci 27: 422–427. [DOI] [PubMed] [Google Scholar]
- 41.Hendrikx D, Smits A, Lavanga M, De Wel O, Thewissen L, Jansen K, et al. (2019): Measurement of Neurovascular Coupling in Neonates. Front Physiol 10: 65. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Mitra A, Snyder AZ, Tagliazucchi E, Laufs H, Elison J, Emerson RW, et al. (2017): Resting-state fMRI in sleeping infants more closely resembles adult sleep than adult wakefulness. PLoS One 12: e0188122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Ouzzani M, Hammady H, Fedorowicz Z, Elmagarmid A (2016): Rayyan—a web and mobile app for systematic reviews. Systematic Reviews, vol. 5. 10.1186/s13643-016-0384-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Gamer Lemon, Gamer Robinson, Kendall’s (2019): Package “irr.” Various coefficients of interrater reliability and agreement. [Google Scholar]
- 45.Raurale SA, Nalband S, Boylan GB, Lightbody G, O’Toole JM (2019): Suitability of an inter-burst detection method for grading hypoxic-ischemic encephalopathy in newborn EEG. Conf Proc IEEE Eng Med Biol Soc 2019: 4125–4128. [DOI] [PubMed] [Google Scholar]
- 46.Jeong J-W, Lee M-H, Fernandes N, Deol S, Mody S, Arslanturk S, et al. (2022): Neonatal encephalopathy prediction of poor outcome with diffusion-weighted imaging connectome and fixel-based analysis. Pediatr Res 91: 1505–1515. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Glass HC, Grinspan ZM, Li Y, McNamara NA, Chang T, Chu CJ, et al. (2020): Risk for infantile spasms after acute symptomatic neonatal seizures. Epilepsia 61: 2774–2784. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Rocha PL, Silva WLS, da Silva Sousa P, da Silva AAM, Barros AK (2022): Discrimination of secondary hypsarrhythmias to Zika virus congenital syndrome and west syndrome based on joint moments and entropy measurements. Sci Rep 12: 7389. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.O’Shea A, Lightbody G, Boylan G, Temko A (2020): Neonatal seizure detection from raw multi-channel EEG using a fully convolutional architecture. Neural Netw 123: 12–25. [DOI] [PubMed] [Google Scholar]
- 50.Ansari AH, Cherian PJ, Dereymaeker A, Matic V, Jansen K, De Wispelaere L, et al. (2016): Improved multi-stage neonatal seizure detection using a heuristic classifier and a data-driven post-processor. Clin Neurophysiol 127: 3014–3024. [DOI] [PubMed] [Google Scholar]
- 51.Bosl WJ, Tager-Flusberg H, Nelson CA (2018): EEG Analytics for Early Detection of Autism Spectrum Disorder: A data-driven approach. Sci Rep 8: 6828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Peck FC, Gabard-Durnam LJ, Wilkinson CL, Bosl W, Tager-Flusberg H, Nelson CA (2021): Prediction of autism spectrum disorder diagnosis using nonlinear measures of language-related EEG at 6 and 12 months. J Neurodev Disord 13: 57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Stahl D, Pickles A, Elsabbagh M, Johnson MH, BASIS Team (2012): Novel machine learning methods for ERP analysis: a validation from research on infants at risk for autism. Dev Neuropsychol 37: 274–298. [DOI] [PubMed] [Google Scholar]
- 54.Wee C-Y, Tuan TA, Broekman BFP, Ong MY, Chong Y-S, Kwek K, et al. (2017): Neonatal neural networks predict children behavioral profiles later in life. Hum Brain Mapp 38: 1362–1373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Gui L, Loukas S, Lazeyras F, Hüppi PS, Meskaldji DE, Borradori Tolsa C (2019): Longitudinal study of neonatal brain tissue volumes in preterm infants and their ability to predict neurodevelopmental outcome. Neuroimage 185: 728–741. [DOI] [PubMed] [Google Scholar]
- 56.Vareilles H de, de Vareilles H, Rivière D, Sun Z, Fischer C, Leroy F, et al. (2022): Shape variability of the central sulcus in the developing brain: a longitudinal descriptive and predictive study in preterm infants. 10.1101/2021.12.15.472770 [DOI] [PubMed] [Google Scholar]
- 57.Ball G, Aljabar P, Arichi T, Tusor N, Cox D, Merchant N, et al. (2016): Machine-learning to characterise neonatal functional connectivity in the preterm brain. Neuroimage 124: 267–275. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Sui J, Jiang R, Bustillo J, Calhoun V (2020): Neuroimaging-based Individualized Prediction of Cognition and Behavior for Mental Disorders and Health: Methods and Promises. Biol Psychiatry 88: 818–828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.He L, Li H, Holland SK, Yuan W, Altaye M, Parikh NA (2018): Early prediction of cognitive deficits in very preterm infants using functional connectome data in an artificial neural network framework. NeuroImage: Clinical, vol. 18. pp 290–297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Adeli E, Meng Y, Li G, Lin W, Shen D (2019): Multi-task prediction of infant cognitive scores from longitudinal incomplete neuroimaging data. Neuroimage 185: 783–792. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Tan L, Holland SK, Deshpande AK, Chen Y, Choo DI, Lu LJ (2015): A semi-supervised Support Vector Machine model for predicting the language outcomes following cochlear implantation based on pre-implant brain fMRI imaging. Brain Behav 5: e00391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Bayley N (2006): Bayley Scales of Infant and Toddler Development: Administration Manual. [Google Scholar]
- 63.Čić M, Šoda J, Bonković M (2013): Automatic classification of infant sleep based on instantaneous frequencies in a single-channel EEG signal. Computers in Biology and Medicine, vol. 43. pp 2110–2117. [DOI] [PubMed] [Google Scholar]
- 64.Fenchel D, Dimitrova R, Robinson EC, Batalle D, Chew A, Falconer S, et al. (n.d.): Neonatal multi-modal cortical profiles predict 18-month developmental outcomes. 10.1101/2021.09.23.461464 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.De Ridder J, Lavanga M, Verhelle B, Vervisch J, Lemmens K, Kotulska K, et al. (2020): Prediction of Neurodevelopment in Infants With Tuberous Sclerosis Complex Using Early EEG Characteristics. Front Neurol 11: 582891. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Meng X, Jiang R, Lin D, Bustillo J, Jones T, Chen J, et al. (2017): Predicting individualized clinical measures by a generalized prediction framework and multimodal fusion of MRI data. Neuroimage 145: 218–229. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Tejavibulya L, Peterson H, Greene A, Gao S, Rolison M, Noble S, Scheinost D (2022): Large-scale differences in functional organization of left- and right-handed individuals using whole-brain, data-driven analysis of connectivity. Neuroimage 252: 119040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Mehrabi N, Morstatter F, Saxena N, Lerman K, Galstyan A (2021): A Survey on Bias and Fairness in Machine Learning. ACM Computing Surveys, vol. 54. pp 1–35. [Google Scholar]
- 69.Li J, Bzdok D, Chen J, Tam A, Ooi LQR, Holmes AJ, et al. (2022): Cross-ethnicity/race generalization failure of behavioral prediction from resting-state functional connectivity. Sci Adv 8: eabj1812. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Olteanu A, Castillo C, Diaz F, Kıcıman E (2019): Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries. Frontiers in Big Data, vol. 2. 10.3389/fdata.2019.00013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Saxena NA, Huang K, DeFilippis E, Radanovic G, Parkes DC, Liu Y (2020): How do fairness definitions fare? Testing public attitudes towards three algorithmic definitions of fairness in loan allocations. Artificial Intelligence, vol. 283. p 103238. [Google Scholar]
- 72.Emerson RW, Adams C, Nishino T, Hazlett HC, Wolff JJ, Zwaigenbaum L, et al. (2017): Functional neuroimaging of high-risk 6-month-old infants predicts a diagnosis of autism at 24 months of age. Sci Transl Med 9. 10.1126/scitranslmed.aag2882 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Jasińska KK, Shuai L, Lau ANL, Frost S, Landi N, Pugh KR (2021): Functional connectivity in the developing language network in 4-year-old children predicts future reading ability. Dev Sci 24: e13041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Vassar R, Schadl K, Cahill-Rowley K, Yeom K, Stevenson D, Rose J (2020): Neonatal Brain Microstructure and Machine-Learning-Based Prediction of Early Language Development in Children Born Very Preterm. Pediatric Neurology, vol. 108. pp 86–92. [DOI] [PubMed] [Google Scholar]
- 75.Díaz-Arteche C, Rakesh D (2020, August 1): Using neuroimaging to predict brain age: insights into typical and atypical development and risk for psychopathology. Journal of Neurophysiology, vol. 124. pp 400–403. [DOI] [PubMed] [Google Scholar]
- 76.Nicholls SG, Wilson BJ, Etchegary H, Brehaut JC, Potter BK, Hayeems R, et al. (2014): Benefits and burdens of newborn screening: public understanding and decision-making. Personalized Medicine, vol. 11. pp 593–607. [DOI] [PubMed] [Google Scholar]
- 77.Kelly N, Makarem DC, Wasserstein MP (2016): Screening of Newborns for Disorders with High Benefit-Risk Ratios Should Be Mandatory. J Law Med Ethics 44: 231–240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Esquerda M, Palau F, Lorenzo D, Cambra FJ, Bofarull M, Cusi V, Bioetica GIen (2021): Ethical questions concerning newborn genetic screening. Clinical Genetics, vol. 99. pp 93–98. [DOI] [PubMed] [Google Scholar]
- 79.MacDuffie KE, Estes AM, Peay HL, Pruett JR, Wilfond BS (2021): The Ethics of Predicting Autism Spectrum Disorder in Infancy. Journal of the American Academy of Child & Adolescent Psychiatry, vol. 60. pp 942–945. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Eugenics and Scientific Racism (n.d.): National Human Genome Research Institute. Retrieved from https://www.genome.gov/about-genomics/fact-sheets/Eugenics-and-Scientific-Racism
- 81.Baca-Garcia E, Perez-Rodriguez MM, Basurte-Villamor I, Del Moral ALF, Jimenez-Arriero MA, De Rivera JLG, et al. (2007): Diagnostic stability of psychiatric disorders in clinical practice. British Journal of Psychiatry, vol. 190. pp 210–216. [DOI] [PubMed] [Google Scholar]
- 82.Horien C, Noble S, Greene A, Lee K, Barron D, Gao S, et al. (2021): A Hitchhiker’s Guide to Working with Large, Open-Source Neuroimaging Datasets. 10.20944/preprints202007.0153.v1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 83.Vaughn KA, Arichi T, Aydin E, Catalina Camacho M, Dapretto M, Ford A, et al. (2022): An Opportunity to Increase Collaborative Science in Fetal, Infant, and Toddler Neuroimaging. Biological Psychiatry. 10.1016/j.biopsych.2022.07.005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Wachinger C, Rieckmann A, Pölsterl S, Alzheimer’s Disease Neuroimaging Initiative and the Australian Imaging Biomarkers and Lifestyle flagship study of ageing (2021): Detect and correct bias in multi-site neuroimaging datasets. Med Image Anal 67: 101879. [DOI] [PubMed] [Google Scholar]
- 85.Schneider A, Hommel G, Blettner M (2010): Linear regression analysis: part 14 of a series on evaluation of scientific publications. Dtsch Arztebl Int 107: 776–782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.James G, Witten D, Hastie T, Tibshirani R (2013): An Introduction to Statistical Learning: With Applications in R. Springer Science & Business Media. [Google Scholar]
- 87.Vaher K, Galdi P, Cabez MB, Sullivan G, Stoye DQ, Quigley AJ, et al. (n.d.): General factors of white matter microstructure from DTI and NODDI in the developing brain. 10.1101/2021.11.29.470344 [DOI] [PubMed] [Google Scholar]
- 88.Pisner DA, Schnyer DM (2020): Support vector machine. Machine Learning. pp 101–121. [Google Scholar]
- 89.Zhang F, O’Donnell LJ (2020): Support vector regression. Machine Learning. pp 123–140. [Google Scholar]
- 90.Temko A, Thomas E, Marnane W, Lightbody G, Boylan G (2011): EEG-based neonatal seizure detection with Support Vector Machines. Clinical Neurophysiology, vol. 122. pp 464–473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 91.Pedregosa Varoquaux, Gramfort (n.d.): Scikit-learn: Machine learning in Python. of machine Learning …. Retrieved from https://www.jmlr.org/papers/volume12/pedregosa11a/pedregosa11a.pdf?ref=https://githubhelp.com
- 92.Ramirez GNA, Avecilla Ramirez GN, Ruiz-Correa S, Marroquin JL, Harmony T, Alba A, Mendoza- Montoya O (2011): Electrophysiological auditory responses and language development in infants with periventricular leukomalacia. PsycEXTRA Dataset. 10.1037/e512592013-571 [DOI] [PubMed] [Google Scholar]
- 93.Hastie T, Tibshirani R, Friedman J (2013): The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media. [Google Scholar]
- 94.Shin Y, Nam Y, Shin T, Choi JW, Lee JH, Jung DE, et al. (2021): Brain MRI radiomics analysis may predict poor psychomotor outcome in preterm neonates. European Radiology. 10.1007/s00330-021-07836-7 [DOI] [PubMed] [Google Scholar]
- 95.Neal (n.d.): Bayesian methods for machine learning. NIPS tutorial. Retrieved from http://media.nips.cc/Conferences/2004/Tutorials/slides/radfordSlides.pdf
- 96.Sadeghi N, Thomas Fletcher P, Prastawa M, Gilmore JH, Gerig G (2014): Subject-Specific Prediction Using Nonlinear Population Modeling: Application to Early Brain Maturation from DTI. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014. pp 33–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 97.LeCun Y, Bengio Y, Hinton G (2015): Deep learning. Nature 521: 436–444. [DOI] [PubMed] [Google Scholar]
- 98.Saha S, Pagnozzi A, Bourgeat P, George JM, Bradford D, Colditz PB, et al. (2020): Predicting motor outcome in preterm infants from very early brain diffusion MRI using a deep learning convolutional neural network (CNN) model. Neuroimage 215: 116807. [DOI] [PubMed] [Google Scholar]
- 99.Page MJ, McKenzie J, Bossuyt P, Boutron I, Hoffmann T, Mulrow C d., et al. (2021): The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. 10.31222/osf.io/v7gm2 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
