Abstract
Purpose
To develop a multichannel deep neural network (mcDNN) classification model based on multiscale brain functional connectome data and demonstrate the value of this model by using attention deficit hyperactivity disorder (ADHD) detection as an example.
Materials and Methods
In this retrospective case-control study, existing data from the Neuro Bureau ADHD-200 dataset consisting of 973 participants were used. Multiscale functional brain connectomes based on both anatomic and functional criteria were constructed. The mcDNN model used the multiscale brain connectome data and personal characteristic data (PCD) as joint features to detect ADHD and identify the most predictive brain connectome features for ADHD diagnosis. The mcDNN model was compared with single-channel deep neural network (scDNN) models and the classification performance was evaluated through cross-validation and hold-out validation with the metrics of accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC).
Results
In the cross-validation, the mcDNN model using combined features (fusion of the multiscale brain connectome data and PCD) achieved the best performance in ADHD detection with an AUC of 0.82 (95% confidence interval [CI]: 0.80, 0.83) compared with scDNN models using the features of the brain connectome at each individual scale and PCD, independently. In the hold-out validation, the mcDNN model achieved an AUC of 0.74 (95% CI: 0.73, 0.76).
Conclusion
An mcDNN model was developed for multiscale brain functional connectome data, and its utility for ADHD detection was demonstrated. By fusing the multiscale brain connectome data, the mcDNN model improved ADHD detection performance considerably over the use of a single scale.
© RSNA, 2019
Summary
By fusing the multiscale brain connectome data, the multichannel deep neural network model improved attention deficit hyperactivity disorder detection performance considerably compared with the use of a single scale.
Key Points
■ The proposed multichannel deep neural network, which fuses multiscale brain connectome data, improved performance compared with the use of a single scale in attention deficit hyperactivity disorder detection.
■ The constructed brain functional connectome that spans multiple scales (based on both anatomic and functional criteria) may provide supplementary information for the complementary depiction of networks across the entire brain.
■ The predictive power of using brain connectome at the individual scale was comparable with that of personal characteristic data.
Introduction
Advances in functional MRI techniques have facilitated quantitative mapping of the connections within and between brain networks. This comprehensive mapping of brain functional network connections is referred to as the functional connectome and has opened a window for observing the human mind (1,2). Functional connectome analysis involves the process of investigating functional networks using graph theory–based techniques. The constructed networks span spatial scales ranging from individual voxels to an aggregation of voxels (ie, parcels or parcellations of interests). Brain parcellations can be defined based on anatomic criteria (3), functional criteria (4), or both (5). One can interrogate the brain at different scales based on different brain parcellations. Among recent developments is the consensus that although network analyses at the multiscale level may be redundant, the analyses could also provide supplementary information for the complementary understanding and depiction of networks within different brain regions as well as across the entire brain (6). In other words, we expect to find similar network properties across different scales; as well, we expect to find unique properties to each scale-specific network.
The brain’s functional connectome is considered a key to understanding the exact etiology behind many brain disorders (7–11). Attempts have been made to use functional connectomes as features to classify various brain disorders; one example of a disorder to classify is attention deficit hyperactivity disorder (ADHD). ADHD is a heritable, chronic, neurobehavioral disorder among children and adolescents (12). It is estimated that this condition is diagnosed in 5% of American preschool and school-aged children (12). Children or adolescents with ADHD are at high risk of failing in academics and in building social relationships, which can result in financial hardships for families and create a tremendous burden on society. The diagnosis of ADHD remains challenging. To date, behavior-based tests are the standard clinical approach to diagnosing ADHD (13). Graph-based measures of functional connectivity, such as regional homogeneity, have been used as important features for distinguishing those with ADHD from control subjects (14). Studies have also used whole-brain connectivity features to detect ADHD (7,9,15). However, the functional connectivity analyses in these cited studies lack the consideration that brain networks are fundamentally multiscale entities (6).
Multiscale whole-brain connectivity features are inherently noisy and have a high dimensionality, containing widely distributed but potentially less robust information. It is important to capture the embedded salient information useful for differentiating a single subject by a classification model. Recent advances in deep neural network approaches have attracted increasing interest in the application toward classification and prediction of brain disorders (8,16,17). Compared with traditional machine learning methods, it has been demonstrated that deep learning methods can produce physiologically meaningful features and reveal new associations from high-dimensional neuroimaging data (8,17). Over the years, traditional machine learning methods have been extensively investigated; however, there has been a lack of studies using deep learning approaches for ADHD detection.
To this end, we proposed a multichannel deep neural network (mcDNN) model analyzing multiscale functional connectome data and tested the value of this model by using ADHD detection as an example. Multiscale functional connectivity features are extracted at coarse-to-fine spatial scales (6) according to both structurally (automated anatomic labeling [AAL]) (3) and functionally (CC200 and CC400) (4) defined parcellations. The CC200 and CC400 functional brain parcellations were constructed using a two-stage spatially constrained spectral clustering procedure. We set out to test the hypothesis that the predictive power of using brain connectome data would be comparable among the different individual scales, but the fusion of multiscale brain connectome improves classification performance. We also included personal characteristic data (PCD) in the proposed model, which have been demonstrated to be potential biomarkers for ADHD classification (7,18). The major contributions of this article can be summarized as follows: first, we depicted the brain connectome at multiple spatial scales; second, we developed an mcDNN model to take multiscale brain connectome data and PCD as joint features; third, we implemented the proposed model to detect ADHD versus control subjects; and finally, we explored the most predictive brain connectome features for ADHD diagnosis.
Materials and Methods
Whole-Brain Functional Connectome at Multiple Spatial Scales
A brain connectome is a comprehensive map of neural connections in the brain. Mathematically, a connectome is a graph, representing the brain connectivity (described as a set of edges) between pairs of brain regions of interest (ROIs; described as a set of nodes). Spatial scale refers to the granularity at which its nodes and edges are defined. For MRI data, spatial scale can range from individual voxels to brain regions and/or parcellations. At the voxel level, data tend to be noisy, and computational costs are expensive owing to the large number of voxels. Therefore, it makes sense to aggregate voxels into parcels or ROIs and analyze the average properties of the ROIs. Questions remain pertaining to the choice of parcellations (ie, number of ROIs) and how the brain connectome properties at different spatial scales are related to each other (19).
The choice of parcellations has implications for the network’s topology (20). The number of ROIs that have been investigated ranges from approximately 60 (21) to approximately 1000 (22) for the whole brain. In this study, we constructed brain connectomes based on a publicly available preprocessed ADHD-200 dataset, where parcellations were done using one anatomically defined template (AAL, 90 ROIs) (3) and two functionally defined templates, CC200 (190 ROIs) and CC400 (351 ROIs). On the basis of different brain parcellations, brain connectome data can be encoded as adjacency matrices with different sizes (ie, 90 × 90, 190 × 190, and 351 × 351, respectively). We defined functional connectivity as the temporal correlation between spatially distinct brain parcels or ROIs (6). The AAL template has been used in numerous anatomic and functional neuroimaging–based research studies to describe regions of cerebral activation. However, it has been reported that the AAL template exhibits poor ROI homogeneity and may not accurately reproduce connectivity patterns (4). The CC200 and CC400 templates were constructed using a two-stage spatially constrained spectral clustering procedure described by Craddock et al (4). Although these well-known structural and functional brain atlases, including the aforementioned AAL, CC200 and CC400, and several other existing parcellations (eg, Eickhoff-Zilles parcellation [https://www.fz-juelich.de/inm/inm-1/EN/Service/Download/download_node.html], Talairach and Tournoux parcellation [http://www.talairach.org], Harvard-Oxford parcellation [https://www.fmrib.ox.ac.uk/fsl/]) have been applied in a majority of brain connectivity analyses, the selection of the proper brain atlas(es) for a particular study or cohort remains an open question. In such a situation, considering multiscale parcellation data simultaneously, as we propose in this study, might be a preferred solution.
mcDNN Model
Figure 1 shows a schematic diagram of our proposed mcDNN model. Multiple parallel channels take multiple inputs, then fuse them into one signal channel to exact combined high-level features, and finally feed the high-level features to a classifier.
Figure 1:
Schematic diagram of the proposed multichannel deep neural network model analyzing multiscale functional brain connectome for a classification task. rs-fMRI = resting-state functional MRI.
ADHD Detection
Datasets—The data used to test our hypothesis are from a publicly available de-identified Neuro Bureau ADHD-200 dataset (http://preprocessed-connectomes-project.org/adhd200/) (23) and consist of 973 participants in total. Those participants were scanned at eight different sites: Peking University, Bradley Hospital (Brown University), Kennedy Krieger Institute, The Donders Institute, New York University Child Study Center, Oregon Health and Science University, University of Pittsburgh, and Washington University in St Louis. For every site, the research ethics review boards approved each cohort, and signed written informed consent was obtained from participants (or legal guardians) before participation. All participants had no history of a psychiatric, neurologic, or medical disorder other than ADHD. The ADHD diagnostic criteria were described in the overview of ADHD dataset (https://fcon_1000.projects.nitrc.org/indi/adhd200/index.html). Site information was included as a feature to control for site bias. The large number of subjects scanned using different scanners and parameters across sites makes the dataset very diverse and offers a unique opportunity for developing robust diagnostic models. The resting-state functional MRI preprocessing was performed using the Athena pipeline (https://neurobureau.projects.nitrc.org/ADHD200/Data.html) (23). A detailed description about the Athena pipeline can be found in Bellec et al (24). Besides resting-state functional MRI data, for each subject, the ADHD-200 dataset also provides PCD, such as age, sex, handedness, and intelligence quotients. In this study, we included all 592 subjects who underwent resting-state functional MRI with complete PCD data. The demographic and performance information are shown in Table 1.
Table 1:
Demographic and Performance Information of Subjects Included in This Study, Who Have Undergone Resting-State Functional MRI and Complete Personal Characteristic Data

Model Architecture
The detailed architecture of the proposed mcDNN is presented in Figure 2. Our mcDNN has four input channels. Each input channel contains several neural network blocks. The four input channels are eventually fused into one output channel through two neural network blocks. Each block reduces the feature dimensions. Apart from the fully connected layer, each block consists of batch normalization layers and dropout regularization. The dropout is a regularization technique that randomly selects a certain ratio of neurons and ignores them during training. The “dropped-out” neurons will not contribute to the feedforward process, and the weights of these neurons are not updated in backpropagation. Dropout regularization helps avoid model overfitting. The functional connectome data are complex and very high in dimension (up to 105). This may result in an overfitted model. When neurons are randomly dropped out during each iteration of training, each neuron will have a chance to get involved to update weights and contribute to the representation of the features. In this way, the model learns multiple, independent, internal representations and is less sensitive to the training samples. Batch normalization is used to solve the internal covariate shift problem. Similar to feature scaling, batch normalization works to adjust and scale hidden unit shifts across hidden layers, which avoids either too high or too low activation. Batch normalization also speeds up the training process when handling a large number of features.
Figure 2:
The architecture of a proposed multichannel deep neural network (mcDNN). The mcDNN has four input channels that take vectors of 1 × 7 for personal characteristic data and 1 × 4005, 1 × 17 955, and 1 × 61 425 for connectome data, consisting of the upper triangular values of the symmetric connectome matrices at a different scale. Four channels are eventually fused into one output channel. Each block reduces the feature dimensions. The number of feature dimensions is shown on the bottom of each block. PCD = personal characteristic data.
Model Training
We implemented the supervised backpropagation algorithm for model training (25). Suppose we have a feature matrix
, labeled with
, where N is the number of subjects. The activation output of each neuron in each hidden layer hW,b(x) is defined by:
![]() |
where
is the activation function (ie, leaky rectified linear units in this study),
, and j is the number of inputs to this neuron.
,
, and
represent weight, input, and bias, respectively.
Assuming that
and
are the predicted values for
input
, then the cost function can be defined as:
![]() |
![]() |
where
is the cross-entropy loss function. The second part of cost function is the penalty term (weight-decay term). Here, we adopted L2 regularization as our penalty term, where l represents the hidden layers and j is the number of neurons in this hidden layer. W represents the weights of the model and λ is the weight control parameter for the penalty term. L2 regularization is a stable solution for regularization and more computationally efficient. We optimized the proposed mcDNN model by using an Adam optimizer (26) which computes adaptive learning rates for each parameter.
Model Validation
We evaluated the classification (ADHD detection) performance of our proposed model through fivefold cross-validation and hold-out validation (Fig 3) with the metrics of accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). We compared the proposed mcDNN model with single-channel deep neural network (scDNN) models, taking the features of the brain connectome at each individual scale and PCD independently. The scDNN architecture designs are listed in Table 2.
Figure 3:
Schemes of cross-validation and hold-out validation.
Table 2:
Single-Channel Deep Neural Network Architecture Designs

In fivefold cross-validation, we randomly portioned subjects into five equal-sized subsets. In each iteration, one single subset (20% subjects) was retained as a validation set for testing the model, and we then built the classification model on the remaining four subsets (complementary 80% subjects, referred to as a training set). Then, the average performance across all five iterations was computed. To reduce variability, 30 rounds of fivefold cross-validation were performed, and the validation results were averaged over the rounds. To better assess the generalization of our proposed approach, we also performed a hold-out validation. Specifically, we trained the proposed models using a cross-validation cohort and then tested the model using an independent hold-out cohort.
Feature Ranking
We implemented a feature-ranking approach (27) designed for deep learning algorithms to unveil which functional connections at each individual spatial scale were the most predictive of the ADHD outcome. Specifically, we calculated the partial derivatives of the mcDNN output with regard to the connectivity weights from the brain connectome. Given mcDNN output y, for each individual channel, the partial derivative
, where
i ≠ j, is connectivity between ROIs i and j. A higher absolute value of the partial derivative indicates a higher level of the importance for ADHD prediction.
Results
Cross-Validation
As shown in Figure 4, scDNN models were able to correctly detect patients with ADHD with an AUC of 0.67 (95% confidence interval [CI]: 0.66, 0.68), 0.69 (95% CI: 0.67, 0.70), 0.67 (95% CI: 0.66, 0.68), and 0.77 (95% CI: 0.76, 0.78) using brain connectome with parcellations of AAL, CC200, CC400, and PCD, respectively. The mcDNN model with combined features (fusion of the multiscale brain connectome data and PCD) achieved an AUC of 0.82 (95% CI: 0.80, 0.83), which was significantly (P <.001) and clinically higher than the previously mentioned scDNN model for ADHD detection. More detailed performances are listed in Table 3. Our mcDNN model with fusion of the single-scale brain connectome data and PCD outperformed scDNN with only PCD.
Figure 4:
Cross-validation. Receiver operating characteristic (ROC) curves of four single-channel deep neural networks (scDNNs) using brain connectome with parcellations of AAL, CC200, CC400, and PCD, independently, and our proposed multichannel deep neural network (mcDNN) using combined features (fusion of the multiscale brain connectome data and PCD). Areas under the ROC curves are 0.67, 0.69, 0.67, 0.77, and 0.82, respectively. Red dotted line signifies the “random guess”. AAL = automated anatomic labeling, CC200, CC400 = brain connectome constructed on functionally defined parcellations, PCD = personal characteristic data (age, sex, handedness, and three individual measures of intelligence quotients).
Table 3:
Cross-Validation: Performance of Single- and Multichannel Deep Neural Networks on Attention Deficit Hyperactivity Disorder Classification

Most Discriminative Brain Functional Connections
To visualize which brain connections and regions were most predictive of ADHD for brain parcellation at each individual scale identified by our mcDNN model, we calculated partial derivatives of output with regard to 4005, 17 955, and 61 425 functional connections, respectively, for AAL, CC200, and CC400. The connections with larger partial derivative values were more important in the classification. At each spatial scale, we identified the top 40 most predictive connections (Fig 5). Figure 5b summarizes the relevant functional connections and abbreviations.
Figure 5a:
Top discriminative functional connections. (a) By registering the atlases of CC200 and CC400 to AAL (mapping the spatial locations of top predictive connections of CC200 and CC400 onto the AAL), five common connections were identified at all scales (red), 11 common connections identified at the scale of AAL and CC200 (green and red); 10 common connections identified at the scale of AAL and CC400 (blue and red); and five common connections identified at the scale of CC200 and CC400 (red). (b) Table format shows top discriminative functional connections (from brain region A to brain region B) and their abbreviations. AAL = automated anatomic labeling, CC200, CC400 = brain connectome constructed on functionally defined parcellations, L = left, R = right.
Figure 5b:
Top discriminative functional connections. (a) By registering the atlases of CC200 and CC400 to AAL (mapping the spatial locations of top predictive connections of CC200 and CC400 onto the AAL), five common connections were identified at all scales (red), 11 common connections identified at the scale of AAL and CC200 (green and red); 10 common connections identified at the scale of AAL and CC400 (blue and red); and five common connections identified at the scale of CC200 and CC400 (red). (b) Table format shows top discriminative functional connections (from brain region A to brain region B) and their abbreviations. AAL = automated anatomic labeling, CC200, CC400 = brain connectome constructed on functionally defined parcellations, L = left, R = right.
Hold-Out Validation
A total of 495 subjects were used for training, and 97 hold-out subjects provided by ADHD-200 competition (https://fcon_1000.projects.nitrc.org/indi/adhd200/index.html) were used for hold-out validation. The proposed mcDNN model using combined features was able to correctly classify patients with ADHD with an AUC of 0.74. This model achieved an accuracy of 73%, with a sensitivity of 63% and specificity of 82%. By using only multiscale functional data, the proposed model achieved an accuracy of 70.1%, specificity of 80.9%, and sensitivity of 60%.
Discussion
Brain connectome studies have shown that abnormal network properties may be useful as discriminative features for the diagnosis of various neurologic conditions (28–30). However, there has been a lack of studies about the predictive power of brain connectomes at multiple spatial scales. Although there is redundant information among brain connectomes at the multiscale level, the additional supplementary information that we can obtain across scales is very helpful for the complementary understanding and depicting of networks within different brain regions and across the entire brain (6). There is still no consensus on how the brain should be parcellated for functional connectivity analyses. This is partly because different types of parcellations can provide some unique information. Therefore, assessing connectivity based on structural versus functional parcellation or functional versus more detailed functional parcellation, as we proposed, provides unique features that likely enhanced our classification accuracy. In this study, we have demonstrated that the predictive power of using brain connectome is comparable among individual scales, but the fusion of multiscale brain connectome improves prediction performance. Our model was first validated using a cross-validation strategy. In addition, the proposed model achieved similar results to cross-validation using the hold-out dataset.
On the basis of this publicly available ADHD-200 dataset, many investigations have been conducted attempting to improve ADHD detection (9–11,18,31). On the basis of the data from all ADHD-200 sites, for the two-class classification task of identifying patients with ADHD versus control subjects, our approach was able to produce promising performance (accuracy of 70.1%, specificity of 80.9%, and sensitivity of 60%) using only functional MRI data on the hold-out dataset, which is better than the most recent best performance (accuracy of 67.3%, specificity of 85.1%, and sensitivity of 45.5%) reported by Sen et al (15). Using the combination of imaging and PCD, we achieved 78.3% on accuracy, 84.2% on specificity, and 70% on sensitivity, which is better than the previous best performance reported by Sidhu et al (10), with an accuracy of 76% (no specificity and sensitivity were reported in their study). Classification of complex behavioral conditions such as ADHD requires a multimodal approach. Our assessment of deep learning for ADHD classification offers a more objective approach to current clinical practice but it is still one of several approaches that will be required to create a multimodal model to accurately classify unknown cases as ADHD (or not). More work is indeed needed to get to 90% or higher classification accuracy for it to be useful in clinical practice. Assessment of additional clinical features, other MRI features (eg, structural connectivity), and/or further improvements in deep learning models are likely required to get to this level of classification accuracy.
By adopting a feature-ranking strategy, we found that several decision-making and recognition regions, including middle frontal gyrus, inferior orbitofrontal cortex, and fusiform gyrus, significantly contributed to ADHD detection, which have been reported by multiple previous ADHD studies (32–34). These consistent results highlight the data mining capability of our proposed mcDNN model. Our findings highlight some of the key functional brain regions that are involved in patients with ADHD. Furthermore, five of 40 top predictive brain connections were identified from all three scale connectomes. This indicated that brain connectomes with different scales contain both common and unique information, suggesting that the integration of different scale connectomes may take advantage of the complementary information, so as to improve the performance of ADHD detection.
PCD have demonstrated promising predictive power for ADHD (18) and compare favorably with functional connectome features. This is consistent with what we found in our study. We also demonstrated that the combination of PCD with functional data improves ADHD diagnosis. These results demonstrate the importance of accounting for differences in personal characteristics in diagnostic imaging research. It is possible that the use of other features, such as medical tests and patient or family history of disease, might further boost performance to a clinically useful level. We, like others (35), also noted that sex is an important feature that shows differences between typical non-ADHD control subjects and subjects with ADHD. Using a feature-ranking approach designed for deep learning algorithms (27), we found that sex was ranked five of all seven clinical features that we considered (three intelligence quotients and age were ranked top four). As a secondary analysis, we also evaluated our mcDNN model in male subjects and achieved an accuracy of 82.3% (95% CI: 80.6%, 83.9%), a sensitivity of 76.7% (95% CI: 73.5%, 79.4%), a specificity of 84.6% (95% CI: 82.5%, 86.6%) and an AUC of 0.85 (0.84, 0.86). This level of model performance was better than when we included both male and female subjects.
There are limitations to be considered in this study. First, we designed the architecture of our proposed mcDNN by brute-force searching the space (ie, limited combinations of the number of layers and neurons in each layer) for a reasonable (but not necessarily optimal) performance. However, the best way is to search “unlimited space” for optimal architecture. Searching unlimited space means to try all possible combinations of the number of layers and neurons, which will be very computationally intense. Although increasing efforts have been made to reduce the search space and enable it to be tractable for objective architecture optimization (25,36), it is still challenging for generalization to different applications. Second, our study was based on the publicly available preprocessed ADHD-200 dataset. We realized that it may be suboptimal to analyze brain connectivity in children and young adults based on the brain parcellation built on adults. To further improve ADHD detection accuracy, it would be better to use children and young adults’ brain parcellation. Second, although a sample size of 592 subjects is considered a large study in this field when performing clinical research and our hold-out validation has demonstrated no overfitting for our proposed model by producing similar performance as cross-validation, larger datasets, including subjects varying in age, sex, ethnicity, and race, may still be needed to yield a generalizable model that is clinically useful. Finally, our study only focused on the brain functional connectomes. In fact, an integration of additional brain structure connectomes derived from diffusion tensor imaging may further improve the classification of ADHD.
In summary, we developed an mcDNN model and tested the model on an application of ADHD detection. We demonstrated the feasibility of using deep learning techniques to analyze multiscale brain connectome data and capture the individual variability inherent in the brains by validating the hypothesis that (a) the predictive power of using PCD is comparable with that of using functional connectivity features; (b) the predictive power of using brain connectome data at an individual scale is comparable among individual scales; and (c) the fusion of multiscale brain connectome data improves prediction performance. The full potential of deep learning models can be achieved and more reliable conclusions drawn when this approach can be applied to much larger neuroimaging datasets.
Acknowledgments
Acknowledgments
We thank the ADHD-200 project investigators for making their data publicly available.
Work supported by National Institutes of Health grants (R21-HD094085, R01-NS094200, and R01-NS096037) and a Trustee grant from Cincinnati Children’s Hospital Medical Center. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Disclosures of Conflicts of Interest: M.C. disclosed no relevant relationships. H.L. disclosed no relevant relationships. J.W. disclosed no relevant relationships. J.R.D. disclosed no relevant relationships. N.A.P. disclosed no relevant relationships. L.H. disclosed no relevant relationships.
Abbreviations:
- AAL
- automated anatomical labeling
- ADHD
- attention deficit hyperactivity disorder
- AUC
- area under the receiver operating characteristic curve
- CI
- confidence interval
- mcDNN
- multichannel deep neural network
- PCD
- personal characteristic data
- ROI
- region of interest
- scDNN
- single-channel deep neural network
References
- 1.Glasser MF, Smith SM, Marcus DS, et al. The Human Connectome Project’s neuroimaging approach. Nat Neurosci 2016;19(9):1175–1187. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Sporns O. The human connectome: origins and challenges. Neuroimage 2013;80:53–61. [DOI] [PubMed] [Google Scholar]
- 3.Tzourio-Mazoyer N, Landeau B, Papathanassiou D, et al. Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 2002;15(1):273–289. [DOI] [PubMed] [Google Scholar]
- 4.Craddock RC, James GA, Holtzheimer PE 3rd, Hu XP, Mayberg HS. A whole brain fMRI atlas generated via spatially constrained spectral clustering. Hum Brain Mapp 2012;33(8):1914–1928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Glasser MF, Coalson TS, Robinson EC, et al. A multi-modal parcellation of human cerebral cortex. Nature 2016;536(7615):171–178. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Betzel RF, Bassett DS. Multi-scale brain networks. Neuroimage 2017;160:73–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Riaz A, Asad M, Alonso E, Slabaugh G. Fusion of fMRI and non-imaging data for ADHD classification. Comput Med Imaging Graph 2018;65:115–128. [DOI] [PubMed] [Google Scholar]
- 8.He L, Li H, Holland SK, Yuan W, Altaye M, Parikh NA. Early prediction of cognitive deficits in very preterm infants using functional connectome data in an artificial neural network framework. Neuroimage Clin 2018;18:290–297. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.dos Santos Siqueira A, Biazoli Junior CE, Comfort WE, Rohde LA, Sato JR. Abnormal functional resting-state networks in ADHD: graph theory and pattern recognition analysis of fMRI data. BioMed Res Int 2014;2014:380531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sidhu GS, Asgarian N, Greiner R, Brown MR. Kernel principal component analysis for dimensionality reduction in fMRI-based diagnosis of ADHD. Front Syst Neurosci 2012;6:74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Fair DA, Nigg JT, Iyer S, et al. Distinct neural signatures detected for ADHD subtypes after controlling for micro-movements in resting state functional connectivity MRI data. Front Syst Neurosci 2013;6:80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Biederman J. Attention-deficit/hyperactivity disorder: a selective overview. Biol Psychiatry 2005;57(11):1215–1220. [DOI] [PubMed] [Google Scholar]
- 13.American Psychiatric Association . Diagnostic and statistical manual of mental disorders. 5th ed. Washington, DC: American Psychiatric Publishing, 2013. [Google Scholar]
- 14.Zhu CZ, Zang YF, Cao QJ, et al. Fisher discriminative analysis of resting-state brain function for attention-deficit/hyperactivity disorder. Neuroimage 2008;40(1):110–120. [DOI] [PubMed] [Google Scholar]
- 15.Sen B, Borle NC, Greiner R, Brown MRG. A general prediction model for the detection of ADHD and Autism using structural and functional MRI. PLoS One 2018;13(4):e0194856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ambastha AK, Leong TY; Alzheimer’s Disease Neuroimaging Initiative . A deep learning approach to neuroanatomical characterisation of Alzheimer’s disease. Stud Health Technol Inform 2017;245:1249. [PubMed] [Google Scholar]
- 17.Li H, Parikh NA, He L. A novel transfer learning approach to enhance deep neural network classification of brain functional connectomes. Front Neurosci 2018;12:491. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Brown MR, Sidhu GS, Greiner R, et al. ADHD-200 Global Competition: diagnosing ADHD using personal characteristic data can outperform resting state fMRI measurements. Front Syst Neurosci 2012;6:69. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.van den Heuvel OA, van Wingen G, Soriano-Mas C, et al. Brain circuitry of compulsivity. Eur Neuropsychopharmacol 2016;26(5):810–827. [DOI] [PubMed] [Google Scholar]
- 20.Zalesky A, Fornito A, Harding IH, et al. Whole-brain anatomical networks: does the choice of nodes matter? Neuroimage 2010;50(3):970–983. [DOI] [PubMed] [Google Scholar]
- 21.Desikan RS, Ségonne F, Fischl B, et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 2006;31(3):968–980. [DOI] [PubMed] [Google Scholar]
- 22.Diez I, Bonifazi P, Escudero I, et al. A novel brain partition highlights the modular skeleton shared by structure and function. Sci Rep 2015;5(1):10532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.HD-200 Consortium . The ADHD-200 Consortium: a model to advance the translational potential of neuroimaging in clinical neuroscience. Front Syst Neurosci 2012;6:62. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Bellec, P, Chu, C, Chouinard-Decorte, F, Benhajali, Y, Margulies, DS, and Craddock, RC. The neuro bureau adhd-200 preprocessed repository. Neuroimage 2017;144:275–286. [DOI] [PubMed] [Google Scholar]
- 25.LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–444. [DOI] [PubMed] [Google Scholar]
- 26.Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv:1412.6980. [preprint] https://arxiv.org/abs/1412.6980. Posted December 22, 2014. Revised January 30, 2017. Accessed October 5, 2018.
- 27.Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: visualising image classification models and saliency maps. arXiv 1312.6034. [preprint] https://arxiv.org/abs/1312.6034 Posted December 20, 2013. Revised April 19, 2014. Accessed June 12, 2018.
- 28.Fei F, Jie B, Zhang D. Frequent and discriminative subnetwork mining for mild cognitive impairment classification. Brain Connect 2014;4(5):347–360. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Khazaee A, Ebrahimzadeh A, Babajani-Feremi A. Application of advanced machine learning methods on resting-state fMRI network for identification of mild cognitive impairment and Alzheimer’s disease. Brain Imaging Behav 2016;10(3):799–817. [DOI] [PubMed] [Google Scholar]
- 30.Wee CY, Yap PT, Shen D. Diagnosis of autism spectrum disorders using temporally distinct resting-state functional connectivity networks. CNS Neurosci Ther 2016;22(3):212–219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Dey S, Rao AR, Shah M. Attributed graph distance measure for automatic detection of attention deficit hyperactive disordered subjects. Front Neural Circuits 2014;8:64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Tafazoli S, O’Neill J, Bejjani A, et al. 1H MRSI of middle frontal gyrus in pediatric ADHD. J Psychiatr Res 2013;47(4):505–512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Valera EM, Spencer RM, Zeffiro TA, et al. Neural substrates of impaired sensorimotor timing in adult attention-deficit/hyperactivity disorder. Biol Psychiatry 2010;68(4):359–367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Tian L, Jiang T, Liang M, et al. Enhanced resting-state brain activities in ADHD patients: a fMRI study. Brain Dev 2008;30(5):342–348. [DOI] [PubMed] [Google Scholar]
- 35.Kaczkurkin AN, Raznahan A, Satterthwaite TD. Sex differences in the developing brain: insights from multimodal neuroimaging. Neuropsychopharmacology 2019;44(1):71–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Bengio Y. Learning deep architectures for AI. Found Trends Mach Learn 2009;2(1):1–127. [Google Scholar]









