Abstract
Parkinson’s Disease (PD) is one of the most prevalent neurodegenerative diseases that affects tens of millions of Americans. PD is highly progressive and heterogeneous. Quite a few studies have been conducted in recent years on predictive or disease progression modeling of PD using clinical and biomarkers data. Neuroimaging, as another important information source for neurodegenerative disease, has also arisen considerable interests from the PD community. In this paper, we propose a deep learning method based on Graph Convolutional Networks (GCN) for fusing multiple modalities of brain images in relationship prediction which is useful for distinguishing PD cases from controls. On Parkinson’s Progression Markers Initiative (PPMI) cohort, our approach achieved 0.9537±0.0587 AUC, compared with 0.6443±0.0223 AUC achieved by traditional approaches such as PCA.
1. Introduction
Parkinson’s Disease (PD) [1] is one of the most prevalent neurodegenerative diseases, which occur when nerve cells in the brain or peripheral nervous system lose function over time and ultimately die. PD affects predominately dopaminergic neurons in substantia nigra, which is a specific area of the brain. PD is a highly progressive disease, with related symptoms progressing slowly over the years. Typical PD symptoms include bradykinesia, rigidity, and rest tremor, which affect speech, hand coordination, gait, and balance. According to the statistics from National Institute of Environmental Healths (NIEHS), at least 500,000 Americans are living with PD1. The Centers for Disease Control and Prevention (CDC) rated complications from PD as the 14th cause of death in the United States [2].
The cause of PD remains largely unknown. There is no cure for PD and its treatments include mainly medications and surgery. The progression of PD is highly heterogeneous, which means that its clinical manifestations vary from patient to patient. In order to understand the underlying disease mechanism of PD and develop effective therapeutics, many large-scale cross-sectional cohort studies have been conducted. The Parkinson’s Progression Markers Initiative (PPMI) [3] is one such example including comprehensive evaluations of early stage (idiopathic) PD patients with imaging, biologic sampling, and clinical and behavioral assessments. The patient recruitment in PPMI is taking place at clinical sites in the United States, Europe, Israel, and Australia. This injects enough diversity into the PPMI cohort and makes the downstream analysis/discoveries representative and generalizable.
Quite a few computational studies have been conducted on PPMI data in recent years. For example, Dinov et al. [4] built a big data analytics pipeline on the clinical, biomarker and assessment data in PPMI to perform various prediction tasks. Schrag et al. [5] predicted the cognitive impairment of the patients in PPMI with clinical variables and biomarkers. Nalls et al. [6] developed a diagnostic model with clinical and genetic classifications with PPMI cohort. We also developed a sequential deep learning based approach to identify the subtypes of PD on the clinical variables, biomarkers and assessment data in PPMI, and our solution won the PPMI data challenge in 2016 [7]. These studies provided insights to PD researchers in addition to the clinical knowledge.
So far research on PPMI has been mostly utilizing its clinical, biomarker and assessment information. Another important part but under-utilized part of PPMI is its rich neuroimaging information, which includes Magnetic Resonance Imaging (MRI), functional MRI, Diffusion Tensor Imaging (DTI), CT scans, etc. During the last decade, neuroimaging studies including structural, functional and molecular modalities have also provided invaluable insights into the underlying PD mechanism [8]. Many imaging based biomarkers have been demonstrated to be closely related to the progression of PD. For example, Chen et al. [9] identified significant volumetric loss in the olfactory bulbs and tracts of PD patients versus controls from MRI scans, and the inverse correlation between the global olfactory bulb volume and PD duration. Different observations have been made on the volumetric differences in substantia nigra (SN) on MRI [10, 11]. Decreased Fractional Anisotropy (FA) in the SN is commonly observed in PD patients using DTI [12]. With high-resolution DTI, greater FA reductions in caudal (than in middle or rostral) regions of the SN were identified, distinguishing PD from controls with 100% sensitivity and specificity [13]. One can refer to [14] for a comprehensive review on imaging biomarkers for PD. Many of these neuroradiology studies are strongly hypothesis driven, based on the existing knowledge on PD pathology.
In recent years, with the arrival of the big data era, many computational approaches have been developed for neuroimaging analysis [15, 16, 17]. Different from conventional hypothesis driven radiology methods, these computational approaches are typically data driven and hypothesis free – they derive features and evidences directly from neuroimages and utilize them in the derivation of clinical insights on multiple problems such as brain network discovery [18, 19] and imaging genomics [20, 21]. Most of these algorithms are linear [22] or multilinear [23], and they work on a single modality of brain images.
In this paper, we develop a computational framework for analyzing the neuroimages in PPMI data based on Graph Convolutional Networks (GCN) [24]. Our framework learns pairwise relationships with the following steps.
Graph Construction. We parcel the structural MRI brain images of each acquisition into a set of Region-of-Interests (ROIs). Each region is treated as a node on a Brain Geometry Graph (BGG), which is undirected and weighted. The weight associated with each pair of nodes is calculated according to the average distance between the geometric coordinates of them in each acquisition. All acquisitions share the same BGG.
Feature Construction. We use different brain tractography algorithms on the DTI parts of the acquisitions to obtain different Brain Connectivity Graphs (BCGs), which are used as the features for each acquisition. Each acquisition has a BCG for each type of tractography.
Relationship Prediction. For each acquisition, we learn a feature matrix from each of its BCG through a GCN. Then all the feature matrices are aggregated through element-wise view pooling. Finally, the feature matrices from each acquisition pair are aggregated into a vector, which is fed into a softmax classifier for relationship prediction.
It is worthwhile to highlight the following aspects of the proposed framework.
Pairwise Learning. Instead of performing sample-level learning, we learn pairwise relationships, which is more flexible and weaker (sample level labels can always be transformed to pairwise labels but not vice versa). Importantly, such a pairwise learning strategy can increase the training sample size (because each pair of training samples becomes an input), which is very important to learning algorithms that need large-scale training samples (e.g., deep learning).
Nonlinear Feature Learning. As we mentioned previously, most of the existing machine learning approaches for neuroimaging analysis are based on either linear or multilinear models, which have a limited capacity of exploring the information contained in neuroimages. We leverage GCN, which is a powerful tool that can explore graph characteristics at a spectrum of frequency bands. This brings our framework more potential to achieve good performance.
Multi-Graph Fusion. Different from conventional approaches that focus on a single graph (image modality), our framework fuses 1) spatial information on the BGG obtained from the MRI part of each acquisition; 2) the features obtained from different BCGs obtained from the DTI part of each acquisition. This effectively leverages the complementary information scattered in different sources.
2. Methodology
In this section, we first describe the problem setting and then present the details of our proposed approach. To facilitate the description, we denote scalars by lowercase letters (e.g., x), vectors by boldfaced lowercase letters (e.g., x), and matrices by boldface uppercase letters (e.g., X). We also use lowercase letters i, j as indices. We write xi to denote the ith entry of a vector x, and xi,j the entry with row index i and column index j in a matrix X. All vectors are column vectors unless otherwise specified.
2.1. Problem Setting
Suppose we have a population of N acquisitions, where each acquisition is subject-specific and associated with M BCGs obtained from different measurements or views. A BCG can be represented as an undirected weighted graph G = (V, E). The vertex set V = {υ1, …, υn} consists of ROIs in the brain and each edge in E is weighted by a connectivity strength, where n is the number of ROIs. We represent edge weights by an n × n similarity matrix X with xi,j denoting the connectivity between ROI i and ROI j. We assume that the vertices remain the same while the edges vary with views. Thus, for each subject, we have M BCGs: . A group of similarity matrices {X(1),… , X(k),… , X(M)} can be derived.
An undirected weighted BGG is defined based on the geometric information of the region coordinates, which is a K-Nearest Neighbor (K-NN) graph. The graph has ROIs as vertices V = {υ1, … , υn}, where each ROI is associated with coordinates of its center. Edges are weighted by the Gaussian similarity function of Euclidean distances, i.e., . We identify the set of vertices Ni that are neighbors to the vertex υi using K-NN, and connect υi and υj if υi ∈ Nj or if υj ∈ Ni. An adjacency matrix à can then be associated with representing the similarity to nearest similar ROIs for each ROI, with the elements:
Our goal is to learn a feature representation for each subject by fusing its BCGs and the shared BGG, which captures both the local traits of each individual subject and the global traits of the population of subjects. Specifically, we develop a customized Multi-View Graph Convolutional Network (MVGCN) model to learn feature representations on neuroimaging data.
2.2. Our Approach
Overview. Fig. 1 provides an overview of the MVGCN framework we develop for relationship prediction on multiview brain graphs. Our model is a deep neural network consisting of three main components: the first component is a multi-view GCN for extracting the feature matrices from each acquisition, the second component is a pairwise matching strategy for aggregating the feature matrices from each pair of acquisitions into feature vectors, and the third component is a softmax predictor for relationship prediction. All of these components are trained using backpropagation and stochastic optimization. Note that MVGCN is an end-to-end architecture without extra parameters involved for view pooling and pairwise matching, Also, all branches of the used views share the same parameters in the multi-view GCN component. We next give details of each component.
• C1: Multi-View GCN. Traditional convolutional neural networks (CNN) rely on the regular grid-like structure with a well-defined neighborhood at each position in the grid (e.g . 2D and 3D images). On a graph structure there is usually no natural choice for an ordering of the neighbors of a vertex, therefore it is not trivial to generalize the convolution operation to the graph setting. Shuman et al. showed that this generalization can be made feasible by defining graph convolution in the spectral domain and proposed a GCN. Motivated by the fact that GCN can effectively model the nonlinearity of samples in a population and has superior capability to explore graph characteristics at a spectrum of frequency bands, we propose a multi-view GCN for an effective fusion of populations of graphs with different views. It consists of two fundamental steps: (i) the design of convolution operator on multiple graphs across views, (ii) a view pooling operation that groups together multi-view graphs.
Graph Convolution. An essential point in GCN is to define graph convolution in the spectral domain based on Laplacian matrix and graph Fourier transform (GFT). We consider the normalized graph Laplacian L = I – D–1/2AD–1/2 where A ∈ ℝn×n is the adjacency matrix associated with the graph, D ∈ ℝn×n is the diagonal degree matrix with , and I ∈ ℝn×n is the identity matrix. As L is a real symmetric positive semidefinite matrix, it can be decomposed as L = UΛUT, where U ∈ ℝn×n is the matrix of eigenvectors with UUT = I (referred to as the Fourier basis) and Λ ∈ ℝn×n is the diagonal matrix of eigenvalues . The eigenvalues represent the frequencies of their associated eigenvectors, i.e. eigenvectors associated with larger eigenvalues oscillate more rapidly between connected vertices. Specifically, in order to obtain a unique frequency representation for the signals on the set of graphs, we define the Laplacian matrix on the BGG , as all graphs share a common structure with adjacency matrix Ã.
Let x ∈ ℝn be a signal defined on the vertices of a graph G, where xi denotes the value of the signal at the i-th vertex. The GFT is defined as , which converts signal x to the spectral domain spanned by the Fourier basis U. Then the graph convolution can be defined as:
(1) |
where θ ∈ ℝn is a vector of Fourier coefficients to be learned, and gθ is called the filter which can be regarded as a function of ∧. To render the filters s-localised in space and reduce the computational complexity, gθ can be approximated by a truncated expansion in terms of Chebyshev polynomials of order s [24]. That is,
(2) |
where the parameter θ ∈ ℝs is a vector of Chebyshev coefficients and is the Chebyshev polynomial of order s evaluated at , a diagonal matrix of scaled eigenvalues that lies in [–1,1].
Substituting Eq. (2) into Eq. (1) yields , where . Denoting , we can use the recurrence relation to compute with and . Finally, the jth output feature map in a GCN is given by:
(3) |
yielding Fin × Fout vectors of trainable Chebyshev coefficients θi,j ∈ ℝk, where xi denotes the input feature maps from a graph. In our case, xi corresponds to the i-th row of the respective similarity matrix X to each graph, and (number of brain ROIs). The outputs are collected into a feature matrix , where each row represents the extracted features of an ROI.
View Pooling. For each subject, the output of GCN are M feature matrices {Y(1), …, Y(k), …, Y(M)}, where each matrix Y(k) ∈ ℝn×Fout corresponds to a view. Similar to the view-pooling layer in the multi-view CNN [25], we use element-wise maximum operation across all M feature matrices in each subject to aggregate multiple views together, producing a shared feature matrix Z. This view-pooling is similar to the view-pooling layer in the multi-view CNN [25]. An alternative is an element-wise mean operation, but it is not as effective in our experiments (see Table 2). The reason may be that the maximum operation learns to combine the views instead of averaging, and thus can use the more informative views of each feature while ignoring others.
Table 2.
Architectures | AUC | NMI |
PCA100-M-S | 64.43±2.23 | 0.39 |
FCN1024-M-FCN64-S | 82.53±4.74 | 0.87 |
GCN128-M-S | 93.75±5.39 | 0.98 |
MVGCN128-M-Smean | 94.74±5.62 | 1.00 |
MVGCN128-M-Smax | 95.37±5.87 | 1.00 |
Fig. 2 gives the flowchart of our multi-view GCN. Based on this multi-view GCN, different views of BCGs can be progressively fused in accordance with their similarity matrices, which can capture both local and global structural information from BCGs and BGG.
C2: Pairwise Matching. Training deep learning model requires a large amount of training data, but usually very few data are available from clinical practice. We take advantage of the pairwise relationships between subjects to guide the process of deep learning [26, 17]. Similarity is an important type of pairwise relationship that measures the relatedness of two subjects. The basic assumption is that, if two subjects are similar, they should have a high probability to have the same class label.
Let Zp and Zq be the feature matrices for any subject pair obtained from multi-view GCN, we can use them to compute a ROI-ROI similarity score. To do so, we first normalize each matrix so that the sum of squares of each row is equal to 1, and then define the following pairwise similarity measure using the row-wise inner product operator:
(4) |
where and are the i-th row vectors of the normalized matrices Zp and Zq, respectively.
C3: Softmax. For each pair, the output of the pairwise matching layer is a feature vector r, where each element is given by Eq. (4). Then, this representation is passed to a fully connected softmax layer for classification. It computes the probability distribution over the labels:
(5) |
where wk is the weight vector of the k-th class, and r is the final abstract representation of the input example obtained by a series of transformations from the input layer through a series of convolution and pooling operations.
3. Experiments and Results
In order to evaluate the effectiveness of our proposed approach, we conduct extensive experiments on real-life Parkinsons Progression Markers Initiative (PPMI) data for relationship prediction and compare with several state-of-the-art methods. In the following, we introduce the datasets used and describe details of the experiments. Then we present the results as well as the analysis.
Data Description. We consider the DTI acquisition on 754 subjects, where 596 subjects are Parkinson’s Disease (PD) patients and the rest 158 are Healthy Control (HC) ones. Each subject’s raw data were aligned to the b0 image using the FSL2 eddy-correct tool to correct for head motion and eddy current distortions. The gradient table is also corrected accordingly. Non-brain tissue is removed from the diffusion MRI using the Brain Extraction Tool (BET) from FSL. To correct for echo-planar induced (EPI) susceptibility artifacts, which can cause distortions at tissue-fluid interfaces, skull-stripped b0 images are linearly aligned and then elastically registered to their respective preprocessed structural MRI using Advanced Normalization Tools (ANTs3) with SyN nonlinear registration algorithm. The resulting 3D deformation fields are then applied to the remaining diffusion-weighted volumes to generate full preprocessed diffusion MRI dataset for the brain network reconstruction. In the meantime, 84 ROIs are parcellated from T1-weighted structural MRI using Freesufer4 and each ROI’s coordinate is defined using the mean coordinate for all voxels in that ROI.
Based on these 84 ROIs, we reconstruct six types of BCGs for each subject using six whole brain tractography algorithms, including four tensor-based deterministic approaches: Fiber Assignment by Continuous Tracking (FACT) [27], the 2nd-order Runge-Kutta (RK2) [28], interpolated streamline (SL) [29], the tensorline (TL) [30], one Orientation Distribution Function (ODF)-based deterministic approach [31]: ODF-RK2 and one ODF-based probabilistic approach: Hough voting [32]. Please refer to [33] for the details of whole brain tractography computations. Each resulted network for each subject is 84 × 84. To avoid computation bias in the later feature extraction and evaluation sections, we normalize each brain network by the maximum value in the matrix, as matrices derived from different tractography methods have different scales and ranges.
Experimental Settings. To learn similarities between graphs, brain networks in the same group (PD or HC) are labeled as matching pairs while brain networks from different groups are labeled as non-matching pairs. Hence, we have 283, 881 pairs in total, with 189, 713 matching samples and 94,168 non-matching samples. 5-fold cross validation is adopted in all of our experiments by separating the sample pairs into 5 stratified randomized sets. Using the coordinate information of ROIs in DTI, we construct a 10-NN BGG in our method, which has 84 vertices and 527 edges. For graph convolutional layers, the order of Chebyshev polynomials s = 30 and the output feature dimension Fout = 128 are used. For fully connected layers, the number of feature dimensions is 1024 in the baseline of one fully connected layer, and those are set as 1024 and 64 for the baseline of two layers. The Adam optimizer [34] is used with the initial learning rate 0.005. The above parameters are optimal settings for all the methods by performing cross-validation. MVGCN code and scripts are available on a public repository (https://github.com/sheryl-ai/MVGCN).
Results. Since our target is to predict relations (matching vs. non-matching) between pairwise BCGs, the performance of binary classification are evaluated using the metric of Area Under the Curve (AUC). Table 1 provides the results of individual views using the following methods: raw edges-weights, PCA, feed-forward fully connected networks (FCN and FCN2l), and graph convolutional network (GCN), where FCN2l is a two-layer FCN. Through the compared methods, the feature representation of each subject in pairs can be learned. For a fair comparison, pairwise matching component and software component are utilized for all the methods. The best performance of GCN-based method achieves an AUC of 93.90%. It is clear that GCN outperforms the raw edges-weights, conventional linear dimension reduction method PCA and nonlinear neural networks FCN and FCN2l.
Table 1.
Methods | Views | |||||
---|---|---|---|---|---|---|
FACT | RK2 | SL | TL | ODF-RK2 | Hough | |
Raw Edges | 58.47±4.05 | 62.54±6.88 | 59.39±5.99 | 61.94±5.00 | 60.93±5.60 | 64.49±3.56 |
PCA | 64.10±2.10 | 63.40±2.72 | 64.43±2.23 | 62.46±1.46 | 60.93±2.63 | 63.46±3.52 |
FCN | 66.17±2.00 | 65.11±2.63 | 65.00±2.29 | 64.33±3.34 | 68.80±2.80 | 61.91±3.42 |
FCN2l | 82.36±1.87 | 81.02±4.28 | 81.68±2.49 | 81.99±3.44 | 82.53±4.74 | 81.77±3.74 |
GCN | 92.67±4.94 | 92.99±4.95 | 92.68±5.32 | 93.75±5.39 | 93.04±5.26 | 93.90±5.48 |
Table 2 reports the performance on classification and acquisition clustering of our proposed MVGCN with three baselines. The architectures of neural networks by the output dimensions of the corresponding hidden layers are presented. M denotes the matching layer based on Eq. (4), S denotes the softmax operation in Eq. (5). The numbers denote the dimensions of extracted features at different layers. For our study, we evaluate both element-wise max pooling and mean pooling in the view pooling component. Specifically, to test the effectiveness of the learned similarities, we also evaluate the clustering performance in terms of Normalized Mutual Information (NMI). The acquisition clustering algorithm we used is K-means (K = 2, PD and HC). The results show that our MVGCN outperforms all baselines on both classification and acquisition clustering tasks, with an AUC of 95.37% and an NMI of 1.00.
In order to test whether the prediction results are meaningful for distinguishing brain networks as PD or HC, we visualize the Euclidean distance for the given 754 DTI acquisitions. Since the output values of all the matching models can indicate the pairwise similarities between acquisitions, we map it into a 2D space with t-SNE [35]. Fig. 3 compares the visualization results with different approaches. The feature extraction by PCA cannot separate the PD and HC perfectly. The result of FCN2l in the view ODF-RK2 that has the best AUC is much better, and two clusters can be observed with a few overlapped acquisitions. Compared with PCA and FCN2l, the visualization result of MVGCN with max view pooling clearly shows two well-separated and relatively compact groups.
Furthermore, we investigate the extracted pairwise feature vectors of the proposed MVGCN. After the ROI-ROI based pairwise matching, the output for each pair is a feature vector embedding the similarity of the given two acquisitions, with each element associated with a ROI. By visualizing the value distribution over ROIs, we can interpret the learned pairwise feature vector of our model. Fig. 4 reports the most 10 similar or dissimilar ROI for PD or HC groups. The similarities are directly extracted from the output representations of the pairwise matching layer. We compute the averaged values of certain groups. For instance, the similarity distributions are computed given the pairwise PD samples, and the values of the top-10 ROI are shown in Fig. 4(a). According to the results, lateral orbitofrontal area, middle temporal and amygdala areas are the three most similar ROIs for PD patients, while important ROIs such as caudate and putamen areas are discriminative to distinguish PD and HC (see Fig. 4(d)). The observations demonstrate that the learned pairwise feature vectors are consistent with some clinical discoveries [36] and thus verify the effectiveness of the MVGCN for neuroimage analysis.
4. Discussion
The underlying rationale of the proposed method is modeling the multiple brain connectivity networks (BCGs) and a geometry graph (BGG) based on the common ROI coordinate simultaneously. Since BCGs are non-Euclidean, it is not straightforward to use a standard convolution that has impressive performances on the grid. Additionally, multi-view graph fusion methods [37] allow us to explore various aspects of the given data. Our non-parametric view pooling is promising in practice. Furthermore, the pairwise learning strategies can satisfy the “data hungry” neural networks with few acquisitions [26]. Our work has demonstrated strong potentials of graph neural networks on the scenario of multiple graph-structured neuroimages. Meanwhile, the representations learned by our approach can be straightforwardly interpreted. However, there are still some limitations. The current approach is completely data- driven without utilization of any clinical domain knowledge. The clinical data such as Electronic Health Records are not considered in the analysis of the disease. In the future, we will continue our research specifically along these directions.
5. Conclusion
We propose a multi-view graph convolutional network method called MVGCN in this paper, which can directly take brain graphs from multiple views as inputs and do prediction on that. We validate the effectiveness of MVGCN on real-world Parkinson’s Progression Markers Initiative (PPMI) data for predicting the pairwise matching relations. We demonstrate that our proposed MVGCN can not only achieve good performance, but also discover interesting predictive patterns.
Acknowledgement
The work is supported by NSFIIS-1716432 (FW), NSFIIS-1650723 (FW), NSFIIS-1750326 (FW), NSFIIS-1718798 (KC), and MJFF14858 (FW). Data used in the preparation of this article were obtained from the Parkinson’s Progression Markers Initiative (PPMI) database (http://www.ppmi-info.org/data). For up-to-date information on the study, visit http://www.ppmi-info.org. PPMI – a public-private partnership – is funded by the Michael J. Fox Foundation for Parkinson’s Research and funding partners, including Abbvie, Avid, Biogen, Bristol-Mayers Squibb, Covance, GE, Genentech, GlaxoSmithKline, Lilly, Lundbeck, Merk, Meso Scale Discovery, Pfizer, Piramal, Roche, Sanofi, Servier, TEVA, UCB and Golub Capital.
Footnotes
References
- [1].William Dauer, Serge Przedborski. Parkinson’s disease: mechanisms and models. Neuron. 2003;39(6):889–909. doi: 10.1016/s0896-6273(03)00568-3. [DOI] [PubMed] [Google Scholar]
- [2].Kenneth D Kochanek, Sherry L Murphy, Jiaquan Xu, Betzaida Tejada-Vera. Deaths: final data for 2014. National vital statistics reports: from the Centers for Disease Control and Prevention National Center for Health Statistics National Vital Statistics System. 2016;65(4):1–122. [PubMed] [Google Scholar]
- [3].Frasier M, Chowdhury S, Sherer T, Eberling J, Ravina B, Siderowf A, Scherzer C, Jennings D, Tanner C, Kieburtz K, et al. The parkinson’s progression markers initiative: a prospective biomarkers study. Movement Disorders. 2010;25:S296. [Google Scholar]
- [4].Ivo D Dinov, Ben Heavner, Ming Tang, Gustavo Glusman, Kyle Chard, Mike Darcy, Ravi Madduri, Judy Pa, Cathie Spino, Carl Kesselman, et al. Predictive big data analytics: a study of parkinsons disease using large complex, heterogeneous, incongruent, multi-source and incomplete observations. PloS one. 2016;11(8):e0157077. doi: 10.1371/journal.pone.0157077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Anette Schrag, Uzma Faisal Siddiqui, Zacharias Anastasiou, Daniel Weintraub, Jonathan M Schott. Clinical variables and biomarkers in prediction of cognitive impairment in patients with newly diagnosed parkinson’s disease: a cohort study. The Lancet Neurology. 2017;16(1):66–75. doi: 10.1016/S1474-4422(16)30328-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Mike A Nalls, Cory Y McLean, Jacqueline Rick, Shirley Eberly, Samantha J Hutten, Katrina Gwinn, Margaret Sutherland, Maria Martinez, Peter Heutink, Nigel M Williams, et al. Diagnosis of parkinson’s disease on the basis of clinical and genetic classification: a population-based modelling study. The Lancet Neurology. 2015;14(10):1002–1009. doi: 10.1016/S1474-4422(15)00178-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].University of california san francisco and weill cornell medicine researchers named winners of 2016 parkinson’s progression markers initiative data challenge. https://www.michaeljfox.org/foundation/publication-detail.html?id=625&category=7.
- [8].Marios Politis. Neuroimaging in parkinson disease: from research setting to clinical practice. Nature Reviews Neurology. 2014;10(12):708. doi: 10.1038/nrneurol.2014.205. [DOI] [PubMed] [Google Scholar]
- [9].Shun Chen, Hong-yu Tan, Zhuo-hua Wu, Chong-peng Sun, Jian-xun He, Xin-chun Li, Ming Shao. Imaging of olfactory bulb and gray matter volumes in brain areas associated with olfactory function in patients with parkinson’s disease and multiple system atrophy. European journal of radiology. 2014;83(3):564–570. doi: 10.1016/j.ejrad.2013.11.024. [DOI] [PubMed] [Google Scholar]
- [10].Hirobumi Oikawa, Makoto Sasaki, Yoshiharu Tamakawa, Shigeru Ehara, Koujiro Tohyama. The substantia nigra in parkinson disease: proton density-weighted spin-echo and fast short inversion time inversion-recovery mr indings. American Journal of Neuroradiology. 2002;23(10):1747–1756. [PMC free article] [PubMed] [Google Scholar]
- [11].Patrice Peran, Andrea Cherubini, Francesca Assogna, Fabrizio Piras, Carlo Quattrocchi, Antonella Peppe, Pierre Celsis, Olivier Rascol, Jean-Francois Demonet, Alessandro Stefani, et al. Magnetic resonance imaging markers of parkinsons disease nigrostriatal signature. Brain. 2010;133(11):3423–3433. doi: 10.1093/brain/awq212. [DOI] [PubMed] [Google Scholar]
- [12].Claire J Cochrane, Klaus P Ebmeier. Diffusion tensor imaging in parkinsonian syndromes a systematic review and meta-analysis. Neurology. 2013;80(9):857–864. doi: 10.1212/WNL.0b013e318284070c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].DE Vaillancourt, MB Spraker, J Prodoehl, I Abraham, DM Corcos, XJ Zhou, CL Comella, DM Little. High-resolution diffusion tensor imaging in the substantia nigra of de novo parkinson disease. Neurology. 2009;72(16):1378–1384. doi: 10.1212/01.wnl.0000340982.01727.6e. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Usman Saeed, Jordana Compagnone, Richard I Aviv, Antonio P Strafella, Sandra E Black, Anthony E Lang, Mario Masel-lis. Imaging biomarkers in parkinsons disease and parkinsonian syndromes: current and emerging concepts. Translational neurodegeneration. 2017;6(1):8. doi: 10.1186/s40035-017-0076-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Francisco Pereira, Tom Mitchell, Matthew Botvinick. Machine learning classiiers and fmri: a tutorial overview. Neu-roimage. 2009;45(1):S199–S209. doi: 10.1016/j.neuroimage.2008.11.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Miles N Wernick, Yongyi Yang, Jovan G Brankov, Grigori Yourganov, Stephen C Strother. Machine learning in medical imaging. IEEE signal processing magazine. 2010;27(4):25–38. doi: 10.1109/MSP.2010.936730. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Sofia Ira Ktena, Sarah Parisot, Enzo Ferrante, Martin Rajchl, Matthew Lee, Ben Glocker, Daniel Rueckert. Distance metric learning using graph convolutional networks: Application to functional brain networks. In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer; 2017. pp. 469–477. [Google Scholar]
- [18].Zilong Bai, Peter Walker, Anna Tschiffely, Fei Wang, Ian Davidson. Unsupervised network discovery for brain imaging data. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM; 2017. pp. 55–64. [Google Scholar]
- [19].Xinyue Liu, Xiangnan Kong, Ann B Ragin. Unified and contrasting graphical lasso for brain network discovery. In Proceedings of the 2017 SIAM International Conference on Data Mining; SIAM; 2017. pp. 180–188. [Google Scholar]
- [20].Ahmad R Hariri, Daniel R Weinberger. Imaging genomics. British medical bulletin. 2003;65(1):259–270. doi: 10.1093/bmb/65.1.259. [DOI] [PubMed] [Google Scholar]
- [21].Paul M Thompson, Nicholas G Martin, Margaret J Wright. Imaging genomics. Current opinion in neurology. 2010;23(4):368. doi: 10.1097/WCO.0b013e32833b764c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Srikanth Ryali, Kaustubh Supekar, Daniel A Abrams, Vinod Menon. Sparse logistic regression for whole-brain classii-cation of fmri data. NeuroImage. 2010;51(2):752–764. doi: 10.1016/j.neuroimage.2010.02.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Paul Sajda, Shuyan Du, Truman R Brown, Radka Stoyanova, Dikoma C Shungu, Xiangling Mao, Lucas C Parra. Non-negative matrix factorization for rapid recovery of constituent spectra in magnetic resonance chemical shift imaging of the brain. IEEE transactions on medical imaging. 2004;23(12):1453–1465. doi: 10.1109/TMI.2004.834626. [DOI] [PubMed] [Google Scholar]
- [24].Michael Defferrard, Xavier Bresson, Pierre Vandergheynst. In Advances in Neural Information Processing Systems. 2016. Convolutional neural networks on graphs with fast localized spectral iltering; pp. 3844–3852. [Google Scholar]
- [25].Hang Su, Subhransu Maji, Evangelos Kalogerakis, Erik Learned-Miller. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings ofthe IEEE international conference on computer vision; 2015. pp. 945–953. [Google Scholar]
- [26].Gregory Koch. 2015. Siamese neural networks for one-shot image recognition. [Google Scholar]
- [27].Susumu Mori, Barbara J Crain, Vadappuram P Chacko, Peter Van Zijl. Three-dimensional tracking of axonal projections in the brain by magnetic resonance imaging. Annals of neurology. 1999;45(2):265–269. doi: 10.1002/1531-8249(199902)45:2<265::aid-ana21>3.0.co;2-3. [DOI] [PubMed] [Google Scholar]
- [28].Peter J Basser, Sinisa Pajevic, Carlo Pierpaoli, Jeffrey Duda, Akram Aldroubi. In vivo iber tractography using dt-mri data. Magnetic resonance in medicine. 2000;44(4):625–632. doi: 10.1002/1522-2594(200010)44:4<625::aid-mrm17>3.0.co;2-o. [DOI] [PubMed] [Google Scholar]
- [29].Thomas E Conturo, Nicolas F Lori, Thomas S Cull, Erbil Akbudak, Abraham Z Snyder, Joshua S Shimony, Robert C McK-instry, Harold Burton, Marcus E Raichle. Tracking neuronal iber pathways in the living human brain. Proceedings of the National Academy of Sciences. 1999;96(18):10422–10427. doi: 10.1073/pnas.96.18.10422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Mariana Lazar, David M Weinstein, Jay S Tsuruda, Khader M Hasan, Konstantinos Arfanakis, M Elizabeth Meyerand, Ben-ham Badie, Howard A Rowley, Victor Haughton, Aaron Field, et al. White matter tractography using diffusion tensor deflection. Human brain mapping. 2003;18(4):306–321. doi: 10.1002/hbm.10102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Iman Aganj, Christophe Lenglet, Guillermo Sapiro, Essa Yacoub, Kamil Ugurbil, Noam Harel. Reconstruction of the orientation distribution function in single-and multiple-shell q-ball imaging within constant solid angle. Magnetic Resonance in Medicine. 2010;64(2):554–566. doi: 10.1002/mrm.22365. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [32].Iman Aganj, Christophe Lenglet, Neda Jahanshad, Essa Yacoub, Noam Harel, Paul M Thompson, Guillermo Sapiro. A hough transform global probabilistic approach to multiple-subject diffusion mri tractography. Medical image analysis. 2011;15(4):414–425. doi: 10.1016/j.media.2011.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Liang Zhan, Jiayu Zhou, Yalin Wang, Yan Jin, Neda Jahanshad, Gautam Prasad, Talia M Nir, Cassandra D Leonardo, Jieping Ye, Paul M Thompson, et al. Comparison of nine tractography algorithms for detecting abnormal structural brain networks in alzheimers disease. Frontiers in aging neuroscience. 2015;7(48) doi: 10.3389/fnagi.2015.00048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [34].Diederik P Kingma, Jimmy Ba. Adam: A method for stochastic optimization. 2014. arXivpreprint arXiv.1412.6980.
- [35].Laurens van der Maaten, Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research. 2008. (Nov) pp. 2579–2605.
- [36].Rui Gao, Guangjian Zhang, Xueqi Chen, Aimin Yang, Gwenn Smith, Dean F Wong, Yun Zhou. Csf biomarkers and its associations with 18f-av133 cerebral vmat2 binding in parkinsons diseasea preliminary report. PloS one. 2016;11(10) doi: 10.1371/journal.pone.0164762. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [37].Tengfei Ma, Cao Xiao, Jiayu Zhou, Fei Wang. Drug similarity integration through attentive multi-view graph auto-encoders. 2018. arXiv preprint arXiv:1804.10850.