Abstract
Background
We recently described a genomic pathway approach to study complex diseases. We demonstrated that models constructed using single nucleotide polymorphisms (SNPs) within axon guidance pathway genes were highly predictive of Parkinson disease (PD) susceptibility, survival free of PD, and age at onset of PD within two independent whole-genome association datasets. We also demonstrated that several axon guidance pathway genes represented by SNPs within our final models were differentially expressed in PD.
Methodology/Principal Findings
Here we employed our genomic pathway approach to analyze data from a whole-genome association dataset of amyotrophic lateral sclerosis (ALS); and demonstrated that models constructed using SNPs within axon guidance pathway genes were highly predictive of ALS susceptibility (odds ratio = 1739.73, p = 2.92×10−60), survival free of ALS (hazards ratio = 149.80, p = 1.25×10−74), and age at onset of ALS (R2 = 0.86, p = 5.96×10−66). We also extended our analyses of a whole-genome association dataset of PD, which shared 320,202 genomic SNPs in common with the whole-genome association dataset of ALS. We compared for ALS and PD the genes represented by SNPs in the final models for susceptibility, survival free of disease, and age at onset of disease and noted that 52.2%, 37.8%, and 34.9% of the genes were shared respectively.
Conclusions/Significance
Our findings for the axon guidance pathway and ALS have prior biological plausibility, overlap partially with PD, and may provide important insight into the causes of these and related neurodegenerative disorders.
Introduction
We recently described a genomic pathway approach as a method to predict complex diseases, and demonstrated that models of single nucleotide polymorphisms (SNPs) within axon guidance pathway genes were highly predictive of Parkinson disease (PD) susceptibility, survival free of PD, and age at onset of PD [1]. Our findings suggested that mechanisms such as a neurodevelopmental defect in brain wiring, or lifelong defects in axonal maintenance and repair, could contribute to the pathogenesis of PD.
Amyotrophic lateral sclerosis (ALS) is also a complex, aging related disease of the nervous system affecting motor control. Both ALS and PD are characterized by degeneration of neurons with long axonal projections. Indeed, for ALS, the neurons that degenerate have some of the longest projections within the nervous system, extending from near the surface of the brain through the length of the spinal cord; or from the spinal cord segments to the muscles of the distal extremities. As for PD, it is plausible that a defect in axonal guidance or maintenance or repair could predispose to ALS [2].
Indeed, prior evidence suggests that axon guidance factors may contribute to the pathogenesis of ALS. Gene ontology pathways relating to axonal function are differentially expressed in ALS, including pathways that relate to the cytoskeleton (processes including axonal outgrowth and transport) and to neuronal maintenance and signaling (processes including axonal differentiation, plasticity, maintenance, and repair) [3]. Furthermore, studies in experimental models and in patients have implicated specific axon guidance pathway genes, or their transcripts or proteins, in the pathogenesis of ALS. These include the genes CDC42 (GeneID 998), CDK5 (GeneID 1020), CXCL12 (GeneID 6387), CXCR4 (GeneID 7852), EPHB2 (GeneID 2048), L1CAM (GeneID 3897), MET (GeneID 4233), PAK1 (GeneID 5058), PAK3 (GeneID 5063), RAC1 (GeneID 5879), RHOA (GeneID 387), and SEMA3A (GeneID 10371) [4]–[18]. Here and throughout the publication, gene names and accession numbers (GeneIDs) were assigned using the Entrez Gene website http://www.ncbi.nlm.nih.gov/entrez/query.fcgidbgene.
In light of this prior biological plausibility, we postulated that models constructed using SNPs within axon guidance pathway genes might predict ALS outcomes as they had predicted PD outcomes; and that the final predictive models for ALS and PD might include SNPs within some of the same axon guidance pathway genes. To test these hypotheses, we analyzed two whole-genome association datasets, one for ALS and one for PD [19], [20]. We chose to compare the datasets for these two studies specifically because the samples were selected from the same biospecimens repository (indeed, the ALS and PD cases were compared to the same unrelated controls); because the sample sizes employed by the two studies were similar (∼270 cases and ∼270 unrelated controls each); and because the same laboratory genotyped the samples using the same platform and overlapping SNP arrays.
Materials and Methods
Bioinformatic Methods
To formally test our hypothesis, we consulted the Kyoto Encyclopedia of Genes and Genomes (KEGG) [21]–[23]. The KEGG PATHWAY database is a bioinformatics resource that provides wiring diagrams of molecular interactions, reactions, and relations. There are at least 270 pathways in KEGG related to Homo sapiens and diseases. This includes a detailed summary of the axon guidance pathway, updated as recently as October 3, 2005 (http://www.genome.jp/dbget-bin/www_bgetpath:hsa04360). We identified all of the genes that encoded proteins within the KEGG axon guidance pathway via Entrez Gene (http://www.ncbi.nlm.nih.gov/entrez/query.fcgidbGene) and consulted the UniGene database (http://www.ncbi.nlm.nih.gov/entrez/query.fcgidbunigene) to determine which of the genes were expressed in the human brain (n = 128). We then mined available whole-genome association datasets for ALS and PD to identify those SNPs that were genotyped in brain-expressed, axon-guidance pathway genes as part of those studies [19], [20]. Specifically, in those studies, 555,352 unique SNPs were used from the Illumina Infinium II HumanHap550 SNP assay for ALS and more than 408,000 unique SNPs were used from the Illumina Infinium I and HumanHap300 assays for PD. Details regarding the sampling methods for the ALS and PD datasets have been previously reported [19], [20]. In brief, samples were obtained from the Coriell Institute for Medical Research, NJ, USA. Cases fulfilled pre-specified diagnostic criteria for ALS or PD, and controls were neurologically normal (no history of ALS or PD or other neurological disorders). The case and control subjects were of the same ethnic origin, but were not matched for age or sex.
Statistical Methods (ALS)
All statistical tests were two-tailed, and considered significant at the conventional alpha level of 0.05. All statistical analyses were performed in SAS v. 9.1 (SAS Institute Inc., Cary, NC) or S-Plus v. 7 (Insightful Corp., Seattle, WA). We considered three outcomes of interest: 1) ALS susceptibility, 2) survival free of ALS, and 3) age at onset of ALS. We sought to identify joint action models of SNPs from the axon guidance pathway that predicted each of the three outcomes.
For the first outcome, we used unconditional logistic regressions to examine associations of the SNPs with ALS susceptibility while adjusting for age and gender [24]. For each SNP, we calculated odds ratios (ORs), 95% confidence intervals (CIs), and p values. Goodness-of-fit was assessed through measuring concordance and visually through histograms of predicted probabilities [25]. We estimated overall ORs by categorizing the predicted probability of ALS from the model into four groups (<0.25, 0.25–0.50, 0.50–0.75, and >0.75), and then calculating the ORs for each group relative to the <0.25 group. We used a likelihood ratio test to assess the significance of the overall model, and calculated a 95% bias-corrected bootstrap CI for the associated p value using 10,000 re-samples.
For the second outcome, we used Cox proportional hazards models to test for associations of the SNPs with survival free of ALS [26]. For each SNP, we calculated hazards ratios (HRs), 95% CI, and p values. Concordance was again calculated for the proportional hazards models, and Kaplan-Meier plots of categorized scores predicting risk of ALS were generated to provide visual gauges for goodness-of-fit [27]. We also calculated HRs for risk groups categorized at the quartiles, using the lowest risk group as reference. We used a likelihood ratio test to assess the significance of the overall model, and calculated a 95% bias-corrected bootstrap CI for the associated p value using 10,000 re-samples.
For the third outcome, we predicted the reported age at onset of ALS using multiple regression models [28]. Goodness-of-fit was described through the model R 2 values and plots of the predicted vs. observed ages at onset. We used an F test to assess the significance of the overall model, and calculated a 95% bias-corrected bootstrap CI for the associated p value and the R2 using 10,000 re-samples. Assumptions were tested throughout. We performed tests of linkage disequilibrium in unrelated controls for the SNPs in the final models for the three outcomes using LDSELECT v. 1.0 (copyright 2004 by Deborah A. Nickerson, Mark Rieder, Chris Carlson, Qian Yi, University of Washington) with a threshold R2 of 0.80.
Figure S1 summarizes the scheme used to develop models for each outcome. Since the modes of expression of the alleles in the SNPs of interest were not known, we first looked at each SNP using three coding schemes: log-additive, Mendelian dominant, and Mendelian recessive (Step 1). We simplified subsequent analyses by removing from further consideration those SNPs with no significant main effects in any coding scheme (Step 2). Removing those SNPs was a conservative approach, potentially biasing our tests towards the null hypotheses, since the SNPs were prevented from possibly entering the joint action models after adjustment for other variables. We generally coded the remaining SNPs using the schemes which produced the smallest p values, since these provided our best estimates of the modes of expression in our data (Step 3). For each outcome, we then created multiple sets of SNPs, where each set contained only SNPs with at least a certain number of non-missing values (Step 4). This was done to address issues due to missing values.
While most SNPs had fairly complete data, others had missing values from substantial (up to 28%) numbers of subjects. We therefore chose an approach where we constructed candidate models using sets of SNPs with fairly complete data (effective sample sizes close to the maximum) to explain as much of the outcomes as possible, then checked to see if adding other SNPs on top of the candidate models would contribute significantly. We constructed the candidate models for each set using standard automated procedures (Step 5), and selected a final candidate model for each outcome based on significance and goodness-of-fit (Step 6). We then added other SNPs, which were significant given the candidate models (Step 7), and significant pair-wise interactions (Step 8).
Statistical Methods (PD)
Our statistical methods for PD were identical to those for ALS. We considered three outcomes of interest: 1) PD susceptibility, 2) survival free of PD, and 3) age at onset of PD. We sought to identify joint action models of SNPs from the axon guidance pathway that predicted each of the three outcomes. The methods employed to study the association of SNPs with each outcome and the scheme used to develop predictive models for each outcome are described above.
Results
Results (ALS)
The primary whole-genome association study dataset employed by this study included 275 ALS cases and 269 unrelated controls. The median age at onset of ALS among the cases was 54 years (range 26–87). Details regarding the SNP markers genotyped were previously reported [19]. Our bioinformatic methods identified 128 brain-expressed axon guidance pathway genes and our SNP dataset included 4,133 SNPs within 124 of those genes.
ALS Susceptibility
Of the 4,133 SNPs within brain-expressed genes of the axon-guidance pathway, 442 SNPs (10.7%) were individually associated with susceptibility to ALS, as detailed in Text S1. Table S1A contains results for the final model produced by running SNPs through the multi-stage process to predict ALS susceptibility. This model used data from 542 unmatched ALS patients and unrelated controls (2 subjects were missing data on one or more SNPs). The ORs (95% CIs) for the groups defined by predicted ALS probability of <0.25, 0.25–0.50, 0.50–0.75, and >0.75 were as follows: 1 (reference), 17.60 (5.70–54.36), 112.00 (35.45–353.83), and 1739.73 (523.53–5781.32) respectively. Since we were interested in the significance of the pathway, rather than individual SNPs, the p value for the overall model was of primary importance. In this case, the model had an overall p value of 2.92×10−60 (95% CI 8.34×10−52-1.16×10−68). This model significantly predicted whether an individual was a case or an unrelated control. The predicted probabilities of ALS were very high (towards 1) for most of the cases, and very low (towards 0) for most of the unrelated controls (Figure 1A). Indeed 78% of the cases had predicted probabilities above 0.9, and 77% of unrelated controls had predicted probabilities below 0.1. As shown by Figure 1A, the model did not completely distinguish the two groups; some ALS cases had low predicted probabilities, and some controls had high predicted probabilities. However, the concordance for the model was about 0.99, indicating excellent agreement between predicted and observed case/control status.
Survival Free of ALS
Of the 4,133 SNPs, 451 (10.9%) were individually associated with survival free of ALS (hazard function) using Cox proportional hazards models, as detailed in Text S1. Table S1B contains results for the final proportional hazards model produced by running SNPs through the multi-stage process to predict survival free of ALS. This model used data from 274 ALS patients (1 patient was missing data on one or more SNPs). In this case, the model had an overall p value of 1.25×10−74 (95% CI 2.22×10−60-1.24×10−89). By contrast, the model was not significant at predicting survival (age at study) in the unrelated controls (p = 0.15). This last finding suggests that the model predicts survival free of ALS (hazard function), but not survival in general, and that the model is specific for ALS cases.
Figure 1B shows a Kaplan-Meier plot to describe the results of the model. The groups were formed by calculating a risk score for each ALS patient using the equation from the proportional hazards model, then categorizing the score at the 25th (Q1), 50th (Q2), and 75th (Q3) percentiles. The survival curves separated nicely right from the earliest ages of onset. By age 50, only 9% of ALS patients in the predicted highest risk group were still free of ALS, whereas 96% of ALS patients in the predicted lowest risk group were free of ALS. By age 55, none of the ALS patients in the predicted highest risk group were still free of ALS, whereas 93% of ALS patients in the predicted lowest risk group were free of ALS. The median ages at onset for each group, from lowest risk group to highest risk group, were 71, 59, 52, and 43, a difference in survival free of ALS of 28 years from lowest to highest. The concordance for this model was 0.86. The HRs (95% CIs) for the four groups, from lowest to highest risk, were 1 (reference), 7.35 (4.50–12.00), 37.29 (20.49–67.87), and 149.80 (77.21–290.63).
Age at Onset of ALS
Of the 4,133 SNPs, 487 (11.8%) were individually associated with age at onset of ALS using linear regression models, as detailed in Text S1. Table S1C contains results for the final model produced by running SNPs through the multi-stage process to predict age at onset of ALS. This model used data from 272 ALS patients (3 patients were missing data on one or more SNPs). In this case, the model had an overall p value of 5.96×10−66 (95% CI 4.95×10−51-8.42×10−80). By contrast, the set of SNPs was not significant at predicting age at study of the unrelated controls (p = 0.55). This last finding suggests that the model predicts age at onset of ALS, not age at the time of the study, and that the model is specific for ALS cases.
Figure 1C shows a plot of predicted age at onset vs. reported age at onset to summarize the results of the model. The data are distributed in an elliptical pattern, reflecting the model R2 of 0.86 (95% CI 0.83–0.88). The model explained about 86% of the variability in age at onset of ALS.
Other combinations of SNPs from the axon guidance pathway also performed quite well in predicting ALS susceptibility, survival free of ALS, and age at onset of ALS. Although the models reported in this manuscript provided good fits to our data, our results do not preclude other combinations of axon guidance pathway SNPs as significant predictors of ALS. The SNPs in the final models that we selected showed no significant linkage disequilibrium in unrelated controls.
Results (PD)
The primary whole-genome association study dataset employed by this study included 269 PD cases and 267 unrelated controls. The median age at onset of PD among the cases was 64 years (range 13–84). Details regarding the SNP markers genotyped were previously reported [20]. Our bioinformatic methods identified 128 brain-expressed axon guidance pathway genes and our SNP dataset included 3,095 SNPs within 122 of those genes.
PD Susceptibility
Of the 3,095 SNPs within brain-expressed genes of the axon-guidance pathway, 295 SNPs (9.5%) were individually associated with susceptibility to PD, as detailed in Text S2. Table S2A contains results for the final model produced by running SNPs through the multi-stage process to predict PD susceptibility. This model used data from 516 unmatched PD patients and unrelated controls (20 subjects were missing data on one or more SNPs). The ORs (95% CIs) for the groups defined by predicted PD probability of <0.25, 0.25–0.50, 0.50–0.75, and >0.75 were as follows: 1 (reference), 1.85 (0.58–5.92), 19.03 (8.56–42.30), and 391.82 (157.94–972.06), respectively. Since we were interested in the significance of the pathway, rather than individual SNPs, the p value for the overall model was of primary importance. In this case, the model had an overall p value of 8.10×10−71 (95% CI 2.34×10−64-1.67×10−76). This model significantly predicted whether an individual was a case or an unrelated control. The predicted probabilities of PD were very high (towards 1) for most of the cases, and very low (towards 0) for most of the unrelated controls (Figure 1A). Indeed 72% of the cases had predicted probabilities above 0.9, and 69% of unrelated controls had predicted probabilities below 0.1. As shown by Figure 2A, the model did not completely distinguish the two groups; some PD cases had low predicted probabilities, and some controls had high-predicted probabilities. However, the concordance for the model was about 0.98, indicating excellent agreement between predicted and observed case/control status.
Survival Free of PD
Of the 3,095 SNPs, 327 (10.6%) were individually associated with survival free of PD (hazard function) using Cox proportional hazards models, as detailed in Text S2. Table S2B contains results for the final proportional hazards model produced by running SNPs through the multi-stage process to predict survival free of PD. This model used data from 264 PD patients (4 patients were missing data on one or more SNPs, and one patient with a questionable age at onset of 13 years was removed from the analyses). In this case, the model had an overall p value of 9.02×10−58 (95% CI 1.48×10−46-7.90×10−70). By contrast, the model was not significant at predicting survival (age at study) of the unrelated controls (p = 0.80). This last finding suggests that the model predicts survival free of PD (hazard function), but not survival in general, and that the model is specific for PD cases.
Figure 2B shows a Kaplan-Meier plot to describe the results of the model. The groups were formed by calculating a risk score for each PD patient using the equation from the proportional hazards model, then categorizing the score at the 25th (Q1), 50th (Q2), and 75th (Q3) percentiles. The survival curves separated nicely right from the earliest ages of onset. By age 60, only 21% of PD patients in the predicted highest risk group were still free of PD, whereas 98% of PD patients in the predicted lowest risk group were free of PD. By age 70, none of the PD patients in the predicted highest risk group were still free of PD, whereas 74% of PD patients in the predicted lowest risk group were free of PD. The median ages at onset for each group, from lowest risk group to highest risk group, were 75, 68, 63, and 58, a difference in survival free of PD of 17 years from lowest to highest. The concordance for this model was 0.85. The HRs (95% CIs) for the four groups, from lowest to highest risk, were 1 (reference), 4.65 (3.05–7.08), 17.36 (10.54–28.57), and 72.90 (41.52–128.00).
Age at Onset of PD
Of the 3,095 SNPs, 326 (10.5%) were individually associated with age at onset of PD using linear regression models, as detailed in Text S2. Table S2C contains results for the final model produced by running SNPs through the multi-stage process to predict age at onset of PD. This model used data from 261 PD patients (7 patients were missing data on one or more SNPs, and one patient with a questionable age at onset of 13 years was removed from the analyses). In this case, the model had an overall p value of 4.52×10−61 (95% CI 8.97×10−46-2.97×10−75). By contrast, the set of SNPs was not significant at predicting age at study of the unrelated controls (p = 0.98). This last finding suggests that the model predicts age at onset of PD, not age at the time of the study, and that the model is specific for PD cases.
Figure 2C shows a plot of predicted age at onset vs. reported age at onset to summarize the results of the model. The data are distributed in an elliptical pattern, reflecting the model R2 of 0.86 (95% CI 0.83–0.89). The model explained about 86% of the variability in age at onset of PD.
Other combinations of SNPs from the axon guidance pathway also performed quite well in predicting PD susceptibility, survival free of PD, and age at onset of PD. Although the models reported in this manuscript provided good fits to our data, our results do not preclude other combinations of axon guidance pathway SNPs as significant predictors of PD. The SNPs in the final models that we selected showed no significant linkage disequilibrium in unrelated controls.
Results (Comparison of Final Models for ALS and PD)
The final model predicting ALS susceptibility contained 31 genes, and the final model predicting PD susceptibility contained 39 genes. The SNPs and genes included in the final models are listed in the supporting information (Tables S1A and S2A). Combined, the models for ALS and PD contained 46 genes, of which 24 (52.2%) were shared, and 22 (47.8%) were not. Of the 22 genes not shared, 7 were in the ALS model but not the PD model, and 15 were in the PD model but not in the ALS model.
The final model predicting survival free of ALS contained 34 genes, and the final model predicting survival free of PD contained 28 genes. The SNPs and genes included in the final models are listed in the supporting information (Tables S1B and S2B). Combined, the models for ALS and PD contained 45 genes, of which 17 (37.8%) were shared, and 28 (62.2%) were not. Of the 28 genes not shared, 17 were in the ALS model but not the PD model, and 11 were in the PD model but not in the ALS model.
The final model predicting age at onset of ALS contained 30 genes, and the final model predicting age at onset of PD contained 28 genes. The SNPs and genes included in the final models are listed in the supporting information (Tables S1C and S2C). Combined, the models for ALS and PD contained 43 genes, of which 15 (34.9%) were shared, and 28 (65.1%) were not. Of the 28 genes not shared, 15 were in the ALS model but not the PD model, and 13 were in the PD model but not in the ALS model.
Figure 3 displays the distributions of the significant SNPs in the genes from the ALS and PD final models (as listed in Tables S1A-S2C). While the samples for the ALS and PD whole-genome association studies were genotyped in the same laboratory using an Illumina platform, the SNPs assayed overlapped only partially. Specifically, 320,202 SNPs were common to both the ALS and PD datasets, but 234,957 SNPs were unique to the ALS dataset and 88,599 SNPs were unique to the PD dataset. In other words, of the SNPs present in the ALS dataset, only 57.6% were also present in the PD dataset; and of the SNPs present in the PD dataset, only 78.3% were also present in the ALS dataset. We again note that the final models were not exclusive; other combinations of SNPs (and genes) also had predictive value for the three outcomes for either ALS or PD (data not shown).
Discussion
These results suggest that common gene variants in the axon guidance pathway contribute to the pathogenesis of ALS as well as PD. The axon guidance pathway includes several ligands, receptors, and intermediary proteins that provide a complex and dynamic set of cues that either repel or attract axons toward their synaptic targets during brain and spinal cord development. Moreover, the same pathway maintains and repairs axons and their connections throughout life [29], [30]. All major families of axon guidance pathway ligands (ephrins, netrins, semaphorins, slits), their receptors, and several intermediary proteins have been reported to play a role in the axon guidance of motor neurons [31]–[53]. Hence it is not surprising that our final models for ALS susceptibility, survival free of ALS, and age at onset of ALS included SNPs from many axon guidance pathway genes (see Tables S1A-C). It is possible that motor neurons and dopamine neurons are selectively vulnerable to neurodegenerative processes in part because they have long and/or highly collateralized axons as compared to less vulnerable neurons [54]–[56]. Hence, normal axon guidance pathway function may be of importance to develop and maintain nervous system health.
We observed only partial overlap of axon guidance pathway gene signatures for ALS and PD outcomes. Conceivably, there are sub-pathway patterns of difference in the gene signatures for the two disorders (e.g., genes encoding ephrin versus netrin or semaphorin or slit families of ligands and receptors; genes encoding ligands versus receptors or intermediary proteins; genes encoding proteins that are primarily chemoattractant versus chemorepellant; etc.). However, we did not recognize any definite patterns of difference in the gene signatures for ALS and PD. Such patterns may become apparent by comparing ALS and PD predictive SNP models across multiple datasets. While there are two other whole-genome association studies of ALS published to date, they are not suitable for our genomic pathway analyses that require individual level genotyping data. Specifically, the one study performed pooled genotyping of cases and controls [57]. The other study performed individual-level genotyping, but at this time only summary data are available (the approved protocols did not permit public release of individual genotyping data) [58].
We also note that while the gene signatures for ALS and PD are partially overlapping, that the genes associated with these two diseases may have been highlighted by different SNPs and with different directions of effect (ORs, HRs, or R2 values greater or less than 1). Unfortunately, the functional effects of most of the SNPs included in either whole-genome association study (or the disease alleles that they map) are unknown [19], [20], and it is unknown whether the genes that were mapped by the SNPs in the final models are expressed specifically by the vulnerable cell types in ALS (e.g., motor neurons) or in PD (e.g., dopaminergic neurons). Furthermore, we note that while the axon guidance pathway SNPs genotyped in the ALS and PD datasets were partially overlapping, that there were many SNPs unique to either dataset. Some linkage disequilibrium bins within axon guidance pathway genes may not have been mapped by the SNPs included on the genotyping arrays.
Our final SNP models for ALS and PD had high model concordance and R2 values, and the sensitivity and specificity of the susceptibility model for ALS were 94% and 94%, and for PD were 92% and 92%. The p values for the three outcomes for ALS and PD were robust, as demonstrated by their 95% CIs (internal validity of the findings). It would be tempting to consider the development for clinical practice of predictive multiplex SNP assays based on our final models. However, we caution that our models were constructed using data from individual SNPs, which typically have small effect sizes. With as few as 269 cases and 267 controls, we could detect odds ratios as small as 1.8 in SNPs with allele frequencies of 0.10 or higher (alpha = 0.05, beta = 0.10). While 80% of SNPs in the ALS dataset and 86% of SNPs in the PD dataset had allele frequencies of 0.10 or higher, the available sample size limits our ability to detect significant main effects of individual SNPs. However, for our joint effects SNP models we had ample power, as fewer than 100 subjects total were required to detect the observed effect sizes. The small available sample size biased our results conservatively to the null hypothesis. With larger sample sizes, additional SNPs with lower allele frequencies and smaller main effects may be included in the final models. Furthermore, individual SNPs may have variable frequencies and carry different linkage disequilibrium information content across populations. Some disease gene loci may be population specific. We note that for each of the three outcomes for ALS and PD, multiple models of SNPs were predictive (we only illustrated the models that provided the best fit to the available data). Efforts to replicate our findings for ALS or PD might therefore consider genotyping at least all of the SNPs that were in axon guidance pathway genes in the featured whole-genome association datasets [20], [59], as listed in Text S1 and S2, and might also consider genotyping additional SNPs within the axon guidance pathway (e.g., at least one SNP per linkage disequilibrium bin; for 6,014 bins and an r2≥0.9, about 8,200 tag SNPs in Caucasian samples). The replication standard might be based upon the final model concordance and p values for a given pathway and outcome, rather than upon the findings for a pre-specified list of SNPs.
Our findings for axon guidance and ALS or PD do not preclude the role of other genomic pathways in the etiology of these disorders. However, the high model concordance and R2 values that we observed would suggest that within the available samples, axon guidance pathway gene variability played a major role. Indeed, a large-scale pathways-based association study in ALS recently failed to exhibit significant findings for other candidate pathways [60]. Our findings also do not preclude the role of environmental factors in the etiology of these disorders [61]. However, we would encourage epidemiological and toxicological studies of ALS and PD to also consider neurodevelopmental mechanisms. Indeed, beta-N-methylamino alanine has been proposed to be the exogenous toxin cause of Guamanian ALS and Parkinsonism [62], and the toxin has a biphasic dose effect on neuritic outgrowth in vitro [63].
Until now, genetic studies of complex diseases such as ALS and PD have largely considered the main effects of SNPs, which are small and of limited attributable risk [19], [20], [64], [65]. By contrast, our genomic pathway approach suggests that even for disorders that appear sporadic, genetic factors may still play a major role.
Supporting Information
Acknowledgments
We dedicate this paper to our Mayo Clinic colleagues the emeritus Donald W. Mulder, MD and the late Leonard T. Kurland, MD, in recognition of their seminal studies of ALS and Parkinsonism in Guam. This study used clinical and genetic data from the SNP Database at the NINDS Human Genetics Resource Center DNA and Cell Line Repository (http://ccr.coriell.org/ninds/). Drs. Katrina Gwinn (NINDS) and Roderick Corriveau (Coriell Institute) provided oversight on data access and use. The original genotyping was performed in the laboratory of Drs. Andrew Singleton and John Hardy (NIA, LNG), Bethesda, MD USA. The initial whole-genome association study results for ALS and PD were reported by the publications that we cited in the references [19], [20], and we thank the authors of those studies. We also thank our collaborator John Ioannidis for useful discussion and advice regarding our genomic pathway approach.
Footnotes
Competing Interests: TGL and DMM report a provisional application for patent under 37 CFR § 1.53 (c) entitled ‘Predicting Parkinson's Disease.’ No monies have been awarded to date.
Funding: This work was supported in part by grants from the National Institutes of Health (R01 ES10751 to DMM), and the Michael J. Fox Foundation (Linked Efforts to Accelerate Parkinson Solutions award to DMM). The funders had no role in the design and conduct of the study, in the collection, analysis, and interpretation of the data, and in the preparation, review, or approval of the manuscript.
References
- 1.Lesnick TG, Papapetropoulos S, Mash DC, Ffrench-Mullen J, Shehadeh L, et al. A genomic pathway approach to a complex disease: axon guidance and Parkinson disease. PLoS Genet. 2007;3:e98. doi: 10.1371/journal.pgen.0030098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Fischer LR, Culver DG, Tennant P, Davis AA, Wang M, et al. Amyotrophic lateral sclerosis is a distal axonopathy: evidence in mice and man. Exp Neurol. 2004;185:232–240. doi: 10.1016/j.expneurol.2003.10.004. [DOI] [PubMed] [Google Scholar]
- 3.Lederer CW, Torrisi A, Pantelidou M, Santama N, Cavallaro S. Pathways and genes differentially expressed in the motor cortex of patients with sporadic amyotrophic lateral sclerosis. BMC Genomics. 2007;8:26. doi: 10.1186/1471-2164-8-26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Nakamura S, Kawamoto Y, Nakano S, Ikemoto A, Akiguchi I, et al. Cyclin-dependent kinase 5 in Lewy body-like inclusions in anterior horn cells of a patient with sporadic amyotrophic lateral sclerosis. Neurology. 1997;48:267–270. doi: 10.1212/wnl.48.1.267. [DOI] [PubMed] [Google Scholar]
- 5.Bajaj NP, Al-Sarraj ST, Anderson V, Kibble M, Leigh N, et al. Cyclin-dependent kinase-5 is associated with lipofuscin in motor neurones in amyotrophic lateral sclerosis. Neurosci Lett. 1998;245:45–48. doi: 10.1016/s0304-3940(98)00176-1. [DOI] [PubMed] [Google Scholar]
- 6.Bajaj NP, al-Sarraj ST, Leigh PN, Anderson V, Miller CC. Cyclin dependent kinase-5 (CDK-5) phosphorylates neurofilament heavy (NF-H) chain to generate epitopes for antibodies that label neurofilament accumulations in amyotrophic lateral sclerosis (ALS) and is present in affected motor neurones in ALS. Prog Neuropsychopharmacol Biol Psychiatry. 1999;23:833–850. doi: 10.1016/s0278-5846(99)00044-5. [DOI] [PubMed] [Google Scholar]
- 7.Bajaj NP. Cyclin-dependent kinase-5 (CDK5) and amyotrophic lateral sclerosis. Amyotroph Lateral Scler Other Motor Neuron Disord. 2000;1:319–327. doi: 10.1080/146608200300079563. [DOI] [PubMed] [Google Scholar]
- 8.Nguyen MD, Lariviere RC, Julien JP. Deregulation of Cdk5 in a mouse model of ALS: toxicity alleviated by perikaryal neurofilament inclusions. Neuron. 2001;30:135–147. doi: 10.1016/s0896-6273(01)00268-9. [DOI] [PubMed] [Google Scholar]
- 9.Patzke H, Tsai LH. Cdk5 sinks into ALS. Trends Neurosci. 2002;25:8–10. doi: 10.1016/s0166-2236(00)02000-2. [DOI] [PubMed] [Google Scholar]
- 10.Nguyen MD, Boudreau M, Kriz J, Couillard-Despres S, Kaplan DR, et al. Cell cycle regulators in the neuronal death pathway of amyotrophic lateral sclerosis caused by mutant superoxide dismutase 1. J Neurosci. 2003;23:2131–2140. doi: 10.1523/JNEUROSCI.23-06-02131.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kato S, Funakoshi H, Nakamura T, Kato M, Nakano I, et al. Expression of hepatocyte growth factor and c-Met in the anterior horn cells of the spinal cord in the patients with amyotrophic lateral sclerosis (ALS): immunohistochemical studies on sporadic ALS and familial ALS with superoxide dismutase 1 gene mutation. Acta Neuropathol (Berl) 2003;106:112–120. doi: 10.1007/s00401-003-0708-z. [DOI] [PubMed] [Google Scholar]
- 12.Hu JH, Chernoff K, Pelech S, Krieger C. Protein kinase and protein phosphatase expression in the central nervous system of G93A mSOD over-expressing mice. J Neurochem. 2003;85:422–431. doi: 10.1046/j.1471-4159.2003.01669.x. [DOI] [PubMed] [Google Scholar]
- 13.Tudor EL, Perkinton MS, Schmidt A, Ackerley S, Brownlees J, et al. ALS2/Alsin regulates Rac-PAK signaling and neurite outgrowth. J Biol Chem. 2005;280:34735–34740. doi: 10.1074/jbc.M506216200. [DOI] [PubMed] [Google Scholar]
- 14.Chung YH, Joo KM, Lim HC, Cho MH, Kim D, et al. Immunohistochemical study on the distribution of phosphorylated extracellular signal-regulated kinase (ERK) in the central nervous system of SOD1G93A transgenic mice. Brain Res. 2005;1050:203–209. doi: 10.1016/j.brainres.2005.05.060. [DOI] [PubMed] [Google Scholar]
- 15.Ignacio S, Moore DH, Smith AP, Lee NM. Effect of neuroprotective drugs on gene expression in G93A/SOD1 mice. Ann N Y Acad Sci. 2005;1053:121–136. doi: 10.1196/annals.1344.010. [DOI] [PubMed] [Google Scholar]
- 16.Jacquier A, Buhler E, Schafer MK, Bohl D, Blanchard S, et al. Alsin/Rac1 signaling controls survival and growth of spinal motoneurons. Ann Neurol. 2006;60:105–117. doi: 10.1002/ana.20886. [DOI] [PubMed] [Google Scholar]
- 17.De Winter F, Vo T, Stam FJ, Wisman LA, Bar PR, et al. The expression of the chemorepellent Semaphorin 3A is selectively induced in terminal Schwann cells of a subset of neuromuscular synapses that display limited anatomical plasticity and enhanced vulnerability in motor neuron disease. Mol Cell Neurosci. 2006;32:102–117. doi: 10.1016/j.mcn.2006.03.002. [DOI] [PubMed] [Google Scholar]
- 18.Corti S, Locatelli F, Papadimitriou D, Del Bo R, Nizzardo M, et al. Neural stem cells LewisX+ CXCR4+ modify disease progression in an amyotrophic lateral sclerosis model. Brain. 2007;130:1289–1305. doi: 10.1093/brain/awm043. [DOI] [PubMed] [Google Scholar]
- 19.Schymick JC, Scholz SW, Fung HC, Britton A, Arepalli S, et al. Genome-wide genotyping in amyotrophic lateral sclerosis and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol. 2007;6:322–328. doi: 10.1016/S1474-4422(07)70037-6. [DOI] [PubMed] [Google Scholar]
- 20.Fung HC, Scholz S, Matarin M, Simon-Sanchez J, Hernandez D, et al. Genome-wide genotyping in Parkinson's disease and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol. 2006;5:911–916. doi: 10.1016/S1474-4422(06)70578-6. [DOI] [PubMed] [Google Scholar]
- 21.Kanehisa M. A database for post-genome analysis. Trends Genet. 1997;13:375–376. doi: 10.1016/s0168-9525(97)01223-7. [DOI] [PubMed] [Google Scholar]
- 22.Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. doi: 10.1093/nar/28.1.27. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006;34:D354–357. doi: 10.1093/nar/gkj102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Breslow NE, Day NE. Statistical methods in cancer research. Volume I - The analysis of case-control studies. IARC Sci Publ. 1980:5–338. [PubMed] [Google Scholar]
- 25.Harrell FE, Jr., Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996;15:361–387. doi: 10.1002/(SICI)1097-0258(19960229)15:4<361::AID-SIM168>3.0.CO;2-4. [DOI] [PubMed] [Google Scholar]
- 26.Cox D. Regression models of life-tables (with discussion). J R Stat Soc [Ser B] 1972;34:187–220. [Google Scholar]
- 27.Kaplan E, Meier P. Non-parametric estimation from incomplete observations. J Am Stat Assoc. 1958;53:457–481. [Google Scholar]
- 28.Draper N, Smith H. Applied regression analysis. 2nd Edition. New York: Wiley; 1981. p. 709. [Google Scholar]
- 29.Chilton JK. Molecular mechanisms of axon guidance. Dev Biol. 2006;292:13–24. doi: 10.1016/j.ydbio.2005.12.048. [DOI] [PubMed] [Google Scholar]
- 30.Gomez TM, Zheng JQ. The molecular basis for calcium-dependent axon pathfinding. Nat Rev Neurosci. 2006;7:115–125. doi: 10.1038/nrn1844. [DOI] [PubMed] [Google Scholar]
- 31.Colamarino SA, Tessier-Lavigne M. The axonal chemoattractant netrin-1 is also a chemorepellent for trochlear motor axons. Cell. 1995;81:621–629. doi: 10.1016/0092-8674(95)90083-7. [DOI] [PubMed] [Google Scholar]
- 32.Mitchell KJ, Doyle JL, Serafini T, Kennedy TE, Tessier-Lavigne M, et al. Genetic analysis of Netrin genes in Drosophila: Netrins guide CNS commissural axons and peripheral motor axons. Neuron. 1996;17:203–215. doi: 10.1016/s0896-6273(00)80153-1. [DOI] [PubMed] [Google Scholar]
- 33.Wang HU, Anderson DJ. Eph family transmembrane ligands can mediate repulsive guidance of trunk neural crest migration and motor axon outgrowth. Neuron. 1997;18:383–396. doi: 10.1016/s0896-6273(00)81240-4. [DOI] [PubMed] [Google Scholar]
- 34.Kaufmann N, Wills ZP, Van Vactor D. Drosophila Rac1 controls motor axon guidance. Development. 1998;125:453–461. doi: 10.1242/dev.125.3.453. [DOI] [PubMed] [Google Scholar]
- 35.Yu HH, Araj HH, Ralls SA, Kolodkin AL. The transmembrane Semaphorin Sema I is required in Drosophila for embryonic motor and CNS axon guidance. Neuron. 1998;20:207–220. doi: 10.1016/s0896-6273(00)80450-x. [DOI] [PubMed] [Google Scholar]
- 36.Allan DW, Greer JJ. Polysialylated NCAM expression during motor axon outgrowth and myogenesis in the fetal rat. J Comp Neurol. 1998;391:275–292. doi: 10.1002/(sici)1096-9861(19980216)391:3<275::aid-cne1>3.0.co;2-z. [DOI] [PubMed] [Google Scholar]
- 37.Lim YS, Mallapur S, Kao G, Ren XC, Wadsworth WG. Netrin UNC-6 and the regulation of branching and extension of motoneuron axons from the ventral nerve cord of Caenorhabditis elegans. J Neurosci. 1999;19:7048–7056. doi: 10.1523/JNEUROSCI.19-16-07048.1999. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Barrett C, Guthrie S. Expression patterns of the netrin receptor UNC5H1 among developing motor neurons in the embryonic rat hindbrain. Mech Dev. 2001;106:163–166. doi: 10.1016/s0925-4773(01)00415-4. [DOI] [PubMed] [Google Scholar]
- 39.Patel K, Nash JA, Itoh A, Liu Z, Sundaresan V, et al. Slit proteins are not dominant chemorepellents for olfactory tract and spinal motor axons. Development. 2001;128:5031–5037. doi: 10.1242/dev.128.24.5031. [DOI] [PubMed] [Google Scholar]
- 40.Eberhart J, Swartz ME, Koblar SA, Pasquale EB, Krull CE. EphA4 constitutes a population-specific guidance cue for motor neurons. Dev Biol. 2002;247:89–101. doi: 10.1006/dbio.2002.0695. [DOI] [PubMed] [Google Scholar]
- 41.Xiao T, Shoji W, Zhou W, Su F, Kuwada JY. Transmembrane sema4E guides branchiomotor axons to their targets in zebrafish. J Neurosci. 2003;23:4190–4198. doi: 10.1523/JNEUROSCI.23-10-04190.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Huang X, Huang P, Robinson MK, Stern MJ, Jin Y. UNC-71, a disintegrin and metalloprotease (ADAM) protein, regulates motor axon guidance and sex myoblast migration in C. elegans. Development. 2003;130:3147–3161. doi: 10.1242/dev.00518. [DOI] [PubMed] [Google Scholar]
- 43.Lindholm T, Skold MK, Suneson A, Carlstedt T, Cullheim S, et al. Semaphorin and neuropilin expression in motoneurons after intraspinal motoneuron axotomy. Neuroreport. 2004;15:649–654. doi: 10.1097/00001756-200403220-00015. [DOI] [PubMed] [Google Scholar]
- 44.Franz CK, Rutishauser U, Rafuse VF. Polysialylated neural cell adhesion molecule is necessary for selective targeting of regenerating motor neurons. J Neurosci. 2005;25:2081–2091. doi: 10.1523/JNEUROSCI.4880-04.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Cohen S, Funkelstein L, Livet J, Rougon G, Henderson CE, et al. A semaphorin code defines subpopulations of spinal motor neurons during mouse development. Eur J Neurosci. 2005;21:1767–1776. doi: 10.1111/j.1460-9568.2005.04021.x. [DOI] [PubMed] [Google Scholar]
- 46.Hammond R, Vivancos V, Naeem A, Chilton J, Mambetisaeva E, et al. Slit-mediated repulsion is a key regulator of motor axon pathfinding in the hindbrain. Development. 2005;132:4483–4495. doi: 10.1242/dev.02038. [DOI] [PubMed] [Google Scholar]
- 47.Huber AB, Kania A, Tran TS, Gu C, De Marco Garcia N, et al. Distinct roles for secreted semaphorin signaling in spinal motor axon guidance. Neuron. 2005;48:949–964. doi: 10.1016/j.neuron.2005.12.003. [DOI] [PubMed] [Google Scholar]
- 48.Sato-Maeda M, Tawarayama H, Obinata M, Kuwada JY, Shoji W. Sema3a1 guides spinal motor axons in a cell- and stage-specific manner in zebrafish. Development. 2006;133:937–947. doi: 10.1242/dev.02268. [DOI] [PubMed] [Google Scholar]
- 49.Kramer ER, Knott L, Su F, Dessaud E, Krull CE, et al. Cooperation between GDNF/Ret and ephrinA/EphA4 signals for motor-axon pathway selection in the limb. Neuron. 2006;50:35–47. doi: 10.1016/j.neuron.2006.02.020. [DOI] [PubMed] [Google Scholar]
- 50.Burgess RW, Jucius TJ, Ackerman SL. Motor axon guidance of the mammalian trochlear and phrenic nerves: dependence on the netrin receptor Unc5c and modifier loci. J Neurosci. 2006;26:5756–5766. doi: 10.1523/JNEUROSCI.0736-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Boulin T, Pocock R, Hobert O. A novel Eph receptor-interacting IgSF protein provides C. elegans motoneurons with midline guidepost function. Curr Biol. 2006;16:1871–1883. doi: 10.1016/j.cub.2006.08.056. [DOI] [PubMed] [Google Scholar]
- 52.Lucanic M, Kiley M, Ashcroft N, L'Etoile N, Cheng HJ. The Caenorhabditis elegans P21-activated kinases are differentially required for UNC-6/netrin-mediated commissural motor axon guidance. Development. 2006;133:4549–4559. doi: 10.1242/dev.02648. [DOI] [PubMed] [Google Scholar]
- 53.Feldner J, Reimer MM, Schweitzer J, Wendik B, Meyer D, et al. PlexinA3 restricts spinal exit points and branching of trunk motor nerves in embryonic zebrafish. J Neurosci. 2007;27:4978–4983. doi: 10.1523/JNEUROSCI.1132-07.2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Braak H, Ghebremedhin E, Rub U, Bratzke H, Del Tredici K. Stages in the development of Parkinson's disease-related pathology. Cell Tissue Res. 2004;318:121–134. doi: 10.1007/s00441-004-0956-9. [DOI] [PubMed] [Google Scholar]
- 55.Parent M, Parent A. Relationship between axonal collateralization and neuronal degeneration in basal ganglia. J Neural Transm Suppl. 2006:85–88. doi: 10.1007/978-3-211-45295-0_14. [DOI] [PubMed] [Google Scholar]
- 56.Schneider VA, Granato M. Motor axon migration: a long way to go. Dev Biol. 2003;263:1–11. doi: 10.1016/s0012-1606(03)00329-4. [DOI] [PubMed] [Google Scholar]
- 57.Dunckley T, Huentelman MJ, Craig DW, Pearson JV, Szelinger S, et al. Whole-genome analysis of sporadic amyotrophic lateral sclerosis. N Engl J Med. 2007;357:775–788. doi: 10.1056/NEJMoa070174. [DOI] [PubMed] [Google Scholar]
- 58.van Es MA, Van Vught PW, Blauw HM, Franke L, Saris CG, et al. ITPR2 as a susceptibility gene in sporadic amyotrophic lateral sclerosis: a genome-wide association study. Lancet Neurol. 2007 doi: 10.1016/S1474-4422(07)70222-3. [DOI] [PubMed] [Google Scholar]
- 59.Maraganore DM, de Andrade M, Lesnick TG, Strain KJ, Farrer MJ, et al. High-resolution whole-genome association study of Parkinson disease. Am J Hum Genet. 2005;77:685–693. doi: 10.1086/496902. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Kasperaviciute D, Weale ME, Shianna KV, Banks GT, Simpson CL, et al. Large-scale pathways-based association study in amyotrophic lateral sclerosis. Brain. 2007;130:2292–2301. doi: 10.1093/brain/awm055. [DOI] [PubMed] [Google Scholar]
- 61.Brown RC, Lockwood AH, Sonawane BR. Neurodegenerative diseases: an overview of environmental risk factors. Environ Health Perspect. 2005;113:1250–1256. doi: 10.1289/ehp.7567. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Garruto RM. A commentary on neuronal degeneration and cell death in Guam ALS and PD: an evolutionary process of understanding. Curr Alzheimer Res. 2006;3:397–401. doi: 10.2174/156720506778249425. [DOI] [PubMed] [Google Scholar]
- 63.Abdulla EM, Campbell IC. Use of neurite outgrowth as an in vitro method of assessing neurotoxicity. Ann N Y Acad Sci. 1993;679:276–279. doi: 10.1111/j.1749-6632.1993.tb18308.x. [DOI] [PubMed] [Google Scholar]
- 64.Schymick JC, Talbot K, Traynor BJ. Genetics of sporadic amyotrophic lateral sclerosis. Hum Mol Genet. 2007;16 doi: 10.1093/hmg/ddm215. doi:10.1093/hmg/ddm215. [DOI] [PubMed] [Google Scholar]
- 65.van ES MA, van Vught PWJ, Blauw HM, Franke L, Saris CGJ, et al. Genetic variation in DPP6 is associated with susceptibility to amyotrophic lateral sclerosis. Nat Genet. 2007 doi: 10.1038/ng.2007.52. doi:10.1038/ng.2007.52. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.