Abstract
The aim of this paper is to develop a functional mixed effects modeling (FMEM) framework for the joint analysis of high-dimensional imaging data in a large number of locations (called voxels) of a three-dimensional volume with a set of genetic markers and clinical covariates. Our FMEM is extremely useful for effciently carrying out the candidate gene approaches in imaging genetic studies. FMEM consists of two novel components including a mixed effects model for modeling nonlinear genetic effects on imaging phenotypes by introducing the genetic random effects at each voxel and a jumping surface model for modeling the variance components of the genetic random effects and fixed effects as piecewise smooth functions of the voxels. Moreover, FMEM naturally accommodates the correlation structure of genetic markers at each voxel, while the jumping surface model explicitly incorporates the intrinsically spatial smoothness of the imaging data. We propose a novel two-stage adaptive smoothing procedure to spatially estimate the piecewise smooth functions, particularly the irregular functional genetic variance components, while preserving their edges among different piecewise-smooth regions. We develop weighted likelihood ratio tests and derive their exact approximations to test the effect of the genetic markers across voxels. Simulation studies show that FMEM significantly outperforms voxel-wise approaches in terms of higher sensitivity and specificity to identify regions of interest for carrying out candidate genetic mapping in imaging genetic studies. Finally, FMEM is used to identify brain regions affected by three candidate genes including CR1, CD2AP, and PICALM, thereby hoping to shed light on the pathological interactions between these candidate genes and brain structure and function.
Keywords: adaptive smoothing, candidate genetic mapping, functional mixed effects models, jumping surface model, likelihood ratio statistic, variance components
1 Introduction
Common mental and neurological disorders, such as autism and schizophrenia, are highly heritable and strongly associated with brain structure and function, but it has been diffcult to unravel the genetic factors of these complex illnesses in that many genetic factors may contribute to the susceptibility of the disease while the contribution of each factor is small. Since imaging data provide the most effective measures of brain function and structure, such data may serve as an important intermediate phenotype that ultimately can lead to discoveries of genes for these complex disorders. Imaging genetic studies, which collect both imaging and genetic data, have recently attracted extensive research interest for dissecting the genetic basis of neurological disorders [Gilmore and et al, 2010, Loth et al., 2011, Savitz and Drevets, 2009]. The common and important themes of both the imaging and genetic data include ultra-high dimensionality and complex correlation structures determined by the physical location. However, most of the existing methods for genetic association studies focus on low dimensional phenotypes (e.g. case-control status) and thus cannot account for high-dimensional imaging phenotypes, while accounting for spatial smoothness of the imaging measurements. On the other hand, most existing association methods in the neuroimaging literature do not study the joint effects of multiple genetic markers, while accommodating their correlations due to linkage disequilibrium (LD).
There are two main genetic association approaches including i) the candidate gene approaches and ii) the genome-wide association studies (GWAS) for correlating imaging phenotype with genotype at one or more polymorphic markers in order to uncover genetic predispositions to disease [Zhu and Zhao, 2007, Amos et al., 2011]. Both approaches enjoy a combination of advantages and disadvantages. Candidate genes are commonly selected for study based on either a priori knowledge of their biological functional impact on the phenotype or disease in question or previous GWAS studies, such as the NHGRI GWAS catalog [Hindor et al., 2009]. The candidate gene approach tends to have rather high statistical power, but is incapable of discovering new genes or gene combinations. A standard statistical method for the candidate approach in imaging genetics is the voxel-wise analysis (VWA) framework. The voxel-wise analysis consists of Gaussian smoothing the imaging data and subsequently fitting a statistical model at each voxel. However, the voxel-wise analysis is generally not optimal in power and the use of Gaussian smoothing may introduce a substantial bias in the statistical results [Jones et al., 2005, T. Ball et al., 2012, Li et al., 2011, 2013].
Most imaging genetic studies have used the candidate gene approach, although more recently, more studies are beginning to scan the entire genome for common genetic variation [Thompson et al., 2013, Zhang et al., 2014]. A standard statistical method for GWAS in imaging genetics is the massive univariate linear modelling (MULM) framework [Hibar and et al, 2011, Shen et al., 2010]. This approach repeatedly fits a linear regression model for each pair of imaging voxels and genetic markers. MULM entails a large number of comparisons, and thus MULM can only detect extremely significant imaging-marker pairs. Moreover, MULM ignores both the spatial information of the imaging data and the correlation among genetic markers. See more detailed discussions in [Vounou et al., 2010, Ge et al., 2012, Thompson et al., 2013]. Recently, in Ge et al. [2012], a cluster-wide, marker set association framework was proposed by integrating cluster size inferences based on random field theory in order to utilize the spatial smoothness of the imaging data [Worsley et al., 2004] and a marker set analysis based on least-squares kernel machines in order to assess the joint association of potentially correlated and interacting loci [Liu et al., 2007].
The aim of this paper is to develop a functional mixed effects modeling (FMEM) framework for the joint analysis of high-dimensional imaging data with a set of genetic markers and clinical covariates. Our FMEM is extremely useful for effciently carrying out the candidate gene approaches in imaging genetic studies. FMEM consists of two novel components including a mixed effects model and a jumping surface model. Specifically, at each voxel, we use the mixed effects model with genetic random effects to assess the nonlinear association of potentially correlated and interacting loci with imaging phenotypes and the variance component (VC) of genetic random effects to detect the nonlinear effects of a marker set on imaging measures across voxels [Liu et al., 2007, Tzeng and Zhang, 2007, Wang and Chen, 2012]. To account for the spatial smoothness of the imaging data, we use the jumping surface model to explicitly model both the genetic variance component and fixed effects as piecewisel smooth functions of the voxels with unknown edges and possible jumps. We develop a novel two-stage adaptive smoothing procedure to spatially estimate the genetic variance component function, while preserving its edges among different piecewise-smooth regions. We also develop weighted likelihood ratio tests and derive their exact approximations to test the effect of the genetic markers across the brain. Our numerical examples show that FMEM significantly outperforms voxel-wide approaches in terms of detection of meaningful effect regions.
The rest of the paper is organized as follows. In Section 2, we describe the proposed FMEM and its adaptive estimation procedure. Then, we develop our hypothesis testing procedure to assess the genetic effects as well as the effect of clinical variables on the imaging phenotypes. In Section 3, we evaluate the finite-sample performance of FMEM by using simulation studies and analyzing a real data set from the Alzheimer’s Disease Neuroimage Initiative (ADNI). A few concluding remarks are given in the Discussion section.
2 Methods
2.1 Functional Mixed Effects Model
Suppose that we observe imaging measures, clinical variables, and genetic markers from n unrelated subjects. Let be the whole brain and v be a voxel in . For each individual i (i = 1, …, n), an NV × 1 vector of imaging measures is observed and denoted by . For notational simplicity, we only consider univariate imaging measures and thus, NV equals the number of voxels in . Moreover, a K × 1 vector of clinical covariates xi = (xi1, …xiK)T and a G × 1 vector gi = (gi1, …gig)T for genetic markers are also collected for each individual. For instance, imaging measures can be brain structural and functional data at each location [Friston, 2009, Ashburner and Friston, 2000], and genetic markers can be various polymorphism types, such as single nucleotide polymorphisms (SNPs), block substitutions, and copy number variants [Liu et al., 2007, Tzeng and Zhang, 2007, Wang and Chen, 2012]. The objective of this paper is to develop FMEM to quantify genetic contributions to high-dimensional imaging measures.
Our FMEM framework consists of two novel components including a mixed effects model (MEM) at each voxel and a jumping surface model (JSM) for varying coefficient functions across the brain. First, at each voxel v in , a mixed effects model is introduced as
(1) |
where β(v) = (β1(v), …, βK(v))T is a K × 1 vector, , , and are independent across subjects i and independent of γ(v) for all , in which IL is an identity matrix. Let Z = (z1, …,zn) be an L × n matrix and Y(v) = (y1(v), …, yn(v))T be an n × 1 vector. Thus, under model (1), we have
Model (1) can be regarded as an alternative representation of the variance component models used in the literature [Liu et al., 2007, Tzeng and Zhang, 2007, Kang et al., 2010, Wang and Chen, 2012]. For instance, when ZTZ equals a kernel matrix K = (K(gi, gi′)), model (1) reduces to the linear mixed model considered in [Liu et al., 2007, Ge et al., 2012], where K(·,·) is a kernel function, such as the polynomial kernel, identity-by-state (IBS) kernel, or the Gaussian kernel. For instance, the IBS kernel is used in [Ge et al., 2012].
Model (1) has a strong connection with a nonparametric fixed effects model given by
(2) |
where h(·; v) is an unknown centered function corresponding to the genetic effects at voxel v. To estimate the unknown functions h(·; v) in model (2), a common approach is to express them as linear combinations of some pre-specified basis functions (e.g., splines or kernels), that is , where γ(v) = (γ1(v), …, γL(v))T is an L × 1 vector for genetic random effects and zi is a pre-specified L × 1 vector of functions of gi. The variation of h(gi; v) is then controlled by the variation of their basis coefficients in γ(v). Moreover, a penalty function (e.g., L2 or L1) coupled with a tuning parameter is usually introduced to impose certain constraints on γ(v) and such penalty function can be regarded as a prior distribution of random effects. Thus, this connection provides a simple way to connect the nonparametric fixed effects model (2) with model (1) [Liu et al., 2007, Tzeng and Zhang, 2007, Kang et al., 2010, Wang and Chen, 2012]. We focus on model (1) from now on.
We also assume spatial correlation as follows. For v and v′ in , we assume
where 1(·) is an indicator function and ρe(v, v′) characterizes the spatial correlation between the measurement errors. Therefore, the covariance structure of yi(v) is given by
(3) |
Following Zhu et al. [2014], we propose a JSM for the genetic varying coefficient function and the fixed effect varying coeffcient functions for j = 1, …, K. For notational simplicity, we only introduce JSM for as follows:
(i) (Disjoint Partition) There is a finite and disjoint partition of V such that .
(ii) (Piecewise Smoothness) is smooth within each for l = 1, …, Lγ, but is discontinuous on , where is the boundary of and the jumping surface of .
A similar JSM can be defined for each βj(·) for j = 1, …, K.
We use JSM to explicitly delineate the fact that imaging data can be regarded as a noisy version of a piecewise-smooth function of with the possible existence of jumps or edges. In many neuroimaging datasets, those jumps or edges often reflect the functional and/or structural changes, such as white matter (WM) and grey matter (GM), across the brain. Therefore, the varying coefficient functions in model (1) may inherit the piecewise-smooth features from the imaging data. Furthermore, it is more reasonable to assume that different varying coefficient functions have different jumps or edges, since they may play different roles in characterizing the piecewise-smooth pattern of the imaging data.
Our FMEM consisting of model (1) and JSM can be regarded as a novel extension of the existing FMEMs and varying coefficient models in the literature [Zhu et al., 2011, Yuan et al., 2014, Morris and Carroll, 2006, Guo, 2002, Greven et al., 2010, Zhu et al., 2014, 2012], even though all of them are developed to model functional responses (of time or voxel) measured either cross-sectionally or longitudinally. However, at each voxel, the jumping surface model is introduced to spatially characterize the piecewise smoothness of the imaging data. In contrast, most existing FMEMs and varying coefficient models reduce to a parametric model at each voxel, while most FMEMs do not explicitly model the piecewise smoothness of the imaging data. Moreover, the primary interest of our FMEM is to estimate the genetic varying coefficient function , whereas that of other FMEMs and varying coefficient models is to estimate the fixed effects varying coefficient functions βj(·). Estimating the function is computationally and theoretically much harder than estimating the varying coefficient functions, since satisfies a nonnegative constraint.
2.2 Two-stage Estimation Procedure
We propose a two-stage estimation procedure to estimate all varying coefficient functions and test their effects on imaging phenotypes. The key ideas of each stage are given as follows:
Stage I. Estimate , develop an adaptive smoothing method to estimate , and test the null hypothesis across all voxels.
Stage II. Develop an adaptive smoothing method to spatially and adaptively estimate and then test associated hypotheses.
Moreover, after calculating , and , we can estimate , where and X = (x1, …,xn) is a p × n matrix. To approximate ρe(v, v′), we calculate the empirical correlation between and where . Since γ(v) and ρe(v, v′) are nuisance parameters, we do not focus on them throughout. Since our primary interest lies in the genetic effect, we focus on Stage I and only briefly discuss Stage II for the sake of space.
The key novelty of our estimation procedure is the adaptive smoothing method in Stage I for smoothing . Since the true variance components σγ(v) can be zero in some regions of interest and their estimates are always non-negative, directly applying standard smoothing methods, such as splines or kernel smoothing, to these nonnegative variance component estimates can introduce substantial bias in the estimation of functional genetic variance components. Thus, most existing smoothing methods cannot be successfully used in such smoothing problems [Zhu et al., 2011, Yuan et al., 2014, Morris and Carroll, 2006, Guo, 2002, Greven et al., 2010, Zhu et al., 2014, 2012, Polzehl and Spokoiny, 2000, Polzehl et al., 2010, Yue et al., 2010].
2.2.1 Stage I
The first stage consists of three major steps as follows:
Step I.1. Calculate the restricted maximum likelihood (REML) estimator of across all voxels .
Step I.2. Spatially and adaptively re-estimate by incorporating information from neighboring voxels.
Step I.3. Construct weighted likelihood ratio statistics and derive their approximate distributions to test the null hypothesis of across all voxels.
In Step I.1, we calculate the REML estimator of η(v) across voxels. There exists an (n – p) × n matrix Kx such that KxXT = 0 and rank(Kx) = n – p. A mixed effects model for Y* (v) = KxY(v) is given by
(4) |
where E(v) = (e1(v), …, en (v))T. Based on the distributional assumptions in (1), we have Y* (v) ~ N(0, ΣY* (v)), where . Thus, at each voxel v, the REML estimate of η(v), denoted by , is to maximize the REML function given by
(5) |
Since our primary interest lies on , we fix as from here on.
In Step I.2, we construct a weighted REML function to estimate by incorporating the spatial information in a neighborhood B(v, h) for each voxel v with a specific radius h as follows:
(6) |
where ωγ (v, v′; h) is a weight function of voxels v, v′, and the radius h. We maximize in order to calculate the weighted REML estimator of , denoted by . The weight function ωγ (v, v′; h) measures the data similarity between the two voxels v and v′ such that Σv′∈B(v,h)ωγ(v,v′; h) = 1 and ωγ (v, v′; h) ≥ 0. A large value of ωγ (v, v′; h) means that the information contained in the voxels v and v′ is very similar, whereas ωγ (v, v′; h) ≈ 0 indicates that the data in voxel v′ do not have too much information for σv (v). The adaptive weight ωγ (v, v′; h) plays a critical role in preventing over-smoothing estimation of and preserving the edges of significant regions of .
In Step I.3, to assess the genetic effects on imaging phenotypes across all voxels, we formulate it as testing the following null and alternative hypotheses:
(7) |
We test (7) by using the weighted REML ratio statistic defined by
(8) |
Since all the subjects share the same random effect γ(v), the standard asymptotic results in Stram and Lee [1994] are invalid and can perform very poorly even for the unweighted REML ratio statistics for testing random effects in model (1). However, we provide an exact null distribution of below.
2.2.2 Step I.2 Adaptive Estimation of
Following the adaptive estimation (AET) procedure proposed in [Li et al., 2011], we adaptively determine and then calculate as h increases from h0 = 0 to a predetermined value hS = r0. The key novelty of AET is to build a sequence of for h0 = 0 < h1 < … < hS = r0 at each voxel and then sequentially determine ωγ (v, v′; hs) for all v′ ∈ B(v, hs) based on for all and s = 1, …, S. However, one cannot apply many existing smoothing methods, such as local kernel or the propagation-seperation method [Zhu et al., 2014, Polzehl and Spokoiny, 2000, Polzehl et al., 2010, Fan and Gijbels, 1996], to directly smooth in the non-activation region . Specifically, since is always non-negative even in the voxels of , directly calculating the weighted means of does not lead to the bias reduction in . A path diagram of AET is given as follows:
The three key steps of AET, including weight adaptation, estimation, and termination checking, are presented as follows.
- In the weight adaption step (i), we select a series of radii with ch ∈ (1, 2), say ch = 1.125. We use a relatively small ch in order to increase estimation robustness and prevent oversmoothing. We then set s = 1 and h1 = ch. The adaptive weights in (6) are given by
where Kloc(u) = (1 – u)+ and Kst(u) = min(1, 2(1 – u2))+, and ∥·∥2 denotes the Euclidean norm of a vector (or a matrix). Moreover, Dγ (v, v′; hs−1) is set as so that the difference between consecutive is within the precision of , where is estimated by using the inverse of the Fisher information matrix of from the likelihood function (5) with h = h0. We choose C = n1/3 χ2(1)0.5 for Dγ (v, v′; hs−1) defined in (9), where χ2(1)0.5 is the 0.5-percentile of the χ2(1) distribution. These quantities are fixed for subsequent updates of hs.(9)
The rationale for choosing different tuning parameters given above is given as follows. The weight Kst(Dγ (v, v′; hs−1)/Cn) downweights the role of a voxel v′ ∈ B(v, hs) in the REML functions if Dγ (v, v′; hs−1) is large. The weight Kloc(∥v – v′∥2/hs) gives less weight to the voxel v′ ∈ B(v, hs), whose location is far from the voxel v. The scale Cn is used to penalize the similarity between any two voxels v and v′ in a similar manner to bandwidth, and an appropriate choice of Cn is crucial for the behavior of the adaptive smoothing method in Stage I. As discussed in Zhu et al. [2014] and Li et al. [2011], Cn should satisfy Cn/n = o(1) and .
In the estimation step (ii), for each and for the radius hs given ωγ (v, v′; hs), we calculate by maximizing defined in equation (6).
- In the termination checking step (iii), after the S0–th iteration, we calculate a stopping criterion based on a distance between and given by
for s > S0. Then, we compare with a benchmark, denoted by , for s > S0. If , then we set and the estimation for this voxel v is terminated. If s = S and , is set as and the estimation process terminates. The algorithm stops when the estimation is finished for all v in V. If s ≤ S0 or for s < S0 S – 1, then we go back to the weight adaptation step (i) with an increased radius . Throughout the paper, we set S0 = 2, , and S = 10. Note that is a decreasing function in s which makes the stopping criteria more and more stringent when the radius increases in order to prevent over-smoothing.(10)
2.2.3 Step I.3: Testing
We perform hypothesis testing in (7) by using the testing statistics and their corresponding p-values. Let be the spectral decomposition of such that D0 = diag(d1, …, dn-p) is the diagonal matrix of eigenvalues dk and U is an (n – p) × (n – p) orthonormal matrix. Without loss of generality, we choose Kx such that . We obtain the following theorem, whose proof is included in the supplementary document.
Theorem 1. Under model (1), RLRTn(v) can be written as
(11) |
where and D(v′; t) is given by
(12) |
Moreover, under the null hypothesis H0,γ, (v), we have
(13) |
where means equality in distribution and the δl(v)’s are i.i.d N(0, 1) random variables.
Although Theorem 1 provides an effcient way of approximating the null distribution of RLRTn(v), a complex issue arises from the complex spatial correlations among the across voxels v′ ∈ B(v, h). One approach for dealing with such an issue is to estimate the spatial correlation for any pair of voxels, which can be computationally intensive. To avoid calculating spatial correlations, we develop a wild bootstrap method to effciently simulate the null finite sample distribution of RLRTn(v). The detailed steps of this bootstrap method are presented in the supplementary document. After the p-values for all voxels are computed, either a false discovery rate (FDR) method or random field theory (RFT) is applied to correct for multiple comparisons [Ge et al., 2012].
2.3 Stage II
The second stage is to estimate β(v) and carry out statistical inference on β(v). At each voxel v, we consider model (1) given by
(14) |
After calculating , we can calculate . Since all components of β(v) and ΣY (v) are statistically orthogonal to each other, we fix ΣY (v) at from here on. Since the true β(v) is not on the boundary of parameter space, different adaptive smoothing methods can be used here [Zhu et al., 2014, Polzehl and Spokoiny, 2000, Polzehl et al., 2010]. For simplicity, we put the detailed steps of Stage II in the supplementary document.
3 Results
3.1 Simulation Studies
We simulated data at all NV = 5, 808 voxels on a 44×44×3 phantom image. Each z-slice contains the same effect regions. At each voxel, we simulated the univariate imaging measures according to model (1) with β(v) = (β0(v), β1(v), β2(v), β3(v))T and xi = (1, xi1, xi2, xi3)T. Moreover, the covariates xi1, xi2, and xi3 were generated from a Gaussian distribution with mean 40 and standard deviation 10, a Bernoulli distribution with success probability 0.5, and a Bernoulli distribution with success probability 0.3, respectively. These three covariates were designed to mimic the common clinical variables: age, gender, and disease status. For a slice of the phantom image, the effect areas for β0(v) were divided into 16 regions with 4 different values ranging from 0.02 to 0.08, increasing by 0.02 (Figure 1(a)); for β1(v), the effect regions were divided into 25 regions ranging from 10−2.5 to 10−12.5, decreasing by a rate of 10−2.5 (Figure 1(b)); for β2(v), the whole space was separated into 3 regions with values 0, 0.05, and 0.1 (Figure 1(c)); the effect area of β3(v) was divided into 9 regions with values ranging from 0 to 0.1, increasing by differences of 0.025 (Figure 1(d)).
The genetic information was simulated according to the SNP data obtained from the public accessible data of the Alzheimer’s Disease Neuroimage Initiative (ADNI). It is an ongoing longitudinal study with the primary purpose of exploring the genetic and neuroimaing information associated with late-onset Alzheimer’s disease (LOAD). The study recruited elderly subjects older than 65 years of age consisting about 400 subjects with mild cognitive impairment (MCI), about 200 subjects with Alzheimer’s disease (AD), and around 200 healthy controls. Each subject was followed for at least 3 years. During the study period, the subjects were assessed with magnetic resonance imaging (MRI) measures and psychiatric evaluation to determine the diagnosis status at each time point. Genetic information was also collected from each subject at baseline and is genotyped by the Illumina 610 Quad array with more than 620,000 single nucleotide polymorphisms (SNPs). More information on ADNI is provided in the real data analysis of Section 3.2. We simulated the genetic information based on the two following scenarios:
Scenario I. To preserve the linkage disequilibrium among SNPs, we utilize all of the SNPs on chromosome 1 from 197 Caucasian controls to generate the genetic effects. After eliminating the SNPs with minor allele frequency (MAF) less than 5%, there were 31554 out of 45627 SNPs left. Then, we randomly chose 20 SNPs and 100 subjects among the 197 healthy controls as the simulated genetic data zi in (1). In this case, n = 100. If any of these 20 SNPs have MAF less than 5%, then the genetic data was resampled until all of the 20 SNPs have MAF ≥ 5%.
Scenario II. To evaluate the performance of FMEM in the case of high LD, we selected the SNPs from the same gene in the second scenario. Searching the SNPs on the gene PICALM, which has been found to be relevant for Alzheimer’s disease in many studies [Harold et al., 2009] using the gene list “glist-hg18” provided by PLINK, there were 23 SNPs on PICALM with MAF larger than 5%. After eliminating the missing values, there are 176 healthy controls with complete genotype data at these 23 SNPs. We randomly selected 7 SNPs from 75 healthy controls to be zi in (1). Although there is strong LD among these 7 SNPs, no SNP has perfect correlation (1 or −1) with any other SNPs in these 75 subjects. In this case, n = 75.
In both scenarios, the SNP effects were assumed to be additive. The γ(v)’s were generated from a multivariate normal distribution with mean zero and covariance matrix . Different values, which represent di erent signal-to-noise ratios, were chosen to examine the finite sample performance of our method at different signal-to-noise ratios and also to test whether FMEM can perform well for different shapes. See Figure 3 (b) and Figure 3 (e) for Scenarios I and II. Moreover, we overlay some of the effect areas of β3(v) and in order to account for the fact that the brain phenotype is an intermediate expression of disease progression. The of the effect regions in Scenario I ranged from 0.005 to 0.025, increasing by 0.0025, whereas the of effect regions in Scenario II ranged from 0.005 to 0.045, increasing by 0.005. The random error ei(v) was independently distributed as a univariate normal distribution with mean 0 and standard deviation 3 for all voxels. We set the number of bootstrap samples M and the number of repetitions to be 200.
Tables 1 and 2 summarize the estimation results of obtained from FMEM and the voxel-wise method for both scenarios. The tables include the average absolute value of the bias, the root mean square error (RMS), standard deviation (SD), and the ratio of RMS to SD. RMS is based on the empirical mean and the SD is based on the theoretical mean. As shown in both tables, FMEM produces smaller estimation bias, RMS, and SD compared with the voxel-wise method, which indicates that FMEM yields much more accurate estimation.
Table 1.
FMEM | Voxel-wise | |||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
|BIAS| | RMS | SD | RE | |BIAS| | RMS | SD | RE | |
0 | 0.001 | 0.002 | 0.002 | 1 | 0.007 | 0.005 | 0.005 | 1 |
0.005 | 2.36e-06 | 0.003 | 0.003 | 1 | 0.005 | 0.005 | 0.005 | 1 |
0.0075 | 0.0005 | 0.003 | 0.003 | 1 | 0.005 | 0.006 | 0.006 | 1 |
0.01 | 0.001 | 0.004 | 0.004 | 1 | 0.006 | 0.008 | 0.008 | 1 |
0.0125 | 0.001 | 0.004 | 0.004 | 1 | 0.006 | 0.008 | 0.008 | 1 |
0.015 | 0.002 | 0.005 | 0.005 | 1 | 0.008 | 0.010 | 0.010 | 1 |
0.0175 | 0.003 | 0.005 | 0.005 | 1 | 0.008 | 0.010 | 0.010 | 1 |
0.020 | 0.002 | 0.006 | 0.006 | 1 | 0.010 | 0.012 | 0.012 | 1 |
0.0225 | 0.003 | 0.006 | 0.006 | 1 | 0.010 | 0.013 | 0.013 | 1 |
0.025 | 0.004 | 0.006 | 0.006 | 1 | 0.010 | 0.014 | 0.014 | 1 |
Table 2.
FMEM | Voxel-wise | |||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
|BIAS| | RMS | SD | RE | |BIAS| | RMS | SD | RE | |
0 | 0.002 | 0.003 | 0.003 | 1 | 0.003 | 0.008 | 0.008 | 1 |
0.005 | 0.0001 | 0.004 | 0.004 | 1 | 0.007 | 0.126 | 0.126 | 1 |
0.010 | 0.001 | 0.006 | 0.006 | 1 | 0.011 | 0.016 | 0.016 | 1 |
0.015 | 0.002 | 0.008 | 0.008 | 1 | 0.014 | 0.020 | 0.02 | 1 |
0.020 | 0.002 | 0.010 | 0.010 | 1 | 0.024 | 0.017 | 0.017 | 1 |
0.025 | 0.003 | 0.020 | 0.020 | 1 | 0.020 | 0.029 | 0.029 | 1 |
0.030 | 0.005 | 0.013 | 0.013 | 1 | 0.023 | 0.032 | 0.032 | 1 |
0.035 | 0.004 | 0.014 | 0.014 | 1 | 0.026 | 0.035 | 0.035 | 1 |
0.040 | 0.015 | 0.006 | 0.006 | 1 | 0.028 | 0.040 | 0.040 | 1 |
0.045 | 0.007 | 0.016 | 0.016 | 1 | 0.031 | 0.040 | 0.040 | 1 |
We tested the hypotheses H0 : and H1 : for all voxels in V based on FMEM and its corresponding voxel-wise method, and the score test based on the IBS kernel used in Ge et al. [2012]. Moreover, we evaluated their finite-sample performance in cluster-based thresholding [Silver et al., 2011]. Specifically, we first thresholded the p-values for all voxels in V by using an initial p-value of 0.01 as suggested by Silver et al. [2011] in order to identify clusters of contiguous supra-threshold voxels. Then, the thresholded clusters were matched with the 9 true effect areas in Figure 3 (b) or (e). If a specific thresholded cluster overlaps with at least one voxel in any of the 9 true effect regions, we call such cluster as a “true positive”. In contrast, if a specific thresholded cluster does not overlap with any voxels of the 9 effect regions, we call a cluster a “false positive”. We summarized the hypothesis testing results by the average dice overlap ratio (DOR), the average number of false positive clusters, and the average size in the number of voxels of false positive clusters. DOR is the ratio between the number of true positive clusters over the true number of effect areas, which is 9 in this simulation setting. Thus, the higher DOR means the higher the probability of detecting the true effect regions.
As shown in Tables 3 and 4, if we set the cluster size threshold as one voxel, FMEM has smaller DOR and a smaller number of false positive clusters compared with the voxel-wise method. When the cluster size threshold increases to 10 voxels, FMEM has a similar DOR value as that of the no threshold case, whereas the DOR of the voxel-wise approach reduces by about 20%. The score test based on the IBS kernel has little power to detect the nine effect regions with subtle effects. It may be caused by both the relatively low sensitivity of the score test itself and the misspecified IBS kernel.
Table 3.
FMEM | Voxel-wise | Score | |||||
---|---|---|---|---|---|---|---|
| |||||||
Threshold | Mean | SD | Mean | SD | Mean | SD | |
| |||||||
Scenario I | |||||||
DOR | 0.94 | 0.05 | 0.99 | 0.02 | 0 | 0 | |
Voxel Size = 1 | False Positive Cluster Number | 1.88 | 6.12 | 21.30 | 12.29 | 0 | 0 |
False Positive Cluster Size | 1.03 | 0.04 | 1.06 | 0.06 | NA | NA | |
| |||||||
DOR | 0.91 | 0.04 | 0.83 | 0.10 | 0 | 0 | |
Voxel Size = 10 | False Positive Cluster Number | 0 | 0 | 0 | 0 | 0 | 0 |
False Positive Cluster Size | NA | NA | NA | NA | NA | NA |
Table 4.
FMEM | Voxel-wise | Score | |||||
---|---|---|---|---|---|---|---|
| |||||||
Threshold | Mean | SD | Mean | SD | Mean | SD | |
| |||||||
Scenario II | |||||||
DOR | 0.86 | 0.06 | 0.996 | 0.02 | 0.49 | 0.21 | |
Voxel Size = 1 | False Positive Cluster Number | 1.35 | 4.85 | 15.45 | 12.92 | 0 | 0 |
False Positive Cluster Size | 1.07 | 0.08 | 1.05 | 0.07 | NA | NA | |
| |||||||
DOR | 0.85 | 0.07 | 0.78 | 0.11 | 0.01 | 0.03 | |
Voxel Size = 10 | False Positive Cluster Number | 0 | 0 | 0 | 0 | 0 | 0 |
False Positive Cluster Size | NA | NA | NA | NA | NA | NA |
Table 5 summarizes the number of significant voxels identified by the three methods in each effect region of Scenarios I and II. In Table 5, FMEM identifies less voxels in the non-effect regions, while detecting more voxels in the effect regions in both scenarios. For FMEM, its computational time is around 30 minutes for the first scenario and 20 minutes for the second one. Table 5 also confirms that the score test based on the IBS kernel has little statistical power in detecting the nine effect regions. Therefore, FMEM significantly outperforms the voxel-wise method and the score test based on the IBS kernel [Ge et al., 2012] in terms of detecting the true effect regions and controlling the false positive error rate.
Table 5.
Number of Voxels | 4749 | 207 | 135 | 111 | 147 | 75 | 48 | 144 | 111 | 81 |
---|---|---|---|---|---|---|---|---|---|---|
Scenario I | ||||||||||
0 | 0.005 | 0.0075 | 0.010 | 0.0125 | 0.015 | 0.0175 | 0.020 | 0.0225 | 0.025 | |
| ||||||||||
Score | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Voxel-wise | 123.91 | 36.3 | 38.25 | 43.08 | 72.245 | 42.98 | 31.535 | 104.32 | 86.1 | 65.97 |
FMEM | 68.43 | 50.52 | 77.32 | 83.09 | 133.7 | 69.51 | 45.57 | 142.26 | 110.07 | 80.54 |
| ||||||||||
Scenario II | ||||||||||
0 | 0.005 | 0.010 | 0.015 | 0.020 | 0.025 | 0.030 | 0.035 | 0.040 | 0.045 | |
| ||||||||||
Score | 0.165 | 0.905 | 2.475 | 5.11 | 11.065 | 7.91 | 6.715 | 24.135 | 21.655 | 17.81 |
Voxel-wise | 110.93 | 27.08 | 33.105 | 40.05 | 65.92 | 39.06 | 27.51 | 90.69 | 73.42 | 56.59 |
FMEM | 76.57 | 8.11 | 46.05 | 71.82 | 124.42 | 66.15 | 43.182 | 139.51 | 108.35 | 79.49 |
3.2 ADNI Data Analysis
The ADNI study began in 2004 and has three phases thus far, including ADNI-1, ADNI GO, and ADNI-2. The overall objective of the ADNI study is to determine the relationships among the clinical, cognitive, imaging, genetic and biochemical biomarker characteristics of the entire spectrum of AD as the pathology evolves from normal aging through early mild cognitive impairment, to mild cognitive impairment, to late mild cognitive impairment, to dementia or AD. The Principal Investigator of this initiative is Dr. Michael W. Weiner, MD, VA Medical Center and University of California San Francisco. For up-to-date information, see www.adni-info.org for details.
The aim of this ADNI data analysis is to use FMEM to identify brain regions affected by candidate genes, thereby hoping to shed light on the pathological interactions between these candidate genes and brain structure and function. The data we employed to evaluate the performance of FMEM was from ADNI-1. About 800 subjects with age older than 65 were recruited and followed at least 3 years. The 800 subjects included 200 healthy controls, 400 subjects with different levels of mild cognitive impairment (MCI), and 200 subjects with Alzheimer’s disease (AD). Besides the SNPs and the T1 weighted MRI imaging measurements, the subjects were assessed with demographic information and psychiatric examination scores to determine the diagnosis status at each scheduled visit.
The raw magnetic resonance image (MRI) data was collected from a variety of 1.5 Tesla MRI scanners with protocols individualized for each scanner, including standard T1-weighted images obtained using volumetric 3-dimensional sagittal MPRAGE or equivalent protocols with varying resolutions. The typical protocol included: repetition time (TR) = 2400 ms, inversion time (TI) = 1000 ms, flip angle = 80, field of view (FOV) = 24 cm, with a 256 × 256 × 170 acquisition matrix in the x-,y-, and z-dimensions yielding a voxel size of 1.25 × 1.26 × 1.2 mm3.
The T1-weighted MRI images were preprocessed by standard image processing steps including AC (anterior commissure) and -PC (posterior commissure) correction, bias field correction, skull-stripping, intensity inhomogeneity correction, cerebellum removal, segmentation, and nonlinear registration [Wang et al., 2011]. After segmentation, the brain was segmented into four different tissues: grey matter (GM), white matter (WM), ventricle (VN), and cerebrospinal fluid (CSF). We quantified the local volumetric group differences by generating RAVENS maps [Davatzikos et al., 2001] for the whole brain and each of the segmented tissue type (GM, WM, VN, and CSF) respectively, using the deformation field which we obtained during registration. RAVENS methodology is based on a volume-preserving spatial transformation, since this process changes an individual[prime]s brain morphology to conform it to the morphology of the Jacob template.
We are interested in detecting meaningful brain regions of interest that are associated with several candidate genes. We included only the subjects whose diagnostic status was healthy control or Alzheimer’s disease at baseline and had no status change during ADNI1. After screening, the total number of subjects we included was 372 (195 Healthy Controls (HCs) and 177 ADs). The clinical covariates of interest included gender, baseline age, square of baseline age, handedness, education, baseline intracranial volume, and the risk of Apolipoprotein E (ApoE). Specifically, handedness was treated as a binary variable, the education information was the self-reported years of education by the subjects, and the risk of APOE is assumed to be additive. Specifically, the risk of APOE for a subject was 3 if he/she carries ε4 at both alleles; it was 2 if he/she carries ε3 and ε4 in two alleles, the risk would be considered 0 if the two APOE alleles were the combination of ε2 and ε3, and other combinations of APOE alleles are assumed to have risk 1.
Many genes have been reported to be causal in the progression of Alzheimer’s disease. We selected three candidate causal genes including CR1 on chromosome 1, CD2AP on chromosome 6, and PICALM on chromosome 11 due to their strong association with the progression of Alzheimer’s disease [Harold et al., 2009, Naj et al., 2011, Lambert et al., 2009]. Specifically, PICALM encodes the protein phosphatidylinositol-binding clathrin assembly and is highly correlated with the emergence of late-on-set AD, which is possibly due to the perturbation at synapse triggering its function change [Harold et al., 2009]. The gene CD2AP encodes the CD2-asscociated protein and involves in the process of cell membrane, including endocytosis, that plays critical roles in neurodegeneration and Aβ clearance from the brain [Naj et al., 2011]. The gene CR1 encodes the complement component (3b/4b) receptor 1 and the pathways involving CR1 are involved in the AD process, specifically in clearance of Aβ peptides, which is the primary composition of amyloid plaques [Lambert et al., 2009].
We first matched the SNPs in ADNI with the gene list “glist-hg18” provided by PLINK [Purcell and et al, 2007] and were able to locate 16, 15, and 23 SNPs on the selected CR1, CD2AP, and PICALM genes, respectively. All these SNPs pass the quality control procedure with MAF > 5% and the Hardy Weinberg Equilibrium (HWE) test p-value> 0.01. The MAFs of the SNPs of the selected genes vary from 0.1 to 0.5. After deleting missing values, there are 335, 299 and 328 subjects corresponding to the CR1, CD2AP, and PICALM genes, respectively. The MAFs of all selected SNPs and demographic information are included in the supplementary document.
For each selected gene, we fitted FMEM (1) with z coded as the number of minor alleles in order to detect its associated brain regions of interest (ROIs). For comparisons, we fitted the same model by using the classical voxel-wise method and Ge’s method to the same data sets. To formally detect significant ROIs, by following Ge et al. [2012], we used a cluster-form of threshold of 0.1% with a minimum voxel clustering value of 50 voxels. The names of the brain regions were included in Tables 6-8 of the supplementary document. FMEM is able to to detect 45, 45, and 27 significant clusters for CR1, CD2AP, and PICALM, respectively, whereas the standard voxel-wise method can only identify 6, 14, and 2 significant clusters, and none from Ge’s method for CR1, CD2AP, and PICALM, respectively. We also fitted FMEM on the same data but only with the HC and AD samples only to investigate white noise signal. For HC samples only, FMEM detected 15, 8 and 31 significant clusters for CR1, CD2AP and PICALM, respectively. For AD samples only, FMEM detected 9, 8 and 41 significant clusters for CR1, CD2AP and PICALM, respectively. Although there are some discrepancies between the results based on the HC and AD samples only and those based on the combined sample, the results are highly similar to each other. The results obtained from the combined sample are generally more significant due to a larger sample size.
Finally, we overlapped these significant clusters with the 96 predefined ROIs in the Jacob template and were able to detect several predefined ROIs for CR1, CD2AP, and PICALM. For CD2AP, based on the combined sample, FMEM identified relatively large clusters with more than 150 voxels of right superior temporal gyrus, left and right inferior temporal gryus, left and right precentral gyrus, left and right middle frontal gyrus, right postcentral gyrus, right fusiform, left angular, left inferior frontal gyrus, left inferior occipital gyrus, left and right postcentral gyrus, left and right superior frontal gyrus, left anterior cingulate and paracingulate gyri, left median cingulate and paracingulate gyri, right calcarine fissure and surrounding cortex, right cuneus, right superior occipital gyrus, right middle occipital gyrus, right caudate, and right middle temporal gyrus.
For CR1, based on the combined sample, FMEM identified relatively large clusters with more than 150 voxels of right superior temporal gyrus, left and right putamen, left inferior temporal gyrus, left angular, left inferior occipital gyrus, right postcentral gyrus, right superior frontal gyrus, left anterior cingulate and paracingulate gyri, left median cingulate and paracingulate gyri, left cuneus, left middle occipital gyrus, and right caudate.
For PICALM, based on the combined sample, FMEM identified relatively large clusters with more than 150 voxels of right inferior frontal gyrus- triangular and orbital parts, right insula, right fusiform, right superior temporal gyrus, right temporal pole, right middle temporal gyrus, right inferior temporal gyrus, right precentral gyrus, right supramarginal gyrus, right middel occipital gyrus, right angular, right middle frontal gyrus, and right middle frontal gyrus.
As shown in the supplementary document, we were able to detect several major ROIs, such as superior temporal gyrus, inferior temporal gyrus, middle frontal gyrus, angular, anterior cingulate and paracingulate gyri, hippocampus, putamen, and fusiform. Our finding of these ROIs are highly similar to previous reports on brain morphology in Alzheimer’s disease in the AD literature [Ohnishi et al., 2001, Convit et al., 2000, Jones et al., 2006, Fennema-Notestine et al., 2009]. The superior temporal gyrus is an essential structure involved in auditory processing, in social cognition processes, as well as in the function of language. The inferior temporal gyrus is one of the higher levels of the ventral stream of visual processing. The middle frontal gyrus plays a role in sustaining attention and working memory. The angular gyrus is involved in a number of processes related to language, number processing and spatial cognition, memory retrieval, attention, and theory of mind. The anterior cingulate and paracingulate gyri in rational cognitive functions, such as reward anticipation, decision-making, empathy, impulse control, and emotion. The hippocampus is known to be associated with memory and cognition. The fusiform is associated with color recognition, word and body recognition and the putamen is associated with motor skills. Figure 4 shows the −log10(p) map of selected slices with significant clusters for testing the genetic effects of CD2AP on RAVEN images identified by FMEM.
4 Discussion
We have developed FMEM to carry out an association analysis between neuroimaging phenotypes and a group of genetic markers, while adjusting for the clinical variables of interest. We have proposed a multiscale adaptive procedure with three features: spatial, hierarchical, and adaptive. Our simulation results have shown substantial gains in parameter estimation precision and statistical power in detecting the true effect of ROIs compared to the voxel-wise method. More research is needed for optimizing the choice of the tuning parameters in FMEM. We will borrow some key ideas of FMEM to develop a fast procedure to carry out GWAS for imaging genetic studies. We will also develop fast FMEM for the joint analysis of neuroimaging and genetic data with rare or common variants [Fan et al., 2013, 2012].
Supplementary Material
Acknowledgment
Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (National Institutes of Health Grant U01 AG024904). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: Alzheimers Association; Alzheimers Drug Discovery Foundation; BioClinica, Inc.; Biogen Idec Inc.; Bristol-Myers Squibb Company; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; F. Hoffmann-La Roche Ltd and its affliated company Genentech, Inc.; GE Healthcare; Innogenetics, N.V.; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Medpace, Inc.; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Synarc Inc.; and Takeda Pharmaceutical Company. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of California, Los Angeles. This research was also supported by NIH grants P30 AG010129 and K01 AG030514.
Footnotes
Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp_content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf
References
- Amos W, Driscoll E, Hoffman JI. Candidate genes versus genome-wide associations: which are better for detecting genetic susceptibility to infectious disease? Proc Biol Sci. 2011;278:1183–1188. doi: 10.1098/rspb.2010.1920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ashburner J, Friston KJ. Voxel-based morphometry: the methods. Neuroimage. 2000;11:805–821. doi: 10.1006/nimg.2000.0582. [DOI] [PubMed] [Google Scholar]
- Convit A, de Asis J, de Leon MJ, Tarshish CY, De Santi S, Rusinek H. Atrophy of the medial occipitotemporal, inferior, and middle temporal gyri in non-demented elderly predict decline to alzheimer’s disease. Neurobiol Aging. 2000;21:19–26. doi: 10.1016/s0197-4580(99)00107-4. [DOI] [PubMed] [Google Scholar]
- Davatzikos C, Genc A, Xu D, Resnick SM. Voxel-based morphometry using the ravens maps: Methods and validation using simulated longitudinal atrophy. NeuroImage. 2001;14:1361–1369. doi: 10.1006/nimg.2001.0937. [DOI] [PubMed] [Google Scholar]
- Fan J, Gijbels I. Local Polynomial Modelling and Its Applications. Chapman and Hall; London: 1996. [Google Scholar]
- Fan Ruzong, Zhang Yiwei, Albert Paul S, Liu Aiyi, Wang Yuanjia, Xiong Momiao. Longitudinal association analysis of quantitative traits. Genetic Epidemiology. 2012;36(8):856–869. doi: 10.1002/gepi.21673. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fan Ruzong, Wang Yifan, Mills James L, Wilson Alexander F, Bailey-Wilson Joan E, Xiong Momiao. Functional linear models for association analysis of quantitative traits. Genetic epidemiology. 2013;37(7):726–742. doi: 10.1002/gepi.21757. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fennema-Notestine C, Hagler DJ, Jr., McEvoy LK, Fleisher AS, Wu EH, Karow DS, Dale AM, ADNI Structural mri biomarkers for preclinical and mild alzheimer’s disease. Hum Brain Mapp. 2009;30:3238–3253. doi: 10.1002/hbm.20744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friston KJ. Modalities, modes, and models in functional neuroimaging. Science. 2009;326:399–403. doi: 10.1126/science.1174521. [DOI] [PubMed] [Google Scholar]
- Ge T, Feng J, Hibar DP, Thompson PM, Nichols TE. Increasing power for voxel-wise genome-wide association studies: The random field theory, least square kernel machines and fast permutation procedures. Neuroimage. 2012;63:858–873. doi: 10.1016/j.neuroimage.2012.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gilmore JH, Schmitt JE, et al. Genetic and environmental contributions to neonatal brain structure: A twin study. Human Brain Mapping. 2010;31:1174–1182. doi: 10.1002/hbm.20926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Greven S, Crainiceanu C, Caffo B, Reich D. Longitudinal functional principal component analysis. Electron. J. Statist. 2010;4:1022–1054. doi: 10.1214/10-EJS575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo W. Functional mixed effects models. Biometrics. 2002;58:121–128. doi: 10.1111/j.0006-341x.2002.00121.x. [DOI] [PubMed] [Google Scholar]
- Harold D, Abraham R, Hollingworth P, Sims R, et al. Genome-wide association study identifies variants at clu and picalm associated with alzheimer’s disease. Nat Genet. 2009;41:1088–1093. doi: 10.1038/ng.440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hibar D, Stein JL, et al. Voxelwise gene-wide association study (vgenewas): Multi-variate gene-based association testing in 731 elderly subjects. Neuroimage. 2011;56:1875–1891. doi: 10.1016/j.neuroimage.2011.03.077. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences of the United States of America. 2009;106:9362–9367. doi: 10.1073/pnas.0903103106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jones BF, Barnes J, Uylings HBM, Fox NC, Frost C, Witter MP, Scheltens P. Differential regional atrophy of the cingulate gyrus in alzheimer disease: a volumetric mri study. Cereb. Cortex. 2006;16:1701–1708. doi: 10.1093/cercor/bhj105. [DOI] [PubMed] [Google Scholar]
- Jones DK, Symms MR, Cercignani M, Howard RJ. The effect of filter size on vbm analyses of dt-mri data. NeuroImage. 2005;26:546–554. doi: 10.1016/j.neuroimage.2005.02.013. [DOI] [PubMed] [Google Scholar]
- Kang HM, Sul JH, Service SK, Zaitlen NA, Kong S, Freimer NB, Sabatti C, Eskin E. Variance component model to account for sample structure in genome-wide association studies. Nature Genetics. 2010;42:348–354. doi: 10.1038/ng.548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lambert J, Heath S, Even G, Campion D, Sleegers K, Hiltunen M, Combarros O, Zelenika D, Bullido MJ, Tavernier B, Letenneur L, Bettens K, Berr C, Pasquier F, et al. Genome-wide association study identifies variants at clu and cr1 associated with alzheimer’s disease. Nature Genetics. 2009;41:1094–1099. doi: 10.1038/ng.439. [DOI] [PubMed] [Google Scholar]
- Li Y, Zhu H, Shen D, Lin W, Gilmore JH, Ibrahim JG. Multiscale adaptive regression models for neuroimaging data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2011;73:559–578. doi: 10.1111/j.1467-9868.2010.00767.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Y, Gilmore JH, Shen D, Styner M, Lin W, Zhu H. Multiscale adaptive generalized estimating equations for longitudinal neuroimaging data. NeuroImage. 2013;72:91–105. doi: 10.1016/j.neuroimage.2013.01.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu D, Lin X, Ghosh D. Semiparametric regression of multidimensional genetic pathway data: Least-squares kernel machines and linear mixed models. Biometrics. 2007;63:1079–1088. doi: 10.1111/j.1541-0420.2007.00799.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loth E, Carvalho F, Schumann G. The contribution of imaging genetics to the development of predictive markers for addictions. Cell. 2011;15:436–446. doi: 10.1016/j.tics.2011.07.008. [DOI] [PubMed] [Google Scholar]
- Morris JS, Carroll RJ. Wavelet-based functional mixed models. J. R. Stat. Soc. Ser. B Stat. Methodol. 2006;68:179–199. doi: 10.1111/j.1467-9868.2006.00539.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Naj AC, Jun G, Beecham GW, Wang L, Vardarajan BN, Buros J, Gallins PJ, Buxbaum JD, Jarvik GP, Crane PK, Larson EB, Bird TD, et al. Common variants at ms4a4/ms4a6e, cd2ap, cd33 and epha1 are associated with late-onset alzheimer’s disease. Nature Genetics. 2011;43:436–441. doi: 10.1038/ng.801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ohnishi T, Matsuda H, Tabira T, Asada T, Uno M. Changes in brain morphology in alzheimer disease and normal aging: is alzheimer disease an exaggerated aging process? AJNR Am J. Neuroradiol. 2001;22:1680–1685. [PMC free article] [PubMed] [Google Scholar]
- Polzehl J, Spokoiny VG. Adaptive weights smoothing with applications to image restoration. J. R. Statist. Soc. B. 2000;62:335–354. [Google Scholar]
- Polzehl J, Voss HU, Tabelow K. Structural adaptive segmentation for statistical parametric mapping. NeuroImage. 2010;52:515–523. doi: 10.1016/j.neuroimage.2010.04.241. [DOI] [PubMed] [Google Scholar]
- Purcell S, Neale B, et al. Plink: a tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics. 2007;81:559–575. doi: 10.1086/519795. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Savitz JB, Drevets WC. Imaging phenotypes of major depressive disorder: Genetic correlates. Neuroscience. 2009;164:300–330. doi: 10.1016/j.neuroscience.2009.03.082. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shen L, Kim S, Risacher SL, Nho K, Swaminathan S, West JD, Foroud TM, Pankratz ND, Moore JH, Sloan SD, Huentelman MJ, Craig DW, DeChairo BM, Potkin SG, Jack CR, Weiner MW, Saykin AJ, ADNI Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in mci and ad: A study of the adni cohort. Neuroimage. 2010;53:1051–1063. doi: 10.1016/j.neuroimage.2010.01.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silver M, Montana G, Nichols TE, ADNI False positives in neuroimaging genetics using voxel-based morphometry data. NeuroImage. 2011;54:992–1000. doi: 10.1016/j.neuroimage.2010.08.049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stram DO, Lee JW. Variance components testing in the longitudinal mixed effects model. Biometrics. 1994;50:1171–1177. [PubMed] [Google Scholar]
- Ball T, Breckel TPK, Mutschler I, Aertsen A, Schulze-Bonhage A, Hennig J, Speck O. Variability of fmri-response patterns at different spatial observation scales. Human Brain Mapping. 2012;33:1155–1171. doi: 10.1002/hbm.21274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thompson PM, Ge T, Glahn DC, Jahanshad N, Nichols TE. Genetics of the connectome. NeuroImage. 2013;80:475–488. doi: 10.1016/j.neuroimage.2013.05.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tzeng JY, Zhang D. Haplotype-based association analysis via variance component score test. The American Journal of Human Genetics. 2007;81:927–938. doi: 10.1086/521558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vounou M, Nichols TE, Montana G, Alzheimer’s Disease Neuroimaging Initiative Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach. Neuroimage. 2010;53:1147–1159. doi: 10.1016/j.neuroimage.2010.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y, Chen H. On testing an unspecified function through a linear mixed effects model with multiple variance components. Biometrics. 2012;68:1113–1125. doi: 10.1111/j.1541-0420.2012.01790.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y, Nie J, Yap PT, Shi F, Guo L, Shen D. Robust deformable-surface-based skull-stripping for large-scale studies. In: Fichtinger G, Martel A, Peters T, editors. Medical Image Computing and Computer-Assisted Intervention. Vol. 6893. Springer; Toronto, Canada: Berlin / Heidelberg: 2011. pp. 635–642. [DOI] [PubMed] [Google Scholar]
- Worsley KJ, Taylor JE, Tomaiuolo F, Lerch J. Unified univariate and multivariate random field theory. NeuroImage. 2004;23:189–195. doi: 10.1016/j.neuroimage.2004.07.026. [DOI] [PubMed] [Google Scholar]
- Yuan Y, Gilmore JH, Geng X, Styner M, Chen K, Wang JL, Zhu H. Fmem: Functional mixed effects modeling for the analysis of longitudinal white matter tract data. NeuroImage. 2014;84:753–764. doi: 10.1016/j.neuroimage.2013.09.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yue Y, Loh JM, Lindquist MA. Adaptive spatial smoothing of fmri images. Statistics and its Interface. 2010;3:3–14. [Google Scholar]
- Zhang Yiwei, Xu Zhiyuan, Shen Xiaotong, Pan Wei. Testing for association with multiple traits in generalized estimation equations, with application to neuroimaging data. NeuroImage. 2014 doi: 10.1016/j.neuroimage.2014.03.061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu H, Brown PJ, Morris JS. Robust, adaptive functional regression in functional mixed model framework. Journal of the American Statistical Association. 2011;106:1167–1179. doi: 10.1198/jasa.2011.tm10370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu HT, Li RZ, Kong LL. Multivariate varying coeffcient model for functional responses. Annals of Statistics. 2012;40:2634–2666. doi: 10.1214/12-AOS1045SUPP. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu HT, Fan JQ, Kong LL. Spatially varying coe cient model for neuroimaging data with jump discontinuities. Journal of Americal Statistical Association. 2014;109 doi: 10.1080/01621459.2014.881742. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhu M, Zhao S. Candidate gene identification approach: progress and challenges. Int J Biol Sci. 2007;3:420–427. doi: 10.7150/ijbs.3.420. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.