GN-SCCA: GraphNet based Sparse Canonical Correlation Analysis for Brain Imaging Genetics

Lei Du; Jingwen Yan; Sungeun Kim; Shannon L Risacher; Heng Huang; Mark Inlow; Jason H Moore; Andrew J Saykin; Li Shen

doi:10.1007/978-3-319-23344-4_27

. Author manuscript; available in PMC: 2016 Jan 1.

Published in final edited form as: Brain Inform Health (2015). 2015;9250:275–284. doi: 10.1007/978-3-319-23344-4_27

GN-SCCA: GraphNet based Sparse Canonical Correlation Analysis for Brain Imaging Genetics

Lei Du ¹, Jingwen Yan ¹, Sungeun Kim ¹, Shannon L Risacher ¹, Heng Huang ², Mark Inlow ³, Jason H Moore ⁴, Andrew J Saykin ¹, Li Shen ^1,^2,^*, for the Alzheimer’s Disease Neuroimaging Initiative

PMCID: PMC4663463 NIHMSID: NIHMS703745 PMID: 26636135

Abstract

Identifying associations between genetic variants and neuroimaging quantitative traits (QTs) is a popular research topic in brain imaging genetics. Sparse canonical correlation analysis (SCCA) has been widely used to reveal complex multi-SNP-multi-QT associations. Several SCCA methods explicitly incorporate prior knowledge into the model and intend to uncover the hidden structure informed by the prior knowledge. We propose a novel structured SCCA method using Graph constrained Elastic-Net (GraphNet) regularizer to not only discover important associations, but also induce smoothness between coefficients that are adjacent in the graph. In addition, the proposed method incorporates the covariance structure information usually ignored by most SCCA methods. Experiments on simulated and real imaging genetic data show that, the proposed method not only outperforms a widely used SCCA method but also yields an easy-to-interpret biological findings.

1 Introduction

Brain imaging genetics, which intends to discover the associations between genetic factors (e.g., the single nucleotide polymorphisms, SNPs) and quantitative traits (QTs, e.g., those extracted from neuroimaging data), is an emerging research topic. While single-SNP-single-QT association analyses have been widely performed [17], several studies have used regression techniques [9] to examine the joint effect of multiple SNPs on one or a few QTs. Recently, bi-multivariate analyses [6, 12, 7, 18], which aim to identify complex multi-SNP-multi-QT associations, have also received much attention.

Sparse canonical correlation analysis (SCCA) [14, 19], a type of bi-multivariate analysis, has been successfully used for analyzing imaging genetics data [12, 6], and other biology data [4, 5, 14, 19]. To simplify the problem, most existing SCCA methods assume that the covariance matrix of the data to be the identity matrix. Then the Lasso [14, 19] or group Lasso [6, 12] regularizer is often solved using the soft-thresholding method. Although this assumption usually leads to a reasonable result, it is worth pointing out that the relationship between those variables within either modality have been ignored. For neuroimaging genetic data, correlations usually exist among regions of interest (ROIs) in the brain and among linkage disequilibrium (LD) blocks in the genome. Therefore, simply treating the data covariance matrices as identity or diagonal ones will limit the performance of identifying meaningful structured imaging genetic associations.

Witten et al. [19, 20] proposed an SCCA method which employs penalized matrix decomposition (PMD) to yield two sparse canonical loadings. Lin et al. [12] extended Witten’s SCCA model to incorporate non-overlapping group knowledge by imposing l_2;1-norm regularizer onto both canonical loadings. Chen et al. [3] proposed the ssCCA approach by imposing a smoothness penalty for one canonical loading of the taxa based on their relationship on the phylogenetic tree. Chen et al. [4, 5] treated the feature space as an undirected graph where each node corresponds to a variable and r_ij is the edge weight between nodes i and j. They proposed network based SCCA which penalizes the l₁ norm of $r_{i j}^{2} (u_{i} - sign (r_{i j}) u_{j})$ to encourage the weight values u_i and u_j to be similar if r_ij > 0, or dissimilar if r_ij < 0. A common limitation of these SCCA models is that they approximate X^TX by identity or diagonal matrix. Du et al. [7] proposed an S2CCA algorithm that overcomes this limitation, and requires users to explicitly specify non-overlapping group structures. Yan et al. [21] proposed KG-SCCA which uses l₂ norm of $r_{i j}^{2} (u_{i} - sign (r_{i j}) u_{j})$ to replace that in Chen’s model [4, 5]. KG-SCCA also requires the structure information to be explicitly defined. Note that an inaccurate sign of r_ij may introduce bias [10].

In this paper, we impose the Graph-constrained Elastic Net (GraphNet) [8] into SCCA model and propose a new GraphNet constrained SCCA (GN-SCCA). Our contributions are twofold: (1) GN-SCCA estimates the covariance matrix directly instead of approximating it by the identity matrix I; (2) GN-SCCA employs a graph penalty using data-driven technique to induce smoothness by penalizing the pairwise differences between adjacent features. Thorough experiments on both simulation and real imaging genetic data show that our method outperforms a widely used SCCA implementation [19]⁵ by identifying stronger imaging genetic associations and more accurate canonical loading patterns.

2 Preliminaries

2.1 Sparse CCA

We use the boldface lowercase letter to denote the vector, and the boldface uppercase letter to denote the matrix. The i-th row and j-th column of M = (m_ij) are represented as mⁱ and m_j. Let X = {x¹; …; xⁿ} ⊆ ℝ^p be the SNP data and Y = {y¹; …; yⁿ} ⊆ ℝ^p be the QT data, where n, p and q are the subject number, SNP number and QT number respectively.

The SCCA model presented in [19, 20] is as follows:

\begin{array}{l} \max_{u, v} u^{T} X^{T} Y v & s . t . {‖ u ‖}_{2}^{2} \leq 1, {‖ v ‖}_{2}^{2} \leq 1, {‖ u ‖}_{1} \end{array} \leq c_{1}, {‖ v ‖}_{1} \leq c_{2},

(1)

where the two terms ${‖ u ‖}_{2}^{2} \leq 1$ and ${‖ v ‖}_{2}^{2} \leq 1$ originate from the equalities ${‖ u ‖}_{2}^{2} = 1$ and ${‖ v ‖}_{2}^{2} = 1$ , where ${‖ u ‖}_{2}^{2} = 1$ and ${‖ v ‖}_{2}^{2} = 1$ approximate ${‖ X u ‖}_{2}^{2} = 1$ and ${‖ Y v ‖}_{2}^{2} = 1$ to simplify computation. This simplification approximates the covariance matrices X^TX and Y^TY by the identity matrix I (or sometimes a diagonal matrix), assuming that the features are independent. Most SCCA methods employ this simplification [3–5, 12, 19, 20]. Besides, ∥u∥₁ ≤ c₁ and ∥v∥₁ ≤ c₂ induce sparsity to control the sparsity of canonical loadings. In addition to the Lasso (l₁-norm), the fused Lasso can also be used [5, 14, 19, 20].

2.2 Graph Laplacian

The Graph Laplacian, also called the Laplacian matrix, has been widely used in the spectral clustering techniques and spectral graph theory [2], owing to its advantage in clustering those correlated features automatically. We denote a weighted undirected graph as G = (V; E; W), where V is the set of vertices corresponding to features of X or Y, E is the set of edges with e_i,j indicating that two features v_i and v_j are connected, and w_i,j is the weight of edge e_i,j. Here we consider G as a complete graph and thus every two vertices are connected.

Formally, the adjacency matrix of G is defined as:

A (i, j) = {\begin{matrix} \begin{array}{l} w_{i, j}, \\ 0, \end{array} & \begin{matrix} if i \neq j, and \\ otherwise . \end{matrix} \end{matrix}

(2)

Generally, w_i,j is set to |r_i,j|^d, where r_i,j is the sample correlation between the i-th and j-th variables. In this work, for simplicity, we set d = 2, i.e. $w_{i, j} = r_{i, j}^{2}$ . It can also be decided by domain experts in other applications.

Let D be a diagonal degree matrix with the following diagonal entries: D(i, i) = Σ_j A(i, j). Then the Laplacian matrix L is defined as L = D – A [8]. L has many merits such as the symmetry and the positive semi-definite structure. Most importantly, it can map a weighted graph onto a new space such that connected vertices stay as close as possible.

3 GraphNet based SCCA (GN-SCCA)

Inspired by the Graph Laplacian [11] and the GraphNet [8] technique, we define the penalty P(u) and P(v) as follows:

\begin{array}{l} P (u) = {‖ u ‖}_{GN} = u^{T} L_{1} u \leq c_{1}, \\ P (v) = {‖ v ‖}_{GN} = v^{T} L_{2} v \leq c_{2} . \end{array}

(3)

where L₁ and L₂ are the Laplacian matrices of two complete undirect graphs defined by the sample correlation matrices of the SNP and QT training data, respectively. The terms u^TL₁u and v^TL₂v make each feature fair be penalized smoothly according to the correlation between the two features.

Applying the two penalties above, the GN-SCCA model takes the form:

\min_{u, v} - u^{T} X^{T} Y v

(4)

s . t . {‖ X u ‖}_{2}^{2} \leq 1, {‖ Y v ‖}_{2}^{2} \leq 1, P (u) \leq c_{1}, P (v) \leq c_{2}, {‖ u ‖}_{1} \leq c_{3}, {‖ v ‖}_{1} \leq c_{4},

where the terms ∥u∥₁ ≤ c₃ and ∥v∥₁ ≤ c₄ are used to induce sparsity; and the P(u) and P(v) are Graph Laplcaian based GraphNet constraints [8]. Note that we use ${‖ X u ‖}_{2}^{2} \leq 1$ instead of ${‖ X ‖}_{2}^{2} \leq 1$ , which is typically used in other models, and thus our model takes into consideration the full covariance information.

Using Lagrange multiplier and writing the penalties into the matrix form, the objective function of GN-SCCA is as follows:

ℒ (u, v) = - u^{T} X^{T} Y v + \frac{λ_{1}}{2} {‖ u ‖}_{G N} + \frac{λ_{2}}{2} {‖ v ‖}_{G N} + \frac{β_{1}}{2} {‖ u ‖}_{1} + \frac{β_{2}}{2} {‖ v ‖}_{1} + \frac{γ_{1}}{2} {‖ X u ‖}_{2}^{2} + \frac{γ_{2}}{2} {‖ Y v ‖}_{2}^{2}

(5)

where (λ₁, λ₂, β₁, β₂) are tuning parameters, corresponding to (c₁,c₂,c₃,c₄). Take the derivative regarding u and v separately and let them be zero:

(λ_{1} L_{1} + β_{1} D_{1} + γ_{1} X^{T} X) u = X^{T} Y v,

(6)

(λ_{2} L_{2} + β_{2} D_{2} + γ_{2} Y^{T} Y) v = Y^{T} X u,

(7)

where D₁ is a diagonal matrix with the k₁-th element as $\frac{1}{2 {‖ u^{k_{1}} ‖}_{1}} (k_{1} \in [1, p])$ , and D₂ is a diagonal matrix with the k₂-th element as $\frac{1}{2 {‖ v^{k_{2}} ‖}_{1}} (k_{2} \in [1, q])$ ⁶.

Since D₁ relies on u and D₂ relies on v, we introduce an iterative procedure to solve this objective. In each iteration, we first fix v and solve for u, and then fix u and solve for v. The procedure stops until it satisfies a predefined stopping criterion. Algorithm 1 shows the pseudocode of the GN-SCCA algorithm.

3.1 Convergence Analysis of GN-SCCA

We first introduce Lemma 1 described in [13].


Algorithm 1 GraphNet based Structure-aware SCCA (GN-SCCA)

Require:
X = {x₁, …, x_n}^T, Y = {y₁, …, y_n}^T
	Ensure:
Canonical vectors u and v.
1:	Initialize u ∈ ℝ^p ^{× 1}, v ∈ ℝ^q ^{× 1}; L₁=D_u − A_u and L₂ = D_v − A_v only from the training data;
2:	while not converged do
3:	while not converged regarding u do
4:	Calculate the diagonal matrix D₁, where the k₁-th element is $\frac{1}{2 {‖ u^{k_{1}} ‖}_{1}}$ ;
5:	Update u = (λ₁L₁ + β₁D₁ + γ₁X^TX)⁻¹ X^TYv;
6:	end while
7:	while not converged regarding v do
8:	Calculate the diagonal matrix D₂, where the k₂-th element is $\frac{1}{2 {‖ v^{k_{2}} ‖}_{1}}$ ;
9:	Update v = (λ₂L₂ + β₂D₂ + γ₂Y^TY)⁻¹ Y^TXu;
10:	end while
11:	end while
12:	Scale u so that ∥Xu∥₂ = 1;
13:	Scale v so that ∥Yv∥₂ = 1.

Open in a new tab

Lemma 1

The following inequality holds for any two nonzero vectors ũ, u with the same length,

{‖ \tilde{u} ‖}_{2} - \frac{{‖ \tilde{u} ‖}_{2}^{2}}{2 {‖ u ‖}_{2}} \leq {‖ u ‖}_{2} - \frac{{‖ u ‖}_{2}^{2}}{2 {‖ u ‖}_{2}} .

(8)

Lemma 2

For any real number ũ and any nonzero real number u, we have

{‖ \tilde{u} ‖}_{1} - \frac{{‖ \tilde{u} ‖}_{1}^{2}}{2 {‖ u ‖}_{1}} \leq {‖ u ‖}_{1} - \frac{{‖ u ‖}_{1}^{2}}{2 {‖ u ‖}_{1}} .

(9)

Proof

The proof is obvious, given Lemma 1, ∥ũ∥₁ = ∥ũ∥₂ and ∥u∥₁ = ∥u∥₂.

Theorem 1

In each iteration, Algorithm 1 decreases the value of the objective function till the algorithm converges.

Proof

The proof consists of two phases. (1) Phase 1: For Steps 3–6, u is the only variable to estimate. The objective function Eq. (5) is equivalent to

ℒ (u, v) = - u^{T} X^{T} Y v + \frac{λ_{1}}{2} {‖ u ‖}_{G N} + \frac{β_{1}}{2} {‖ u ‖}_{1} + \frac{γ_{1}}{2} {‖ X u ‖}_{2}^{2}

From Step 5, we denote the updated value as ũ. Then we have

\begin{array}{l} - {\tilde{u}}^{T} X^{T} Y v + λ_{1} {\tilde{u}}^{T} L_{1} \tilde{u} + β_{1} {\tilde{u}}^{T} D_{1} \tilde{u} + γ_{1} {\tilde{u}}^{T} X^{T} X \tilde{u} \\ \leq - u^{T} X^{T} Y v + λ_{1} u^{T} L_{1} u + β_{1} u^{T} D_{1} u + γ_{1} u^{T} X^{T} X u \end{array}

According to the definition of D₁, we obtain

\begin{array}{l} - {\tilde{u}}^{T} X^{T} Y v + λ_{1} {\tilde{u}}^{T} L_{1} \tilde{u} + β_{1} \sum_{k_{1}} \frac{{‖ {\tilde{u}}^{k_{1}} ‖}_{1}^{2}}{2 {‖ u^{k_{1}} ‖}_{1}} + γ_{1} {\tilde{u}}^{T} X^{T} X \tilde{u} \\ \leq - u^{T} X^{T} Y v + λ_{1} u^{T} L_{1} u + β_{1} \sum_{k_{1}} \frac{{‖ u^{k_{1}} ‖}_{1}^{2}}{2 {‖ u^{k_{1}} ‖}_{1}} + γ_{1} u^{T} X^{T} X u \end{array}

(10)

Then summing Eq. (9) and Eq. (10) on both sides, we obtain

- {\tilde{u}}^{T} X^{T} Y v + λ_{1} {\tilde{u}}^{T} L_{1} \tilde{u} + β_{1} {‖ \tilde{u} ‖}_{1} + γ_{1} {‖ X \tilde{u} ‖}_{2}^{2} \leq - u^{T} X^{T} Y v + λ_{1} u^{T} L_{1} u + β_{1} {‖ u ‖}_{1} + γ_{1} {‖ X u ‖}_{2}^{2}

Let $λ_{1}^{*} = 2 λ_{1}$ , $γ_{1}^{*} = 2 γ_{1}$ , $β_{1}^{*} = 2 β_{1}$ , we arrive at

- {\tilde{u}}^{T} X^{T} Y v + \frac{λ_{1}^{*}}{2} {‖ \tilde{u} ‖}_{G N} + \frac{β_{1}^{*}}{2} {‖ \tilde{u} ‖}_{1} + \frac{γ_{1}^{*}}{2} {‖ X \tilde{u} ‖}_{2}^{2} \leq - u^{T} X^{T} Y v + \frac{λ_{1}^{*}}{2} {‖ u ‖}_{G N} + \frac{β_{1}^{*}}{2} {‖ u ‖}_{1} + \frac{γ_{1}^{*}}{2} {‖ X u ‖}_{2}^{2} .

(11)

Thus, the objective value decreases during Phase 1: ℒ(ũ, v) ≤ ℒ(u, v).

(2) Phase 2: For Steps 7–10, v is the variable to estimate. Similarly, we have

- {\tilde{u}}^{T} X^{T} Y \tilde{v} + \frac{λ_{2}^{*}}{2} {‖ \tilde{v} ‖}_{G N} + \frac{β_{2}^{*}}{2} {‖ \tilde{v} ‖}_{1} + \frac{γ_{2}^{*}}{2} {‖ Y \tilde{v} ‖}_{2}^{2} \leq - {\tilde{u}}^{T} X^{T} Y v + \frac{λ_{2}^{*}}{2} {‖ v ‖}_{G N} + \frac{β_{2}^{*}}{2} {‖ v ‖}_{1} + \frac{γ_{2}^{*}}{2} {‖ Y v ‖}_{2}^{2}

(12)

Thus, the objective value decreases during Phase 2: $ℒ (\tilde{u}, \tilde{v}) \leq ℒ (\tilde{u}, v)$ .

Applying the transitive property of inequalities, we obtain $ℒ (\tilde{u}, \tilde{v}) \leq ℒ (u, v)$ . Therefore, Algorithm 1 decreases the objective function in each iteration.

We set the stopping criterion of Algorithm 1 as max{|δ| | δ ∈ (u_t₊₁−u_t)} ≤ τ and max{|δ| | δ ∈ (v_t₊₁−v_t)} ≤ τ, where τ is a desirable estimate error. In this paper, τ = 10⁻⁵ is empirically chosen in the experiments.

4 Experimental Results

4.1 Results on Simulation Data

We used four simulated data sets to compare the performances of GN-SCCA and a widely used SCCA implementation [19]. We applied two different methods to generate these data with distinct structures to assure diversity. The first two data sets (both with n = 1000 and p = q = 50, but with different built-in correlations) were generated as follows: 1) We created a random positive definite group structured covariance matrix M. 2) Data set Y with covariance structure M was calculated by Cholesky decomposition. 3) Data set X was created similarly. 4) Canonical loadings u and v were created so that the variables in one group share the same weights based on the group structures of X and Y respectively. 5) The portion of the specified group in Y were replaced based on the u, v, X and the assigned correlation. The last two data sets (with different n, p, q and built-in correlations) were created using the simulation procedure described in [5]: 1) Predefined structure information was used to create u and v. 2) Latent vector z was generated from N(0, I_n×n). 3) X was created with each x_i ~ N(z_iu, I_p×p) and Y with each y_i ~ N(z_iv, Σ_y) where ${(\sum_{y})}_{j k} = \exp^{- | v_{j} - v_{k} |}$ .

According to Eqs. (6–7), six parameters need to be decided for GN-SCCA. Here we choose the value of tuning range based on two considerations: 1) Chen and Liu [4] showed that the results were insensitive to γ₁ and γ₂ in a similar study; 2) The major difference between traditional CCA and SCCA is the penalty terms. Thus their results will be the same if small parameters are used. With this observation, we tune γ₁ and γ₂ from small range of [1,10,100], and tune the remaining ones from 10⁻¹ to 10³ through nested 5-fold cross-validation.

The true signals and estimated u and v are shown in Fig. 1. The estimated canonical loadings u and v of GN-SCCA were consistent with the ground truth on all simulated data sets, while SCCA only found an incomplete portion of the true signals. Shown in Table 1 are the cross-validation performances of the two methods. The left part of the table shows that GN-SCCA outperformed SCCA consistently and significantly, and it has better test accuracy than SCCA on testing data. The right part of Table 1 presents the area under ROC (AUC), where GN-SCCA also significantly outperformed SCCA on all data sets. These results demonstrated that GN-SCCA identifies the correlations and signal locations more accurately and more stably than SCCA.

Table 1.

5-fold nested cross-validation results on synthetic data: Mean±std. is shown for estimated correlation coefficients and AUC regarding the estimated canonical loadings. p-values of paired t-test between GN-SCCA and SCCA are also shown.

True CC	Correlation Coefficient (CC)			Area under ROC (AUC)
True CC	SCCA	GN-SCCA	p	SCCA:u	GN-SCCA:u	p	SCCA:v	GN-SCCA:v	p
Data1(0.80)	0.48±0.03	0.80±0.01	1.52E-05	0.65±0.02	1.00±0.00	1.44E-06	0.81±0.04	1.00±0.00	2.65E-04
Data2(0.90)	0.56±0.04	0.90±0.01	8.38E-06	0.66±0.01	1.00±0.00	3.15E-07	0.79±0.04	1.00±0.00	1.79E-04
Data3(0.92)	0.55±0.15	0.89±0.06	1.54E-03	0.67±0.01	0.89±0.04	2.49E-04	0.81±0.04	1.00±0.00	2.23E-04
Data4(0.98)	0.97±0.01	0.98±0.01	6.82E-02	0.89±0.05	0.98±0.03	3.45E-03	0.69±0.01	1.00±0.00	1.88E-07

Open in a new tab

4.2 Results on Real Neuroimaging Genetics Data

We used the real neuroimaging and SNP data downloaded from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database to assess the performances of GN-SCCA and SCCA. One goal of ADNI is to test whether serial MRI, positron emission tomography, other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early AD. Please see www.adni-info.org for more details.

Both the SNP and MRI data were downloaded from the LONI website (adni.loni.usc.edu). There are 204 healthy control (HC), 363 MCI and 176 AD participants. The structural MRI scans were processed with voxel-based morphometry (VBM) in SPM8 [1, 15]. Briefly, scans were aligned to a T1-weighted template image, segmented into gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) maps, normalized to MNI space, and smoothed with an 8mm FWHM kernel. We subsampled the whole brain and yielded 465 voxels spanning all brain ROIs. These VBM measures were pre-adjusted for removing the effects of the baseline age, gender, education, and handedness by the regression weights derived from HC participants. We investigated SNPs from the top 5 AD risk genes [16] and APOE e4. In total we have 379 SNPs in this study. Our task was to examine correlations between the voxels (GM density measures) and genetic biomarker SNPs.

Shown in Table 2 are the 5-fold cross-validation results of GN-SCCA and SCCA. GN-SCCA significantly and consistently outperformed SCCA in terms of identifying stronger correlations from the training data. For the testing performance, SCCA did not do well possibly due to over-fitting, while GN-SCCA consistently outperformed SCCA. Fig. 2 shows the heat maps of the trained canonical loadings learned from cross-validation. We could observe that both weights, i.e. u and v, estimated by GN-SCCA were quite sparse and presented a clear pattern which could be easier to interpret. However, SCCA identified many signals which could be harder to explain. The strongest genetic signal, identified by GN-SCCA, was the APOE e4 SNP rs429358; and the strongest imaging signals came from the hippocampus. They were negatively correlated with each other. This reassures that our method identified a well-known correlation between APOE and hippocampal morphometry in an AD cohort. These results show the capability of GN-SCCA to identify biologically meaningful imaging genetic associations.

Table 2.

5-fold nested cross-validation results on real data: The models learned from training data were used to estimate the correlation coefficients for both training and testing cases. p-values of paired t-tests between GN-SCCA and SCCA are shown.

Correlation coefficients	SCCA						GN-SCCA						P
Correlation coefficients	F1	F2	F3	F4	F5	mean±std.	F1	F2	F3	F4	F5	mean±std.	P
Training	0.22	0.23	0.24	0.20	0.21	0.22±0.02	0.28	0.27	0.28	0.26	0.27	0.27±0.01	2.25E-4
Testing	0.07	0.04	0.09	0.05	0.16	0.07±0.03	0.21	0.28	0.24	0.31	0.27	0.26±0.04	9.14E-4

Open in a new tab

Fig. 2 — Comparisons on estimated canonical loadings using 5-fold cross-validation on real data. The SCCA results (the top row) and GN-SCCA results (the bottom row) are shown. For each panel pair, the 5 estimated u’s are shown on the left panel, and the 5 estimated v’s are shown on the right.

5 Conclusions

We proposed a GraphNet constrained SCCA (GN-SCCA) to mine imaging genetic associations, and incorporated the covariance information ignored by many existing SCCA methods. The GraphNet term induces smoothness by penalizing the pairwise differences between adjacent features in a complete graph or an user-given graph (correlation matrix used in this study). Our experimental study showed that GN-SCCA accurately discovered the true signals from the simulation data and obtained improved performance and biologically meaningful findings from real data. In this work, we only did comparative study between GN-SCCA and a widely-used SCCA method [19]. We have observed many recent developments in structured SCCA models. Some (e.g., [6, 18, 19, 14, 12]) ignored the covariance structure information of the input data, which was usually helpful to imaging genetics applications. A few other models (e.g., [7, 21]) overcome this limitation but impose different sparsity structures. Work is in progress to compare the proposed GN-SCCA with these structured SCCA models. Given the mathematically simple formulation of GN-SCCA, we feel it is a valuable addition which is complementary to the existing SCCA models.

Acknowledgments

This work was supported by NIH R01 LM011360, U01 AG024904 (details available at http://adni.loni.usc.edu), RC2 AG036535, R01 AG19771, P30 AG10133, and NSF IIS-1117335 at IU, by NSF CCF-0830780, CCF-0917274, DMS-0915228, and IIS-1117965 at UTA, and by NIH R01 LM011360, R01 LM009012, and R01 LM010098 at Dartmouth.

Footnotes

⁵

SCCA in the PMA software package is widely used as a benchmark algorithm. Here we simply use SCCA to denote the SCCA method in this software package. See http://cran.r-project.org/web/packages/PMA/ for details.

⁶

If ${‖ u^{k_{1}} ‖}_{1} = 0$ or ${‖ v^{k_{2}} ‖}_{1} = 0$ , we approximate it with $\sqrt{{‖ u^{k_{1}} ‖}_{2}^{2} + ς}$ or $\sqrt{{‖ v^{k_{2}} ‖}_{2}^{2} + ς}$ , where ζ is a very small non-zero value. According to [13], this regularization will not affect the result when ζ → 0.

References

1.Ashburner J, Friston KJ. Voxel-based morphometry–the methods. Neuroimage. 2000;11(6):805–21. doi: 10.1006/nimg.2000.0582. [DOI] [PubMed] [Google Scholar]
2.Belkin M, Niyogi P. Proceedings of the 18th annual conference on Learning Theory. Springer-Verlag; 2005. Towards a theoretical foundation for laplacian-based manifold methods; pp. 486–500. [Google Scholar]
3.Chen J, Bushman FD, et al. Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. Biostatistics. 2013;14(2):244–258. doi: 10.1093/biostatistics/kxs038. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Chen X, Liu H. An efficient optimization algorithm for structured sparse cca, with applications to eqtl mapping. Statistics in Biosciences. 2012;4(1):3–26. [Google Scholar]
5.Chen X, Liu H, Carbonell JG. Structured sparse canonical correlation analysis. International Conference on Artificial Intelligence and Statistics. 2012 [Google Scholar]
6.Chi E, Allen G, et al. Imaging genetics via sparse canonical correlation analysis. Biomedical Imaging (ISBI), 2013 IEEE 10th Int Sym on. 2013:740–743. doi: 10.1109/ISBI.2013.6556581. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Du L, et al. A novel structure-aware sparse learning algorithm for brain imaging genetics. International Conference on Medical Image Computing and Computer Assisted Intervention. 2014:329–336. doi: 10.1007/978-3-319-10443-0_42. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Grosenick L, et al. Interpretable whole-brain prediction analysis with graphnet. NeuroImage. 2013;72:304–321. doi: 10.1016/j.neuroimage.2012.12.062. [DOI] [PubMed] [Google Scholar]
9.Hibar DP, Kohannim O, et al. Multilocus genetic analysis of brain images. Front Genet. 2011;2:73. doi: 10.3389/fgene.2011.00073. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Kim S, Xing EP. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS genetics. 2009;5(8) doi: 10.1371/journal.pgen.1000587. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Li C, Li H. Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics. 2008;24(9):1175–1182. doi: 10.1093/bioinformatics/btn081. [DOI] [PubMed] [Google Scholar]
12.Lin D, Calhoun VD, Wang YP. Correspondence between fMRI and SNP data by group sparse canonical correlation analysis. Med Image Anal. 2013 doi: 10.1016/j.media.2013.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Nie F, Huang H, Cai X, Ding CH. Efficient and robust feature selection via joint 2, 1-norms minimization. Advances in Neural Information Processing Systems. 2010:1813–1821. [Google Scholar]
14.Parkhomenko E, Tritchler D, Beyene J. Sparse canonical correlation analysis with application to genomic data integration. Statistical Applications in Genetics and Molecular Biology. 2009;8(1):1–34. doi: 10.2202/1544-6115.1406. [DOI] [PubMed] [Google Scholar]
15.Risacher SL, Saykin AJ, et al. Baseline MRI predictors of conversion from MCI to probable AD in the ADNI cohort. Curr Alzheimer Res. 2009;6(4):347–61. doi: 10.2174/156720509788929273. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Shah RD, Samworth RJ. Variable selection with error control: another look at stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2013;75(1):55–80. [Google Scholar]
17.Shen L, Kim S, et al. Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: A study of the ADNI cohort. Neuroimage. 2010;53(3):1051–63. doi: 10.1016/j.neuroimage.2010.01.042. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Vounou M, Nichols TE, Montana G. Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach. NeuroImage. 2010;53(3):1147–59. doi: 10.1016/j.neuroimage.2010.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Witten DM, Tibshirani R, Hastie T. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics. 2009;10(3):515–34. doi: 10.1093/biostatistics/kxp008. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Witten DM, Tibshirani RJ. Extensions of sparse canonical correlation analysis with applications to genomic data. Statistical applications in genetics and molecular biology. 2009;8(1):1–27. doi: 10.2202/1544-6115.1470. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Yan J, Du L, et al. Transcriptome-guided amyloid imaging genetic analysis via a novel structured sparse learning algorithm. Bioinformatics. 2014;30(17):i564–i571. doi: 10.1093/bioinformatics/btu465. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Ashburner J, Friston KJ. Voxel-based morphometry–the methods. Neuroimage. 2000;11(6):805–21. doi: 10.1006/nimg.2000.0582. [DOI] [PubMed] [Google Scholar]

[R2] 2.Belkin M, Niyogi P. Proceedings of the 18th annual conference on Learning Theory. Springer-Verlag; 2005. Towards a theoretical foundation for laplacian-based manifold methods; pp. 486–500. [Google Scholar]

[R3] 3.Chen J, Bushman FD, et al. Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. Biostatistics. 2013;14(2):244–258. doi: 10.1093/biostatistics/kxs038. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Chen X, Liu H. An efficient optimization algorithm for structured sparse cca, with applications to eqtl mapping. Statistics in Biosciences. 2012;4(1):3–26. [Google Scholar]

[R5] 5.Chen X, Liu H, Carbonell JG. Structured sparse canonical correlation analysis. International Conference on Artificial Intelligence and Statistics. 2012 [Google Scholar]

[R6] 6.Chi E, Allen G, et al. Imaging genetics via sparse canonical correlation analysis. Biomedical Imaging (ISBI), 2013 IEEE 10th Int Sym on. 2013:740–743. doi: 10.1109/ISBI.2013.6556581. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Du L, et al. A novel structure-aware sparse learning algorithm for brain imaging genetics. International Conference on Medical Image Computing and Computer Assisted Intervention. 2014:329–336. doi: 10.1007/978-3-319-10443-0_42. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Grosenick L, et al. Interpretable whole-brain prediction analysis with graphnet. NeuroImage. 2013;72:304–321. doi: 10.1016/j.neuroimage.2012.12.062. [DOI] [PubMed] [Google Scholar]

[R9] 9.Hibar DP, Kohannim O, et al. Multilocus genetic analysis of brain images. Front Genet. 2011;2:73. doi: 10.3389/fgene.2011.00073. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Kim S, Xing EP. Statistical estimation of correlated genome associations to a quantitative trait network. PLoS genetics. 2009;5(8) doi: 10.1371/journal.pgen.1000587. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Li C, Li H. Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics. 2008;24(9):1175–1182. doi: 10.1093/bioinformatics/btn081. [DOI] [PubMed] [Google Scholar]

[R12] 12.Lin D, Calhoun VD, Wang YP. Correspondence between fMRI and SNP data by group sparse canonical correlation analysis. Med Image Anal. 2013 doi: 10.1016/j.media.2013.10.010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Nie F, Huang H, Cai X, Ding CH. Efficient and robust feature selection via joint 2, 1-norms minimization. Advances in Neural Information Processing Systems. 2010:1813–1821. [Google Scholar]

[R14] 14.Parkhomenko E, Tritchler D, Beyene J. Sparse canonical correlation analysis with application to genomic data integration. Statistical Applications in Genetics and Molecular Biology. 2009;8(1):1–34. doi: 10.2202/1544-6115.1406. [DOI] [PubMed] [Google Scholar]

[R15] 15.Risacher SL, Saykin AJ, et al. Baseline MRI predictors of conversion from MCI to probable AD in the ADNI cohort. Curr Alzheimer Res. 2009;6(4):347–61. doi: 10.2174/156720509788929273. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Shah RD, Samworth RJ. Variable selection with error control: another look at stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2013;75(1):55–80. [Google Scholar]

[R17] 17.Shen L, Kim S, et al. Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: A study of the ADNI cohort. Neuroimage. 2010;53(3):1051–63. doi: 10.1016/j.neuroimage.2010.01.042. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Vounou M, Nichols TE, Montana G. Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach. NeuroImage. 2010;53(3):1147–59. doi: 10.1016/j.neuroimage.2010.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Witten DM, Tibshirani R, Hastie T. A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics. 2009;10(3):515–34. doi: 10.1093/biostatistics/kxp008. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Witten DM, Tibshirani RJ. Extensions of sparse canonical correlation analysis with applications to genomic data. Statistical applications in genetics and molecular biology. 2009;8(1):1–27. doi: 10.2202/1544-6115.1470. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Yan J, Du L, et al. Transcriptome-guided amyloid imaging genetic analysis via a novel structured sparse learning algorithm. Bioinformatics. 2014;30(17):i564–i571. doi: 10.1093/bioinformatics/btu465. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

GN-SCCA: GraphNet based Sparse Canonical Correlation Analysis for Brain Imaging Genetics

Lei Du

Jingwen Yan

Sungeun Kim

Shannon L Risacher

Heng Huang

Mark Inlow

Jason H Moore

Andrew J Saykin

Li Shen

Abstract

1 Introduction

2 Preliminaries

2.1 Sparse CCA

2.2 Graph Laplacian

3 GraphNet based SCCA (GN-SCCA)

3.1 Convergence Analysis of GN-SCCA

Lemma 1

Lemma 2

Proof

Theorem 1

Proof

4 Experimental Results

4.1 Results on Simulation Data

Fig. 1.

Table 1.

4.2 Results on Real Neuroimaging Genetics Data

Table 2.

Fig. 2.

5 Conclusions

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

GN-SCCA: GraphNet based Sparse Canonical Correlation Analysis for Brain Imaging Genetics

Lei Du

Jingwen Yan

Sungeun Kim

Shannon L Risacher

Heng Huang

Mark Inlow

Jason H Moore

Andrew J Saykin

Li Shen

Abstract

1 Introduction

2 Preliminaries

2.1 Sparse CCA

2.2 Graph Laplacian

3 GraphNet based SCCA (GN-SCCA)

3.1 Convergence Analysis of GN-SCCA

Lemma 1

Lemma 2

Proof

Theorem 1

Proof

4 Experimental Results

4.1 Results on Simulation Data

Fig. 1.

Table 1.

4.2 Results on Real Neuroimaging Genetics Data

Table 2.

Fig. 2.

5 Conclusions

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases