Sparse Representation Based Biomarker Selection for Schizophrenia with Integrated Analysis of fMRI and SNPs

Hongbao Cao; Junbo Duan; Dongdong Lin; Yin Yao Shugart; Vince Calhoun; Yu-Ping Wang

doi:10.1016/j.neuroimage.2014.01.021

. Author manuscript; available in PMC: 2015 Nov 15.

Published in final edited form as: Neuroimage. 2014 Feb 12;102 Pt 1:220–228. doi: 10.1016/j.neuroimage.2014.01.021

Sparse Representation Based Biomarker Selection for Schizophrenia with Integrated Analysis of fMRI and SNPs

Hongbao Cao ¹, Junbo Duan ^2,³, Dongdong Lin ², Yin Yao Shugart ¹, Vince Calhoun ^4,⁵, Yu-Ping Wang ^2,^3,^*

PMCID: PMC4130811 NIHMSID: NIHMS560490 PMID: 24530838

Abstract

Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different nature of diverse data modality, data integration is challenging. Here we address the data integration problem by developing a generalized sparse model (GSM) using weighting factors to integrate multi-modality data for biomarker selection. As an example, we applied the GSM model to a joint analysis of two types of schizophrenia data sets: 759075 SNPs and 153594 functional magnetic resonance imaging (fMRI) voxels in 208 subjects (92 cases/116 controls). To solve this small-sample-large-variable problem, we developed a novel sparse representation based variable selection (SRVS) algorithm, with the primary aim to identify biomarkers associated with schizophrenia. To validate the effectiveness of the selected variables, we performed multivariate classification followed by a ten-fold cross validation. We compared our proposed SRVS algorithm with an earlier sparse model based variable selection algorithm for integrated analysis. In addition, we compared with the traditional statistics method for univariant data analysis (Chi-squared test for SNP data and ANOVA for fMRI data). Results showed that our proposed SRVS method can identify novel biomarkers that show stronger capability in distinguishing schizophrenia patients from healthy controls. Moreover, better classification ratios were achieved using biomarkers from both types of data, suggesting the importance of integrative analysis.

Keywords: Sparse representations, SNP, fMRI, variable selection, schizophrenia

I. INTRODUCTION

Schizophrenia has been hypothesized to arise from a number of genetic factors and environmental effects. To date, many studies have investigated the role of critical genes or single nucleotide polymorphisms (SNP) associated with schizophrenia. Many genes of great significance have been identified as potential causal genetic makers for schizophrenia, such as G72/G30 on chromosome 13q, DISC1, GRIK3, EFNA5, AKAP5 and CACNG2 (Badner and Gershon, 2002; Callicott et al., 2005; Sutrala et al., 2007). Besides genetic studies, functional magnetic resonance imaging (fMRI) is another widely used tool for the study of schizophrenia in that it has the ability to identify both structural and functional abnormalities in brain regions of schizophrenia patients (Meda et al., 2008; Szycik et al., 2009). Therefore, the identification of biomarkers from joint analysis of fMRI and SNPs data is of tremendous importance for disease diagnosis and treatment (Liu et al., 2009; Lin et al., 2011).

In this paper we proposed a generalized sparse model (GSM) to integrate multi-modality data (e.g., SNP and fMRI data) for biomarker selection. Sparse representation, particularly compressive sensing, received a great attention in recent years (Gribonval and Nielsen, 2003;Tropp et al., 2003; Donoho and Elad, 2003; Kidron et al., 2007; Tang et al., 2012; Cao et al., 2012a; Cao et al., 2012b). For example, Kidron et. al. used sparse regression for cross-modal localizations of sound-related region in the video (Kidron et al., 2007). We recently developed the sparse representation-based classification algorithms for sub-typing of leukemia from gene expression data (Tang et al., 2012), for chromosome image segmentation (Cao et al., 2012a) and for integrative analysis of gene copy number variation and gene expression data (Cao et al., 2012b).

Traditionally, the GSM can be solved by many existing algorithms, such as Homotopy method (Donoho and Tsaig, 2008), orthogonal matching pursuit (OMP) algorithm (Davis et al., 1997; Tropp, 2004; Cai and Wang, 2011), single best replacement (SBR) algorithm (Soussen et al., 2011), and FOCUSS method (Cotter et al., 2005). However, in compressive sensing theory, the exact signal recovery of a s-sparse signal typically requires a large number of samples (Davenport et al., 2011). Here the s-sparse signal refers to a vector having at most s number of nonzero entries. Those entries with high amplitudes correspond to the variables to be selected (Cai and Wang, 2011). When the number of measurements n ≫ m, (m is the number of samples), it will be difficult for the exact signal recovery (Hsu et al., 2009; Davenport et al., 2011). One most often used conditions for exact signal recovery is the restricted isometry property (RIP) (Davenport et al., 2011). However, whether a measurement matrix satisfies the RIP condition is hard to verify in practice. Another method is using the coherence of a matrix X (Donoho, 2004; Candes and Tao, 2006), which is often required to be small (e.g. $O (\frac{1}{s})$ ). Moreover, when a matrix X satisfies the signal recovery condition, the number of signals to be recovered or variables to be selected using those traditional sparse representation methods will generally be equal to or less than the number of samples (Li et al. 2009). To address this problem, Li et al. proposed a sparse representation based variable selection method, aiming to achieve a sparse solution for the GSM when the sample number is large (e.g., larger than the number of variables to be selected; Li et al. 2009).

In this work and in many other practical cases, the number of samples (92 cases/116 controls) is far less than the number of variables (i.e., 759075 SNPs and 153594 fMRI voxels). As a consequence, the small coherence condition of the data matrix is hard to be satisfied (Hsu et al., 2009), and thus directly using existing compressive sensing methods may fail (Blankertz et al. 2011; Parra et al. 2005; Zien et al. 2009). To overcome the difficulty caused by this large-n-small-m problem, we proposed a novel sparse representation based variable selection (SRVS) algorithm, which can select the significant variables regardless of the coherence condition of the measurement matrix. Moreover, the proposed SRVS algorithm has been proven to have multi-resolution properties that select variables at different significance level. Instead of solving the GSM directly, the proposed SRVS algorithm solves sub-matrixes based L_p norm minimization problems and generates a sparse solution for the GSM. In a preliminary work (Cao et al., 2012c), we studied the orthogonal matching pursuit (OMP) based SRVS algorithm. Our preliminary results documented that, even for small number of samples, the SRVS is capable of identifying a number of biomarkers for schizophrenia, leading to improved identification accuracy.

Here, we extend the work by applying our proposed SRVS algorithm to GSM with a more general penalization term (L_p (0 ≤ p ≤ 1) norms), aiming to identify more effective joint biomarkers for schizophrenia. Specifically, we tested and compared three models with p=0, 0.5, 1. For the L_p (0 ≤ p ≤ 1) based model, we proved that the proposed SRVS method can identify significant variables at different significant level, and recover signals with large probability regardless of the coherence of the measurement matrix. We also showed the convergence and effectiveness of the proposed SRVS algorithm. After that, we applied the SRVS to the GSM model integrating 759175 SNPs and 153594 fMRI voxels in 208 subjects (92 cases and 116 controls) for the identification of biomarkers for schizophrenia. To test the predictive power of the biomarkers or variables selected, we used the selected variables to distinguish schizophrenia patients from healthy controls followed by a 10-fold cross-validation. We evaluated the three models with different penalization terms (i.e., L_p norm, p=0, 0.5 and 1) and compared them with the biomarker selection approach proposed by Li et al. (Li et al., 2009) for integrated analysis. In addition, we compared our method with the traditional statistical methods for uni-type data analysis (i.e. Chi-squared test for SNP data and ANOVA for fMRI data).

II. Materials and Methods

The proposed variable selection approach includes three steps, as shown in Fig. 1: 1.) Data combination. The GSM model is proposed to combine two types of data. 2.) Variable selection. The SRVS algorithm is proposed to solve the sparse linear system in GSM. 3.) Validation of the selected variables. We employed a multi-classification approach to test the effectiveness of the selected variables. We used cross validation to select the optimal parameters used in the GSM.

2.1 A sparse model for Data combination

The sparse representation of a signal can be modeled as

y = X δ + ε,

(1)

where y ∈ R^m^×1 is the observation vector; X ∈ R^m^×ⁿ represents the measurement matrix; and ε ∈ R^m^×1 is the measurement error or noise. The goal of sparse representation is to recover the unknown sparse vector δ ∈ Rⁿ^×1 from y and X, and the non-zero entries of δ correspond to selected variables/columns in X.

In case of representing multi-modality data (e.g. SNP data and fMRI data from same subject groups), we propose a generalized sparse model (GSM) in Eq. (2).

y = [α_{1} X_{1}, α_{2} X_{2}] [\begin{matrix} δ_{1} \\ δ_{2} \end{matrix}] + ε = X δ + ε,

(2)

where y ∈ R^m^×1 is the observation vector (phenotypes of the subjects; e.g., 1 for disease case; 0 for healthy control); X₁ ∈ R^m×n₁ and X₂ ∈ R^m×n₂ are the measurements of two different data types (e.g., numerical SNPs values (0, 1, 2) and fMRI voxel values) with m samples, and n₁(or n₂) features in each sample; each column is normalized to have unit L₂ norm; X = [α₁X₁, α₂X₂] ∈ R^m^×ⁿ; α₁ + α₂ = 1, and α₁, α₂ > 0 are the weight factors for the two types of data; ε ∈ R^m^×1 is the measurement error. Then, the problem of variable selection becomes to identify the unknown sparse vector $δ = [\begin{matrix} δ_{1} \\ δ_{2} \end{matrix}] \in R^{n \times 1}$ from y and X, where δ₁ ∈ R^n₁×1, δ₂ ∈ R^n₂×1, and n = n₁ + n₂. To determine optimal weighting factors α₁ and α₂, cross validation can be used, i.e. the weighting factors that generates the best classification ratio (CR).

2.2 Variable selection with SRVS

By integrating multi-modality data, the GSM model given by Eq. (2) offers the potential to detect more significant and reliable biomarkers (Rhodes and Chinnaiyan, 2005; Liu et al., 2009; Cao et al., 2012d). However, in genomic and bio-imaging data analysis (e.g. SNP and fMRI), usually n ≫ m and the linear system defined by Eq. (2) is underdetermined, and the solution of the system is not unique. To overcome the problem, a sparse constraint is usually imposed on the model. An example is given in Eq. (3) by using the L₀ norm based penalty (Cai and Wang, 2011; Tropp, 2004; Davis et al., 1997; Soussen et al., 2011), which measures the number of nonzero elements.

\min {‖ δ ‖}_{0} subject t o {‖ y - X δ ‖}_{2} \leq ε .

(3)

But it is unfortunate that this penalty results in a combinatorial problem with NP-hard complexity. Thus, L₁ norm penalty (Donoho and Tsaig, 2008) is introduced instead:

\min {‖ δ ‖}_{1} subject t o {‖ y - X δ ‖}_{2} \leq ε .

(4)

Detailed discussions on the differences between L₀ and L₁ norm penalties can be found in Sharon and Ma, 2007 and Donoho and Tsaig, 2006. In recent years, L_p norm penalty (0 < p < 1) was also studied (Cotter et al., 2005; Xu et al., 2012), which can lead to a more sparse solution. The model is formulated as

\min {‖ δ ‖}_{p} subject t o {‖ y - X δ ‖}_{2} \leq ε .

(5)

Several algorithms (Cotter et al., 2005; Foucart and Lai, 2009 and Wang et al., 2011) have been proposed.

Nevertheless, when applying model (5) to signal recovery/variable selection, the measurement matrix X ∈ R^m^×ⁿ is usually required to satisfy the RIP condition for exact signal recovery (Davenport et al., 2011; Donoho, 2004; Candes and Tao, 2006 and Hsu et al., 2009). The RIP is defined as follows.

A matrix X ∈ R^m^×ⁿ is said to satisfy the restricted isometry property (RIP) of order s if there exists a τ_s ∈ (0,1) such that

(1 - τ_{s}) {‖ δ ‖}_{2}^{2} \leq {‖ X δ ‖}_{2}^{2} \leq (1 + τ_{s}) {‖ δ ‖}_{2}^{2}

(6)

holds for all s-sparse vector δ ∈ Rⁿ.

When a matrix X ∈ R^m^×ⁿ satisfies the RIP of order s, the s-sparse vector δ ∈ R^N can be recovered from the m samples. One necessary condition for X ∈ R^m^×ⁿ to satisfy RIP is that m and n satisfy Eq. (7) (Davenport et al., 2011).

Theorem

If X ∈ R^m^×ⁿ satisfies an RIP of order 2s with $ε_{2 s} \in (0, \frac{1}{2}]$ , then

m \geq cslog (\frac{n}{s}),

(7)

where $c = \frac{1}{2 \log (\sqrt{24} + 1)} \approx 0.28$ (Davenport et al., 2011). Thus for a data sample with the number of m, the following condition has to be satisfied

n \leq {s e}^{m / c s}

(8)

for exact recovery of s-sparse vector δ ∈ R^N. In genomic or medical imaging data analysis, we can assume m ≤ s, i.e., the number of biomarkers to be detected is generally larger than the number of sample m (Li et al., 2004; Li et al., 2009). Then Eq. (8) can be simplified as

n \leq {m e}^{1 / c} \approx 35 m .

(9)

Eq. (9) suggests that, for a given sample size m, the number of n in the measurements matrix X ∈ R^m^×ⁿ should be less than 35m so that the sparse solution can be obtained.

However, in practice, this condition cannot be satisfied, because the number of features (SNPs/fMRI voxels) is often greatly larger than that of the sample.

Eq. (9) describes a necessary condition for a matrix X ∈ R^m^×ⁿ to satisfy signal recovery requirement. Since it’s difficult to verify if a matrix X ∈ R^m^×ⁿ satisfies the RIP, the coherence of X is used instead (Donoho, 2004). The coherence μ(X) is defined as:

μ (X) = \max_{1 \leq i < j \leq n} ∣ {(X^{T} X)}_{i, j} ∣ / \sqrt{∣ {(X^{T} X)}_{i, i} ∣ ∣ {(X^{T} X)}_{j, j} ∣} .

(10)

The coherence given by Eq. (10) is always within the following range: $μ (X) \in [\sqrt{\frac{n - m}{m (n - 1)}}, 1]$ ; the lower bound is known as the Welch bound (Davenport et al., 2011). Some reconstruction algorithms require a strong condition of bounded coherence (Donoho, 2004; Hsu et al., 2009):

μ (X) \leq O (\frac{1}{s}) .

(11)

However, this small coherence condition is hard to be satisfied by our problem. In fact, when n ≫ m, it is inevitable for some columns within the small-m-large-n measurement matrix to have big coherence (Candes and Tao, 2006). In our study, n=759075 + 153594 (SNPs + fMRI voxels) and m=208 (92 cases/116 controls). Therefore, we proposed the SRVS algorithm to find an approximate solution of the GSM model given by Eq. (2). The SRVS algorithm is described as follows, and its properties are presented in Appendix A.

SRVS Algorithm (http://hongbaocao.weebly.com/software-for-download.html)

Initialize δ⁽⁰⁾ = 0;
For Step l, randomly choose k columns from X = {x₁, …, x_n} ∈ R^m^×ⁿ to construct a m × k sub-matrix denoted as X_l ∈ R^m^×^k, and denote the index vector of the selected columns as I_l ∈ {1, 2, 3, …};
Given the sub-matrix X_l, solve the following L_p minimization problem to get the optimal sparse solution δ_l ∈ R^k^×1
$\min {‖ δ_{l} ‖}_{p} subject t o {‖ y - X_{l} δ_{l} ‖}_{2} \leq ε;$ (12)
Update δ⁽^l⁾ ∈ Rⁿ^×1 with δ_l: δ⁽^l⁾(I_l) = δ⁽^l⁻¹⁾(I_l)+δ_l; where δ⁽^l⁾(I_l) and δ⁽^l⁻¹⁾(I_l) denote the I_lth entries in δ⁽^l⁾ and δ⁽^l⁻¹⁾, respectively;
If a stopping rule is not satisfied, update l = l +1 and go to Step 2. Otherwise, set δ = δ⁽^l⁾/l and break. The non-zero entries in δ correspond to the column vectors selected, i.e., variable selection.

In Step 2, one way to achieve random selection of k columns from X is to shuffle the data with Fisher-Yates algorithm (Fisher and Yates, 1948), and then use a window of length k to select variables randomly (Cao et al., 2012c). It should be noted that in each iteration, a different sub-matrix X_l will be randomly selected (totally $(\begin{matrix} n \\ k \end{matrix})$ possible combinations), which is not a simple split of the X into several sub-sets.

In Step 3, there are many well-established methods for solving the L_p minimization problem, such as Homotopy algorithm (Donoho and Tsaig, 2008) for p = 1, orthogonal matching pursuit (OMP) algorithm (Cai and Wang, 2011; Tropp, 2004; Davis et al., 1997) and single best replacement (SBR) algorithm (Soussen et al., 2011) for p = 0 and the FOCUSS method for 0 ≤ p ≤ 1 (Cotter et al., 2005).

In Step 5, we set the following two stop rules: 1. ||δ⁽^l⁾/l − δ⁽^l⁻¹⁾/(l−1)||₂ < α, where α is a predefined threshold; 2. The probability that each column in X has been evaluated should be greater than 1−p_stop. The algorithm terminates when both rules are satisfied, which decides the total number of iterations. In this work, we set α=0.01 and p_stop = 1e⁻⁴. At those stop rules, the total number of iterations was around 200 for the simulation data with n = 1e⁶ features and 300 for the real data sets (759075 SNPs and 153594 fMRI voxels) tested in this work. The effect of stop rules on the number of iterations will be evaluated in Sec. B of Appendix A, where the convergence of the algorithm is also proved.

We present the discussion and proof of the properties of the proposed SRVS algorithm in Appendix A, including: 1.) The independence on coherence condition of the data matrix X; 2.) Convergence and effectiveness of SRVS; 3.) Multi-resolution property of SRVS; and 4.) Sparsity control using ε. The Matlab based software toolbox for the proposed SRVS algorithm is available online: http://hongbaocao.weebly.com/software-for-download.html.

2.3 Validation of selected variables

To test the detective power of the selected biomarkers (SNPs/fMRI voxels), we performed a multivariate classification followed by a ten-fold cross-validation to classify schizophrenia patients from health controls. Results from four models were compared: SRVS algorithm with different L_p norm penalties (p = 0, 0.5, 1) and Li et al.’s method (Li et al., 2009).

Furthermore, we compared several classifiers, including sparse representation-based classifier (SRC), fuzzy c-means (FCM) classifier, and support vector machine (SVM) based classifier, and the SRC gives the best performance (see Appendix B, Fig. B 1.). The SRC has been proven effective for many tasks such as face recognition (Wright et al. 2009), speech recognition (Gemmeke et al. 2011), signal classification for brain computer interface (Shin et al. 2012), and image classification (Cao et al., 2012a). We provide the results in Appendix B, Fig. B. 1. Here, we describe the SRC algorithm as follows.

Sparse Representation-based Classification (SRC) algorithm

Inputs: a matrix of training samples A = [A₁, A₂, …, A_c] ∈ Rⁿ^×^s for c classes; and a test sample s_t ∈ Rⁿ.
Normalize the columns of A to have unit L₂-norm;
Solve the L₁ norm minimization problem: x̂ = arg min||x||₁, subject to Ax = s_j;
Calculate the residuals r_i(s_t) = ||s_t − Aδ_i(x)||₂ for i = 1, …, c;
ClassID (s_t) = arg min_i r_i(s_t)

The inputs of the SRC algorithm include 1. s_t ∈ Rⁿ, the feature vector from the subject t; 2. A ∈ R^nxs, feature vectors from c = 2 cluster groups in a total of s samples/subjects; δ_i(*) is a R^s → R^s transformation function, which selects the coefficients associated with the i-th class. The output is the ClassID of the subject t.

In each run of the 10-fold cross-validation, 90 percent subjects from both cases and controls were randomly selected for variable/biomarker selection, while the rest were used for testing. For each method, we carried out 100 runs and the average of the classification ratios was used as the final identification accuracy.

We also used the cross-validation to determine the optimal weighting factors in Eq. (2). For different pair of weighting factors, different variable groups will be selected, resulting in different classification ratios. Therefore, using the cross validation stated above, we can select the best weighting factors that lead to the highest classification ratio.

To test the effectiveness of integrative analysis, we compared our method with two traditional statistical methods for uni-type data analysis (i.e. Chi-squared test for SNP data and ANOVA for fMRI data). We provide the top 200 selected SNPs and fMRI voxels and the classification ratios in Appendix C and Appendix D, respectively.

III. Results

This section is organized as follows. We first describe in Sec. 3.1 the data we tested. Then we present in Sec. 3.2 the variables (SNPs/fMRI voxels) selected using GSM with different weighting factors. After that we compare in Sec. 3.3 the variables selected with different models (SRVS with three different penalties, Li et al.’s method). Finally, in Sec. 3.4, we provide the cross validation results for the selection of weighting factors.

3.1 Data Collection

In this study, participant recruitment and data collection were conducted by the Mind Clinical Imaging Consortium (MCIC). Two types of data (SNP and fMRI) were collected from 208 subjects including 96 schizophrenia patients (age: 34 ± 1, 22 females) and 112 healthy controls (age: 32 ± 1, 44 females). All of them provided written informed consents. Healthy participants were free of any medical, neurological or psychiatric illnesses and had no history of substance abuse. By the clinical interview of patients for DSM IV-TR Disorders (Pascual-Leone et al., 2002; Kumari et al., 2012) or the comprehensive assessment of symptoms and history, patients met criteria for DSM-IV-TR schizophrenia (Onitsuka et al., 2004; Meier et al., 2008). Antipsychotic history was collected as part of the psychiatric assessment.

3.1.1 fMRI Data Collecting and Preprocessing

The fMRI data were collected during a sensorimotor task, a block-design motor response to auditory stimulation. During the on-block, 200 msec tones presented a 500 msec stimulus onset asynchrony (SOA). A total of 16 different tones were presented in each on-block, with frequency ranging from 236 Hz to 1318 Hz. The fMRI images were acquired on Siemens 3T Trio Scanners and a 1.5T Sonata with echo-planar imaging (EPI) sequences using the following parameters (TR = 2000msec, TE = 30msec (3.0T)/40msec (1.5T), field of view = 22cm, slice thickness = 4mm, 1mm skip, 27 slices, acquisition matrix = 64 × 64, flip angle = 90°.) Four scanners were used and we have roughly equal numbers of patients and controls at all sites. Data were pre-processed in SPM5 (http://www.fil.ion.ucl.ac.uk/spm) and were realigned, spatially normalized and re-sliced to 3×3×3 mm³, smoothed with a 10×10×10 mm³ Gaussian kernel to reduce spatial noise, and analyzed by multiple regression considering the stimulus and their temporal derivatives plus an intercept term as regressors. Finally the stimulus-on versus stimulus-off contrast images were extracted with 53 × 63 × 46 voxels and all of the voxels with missing measurements were excluded.

3.1.2 SNPs Data

A blood sample was obtained from each participant and DNA was extracted. Genotyping for all participants was performed at the Mind Research Network using the Illumina Infinium HumanOmni1-Quad assay covering 1,140,419 SNP loci. Bead Studio was used to make the final genotype calls. Next, the PLINK software package (http://pngu.mgh.harvard.edu/~purcell/plink) was used to perform a series of standard quality control procedures, resulting in the final dataset spanning 759075 SNP loci. Each SNP was categorized into three clusters based on their genotype and was represented with discrete numbers: 0 for ‘BB’ (no minor allele), 1 for ‘AB’ (one minor allele) and 2 for ‘AA’ (two minor alleles).

3.2 Variable Selection with Generalized Sparse Model

Based on the generalized sparse model (Eq. (2)), we applied the proposed SRVS algorithm to select biomarkers for schizophrenia from the combination of two data sets (SNP data and fMRI data), where the weight factors α₁ and α₂ (α₁ + α₂ = 1) reflect the level of contribution from SNP and fMRI data set respectively. When the weight factor α₁ = 1 or α₂ = 1, the variable selection is performed only on one type of data. We tested the range of α₁ from 0.3 to 0.6, and used a step length of 0.02 with a total of 16 different trials and we set k = 0.05n. To test the most important biomarkers, we selected 200 biomarkers in each trial by using our proposed SRVS method in three models with different L_p norms (p = 0, 0.5, 1). We also compared with Li et al.’s method (Li et al., 2009). Fig. 2 shows the plot of the number of SNPs and fMRI voxels selected against weight factor α₁ for these four models. As shown in Fig. 2, the weight factor has similar effects on the variables selected with the four models. It was interesting to see that even though the number of SNPs was much larger than that of fMRI voxels (759075 vs. 153594), similar number of variables was selected from both data sets when weight factor α₁ was around 0.4 (0.38 for SRVS method with L_1/2 norms, 0.46 for SRVS method with L₀ norms, 0.47 for SRVS method with L₁ norms, and 0.47 for Li et al.’s method).

In addition, from Fig. 2 we can see that when α₁ took a smaller value, only a few SNPs were selected. Those SNPs can be viewed as the most important biomarkers since they were identified in both two data sets consisting of SNPs with small weight. When α₁ took a large value (α₂ was small), only a few fMRI voxels were selected. For the same reason, these voxels should be the most important ones. To further understand the relationships between the groups of variables selected in each trail, we analyzed the newly selected variables with the decrease of the corresponding weight factor, as shown in Fig. 3.

In Fig. 3, the newly selected variables shown in each trial have no overlap with variables from any other trials. When the weight factors have larger values (0.6 for SNP data set and 0.7 for fMRI data set), the selected groups have relatively larger size, and the variables were mostly from one type of data. Those were the variables that can be identified when using one type of data alone for the analysis. With the decrease of the weight factor, fewer new variables were detected. However, those variables should not be viewed as less significant than those selected with bigger weight factors, since they were selected over variables from both types of data with smaller weights.

3.3 Comparison of the Variables Selected Using Different Methods

We further compared the selected variables (SNPs/fMRI voxels) using different methods: SRVS with (L₀, L₁, L_1/2 and Li et al.’s method, as shown in Table 1. For the 16 trials with 200 variables selected in each trial, there were totally 3200 variables. However, as shown in Fig. 3, only a few of new variables were selected in each run, resulting in less number of final selected variables (807, 888, 1092 and 1939 for the four models, respectively). The overlaps among the selected variable groups using SRVS models with different L_p norm penalties are around 50% (458, 447 and 514 as shown in Table 1 and Fig. 4). Totally 349 variables are selected by all those three models. When compared with Li et al.’s method, only a small percentage (<10%) overlapped with those of the SRVS models with different L_p norms (67, 87 and 79 respectively). There were totally 48 variables selected by all the four models. We provided the first 50 SNPs and the corresponding genes identified by the four method in Table A. 1.

Table 1.

The comparison of the variable numbers (SNPs/fMRI voxels) selected by the four Models: SRVS with (L₀, L₁, L_1/2) and Li et al.’s method

	SRVS (L_1/2)	SRVS (L₀)	SRVS (L₁)	Li et al.’s method	Three SRVS methods
SRVS (L_1/2)	807	458	447	67	349
SRVS (L₀)	/	888	514	87
SRVS (L₁)	/	/	1092	79
Li et al’s method	/	/	/	1939	/
All four methods	48

Open in a new tab

Fig. 4 — Comparison of the selected variables (SNPs/fMRI voxels) using a Venn diagram. A, B and C are the variables selected using SRVS with L_1/2, L₀ and L₁ norm penalties, respectively.

We also compared our selected genes with the top ranked 45 schizophrenia genes reported in (http://www.szgene.org/default.asp) (see Table A. 2). We selected 200 variables in each trial and identified 4 to 5 reported genes by using our proposed SRVS methods with L₀ and L₁ norm penalties, and by using Li et al.’s method, as shown in Table 2. Our proposed SRVS algorithm with L_1/2 norm identified 6 reported genes. The genes/SNPs identified by each model were different. It should be noted that even though the OPCML gene was identified by all the four models, the corresponding SNPs, according to which the gene was identified, were different. If we select more variables in each run, corresponding to larger s-sparsity, more significant variables can be selected. This has been reported in our previous work, in which we selected around 800 variables in each of the 16 trials (α₁ is from 0.3 to 0.6; step length = 0.02), and 20 reported genes (e.g. PRSS16, NOTCH4, PDE4B, TCF4) (Cao et al., 2012c).

Table 2.

The comparison with the reported first 45 SCHIZOPHRENIA GENES (http://www.szgene.org/default.asp)

SRVS (L₀)		SRVS (L_1/2)		SRVS (L₁)		Li’s method
Genes	SNPs	Genes	SNPs	Genes	SNPs	Genes	SNPs
PDE4B	rs10846559	DRD2	rs10800893	HIST1H2BJ	rs11220916	PRSS16	rs13399561
NRG1	rs12097254	NRG1	rs16956192	DRD2	rs16828456	DAOA	rs16869700
PLXNA2	rs4811326	RGS4	rs1293448	NRG1	rs10846559	RPP21	rs1836942
OPCML	rs3026883	PPP3CC	rs6637088	PLXNA2	rs4632116	NRG1	rs10833482
/	/	PLXNA2	rs4072729	OPCML	rs11807403	OPCML	rs1745939
/	/	OPCML	rs11772714	/	/	/	/

Open in a new tab

We also compared the fMRI voxels selected by our proposed SRVS method with those of using Li et al.’s method (Li et al., 2009), as shown in Fig. 4. It is evident the voxels selected by the SRVS method tended to cluster together at specific regions such as temporal lobe, lateral frontal lobe, occipital lobe, and motor cortex, which are schizophrenia related brain regions (Pascual-Leone et al., 2002; Kumari et al., 2012; Onitsuka et al., 2004). However, the voxels selected by Li et al.’s method tend to be small regions scattered over the whole brain. This may be due to the fact that the voxels within the same brain region may not be simultaneously detected using their method.

The above comparisons show big differences among the biomarkers selected using the four models. To compare and the test the effectiveness of those different groups of biomarkers, they were used for the classification of schizophrenia patients from normal controls and the results were provided in Sec. 3.4.

3.4 Cross validation for the selection of weighting factor

We used the SRC as a classifier in classification with a ten-fold cross validation to test the predictive power of the variables/biomarkers selected in the four models (SRVS with L_0.5 penalty; SRVS with L₀ penalty; SRVS with L₁ penalty; and Li et al.’s method). We also used cross-validation to select the best weighting factors for the GSM (i.e., the weighting factors corresponding to the highest classification accuracy). Fig. 6 (a) shows the results of the ten-fold cross validation for the 16 trials given in Fig. 2. It can be seen from Fig. 6 (b) that our proposed SRVS methods in all three models with different L_p norm penalties provide much higher classification ratios than that of Li et al.’s method (Li et al., 2009) (p-valve <1e−8). In addition, the SRVS method with the L_1/2 norm penalty gives the highest classification ratio among four tested models. This is consistent with the fact that, the L_1/2 norm based sparse models provide the best data fitting and visual modeling among all the L_p norms (p ∈ (0,1]) as demonstrated by Xu et al.’s work (Xu et al., 2012).

Fig. 6 — A comparison of classification results of using four sparse models. (a) gives the classification ratio of differentiating schizophrenia from healthy controls using four models with different weight factors; (b) is the box plot generated with ANOVA analysis of the classification ratios using four different models.

The cross validation results determined that, for the four models tested, the best weighting factors are: 1.) SRVS with L_1/2 penalty, α₁ =0.58, CR=89.7%; 2.) SRVS with L₀ penalty, α₁ =0.48, CR=81.5%; 3.) SRVS with L₁ penalty, α₁ =0.38, CR=82.1%; and 4.) Li et al.’s method, α₁ =0.46, CR=62.1%. It should be noted that the best CRs and the corresponding optimum weighting factors were achieved with biomarkers selected from both types of data.

When compared with the traditional statistics method for uni-type data analysis (Chi-squared test for SNP data and ANOVA for fMRI data), results showed that using selected top (200~1000) SNPs alone can reach identification accuracy of (83.11±1.32)%, while using top (200~1000) fMRI voxels alone the accuracy was (63.13±0.74)%. Please refer Appendix C and Appendix D for the selected top features and the classification results.

IV. Discussion and Conclusion

This work aimed at biomarker identification using multi-modality data, i.e., SNP and fMRI data. To achieve this goal, we proposed a generalized sparse model (GSM) solved by a novel SRVS algorithm. The selected biomarkers were then tested by applying to the classification of schizophrenia patients with a ten-fold cross validation.

The GSM given in Eq. (2) uses multi-modality data for integrative analysis to detect biomarkers that cannot be identified using one type of data alone. The two weighting factors in GSM represent the level of contribution from different data types, and the best values can be determined by cross-validation. As shown in Sec. 3.4, the best weighting factors for all the four models are between [0.38,0.58]; using the combination of both data sets at these values can lead to to highest classification ratios. This demonstrates the advantage of integrating multi-modality data for the diagnosis of schizophrenia. In addition, when compared to the uni-type data analysis, our proposed SRVS method with L_1/2 norm led to significantly higher identification accuracy (P-value<0.001) than that from using one type of biomarkers alone (i.e., only using SNP or fMRI data for classification). This further demonstrates the advantage of using multiple data modalities.

The ten-fold validation results showed that features selected using our proposed SRVS algorithm gave higher classification accuracy than that of Li et al.’s method (see Fig. 6 (b), p-value<1e⁻⁸), indicating its effectiveness when the sample numbers are much smaller than the number of variables. The comparison results indicate that, even though Li’s method is valid for data of large sample size, our proposed SRVS is more suitable for processing data of small sample size.

For data set of small-m-large-n, traditional sparse model may fail. By randomly sampling the columns of the original measurement matrix into smaller sub-matrixes, however, our proposed SRVS method can overcome the difficulty. We proved that, a significant variable from the original data set can be selected with high probability, regardless of the coherence conditions (Appendix A, Sec. A). Moreover, we showed the convergence and effectiveness of the proposed SRVS method in Appendix A, Sec. B. As shown in Fig. 6 (a), voxels from the same clusters were identified simultaneously. Those variables may not be recovered with the traditional sparse models (e.g., the method used by Li et al. (Li et al., 2009)).

One advantage of our proposed SRVS algorithm is its multi-resolution properties. We showed in Appendix A, Sec. C that variables selected using larger window length k will be subset of those selected with smaller k. When k = n, the model reduces to traditional sparse model as given by Eq. (5), and no more than m biomarkers can be identified (Li et al. 2009; Cai and Wang, 2011). When using two different parameters k₁>k₂, a larger group of variables will be selected by using k₂, which include the variables selected using k₁. Thus, as long as the parameter n > k > m (e.g., k = 0.05n), the same top (n/k) × m variables will be always selected. We provided the discussion of the relationship of window length k and the number of variables to be selected in Appendix A, Sec. D. In addition, those variables can be ranked in the order of their significance (i.e., the amplitudes of the corresponding entries in the solution δ; Cai and Wang, 2011; see Appendix A, Fig. A 3). Then the residual ε can be used to determine how many variables should be selected (i.e., sparsity control; see Appendix A,Sec. E.).

Our multivariate classification results show that the variables selected by using our proposed SRVS algorithm, especially with L_1/2 norm based penalty, generated highest classification accuracy in discriminating schizophrenia patients from healthy controls. This suggests that L_1/2 norm may be the best choice as penalization term for the proposed SRVS method. However, the multivariate classification approach is not an extra validation step, but rather a way to determine how predictive the selected variables are as selected by the SRVS approach. For those variables that are not reported by the previous studies, further validation approaches should be conducted to test their significance. Moreover, the SRVS method proposed in this work is a data driven method, which do not directly interpret the physiological meaning of the selected variables. It was suggested by Haufe et al.’s work that signals detected using general linear model may involve noise (Haufe et al 2014). Therefore, those variables that are not reported by the previous studies may need further validation approaches using independent data sets to test their physiological significance.

In summary, we presented an effective multi-modality data integration method for biomarker selection. Using the method, we are able to integrate different types of data with large number of variables but small number of samples. However, due to the limited sample size, further biological experimental work is needed to validate the biomarkers identified in the paper.

Supplementary Material

NIHMS560490-supplement-01.docx^{(150.8KB, docx)}

NIHMS560490-supplement-02.docx^{(172.3KB, docx)}

NIHMS560490-supplement-03.xls^{(237.5KB, xls)}

NIHMS560490-supplement-04.xls^{(51KB, xls)}

A comparison of the selected fMRI voxels between SRVS (L_1/2) and Li et al.’s method (Li et al., 2009). The value of a voxel represents the frequency that it has been selected in the 16 trials

Acknowledgments

This work has been partially supported by both NIH and NSF. Drs. Cao and Shugart are supported by the intramural Program of NIMH, National Institutes of Health.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

Badner JA, Gershon ES. Meta-analysis of whole-genome linkage scans of bipolar disorder and schizophrenia. Mol Psychiatry. 2002;7(4):405–411. doi: 10.1038/sj.mp.4001012. [DOI] [PubMed] [Google Scholar]
Cai TT, Wang L. Orthogonal Matching Pursuit for Sparse Signal Recovery. IEEE Trans on Inf Theory. 2011;57(7):1–26. [Google Scholar]
Callicott JH, Straub RE, Pezawas L, Egan MF, Mattay VS, Hariri AR, Verchinski BA, Meyer-Lindenberg A, Balkissoon R, Kolachana B, Goldberg TE, Weinberger DR. Variation in DISC1 affects hippocampal structure and function and increases risk for schizophrenia. Proc Natl Acad Sci U S A. 2005 Jun;102(24):8627–32. doi: 10.1073/pnas.0500515102. [DOI] [PMC free article] [PubMed] [Google Scholar]
Candès E, Tao T. Near optimal signal recovery from random projections:Universal encoding strategies? IEEE Trans Inf Theory Dec. 2006;52(12):5406–5425. [Google Scholar]
Cao H, Deng H, Li M, Wang YP. Classification of Multicolor Fluorescence In-situ Hybridization (M-FISH) Images with Sparse Representation, IEEE Tans. Nanobioscience. 2012a;11(2):111–118. doi: 10.1109/TNB.2012.2189414. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cao H, Duan J, Lin D, Wang YP. Sparse Representation Based Clustering for Integrated Analysis of Gene Copy Number Variation and Gene Expression Data. IJCA. 2012b Jun;19(2):131. [Google Scholar]
Cao H, Duan J, Lin D, Calhoun V, Wang YP. Biomarker Identification for Diagnosis of Schizophrenia with Integrated Analysis of fMRI and SNPs. Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on; Oct. 4–7, 2012c; Philadelphia, PA, USA. pp. 1–6. [Google Scholar]
Cao H, Lei S, Deng HW, Wang YP. Identification of genes for complex diseases using integrated analysis of multiple types of genomic data. PLoS One. 2012d;7(9):e42755. doi: 10.1371/journal.pone.0042755. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cotter SF, Rao BD, Engan K, Kreutz-Delgado K. Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans on Signal processing. 2005;53(7):2477–2488. [Google Scholar]
Davenport M, Duarte M, Hegde C, Baraniuk R. Introduction to compressive sensing. Connexions. 2011 Apr 10; Web site. http://cnx.org/content/m37172/1.7/
Davis G, Mallat S, Avellaneda M. Greedy adaptive approximation. J Constr Approx. 1997;13(1):57–98. [Google Scholar]
Donoho DL, Elad M. Maximal sparsity representation via L1 minimization. Proc Nat Acad Sci USA. 2003 Mar;100:2197–2202. doi: 10.1073/pnas.0437847100. [DOI] [PMC free article] [PubMed] [Google Scholar]
Donoho DL, Tsaig Y. Fast solution of L1-norm minimization problems when the solution may be sparse. IEEE Transs on Information Theory. 2008 Nov;54:4789–4812. [Google Scholar]
Donoho DL. Technical Report. Stanford University; 2004. Compressed sensing. [Google Scholar]
Donoho DL, Elad M, Temlyakov VN. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Transactions on Information Theory. 2006;52(1):6–18. [Google Scholar]
Fisher RA, Yates F. Statistical tables for biological, agricultural and medical research. 3. London: Oliver & Boyd; 1948. pp. 26–27. OCLC 14222135. [Google Scholar]
Foucart S, Lai MJ. Sparsest Solutions of Underdetermined Linear Systems via q minimization for 0 < q 1, Applied Comput. Harmonic Analysis May. 2009;26(3):395–407. [Google Scholar]
Gemmeke JF, Virtanen T, Hurmalainen A. Exemplar-based sparse representations for noise robust automatic speech recognition, IEEE Trans. Audio Speech Lang Process. 2011;19(7):2067–2080. [Google Scholar]
Gilbert Anna C, Muthukrishnan S, Strauss Martin J. Improved sparse approximation over quasi-incoherent dictionaries. Proc. 2003 IEEE Int. Conf. Image Process; Barcelona, Spain. Sep., 1; pp. 137–140. [Google Scholar]
Gribonval R, Nielsen M. Sparse decompositions in unions of bases\ IEEE Trans Inf Theory. 2003 Dec;49(12):3320–3325. [Google Scholar]
Haufe S, Meinecke F, Görgen K, Dähne S, Haynes JD, Blankertz B, Bießmann F. On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage. 2014;87c:96–110. doi: 10.1016/j.neuroimage.2013.10.067. [DOI] [PubMed] [Google Scholar]
Hsu D, Kakade S, Langford J, Zhang T. In Neural Information Processing Systems (NIPS) 2009. Multi-label prediction via compressed sensing. [Google Scholar]
Kidron E, Schechner YY, Elad M. Cross-modal localization via sparsity, IEEE Trans. Signal Process. 2007 Apr;55(4):1390–1404. [Google Scholar]
Kumari V, Gray JA, Honey GD, Soni W, Bullmore ET, Williams SC, Ng VW, Vythelingum GN, Simmons A, Suckling J, Corr PJ, Sharma T. Procedural learning in schizophrenia: a functional magnetic resonance imaging investigation. Schizophrenia Research. 2012 Sep;57(1):97–107. doi: 10.1016/s0920-9964(01)00270-5. [DOI] [PubMed] [Google Scholar]
Li Y, Cichocki A, Amari S. Analysis of sparse representation and blind source separation. Neural Comput. 2004;16(6):1193–1234. doi: 10.1162/089976604773717586. [DOI] [PubMed] [Google Scholar]
Li Y, Namburi P, Yu Z, Guan C, Feng J, Gu Z. Voxel Selection in fMRI Data Analysis Based on Sparse Representation. IEEE Transs on Biomed Eng. 2009;56(10):2439–2451. doi: 10.1109/TBME.2009.2025866. [DOI] [PubMed] [Google Scholar]
Lin D, Cao H, Calhoun VD, Wang Y. Classification of schizophrenia patients with combined analysis of SNP and fMRI data based on sparse representation. BIBM 2011; 2011 IEEE International Conference on; Atlanta, GA, USA. [Google Scholar]
Liu J, Pearlson G, Windemuth A, Ruano G, Perrone-Bizzozero NI, Calhoun VD. Combining fMRI and SNP data to investigate connections between brain function and genetics using parallel ICA. Hum Brain Mapp. 2009 Jan;30(1):241–55. doi: 10.1002/hbm.20508. [DOI] [PMC free article] [PubMed] [Google Scholar]
Meda SA, Bhattarai M, Morris NA, Astur RS, Calhoun VD, Mathalon DH, Kiehl KA, Pearlson GD. An fMRI study of working memory in first-degree unaffected relatives of schizophrenia patients. Schizophr Res. 2008 Sep;104(1–3):85–95. doi: 10.1016/j.schres.2008.06.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
Meier L, Geer S, Bühlmann P. The group lasso for logistic regression. J R Statist Soc B. 2008;70(1):53–71. [Google Scholar]
Onitsuka T, Shenton ME, Salisbury DF, Dickey CC, Kasai K, Toner SK, Frumin M, Kikinis R, Jolesz FA, McCarley RW. Middle and inferior temporal gyrus gray matter volume abnormalities in chronic schizophrenia: an MRI study. Am J Psychiatry. 2004 Sep;161(9):1603–11. doi: 10.1176/appi.ajp.161.9.1603. [DOI] [PMC free article] [PubMed] [Google Scholar]
Pascual-Leone A, Manoach DS, Birnbaum R, Goff DC. Motor cortical excitability in schizophrenia. Biol Psychiatry. 2002 Jul;52(1):24–31. doi: 10.1016/s0006-3223(02)01317-3. [DOI] [PubMed] [Google Scholar]
Ramirez I, Sprechmann P, Sapiro G. Classification and clustering via dictionary learning with structured incoherence and shared features. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on; San Francisco, CA, USA. June 2010; pp. 3501–3508. [Google Scholar]
Rhodes DR, Chinnaiyan AM. Integrative analysis of the cancer transcriptome. Nat Genet. 2005 Jun;37:31–37. doi: 10.1038/ng1570. [DOI] [PubMed] [Google Scholar]
Sharon Y, Wright J, Ma Y. UIUC, Tech Rep UILU-ENG-07-2008. 2007. Computation and relaxation of conditions for equivalence between 11 and 10 minimization. [Google Scholar]
Shin Y, Lee S, Lee J, Lee H-N. Sparse representation-based classification scheme for motor imagery-based brain computer interface systems. J Neural Eng. 2012;9 (5):1, 12. doi: 10.1088/1741-2560/9/5/056002. [DOI] [PubMed] [Google Scholar]
Soussen C, Idier J, Brie D, Duan J. From Bernoulli-Gaussian deconvolution to sparse signal restoration, IEEE Trans. Signal Processing. 2011;59(10):4572–4584. [Google Scholar]
Sutrala SR, Norton N, Williams NM, Buckland PR. Gene copy number variation in schizophrenia. Am J Med Genet B Neuropsychiatr Genet. 2008 Jul 5;147B(5):606–11. doi: 10.1002/ajmg.b.30645. [DOI] [PubMed] [Google Scholar]
Szycik GR, Münte TF, Dillo W, Mohammadi B, Samii A, Emrich HM, Dietrich DE. Audiovisual integration of speech is disturbed in schizophrenia: an fMRI study. Schizophr Res. 2009 May;110(1–3):111–8. doi: 10.1016/j.schres.2009.03.003. [DOI] [PubMed] [Google Scholar]
Tang W, Cao H, Duan J, Wang YP. A compressed sensing based approach for subtyping of leukemia from gene expression data. J of Bioinformatics and Computational Biology. 2011;9(5):631–645. doi: 10.1142/s0219720011005689. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tropp JA. Greed is good: Algorithmic results for sparse approximation, IEEE Trans. Inf Theory. 2004;50(10):2231–2242. [Google Scholar]
Wang J, Yang CY, Chen B. Sparse Signal Recovery Based on lq (0 < q = 1) Minimization. 2011 International Conference on Multimedia and Signal Processing; Guilin, Guangxi China. May 2011. [Google Scholar]
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Intell. 2009 Mar;31(2):210–227. doi: 10.1109/TPAMI.2008.79. [DOI] [PubMed] [Google Scholar]
Xu Z, Chang X, Xu F. L-1/2 Regularization: A Thresholding Representation Theory and a Fast Solver. IEEE Trans on Neural Networks and Learning Systems. 2012;23 (7):1013–1027. doi: 10.1109/TNNLS.2012.2197412. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

NIHMS560490-supplement-01.docx^{(150.8KB, docx)}

NIHMS560490-supplement-02.docx^{(172.3KB, docx)}

NIHMS560490-supplement-03.xls^{(237.5KB, xls)}

NIHMS560490-supplement-04.xls^{(51KB, xls)}

[R1] Badner JA, Gershon ES. Meta-analysis of whole-genome linkage scans of bipolar disorder and schizophrenia. Mol Psychiatry. 2002;7(4):405–411. doi: 10.1038/sj.mp.4001012. [DOI] [PubMed] [Google Scholar]

[R2] Cai TT, Wang L. Orthogonal Matching Pursuit for Sparse Signal Recovery. IEEE Trans on Inf Theory. 2011;57(7):1–26. [Google Scholar]

[R3] Callicott JH, Straub RE, Pezawas L, Egan MF, Mattay VS, Hariri AR, Verchinski BA, Meyer-Lindenberg A, Balkissoon R, Kolachana B, Goldberg TE, Weinberger DR. Variation in DISC1 affects hippocampal structure and function and increases risk for schizophrenia. Proc Natl Acad Sci U S A. 2005 Jun;102(24):8627–32. doi: 10.1073/pnas.0500515102. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Candès E, Tao T. Near optimal signal recovery from random projections:Universal encoding strategies? IEEE Trans Inf Theory Dec. 2006;52(12):5406–5425. [Google Scholar]

[R5] Cao H, Deng H, Li M, Wang YP. Classification of Multicolor Fluorescence In-situ Hybridization (M-FISH) Images with Sparse Representation, IEEE Tans. Nanobioscience. 2012a;11(2):111–118. doi: 10.1109/TNB.2012.2189414. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Cao H, Duan J, Lin D, Wang YP. Sparse Representation Based Clustering for Integrated Analysis of Gene Copy Number Variation and Gene Expression Data. IJCA. 2012b Jun;19(2):131. [Google Scholar]

[R7] Cao H, Duan J, Lin D, Calhoun V, Wang YP. Biomarker Identification for Diagnosis of Schizophrenia with Integrated Analysis of fMRI and SNPs. Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on; Oct. 4–7, 2012c; Philadelphia, PA, USA. pp. 1–6. [Google Scholar]

[R8] Cao H, Lei S, Deng HW, Wang YP. Identification of genes for complex diseases using integrated analysis of multiple types of genomic data. PLoS One. 2012d;7(9):e42755. doi: 10.1371/journal.pone.0042755. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Cotter SF, Rao BD, Engan K, Kreutz-Delgado K. Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans on Signal processing. 2005;53(7):2477–2488. [Google Scholar]

[R10] Davenport M, Duarte M, Hegde C, Baraniuk R. Introduction to compressive sensing. Connexions. 2011 Apr 10; Web site. http://cnx.org/content/m37172/1.7/

[R11] Davis G, Mallat S, Avellaneda M. Greedy adaptive approximation. J Constr Approx. 1997;13(1):57–98. [Google Scholar]

[R12] Donoho DL, Elad M. Maximal sparsity representation via L1 minimization. Proc Nat Acad Sci USA. 2003 Mar;100:2197–2202. doi: 10.1073/pnas.0437847100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Donoho DL, Tsaig Y. Fast solution of L1-norm minimization problems when the solution may be sparse. IEEE Transs on Information Theory. 2008 Nov;54:4789–4812. [Google Scholar]

[R14] Donoho DL. Technical Report. Stanford University; 2004. Compressed sensing. [Google Scholar]

[R15] Donoho DL, Elad M, Temlyakov VN. Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Transactions on Information Theory. 2006;52(1):6–18. [Google Scholar]

[R16] Fisher RA, Yates F. Statistical tables for biological, agricultural and medical research. 3. London: Oliver & Boyd; 1948. pp. 26–27. OCLC 14222135. [Google Scholar]

[R17] Foucart S, Lai MJ. Sparsest Solutions of Underdetermined Linear Systems via q minimization for 0 < q 1, Applied Comput. Harmonic Analysis May. 2009;26(3):395–407. [Google Scholar]

[R18] Gemmeke JF, Virtanen T, Hurmalainen A. Exemplar-based sparse representations for noise robust automatic speech recognition, IEEE Trans. Audio Speech Lang Process. 2011;19(7):2067–2080. [Google Scholar]

[R19] Gilbert Anna C, Muthukrishnan S, Strauss Martin J. Improved sparse approximation over quasi-incoherent dictionaries. Proc. 2003 IEEE Int. Conf. Image Process; Barcelona, Spain. Sep., 1; pp. 137–140. [Google Scholar]

[R20] Gribonval R, Nielsen M. Sparse decompositions in unions of bases\ IEEE Trans Inf Theory. 2003 Dec;49(12):3320–3325. [Google Scholar]

[R21] Haufe S, Meinecke F, Görgen K, Dähne S, Haynes JD, Blankertz B, Bießmann F. On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage. 2014;87c:96–110. doi: 10.1016/j.neuroimage.2013.10.067. [DOI] [PubMed] [Google Scholar]

[R22] Hsu D, Kakade S, Langford J, Zhang T. In Neural Information Processing Systems (NIPS) 2009. Multi-label prediction via compressed sensing. [Google Scholar]

[R23] Kidron E, Schechner YY, Elad M. Cross-modal localization via sparsity, IEEE Trans. Signal Process. 2007 Apr;55(4):1390–1404. [Google Scholar]

[R24] Kumari V, Gray JA, Honey GD, Soni W, Bullmore ET, Williams SC, Ng VW, Vythelingum GN, Simmons A, Suckling J, Corr PJ, Sharma T. Procedural learning in schizophrenia: a functional magnetic resonance imaging investigation. Schizophrenia Research. 2012 Sep;57(1):97–107. doi: 10.1016/s0920-9964(01)00270-5. [DOI] [PubMed] [Google Scholar]

[R25] Li Y, Cichocki A, Amari S. Analysis of sparse representation and blind source separation. Neural Comput. 2004;16(6):1193–1234. doi: 10.1162/089976604773717586. [DOI] [PubMed] [Google Scholar]

[R26] Li Y, Namburi P, Yu Z, Guan C, Feng J, Gu Z. Voxel Selection in fMRI Data Analysis Based on Sparse Representation. IEEE Transs on Biomed Eng. 2009;56(10):2439–2451. doi: 10.1109/TBME.2009.2025866. [DOI] [PubMed] [Google Scholar]

[R27] Lin D, Cao H, Calhoun VD, Wang Y. Classification of schizophrenia patients with combined analysis of SNP and fMRI data based on sparse representation. BIBM 2011; 2011 IEEE International Conference on; Atlanta, GA, USA. [Google Scholar]

[R28] Liu J, Pearlson G, Windemuth A, Ruano G, Perrone-Bizzozero NI, Calhoun VD. Combining fMRI and SNP data to investigate connections between brain function and genetics using parallel ICA. Hum Brain Mapp. 2009 Jan;30(1):241–55. doi: 10.1002/hbm.20508. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Meda SA, Bhattarai M, Morris NA, Astur RS, Calhoun VD, Mathalon DH, Kiehl KA, Pearlson GD. An fMRI study of working memory in first-degree unaffected relatives of schizophrenia patients. Schizophr Res. 2008 Sep;104(1–3):85–95. doi: 10.1016/j.schres.2008.06.013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] Meier L, Geer S, Bühlmann P. The group lasso for logistic regression. J R Statist Soc B. 2008;70(1):53–71. [Google Scholar]

[R31] Onitsuka T, Shenton ME, Salisbury DF, Dickey CC, Kasai K, Toner SK, Frumin M, Kikinis R, Jolesz FA, McCarley RW. Middle and inferior temporal gyrus gray matter volume abnormalities in chronic schizophrenia: an MRI study. Am J Psychiatry. 2004 Sep;161(9):1603–11. doi: 10.1176/appi.ajp.161.9.1603. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] Pascual-Leone A, Manoach DS, Birnbaum R, Goff DC. Motor cortical excitability in schizophrenia. Biol Psychiatry. 2002 Jul;52(1):24–31. doi: 10.1016/s0006-3223(02)01317-3. [DOI] [PubMed] [Google Scholar]

[R33] Ramirez I, Sprechmann P, Sapiro G. Classification and clustering via dictionary learning with structured incoherence and shared features. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on; San Francisco, CA, USA. June 2010; pp. 3501–3508. [Google Scholar]

[R34] Rhodes DR, Chinnaiyan AM. Integrative analysis of the cancer transcriptome. Nat Genet. 2005 Jun;37:31–37. doi: 10.1038/ng1570. [DOI] [PubMed] [Google Scholar]

[R35] Sharon Y, Wright J, Ma Y. UIUC, Tech Rep UILU-ENG-07-2008. 2007. Computation and relaxation of conditions for equivalence between 11 and 10 minimization. [Google Scholar]

[R36] Shin Y, Lee S, Lee J, Lee H-N. Sparse representation-based classification scheme for motor imagery-based brain computer interface systems. J Neural Eng. 2012;9 (5):1, 12. doi: 10.1088/1741-2560/9/5/056002. [DOI] [PubMed] [Google Scholar]

[R37] Soussen C, Idier J, Brie D, Duan J. From Bernoulli-Gaussian deconvolution to sparse signal restoration, IEEE Trans. Signal Processing. 2011;59(10):4572–4584. [Google Scholar]

[R38] Sutrala SR, Norton N, Williams NM, Buckland PR. Gene copy number variation in schizophrenia. Am J Med Genet B Neuropsychiatr Genet. 2008 Jul 5;147B(5):606–11. doi: 10.1002/ajmg.b.30645. [DOI] [PubMed] [Google Scholar]

[R39] Szycik GR, Münte TF, Dillo W, Mohammadi B, Samii A, Emrich HM, Dietrich DE. Audiovisual integration of speech is disturbed in schizophrenia: an fMRI study. Schizophr Res. 2009 May;110(1–3):111–8. doi: 10.1016/j.schres.2009.03.003. [DOI] [PubMed] [Google Scholar]

[R40] Tang W, Cao H, Duan J, Wang YP. A compressed sensing based approach for subtyping of leukemia from gene expression data. J of Bioinformatics and Computational Biology. 2011;9(5):631–645. doi: 10.1142/s0219720011005689. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R41] Tropp JA. Greed is good: Algorithmic results for sparse approximation, IEEE Trans. Inf Theory. 2004;50(10):2231–2242. [Google Scholar]

[R42] Wang J, Yang CY, Chen B. Sparse Signal Recovery Based on lq (0 < q = 1) Minimization. 2011 International Conference on Multimedia and Signal Processing; Guilin, Guangxi China. May 2011. [Google Scholar]

[R43] Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y. Robust face recognition via sparse representation. IEEE Trans Pattern Anal Intell. 2009 Mar;31(2):210–227. doi: 10.1109/TPAMI.2008.79. [DOI] [PubMed] [Google Scholar]

[R44] Xu Z, Chang X, Xu F. L-1/2 Regularization: A Thresholding Representation Theory and a Fast Solver. IEEE Trans on Neural Networks and Learning Systems. 2012;23 (7):1013–1027. doi: 10.1109/TNNLS.2012.2197412. [DOI] [PubMed] [Google Scholar]

PERMALINK

Sparse Representation Based Biomarker Selection for Schizophrenia with Integrated Analysis of fMRI and SNPs

Hongbao Cao

Junbo Duan

Dongdong Lin

Yin Yao Shugart

Vince Calhoun

Yu-Ping Wang

Abstract

I. INTRODUCTION

II. Materials and Methods

Fig. 1.

2.1 A sparse model for Data combination

2.2 Variable selection with SRVS

Theorem

SRVS Algorithm (http://hongbaocao.weebly.com/software-for-download.html)

2.3 Validation of selected variables

Sparse Representation-based Classification (SRC) algorithm

III. Results

3.1 Data Collection

3.1.1 fMRI Data Collecting and Preprocessing

3.1.2 SNPs Data

3.2 Variable Selection with Generalized Sparse Model

Fig. 2.

Fig. 3.

3.3 Comparison of the Variables Selected Using Different Methods

Table 1.

Fig. 4.

Table 2.

3.4 Cross validation for the selection of weighting factor

Fig. 6.

IV. Discussion and Conclusion

Supplementary Material

Figure 5.

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases