Fast and precise independent component analysis for high field fMRI time series tailored using prior information on spatiotemporal structure

Kiyotaka Suzuki; Tohru Kiryu; Tsutomu Nakada

doi:10.1002/hbm.1061

. 2001 Oct 30;15(1):54–66. doi: 10.1002/hbm.1061

Fast and precise independent component analysis for high field fMRI time series tailored using prior information on spatiotemporal structure

Kiyotaka Suzuki ^1,², Tohru Kiryu ², Tsutomu Nakada ^1,^3,^✉

PMCID: PMC6872029 PMID: 11747100

Abstract

Independent component analysis (ICA) has been shown as a promising tool for the analysis of functional magnetic resonance imaging (fMRI) time series. Each of these studies, however, used a general‐purpose algorithm for performing ICA and the computational efficiency and accuracy of elicited neuronal activations have not been discussed in much detail. We have previously proposed a direct search method for improving computational efficiency. The method, which is based on independent component‐cross correlation‐sequential epoch (ICS) analysis, utilizes a form of the fixed‐point ICA algorithm and considerably reduces the time required for extracting desired components. At the same time, it is shown that the accuracy of detecting physiologically meaningful components is much improved by tailoring the contrast function used in the algorithm. In this study, further improvement was made to this direct search method by integrating an optimal contrast function. Functional resolution of activation maps could be controlled with a suitable selection of the contrast function derived from prior knowledge of the spatial patterns of physiologically desired components. A simple skewness‐weighted contrast function was verified to extract sufficiently precise activation maps from the fMRI time series acquired using a 3.0 Tesla MRI system. Hum. Brain Mapping 15:54–66, 2001. © 2001 Wiley‐Liss, Inc.

Keywords: fMRI, independent component analysis, fixed‐point algorithm, contrast function, sequential epoch analysis, overcomplete representation

INTRODUCTION

Independent component analysis (ICA) [Bell and Sejnowski, 1995; Comon, 1994] is a novel statistical signal processing technique the application of which has recently received much attention in research and industrial fields. Compared to principal component analysis (PCA), which removes second‐order correlations from observed signals, ICA further removes higher‐order dependencies. It was originally shown by McKeown et al. [1998a,b] that ICA enables powerful exploratory analysis on functional magnetic resonance imaging (fMRI) data by extracting spatially independent patterns or maps of both task‐related activations and artifactual components. Unlike univariate analytic methods, the most prominent feature of ICA as applied to fMRI analysis is that voxels or brain regions are automatically correlated. The problem of how to select the maps of physiological interest out of numerous statistically independent patterns, however, still remains to be addressed in each experiment. As a solution to this issue, a hybrid technique of ICA and sequential epoch analysis (SEA), termed independent component‐cross correlation‐sequential epoch (ICS) analysis [Nakada et al., 2000], was introduced.

SEA was devised to allow multi‐state contrast analyses on blood oxygenation level dependent (BOLD) fMRI data of a single experimental session [Nakada et al., 1998]. The key idea of SEA in its original form is represented by a set of possible combinations, termed sequential epoch patterns, of fundamental contrasts resulting from multiple univariate regression analyses such as statistical parametric mapping (SPM) [Friston et al., 1995a,b]. A ‘fundamental contrast’ is interpreted as a specified combination of two states for which conditional effects are to be statistically assessed. A reference state, e.g., resting state, is usually assigned to one of the pairs of states similar to an origin is set common to the axes of coordinates. Sequential epoch patterns represent possible logical operations on the resultant activation maps of fundamental contrasts. A certain sequential epoch pattern is used to elucidate the specific function of an activated area of interest. ICS analysis is based on the fact that sequential epoch patterns are actually observed in the raw fMRI time series of sequential epoch paradigms which have an acceptable level of pixel misalignment. In ICS analysis, each sequential epoch pattern is in turn used as a hypothesized temporal reference function of physiologically significant components. Spatially independent patterns whose associated time course correlates to a designated sequential epoch pattern in excess of a predefined threshold, e.g., r > 0.7, may be considered potential activation maps of functionally discrete areas that participate in the task specified by the sequential epoch pattern. The direct non‐invasive in vivo demonstration of the dual representation of the human primary motor cortex using ICS analysis [Nakada et al., 2000] is a convincing example of the enhanced capability of identifying brain activation afforded through ICS analysis. ICS analysis indeed appears to be a highly reliable and versatile method.

Due to the immensity of the data set, ICS analysis imposes as heavy a computational load as other ICA based analyses. To mitigate against this, we have proposed a direct exploratory method [Suzuki et al., 2000] using a fixed‐point ICA algorithm [Hyvärinen, 1999]. Our choice of this algorithm for performing ICS analysis is based on the following features. The algorithm realizes a fast and reliable one‐unit search by combining information‐theoretic indices with the exploratory projection pursuit (EPP) approach. We have shown that with appropriate initial unmixing weights assigned to the iteration process of the fixed‐point algorithm, activation maps significantly relevant to a specified sequential epoch pattern can be selectively retrieved from the data one by one. This direct search technique represents a unique application of ICA in which prior knowledge of mixing weights, or temporal patterns, of desired components is used to control the pursuit direction. In contrast to adaptive neural algorithms, calculations in the fixed‐point iteration algorithm are made in batch mode. In other words, a large number of data points, or the entire data set if possible, are used in each step of the iteration. In addition, although the fixed point algorithm has no arbitrary constants such as a learning rate, the convergence of general neural network algorithms depends strongly on a deliberate choice of the learning rate, which even if the value were optimal, still yields a convergence speed that is much slower than the proposed algorithm. These features of the fixed‐point algorithm yield a fast and stable convergence. A class of neural algorithms for the stochastic gradient descent approach does have the advantage of adaptation in a non‐stationary system. Adaptability, however, is an unnecessary property for the current fMRI analyses.

The contrast function for the one‐unit fixed‐point iteration should be carefully chosen because it plays an important role in the accurate separation of components. This is particularly important for the direct search technique, because an estimated independent component is removed from the original data previous to the subsequent search, for the purpose of overcomplete decomposition, and thus the accuracy of an extracted component cumulatively affects the accuracy of subsequent ones. Three definite forms of general‐purpose contrast functions have been proposed for the fixed‐point iteration [Hyvärinen, 1999]. In our experience, however, none of these were perfectly suitable for ICS analysis.

The principal aims of the present study can be summarized as follows: 1) to explore an appropriate contrast function for the identification of highly resolved activation maps using empirical knowledge of the density distribution of voxel values in the desired maps; 2) to provide an efficient and accurate ICA algorithm by integrating the resultant contrast function with the direct search method; and 3) to validate its performance with a representative fMRI data set.

METHODS

ICA Data Model and Decomposition Algorithm

Assuming that fMRI data are linear mixtures of spatially independent components [McKeown and Sejnowski, 1998], the ICA model can be represented by

(1)

where k is the index of voxels, s(k)=$\lbrace s_1(k),(\ldots),s_N(k)\rbrace^T$ is a set of spatially independent maps, x$(k)=\lbrace x_1(k),\ldots,x_N(k)\rbrace^T$ is an observed fMRI time series, and A is an N × N unknown time‐invariant mixing matrix. The matrix A is assumed to be invertible or non‐singular. Each column of A indeed represents a time course of the corresponding map and is also referred to the basis vector of the map if it is normalized to be a unit vector. It should be noted that unlike PCA, the bases are allowed to be mutually non‐orthogonal. Accordingly, a slight difference between the time courses can be discriminated to give finer segregation of active brain regions.

In practice, ICA seeks for an unmixing matrix W so that the vector

(2)

is an estimate of independent maps s(k), except for permutation, signs, and amplitudes. In the learning process of neural algorithms, generally, the whole of W is updated at one time and iterated to minimize dependence on the elements of y as much as possible by evaluating mutual information or joint entropy. The one‐unit fixed‐point algorithm finds, conversely, a single independent component at a time in a way of exploratory projection pursuit (EPP). In the fixed‐point ICA, an approximation of negentropy is used for the measure of non‐Gaussianity, which is a true index of the EPP approach and has essential connection with ICA. The contrast function for finding one component is given by

(3)

where E{·} denotes the expectation operator, w=($w_1,\ldots,w_N$)$^T$ is a weight vector under the constraint E$\lbrace$(w$^T$)x$^2\rbrace=1$, which is a certain row of the matrix W, G is a non‐quadratic function, and v is a normalized Gaussian variable. A single independent component, y=w$^T$x, would be found at a local maximum of the function $J_G$(w). If the data have been sphered, in other words, uncorrelated up to the second‐order statistics so that $E\lbrace$xx$^T\rbrace$=I, the algorithm for finding the maxima can be simplified. PCA, which relates to the singular value decomposition (SVD) in matrix decomposition theory, can be used to sphere the data giving a sphering matrix B as

(4)

where E denotes the transposed matrix consisting of eigenvectors of the covariance E$\lbrace$xx$^T\rbrace$ and D the diagonal matrix whose elements are the square root of the corresponding eigenvalues. Sphered data \^x and the unmixing vector \^w for a single component in a sphered space relate to those in the original space as

(5)

respectively. The maxima of $J_G$(\^w) are obtained at optima of Inline graphic that are given by the solutions of the following fixed‐point equation:

(6)

where G′ is the derivative of G with respect to y(=\^w$^T$\^x) and \^w$_0$ is the value of \^w at the optimum. Newton's method, the convergence of which is quadratic, can be applied to derive the following iteration algorithm [Hyvärinen, 1999]:

(7)

where ∥ · ∥ denotes the Euclidean vector norm and \^w{*} is the updated value of \^w. Because an independent component is given by a projection of the multidimensional sphered data onto the axis defined by the weight vector \^w, the vector \^w is also referred to as the basis vector of the corresponding independent component. The basis vector here is the one defined in the sphered space and it has one‐to‐one correspondence to the aforementioned basis vector defined in the original space. It is the sphering matrix that expresses such a spatial transformation. The unmixing matrix W in the original space can be represented as

(8)

where \^W=$\lbrace$\^w$_1,\ldots$,\^w$_N\rbrace^T$ is an orthonormal or rotation matrix satisfying \^W\^W$^T$=I. Accordingly, the multi‐unit fixed‐point algorithm is readily obtained. When n (<N) independent components or n vectors \^w$_1$,…,\^w$_n$ have been estimated, the subsequent component \^w_n+1 can be estimated by subtracting from \^w_n+1 the projections of the previous n vectors after every iteration step [Hyvärinen, 1999] as follows:

(9)

This procedure, however, was not used in our direct search method, because we had found that overcomplete bases, or quasi‐orthogonal basis vectors, could be obtained automatically by removing the estimated components sequentially from the data, and that the flexibility of separating fMRI data into components could be increased. Although it seems natural to say that if the sources were exactly independent then projecting out independent components would lead to orthogonal bases, this statement is correct only when the number of true sources is less than or equal to the number of fMRI time steps and, at the same time, a perfect contrast function for distinguishing the sources is used. A fine distinction between the components, which is our aim in this study, implies that the total number of latent components, including noise, may possibly be larger than that of the fMRI time steps. A set of quasi‐orthogonal bases leads to an overcomplete representation and is generally applicable to the data extending over an extremely high dimensional space. To make this quasi‐orthogonal search reliable, again, accurate estimation of a single component is required.

Determining the contrast function is equivalent to making a selection of the nonlinear transfer function G in equation (3). The following choices for non‐linearity have been proposed [Hyvärinen, 1999; Hyvärinen and Oja, 1997]:

(10)

(11)

(12)

where 1 ≤ a ₁ ≤ 2, a ₂ ≈ 1 are constants. Roughly speaking, the functions growing slower than quadratically are suitable for estimating super‐Gaussian densities, and conversely, the functions growing faster than quadratically may fit sub‐Gaussian densities [Hyvärinen, 1999]. The benefits of these functions are summarized from a theoretical point of view as follows: G _a is a good general‐purpose function; G _b is robust and better for highly super‐Gaussian components; G _c is useful only for estimating sub‐Gaussian components without outliers.

Candidates for the Optimal Contrast Function

Figure 1 shows a sample activation map revealed by ICS analysis with an infomax neural algorithm [Amari, 1998; Bell and Sejnowski, 1995; Linsker, 1992] together with the intensity distribution of the map. The colored pixels that assembled to form a cluster corresponding to an active brain region were selected with an intensity threshold indicated by the arrow in the distribution. Only less than one percent of the whole pixels exceeded the cut‐off value and thus contributed significantly to the map. Therefore, it was recognized that the major part of the distribution was governed by less significant background pixels in the activation maps in general. The point is that the robustness of an ICA algorithm leads to lack of sensitivity to such peculiar distributions. The principal features of non‐Gaussian distributions are asymmetry and sparsity. For the intensity distributions shown in Figure 1, the distinctive feature is regarded as asymmetry made by outlying intensities, rather than sparsity. Theoretically, odd transfer functions can be used for directly measuring the asymmetry, and even functions for sparsity and bimodality. Equations ((10), (11), (12)) are all even, however, reflecting the fact that the signals in the real world such as voices mostly have symmetric distributions, and that none of these functions may perfectly fit our case. Thus, an optimal transfer function should be investigated for the analysis of fMRI data.

Representative activation map (a) and its intensity distribution (b) obtained by independent component, cross correlation, sequential epoch (*ICS*) analysis in a hand motion study. The abscissa in (b) indicates pixel intensity and the ordinate indicates relative frequency of the intensity actually counted on the map. The distribution was normalized to zero mean and unit variance. The arrow in the distribution represents the cut‐off value used for selecting the significant pixels. The pixels above the cut‐off value were colored as indicated by the scale on the right side of the map.

A fundamental statistical measure of asymmetry is skewness or the third‐order cumulant, which can be estimated with the following simple odd function:

(13)

Similarly, it is now obvious that equation (12) gives the principal measure of sparsity, namely kurtosis or the fourth‐order cumulant. It has been mathematically proven that the square of the kurtosis can be an approximation of negentropy [Comon, 1994] and its local maxima provide independent components exactly [Delfosse and Loubaton, 1995]. By truncating the Edgeworth expansion series of the KL divergence at fourth‐order, the marginal negentropy J is approximated as follows [Comon, 1994]:

(14)

where k _n denotes the n ^th‐order cumulant of distribution. Equation (12) has been derived from the assumption that the distribution is symmetric, because the third‐order cumulant would be negligible. In practice, however, the higher‐order cumulants often provide a poor approximation of entropy for the following reasons: 1) Finite‐sample estimation of higher‐order cumulants are highly sensitive to outlying signals. In other words, the estimations are easily affected by a few erroneous observations of relatively large absolute values; 2) The higher‐order cumulants essentially measure mainly the tail of distribution, and are largely unaffected by the central part, even if the values were perfectly estimated [Friedman, 1987; Hyvärinen, 1998]. Historically, equations (10, 11) were introduced to provide more reliable approximations of negentropy than the cumulant‐based approaches with increased sensitivity to the center of distribution.

In our view, however, the weaknesses of the cumulant‐based estimation described above are indeed the property we require for the projection pursuit index. Assuming that true sparsity given without outlying values is approximately the same among the significant components, the skewness measure G _d is expected to have a better performance in extracting highly resolved activation maps than G _a, G _b, and G _c. Moreover, G _d is the fastest to compute. The corresponding fixed‐point iteration is simply given as below:

(15)

where the normalization condition is dropped for simplicity. In practice, the expectation E{·} is substituted for the sample mean.

Further higher‐order odd functions may possibly be used. To investigate their potential property, the fifth‐order cumulant measure was also examined. The exact forms of the transfer function and the derived fixed‐point algorithm are

(16)

(17)

respectively. Although the fifth‐order cumulant does not solely provide a perfect approximation of entropy and thus it may not be an information theoretic pursuit index, its extreme sensitivity to the tail of distribution may provide a subtle distinction among the components. Figure 2 shows the nonlinear transfer functions mentioned above plotted on the representative intensity distribution.

Nonlinear transfer functions (dashed lines) examined in this study. The solid line indicates the intensity distribution shown in Figure 1. It is obvious that cumulant‐based functions, G _c, G _d, and G _e, give more weight in the tail of the intensity distribution than the robust functions G _a and G _b.

fMRI Data Acquisition

An 18‐year‐old right‐handed normal male volunteer participated in our sequential epoch hand motion study. Informed consent was obtained from the subject and the study was performed according to the human research guidelines of the Internal Review Board of the University of Niigata. The subject was instructed to perform grasp motions with either one or both hands at approximately one grasp per second during an epoch. Visual cues were presented to indicate state change. Each session consisted of nine 30 sec epochs in the sequence of r‐R‐L‐B‐r‐R‐L‐B‐r, where r, R, L, and B represent rest, right hand motion, left hand motion, and bilateral hand motion, respectively. Gradient echo echo‐planar images (EPI) were obtained using a General Electric SIGNA 3.0 Tesla system equipped with an Advanced NMR EPI module. The following parameters were used for data acquisition: FOV 40 × 20 cm; matrix 128 × 64; slice thickness 5 mm; inter‐slice gap 2.5 mm; TR 1 sec. Four slices were consecutively acquired within each TR. FMRI time series consisted of 270 consecutive EPI images for each slice. Spatial resolution was approximately 3 × 3 × 5 mm.

Component Extraction

The hand motion paradigm has six possible sequential epoch patterns as shown in Figure 3. When each pattern is used as a temporal reference function in ICS analysis, it should be shifted about 6 seconds backward to take into account the so‐called hemodynamic delay. Our direct search technique was utilized for extracting the physiologically significant components relevant to a specified sequential epoch pattern. This technique initiates the fixed‐point iteration (7) with the basis vector given below:

(18)

where b is a column vector representing a target sequential epoch pattern normalized to zero mean and B is the sphering matrix given by equation (4). It should be noted that a priori information on the temporal pattern of a desired component is used here. A single spatial pattern whose associated time course has the highest correlation with the designated sequential epoch pattern b among the potential components may be obtained. If the correlation meets the predetermined criterion, r > 0.7, then the extracted spatial pattern is possibly a significant map showing the activated areas relevant to a specific neurophysiological function that is suggested by the sequential epoch pattern. Its time series, i.e., spatiotemporal component, is then removed from the data for the subsequent search. Once the correlation gets below the criterion, no other significant components may remain in the data.

Possible sequential epoch patterns. The letters r, R, L, and B correspond to rest, right hand motion, left hand motion, and bilateral hand motion, respectively. Six patterns of task‐related activation can be considered in this study: (A) right hand motion in general, (B) left hand motion in general, (C) non‐specific hand motion, (D) exclusive right hand motion, (E) exclusive left hand motion, and (F) bilateral hand motion. Provided that each pattern has been embedded in the data, a pattern can be used as a temporal reference function of significant components. *ICS* analysis, a hybrid method in which sequential epoch analysis is combined with ICA, has shown capability of detailed neurophysiological assessment [Nakada et al., 2000].

The following processes were carried out before ICS analysis: 1) The pixels outside a rectangular bound containing the whole cross section of the brain were discarded for the analysis. The number of pixels subjected to the analysis was 1188; 2) Spatial smoothing was applied by convolving with a 5 mm full width at half maximum (FWHM) Gaussian kernel [Friston et al., 1995b] to minimize the effects of pixel misalignment due to brain motion during the experimental session, whereas no temporal smoothing was applied to allow for a fine distinction between the components by retaining the intrinsic resolution; 3) The mean value of time series was subtracted individually at each pixel to be conformed to the direct search method.

The 6‐sec delayed sequential epoch pattern of “left hand motion in general” (denoted B in Fig. 3) was chosen for the representative temporal reference function for validation of the cumulant‐based contrast functions.

RESULTS AND DISCUSSION

Figure 4 shows the activation maps and associated time courses extracted with each contrast function. Only the pixels whose intensity u was over 5.0 were colored as indicated in Figure 1. The intensity distribution is roughly regarded as Gaussian distribution with unit variance, and thus Gaussian random field theory [Worsley, 1994] is applicable to the statistical inference for the intensity threshold under the null hypothesis that no pixel correlates significantly with a designated sequential epoch pattern. In our case, the cut‐off condition u > 5.0 gives approximately p < 10⁻³. The subject's own anatomical image obtained with the identical location was used for anatomical identification of activated areas. Only one component was obtained with both G _a and G _b, whereas two components were separably extracted with G _c. The component obtained with G _a was almost identical to that of G _b, and the two independent areas revealed by G _c were consistently parts of the area revealed by G _a or G _b. The number of separated maps was further increased to four by the skewness measure G _d. Indeed these accurately separated components, except the one with the lowest correlation coefficient, could be assessed from the functionally discrete areas within the right primary sensorimotor cortex responsible for the left hand motion. The physiological interpretation of these components have been described in detail by Nakada et al. [2000]. Furthermore, five independent maps were obtained with the fifth‐order function G _e. The cluster sizes, however, were too small for physiological assessment of this hand motion study according to the current state of knowledge regarding signal sources governed by a complex mechanism. Statistically valid comparisons between results of different contrast functions are virtually impossible, because our decomposition method, as well as other ICA‐based methods, relies on an ‘assumption’ of distribution forms of source signals. It is of course possible that the clusters may prove to be pure mathematical products without significant physiological meaning. Because axiomatically the ultimate unit of any human activation, a phenomenon that represents spatial independence of function, is considered to be a single neuron, the ability to resolve such minutely separated maps no doubt holds high potential for determining underlying neuronal mechanisms of cortical function in the future.

Significant spatial patterns (r > 0.7) and associated time courses extracted for the representative sequential epoch pattern B. Red lines represent the 6‐sec delayed temporal reference function normalized to zero mean and unit norm. Blue lines show the extracted time courses. The pixels of relatively high magnitude (u > 5.0) are colored as shown in Figure 1. Cumulant‐based contrast functions extracted more significant components than the robust general‐purpose contrast functions.

Figure 5 shows the particular voxels revealed shared by components that were extracted using the cumulant‐based contrast functions. As the number of extracted maps increased, the number of voxels whose raw time course was decomposed into more than one component also increased. The observation suggests that slightly different responses that are mixed in the time course of a single voxel can be restored by the contrast functions with an extreme sensitivity to the tail of distribution. It is interesting that regardless of the contrast function chosen, a Gaussian‐like form in the central part of the density distribution was common to all the extracted maps (Fig. 6).

The areas shared by more than one extracted map. The yellow pixels indicate the significant areas on only one map. The pixels distinctively colored green and red indicate the areas shared by two and three maps, respectively. As the contrast function increases its capability of separating out spatial patterns, the common areas extend. Correspondingly, a slighter difference in time courses can be identified.

Intensity distributions of extracted maps. One representative distribution is plotted for each cumulant‐based contrast function. Gaussian‐like distributions were obtained regardless of the lower sensitivity of the contrast functions to the center of distribution.

To evaluate the convergence property, the evolution of the average absolute error in the elements of the estimated basis vector was recorded during the iteration process (Fig. 7). The number of iteration steps required for convergence are shown in Table I for each of the extracted components. The average error of the basis vector was calculated by

(19)

where \^w_c represents the basis vector given at the convergence. Unsettled convergence was evidently shown for both G _a and G _b, implying that these transfer functions yielded a large step size. The oscillating error sequence can be stabilized by a multiplicative learning constant whose value is less than unity. Instead, however, much more steps of iteration are needed due to the loss of the quadratic property of Newton's method. By contrast, stable convergence was shown for G _c; that is, after a fast initial decrease, the error gradually decreased to zero. This is possibly because G _c is sensitive to the tail of distribution by which the extracted maps are well characterized. Both G _d and G _e showed a stable convergence comparable to that of G _c on every component.

Convergence of the absolute error. Every cumulant‐based contrast function exhibits stable and fast convergence.

Table I.

Iteration numbers at which a true basis was found

Nonlinearity	Component	Iteration
G _a	41
G _b	45
G _c	A	30
B	98
G _d	A	42
B	97
C	43
D	47
G _e	A	23
B	29
C	10
D	19
E	23

Open in a new tab

The angles between the basis vectors estimated with G _d and G _e were calculated for all the possible pairs to determine whether the quasi‐orthogonal condition held. The angle θ was computed by

(20)

As listed in Table II, every angle was not exactly but close to 90° indicating that the quasi‐orthogonal bases were successfully obtained.

Table II.

Angles between basis vectors (degrees)

	B	C	D	E
G _d
A	79.9	89.4	87.3
B	89.1	85.4
C	85.8
G _e
A	84.1	86.7	87.8	83.4
B	87.4	89.5	84.0
C	87.6	82.9
D	85.4

Open in a new tab

Information theoretic approaches to projection pursuit suggest that given the true density distribution f(u) of a source signal to be extracted, the optimal transfer function should be in the form, G = log{f(u)}. Due to difficulty in assuming or correctly estimating the true density a priori, some alternative indices were investigated for projection pursuit. Several effective contrast functions that have both sensitivity to the center of distribution and tolerance against outliers have been proposed. These contrast functions firmly connected projection pursuit to ICA. Real world signals are mostly well fitted to the robust contrast functions. FMRI time series, however, are an exception. As shown in Figure 1, the actual distribution is highly super‐Gaussian, so that either log{cosh(u)} ( = G _a) or Gauss function ( = G _b) is recommended in the information theoretic framework. One should be aware of the fact that the degree of regional specificity obtained with these functions was very low. That is to say, the robustness of these functions gives only robust solutions. In our view, the tail of the intensity distribution of a map provides major information regarding active brain regions, whereas the central part carries little information and thus is relatively insignificant. For this reason, classical cumulant measures, which are mostly less useful for separating real signals due to the fluctuation made by excessively high erroneous observations, were examined in this study to determine whether these have a superior capability for extracting activation maps from fMRI time series when used as projection indices.

As we expected, higher‐order cumulant measures giving more weight in the tail of distribution showed better performance in our fMRI study than such robust transfer functions. Moreover, the fact that the third‐order function u ³ extracted more significant maps than the fourth‐order function u ⁴ suggested that asymmetry of the intensity distribution was also a key index for decomposition. Indeed, a group of outlying intensities on one side of the distribution was the exact cause of this asymmetry. Regional specificity was further increased by the fifth‐order function u ⁵. Taking into account spatial smoothness of fMRI time series, the mean extent of the clusters obtained with the fifth‐order function was ultimately small. Although the utility of such extremely small resolved maps should be evaluated from the neurophysiological aspect, it is technically significant that the resultant resolution of activation maps can be adjusted as desired. In conclusion, our study showed that the skewness measure, G _d( = u ³/3), was the most appropriate transfer function for the projection index among those examined, because the high degree of functional resolution obtained was to the level that physiological assessment of each functionally independent area was possible.

The use of skew distributions for extracting images from biomedical data has been described by Porrill et al. [2000]. The current study is based on similar concept. Nevertheless, our algorithm exhibits several advantages. Let alone the fact that our study represents the first successful application of the skewness measure in the analysis of fMRI, the study clearly exhibits that the functional resolution of significant components can be controlled with a selection of the contrast function defined in direct search method. Furthermore, importance of overcomplete representation for the analysis of fMRI data is demonstrated.

A brief comment regarding some advantages of our method over classical hypothesis‐based methods such as SPM seems helpful for practical application. To extract physiologically meaningful components, one has to rely on a priori knowledge of the expected temporal variations. Therefore, to detect physiological components, a complete set of basis functions representing signal components has to be determined in advance. Accurate assumption of all the temporal variations, however, is virtually impossible. Consequently, analytical errors are inevitable in classical hypothesis driven methods. Furthermore, similar bases may share weights, and are not necessarily separated into different physiological components. Our method can overcome these principal limits of classical hypothesis‐driven methods

In summary, we introduced here a tailored ICA algorithm that is especially suitable for analysis of high field fMRI time series by investigating a class of contrast functions based on classical higher‐order cumulant measures. The fixed‐point iteration is launched from an optimal point derived from a specified temporal pattern and then conducted to converge efficiently on the significant components of sufficient accuracy by employing the optimal contrast function derived from prior knowledge of the intensity distribution of desired spatial patterns.

REFERENCES

Amari S (1998): Natural gradient works efficiently in learning. Neural Comput 10: 251–276. [Google Scholar]
Bell AJ, Sejnowski TJ (1995): An information‐maximization approach to blind separation and blind deconvolution. Neural Comput 7: 1129–1159. [DOI] [PubMed] [Google Scholar]
Comon P (1994): Independent component analysis, a new concept? Signal Process 36: 287–314. [Google Scholar]
Delfosse N, Loubaton P (1995): Adaptive blind separation of independent sources: a deflation approach. Signal Process 45: 59–83. [Google Scholar]
Friedman JH (1987): Exploratory projection pursuit. J Am Stat Assoc 82: 249–266. [Google Scholar]
Friston KJ, Holmes AP, Worsley KJ, Poline JP, Frith CD, Frackowiak RSJ (1995a): Statistical parametric maps in functional imaging: a general linear approach. Hum Brain Mapp 2: 189–210. [Google Scholar]
Friston KJ, Holmes AP, Poline JP, Grasby PJ, Williams SCR, Frackowiak RSJ, Turner R (1995b): Analysis of fMRI time series revisited. Neuroimage 2: 45–53. [DOI] [PubMed] [Google Scholar]
Hyvärinen A (1998): New approximation of differential entropy for independent component analysis and projection pursuit In: Jordan MI, Kearns MJ, Solla SA, editors. Advances in neural information processing systems 10 Cambridge, MA: MIT Press; p 273–279. [Google Scholar]
Hyvärinen A (1999): Fast and robust fixed‐point algorithms for independent component analysis. IEEE Trans Neural Networks 10: 626–634. [DOI] [PubMed] [Google Scholar]
Hyvärinen A, Oja E (1997): A fast fixed‐point algorithm for independent component analysis. Neural Comput 9: 1483–1492. [DOI] [PubMed] [Google Scholar]
Linsker R (1992): Local synaptic learning rules suffice to maximize mutual information in a linear network. Neural Comput 4: 691–702. [Google Scholar]
McKeown MJ, Jung T‐P, Makeig S, Brown GG, Kindermann SS, Lee T‐W, Sejnowski TJ (1998a): Spatially independent activity patterns in functional MRI data during the Stroop color‐naming task. Proc Natl Acad Sci USA 95: 803–810. [DOI] [PMC free article] [PubMed] [Google Scholar]
McKeown MJ, Makeig S, Brown GG, Jung T‐P, Kindermann SS, Bell AJ, Sejnowski TJ (1998b): Analysis of fMRI data by blind separation into independent spatial components. Hum Brain Mapp 6: 160–188. [DOI] [PMC free article] [PubMed] [Google Scholar]
McKeown MJ, Sejnowski TJ (1998): Independent component analysis of fMRI data: examining the assumptions. Hum Brain Mapp 6: 368–372. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nakada T, Fujii Y, Suzuki K, Kwee IL (1998): High‐field (3.0 T) functional MRI sequential epoch analysis: an example for motion control analysis. Neurosci Res 32: 355–362. [DOI] [PubMed] [Google Scholar]
Nakada T, Suzuki K, Fujii Y, Matsuzawa H, Kwee IL (2000): Independent component, cross correlation, sequential epoch (ICS) analysis of high‐field fMRI time series: direct visualization of dual representation of the primary motor cortex in human. Neurosci Res 37: 237–244. [DOI] [PubMed] [Google Scholar]
Porrill J, Stone JV, Berwick J, Mayhew J, Coffey P (2000): Analysis of optical imaging data using weak models and ICA In: Girolami M, editor. Advances in independent component analysis. Berlin, London, New York: Springer‐Verlag; p 217–233. [Google Scholar]
Suzuki K, Kiryu T, Nakada T (2000): An efficient method for independent component, cross correlation, sequential epoch analysis of functional magnetic resonance imaging In: Pajunen P, Karhunen J, editors. Proceedings of Second International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2000). Helsinki, Finland: Helsinki University of Technology; p 309–314. [Google Scholar]
Worsley KJ (1994): Local maxima and the expected Euler characteristic of excursion sets of χ², F‐ and t‐fields. Adv Appl Probab 26: 13–41. [Google Scholar]

[bib1] Amari S (1998): Natural gradient works efficiently in learning. Neural Comput 10: 251–276. [Google Scholar]

[bib2] Bell AJ, Sejnowski TJ (1995): An information‐maximization approach to blind separation and blind deconvolution. Neural Comput 7: 1129–1159. [DOI] [PubMed] [Google Scholar]

[bib3] Comon P (1994): Independent component analysis, a new concept? Signal Process 36: 287–314. [Google Scholar]

[bib4] Delfosse N, Loubaton P (1995): Adaptive blind separation of independent sources: a deflation approach. Signal Process 45: 59–83. [Google Scholar]

[bib5] Friedman JH (1987): Exploratory projection pursuit. J Am Stat Assoc 82: 249–266. [Google Scholar]

[bib6] Friston KJ, Holmes AP, Worsley KJ, Poline JP, Frith CD, Frackowiak RSJ (1995a): Statistical parametric maps in functional imaging: a general linear approach. Hum Brain Mapp 2: 189–210. [Google Scholar]

[bib7] Friston KJ, Holmes AP, Poline JP, Grasby PJ, Williams SCR, Frackowiak RSJ, Turner R (1995b): Analysis of fMRI time series revisited. Neuroimage 2: 45–53. [DOI] [PubMed] [Google Scholar]

[bib8] Hyvärinen A (1998): New approximation of differential entropy for independent component analysis and projection pursuit In: Jordan MI, Kearns MJ, Solla SA, editors. Advances in neural information processing systems 10 Cambridge, MA: MIT Press; p 273–279. [Google Scholar]

[bib9] Hyvärinen A (1999): Fast and robust fixed‐point algorithms for independent component analysis. IEEE Trans Neural Networks 10: 626–634. [DOI] [PubMed] [Google Scholar]

[bib10] Hyvärinen A, Oja E (1997): A fast fixed‐point algorithm for independent component analysis. Neural Comput 9: 1483–1492. [DOI] [PubMed] [Google Scholar]

[bib11] Linsker R (1992): Local synaptic learning rules suffice to maximize mutual information in a linear network. Neural Comput 4: 691–702. [Google Scholar]

[bib12] McKeown MJ, Jung T‐P, Makeig S, Brown GG, Kindermann SS, Lee T‐W, Sejnowski TJ (1998a): Spatially independent activity patterns in functional MRI data during the Stroop color‐naming task. Proc Natl Acad Sci USA 95: 803–810. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib13] McKeown MJ, Makeig S, Brown GG, Jung T‐P, Kindermann SS, Bell AJ, Sejnowski TJ (1998b): Analysis of fMRI data by blind separation into independent spatial components. Hum Brain Mapp 6: 160–188. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] McKeown MJ, Sejnowski TJ (1998): Independent component analysis of fMRI data: examining the assumptions. Hum Brain Mapp 6: 368–372. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] Nakada T, Fujii Y, Suzuki K, Kwee IL (1998): High‐field (3.0 T) functional MRI sequential epoch analysis: an example for motion control analysis. Neurosci Res 32: 355–362. [DOI] [PubMed] [Google Scholar]

[bib16] Nakada T, Suzuki K, Fujii Y, Matsuzawa H, Kwee IL (2000): Independent component, cross correlation, sequential epoch (ICS) analysis of high‐field fMRI time series: direct visualization of dual representation of the primary motor cortex in human. Neurosci Res 37: 237–244. [DOI] [PubMed] [Google Scholar]

[bib17] Porrill J, Stone JV, Berwick J, Mayhew J, Coffey P (2000): Analysis of optical imaging data using weak models and ICA In: Girolami M, editor. Advances in independent component analysis. Berlin, London, New York: Springer‐Verlag; p 217–233. [Google Scholar]

[bib18] Suzuki K, Kiryu T, Nakada T (2000): An efficient method for independent component, cross correlation, sequential epoch analysis of functional magnetic resonance imaging In: Pajunen P, Karhunen J, editors. Proceedings of Second International Workshop on Independent Component Analysis and Blind Signal Separation (ICA 2000). Helsinki, Finland: Helsinki University of Technology; p 309–314. [Google Scholar]

[bib19] Worsley KJ (1994): Local maxima and the expected Euler characteristic of excursion sets of χ², F‐ and t‐fields. Adv Appl Probab 26: 13–41. [Google Scholar]

PERMALINK

Fast and precise independent component analysis for high field fMRI time series tailored using prior information on spatiotemporal structure

Kiyotaka Suzuki

Tohru Kiryu

Tsutomu Nakada

Abstract

INTRODUCTION

METHODS

ICA Data Model and Decomposition Algorithm