Discussion on “Distributional independent component analysis for diverse neuroimaging modalities” by Ben Wu, Subhadip Pal, Jian Kang, and Ying Guo

Kan Keeratimahat; Thomas E Nichols

doi:10.1111/biom.13591

. Author manuscript; available in PMC: 2023 Sep 1.

Published in final edited form as: Biometrics. 2021 Nov 15;78(3):1113–1117. doi: 10.1111/biom.13591

Discussion on “Distributional independent component analysis for diverse neuroimaging modalities” by Ben Wu, Subhadip Pal, Jian Kang, and Ying Guo

Kan Keeratimahat ¹, Thomas E Nichols ²

PMCID: PMC9107521 NIHMSID: NIHMS1750922 PMID: 34780664

Summary:

Wu et al. (2021) have made an important contribution to the methodology for data-driven analysis of MRI data. However, we wish to challenge the authors on new potential applications of their approach beyond diffusion tensor imaging data, and to think carefully about the impact of random initialization implicit in their method. We illustrate the variability found from re-analyzing the supplied demonstration data multiple times, finding that that the discovered independent components have a wide range of reliability, from nearly perfect overlap to no overlap at all.

Keywords: DTI, fMRI, Independent Component Analysis, Multimodality neuroimaging

We congratulate Wu et al. (2021) on their work that advances probabilistic approaches for Independent Components Analysis (ICA). ICA has long played an essential role in the analysis of resting Functional Magnetic Resonance Imaging (fMRI) data (Jung et al., 2001) but it is now being used with other types of MRI data (Xu et al., 2009). They propose a Distributional ICA (DICA) that uses a generic mixture model to allow ICA to be applied on structured data, like that from Diffusion Tensor Imaging (DTI). We have a number of comments on the work and ideas for future directions.

The Wishart mixture for the DTI data is well-motivated, as the diffusion tensors are exactly the covariance of a Gaussian diffusion model. However we were hoping to see some other examples where this approach could be deployed. Have they considered other types of highly structured imaging data that would be suitable for this method? For example the current trend in diffusion MRI are macroscopic models that account for the heterogeneous diffusion in a voxel, for example Neurite Orientation Dispersion and Density Imaging (NODDI) (Zhang et al., 2012). NODDI produces multiple measures with complex non-Gaussian interdependence, but not in a manner consistent with a standard parametric model. Do they have ideas for mixtures models when physics does not dictate the stochastic structure?

We wondered if another potential application is multisubject binary brain data, whether representing white matter hyperintensities (WMH) (Wardlaw et al., 2015), multiple sclerosis lesions (Bakshi et al., 2008) or stroke lesions (Karnath et al., 2018). After intersubject alignment, an ICA decomposition would be incredibly useful to discover distinct patterns of lesions over subjects. For example, for WMH there is a average pattern of anterior and posterior periventricular lesions, but what if it is the case that half of subjects mainly have anterior lesions and the other half have mainly posterior lesions; an ICA decomposition ideally would discover these as two distinct modes of variation. However, univariate Bernoulli mixtures are not identifiable and so we don’t see how the the authors’ DICA strategy would be applicable to this important use case.

An eternal question for all work with mixture models and ICA is: how many components are needed? We were expecting the authors to dispense with at least one tuning parameter by avoiding the PCA dimension reduction for fMRI and directly estimating mixtures with the original fMRI data. While perhaps impractical computationally, won’t direct mixture modeling of the full-size fMRI data perhaps produce a more robust method?

And finally, we were disappointed that the issue of stability and robustness were not addressed more thoroughly. Both ICA and mixture models are notoriously sensitive to initial conditions, and with random initialization sometimes producing quite different results (Himberg et al., 2004). While it is not discussed explicitly in the paper, the ICA tool used in the authors R software (icaimax from the ica package) uses a fixed initialization of the mixing matrix. In contrast, it is common to use a random initialization, as is in FSL’s MELODIC and ”fastica” in R and Matlab. In particular, we worry that the extra source of random initialization (mixture model + ICA) may lead to greater instability in DICA relative to usual ICA.

In the remainder of this comment we use the author’s software and example datasets to examine the stability of the estimated ICA results, comparing DICA to standard ICA. Also, we explore the similarity and interpretability of the discovered DICA fMRI components in comparison to usual ICA.

1. Methods

The DICA method consists of two stages, first a mixture model fitting and then ICA on the mlogit-transformed mixture weights. In the current implementation the initialization of the mixture model via k-means is random and leads to different results with each run, but the icaimax ICA initializes the mixing matrix to the identity; other ICA implementations initialize the mixing matrix with i.i.d. random Gaussian entries. We consider measuring both the stability of the results over multiple re-runs of the algorithm, and similarity between the DICA and usual ICA results. We consider three variants of methods: DICA as originally implemented (k-means-only randomness), DICA with random initialization of the ICA mixing matrix as well, and the usual ICA with random initialization.

Interpretation of spatial independent components (ICs) is based on thresholded maps; we follow the DICA example code which thresholded the ICs at the 95th percentile in absolute value. As correlation of thresholded maps would largely be driven by overlap, we instead measure similarity as the spatial overlap on the basis of thresholded IC signs (−1,0,1) at each voxel. We measure overlap with the multi-class Dice coefficient, taking the better of the original comparison and a comparison with one map sign flipped. The L ICs from each run are unordered and may not correspond at all between different runs or between methods. We use a greedy matching algorithm, where we find the pair of ICs with the best Dice coefficient, then remove that pair from consideration, and repeat until all L IC pairs are matched. Each IC then has a Dice value for an optimally matched IC. To assess stability we use 5 runs of one method on a dataset, producing L × (5 − choose − 2) Dice values; for similarity we use 5 runs for each method and then have L × 5² Dice values for each pair of methods.

We use the data provided with the authors’ software, one fMRI dataset with 71 volumes and 271,633 voxels in mask, and one DTI dataset provided as eigenvalues (3 volumes) and eigenvectors (3 3-volume images) with 195,589 voxels in mask. For fMRI we ran DICA as illustrated by the authors (40 principal components (PCs), K = 20 mixture components, L = 14 ICs), and ran ICA similarly (40 PCs, L = 14 ICs). For DTI we likewise ran DICA as illustrated (K = 20, L = 14) and with reduced dimension (K = 20, L = 6); for ICA used the 6 unique values of the tensor (K = 6 ICs).

Finally, we manually identified the primary sensory networks, visual, motor and auditory, in each of the 5 runs of DICA (full random initialization). These are the strongest of resting state fMRI networks and should be identified with some consistency.

2. Results

The stability results are presented in Figure 1, where “FI” stands for fixed initialization of ICA and “FI” for random initialization of ICA. The Dice values are sobering: Similarity between re-runs on the very same data produces Dice similarity values that range fully from zero to one. For fMRI (Fig. 1 left), DICA-FI has greater stability (higher Dice) than DICA-RI, as expected; remarkably, even DICA-RI’s degraded similarity, median Dice around 0.4, is better than ICA-RI. This suggests that the mixture modelling is providing a better latent representation than the PCA dimension reduction alone. For DTI (Fig. 1 right), with DICA and the suggested L = 14 components, RI has little impact; when running with L = 6 components to allow comparison with usual ICA, median Dice values are higher and if anything ICA is more reliable.

The similarity results are presented in Figure 2, where DICA is compared to ICA, without (FI) and with (RI) random ICA initialization. The Dice values are worse here for both fMRI (Fig. 2 left) with a median around 0.3 and DTI (Fig. 2 right) with a median of 0.1; however, for this L = 6 decomposition, there are a few outlier components that show high agreement.

Figure 3 shows manually identified components for 5 re-runs (columns), the visual network (top two rows, DICA then ICA) and auditory network (bottom two rows). For the visual network, close examination between columns shows some differences in the lateral extent of the thresholded regions, though there is more variability in ICA than DICA. For motor, there is more evident variation, with the 3rd run for DICA picking up little of auditory cortex, while other runs adding areas quite distant (e.g. the 1st run of DICA includes lateral occipital cortex), and there is variable negative weighting on the lingual gyrus (blue area near the midline). The motor network runs had variability similar to the auditory network’s (not shown).

3. Conclusions

We have highlighted the important issue of stability of solutions obtained from mixture model and ICA fits. On fMRI DICA was more stable than usual ICA while on DTI, for the case of L = 6 ICs, ICA was more stable. Visual comparisons of hand-identified fMRI ICs look grossly similar, though close inspection reveals some appreciable differences between re-runs on the same data, and these differences are reflected quantitatively in quite low Dice coefficients. For fMRI, though, the Dice results suggest DICA may deliver greater inter-run stability than ICA alone.

These findings are, however, at best illustrative. We used but 5 re-runs and used a single dataset that may not reflect current data quality. For example, typical resting state fMRI data has much longer time series and is usually subjected to extensive cleaning. However, we hope this exploration helps throw a spotlight on the issue of stability in the development of data-driven methodology for fMRI.

Acknowledgements

KK and TEN are supported by NIH grant R01DA048993; KK is a part of EPSRC Health Data Science Centre for Doctoral Training, EP/S02428X/1.

Contributor Information

Kan Keeratimahat, Department of Computer Science, Parks Road, University of Oxford.

Thomas E. Nichols, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford

References

Bakshi R, Thompson AJ, Rocca MA, Pelletier D, Dousset V, Barkhof F, et al. (2008). MRI in multiple sclerosis: current status and future prospects. Lancet Neurology 7, 615–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
Himberg J, Hyvärinen A, and Esposito F (2004). Validating the independent components of neuroimaging time series via clustering and visualization. NeuroImage 22, 1214–1222. [DOI] [PubMed] [Google Scholar]
Jung T-P, Makeig S, McKeown M, Bell A, Lee T-W, and Sejnowski T (2001). Imaging brain dynamics using independent component analysis. Proceedings of the IEEE 89, 1107–1122. [DOI] [PMC free article] [PubMed] [Google Scholar]
Karnath HO, Sperber C, and Rorden C (2018). Mapping human brain lesions and their functional consequences. NeuroImage 165, 180–189. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wardlaw JM, Valdés Hernández MC, and Muñoz-Maniega S (2015). What are white matter hyperintensities made of? Relevance to vascular cognitive impairment. Journal of the American Heart Association 4, 001140. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wu B, Pal S, Kang J, and Guo Y (2021). Distributional Independent Component Analysis for Diverse Neuroimaging Modalities. Biometrics pages 1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
Xu L, Groth KM, Pearlson G, Schretlen DJ, and Calhoun VD (2009). Source-based morphometry: The use of independent component analysis to identify gray matter differences with application to schizophrenia. Human Brain Mapping 30, 711–724. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang H, Schneider T, Wheeler-Kingshott CA, and Alexander DC (2012). NODDI: Practical in vivo neurite orientation dispersion and density imaging of the human brain. NeuroImage 61, 1000–1016. [DOI] [PubMed] [Google Scholar]

[R1] Bakshi R, Thompson AJ, Rocca MA, Pelletier D, Dousset V, Barkhof F, et al. (2008). MRI in multiple sclerosis: current status and future prospects. Lancet Neurology 7, 615–25. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] Himberg J, Hyvärinen A, and Esposito F (2004). Validating the independent components of neuroimaging time series via clustering and visualization. NeuroImage 22, 1214–1222. [DOI] [PubMed] [Google Scholar]

[R3] Jung T-P, Makeig S, McKeown M, Bell A, Lee T-W, and Sejnowski T (2001). Imaging brain dynamics using independent component analysis. Proceedings of the IEEE 89, 1107–1122. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Karnath HO, Sperber C, and Rorden C (2018). Mapping human brain lesions and their functional consequences. NeuroImage 165, 180–189. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Wardlaw JM, Valdés Hernández MC, and Muñoz-Maniega S (2015). What are white matter hyperintensities made of? Relevance to vascular cognitive impairment. Journal of the American Heart Association 4, 001140. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Wu B, Pal S, Kang J, and Guo Y (2021). Distributional Independent Component Analysis for Diverse Neuroimaging Modalities. Biometrics pages 1–23. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Xu L, Groth KM, Pearlson G, Schretlen DJ, and Calhoun VD (2009). Source-based morphometry: The use of independent component analysis to identify gray matter differences with application to schizophrenia. Human Brain Mapping 30, 711–724. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] Zhang H, Schneider T, Wheeler-Kingshott CA, and Alexander DC (2012). NODDI: Practical in vivo neurite orientation dispersion and density imaging of the human brain. NeuroImage 61, 1000–1016. [DOI] [PubMed] [Google Scholar]

PERMALINK

Discussion on “Distributional independent component analysis for diverse neuroimaging modalities” by Ben Wu, Subhadip Pal, Jian Kang, and Ying Guo

Kan Keeratimahat

Thomas E Nichols

Summary:

1. Methods

2. Results

Figure 1.

Figure 2.

Figure 3.

3. Conclusions

Acknowledgements

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Discussion on “Distributional independent component analysis for diverse neuroimaging modalities” by Ben Wu, Subhadip Pal, Jian Kang, and Ying Guo

Kan Keeratimahat

Thomas E Nichols

Summary:

1. Methods

2. Results

Figure 1.

Figure 2.

Figure 3.

3. Conclusions

Acknowledgements

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases