We are glad that our paper (1) has generated intense discussions in the fMRI field (2–4), on how to analyze fMRI data, and how to correct for multiple comparisons. The goal of the paper was not to disparage any specific fMRI software, but to point out that parametric statistical methods are based on a number of assumptions that are not always valid for fMRI data, and that nonparametric statistical methods (5) are a good alternative. Through AFNI’s introduction of nonparametric statistics in the function 3dttest++ (3, 6), the three most common fMRI softwares now all support nonparametric group inference [SPM through the toolbox SnPM (www2.warwick.ac.uk/fac/sci/statistics/staff/academic-research/nichols/software/snpm), and FSL through the function randomise].
Cox et al. (3) correctly point out that the bug in the AFNI function 3dClustSim only had a minor impact on the false-positive rate (FPR). This was also covered in our original paper (1): “We note that FWE [familywise error] rates are lower with the bug-fixed 3dClustSim function. As an example, the updated function reduces the degree of false positives from 31.0% to 27.1% for a CDT [cluster-defining threshold] of P = 0.01, and from 11.5% to 8.6% for a CDT of P = 0.001.” It is unfortunate that several media outlets focused extensively on this bug, when the main problem was found to be violations of the assumptions in the statistical models.
The statement that AFNI had particularly high FPRs, compared with SPM and FSL, is for example supported by figure S1A in our original paper (1) (Beijing data, two-sample t test with 20 subjects, CDT P = 0.01). For 8-mm smoothing, the FPR for AFNI is 23–31%, whereas it is 13–20% for SPM and 14–18% for FSL OLS. To understand the higher FPRs, we investigated how the 3dClustSim function works, which eventually led us to finding the bug in 3dClustSim. However, we agree that AFNI did not produce higher FPRs for all parameter combinations.
The 70% FPR comes from figure S9C in our original report (1) (Oulu data, one-sample t test with 40 subjects, CDT P = 0.01, FSL OLS with 4-mm smoothing) and not, as some readers believed, from figure 2 in the original paper (1), which shows results for the ad hoc clustering approach. The main reason for using the highest observed FPR was to give the reader an idea of how severe the problem can be, but we agree that it led to a too pessimistic view.
As pointed out by Cox et al. (3), the nonparametric approach also performed suboptimal for the one-sample t test, especially for the Oulu data. As discussed in our paper (1), the one-sample t test has an assumption of symmetrically distributed errors that can be violated by outliers in small samples. Our current research is therefore focused on how to improve the nonparametric test for one-sample t tests. Regarding the flexibility of the permutation testing, recent work has shown that virtually any regression model with independent errors can be accommodated (5), and even longitudinal and repeated-measures data can be analyzed with a related bootstrap approach (7).
Kessler et al. (4) extend our evaluations to (nonparametric) cluster-based false-discovery rate (FDR) on-task data, to better understand how existing parametric cluster P values based on the FWE should be interpreted. For the problematic CDT of P = 0.01, Kessler et al. conclude that a cluster FWE-corrected P value smaller than P = 0.00001 survives FDR correction at q = 0.05. Indeed, this information makes it easier to interpret existing results in the fMRI literature, but it should be noted that it is not straightforward to generalize these results to other studies. For example, the fMRI software used, the MR sequence used (EPI or multiband), the degree of smoothing, and the number of subjects are all likely to affect this cut-off. The only way to retrospectively evaluate existing results is, in our opinion, to reanalyze the original fMRI data [e.g., made available through OpenfMRI (8)] or to apply a new threshold to the statistical maps [e.g., made available through NeuroVault (9)].
Finally, we would like to note the importance of data and code sharing. Cox et al. (3, 6) replicated and extended our findings with the same open fMRI data (10) as in our original paper (1) (and made use of our processing scripts available on github, https://github.com/wanderine/ParametricMultisubjectfMRI), ultimately resulting in improvements to the AFNI software. Furthermore, we never would have been able to identify the bug in 3dClustSim were AFNI not open-source software. Kessler et al. (4) also used the same task datasets from OpenfMRI (8) to find the empirical cluster FDR. Together, these examples show the importance of data sharing (11, 12), open-source software (13), code sharing (14, 15), and reproducibility (16).
Acknowledgments
This research was supported by the Neuroeconomic Research Initiative at Linköping University, by Swedish Research Council Grant 2013-5229 (“Statistical Analysis of fMRI Data”), the Information Technology for European Advancement 3 Project BENEFIT (better effectiveness and efficiency by measuring and modelling of interventional therapy), the Swedish Research Council Linnaeus Center CADICS (control, autonomy, and decision-making in complex systems), and the Wellcome Trust.
Footnotes
The authors declare no conflict of interest.
References
- 1.Eklund A, Nichols TE, Knutsson H. 2016. Cluster failure: Why fMRI inferences for spatial extent have inflated false-positive rates. Proc Natl Acad Sci USA 113:7900–7905. Erratum in Proc Natl Acad Sci USA 113:E4929.
- 2.Brown EN, Behrmann M. Controversy in statistical analysis of functional magnetic resonance imaging data. Proc Natl Acad Sci USA. 2017;114:E3368–E3369. doi: 10.1073/pnas.1705513114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Cox RW, Chen G, Glen RD, Reynolds RC, Taylor PA. fMRI clustering and false positive rates. Proc Natl Acad Sci USA. 2017;114:E3370–E3371. doi: 10.1073/pnas.1614961114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Kessler D, Angstadt M, Sripada CS. Reevaluating “cluster failure” in fMRI using nonparametric control of the false discovery rate. Proc Natl Acad Sci USA. 2017;114:E3372–E3373. doi: 10.1073/pnas.1614502114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Winkler AM, Ridgway GR, Webster MA, Smith SM, Nichols TE. Permutation inference for the general linear model. Neuroimage. 2014;92:381–397. doi: 10.1016/j.neuroimage.2014.01.060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Cox RW, Reynolds RC, Taylor PA. AFNI and clustering: False positive rates redux. bioRxiv. 2016 doi: 10.1101/065862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Guillaume B, Hua X, Thompson PM, Waldorp L, Nichols TE. Alzheimer’s Disease Neuroimaging Initiative Fast and accurate modelling of longitudinal and repeated measures neuroimaging data. Neuroimage. 2014;94:287–302. doi: 10.1016/j.neuroimage.2014.03.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Poldrack RA, et al. Toward open sharing of task-based fMRI data: the OpenfMRI project. Front Neuroinform. 2013;7:12. doi: 10.3389/fninf.2013.00012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Gorgolewski KJ, et al. NeuroVault.org: A repository for sharing unthresholded statistical maps, parcellations, and atlases of the human brain. Neuroimage. 2016;124:1242–1244. doi: 10.1016/j.neuroimage.2015.04.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Biswal BB, et al. Toward discovery science of human brain function. Proc Natl Acad Sci USA. 2010;107:4734–4739. doi: 10.1073/pnas.0911855107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Poldrack RA, Gorgolewski KJ. Making big data open: Data sharing in neuroimaging. Nat Neurosci. 2014;17:1510–1517. doi: 10.1038/nn.3818. [DOI] [PubMed] [Google Scholar]
- 12.Poline JB, et al. Data sharing in neuroimaging research. Front Neuroinform. 2012;6:9. doi: 10.3389/fninf.2012.00009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Ince DC, Hatton L, Graham-Cumming J. The case for open computer programs. Nature. 2012;482:485–488. doi: 10.1038/nature10836. [DOI] [PubMed] [Google Scholar]
- 14.Baker M. Why scientists must share their research code. Nature. 2016 doi: 10.1038/nature.2016.20504. [DOI] [Google Scholar]
- 15.Eglen S, et al. Towards standard practices for sharing computer code and programs in neuroscience. bioRxiv. 2016 doi: 10.1101/045104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Gorgolewski KJ, et al. BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods. PLoS Comput Biol. 2016;13:e1005209. doi: 10.1371/journal.pcbi.1005209. [DOI] [PMC free article] [PubMed] [Google Scholar]